From Browser Tabs to Insights: My Friday Reads 9/27/2024

As anyone in tech, I constantly get bombarded with countless papers, links, and resources to read. While I try to keep up, it's often impossible to go through them all, and they inevitably pile up in my browser's collection. Despite my efforts to be selective, the list grows faster than I can manage. So, I dedicate Fridays to tackling this "technical debt" by reading through as much as possible. Starting today, I'm sharing what I've learned along the way, hoping it might be helpful to others as well. Enjoy!

Tines launched Workbench

Tines is a workflow automation platform focused on security, IT, and engineering, designed to streamline tools and data, allowing complex workflows to be executed effortlessly. Recently, Tines introduced Workbench, an AI-powered chat interface that leverages templates to create automated workflows with the help of AI. This innovation gives companies greater confidence in transitioning from manual workflows to AI-driven automation. One standout feature on their website is a phishing triage and response demo, showcasing how organizations can automate critical security processes.

CrowdStrike announced a set of innovations in Fal.Con

CrowdStrike hosted its annual Fal.Con conference last week, unveiling several new AI-powered features. Some key highlights include:

AI-powered parsers: Logs can be notoriously difficult to work with, and despite efforts like OCSF, log standardization remains inconsistent. To address this, CrowdStrike introduced an AI-powered data ingestion pipeline. Users can provide a sample of logs, and Falcon SIEM, leveraging large language models (LLMs), will analyze the structure and content to automatically generate parsers. Interestingly, CrowdStrike emphasized the use of "multiple" LLMs, hinting that both encoder and decoder models may be at play.
AI-powered Attack Path Analysis: Many organizations still rely on CVSS scores to prioritize vulnerabilities, though it's well known that CVSS alone isn't sufficient to identify the most critical risks. Falcon Exposure Management's Attack Path Analysis goes beyond CVSS scores, incorporating environmental factors. AI is then applied to reason through this data, enabling better prioritization of vulnerabilities.
Automated Leads, GenAI-Based Response and Triage: Another notable announcement was on the XDR front, aimed at assisting security teams, particularly Security Operations Centers (SOC), in cutting through the noise to focus on the most impactful alerts and incidents. CrowdStrike claims that their Charlotte AI can now handle incident and alert triage, which is often considered one of the most challenging and time-consuming tasks in SOC operations.

For more details, you can explore these topics further on the CrowdStrike blog.

To Code, or Not To Code? Exploring Impact of Code in Pre-training

The value of incorporating code in pretraining is a well-known advantage in the LLM community, though there has yet to be a comprehensive ablation study that fully demonstrates its effects. This paper systematically examines how adding code to the pretraining dataset influences LLM performance. While it doesn't fully address the "why," it serves as a significant empirical study that highlights the impact of code in pretraining.

What other types of data could yield similar benefits? It's an open question, and perhaps the next breakthrough in LLM development lies in identifying those data types.

TaskGen: A Task-Based Memory-Infused Agentic Framework using StrictJSON

This paper comes from a startup called Simbian, which focuses on developing autonomous agents to tackle security challenges. It introduces a new agent-based framework that enforces strict JSON output, leading to significant time and cost savings. The framework also introduces the concepts of Inner Agent and Meta Agents, which add an interesting dimension to the design and functionality of autonomous systems.