Technology

Databricks' KARL Cuts Agent Costs

Databricks' new KARL AI agent drastically cuts costs and latency for enterprise knowledge tasks using custom reinforcement learning.

Mar 5 at 3:31 PM2 min read
Databricks logo with abstract AI graphics

Databricks has unveiled KARL, a new AI agent designed to accelerate enterprise knowledge work. Built using custom reinforcement learning (RL), KARL aims to dramatically lower the operational costs associated with deploying large language models for complex tasks like document search, fact-finding, and multi-step reasoning.

The explosion of AI agents for knowledge work comes with a steep price tag. Inference costs for these powerful models are growing unsustainably for many organizations. Databricks claims its approach, detailed in a recent blog post, not only matches the quality of leading proprietary models but also outperforms them on inference cost and latency.

KARL Tackles Grounded Reasoning

KARL specializes in 'grounded reasoning' – answering questions by actively searching and cross-referencing enterprise data. This capability is crucial for Databricks products like Agent Bricks Knowledge Assistant, but it presents a unique challenge for AI training due to the often subjective nature of correct answers. Traditional methods struggle to guide models effectively in such scenarios.

Databricks leveraged its internal RL infrastructure, developed for its Agent Bricks product, to train KARL. The company reports achieving this performance boost with only a few thousand GPU hours and entirely synthetic data, a fraction of the resources typically required.

RL Infrastructure for Customers

The underlying RL pipelines and infrastructure used to create KARL are now being offered to Databricks customers. This move aims to empower businesses to optimize their own high-volume AI agent workloads, reducing costs and improving efficiency for domain-specific tasks. This follows similar industry trends, such as the use of RL in tools like Cursor's Composer model, which saw significant speed and quality gains.

Databricks is launching a private preview for its Custom RL offering, backed by Serverless GPU Compute. This allows customers to build more efficient, domain-specific versions of their existing agents. The company also highlighted related research, including work on Instructed Retriever for system-level reasoning in search agents and MemAlign for building better LLM judges, underscoring their commitment to advancing the capabilities of enterprise knowledge agents and exploring new frontiers in Reinforcement Learning for Enterprise Agents.