At Code w/ Claude, Craig Wiley, Sr. Director of Product for AI/ML at Databricks, articulated a compelling vision for enterprise AI, emphasizing the shift from general intelligence to domain-specific "Data Intelligence." Wiley highlighted that while foundational models trained on broad datasets are powerful, their true value for businesses emerges when "AI is connected to your customer data and able to solve domain-specific problems." This core insight underscores Databricks’ strategy: leveraging proprietary data as the linchpin for impactful AI solutions.
Enterprises today grapple with a fragmented data estate—a labyrinth of data lakes, warehouses, streaming systems, and various AI/ML tools spread across multiple clouds and vendors. This complexity results in "a complexity nightmare of high costs and proprietary formats," severely hindering efforts to extract meaningful value from their vast data reserves. Databricks positions its Lakehouse architecture and Unity Catalog as the foundational remedy, offering a unified platform that brings together structured and unstructured data, alongside governance, orchestration, and machine learning capabilities.
The partnership with Anthropic's Claude is central to Databricks' approach to unlocking this data intelligence. Wiley shared the case of FactSet, a Fortune 500 financial services company, which initially struggled to translate natural language queries into their proprietary FactSet Query Language (FQL) using a general-purpose LLM. This attempt yielded a mere "59% accuracy" and "15s latency." By implementing a multi-agent system on Databricks, powered by Claude, FactSet achieved a dramatic improvement: "85% accuracy" with just "6s latency." This significant leap demonstrates how decomposing complex tasks and leveraging specialized agents can transform raw LLM capabilities into robust, accurate, and efficient solutions for critical business functions.
For AI to be trustworthy in "areas of financial and reputational risk," two pillars are paramount: governance and evaluation. Databricks' platform offers native governance across all data assets, AI models, and tools, enforcing access controls, setting rate limits, and tracking lineage. Critically, it also provides robust agent evaluation capabilities, allowing enterprises to measure the quality, cost, and latency of their AI applications, including Retrieval Augmented Generation (RAG) applications and agent chains. This comprehensive evaluation framework, which includes human and machine evaluators, is crucial for identifying quality issues and determining root causes, ensuring AI systems meet stringent enterprise standards.
Databricks' integrated platform simplifies the creation of high-quality AI applications and agents directly on a company's data. The native availability of Claude across Azure, AWS, and GCP, coupled with Databricks' robust tooling, allows for seamless integration and deployment. This convergence means businesses can utilize their existing Databricks commitments, ensuring data security within their perimeter while tapping into high-scale batch inference with full governance and lineage. The success of internal projects like ARIA, which uses Claude to automate industry analyst responses, saving "100s of hours saved per response" and improving quality, further validates this approach. Similarly, Block's agent-based LLM project, Goose, built with Claude on Databricks, has led to a "40-50% weekly user adoption increase" and "8-10 hours saved per week" for employees interacting with complex data. Databricks believes the deepest value lies in the intimate integration of AI and data layers, offering a path for enterprises to confidently deploy powerful, domain-specific AI.

