Databricks Taps NVIDIA Vera for AI Agents

Databricks and NVIDIA are expanding their collaboration, aiming to streamline the entire AI lifecycle from training to inference and the burgeoning field of agentic AI. This partnership integrates NVIDIA's accelerated computing, including its new NVIDIA Vera CPU, directly into the Databricks platform.

The expanded alliance focuses on delivering an end-to-end AI solution. It promises to accelerate model training, inference, and the development of agentic AI applications built on governed enterprise data. Databricks is also bringing serverless NVIDIA GPUs to its Free Edition, broadening access for developers, students, and startups.

Related startups

Training and Fine-Tuning

Databricks AI Runtime (AIR) now directly integrates NVIDIA GPU acceleration. This allows data and AI teams to train and fine-tune models on governed data without managing separate GPU infrastructure. AIR supports NVIDIA Hopper GPUs with NVIDIA Quantum InfiniBand for multi-node distributed training, eliminating communication bottlenecks.

The platform is also being prepped for the upcoming NVIDIA Blackwell architecture. Furthermore, Databricks will soon support NVIDIA NGC containers and custom CUDA environments for native execution within the platform.

Inference: NVIDIA Acceleration in Databricks Model Serving

Databricks Model Serving is being enhanced with NVIDIA hardware and software for low-latency, high-throughput inference at scale. This includes support for leading inference-optimized GPUs and the Triton Inference Server. Customers can serve models trained on NVIDIA hardware directly through managed Databricks infrastructure.

Agentic Infrastructure: NVIDIA Vera CPU

The rise of autonomous agents introduces new infrastructure demands. While GPUs handle inference, tasks like tool calling, analytics, and multi-step reasoning often bottleneck on traditional CPUs. NVIDIA Vera is designed to address this, targeting agentic workloads, reinforcement learning, and CPU-based data analytics.

Vera features NVIDIA-designed, Arm-compatible cores promising up to 3x faster SQL queries and 80% faster agentic performance. Its massive memory bandwidth and fast core-to-core communication aim to provide predictable performance for complex agentic operations. The vision is an end-to-end NVIDIA-accelerated stack on Databricks, with models on GPUs and agent orchestration on Vera CPUs.

Developer Experience

The NVIDIA Agent Toolkit is now deployable on Databricks Apps. This open-source platform enables building, customizing, and deploying agentic AI workflows directly within the Databricks environment. Capabilities include guardrails, tool use, retrieval-augmented generation, and multi-step reasoning.

Databricks Apps act as the hosting layer, offering managed applications with built-in authentication, networking, and governance via Unity Catalog. Developers can access governed data and call models through the platform without leaving the environment.

Genie Code is being introduced to simplify GPU workload management. This agent-first approach allows conversational debugging of GPU issues, performance optimization, and leverages NVIDIA-specific knowledge like CUDA and NCCL. Genie Code integrates with Databricks Notebooks, MLflow, and Model Serving for enhanced monitoring and debugging.

Industry AI

Databricks and NVIDIA are also integrating NVIDIA's industry-specific AI frameworks. This allows customers to accelerate use cases in sectors like healthcare, life sciences, supply chain, robotics, digital twins, and document intelligence, all running on governed Databricks data.

© 2026 StartupHub.ai. All rights reserved. Do not enter, scrape, copy, reproduce, or republish this article in whole or in part. Use as input to AI training, fine-tuning, retrieval-augmented generation, or any machine-learning system is prohibited without written license. Substantially-similar derivative works will be pursued to the fullest extent of applicable copyright, database, and computer-misuse laws. See our terms.