Artificial Intelligence

Preferred on Google

Sunil Pai on AI Agents & the Future of Software

Cloudflare's Sunil Pai discusses the future of AI agents, moving from tool-calling to code generation for more efficient and powerful interactions.

Apr 19 at 6:01 PM4 min read

Sunil Pai speaking on stage at AI Engineer Europe — Sunil Pai, Founder & Principal Systems Engineer at Cloudflare, discusses AI agents and code generation.· AI Engineer

Sunil Pai, Founder and Principal Systems Engineer at Cloudflare, recently took the stage at AI Engineer Europe to discuss the evolving capabilities of AI agents. His talk, titled "Code Mode: Let the code do the talking," explored the transition from traditional tool-calling mechanisms to a more sophisticated model where AI directly generates and executes code.

Sunil Pai on AI Agents & the Future of Software - AI Engineer — Sunil Pai on AI Agents & the Future of Software — from AI Engineer

Understanding "Code Mode"

Pai began by highlighting the challenges of scaling AI agent interactions. "Tool calling gets weird at scale," he stated, explaining that while simple tools work well for short-run tasks, managing hundreds or thousands of tools and their associated API calls becomes inefficient and prone to errors. This complexity leads to slow response times and a breakdown in the model's ability to compose actions effectively.

The solution, as proposed by Pai, is to shift from explicit tool definitions to a "code mode." In this paradigm, the AI agent generates code—typically in a language like JavaScript or Python—that directly interacts with the underlying systems or APIs. This approach offers several advantages:

Related startups

Type safety: Code provides a more robust and type-safe interface compared to JSON definitions.
Execution: The model can execute the generated code directly, streamlining the process.
Flexibility: It allows for more complex logic, such as loops, conditional statements, and error handling, which are difficult to express with simple tool calls.

From 2,594 Endpoints to Two Tools

Pai shared a compelling example from Cloudflare's own experience. They reduced their API surface from 2,594 endpoints to just two core tools: a search tool and an execute tool. This simplification was achieved by enabling the AI agents to write code that interacts with these fundamental capabilities. The result was a dramatic reduction in tokens used per call (from 1,000 to approximately 1,000) and a significant increase in efficiency, with a 99.9% reduction in token usage per action.

"We were able to shrink that entire API surface and make it really fast," Pai explained. This transition allows the AI to interact with the system more natively, leveraging the full power of programming languages to express complex intents and actions.

The Harness: A New Architecture for AI Interaction

The core of Pai's argument centers on a new architectural pattern he calls "the harness." This harness acts as a secure, scoped runtime environment that allows AI agents to execute code safely. The harness provides explicit capabilities to the agent, ensuring that it can only perform actions within its defined permissions. This contrasts with traditional containerization, which is often more cumbersome and less efficient for the rapid execution required by AI agents.

Pai elaborated on the benefits of this approach:

Capability-based security: Instead of broad permissions, the harness grants specific, granular capabilities, enhancing security.
Fast startup: Unlike traditional containers or VMs, these sandboxed environments can start almost instantaneously.
Full observability: The harness provides deep insights into the agent's execution, including detailed logs and audit trails.

Breaking Down the Programmer/User Divide

Pai concluded by emphasizing how this shift is breaking down the traditional divide between programmers and users. "Programmers got code. Everyone else got buttons," he remarked, highlighting the historical separation. However, with AI agents capable of generating and executing code, this distinction is blurring. Every user, in essence, can now have a personalized program tailored to their specific needs and workflows.

"Your next billion users will never touch your UI," Pai stated, suggesting that future interactions will be driven by natural language prompts that translate into complex, generated code. The focus is shifting from building static interfaces to creating dynamic, user-specific software that adapts and evolves with the user's needs. The key takeaway is that the runtime contract—the underlying mechanism of interaction—remains consistent, regardless of the language the agent uses to write its code.

© 2026 StartupHub.ai. All rights reserved. Do not enter, scrape, copy, reproduce, or republish this article in whole or in part. Use as input to AI training, fine-tuning, retrieval-augmented generation, or any machine-learning system is prohibited without written license. Substantially-similar derivative works will be pursued to the fullest extent of applicable copyright, database, and computer-misuse laws. See our terms.

#Sunil Pai #Cloudflare #AI Agents #Large Language Models #Code Generation #Runtime #Sandbox #Developer Experience

AI Daily Digest

Get the most important AI news daily.

+40k readers