Preferred on Google

The Pragmatic Pivot: Why Anthropic Champions Skills Over Agents

Dec 6, 2025 at 10:15 PM4 min read

The Pragmatic Pivot: Why Anthropic Champions Skills Over Agents

The AI industry is at a critical juncture, grappling with the promise of autonomous agents versus the practicalities of deployment. Barry Zhang, Head of Product, and Mahesh Murag, Engineering Lead, both from Anthropic, cut through the hype with a compelling argument: the future of reliable AI lies not in building monolithic agents, but in developing modular, robust "skills." This perspective challenges the prevailing narrative, offering a more grounded, engineering-centric approach to AI product development that prioritizes reliability and composability over unfulfilled autonomy.

During a recent panel discussion hosted by Cognition Labs, Zhang and Murag delineated their pragmatic philosophy, dissecting the inherent limitations of current large language models when tasked with complex, multi-step operations. Their core insight is that while LLMs excel at reasoning and planning, their execution reliability diminishes significantly over extended chains of action, leading to unpredictable failures. This isn't merely a technical nuance; it represents a fundamental re-evaluation of how founders and VCs should approach investing in and building AI solutions.

Related startups

Mahesh Murag succinctly articulated this paradigm shift, stating, "We've been talking about agents for a long time, but I think what we're actually building are more like skills." This distinction is crucial. An agent, in the common conception, is an autonomous entity capable of performing complex, multi-faceted tasks end-to-end. A skill, conversely, is a discrete, well-defined capability designed to perform a specific function reliably. Think of an LLM not as a self-sufficient project manager, but as a brilliant coordinator who needs a team of highly specialized, dependable contractors (skills) to execute individual tasks.

The fundamental challenge, as Barry Zhang elaborated, is that "LLMs are very good at reasoning, but they're not necessarily good at execution reliability over long chains of actions." This disconnect between reasoning prowess and execution consistency is the Achilles' heel of many ambitious agentic projects. An LLM might brilliantly deduce a multi-step plan, but its ability to reliably carry out each step without deviation or hallucination remains a significant hurdle. This often results in a "hallucination of execution," where the model believes it has successfully completed a task, only for it to fall short in reality.

This leads to a core insight for product builders: focus on developing capabilities that are inherently robust. Murag emphasized the key attributes of a valuable skill: "You want a skill to be robust. You want it to be composable. You want it to be debuggable." Robustness ensures the skill performs its function consistently, even with minor variations in input. Composability allows for skills to be combined and recombined in flexible ways, enabling dynamic adaptation to new problems without requiring a complete system overhaul. Debuggability is paramount for identifying and rectifying failures efficiently, a stark contrast to the opaque, difficult-to-diagnose failures of complex agentic systems.

Related Reading

For startup founders, this perspective offers a clearer path to delivering tangible value. Instead of aiming for a grand, general-purpose agent that may prove brittle and difficult to scale, teams can focus on shipping highly reliable, atomic skills that solve specific problems. This incremental approach allows for faster iteration, clearer product-market fit validation, and a more manageable development lifecycle. Barry Zhang highlighted this practical advantage: "The product surface area for a skill is much more constrained, which makes it easier to build and deploy." This focus on constrained, well-defined problems reduces complexity and accelerates time to market, a critical factor in the competitive AI landscape.

The implications for safety and control are also profound. By modularizing AI systems into distinct skills, developers gain finer-grained control and transparency. Failures can be isolated to specific components, making systems easier to audit, monitor, and, crucially, intervene when necessary. This human-in-the-loop orchestration of skills, rather than blind delegation to an autonomous agent, offers a more responsible and controllable pathway for deploying AI in sensitive or high-stakes environments. This pragmatic approach acknowledges the current limitations of LLMs while strategically leveraging their strengths, paving the way for more dependable and valuable AI applications.

© 2025 StartupHub.ai. All rights reserved. Do not enter, scrape, copy, reproduce, or republish this article in whole or in part. Use as input to AI training, fine-tuning, retrieval-augmented generation, or any machine-learning system is prohibited without written license. Substantially-similar derivative works will be pursued to the fullest extent of applicable copyright, database, and computer-misuse laws. See our terms.

#AI Agents #AI Product Development #AI Reliability #Anthropic AI #Build AI Skills #Machine Learning Skills #tech startups

AI Daily Digest

Get the most important AI news daily.

+40k readers