The proliferation of agent orchestration frameworks like LangGraph, CrewAI, and Google ADK has centered on an external orchestrator managing LLM state and routing. However, a controlled comparison detailed on arXiv reveals a simpler, more effective paradigm for procedural tasks: embedding the entire procedure directly into the system prompt, allowing the LLM to self-orchestrate.
Self-Orchestration Dominates Procedural Tasks
Across travel booking, Zoom technical support, and insurance claims processing, the in-context approach consistently outperformed external orchestration. For a 55-node insurance claims task, the in-context method achieved scores of 4.53-5.00 on a 5-point scale. In contrast, a LangGraph orchestrator using the same LLM yielded scores between 4.17-4.84. This agent orchestration frameworks comparison highlights a critical performance gap.