Subagents, those miniature AI collaborators designed to manage context and break down complex problems, have become indispensable for efficient AI-driven development. Yet, as Brian John, Principal Full Stack Engineer at BetterUp, illuminated in his presentation "Hacking Subagents Into Codex CLI," even advanced tools like OpenAI's Codex CLI initially lack this critical functionality, leading developers to seek ingenious workarounds. John's work provides a compelling case study in extending the utility of powerful AI tools through clever architectural design and persistent problem-solving.
John spoke about his endeavor to integrate subagents into Codex CLI, a significant enhancement for developers accustomed to the robust context management features found in other AI coding assistants. As a Principal Full Stack Engineer at BetterUp, where AI enablement for R&D is a core focus, John's motivation stemmed from a practical need: to help his team members ship faster and with higher quality. His deep experience with AI, spanning over eight years at BetterUp, positioned him to tackle this challenge directly.
A primary driver for John was the desire to circumvent vendor lock-in. Having extensively used Claude Code, a tool he praises for its bells and whistles and great models, he expressed a common developer apprehension: "I don't want to be locked in to one tool, and I really don't want to be locked in to one model family." This sentiment resonates deeply within the tech community, where flexibility and interoperability are paramount. Codex CLI, with its promising models, presented an opportunity to diversify, but its lack of native subagent support was a critical impediment.
The fundamental advantage of subagents lies in their ability to manage context effectively. A main agent can delegate a specific problem to a subagent, which then executes its task, utilizing its own token window, and returns only the concise answer to the main agent. This modular approach prevents the main context window from becoming bloated, thereby improving efficiency and reducing token costs. It's a workflow that John found transformative, particularly in navigating large codebases, crediting Dexter Horthy's talk on Advanced Context Engineering for Agents as a pivotal influence.
John's design for integrating subagents into Codex CLI is elegantly simple in concept. A parent Codex session launches a wrapper script, `agent-exec`. This script reads the subagent's definition, constructs a prompt, and then invokes a child Codex CLI process. This child process acts as the subagent, operating within its own sandbox, responding to the prompt, and writing its result to a temporary file. The `agent-exec` script then reads this file, prints the result to standard out, and returns it to the parent Codex session.
However, the implementation proved more challenging than anticipated. The stringent security measures of Codex's sandbox environment posed significant hurdles. John candidly admitted, "Codex's sandbox really seems to not want to let you do this... to get it to work with the normal set of permissions actually was really, really hard." Overcoming these permission issues involved meticulously configuring access for both the parent and child processes, ensuring the child could access parent's OpenAI credentials and write to the temporary output file, while also disabling the "Rollout recorder" feature that otherwise prevented necessary file system access.
Security considerations were paramount. John referenced Meta's "Agents Rule of 2," which outlines three key areas: processing untrustworthy inputs, accessing sensitive systems or private data, and changing state or communicating externally. His solution, while accessing proprietary code (sensitive data) and communicating with OpenAI's API (external communication), avoids processing untrustworthy inputs. As John rightly cautioned, "Lower risk does not mean no risk!" This encapsulates the ongoing vigilance required when integrating powerful AI agents with system access, emphasizing that developers must make their own determinations regarding acceptable risk.
The practical implementation involves an `AGENTS.md` file that defines available subagents, their "reasoning effort" (e.g., light, medium, high), and their specific prompts. This file also instructs Codex on when to use subagents—either when explicitly requested by the user or proactively when it would be helpful. Crucially, it defines how to invoke them, by writing the agent name and query to temporary files, and then executing the `agent-exec` wrapper script. This file-based communication, while effective, means that Codex's permission system only requires a one-time approval for the `agent-exec` command, rather than repetitive approvals for each subagent invocation. This approach avoids the tedious user interaction that would otherwise render the system impractical.
Related Reading
- Coding Agents with Taste: The Next Frontier in AI Development
- Backlog.md Unleashes AI Agents with Terminal-Native Task Management
- AI Agents: Year of Utility, Decade of Autonomy
One notable trade-off in this serial execution model is speed. Unlike Claude Code, which can run processes asynchronously, John's solution executes everything sequentially. This means that while the subagent functionality is achieved, the overall process is slower. John suggests this might be an intentional design choice by OpenAI, positioning Codex as a more "hands-off, unattended type of tool" compared to the iterative nature of Claude Code. Despite the speed difference, John finds this acceptable for his workflow, highlighting that the gains in context management and flexibility outweigh the performance hit for specific use cases.
John's innovative hack to bring subagent capabilities to Codex CLI exemplifies the resourceful spirit of software engineering in the age of AI. By carefully navigating technical constraints and security considerations, he has provided a valuable blueprint for extending the functionality of powerful, yet sometimes limited, AI tools. This work enhances developer flexibility and reinforces the critical role of context management in complex AI tasks.



