OpenAI is shifting from models that excel at single tasks to agents capable of complex workflows. By providing these AI models with a computer environment, OpenAI's Responses API can now execute a far wider range of use cases, from running services to generating structured reports. This move addresses practical challenges like managing intermediate files, handling large datasets, and securing network access. According to OpenAI's research, the Responses API now equips developers with the necessary components for reliable real-world task execution.
The Shell Tool and Agent Loop
At the core of this new capability is the shell tool, which allows models to interact with a computer via the command line. This significantly expands an agent's potential, enabling it to perform tasks using familiar Unix utilities like grep and curl. Unlike earlier code interpreters limited to Python, this shell tool supports multiple programming languages and server execution, paving the way for more sophisticated agentic tasks. This forms the backbone of the agent loop, where the model proposes actions, the platform executes them, and results feed back into the next step.
The Responses API orchestrates this loop by assembling context, including user prompts and tool instructions. When a model is trained to propose shell commands (like GPT-5.2 and later), the API forwards these to a container runtime. The output streams back, informing the model's subsequent actions. This iterative process continues until the task is complete.