OpenAI Gives Models Computer Brains

OpenAI is shifting from models that excel at single tasks to agents capable of complex workflows. By providing these AI models with a computer environment, OpenAI's Responses API can now execute a far wider range of use cases, from running services to generating structured reports. This move addresses practical challenges like managing intermediate files, handling large datasets, and securing network access. According to OpenAI's research, the Responses API now equips developers with the necessary components for reliable real-world task execution.

The Shell Tool and Agent Loop

At the core of this new capability is the shell tool, which allows models to interact with a computer via the command line. This significantly expands an agent's potential, enabling it to perform tasks using familiar Unix utilities like grep and curl. Unlike earlier code interpreters limited to Python, this shell tool supports multiple programming languages and server execution, paving the way for more sophisticated agentic tasks. This forms the backbone of the agent loop, where the model proposes actions, the platform executes them, and results feed back into the next step.

The Responses API orchestrates this loop by assembling context, including user prompts and tool instructions. When a model is trained to propose shell commands (like GPT-5.2 and later), the API forwards these to a container runtime. The output streams back, informing the model's subsequent actions. This iterative process continues until the task is complete.

To manage performance and context, the API supports concurrent command execution and bounded output, ensuring that large logs don't overwhelm the model's reasoning capabilities. This approach makes the agent loop both fast and context-efficient.

Context Compaction for Long-Running Tasks

Long-running agent tasks can quickly fill up the model's context window, crucial for maintaining conversational state. OpenAI has introduced native context compaction within the Responses API. This feature allows models to analyze prior conversation state and produce a condensed representation, preserving key details while discarding extraneous information. This ensures coherent workflows even across extended, multi-step sessions, a capability demonstrated by systems like Codex for sustained coding tasks.

Container Context: Files, Databases, and Network

The hosted container workspace provides the essential runtime context. This includes a file system for organizing resources, enabling models to perform targeted file operations rather than broad scans. Structured data can be stored in databases like SQLite, allowing models to query specific information efficiently instead of processing entire datasets. This is a significant improvement over embedding large tables directly into prompts.

Network access is managed through a sidecar egress proxy. All outbound requests are routed through a centralized policy layer that enforces allowlists and access controls, mitigating security risks associated with unrestricted internet access. Credentials are injected on an egress basis, ensuring sensitive information remains outside the model's direct view.

This robust container context allows agents to interact with external systems securely.

Agent Skills for Reusability

To prevent agents from rediscovering common workflows, OpenAI has introduced Agent Skills. These package multi-step patterns into reusable building blocks, consisting of metadata, instructions, and supporting resources. When a model needs to use a skill, it can discover and execute these packaged workflows via shell commands within the container environment.

The Responses API manages skills by loading them into the model's context before a prompt is sent. This ensures a deterministic process for fetching skill metadata, unpacking resources into the container, and updating the model's awareness of available tools. This structure allows for composable workflows where agents can discover, process, and generate artifacts like spreadsheets from live data.

OpenAI is excited to see developers leverage these primitives to build more capable AI applications. The company aims to evolve its platform beyond text, image, and audio generation to handle complex, real-world tasks at scale.

OpenAI Gives Models Computer Brains

The Shell Tool and Agent Loop

Related startups

Context Compaction for Long-Running Tasks

Container Context: Files, Databases, and Network

Agent Skills for Reusability

AI Daily Digest