OpenAI shrinks GPT-5.4 for speed

OpenAI is expanding its GPT-5.4 family with the introduction of two new, more compact models: GPT-5.4 mini and GPT-5.4 nano. These releases, detailed on OpenAI News, are designed for efficiency and speed, targeting applications where rapid response times and cost-effectiveness are paramount.

GPT-5.4 mini represents a substantial upgrade over the previous GPT-5 mini, delivering enhanced performance in coding, reasoning, and multimodal understanding. Notably, it runs over twice as fast and approaches the benchmarks of the larger GPT-5.4 on tasks like SWE-Bench Pro and OSWorld-Verified.

The GPT-5.4 nano model is positioned as the most economical and swift option within the GPT-5.4 series. OpenAI recommends it for tasks such as classification, data extraction, and powering simpler subagents, where minimal latency and cost are key drivers.

Optimized for Action

These new models are engineered for workloads where responsiveness directly impacts user experience. This includes coding assistants demanding near-instantaneous feedback, subagents executing routine tasks quickly, and computer-using systems that process screenshots in real-time.

The company emphasizes that for many applications, the most effective model isn't necessarily the largest. Instead, it's the one that balances speed, reliable tool usage, and strong performance on complex professional tasks.

Performance Benchmarks

Early benchmarks indicate GPT-5.4 mini's strengths across various domains. In coding tasks, it significantly outperforms GPT-5 mini at similar latencies, nearing GPT-5.4 levels of success while operating much faster. This offers a compelling performance-per-latency trade-off for developers.

GPT-5.4 mini also shows promise in multimodal applications, particularly those involving computer vision. It can swiftly interpret complex user interfaces from screenshots, demonstrating performance close to GPT-5.4 on the OSWorld-Verified benchmark.

Subagents and Compositional AI

The introduction of these smaller, faster models also facilitates more sophisticated AI architectures. OpenAI highlights their use in subagent systems, where larger models can orchestrate tasks, delegating specialized sub-functions to GPT-5.4 mini or nano for rapid execution.

This compositional approach, where developers combine models of varying sizes for optimal efficiency, becomes increasingly viable as smaller models gain capability. It allows for systems where a central model handles planning, while smaller, specialized models execute specific tasks at scale.

Availability and Pricing

GPT-5.4 mini is now available via the OpenAI API, Codex, and ChatGPT. The API version supports text and image inputs, tool use, and boasts a 400k context window, priced at $0.75 per 1M input tokens and $4.50 per 1M output tokens.

GPT-5.4 nano is exclusive to the API, offered at a more aggressive price point of $0.20 per 1M input tokens and $1.25 per 1M output tokens, making it suitable for high-volume, cost-sensitive applications.

OpenAI shrinks GPT-5.4 for speed

Related startups

Optimized for Action

Performance Benchmarks

Subagents and Compositional AI

Availability and Pricing

AI Daily Digest