OpenAI today launched a research preview of GPT‑5.3-Codex‑Spark, a stripped-down version of its larger GPT‑5.3‑Codex model. This new iteration is the company's first AI specifically engineered for real-time coding assistance, marking a significant step in its collaboration with Cerebras, announced earlier this year.
Codex‑Spark is built for speed, optimized to deliver near-instantaneous responses on ultra-low latency hardware. It boasts over 1000 tokens per second, a critical metric for interactive coding where immediate feedback is essential.

A New Mode for Codex
While OpenAI’s larger frontier models excel at complex, long-running autonomous tasks, Codex‑Spark targets immediate, interactive coding. Developers can use it for quick edits, logic refactoring, or interface adjustments, seeing results instantly.
This dual capability means Codex now supports both ambitious, multi-day projects and rapid, in-the-moment development. OpenAI plans to gather developer feedback to refine the model and expand access.
The research preview offers a 128k context window and is text-only. Usage will have separate rate limits during this phase, with potential queuing during peak demand to ensure reliability.
