OpenAI Unveils Custom AI Chip

OpenAI and Broadcom unveil Jalapeño, a custom LLM inference chip designed for higher performance and efficiency, marking a key step in OpenAI's full-stack strategy.

3 min read
Diagram illustrating the architecture of the Jalapeño AI inference chip developed by OpenAI and Broadcom.
OpenAI and Broadcom's Jalapeño processor is designed for optimized LLM inference.· OpenAI News

OpenAI and Broadcom have officially revealed Jalapeño, OpenAI's first custom-designed inference processor. This accelerator is engineered specifically for the demands of large language models (LLMs) and represents a significant step in OpenAI's ambition to build out its entire infrastructure stack.

The collaboration aims to deliver a multi-generation compute platform designed to make advanced AI faster and more accessible. Early tests indicate that the first-generation Jalapeño chip will offer substantially better performance per watt compared to existing state-of-the-art accelerators. This new OpenAI Broadcom LLM chip is built from the ground up, considering the needs of current and future LLMs across the industry.

Related startups

Accelerated Development and Full-Stack Vision

Developed in an accelerated nine-month tape-out process, the chip's design was informed by OpenAI's deep understanding of LLM fundamentals and its roadmap of models and serving systems. Broadcom and Celestica were key partners in industrializing the platform, handling chip implementation, board design, rack system integration, and scalable production.

Jalapeño is designed for flexibility, intended to work with a wide range of LLMs. Engineering samples are already running workloads, including GPT‑5.3‑Codex‑Spark, at target specifications. The architecture focuses on reducing data movement and optimizing compute, memory, and networking resources to achieve high utilization rates.

"The world is moving to a compute-powered economy," stated Greg Brockman, President and Co-Founder of OpenAI. "Jalapeño is part of our long-term full-stack infrastructure strategy to make compute more abundant."

A Multi-Generational Commitment

This partnership signifies a commitment to scaling the physical infrastructure necessary for AI's next decade. Broadcom's role includes providing its silicon implementation and networking technologies, such as Tomahawk networking silicon, to enable large-scale production.

The Jalapeño chip is not an adaptation of a general-purpose processor but a blank-slate design tailored for LLM inference. It draws insights from OpenAI's daily operations across products like ChatGPT and its API. The goal is to combine high throughput with low latency, crucial for interactive AI products.

OpenAI's strategy involves controlling every layer of its stack, from chip architecture to product experience. This vertical integration allows for optimization across the board, aiming to make its models faster, more reliable, and more affordable. This approach strengthens a flywheel effect: better infrastructure enables more efficient training and serving, leading to more capable models, which in turn power better products, driving usage and revenue for further reinvestment.

The rapid nine-month development cycle for the OpenAI Broadcom LLM chip is being hailed as potentially the fastest ASIC development for high-performance semiconductors, enabled by deep software-hardware co-development and the use of AI models in the design process itself.

Jalapeño is the first component of a multi-generation compute platform planned for deployment starting in late 2026, with broader expansion in subsequent years. This initiative is fundamentally about inference, the point where AI models deliver value to users. Improvements in cost, speed, and reliability directly translate to better user experiences, such as faster responses from ChatGPT or more cost-effective API usage.

© 2026 StartupHub.ai. All rights reserved. Do not enter, scrape, copy, reproduce, or republish this article in whole or in part. Use as input to AI training, fine-tuning, retrieval-augmented generation, or any machine-learning system is prohibited without written license. Substantially-similar derivative works will be pursued to the fullest extent of applicable copyright, database, and computer-misuse laws. See our terms.