NVIDIA’s new Blackwell GPUs just swept a new, independent AI benchmark, but the real story isn’t just about record-breaking speed. It’s about the brutal economics of running AI at scale. According to results from the new InferenceMAX v1 benchmark by SemiAnalysis, NVIDIA is making a bold claim: its hardware can deliver a 15x return on investment.
The headline number is staggering: a $5 million investment in a GB200 NVL72 system can supposedly generate $75 million in token revenue. This is the core of NVIDIA’s new pitch for the era of “AI factories,” where the focus shifts from theoretical performance to the total cost of ownership and profitability.
The InferenceMAX benchmark, released by industry analysis firm SemiAnalysis, is designed to measure this exact thing. It’s the first independent test to model the total cost of compute across diverse, real-world scenarios. As AI models evolve from simple chatbots to complex reasoning agents that use tools and generate lengthy responses, the sheer volume of tokens—and the cost to produce them—is exploding. This benchmark aims to capture that reality.
