• StartupHub.ai
    StartupHub.aiAI Intelligence
Discover
  • Home
  • Search
  • Trending
  • News
Intelligence
  • Market Analysis
  • Comparison
  • Market Map
Workspace
  • Email Validator
  • Pricing
Company
  • About
  • Editorial
  • Terms
  • Privacy
  • v1.0.0
  1. Home
  2. News
  3. Azures Gb300 Cluster Openais New Ai Superpower
Back to News
Ai research

Azure's GB300 Cluster: OpenAI's New AI Superpower

S
StartupHub Team
Oct 9, 2025 at 7:01 PM3 min read
Azure's GB300 Cluster: OpenAI's New AI Superpower

** Microsoft Azure has just pulled back the curtain on a monumental leap in AI infrastructure: the world’s first production-scale NVIDIA GB300 NVL72 supercomputing cluster. This isn't just another server farm; it's a purpose-built behemoth designed to tackle OpenAI’s most demanding AI inference workloads, signaling a new frontier for reasoning models and agentic AI systems.

According to the announcement, this supercomputer-scale cluster boasts over 4,600 NVIDIA Blackwell Ultra GPUs, all interconnected via the cutting-edge NVIDIA Quantum-X800 InfiniBand networking platform. Microsoft and NVIDIA’s years-long partnership has culminated in a radical engineering feat, optimizing memory and networking to deliver the sheer compute scale needed for high inference and training throughput. This isn't just about raw power; it's about making that power accessible and efficient for the next generation of AI.

Inside the GB300 Engine

At the core of Azure’s new NDv6 GB300 VM series lies the liquid-cooled, rack-scale NVIDIA GB300 NVL72 system. Each rack is an independent powerhouse, integrating 72 NVIDIA Blackwell Ultra GPUs and 36 NVIDIA Grace CPUs. This configuration creates a staggering 37 terabytes of fast memory and 1.44 exaflops of FP4 Tensor Core performance per VM, forming a massive, unified memory space crucial for complex multimodal generative AI and advanced reasoning tasks.

The performance gains are already evident. In recent MLPerf Inference v5.1 benchmarks, the NVIDIA GB300 NVL72 systems delivered record-setting results, showing up to 5x higher throughput per GPU on the 671-billion-parameter DeepSeek-R1 reasoning model compared to the previous Hopper architecture. This kind of performance is critical for the real-time, complex decision-making required by agentic AI.

Connecting these thousands of GPUs into a single, cohesive supercomputer required a sophisticated two-tiered networking architecture. Within each GB300 NVL72 rack, the fifth-generation NVIDIA NVLink Switch fabric provides 130 TB/s of direct bandwidth, effectively turning the entire rack into a unified accelerator. To scale beyond the rack, the cluster leverages the NVIDIA Quantum-X800 InfiniBand platform, offering 800 Gb/s of bandwidth per GPU across all 4,608 GPUs. This ensures seamless, low-latency communication essential for trillion-parameter-scale AI.

This achievement underscores Microsoft Azure’s commitment to building the foundational infrastructure for future AI breakthroughs. By deploying the world’s first production NVIDIA GB300 NVL72 cluster at this scale, Azure is not just offering more compute; it's redefining what's possible for AI development, directly empowering partners like OpenAI to push the boundaries of what AI can do.

**

Related Reading

- [Blackwell AI Inference: NVIDIA's Extreme-Scale Bet](https://www.startuphub.ai/ai-news/ai-research/2025/blackwell-ai-inference-nvidias-extreme-scale-bet/) - [NVIDIA's Open Play for Agentic AI with Nemotron](https://www.startuphub.ai/ai-news/ai-research/2025/nvidias-open-play-for-agentic-ai-with-nemotron/) - [UK Sovereign AI Boosts Welsh Language, Public Services](https://www.startuphub.ai/ai-news/ai-research/2025/uk-sovereign-ai-boosts-welsh-language-public-services/)

#AI
#AI Infrastructure
#Generative AI
#Launch
#Microsoft Azure
#NVIDIA
#OpenAI
#Supercomputing

AI Daily Digest

Get the most important AI news daily.

GoogleSequoiaOpenAIa16z
+40k readers