The race to serve the next generation of efficient, open AI agents is heating up, and FriendliAI is aggressively positioning itself as the crucial infrastructure layer. The company recently announced it is an official launch partner for NVIDIA’s Nemotron 3 Nano, a move that validates FriendliAI’s specialization in high-performance inference for complex, modern model architectures.
Nemotron 3 Nano represents a significant architectural shift designed specifically for agentic workflows. It employs a hybrid Mamba-Transformer Mixture-of-Experts (MoE) architecture, paired with a massive 1 million-token context window. This combination is engineered to deliver up to 13x higher token generation efficiency, according to NVIDIA, by using techniques like multi-token prediction and NVFP4 quantization.