Mistral Small 4 Unifies AI Capabilities

Mistral AI unveils Mistral Small 4, a unified model combining text, image, reasoning, and coding capabilities under an open-source license.

3 min read
Mistral Small 4 AI model announcement graphic
Image credit: mistral.ai

Mistral AI has launched Mistral Small 4, a significant update to its Small family of models. This new release aims to consolidate multiple AI functionalities into a single, versatile model, eliminating the need for users to switch between specialized tools.

Mistral Small 4 combines the strengths of previous models: the instruction-following capabilities of Mistral Small, the reasoning power of Magistral, the multimodal features of Pixtral, and the agentic coding abilities of Devstral. This unification means users can now access a fast instruct model, a powerful reasoning engine, and a multimodal assistant all within one package, boasting configurable reasoning effort and improved efficiency.

The model is designed as a hybrid, optimized for general chat, coding, agentic tasks, and complex reasoning. Its architecture supports both text and image inputs, broadening its applicability across various use cases.

A New Standard for Hybrid AI

Mistral Small 4 represents a step towards more integrated AI solutions. Its architecture features a Mixture of Experts (MoE) design with 128 experts, activating four per token for efficient scaling. It boasts a total of 119 billion parameters, with 6 billion active per token, and supports a substantial 256k context window for handling long-form interactions and document analysis.

A key innovation is the configurable reasoning effort. Users can select settings ranging from fast, low-latency responses akin to Mistral Small 3.2, to deep, step-by-step reasoning comparable to Magistral models. This flexibility allows for tailored performance based on the task at hand.

Performance benchmarks indicate a 40% reduction in end-to-end completion time in latency-optimized setups and a threefold increase in requests per second in throughput-optimized configurations compared to Mistral Small 3.

Mistral Small 4 also demonstrates competitive performance against models like GPT-OSS 120B, achieving comparable or superior scores on benchmarks such as LCR LiveCodeBench and AIME 2025, while generating significantly shorter, more efficient outputs. This efficiency translates to lower latency, reduced inference costs, and an improved user experience, especially critical for enterprise deployments.

Continuing Mistral AI's commitment to open access, Mistral Small 4 is released under the Apache 2.0 license. It is available on various platforms including vLLM, llama.cpp, and Hugging Face. Developers can prototype the model for free on NVIDIA accelerated computing, with production deployments available as NVIDIA NIMs for optimized inference and customization via NVIDIA NeMo.

Mistral AI also announced its founding membership in the NVIDIA Nemotron Coalition, underscoring its dedication to open-source collaboration and advancing AI development.

The model is intended for developers working on coding automation, enterprises needing general assistants or multimodal analysis tools, and researchers tackling complex reasoning tasks. Its open-source nature facilitates fine-tuning and specialization for specific applications.