Mistral Small 4 Unifies AI Capabilities

Mistral AI has launched Mistral Small 4, a significant update to its Small family of models. This new release aims to consolidate multiple AI functionalities into a single, versatile model, eliminating the need for users to switch between specialized tools.

Mistral Small 4 combines the strengths of previous models: the instruction-following capabilities of Mistral Small, the reasoning power of Magistral, the multimodal features of Pixtral, and the agentic coding abilities of Devstral. This unification means users can now access a fast instruct model, a powerful reasoning engine, and a multimodal assistant all within one package, boasting configurable reasoning effort and improved efficiency.

Related startups

The model is designed as a hybrid, optimized for general chat, coding, agentic tasks, and complex reasoning. Its architecture supports both text and image inputs, broadening its applicability across various use cases.

A New Standard for Hybrid AI

Mistral Small 4 represents a step towards more integrated AI solutions. Its architecture features a Mixture of Experts (MoE) design with 128 experts, activating four per token for efficient scaling. It boasts a total of 119 billion parameters, with 6 billion active per token, and supports a substantial 256k context window for handling long-form interactions and document analysis.

A key innovation is the configurable reasoning effort. Users can select settings ranging from fast, low-latency responses akin to Mistral Small 3.2, to deep, step-by-step reasoning comparable to Magistral models. This flexibility allows for tailored performance based on the task at hand.

Performance benchmarks indicate a 40% reduction in end-to-end completion time in latency-optimized setups and a threefold increase in requests per second in throughput-optimized configurations compared to Mistral Small 3.

Mistral Small 4 also demonstrates competitive performance against models like GPT-OSS 120B, achieving comparable or superior scores on benchmarks such as LCR LiveCodeBench and AIME 2025, while generating significantly shorter, more efficient outputs. This efficiency translates to lower latency, reduced inference costs, and an improved user experience, especially critical for enterprise deployments.

Continuing Mistral AI's commitment to open access, Mistral Small 4 is released under the Apache 2.0 license. It is available on various platforms including vLLM, llama.cpp, and Hugging Face. Developers can prototype the model for free on NVIDIA accelerated computing, with production deployments available as NVIDIA NIMs for optimized inference and customization via NVIDIA NeMo.

Mistral AI also announced its founding membership in the NVIDIA Nemotron Coalition, underscoring its dedication to open-source collaboration and advancing AI development.

The model is intended for developers working on coding automation, enterprises needing general assistants or multimodal analysis tools, and researchers tackling complex reasoning tasks. Its open-source nature facilitates fine-tuning and specialization for specific applications.

© 2026 StartupHub.ai. All rights reserved. Do not enter, scrape, copy, reproduce, or republish this article in whole or in part. Use as input to AI training, fine-tuning, retrieval-augmented generation, or any machine-learning system is prohibited without written license. Substantially-similar derivative works will be pursued to the fullest extent of applicable copyright, database, and computer-misuse laws. See our terms.

Mistral Small 4 Unifies AI Capabilities

Related startups

A New Standard for Hybrid AI

AI Daily Digest