The release of Cogito v2 marks a pivotal moment in the pursuit of advanced artificial intelligence, shifting the paradigm from mere inference-time search to genuine self-improvement. This development introduces a novel approach to scaling AI capabilities, focusing on the internalization of reasoning processes rather than simply extending computational effort. It represents a significant stride towards building more intuitive and ultimately, superintelligent systems.
Cogito v2 unveils four new hybrid reasoning models, ranging from 70B dense to a formidable 671B Mixture-of-Experts (MoE) model. The largest of these stands among the most capable open-source models globally, demonstrating performance competitive with DeepSeek v3 and R1, and even approaching the benchmarks set by closed frontier models like o3 and Claude 4 Opus. This achievement underscores the potential for open innovation to challenge established leaders.
The Genesis of AI Intuition
The core innovation lies in extending Iterated Distillation and Amplification (IDA), a research direction previously explored in Cogito v1. Unlike conventional Large Language Models (LLMs) that primarily enhance performance by "searching more" through longer reasoning chains, Cogito v2 focuses on distilling these discoveries back into the model's parameters. This process cultivates a stronger "intelligence prior" or "intuition." The model learns to anticipate the outcome of its own reasoning, effectively developing a deeper understanding of the problem space.
This shift from extensive search to internalized intuition yields tangible benefits. Cogito models exhibit significantly shorter reasoning chains—up to 60% shorter than DeepSeek R1—while maintaining or exceeding performance. This efficiency is not merely about speed; it signifies a more profound grasp of the underlying logic, allowing the model to arrive at solutions more directly and elegantly. It suggests a move beyond brute-force computation towards a more sophisticated form of AI cognition.
Perhaps most remarkably, this breakthrough in self-improvement has been achieved with unprecedented cost efficiency. The combined training cost for all eight Cogito models (from 3B to 671B) is reported at less than $3.5 million. This challenges the prevailing notion that such advanced AI research necessitates immense capital investment, potentially democratizing access to cutting-edge development and accelerating the pace of innovation across the field.
The developers explicitly frame this work as a scalable training recipe for unbounded iterative intelligence improvements, with the ultimate goal of building superintelligence. Their commitment to open-sourcing all models, both current and future, ensures that these foundational advancements will benefit the wider research community. This transparency fosters collaborative progress and accelerates the collective journey towards more capable AI.
An intriguing emergent property of Cogito v2 is its ability to perform reasoning over visual domains, despite being trained exclusively on text inputs. This unexpected multimodal capability, arising purely from transfer learning, highlights the potential for these intuitive models to generalize across different data modalities without explicit multimodal training. It opens new avenues for bootstrapping training data and exploring reinforcement learning for visual reasoning.
Cogito v2 is a proof of concept for a fundamentally different approach to AI scaling. By prioritizing the development of internal "intuition" and self-improvement mechanisms, it lays crucial groundwork for AI systems that can genuinely learn, adapt, and evolve their own intelligence, paving a clearer path towards the realization of advanced general AI. Iterated Distillation and Amplification is proving to be a powerful mechanism for this evolution. The implications for future Large Language Model Reasoning and the broader landscape of Open-Source AI Models are profound.
