Article Not Found | StartupHub.ai

The most profound advances in artificial intelligence often arrive not through brute-force scaling, but through mathematical simplification. When Y Combinator General Partner Ankit Gupta spoke with YC Visiting Partner Francois Chaubard on a recent episode of Decoded, the central topic was Diffusion, the machine learning framework underpinning everything from OpenAI’s Sora to Stable Diffusion and Google’s GenCast. Chaubard, a computer vision veteran, laid bare the framework’s surprising versatility and the philosophical reasons why it may offer a more biologically plausible path toward Artificial General Intelligence than the current generation of large language models.

Chaubard defined diffusion as a fundamental machine learning framework designed to learn the underlying probability distribution of any data set, regardless of domain. While this is the goal of all generative models, diffusion stands out for its capacity to map effectively from high-dimensional space to high-dimensional space, even in "low data regimes." The core concept relies on a two-part process: a forward diffusion step, where noise is incrementally added to a clean data sample until it becomes pure static, and a reverse process, where a trained model learns to remove that noise step-by-step to reconstruct the original data.

The conversation quickly shifted to the evolution of this technique, tracing its roots back to a 2015 paper on non-equilibrium thermodynamics. Early implementations relied on complex Kullback-Leibler (KL) divergence losses and intricate noise schedules to manage the gradual corruption and reconstruction of data. However, the subsequent trajectory of diffusion research has been marked by a surprising trend: mathematical simplification leading to vastly improved results. The most pivotal of these innovations, Flow Matching, significantly streamlined the training objective. Instead of focusing on predicting the original data or the noise itself, Flow Matching trains the model to predict a simple, straight-line velocity vector between the noisy input and the clean data. This elegant reformulation allows the model to learn the data distribution with remarkable stability and efficiency. Chaubard highlighted the power of this abstraction, noting that the core training logic is domain-agnostic: “This code here has nothing to do with images. It could be weather data, it could be stock market data... it’s all the exact same code.” This abstraction is key to diffusion’s sprawling applicability across diverse fields.

The practical applications of this increasingly elegant framework are broad and rapidly expanding beyond the visual domain where it first gained notoriety. While Stable Diffusion, Midjourney, and the recent video models like Sora have captured public attention, diffusion is now state-of-the-art in areas like protein folding (AlphaFold 3), molecular design (DiffDock), and real-world robotics control (Diffusion Policy). The technique’s inherent ability to handle complex, high-dimensional spaces makes it uniquely suited for systems where the output space is massive, such as predicting the intricate movements of a robot arm or generating accurate multi-day weather forecasts.

Chaubard drew a sharp philosophical line between the foundational processes of diffusion models and auto-regressive large language models (LLMs). He argued that while LLMs are powerful, their sequential nature—generating one token at a time and never looking back to revise previous tokens—is fundamentally restrictive. This contrasts sharply with the recursive, iterative nature of diffusion, which allows the model to refine and improve its output across many steps. When compared to biological intelligence, diffusion models appear to be a closer analogue to how the human brain constructs complex thoughts: generating high-level concepts and then recursively decoding them into lower-level manifestations. Chaubard stated that LLMs are “stuck. It can’t do more than in one step, even though it might want to.” Diffusion, conversely, leverages randomness and recursive refinement, mirroring the dynamic, back-and-forth process of biological thought.

For founders and technical professionals, the message is clear: diffusion is not just a tool for image generation; it is a core, general-purpose primitive for modeling complex data distributions. Given the fundamental simplicity and broad applicability demonstrated by Flow Matching and subsequent research, any application involving high-dimensional data—from novel drug discovery to robotics control—should be seriously evaluating diffusion as a core component of its ML architecture. Chaubard advised founders to "update your prior on how good these things are getting." The rapid evolution toward simpler, more robust diffusion procedures means that this once-complex technique is now accessible enough to redefine entire sectors of the economy.

Diffusion Is the Foundational AI Technique Every Founder Must Master

AI Daily Digest

Diffusion Is the Foundational AI Technique Every Founder Must Master

AI Daily Digest