How Fei-Fei Li Splits ‘World Models’ Into Three Distinct Functions

Fei-Fei Li’s June 2026 taxonomy places every world model into one of three functions: renderer, simulator, or planner. Here is what each one does and where World Labs’ Marble fits at the intersection.

6 min read
Fei-Fei Li, world model taxonomy concrete technical contribution, 2026
Fei-Fei Li speaking at the AI for Good Global Summit, 2017.· Photo by ITU Pictures, via Wikimedia Commons (CC BY 2.0)

On June 3, 2026, Fei-Fei Li published a taxonomy that assigns every system calling itself a "world model" to one of three functions: renderer, simulator, or planner. The piece, posted to Li's personal Substack, is the most precise public statement the World Labs CEO has made about what separates a genuine world model from a video generator with better marketing.

Related startups

Fei-Fei Li co-created ImageNet in 2009, a dataset of 14 million labeled images that became foundational infrastructure for the deep learning era. She co-founded World Labs in late 2023; the company has since raised $1.23 billion across two rounds to build what she calls spatial intelligence.

From Video Generator to Physics Engine: the Renderer-Simulator Gap

The sharpest point in Li's taxonomy is the distinction between a renderer and a simulator. A renderer takes inputs, whether text, image, or video, and produces pixels: a visual representation of what a scene looks like. Most AI systems currently described as world models, including video generators such as Sora, Runway, and Kling, are renderers by this definition. Li's taxonomy is explicit: systems that stop at the renderer stage are not true world models.

A simulator outputs state rather than pixels. Its output is a geometrically and physically faithful representation that programs can compute on directly: collider meshes, material properties, spatial coordinates, the data a game engine or robotics simulation pipeline consumes. The distinction is consequential for any application where the downstream consumer is a machine rather than a human viewer. A renderer conveys what the world looks like; a simulator conveys how it behaves.

The third function, the planner, closes a continuous loop. A planner takes observations as input and produces actions, connecting perception to behavior. Li's paper positions this as the synthesis of the first two: without a grounded simulator, a planner's action outputs are disconnected from physical reality. Together, renderer, simulator, and planner form what Li describes as "an interconnected loop" that underpins spatial intelligence, per TechTimes' June 2026 coverage.

Li's three world model functions: renderer, simulator, planner in taxonomy order
The three functions in Li's June 2026 taxonomy, shown in loop order. Source: Fei-Fei Li, "A Functional Taxonomy of World Models," Substack, June 3, 2026.

How Marble Sits at the Renderer-Simulator Intersection

Marble, World Labs' first commercial product, launched in limited beta in November 2025 and moved to general commercial availability in February 2026, timed to the company's $1 billion growth round. Its technical architecture was designed specifically to bridge the renderer-simulator divide.

When a user submits an image, video, or text prompt, Marble produces two outputs in a single pass: triangle meshes (higher-fidelity geometry compatible with standard 3D software packages) and collider meshes (lower-fidelity geometry a physics engine can run collision detection against). This dual-output design is what positions Marble at the intersection of the renderer and simulator categories rather than being a purely visual tool, as TechCrunch reported at Marble's November 2025 launch.

Chisel, an experimental mode available through the World API (launched January 2026), extends this further. Users block out coarse spatial layouts using geometric primitives; Marble fills in visual detail and physical structure. Decoupling spatial layout from appearance has direct implications for game developers, architects, and simulation engineers, who typically work from rough spatial logic before committing to aesthetics. Autodesk's $200 million participation in the February 2026 funding round reflects the industrial 3D software market's specific interest in this kind of programmable spatial output, per TechCrunch.

World Labs $1.23B funding composition: Series A 2024 and growth round 2026
World Labs has raised $1.23B across two rounds. Sources: TechCrunch (Feb 18, 2026); World Labs blog (worldlabs.ai/blog/funding-2026).

From ImageNet to World Models: the Same Infrastructure Pattern

Li's career has a recognizable structure. In 2009, she co-created ImageNet at Princeton, assembling 14 million manually labeled photographs across 22,000 categories. The dataset existed without broad application until 2012, when Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton trained their AlexNet architecture against it and produced results that reshaped the field's trajectory. Li did not build the winning application; she built the substrate that made applications possible.

At World Labs, Li has framed spatial intelligence in similar terms. "I believe spatial intelligence is as critical as, and complementary to, language intelligence," she said in the February 2026 funding announcement, per the World Labs blog. The June taxonomy is the theoretical version of that argument: by naming the three functions and specifying where current AI systems fall short, Li is defining the problem space before the full solutions exist. The planner layer, in particular, remains largely unoccupied; World Labs has not released a standalone planning product, and the taxonomy frames this as the open frontier of the field rather than a gap in World Labs' current capabilities.

Bloomberg reported in January 2026 that World Labs was in funding discussions at a $5 billion valuation, up from the $1 billion post-money valuation at the September 2024 Series A, per Bloomberg. The company has not confirmed the final post-money valuation for the February 2026 round.

World Labs valuation milestones: $1B in 2024, $5B target in Jan 2026 per Bloomberg
World Labs valuation at two reported milestones. Sources: TechCrunch (Sep 2024); Bloomberg (Jan 23, 2026).

What It Means

The taxonomy functions as a competitive positioning document as much as a technical one. By placing Marble explicitly at the renderer-simulator intersection and noting the planner layer is still being built across the industry, Li signals where World Labs intends to expand without committing to a timeline. The $1.23 billion raised to date, with Autodesk, AMD, Fidelity, Nvidia, and Sea among investors in the growth round, gives the company capital to move from simulation toward planner integration. The practical question is whether Marble's dual-output architecture proves as substrate-defining for spatial AI as ImageNet was for visual recognition.

Sources

Editorial standards: every claim is sourced. Tips: editor@startuphub.ai

© 2026 StartupHub.ai. All rights reserved. Do not enter, scrape, copy, reproduce, or republish this article in whole or in part. Use as input to AI training, fine-tuning, retrieval-augmented generation, or any machine-learning system is prohibited without written license. Substantially-similar derivative works will be pursued to the fullest extent of applicable copyright, database, and computer-misuse laws. See our terms.