Fei-Fei Li, a luminary in AI, posits that spatial intelligence represents the critical next frontier for artificial intelligence, transcending the current dominance of large language models. Joined by her former PhD student and now co-founder, Justin Johnson, at World Labs, Li articulated a compelling vision for machines that not only process information but deeply understand and interact with the three-dimensional world. Interviewed by Shawn Wang and Alessio Fanelli of Latent Space, the pair unveiled Marble, World Labs' pioneering generative "world model," designed to bridge the chasm between abstract language and embodied reality.
The genesis of World Labs stems from a shared conviction that AI's evolution demands a shift beyond language-centric models. Li and Johnson, whose careers span foundational work like ImageNet and early vision-language research, recognized an impending bottleneck. As Li succinctly puts it, "language is a lossy, low-bandwidth channel for describing the rich 3D/4D world we live in." Human intelligence, they argue, is inherently multimodal, with spatial reasoning playing a profound role in our understanding of physics, causality, and interaction. This insight drove their collaborative effort to build AI systems capable of perceiving, understanding, and building in 3D space.
