The next transformative wave of artificial intelligence is unfolding not in the digital ether, but in the tangible, messy reality of the physical world. This was the central thesis articulated by Sanjit Biswas, CEO of Samsara, in a recent discussion with Sequoia Capital's Sonya Huang and Pat Grady. Biswas, a serial founder known for scaling AI in physical domains, first with Meraki and now with the $20B+ public company Samsara, offered a sharp analysis of why "physical AI" presents fundamentally different challenges and unparalleled opportunities compared to its cloud-based counterpart.
Sanjit Biswas, a legendary Sequoia-backed founder with a background rooted in MIT's Roofnet project and co-founder of Meraki, which was acquired by Cisco for $1.2 billion, spoke with Sequoia Capital’s Sonya Huang and Pat Grady about the unique constraints and vast potential of physical AI. The conversation centered on how Samsara, with sensors deployed across millions of vehicles and job sites capturing 90 billion miles of driving data annually, is navigating the complexities of bringing AI to asset-heavy industries like logistics, field service, and construction.
Biswas highlighted that physical AI operates under a distinct set of constraints that cloud-based AI does not face. Running inference on low-power edge devices, typically between two to ten watts, demands an entirely different engineering approach than the virtually limitless compute power of centralized data centers. Furthermore, the "messy diversity of real-world data"—encompassing everything from unpredictable weather and varied road conditions to the long tail of human behavior—presents both the biggest challenge and the greatest opportunity for embodied AI. This inherent variability necessitates robust, adaptable models that can perform reliably in unpredictable environments, a stark contrast to the often controlled or simulated environments of purely digital AI.
The "why now" for Samsara, as Biswas articulated, stemmed from a powerful confluence of three compounding technological curves: ubiquitous connectivity, advancements in compute, and the proliferation of high-quality sensors. He recalled the early 2000s, when Wi-Fi was nascent and internet access expensive, contrasting it with the present where connectivity is pervasive. The emergence of powerful, yet compact, embedded GPUs, exemplified by devices like the Nintendo Switch, signaled a significant leap in on-device processing capabilities. Simultaneously, the mass adoption of smartphones had driven down the cost and improved the quality of camera sensors dramatically. These three pillars, once combined, created an inflection point for physical AI, enabling real-world data capture and processing at an unprecedented scale and cost-effectiveness.
The impact of this physical AI extends beyond mere risk detection, venturing into proactive coaching and efficiency gains. Biswas explained how AI is beginning to "coach frontline workers—not just detect risk, but recognize good driving and improve fuel efficiency." This shift from punitive oversight to positive reinforcement, identifying and amplifying desirable behaviors, is a powerful motivator for workforces. Automation, by lowering costs and increasing operational speeds, unlocks latent demand that was previously uneconomical to serve. For instance, the ability to deliver a needed part to a field service technician for five dollars instead of fifty could dramatically increase the volume of service calls and overall operational velocity.
Scaling physical AI, however, is not without its formidable challenges. It demands a "village" of effort, encompassing everything from hardware installation in millions of units to extensive workforce training and change management. Samsara's deep investment, approaching $3 billion in R&D and customer success, underscores the sheer willpower and capital required to bridge the gap between cutting-edge technology and real-world deployment. The diversity of data, from urban to rural, residential to industrial, across varied weather conditions, creates an "incredible dataset" for training robust world models. Yet, these large models must be distilled into highly specialized, energy-efficient versions capable of running on low-power edge devices, a complex engineering feat.
Biswas emphasized that while Samsara doesn't build foundational models, it leverages their evolving capabilities, distilling them for specific use cases. The company's unique distributed architecture, running inference directly on cameras and other edge devices, allows for continuous risk monitoring without overwhelming cloud infrastructure with raw data streams. This on-device processing, operating at 2-10 watts, significantly reduces bandwidth costs and latency. The resulting "tokens" and metadata are then streamed to the cloud, where more sophisticated video reasoning models can analyze complex events, such as understanding the full context of a driving incident. This hybrid approach allows for immediate, actionable insights at the edge while enabling deeper, holistic analysis in the cloud.
The ultimate vision for Samsara and the broader physical AI landscape is one of digital transformation, where real-time visibility and intelligent automation permeate every facet of physical operations. Biswas envisions a future where a unified platform provides a comprehensive view of frontline workers, vehicles, and assets, enabling predictive maintenance, optimized routing, and enhanced safety. This includes empowering workers with new tools like augmented reality wearables and AI-powered voice assistants that provide real-time guidance, freeing their hands and minds for complex tasks. The future of humans and AI in the physical world, Biswas suggests, is not one of replacement, but of profound augmentation, where AI serves as a powerful co-pilot, enhancing human capabilities and driving unprecedented levels of efficiency and safety across industries that form the very backbone of global infrastructure.

