The future of human-technology interaction is being audibly reshaped by ElevenLabs, a company co-founded by Mati Staniszewski, who recently sat down with Sarah Guo on the No Priors podcast. Their discussion illuminated how voice AI is transcending its early, often robotic, iterations to become a profoundly natural and intuitive interface, powering everything from content creation to deeply personalized AI companions. Staniszewski detailed ElevenLabs’ rapid ascent since its 2022 founding, its dual focus on foundational research and product deployment, and the ambitious vision for voice as the ultimate medium for digital engagement across industries.
ElevenLabs has experienced remarkable growth, scaling to 350 employees globally and achieving a $300 million annual recurring revenue (ARR) split evenly between its self-serve creative platform and its enterprise-focused agent platform. With hubs in major tech centers like London, New York, Tokyo, and San Francisco, the company is not merely building tools but "solving how humans and technology interact," as Staniszewski articulated. This involves developing foundational audio models capable of generating human-like speech, understanding complex vocal nuances, and orchestrating interactive voice components to create seamless experiences.
The genesis of ElevenLabs' innovative approach stemmed from a personal frustration with the subpar quality of voiceovers, particularly in Polish dubbed content, where a single, monotonous voice would narrate all characters regardless of gender or emotion. This "terrible experience," as Staniszewski described it, ignited a realization: the potential for voice AI to retain the original speaker's emotions and intonations across languages and contexts was immense. This core insight, born from a seemingly niche problem, became a driving force for the company's dual strategy of deep research and product development.
ElevenLabs' strategy is not about chasing every trend but rather identifying "broken" experiences and building a specialized "lab" around each problem. This involves a dedicated team of researchers, engineers, and operators who collaboratively tackle the technical challenges, then build a simple product layer, and finally expand its applications. This agile, problem-centric methodology allows ElevenLabs to push the boundaries of voice AI quality and apply it to diverse, high-impact use cases.
For instance, their creative product offers solutions for narrations, audiobooks, and dubbing, addressing the very problem that inspired the company. The agent platform, however, represents a more profound shift. It moves beyond traditional reactive customer support to proactive, personalized AI experiences. Staniszewski cited examples like Meesho, a major Indian e-commerce platform, which utilizes ElevenLabs' agents as a front-end interface, helping customers with refunds, tracking, and even product recommendations. Similarly, Square is leveraging voice AI for ordering and discovery, transforming the customer journey in commerce.
The vision extends further into personalized learning and interactive experiences. ElevenLabs has partnered with Chess.com to allow users to learn chess from the AI-generated voices of grandmasters like Hikaru Nakamura and Magnus Carlsen. Another intriguing collaboration involves Chris Voss, a former FBI hostage negotiator, whose voice AI can be used for negotiation practice, offering a truly immersive and personalized educational experience. These applications underscore a fundamental shift from static content consumption to dynamic, interactive engagement, where the AI acts as a personal tutor or guide.
Beyond commerce and education, ElevenLabs is venturing into agentic government services, partnering with Ukraine's Ministry of Transformation to explore how voice AI can revolutionize citizen interaction with public services, from benefits inquiries to immigration processes. This ambitious undertaking highlights Staniszewski’s belief that voice is the ultimate interface for the future, capable of breaking down language barriers globally and fostering more intuitive interactions across all facets of life.
When considering the competitive landscape, Staniszewski acknowledges that foundational models will eventually commoditize. The enduring value, he argues, lies in the surrounding ecosystem: the branding, distribution channels, extensive collection of voices, robust integrations, and streamlined workflows that ElevenLabs builds around its core technology. The company’s focus on a platform that allows partners and enterprises to deploy customized solutions, rather than a one-size-fits-all approach, is a key differentiator. ElevenLabs also actively engages in foundational research, consistently outperforming competitors on benchmarks for text-to-speech, speech-to-text, and orchestration, a testament to the continuous investment in core AI capabilities. This blend of deep research and product-market fit, driven by a vision to make human-technology interaction profoundly natural, positions ElevenLabs at the forefront of the voice AI revolution.



