ElevenLabs Gives Chat Agents a Voice

Luke Harries from ElevenLabs discusses the increasing importance of voice for AI chat agents, highlighting the benefits of speed, accessibility, and user experience.

Luke Harries of ElevenLabs presenting on giving chat agents a voice.
Image credit: AI Engineer· AI Engineer

Luke Harries from ElevenLabs presented on the future of AI interaction, focusing on how to equip chat agents with a voice. He posited that by 2025, businesses will either have integrated chat agents into their software-as-a-service (SaaS) offerings or will have adopted an AI-first approach where chat agents are the primary user interface.

ElevenLabs Gives Chat Agents a Voice - AI Engineer
ElevenLabs Gives Chat Agents a Voice — from AI Engineer

The Evolution of Chat Agents

Harries highlighted a prevalent trend driven by a viral tweet: by 2025, businesses will either have died as a SaaS or added a chat agent. This sentiment suggests the increasing importance of conversational AI in user experience. He noted that while chat agents are becoming standard, voice remains the more natural and efficient medium for human-computer interaction. Voice is presented as being three times faster and more accessible, enabling omnichannel experiences.

Related startups

The Power of Voice in AI Interaction

The presentation emphasized that voice interaction unlocks new possibilities for AI agents. Harries stated, "Voice is the natural medium... it's way quicker, it's way more interactive, it's also much more accessible." This accessibility is crucial for users who may struggle with traditional keyboard interfaces due to conditions like dyslexia. The ability to integrate voice allows AI agents to participate in voice calls, offer real-time feedback, and provide a more engaging user experience.

Building and Integrating Voice Agents

ElevenLabs is providing developers with the tools to seamlessly integrate voice capabilities into their existing chat agents. Harries showcased the architecture of a voice agent, which typically comprises a 'Voice Engine' for audio processing and 'Agent Orchestration' for managing the LLM, knowledge base, and integrations. The Voice Engine handles tasks like turn-taking, speech-to-text, and text-to-speech, ensuring natural and context-aware interactions. The Agent Orchestration layer integrates with LLMs, knowledge bases, and Retrieval-Augmented Generation (RAG) systems to provide intelligent responses.

Harries demonstrated the simplicity of integrating ElevenLabs' technology. He explained that developers can leverage their existing chat agent infrastructure and add a voice wrapper. The process involves obtaining a conversation token, starting a session, and then using the server SDK to attach the voice engine. This allows the agent to process voice input, generate responses, and output them as speech. The client SDK simplifies this further, enabling developers to add a voice widget to their site with just a few lines of code.

ElevenLabs' Offerings: Voice Engine vs. Agents Platform

ElevenLabs offers two distinct paths for developers: the 'Voice Engine' and the 'Agents Platform'. The Voice Engine provides maximum flexibility, allowing users to implement their own LLM and orchestration logic, custom RAG and business logic, and text-only servers. It supports the same Conversation SDK. The Agents Platform, conversely, focuses on maximum performance with fully managed LLMs, built-in tools, a knowledge base, a dashboard for non-developers, and out-of-the-box telephony, offering the lowest possible latency.

Harries concluded by offering a glimpse into the future, predicting that by 2026, the choice will be stark: either a company's chat agent will die, or it will evolve into a voice agent. He expressed enthusiasm for collaborating with early adopters and design partners who are looking to lead this transition, emphasizing the ease with which existing chat agents can be enhanced with voice capabilities.

© 2026 StartupHub.ai. All rights reserved. Do not enter, scrape, copy, reproduce, or republish this article in whole or in part. Use as input to AI training, fine-tuning, retrieval-augmented generation, or any machine-learning system is prohibited without written license. Substantially-similar derivative works will be pursued to the fullest extent of applicable copyright, database, and computer-misuse laws. See our terms.