Google Rolls Out Gemini 3.5 Live Translate

Google is releasing Gemini 3.5 Live Translate, a new audio model aiming to deliver fluid, natural-sounding voice translations in over 70 languages. This advancement builds on two decades of Google's machine learning efforts in language translation.

The system automatically detects languages and generates speech that mirrors the original speaker's intonation, pacing, and pitch. Unlike traditional turn-by-turn translation, Gemini 3.5 Live Translate continuously generates audio, maintaining a few seconds lag to ensure contextual accuracy while staying in sync with the speaker.

Broader Rollout and Developer Access

The technology is rolling out across Google products. Developers can access it via the Gemini Live API and Google AI Studio. Enterprise users will see it in private preview within Google Meet starting this month.

For consumers, Gemini 3.5 Live Translate will be available through the Google Translate app on Android and iOS. This integration supports over 70 languages, a significant expansion from previous limitations.

Enhanced User Experience

Android users will benefit from a new 'listening mode' in the Google Translate app. This feature allows users to hear translations directly through their phone's earpiece, useful for private listening without headphones.

Google Meet will also see improvements, supporting over 2000 language combinations in a single meeting, a massive leap from its previous English-centric translation capabilities. This aims to foster seamless global communication.

The model's robustness to noise and ability to handle multilingual inputs without manual configuration are key features for real-world applications. This makes it suitable for facilitating live interpretation in meetings, lessons, and broadcasts.

Developer Partnerships and Real-World Testing

Developer platforms like Agora and LiveKit are integrating the Gemini Live API to simplify the creation of voice translation apps. This allows developers to focus on user experience while the API handles complex streaming infrastructure.

Companies like Grab are already testing the technology for real-time communication between drivers and riders, handling millions of voice calls monthly. Early feedback from partners highlights impressive translation quality, accuracy, and low latency.

All AI-generated audio will be watermarked with SynthID to ensure detectability and prevent misinformation.

© 2026 StartupHub.ai. All rights reserved. Do not enter, scrape, copy, reproduce, or republish this article in whole or in part. Use as input to AI training, fine-tuning, retrieval-augmented generation, or any machine-learning system is prohibited without written license. Substantially-similar derivative works will be pursued to the fullest extent of applicable copyright, database, and computer-misuse laws. See our terms.