Sundar Pichai at Google I/O 2026: Gemini Hits 900M Users

At Google I/O 2026, Sundar Pichai declared the agentic Gemini era, announced Gemini 3.5 Flash with 900M monthly app users, a split TPU 8th-gen silicon strategy with 3x training compute, and restructured AI Ultra pricing from $250 to $200 per month.

6 min read
Sundar Pichai, Google I/O 2026 keynote recap, 2026
Sundar Pichai at the European Commission, Brussels, May 2023.· Photo by Lukasz Kobus (European Commission), via Wikimedia Commons (CC BY 4.0)

Google's Gemini app crossed 900 million monthly active users at Google I/O on 19 May 2026, more than double the 400 million reported a year earlier, while AI Mode in Search separately surpassed one billion monthly active users, according to Sundar Pichai's keynote blog post.

Gemini 3.5 Flash and the pivot from chat to action

Pichai opened with a statement that functioned as a mission update: "It's clear we're firmly in our agentic Gemini era." The phrase reframes what Google is asking consumers to expect from its AI products. Gemini 3.5 Flash, the first model released that day across the Gemini app, Google Search, and the Gemini API, is described by Google as "our first in a series of models combining frontier intelligence with action," a deliberate shift away from question-answering toward task completion, per the official keynote post.

Related startups

The model surpasses Gemini 3.1 Pro across coding, agentic, and multimodal benchmarks while running at four times the output-token speed of other frontier models, CNBC reported. Gemini 3.5 Pro, the heavier sibling, is in testing and due in June 2026. Two additional releases complete the I/O slate: Gemini Omni, an any-to-any model that accepts and emits image, audio, video, and text; and Gemini Spark, an agentic assistant initially exclusive to AI Ultra subscribers, per 9to5Google.

The user-growth data gives these releases urgency. Daily request volume on Gemini grew sevenfold year-on-year, per Pichai's keynote. AI Mode in Search, one year since launch, hit one billion MAU with queries "more than doubling every quarter." The scale of that distribution is what separates Google's position from every pure-play AI lab. For additional context on how Google builds agent systems, see our coverage of DeepMind's agent infrastructure.

Bar chart showing Gemini App MAU growing from 400M in May 2025 to 900M in May 2026, with Search AI Mode at 1B MAU
Gemini app monthly active users more than doubled in a year; Search AI Mode crossed one billion. Source: Sundar Pichai's Google I/O 2026 keynote.

Silicon as strategy: the TPU 8t and 8i split

The most structurally significant announcement at I/O was not a model but a chip architecture decision first disclosed at Google Cloud Next in April and reinforced on stage. Google's eighth-generation TPU family divides into two purpose-built designs: the TPU 8t for training and the TPU 8i for inference. The split is a public signal that Google now regards training compute and inference compute as different economic problems requiring different silicon, per the Google infrastructure blog.

TPU 8t scales to 9,600 chips in a single superpod with two petabytes of shared high-bandwidth memory, achieving three times the raw compute of the previous-generation chip at up to twice the performance per watt, Bloomberg reported in April. TPU 8i, the inference sibling, connects 1,152 chips per pod with three times the on-chip SRAM; Google notes the design prioritises latency because "interactions between agents at scale magnify even small inefficiencies." Both chips target general availability later in 2026.

This is Pichai's most direct structural answer to Nvidia's dominance in AI compute. Proprietary training and inference silicon lets Google price Gemini API access below the cost achievable by cloud providers reliant on third-party GPUs, a margin advantage that compounds as agent-call volumes grow. For context on the broader capital flows reshaping AI infrastructure, see our piece on the inference-layer funding wave of May 2026.

Horizontal bar chart comparing TPU 8t max scale of 9,600 chips versus TPU 8i at 1,152 chips per pod
TPU 8t scales to 9,600 chips in a training superpod; TPU 8i prioritises inference latency with 1,152 chips per pod. Source: Google infrastructure blog; Bloomberg.

Products and pricing: Android Halo, Gemini Intelligence, and the restructured AI Ultra tiers

Pichai used I/O to show how Gemini Intelligence, an agentic layer embedded into Android, ChromeOS, Wear OS, and Android Auto, shifts the product question from "what can the chatbot answer" to "what can the agent complete while the phone is in a pocket." Android Halo, a persistent agent interface on Android phones, is due in summer 2026 and is designed to run Gemini Spark-class tasks in the background, including booking, research, and summarisation, without requiring the user to open an app, per 9to5Google. Android XR audio glasses with iPhone support follow in autumn 2026.

The pricing restructure signals confidence in the consumer AI subscription market. AI Ultra now costs $200 per month, down from $250, while a new $100-per-month tier was introduced below it; Google's stated aim is to align price points with the direct competitors in the professional-user segment, per CNBC. The $50 reduction at the top tier, alongside adding a new entry point, suggests Google views the AI subscription customer base as elastic enough to expand significantly with better pricing rather than to protect existing revenue at the margin.

Bar chart showing AI Ultra price drop from $250 to $200 per month and new $100 mid-tier introduced at Google I/O 2026
Google cut AI Ultra from $250 to $200 per month and added a $100 tier below it. Source: CNBC.

What it means

Google I/O 2026 is the clearest public statement Pichai has made about where he believes the AI value chain settles. The company is betting that the combination of proprietary training silicon, proprietary inference silicon, a billion-user Search distribution layer, and a 900-million-user Gemini app creates a cost and reach position that pure-play AI labs cannot replicate. The word "agentic" ran throughout the keynote not as marketing but as a structural claim: that the next phase of AI revenue comes from background task completion, not from conversations that users must initiate. For Pichai, I/O 2026 was less a product launch and more a statement of the full-stack position Google intends to hold.

Sources

Editorial standards: every claim is sourced. Tips: editor@startuphub.ai

© 2026 StartupHub.ai. All rights reserved. Do not enter, scrape, copy, reproduce, or republish this article in whole or in part. Use as input to AI training, fine-tuning, retrieval-augmented generation, or any machine-learning system is prohibited without written license. Substantially-similar derivative works will be pursued to the fullest extent of applicable copyright, database, and computer-misuse laws. See our terms.