Google has released a comprehensive analysis detailing the energy, emissions, and water consumption associated with its Gemini AI models during inference. The company's new technical paper, published in an announcement on its Google Cloud Blog, aims to provide a more accurate and holistic understanding of AI's environmental impact, revealing figures for Gemini prompts that are substantially lower than many prevailing public estimates.
According to Google's calculations, a median Gemini Apps text prompt consumes approximately 0.24 watt-hours (Wh) of energy, emits 0.03 grams of carbon dioxide equivalent (gCO2e), and uses 0.26 milliliters of water—roughly five drops. These figures, based on a May 2025 analysis, are presented as a more complete picture of operational impact, contrasting with less comprehensive methodologies that often overlook critical factors in real-world AI deployment. The energy impact per prompt is likened to watching television for less than nine seconds, underscoring the efficiency achieved.
The announcement also highlights significant efficiency gains within Google's AI infrastructure. Over a recent 12-month period, the energy footprint of the median Gemini Apps text prompt decreased by an impressive 33x, while its total carbon footprint dropped by 44x. These improvements occurred concurrently with enhancements in response quality, demonstrating that environmental responsibility can align with performance advancements. Google attributes these gains to continuous research, software and hardware optimizations, and its ongoing commitment to carbon-free energy and water replenishment across its data centers.
Unpacking Google's Full-Stack Efficiency Strategy
Google's methodology for measuring AI's environmental footprint is designed to capture the complexities of serving AI at scale. Unlike simpler models that might only consider active machine consumption, Google's approach accounts for full system dynamic power, including actual chip utilization at production scale, which can be lower than theoretical maximums. It also factors in the energy consumed by idle machines provisioned for high availability, the crucial role of host CPUs and RAM, and the significant energy draw from data center overhead like cooling systems and power distribution, measured by Power Usage Effectiveness (PUE). Water consumption for cooling is also integrated into the calculation, acknowledging its direct link to energy use.
This comprehensive view contrasts sharply with what Google describes as "optimistic" scenarios that only consider active TPU and GPU consumption. Such limited calculations, Google states, would estimate a median Gemini text prompt at 0.10 Wh of energy, 0.02 gCO2e, and 0.12 mL of water, substantially underestimating the true operational footprint. By sharing its detailed methodology, Google hopes to foster industry-wide consistency and accuracy in reporting AI's resource consumption.
The substantial efficiency improvements in Gemini are a direct result of Google's "full-stack" approach to AI development. This strategy integrates efficiency at every layer, from custom hardware to model architectures and serving systems. Key innovations include the use of Transformer model architectures, which offer 10-100x efficiency boosts, and inherently efficient structures like Mixture-of-Experts (MoE) models that activate only a subset of a large model for specific queries.
Further optimizations come from refined algorithms like Accurate Quantized Training (AQT) and advanced inference techniques such as speculative decoding and model distillation, which create smaller, more efficient models like Gemini Flash and Flash-Lite. Google's custom-built TPUs, including the latest Ironwood generation, are co-designed with AI models for maximum performance per watt, with Ironwood being 30x more energy-efficient than its first publicly available TPU. The company also employs optimized idling strategies, dynamically moving models based on demand, and leverages its ultra-efficient data centers, which boast a fleet-wide average PUE of 1.09. These efforts are complemented by responsible data center operations focused on 24/7 carbon-free energy and 120% water replenishment goals.
Google acknowledges that while significant progress has been made, the demand for AI is growing, necessitating continued investment in reducing power provisioning costs and water consumption per prompt. The company's commitment to sharing its findings and methodology underscores a broader ambition to drive industry-wide progress toward more efficient and responsible AI development.

