Uber's Smart Database Overload Fix

Uber evolved its database overload protection from static rate-limiting to an intelligent, priority-aware system called Cinnamon, enhancing stability and user experience.

8 min read
Diagram illustrating Uber's intelligent load management system architecture.
An overview of Uber's approach to intelligent load management for its critical database systems.· Uber Engineering

Uber's massive infrastructure, supporting over 170 million monthly users, relies on in-house databases like Docstore and Schemaless. These systems, built on MySQL, handle petabytes of data and tens of millions of requests per second. At this scale, minor overloads can cascade, leading to system-wide failures. Ensuring fairness in a multitenant environment, where one user shouldn't hog resources, adds another layer of complexity. This engineering challenge led Uber to develop an intelligent load management system.

Visual TL;DR. Massive Uber Scale leads to Database Overload Risk. Database Overload Risk led to Static Rate Limiting. Static Rate Limiting revealed limitations Concurrency Load Signal. Concurrency Load Signal informed Early Load Management. Early Load Management evolved to Cinnamon System. Cinnamon System resulted in Enhanced Stability. Cinnamon System resulted in Fairness & User Exp.

  1. Massive Uber Scale: supporting over 170 million monthly users and petabytes of data
  2. Database Overload Risk: minor overloads can cascade, leading to system-wide failures
  3. Static Rate Limiting: quota-based limits proved problematic and imprecise
  4. Concurrency Load Signal: real load signal is concurrency, not just request count
  5. Early Load Management: CoDel and Scorecard techniques were initial steps
  6. Cinnamon System: intelligent, priority-aware load management system
  7. Enhanced Stability: improved system stability and reliability
  8. Fairness & User Exp: ensuring fairness and better user experience
Visual TL;DR
Visual TL;DR — startuphub.ai Massive Uber Scale leads to Database Overload Risk. Cinnamon System resulted in Enhanced Stability. Cinnamon System resulted in Fairness & User Exp resulted in resulted in Massive Uber Scale Database Overload Risk Concurrency Load Signal Cinnamon System Enhanced Stability Fairness & User Exp From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai Massive Uber Scale leads to Database Overload Risk. Cinnamon System resulted in Enhanced Stability. Cinnamon System resulted in Fairness & User Exp resulted in resulted in Massive UberScale Database OverloadRisk Concurrency LoadSignal Cinnamon System EnhancedStability Fairness & UserExp From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai Massive Uber Scale leads to Database Overload Risk. Cinnamon System resulted in Enhanced Stability. Cinnamon System resulted in Fairness & User Exp resulted in resulted in Massive Uber Scale supporting over 170 million monthly usersand petabytes of data Database Overload Risk minor overloads can cascade, leading tosystem-wide failures Concurrency Load Signal real load signal is concurrency, not justrequest count Cinnamon System intelligent, priority-aware loadmanagement system Enhanced Stability improved system stability and reliability Fairness & User Exp ensuring fairness and better userexperience From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai Massive Uber Scale leads to Database Overload Risk. Cinnamon System resulted in Enhanced Stability. Cinnamon System resulted in Fairness & User Exp resulted in resulted in Massive UberScale supporting over 170million monthlyusers and petabytes… Database OverloadRisk minor overloads cancascade, leading tosystem-wide… Concurrency LoadSignal real load signal isconcurrency, notjust request count Cinnamon System intelligent,priority-aware loadmanagement system EnhancedStability improved systemstability andreliability Fairness & UserExp ensuring fairnessand better userexperience From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai Massive Uber Scale leads to Database Overload Risk. Database Overload Risk led to Static Rate Limiting. Static Rate Limiting revealed limitations Concurrency Load Signal. Concurrency Load Signal informed Early Load Management. Early Load Management evolved to Cinnamon System. Cinnamon System resulted in Enhanced Stability. Cinnamon System resulted in Fairness & User Exp led to revealed limitations informed evolved to resulted in resulted in Massive Uber Scale supporting over 170 million monthly usersand petabytes of data Database Overload Risk minor overloads can cascade, leading tosystem-wide failures Static Rate Limiting quota-based limits proved problematic andimprecise Concurrency Load Signal real load signal is concurrency, not justrequest count Early Load Management CoDel and Scorecard techniques wereinitial steps Cinnamon System intelligent, priority-aware loadmanagement system Enhanced Stability improved system stability and reliability Fairness & User Exp ensuring fairness and better userexperience From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai Massive Uber Scale leads to Database Overload Risk. Database Overload Risk led to Static Rate Limiting. Static Rate Limiting revealed limitations Concurrency Load Signal. Concurrency Load Signal informed Early Load Management. Early Load Management evolved to Cinnamon System. Cinnamon System resulted in Enhanced Stability. Cinnamon System resulted in Fairness & User Exp led to revealed limitations informed evolved to resulted in resulted in Massive UberScale supporting over 170million monthlyusers and petabytes… Database OverloadRisk minor overloads cancascade, leading tosystem-wide… Static RateLimiting quota-based limitsproved problematicand imprecise Concurrency LoadSignal real load signal isconcurrency, notjust request count Early LoadManagement CoDel and Scorecardtechniques wereinitial steps Cinnamon System intelligent,priority-aware loadmanagement system EnhancedStability improved systemstability andreliability Fairness & UserExp ensuring fairnessand better userexperience From startuphub.ai · The publishers behind this format

From Static Limits to Dynamic Control

Initially, Uber experimented with quota-based rate limiting within its query engine. This involved assigning capacity costs to requests and enforcing fixed quotas. However, this approach proved problematic. It added complexity with external Redis calls and failed to accurately reflect the actual load on storage partitions. The cost model was imprecise, treating heavy and light queries similarly, and static quotas required constant stakeholder adjustments.

Related startups

A key insight emerged: overload management needed to be closer to the storage nodes themselves. The focus shifted to identifying more reliable signals than simple query-per-second (QPS) metrics.

Concurrency: The Real Load Signal

Concurrency, the number of operations in flight, proved a more dependable indicator of system load. It directly correlates with resource usage, following Little's Law (Concurrency = Throughput × Latency). This metric became central to detecting overload.

Balancing Resilience and Fairness

A core challenge in multitenant systems is balancing overall system resilience with per-tenant fairness. The system needed to shed low-priority traffic during global overload while also preventing individual 'noisy neighbors' from monopolizing resources.

Early Load Management: CoDel and Scorecard

The first iteration involved CoDel (Controlled Delay) queues, adapted from networking to manage request latency rather than just queue length. Separate queues for reads, writes, and slow operations were implemented. CoDel also introduced adaptive LIFO (Last-In, First-Out) behavior under pressure, favoring newer, more relevant requests over stale ones to fail fast and reduce wasted work. This approach improves responsiveness by shedding stale work and prioritizing fresh requests.

To enforce per-tenant limits, the Scorecard engine was introduced. This rule-based system acts as a lightweight quota mechanism, preventing any single tenant from dominating shared infrastructure even under normal conditions. It aids in incident containment by isolating and capping misbehaving tenants without affecting others.

Additional plug-in regulators, such as write byte limits and partition key throttling, were added to address subtler forms of overload not captured by concurrency alone. These regulators guard against issues like large write payloads or traffic skewed to specific partition keys.

Limitations and the Evolution to Cinnamon

Despite improvements, the CoDel system had limitations. It treated all requests equally, indiscriminately dropping low-priority and user-facing traffic, impacting customer experience. Fixed queue timeouts and static concurrency limits required frequent manual tuning, leading to operational toil and a 'thundering herd' problem on retries.

This highlighted the need for dynamic and priority-aware load shedding. The solution came with Cinnamon, a priority-aware load shedder developed by Uber's Delivery team. Cinnamon considers request rank (priority) and dynamic system state to make smarter shedding decisions. Critical user-facing traffic (Tier 1) is protected over lower-priority asynchronous jobs (Tier 5).

Cinnamon simplified the queue structure by marking long-running operations with lower priority, rather than using separate queues. This priority-aware shedding ensures that during overloads, critical user-facing requests are protected with minimal impact. Furthermore, Cinnamon's Auto Tuner dynamically adjusts queue timeout thresholds and inflight limits using P90 latency and error rate signals, eliminating manual tuning and providing more nuanced load absorption than CoDel's static approach.

© 2026 StartupHub.ai. All rights reserved. Do not enter, scrape, copy, reproduce, or republish this article in whole or in part. Use as input to AI training, fine-tuning, retrieval-augmented generation, or any machine-learning system is prohibited without written license. Substantially-similar derivative works will be pursued to the fullest extent of applicable copyright, database, and computer-misuse laws. See our terms.