Uber's massive infrastructure, supporting over 170 million monthly users, relies on in-house databases like Docstore and Schemaless. These systems, built on MySQL, handle petabytes of data and tens of millions of requests per second. At this scale, minor overloads can cascade, leading to system-wide failures. Ensuring fairness in a multitenant environment, where one user shouldn't hog resources, adds another layer of complexity. This engineering challenge led Uber to develop an intelligent load management system.
From Static Limits to Dynamic Control
Initially, Uber experimented with quota-based rate limiting within its query engine. This involved assigning capacity costs to requests and enforcing fixed quotas. However, this approach proved problematic. It added complexity with external Redis calls and failed to accurately reflect the actual load on storage partitions. The cost model was imprecise, treating heavy and light queries similarly, and static quotas required constant stakeholder adjustments.
