Uber's Hybrid Core Allocation

Uber Engineering's hybrid core allocation system blends dedicated and shared CPUs for better efficiency and reliability.

May 28 at 12:16 AM9 min read

Diagram illustrating NUMA-aware allocation of dedicated and shared CPU cores on a host. — Illustration of NUMA-aware allocation for dedicated and shared CPU cores.· Uber Engineering

Visual TL;DR. Old CPU Allocation leads to Odin Container Orchestration. Odin Container Orchestration leads to Hybrid Core Allocation. Hybrid Core Allocation leads to Dedicated Cores. Hybrid Core Allocation leads to Shared Core Pool. Shared Core Pool leads to Linux cpu.shares. Dedicated Cores leads to Improved Efficiency. Shared Core Pool leads to Handles Bursty Workloads. Hybrid Core Allocation leads to Improved Efficiency. Hybrid Core Allocation leads to Handles Bursty Workloads. Hybrid Core Allocation leads to NUMA Considerations. Hybrid Core Allocation leads to Vertical Scaling Logic.

Old CPU Allocation: strict dedicated cores, one-minute averages insufficient for bursty workloads
Odin Container Orchestration: Uber's system for managing containerized applications and their resources
Hybrid Core Allocation: blends dedicated and shared CPUs for better efficiency and reliability
Dedicated Cores: guaranteed CPU resources for critical workloads
Shared Core Pool: pooled per host, over-allocated using a defined ratio
Linux cpu.shares: dynamically distributes shared CPU time based on allocation
Improved Efficiency: better utilization of CPU resources across hosts
Handles Bursty Workloads: adapts to dynamic, high-demand CPU patterns more effectively
NUMA Considerations: optimizing allocation across Non-Uniform Memory Access architectures
Vertical Scaling Logic: refining how CPU scaling decisions are made

Visual TL;DRQuickExplainDeeper

Uber's engineering team has refined its approach to CPU resource management with the introduction of hybrid core allocation within its Odin container orchestration system. This evolution moves away from strict dedicated CPU assignments to a more flexible model designed to handle bursty workloads more effectively.

Historically, Uber relied on a vertical CPU scaler that assumed CPU usage could be gauged by one-minute averages and that allocated cores were exclusively dedicated. These assumptions proved insufficient for dynamic, high-demand CPU patterns. The new system, detailed on Uber Engineering, integrates shared core allocation alongside dedicated resources, offering a middle ground.

From Dedicated to Shared

The hybrid model assigns workloads both guaranteed dedicated cores and an optional pool of shared cores. These shared cores are pooled per host and can be over-allocated using a defined ratio. Linux's cpu.shares mechanism dynamically distributes this shared CPU time based on allocation size, ensuring fair contention handling.

The system's vertical scaler is upgraded to calculate optimal allocations of both dedicated and shared cores for each workload. An empirical study using Linux cpuset.cpus and cpu.shares revealed that while the Linux scheduler generally balances workloads effectively, the actual ratio can become skewed during severe contention. The formula used for shares is round((dedicated_CPUs + shared_CPUs_for_contention) × 100), allowing for precise, percentage-based allocation.

NUMA Considerations and Allocation Strategy

Efficient memory access is critical on multi-socket systems with Non-Uniform Memory Access (NUMA). Uber observed that the Linux scheduler typically prioritizes CPUs closest to a process's memory. While shared libraries can complicate this, the kernel generally optimizes for local memory usage, reducing the need for explicit NUMA pinning.

Uber's strategy allocates dedicated cores within the same NUMA node while distributing shared pools across multiple nodes. This configuration balances workload performance with host agent efficiency. The odin-agent manages this distribution, aiming to balance shared cores evenly across nodes and prioritizing heavier dedicated CPU load nodes for additional shared cores. Dedicated cores are kept on the same NUMA node as their associated shared cores to minimize latency.

A key feature is in-place vertical scaling on the host level, crucial for stateful fleets with local disks. This allows resources to be adjusted on the fly without the costly and disruptive process of moving entire workloads, enhancing fleet reliability.

Vertical Scaling Logic

The scaler determines core counts based on average dedicated CPU usage and the deviation for shared cores, which are reserved for occasional bursts. The total shared CPU pool on a host is typically over-allocated relative to the sum of shared CPUs assigned to individual workloads. During contention, cpu.shares are proportionally set to ensure fair distribution.

Bridging Kubernetes Gaps

To align with cloud-native patterns, Uber extracted its cgroups management code into a standalone library. This library is then wrapped into a Kubernetes CRI plugin, enabling Odin to adopt Kubernetes without sacrificing host-level efficiency, despite Kubernetes' current lack of native support for hybrid core allocation.

Limitations and Future Improvements

Current CPU allocation decisions are made at the host level. Moving this decision-making to a higher-level workload scheduler, with a fleet-wide view, could offer further optimization. The system also faces potential CPU allocation fragmentation over time, which the host agent could address through periodic defragmentation passes.

This hybrid approach allows for efficient CPU resource utilization, balanced load distribution, and maintained stability under varying demands, significantly improving resource management in distributed systems.

© 2026 StartupHub.ai. All rights reserved. Do not enter, scrape, copy, reproduce, or republish this article in whole or in part. Use as input to AI training, fine-tuning, retrieval-augmented generation, or any machine-learning system is prohibited without written license. Substantially-similar derivative works will be pursued to the fullest extent of applicable copyright, database, and computer-misuse laws. See our terms.

#Uber Engineering #Odin #CPU #Linux #Kubernetes #NUMA #Container Orchestration #Resource Management