LinkedIn Overhauls Log Storage

LinkedIn replaces its Kafka log storage with Northguard and Xinfra for improved scalability and operability to handle over 1.2 billion members.

6 min read
Diagram illustrating the Northguard data model with topics, ranges, and segments.
A look at how Northguard structures data for scalable log storage.· LinkedIn Engineering

LinkedIn is phasing out its long-standing Kafka infrastructure for a new homegrown system dubbed Northguard. This overhaul aims to address the exponential growth in data volume and complexity since Kafka was first implemented 15 years ago. The move is detailed in a recent LinkedIn Engineering post.

Visual TL;DR. LinkedIn's Growth leads to Kafka Challenges. Kafka Challenges necessitates Introducing Northguard. Introducing Northguard integrates with Xinfra Layer. Introducing Northguard enables Improved Scalability. Xinfra Layer supports Improved Scalability. Introducing Northguard provides Enhanced Operability. Data Reprocessing supported by Introducing Northguard.

  1. LinkedIn's Growth: over 1.2 billion members and exponential data volume increase
  2. Kafka Challenges: metadata bottlenecks, cluster size, load balancing difficulties
  3. Introducing Northguard: LinkedIn's new homegrown log storage system
  4. Xinfra Layer: part of the new log storage infrastructure
  5. Improved Scalability: handling 32 trillion records daily across 17 petabytes
  6. Enhanced Operability: addressing challenges of the previous Kafka system
  7. Data Reprocessing: crucial for debugging and verification of data streams
Visual TL;DR
Visual TL;DR — startuphub.ai LinkedIn's Growth leads to Kafka Challenges. Kafka Challenges necessitates Introducing Northguard. Introducing Northguard enables Improved Scalability. Introducing Northguard provides Enhanced Operability leads to necessitates enables provides LinkedIn's Growth Kafka Challenges Introducing Northguard Improved Scalability Enhanced Operability From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai LinkedIn's Growth leads to Kafka Challenges. Kafka Challenges necessitates Introducing Northguard. Introducing Northguard enables Improved Scalability. Introducing Northguard provides Enhanced Operability leads to necessitates enables provides LinkedIn's Growth Kafka Challenges IntroducingNorthguard ImprovedScalability EnhancedOperability From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai LinkedIn's Growth leads to Kafka Challenges. Kafka Challenges necessitates Introducing Northguard. Introducing Northguard enables Improved Scalability. Introducing Northguard provides Enhanced Operability leads to necessitates enables provides LinkedIn's Growth over 1.2 billion members and exponentialdata volume increase Kafka Challenges metadata bottlenecks, cluster size, loadbalancing difficulties Introducing Northguard LinkedIn's new homegrown log storagesystem Improved Scalability handling 32 trillion records daily across17 petabytes Enhanced Operability addressing challenges of the previousKafka system From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai LinkedIn's Growth leads to Kafka Challenges. Kafka Challenges necessitates Introducing Northguard. Introducing Northguard enables Improved Scalability. Introducing Northguard provides Enhanced Operability leads to necessitates enables provides LinkedIn's Growth over 1.2 billionmembers andexponential data… Kafka Challenges metadatabottlenecks,cluster size, load… IntroducingNorthguard LinkedIn's newhomegrown logstorage system ImprovedScalability handling 32trillion recordsdaily across 17… EnhancedOperability addressingchallenges of theprevious Kafka… From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai LinkedIn's Growth leads to Kafka Challenges. Kafka Challenges necessitates Introducing Northguard. Introducing Northguard integrates with Xinfra Layer. Introducing Northguard enables Improved Scalability. Xinfra Layer supports Improved Scalability. Introducing Northguard provides Enhanced Operability. Data Reprocessing supported by Introducing Northguard leads to necessitates integrates with enables supports provides supported by LinkedIn's Growth over 1.2 billion members and exponentialdata volume increase Kafka Challenges metadata bottlenecks, cluster size, loadbalancing difficulties Introducing Northguard LinkedIn's new homegrown log storagesystem Xinfra Layer part of the new log storage infrastructure Improved Scalability handling 32 trillion records daily across17 petabytes Enhanced Operability addressing challenges of the previousKafka system Data Reprocessing crucial for debugging and verification ofdata streams From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai LinkedIn's Growth leads to Kafka Challenges. Kafka Challenges necessitates Introducing Northguard. Introducing Northguard integrates with Xinfra Layer. Introducing Northguard enables Improved Scalability. Xinfra Layer supports Improved Scalability. Introducing Northguard provides Enhanced Operability. Data Reprocessing supported by Introducing Northguard leads to necessitates integrates with enables supports provides supported by LinkedIn's Growth over 1.2 billionmembers andexponential data… Kafka Challenges metadatabottlenecks,cluster size, load… IntroducingNorthguard LinkedIn's newhomegrown logstorage system Xinfra Layer part of the new logstorageinfrastructure ImprovedScalability handling 32trillion recordsdaily across 17… EnhancedOperability addressingchallenges of theprevious Kafka… Data Reprocessing crucial fordebugging andverification of… From startuphub.ai · The publishers behind this format

Data streams are fundamental to LinkedIn's thousands of services, enabling them to subscribe to and process information from other services. The ability to reprocess data is crucial for debugging and verification. Kafka, developed by LinkedIn a decade and a half ago, became the backbone for this ordered data pipeline, known as a log, supporting everything from user activity to AI features.

Related startups

However, scaling Kafka to accommodate over 1.2 billion members has proven increasingly challenging. The platform now handles 32 trillion records daily across 17 petabytes, distributed across 10,000+ machines. Key issues included metadata and cluster size bottlenecks, load balancing difficulties, and compromises on consistency in favor of availability.

Introducing Northguard

Northguard is engineered for high scalability and operability. It shards data and metadata, minimizes global state, and employs a decentralized group membership protocol. Log striping is central to its design, ensuring even load distribution by breaking logs into smaller, replicated chunks called segments and ranges.

The system's data model consists of records, segments (sequences of records, serving as the unit of replication), ranges (contiguous key space segments), and topics (named collections of ranges). This structure allows for granular control and efficient replication.

Northguard's metadata is managed by vnodes, which are fault-tolerant replicated state machines utilizing Raft. A coordinator, the leader of a vnode, handles metadata operations and persists state, enabling seamless leader transitions. This metadata is sharded across vnodes using consistent hashing via a Dynamically-Sharded Replicated State Machine (DS-RSM) to prevent hotspots.

The Xinfra Layer

To ease the transition from Kafka, LinkedIn developed Xinfra, a virtualized Pub/Sub layer that sits above Northguard. This abstraction layer maintains compatibility with existing applications while leveraging Northguard's improved scalability and operability.

This evolution signifies LinkedIn's ongoing commitment to robust distributed systems log management, building on foundational technologies like Kafka to support its massive scale. The development of Northguard and Xinfra represents a significant step in ensuring the platform's future infrastructure can keep pace with its user growth and data demands, similar to how other platforms are innovating in distributed systems.

The adoption of Northguard and Xinfra underscores the critical role of scalable log storage systems in modern, data-intensive platforms, pushing the boundaries of what's possible in areas like AI development.

© 2026 StartupHub.ai. All rights reserved. Do not enter, scrape, copy, reproduce, or republish this article in whole or in part. Use as input to AI training, fine-tuning, retrieval-augmented generation, or any machine-learning system is prohibited without written license. Substantially-similar derivative works will be pursued to the fullest extent of applicable copyright, database, and computer-misuse laws. See our terms.