LinkedIn Sales Navigator Search Speed Boost

LinkedIn engineers drastically cut Sales Navigator's search data processing time by optimizing its Spark pipeline, enabling faster results for users.

7 min read
Abstract visualization of data nodes and connections representing a complex data pipeline.
Optimizing the complex data pipeline powering LinkedIn Sales Navigator's search.· LinkedIn Engineering

LinkedIn has overhauled its Sales Navigator search system, cutting down the time it takes to process data and deliver fresh prospect insights. The engineering team detailed how they moved the core data manipulation pipeline from the older MapReduce framework to Apache Spark, implementing a series of targeted optimizations.

Visual TL;DR. Slow Search Processing addressed by Migrate to Spark. Migrate to Spark with Optimize Spark Jobs. Migrate to Spark to tackle Reduce Pipeline Complexity. Optimize Spark Jobs led to Cut Execution Time. Reduce Pipeline Complexity enabling Faster Search Results. Faster Search Results resulting in Improved User Experience. Cut Execution Time leading to Faster Search Results.

  1. Slow Search Processing: Sales Navigator search data processing time was too long
  2. Migrate to Spark: Moved core data manipulation pipeline from MapReduce to Apache Spark
  3. Optimize Spark Jobs: Implemented targeted optimizations on over 100 individual Spark jobs
  4. Reduce Pipeline Complexity: Tackled complex data manipulation pipeline for critical Sales Navigator features
  5. Faster Search Results: Enabled quicker delivery of updated search results for sales professionals
  6. Cut Execution Time: Reduced total execution time from 6-7 hours down to roughly three hours
  7. Improved User Experience: Drastically cut search data processing time for users
Visual TL;DR
Visual TL;DR — startuphub.ai Slow Search Processing addressed by Migrate to Spark. Cut Execution Time leading to Faster Search Results addressed by leading to Slow Search Processing Migrate to Spark Faster Search Results Cut Execution Time From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai Slow Search Processing addressed by Migrate to Spark. Cut Execution Time leading to Faster Search Results addressed by leading to Slow SearchProcessing Migrate to Spark Faster SearchResults Cut ExecutionTime From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai Slow Search Processing addressed by Migrate to Spark. Cut Execution Time leading to Faster Search Results addressed by leading to Slow Search Processing Sales Navigator search data processingtime was too long Migrate to Spark Moved core data manipulation pipeline fromMapReduce to Apache Spark Faster Search Results Enabled quicker delivery of updated searchresults for sales professionals Cut Execution Time Reduced total execution time from 6-7hours down to roughly three hours From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai Slow Search Processing addressed by Migrate to Spark. Cut Execution Time leading to Faster Search Results addressed by leading to Slow SearchProcessing Sales Navigatorsearch dataprocessing time was… Migrate to Spark Moved core datamanipulationpipeline from… Faster SearchResults Enabled quickerdelivery of updatedsearch results for… Cut ExecutionTime Reduced totalexecution time from6-7 hours down to… From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai Slow Search Processing addressed by Migrate to Spark. Migrate to Spark with Optimize Spark Jobs. Migrate to Spark to tackle Reduce Pipeline Complexity. Optimize Spark Jobs led to Cut Execution Time. Reduce Pipeline Complexity enabling Faster Search Results. Faster Search Results resulting in Improved User Experience. Cut Execution Time leading to Faster Search Results addressed by with to tackle led to enabling resulting in leading to Slow Search Processing Sales Navigator search data processingtime was too long Migrate to Spark Moved core data manipulation pipeline fromMapReduce to Apache Spark Optimize Spark Jobs Implemented targeted optimizations on over100 individual Spark jobs Reduce Pipeline Complexity Tackled complex data manipulation pipelinefor critical Sales Navigator features Faster Search Results Enabled quicker delivery of updated searchresults for sales professionals Cut Execution Time Reduced total execution time from 6-7hours down to roughly three hours Improved User Experience Drastically cut search data processingtime for users From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai Slow Search Processing addressed by Migrate to Spark. Migrate to Spark with Optimize Spark Jobs. Migrate to Spark to tackle Reduce Pipeline Complexity. Optimize Spark Jobs led to Cut Execution Time. Reduce Pipeline Complexity enabling Faster Search Results. Faster Search Results resulting in Improved User Experience. Cut Execution Time leading to Faster Search Results addressed by with to tackle led to enabling resulting in leading to Slow SearchProcessing Sales Navigatorsearch dataprocessing time was… Migrate to Spark Moved core datamanipulationpipeline from… Optimize SparkJobs Implementedtargetedoptimizations on… Reduce PipelineComplexity Tackled complexdata manipulationpipeline for… Faster SearchResults Enabled quickerdelivery of updatedsearch results for… Cut ExecutionTime Reduced totalexecution time from6-7 hours down to… Improved UserExperience Drastically cutsearch dataprocessing time for… From startuphub.ai · The publishers behind this format

This initiative focused on the search system that underpins critical Sales Navigator features like Lead Search, Relationship Explorer, and Lead Recommendations. The goal was to accelerate the delivery of updated search results, a crucial factor for sales professionals making timely decisions.

The complex data manipulation pipeline, comprising over 100 individual Spark jobs, was a prime candidate for optimization. Engineers successfully reduced the total execution time from a lengthy 6-7 hours down to roughly three hours.

Under the Hood: Sales Navigator's Search Architecture

The search system operates in three tiers: offline, nearline, and serving. The offline component handles large-scale batch processing, transforming raw data into immutable base indexes using Spark heavily. A nearline component captures real-time updates, building a live index that's periodically flushed to disk.

Related startups

The serving layer then orchestrates query requests, distributing them to various search servers that retrieve and rank results from index shards.

Tackling Pipeline Complexity

Operating such an extensive data pipeline presents inherent challenges. Complex job dependencies can obscure performance bottlenecks, where a slowdown in one job cascades, impacting the entire workflow.

Additionally, strict resource caps prevent simple scaling by adding more compute power, pushing the team to find more efficient solutions.

Uneven data distribution across jobs, particularly when unioning datasets of vastly different sizes, also created significant performance hurdles.

Strategic Optimization Techniques

The optimization process began with pruning the job graph. By identifying and consolidating jobs with no external dependencies, LinkedIn removed unnecessary intermediate data writes and reads, saving over 30 minutes on one segment alone.

Focus then shifted to identifying bottlenecks on the critical path of the job execution flow. Optimizing these key jobs directly impacts the overall pipeline duration.

Spark Job Tuning for Speed

Data skewness, a common Spark performance killer, was addressed through repartitioning. By redistributing data more evenly based on unique search document IDs, one job's execution time dropped from two hours to just 30 minutes.

Careful adjustment of shuffle partition counts, aligned with the number of Spark executors, also yielded significant time savings, reducing a job’s runtime by over 30 minutes.

Broadcast joins proved effective for merging datasets of dramatically different sizes. Broadcasting a 40 MB table to all executors reduced a job’s runtime from over an hour to approximately 20 minutes.

The team also leveraged LinkedIn’s internal auto-tuning tool, Right-Sizing, which analyzes historical job runs to adjust Spark parameters automatically.

This comprehensive approach to LinkedIn Sales Navigator Spark optimization demonstrates how deep technical tuning can unlock significant performance gains in enterprise-grade big data pipeline optimization.

© 2026 StartupHub.ai. All rights reserved. Do not enter, scrape, copy, reproduce, or republish this article in whole or in part. Use as input to AI training, fine-tuning, retrieval-augmented generation, or any machine-learning system is prohibited without written license. Substantially-similar derivative works will be pursued to the fullest extent of applicable copyright, database, and computer-misuse laws. See our terms.