LinkedIn is overhauling its search infrastructure with large language models (LLMs) to deliver a more intuitive and personalized experience. This shift, detailed on the LinkedIn Engineering blog, aims to move beyond simple keyword matching towards understanding user intent through natural language processing.
The company has introduced AI Job Search and AI-powered People Search, features that interpret queries semantically. Instead of relying on exact word matches, these tools infer user goals and preferences, overcoming vocabulary gaps to better align search results with how professionals articulate their career ambitions.
This significant upgrade to LinkedIn's search tech stack utilizes LLMs to create a semantic search experience. It allows for more flexible and accurate retrieval by interpreting natural language to infer user goals and preferences.
Semantic Search Infrastructure at Scale
At its core, LinkedIn's semantic search employs a multi-stage process. User queries are first processed by a query understanding module, which generates embeddings. These embeddings are then used for embedding-based retrieval (EBR) on GPUs to identify a broad set of candidate documents.
A subsequent ranking stage refines these candidates using a Cross-Encoder Small Language Model (SLM). This model, running on SGLang, combines query, job, and member features to score relevance and engagement.
To maintain efficiency at scale, the ranking pipeline incorporates score caching, a ranking-depth controller, and traffic shaping. These optimizations aim to enhance latency and result quality for millions of real-time queries.