Uber Eats' Search Engine Gets Smarter

Uber Eats enhances its delivery search with semantic AI, leveraging LLMs and optimized infrastructure for speed, scale, and accuracy.

7 min read
Illustration of a smartphone showing the Uber Eats app interface with a search bar.
Uber Eats' new semantic search aims for intuitive discovery.· Uber Engineering

Search is the gateway to orders on Uber Eats, directly impacting conversion rates and user satisfaction. Traditional keyword matching struggles with synonyms, typos, and language nuances, leading to missed intent. Uber Eats has shifted to semantic search, which matches meaning rather than just words by encoding queries and documents into vector embeddings.

Visual TL;DR. Keyword Search Limits leads to Semantic Search Shift. Semantic Search Shift uses LLMs & Vector Embeddings. LLMs & Vector Embeddings enables Two-Tower Architecture. Two-Tower Architecture requires Optimized Infrastructure. Optimized Infrastructure enables Improved Search Accuracy. Improved Search Accuracy results in Enhanced User Satisfaction.

  1. Keyword Search Limits: traditional keyword matching struggles with synonyms, typos, and language nuances
  2. Semantic Search Shift: matches meaning rather than just words by encoding queries and documents
  3. LLMs & Vector Embeddings: leveraging large language models for flexible embedding dimensions and fine-tuning
  4. Two-Tower Architecture: decoupling query and document embedding calculations for efficient processing
  5. Optimized Infrastructure: robust tech stack including deployment, indexing, and monitoring at scale
  6. Improved Search Accuracy: better capture user intent across stores, dishes, and items
  7. Enhanced User Satisfaction: directly impacting conversion rates and overall user experience
Visual TL;DR
Visual TL;DR — startuphub.ai Keyword Search Limits leads to Semantic Search Shift. Semantic Search Shift uses LLMs & Vector Embeddings. Optimized Infrastructure enables Improved Search Accuracy leads to uses enables Keyword Search Limits Semantic Search Shift LLMs & Vector Embeddings Optimized Infrastructure Improved Search Accuracy From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai Keyword Search Limits leads to Semantic Search Shift. Semantic Search Shift uses LLMs & Vector Embeddings. Optimized Infrastructure enables Improved Search Accuracy leads to uses enables Keyword SearchLimits Semantic SearchShift LLMs & VectorEmbeddings OptimizedInfrastructure Improved SearchAccuracy From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai Keyword Search Limits leads to Semantic Search Shift. Semantic Search Shift uses LLMs & Vector Embeddings. Optimized Infrastructure enables Improved Search Accuracy leads to uses enables Keyword Search Limits traditional keyword matching struggleswith synonyms, typos, and language nuances Semantic Search Shift matches meaning rather than just words byencoding queries and documents LLMs & Vector Embeddings leveraging large language models forflexible embedding dimensions andfine-tuning Optimized Infrastructure robust tech stack including deployment,indexing, and monitoring at scale Improved Search Accuracy better capture user intent across stores,dishes, and items From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai Keyword Search Limits leads to Semantic Search Shift. Semantic Search Shift uses LLMs & Vector Embeddings. Optimized Infrastructure enables Improved Search Accuracy leads to uses enables Keyword SearchLimits traditional keywordmatching struggleswith synonyms,… Semantic SearchShift matches meaningrather than justwords by encoding… LLMs & VectorEmbeddings leveraging largelanguage models forflexible embedding… OptimizedInfrastructure robust tech stackincludingdeployment,… Improved SearchAccuracy better capture userintent acrossstores, dishes, and… From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai Keyword Search Limits leads to Semantic Search Shift. Semantic Search Shift uses LLMs & Vector Embeddings. LLMs & Vector Embeddings enables Two-Tower Architecture. Two-Tower Architecture requires Optimized Infrastructure. Optimized Infrastructure enables Improved Search Accuracy. Improved Search Accuracy results in Enhanced User Satisfaction leads to uses enables requires enables results in Keyword Search Limits traditional keyword matching struggleswith synonyms, typos, and language nuances Semantic Search Shift matches meaning rather than just words byencoding queries and documents LLMs & Vector Embeddings leveraging large language models forflexible embedding dimensions andfine-tuning Two-Tower Architecture decoupling query and document embeddingcalculations for efficient processing Optimized Infrastructure robust tech stack including deployment,indexing, and monitoring at scale Improved Search Accuracy better capture user intent across stores,dishes, and items Enhanced User Satisfaction directly impacting conversion rates andoverall user experience From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai Keyword Search Limits leads to Semantic Search Shift. Semantic Search Shift uses LLMs & Vector Embeddings. LLMs & Vector Embeddings enables Two-Tower Architecture. Two-Tower Architecture requires Optimized Infrastructure. Optimized Infrastructure enables Improved Search Accuracy. Improved Search Accuracy results in Enhanced User Satisfaction leads to uses enables requires enables results in Keyword SearchLimits traditional keywordmatching struggleswith synonyms,… Semantic SearchShift matches meaningrather than justwords by encoding… LLMs & VectorEmbeddings leveraging largelanguage models forflexible embedding… Two-TowerArchitecture decoupling queryand documentembedding… OptimizedInfrastructure robust tech stackincludingdeployment,… Improved SearchAccuracy better capture userintent acrossstores, dishes, and… Enhanced UserSatisfaction directly impactingconversion ratesand overall user… From startuphub.ai · The publishers behind this format

This move aims to better capture user intent across stores, dishes, and items, even in multilingual markets. As detailed by Uber Engineering, building this at scale involves more than just a model; it requires a robust tech stack including deployment, indexing, and monitoring.

Related startups

Architecture and Model Training

The system employs a two-tower architecture, decoupling query and document embedding calculations. Query embeddings are generated in real-time online, while document embeddings are processed offline in batches. They utilize Matryoshka Representation Learning (MRL) for flexible embedding dimensions and fine-tune large language models (LLMs) like Qwen as the backbone for their world knowledge and cross-lingual capabilities.

This single embedding model now serves all Uber Eats verticals and markets. Training is orchestrated using PyTorch and Ray, with large-scale training leveraging PyTorch’s DDP and DeepSpeed (ZeRO-3) to handle massive LLMs. Versioned artifacts are meticulously tracked for reproducibility.

Scaling and Optimization

Offline inference is crucial for embedding Uber's vast document corpus. Embeddings are calculated at the feature level and then joined back to the full catalog. These embeddings are stored in feature store tables and used to build search indexes supported by HNSW graphs, offering both non-quantized and quantized vector representations.

Balancing retrieval accuracy with infrastructure costs was a primary challenge. Uber Eats tunes Approximate Nearest Neighbor (ANN) parameters, employs quantization strategies (like int7 SQ), and uses different embedding dimensions via MRL. These optimizations significantly reduced cost and latency without compromising retrieval quality.

The system also incorporates locale-aware lexical fields and boolean pre-filters to shrink the candidate set before ANN search. A lightweight re-ranking step further refines results before they reach downstream rankers.

Productionization and Reliability

Uber's data is dynamic, necessitating a biweekly retraining and index update cadence. A blue/green deployment strategy at the index column level ensures seamless model refreshes and rollback capabilities. Each index maintains two columns (embedding_blue and embedding_green), with the active model version mapped via configuration.

Automated validations gate deployments, checking for completeness, backward compatibility, and correctness against real queries in non-prod environments. These checks prevent data corruption and ensure new indexes perform at least as well as the current production index.

Serving-time reliability checks further guard against errors. The system verifies that the model generating the query embedding matches the model ID on the active index column. Mismatches trigger alerts and automatic rollbacks, preventing outages without impacting read path latency.

Conclusion

Uber Eats has successfully built a scalable, multilingual search system powering discovery across its verticals. By combining advanced language models, efficient embedding techniques like Matryoshka Representation Learning, and a production-first design with robust deployment strategies, they deliver a faster, more intuitive search experience. This approach to semantic search at scale, similar to advancements seen in platforms like LinkedIn's AI Search Upgrade, highlights the critical role of thoughtful engineering in modern discovery platforms.

© 2026 StartupHub.ai. All rights reserved. Do not enter, scrape, copy, reproduce, or republish this article in whole or in part. Use as input to AI training, fine-tuning, retrieval-augmented generation, or any machine-learning system is prohibited without written license. Substantially-similar derivative works will be pursued to the fullest extent of applicable copyright, database, and computer-misuse laws. See our terms.