Skip to main content

Preview — Pro guide

You are seeing a portion of this guide. Sign in and upgrade to unlock the full article, quizzes, and interview answers.

ML System Design: Query Understanding — Rewriting, Expansion, Classification, and Spell Correction

Design the query understanding stack behind web search, e-commerce search, and internal enterprise retrieval — tokenization, spelling, intent classification, synonym expansion, PII redaction, and safe query rewriting for vector + lexical hybrid retrieval. Covers how Amazon-style search decomposes the problem into cascaded lightweight models under single-digit millisecond budgets before heavy ranking.

52 min read 2 sections 1 interview questions
Query UnderstandingSearch RankingQuery RewritingSpell CorrectionIntent ClassificationSynonym ExpansionBERTElasticSearchRetrievalPII RedactionLatency BudgetE-commerce Search

Query Understanding Sits Upstream of Retrieval Quality

Bad queries waste FAISS ANN budget and poison BM25 signals. QU stack normalizes language: Unicode, casing, spelling, language detection, intent routing (product vs support doc), expansion with controlled vocabulary, and PII stripping before logging or sending to third-party LLMs.

Interviews expect a latency cascade — microseconds for rules, sub-ms for small classifiers, 1–3 ms for compact transformers on CPU — before retrieval at ~10–30 ms typical for hybrid stacks at moderate QPS.

IMPORTANT

Premium content locked

This guide is premium content. Upgrade to Pro to unlock the full guide, quizzes, and interview Q&A.