Design a vector search system that can handle billion-scale embeddings. Cover index types (HNSW, IVF), quantization strategies, distributed search, and real-time updates.
HNSW index for high recall at low latency. Product quantization (PQ) to compress vectors 4-8x. Shard index across machines by hash. Replicate shards for availability. Write-ahead log for real-time updates. Two-phase: coarse ANN → exact reranking.
Ready to practice?
Write your structured answer, then compare to a strong model answer.