Preview — Pro guide
You are seeing a portion of this guide. Sign in and upgrade to unlock the full article, quizzes, and interview answers.
SQL Query Optimization: Indexes, Query Plans, and Performance at Scale
Production SQL is not about writing queries that work — it's about queries that scale. This guide covers B-tree and hash index mechanics, reading EXPLAIN ANALYZE output, index selectivity and covering indexes, when indexes hurt write throughput, correlated subquery elimination, and partitioning strategies used in analytics warehouses at scale.
Why Query Optimization Is an Interview Topic
Most SQL interviews start with "write a query." The follow-up is almost always "how does this perform at 10 billion rows?" or "your query is running for 20 minutes — how do you debug it?"
Query optimization is the domain where data engineers and ML engineers prove they understand the system beneath the SQL. Writing a correct window function query is table stakes. Knowing why it's doing a full table scan on 500M rows, and what index to add to fix it, is the signal interviewers at Stripe, Databricks, and Snowflake are testing.
This guide focuses on the mechanics that matter in analytics and ML engineering: B-tree indexes, reading query plans, and the patterns that cause production slowdowns — correlated subqueries, low-selectivity indexes, and N+1 queries in ORM-generated SQL.