SQL & Databases
guides
SQL Foundations for Data & ML Interviews: JOINs, Aggregations, and Window Functions
Master the SQL patterns that appear in every data and ML engineer interview — JOINs, GROUP BY, CTEs, and NULL semantics. This guide covers what most tutorials skip: why LEFT JOIN silently drops rows, when HAVING beats WHERE, and how NULL poisons aggregates in ways that hide bugs for months.
SQL Query Optimization: Indexes, Query Plans, and Performance at Scale
Production SQL is not about writing queries that work — it's about queries that scale. This guide covers B-tree and hash index mechanics, reading EXPLAIN ANALYZE output, index selectivity and covering indexes, when indexes hurt write throughput, correlated subquery elimination, and partitioning strategies used in analytics warehouses at scale.
Window Functions for Analytics: ROW_NUMBER, RANK, LAG/LEAD, and Running Totals
Window functions are the single most-tested SQL topic in data and ML engineer interviews. This guide covers the full OVER() clause mechanics, every ranking function and when each is wrong, LAG/LEAD for time-series analysis, running aggregates, and the ROWS vs RANGE frame clause distinction that trips up even experienced engineers.
SQL Joins & Aggregations: INNER, LEFT, GROUP BY, and Aggregation Pitfalls
Joins and GROUP BY form the backbone of every SQL interview. This guide covers the four join types with precise NULL semantics, GROUP BY with HAVING, the COUNT(*) vs COUNT(col) distinction that fails most candidates, multi-table aggregation patterns, and the self-join technique for hierarchical and comparison queries asked at Meta, Stripe, and Airbnb.
Subqueries and CTEs: WITH Clauses, Correlated Subqueries, and Recursive Patterns
Subqueries and CTEs are the SQL constructs that separate mid-level analysts from senior engineers. This guide covers scalar subqueries, correlated subqueries and their performance implications, the WITH (CTE) clause for readable multi-step logic, and recursive CTEs for hierarchical data — all tested at Meta, Stripe, Google, and Airbnb's data rounds.
SQL Indexes and Query Performance: B-Tree, Composite, and Covering Indexes
Index selection is what separates a working SQL query from one that runs in production. This guide covers B-tree index mechanics, composite index column ordering, covering indexes for index-only scans, when indexes hurt writes, and how to read EXPLAIN plans — skills tested at Google, Stripe, and Meta for senior data and backend roles.