The toolkit every senior data scientist needs when A/B tests aren't possible. Covers DiD, Instrumental Variables (IV), RDD, Propensity Score Matching, and Double ML for machine learning interviews — the exact methods used at Airbnb and Microsoft to estimate causal effects from observational data when randomization is impossible.

50 min read 2 sections 1 interview questions

Causal InferenceDifference-in-DifferencesDiDInstrumental VariablesRegression DiscontinuityPropensity Score MatchingAverage Treatment EffectATEParallel TrendsObservational DataDouble MLCounterfactualDAGSelection BiasA/B Test

Why Causal Inference? The Problem with Observational Data

The central problem in data science: observing a correlation does not tell you if one variable causes another. Ice cream sales and drowning deaths are correlated (both driven by summer heat). Users who see longer watch times also have higher retention (both driven by content quality). If you cut ice cream sales, drowning deaths won't fall. If you lengthen videos, retention may drop.

The gold standard: Randomized Controlled Trials (A/B tests). Randomly assign users to treatment and control. By randomization, the only systematic difference between groups is the treatment. The observed difference in outcomes IS the causal effect.

But A/B tests are often impossible:

Ethical constraints: You can't randomly assign users to receive addictive content, potentially harmful features, or services affecting their livelihoods.
Network effects: If you treat 50% of users, the control group is affected (treated users interact with control users on the same platform). Used in switchback experiments instead.
Long-term effects: A/B test of a 3-month subscription product requires waiting 3 months. Business needs answers in days.
External events: Evaluating impact of an iOS policy change, economic downturn, or competitor move — you can't randomly assign who was affected.
Lack of randomization: An ML model routes users to different features. The model's targeting is correlated with the outcome → selection bias.

The potential outcomes framework (Rubin, 1974): For each unit i, define potential outcomes:

Y_i(1) = outcome if treated
Y_i(0) = outcome if not treated

The individual treatment effect = Y_i(1) - Y_i(0). The fundamental problem: you can only observe one potential outcome per unit — the one that actually happened. The other is the counterfactual.

What we can estimate: Average Treatment Effect (ATE) = E[Y(1) - Y(0)]. Causal inference methods are attempts to credibly estimate this counterfactual outcome from data where treatment was not randomized.

IMPORTANT

Premium content locked

This guide is premium content. Upgrade to Pro to unlock the full guide, quizzes, and interview Q&A.

Upgrade to Pro Sign in to upgrade