Preview — Pro guide

You are seeing a portion of this guide. Sign in and upgrade to unlock the full article, quizzes, and interview answers.

Sections

0/2

Related Guides

Hypothesis Testing for Data Scientists: p-values, Type I/II, Multiple Testing

Machine Learning

40m

A/B Testing & Experimentation at Scale

Machine Learning

70m

Sequential Testing & the Peeking Problem: Alpha Spending, SPRT, and Always-Valid Inference

Machine Learning

38m

Quiz

← Back to Library

Machine Learning·Intermediate

Multiple Testing Corrections: FWER, FDR, Bonferroni, Benjamini–Hochberg, and When Each Fails

Running twenty metrics at α=0.05 each does not leave your program at a 5% false positive rate — family-wise error explodes. This guide covers Bonferroni, Holm, Benjamini–Hochberg FDR, false discovery proportion intuition, and how Meta-style experimentation teams pair primary-metric discipline with exploratory FDR on secondary reads.

36 min read 2 sections 1 interview questions

Multiple ComparisonsBonferroniBenjamini-HochbergFalse Discovery RateFWERHolm-BonferroniHypothesis TestingA/B TestingExperimentationPrimary Metricq-valueWestfall-Young

The Family-Wise Error Explosion

Suppose you test independent null hypotheses, each at level . If **all** nulls are true, the probability of **at least one** false rejection is . For and , that is about **40%** — not 5%. Product experiments violate independence (metrics co-move), but the qualitative lesson survives: **naive multi-metric dashboards** generate phantom wins. Interviewers want you to separate three objects: (1) a **pre-registered primary** metric for ship decisions, (2) **FWER-controlled** procedures when any single false alarm is catastrophic, (3) **FDR-controlled** procedures for exploratory science on many hypotheses.

IMPORTANT

Premium content locked

This guide is premium content. Upgrade to Pro to unlock the full guide, quizzes, and interview Q&A.

Upgrade to Pro Sign in to upgrade