Preview — Pro guide
You are seeing a portion of this guide. Sign in and upgrade to unlock the full article, quizzes, and interview answers.
Sections
Related Guides
Cohort & Retention Analysis: D1/D7/D30 Curves, Churn Interpretation, and Retention SQL
Product Analytics
Funnel Analysis: Conversion Optimization, Drop-off Attribution, and Funnel SQL
Product Analytics
Feature Engineering: Leakage-Safe Encoding, Interactions, Temporal, and Production Parity
Machine Learning
User Segmentation & Behavioral Analytics: RFM, Clustering, Personas, and Production Guardrails
Segmentation powers targeting, pricing, and product prioritization — but k-means on raw features without scaling, leakage from future data, and unstable personas kill trust. This guide covers RFM, hierarchical clustering vs k-means, behavioral sequence features, evaluation metrics (silhouette with caveats), and how LinkedIn-scale teams ship segments with drift monitoring.
Segmentation Is an Interface Between ML and the Business
Segments are groups of users who behave similarly enough that differentiated actions (creative, discount, product surface) beat one-size-fits-all. Interviews test whether you know unsupervised learning is underspecified — many mathematically valid clusterings are useless if PMs cannot act on them.
Strong candidates lead with business actions, then choose algorithms; weak candidates dump k-means coordinates on stakeholders.
A practical litmus test: if a segment does not change messaging, pricing, routing, or product treatment, it is analytics theater. Mature teams require a segment card for each cluster that includes target action, owner, expected KPI movement, and sunset criteria. This is also where many candidates miss governance risk — segments can become stale as behavior changes, so they must be treated like model artifacts with retrain cadence and drift monitoring.