Preview — Pro guide
You are seeing a portion of this guide. Sign in and upgrade to unlock the full article, quizzes, and interview answers.
Sections
Related Guides
MLSD Case Study: End-to-End Recommender System
ML System Design
ML System Design: E-commerce Recommendation System
ML System Design
Embeddings & Vector Databases: ANN Search at Scale
ML System Design
Two-Stage Retrieval & Ranking: The Architecture Behind Every Large-Scale Recommender
ML System Design
Offline vs Online Evaluation: Why Metrics Disagree and What to Do About It
ML System Design
ML System Design: Job Recommendation (LinkedIn-Style Marketplace)
Design a production job recommendation system for a professional marketplace — end-to-end. Covers cold-start and short-lived job postings, the LinkSAGE / LiGNN pattern: **nearline** GNN embeddings as **features** into a low-latency two-tower ranker (not full GNN on every list load), position bias and delayed labels (apply → hire), eligibility hard-filters, and two-sided evaluation with employer quality guardrails. Citations: LiGNN and LinkSAGE (arXiv 2024) for large-scale job graph learning at LinkedIn.
Why This Is Not ItemCF on Movie IDs
Postings are short-lived and sparse. A job ID may receive few interactions before close or fill. Pure collaborative filtering on posting ID starves before the graph has signal.
The marketplace is two-sided. Clicks, applies, and recruiter replies measure different things. A model that maximizes click alone optimizes for sensational titles; one that maximizes apply alone can favor one-click spam applications that waste recruiter time. Production teams define a funnel and quality guardrails.
Labels are delayed and structurally missing. Hire and onsite outcomes can arrive weeks or months after the impression. Training only on short-horizon clicks without inverse propensity correction or multi-task structure systematically misaligns the model with employer value: you think you optimized applies; you may have increased low-quality volume.
The industrial graph insight (LiGNN, LinkSAGE). LinkedIn published LiGNN (large-scale heterogeneous GNN training; arXiv:2402.11139) and LinkSAGE (GNN for job matching with nearline encoder inference feeding DNN rankers; arXiv:2402.13430). The interview lesson is not "run a GNN in the request path" — it is: GNN encoders materialize embeddings; the hot path remains retrieval + neural rank in the tens of milliseconds, with graph signal entering as dense features in the ranker, not as an online full-graph solve for every list load.
What Interviewers Are Evaluating
Mid-level: Two-tower, skills overlap, in-batch negatives, NDCG.
Senior: FAISS/HNSW retrieval, PIT joins, position bias mitigation *(IPW or randomized buckets ) , eligibility gating , job inventory staleness .
Staff: Nearline GNN embedding tables with freshness SLO ; two-sided A/B with interference awareness ; EEO / fairness stratification for hiring surfaces ; explicit reference to LinkSAGE transfer pattern .