Skip to main content

Preview — Pro guide

You are seeing a portion of this guide. Sign in and upgrade to unlock the full article, quizzes, and interview answers.

ML System Design: Job Recommendation (LinkedIn-Style Marketplace)

Design a production job recommendation system for a professional marketplace — end-to-end. Covers cold-start and short-lived job postings, the LinkSAGE / LiGNN pattern: **nearline** GNN embeddings as **features** into a low-latency two-tower ranker (not full GNN on every list load), position bias and delayed labels (apply → hire), eligibility hard-filters, and two-sided evaluation with employer quality guardrails. Citations: LiGNN and LinkSAGE (arXiv 2024) for large-scale job graph learning at LinkedIn.

58 min read 3 sections 1 interview questions
Job RecommendationTwo-TowerGraph Neural NetworksLinkSAGELiGNNFAISSHNSWPosition BiasLearning to RankPoint-in-TimeCold StartEconomic GraphFairnessDelayed Labels

Why This Is Not ItemCF on Movie IDs

Postings are short-lived and sparse. A job ID may receive few interactions before close or fill. Pure collaborative filtering on posting ID starves before the graph has signal.

The marketplace is two-sided. Clicks, applies, and recruiter replies measure different things. A model that maximizes click alone optimizes for sensational titles; one that maximizes apply alone can favor one-click spam applications that waste recruiter time. Production teams define a funnel and quality guardrails.

Labels are delayed and structurally missing. Hire and onsite outcomes can arrive weeks or months after the impression. Training only on short-horizon clicks without inverse propensity correction or multi-task structure systematically misaligns the model with employer value: you think you optimized applies; you may have increased low-quality volume.

The industrial graph insight (LiGNN, LinkSAGE). LinkedIn published LiGNN (large-scale heterogeneous GNN training; arXiv:2402.11139) and LinkSAGE (GNN for job matching with nearline encoder inference feeding DNN rankers; arXiv:2402.13430). The interview lesson is not "run a GNN in the request path" — it is: GNN encoders materialize embeddings; the hot path remains retrieval + neural rank in the tens of milliseconds, with graph signal entering as dense features in the ranker, not as an online full-graph solve for every list load.

IMPORTANT

What Interviewers Are Evaluating

Mid-level: Two-tower, skills overlap, in-batch negatives, NDCG.

Senior: FAISS/HNSW retrieval, PIT joins, position bias mitigation *(IPW or randomized buckets ) , eligibility gating , job inventory staleness .

Staff: Nearline GNN embedding tables with freshness SLO ; two-sided A/B with interference awareness ; EEO / fairness stratification for hiring surfaces ; explicit reference to LinkSAGE transfer pattern .

IMPORTANT

Premium content locked

This guide is premium content. Upgrade to Pro to unlock the full guide, quizzes, and interview Q&A.