Skip to main content

Preview — Pro guide

You are seeing a portion of this guide. Sign in and upgrade to unlock the full article, quizzes, and interview answers.

ML System Design: Real-Time Bidding Optimization

Design a production DSP bidding system under 50ms — covering contextual bandits (UCB vs Thompson Sampling), budget pacing from PID controllers to RL, distributed budget state with token buckets, and ultra-low-latency hot-path engineering. Includes the auction bias problem with IPS correction, bid shading for first-price auctions, and failure mode analysis for production RTB systems.

65 min read 4 sections 1 interview questions
RTBReal-Time BiddingDSPBanditsThompson SamplingUCBBudget PacingPID ControllerRLLatencyOpenRTBAd AuctionContextual BanditsInverse Propensity ScoringBid Shading

RTB Is a Market, Not a Pipeline — Get the Framing Right

Every time you load a webpage, an auction runs. Multiple advertisers compete for the right to show you an ad, the auction resolves, and the winner's creative renders — all before your browser finishes painting the DOM. The entire sequence must complete in under 100 milliseconds.

Real-Time Bidding (RTB) processes over 10 million auctions per second at peak across the industry. Google's AdX, Amazon's AAP, and The Trade Desk each operate at this scale individually.

Most candidates describe RTB as a pipeline. It is actually a market. The difference matters because markets have adversarial dynamics, information asymmetry, and equilibria that pipelines don't have.

The participants:

  • Publisher: Owns the ad inventory (webpage, app, video). Wants maximum revenue per impression.
  • SSP (Supply-Side Platform): Aggregates publisher inventory. Runs the auction.
  • Ad Exchange: Matching layer between SSPs and DSPs. OpenRTB protocol standardizes message format.
  • DSP (Demand-Side Platform): Represents advertisers. Receives bid requests, evaluates them, returns bids. This is where the ML system lives.
  • Advertiser: Has a campaign budget, target audience, and performance goal (clicks, conversions, awareness).

Two auction models you must know:

  • Second-price auction (Vickrey): Highest bidder wins, pays second-highest bid + ε. Honest bidding incentive: optimal strategy is to bid your true value. Standard for many years in programmatic advertising.
  • First-price auction: Highest bidder wins, pays exactly what they bid. Google and many exchanges migrated to first-price in 2019-2021. Optimal strategy changes significantly — you must now predict the clearing price and bid just above it, not your true value. This is the dominant model in 2026. Your bid optimization algorithm changes shape depending on which you're designing for.
TIP

What Interviewers Are Evaluating at Each Level

Mid-level: Can you describe the SSP/DSP/Exchange architecture and the OpenRTB request/response flow? Do you understand CTR prediction as the core ML task? Can you name the 100ms latency constraint and explain what happens if you miss it?

Senior-level: Can you give the 50ms DSP budget broken down by operation? Do you understand the bandit framing for segment exploration? Can you describe at least one concrete budget pacing approach beyond 'use Redis'? Do you name Go or C++ as the serving language choice and explain why?

Staff-level: Can you describe auction bias and the IPS correction? Do you understand bid shading for first-price auctions and why it's necessary? Can you design the distributed budget state system? Do you analyze the feedback loop pathologies (budget oscillation, winner's curse) with specific interventions?

Clarifying Questions — Ask These First

01

What auction type?

Second-price vs first-price determines the entire bid optimization strategy. In second-price: bid your true value. In first-price: bid slightly below true value (bid shading based on predicted clearing price). Most production systems in 2026 operate on first-price auctions.

02

What campaign objective?

CPC (cost per click), CPM (cost per thousand impressions), CPA (cost per acquisition), ROAS (return on ad spend). The objective determines what CTR/CVR model output you optimize. CPA is hardest — conversion signal is delayed 24-48 hours, creating training data lag.

03

What latency budget?

Google AdX: 80ms total exchange window. OpenRTB default: 120ms. DSP gets roughly 50ms of that. This is not soft — miss the window and your bid is ignored, budget goes unspent, campaign misses goals.

04

Single advertiser or multi-tenant?

A multi-tenant DSP serving hundreds of advertisers has a budget state management problem that a single-advertiser system doesn't. The distributed budget counter is only a problem at scale.

05

What budget granularity?

Daily caps, hourly caps, or total campaign budgets — each creates different pacing challenges. Hourly caps are harder to smooth because the time window is short and traffic variance is high.

IMPORTANT

Premium content locked

This guide is premium content. Upgrade to Pro to unlock the full guide, quizzes, and interview Q&A.