How to Approach an HLD System Design Interview
The pre-game mindset, signal management, and communication strategy for HLD interviews. Covers what interviewers evaluate, time budgeting, recovery patterns, and the failure modes that cause strong engineers to underperform on system design interviews.
What This Page Is (and Isn't)
This page is the pre-game: how to think about an HLD interview before you draw a single box. The companion page, How to Design at HLD, covers the execution: turning a blank whiteboard into a defensible architecture with concrete patterns and templates.
The reason for the split is that strong engineers fail HLD interviews not because they lack technical depth — they have shipped real distributed systems — but because they make the wrong moves at the meta level: they design before clarifying, propose technologies without naming the constraint that forced the choice, or run out of time on a deep dive while leaving the failure analysis untouched.
Read this page to understand the signals interviewers grade on, the traps that catch senior candidates, and the recovery patterns when you realize at minute 25 that you missed something critical. Then read the design page for the mechanical playbook.
The asymmetric truth: in 45 minutes, the interviewer cannot evaluate your full distributed-systems knowledge. They sample your judgment and your communication. A candidate who designs only a partial system but makes every decision defensibly outperforms a candidate who designs a complete system but cannot articulate why each piece is there.
The Five Signals Interviewers Actually Score
Every FAANG HLD rubric is some variation of these five signals. Memorize them — they tell you what to optimize at every moment of the interview:
- Structured thinking under ambiguity — do you ask requirements before designing? Do you confirm scope before going deep?
- Trade-off reasoning — every choice must be
I picked X over Y because <metric>; if <condition> changed, I'd reconsider. - Quantified estimation — can you derive QPS, storage, and bandwidth from product assumptions, then use those numbers to justify architecture?
- Failure mode awareness — for each component, can you say what breaks it, how it's detected, and how the system recovers?
- Communication — do you narrate decisions out loud so the interviewer can score your reasoning, not just your conclusions?
Junior candidates miss signals 1 and 4. Senior candidates lose on 2 and 5 — they know the answers but never make their reasoning visible. Staff+ candidates win on 2 and 3 — they treat numbers as the language of system design.
The 7 Mindset Rules for HLD Interviews
Rule 1 — Treat ambiguity as a signal, not a bug
The vague prompt 'design Twitter' is deliberate. The interviewer is testing whether you ask scope questions or fill in assumptions silently. Silent assumptions = senior-level failure. Always say: 'Before I design, I want to confirm a few things — is the feed chronological or ranked? Are we including DMs in scope? What's the user count we're sizing for?'
Rule 2 — Numbers come before architecture
100 QPS and 100K QPS are different systems even for the same product. Estimate read QPS, write QPS, and storage first. Every technology choice downstream is justified by these numbers. A candidate who says 'use Cassandra' without naming the write rate is guessing; one who says '50K writes/sec, leaderless multi-region → Cassandra' is reasoning.
Rule 3 — Drive the conversation; do not wait to be led
Interviewers expect senior candidates to propose the agenda. Say at minute 5: 'Now that requirements are clear, I'll spend ~10 min on rough design, then deep dive on the feed pipeline since that's the most interesting part — does that work?' This signals ownership.
Rule 4 — Every component decision = 1 metric + 1 alternative
Anchor each choice: the metric that forced it (write rate, latency budget, consistency requirement) and the alternative you ruled out (and why). 'Redis here for <1ms reads. Memcached works but I want sorted sets for the leaderboard view.'
Rule 5 — Pick 2-3 deep dives, not 5
Your deep dives are where depth is graded. Two thorough dives on hard parts (celebrity fan-out, cache invalidation) outperform five shallow dives. Decline the trap: 'I could go deep on the CDN, but I think the timeline service is the more interesting design — happy to switch if you'd rather.'
Rule 6 — Surface failure modes unprompted
Don't wait for 'what about reliability?' — proactively say 'before I move on, the failure mode for this component is X; the detection signal is Y; recovery takes Z seconds.' This is the L5+ signal that distinguishes you from candidates who design only the happy path.
Rule 7 — Watch the clock; abandon perfection
At minute 35 of a 45-minute interview, stop adding components. Wrap up with explicit prioritization: 'Given the time, here's what I'd cover next if we had 15 more minutes — failover for the primary DB, the analytics pipeline, and rate limiting at the gateway.' This signals you understand engineering tradeoffs in real time.
The 45-Minute HLD Interview Timeline
Anti-Patterns That Lose Points (and the Senior Fix)
| Anti-pattern | Why it costs points | Senior fix |
|---|---|---|
| Drawing a diagram in the first 2 minutes | Signals you skip requirements gathering — explicit L5+ red flag | Spend the first 5 min asking scope and confirming assumptions out loud |
| Listing 'I'd use Redis, Kafka, Cassandra, Spark...' upfront | Pattern-matching, not reasoning — interviewers downgrade for vocabulary dumps | Introduce each technology only when a specific metric forces the choice |
| Designing for 1M users when the prompt implies 1B | Fundamental mis-sizing — every downstream decision is wrong | Anchor scale early: 'I'll size for 100M DAU; tell me if you'd rather a different scale' |
| Skipping back-of-envelope to 'save time' | Without numbers, every choice looks arbitrary | Numbers ARE the time-saver — they let you pre-empt 80% of follow-up questions |
| Deep-diving on whatever you know best | Often the easiest, least differentiating component (CDN, load balancer) | Pick the genuinely hard part — what would actually be argued about in a real design review |
| Saying 'add a cache' without strategy | Half-answer: doesn't address invalidation, fallback, hit rate | Always cover: write strategy, TTL, eviction, miss path, failure path, target hit rate |
| Treating failure modes as an afterthought | Reads as 'designs only the happy path' | Surface failure modes per component as you introduce them, unprompted |
| Going silent during whiteboarding | Interviewer can't grade what they can't hear | Narrate every decision: 'I see two options here — A and B. Picking B because...' |
| Refusing to abandon a component when steered | Reads as inflexible, defensive | 'Happy to pivot — let me close the loop on this point first, then I'll switch to X' |
| Building 'just in case' over-engineered systems | Reads as junior — assumes everything must scale to billions | Explicitly state: 'For 100K DAU this is overkill — I'd start with a single Postgres and revisit at 10x scale' |
The Most Expensive Mistake — Silent Assumptions
The single highest-leverage failure mode in HLD interviews: making an assumption in your head and never voicing it.
Example: the interviewer says 'design a chat system.' You silently assume 1:1 chat (no group chat). You design a clean per-user inbox. The interviewer asks at minute 30: 'how do groups of 10K members work?' Your design now needs major revisions — fan-out is wrong, message storage is wrong, indexing is wrong.
Cost: 5-10 minutes of recovery, plus the perception that you didn't ask in the first place.
Fix: state every assumption out loud, even ones that feel obvious. 'I'm assuming 1:1 chat for now — should I include group chat with up to 10K members?' takes 4 seconds. The cost of asking and being told 'yes, 1:1 only' is zero. The cost of not asking and being wrong is the entire interview.
Recovery Patterns When Things Go Wrong
When you realize at minute 25 you missed a requirement
Acknowledge cleanly: 'Stepping back — I should have asked about <X> at the start. Let me confirm: <question>?' Then walk through what changes in the design. This recovers more credit than pretending you accounted for it.
When the interviewer says 'what about <topic>' and you haven't thought about it
Buy 30 seconds of thinking time without going silent: 'Good call — let me think through that out loud.' Then work through it systematically. Silence reads as panic; structured reasoning under pressure reads as senior.
When you propose a technology and the interviewer pushes back
Do not defend; reason. Say 'Fair point — let me reconsider.' Walk through the tradeoff again. Often the interviewer is testing whether you'll commit defensively or reason. The right move is the second.
When the interviewer wants you to deep dive on a component you barely sketched
60 seconds to establish the contract first: 'Before I dive in, the interface for this is <X>; latency budget <Y>; throughput <Z>. Now the internals...' This shows you don't deep dive without context.
When you're 5 minutes from the end with major gaps
Don't try to cover everything — explicit prioritization is itself the signal: 'In the remaining 5 min, I'll do a fast pass on failure modes since that's most differentiating; the analytics and CDN I'd cover next if we had more time.'
When you give a wrong answer and realize it 30 seconds later
Self-correct out loud: 'Actually, what I just said about <X> isn't right — the correct way is <Y>.' Self-correction is a positive signal; it shows reflection. Pretending it didn't happen is far worse.
The Decision-Making Loop You Should Run on Every Choice
What Different Interview Levels Actually Test
| Level | Primary signal | How to demonstrate |
|---|---|---|
| L4 / Mid (E4) | Can you build a working system from a clear spec? | Get the rough design right; reasonable technology choices; some tradeoff awareness |
| L5 / Senior (E5) | Can you make defensible decisions under ambiguity? | Quantified tradeoffs · proactive failure modes · clean communication · drives the conversation |
| L6 / Staff (E6) | Can you identify the highest-leverage architectural decision and reason about it? | Names the *one* decision the rest of the system depends on · debates the alternative seriously · discusses long-term evolution |
| L7 / Senior Staff (E7) | Can you connect technical choices to business and org constraints? | Discusses cost · team operability · data sovereignty · multi-year migration paths · alignment with broader infra strategy |
How to Practice (and What to Practice)
The wrong practice: solving 50 system design problems shallowly. The right practice: solving 5-10 problems deeply — with a friend playing interviewer, on a real whiteboard, with a 45-minute timer, narrating every decision.
What to drill specifically:
- Estimation reflexes: pick 5 product scales (1M DAU, 100M DAU, 1B DAU) and have the QPS, storage, and bandwidth math automatic. The math is trivial; the speed of recall is what creates space for design thinking.
- The 7-step decision loop: practice running it out loud for any choice (DB, cache, queue, replication strategy). Run it on real systems you've worked with, not just interview problems.
- Failure mode coverage: for the 8 most common components (LB, gateway, service, cache, primary DB, replica, queue, object store), memorize the failure mode + detection + recovery for each. This becomes a checklist you can run mentally.
- Recovery patterns: deliberately practice recovering from interviews that are going badly — your interview partner should ambush you with hard questions at minute 25.
What NOT to over-practice: memorizing specific architectures (Twitter, Instagram, Uber). Interviewers expect you to derive an architecture from requirements, not regurgitate one. A candidate who has memorized Twitter's architecture will fail when asked to design something slightly different.
Interview Questions
Click to reveal answersSign in to take the Quiz
This topic has 15 quiz questions with instant feedback and detailed explanations. Sign in to unlock quizzes.
Sign in to take quiz →