HLD Interview Framework: CIRCLE Method
A battle-tested 6-step framework for HLD interviews with timing guidance, back-of-envelope estimation references, and the exact questions to ask at each step.
What Interviewers Actually Evaluate
HLD interviews at FAANG evaluate 5 signals:
- Structured thinking under ambiguity — do you ask requirements before designing?
- Trade-off reasoning — can you explain "I chose X over Y because..."?
- Back-of-envelope estimation — can you derive real numbers that drive design choices?
Understanding what interviewers are testing changes how you answer. The question "Design Twitter" is not a test of whether you know Twitter's architecture. It's a test of whether you can navigate ambiguity, make decisions under incomplete information, and articulate trade-offs coherently — all signals for how you'll perform in real engineering discussions.
What fails at the senior level: jumping straight to a component diagram without asking clarifying questions. Interviewers at L5+ are explicitly watching for this. A candidate who starts drawing boxes within the first 2 minutes of the question has already signaled a lack of structured process. The first 5 minutes should be requirements gathering and scale estimation — even if you "know" the answer already.
The asymmetry of mistakes: proposing the wrong technology can be corrected mid-interview if you explain your reasoning. Proposing a solution without explaining your reasoning signals that you can't think out loud under pressure — a much harder signal to recover from. Every design decision must be accompanied by the explicit trade-off: "I chose Cassandra here over MySQL because the write volume (100K/sec) and the query pattern (key-value lookup with no joins) favor wide-column storage — if the query patterns were more relational, I'd reconsider."
Scale estimates drive design, not the other way around: 100 QPS is a very different system from 100K QPS, even for the same problem. A system that needs to store 1MB/user for 1M users (1TB total) fits on a single server. The same system at 1B users (1PB) requires distributed object storage. Your estimation must happen before you draw any architecture — it determines every technology choice downstream.
- Failure mode awareness — what happens when each component fails?
- Communication — do you drive the conversation clearly and concisely?
CIRCLE — The 6-Step Framework
C — Clarify Requirements (5 min)
Never assume. Ask: What are the core features? Who are the users and where are they located? What scale are we designing for (users, requests/day)? Read-heavy or write-heavy? Consistency vs availability preference? Strong/eventual consistency? SLAs?
I — Identify Scale with Estimation (5 min)
Calculate: QPS (requests per second), storage needs (GB/TB/PB), bandwidth. Use these numbers to drive decisions: 'At 10,000 QPS a single DB won't cut it, so we need read replicas.' Numbers make abstract problems concrete.
R — Rough Design: API + Data Model + Flow (10 min)
Define: Key APIs (REST endpoints or event interfaces). Core data model (what entities, what fields, rough schema). High-level request flow — draw boxes and arrows. This is the skeleton; details come later.
C — Core Components Deep Dive (15 min)
Pick 2-3 most interesting/challenging components and deep-dive. Be guided by the interviewer. Common deep dives: Caching strategy, Database choice and sharding, Message queue design, CDN usage. Always explain WHY you chose each technology.
L — Latency & Bottleneck Analysis (10 min)
Where are the hot paths? What are the slowest operations? Apply solutions: CDN for static content, Redis for hot reads, connection pooling for DB, async queues for slow writes. Estimate the improvement each optimization makes.
E — Edge Cases & Failure Modes (5 min)
What happens when: cache goes down? DB primary fails? A service crashes? Network partition occurs? Discuss: retry with backoff, circuit breakers, fallback strategies, data consistency during failures.
Back-of-Envelope Estimation Cheat Sheet
| Metric | Rule of Thumb |
|---|---|
| 1M requests/day | ≈ 12 QPS |
| 100M requests/day | ≈ 1,200 QPS |
| 1B requests/day | ≈ 12,000 QPS |
| Peak factor | 3-5x average QPS |
| 1 KB per record, 1B records | = 1 TB storage |
| Average tweet | 280 bytes text + metadata ≈ 1 KB |
| Read from memory | 0.1 ms |
| Read from SSD | 1 ms |
| Network round trip (same region) | 1-5 ms |
| Network round trip (cross-region) | 30-150 ms |
| Single DB server | handles ~1,000-5,000 QPS |
| Cache hit rate target | > 90% |
Database Selection Guide
| Use Case | Best Choice | Why |
|---|---|---|
| User profiles, relationships | PostgreSQL / MySQL | Strong consistency, rich queries, ACID |
| Session cache, leaderboards | Redis | In-memory, sub-ms latency, sorted sets |
| Time-series (metrics, logs) | Cassandra / InfluxDB | Write-optimized, excellent time-range queries |
| Document store (flexible schema) | MongoDB | Schemaless, horizontal scale |
| Graph relationships | Neo4j / Amazon Neptune | Native graph traversal, relationship-heavy queries |
| Full-text search | Elasticsearch | Inverted index, relevance scoring |
| Real-time location/geo queries | Redis with GEO | GEORADIUS commands, in-memory speed |
| Blob storage (images, video) | S3 / GCS | Cheap, durable, CDN-ready |
Generic Production Architecture — The 5 Layers
The Most Common Mistake
Jumping to solutions before clarifying requirements. Interviewers deliberately leave requirements ambiguous to see if you ask. The first 5 minutes of clarification often change the entire design. For example: 'is this read-heavy?' might reveal that 90% of traffic is reads → you need aggressive caching, not write optimization.
CIRCLE Framework — HLD Interview Flow
Full Back-of-Envelope: Design Instagram (Worked Example)
Scale assumptions
1B total users, 100M DAU. 50M photos uploaded/day. 500M photo views/day. Average photo = 3MB original, 200KB compressed. Average request = 400 bytes.
Write QPS
50M uploads/day ÷ 86400 seconds = 580 uploads/sec. Peak = 3× average = 1,740 uploads/sec. At 200KB each: 1,740 × 200KB = 348MB/sec upload bandwidth.
Read QPS
500M photo views/day ÷ 86400 = 5,800 reads/sec. Peak = 3× = 17,400 reads/sec. At 200KB each: 17,400 × 200KB = 3.48GB/sec. A CDN must serve this — a single server can't deliver 3.5GB/sec.
Storage
New photos: 50M/day × 200KB = 10TB/day. 3 years × 365 days × 10TB = 10.95PB. Add 3 sizes (original, medium, thumbnail) = 30PB in 3 years. S3 cost: $30PB × $0.023/GB/month = ~$700K/month storage alone.
Database
Metadata (photo_id, user_id, caption, timestamp): 50M rows/day × 365 × 3 = 55B rows at 500 bytes each = 27TB. PostgreSQL handles up to ~10TB with good tuning → need sharding after year 1. Shard by user_id. 10 shards initially with 2TB each.
CDN sizing
3.5GB/sec average, 10GB/sec peak. CloudFront: ~$0.085/GB in US → 3.5GB/sec × 86400 × 30 days = 9PB/month → $765K/month CDN cost. Cache hit rate target: 95% (popular photos served from CDN, not origin). With 95% hit rate, origin servers see 5% × 17,400 reads/sec = 870 reads/sec — a single server handles this easily.
Latency Numbers Every Engineer Should Know (2025)
| Operation | Latency | Notes |
|---|---|---|
| L1 cache access | ~0.5 ns | CPU registers + L1 cache |
| L2 cache access | ~7 ns | |
| L3 cache access | ~40 ns | |
| Main memory (RAM) read | ~100 ns = 0.1 μs | 100× slower than L1 |
| SSD sequential read (4KB) | ~150 μs = 0.15 ms | |
| HDD read (seek + rotation) | ~5-10 ms | 50,000× slower than RAM |
| Network: same datacenter | ~0.5 ms RTT | |
| Network: US cross-country | ~40 ms RTT | |
| Network: US to Europe | ~80 ms RTT | |
| Network: US to Asia-Pacific | ~150 ms RTT | |
| Redis GET | ~0.5–1 ms | Network + hash lookup |
| PostgreSQL query (indexed) | ~1–5 ms | SSD read + query planning |
| PostgreSQL query (seq scan, large table) | ~100ms–seconds | Avoid in hot paths |
| Object storage (S3 GET) | ~50–200 ms | Varies by region |
| HTTP request to external API | ~50–500 ms | Includes DNS + TLS |
Interview Scenario: Design a Global Notification System
Problem: send push notifications to 500M mobile users. 10M notifications/day normally, spikes to 100M during major events (sports scores, breaking news).
Scale: 100M/day peak = 1,160/sec average, 10,000/sec peak. Each notification = 500 bytes = 50MB/sec peak.
Architecture: (1) API Gateway + notification service: accepts notification requests, validates, enriches with user preferences (do not disturb, language). Writes to Kafka topic 'notifications'. Does NOT send directly (decoupling).
(2) Kafka cluster: buffer for notification bursts. If push provider is slow (Apple APNS/Google FCM), the Kafka topic accumulates and is drained at a steady rate. Topic partition by user_id for ordering guarantees.
(3) Fan-out service: reads from Kafka, looks up user's device tokens from Redis/DynamoDB (< 5ms lookup). Routes to correct push provider. Handles retry with exponential backoff.
(4) Device token store: Redis Hash per user. user:{user_id}:devices → hash of {device_id: push_token}. DynamoDB for persistence. Redis for hot cache.
Scaling for 100M/event spike: Kafka handles the burst (consumer lag is acceptable). Fan-out service scales horizontally to 50 workers. Rate limit push provider calls (APNS: 1M/sec limit). Pre-warm connections to APNS/FCM.
Deduplication: notification_id in Redis SET with 24h TTL. If notification_id already in set, skip. Prevents duplicate sends during retry.
Key trade-offs to mention: delivery guarantees (at-most-once vs at-least-once), message expiry (don't send 'game just started' 6 hours later), respect user time zones.
Interview Questions
Click to reveal answersSign in to take the Quiz
This topic has 20 quiz questions with instant feedback and detailed explanations. Sign in to unlock quizzes.
Sign in to take quiz →