High-Level Design
Distributed systems the way FAANG asks them: latency budgets, capacity math, sharding strategies, and failure modes — grounded in real system blueprints.
guides
Design a URL Shortener (TinyURL)
End-to-end design of a URL shortening service handling 10 billion stored URLs, 100K redirects/sec, and sub-5ms p99 redirect latency. Covers ID generation, read-heavy caching, multi-region replication, and async analytics.
HLD Interview Framework: CIRCLE Method
A battle-tested 6-step framework for HLD interviews with timing guidance, back-of-envelope estimation references, and the exact questions to ask at each step.
How to Approach an HLD System Design Interview
The pre-game mindset, signal management, and communication strategy for HLD interviews. Covers what interviewers evaluate, time budgeting, recovery patterns, and the failure modes that cause strong engineers to underperform on system design interviews.
How to Design at HLD: From Blank Whiteboard to Defensible Architecture
The mechanical playbook for HLD interview execution. Covers the API-first design rule, capacity-driven technology selection, the five-layer reference architecture, sharding decision tree, caching strategy, and the patterns FAANG candidates use to deliver consistently strong system designs.
Caching: Strategy, Redis Internals & Distributed Patterns
Caching is the most tested topic in HLD interviews. Master cache write strategies, Redis data structures and their time complexities, distributed cache topology, the cache invalidation problem, and how to design multi-level caching architectures for real systems.
CDN: Edge Caching, Push vs Pull, and Invalidation at Global Scale
CDNs cut global latency from hundreds of milliseconds to tens, offload 80-99% of origin bandwidth, and shield origins from bursty traffic. Master push vs pull, cache hierarchy, invalidation, and the failure modes that break CDN-backed systems.
Cloud Services Architecture for System Design Interviews
How to choose AWS, GCP, and Azure managed services in system design interviews with clear tradeoffs. Covers compute, storage, messaging, networking, identity, and serverless-vs-container decisions tied to latency, reliability, and cost.
Distributed Queue Design: Ordering, Retries, and Throughput
Design production-grade distributed message queue systems covering Kafka vs SQS vs RabbitMQ tradeoffs, delivery semantics, consumer group patterns, backpressure, and dead-letter queues. The system design interview's most-tested async primitive — master it to handle any event-driven architecture question.
Load Balancer Design: L4/L7 Routing, Health Checks, and Failover
Design scalable load balancing for modern distributed systems. Covers L4 vs L7 tradeoffs, routing algorithms (round-robin, least-connections, P2C, consistent hashing), health check design, connection draining, sticky sessions, and global load balancing with GeoDNS and Anycast. Builds the mental model interviewers use to assess system design maturity.
Rate Limiter Design: Token Bucket, Sliding Window, and Distributed Enforcement
Design distributed rate limiters for APIs and gateways. Covers all five algorithm tradeoffs (token bucket, leaky bucket, fixed window, sliding window log, sliding window counter), Redis data structure choices, hot-key mitigation, race conditions, and multi-region consistency. One of the most frequently asked HLD fundamentals at FAANG.
Design a Chat System (WhatsApp)
End-to-end design of a real-time encrypted messaging platform serving 2 billion users and 100 billion messages per day. Covers WebSocket connection management, the Signal Protocol for E2EE, store-and-forward offline delivery, and group message fan-out.
Design Google Docs (Real-Time Collaborative Editing)
System design deep-dive for real-time collaborative editing at 1B+ scale. Covers OT vs CRDTs, why Google Docs uses OT with a centralized server, WebSocket sync, offline editing with Yjs, and the tombstone problem that limits CRDT scalability.
Design a File Storage and Sync System (Dropbox / Google Drive)
End-to-end system design for a Dropbox-scale file storage and sync platform serving 500M users and 2.5EB of data. Covers content-addressable chunking, metadata-blob separation, the sync protocol, shared-folder consistency, version history, and the garbage-collection edge cases that interviewers target.
Design Google Maps (Routing, ETA & Map Tile Serving)
Deep-dive system design for Google Maps covering map tile serving at CDN scale, Contraction Hierarchies-based routing on a 36M-node road graph, and real-time ETA prediction from 100M GPS probe devices — with production numbers, algorithm tradeoffs, and what senior engineers get wrong.
News Feed Architecture: Fan-Out on Write vs Read, Hybrid Strategies & Feed Ranking
The fan-out problem is the central challenge behind every social feed — Twitter, Instagram, Facebook, LinkedIn. Master fan-out on write (push), fan-out on read (pull), the hybrid approach for celebrity accounts, feed storage in Redis, Cassandra schema design, and how feed ranking layers on top of chronological ordering.
Design a Multi-Channel Notification Service at Scale
End-to-end design of a notification platform delivering 1B notifications/day across push, email, and SMS. Covers the fan-out problem for broadcast sends, per-provider rate limiting, idempotency, retries, and compliance (GDPR/CAN-SPAM/TCPA) — the system design topics interviewers actually grill on.
Design a Payment System (Stripe-style Processor)
System design deep-dive for a merchant payment processor handling 10K TPS peak, sub-500ms p99 latency, PCI DSS compliance, idempotent charges, double-entry ledger accounting, fraud scoring, and exactly-once distributed systems semantics under partial failure.
Design Proximity Services (Yelp / Google Places)
System design deep-dive for a geo-search platform handling 200M+ monthly users, sub-100ms nearby business search across 50M+ geo-indexed points, geohash/S2/H3 indexing tradeoffs, read/write split at 100:1, sharding by geohash prefix, and ranking by distance, rating, and personalization signals.
Design Uber (Ride-Sharing Platform)
End-to-end design of a real-time ride-matching platform handling millions of simultaneous GPS location updates, sub-10-second driver matching, dynamic surge pricing, and global trip management across 70 countries.
Design Search Autocomplete / Typeahead (Google-scale)
System design deep-dive for a search typeahead service delivering sub-100ms top-K suggestions at 10B queries/day using sharded tries, precomputed top-K at each node, multi-tier caching, real-time trending pipelines, distributed systems scalability, and personalization.
Design a Stock Exchange (Order Book & Matching Engine)
System design for a stock exchange at NYSE/NASDAQ scale: 10B+ messages/day, sub-microsecond matching via LMAX Disruptor and FPGA, price-time priority order book, UDP multicast for market data, and why the matching engine is intentionally single-threaded for determinism and audit compliance.
Design an Event Ticketing System (BookMyShow / Ticketmaster)
System design deep-dive for a flash-sale ticketing platform at Ticketmaster scale — 600K+ concurrent users competing for 50K seats in seconds — covering seat reservation with TTL, oversell prevention via optimistic locking, virtual waiting rooms, WebSocket availability overlays, and idempotent payment flows.
Design Twitter/X Feed (News Feed)
End-to-end design of a social media news feed serving 200 million daily active users. The central design challenge is the fan-out problem: how to show a user their personalized timeline of tweets from all the accounts they follow, with < 200ms latency, at 180K reads/sec.
Design Netflix (Video Streaming)
End-to-end design of a global video streaming platform serving 250M subscribers across 190 countries, handling 25 Tbps of peak traffic. Covers video ingestion and encoding pipelines, the Open Connect CDN, adaptive bitrate streaming, and recommendation at scale.
Design a Web Crawler at Google Scale
End-to-end system design for a distributed web crawler at Google/Bing scale: 15B+ pages, ~6B refreshed daily, covering URL frontier design, OPIC-based priority scoring, politeness enforcement, SimHash near-duplicate detection, distributed sharding, and freshness-vs-depth tradeoffs most resources skip entirely.
API Gateway Design: Auth, Rate Limiting, Routing, and BFF Patterns
Production API gateway architecture covering Kong, Envoy, AWS API Gateway, and BFF patterns. Where to terminate TLS, validate JWTs, enforce rate limits, aggregate requests — and how to avoid turning the gateway into a distributed monolith.
Authentication & Authorization: JWT, OAuth 2.0, OIDC & Zanzibar ReBAC
Deep-dive on modern auth systems: AuthN vs AuthZ, session vs JWT tradeoffs, OAuth 2.0 flows (Authorization Code + PKCE, Client Credentials, Device), OIDC identity tokens, RBAC vs ABAC vs Google Zanzibar ReBAC, JWT revocation, key rotation, and WebAuthn passkeys for FAANG system design interviews.
Consensus Protocols: Raft vs Paxos, Leader Election & Log Replication
Distributed consensus is what makes ZooKeeper, etcd, and CockroachDB correct. Master Raft's three-phase algorithm (leader election → log replication → commit), why Raft is simpler than Paxos, the split-brain scenario, Byzantine fault tolerance, and where consensus appears in real systems — Kafka, Kubernetes, and Google Spanner.
Consistency Models: Strong vs Eventual vs Causal, Linearizability, CRDTs & CAP Theorem
Consistency models define what values a distributed read can return after a write. Master linearizability (strong consistency), sequential consistency, causal consistency, eventual consistency, and read-your-writes — with practical implications for database selection, microservice design, and the CAP theorem's real-world limitations.
Containers and Kubernetes for System Design Interviews
A production-focused guide to Docker and Kubernetes for backend system design interviews. Covers pod scheduling, Deployment vs StatefulSet decisions, autoscaling, service mesh tradeoffs, and failure handling with concrete latency and reliability constraints.
Data Partitioning & Sharding: Consistent Hashing, Range Sharding & Hotspot Elimination
Sharding is what makes databases scale beyond a single machine. Master horizontal vs vertical partitioning, range sharding, hash sharding, consistent hashing with virtual nodes, hotspot detection, and resharding strategies — with real numbers from Cassandra, DynamoDB, and YouTube's architecture.
Data Warehouse Architecture: Columnar Storage, MPP, and Lakehouse Tradeoffs
A practical system design guide to modern warehouse architecture using Snowflake, BigQuery, and Redshift patterns. Covers columnar storage internals, MPP execution, partitioning and clustering strategy, and warehouse-vs-lakehouse design choices.
Databases: Sharding, Indexing & Replication
Database engineering for large-scale systems — the most tested HLD sub-topic after caching. Covers B-Tree and LSM-Tree storage engines, indexing strategies (covering, composite, partial), sharding strategies and hotspot handling, replication (sync vs async, leader election), and how to choose between SQL and NoSQL in system design interviews.
Distributed Locks: Redlock, ZooKeeper, Fencing Tokens & Exactly-Once Guarantees
Distributed locks are the mechanism behind inventory reservations, payment deduplication, and leader election. Master Redis SET NX EX, the Redlock algorithm and its controversy, ZooKeeper ephemeral znodes, fencing tokens for safe expiry, and when distributed locks are the wrong tool entirely.
Distributed Systems Patterns
The 8 core distributed systems patterns every senior engineer must know: consistent hashing, CAP theorem, saga pattern, CQRS, event sourcing, two-phase commit, gossip protocol, and leader election.
Distributed Transactions: 2PC, Saga Pattern, and Compensating Transactions
Distributed transactions coordinate state changes across multiple services or databases. Two-Phase Commit (2PC) provides strong consistency but sacrifices availability. The Saga pattern achieves eventual consistency through compensating transactions — the approach used by Uber, Amazon, and Stripe for multi-step business workflows. This guide covers both models, their failure modes, and when each applies.
Event Sourcing and CQRS: Audit Logs, Temporal Queries, and Read/Write Separation
Event sourcing stores state as an immutable event log rather than mutable rows. CQRS separates write and read models for scalability in distributed systems. Together they power audit logs, temporal queries, and high-read systems at Stripe and Microsoft. Covers failure modes and the projection rebuild problem most candidates miss.
Message Queues & Streaming: Kafka, Delivery Semantics, and Consumer Groups
Async messaging is the backbone of every scalable architecture. Master the queue-vs-log distinction, Kafka partitioning and consumer groups, delivery guarantees, ordering semantics, and the production failure modes that separate staff engineers from the crowd.
Microservices Architecture: Decomposition, Service Mesh, Circuit Breakers & Saga Pattern
When and how to decompose a monolith into microservices — and the distributed systems complexity that follows. Covers domain-driven decomposition, the strangler fig migration pattern, service discovery, API gateway, circuit breakers (Hystrix/Resilience4j), service mesh (Istio), and the saga pattern for distributed transactions.
Observability: Metrics, Distributed Tracing, Structured Logging & SLO Design
Observability is what separates systems that are operated from systems that are debugged by guessing. Master the three pillars (metrics, logs, traces), SLI/SLO/SLA design, error budgets, structured logging, OpenTelemetry for distributed tracing, and the on-call runbook pattern. Includes the staff-level synthesis: designing observability before coding the system.
Search Internals: Inverted Index, TF-IDF, Elasticsearch Architecture & Relevance Ranking
Full-text search powers every application. Master the inverted index data structure, TF-IDF relevance scoring, BM25 (the modern standard), Elasticsearch's distributed shard architecture, query execution pipeline, and the tradeoffs between exact-match, fuzzy, and semantic search.
Stream Processing Systems: Flink, Kafka Streams, Windows, and Exactly-Once
A system design deep dive on real-time stream processing architecture. Learn how to choose Flink vs Kafka Streams, design windowed aggregations, handle late data, and implement exactly-once semantics with production-grade failure recovery.