Skip to main content
High-Level Design·Intermediate

Design a URL Shortener (TinyURL)

End-to-end design of a URL shortening service handling 10 billion stored URLs, 100K redirects/sec, and sub-5ms p99 redirect latency. Covers ID generation, read-heavy caching, multi-region replication, and async analytics.

40 min read 12 sections 6 interview questions
CachingHashingBase62NoSQLCDNKafkaAnalyticsRead-Heavy

Problem Statement

A URL shortener converts a long URL (e.g. https://example.com/very/long/path?query=params) into a short 7-character alias (e.g. tiny.url/aB3xY9z) that redirects to the original.

Three things make this hard at scale: (1) generating unique IDs without coordination across regions, (2) serving 100K+ redirects/sec under Zipf-skewed hot-set load with sub-5ms latency, and (3) collecting per-click analytics without coupling them to the critical redirect path.

Why the redirect is the interview-critical path: the ratio of reads (redirects) to writes (URL creation) is approximately 100:1 or higher. The system is overwhelmingly read-dominated, and the read path must be P99 < 10ms to provide a seamless user experience. A user clicking a short link has no tolerance for a loading screen — even 200ms latency is noticeable. This immediately tells you that the architecture must be cache-first for reads.

The 301 vs 302 decision is a canonical trade-off question: HTTP 301 (Moved Permanently) signals to browsers that the redirect is permanent — they cache it locally and future clicks never hit your server. This is optimal for latency but destroys analytics (you never see the traffic). HTTP 302 (Found / Temporary Redirect) forces the browser to hit your server on every click — analytics works, but you lose caching. The right answer: 302 if analytics are a business requirement (most production URL shorteners), 301 if latency minimization and infrastructure cost are the priority.

The ID generation problem is the staff-level depth question: at 1,000 creates/sec across multiple regions, a single global auto-increment counter becomes a coordination bottleneck. Candidates who propose MySQL AUTO_INCREMENT are missing this. The production solution requires a distributed ID generation strategy — Snowflake-style IDs, per-region counter ranges, or hash-based codes with collision detection.

Functional Requirements

01

Short URL Creation

POST /urls — accept a long URL, return a unique 7-character short code. Optionally support custom aliases, expiration TTLs (1d / 7d / 30d / 1y / never), and bulk creation up to 1,000 URLs per request.

02

Redirect

GET /{shortCode} — look up the long URL and return HTTP 302 (not 301 — 302 prevents browser caching so every click hits the analytics layer).

03

Analytics

Track total clicks, geographic distribution, referrer, and device type per short code. Dashboard freshness target: < 5 seconds from click to dashboard.

04

Expiration & Deletion

Expired URLs must stop resolving. Owners can delete their URLs via DELETE /urls/{shortCode}. Expired rows are reaped automatically by TTL.

Non-Functional Requirements

RequirementTarget
Redirect latency (cache hit, server-side)p50 < 2ms / p99 < 5ms
Redirect latency (cache miss, server-side)p50 < 8ms / p99 < 15ms
Create latencyp50 < 20ms / p99 < 50ms
Redirect throughput100K/sec sustained globally
Availability99.99% (≤ 52 min downtime/year)
Short code length7 characters — Base62 = 3.5 trillion combinations
Analytics freshness< 5s from click to dashboard
URL durabilityZero data loss for non-expired URLs
NOTE

Scale Numbers

10 billion URLs stored. 1,000 creates/sec (~86M/day). 100,000 redirects/sec (~8.6B/day). 100:1 read-to-write ratio. 200 bytes average long URL. 7-character short code. Storage: ~350 bytes/row × 10B rows × RF=3 per DC × 3 DCs ≈ 31.5 TB total replicated across all regions.

High-Level Architecture

Rendering diagram...

Short Code Generation Strategy

At 1K creates/sec across 3 regions (~333/sec per region), a single global auto-increment counter is a coordination bottleneck — every create requires a round-trip to the global counter authority. At higher volumes, this becomes a hard limit on write throughput.

The production solution: per-region counter ranges. Each region claims a range of 100,000 IDs atomically via ScyllaDB Lightweight Transactions (LWT), then mints codes locally with zero cross-region coordination. Region A gets IDs 1–100,000, Region B gets 100,001–200,000. When a region exhausts its range, it claims the next block. At 333/sec per region, each block lasts 5 minutes. The LWT claim is the only cross-region operation in the write path — everything else is local.

Why sequential IDs are a security problem: if ID 10,001 maps to aB3xY9z, a malicious actor can enumerate all short codes by iterating IDs. A Feistel cipher shuffle on the numeric ID before Base62 encoding breaks sequential predictability: ID 10,001 → shuffled 7,823,019 → xK9mP2q. The Feistel transform is reversible (for internal ID lookup) but appears random to external observers. This is the same technique used by credit card number generators.

Base62 math: 62 characters (a–z + A–Z + 0–9) raised to the 7th power gives 62⁷ ≈ 3.5 trillion unique codes. At 1,000 creates/sec, exhausting the namespace takes 111 years. 6-character codes give 62⁶ ≈ 56 billion — still ~1,800 years. 5 characters (62⁵ = 916M) would exhaust in ~10 days at current rate. 7 characters is the right length for a system designed to last decades.

The hash-based alternative: MD5(long_url) → take first 7 Base62 chars. Stateless — no counter required, no coordination. Problem: collision probability. With 1B URLs in the table, the birthday paradox gives ~14% collision probability. Collisions require retry logic and table reads on every create. The counter-based approach has zero collision probability and simpler code — counter is strictly preferred for high-volume production systems.

Base62 Encoding

pythonShort code generation
ALPHABET = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
BASE = 62
CODE_LENGTH = 7

def to_base62(n: int) -> str:
    """Encode a 42-bit integer to a 7-character Base62 string."""
    chars = []
    while n:
        chars.append(ALPHABET[n % BASE])
        n //= BASE
    return "".join(reversed(chars)).zfill(CODE_LENGTH)

def feistel_shuffle(n: int, rounds: int = 4, key: int = 0xDEADBEEF) -> int:
    """Obscure sequential IDs so short codes don't look guessable."""
    L, R = (n >> 21) & 0x1FFFFF, n & 0x1FFFFF  # split 42 bits
    for _ in range(rounds):
        L, R = R, L ^ (hash(R ^ key) & 0x1FFFFF)
    return (L << 21) | R

# Usage:
# regional_counter = fetch_next_from_scylla_range()
# short_code = to_base62(feistel_shuffle(regional_counter))

Read Path (Redirect Flow)

Rendering diagram...

Database Schema (ScyllaDB / CQL)

sqlScyllaDB schema
CREATE KEYSPACE url_shortener
WITH replication = {
    'class': 'NetworkTopologyStrategy',
    'us-east': 3, 'eu-central': 3, 'ap-northeast': 3
} AND durable_writes = true;

-- Primary URL mapping — immutable after create; TTL auto-expires rows
CREATE TABLE url_shortener.urls (
    short_code      text PRIMARY KEY,
    long_url        text,
    user_id         uuid,
    created_at      timestamp,
    expires_at      timestamp,
    is_custom_alias boolean,
    region_id       tinyint,
    metadata        map<text, text>
) WITH gc_grace_seconds = 864000   -- 10 days, > repair interval
  AND compaction = {
      'class': 'TimeWindowCompactionStrategy',
      'compaction_window_unit': 'DAYS',
      'compaction_window_size': 30
  };

-- Counter range allocation (per-region, LWT-protected)
CREATE TABLE url_shortener.counter_ranges (
    region_id   tinyint PRIMARY KEY,
    next_start  bigint,
    range_size  int
);

-- Custom alias uniqueness guard
CREATE TABLE url_shortener.custom_aliases (
    alias       text PRIMARY KEY,
    short_code  text,
    created_at  timestamp
);

Storage Technology Choices

StoreTechnologyWhy
Primary URL storeScyllaDB (Cassandra-compatible)Shard-per-core → sub-ms p99 reads. Native TTL + TWCS for expiration. Active-active multi-region with NetworkTopologyStrategy.
Hot URL cacheValkey 8 ClusterIn-memory LRU. 150M URLs × 250B ≈ 60GB per region. 24h TTL. Sub-ms lookups absorb 80% of origin traffic.
Analytics storeClickHouseColumnar. 10× compression on click events. Sub-second aggregations over billions of rows. Partitioned by day.
Event busKafkaDurable click event log. Decouples redirect path from analytics. Flink consumes in 5-second micro-batches.
Edge cacheCDN (CloudFront/Cloudflare)60s TTL caches viral URLs at ~25K/sec globally, never reaching origin servers.
⚠ WARNING

Zipf Distribution — The Key Assumption

The entire caching strategy depends on Zipf-shaped URL popularity: the top 1–2% of URLs account for ~80% of redirects. A 60GB Valkey cluster covering 150M hot URLs achieves a 95%+ cache hit rate under this distribution. If traffic were uniform (all 10B URLs equally popular), you'd need 5TB of cache — not feasible. Always validate your traffic distribution assumption before sizing the cache.

IMPORTANT

Failure Modes & Mitigations

  • Valkey node failure: Cluster mode with N/2+1 replicas. On node loss, consistent hashing remaps a fraction of keys. Downstream Scylla absorbs the temporary miss rate spike.
  • ScyllaDB region failure: Active-active multi-region. Other regions serve traffic with LOCAL_QUORUM. Hinted handoff queues writes until the failed region recovers.
  • ID collision on custom aliases: LWT (Paxos-backed) INSERT IF NOT EXISTS on the custom_aliases table guarantees uniqueness. Race conditions return 409 Conflict to the caller.
  • Kafka lag spike: Flink consumer applies backpressure. Analytics freshness degrades gracefully — redirects are never affected. Alert at > 30s lag, page at > 5 min lag.
  • CDN stale URL: 302 (not 301) means browsers don't cache redirects. CDN 60s TTL means at most 60s of stale redirect after a URL is deleted. Acceptable for the SLA.

Interview Questions

Click to reveal answers
Test your knowledge

Sign in to take the Quiz

This topic has 20 quiz questions with instant feedback and detailed explanations. Sign in to unlock quizzes.

Sign in to take quiz →