System design for a stock exchange at NYSE/NASDAQ scale: 10B+ messages/day, sub-microsecond matching via LMAX Disruptor and FPGA, price-time priority order book, UDP multicast for market data, and why the matching engine is intentionally single-threaded for determinism and audit compliance.

55 min read 4 sections 1 interview questions

Stock Exchange DesignOrder Book ArchitectureMatching EngineLMAX DisruptorFPGA Low LatencyDPDK Kernel BypassFIX ProtocolMarket Data MulticastNASDAQ ITCH OUCHPrice-Time PriorityT+1 Settlement DTCCHigh-Frequency TradingKafka Event Log WAL

Why Stock Exchanges Are the Hardest Latency Problem in Software

A stock exchange is one of the few systems in software engineering where latency is measured in nanoseconds and correctness is a legal requirement. NYSE's matching engine targets ~20 microseconds end-to-end (published); Nasdaq processes roughly 50% of US equity volume; the entire system handles 10B+ messages per day. That scale creates two distinct hard problems that most candidates conflate into one. The first hard problem is low-latency order matching — getting an order from receipt to matched trade in microseconds. Every millisecond of unnecessary latency gives high-frequency trading firms an edge they will exploit; exchanges compete on this metric. The second hard problem is deterministic, auditable execution — every order must execute in a provably fair, reproducible order; regulators (SEC, FINRA) require a complete audit trail with nanosecond timestamps, and any non-determinism makes that trail unreproducible. These two requirements pull in opposite directions: the most obvious way to scale throughput (multi-threading) destroys determinism. Understanding that tension — and the engineering choices that resolve it — is what separates a senior answer from a staff answer on this topic. The interviewer is testing whether you understand that the matching engine is intentionally a single-threaded bottleneck, that FPGA matching moves logic before the OS networking stack, and that UDP multicast is the correct choice for market data distribution not despite packet loss, but partly because of its simplicity under that constraint.

IMPORTANT

What Interviewers Are Testing at Each Level

Mid-level: Order book data structure (price-time priority, sorted map of price levels), basic matching logic (market vs. limit orders), and that the system needs high availability. Senior: Why the matching engine is single-threaded (determinism + no lock contention), LMAX Disruptor ring buffer for inter-thread communication, market data via UDP multicast (not TCP), and pre-trade risk checks outside the hot path. Staff: FPGA matching (logic before OS networking stack, ~200ns latency), kernel bypass with DPDK/RDMA, the difference between execution (matching engine) and settlement (DTCC clearing, T+1), and the regulatory requirements that constrain architecture choices. The single most disqualifying answer at senior level: suggesting multi-threaded matching to increase throughput.

Clarifying Questions to Ask Before Designing

What asset classes?

Equities only (simpler — one instrument per order book), or futures/options too? Options require a separate book per strike+expiry, multiplying book count by 10–100x. Futures have different settlement. Assume equities unless told otherwise.

What order types must be supported?

At minimum: Market (execute now at best price), Limit (execute at price or better), Stop (trigger when price crosses threshold). Advanced: IOC (Immediate-Or-Cancel — fill what you can, cancel rest), FOK (Fill-Or-Kill — fill entirely or cancel), GTC (Good-Til-Cancel — persist across trading sessions). More order types = more matching engine complexity.

Market hours only or 24/7?

Traditional exchanges (NYSE, Nasdaq) have defined market hours (9:30 AM – 4:00 PM ET) plus pre/after-hours sessions with lower liquidity. 24/7 (crypto exchanges) means no overnight clear of open orders and different risk models. Assume standard market hours unless told otherwise.

Co-location support?

Co-location lets HFT firms place servers in the exchange's data center on dedicated racks, accessing the matching engine via a direct cross-connect with sub-100ns network latency. This is a revenue stream for exchanges ($1M+/year per rack) and a latency SLA commitment. Requires strict fairness guarantees — all co-lo clients get equal cable length.

Market data feed requirements?

Who consumes market data — just exchange participants, or public aggregators like the SIP (Securities Information Processor)? Public feeds go through OPRA/SIP consolidation. Direct feeds (proprietary) are faster and sold separately. Each is a different protocol and latency tier.

What is the throughput target?

NYSE and Nasdaq process hundreds of thousands of order updates per second at peak. Nasdaq ITCH feed publishes ~40–80M messages/day. Design for at least 500K messages/sec sustained, 1M/sec peak, with sub-millisecond matching latency at median.

IMPORTANT

Premium content locked

This guide is premium content. Upgrade to Pro to unlock the full guide, quizzes, and interview Q&A.

Upgrade to Pro Sign in to upgrade