Preview — Pro guide
You are seeing a portion of this guide. Sign in and upgrade to unlock the full article, quizzes, and interview answers.
Design a Payment System (Stripe-style Processor)
System design deep-dive for a merchant payment processor handling 10K TPS peak, sub-500ms p99 latency, PCI DSS compliance, idempotent charges, double-entry ledger accounting, fraud scoring, and exactly-once distributed systems semantics under partial failure.
Why Payment Systems Are a Consistency-First Problem
A payment system is the canonical example of a workload where correctness dominates availability. Losing a single charge or double-billing a customer is far worse than a 60-second outage — a duplicate $500 charge on 10K customers is a regulatory event, not a bug. This flips the usual CAP intuition: most consumer systems lean AP (eventual consistency is fine for a timeline), but payments lean CP, and the interesting engineering is how you claw back availability without sacrificing strong consistency. Every design decision — the database, the retry semantics, the reconciliation job, the webhook delivery contract — is downstream of one invariant: every dollar is accounted for, exactly once. Interviewers testing senior candidates on this topic are probing whether you understand that idempotency, double-entry ledgers, and reconciliation are not optional add-ons; they are the core of the system. Candidates who describe a payment system as 'API → DB → charge card' without those three layers fail the bar immediately. The second dimension they test is whether you understand that the card network (Visa/Mastercard/PSP) is the slowest, least-reliable component in the pipeline — typically 300–400ms p50 and occasionally timing out with unknown state — and your system must be designed around its failure modes, not yours.
Clarifying Questions to Ask First
What type of payment system?
Merchant processor (Stripe/Adyen) vs P2P wallet (Venmo/Cash App) vs subscription billing (Chargebee) vs marketplace payouts (Stripe Connect). Pick one — they have different ledger models. Assume merchant processor unless told otherwise.
What scale?
Transactions per second (TPS) at peak? Stripe does ~13K TPS peak (Black Friday 2023, published). Default assumption: 10K TPS peak, 1K TPS avg, ~1B transactions/year.
Which payment methods?
Cards only, or ACH/SEPA/UPI/wallets? Cards dominate initially. ACH adds 2–3 day settlement windows. UPI adds sub-second expectation. Each payment rail is a separate PSP integration.
What is the regulatory scope?
PCI DSS Level 1 (required if storing card data), PSD2/SCA in EU (strong customer auth), GDPR (data residency). These shape architecture: card data must be isolated in a tokenization vault, never flow through business logic.
Refunds, disputes, chargebacks?
Full + partial refunds within 60/120 days. Chargeback flow: merchant notified, evidence submission window (7–21 days), decision. Each needs its own state machine.
Latency SLO?
p99 < 500ms for charge creation is the standard bar. Card network itself is 300–400ms, so your platform budget is ~100ms. This single constraint kills most naive designs.
Consistency requirements?
Strict financial correctness — no double-charge, no lost charge, ledger balances to zero. This rules out eventually-consistent stores for the authoritative ledger.