Sections

Deep-dive on modern auth systems: AuthN vs AuthZ, session vs JWT tradeoffs, OAuth 2.0 flows (Authorization Code + PKCE, Client Credentials, Device), OIDC identity tokens, RBAC vs ABAC vs Google Zanzibar ReBAC, JWT revocation, key rotation, and WebAuthn passkeys for FAANG system design interviews.

55 min read 17 sections 7 interview questions

JWTOAuth 2.0OIDCOpenID ConnectRBACABACReBACGoogle ZanzibarSpiceDBWebAuthnPasskeysPKCEAuthenticationAuthorizationArgon2

Why Authentication Systems Are the Hardest Part of System Design

Auth is the system that every other system trusts. Get it wrong and every endpoint is exposed — get it subtly wrong and you ship silent vulnerabilities that take years to discover. Real-world failures follow predictable patterns: algorithm confusion attacks on JWTs (RS256 signed keys verified as HS256 with the public key as secret — CVE-2016-10555), missing aud claim validation letting tokens from one service be replayed at another (the "confused deputy"), session fixation in shared workstations, and OAuth Implicit flow leaking tokens via URL fragments to browser history.

The interview question "design authentication for a multi-tenant SaaS" tests whether you can navigate five orthogonal decisions without conflating them:

AuthN vs AuthZ — who are you, vs what can you do. Separate systems, separate failure modes.
Session vs stateless token — where you store state (server DB vs client JWT) changes revocation, scale, and latency tradeoffs.
OAuth flow selection — Authorization Code + PKCE for web/mobile, Client Credentials for service-to-service, Device Code for TVs. Wrong flow = wrong security model.
Authorization model — RBAC (simple, rigid), ABAC (flexible, attribute-driven), ReBAC (relationship graphs — Google Docs "can user X edit this file because they're in team Z").
Key management — signing key rotation, refresh token rotation, JWK sets. The hardest operational problem in auth.

Staff-level answers distinguish these explicitly. Mid-level answers conflate AuthN with AuthZ, propose JWTs without a revocation plan, and pick RBAC without considering when it breaks (it breaks the moment you add "share a document with one user from another org").

IMPORTANT

What Interviewers Actually Evaluate

Auth system design questions probe five distinct competencies — cover all five to hit staff bar.

Conceptual clarity: Can you say "authentication is who, authorization is what" without fumbling? Can you explain why OAuth is not an authentication protocol and why OIDC was built on top of it?
Token mechanics: Do you know a JWT payload is base64, not encrypted (anyone can read it)? Do you know the difference between HS256 (shared secret — breaks at scale because every verifier holds the signing key) and RS256 (asymmetric — only the issuer signs, verifiers hold the public key)?
OAuth flow fluency: Can you name four flows, match them to use cases (web app = Authorization Code with PKCE, SPA = same, mobile = same, CLI/TV = Device Code, service-to-service = Client Credentials), and explain why Implicit flow was deprecated?
Revocation: The canonical trick question. JWTs are stateless — how do you revoke one? (Short TTL + refresh tokens + optional blocklist.)
Authorization at scale: For "can user X edit document Y," when does RBAC break? What does Google Zanzibar / SpiceDB solve? The candidate who mentions Zanzibar without prompting stands out.

The anti-signal: "just use JWT," "store passwords hashed" (without saying bcrypt/Argon2 with specific cost parameter), or proposing OAuth Implicit flow in 2024.

Clarifying Questions Before You Design

Is this authentication or authorization — or both?

If the interviewer asks 'design login for a social app,' that's AuthN with SSO. 'Design permission checks for Google Docs' is AuthZ. Scope matters.

What clients — web, mobile, CLI, service-to-service?

Each maps to a different OAuth flow. Web SPA = Auth Code + PKCE. Mobile = same. Backend cron job = Client Credentials. TV app = Device Code.

Single-tenant or multi-tenant? Is SSO required?

Multi-tenant SaaS with enterprise customers needs SAML/OIDC federation with customer IdPs (Okta, Azure AD). That's a different architecture than a consumer app.

What's the scale — 10k users or 100M?

Session tables in Postgres work to ~1M users. Above that, JWT + refresh tokens or session data in Redis cluster. 100M users needs sharded session store or JWT.

What compliance / revocation requirements?

SOC 2 / HIPAA requires audit logs of auth events and rapid session revocation. PCI-DSS forces short token lifetimes. These constrain the AuthN architecture.

What authorization complexity — roles, attributes, or relationships?

Roles (admin/user) = RBAC. Attributes (dept + clearance + time-of-day) = ABAC. Relationships ('can edit document X because in team Y which owns folder Z') = ReBAC / Zanzibar. Pick the simplest that covers requirements.

What MFA / account-recovery requirements?

Consumer app: TOTP or WebAuthn (passkeys). Enterprise: hardware keys, SAML. Recovery flows are the #1 source of account takeovers — design them as first-class.

AuthN vs AuthZ — The Distinction That Interviewers Actually Test

Authentication (AuthN) answers "who are you?" It verifies identity — a password check, a hardware key signature, a SAML assertion from Okta. The output is an authenticated principal: a user ID, an email, plus claims about how the authentication happened (MFA used? single sign-on? session age?).

Authorization (AuthZ) answers "what can you do?" Given an authenticated principal and a requested action on a resource, decide allow/deny. The input is the principal, the action (docs.edit), the resource (document:abc123), and the context (IP, time, MFA recency). The output is a boolean plus a reason.

These are separate systems for separate reasons:

Different failure modes: a broken AuthN means anyone can pretend to be anyone. A broken AuthZ means authenticated users do things they shouldn't.
Different cadence: AuthN happens once per session (plus refresh). AuthZ happens on every request — must be fast (< 10ms).
Different data stores: AuthN holds user records, passwords, MFA factors. AuthZ holds role assignments, ACLs, or relationship graphs.
Different scaling concerns: AuthN handles login spikes (Monday morning). AuthZ handles every API call — needs caching, precomputed materialized views.

The conflation trap: candidates design "a login system" and include permission checks in the same service. In production this is wrong — AuthN and AuthZ are independent microservices. Auth0 and Okta are AuthN platforms. Google Zanzibar, AWS IAM, and SpiceDB are AuthZ platforms. The boundary is the access token — AuthN issues it, AuthZ consumes it.

Sessions vs Tokens — Where You Store State

Every request after login needs to carry identity. You have two choices: session-based (server stores state, client holds an opaque ID) or token-based (client holds self-contained signed token, server stores nothing).

Session-based (cookie holds session_id=abc123, server looks up abc123 in a session store):

Pros: trivial revocation (delete the session row), small client payload (~32 bytes), full state lives server-side so you can attach arbitrary data.
Cons: every request hits the session store — must be a fast cache (Redis). Sticky sessions or a shared session store are required for horizontal scaling. Cross-domain sessions need careful cookie configuration (SameSite, HttpOnly, Secure).
Scale: Redis-backed sessions work to ~10M concurrent sessions per Redis cluster. Beyond that, shard by session ID.

Token-based (JWT or opaque bearer token):

Pros: stateless verification — no server lookup needed if the token is signed (JWT). Scales to any number of concurrent sessions without a session store. Good for microservices — every service validates independently with the public key.
Cons: revocation is hard (the whole point of statelessness is that you don't check a server). Solutions: short TTL (5-15 min access tokens) + refresh tokens + optional blocklist for high-value revocations. Payload size is larger (500-2000 bytes typical).
Scale: JWT is the right choice when you have >100 services that all need to validate identity — the alternative is every service hitting a central session store on every request.

The decision tree is simple: monolith or small service mesh → sessions (simpler, trivial revocation). Large microservices architecture or multi-region → JWT (avoid the shared session store becoming a bottleneck or SPOF). Hybrid is common: JWT for cross-service identity propagation, but a central session record in the AuthN service to support global logout.

Session vs JWT Decision Matrix

Dimension	Server Session	JWT (Stateless)	Winner
Revocation	Delete row — instant	Hard — needs blocklist or short TTL	Session
Latency per request	~1-5ms (Redis lookup)	< 1ms (public-key verify, cached)	JWT
Payload size	~32 bytes (opaque ID)	500-2000 bytes (signed claims)	Session
Horizontal scale	Needs shared Redis / sticky sessions	Fully stateless — any replica	JWT
Cross-service use	Every service hits session store	Every service verifies independently	JWT
Data inspection by client	Opaque — client sees nothing	Base64 payload — client reads all claims	Session (privacy)
Audit / session listing	Trivial — SELECT * from sessions	Requires separate tracking table	Session
Best for	Monolith, trusted single domain	Microservices, multi-region, SSO federation	Depends

JWT Deep Dive — Structure, Signing, and the Common Mistakes

A JWT is three base64url-encoded parts joined by dots: header.payload.signature. Example: eyJhbGciOiJSUzI1NiIsImtpZCI6ImtleTEifQ.eyJzdWIiOiJ1c2VyXzEyMyIsImV4cCI6MTcxNDE1NjgwMH0.signature_bytes_base64.

Header: {"alg": "RS256", "kid": "key1"} — algorithm and key ID (for rotation). Payload (claims): {"iss": "auth.example.com", "sub": "user_123", "aud": "api.example.com", "exp": 1714156800, "iat": 1714155900} — issuer, subject (user), audience, expiry, issued-at. Signature: the signing server signs base64(header).base64(payload) with its private key (RS256) or shared secret (HS256).

The #1 JWT mistake — writing sensitive data into the payload. The payload is base64-encoded, not encrypted. Anyone who gets the token (via any console.log, any sent header in a log line, any network capture) can decode it and read every claim. Email, role, internal user ID — all readable. Never put anything in a JWT you wouldn't print in a log.

HS256 vs RS256 — pick asymmetric at any scale beyond a single service:

HS256 (HMAC-SHA256): symmetric — signer and verifier share a secret. If 10 services need to verify tokens, they all hold the signing key — one compromised service leaks the ability to mint tokens for the entire platform. Acceptable only in a single monolith or tightly-trusted environment.
RS256 (RSA-SHA256): asymmetric — auth service signs with private key, everyone verifies with public key. Standard for production. Publish a JWK set (/.well-known/jwks.json) so verifiers can fetch and cache public keys. Supports key rotation with a kid header — include multiple keys in the JWK set during rotation.

Key validation checks every verifier must do:

Signature valid (RSA verification).
exp not passed (token not expired).
iss matches expected issuer (prevents tokens from random auth servers).
aud matches this service's identifier (prevents confused deputy — a token minted for service A replayed at service B).
nbf if present ("not before" — token not yet valid).

Skipping aud validation is the second-most-common real-world JWT bug after base64-vs-encryption confusion.

OAuth 2.0 Authorization Code Flow with PKCE + JWT Issuance

Rendering diagram...

OAuth 2.0 Flows — Pick the Right One for Your Client

OAuth 2.0 is a delegation framework — it lets a user grant an app access to a resource without sharing their password. There are five flows; three are still recommended, one is situational, one is deprecated.

Authorization Code + PKCE (the default — use this): user redirects to auth server, logs in, gets a one-time code, client exchanges code + PKCE code_verifier for tokens. PKCE (Proof Key for Code Exchange) binds the code exchange to the client that initiated the flow — even if the code leaks (in the URL, in browser history, in a logging system), an attacker can't exchange it without the verifier. Use for: web apps (with or without backend), SPAs, mobile apps. This is the only correct answer for user-facing clients in 2024.

Client Credentials (service-to-service): backend service presents its own client_id + client_secret directly to the token endpoint, gets an access token scoped to itself. No user involved. Use for: cron jobs, backend-to-backend API calls, CI/CD pulling secrets, internal microservices. The client_secret lives in your secret store (Vault, AWS Secrets Manager) — never in source.

Device Code (input-constrained devices): user is shown a short code on their TV, logs in on their phone via a URL, enters the code. TV polls the token endpoint until the user approves. Use for: smart TVs (Netflix on Apple TV), CLI tools (gcloud auth login, aws sso login), IoT with limited input.

Resource Owner Password Credentials (ROPC): user gives their password to the client app directly, client exchanges password for token. Deprecated by OAuth 2.1. Only use if migrating a legacy system where the user explicitly trusts the client (first-party mobile apps pre-OAuth). Avoid if at all possible — it defeats the entire purpose of OAuth.

Implicit flow: token returned in the URL fragment. Deprecated due to token leakage via browser history, referer headers, and logs. Every resource previously using Implicit should migrate to Authorization Code + PKCE. If a candidate proposes Implicit in 2024, that's a strong negative signal.

OAuth vs OIDC — Why Both Exist

OAuth 2.0 is authorization (delegation): "can app X access resource Y on user's behalf?" It issues access tokens scoped to APIs. It does not tell the app who the user is — the access token's contents are opaque to the client by design.

OpenID Connect (OIDC) is a thin identity layer on top of OAuth 2.0. It adds an id_token (a JWT) alongside the access token, containing user claims (sub, email, name, picture). Now the client knows who the user is — OIDC is authentication built on OAuth's flow machinery.

The practical mapping:

"Sign in with Google" → OIDC. You want to know the user's identity. The id_token gives you their email and Google user ID.
"Access my Google Calendar" → OAuth. You want delegated access to the Calendar API. The access token has scope calendar.read.
Usually both — when a user clicks "Sign in with Google" in your app, you use OIDC to get their identity and OAuth to get an access token for their Google data (if your app needs that).

Three OIDC concepts that matter in interviews:

id_token vs access token: the id_token is meant for the client (signed JWT with user claims, verified by the client). The access token is meant for resource servers (opaque to client, validated by API). Don't use an id_token to call APIs — wrong audience.
UserInfo endpoint: OIDC defines /userinfo — the client can exchange the access token for user claims even without parsing the id_token. Useful when claims change mid-session.
OIDC Discovery: /.well-known/openid-configuration returns the auth server's endpoints and supported flows. All major IdPs (Google, Okta, Azure AD, Auth0) implement it — write your integration against discovery metadata, not hardcoded URLs.

Authorization Models — RBAC vs ABAC vs Google Zanzibar ReBAC

Authorization models differ in how you express permission. Pick the simplest model that handles your requirements — complexity is real cost.

RBAC (Role-Based Access Control): users have roles, roles have permissions. alice is admin; admin can do everything. Simple, auditable, the default for 80% of systems. Implement as (user_id, role) + (role, permission) tables. Breaks when: a role means different things in different contexts ("manager of team A, but not team B"), or permissions depend on data ("can edit their own documents, not others"). At that point you've invented ABAC or ReBAC poorly.

ABAC (Attribute-Based Access Control): permission is a function of attributes — user attributes (department, clearance), resource attributes (classification, owner), environment (time of day, IP range). Expressed as policies: allow if user.dept == resource.owner_dept AND user.clearance >= resource.required_clearance. Engines: AWS IAM policies, OPA (Open Policy Agent), Cedar (AWS's policy language). Use when: permissions are rule-driven, requirements change (compliance adds "no access after 6pm"), or you have heterogeneous resources. Breaks when: permissions depend on arbitrary relationships between users and resources, like document sharing.

ReBAC (Relationship-Based Access Control) — Google Zanzibar: permissions are graph edges. "User X can edit document Y because X is in team Z, and team Z owns folder F, and folder F contains document Y." Used by Google Docs, Drive, YouTube, and Airbnb. The canonical paper is Google's 2019 Zanzibar paper; SpiceDB is the open-source implementation; AuthZed, Ory Keto, and Warrant are hosted offerings. The data model is tuples (user:alice, relation:editor, object:document:xyz), and queries answer "does user:alice have relation:edit to object:document:xyz?" by graph traversal. Use when: resource sharing is core (collaboration products, social graphs), permissions cascade through hierarchies (Drive folders), or you need high-throughput check (Zanzibar handles ~2M checks/sec with <10ms p99).

The interview signal: naming Zanzibar for Google-Docs-style permissions without prompting is a strong positive. Defaulting to RBAC when the scenario clearly needs ReBAC ("design Dropbox sharing") is a negative.

RBAC vs ABAC vs ReBAC — When to Use Each

Model	Data Shape	Query Cost	Best For	Breaks When
RBAC	(user, role) + (role, perm) tables	O(1) — single lookup	Simple apps: admin/user distinction, internal tools	Permissions depend on resource ownership or sharing
ABAC	Policies evaluated over attributes	O(policy size) — rule evaluation	Compliance-driven (SOC 2, HIPAA), AWS IAM-style APIs	Needs arbitrary graph relationships
ReBAC (Zanzibar)	Tuples (user, relation, object) in a graph	O(graph traversal) — cached ~5ms	Collaboration (Docs, Drive, Dropbox), social graphs	Small apps — massive overhead vs benefit
Hybrid (common)	RBAC for coarse, ABAC/ReBAC for fine	Two-tier check	Most production systems at scale	Requires careful policy boundary design

Key Storage, Rotation, and JWK Sets

Signing keys are the crown jewels — anyone with the private key can mint valid tokens for your platform. Three operational practices are non-negotiable.

1. Store signing keys in HSM / KMS, never in config: AWS KMS, GCP Cloud KMS, or a hardware HSM holds the private key. The auth server signs by calling the KMS API — the raw private key never leaves the HSM boundary. A leaked environment file or a compromised app server doesn't leak the signing key. Performance: KMS sign calls are ~5-20ms; cache them if you're issuing >100 tokens/sec (many auth servers sign in-process with a key fetched at startup and refreshed periodically).

2. Rotate signing keys regularly (quarterly minimum): publish a JWK set at /.well-known/jwks.json containing multiple active keys. During rotation, add the new key to the set and start signing with it (via the kid header) while continuing to verify tokens signed by the old key. After all old tokens have expired (wait max_token_lifetime + buffer), remove the old key. Verifiers cache the JWK set with a TTL (15 min to 1 hour); they pick up new keys automatically.

3. Rotate refresh tokens on every use: issue a new refresh token each time the client exchanges one for an access token; invalidate the previous. If an attacker steals a refresh token and uses it, the legitimate client's next refresh fails — detectable as a refresh token reuse attack. On detection, invalidate the entire token family (all tokens descended from the stolen one) and force re-authentication. This is the OAuth 2.1 recommendation.

Passwords: store with Argon2id (cost parameters: m=64MB, t=3, p=4 minimum as of 2024) or bcrypt (cost factor 12+). Never MD5, SHA-1, SHA-256 — those are for data integrity, not passwords. Always use a unique per-user salt. At login, compare with constant-time comparison to prevent timing attacks revealing hash prefix.

JWT Validation with JWK Caching and Full Claim Checks

pythonjwt_validator.py

import time
from typing import Optional
from dataclasses import dataclass
import requests
from jwt import PyJWKClient, InvalidTokenError
import jwt as pyjwt

@dataclass
class AuthContext:
    user_id: str
    issuer: str
    audience: str
    scopes: list[str]
    issued_at: int
    expires_at: int


class JWTValidator:
    """Validates RS256 JWTs issued by a trusted auth server.

    - Fetches and caches the JWK set (1-hour TTL).
    - Validates signature, exp, iss, aud, nbf in one pass.
    - Raises InvalidTokenError on any failure — never silently accepts.
    """

    def __init__(self, jwks_uri: str, expected_issuer: str,
                 expected_audience: str, clock_skew_sec: int = 30):
        self._jwks_client = PyJWKClient(jwks_uri, cache_keys=True,
                                        lifespan=3600)  # 1h cache
        self._issuer = expected_issuer
        self._audience = expected_audience
        self._clock_skew = clock_skew_sec

    def validate(self, token: str) -> AuthContext:
        """Returns AuthContext on success, raises InvalidTokenError on failure."""
        try:
            # Picks the correct key via `kid` header — supports rotation.
            signing_key = self._jwks_client.get_signing_key_from_jwt(token)
        except Exception as e:
            raise InvalidTokenError(f"Failed to fetch signing key: {e}")

        # Decode + verify — pyjwt checks exp, iss, aud, nbf, signature all at once.
        # Do NOT disable any of these — each is a real attack vector.
        payload = pyjwt.decode(
            token,
            signing_key.key,
            algorithms=["RS256"],  # explicit allow-list — prevents alg confusion
            audience=self._audience,
            issuer=self._issuer,
            leeway=self._clock_skew,
            options={
                "require": ["exp", "iat", "iss", "aud", "sub"],
                "verify_signature": True,
                "verify_exp": True,
                "verify_iat": True,
                "verify_aud": True,
                "verify_iss": True,
            },
        )

        return AuthContext(
            user_id=payload["sub"],
            issuer=payload["iss"],
            audience=payload["aud"] if isinstance(payload["aud"], str)
                      else payload["aud"][0],
            scopes=payload.get("scope", "").split(),
            issued_at=payload["iat"],
            expires_at=payload["exp"],
        )


# Usage in an API middleware
validator = JWTValidator(
    jwks_uri="https://auth.example.com/.well-known/jwks.json",
    expected_issuer="https://auth.example.com",
    expected_audience="https://api.example.com",
)

def middleware(request) -> AuthContext:
    auth_header = request.headers.get("Authorization", "")
    if not auth_header.startswith("Bearer "):
        raise InvalidTokenError("Missing Bearer token")
    token = auth_header[len("Bearer "):]
    return validator.validate(token)  # raises on any failure

MFA and Passwordless — WebAuthn Is the Future

Password-only authentication is a broken model: reused passwords, phishing, credential-stuffing attacks from breach databases. Modern auth adds a second factor or removes passwords entirely.

TOTP (Time-based One-Time Password) — RFC 6238, the algorithm behind Google Authenticator, Authy, 1Password TOTP. User scans a QR code that encodes a shared secret; app generates a 6-digit code that rotates every 30 seconds. Strong against password-breach attacks. Weak against phishing — a fake login page can relay the code in real time to the real service.

SMS / email codes — common but compromised. SMS is vulnerable to SIM swap attacks and SS7 protocol flaws (attacker convinces the carrier to port the victim's number). Email is only as secure as the email account. NIST deprecated SMS as a primary second factor in 2017; use only as a fallback and never for high-value accounts (banking, enterprise admin).

WebAuthn / Passkeys — the current gold standard. Cryptographic challenge-response using a hardware-backed key pair (Touch ID, Face ID, YubiKey, Windows Hello). The private key never leaves the device. The server sends a challenge; the client signs with the private key; the server verifies with the public key registered during signup. Phishing-resistant by design — the browser binds the challenge to the domain (example.com), so a phishing site (example-login.com) cannot trigger a valid signature. Apple, Google, and Microsoft are pushing passkeys as password replacement — sync via iCloud Keychain, Google Password Manager, or Windows Hello. Use in 2024 onward for any new system.

Magic links — email a one-click login URL. Low friction, but email inbox security becomes auth security. Acceptable for low-stakes products; inappropriate for anything touching money or sensitive data.

The staff-level recommendation: TOTP + recovery codes as baseline, WebAuthn passkeys as preferred primary, SMS only as account-recovery fallback with explicit user opt-in.

Failure Modes — The Ways Auth Systems Get Broken

Real attacks exploit specific subsystems. Your design should explicitly address each.

JWT algorithm confusion (CVE-2016-10555): verifier accepts tokens signed with alg=HS256 when it should only accept RS256. Attacker takes the public key, uses it as the HMAC secret, and forges tokens. Mitigation: hardcode algorithms=["RS256"] in the verifier — never trust the alg header alone.

Missing aud claim validation (confused deputy): service A issues a JWT; attacker replays it at service B; service B accepts it because both trust the same issuer. Mitigation: every service validates aud matches its own identifier.

Stolen refresh token: attacker exfiltrates a long-lived refresh token via XSS or a compromised device. Mitigation: rotate refresh tokens on every use + detect reuse (if a used token is presented again, the family is stolen — revoke all). Bind tokens to device via DPoP or mTLS when available.

Token replay: attacker captures a valid JWT and replays it before expiry. Mitigation: short access-token TTL (5-15 min), jti claim + seen-set for high-value operations, TLS everywhere (no plaintext auth headers).

Timing attacks on password comparison: naïve == comparison returns early on the first mismatched byte — an attacker measures response times to recover the hash. Mitigation: hmac.compare_digest() or equivalent constant-time comparison.

Session fixation: attacker sets the victim's session ID before login; after login the attacker knows the session. Mitigation: regenerate session ID on every privilege escalation (login, MFA, permission grant).

SSRF via OIDC discovery: attacker points your server at a malicious .well-known/openid-configuration URL, gets it to fetch internal resources. Mitigation: allowlist the set of valid issuer URLs.

JWKS poisoning: verifier fetches JWKS from a URL the attacker can influence. Mitigation: pin the JWKS URI to the auth server's known domain + TLS cert.

TIP

What to Say in an Auth System Design Interview

A high-signal opening for "design authentication for [large app]":

"I'd separate AuthN from AuthZ as distinct services. AuthN uses OIDC — OAuth 2.0 Authorization Code flow with PKCE for all user-facing clients (web, mobile, SPA) — issuing short-lived RS256 JWT access tokens (5-15 min TTL) plus longer refresh tokens with rotation on every use. Signing keys live in KMS, published via a JWK set with kid-based rotation. For AuthZ, I'd start with RBAC for coarse admin/user roles and use a Zanzibar-style ReBAC system (SpiceDB) for resource-level sharing permissions — the 'can user X edit document Y because they're in team Z' case. For MFA, WebAuthn passkeys as primary, TOTP as fallback, SMS only for recovery. Revocation is handled by short token TTL + a blocklist for high-value revocations (password reset, device loss). For scale, every resource service validates JWT locally with a cached public key — p99 under 1ms — rather than hitting a central session store. I'd fail-closed on all auth checks."

That covers the eight axes interviewers care about: AuthN/AuthZ split, OAuth flow, token format, key management, authorization model, MFA, revocation, and scale — in under two minutes. Everything else is drill-down.

Interview Questions

Click to reveal answers

Test your knowledge

Sign in to take the Quiz

This topic has 20 quiz questions with instant feedback and detailed explanations. Sign in to unlock quizzes.