Sections

A practical framework for the six dimensions of code quality that interviewers and senior engineers actually evaluate: correctness, clarity, testability, error handling, performance awareness, and maintainability. Covers naming as design, the rule of three, when to comment and what to say, the hidden cost of cleverness, and the before/after transformations that turn adequate code into production-grade code.

50 min read 14 sections 7 interview questions

Code QualityEngineering CraftClean CodeNamingError HandlingTestabilityRefactoringSoftware DesignPythonTypeScriptCode ReadabilityMaintainabilitySingle ResponsibilityAbstraction

The Six Dimensions — What 'Good Code' Actually Means

"Good code" is a phrase everyone uses and almost no one defines precisely. In interviews and on teams, it gets used as a proxy for "code I would have written" — which is not a useful definition.

The six dimensions that actually matter, in the order they should be prioritized:

1. Correctness. The code does what it is supposed to do, including in edge cases and under failure conditions. Non-negotiable; supersedes all other dimensions.

2. Clarity. The code communicates its intent clearly to the future reader — not the author. This is the most underrated dimension and the one that most differentiates senior from junior code.

3. Error handling. Errors are treated as first-class citizens: explicitly named, explicitly handled, explicitly surfaced to the caller. The most common quality failure in production codebases is not algorithmic bugs — it is silenced exceptions and missing error paths.

4. Testability. The code is designed to be tested without excessive mocking, global state, or access to internal details. Testability is an architectural property, not a testing afterthought.

5. Performance awareness. The code is not gratuitously inefficient, and the author is aware of where the expensive operations are. This does not mean premature optimization — it means not writing O(N²) where O(N) exists, and not making 1,000 database calls where 1 would work.

6. Maintainability. The code will be easy to change correctly six months from now. This is the dimension most affected by the other five.

These dimensions are in tension. Maximizing clarity can reduce performance. Maximum testability can require more abstraction than clarity prefers. Good engineering judgment is knowing which dimension to optimize for in context — and making that trade-off explicit rather than implicit.

How to Prioritize When Dimensions Conflict

Step 1: Correctness always wins

No other quality dimension justifies shipping incorrect code. A beautifully readable function that computes the wrong answer is a bug, not a feature. If you are time-constrained and must cut corners, cut them on clarity or performance — never on correctness. Write the test that proves correctness before you polish the code.

Step 2: When clarity conflicts with performance, default to clarity

Most performance problems are not in the code you would make clever. They are in the algorithm choice (O(N²) vs O(N)), the I/O pattern (N+1 queries), or the data structure (list membership check vs set). Premature micro-optimization of clear code rarely moves the needle. If profiling confirms a specific function is the bottleneck, optimize it with a comment explaining why the non-obvious implementation was necessary.

Step 3: When testability conflicts with simplicity, add one layer of abstraction

If a function is simple but untestable because it directly calls a database or HTTP client, extract the dependency into a parameter. This adds one level of abstraction but makes the function testable without infrastructure. The abstraction cost is a function signature with one more parameter; the benefit is a test suite that runs in milliseconds and catches regressions in CI.

Step 4: When maintainability conflicts with delivering quickly, name everything

If you genuinely do not have time to decompose a function correctly, the minimum investment in maintainability is naming: extract magic numbers to named constants, name intermediate variables descriptively, add one comment explaining the 'why.' A future engineer can refactor a function that has good names. They cannot easily refactor a function where every variable is x, d, or result.

Step 5: Style is always last and always automated

If your project does not have a linter and auto-formatter, add one. Style decisions should be made once (when setting up the formatter config) and enforced automatically. If you are spending time in code review on formatting, indentation, or import ordering, you are solving a solved problem manually. Configure black (Python), prettier (TypeScript/JS), or gofmt (Go) and never discuss style in a code review again.

The Six Quality Dimensions — Priority and Interaction

Rendering diagram...

TIP

What Interviewers Are Testing

L4/Mid signal: Writes code that is correct and passes obvious edge cases. Uses meaningful names. Avoids magic numbers. Writes some tests. Has basic error handling.

L5/Senior signal: Explicit error handling in every non-trivial code path. Names that communicate intent to a future reader without context. Functions that do one thing and can be tested in isolation. Comments that explain why, not what. Does not write clever code when clear code achieves the same result.

Staff signal: Code is designed for the reader who joins the company in 6 months. Every abstraction has a name, a reason, and a home. Error types are explicit and catchable. Performance characteristics are known and documented where non-obvious. The code anticipates the ways it will need to change and is structured to make those changes easy without rework.

Naming as Design — The Hardest Part of Programming

Phil Karlton's observation that "there are only two hard things in computer science: cache invalidation and naming things" has endured because it is precisely correct. Bad naming is the most pervasive quality problem in production codebases, and it compounds over time as each new developer reads the bad name and interprets it differently.

The naming test: Can the reader understand what this variable/function/class contains or does without reading any other code? If the answer is no, the name is wrong.

Common naming failures and their fixes:

Abbreviations that require context: usr, dt, cfg, msg. Use full names: user, date_range, database_config, error_message.

Generic names that describe structure instead of content: data, result, temp, obj, list. These say what the thing is syntactically without saying what it is semantically. validated_user_record, checkout_session, pending_transactions are semantic names.

Boolean names that don't read as booleans: status, flag, enabled. Prefer the is/has/can/should prefix: is_active, has_permission, can_checkout, should_retry.

Function names that describe implementation instead of intent: process_data, handle_request, do_thing. Name functions by what they return or accomplish: validate_payment_details, fetch_user_by_email, calculate_tax_for_region.

Names that lie: get_user that creates a user if one doesn't exist. validate_config that also applies the config. send_notification that also logs the notification. These are the worst category — they produce bugs that are hard to trace because the code appears to do one thing while doing another.

Naming — Before and After

pythonnaming_examples.py

# ❌ Before: requires reading the entire function to understand any part
def proc(d, f=False):
    res = []
    for item in d:
        if item["s"] == 1 or f:
            res.append(item)
    return res


# ✅ After: each piece communicates its role without context
ACTIVE_STATUS = 1

def filter_active_orders(
    orders: list[dict],
    include_pending: bool = False,
) -> list[dict]:
    """
    Returns orders with ACTIVE status.
    include_pending=True also includes PENDING orders, used during
    checkout to show in-flight orders that are not yet confirmed.
    """
    if include_pending:
        return [o for o in orders if o["status"] in (ACTIVE_STATUS, PENDING_STATUS)]
    return [o for o in orders if o["status"] == ACTIVE_STATUS]


# ❌ Before: class name, attributes, and methods are all generic
class Manager:
    def __init__(self):
        self.items = []
        self.flag = False

    def process(self, obj):
        if not self.flag:
            self.items.append(obj)


# ✅ After: semantic names at every level
class CheckoutCartManager:
    def __init__(self) -> None:
        self.pending_line_items: list[LineItem] = []
        self.is_payment_in_progress: bool = False

    def add_item(self, line_item: LineItem) -> None:
        # Payment in progress: cart is locked to prevent concurrent modification
        if not self.is_payment_in_progress:
            self.pending_line_items.append(line_item)

Error Handling as a First-Class Citizen

The most common quality failure in production Python and TypeScript codebases is not wrong algorithms — it is swallowed exceptions and silent error paths. Code that works correctly on the happy path but fails invisibly on any error condition.

The three tiers of error handling quality:

(1) Ignore the error (the worst):

try:
    result = payment_gateway.charge(amount)
except Exception:
    pass  # ← This is a bug in every codebase it appears in

The caller has no way to know the charge failed. The user sees a success message. The charge did not happen.

(2) Log and re-raise (acceptable for infrastructure code):

try:
    result = payment_gateway.charge(amount)
except PaymentGatewayError as e:
    logger.error("Payment charge failed", extra={"amount": amount, "error": str(e)})
    raise  # Re-raises the original exception with its traceback intact

Logging is not error handling — it is observation. The caller still needs to handle the failure.

(3) Explicit typed errors surfaced to the caller (production standard):

class PaymentDeclined(Exception):
    def __init__(self, reason: str, decline_code: str) -> None:
        self.reason = reason
        self.decline_code = decline_code

def charge_payment_method(amount_cents: int, payment_method_id: str) -> ChargeResult:
    try:
        return gateway.charge(amount_cents, payment_method_id)
    except GatewayDeclineError as e:
        raise PaymentDeclined(reason=e.message, decline_code=e.code) from e
    except GatewayTimeoutError as e:
        raise PaymentServiceUnavailable(retry_after_seconds=30) from e

The caller knows exactly what can go wrong, can handle each case appropriately, and is not required to read the implementation to know what exceptions to expect.

The golden rule: Every function that can fail should either return an explicit error type (Result pattern) or raise a named exception that the caller can catch. Silent failures — returning None, returning empty, catching and ignoring — are bugs waiting to be discovered in production.

The Single Responsibility Principle — As Practiced, Not As Preached

The Single Responsibility Principle is one of the most cited and least understood principles in software engineering. The common misunderstanding: "a function should do one thing." The useful definition: a function/class should have one reason to change.

In practice, the test is: if requirements change in a specific way, how many functions do I need to modify? If the answer is "many functions scattered across the codebase," the responsibility is not localized.

The pattern that violates SRP most often: functions that combine data retrieval with data transformation with side effects.

# ❌ Three reasons to change: the query changes, the format changes, or the side effect changes
def get_and_process_orders(user_id: int) -> None:
    orders = db.query("SELECT * FROM orders WHERE user_id = ?", user_id)
    processed = [{"id": o["id"], "total": o["price"] * o["qty"]} for o in orders]
    cache.set(f"orders:{user_id}", processed)
    email_service.send_summary(user_id, processed)

# ✅ Each function has exactly one reason to change
def fetch_raw_orders(user_id: int) -> list[RawOrder]:
    return db.query("SELECT * FROM orders WHERE user_id = ?", user_id)

def calculate_order_totals(raw_orders: list[RawOrder]) -> list[OrderSummary]:
    return [OrderSummary(id=o.id, total=o.price * o.quantity) for o in raw_orders]

def cache_order_summaries(user_id: int, summaries: list[OrderSummary]) -> None:
    cache.set(f"orders:{user_id}", summaries)

def send_order_summary_email(user_id: int, summaries: list[OrderSummary]) -> None:
    email_service.send_summary(user_id, summaries)

The separated version has four functions instead of one. The payoff: calculate_order_totals is now pure — no database, no cache, no email — and can be tested with zero mocking. The query can change without touching the transformation. The email behavior can change without touching the cache. Each function has exactly one reason to change.

The rule of three: If you find yourself copy-pasting a block of code a third time, extract it into a named function. The name is the value — it documents what the block is doing and gives the future reader a handle to understand it. Extract at the third duplication, not the second: two instances of similar code might be genuinely different; three is a pattern.

Testability — Design for Tests, Not Tests for Design

typescripttestability_example.ts

// ❌ Hard to test: depends on global state, real time, and real external service
class OrderProcessor {
  async processOrder(orderId: string): Promise<void> {
    const order = await db.query(`SELECT * FROM orders WHERE id = '${orderId}'`);
    const now = new Date();
    if (now.getHours() < 9 || now.getHours() > 17) {
      throw new Error("Orders can only be processed during business hours");
    }
    await fetch(`https://payment-gateway.com/charge`, {
      method: "POST",
      body: JSON.stringify({ amount: order.total }),
    });
    await db.query(`UPDATE orders SET status = 'processed' WHERE id = '${orderId}'`);
  }
}
// Testing this requires: a real database, a real payment gateway, and running at 10am.


// ✅ Testable: dependencies are injected, time is an injectable, pure transformations
interface OrderRepository {
  findById(orderId: string): Promise<Order | null>;
  updateStatus(orderId: string, status: OrderStatus): Promise<void>;
}

interface PaymentGateway {
  charge(amountCents: number): Promise<ChargeResult>;
}

interface Clock {
  now(): Date;
}

class OrderProcessor {
  constructor(
    private readonly orders: OrderRepository,
    private readonly payments: PaymentGateway,
    private readonly clock: Clock,
  ) {}

  async processOrder(orderId: string): Promise<void> {
    const order = await this.orders.findById(orderId);
    if (!order) throw new OrderNotFoundError(orderId);

    this.assertBusinessHours(this.clock.now());

    const charge = await this.payments.charge(order.totalCents);
    if (!charge.succeeded) throw new PaymentDeclinedError(charge.declineReason);

    await this.orders.updateStatus(orderId, OrderStatus.PROCESSED);
  }

  private assertBusinessHours(now: Date): void {
    const hour = now.getHours();
    if (hour < 9 || hour > 17) {
      throw new OutsideBusinessHoursError();
    }
  }
}
// Now: test with in-memory fakes for OrderRepository and PaymentGateway.
// Test business hours logic by injecting a mock Clock with any time.
// No real database. No real HTTP calls. Tests run in milliseconds.

Comments — What to Write and What Not to Write

The most useful mental model for comments: comments explain why, code explains what.

If a reader needs a comment to understand what a piece of code does, the code is not clear enough. Rename the variables, extract the function, or simplify the logic until the code itself communicates its intent. A comment that says "// increment the counter" is noise — the code already says counter += 1.

The four cases where comments genuinely add value:

(1) Non-obvious decisions. When you made a choice that a future engineer will question, explain it.

# Sleep for 100ms before retry to avoid thundering herd against the payment gateway.
# The gateway rate-limits at 500 req/s; at our peak traffic of 450 req/s, without
# backoff all retries would arrive simultaneously and trigger a second rate limit.
time.sleep(0.1)

(2) Links to external context. Bug reports, spec documents, or known limitations that are not captured in the code.

# Stripe rounds to the nearest cent in certain currencies (JPY, HUF) — do not
# apply our own rounding before passing to the gateway or we get double-rounding.
# See: https://stripe.com/docs/currencies#zero-decimal

(3) Known limitations or invariants. Preconditions the caller must satisfy, or postconditions the function guarantees.

def calculate_shipping_rate(weight_kg: float, destination: Address) -> Money:
    # Assumes weight_kg > 0 and destination.country is a valid ISO 3166 code.
    # Returns a non-negative Money amount in the destination country's currency.
    ...

(4) Performance rationale. When an unusual implementation exists for performance reasons, document it so the next engineer does not "simplify" it away.

# Using a set instead of a list here: membership checks are O(1) vs O(N).
# At ~10K items per request, the list version took ~80ms; the set version takes ~0.1ms.
seen_user_ids: set[int] = set()

Comments that should be deleted:

# Returns the user             ← the function name already says this
# Loop through the orders      ← the code shows this
# TODO: fix this               ← create a ticket instead; comments don't get resolved
# This is a hack               ← explain why it's a hack and what the right fix is

⚠ WARNING

The Hidden Cost of Clever Code

Every engineer has written code they were proud of for its elegance or compactness. Some of that code is still causing bugs in production systems because the next engineer could not understand it.

The rule: prefer clear over clever. Clever code optimizes for the author's satisfaction. Clear code optimizes for the reader's comprehension.

Examples of clever code that becomes a maintenance problem:

One-liners that compress too much logic:

# ❌ Clever: what does this return when user is None? When has_orders is False?
result = user and user.is_active and user.has_orders() and user.cart or []

# ✅ Clear: each case is explicit and readable
if user is None or not user.is_active:
    return []
return user.cart if user.has_orders() else []

Implicit boolean coercion for None checks:

# ❌ Ambiguous: is this checking for None, or for any falsy value (0, "", [])?
if not count:
    return default

# ✅ Explicit: clear that None is specifically being checked
if count is None:
    return default

Deeply nested ternaries and comprehensions:

# ❌ Requires the reader to mentally parse three levels of nesting
result = [f(x) for x in data if x > 0 if x % 2 == 0] if data else []

# ✅ The logic is visible without mental parsing
if not data:
    return []
even_positives = [x for x in data if x > 0 and x % 2 == 0]
return [f(x) for x in even_positives]

The test: can a competent engineer who has never seen this codebase understand this code correctly on first read? If not, it is too clever.

Performance Awareness — Know Where Your Expensive Operations Are

Performance awareness does not mean premature optimization. It means: before you write code that runs in a loop, a hot path, or a tight deadline, you know what the expensive operations are and you have not placed them carelessly.

The three performance anti-patterns that appear in every codebase:

(1) N+1 queries. The single most common production performance bug. A loop that fetches related data inside the loop body:

# ❌ N+1: 1 query for orders + N queries for each order's user
orders = Order.objects.filter(status="pending")
for order in orders:
    user = User.objects.get(id=order.user_id)  # 1 query per order
    print(f"{user.email}: {order.total}")

# ✅ 2 queries total regardless of order count
orders = Order.objects.filter(status="pending").select_related("user")
for order in orders:
    print(f"{order.user.email}: {order.total}")  # no additional queries

(2) Sequential I/O where concurrent is possible. HTTP calls, database queries, or file reads that can run in parallel but are written sequentially:

# ❌ Sequential: 3 × API latency (~300ms total if each takes 100ms)
user = fetch_user(user_id)
orders = fetch_orders(user_id)
preferences = fetch_preferences(user_id)

# ✅ Concurrent: max(API latency) (~100ms total)
user, orders, preferences = await asyncio.gather(
    fetch_user(user_id),
    fetch_orders(user_id),
    fetch_preferences(user_id),
)

(3) Repeated computation in a loop. A function call whose result does not change across iterations being called inside the loop:

# ❌ Calls len(blocked_users) on every iteration — O(N) if blocked_users is a list
for user in all_users:
    if len(blocked_users) > 0 and user.id in blocked_users:
        ...

# ✅ Compute once, outside the loop. Use a set for O(1) membership checks.
blocked_user_ids = set(u.id for u in blocked_users)  # computed once
for user in all_users:
    if user.id in blocked_user_ids:  # O(1) lookup
        ...

The guideline: before writing any loop over a collection that might be large, ask — am I making a network call or database query inside this loop? If yes, find the batch API or restructure the query to fetch all data before the loop.

The Six Dimensions Across Real Code Decisions

Scenario	Dimension at Stake	Poor Choice	Good Choice	Why
Function name for finding or creating a user	Clarity	get_user(email)	find_or_create_user(email)	get implies read-only; find_or_create is honest about the side effect
A database call inside a render loop	Performance	Fetch each user's avatar URL in a loop over 100 orders	Fetch all avatar URLs in one query before the loop	N+1 queries; 100 network round-trips vs. 1
Exception from a payment service	Error handling	except Exception: return None	except PaymentDeclinedError as e: raise PaymentFailed(code=e.decline_code)	Silent None is invisible to the caller; typed exception is catchable and informative
A function that validates and saves	Testability / SRP	def validate_and_save_user(data)	separate validate_user_data(data) and save_user(validated)	Combined function requires DB to test validation; separated function tests validation without DB
Magic number 86400 in a timeout	Clarity / Maintainability	if age > 86400:	SECONDS_IN_ONE_DAY = 86_400 and if age > SECONDS_IN_ONE_DAY:	86400 is opaque; named constant communicates intent and is refactored in one place
Comment on a one-liner sort	Clarity	# sort by date	Remove the comment; name the variable sorted_by_creation_date	The comment says what the code says; rename the variable instead

Level Differentiation: Code Quality by Engineering Level

Level	Correctness	Clarity	Error Handling	Testability	What They Get Wrong
L3 / Junior	Happy path works; edge cases missing	Functional names; some abbreviations; magic numbers	Exceptions caught and ignored or not caught at all	Code works but cannot be tested without global state or real dependencies	Does not think about the future reader; error handling is an afterthought
L4 / Mid	Edge cases mostly covered; error handling in hot paths	Mostly semantic names; avoids magic numbers; some SRP violations	Errors logged; not always re-raised or typed	Some injection; tests exist but mock heavily	Function does 2-3 things; naming is correct but not precise; performance anti-patterns in non-obvious places
L5 / Senior	Comprehensive: edge cases, error paths, concurrent access	Semantic names throughout; SRP applied; comments explain why not what	Typed exceptions surfaced to caller; no swallowed errors; retry logic explicit	Dependencies injected; pure functions for business logic; tests don't require real infrastructure	Rare: occasionally over-engineers for flexibility that won't be needed
Staff	Designs for correctness at the system level, not just the function level	Code is readable by the engineer who will join 6 months from now	Error taxonomy is designed: which errors are retryable, which are fatal, which need user action	Code is designed so tests describe system behavior, not implementation details	This is the target — every dimension explicit, every trade-off documented

Interview Questions

Click to reveal answers

Test your knowledge

Sign in to take the Quiz

This topic has 15 quiz questions with instant feedback and detailed explanations. Sign in to unlock quizzes.