Preview — Pro guide
You are seeing a portion of this guide. Sign in and upgrade to unlock the full article, quizzes, and interview answers.
Loss Functions: Choosing the Right Objective for Every ML Problem
The most underestimated ML interview topic. Covers regression losses (MSE, MAE, Huber), classification losses (cross-entropy, focal loss), ranking and embedding losses (triplet, InfoNCE), and the exact decision framework for choosing which loss to use — and why the wrong choice silently destroys model quality.
Why Loss Function Choice Is a Design Decision, Not a Default
Most practitioners accept the loss function as a default: cross-entropy for classification, MSE for regression. This is wrong. The loss function defines what your model literally optimizes — it is the most direct expression of the business objective in mathematical form. Using the wrong loss produces a model that minimizes the metric you specified, not the outcome you wanted.
Three examples where defaults fail:
MSE for skewed targets (revenue prediction): MSE penalizes errors proportionally to their square. A $10,000 prediction error on a $1M transaction gets 10,000× more weight than a $1 error on a $1K transaction. The model memorizes the high-value tail. Use MAE or Huber loss for heavy-tailed distributions.
Cross-entropy for extreme class imbalance (fraud detection, 1% positive rate): Standard cross-entropy assigns equal weight to each example. With 99:1 imbalance, the model can achieve 99% accuracy by predicting all negatives. Use focal loss, which down-weights easy negatives and forces the model to learn from hard positives.
MSE for generative models (image reconstruction): MSE averages pixel errors independently, which produces blurry images — the model predicts the mean of possible outputs because any other prediction would be 'wrong' in more ways. Use perceptual loss or adversarial loss (GAN) for sharp generation.