Member-only story

The Limits of Accuracy: A Broader Approach to Model Evaluation

6 min readApr 4, 2025

Suppose we have a model that predicts the colour of a ball. We have 5 red balls and 5 blue balls, and we ask our model to make a prediction of the colour of each of them. How can we evaluate our model’s success?

Not a member? You can read this article, completely free, at the link below:

N.B. Throughout this article, we’ll refer to red as the positive class and blue as the negative.

The problem with accuracy

One approach would be to count the number of predictions that our model gets right. We count the number of red balls that the model predicts to be red (number of True Positives) and count the number of blue balls that our model predicts to be blue (the number of True Negatives). We take the sum of these two values as the total number of predictions that the model got correct. Dividing this by the total number of balls gives us the accuracy of the model.

Data Science Collective

The Limits of Accuracy: A Broader Approach to Model Evaluation

The problem with accuracy

Published in Data Science Collective

Written by James Wilkins

Responses (4)