Average Precision

If you have ever wondered how to evaluate an Object Detection algorithm, you might have heard of a well-known mean Average Precision (mAP) Machine Learning (ML) metric. As the name suggests, mAP is calculated by taking a mean value from Average Precision scores. So, to understand mAP, you must first understand the Average Precision concept. On this page, we will:

Сover the logic behind the Average Precision metric;
Find out how to interpret the metric’s value;
Calculate Average Precision on a simple example;
And see how to work with Average Precision using Python.

Let’s jump in.

As the name suggests, Average Precision is based on the Precision score metric derived from the Confusion matrix. Also, it uses the Recall and Precision-Recall curve concepts. So, to better grasp the metric, please check out the Confusion matrix, Precision score, Recall score, and Precision-Recall curve pages first.

What is the Average Precision score?

Like the Area under the Precision-Recall curve (AUC-PR) metric, Average Precision is a way to summarize the PR curve into a single value. To define the term, the Average Precision metric (or just AP) is the weighted mean of Precision scores achieved at each PR curve threshold, with the increase in Recall from the previous threshold used as the weight.

Average Precision formula

Sure, such a definition might be tough to process. Still, everything will become accessible as soon as you look at the formula.

So, the general Average Precision calculation algorithm is as follows:

Get the predictions from your model, define the thresholds, and build a Precision-Recall curve (in the multiclass case, you can compute Micro or Macro Precision/Recall, for example);
Use a loop that goes through all Precision/Recall pairs;
Calculate the difference between the current and next Recall values (weight);
Multiply the weight by the current Precision value;
Repeat steps 2-4 for the next pair;
Summarize the obtained scores;
Analyze the Average Precision value.

Additionally, in real life, if you face a multiclass case, you might want to calculate the Average Precision score for each class separately. Such an approach will give you a better view of the algorithm’s performance as you will build PR curves for each category and understand whether your model is good at detecting specific class objects.

Interpreting Average Precision

It is easy to understand the AP value itself. If you are getting more correct predictions, it leads to a better PR curve and, as a result, to higher Average Precision. The higher the metric value, the better. The best possible score is 1, and the worst is 0.

However, it is difficult to set any benchmarks in the Average Precision case because the thresholds might highly vary depending on the Machine Learning task, type of the case (binary/multiclass), etc. So, we suggest you dive deeper into your task and develop your benchmarking logic if you want to use Average Precision as an evaluation metric.

Average Precision calculation example

Let’s check out how to calculate Average Precision on a simple example. Imagine us having the following Precision/Recall pairs.

Precision	Recall
0.5	1
0.7	0.6
0.75	0.5
0.9	0.3
1	0

Let’s start with calculating the weights:

1 - 0.6 = 0.4;
0.6 - 0.5 = 0.1;
0.5 - 0.3 = 0.2;
0.3 - 0 = 0.3.

Precision	Recall	Recall weights
0.5	1	-
0.7	0.6	0.4
0.75	0.5	0.1
0.9	0.3	0.2
1	0	0.3

Now it is time to multiply the weights by the corresponding Precision values:

0.5 * 0.4 = 0.2;
0.7 * 0.1 = 0.07;
0.75 * 0.2 = 0.15;
0.9 * 0.3 = 0.27;

The final step is to summarize the obtained values:

Average Precision = 0.2 + 0.07 + 0.15 + 0.27 = 0.69

Average Precision in Python

Average Precision as a standalone Machine Learning metric is not that popular in the industry. In real life, it is mostly used as a basis for a bit more complicated mean Average Precision metric. On this page, we decided to present one code block featuring working with the Average Precision in Python through the Scikit-learn (Sklearn) library.

In Sklearn, Average Precision can be found under the average_precision_score function.

Average Precision in Sklearn (Scikit-learn)

  
Hello, thank you for using the code provided by CloudFactory. Please note that some code blocks might not be 100% complete and ready to be run as is. This is done intentionally as we focus on implementing only the most challenging parts that might be tough to pick up from scratch. View our code block as a LEGO block - you can’t use it as a standalone solution, but you can take it and add it to your system to complement it.

      python
      
      # Importing the libraries and functions
  from sklearn.metrics import average_precision_score
  
  # Defining the arrays
  y_true = np.array([0, 0, 1, 1])
  y_scores = np.array([0.1, 0.4, 0.35, 0.8]) # these are the predicted probabilities of an object to be of the class 1
  
  # Calculating the result
  average_precision_score(y_true, y_scores)

Learn more about the metrics based on the Confusion matrix

Boost model performance quickly with AI-powered labeling and 100% QA.

Learn more

Last modified 14d ago

Previous - Computer Vision metrics

F-score

Next - Computer Vision metrics

mean Average Precision (mAP)