- Getting started
- Introduction to the Wiki
- Overview of topics
- How to contribute
- General best practices
- Key principles of Computer Vision
- Convolution
- Advanced convolution techniques and layers
- Pooling
- Overfitting
- Underfitting
- Overfitting Vs. Underfitting in Machine Learning
- Upsampling and Downsampling techniques in Machine Learning
- Computer Vision tasks
- The complete glossary of the modern Computer Vision tasks
- Classification / Tagging
- Object Detection
- Semantic Segmentation
- Instance Segmentation
- Panoptic Segmentation
- Attribute Prediction
- Computer Vision model architectures
- ResNet
- Faster R-CNN
- Mask R-CNN
- DeepLabv3+
- U-Net
- FBNetV3
- U-Net++
- Efficient Net
- PAN
- PSPNet
- LinkNet
- FPN
- RetinaNet
- Cascade R-CNN
- FBNetV3IS
- FBNetV3OD
- CascadeMask R-CNN
- HybridTask Cascade
- Computer Vision metrics
- Confusion Matrix
- Intersection over Union (IoU)
- Accuracy
- Hamming score
- Precision
- Recall
- Precision-Recall curve and AUC-PR
- F-score
- Average Precision
- mean Average Precision (mAP)
- Loss functions in Machine Learning
- Comprehensive overview of loss functions in Machine Learning
- Cross-Entropy Loss
- Binary Cross-Entropy Loss
- Focal loss
- Bounding Box Regression Loss
- CrossEntropyIoULoss2D
- Average Loss
- Solver / Optimizer
- Comprehensive overview of solvers/optimizers in Deep Learning
- Adam
- SGD
- Adadelta
- Adagrad
- AdaMax
- Adamw
- ASGD
- Rprop
- RMSprop
- Lion
- Weight Decay
- Base Learning Rate
- Momentum (SGD)
- Epsilon Coefficient
- Training Parameters
- Patience
- Min delta
- Seed
- Everything you need to know about batches in Machine Learning
- Iterations
- Epoch
- Scheduler
- Comprehensive overview of learning rate schedulers in Machine Learning
- ExponentialLR
- CyclicLR
- StepLR
- MultiStepLR
- ReduceLROnPlateau
- CosineAnnealingLR
- Computer Vision augmentations
- Comprehensive overview of augmentations in Machine Learning
- Horizontal Flip
- Vertical Flip
- Random Crop
- Random Sized Crop
- Rotate
- Resize
- Blur
- Smallest max size
- Center Crop
- Color Jitter
- Gaussian Noise
- Shift Scale Rotate
- Longest max size
- Equalize
- To gray
- Shear
- Mosaic
- Copy Paste
- Extrapolation methods
- Interpolation methods
- Deployment
- Primitive deployment using web frameworks
- Commonly used web frameworks
- Containerized Deployment
- Orchestrated Deployment
- Challenges of Deployment
- Splits
- Data Splitting in Machine Learning

- Getting started
- Introduction to the Wiki
- Overview of topics
- How to contribute
- General best practices
- Key principles of Computer Vision
- Convolution
- Advanced convolution techniques and layers
- Pooling
- Overfitting
- Underfitting
- Overfitting Vs. Underfitting in Machine Learning
- Upsampling and Downsampling techniques in Machine Learning
- Computer Vision tasks
- The complete glossary of the modern Computer Vision tasks
- Classification / Tagging
- Object Detection
- Semantic Segmentation
- Instance Segmentation
- Panoptic Segmentation
- Attribute Prediction
- Computer Vision model architectures
- ResNet
- Faster R-CNN
- Mask R-CNN
- DeepLabv3+
- U-Net
- FBNetV3
- U-Net++
- Efficient Net
- PAN
- PSPNet
- LinkNet
- FPN
- RetinaNet
- Cascade R-CNN
- FBNetV3IS
- FBNetV3OD
- CascadeMask R-CNN
- HybridTask Cascade
- Computer Vision metrics
- Confusion Matrix
- Intersection over Union (IoU)
- Accuracy
- Hamming score
- Precision
- Recall
- Precision-Recall curve and AUC-PR
- F-score
- Average Precision
- mean Average Precision (mAP)
- Loss functions in Machine Learning
- Comprehensive overview of loss functions in Machine Learning
- Cross-Entropy Loss
- Binary Cross-Entropy Loss
- Focal loss
- Bounding Box Regression Loss
- CrossEntropyIoULoss2D
- Average Loss
- Solver / Optimizer
- Comprehensive overview of solvers/optimizers in Deep Learning
- Adam
- SGD
- Adadelta
- Adagrad
- AdaMax
- Adamw
- ASGD
- Rprop
- RMSprop
- Lion
- Weight Decay
- Base Learning Rate
- Momentum (SGD)
- Epsilon Coefficient
- Training Parameters
- Patience
- Min delta
- Seed
- Everything you need to know about batches in Machine Learning
- Iterations
- Epoch
- Scheduler
- Comprehensive overview of learning rate schedulers in Machine Learning
- ExponentialLR
- CyclicLR
- StepLR
- MultiStepLR
- ReduceLROnPlateau
- CosineAnnealingLR
- Computer Vision augmentations
- Comprehensive overview of augmentations in Machine Learning
- Horizontal Flip
- Vertical Flip
- Random Crop
- Random Sized Crop
- Rotate
- Resize
- Blur
- Smallest max size
- Center Crop
- Color Jitter
- Gaussian Noise
- Shift Scale Rotate
- Longest max size
- Equalize
- To gray
- Shear
- Mosaic
- Copy Paste
- Extrapolation methods
- Interpolation methods
- Deployment
- Primitive deployment using web frameworks
- Commonly used web frameworks
- Containerized Deployment
- Orchestrated Deployment
- Challenges of Deployment
- Splits
- Data Splitting in Machine Learning

If you have ever tried solving a Classification task using a Machine Learning (ML) algorithm, you might have heard of a well-known Precision score ML metric. On this page, we will:

- Сover the logic behind the metric (both for the binary and multiclass cases);
- Check out the metric’s formula;
- Find out how to interpret the Precision value;
- Calculate Precision on simple examples;
- Dive a bit deeper into the Micro and Macro Precision scores;
- And see how to work with the Precision score using Python.

Let’s jump in.

The Precision score is derived from the Confusion matrix. So, to better grasp the metric, please check out the Confusion matrix page first.

With the Accuracy score having two drawbacks, such as the imbalance problem and being uninformative as a standalone Machine Learning metric, Data Scientists developed two new metrics that addressed the disadvantages of Accuracy and gave researchers a better view of a model’s performance.

Up-to-date, these metrics are widely used to evaluate Classification algorithms across the industry. They are called:

- Precision;
- And Recall.

Usually, Precision and Recall are used together as they nicely complement each other. So, to learn more, please check out the Recall page when you finish reading this one.

To define the term, in Machine Learning, the Precision score (or just Precision) is a Classification metric featuring a fraction of predictions of the Positive class that are Positive by ground truth. In other words, Precision measures the ability of a classifier not to label as Positive a Negative sample.

In a sense, Accuracy and Precision are somewhat similar, and many newcomers mistakes these metrics as interchangeable. No, they are not. They are different in their concept, so please keep that in mind when choosing a metric for your next Machine Learning project.

Anyway, to evaluate a Classification model using the Precision score, you need to have:

- The ground truth classes;
- And the model’s predictions.

The Precision score is an intuitive metric, so you should not experience any challenges in understanding it.

To get the Precision score value, you need to divide the True Positives
(the predicted class for a sample is Positive and the ground truth class
is also Positive) by all positive predictions - True Positives and
False Positives (the predicted class for a sample is Positive, but the
ground truth class is Negative).

Precision is the metric that addresses the Accuracy imbalance problem. If the class distribution is skewed, Accuracy allows you to assign all the samples to one class and get a better metric value (although such a model does not have any predictive power). However, with Precision, it is not possible as assigning all the samples to one class leads to the growth of False Positives resulting in a worse Precision score value. Such a logic behind the metric means that you can effectively use Precision even if your data is imbalanced because the Precision score is not dependent on the class distribution.

To simplify the formula, let’s visualize it.

As you can see, Precision can be easily described using the Confusion matrix terms such as True Positive and False Positive. Still, as described on the Confusion matrix page, these terms are mainly used for the binary Classification tasks.

So, the Precision score algorithm for the binary Classification task is as follows:

- Get predictions from your model;
- Calculate the number of True Positive and False Positive predictions;
- Use the formal Precision formula;
- And analyze the obtained value.

For the binary case, the workflow is straightforward. However, there are also multiclass use cases, and this is when things might get a bit tricky. In general, there are various approaches you can take when calculating Precision for the multiclass task. There are at least three different options, as you can see in the sklearn Precision score metric function:

- Micro;
- Macro;
- And Weighted.

Each of these approaches is solid and can be very helpful in model evaluation. Also, in real life, you will likely calculate the metric value using all of them to get a more comprehensive view of a problem. Please check out the micro and macro Precision score calculation examples below or the scikit-learn documentation page if you want to learn more.

So, the Precision score algorithm for the multiclass Classification task is as follows:

- Get predictions from your model;
- Identify the multiclass calculation approach you feel is the best for your task;
- Use a Machine Learning library (for example, sklearn) to do the calculations for you;
- And analyze the obtained value while keeping in mind the approach you used to get it.

In the Precision case, the metric value interpretation is straightforward. If you correctly classify more True Positives, it results in a higher Precision score. The higher the metric value, the better. The best possible value is 1 (if a model got all the predictions right), and the worst is 0 (if a model did not make a single correct prediction).

From our experience, for both multiclass and binary use cases, you should consider **Precision > 0.85** as an excellent score, **Precision > 0.7**
as a good one, and any other score as the poor one. Still, you can set
your own thresholds as your logic and task might vary highly from ours.
Also, please be careful in the multiclass cases, as you might get a
model with a high metric value on one class but a low one on the other.
So, always try to see the bigger picture. Do not rely on a single value
averaged across the classes.

Understanding whether you got a high or low Precision value is good, but what does this value mean in the grand scheme of things? The higher the Precision score value, the higher the probability that the classifier will not misclassify a Positive sample as a Negative. So, with high Precision, you can trust the model’s ability to correctly identify instances of the Positive class.

Let’s say we have a binary Classification task. For example, you are trying to determine whether a cat or a dog is on an image. You have a model and want to evaluate its performance using Prrecision. You pass **15** pictures with a cat and **20** images with a dog to the model. From the given **15** cat images, the algorithm predicts **9** pictures as the dog ones, and from the **20** dog images - **6** pictures as the cat ones. Let’s build a Confusion matrix first (you can check the detailed calculation on the Confusion matrix page).

Excellent, now let’s calculate the Precision score using the formula for the binary Classification use case (the number of correct predictions is in the green cells of the table, and the number of the incorrect ones is in the red cells).

**Precision**= (TP) / (TP + FP) = (6) / (6 + 6) ~ 0.5

Ok, great. Let’s expand the task and add another class, for example, the bird one. You pass **15** pictures with a cat, **20** images with a dog, and **12** pictures with a bird to the model. The predictions are as follows:

**15**cat images:**9**dog pictures,**3**bird ones, and**15 - 9 - 3 = 3**cat images;**20**dog images:**6**cat pictures,**4**bird ones, and**20 - 6 - 4 = 10**dog images;**12**bird images:**4**dog pictures,**2**cat ones, and**12 - 4 - 2 = 6**bird images.

Let’s build the matrix.

Macro Precision score is a way to study the classification as a whole. To calculate the Macro Precision score, you need to compute the metric independently for each class and take the average of the sum. The Macro approach treats all the classes equally as it aims to see the bigger picture and evaluate the algorithm’s performance across all the classes in one value.

Let’s calculate the Precision value for each class. To do so, we need to go row by row (the green cell is the True Positives predictions for a specific class whereas red cells are False Positives):

**Dog Precision**: 10 / (4 + 9 + 10) ~ 0.43**Bird Precision**: 6 / (4 + 3 + 6) ~ 0.46**Cat Precision**: 3 / (6 + 2 + 3) ~ 0.27**Macro Precision score**: (Dog Precision + Bird Precision + Cat Precision) / 3 = (0.43 + 0.46 + 0.27) / 3 ~ 0.386

On the other hand, the Micro Precision score studies individual classes. To calculate it, you need to sum all True Positives and divide by the sum of all True Positives and False Positives predictions across all the classes. Thus, Micro Precision will combine the contributions of all classes to calculate the average metric.

Let’s calculate the Micro Precision score value for our use case

**Micro Precision score**: (TP Dog + TP Bird + TP Cat) / ((TP + FP) Dog + (TP + FP) Bird + (TP + FP) Cat) = (10 + 6 + 3) / ((4 + 9 + 10) + (4 + 3 + 6) + (6 + 2 + 3)) ~ 0.4

The Precision score is widely used in the industry, so all the Machine and Deep Learning libraries have their own implementation of this metric. For this page, we prepared three code blocks featuring calculating Precision in Python. In detail, you can check out:

- Precision in Scikit-learn (Sklearn);
- Precision in TensorFlow;
- Precision in PyTorch.

Scikit-learn is the most popular Python library for classical Machine Learning. From our experience, Sklearn is the tool you will likely use the most to calculate Precision (especially, if you are working with the tabular data). Fortunately, you can do it in a blink of an eye.

```
# Importing the function
from sklearn.metrics import precision_score
# Initializing the ground truth array
act_pos = [1 for _ in range(100)]
act_neg = [0 for _ in range(10000)]
y_true = act_pos + act_neg
# Initializing the predictions array
pred_pos = [0 for _ in range(10)] + [1 for _ in range(90)]
pred_neg = [0 for _ in range(10000)]
y_pred = pred_pos + pred_neg
# Calculating and printing the result
precision = precision_score(y_true, y_pred, average='binary')
print('Precision: %.3f' % precision)
```

In the vision AI field, the Accuracy score algorithm is slightly different. For instance segmentors, semantic segmentors, and object detectors, a prediction is correct if the predicted class equals the ground truth one and the prediction's IoU is above a certain threshold (often, a threshold of 0.5 is used).

```
tf.keras.metrics.Precision(
thresholds=None, top_k=None, class_id=None, name=None, dtype=None)
```

```
!pip install torchmetrics
# Importing the library
import torch
import torchmetrics
from torchmetrics import Precision
# Initializing the input tensors
preds = torch.tensor([2, 0, 2, 1])
target = torch.tensor([1, 1, 2, 0])
# Сalculating and printing the result with the macro average
precision = Precision(average='macro', num_classes=3)
# 'macro': Calculate the metric for each class separately, and average the metrics across classes (with equal weights for each class).
print(precision(preds, target))
# Сalculating and printing the result with the micro average
precision1 = Precision(average='micro')
# 'micro': Calculate the metric globally, across all samples and classes.
print(precision1(preds, target))
```

Last modified 10d ago

© 2010-2024 CloudFactory Limited. All rights reserved.