Mosaic

If you have ever worked on a Computer Vision project, you might know that using augmentations to diversify the dataset is the best practice. On this page, we will:

Сover the Mosaic augmentation;
Check out its parameters;
See how Mosaic affects an image;
And check out how to work with Mosaic in Python.

Let's get into it!

Mosaic augmentation explained

Mosaic data augmentation combines 4 training images into one in random proportions. The algorithms is the following:

Take 4 images from the train set;
Resize them into the same size;
Integrate the images into a 4x4 grid;
Crop a random image patch from the center. This will be the final augmented image.

Starting Mosaic grid
4 images in a grid
Small boxes represent annotations
A red square represents the random crop
Source

The final augmented image through the Mosaic transformation
Source

Mosaic augmentation teaches the model to recognize objects in different localizations without relying too much on one specific context. This boosts the model’s performance by making the algorithm more robust to the surroundings of the objects.

Using Mosaic can be really helpful if objects in your dataset:

appear in different contexts;
might have different locations in the image.

Images of people, animals, electronic components, satellite imagery, and many others might be good examples.

However, using Mosaic might not yield significant improvement if your dataset consists of:

Textual documents;
Large upfront objects;
Objects that have a fixed location in the image (for example, you have a dataset of a medical test tube always being in the same place across all the images).

Mosaic augmentation parameters

Alpha - percent jitter on the center coordinate. If alpha is 0, then the center of the mosaic will be exactly in the center of the image. If alpha is 1, then the center can be any pixel in the image;

python

      # The formula for Alpha
xc = np.random.randint(x_center - x_center * self.alpha, x_center + x_center * self.alpha)
yc = np.random.randint(y_center - y_center * self.alpha, y_center + y_center * self.alpha)