Instance Segmentation
Computer Vision (CV) is a scientific field that researches software systems trained to extract information from visual data, analyze it, and draw conclusions based on the analysis. The area consists of so-called CV or vision AI tasks. Each task is unique and incorporates techniques and heuristics for acquiring, processing, analyzing, understanding the data, and extracting various details from it. One of these tasks is Instance Segmentation. On this page, we will:
- Understand the basics of the Image Segmentation field in Machine Learning;
- Cover in-depth the Instance Segmentation vision AI task;
- See how Instance Segmentation compares with Semantic and Panoptic Segmentation;
- Research the real-life applications of Instance Segmentation;
- Cover some popular Instance Segmentation datasets and SOTA results on them;
- See features that CloudFactory offers for streamlining an Instance Segmentation task.
Let’s jump in.
What is Image Segmentation in Machine Learning?
It is essential to start with a bigger picture. The logical question is, what is Image Segmentation in Machine Learning?
Well, Segmentation is a well-known term in business and marketing. In short, it defines the process of splitting customers (or a whole market) into separate groups based on specific patterns in their behavior. Fortunately, such a definition is close to what we refer to when saying Image Segmentation in ML.
Image Segmentation in Machine Learning is a part of the vision AI field that incorporates different methods of dividing visual data (for example, an image) into segments featuring specific, similar, and significant information of the same class label.
As of today, corporate Data Science regularly solves Image Segmentation challenges in various spheres. In CloudFactory, we see that the demand for high-quality Segmentation solutions has rapidly grown over the past couple of years. It also applies to Data Scientists who specialize in the Image Segmentation field. As a result, the industry is developing and growing, bringing new SOTAs, solution techniques, and challenges.
Nowadays, researchers say that the Image Segmentation field consists of three vision AI tasks. These are:
- Semantic Segmentation;
- Instance Segmentation;
- Panoptic Segmentation.
Let’s take a closer look at the Instance Segmentation vision AI task.
What is Instance Segmentation in Machine Learning?
Moving from the bigger picture to the details, let’s see what Instance Segmentation is.
Instance Segmentation is a Computer Vision task that combines the ideas behind Object Detection and Semantic Segmentation tasks bringing a way more holistic output format. As you might know, Object Detection focuses on localizing the position of an object in an image (via a bounding box) and classifying it. On the other hand, Semantic Segmentation is about classifying every pixel in a picture to create a pixel-perfect segmentation map for a specific image.
As a combination, you get Instance Segmentation that detects all distinct objects in an image, segments them, produces a pixel-perfect segmentation mask for each object, and classifies them. Here is a simple example of how an Instance Segmentation model output might look like:
The data annotation process for an Instance Segmentation task is as follows:
You predefine some instance target classes (for example, ‘person’ and ‘bicycle’);
You create a segmentation mask for each distinct instance of the target classes.
To summarize, Instance Segmentation receives an image and some instance target classes as input. As an output, you get a bounding box, a segmentation mask, and the class for each instance of target classes.
Semantic Segmentation Vs. Instance Segmentation
Although Instance and Semantic Segmentation live in the same Image Segmentation field and share some similarities, they are still different vision AI tasks. You must remember it when identifying which task you aim to solve.
Semantic Segmentation classifies every pixel on an image, so all instances of the same category share the same class label. As a result, you get a segmentation map.
On the other hand, Instance Segmentation ensures that all the objects of the same class are viewed as distinct instances. So, from an IS model, you get a bounding box and a segmentation mask for each object.
Please consider the following question if you are unsure whether you should use Instance or Semantic Segmentation as your vision AI task. Do you want your model to be able to count distinct objects? If so, label data for an Instance Segmentation task. It is generally easy to use labels created for Instance Segmentation to solve a Semantic Segmentation task. However, the other way around is tricky. Please think carefully before rushing into the annotation process.
Panoptic Segmentation Vs. Instance Segmentation
Although Panoptic and Instance Segmentation live in the same Image Segmentation field and share many similarities, they are still different vision AI tasks. You must remember it when identifying which task you aim to solve.
Instance Segmentation ensures that all the objects of the same class are viewed as distinct instances. So, from an IS model, you get a bounding box and a segmentation mask for each object.
On the other hand, Panoptic Segmentation is a combination of Semantic and Instance Segmentation. In other words, with Panoptic Segmentation, you can obtain information such as the number of objects for every target class (countable objects), bounding boxes, segmentation masks, and a segmentation map of the whole image. These come from Instance Segmentation. However, you also get a segmentation map from Semantic Segmentation and know a target class for each pixel. So, Panoptic Segmentation provides a way more holistic understanding of a scene.
Instance Segmentation Vs. Panoptic Segmentation Vs. Semantic Segmentation
Instance Segmentation | Find every distinct object of target classes | An image and some instance target classes | A bounding box and a segmentation mask for each instance of target classes |
Semantic Segmentation | Classify every pixel on an image | An image and some semantic target classes | Pixel-perfect segmentation map of the whole image |
Panoptic Segmentation | Combine Instance and Semantic Segmentation | An image and some instance and semantic target classes | Pixel-perfect segmentation map of the whole image and A bounding box and a segmentation mask for each instance of target classes |
Instance Segmentation real-life applications
Nowadays, Image Segmentation is widely used across various industries. From our experience, it seems like Instance Segmentation is more popular than Panoptic and Semantic Segmentation. Still, it does not mean that Instance Segmentation holds the market, as there is always room for other tasks and approaches.
Here are some noteworthy Instance Segmentation applications:
Autonomous driving systems (for example, segmentation of visual input data from the camera - classification of a pedestrian on the road, a vehicle, signs, etc.);
Source
Medical data processing (for example, segmentation of MRI images searching for tissues, tumors, anomalies, and their characteristics such as area, dynamics, etc.);
Aerial, satellite, and UAV image processing (for example, segmentation of a landscape);
Video surveillance systems;
Source
And many other use cases.
Instance Segmentation datasets
Speaking about Instance Segmentation datasets and SOTA solutions, it is worth mentioning that many Image Segmentation datasets have Instance Segmentation annotations within their labels. Moreover, for the major part of them, Instance Segmentation is the primary focus.
This means that there are many trustworthy benchmark datasets regularly used to evaluate recent Instance Segmentation model architectures and approaches. The most popular ones are:
COCO;
Source
Source
How do we solve an Instance Segmentation task in CloudFactory?
Throughout years in the industry, CloudFactory's IT team has developed many internal instruments that our cloudworkers and Data Scientists use when working on client cases.
Let’s go through the available options step-by-step. To streamline the Instance Segmentation annotation experience, CloudFactory's internal data labeling tool supports:
- Manual annotation tools such as Polygon and Brush. Also, there is an option to convert polygon to mask and vice versa in a single click;
- Semi-automated annotation tools such as ATOM powered by SAM and Box to instance;
- AI-powered Instance Segmentation assistant.
As for the annotation quality control process, CloudFactory has you covered with its AI Consensus Scoring feature with a separate Instance Segmentation review option. With the help of AI CS, you can find missing labels, extra labels, and different artefacts. Also, you will better understand how a machine sees your data, which might be valuable for your annotation strategy.
Regarding model development, CloudFactory's internal model-building tool supports many modern neural network architectures. For Instance Segmentation, these include:
- Mask R-CNN;
- HybridTaskCascade;
- CascadeMaskRCNN;
- FBNetV3.
As a backbone for these architectures, CloudFactory offers:
- ResNet with adjustable depth.
As a Machine Learning metric for the Instance Segmentation case, CloudFactory implements mask mean Average Precision (mask mAP).
Hasty's YouTube channel features video tutorials of working on various Computer Vision tasks. Check out how it looks on the developer's side. Here is the one about Instance Segmentation.
As of today, these are the key technical options CloudFactory has for Instance Segmentation cases. If you want a more detailed overview, please check out the further resources or book some time with us to get deeper into CloudFactory with our help.
Further Resources
- Guide on how to choose between Object Detection, Instance, and Semantic Segmentation for your use case;
- Creating an Instance Segmentation split in the Model Playground environment;
- Annotating data for an Instance Segmentation task with CloudFactory capabilities;
- Using, analyzing, and interpreting the AI-powered data Quality Control run for an Instance Segmentation task.