Glass slide

Nuclei segmentation using deep learning: Methodology essentials

Written by:

Elisa Opriessnig and Fanny Dobrenova

State-of-the-art nuclei segmentation techniques using deep learning technology  solve many of the problems related to extracting reliable quantitative information on the cell nucleus-level.

This article throws light on recent DL-based methods used for nuclei segmentation tasks. Learn about the most common challenges researchers face during tissue analysis such as over- and under-segmentation, overlapping or touching nuclei and domain shift

We introduce you to a number of pre-processing, annotation and labeling best practices, intended to help you yield optimal results when implementing nucleus segmentation in digital image analysis workflows. 

Find out about the various applications of algorithm training for nuclear segmentation tasks in the field of histopathology.

How the automated analysis method can help you advance your nucleus segmentation project

Advanced methods for nuclei segmentation using deep learning (DL) have risen in popularity in recent years. These approaches aim at detecting and measuring properties of nuclei in tissue sections based on automated image recognition algorithms.

Being able to correctly detect and segment cell nuclei in microscopy images is an essential task throughout various disciplines such as pathology and histology. Changes in the morphological characteristics of the nuclei in tissue samples are often considered an indication of pathological processes.

nuclei segmentation deep learning benefits


The benefits of deep learning approaches for nuclei segmentation
5 Benefits of a DL-based method for nuclei segmentation prepared by KML Vision

We have summed up the core benefits of using a DL-based method in nuclei segmentation tasks for you:

  • Facilitating a faster and more reliable morphological assessment
  • Improving analysis objectivity
  • Improving results reproducibility
  • Increasing efficiency in research and diagnostics

Yet, the choice of the right DL-architecture for your research project should depend primarily on its performance in terms of processing efficiency and segmentation accuracy.

The cell nuclei segmentation workflow

The automated segmentation of cell nuclei involves a number of stages: preprocessing, segmentation, postprocessing and evaluation.

  • 1
    During the preprocessing stage the quality of input images is improved to ensure optimal segmentation results.
  • 2
    The objective of the actual segmentation stage is to extract cell nuclei present in the foreground of the image.
  • 3
    During the post processing stage the segmentation results are optimized. This involves sorting out false findings i.e. removing objects that are not actual nuclei, like image artifacts, from the segmentation results by adding additional code for morphological filtering. Post-processing is also applied to refine nuclear boundaries.
  • 4
    At the evaluation stage the researcher assesses the quality of the quantitative segmentation results based on benchmark metrics. The most common metrics used at this stage are precision, recall and F1-measure (Win et al., 2018).
nuclei segmentation deep learning workflow

The cell nucleus segmentation deep learning workflow

Precision is a measure of prediction accuracy. Recall is a metric related to the reproducibility of results. The F1-measure is a metric related to the object detection capabilities of the network. All above-mentioned metrics can take an expression from 0 to 1. The closer to 1 the value is, the greater the predictive power of the model (Fujita & Han, 2020).

Deep learning methods for nuclei segmentation: analysis frameworks, challenges and opportunities

A frequent and central goal of tissue analysis is determining the population of individual cell nuclei in microscopy images. Based on the task, an object detection algorithm may be sufficient to localize nuclei instances.

Morphological properties of the nucleus such as shape and texture allow observers to obtain quantitative details at the individual object level. This data may subsequently be utilized to visualize size distributions and detect abnormal patterns.

An essential step in this analysis workflow is the accurate splitting of the image into semantic classes. These classes represent coherent regions such as nuclei (on the intracellular level), connective tissue, epithelial and stromal areas within a histology dataset.

Each semantic class needs to be further processed by identifying the boundaries of every single nucleus i.e. performing boundary detection. Thus, three classes of pixel types are identified during the nuclear segmentation process: background, nuclei interior and nuclear boundaries (Caicedo et al., 2019).

Tips and tricks: Using state-of-the art deep learning techniques allows you to detect individual instances of a particular type of nuclei.

Traditional nucleus segmentation methods

The most common traditional approaches for nuclei segmentation analysis are thresholding and seeded watershed.

Thresholding involves the use of threshold intensity values in grayscale histograms of microscopy images in order to filter objects of interest. Intensity thresholding is thus well-suited for extracting nuclei from the foreground (Win et al., 2018; Caicedo et al., 2019).

The classical seeded watershed method partitions microscope images using watershed lines alongside the boundaries of morphological structures such as nuclei. By doing so, the locations of the nuclei - referred to as seeds - are identified. The algorithm further relies on region growing in order to expand each location until the boundaries of the nucleus are reached (Atta-Fosu et al., 2016).

However, accurate and robust nuclear instance segmentation using traditional methods can be challenging due to morphological variations in the nuclei among different organs and tissue types. Moreover, spatial configurations (e.g. clusters, touching nuclei), tissue preparation specifics or complex background containing imaging artefacts (e.g. folds, out-of-focus regions) are significantly limiting the applicability of such methods (Zhou et al., 2019).

At the same time, innovative deep learning-based methods have been on the rise in recent years and have proven to be more efficient and accurate and robust in segmenting cell nuclei across various tissue types (Jang et al., 2019). We provide a brief overview of the most common DL-based approaches used for nuclei segmentation.

Deep learning model architectures for nucleus segmentation

Deep neural network architectures apply semantic and instance segmentation in order to identify and locate nuclei in microscopy images (Naylor et al., 2017).

Most common DL-based approaches for nuclei segmentation:

Associates every pixel of an image with a class label

Treats multiple objects of the same class as distinct individual instance

Semantic segmentation involves assigning areas of the image to different semantic classes being nuclei area, cytoplasm area, nuclei edges or boundaries area and background area. The task of semantic segmentation DL-algorithms is to track the presence of nuclei in the foreground of the digital slide and to later detect the boundary of each nucleus by segmenting the connected foreground area (Cui et al., 2019; Kowal et al., 2020).

Instance segmentation involves in two stages: semantic segmentation and object identification.

First, the algorithm detects objects of interest and creates bounding boxes around them. Second, the model produces a pixel-wise mask for each individual object, i.e. it performs semantic segmentation on each bounding box. Instance segmentation offers the advantage of providing additional information on rich morphological features of cellular structures (Zhou et al., 2019).

DL-algorithms classify and segment objects in the images by learning features from large amounts of representative input data (Hayakawa et al., 2019). Various types of DL-architectures have been used for segmenting biomedical images, yet they all differ in layer configuration and model depth (Kowal et al., 2020). 

Other DL-architectures used for nuclei segmentation: 

Convolutional neural networks (CNN)

Fully convolutional neural networks (FCN)

U-Net

Mask R CNN

Tips and tricks:  When selecting the right model for your nucleus segmentations project you should choose one that is robust and flexible enough to adapt to different samples, research designs and segmentation tasks.

The cell nuclei segmentation workflow prepared by KML Vision

Preprocessing, annotation and labeling strategies

Prior to conducting the actual quantitative segmentation analysis, a stage of preparation is necessary in order to get the image data ready for the algorithm training. Different strategies for preprocessing, annotation and labeling of the image dataset have been proposed in existing literature in order to extract the most accurate nucleus segmentation data with trained algorithms.

We present to you several proven strategies for the preprocessing, annotation and labeling of the image dataset that can help you extract the most accurate nucleus segmentation data.

7 Training Techniques for Nuclei Segmentation prepared by KML Vision

Data preparation techniques for nuclei segmentation tasks

Training data selection and preparation

DL-algorithms tend to be effective in segmenting nuclei even when trained on a small dataset, when the task permits it. Yet, providing a variety of training images improves the predictive power of the algorithm. DL-algorithms can be used across different experimental settings as long as variations of the morphological structures of interest have been priorly introduced to the algorithm (Caicedo et al., 2019). Otherwise, a “domain shift” (check the definition in Mask R CNN) issue can occur, where the model was trained on a specific setting, but has to perform in a different setting, limits the applicability (Liu et al., 2020; Hsu et al., 2021).

The careful selection of input images is also recommended in order to be able to balance the dataset and reduce the amount of images containing background only or images of poor quality  (Araujo et al., 2019).

Image data augmentation

Data transformation techniques in the spatial or color domain may effectively be used as part of a data augmentation pipeline. Augmentation refers to the artificial extension of a dataset to increase the amount and variance of training data so that it best represents the expected target domain. Data augmentation is amongst the most important steps in preprocessing and building a robust model for production use.

Common augmentation strategies involve random or parameterized spatial transformations such as flipping, rotation or deformation, frequently combined with adding noise or color transformations (Allehaibi et al., 2019). A nuclei segmentation model generally benefits from these operations, because when using meaningful data augmentation, the total amount of manually annotated and labeled data required for training can be kept at minimum, facilitating faster model development.

Color processing and normalization

During the segmentation process of stained virtual slides color information serves as an important indicator for detecting objects of interest. However, differences in the staining or acquisition devices of the whole-slide images may pose a difficulty for DL-algorithms to properly process the data.

Different types of stains are used to mark various objects of interest within histopathological images. Stains such as hematoxylin and eosin are used to mark those objects with a particular color. Thus hematoxylin is often used to mark cell nuclei and paint them blue. Eosin stains are normally absorbed in the cytoplasm marking it reddish in color.

Certain color variations in the markings may occur as a result of using different staining protocols, stain brands or microscope and scanning devices. This may affect the performance of the deep-learning network, especially if the color hues training input images differ from those  images to be processed  (Kowal et al., 2020).

Therefore, you should aim to ensure the homogeneity of input data in terms of color variation. You can do this by performing color transformation into an illumination resistant color space and also normalization of the individual color space components. The following color normalization techniques will help you achieve the desired outcome: histogram matching, color transfer and spectral matching.  (Hayakawa et al., 2019; Cui et al., 2019; Mahmood et al., 2019; Kowal et al., 2020).

Annotation and labeling strategies 

The terms “annotation” and “labeling” are often used interchangeably and usually mean the same: adding a semantic meaning to a specific, spatially constrained image region. With annotating we refer to the process of marking image regions manually using geometric shapes, for instance a polygon, and labeling the process of adding a specific term or meaning to a respective geometry.

annotations for bioimage segmentation

The broad choice of annotation tool shapes available in the ​IKOSA Platform is there to help you outline even the most intricate morphological features.

End-to-end deep-learning-based segmentation models essentially learn to create semantic masks in the image, where each pixel is assigned to a specific semantic class. Hence, a pixel mask is also needed as a learning target for the system. While a pixel-wise annotation or “dense annotation” method requires a lot of manual effort (Qu et al., 2020) for the accurate outlining of individual objects, there are a number of annotation techniques discussed in literature to make the algorithm training process more time-efficient.

Partial annotation, or “weak annotation”, is a technique that relies on annotating only a small number of objects in the training dataset. This strategy is also referred to as “sparse labeling.” Usually, this technique does not directly result in the desired masks and requires additional preprocessing steps to subsequently generate pixel-level annotations.

While the generated masks may not be perfectly delineating the nuclei, this approach can help you save time, especially when you are creating the initial basic draft of your algorithm. Moreover, algorithms trained on partially annotated image datasets have reportedly shown similar performance results when compared to approaches involving a dense pixel-level annotation (Qu et al., 2020; Bruch et al., 2020; Ho et al., 2021).

Point annotations are probably the most time-saving method of manual image annotation. Recent work has shown that training nuclei segmentation algorithms only by marking a single location of the nucleus without actually delineating the exact boundaries is possible, which significantly reduces the effort of annotation (Yoo et al., 2019).

While fine-tuning your algorithm, carefully made annotations will definitely help you get more accurate predictions. Level up your annotating skills with our special article on how to make annotations in IKOSA.

Tips and Tricks: Another smart way to save time is to annotate certain regions-of-interest (ROIs) instead of the entire images. Learn how to solve ROI-drawing tasks in IKOSA.

Dealing with touching and overlapping nuclei during algorithm development

One object class of particular interest in the analysis of histological images is the nucleus of cells. The detection of overlapping, touching and heavily clustered nuclei is one of the most challenging issues when training automated nuclei segmentation models. The presence of multiple layers of cells e.g. in a pap smear sample often results in an upper layer of cells covering and obscuring the underlying cell layers on the microscopy slide image.

Challenges during tissue analysis:

  • Over- and under-segmentation
  • Overlapping nuclei
  • Touching nuclei
  • Domain shift

Overlapping cell nuclei have intersecting boundaries at points of concavity (Cloppet & Boucher, 2008; Ishrad et al., 2013). Thus, extracting reliable and accurate nuclear boundaries in regions of overlap may be problematic with the majority of solutions currently available on the market.

Overlaps and touching regions in the whole-slide images may result in the inaccurate identification of nuclear boundaries and may cause either under-segmentation, i.e. the segmenting overlapping nuclei as one, or over-segmentation, i.e. segmenting one single nucleus as multiple ones (Mahmood et al.,  2019;  Zhou et al.,  2019).

Alt-tag: nuclei segmentation overlapping nuclei

Pap smear image ©MIS. See how cell and nuclei boundaries touch and overlap

Several studies have attempted to address this issue using non-DL methods watershed transform algorithms (Cloppet & Boucher, 2008) or active shape models (ASM) (Plissiti & Nikou, 2012). DL-based models for splitting touching and overlapping nuclei areas are more scarce and often involve complementing the DL-algorithm with traditional techniques like seeded watershed (Kowal et al., 2020).

Yet, most DL-algorithms for splitting overlapping nuclei used in literature are not completely technically mature and still require additional postprocessing steps like binary map transformation and thresholding (Cui et al., 2019).

Recent advances in the field provide a solution to this common problem researchers face. Instance segmentation is often pointed out as a proven approach to split touching or overlapping nuclei. Instance segmentation is an especially robust method in such cases as it is a contour-aware network and particularly sensitive to the spatial and the textural dependencies between cell nuclei (Zhou et al., 2019).

Tips and tricks: Use the instance segmentation method to avoid the issue of over-or under-segmentation of the nucleus instances in your dataset.

Summing up here is a brief checklist to help you improve the outcomes of your analysis:

  • Provide a variety of training images
  • Select representative input images carefully
  • Augment the input images to obtain the optimal view
  • Perform color transformation to avoid staining-related variations
  • Consider partial annotations or annotating regions-of-interest (ROIs) to save time
  • Consider instance segmentation to split touching or heavily clustered nuclei

The use of deep-learning nuclei segmentation techniques in histopathology image analysis

We take a closer look at the uses of DL-based nucleus segmentation in the field of histopathology. In histopathology studies nuclear segmentation is an essential task, which enables nuclei morphology analysis, cell type classification as well as cancer detection and grading.

One of the biggest challenges that DL-based methods face during the analysis of histopathology image data is the segmenting of abnormal cell structures. Abnormalities in the morphological characteristics of the nuclei such as size, shape, texture, color, volume, improper distribution of nuclear to cytoplasmic ratio serve as indicators of pathological processes detected in histology images (Ishrad et al., 2013; Rączkowski et al.,  2019).

A large body of histopathology nuclei segmentation research work sets a thematic focus on two types of nuclei: lymphocyte nuclei and epithelial nuclei, since these two types are often associated with inflammatory processes on a cellular level (Ishrad et al., 2013). Nuclei segmentation studies using histopathological images have already been performed on different types of tissues, the most common ones being breast, cervix and prostate tissue.

Cell nuclei segmentation deep learning prostate tissue


Cell nuclei in prostate mouse tissue

Nuclear segmentation on the basis of histopathology image data often poses a challenge to researchers and pathologists due to:

  • color and contrast variations
  • the presence of staining artifacts
  • the morphological variability of nucleus shapes
  • the occurrence of clustered structures and overlapping regions (Ishrad et al., 2013).

Furthermore, non-deep-learning computer vision approaches often fail to correctly segment the nuclei of cancerous or malignant cells, since these algorithms have been previously trained solely on a benign cell sample and lack the ability to adapt to novel data.

A DL-model can alleviate this problem as it automatically extracts features from raw data. Using a DL-based method proves to be more effective, due to the ability of DL-algorithms to autonomously and adaptively learn from novel image data. Furthermore, DL-methods are more robust and showcase a more reliable performance when it comes to object classification, detection, and segmentation. In particular, a better predictive performance can be achieved (Araujo et al., 2019; Wang et al., 2020).

deep learning approaches for nuclei segmentation

This is how deep learning methods for cell nuclei segmentation work.

deep learning approaches for nuclei segmentation

In histopathology studies the number of cells undergoing mitosis i.e. cell division is often reported as an indicator of tumor severity. That is why it is essential to precisely track instances of mitotic nuclei. Yet, the segmentation of nuclei undergoing mitosis is also a difficult task due to variability in the morphology and texture of mitotic cells (Wang et al., 2014).

So far nuclei segmentation studies in histopathology have been conducted on different tissue types: skin tissue / basal cells (Cruz-Roa et al., 2013), breast tissue (Naylor et. al., 2017), breast tissue / mitotic cells (Wang et al., 2014; Saha et al. 2018), breast tissue / lymphocyte cells (Lu et al. 2020), prostate tissue (Ren et al. 2017; Ali et al., 2020), cervix tissue (Araujo et al. 2019), multiorgan tissue (Kang et. al. 2019; Cui et al., 2019; Jung et al. 2019).     

Several histopathology studies even suggest that the same trained algorithms often yield different performance metrics for tissues from different organs (Kang et al., 2019; Cui et al., 2019). That is why there is a high demand for effective nuclei segmentation methods which can be generalized across various cell, tissue and organ types.

The over- or under-segmentation in cases of touching and overlapping nuclei, the misidentification of micronuclei and mitotic nuclei pose different types of issues. Prior research points at differences in the performance of the algorithm when running the algorithm on benign and malignant samples. For example, Ali et al. (2020) report that the algorithm achieves higher scores in precision, recall and F-measure in the case of malignant cells  as compared to benign cells.

In order to deal with these issues further post-processing and intervention are usually necessary (Mahmood et al.,  2019; Caicedo et al.,  2019). 

Nucleus segmentation in prostate tissue samples

Several studies have laid an emphasis on segmenting nuclei in prostate biopsy tissue samples (Ren et al., 2017, Ali et al., 2020). When analyzing prostate tissue glandular structures such as epithelial nuclei, stromal nuclei, epithelial cytoplasm, stromal cytoplasm and lumen are put in focus. The accurate nuclei segmentation in the histopathology analysis of prostate tissue biopsies is a decisive factor in the diagnosis and grading of prostate cancer. Epithelial nuclei concentration on the boundaries of the prostate gland indicate that the structure of the tissue is intact and benign. The spread of epithelial nuclei with irregular shapes across the stroma areas can indicate that the biopsy sample is malignant.

Prostate tissue microscopy

Prostate tissue microscopy image

The procedure proposed by Ali et al. (2020) employs a few preprocessing steps i.e. affine transformation prior to performing a coarse segmentation of the input images. The next stage involves the extraction of glandular regions of the tissue. While doing so, benign tissue samples are distinguished from malignant ones.

More recent approaches also apply both boundary segmentation and region growing, while employing  prior knowledge on the specifics of prostate tissue structure. For the purposes of Gleason scoring, which is typically used for prostate cancer grading, morphological features like glandular growth patterns and the degree of differentiation have to be considered as seen under the microscope at low magnification.

The nuclei in neoplastic lesions are often enlarged and show prominent nucleoli, however, may also exhibit variations in size and shape, while mitotic figures are in general extremely uncommon. By using deep learning approaches, according to Bhattacharjee et al. (2019), a high accuracy for classifying input images into benign vs. malignant samples (81%), grade 3 vs. grades 4 and 5 prostate cancer samples (75%) and grade 4 vs. grade 5 prostate cancer samples  (76%) can be reported.

A multistage segmentation approach proves to be more efficient as it allows for an easier differentiation between epithelial nuclei and stroma nuclei in standard H&E stained images (Ali et al.,  2020). 

nuclei segmentation deep learning mouse prostate tissue

Annotated mouse prostate microscopy images

In the case of malignant prostate tissue samples, cell nuclei have a different appearance based on the pathological pattern associated with the dataset. With the help of advanced deep learning techniques you can extract reliable quantitative information on such specific morphological features of cell nuclei in prostate tissue datasets.

nuclei segmentation deep learning mouse prostate tissue

Mouse prostate tissue deep learning segmentation model

Developing new nucleus segmentation algorithms based on your own image data may be time-consuming and requires programming skills. Some of the deep learning architectures we have presented above are available in open source libraries, however, adjusting them for the purposes of the research design still requires expert knowledge.

Extract reliable nuclei segmentation data with Sparkfinder

Our original bioimage analysis application Sparkfinder can help you master complex nuclei segmentation tasks with ease. Developed especially for the study of multichannel fluorescence images, Sparkfinder is an excellent aid when extracting useful quantitative details about the cell nuclei in your dataset including: 

  • color intensity
  • distance and
  • density.

On top of that Sparkfinder proves highly effective for splitting touching and overlapping nuclei thanks to its advanced instance segmentation capabilities. Untap the full potential of Sparkfinder to advance your nuclei segmentation research project.

instance segmentation of cell nuclei with specialized software


Our Sparkfinder app is specifically designed to solve complex nuclei segmentation tasks with the help of instance segmentation

Try Sparkfinder now

You can try Sparkfinder yourself in the trial version of the IKOSA Platform. If you are interested in finding out more about Sparkfinder and our other image analysis solutions contact us at office@kmlvision.com.

See references

1 COMMENT
  • Prostate Tissue Histology - Automated Analysis Methods

    […] Other methods relying on histology slide data and automated histopathological image analysis allow researchers to obtain valuable quantitative information on the structural features of prostate tissue. Different deep learning techniques for epithelium segmentation, nucleus segmentation, stroma and gla… […]

Comments are closed.