You are on page 1of 5

Unet

No Author Given

Laboratory of Mathematics, Computer Science & Applications-Security of Information, Department of Mathematics.


Faculty of Sciences, Mohammed V University in Rabat, Morocco
birjmel@gmail.com

Abstract. In cancer research,

Introduction

Tumor segmentation, which involves identifying and isolating the boundaries of tumors in medical image

Related Works

In recent years, there has been growing interest in using neural networks for 3D segmentation of spheroid models
in the field of microscopy. This review covers the most recent and relevant studies published from 2017 to 2022 and
outlines the methods used, their results, and impact on the field. This section will review the most relevant studies
in increasing order of publication year.

– 2017: Kecheril Sadanandan et al. [17] proposed a novel approach for segmenting spheroids in 2D bright-field
microscopy images using a multiscale deep adversarial network. Their results showed that this method outper-
formed other networks and did not require any configuration. Schmitz et al. [14] presented a multiscale image
analysis pipeline that combines light sheet-based fluorescence microscopy with automated nuclei segmentation
and concepts from graph analysis and computational topology to produce high-quality images of intact spheroids
at cellular resolution. This approach was applied to breast carcinoma spheroids and revealed two concentric
layers of different cell density, providing a multiscale understanding of tissue architecture.
– In 2018: Khoshdeli et al. [18] proposed a new 3D segmentation model based on convolutional neural networks for
nuclear segmentation in human mammary epithelial cell line spheroids. The model was validated and demon-
strated superior pixel-based segmentation compared to previous methods. Rasti et al. [7] introduced a method
for using supervised machine learning for 3D microscopy without manual annotation. They trained and tested
a deep learning detection approach using synthetic images and showed that it could achieve over 90
– In 2019: Michálek et al. [16] presented a detailed workflow that uses algorithms for imaging spheroids, combining
data from mass spectrometry and fluorescence microscopy, and evaluating drug penetration and cell distribution
in 3D in-vitro cell culture models. Souadih et al. [8] proposed a method for automatic segmentation of the
sphenoid sinus in CT scans using a 3D convolutional neural network.
– In 2020: Tasnadi et al. [13] introduced 3D-Cell-Annotator, an open-source software for semi-automated segmen-
tation of single cells in 3D microscopy images. The software employs shape descriptors and integrates with the
3D interface of the Medical Imaging Interaction Toolkit (MITK) to reach a precision level equivalent to that of
a human expert.
– In 2021: Ahmad et al. [15] proposed a method for clearing spheroids for 3D fluorescent microscopy. It combines
safe and soft chemicals with a deep convolutional neural network to enhance visualization and enable high-
quality image acquisition and analysis. The authors also proposed a combination of local contrast metrics and
deep convolutional neural network-based segmentation of nuclei to quantify clearing efficiency. Lacalle et al. [19]
proposed SpheroidJ, an open-source deep learning-based tool set for spheroid segmentation. It can handle images
of single or multiple spheroids and does not require configuration. Grexa et al. [12] proposed SpheroidPicker,
a deep learning-based method for automated 3D cell culture manipulation. The system uses light microscopy
and a micromanipulator to select uniform spheroids for transfer to various sample holders. Chen et al. [11]
proposed a deep learning-based method for automating the detection and analysis of tumor spheroid behavior
in 3D culture. The method improves the application of 3D tumor spheroids in high-throughput drug screening.
Wen et al. [10] proposed 3DeeCellTracker, a deep learning-based software pipeline for segmenting and tracking
cells in 3D time-lapse images. The method successfully segments and tracks cells with high accuracy, enabling
the analysis of previously difficult image datasets.
XXXX

– In 2022: Rettenberger et al. [9] proposed a framework to reduce annotation efforts in image segmentation. It uses
a combination of unsupervised machine learning, thresholding methods, and a convolutional neural network to
generate better annotations than traditional methods.

The reviewed studies highlight the growing adoption of neural networks for 3D segmentation of spheroid models.
The proposed methods have improved the accuracy and efficiency of image segmentation and have enabled high
precision 3D analysis of spheroids in various fields of microscopy, especially in drug screening and cell tracking.
One approach to enhance the accuracy of 3D spheroid segmentation using deep learning is by incorporating
imaging methods. There have been several recent publications on this subject, including:

– In 2017, Kecheril Sadanandan et al. presented a multiscale deep adversarial network approach for spheroid
segmentation [17]. The authors generated ground truth data through semi-automatic segmentation with manual
correction for one dataset and manual annotation for another. They utilized data augmentation, a deep neural
network architecture influenced by U-Net, long short skip connection networks, inception-residual network, and
multiscale networks. They evaluated the performance of their method using the Dice coefficient, a metric for
comparing predicted segmentation masks with the ground truth masks.
– In 2021, Vaidyanathan et al. proposed a deep neural network-based machine learning pipeline for automatically
segmenting spheroid images [6],this model was trained using a combination of a VGG16 model for feature
extraction, a U-Net decoder for image segmentation. The authors utilized a set of 23 diverse spheroid images
that were manually annotated and used to generate binary mask images, which served as the training dataset for
the network. Meanwhile, Lacalle et al. proposed an open-source set of tools, SpheroidJ, that merges traditional
imaging processing methods with deep learning techniques to develop a generic spheroid segmentation algorithm
[19]. They found that their best model generalizes well to different experimental conditions and made the
algorithm and models publicly available to facilitate the analysis of spheroids under various conditions and
advance our understanding of tumor behavior.
– In 2022, Rettenberger et al. presented a framework for using supervised machine learning in 3D microscopy
without the need for manual annotation [9]. The authors employed a thresholding method to generate imperfect
annotations, which were then utilized in a deep neural network (U-Net) to learn segmentation masks. The idea
is that U-Net can filter out noise from false annotations and improve the generated segmentation masks.

All these studies highlight the significance of incorporating imaging methods in enhancing the accuracy of deep
learning models for 3D spheroid segmentation.
Paper Methodology Data set Results
Multiscale deep adversarial network outperformed baseline and
Multiscale deep adversarial network with two different
2D bright-field images of spheroids with varying sizes and modified U-net networks in segmenting spheroids from 2D bright-
Kecheril et al. types of deep feature extraction at different scales, linearly
shapes from different cell types, treatments, and field microscopy images, with mean loU of 0.88 and mean Dice
(2017) increased adversarial loss to segmentation network for
experimental conditions coefficient of 0.93 on dataset 1 and 0.74 and 0.83 on dataset 2,
stability
respectively
Deep learning methods (Unet, Dist, and Stardist) applied to RapiClear and Glycerol are compared to non-cleared
Schmitz et al.
segmentation images with transfer learning, end-to-end samples. Glycerol clearing method provides the best F1-score and AJl used as metrics
(2017)
learning process segmentation performance
Training set of 120,000 image patches generated by
Support Vector Machine (SVM) with linear kernel applied to
simulating the diffusion of light through numerical
Rasti et al. Local Binary Pattern (LBP) descriptor for one classification CNN classifier provides better overall detection performance
spheres, and a testing data set of 3,500 patches based on
(2018) approach, convolutional neural network (CNN) applied to compared to the LBP+SVM classifier
real images and annotated using a semi-automated cell
downscaled image for the other
detector
Khoshdeli et al. Multilayer Encoder-Decoder Network based on Manually annotated data of human mammary epithelial Proposed method has a superior pixel-based segmentation, and
(2018) Convolutional Neural Networks with encoder-decoder cell lines an F1-score of 0.95 is reported
Deep 3D CNN (Deep Medic architecture) with two parallel Preliminary results from CT volumes appear very promising, more
Souadih et al. Small dataset of 28 volume CT scan images of the
convolution paths, small convolutional kernel, and complete efficient and compact than traditional segmentation methods
(2019) sphenoid sinus
convolutional method using Dice, but requires a small dataset for training and may not
Vaidyanathan et VGG16-U-Net deep neural network for spheroid segmentation Successfully identified morphological subpopulations of drug-
VSMC spheroid images
al. (2021) and morphological clustering treated VSMC spheroids
Wen et al. 3D U-Net and data augmentation for tumor spheroid Outperformed recently developed 2D/3D tracking software using
Four datasets from different laboratories
(2021) seamentation 3DeeCellTracker
Chen et al. SMART system with PSP-U-Net for invasive tumor spheroid PSP-U-Net method outperformed existing image dissection
Six cancer cell lines
(2021) boundary detection methods
Grexa et al. Mask R-CNN and U-Net for automated spheroid detection
Unique image database of annotated spheroids Models can detect and segment spheroids with high reliability (SpheroidPicker)
(2021) and seamentation
Combination of traditional imaging processing methods and deep learning Images of two different tumor spheroids under different
Lacalle et al.
techniques (DeepLabV3+ HRNet-seg Mask-RCNN U-Net U2-Net-,) experimental conditions and captured Constructed generic algorithm particularised to different scenarios (SpheroidJ)
(2021)
for generic spheroid segmentation using different equipment
Rettenberger et U-Net for spheroid segmentation in high-throughput Droplet Microarray High-throughput Droplet Microarray Achieved high segmentation accuracy with U-Net

Deeplabv3+

Atrous Convolution and DeepLab Family

Atrous or dilated convolutions allow for an increase in the receptive field without changing the size of the feature
map or the number of parameters (figure 1). The output Y (m, n) of a 2D Atrous convolution from input X(m, n)
through a convolution filter W (i, j), with a rate factor r that determines the sample stride, can be expressed as
follows (as shown in equation

2
DEEPLABV3+

a X
X b
Y (m, n) = X(m + ri, n + rj)W (i, j)
i=1 j=1

Here, the integer parameter r = 1 refers to standard convolution, while r > 1 stands for Atrous convolution.
Atrous convolution allows for the use of filters of different sizes to cover a wider context.
Atrous or dilated convolutions allow for an increase in the receptive field without changing the size of the feature
map or the number of parameters (figure 1). The output Y (m, n) of a 2D Atrous convolution from input X(m, n)
through a convolution filter W (i, j), with a rate factor r that determines the sample stride, can be expressed as
follows (as shown in equation below):
a X
X b
Y (m, n) = X(m + ri, n + rj)W (i, j)
i=1 j=1

Here, the integer parameter r = 1 refers to standard convolution, while r > 1 stands for Atrous convolution.
Atrous convolution allows for the use of filters of different sizes to cover a wider context.

Feature map Feature map

(a) Dilation Rate =1 (b) Dilation Rate = 2

Fig. 1: Dilated convolution with size of 3 × 3 and different dilation rates

This type of convolution finds its application in image segmentation. In fact, several models are constructed
based on atrous convolution, the most famous among them being the DeepLab family proposed by Google [5]
and DenseASPP [3]. DeepLab has achieved state-of-the-art performance in various benchmark datasets, such as
PASCAL VOC, PASCAL-Context, and Cityscapes.
The first model in this family, DeepLab v1, was introduced in 2014 by the Google Research team [1]. The
original DeepLab model used a Convolutional Neural Network (CNN) architecture followed by an upsampling stage
to produce dense per-pixel prediction maps to perform semantic segmentation. The network is trained end-to-end
to optimize a per-pixel cross-entropy loss.
The DeepLab family has since evolved, with subsequent versions introducing various improvements to the model’s
performance. One major improvement was the introduction of atrous convolution (also known as dilated convolu-
tion), which allowed for the incorporation of large-scale context without increasing the number of parameters. An-
other improvement was the use of Atrous Spatial Pyramid Pooling (ASPP), which was first introduced in DeepLab
v2 [5], achieving state-of-the-art results on the PASCAL VOC 2012 dataset. Subsequently, DeepLab v3 was intro-
duced [2], which merges cascaded and parallel dilated convolution modules. The parallel modules are assembled in
the ASPP and include a 1 × 1 convolution and batch normalization. The ASPP’s outputs are concatenated and
further processed with another 1 × 1 convolution to generate the ultimate output, which consists of logits for each
pixel.

3
XXXX

DeepLab v3+ [4] was released later in 2018 and further improved the model’s performance by incorporating
an encoder-decoder structure. The encoder module encodes multi-scale contextual information by applying atrous
convolution at multiple scales, while the simple yet effective decoder module refines the segmentation results along
object boundaries.
The Deeplabv3+ model (figure 2) uses an encoder module based on the Aligned Xception model, which employs
depth-wise separable convolutions to extract features at higher resolution for semantic segmentation. The model’s
decoder module is built on the ASPP module, which fuses multi-scale features from five parallel branches, each
with a different dilation rate, to capture features at various scales for semantic segmentation. The decoder output
is upsampled to the original image resolution through bilinear interpolation, and a softmax function is applied to
generate the final segmentation map.
4 L.-C Chen, Y. Zhu, G. Papandreou, F. Schroff, and H. Adam

Encoder
1x1 Conv

3x3 Conv
Image DCNN
rate 6
Atrous Conv
3x3 Conv
rate 12 1x1 Conv

3x3 Conv
rate 18

Image
Pooling

Decoder
Upsample
Low-Level by 4
Features
Prediction

Upsample
1x1 Conv Concat 3x3 Conv by 4

Fig. 2. Our proposed DeepLabv3+


Fig. extends DeepLabv3
2: The DeepLabv3+ model. Fromby [4] employing a encoder-
decoder structure. The encoder module encodes multi-scale contextual information by
applying atrous convolution at multiple scales, while the simple yet effective decoder
module refines the segmentation results along object boundaries.
Mathematically, the DeepLab models can be represented as a function F that maps an input image I to a pixel-
wise segmentation mask Y. The input image I is first passed through a series of convolutional layers, which extract
features from the image. These features
Depthwise are then
separable passed through
convolution: an atrous
Depthwise convolutional
separable module,
convolution [27,28]which performs
dilated convolutions
or group convolution [7,65], a powerful operation to reduce the computation cost is then passed
to capture large-scale context. The output of the atrous convolutional module
through a decoder and
module (in DeepLab
number v3+),while
of parameters whichmaintaining
refines the similar
segmentation mask.better) perfor-
(or slightly
mance. This operation has been adopted in many recent neural network designs
Acknowledgement [66,67,26,29,30,31,68]. In particular, we explore the Xception model [26], similar
to [31] for their COCO 2017 detection challenge submission, and show improve-
ment in terms
This work was supported by theofMinistry
both accuracy and speed
of Higher for theScientific
Education, task of semantic segmentation.
Research and Innovation, the Digital
Development Agency (DDA) and the CNRST of Morocco.
3 Methods
References
In this section, we briefly introduce atrous convolution [69,70,8,71,42] and depth-
wise separable
1. L.-C. Chen, G. Papandreou, convolution
I. Kokkinos, [27,28,67,26,29].
K. Murphy, We then
and A. L. Yuille, reviewimage
“Semantic DeepLabv3 [23] with deep convo-
segmentation
lutional nets and which is used ascrfs,”
fully connected our arXiv
encoder module
preprint before discussing
arXiv:1412.7062, 2014 the proposed decoder
2. L.-C. Chen, G. Papandreou, F. Schroff,
module appended and encoder
to the H. Adam,output.
“Rethinking atrous
We also convolution
present for semantic
a modified image segmentation,”
Xception
arXiv preprint arXiv:1706.05587, 2017. L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy,
model [26,31] which further improves the performance with faster computation. and A. L. Yuille, “Semantic
image segmentation with deep convolutional nets and fully connected crfs,” arXiv preprint arXiv:1412.7062, 2014
3. M. Yang, K. Yu, C. Zhang, Z. Li, and K. Yang, “Denseaspp for semantic segmentation in street scenes,” in Proceedings
3.1 Encoder-Decoder
of the IEEE Conference on Computer Visionwith AtrousRecognition,
and Pattern Convolution 2018, pp. 3684–3692
4. L.-C. Chen, Y. Zhu, G. Papandreou, F. Schroff, and H. Adam, “Encoder-decoder with atrous separable convolution
for semantic imageAtrous convolution:
segmentation,” Atrous convolution,
in Proceedings a powerful
of the European tool that
conference allows usvision
on computer to ex-(ECCV), 2018, pp.
801–818. plicitly control the resolution of features computed by deep convolutional neural
5. L.-C. Chen, G. Papandreou,
networks and I. Kokkinos, K. Murphy,
adjust filter’s and A.in
field-of-view L.order
Yuille,to“Deeplab: Semantic image
capture multi-scale segmentation with deep
informa-
convolutional nets, atrous convolution, and fully connected crfs,” IEEE transactions on
tion, generalizes standard convolution operation. In the case of two-dimensional pattern analysis and machine
intelligence, vol.signals,
40, no. 4, pp. 834–848, 2017
for each location i on the output feature map y and a convolution filter
w, atrous convolution is applied over the input feature map x as follows:
4
DEEPLABV3+

6. Vaidyanathan, K., Wang, C., Krajnik, A., Yu, Y., Choi, M., Lin, B., ... & Bae, Y. (2021). A machine learning pipeline
revealing heterogeneous responses to drug perturbations on vascular smooth muscle cell spheroid morphology and for-
mation. Scientific reports, 11(1), 1-15.
7. Rasti, P., Huaman, R., Riviere, C., & Rousseau, D. (2018, May). Supervised machine learning for 3D microscopy without
manual annotation: application to spheroids. In Unconventional Optical Imaging (Vol. 10677, pp. 420-425). SPIE.
8. Souadih, K., Belaid, A., & Salem, D. B. (2019). Automatic segmentation of the sphenoid sinus in CT-scans volume with
DeepMedics 3D CNN architecture. Medical Technologies Journal, 3(1), 334-346.
9. Rettenberger, L., Schilling, M., & Reischl, M. (2022). Annotation Efforts in Image Segmentation can be Reduced by
Neural Network Bootstrapping. Current Directions in Biomedical Engineering, 8(2), 329-332.
10. Wen, C., Miura, T., Voleti, V., Yamaguchi, K., Tsutsumi, M., Yamamoto, K., ... & Kimura, K. D. (2021). 3DeeCellTracker,
a deep learning-based pipeline for segmenting and tracking cells in 3D time lapse images. Elife, 10, e59187.
11. Chen, Z., Ma, N., Sun, X., Li, Q., Zeng, Y., Chen, F., ... & Gu, Z. (2021). Automated evaluation of tumor spheroid
behavior in 3D culture using deep learning-based recognition. Biomaterials, 272, 120770.
12. Grexa, I., Diosdi, A., Harmati, M., Kriston, A., Moshkov, N., Buzas, K., ... & Horvath, P. (2021). SpheroidPicker for
automated 3D cell culture manipulation using deep learning. Scientific reports, 11(1), 1-11.
13. Tasnadi, E. A., Toth, T., Kovacs, M., Diosdi, A., Pampaloni, F., Molnar, J., ... & Horvath, P. (2020). 3D-Cell-Annotator:
an open-source active surface tool for single-cell segmentation in 3D microscopy images. Bioinformatics, 36(9), 2948-2949.
14. Schmitz, A., Fischer, S. C., Mattheyer, C., Pampaloni, F., & Stelzer, E. H. (2017). Multiscale image analysis reveals
structural heterogeneity of the cell microenvironment in homotypic spheroids. Scientific reports, 7(1), 1-13.
15. Ahmad, A., Goodarzi, S., Frindel, C., Recher, G., Riviere, C., & Rousseau, D. (2021). Clearing spheroids for 3D fluorescent
microscopy: combining safe and soft chemicals with deep convolutional neural network. bioRxiv.
16. Michálek, J., Štěpka, K., Kozubek, M., Navrátilová, J., Pavlatovská, B., Machálková, M., ... & Pruška, A. (2019).
Quantitative assessment of anti-cancer drug efficacy from coregistered mass spectrometry and fluorescence microscopy
images of multicellular tumor spheroids. Microscopy and Microanalysis, 25(6), 1311-1322.
17. Kecheril Sadanandan, S., Karlsson, J., & Wahlby, C. (2017). Spheroid segmentation using multiscale deep adversarial
networks. In Proceedings of the IEEE International Conference on Computer Vision Workshops (pp. 36-41).
18. Khoshdeli, M., Winkelmaier, G., & Parvin, B. (2018). Multilayer encoder-decoder network for 3D nuclear segmentation
in spheroid models of human mammary epithelial cell lines. In Proceedings of the IEEE Conference on Computer Vision
and Pattern Recognition Workshops (pp. 2239-2245).
19. Lacalle, D., Castro-Abril, H. A., Randelovic, T., Domínguez, C., Heras, J., Mata, E., ... & Ochoa, I. (2021). SpheroidJ:
an open-source set of tools for spheroid segmentation. Computer Methods and Programs in Biomedicine, 200, 105837.

You might also like