BMS Cmim 2020 162

Send Orders for Reprints to reprints@benthamscience.
net
Current Medical Imaging, 2021, 17, 1-00 1
REVIEW ARTICLE
A Tour of Unsupervised Deep Learning for Medical Image Analysis

Khalid Raza1,* and Nripendra Kumar Singh2
1
Department of Computer Science, Jamia Millia Islamia, New Delhi, India; 2Department of Computer Science, Jamia
Millia Islamia, New Delhi, India
Abstract: Background: Interpretation of medical images for the diagnosis and treatment of com-
plex diseases from high-dimensional and heterogeneous data remains a key challenge in transform-
ing healthcare. In the last few years, both supervised and unsupervised deep learning achieved
promising results in the area of medical image analysis. Several reviews on supervised deep learn-
ing are published, but hardly any rigorous review on unsupervised deep learning for medical image
analysis is available.
Objectives: The objective of this review is to systematically present various unsupervised deep
ARTICLE HISTORY
learning models, tools, and benchmark datasets applied to medical image analysis. Some of the dis-
Received: May 30, 2020 cussed models are autoencoders and their variants, restricted Boltzmann machines (RBM), deep be-
Revised: November 17, 2020 lief networks (DBN), deep Boltzmann machine (DBM), and generative adversarial network
Accepted: December 16, 2020
(GAN). Future research opportunities and challenges of unsupervised deep learning techniques for
DOI: medical image analysis are also discussed.
10.2174/1573405617666210127154257
Conclusion: Currently, interpretation of medical images for diagnostic purposes is usually per-
formed by human experts that may be replaced by computer-aided diagnosis due to advancement
in machine learning techniques, including deep learning, and the availability of cheap computing in-
frastructure through cloud computing. Both supervised and unsupervised machine learning ap-
proaches are widely applied in medical image analysis, each of them having certain pros and cons.
Since human supervisions are not always available or are inadequate or biased, therefore, unsuper-
vised learning algorithms give a big hope with lots of advantages for biomedical image analysis.
Keywords: Unsupervised learning, medical image analysis, autoencoders, restricted boltzmann machine, deep belief network.
1. INTRODUCTION supervised and unsupervised machine learning approaches

Medical imaging techniques, including magnetic reso- are widely applied in medical image analysis; each of them
nance imaging (MRI), positron emission tomography (PET), has its pros and cons. Some of the widely used supervised
computed tomography (CT), mammography, ultrasound, X- (deep) learning algorithms are Feedforward Neural Network
ray, and digital pathology images, are frequently used diag- (FFNN), Recurrent Neural Network (RNN), Convolutional
nostic systems for the early detection, diagnosis, and treat- Neural Network (CNN), Support Vector Machine (SVM),
ment of various complex diseases [1]. In the clinics, the im- and so on [2]. There are many scenarios where human super-
ages are mostly interpreted by human experts such as radiol- visions are unavailable, inadequate, or biased, and therefore,
ogists and physicians. Because of major variations in pathol- a supervised learning algorithm cannot be directly used. Un-
ogy and the potential fatigue of human experts, scientists supervised learning algorithms, including their deep architec-
and doctors have started using computer-assisted interven- ture, give hope with lots of advantages and have been wide-
tions. The advancement in machine learning techniques, in- ly applied in several areas of medical and engineering
cluding deep learning and the availability of computing in- problems, including medical image analysis.
frastructure through cloud computing, has fuelled the field This review presents unsupervised deep learning models,
of computer-assisted medical image analysis and comput- their applications in medical image analysis, a list of soft-
er-assisted diagnosis (CAD). Deep learning is about learning ware tools/packages and benchmark datasets, and discusses
representations, i.e., learning intermediate concept or fea- opportunities and future challenges in the area.
tures which are important to capture dependencies from in-
put variables to output variables in supervised learning, or 2. WHY UNSUPERVISED LEARNING?
between subsets of variables in unsupervised learning. Both In the majority of machine learning projects, the work-
flow is designed in a supervised way, where the algorithm is
*Address correspondence to this author at the aDepartment of Computer humanly maneuvered, referring to tasks with already speci-
Science, Jamia Millia Islamia, New Delhi, India; E-mail: kraza@jmi.ac.in fied labels. In such supervised architecture, the potential of
1573-4056/21 $65.00+.00 © 2021 Bentham Science Publishers

2 Current Medical Imaging, 2021, Vol. 17, No. 00 Raza and Singh
the algorithms is limited in three ways: (i) A huge manual ef- (2)
fort to create labels, (ii) To prepare biases to check the algo-
rithms’ supervised functioning, and (iii) Reduce the scalabili-
ty of the target function at hand. (3)
To intelligently solve these issues, an unsupervised ma- where Sf and Sg are encoder and decoder activation func-
chine learning algorithm can be used. Unsupervised mation (normally sigmoid, hyperbolic tangent, or an identity
chine learning algorithms not only derive insights directly function), while the parameters of the model θ = {W, b, W',
from the data and group the data based on their patterns but d} where W and W’ are encoder-decoder weight matrices,
also use these insights for data-driven decision-making. Al- and b and d are encoder and decoder bias vector respective-
so, unsupervised models are more robust in the sense that ly. Moreover, regularization or sparse constraints may be ap-
they act as a base for several different complex tasks where plied to boost the discovery process. In such a case, the hid-
these can be utilized as the holy grail of learning and classifi- den layer(s) has the same output as the input layer, and no
cation. In fact, classification is not the only task that is em- non-linearity is added. The model would simply learn an
ployed, but other tasks such as compression, dimensionality identity function. (Fig. 2 and 3 illustrate the basic structure
reduction, denoising, super-resolution, and some degree of of AE.
decision making are also performed. Therefore, it is more
useful to construct a model without having any knowledge
about the tasks. In a nutshell, we can think of unsupervised
learning as a preparation (pre-processing) step for super-
vised learning tasks, where the latter may allow better gener-
alization of a classifier [2].
Unsupervised pre-training gained popularity because of
its simplicity and has been highly addressed by the medical
imaging community. Moreover, autoencoders, along with
their several variants and other unsupervised methods, are re-
ported, which are being extensively applied in medical im-
age analysis.
3. UNSUPERVISED DEEP LEARNING MODELS

The unsupervised deep learning models can be broadly
classified as shown in Fig. (1). These have a myriad of uses
in different aspects of medical image processing.
3.1. Autoencoders and its Variants Fig. (1). Unsupervised deep learning models. (A higher resolution /
colour version of this figure is available in the electronic copy of
3.1.1. Autoencoders and Stacked Autoencoder the article).
Autoencoders (AEs) [3] are simple unsupervised learn-
ing models consisting of a single-layer neural network that Stacked autoencoders (SAEs) are constructed by organiz-
transforms the input into a latent or compressed representa- ing AEs on top of one another, also known as deep AEs.
tion by minimizing the reconstruction errors between input They consist of multiple AEs stacked into multiple layers
and output values of the network. By constraining the dimen- where the output of each layer is wired to the inputs of the
sion of latent representation (maybe from different input (s), successive layers (Fig. 2(B)). To obtain good parameters,
it is possible to discover relevant patterns from the data. Au- SAE uses greedy layer-wise training. The benefit of this
toencoder framework defines a feature to extract function type of autoencoder is that it can enjoy the benefits of a deep
with specific parameters [4]. Basically, AEs are trained with network having greater expressive power. Furthermore, it
a specific function. fθ is called encoder and h = fθ (x) is a fea- usually captures a useful hierarchical grouping of the input
ture vector or representation from input x. Another parame- [5].
terized function gθ is called a decoder which produces input
space back from feature space. In short, basic AEs are 3.1.2. Denoising autoencoder
trained to minimize reconstruction error in finding a value of Denoising autoencoder (DAE) is another variant of the
parameter θ given by: auto-encoder. Denoising is important and is encapsulated as
a training criterion for efficient and robust learning and to ex-
(1) tract useful features [6]. They also prevent the model from
learning a trivial solution, where Litjens et al. [7] model is
This minimization is optionally followed by a non-linear- trained to reconstruct a clean input from the corrupted ver-
ity (most commonly used for encoder and decoder) as given sion from noise or another corruption which is done by corr-
by:
A Tour of Unsupervised Deep Learning Current Medical Imaging, 2021, Vol. 17, No. 00 3
Fig. (2 A-D). A collage of networks of autoencoders and its variants. (A higher resolution / colour version of this figure is available in the
electronic copy of the article).
Fig. (3). A collage of networks in (A) Convolutional autoencoder, (B) Variational autoencoder, and (C) Contractive autoencoders. (A higher
resolution / colour version of this figure is available in the electronic copy of the article).
upting the initial input x into by using a stochastic func- The property of the penalty function is that KL (P || j) =
tion . The corrupted input is then mapped to 0, if Pj = j, otherwise, it increases gradually as j diverse
a hidden representation and recon- for P.
struction z = gθ (y). A schematic representation of DAE is
shown in Fig. (2(C)). Parameters θ and θ' are initialized ran- 3.1.4. Convolutional Autoencoder
domly and trained using stochastic gradient descent to mini- As current research witness, the stacked AE is the most
mize average reconstruction error. The denoising autoen- popular and widely used network model in deep unsuper-
coders continue minimizing the same reconstruction loss be- vised architecture [7], which requires layer-wise pre-train-
tween clean X and reconstruction from Y. This continues ing. When layers go deeper during the pre-training process
maximizing a lower bound on the mutual information be- making fully connected layers, the entire process becomes
tween input x and representation y, and the difference is ob- very tedious and time-consuming. Li et al. [13] proposed the
tained by applying mapping fθ to a corrupted input. Hence, first trial to train convolutional directly in an end-to-end
such learning is better than the identity, and it extracts fea- manner without pre-training. Guo et al. [14] suggested that
tures useful for denoising. Stack denoising autoencoder (S- convolutional autoencoder (CAE) is beneficial to learn fea-
DAE) is a deep network utilizing the power of DAE [6, 8] tures for images and preserving the local structure of data
and RBMs [9, 10]. and to avoid distortion of feature space. A typical architec-
ture of CAE is depicted in Fig. (3).
3.1.3. Sparse Autoencoder
The limitation of autoencoders to have only small num- 3.1.5. Variational Autoencoder
bers of hidden units can be overcome by adding a sparse con- Another variant of autoencoder, called variational autoen-
straint, where a large number of hidden units, more than coder (VAE), was introduced as a generative model [15]. A
one, can be introduced. The aim of sparse autoencoder typical architecture of VAE is given in Fig. 3. They utilize
(SAE) is to make a large number of neurons have a low aver- the strategy of deriving a lower bond estimator from the di-
age output so that neurons are inactive most of the time. rected graphical models with a continuous distribution of la-
Sparsest can be achieved by introducing a loss function dur- tent variables. The generative parameter θ in the decoder
ing the training or by simple manual zeroing few strongest (generative model) assists the learning process of the varia-
hidden unit activations. A schematic representation of SAE tional parameter, ϕ as an encoder in the variational approxi-
is shown in Fig. (2(D)). mation model. VAEs apply the variation approach to latent
If the activation function of the hidden neurons is aj, the representation, learning as additional loss component train-
ing estimators, known as Stochastic Gradient Variational
average activation function of each hidden neuron j will then
Bayes (SGVB) and Autoencoding Variational Bayes
be given by:
(AEVB) [15]. It optimizes the parameter ϕ and θ for the
probabilistic encoder qØ (Z, X), which is an approximation
to the generative model pθ (Z, X) where z is the latent vari-
(4) able and x is a continuous or discrete variable. It aims to
maximize the probability of each x in the training data set un-
der the entire generative process. However, an alternative
The objective of sparse constraints is to minimize Pj so configuration of generative latent variable modeling rises to
that Pj = P, where P is a sparse constraint very close to 0, give deep generative models (DGMs) instead of the existing
such as 0.05. To enforce sparse constraints, a penalty term is assumption of symmetric Gaussian posterior [16].
added to the cost function, which penalizes j, de-weighting
Recently, Ilse and collaborators [114] extended the VAE
significantly from P. The penalty term is the Kullback-Lei- framework, named domain invariant variational autoencoder
bler (KL) divergence between Bernoulli random variables, (DIVA), to tackle the problem of domain generalization,
which can be calculated as [11, 12]. such as learning representation from a set of the previously
unseen domain. Considering a perfectly disentangled latent
space, it is hypothesized that there exists a latent subspace
(5) that is invariant to change, called domain invariant, d. The
DIVA considers three independent latent subspaces such as
domain (zd), class (zy), and residual variations (zx). As zd and
where N2 is the number of neurons in the hidden layers, zy are marginally independent, the model would learn repre-
and index j is summing over the hidden units in the network. sentations zy, which are invariant with respect to the domain
d. Finally, these three latent variants are used by a single de-
coder pθ (x|zd, zx, zy) to reconstruct x. A detailed discussion
(6) on DIVA can be found in [114].
3.1.6. Contractive Autoencoder

(7)
The contractive autoencoder was presented by Rifai
(2011) [17] as a novel approach for training deterministic au-
toencoder. It has an additional explicit regularizer in the ob- The difference between contractive AE and DAE is stat-
jective function that enables the model to learn a function ed in earlier work [6]. The latter explicitly encourages ro-
with the slightest variations in the input values. This additio- bustness of representation, whereas the former stresses the
nal regularizer corresponds to the squared Frobenius norm robustness of reconstruction. This property makes sense of
of the Jacobian matrix of given activation concerning the in- contractive AE as a better choice than DAEs to learn useful
put. The general architecture of the contractive autoencoder feature extraction. Table 1 presents a summary of autoen-
is given in Fig. (3). The contractive autoencoder is obtained coders and their variants, and Table 2 presents the applica-
with the regularization term in the following equation yield tions of autoencoders and their variants for medical image
the final objective function. analysis.
Table 1. Summary of autoencoders and their variants.
Types Descriptions References

Autoencoder One of the simplest forms that aim to learn a representation (encoding) for a set of data [18, 19]
Stacking autoencoder An autoencoder having multiple layers where the outputs of each layer are given as inputs of the successive layer [20]
Sparse autoencoder Encourages hidden units to be zero or near to zero [21]
Denoising autoencoder Capable of predicting true inputs from noisy data [22, 23]
Convolutional autoencoder Learns features, preserves the local structure of data, and avoids distortion of feature space [14]
A generative model utilizing the strategy of deriving a lower bond estimator from directed graphical models with a
Variational autoencoder [15]
continuous distribution of latent variables
Contractive autoencoder Forces the encoder to take small derivatives [17]
Table 2. Applications of autoencoders and their variants for medical image analysis.
Method Task Image Type Remarks References

SAE AD/MCI classification MRI SAE accompanied by supervised fine-tuning [24]
Extraction of latent features on a huge set of features obtained from MRI
SAE AD/MCI/HC classification MRI and PET [25]
and PET images using SAE
SAE AD/MCI/HC classification MRI SAE used to pre-train 3D CNN [29]
SAE MCI/HC classification fMRI SAE used for feature extraction, HMM as a generative model on top [26]
SAE Hippocampus segmentation MRI SAE used for representation learning and measurement of target/atlas patch [30]
SAE used to learn appearance features to steer the shape model for segmen-
SAE Visual pathway segmentation MRI [31]
tation
Uses an ensemble of denoising SAE (pre-trained with RBMs). Denoising
SAE Denoising DCE-MRI MRI [32]
contrast-enhanced MRI sequences using expert DNNs
SAE was used to detect coronavirus disease 2019 (COVID-19) infections
SAE COVID-19 diagnosis CT [121]
from the CT image
SSAE Nucleus detection Digital pathology image Detection of nuclei on breast cancer digital histopathological images [33]
SAE is applied to classify tissues and their subsequent histogram
SAE Stain normalization Digital pathology image [34]
Matching
Unsupervised CNN with SAE to learn features from unlabeled data for
SAE Density classification Mammography [28]
breast texture and density classification
Learn to extract features from multi-parametric MRI data, subsequently cre-
SAE Lesion classification MRI [27]
ates a hierarchical classification to detect prostate cancer
Detection of Heart, kidney, and SAE used for acquisition of Spatio-temporal features on 2D along with time
SAE MRI [5]
liver location DCE-MRI
SAE Cell segmentation Digital pathology image Learning spatial relationships [35]
Segmentation of right ventricle
SAE MRI SAE applied to obtain initial right ventricle segmentation [36]
in cardiac MRI
The SDAE trained with data and their structured labels for cell segmenta-
SDAE Cell segmentation Digital pathology image [37]
tion
A post-processing method that is based on DAE to improve the anatomical
Post-DAE Image segmentation X-ray, MRI [112]
plausibility of image segmentation algorithms

SSAE AD MRI SSAE for early detection of Alzheimer’s disease from brain MRI [38]
SSAE Brain tumor detection MRI A two-layer SSAE for the detection of brain tumors from MRI slices [115]
SDAE Breast lesion Ultrasound and CT Stacked Denoising AE for Diagnosis of breast nodules and lesions [39]
SDAE for an unsupervised early prediction of a patient’s future clinical
SDAE Patient clinical events Patient clinical history [40]
events and disease
SDAE - CT/MRI Multi-modal SDAE is used to pre-train the DNN [41]
DCAE Modeling task fMRI tfMRI Deep Convolutional AE to model tfMRI [42]
DCAE Tuberculosis severity scoring CT DCAE is used for descriptors extraction from a 3D CT image [113]
CAE AD/MCI/HC classification fMRI CAE used to pre-train 3D CNN [43]
Sparse CAE to detect and encode nuclei and feature extraction from tissue
CAE Nucleus detection Digital pathology image [44]
section images
CAE was used to extract features from unlabeled data for a CNN-based 3D
CAE 3D liver segmentation CT [111]
image segmentation
[Abbreviations: CAE: Convolutional autoencoder; DCAE: Deep convolutional autoencoder; SAE: Sparse autoencoder; SSAE: Stacked sparse autoencoder; SDAE: Stack denoising
autoencoder; DAE: Deep autoencoders; H&E: hematoxylin and eosin staining; AD: Alzheimer’s disease; MCI: Mild cognitive impairment; fMRI: Functional magnetic resonance
imaging; sMRI: Structural magnetic resonance imaging; rs-fMRI: Resting-state fMRI; EEG: electroencephalography; DBN: Deep belief network; RBM: Restricted Boltzmann
machine].
SAE were used by Suk and Shen [24] and Suk et al. [25] en input vector x can take the latent feature representation h
to extract low-level, latent features from multimodal imag- and vice-versa. These are generative models that learn a
ing datasets of MCI, AD, and HC (healthy control subjects). probability distribution over the given input space and gener-
MRI, FDG-PET, and biological marker data from CSF were ates a new data point [45]. An illustration of a typical RBM
pre-trained on SAE followed by fusion of extracted latent is shown in Fig. (4(A)). In fact, RBMs are restricted ver-
features with multi-kernel SVM to yield the diagnosis. Su- sions of Boltzmann machines where neurons must form an
pervised fine-tuning of the whole network using the pre-- arrangement of bipartite graphs. Due to this restriction, pairs
trained parameters at the starting point has led to the consid- of nodes belonging to each of the visible and hidden nodes
erable accuracy of the output. Another variant of SAE, name- have a symmetric connection between them, and nodes
ly Deep Autoencoders (DAE), when used along with the within a group have no internal connections. This restriction
Hidden Markov Model (HMM), renders the diagnosis of makes RBM a more efficient training algorithm than the gen-
MCI based on rs-fMRI data as input [26]. Apart from its use eral case of the Boltzmann machine. Hinton et al. (2010)
in classification, prediction, and diagnosis of various brain [45] proposed a practical guide to train RBMs.
disorders, several studies use SAE for lesion classification in RBMs have been utilized in various aspects of medical
prostate cancer, texture, and density of breast cancer, detec- image analysis as classification, segmentation, or detection
tion of visceral organ’s location from DCE MRI images, and of objects from radio images or pathological data. Yoo et al.
so on [5, 27, 28]. (2014) [46] proposed a method for segmentation of multiple
Besides the autoencoders and their several variants men- sclerosis (MS) lesion in multi-channel 3D MRI images.
tioned in Table 1, there are many other variants of autoen- Huang et al. 2016 [47] applied the RBM for Blind Source
coders, although not prominently visible. However, they are Separation (BSS) for fMRI data instead of Independent Com-
recently proposed and their applications in various aspects ponent Analysis (ICA) to detect latent sources that differenti-
of medical image analysis are shown. For instance, Super- ate internal or functional interaction in the special brain re-
vised Switching Autoencoders (SSAs) proposed for gion. Cai et al. (2016) [48] forwarded a novel architecture
Alzheimer’s disease classification from the single-slice im- called Transformed Deep Convolutional Network (TDCN),
age and disease regional analysis [116], Spatiotemporal At- a multi-output Convolutional Restricted Boltzmann Machine
tention Autoencoder (STAAE) for attention deficit hyperac- (CRBM) implemented to fuse the learning features from dif-
tivity disorder (ADHD) classification [117], Disentangled ferent modalities, e.g., CT and MRI in an unsupervised man-
Autoencoder for cross-stain feature extraction [118], 3D resi- ner which improved the recognition of vertebra pattern of
dual autoencoder (3D ResAE) to model deep representations different species. Jaumard-Hakoun et al. (2016) [49] trained
of fMRI [119], and Dense Residual Convolutional Auto En- a deep autoencoder in two phases. In the first phase, Transla-
coder (DRCAE) for retinal blood vessels segmentation tional RBM (tRBM) learned the relationship between inputs
[120]. (US and contour) data and, in the second phase, it recon-
structed the contour of the tongue motion during a speech
3.2. Restricted Boltzmann Machines from US image. However, RBMs and their variant were
widely applied for classification problems. Cao et al. (2015)
Restricted Boltzmann Machines (RBMs) are a variant of [50] solved the classification of imbalanced data in breast
Markov Random Field (MRF), constituting single layer undi- mammogram based oversampling and semi-supervised learn-
rected graphical model with an input layer or visible layer x ing model for false-positive reduction. Zhang et al. (2016)
= (x1, x2...... xN) and a hidden layer h = {h1, h2, ….HM}. The [51] achieved a classification AUC of 93.40% to differenti-
connection between nodes/units is bidirectional, so each giv- ate between benign and malignant breast tumors. Van
Fig. (4). Various unsupervised learning models (A) Restricted Boltzman Machine, (B) Deep Belief Network, (C) Deep Boltzman Machine,
(D) Generative Adversarial Network. (A higher resolution / colour version of this figure is available in the electronic copy of the article).
Table 3. Applications of RBM for medical image analysis.

Uses multi-channel 3D MR images of multiple sclerosis (MS) lesion for MS seg-
RBM Multiple sclerosis lesions 3DMRI [46]
mentation
RBM based method for oversampling and semi-supervised learning to solve the
RBM Mass detection in breast cancer Mammography [50]
classification of imbalanced data with a few labeled samples
RBM is used for both internal and functional interaction-induced latent source de-
RBM fMRI blind source separation fMRI [47]
tection
RBM Vertebrae localization CT, MRI RBMs to locate the exact position of the vertebrae [48]
Shear wave elastography for classification of benign and malignant mammary
RBM Benign/Malignant classification Ultrasound [51]
gland tumors using RBM
Analysis of tongue motion during the speech, using autoencoders in combination
tRBM Tongue contour extraction Ultrasound [49]
with RBM
Lung tissue classification and airway Discriminative and generative learning by CRBM to develop filters for training as
CRBM CT [52]
detection well as classification
Achieves average recognition accuracy for ventricular and supraventricular ec-
RBM Cardiac arrhythmia classification ECG topic beats (93.63% and 95.57%, respectively) for Cardiac arrhythmia classifica- [53]
tion
RBM Brain lesion segmentation MRI RBM is used for feature learning and a Random Forest as a classifier [54]
A Restricted Boltzmann machine with backpropagation has been used for histo-
RBM Breast-image classification MRI [55]
pathological breast-image classification
Four-layer RBM is trained with the frequency signal of EEG used in brain-com-
RBM Motor imagery classification EEG [84]
puter interface research
[tRBM: Translational RBM; CRBM: Convolutional RBM].
Tulder and de Bruijne (2016) [52] proposed the Convolution- When multiple RBMs are stacked hierarchically, an undi-
al RBM, which performed good in both training data and rected generative model is formed by the top two layers, and
lung texture classification. Mathews et al. (2018) [53] de- a directed generative model is formed by lower layers. (Fig.
monstrated single-lead ECG signals classification for detec- 4(B)) illustrates the structure of DBN. Hinton et al. [57] for-
tion of ventricular and supraventricular heartbeats. Pereira et warded an algorithm “wake-sleep” for unsupervised neural
al. (2017) [54] highlighted a combined RBM-Random forest networks. Further, Hinton et al. (2006a) [10] applied a fast
classifier which jointly correlated between features imaging learning algorithm based on layer-wise training procedure,
data for brain tumor segmentation and penumbra estimation. where lower layers learn low-level features and subsequent-
A brief account for the application of RBMs in medical im- ly higher layers learn high-level features.
age analysis is shown in Table 3.
DBN has received a lot of attention in the field of com-
3.3. Deep Belief Networks puter vision for success in object recognition. Still, it was
too expensive, like training 3D images due to large training
Deep Belief Networks (DBN) is a kind of deep neural parameters, so the limitation could not be overcome. Recent-
network proposed by Bengio (2009) [56]. It is a greedy lay- ly a new variant of DBN was introduced, namely, convolu-
er-wise unsupervised learning algorithm with several layers tional DBN (ConvDBNs) [58]. Other work [59] [60] demon-
of hidden variables [8]. Layer-wise unsupervised training strated a DBN architecture that learns low-dimensional mani-
helps the optimization and provides weight initialization for fold of brain 3D-MRI that detects different variations in mor-
better generalization. In fact, DBN is a hybrid single proba- phological changes and correlates with Alzheimer’s disease
bilistic generative model, like a typical RBM. To construct a parameters and Lesion Distribution in Multiple Sclerosis.
deep architecture like SAEs where AEs layers are replaced Plis et al. (2014) [61] and Pinaya et al. (2016) [62] evaluat-
by RBMs, DBN has one lowest visible layer V, representing ed deep learning methods to characterize brain networks in
the state of input data vector and a series of hidden layers h1, neurocognitive disorders like Huntington’s disease and vari-
h2, h3, . . . hL. The following function in DBN represents the ous brain region in Schizophrenia using DBN and super-
joint distribution of visible unit V, hidden layers hl (l = 1, vised fine-tuning whereas Ortiz et al. (2016) [63] classified
2…. L): Mild Cognitive Impairment (MCI) and AD for simpler diag-
nosis and treatment based on Automated Anatomical Label-
ing (AAL) which provides grey matter (GM) image from
each brain region. Carneiro et al. (2012) [64], Carneiro and
(8) Nascimento (2013) [65] built rigid and non-rigid derivatives
Table 4. Applications of DBNs for medical image analysis.

Manifold Learning for AD/HC
DBN MRI DBNs with convolutional RBMs for manifold learning [59]
classification
DBM along with convolutional RBM layers to efficiently train DBMs to
DBN, Convolutio-
multiple sclerosis MRI detect morphological changes in the brain in normal as well as disease [60]
nal RBM
conditions
Huntington’s disease and Schi- Evaluation of DBN to estimate brain networks in neurocognitive disor-
DBN MRI [61]
zophrenia ders like Huntington’s disease and Schizophrenia
A group of voting schemes clubbed using an SVM to better classify AD
DBN AD/MCI/HC classification MRI [63]
and MCI from the brain’s 3D gray matter images
A DBN-assisted system, exploiting non-rigid registration, landmarks,
DBN Left ventricle segmentation Ultrasound [64, 65]
and patches to maneuver multi-atlas segmentation
Characterizing differences in morphology of various brain regions in
DBN Schizophrenia/NH classification MRI [62]
schizophrenia using DBN and supervised fine-tuning
Training DBN to extract features from prostate ultrasonography images
DBN Lesion classification Ultrasound [67]
to classify benign and malignant lesions
The combination of DBN and level set method to yield automated seg-
DBN Left ventricle segmentation MRI [66]
mentation of the left ventricle from cardiac cine MRI
Autism spectrum disorders classi- Classifies Autism spectrum disorders (ASDs) in children using rs-fMRI
DBN rs-fMRI, sMRI [68]
fication and sMRI data based on Random Neural Network clustering
Multi-modal re- Risk factor analysis and osteoporosis prediction based on the heteroge-
DBN Bone disease prediction [85]
cords neous electronic health records (EHRs)
Discrete Wavelet Transform (DWT) is combined with DBN to detect
DWT-DBN Glioblastoma tumor detection MRI [122]
Glioblastoma tumor in MRI
which are based on training classifier to automatic segmentation from the research communities in the area of medical
tion of the left ventricle of the heart from ultrasound se- image analysis. Still, with the power of well-built modeling
quence data at the same time. Ngo et al. (2017) [66] intro- abilities of DBM, even with a small number of training data,
duced a combination of DBN and level set model to yield au- DBM might get its due place in the future. Suk et al. (2014)
tomated segmentation of the left ventricle from cardiac cine [74] proposed a high-level latent and combined feature repre-
MRI. Although DBN also trains for the classification of be- sentation and fusion of multimodal images from MRI and
nign and malignant lesions from prostate ultrasonography, PET using DBM, which compared the binary classification
images from other work [67] [68] classify Autism spectrum of Alzheimer's disease (AD) and have Mild Cognitive Im-
disorders (ASDs) in children using rs-fMRI and sMRI data pairment (MCI) disease. Cao et al. (2014) [75] developed a
based on Random Neural Network clustering. Li et al. deep DBM for next-generation medical image revival sys-
(2014) [85] developed a risk analysis framework for osteo- tem instead of content-based image retrieval (CBIR), which
porosis prediction based on heterogeneous electronic health learned joint density from the multimodal learning model to
records (EHRs). To improve the classification accuracy, dis- get relative missing modalities. Recently, Wu et al. (2018)
crete wavelet transform (DWT) was combined with a DBN [76] developed a frame-by-frame heart segmentation by em-
and applied to detect Glioblastoma tumors from MRI [122]. ploying a three-layered DBM using cine MRI images to find
Summary of DBN application for medical image analysis local and global features of heart contours. Summary of the
presented in Table 4. DBM applications for medical image analysis is shown in
Table 5.
3.4. Deep Boltzmann Machine
3.5. Generative Adversarial Network (GAN)
Deep Boltzmann Machine (DBM) is a robust deep learn-
ing model proposed by Salakhutdinov et al. (2009) [69] and Generative Adversarial Network (GAN) [77] is one of
Salakhutdinov et al. (2012) [70]. They stacked multiple the recent promising techniques for building flexible deep
RBMs in a hierarchal manner to handle ambiguous input ro- generative unsupervised architecture. Goodfellow et al.
bustly. (Fig. 4(C)) represents the architecture of DBM as a (2014) [77] proposed two models generative model G and
composite model of RBMs, which clearly shows how DBM Discriminative model D, where G capture data distribution
differs from DBN. Unlike DBN, DBM forms an undirected (pg) over real data t, and D estimates the probability of a
generative model combining information from both lower
and upper layers, which improves the representation power sample coming from training data (m) not from G. Dur-
of DBMs. Training of layer-wise greedy algorithm for DBM ing iterations, the backpropagation generator and the discrim-
[71, 72] is calculated by modifying the procedure of DBN. inator compete with each other. In the training procedure,
the probability of D is maximized. This framework func-
Salakhutdinov et al. (2015) [71] and Dinggang et al. tions like a mini-max two-player game. The value function
(2017) [73] presented a three-layer DBM where the DBM V (G, D) establishes following two-player mini-max game is
learns the parameters θ = {w1, w2} the values of the neighbor given by:
layer(s) and probabilities of visible and hidden units are com-
puted using a logistic sigmoidal function. The derivative of
log-likelihood of the observation (V) with respect to the mod- (10)
el parameter (θ) is computed as,
where D(t) represents the probability of t from data m
(9) and pdata is the distribution of real-world data. This model
seems to be stable and improved as pg = pdata. A typical archi-
where Edata[.] denotes data-dependent obtained from visi- tecture of GAN is depicted in Fig. (4(D)). In fact, these two
ble units and Emodel[.] denotes data-independence obtained adversaries, Generator and Discriminator, continuously bat-
from the model. tle during the processing of training. GAN has been applied
to generate samples of photorealistic images to visualize
In the case of preceding deep networks (e.g., RBM or new designs.
DBN), deep Boltzmann machine has not gained much atten-
Table 5. Applications of DBMs for medical image analysis.

Deep Boltzmann Machine-Driven level set for heart segmentation during radia-
DBM Heart motion tracking MRI [76]
tion therapy of cancer patients on cine MRI images
DBMs on multimodal images from MRI and PET scans for disease classifica-
DBM/RBM AD/MCI/HC classification MRI/PET [74]
tion
DBM Medical image retrieval Multi digital image DBM based multi-model learning to learn joint density model [75]
Binary classification of cancers region using three-layer with backpropagation
DBM Cancers region classification Hyperspectral image [83]
architecture
The recent applications of GAN produce promising re- ROI or analysis [88-99]. Some of the further applications of
sults in many diverse fields of medical image analysis. Hu et cGAN include brain tumor segmentation from MRI [100],
al. (2017) [78] employed conditional generative adversarial tissue segmentation from Ultrasound [101], retinal image
networks (cGAN) to simulate ultrasound images at given 3D synthesis [98], and effectively generate high-quality and real-
spatial locations, while Bi et al. (2017) [79] designed Multi- istic coronavirus diseases 2019 (COVID-19) CT images
-channel GAN (M-GAN) to synthesize PET data which im- [130]. Zhang et al. (2020) [127] proposed a task-driven
prove the AUC of PET-based CAD system. Bi et al. (2018) GAN (TD-GAN) in order to achieve both synthesis and pars-
[80] used an improved GAN, which uses dual-path adver- ing simultaneously for unseen real X-ray images without
sarial learning for Fully Convolutional Network (FCN) any annotations requirement from the X-ray image domain.
based image segmentation of the region of interest (ROI) The TD-GAN model provides a promising average dice of
without any medical image specification. Iqbal and Ali 86%, which achieves a similar level of accuracy as super-
(2018) [81] and Canas et al. (2018) [82] proposed MI-GAN, vised training. Few other applications of GAN include a re-
which generates precise segmented images for the applica- current GAN (RNN-GAN) to mitigate data imbalance
tion of supervised learning of retinal images, retaining patho- problems in medical image semantic segmentation [128], a
logical quality. Wang et al. (2018) [83-87] proposed 3D-con- GAN with dual discrimination (DD-GAN) to improve the
ditional GAN synthesizing high-quality brain region PET im- recognition in skin lesion segmentation [129], and so on.
age from low-dose PET image for treatment of MCI disease.
In several applications of GAN and its variants, medical im- There are several recent reviews on GAN and its
age generation or augmentation of various modalities such variants in medical image analysis, including medical image
as CT, MRI, X-ray, and PET was carried out to maximize generation [102], mammogram analysis [123], and medical
the size of the training dataset with realistic, high-quality im- imaging and analysis [124-126]. A summary of the GAN ap-
ages and preserving all deciding features in synthetic images plications for medical image analysis is presented in Table
which largely impact on the performance of segmenting 6.
Table 6. Applications of GAN for medical image analysis.

MI-GAN generates precise segmented images for the application of super-
GAN Synthesis of retinal images Retinal [81]
vised learning of retinal images
GAN used to produce photorealistic images which retain pathological
GAN Chest X-ray X-ray [82]
quality
Dual GAN- Segmentation of regions of in- Improve GAN using dual-path adversarial learning for Fully Convolution-
- [80]
FCN terest (ROIs) al Network-based image segmentation
Treatment of lymphomas and
GAN PET Multi-channel GANs are used to synthesize PET data [79]
lung cancer
Simulation of B-mode ultra- Conditional GANs are used to simulate ultrasound images at given 3D spa-
GAN Ultrasound [78]
sound images tial locations
Least-squares GAN used to model the low dimension image to high-quali-
GAN Reconstruction MRI [86]
ty images
Mild cognitive impairment 3D-conditional GAN for generating high-quality PET image from the low
GAN PET [87]
treatment dose PET image
GAN based denoising for improving the quality of the reconstructed
GAN Reconstruction CT [88]
image
Multiphase coronary angio- Cycle-consistent GAN based denoising of low-dose CT image to equiva-
GAN CT [89]
graphy lent routine CT images
Deep convolutional GAN network used to generate a synthetic image of le-
DCGAN Liver lesion classification CT [90]
sion region which improve the classification performance
Utilize unsupervised learning to generate realistic fake lung nodules which
DCGAN Lung cancer diagnosis CT [91]
easily discriminate between benign and malignant lung nodules
Semi-supervised learning for segmenting labeled and unlabeled 3D multi-
GAN Brain MRI segmentation MRI [92]
modal image
Chest pathology classifica-
GAN X-ray GAN framework used for augmentation of an artificial X-ray image [93]
tion
GAN based semi-supervised learning methods used for classification of
GAN Chest X-ray classification X-ray [94]
cardiac abnormalities in chest X-ray
Images (ISIC 2017 Laplacian GAN (LAP-GAN) framework utilize to generate high-resolu-
LAP-GAN Skin lesion synthesis [95]
dataset) tion realistic skin lesion image
An unsupervised learning framework allows blood vessel segmentation in
GAN Retinal vessel segmentation Retinal [96]
retinal fundus images under extremely low annotation

GAN and autoencoder based framework for generating synthetic retinal
GAN + VAE Retinal image synthesis Retinal [97]
image and perform precise retinal vessel segmentation
Synthesizing realistic retinal image using conditional GAN (cGAN) and
cGAN Retinal image synthesis Retinal [98]
preserving the same structured annotation
Generating synthetic brain tumors images to achieve precise segmentation
GAN Brain tumor segmentation MRI [99]
of ROI
GAN framework used for MRI brain tumor image augmentation to in-
cGAN Brain tumor segmentation MRI [100]
crease the size of training data
GAN approach for synthetic ultrasound image generation and simulation
cGAN Tissue segmentation Ultrasound of pathological ultrasound images to preserve the intensities regarding spe- [101]
cific tissue
cGAN applied to effectively generate high-quality and realistic
cGAN Chest CT synthesis CT [130]
COVID-19 CT images
A task-driven GAN (TD-GAN) was built to achieve both synthesis and
TD-GAN Image segmentation X-ray parsing simultaneously for unseen real X-ray images without any annota- [127]
tions by the X-ray radiologists
Images (ACD-
C-2017, HVSM- A recurrent GAN (RNN-GAN) was proposed to mitigate the data imbal-
RNN-GAN Image segmentation [128]
R-2016, LiTS-2017 ance problems in medical image semantic segmentation
datasets)
Images (ISIC Skin Le-
A dual discrimination GAN (DD-GAN) was proposed to improve the
DD-GAN Skin lesion segmentation sion Challenge Da- [129]
recognition in skin lesion segmentation
tasets 2017 and 2018)
Table 7. Strengths and weaknesses of unsupervised deep learning models.
Unsupervised Models Strength Weakness

AEs minimize the reconstruction errors between input and output values Presence of a few hidden layer network force it to
Autoencoders (AEs) and
for both linear and non-linear transformation, whereas Stacked AEs ex- learn compressed representation. A simple AE is very
Stacked autoencoder
tend the benefits of a deep network prone to identity learning
DAEs are capable of reconstructing a clean input from a noisy or cor-
Denoising autoencoder The problem of over-generalization or fooling during
rupted version of it and address the risk of learning from identity func-
(DAE) training
tion
SAEs use a simple regularization method where the introduced weight
Sparse autoencoder (SAE) of autoencoder is penalized in the cost function and maintains the com- -
plexity of the model by preventing over-fitting
CAEs are highly scalable, which trains the model directly end-to-end
Convolutional autoencoder
manner without pre-training, which reduces the time required by layer- Training is very complex and difficult
(CAEs)
wise pre-training in Stacked AEs
Due to injected noise, imperfect reconstruction, and the
Variational autoencoder VAE inherits the architecture of autoencoder with the extension of la-
use of direct mean squared error, the generated images
(VAE) tent representation learning
are likely to be more blurry
Contractive autoencoder Contractive AE explicitly encourages robustness of representation by ad- The trade-off between reconstruction error and a regu-
(CAE) ding an explicit regularizer in the objective function larization term
Restricted Boltzmann RBMs are more efficient than a simple Boltzmann machine and learn a Only one layer of the hidden unit and no connection be-
Machines (RBMs) probability distribution over given input tween hidden units
DBN consists of the stacking of multiple RBMs hierarchically and
Deep Belief Networks During the learning network more vulnerable to gener-
trained in a greedy manner so, it extends the strength of RBM for better
(DBN) ate the observed data
generalization
Unlike DBNs, DBMs symmetrically connected binary units form an
Deep Boltzmann machine Only approximate maximum likelihood learning, limit
undirected generative model can learn complex input task, which im-
(DBM) the speed of performance and functionality
proves the representation power of DBM
The biggest problem for GAN to learn implicit struc-

GAN is a flexible deep generative unsupervised architecture; produces a ture in object class and data structure like 2D or 3D im-
Generative Adversarial Net-
deeper understanding of data by generating new fake data and learning age data has complex variation. Also, the difficulty of
work (GAN)
through it training, evaluation of generated samples, and theoreti-
cal limitations (Suganuma et al., 2018)
Table 8. List of software tools/packages for unsupervised learning models.
S. Tools/ Packages
Models/Methods Description Language/Technology URL
No. Name
Deep learning APIs for Java hav-
1 deeplearning4j Autoencoders ing an implementation of several Java https://deeplearning4j.org/
deep learning techniques
A scientific computing framework
with good support for machine
learning algorithms that puts
2 unsupundertorch7 Autoencoder, etc. GPUs first. Unsup package pro- Lua https://github.com/torch/torch7
vides few unsupervised learning al-
gorithms such as autoencoders,
clustering, etc.
MIT licensed deep learning frame-
work that runs on CPU or GPUs
https://github.com/andersbll/deeppy
3 DeepPy Autoencoders and implements autoencoders, in Python
http://andersbll.github.io/deeppy-website/
addition to other supervised learn-
ing algorithms
Build a stacked autoencoder in the
R environment for pre-training of
https://rdrr.io/cran/SAENET/man/SAENET.-
4 SAENET.train Stacked autoencoder feed-forward NN and dimension R package
train.html
reduction of features
Gaussian Mixture Convolutional

Convolutional au- Autoencoder (GMCAE) used for Python, Keras, Ten-
5 kdsb17 https://github.com/alegonz/kdsb17
toencoder CT lung scan using Keras/TensorF- sor-flow-gpu
low
Training a deep autoencoder for http://www.cs.toronto.edu/~hinton/code/Autoe
6 autoencoder Deep autoencoder Matlab
MNIST digits datasets ncoder_Code.tar
Parallelized implementations of
many supervised and unsupervised
machine learning algorithms, in-
7 H2O Deep autoencoder R package https://cran.r-project.org/web/packages/h2o/
cluding GLM, GBM, RF, DNN,
K-Means, PCA, Deep AE, etc.
Domain invariant
Implementation of DIVA, an ex- Python, Scikit-image, https://github.com/AMLab-Amsterdam/DIVA
8 DIVA variational autoen-
tension of variational autoencoder Scikit-learn s
coder (DIVA)
Deep belief network pre-train in
an unsupervised manner with https://rdrr.io/github/TimoMatzen/RBM/sr-
9 Dbn DBN R package
stacks of RBM, which in return c/R/DBN.R
fine-tuned DBN
Restricted Boltzmann machine,
10 darch DBN, RBM deep belief network implementa- R package https://github.com/maddin79/darch
tion
DBN, RBM, deep Implementation of RBM, DBN, https://cran.r-project.org/web/packages/deep-
11 deepnet R package
autoencoders deep-stacked autoencoders net/
DBN and other deep learning im-
12 Vulpes DBN Visual Studio https://github.com/fsprojects/Vulpes
plementation in F#
RBM and DBM are implemented
13 pydbm DBM/ RBM in Python for pre-learning or di- Python https://pypi.org/project/pydbm/
mension reduction
Simple RBM implementation in https://github.com/echen/restricted-boltzman-
14 RBM RBM Python
Python n-machines
RBM and its Implementation of RBM and its
15 xRBM Python https://github.com/omimo/xRBM
variants variants in Tensorflow
Unsupervised representation learn-
16 DCGAN.torch GAN ing using Deep Convolutional Lua https://github.com/soumith/dcgan.torch
GAN
S. Tools/ Packages
Models/Methods Description Language/Technology URL
No. Name
Conditional Adversarial Networks
17 pix2pix GAN for Image-to-image translation syn- Linux Shell Script https://github.com/phillipi/pix2pix
thesizing from the image
Energy-based GAN equivalent to
probabilistic GANs produces high- https://github.com/eriklindernoren/PyTorch-
18 ebgan GAN Python
-resolution images GAN/tree/master/implementations/ebgan
[Abbreviations: ADNI: Alzheimer’s Disease Neuroimaging Initiative; ABIDE: Autism Brain Imaging Data Exchange; DICOM: Digital Imaging and Communications in Medicine;
BCDR: Breast Cancer Digital Repository; CIVM: Center for in vivo Microscopy; DDSM: Digital Database for Screening Mammography; DRIVE: Digital Retinal Images for Vessel
Extraction; IDA: Image & Data Archive; ISDIS: International Society for Digital Imaging of the Skin; NBIA: National Biomedical Imaging Archive; OASIS: Open Access Series of
Imaging Studies; TCGA: The Cancer Genome Atlas; TCIA: The Cancer ImagingArchive].
Table 9. List of benchmark medical image datasets.
S.
Data Set Modalities Medical Condition Accessibility URL
No.
1 ABIDE MRI Autism spectrum disorder Open access http://fcon_1000.projects.nitrc.org/indi/abide/
2 ADNI MRI Alzheimer’s disease Paid http://adni.loni.usc.edu/data-samples/access-data/
3 BCDR Mammography Breast cancer Open access https://bcdr.eu/
Histology of the Embryonic
4 CIVM 3D-MRM Limited access http://www.civm.duhs.duke.edu/devatlas/
and Neonatal Mouse
5 DDSM Mammography Breast cancer Open access http://marathon.csee.usf.edu/Mammography/Database.html
A huge database of various
6 DermNet Photo dermatology Limited access http://www.dermnet.com/
skin diseases
A variety of medical images,
7 DICOM MRI, CT, etc. Open access https://www.dicomlibrary.com
videos, and signal files
Retinal blood vessel segmenta-
2D color images of
8 DRIVE tion to study diabetic retino- Open access http://www.isi.uu.nl/Research/Databases/DRIVE/download.php
the retina
pathy
An online resource for neuros-
9 IDA Open access https://ida.loni.usc.edu/
cience images
Dermoscopy,
10 ISDIS telemedicine, spec- Skin disease Paid https://isdis.org/
troscopy, etc.
Online database of medical im-
Variety of imaging
11 MedPix ages, teaching cases, and clini- Open access https://medpix.nlm.nih.gov
data
cal topics
A database of the National Can-
cer Institute proving medical Limited/ open ac-
12 NBIA CT, PET, MRI, etc. https://imaging.nci.nih.gov/
images of various conditions cess
and anatomical sites
Normal aging or mild to moder-
13 OASIS MRI and PET Open access http://www.oasis-brains.org/
ate Alzheimer's Disease
Collection of MRI, A multimodal image archive Limited/ open ac-
14 TCIA http://www.cancerimagingarchive.net/
CT, etc. for various types of cancer cess
Histopathology slide images
Histopathology slide
15 TCGA from sample portions of vari- Open https://cancergenome.nih.gov/
images
ous types of cancers
Prostate MR Image Segmenta-
16 PROMISE12 MRI Open https://promise12.grand-challenge.org/
tion
4. COMPARISON AMONG UNSUPERVISED DEEP 5. LIST OF SOFTWARE TOOLS/PACKAGES AND

LEARNING MODELS BENCHMARK DATASETS
So far, we have discussed the working and applications A plethora of software tools and packages implementing
of various unsupervised deep learning models in the field of unsupervised learning models (as discussed in the paper) has
medical image analysis. Each model has its own strengths been developed and made available to the research communi-
and weaknesses, as presented in Table 7. ty and data analysts. These software tools and packages are
very handy to be used by the researchers. These are avail-
able in various programming language environments, includ- clustering algorithm will only work if the images fit into nat-
ing R, Python, Lua, Java, Matlab, and Linux shell script. In urals groups.
the last few years, several image datasets have been made
(iv) Not a common choice for medical image analysis:
available to the research communities which are used as the
Unsupervised learning is not a common choice for medical
benchmark datasets. Some of the tools/packages and medi-
image analysis. However, from the literature, it is revealed
cal image benchmark datasets are listed in Table 8 and 9, re-
that these (autoencoders and their variants, DBN, RBM,
spectively.
etc.) are mostly used to learn the hierarchy level of features
6. DISCUSSION, OPPORTUNITIES, AND CHAL- for classification tasks. It is expected that unsupervised learn-
LENGES ing will play a pivotal role in solving complex medical imag-
ing problems that are not only scalable to a large amount of
Medical imaging and diagnostic techniques are the most unlabeled data but also suitable for performing unsupervised
widely used approaches for the early detection, diagnosis, and supervised learning tasks simultaneously (Yi et al.,
and treatment of complex diseases. After significant ad- 2018).
vancements in machine learning and deep learning (both su-
pervised and unsupervised), there is a paradigm shift from (v) Development of patient-specific anatomical and
the manual interpretation of medical images by human ex- organ model: Anatomical skeletons play a crucial role in un-
perts such as radiologists and physicians to automated analy- derstanding diseases and pathology. Patient-specific anatomi-
sis and interpretation, called computer-assisted diagnosis cal model is frequently used for surgery and interventions.
(CAD). As unsupervised learning algorithms can derive in- They help to plan procedures, perform measurements for de-
sights directly from data, we can use them for data-driven de- vice surging, and predict the outcome of post-surgery com-
cision making. They are more robust, and hence, can be util- plexities. Hence, the algorithm needs to be developed to con-
ized as the holy grail of learning and classification struct a patient-specific anatomical and organ model from
problems. Furthermore, these models are also utilized for medical images.
other important tasks, including compression, dimensionali-
ty reduction, denoising, super-resolution, and some degree (vi) Heterogeneous image data: In the last two to three
of decision making. decades, more emphasis was given to well-defined medical
image analysis applications, where developed algorithms
Unsupervised learning and CAD are both in their infan- were validated on well-defined types of images with wel-
cy. Researchers and practitioners have many opportunities l-defined acquisition protocol. The algorithms are required,
in this area. Some of them are: (i) allows exploratory analy- which can work on more heterogeneous data.
sis of data, (ii)allows preprocessing for the supervised algo-
rithm when it is used to generate a new representation of da- (vii) Semantic segmentation of images: Semantic seg-
ta which ensures learning accuracy and reduces memory mentation is the task of complete scene understanding, lead-
time overheads, (iii) the recent development of cloud coming to knowledge inference from imagery. Scene understand-
puting, GPU-based computing, parallel computing and its ing is a core of computer vision problems with several appli-
cheaper cost allows big data processing, image analysis, and cations, including human-computer interaction, self-driving
execution of complex deep learning algorithms very easily, vehicles, virtual reality, and medical image analysis. For ins-
(iv) provides seamless and secured smart health care system tance, MRI prostate segmentation using deep supervised
at reasonable bandwidth speed [105-108]. Ghoneim et al. CNN and boundary-weighted neural network-based segmen-
contributed a system for medical image forgery detection us- tation was carried out earlier [108, 110]. In another study, a
ing support vector machine-based classifier [103]. bidirectional convolution targets all slices of 3D volume da-
ta instead of an individual slice [104]. The semantic segmen-
Some of the challenges and research directions are: tation of medical images with acceptable accuracy is still
(i) Difficult to evaluate whether the algorithm has challenging.
learned anything useful: Due to the lack of labels in unsu- (viii) Medical video transmission: Enabling 3D video
pervised learning, it is nearly impossible to quantify its accu- in recently adopted telemedicine and U-healthcare applica-
racy. For instance, how can we assess whether the K-means tions result in more natural viewing conditions and better di-
algorithm found the right clusters? In this direction, there is agnosis. Also, remote surgery can be benefited from 3D
a need to develop algorithms that can give an objective per- video because of the additional dimensions of depth. Howev-
formance measure in unsupervised learning. er, it is crucial to transmitting data-hungry 3D medical video
(ii) Difficult to select the right algorithm and hard- streams in real-time through limited bandwidth channels.
ware: Selection of the right algorithm for a particular type Hence, efficient encoding and decoding techniques for 3D
of medical image analysis is not a trivial task because the video data transmission are required.
performances of the algorithm are highly dependent on the (ix) Need extensive inter-organizational collabora-
types of data. Similarly, hardware requirement also varies tions: Inter-professional and inter-organizational collabora-
from problem to problem. tion is important for better functioning of the healthcare sys-
(iii) Will unsupervised learning work for me?: It is a tem, eliminating some of the pitfalls such as limited re-
frequently asked question, but its answer depends on the sources, lack of expertise, aging populations, and combating
problem at hand. In the image segmentation problem, the chronic diseases (Karam et al., 2017). Medical image-based
CAD needs extensive inter-organizational collaborations Unsupervised learning algorithms derive insights direct-
among doctors, radiologists, medical image analysts, and ly from data and use them for data-driven decision making.
computational data analysts. Unsupervised models are more robust and they can be util-
ized as the holy grail of learning and classification
(x) Need to capitalize on the big medical imaging mar-
problems. These models are also used for other tasks includ-
ket: According to the IHS Market report (https://technolo-
ing compression, dimensionality reduction, denoising, su-
gy.ihs.com.), the medical imaging market has a total global
per-resolution, and some degree of decision making. There-
revenue of $21.2 billion in 2016, which is forecasted to
fore, it is better to construct a model without knowing what
touch $24.0 billion by 2020. According to WHO, the global tasks will be at hand and whether or not we would use repre-
population will rise from 12% to 22% from 2015 to 2050. sentation (or model). All in all, we can think of unsuper-
Population aging leads to an increased rate of chronic diseas- vised learning as a preparation (pre-processing) step for su-
es globally, and hence there is a need to capitalize on a big pervised learning tasks, where unsupervised learning of rep-
medical imaging market worldwide. resentation may allow better generalization of a classifier.
(xi) Black-box and its acceptance by health professio-
nals: Machine learning algorithms are boons which solve CONSENT FOR PUBLICATION
the problems earlier thought to be unsolvable. However, it Not applicable.
suffers from being “black-box”, i.e., how output arrives
from the model is very complicated to interpret. Particularly, FUNDING
deep learning models are almost non-interpretable but are
still used for complex medical image analysis. Hence, its ac- None.
ceptance by health professionals is still questionable.
CONFLICT OF INTEREST
(xii) Will technology replace radiologists? For the pro-
The authors declare no conflicts of interest, financial or
cessing of medical images, deep learning algorithms help to
otherwise.
select and extract important features and construct new ones,
leading to a new representation of images not seen before. ACKNOWLEDGEMENTS
For the image interpretation side, deep learning helps to iden-
tify, classify, and quantify disease patterns. It also allows the None.
measurement of predictive targets, makes predictive models,
and so on. So, will technology “replace radiologists”, or mi- REFERENCES
grate to “virtual radiologist assistant” in the near future? [1] Wani N, Raza K. Multiple Kernel-Learning Approach for Medical
Hence, the following slogan is quite relevant in this context: Image Analysis. 2018.
“Embrace it, it will make you stronger; reject it, it may make http://dx.doi.org/10.1016/B978-0-12-813087-2.00002-6
[2] Jabeen A, Ahmad N, Raza K. Machine Learning-Based State-of-
you irrelevant”. the-Art Methods for the Classification of RNA-Seq Data. Classifi-
In a nutshell, unsupervised learning is an open topic cation in BioApps 2018; 6: 133-72.
http://dx.doi.org/10.1007/978-3-319-65981-7_6
where researchers can make contributions by developing a [3] Bourlard H, Kamp Y. Auto-association by multilayer perceptrons
new unsupervised method to train the network (e.g., solve a and singular value decomposition. Biol Cybern 1988; 59(4-5):
puzzle, generate image patterns, image patch comparison, 291-4.
http://dx.doi.org/10.1007/BF00332918 PMID: 3196773
etc.). Re-thinking of creating a great unsupervised feature [4] Bengio Y, Courville A, Vincent P. Representation learning: a re-
representation (e.g., what is the object and what is the back- view and new perspectives. IEEE Trans Pattern Anal Mach Intell
ground?) is nearly analogous to the human visual system. 2013; 35(8): 1798-828.
http://dx.doi.org/10.1109/TPAMI.2013.50 PMID: 23787338
CONCLUSION [5] Shin HC, Orton MR, Collins DJ, Doran SJ, Leach MO. Stacked
autoencoders for unsupervised feature learning and multiple organ
Medical imaging is one of the important techniques for detection in a pilot study using 4D patient data. IEEE Trans Pat-
tern Anal Mach Intell 2013; 35(8): 1930-43.
the early detection, diagnosis, and treatment of complex dis- http://dx.doi.org/10.1109/TPAMI.2012.277 PMID: 23787345
eases. Interpretation of medical images is usually performed [6] Vincent P, Larochelle H, Lajoie I. Stacked denoising autoen-
by human experts such as radiologists and physicians. After coders: learning useful representations in a deep network with a lo-
the success of machine learning techniques, including deep cal denoising criterion. J Mach Learn Res 2010; 11: 3371-408.
[7] Litjens G, Kooi T, Bejnordi BE, et al. A survey on deep learning
learning and the availability of cheap computing infrastruc- in medical image analysis. Med Image Anal 2017; 42: 60-88.
ture through cloud computing, there has been a paradigm http://dx.doi.org/10.1016/j.media.2017.07.005 PMID: 28778026
shift in the field of computer-assisted diagnosis (CAD). [8] Bengio Y, Lamblin P, Popovici D, et al. Greedy layer-wise train-
Both supervised and unsupervised machine learning ap- ing of deep networks. Adv Neural Inf Process Syst 2006; 19:
153-60.
proaches are widely applied in medical image analysis, each [9] Hinton GE, Salakhutdinov RR. Reducing the dimensionality of da-
of them with their pros and cons. Due to the fact that human ta with neural networks. Science 2006; 313(5786): 504-7.
supervisions are not always available or can be inadequate http://dx.doi.org/10.1126/science.1127647 PMID: 16873662
or biased, therefore, unsupervised learning algorithms, in- [10] Hinton GE, Osindero S, Teh YW. A fast learning algorithm for
deep belief nets. Neural Comput 2006; 18(7): 1527-54.
cluding their deep architecture, provide hope for the future http://dx.doi.org/10.1162/neco.2006.18.7.1527 PMID: 16764513
with lots of advantages. [11] Ng A. Sparse autoencoder lecture notes 2013.web.stanford.edu/-
class/cs294a/sparseAutoencoder.pdf [33] Xu J, Xiang L, Liu Q, et al. mStacked sparse autoencoder (SSAE)

[12] Makhzani A, Frey B. k-Sparse Autoencoders. arxiv: preprint for nuclei detection on breast cancer histopathology images. IEEE
2013. Trans Med Imaging 2016; 35(1): 119-30.
[13] Li F, Qiao H, Zhang B. Discriminatively boosted image clustering http://dx.doi.org/10.1109/TMI.2015.2458702 PMID: 26208307
with fully convolutional auto-encoders. Pattern Recognit 2018; [34] Janowczyk A, Basavanhally A, Madabhushi A. Stain Normaliza-
83: 161-73. tion using Sparse AutoEncoders (StaNoSA): Application to digital
http://dx.doi.org/10.1016/j.patcog.2018.05.019 pathology. Comput Med Imaging Graph 2017; 57: 50-61.
[14] Guo X, Liu X, Zhu E, Yin J. Deep clustering with convolutional http://dx.doi.org/10.1016/j.compmedimag.2016.05.003 PMID:
autoencoders. International Conference on Neural Information Pro- 27373749
cessing. 373-82. [35] Hatipoglu N, Bilgin G. Cell segmentation in histopathological im-
[15] Kingma and Max Welling. Auto-encoding variationalbayes. ages with deep learning algorithms by utilizing spatial relation-
CoRRabs 2013. ships. Med Biol Eng Comput 2017; 55(10): 1829-48.
[16] Partaourides H, Chatzis SP. Asymmetric deep generative models. http://dx.doi.org/10.1007/s11517-017-1630-1 PMID: 28247185
Neurocomputing 2017; 241: 90. [36] Avendi MR, Kheradvar A, Jafarkhani H. Automatic segmentation
http://dx.doi.org/10.1016/j.neucom.2017.02.028 of the right ventricle from cardiac MRI using a learning-based ap-
[17] Rifai S, Vincent P, Muller X, et al. Contractive auto-encoders: ex- proach. Magn Reson Med 2017; 78(6): 2439-48.
plicit invariance during feature extraction. Proceedings of the 28th http://dx.doi.org/10.1002/mrm.26631 PMID: 28205298
International Conference on International Conference on Machine [37] Su H, Xing F, Kong X, et al. Robust Cell Detection and Segmenta-
Learning (ICML 2011). 833-40. tion in Histopathological Images Using Sparse Reconstruction and
[18] Ballard DH. Modular Learning in Neural Networks.AAAI. 1987; Stacked Denoising Autoencoders. Lect Notes Comput Sci 2018;
pp. 279-84. 9351.
[19] Pinaya WHL, Sandra V, Rafael G-D, et al. Autoencoders Machine [38] Liu S, Liu S, Cai W, et al. Early diagnosis of Alzheimer’s disease
Learning Academic Press. 2020; pp. 193-208. with deep learning. IEEE Int Symp Biomed Imaging 2014;
[20] Zabalza J, Ren J, Zheng J, et al. Novel segmented stacked autoen- 1015-8.
coder for effective dimensionality reduction and feature extraction http://dx.doi.org/10.1109/ISBI.2014.6868045
in hyperspectral imaging. Neurocomputing 2016; 185: 1-10. [39] Cheng J-Z, Ni D, Chou Y-H, et al. Computer-aided diagnosis with
http://dx.doi.org/10.1016/j.neucom.2015.11.044 deep learning architecture: applications to breast lesions in US im-
[21] Goodfellow I, Lee H, Le Q, et al. Measuring invariances in deep ages and pulmonary nodules in CT scans. Sci Rep 2016; 6: 24454.
networks. Adv Neural Inf Process Syst 2009; 22: 646-54. http://dx.doi.org/10.1038/srep24454 PMID: 27079888
[22] Gallinari P, LeCun Y, Thiria S, et al. Memoires associative dis- [40] Miotto R, Li L, Kidd BA, Dudley JT. Deep patient: an unsuper-
tributes. Proceedings of COGNITIVA. vised representation to predict the future of patients from the elec-
[23] Vincent H, Larochelle Y. Extracting and composing robust fea- tronic health records. Sci Rep 2016; 6: 26094.
tures with denoising autoencoders. In: Cohen WW, McCallum A, http://dx.doi.org/10.1038/srep26094 PMID: 27185194
Roweis ST, Eds. Proceedings of the Twenty-fifth International [41] Cheng LZ. YefengZheng. Deep similarity learning for multimodal
Conference on Machine Learning (ICML’08). 1096-103. medical images. Comput Methods Biomech Biomed Eng Imaging
http://dx.doi.org/10.1145/1390156.1390294 Vis 2018; 6(3): 248-52.
[24] Suk H-I, Shen D. Deep learning-based feature representation for http://dx.doi.org/10.1080/21681163.2015.1135299
AD/MCI classification.Proceedings of the Medical Image Comput- [42] Huang H, Hu X, Zhao Y, et al. Modeling task fMRI data via deep
ing and Computer-Assisted Intervention. 8150: 583-90. convolutional autoencoder. IEEE Trans Med Imaging 2018;
[25] Suk H-I, Lee S-W, Shen D. Alzheimer’s Disease Neuroimaging 37(7): 1551-61.
Initiative. Latent feature representation with stacked auto-encoder http://dx.doi.org/10.1109/TMI.2017.2715285 PMID: 28641247
for AD/MCI diagnosis. Brain Struct Funct 2015; 220(2): 841-59. [43] Hosseini-Asl E, Gimelfarb G, El-Baz A. Alzheimer’s disease diag-
http://dx.doi.org/10.1007/s00429-013-0687-3 PMID: 24363140 nostics by a deeply supervised adaptable 3D convolutional net-
[26] Suk H-I, Wee C-Y, Lee S-W, Shen D. State-space model with work. arxiv 2016.
deep learning for functional dynamics estimation in resting-state [44] Hou L, Nguyen V, Kanevsky AB, et al. Sparse Autoencoder for
fMRI. Neuroimage 2016; 129: 292-307. Unsupervised Nucleus Detection and Representation in Histo-
http://dx.doi.org/10.1016/j.neuroimage.2016.01.005 PMID: pathology Images. Pattern Recognit 2019; 86: 188-200.
26774612 http://dx.doi.org/10.1016/j.patcog.2018.09.007 PMID: 30631215
[27] Zhu Y, Wang L, Liu M, et al. MRI-based prostate cancer detec- [45] Hinton G. A practical guide to training restricted boltzmann
tion with high-level representation and hierarchical classification. machines. Momentum 2010; 9(1): 926.
Med Phys 2017; 44(3): 1028-39. [46] Yoo Y, Brosch T, Traboulsee A, et al. Deep learning of image fea-
http://dx.doi.org/10.1002/mp.12116 PMID: 28107548 tures from unlabeled data for multiple sclerosis lesion segmenta-
[28] Kallenberg M, Petersen K, Nielsen M, et al. Unsupervised deep tion. International Workshop on Machine Learning in Medical
learning applied to breast density segmentation and mammograph- Imaging. 117-24.
ic risk scoring. IEEE Trans Med Imaging 2016; 35(5): 1322-31. http://dx.doi.org/10.1007/978-3-319-10581-9_15
http://dx.doi.org/10.1109/TMI.2016.2532122 PMID: 26915120 [47] Huang H, Hu X, Han J, et al. Latent source mining in FMRI data
[29] Payan A, Montana G. Predicting Alzheimer’s disease: a neuroi- via deep neural network. Proceedings of the IEEE Int Symp
maging study with 3D convolutional neural networks. arXiv Biomed Imaging. 638-41.
preprin 2015. http://dx.doi.org/10.1109/ISBI.2016.7493348
[30] Guo Y, Wu G, Commander L-A, et al. Segmenting hippocampus [48] Cai Y, Landis M, Laidley DT, Kornecki A, Lum A, Li S. Multi-
from infant brains by sparse patch matching with deep-learned fea- modal vertebrae recognition using Transformed Deep Convolu-
tures. International Conference on Medical Image Computing and tion Network. Comput Med Imaging Graph 2016; 51: 11-9.
Computer-Assisted Intervention. 308-15. http://dx.doi.org/10.1016/j.compmedimag.2016.02.002 PMID:
http://dx.doi.org/10.1007/978-3-319-10470-6_39 27104497
[31] Mansoor A, Cerrolaza JJ, Idrees R, et al. Deep learning guided par- [49] Jaumard-Hakoun A, Xu K, Roussel-Ragot P, et al. Tongue con-
titioned shape model for anterior visual pathway segmentation. tour extraction from ultrasound images based on deep neural net-
IEEE Trans Med Imaging 2016; 35(8): 1856-65. work. arxiv 2016.
http://dx.doi.org/10.1109/TMI.2016.2535222 PMID: 26930677 [50] Cao P, Liu X, Bao H, Yang J, Zhao D. Restricted Boltzmann
[32] Benou A, Veksler R, Friedman A, et al. De-noising of contrast-en- machines based oversampling and semi-supervised learning for
hanced MRI sequences by an ensemble of expert deep neural net- false positive reduction in breast CAD. Biomed Mater Eng 2015;
works.Deep Learning and Data Labeling for Medical Applica- 26 (Suppl. 1): S1541-7.
tions. Cham: Springer 2016; pp. 95-110. http://dx.doi.org/10.3233/BME-151453 PMID: 26405918
http://dx.doi.org/10.1007/978-3-319-46976-8_11 [51] Zhang Q, Xiao Y, Dai W, et al. Deep learning based classification
of breast tumors with shear-wave elastography. Ultrasonics 2016; 2018; 31(6): 895-903.
72: 150-7. http://dx.doi.org/10.1007/s10278-018-0093-8 PMID: 29736781
http://dx.doi.org/10.1016/j.ultras.2016.08.004 PMID: 27529139 [69] Salakhutdinov R, Hinton G. Deep Boltzmann machines.Artificial
[52] van Tulder G, de Bruijne M. Combining generative and discrimi- Intelligence and Statistics. PMLR 2009; pp. 448-55.
native representation learning for lung CT analysis with convolu- [70] Salakhutdinov R, Hinton G. An efficient learning procedure for
tional restricted boltzmann machines. IEEE Trans Med Imaging deep Boltzmann machines. Neural Comput 2012; 24(8):
2016; 35(5): 1262-72. 1967-2006.
http://dx.doi.org/10.1109/TMI.2016.2526687 PMID: 26886968 http://dx.doi.org/10.1162/NECO_a_00311 PMID: 22509963
[53] Mathews SM, Kambhamettu C, Barner KE. A novel application [71] Salakhutdinov R. Learning deep generative models. Annu Rev
of deep learning for single-lead ECG classification. Comput Biol Stat Appl 2015; 2: 361-85.
Med 2018; 99: 53-62. http://dx.doi.org/10.1146/annurev-statistics-010814-020120
http://dx.doi.org/10.1016/j.compbiomed.2018.05.013 PMID: [72] Goodfellow I, Mirza M, Courville A, Bengio Y. Multi-prediction
29886261 deep Boltzmann machines. Advances in Neural Information Pro-
[54] Pereira S, Meier R, McKinley R, et al. Enhancing interpretability cessing Systems 2013; 548-56.
of automatically extracted machine learning features: application [73] Dinggang S, Wu G. SukHeung-Il. Deep Learning in Medical Im-
to a RBM-Random Forest system on brain lesion segmentation. age Analysis. Annu Rev Biomed Eng 2017; •••: 19.
Med Image Anal 2018; 44: 228-44. [74] Suk H-I, Lee S-W, Shen D. Alzheimer’s Disease Neuroimaging
http://dx.doi.org/10.1016/j.media.2017.12.009 PMID: 29289703 Initiative. Hierarchical feature representation and multimodal fu-
[55] Nahid A-A, Mikaelian A, Kong Y. Histopathological breast-im- sion with deep learning for AD/MCI diagnosis. Neuroimage 2014;
age classification with restricted Boltzmann machine along with 101: 569-82.
backpropagation. Biomed Res (Aligarh) 2018; 29(10): 2068-77. http://dx.doi.org/10.1016/j.neuroimage.2014.06.077 PMID:
[56] Bengio Y. Learning Deep Architectures for AI. Found Trends 25042445
Mach Learn 2019; 2(1): 1-127. [75] Cao Y, Steffey S, He J, et al. Medical image retrieval: A multimo-
http://dx.doi.org/10.1561/2200000006 dal approach. Cancer Inform 2015; 13 (Suppl. 3): 125-36.
[57] Hinton GE, Dayan P, Frey BJ, Neal RM. The “wake-sleep” algo- PMID: 26309389
rithm for unsupervised neural networks. Science 1995; 268(5214): [76] Wu J, Ruan S, Mazur TR, et al. Heart motion tracking on cine
1158-61. MRI based on a deep Boltzmann machine-driven level set
http://dx.doi.org/10.1126/science.7761831 PMID: 7761831 method. Proceedings of IEEE Int Symp Biomed Imaging 2018;
[58] Lee H, Grosse R, Ranganath R, et al. Unsupervised learning of hi- 1153-6.
erarchicalrepresentations with convolutional deep belief networks. http://dx.doi.org/10.1109/ISBI.2018.8363775
Commun ACM 2011; 54(10): 95-103. [77] Goodfellow JP-A, Mirza M, Xu B, Warde-Farley D. Generative
http://dx.doi.org/10.1145/2001269.2001295 adversarial nets. Advances in Neural Information Processing Sys-
[59] Brosch T, Tam R. Manifold learning of brain MRIs by deep learn- tems 2014; 2672-80.
ing. Lect Notes Comput Sci 2013; 8150: 633-40. [78] Hu Y, Gibson E, Lee L-L, et al. Freehand ultrasound image simu-
http://dx.doi.org/10.1007/978-3-642-40763-5_78 lation with spatially-conditioned generative adversarial networks.
[60] Brosch T, Yoo Y, Li DKB, Traboulsee A, Tam R. Modeling the Lect Notes Comput Sci 2017; 10555: 105-15.
variability in brain morphology and lesion distribution in multiple http://dx.doi.org/10.1007/978-3-319-67564-0_11
sclerosis by deep learning. Lect Notes Comput Sci 2014; 8674: [79] Bi L, Kim J, Kumar A, et al. Synthesis of positron emission to-
462-9. mography (PET) images via multi-channel generative adversarial
http://dx.doi.org/10.1007/978-3-319-10470-6_58 networks (GANs). Lect Notes Comput Sci 2017; 10555: 43-51.
[61] Plis SM, Hjelm DR, Salakhutdinov R, et al. Deep learning for neu- http://dx.doi.org/10.1007/978-3-319-67564-0_5
roimaging: a validation study. Front Neurosci 2014; 8: 229. [80] Bi L, Feng D, Kim J. Dual-Path Adversarial Learning for Fully
http://dx.doi.org/10.3389/fnins.2014.00229 PMID: 25191215 Convolutional Network (FCN)-Based Medical Image Segmenta-
[62] Pinaya WHL, Gadelha A, Doyle OM, et al. Using deep belief net- tion. Vis Comput 2018; 34(6-8): 1043-52.
work modelling to characterize differences in brain morphometry http://dx.doi.org/10.1007/s00371-018-1519-5
in schizophrenia. Sci Rep 2016; 6: 38897. [81] Iqbal T, Ali H. Generative Adversarial Network for Medical Im-
http://dx.doi.org/10.1038/srep38897 PMID: 27941946 ages (MI-GAN). J Med Syst 2018; 42(11): 231.
[63] Ortiz A, Munilla J, Górriz JM, Ramírez J. Ensembles of deep http://dx.doi.org/10.1007/s10916-018-1072-9 PMID: 30315368
learning architectures for the early diagnosis of the Alzheimer’s [82] Canas K, Liu X, Ubiera B, et al. Scalable biomedical image synth-
disease. Int J Neural Syst 2016; 26(7) esis with GAN. ACM International Conference Proceeding Series.
http://dx.doi.org/10.1142/S0129065716500258 PMID: 27478060 [83] Jeyaraj P, Nadar ERS. Deep Boltzmann Machine Algorithm for
[64] Carneiro G, Nascimento JC, Freitas A. The segmentation of the Accurate Medical Image Analysis for Classification of Cancerous
left ventricle of the heart from ultrasound data using deep learning Region. Cognitive Computation and Systems 2019; 1(3): 85-90.
architectures and derivative-based search methods. IEEE Trans Im- http://dx.doi.org/10.1049/ccs.2019.0004
age Process 2012; 21(3): 968-82. [84] Lu N, Li T, Ren X, Miao H. A Deep Learning Scheme for Motor
http://dx.doi.org/10.1109/TIP.2011.2169273 PMID: 21947526 Imagery Classification based on Restricted Boltzmann Machines.
[65] Carneiro G, Nascimento JC. Combining multiple dynamic models IEEE Trans Neural Syst Rehabil Eng 2017; 25(6): 566-76.
and deep learning architectures for tracking the left ventricle endo- http://dx.doi.org/10.1109/TNSRE.2016.2601240 PMID:
cardium in ultrasound data. IEEE Trans Pattern Anal Mach Intell 27542114
2013; 35(11): 2592-607. [85] Li H, Li X, Ramanathan M, Zhang A. Identifying informative risk
http://dx.doi.org/10.1109/TPAMI.2013.96 PMID: 24051722 factors and predicting bone disease progression via deep belief net-
[66] Ngo TA, Lu Z, Carneiro G. Combining deep learning and level set works. Methods 2014; 69(3): 257-65.
for the automated segmentation of the left ventricle of the heart http://dx.doi.org/10.1016/j.ymeth.2014.06.011 PMID: 24979059
from cardiac cine magnetic resonance. Med Image Anal 2017; 35: [86] Mardani M, Gong E, Cheng JY, et al. Deep generative adversarial
159-71. neural networks for compressive sensing MRI. IEEE Trans Med
http://dx.doi.org/10.1016/j.media.2016.05.009 PMID: 27423113 Imaging 2019; 38(1): 167-79.
[67] Azizi S, Imani F, Ghavidel S, et al. Detection of prostate cancer http://dx.doi.org/10.1109/TMI.2018.2858752 PMID: 30040634
using temporal sequences of ultrasound data: a large clinical feasi- [87] Wang Y, Yu B, Wang L, et al. 3D conditional generative adver-
bility study. Int J CARS 2016; 11(6): 947-56. sarial networks for high-quality PET image estimation at low
http://dx.doi.org/10.1007/s11548-016-1395-2 PMID: 27059021 dose. Neuroimage 2018; 174: 550-62.
[68] Akhavan Aghdam M, Sharifi A, Pedram MM. Combination of rs- http://dx.doi.org/10.1016/j.neuroimage.2018.03.045 PMID:
fMRI and sMRI Data to Discriminate Autism Spectrum Disorders 29571715
in Young Children Using Deep Belief Network. J Digit Imaging [88] Liu Z, Bicer T, Kettimuthu R, Gursoy D, De Carlo F, Foster I. To-
moGAN: low-dose synchrotron x-ray tomography with generative http://dx.doi.org/10.4018/IJCAC.2018010108

adversarial networks: discussion. J Opt Soc Am A Opt Image Sci [107] Guo P, Evans A, Bhattacharya P. Nuclei segmentation for quantifi-
Vis 2020; 37(3): 422-34. cation of brain tumors in digital pathology images. Int J Softw Sci
http://dx.doi.org/10.1364/JOSAA.375595 PMID: 32118926 Comput Intell 2018; 10(2): •••.
[89] Kang E, Koo HJ, Yang DH, Seo JB, Ye JC. Cycle-consistent ad- http://dx.doi.org/10.4018/IJSSCI.2018040103
versarial denoising network for multiphase coronary CT angiogra- [108] Liu H, Guo Q, Wang G, Gupta BB, Zhang C. Medical image reso-
phy. Med Phys 2019; 46(2): 550-62. lution enhancement for healthcare using nonlocal self-similarity
http://dx.doi.org/10.1002/mp.13284 PMID: 30449055 and low-rank prior. Multimedia Tools Appl 2019; 78(7)
[90] Frid-Adar M, Diamant I, Klang E, et al. GAN-based synthetic http://dx.doi.org/10.1007/s11042-017-5277-6
medical image augmentation for increased CNN performance in [109] Zhu Q, Du B, Turkbey B, et al. Deeply-supervised CNN for pros-
liver lesion classification. Neurocomputing 2018; 321(10): tate segmentation 2017 International Joint Conference on Neural
321-31. Networks. 178-84.
http://dx.doi.org/10.1016/j.neucom.2018.09.013 http://dx.doi.org/10.1109/IJCNN.2017.7965852
[91] Chuquicusma MJM, Hussein S, Burt J, et al. How to fool radiolo- [110] Zhu Q, Du B, Yan P. Boundary-weighted domain adaptive neural
gists with generative adversarial networks?A visual turing test for network for prostate MR image segmentation. IEEE Trans Med
lung cancer diagnosis. Proceedings - IEEE Int Symp Biomed Imaging 2020; 39(3): 753-63.
Imaging 2018; 240-4. http://dx.doi.org/10.1109/TMI.2019.2935018 PMID: 31425022
[92] Mondal AK, Dolz J, Desrosiers C. Few-shot 3D Multi-modal Med- [111] Sital C, Brosch T, Tio D, Raaijmakers A, Weese J. 3D medical im-
ical Image Segmentation using Generative Adversarial Learning. age segmentation with labeled and unlabeled data using autoen-
arXiv preprint 2018. coders at the example of liver segmentation in CT images. arXiv
[93] Salehinejad H, Valaee S, Dowdell T, et al. Generalization of Deep preprint 2020.
Neural Networks for Chest Pathology Classification in X-Rays Us- [112] Larrazabal AJ, Martínez C, Glocker B, Ferrante E. Post-dae: Ana-
ing Generative Adversarial Networks. Proceeding IEEE Internatio- tomically plausible segmentation via post-processing with denois-
nal Conference on Acoustics, Speech and Signal Processing. ing autoencoders. IEEE Trans Med Imaging 2020; 39(12):
990-4. 3813-20.
http://dx.doi.org/10.1109/ICASSP.2018.8461430 http://dx.doi.org/10.1109/TMI.2020.3005297 PMID: 32746125
[94] Madani A, Moradi M, Karargyris A, et al. [113] Kazlouski S. Tuberculosis CT Image Analysis Using Image Fea-
[95] Baur C, Albarqouni S, Navab N. MelanoGANs : High Resolution tures Extracted by 3D Autoencoder. International Conference of
Skin Lesion Synthesis with GANs. arXiv preprint 2018. the Cross-Language Evaluation Forum for European Languages.
[96] Lahiri A, Jain V, Mondal A, et al. Retinal Vessel Segmentation 131-40.
Under Extreme Low Annotation: A Gan Based Semi-Supervised http://dx.doi.org/10.1007/978-3-030-58219-7_12
Approach. IEEE International Conference on Image Processing (I- [114] Ilse M, Tomczak JM, Louizos C, Welling M. Domain invariant
CIP). 418-22. variational autoencoders.Medical Imaging with Deep Learning
http://dx.doi.org/10.1109/ICIP40778.2020.9190882 2020; 322-48.
[97] Costa P, Galdran A, Meyer MI, et al. End-to-End Adversarial Reti- [115] Amin J, Sharif M, Gul N, et al. Brain Tumor Detection by Using
nal Image Synthesis. IEEE Trans Med Imaging 2018; 37(3): Stacked Autoencoders in Deep Learning. J Med Syst 2019; 44(2):
781-91. 32.
http://dx.doi.org/10.1109/TMI.2017.2759102 PMID: 28981409 http://dx.doi.org/10.1007/s10916-019-1483-2 PMID: 31848728
[98] Zhao H, Li H, Maurer-Stroh S, Cheng L. Synthesizing retinal and [116] Mendoza-Léon R, Puentes J, Uriza LF, Hernández Hoyos M. Sin-
neuronal images with generative adversarial nets. Med Image gle-slice Alzheimer’s disease classification and disease regional
Anal 2018; 49: 14-26. analysis with Supervised Switching Autoencoders. Comput Biol
http://dx.doi.org/10.1016/j.media.2018.07.001 PMID: 30007254 Med 2020; 116
[99] Shin HC, Tenenholtz NA, Rogers JK, et al. Medical image synthe- http://dx.doi.org/10.1016/j.compbiomed.2019.103527 PMID:
sis for data augmentation and anonymization using generative ad- 31765915
versarial networks. Lect Notes Comput Sci 2018; •••: 1-11. [117] Spatiotemporal Attention Autoencoder (STAAE) for ADHD Clas-
http://dx.doi.org/10.1007/978-3-030-00536-8_1 sification Dong Q, Qiang N, Lv J, Li X, Liu T, Li Q. Spatiotempo-
[100] Mok TCW, Chung ACS. Learning data augmentation for brain tu- ral Attention Autoencoder (STAAE) for ADHD Classifica-
mor segmentation with coarse-to-fine generative adversarial net- tion.Lect. Notes Comput. Sci2020; 12267.
works. Lect Notes Comput Sci 2019; 11383. http://dx.doi.org/10.1007/978-3-030-59728-3_50
http://dx.doi.org/10.1007/978-3-030-11723-8_7 [118] Hecht H, Sarhan MH, Popovici V. Disentangled Autoencoder for
[101] Tom F, Sheet D. Simulating patho-realistic ultrasound images us- Cross-Stain Feature Extraction in Pathology Image Analysis. Appl
ing deep generative networks with adversarial learning. Proceed- Sci (Basel) 2020; 10(18): 6427.
ings IEEE Int Symp Biomed Imaging. Washington DC. 2018; pp. http://dx.doi.org/10.3390/app10186427
1174-7. [119] Dong Q, Qiang N, Lv J, et al. Discovering Functional Brain Net-
http://dx.doi.org/10.1109/ISBI.2018.8363780 works with 3D Residual Autoencoder (ResAE). Lect Notes Com-
[102] Singh NK, Raza K. Medical Image Generation using Generative put Sci 2020; 12267.
Adversarial Networks. Stud Comput Intell http://dx.doi.org/10.1007/978-3-030-59728-3_49
[103] Ghoneim A, Muhammad G, Amin SU, et al. Medical Image [120] Adarsh R, Amarnageswarao G, Pandeeswari R, Deivalakshmi S.
Forgery Detection for Smart Healthcare. IEEE Commun Mag Dense Residual Convolutional Auto Encoder For Retinal Blood
2018; 56(4): 33-7. Vessels Segmentation. 2020 6th International Conference on Ad-
http://dx.doi.org/10.1109/MCOM.2018.1700817 vanced Computing and Communication Systems (ICACCS).
[104] Zhu Q. Exploiting interslice correlation for MRI prostate image 280-4.
segmentation, from recursive neural networks aspect. Complexity http://dx.doi.org/10.1109/ICACCS48705.2020.9074172
2018; (Feb): 2018. [121] Li D, Fu Z, Xu J. Stacked-autoencoder-based model for
http://dx.doi.org/10.1155/2018/4185279 COVID-19 diagnosis on CT images. Appl Intell 2020.
[105] Golea NE-H, Melkemi KE. ROI-based fragile watermarking for http://dx.doi.org/10.1007/s10489-020-02002-w
medical image tamper detection. International Journal of High-Per- [122] Reddy AVN, Krishna CP, Mallick PK, et al. Analyzing MRI
formance Computing and Networking 2019; 13(2): 199-210. scans to detect glioblastoma tumor using hybrid deep belief net-
http://dx.doi.org/10.1504/IJHPCN.2019.097508 works. J Big Data 2020; 7: 35.
[106] Dorgham O, Al-Rahamneh B, Ai-Hadidi M, Khatatneh KF, Almo- http://dx.doi.org/10.1186/s40537-020-00311-y
mani A. Enhancing the security of exchanging and storing DI- [123] Gopal A, Gandhimaruthian L, Ali J. Role of General Adversarial
COM medical images on the cloud. Int J Cloud Appl Comput Networks in Mammogram Analysis: A Review. Curr Med Imag-
2018; 8(1): •••. ing 2020; 16(7): 863-77.
http://dx.doi.org/10.2174/1573405614666191115102318 PMID: [127] Zhang Y, Miao S, Mansi T, Liao R. Unsupervised X-ray image
33059556 segmentation with task driven generative adversarial networks.
[124] Wolterink JM, Kamnitsas K, Ledig C, Išgum I. Generative adver- Med Image Anal 2020; 62
sarial networks and adversarial methods in biomedical image anal- http://dx.doi.org/10.1016/j.media.2020.101664 PMID: 32120268
ysis. arXiv preprint 2018. [128] Rezaei M, Yang H, Meinel C. Recurrent generative adversarial
network for learning imbalanced medical image semantic segmen-
[125] Yi X, Walia E, Babyn P. Generative adversarial network in medi- tation. Multimedia Tools Appl 2020; 79(21): 15329-48.
cal imaging: A review. Med Image Anal 2019; 58 http://dx.doi.org/10.1007/s11042-019-7305-1
http://dx.doi.org/10.1016/j.media.2019.101552 PMID: 31521965 [129] Lei B, Xia Z, Jiang F, et al. Skin lesion segmentation via genera-
[126] Kazeminia S, Baur C, Kuijper A, et al. GANs for medical image tive adversarial networks with dual discriminators. Med Image
analysis. Artif Intell Med 2020; 9 Anal 2020; 64
http://dx.doi.org/10.1016/j.artmed.2020.101938 http://dx.doi.org/10.1016/j.media.2020.101716 PMID: 32492581

BMS Cmim 2020 162

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

BMS Cmim 2020 162

Uploaded by

Copyright:

Available Formats

Send Orders for Reprints to reprints@benthamscience.

A Tour of Unsupervised Deep Learning for Medical Image Analysis

1. INTRODUCTION supervised and unsupervised machine learning approaches

1573-4056/21 $65.00+.00 © 2021 Bentham Science Publishers

3. UNSUPERVISED DEEP LEARNING MODELS

3.1.6. Contractive Autoencoder

Table 1. Summary of autoencoders and their variants.

Types Descriptions References

Method Task Image Type Remarks References

Method Task Image Type Remarks References

Table 3. Applications of RBM for medical image analysis.

Method Task Image Type Remarks References

Table 4. Applications of DBNs for medical image analysis.

Method Task Image Type Remarks References

Table 5. Applications of DBMs for medical image analysis.

Method Task Image Type Remarks References

Table 6. Applications of GAN for medical image analysis.

Method Task Image Type Remarks References

Method Task Image Type Remarks References

Table 7. Strengths and weaknesses of unsupervised deep learning models.

Unsupervised Models Strength Weakness

The biggest problem for GAN to learn implicit struc-

Table 8. List of software tools/packages for unsupervised learning models.

Gaussian Mixture Convolutional

Table 9. List of benchmark medical image datasets.

4. COMPARISON AMONG UNSUPERVISED DEEP 5. LIST OF SOFTWARE TOOLS/PACKAGES AND

class/cs294a/sparseAutoencoder.pdf [33] Xu J, Xiang L, Liu Q, et al. mStacked sparse autoencoder (SSAE)

moGAN: low-dose synchrotron x-ray tomography with generative http://dx.doi.org/10.4018/IJCAC.2018010108

You might also like