You are on page 1of 17

Multimedia Tools and Applications

https://doi.org/10.1007/s11042-021-11066-w

1155T: ADVANCED MACHINE LEARNING ALGORITHMS FOR BIO-


MEDICAL DATA AND IMAGING

A novel deep learning framework for lung nodule


detection in 3d CT images

Reza Majidpourkhoei 1 & Mehdi Alilou 1 & Kambiz Majidzadeh 1 &


Amin Babazadehsangar 1

Received: 1 November 2019 / Revised: 28 April 2021 / Accepted: 11 May 2021

# Springer Science+Business Media, LLC, part of Springer Nature 2021

Abstract
Lung cancer is one of the deadliest cancers all over the world. One of the indications of
lung cancers is the presence of the lung nodules which can appear individually or attached
to the lung walls. The early detection of these nodules is crucial for saving the patient’s
lives. Machine learning and image processing techniques, generally embedded in
computer-aided diagnosis (CAD) systems, might help radiologists locate and assess the
risk of these nodules. Accordingly, in this paper, we present a framework for identifying
pulmonary nodules in lung CT images and a convolutional neural network (CNN)
approach to automatically extract the features from lung images, followed by classifying
the suspicious regions as either nodule or non-nodule objects. The proposed model is
based on Le-Net architectural stylization and the light model is obtained after going
through the innovative steps. A subset of LIDC public dataset including N = 7072 CT
slices of varying nodule sizes (1 mm to 5 mm) is used to train and validate this approach.
The proposed framework carries out all stages of lung segmentation as well as diagnosis
and categorization of the existing nodules automatically. Training and validation steps of
this network with configurations 2.4GHz Core i5 processor, 8GB memory, and Intel
Graphics 520 are performed approximately in six hours and this system yields the
performance with accuracy = 90.1%, sensitivity = 84.1%, specificity =91.7%, for identi-
fying the nodules. Compared to other famous CNN architectures, the proposed model is
agile (light and fast) and has appropriate performance, thereby is suitable for real-time
medical image analysis.

Keywords Computer-aided diagnosis system . Convolution neural network . Deep learning .


Lung nodules

* Mehdi Alilou
me.alilou@gmail.com

Extended author information available on the last page of the article


Multimedia Tools and Applications

1 Introduction

Pulmonary nodules are often microscopic lesions with a round shape in their early state that
might lead to lung cancer once it is not detected early enough. Owing to the low contrast, tiny
scale, and the position of the nodules in the early phases of lung cancer, CT scans depict no
signs. The symptoms show themselves once the disease is in extra advanced stages. In both
males and females, the second most prevalent cancer (excluding skin cancer) is lung cancer
(both small vs. non-small cell). Prostate cancer is more prevalent in men, whereas breast cancer
has a higher frequency among women. Lung cancer accounts for about 25% of all cancer
deaths. The American Cancer Society released the subsequent lung cancer figures in 2020:

& There were approximately 228,820 new cases of lung cancer (116,300 in men and 112,520
in women)
& Lung cancer claimed the lives of roughly 135,720 people (72,500 in men and 63,220 in
women)

Survival statistics indicate the number of patients with similar cancer stages and types currently
alive for a particular period (typically five years) following the diagnosis of cancer. Although it
is not possible to reveal the period throughout which the patients might live, the statistics could
deliver a decent picture of the probability of treatment success [7]. Early diagnosis of lung
cancer could provide the patients with enhanced treatment options and improve their chances
of survival [40]. Lung cancer is usually treated with radiotherapy, surgery, and chemotherapy.
On the other hand, the survival rate of patients with late-stage lung cancer treated with the so-
called methods of treatments is not encouraging. The survival rate in lung cancer patients
diagnosed in the early stages seems to rise from 16% to 50% [6]. Various studies have been
working on computer-aided diagnostic (CAD) models to help radiologists view medical
images more accurately [51]. CAD systems can aid radiologists in detecting and estimating
the lung nodules in their early stages. Machine-based vision, image enhancement, and machine
learning mechanisms are popular components of these systems. For the compartment of the
picture into meaningful constituents, segmentation techniques are utilized. Besides, in order to
obtain image characteristics and train the predictive models, machine learning approaches,
e.g., support vector machines (SVM) or artificial neural networks (ANN), were exploited [48].
Suspicious candidate identification and false-positive elimination are the two primary goals
of a CAD system. The capability of having high sensitivity and a low false positive (FP) rate is
the purpose of every CAD system for the detection of a lung nodule. Nevertheless, CAD
systems are also not commonly employed in clinical practice due to a variety of reasons, such
as the available systems poor sensitivity or high false-positive rates [30, 36]. The vast range of
nodule morphology and the identification of overlooked small-sized nodules are the critical
challenges of the present task. The majority of previous studies relied on non-automatic
computational functions, which were then submitted to linear classifiers for distinguishing
between benign and malignant cases [42]. However, the so-called hand-made CAD systems
based on features have had to deal with a variety of problematic circumstances. Firstly, the
hand-made features are dependent on the lung nodule’s segmentation. As a result, the move is
controversial due to the contingent nature of the segmentation [18]. Secondly, the hand-made
features are entirely on the basis of prior knowledge, which is entirely contingent on the CAD
system designers’ abilities. Therefore, hand-made CAD systems based on features are chal-
lenging to use in clinical settings owing to the so-called issues [20].
Multimedia Tools and Applications

1.1 A review of deep learning and recent CNN architecture

Sequential forward floating selection (SFFS) and genetic algorithm (GA) are two instances of
feature selection algorithms that could aid in the generation of optimal feature combinations
[50]. Nevertheless, they demonstrate less efficacy in large-scale features such as images [52].
Deep learning has recently gained popularity in CAD systems as a result of its success in a
variety of applications [27]. Owing to the usage of end-to-end deep learning technologies,
several medical image processing apps have been productive [8]. Recently, deep learning
architecture was used in pulmonary nodule detection systems on images of CT scans [13].
These systems outperform those that are dependent on hand-made features [55]. CNN deep
learning network is also a safer alternative for learning multifarious patterns from medical
imaging data due to a large number of data attributes and image volume. Deep learning
algorithms could be defined as a subset of machine learning approaches that include artificial
neural networks and a number of heuristic algorithms [38].
Deep learning techniques are a series of algorithms with the purpose of modeling concepts
with high-level abstractions through learning from accessible data at various levels and layers
of abstraction. Reduced network parameters, abstract learning capabilities, and automated
learning abilities are only a few of the internal layers of a network [1]. Deep learning
algorithms outperform traditional machine learning models in a variety of functions, including
speech recognition, natural language processing, and computer vision, according to various
studies. Deep learning implementations have progressed from simple image classifications,
such as the recognition of handwritten numbers, to extra complex classification tasks [31].
Autoencoders, restricted Boltzmann machines (RBMS), sparse coding, extreme learning,
RNN, and convolutional neural networks (CNN) are generally divided into some of the
predominant deep learning algorithms. CNN is typically employed in the analysis of medical
images, and a series of recent studies have stated that contingent strategies outperformed
human testers in terms of precision [10]. In a nutshell, convolutional neural networks are made
up of a series of disparate layers that are learned with the aim of automatic extraction of
valuable information out of input data without the use of feature engineering procedures [38].
LeNet (the architecture is depicted in Fig. 1) was the first CNN, which was initially presented
in a paper published in 1998 [33]. Convolution filters were used for the first time in the so-
called network [33, 56]. Following the LeNet-5 [33], several deep learning networks have been
developed with varying layer numbers and adaptable structures, including AlexNet with eight
layers, modeled by Alex Kryshevsky. It served as the foundation for the convolutional neural
networks used in the ILSVRC-2012 challenge [31]. Besides, VGG-VD contained CNN

Fig. 1 Architecture of LeNet-5 proposed by Yann LeCun [33]


Multimedia Tools and Applications

structures with 16 and 19 layers. The 22-layer GoogleNet network comprises base architec-
tures with the purpose of monitoring the conflict between overfitting and growing training
parameters [53]. ResNet possesses roughly 20 times extra depth than AlexNet and 8 times
extra depth compared to VGGNet [25]. It is worth mentioning that SENet and other cases are
also debated [28].

1.2 Segmentation and CNN for object detection

Distinguished segmentation approaches regularly utilize hand-crafted features (e.g., gray


levels, optical, flow color) and sheer foreground-related heuristic-based suppositions (i.e.,
background priors, local motion differences) in order to partition the artifacts from the
background automatically. In the lack of training data, the majority of these procedures operate
utterly unsupervised. With the advent of deep learning, further studies have been undertaken to
address these issues using deep learning frameworks [58]. A multi-layer detector on the basis
of cognition was intended to spot the moving artifacts [16], and motion saliency along with
deep learning-based embedding instances were combined to improve efficiency [34]. Others
also scrutinized fully convolutional networks (FCNs) by presenting two-stream networks in
order to mingle the presence and motion data [12, 35], or models with extra effective feature
extraction as well as LSTM variants to help efficiently locate foreground artifacts [49, 58].
Object detection is a valuable activity in computer visionaries since it allows computers to
perceive the surrounding environment and adapt themselves by producing reactions. It also
possesses loads of capabilities in novel applications. Deep convolutional neural networks
(CNNs) have demonstrated positive outcomes in object recognition in recent years [43].

1.3 Related works

Visual recognition tasks have been shown to be successful with convolutional neural networks
[31]. The LIDC-IDRI database was included in the majority of previous studies [9]. Typically,
a classification parameter is used to pick a set of samples from the database. A number of the
detection systems have been tailored to particular nodule forms. This section is focused on
works that have utilized the LIDC database; however, there are a few other noteworthy works
that are mentioned as well.
Firstly, studies that utilized hand-crafted features are mentioned to identify pulmonary
nodules. Using 19 GLCM features derived from different sub-bands of the wavelet transform
along with the SVM-based classifier, Orozco et al. established a classification structure for a
lung nodule. A subset of the LIDC database was used to achieve an 82% of accuracy [37]. For
extraction of texture features of nodules, Froz et al. employed artificial crawler, techniques of
rose diagrams, as well as radial-based kernel predicated on SVM for classification [17]. In a
similar vein, Aghabalaei et al. exploited the SVM to categorize candidate nodules following
the designing of a series of texture, spectral, and shape features with the intention of
characterizing nodules [2]. Nithila et al. created a CAD system for the detection of marked
lung nodules with an emphasis on heuristic search algorithms and for optimization of the
network using particle clustering algorithms [39]. Alam et al. suggested a patch-based multi-
spectral approach for detecting nodules [4].
Secondly, the recognition along with categorization of pulmonary lung nodules on the basis
of deep learning is discussed. Dou et al. [15] employed 3D convolutional neural networks to
generate multilevel contextual data with the intention of minimizing the false positives and
Multimedia Tools and Applications

false negatives. Ozdemir et al. [41] exploited 3D convolutional neural networks predicated on
V-net to build an end-to-end system for the detection of nodules. Zhang et al. [61] used a
tightly dilated 3D convolutional neural network with a view to minimizing the false-positive
rates and Gaussian filters with restricted multi-scale Laplacian with the intention of localizing
the possible nodule candidates.
Another approach suggested by Xie et al. [59] is a nodule detection system focused on a
quicker region-based CNN. The system could obtain 86.4% precision using a database
comprising 150,414 images by employing a 2D convolutional operation to minimize the
false-positive rates. Additionally, Shen et al. [46] used a multi-scale convolutional neural
network with two layers to detect lung cancers on the LIDC database with 86.84% accuracy.
Moreover, Kumar et al. [32] used deep features derived from the autoencoder to evaluate their
algorithm on 157 cases from the comparable dataset and could achieve 75.01% accuracy.
A pile of critique in the literature, there have been bold attempts with decent results.
However, the difficulty of providing optimal models with satisfactory accuracy and sensitivity
along with a minimum rate of false positives remains unsolved. Even hefty architecture was
utilized in previous attempts, which slowed down the speed as well as criteria and raised the
number of false positives. Consequently, light architecture is proposed in the present paper.

1.4 Major contributions of the work

Consistent with the present review, significant improvement has been made in the detection of
lung nodules throughout the last decade. The difficulties persist, and they involve detecting
and classifying unequally controlled nodules based on shape, size, and density. As a result, a
completely automated structure that could solve several of the so-called difficulties is required.
Nonetheless, the available CAD schemes, despite their high sensitivity, continue to deliver an
increased number of false positives (envisaging non-nodules as nodules). In the present study,
an innovative CNN architecture was presented that is both light (and not too deep) and suitable
for medical image processing. The specific contributions of this paper are as follows:
& Reducing the size of region of interest by prior segmentation of the lung regions from the
surrounding anatomy as a preprocessing step;
& Diagnosing micro-nodules that are less than 5 mm;
& Having pleasant performance from low number of CNN layers;
& Selecting light network architecture and discovering the best configuration by extensive
experiments which yield an optimal lightweight network that avoids overfitting and
performs well;
& Making sufficient, valid, and realistic samples for training;
& Having the number of suitable cycles of the learning phase in order for agility;
& Improving evaluation criteria such as accuracy, sensitivity, and specificity, and reducing
false positive and computational time, which are the main challenges of these works;
& Benefiting from a patch-based CNN for the image classification, which reduces the
computational costs and training time of the whole framework; In other words, the whole
detection pipeline is designed as a single 3D convolutional neural network (CNN) with
dense connections, trained in an end-to-end and supervised learning manner;

The paper is structured as follows: in Section 2, we present dataset and methods. Section 3
provides experimental results and finally, conclusion and future works are presented in
Section 4.
Multimedia Tools and Applications

2 Materials and methods

2.1 Dataset

One of the challenges in implementing deep learning algorithms is lack of appropriate amount of
labeled medical image data. This limitation, which applies to most of the deep learning applications,
is mostly due to the confidentiality of patient medical information. One of the publicly available lung
CT image datasets is lung image database consortium (LIDC/IDRI) database [9]. Since the size of
the whole LIDC dataset is massive (125 gigabytes), a subset of this dataset including a number of
3536 positive and 3536 negative samples/slices extracted from 300 CT scans is used in our
experiments. Positive samples correspond to a set of 2D image regions/patches with size 80*80
pixels with the manifestation of a nodule, whereas the negative samples correspond to the same size
patches inside lung parenchyma without the presence of a nodule.

2.2 The proposed framework

The workflow of the proposed framework is revealed in Fig. 2, Our framework includes the
steps of pre-processing, segmentation, automatic identification (extraction of features, regard-
less of specific features and based on the convolutional network and, then, network training
using dataset), and classification based on the CNN algorithm (and non-traditional).
According to the objectives of this research (as noted in sub-section 1–4), above all, fast
analysis, high accuracy, and low false positive rate, a basic model that is light, named Le-Net,
was chosen and implemented on our dataset. To improve the results, innovative changes were
applied to this CNN model and the results were observed several times. Finally, the optimized
architecture compatible with our dataset and objectives of this research was achieved.
An efficient method for improving CT scan images should aim at first to reduce the noise in
that region. In this research, we used an image enhancement based on the method used in [14]
to increase the contrast of images. The first step of our framework was to segment the lung
regions, which was simply extracting the main areas of the lung from the background and
removing other parts and contiguous tissues from the image. This step is important in terms of
limiting the regions of interest and decreasing the load and input data of the subsequent deep
model. In this stage, we used an image segmentation approach based on the method employed
in [5]. To accomplish the lung segmentation task, we generated four types of 3D masks. These
were the initial lung mask, the body mask, the secondary lung mask, and the final lung mask.

Pre-Processing of Determine Labeling Image


the Lung Region Topology of CNN Regions

Results(Nodule Or Pattern Image Import in CNN


Not Nodule) Classification Model for Training

CT Images
Evaluation Based on
Criteria
Feedback to Optimizing

Fig. 2 The workflow of the proposed framework and model


Multimedia Tools and Applications

Figure 3 presents an example input (left panel) and output/final mask (right panel) of the lung
segmentation step.
The second step determines the training strategy and data preparation, which include pre-
processing and multi-partitioning of images as well as positive and negative tagging of
samples. In fact, this step leads to data preparation. A CT scan consists of n number of slices,
each slice includes a two-dimensional image with the size of 512 * 512 pixels. Initially, the
patient’s CT scan image is stored in a 3-D array. Then, the three-dimensional CTs are
converted into a set of smaller 80 × 80 pixel sections (i.e., patches). We used xyz coordinates
of the nodules that were available in LIDC dataset to determine the 2D positive or negative
patches. Figure 4 shows how a CT slice is broken down into a set of patches. A number of
positive and negative patches are presented in Fig. 5.
The information for each of these sections, including the main coordinates and the images,
should be stored for later stages, the above items are stored for entire patients and the results are
stored as labeled data. We potentially could train CNN at the pixel level; however, to skip the
increasing computational cost and training time, we decided to process images in a patch-based
approach, rather than pixel-based processing. It is beneficial to increase the number of layers in
the network because of the features that can be easily learned at different abstraction levels.
However, using very deep network systems requires more parameters to be learned. Which leads
to an increase in the network complexity, training time, error generalization, and over fitting rate.
Thus, deeper CNN systems are not good all the time and it is crucial to take into consideration the
following factors while designing a network: the available dataset, size and shape of the objects to
be classified, some parameters such as kernels size, and not just relying on the network depth
[54]. In our study, we explored CNNs with three convolutional layers and developed an agile
CNN model based on the LeNet-5 model. So, we introduced a CNN model in the third step. We
formally presented the CNN model as the following equations:
The convolutional layer as in Eq. (1) is shown.

 
f j ¼ PReLU ∑i cij *f i þ b j ð1Þ

fi i-th input attribute to be mapped

Fig. 3 Output of lung segmentation: (left) the original image and (right) the segmented image
Multimedia Tools and Applications

Fig. 4 A CT slice spitted into patches

bj bias for the j-th output attribute that is mapped


fj j-th Output feature that is mapped
cij Kernel and convolutional core among fi and fj [29]
After each convolution layer (Conv), a PReLU unit (parametric rectified linear unit)
converts nonlinear mode linearly, according to Eq. (2).
 j 
  f if fj>0
PReLU f j ¼ ð2Þ
a j f j if f j ≤0

aj The learning parameter, for example, 0.25 [24]


Pooling layer or parameter reduction as in Eq. (3) is shown.
2   3
pool ∝ðc1 þ b1 *eÞ
6 : 7
C pooled ¼ 6
4
7
5 ð3Þ
:
poolð∝ðcn þ bn *eÞÞ

α() Activation function as Max or Average


b The bias of each feature vector
ci i-th Convolution property vector
e Vector unit, same size with ci [31]

-
Fig. 5 Examples of positive and negative labeled patches
Multimedia Tools and Applications

Two state Soft-Max functions are placed in the last layer and used to obtain the
probability functions for each positive or negative label. The main purpose is to
increase the probability and reduce entropy, according to Eq. (4).
expðOk Þ
Pk ¼ ð4Þ
∑ expðOh Þ
h⊆f0;1g

Pk Likelihood of each tag


Ok k-th Output
Zero means negative label and one means positive label. The loss function is used to reduce
the entropy during the training, according to Eq. (5).
1 N h  i
LðwÞ ¼ − ∑ yn Log b yn þ λjwj
yn þ ð1−yn ÞLog 1−b ð5Þ
N n¼1

N Training sample number


byn Predictable Chance for CNN
λ ∣ w∣=5*10−4 [19].
Weights are plotted by the SGD algorithm during training, according to Eq. (6).
wtþ1 ¼ wt þ vtþ1 vtþ1 ¼ μ vt −∝∇Lðwt Þ ð6Þ

t Repeat Sample Number


α Training rate
v updated value, whose initial value is zero
μ 0.9
The initial training rate is ∝0 = 6 ∗ 10−5
∇L(w) gradient proportional to size of the data [23]
Calculate the training rate according to Eq. (7) [45].
∝tþ1 ¼ ∝0 ð1 þ γtÞ−p ð7Þ

γ ¼ 0:0001 p ¼ 0:75

In third step, the labeled data from the previous step are divided into training, testing, and
validation subsets. we used our model to carry out training, testing, and validation steps. In the
training phase, the labeled data were provided to the network and network’s weights were
automatically optimized to classify the patches into positive and negative classes. Then, all the
model performance was evaluated by confusion matrix. Other performance assessment criteria
such as sensitivity, specificity, accuracy, false positive and true positive rates were calculated.
In order to achieve desired performance in contrast to the similar studies, we experimented
several different network configurations and patch sizes. Our CNN architecture used three
convolutional layers with 400 training rounds (epochs) with each epoch taking the average of
45 s to train. The architecture of the model is shown in Fig. 6 and Table 1.
The fourth step of our framework was visualization and 3D construction. Once a region is
classified as a positive sample (i.e., nodule is presented), 3D representation of the nodule can
be illustrated. Given that the information of each of these sections and coordinates of the
Multimedia Tools and Applications

Batch
Normalization

Batch Batch +
Normalization Normalization +
80 × 80 CT 80 × 80 × 32 Max Pooling3D Drop out
section Feature maps 40 × 40 × 64
40 × 40 × 64
with Feature maps Feature maps
Convolution3D with with
Convolution3D Convolution3D

Fig. 6 Proposed CNN architecture

patches is already available, considering the location of the patch in the 2D slice and also the slice
number in the 3D CT scan, there is the possibility of three-dimensional construction (see Fig. 7).
For implementing this research, hardware configuration includes Core i5 2.4GHz Proces-
sor, 8GB of RAM, and Intel Graphics 520. Matlab software is used to prepare and preprocess
the image and to isolate the lung from the background. Python is the main programming
language selected for developing the subsequent steps of the framework. Anaconda3 (64-bit),
Jupyter Notebook, Tensorflow and Keras libraries are used for implementing the CNN
network.

3 Experimental results

The criteria for evaluating the performance of the presented framework include the following:
Classification accuracy, i.e., how much recognition is close to the reality and is obtained by
dividing the number of samples that are correctly categorized into the total number of samples.
Sensitivity or recall or true positive rate, which shows how many positive samples (patches
including a nodule presentation) are correctly classified as positive. Precision, the exact
diagnosis, or how close the results in the consecutive evaluations. Specificity or true negative
rate, the data of which are in the negative class and correctly recognized as negative [11].
In this research, we used 5-Cross validation method for the learning and testing steps. Thus,
after training the CNN model, we tested the model with 1415 images and, consequently, the
confusion matrix was computed, as shown in Fig. 8. The average time to run this system was 6 h.
Evaluation criteria calculated on the basis of the confusion matrix:
Accuracy = (Tp + Tn)/D = (261 + 1014)/1415 = 90.1%, where Tp is True positive rate, Tn is
True negative rate and D is total number of test data;

Table 1 Architecture of the network layers with 3D convolutions include: con3D followed by another conv3D
and max-pooling and another conv3D and a Dropout layer(test_split =0.2 and patch_size =2 and batch_size =
3000 and nb_classes =2 and nb_epoch =400)

Layer (type) Output Shape Param #

conv3d_4 (Conv3D) (None, 39, 39, 3, 8) 80


batch_normalization_4 (Batch (None, 39, 39, 3, 8) 32
conv3d_5 (Conv3D) (None, 19, 19, 3, 8) 584
batch_normalization_5 (Batch (None, 19, 19, 3, 8) 32
max_pooling3d_2 (MaxPooling3 (None, 9, 9, 3, 8) 0
conv3d_6 (Conv3D) (None, 4, 4, 3, 8) 584
batch_normalization_6 (Batch (None, 4, 4, 3, 8) 32
Multimedia Tools and Applications

Fig. 7 3D construction of the detected nodule inside the lung regions is possible since we already segmented the
lung region and the xyz coordinates of the patches are available

Precision (PPV) = Tp / (Tp + Fp) = 261/(261 + 91) =74.1%, where Fp is False positive rate;
Recall or Sensitivity or TPR = Tp/ P = 261/310 = 84.1%, where P is the number of samples
that belong to the positive class;
Specificity = Tn/ N = 1-FPR = 1014/1105 = 91.7%, where N is the number of samples that
belong to the negative class;
FPR = Fp/ Negative class =1- specificity = 91/1105 = 0.082 = 8.2%;
NPV = Tn/(Tn + Fn) = 1014/(1014 + 49) = 95.3%;
FDR = Fp/(Fp + Tp) = 91/(91 + 261) = 0.258 = 25.8%;
The loss and accuracy plots as in Figs. 9 and 10 are shown:
Table 2 presents a comparison between our approach and other similar published methods.
According to this comparison and other similar studies that have been mentioned in the
literature, the evaluation criteria have improved in a balanced state; therefore, the proposed
method provided acceptable performance.

Fig. 8 Confusion Matrix


No-nodule
True Label
Nodule

No-nodule Nodule
Predicted Label
Multimedia Tools and Applications

Fig. 9 The model’s loss curve for training and test sets

4 Conclusion and future works

Smoking, toxic gases, and air pollution are among the many factors that had led to an increase
in respiratory and pulmonary diseases and endangered the human health. Prevention, timely
and correct diagnosis of pulmonary disease, and proper treatment are very necessary and vital.
Diagnosis of pulmonary diseases requires careful and time-consuming assessment by a
specialist physician. Obviously, human error due to the large number and complexity of
images can badly affect the diagnosis and treatment methods. Given the increasing spread of
pulmonary diseases in today’s industrialized societies, using computerized methods to help in
fast and accurate diagnosis is one of the main concerns of physicians and engineers.
In this paper, we proposed an intelligent system based on deep learning algorithms for
diagnosing the lung nodules in the CT images. Due to the uncertainty in the available methods
for image extraction, the proposed framework employed the effective CNN method for
extracting lung image characteristics and automating the process of classification and diagno-
sis of the existing nodules.

Fig. 10 The model’s accuracy for training and test sets


Table 2 Comparison of the proposed method with previous similar studies
Multimedia Tools and Applications

Study Database Number of Used CT Images Sensitivity Specificity Accuracy


% % %

Song et al. (2017) [48] LIDC-IDRI 4581 83.96% 84.32% 84.15%


Gruetzemacher and Gupta (2016) [21] LIDC-IDRI 1000 78.19 86.13 82.1
Ypsilantis and Montana (2016) [60] LIDC-IDRI 1018 90.5% NA NA
Alakwaa et al. (2017) [3] Lung Nodule Analysis 2016 (LUNA16) 1397 NA NA 86.6
Setio et al. (2016) [44] LIDC-IDRI 888 85.4% and 90.1% NA NA
Heeneman and Hoogendoorn (2018) [26] LIDC-IRDI 799 NA NA 95% confidence
Wang et al. (2018) [57] LUNA16 888 95.8% NA NA
Xie et al. (2019) [59] LUNA16 1018 86.42% 73.4% and 74.4% NA NA
Silveira et al. (2007) [47] LIDC-IRDI 800 84.3% 85.9% 84.8%
Hassanpour et al. (2011) [22] LIDC-IRDI 800 78.8% 81.2% 80.4%
ELCAP 50 72.5% 60% 67.5%
Proposed Method LIDC-IRDI 300 CTs 84.1% 91.7% 90.1%
7072 patches
Multimedia Tools and Applications

We used a publicly available dataset in this study to provide a benchmark for other studies.
The design of the CNN network layers was based on the LeNet-5 model and experimental
approach to discover the best performing architecture and configuration.
Providing a CAD pipeline for diagnosing lung nodules, significant increase in overall
processing speed because of using CNN algorithms, and fast extraction of image features, and
designing the structure of CNN network while avoiding the over fitting reasonable amount of
training data are the highlights of this paper.
For future works, we develop a Mobile app based on our model as a decision support
system (DSS) for public uses. In addition, we only used the dataset of LIDC and the
generalizability of our CNN model is not known for other independent datasets. These
limitations will be addressed in the future study. Moreover, our model would be generalized
for the nodules of other body textures using the transfer learning. Also, in the future, it could be
possible to extend our current model to not only determine the nodules, but also to declare
them cancerous or not. And the challenge continues.

Acknowledgements The authors thank the potential reviewer’s for their constructive comments and sugges-
tions that greatly contributed to improving this paper.

Declarations

Ethical approval This article does not contain any studies with human participants or animals performed by
any of the authors.

Conflict of interest Authors, Reza Majidpourkhoei, Mehdi Alilou, Kambiz Majidzadeh and Amin
BabazadehSangar, declare that they have no conflict of interest.

References

1. Affonso C, Sassi RJ, Barreiros RM (2015) Biological image classification using rough-fuzzy artificial
neural network. Expert Syst Appl 42(24):9482–9488. https://doi.org/10.1016/j.eswa.2015.07.075
2. Aghabalaei Khordehchi E, Ayatollahi A, Daliri MR (2017) Automatic lung nodule detection based on
statistical region merging and support vector machines. Image Anal Stereol 36(2):65. https://doi.org/10.
5566/ias.1679
3. Alakwaa W, Nassef M, Badr A (2017) Lung Cancer detection and classification with 3D convolutional
neural network (3D-CNN). Int J Adv Comput Sci Appl 8(8). https://doi.org/10.14569/ijacsa.2017.080853
4. Alam M, Sankaranarayanan G, Devarajan V (2016) Lung Nodule Detection and Segmentation Using a
Patch-Based Multi-Atlas Method. 2016 International Conference on Computational Science and
Computational Intelligence (CSCI). https://doi.org/10.1109/csci.2016.0012.
5. Alilou M, Kovalev V, Snezhko E, Taimouri V (2014) A comprehensive framework for automatic detection
of pulmonary nodules in lung ct images. Image Anal Stereol 33(1):13. https://doi.org/10.5566/ias.v33.p13-
27
6. American Cancer Society, Cancer Facts and Figures (2015) https://www.cancer.org/acs/groups/content/@
editorial/documents/document/acspc-044552.pdf.
7. American Cancer Society, Key Statistics for Lung Cancer (2020) https://www.cancer.org/cancer/lung-
cancer/about/key-statistics.html.
8. Amin SU, Alsulaiman M, Muhammad G, Mekhtiche MA, Shamim Hossain M (2019) Deep learning for
EEG motor imagery classification based on multi-layer CNNs feature fusion. Futur Gener Comput Syst
101:542–554. https://doi.org/10.1016/j.future.2019.06.027
9. Armato SG, McLennan G, Bidaut L, McNitt-Gray MF, Meyer CR, Reeves AP, … Hoffman EA (2011) The
lung image database consortium (LIDC) and image database resource initiative (IDRI): a completed
Multimedia Tools and Applications

reference database of lung nodules on CT scans. Med Phys 38(2):915–931. https://doi.org/10.1118/1.


3528204
10. Bakator M, Radosav D (2018) Deep learning and medical diagnosis: a review of literature. Multimodal
Technol Interact 2(3):47. https://doi.org/10.3390/mti2030047
11. Bengio Y (2012) Deep learning of representations for unsupervised and transfer learning. In Proceedings of
ICML workshop on unsupervised and transfer learning (pp. 17-36).
12. Cao J, Pang Y, Li X (2019) Triply Supervised Decoder Networks for Joint Detection and Segmentation.
2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). https://doi.org/10.
1109/cvpr.2019.00757.
13. De Carvalho Filho AO, Silva AC, de Paiva AC, Nunes RA, Gattass M (2018) Classification of patterns of
benignity and malignancy based on CT using topology-based phylogenetic diversity index and
convolutional neural network. Pattern Recogn 81:200–212. https://doi.org/10.1016/j.patcog.2018.03.032
14. De Oliveira JEE (2011) Content-based image retrieval applied to BI-RADS tissue classification in screening
mammography. World J Radiol 3(1):24–31. https://doi.org/10.4329/wjr.v3.i1.24
15. Dou Q, Chen H, Yu L, Qin J, Heng P-A (2017) Multilevel contextual 3-D CNNs for false positive reduction
in pulmonary nodule detection. IEEE Trans Biomed Eng 64(7):1558–1567. https://doi.org/10.1109/tbme.
2016.2613502
16. Fragkiadaki K, Arbelaez P, Felsen P, Malik J (2015) Learning to segment moving objects in videos. 2015
IEEE Conference on Computer Vision and Pattern Recognition (CVPR). https://doi.org/10.1109/cvpr.2015.
7299035.
17. Froz BR, de Carvalho Filho AO, Silva AC, de Paiva AC, Acatauassú Nunes R, Gattass M (2017) Lung
nodule classification using artificial crawlers, directional texture and support vector machine. Expert Syst
Appl 69:176–188. https://doi.org/10.1016/j.eswa.2016.10.039
18. Gillies RJ, Kinahan PE, Hricak H (2016) Radiomics: images are more than pictures, they are data.
Radiology 278(2):563–577. https://doi.org/10.1148/radiol.2015151169
19. Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In
Proceedings of the thirteenth international conference on artificial intelligence and statistics (pp. 249-256).
20. Greenspan H, van Ginneken B, Summers RM (2016) Guest editorial deep learning in medical imaging:
overview and future promise of an exciting new technique. IEEE Trans Med Imaging 35(5):1153–1159.
https://doi.org/10.1109/tmi.2016.2553401
21. Gruetzemacher R, Gupta A (2016) Using deep learning for pulmonary nodule detection &
diagnosis.Twenty-second Americas Conference on Information Systems, San Diego.
22. Hassanpour H, Yousefian H, Zehtabi A (2011) Pixon-based image segmentation. Image Segmentation.
https://doi.org/10.5772/15941.
23. Havaei M, Davy A, Warde-Farley D, Biard A, Courville A, Bengio Y, Pal C, Jodoin PM, Larochelle H
(2017) Brain tumor segmentation with deep neural networks. Med Image Anal 35:18–31. https://doi.org/10.
1016/j.media.2016.05.004
24. He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on
ImageNet classification. 2015 IEEE International Conference on Computer Vision (ICCV), 1026–1034.
https://doi.org/10.1109/iccv.2015.123.
25. He K, Zhang X, Ren S, Sun J (2016) Deep Residual Learning for Image Recognition. 2016 IEEE
Conference on Computer Vision and Pattern Recognition (CVPR). https://doi.org/10.1109/cvpr.2016.90.
26. Heeneman T, Hoogendoorn M (2018) Lung nodule detection by using deep learning. University of
Amsterdam, Research Paper.
27. Hossain MS, Muhammad G (2014) Cloud-based collaborative media service framework for HealthCare. Int
J Distrib Sens Netw 10(3):858712. https://doi.org/10.1155/2014/858712
28. Hu J, Shen L, Albanie S, Sun G, Wu E (2020) Squeeze-and-excitation networks. IEEE Trans Pattern Anal
Mach Intell 42(8):2011–2023. https://doi.org/10.1109/tpami.2019.2913372
29. Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal
covariate shift. arXiv preprint arXiv:1502.03167.
30. Jacobs C, van Rikxoort EM, Murphy K, Prokop M, Schaefer-Prokop CM, van Ginneken B (2016)
Computer-aided detection of pulmonary nodules: a comparative study using the public LIDC/IDRI
database. Eur Radiol 26(7):2139–2147. https://doi.org/10.1007/s00330-015-4030-7
31. Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural
networks. Commun ACM 60(6):84–90. https://doi.org/10.1145/3065386
32. Kumar D, Wong A, Clausi D A (2015) Lung nodule classification using deep features in CT images. 2015
12th conference on computer and robot vision, 133–138. https://doi.org/10.1109/crv.2015.25.
33. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition.
Proc IEEE 86(11):2278–2324. https://doi.org/10.1109/5.726791
Multimedia Tools and Applications

34. Li S, Seybold B, Vorobyov A, Fathi A, Huang Q, Kuo C-CJ (2018) Instance Embedding Transfer to
Unsupervised Video Object Segmentation. 2018 IEEE/CVF Conference on Computer Vision and Pattern
Recognition. https://doi.org/10.1109/cvpr.2018.00683.
35. Li G, Xie Y, Wei T, Wang K, Lin L (2018) Flow Guided Recurrent Neural Encoder for Video Salient
Object Detection. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. https://doi.
org/10.1109/cvpr.2018.00342.
36. Liang M, Tang W, Xu DM, Jirapatnakul AC, Reeves AP, Henschke CI, Yankelevitz D (2016) Low-dose
CT screening for lung Cancer: computer-aided detection of missed lung cancers. Radiology 281(1):279–
288. https://doi.org/10.1148/radiol.2016150063
37. Madero Orozco H, Vergara Villegas OO, Cruz Sánchez VG, Domínguez O, H. de J., & Nandayapa Alfaro,
M. de J. (2015) Automated system for lung nodules classification based on wavelet feature descriptor and
support vector machine. Biomed Eng Online 14(1):9. https://doi.org/10.1186/s12938-015-0003-y
38. Monkam P, Qi S, Ma H, Gao W, Yao Y, Qian W (2019) Detection and classification of pulmonary nodules
using convolutional neural networks: a survey. IEEE Access 7:78075–78091. https://doi.org/10.1109/
access.2019.2920980
39. Nithila EE, Kumar SS (2017) Automatic detection of solitary pulmonary nodules using swarm intelligence
optimized neural networks on CT images. Eng Sci Technol, Int J 20(3):1192–1202. https://doi.org/10.1016/
j.jestch.2016.12.006
40. Oudkerk M, Devaraj A, Vliegenthart R, Henzler T, Prosch H, Heussel CP, Bastarrika G, Sverzellati N,
Mascalchi M, Delorme S, Baldwin DR, Callister ME, Becker N, Heuvelmans MA, Rzyman W, Infante
MV, Pastorino U, Pedersen JH, Paci E, Duffy SW, de Koning H, Field JK (2017) European position
statement on lung cancer screening. Lancet Oncol 18(12):e754–e766. https://doi.org/10.1016/S1470-
2045(17)30861-6
41. Ozdemir O, Russell RL, Berlin AA (2020) A 3D probabilistic deep learning system for detection and
diagnosis of lung Cancer using low-dose CT scans. IEEE Trans Med Imaging 39(5):1419–1429. https://doi.
org/10.1109/tmi.2019.2947595
42. Qian W, Song D, Lei M, Sankar R, Eikman E (2007) Computer-aided mass detection based on ipsilateral
multiview mammograms. Acad Radiol 14(5):530–538. https://doi.org/10.1016/j.acra.2007.01.012
43. Ren S, He K, Girshick R, Sun J (2017) Faster R-CNN: towards real-time object detection with region
proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149. https://doi.org/10.1109/tpami.
2016.2577031
44. Setio AAA, Ciompi F, Litjens G, Gerke P, Jacobs C, van Riel SJ, Wille MMW, Naqibullah M, Sanchez CI,
van Ginneken B (2016) Pulmonary nodule detection in CT images: false positive reduction using multi-
view convolutional networks. IEEE Trans Med Imaging 35(5):1160–1169. https://doi.org/10.1109/tmi.
2016.2536809
45. Severyn A, Moschitti A (2015) UNITN: training deep convolutional neural network for twitter sentiment
classification. Proceedings of the 9th international workshop on semantic evaluation (SemEval 2015).
https://doi.org/10.18653/v1/s15-2079.
46. Shen W, Zhou M, Yang F, Yang C, Tian J (2015) Multi-scale convolutional neural networks for lung
nodule classification. Inform Process Med Imag 588–599. https://doi.org/10.1007/978-3-319-19992-4_46
47. Silveira M, Nascimento J, Marques, J (2007) Automatic segmentation of the lungs using robust level sets.
2007 29th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.
https://doi.org/10.1109/iembs.2007.4353317.
48. Song Q, Zhao L, Luo X, Dou X (2017) Using deep learning for classification of lung nodules on computed
tomography images. J Healthcare Eng 2017:8314740–8314747. https://doi.org/10.1155/2017/8314740
49. Song H, Wang W, Zhao S, Shen J, Lam K-M (2018) Pyramid dilated deeper ConvLSTM for video salient object
detection. Lect Notes Comput Sci 744–760. https://doi.org/10.1007/978-3-030-01252-6_44.
50. Sun W, Tseng TL, Qian W, Zhang J, Saltzstein EC, Zheng B, Lure F, Yu H, Zhou S (2015) Using
multiscale texture and density features for near-term breast cancer risk analysis. Med Phys 42(6):2853–
2862. https://doi.org/10.1118/1.4919772
51. Sun W, Tseng TL, Zhang J, Qian W (2016) Computerized breast cancer analysis system using three stage
semi-supervised learning method. Comput Methods Prog Biomed 135:77–88. https://doi.org/10.1016/j.
cmpb.2016.07.017
52. Sun W, Huang X, Tseng T-L, Zhang J, Qian W (2016) Computerized lung cancer malignancy level analysis
using 3D texture features. Medical Imaging 2016: Computer-Aided Diagnosis. https://doi.org/10.1117/12.
2216329.
53. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, … Rabinovich A (2015) Going deeper with
convolutions. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). https://doi.org/
10.1109/cvpr.2015.7298594.
Multimedia Tools and Applications

54. Tajbakhsh N, Suzuki K (2017) Comparing two classes of end-to-end machine-learning models in lung
nodule detection and classification: MTANNs vs. CNNs. Pattern Recogn 63:476–486. https://doi.org/10.
1016/j.patcog.2016.09.029
55. Wang Y, Qiu Y, Thai T, Moore K, Liu H, Zheng B (2017) A two-step convolutional neural network based
computer-aided detection scheme for automatically segmenting adipose tissue volume depicting on CT
images. Comput Methods Prog Biomed 144:97–104. https://doi.org/10.1016/j.cmpb.2017.03.017
56. Wang S, Zhou M, Liu Z, Liu Z, Gu D, Zang Y, Dong D, Gevaert O, Tian J (2017) Central focused
convolutional neural networks: developing a data-driven model for lung nodule segmentation. Med Image
Anal 40:172–183. https://doi.org/10.1016/j.media.2017.06.014
57. Wang B, Qi G, Tang S, Zhang L, Deng L, Zhang Y (2018) Automated pulmonary nodule detection: high
sensitivity with few candidates. Lecture notes in computer science, 759–767. https://doi.org/10.1007/978-3-
030-00934-2_84.
58. Wang W, Lu X, Shen J, Crandall DJ, Shao L (2019) Zero-shot video object segmentation via attentive graph
neural networks. In Proceedings of the IEEE international conference on computer vision (pp. 9236-9245).
https://doi.org/10.1109/iccv.2019.00933.
59. Xie H, Yang D, Sun N, Chen Z, Zhang Y (2019) Automated pulmonary nodule detection in CT images
using deep convolutional neural networks. Pattern Recogn 85:109–119. https://doi.org/10.1016/j.patcog.
2018.07.031
60. Ypsilantis PP, Montana G (2016) Recurrent convolutional networks for pulmonary nodule detection in CT
imaging. arXiv preprint arXiv:1609.09143.
61. Zhang J, Xia Y, Zeng H, Zhang Y (2018) Nodule: combining constrained multi-scale log filters with
densely dilated 3D deep convolutional neural network for pulmonary nodule detection. Neurocomputing
317:159–167. https://doi.org/10.1016/j.neucom.2018.08.022

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps
and institutional affiliations.

Affiliations

Reza Majidpourkhoei 1 & Mehdi Alilou 1 & Kambiz Majidzadeh 1 & Amin
Babazadehsangar 1

Reza Majidpourkhoei
rmajidpour@yahoo.com

Kambiz Majidzadeh
K.Majidzadeh@iaurmia.ac.ir

Amin Babazadehsangar
1
bsamin2@liveutm.onmicrosoft.com
Department of Computer Engineering, Urmia Branch, Islamic Azad University, Urmia, Iran

You might also like