ResNet-50 Vs VGG-19 Vs Training From Scratch A Comparative Analysis of The Segmentation and Classification of Pneumonia From

Journal Pre-proof
ResNet-50 vs VGG-19 vs Training from Scratch: A comparative

analysis of the segmentation and classification of Pneumonia from
chest x-ray images
Victor Ikechukwu A Research Scholar , Murali S dProfessor ,

Deepu R dProfessor , Shivamurthy RC Professor
PII: S2666-285X(21)00055-8
DOI: https://doi.org/10.1016/j.gltp.2021.08.027
Reference: GLTP 51
To appear in: Global Transitions Proceedings
Please cite this article as: Victor Ikechukwu A Research Scholar , Murali S dProfessor ,
Deepu R dProfessor , Shivamurthy RC Professor , ResNet-50 vs VGG-19 vs Training from Scratch:
A comparative analysis of the segmentation and classification of Pneumonia from chest x-ray images,
Global Transitions Proceedings (2021), doi: https://doi.org/10.1016/j.gltp.2021.08.027
This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition
of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of
record. This version will undergo additional copyediting, typesetting and review before it is published
in its final form, but we are providing this version to give early visibility of the article. Please note that,
during the production process, errors may be discovered which could affect the content, and all legal
disclaimers that apply to the journal pertain.
© 2021 The Authors. Publishing Services by Elsevier B.V. on behalf of KeAi Communications Co.
Ltd.
This is an open access article under the CC BY-NC-ND license
(http://creativecommons.org/licenses/by-nc-nd/4.0/)
Global Transitions
ScienceDirect Proceedings
http://www.keaipublishing.co
Procedia Manufacturing 00 (2019) 000–000 m/en/journals/global-transiti
ons-proceedings/
ResNet-50 vs VGG-19 vs Training from Scratch: A comparative analysis of the segmentation and
classification of Pneumonia from chest x-ray images
Victor Ikechukwu Aa*., Murali Sb., Deepu Rc., Shivamurthy R.C.d

a
Research Scholar, Department of CSE, Maharaja Institute of Technology Mysore-571477, India
b,c,d
Professor, Department of CSE, Maharaja Institute of Technology Mysore-571477, India
* Corresponding author. Tel.: +91-78928 19690; E-mail address: victor.agughasi@gmail.com
Abstract
In medical imaging, segmentation plays a vital role towards the interpretation of X-ray images where salient features are
extracted with the help of image segmentation. Without undergoing surgery, clinicians employ various modalities ranging from
X-Rays and CT-Scans to ultrasonography, and other imaging techniques to visualise and examine interior human body organ and
structures. To ensure appropriate convergence, training a deep convolutional neural network (CNN) from scratch is tough since it
requires more computational time, a big amount of labelled training data and a considerable degree of experience. Fine-tuning a
CNN that has been pre-trained using, for instance, a huge set of labelled medical datasets, is a viable alternative. In this paper, a
comparative study was done using pre-trained models such as VGG-19 and ResNet-50 as against training from scratch. To
reduce overfitting, data augmentation and dropout regularization was used. With a recall of 92.03%, our analysis showed that the
pre-trained models with proper finetuning was comparable with Iyke-Net, a CNN trained from scratch.
Keywords: Chest x-rays; CNNs; Deep Learning; Medical imaging; Pneumonia detection; ResNet-50; Segmentation; VGG-19
1. Introduction Fine-tuning a CNN that has been trained using large,

labelled dataset from a separate application is a viable
In many clinical applications, medical image segmentation alternative to building a CNN from scratch. The pre-trained
has a significant impact on all subsequent research and models have been used successfully as a baseline for transfer
diagnosis. Manual annotation, on the other hand, is reliant on learning in a variety of computer vision tasks[9][10]. In this
domain expertise and expert skills. It is boring and paper, we seek to provide an answer to the central question:
time-consuming, and prone to intra and inter-observer Will the accuracy of training deep neural networks from
inconsistencies. As a result, creating automated segmentation scratch for pneumonia detection supersede that of pre-trained
algorithms for exact annotation of medical images is both models such VGG19 and ResNet-50?. This is an obvious
clinically useful and necessary. Convolutional neural networks question to answer because of limited computational resources
(CNNs), which found useful application in computer vision that we employed. To address this, we conducted experiments
for several decades [1][2] has become a standard deep on three (3) publicly available chest X-ray datasets namely
learning algorithm amongst researchers in various application CheXpert from Stanford university [11], NIH and Chest X-ray
areas such as natural language processing, medical image images of pneumonia obtained from Kaggle repository. We
segmentation and analysis [3][4][5] because of its ability to compared the performance of VGG-19 and ResNet-50 with
discriminate features from differing abstraction levels. our fine-tuned CNN models trained from scratch on the chest
However, it is often difficult to train deep neural networks x-ray images.
(without the use of pre-trained models) because of various In the next section, related works on chest-x-ray image
limitations such as – data imbalance, where there are more segmentation and pneumonia classification using deep
samples of normal as against malignant datasets. Second, the learning are introduced as well as highlights the major
availability of large unlabeled dataset which requires contributions of our work. Section 3 describes the methods
annotation by experts. Third, the problem of poor used in detail. In Section 4, the experimental results obtained
generalization, where the model performs below average on through a performance comparison of the proposed method
an unseen dataset (validation dataset) and the dependence on with an existing state-of-the-arts (SOTA) medical image
the enormous amount of computational resources[6][7][8]. classification techniques were discussed.
The conclusion of the work, limitations and future directions 2.2. Deep Learning in Pneumonia Classification
were highlighted in the last section.
The correct detection of lung borders is one of the most
crucial steps in automatic CXR analysis where boundary
2. Related Works regions are extracted, which hence helps in diagnosing
pneumonia, pneumothorax, emphysema, and cardiomegaly; a
The early application of CNNs in medical imaging dates family of lung diseases commonly known as Chronic
back to 1990s when a computer-aided diagnostic (CAD) tool Obstructive Pulmonary Diseases (COPDs)[20][21]. The work
was used in mammography[12][13]. The evolution of of [22] involves using various CNNs architectures for the
powerful computing (GPU and TPU) fueled more research in classification of viral pneumonia diseases from
medical imaging which translates into more literatures and the non-COVID-19. The overall accuracy of 99.51% and an AUC
proposal on new detection systems with better performance. of 99.4% was reported, which showed that ResNet-101 could
The application of CNNs is not limited to the development of distinguish COVID-19 from non-COVID-19 cases. Wang et
clinical decision support systems (CDSS) but to the al.[23] trained a CNN for the discrimination of viral
segmentation and classifications of disease subtypes. pneumonia from COVID-19 cases based on clinical changes
from CT images collected from 453 unique patients.
Similarly, Zhang et al.[24] proposed an automated intelligent
2.1. Deep Learning in Lung Segmentation assistive system for the detection and quantification of
COVID-19 pneumonia using CT images of 2215 patients with
Conventionally, medical image processing involves the multiple lesions. A detailed study on deep learning
following steps – Image formation (acquisition and approaches for automatic detection and prediction of
digitization), Image enhancement (calibration, registration, pneumonia in the era of COVID-19 was presented by Shoebi
and transformation) and Image analysis (feature extraction, et al.[25]. The study also highlighted the challenges faced by
segmentation, and classification). The very step towards clinicians and future research directions. The use of transfer
building a robust CDSS lies in efficient lung segmentation learning with the help of data augmentation was proposed by
and localization. Standard image processing technique such as Chowdhury et al[26] for the screening of COVID-19
clustering, edge detection, threshold and vector quantization pneumonia from digital CXR images with 97% accuracy. A
have been studied for lung segmentation in chest x-ray images deep neural network called Deep-COVID was proposed by
(CXR)[14][15][16]. The image processing approaches, on the Minaee et al[27] was proposed for forecasting the spread of
other hand, use relatively basic algorithms and perform poorly COVID-19 pneumonia from chest x-ray images. A
when the input image contains noise and some other artefacts. comprehensive dataset of CXR and CT scan images from
Lung segmentation using convolutional neural networks is variety of sources presented for the diagnoses of COVID-19
being intensively investigated with the emergence of deep pneumonia[28]. Experimental results showed that transfer
learning. Researchers are currently striving to improve lung learning via pre-trained models was better to training a model
segmentation performance by employing a variety of from scratch.
approaches, including attention modules, in addition to In summary, the “transferability” of knowledge
proposing more advanced image segmentation network incorporated in pre-trained CNNs is one of the most crucial
techniques. aspects of deep neural networks. However, few research was
Current research revolves using the state-of-the-arts done in the performance evaluation of CNNs trained from
algorithms as a benchmark. A robust segmentation algorithm scratch against pre-trained networks.
called Total Variation-based Active Contour (TVAC) was
proposed by Narathip et al.[17] which proved successful when
compared the U-Net architecture. Faizan et al.[18] used 2.3. Contributions
Generative Adversarial Networks (GANs) to train different
discriminators for lung segmentation from CXR. The Our primary contributions are:
proposed approach, with a dice-score of 97% and IOU of  We present a fine-tuned CNN based model for
94%, outperformed existing state-of-the-arts techniques. Deep classification of pneumonia in chest x-ray images.
learning based method called “Self-Attention Deep Neural  We demonstrated that although it is computationally
Network” was proposed by Kim et al.[19] for the automatic intensive, training a deep neural network from
segmentation of CXR. Experimental results show the scratch using limited resources is possible.
performance was comparable to most of the existing  We analyzed the effects of hyperparameter
techniques based on the dice score. optimizations to identify the effects of dropout
A common trend across existing literature is the absence of variations tailored towards achieving better accuracy
a well annotated training dataset which yields low than most traditional approaches.
performance in situations where the lungs are either deformed We presented overall accuracy of 87%, and a recall of 98%
or occluded. using pre-trained models with appropriate fine-tuning which
is comparable with the state-of-the-art techniques on medical
image analysis.
3. Methods
Deep convolutional neural networks have proven to yield

better accuracy when dealing with large volumes of dataset,
and many researchers tend to use them as de-facto standards.
This was made possible because of transfer learning, where
models trained from large datasets like the ImageNet with
modified optimizers were used.
We present the experiments and procedures undertaken to
verify the effectiveness of our proposed model based on the
publicly available chest x-ray (CXR) datasets from the NIH
and Kaggle repositories. Keras and TensorFlow, both
open-source Python libraries were used to train the CNNs to Fig. 3: The distribution of chest x-ray images
discriminate features for the classification of pneumonia from
chest x-ray images. The dataset is highly imbalance with more of pneumonia
cases versus normal cases, hence data augmentation was used
to balance the dataset, thereby eliminated the possibility of
3.1. Dataset Description overfitting the model.
To evaluate the performance on other COPD diseases, 4999
A more reliable database other than GitHubs‟ is the very CXR images were randomly chosen from the NIH dataset;
popular chest X-ray database on “Kaggle” with the with 2999 used for training and 1000 each for testing and
anterior-to-posterior (AP) and posterior-to-anterior (PA) CXR validation.
images consisting of 5856 images of normal, bacterial, and
viral pneumonia with varying resolution of 400p to 2000p.
However, only chest X-ray images for normal and viral 3.2. System Requirement
pneumonia were used in this study as shown in figure (1).
The hardware used for all experiments were on an 8th
generation Core i7 laptop PC with a 16GB RAM, an NVIDIA
GeForce 1060Ti GPU of 6GB and 256GB of SSD with the
usual Windows OS.
(a) (b) 3.3. Baseline for preprocessing
Fig. 1: The chest x-ray images of (a) a patient with highly enhanced edges,
There was no need for image quality enhancement as the
and (b) normal patient. dataset images has already been denoised and in high
resolution. However, to make the model more robust,
It was further subdivided into training, testing and
rescaling and normalization was used. Furthermore, images
validation set with each containing both normal and
were randomly rotated at 30 degrees during training and
pneumonia annotated images. For pneumonia cases, 3875
flipped both vertically and horizontally by 25 degrees each as
images were used for training, 390 for testing and 8 for
illustrated in table (1).
validation as illustrated in figure (2).
Table 1: Parameters for image augmentation.
Method Default Adjusted
Horizontal flip None True
Horizontal shift 0 0.25
Vertical shift 0 0.25
(a) (b) Shear range 0 0.3
Fig. 2: The chest x-ray images (a), (b) of Pneumonia patients. Rescale - 1./255
The remaining normal images were 1341, 234 and 8 Zoom range - 0.3
respectively as shown in the figure (3). Fixed image size 1024 x 1024 224 x 224
3.4. ResNet-50
The ResNet-50 (residual neural network) is a variation of

ResNet architecture with 50 deep layers that has been trained
on at least one million images from the ImageNet database.
 1 * 1, 6 4 
 
Double or triple-layer skips with nonlinearities (ReLU) and 3 * 3 , 6 4at  the
applied x 3 fully connected layer, followed by
 
batch normalisation are used in most ResNet models.  1 * 1, 2 5for
softmax 6 classification.
HighwayNet, a model that employ an additional weight  (1)
matrix to learn the skip weights are often incorporated.
ResNet-50 architecture consists of sequences of convolutional Table 2: ResNet-50 architecture summary
Layer Output size 18-layer 34-layer 50-layer 101-layer 152-layer

name
conv1 112 x 112 7 x 7, 64, stride 2
conv2_x 56 x 56 3 x 3 max pool, stride 2
[ ] [ ]
[ ] [ ] [ ]
conv3_x 28 x 28
[ ] [ ]
[ ] [ ] [ ]
conv4_x 14 x 14
[ ] [ ]
[ ] [ ] [ ]
conv5_x 7x7
[ ] [ ]
[ ] [ ] [ ]
1x1 average pool, 1000-d fc, softmax

FLOPs 1.8 x 109 3.6 x 109 3.8 x 109 7.6 x 109 11.3 x 109
blocks with average pooling. Softmax is used as at the last

layer for classification. The basic idea of ResNet50 was better Table 2: ResNet-50 architecture summary
explained by Quingge et al[29], in their study for
identification of macular diseases from optical coherency
tomography images as shown
in figure (4).
3.5. VGG-19
The VGG network is a tradename for the pre-trained CNN

model proposed by Simonyan and Zisserman [30] in early
2014 at the University of Oxford, UK. VGG (Visual
Geometry Group) was trained on ImageNet ILSVRC dataset
Fig. 4: The basic architecture of ResNet-50 (only 34 layers are of 1.3million images consisting of 1000 classes for which
shown for simplicity) 100, 000 images were used for training and 50,000 for
validation. VGG-19, a variant of VGG architectures has 19
The ResNet-50 consists of five convolutional layers namely; deeply connected layers which has consistently achieved
conv1, conv2_x, conv3_x, conv4_x and conv5_x. Once the better performance as compared to other state-of-the-arts
input image is loaded, it is passed through a convolutional model. The model consists of highly connected convolutional
layer with and kernel size of (conv1 layer) and fully-connected layers which enables better feature
followed by a max pooling layer of stride length of 2 in both extraction and, the use of Maxpooling (in the place of average
cases. Next, in the layers are grouped in pairs pooling) for downsampling prior classification using SoftMax
because of how residual networks are connected. The matrix activation function. The architecture of VGG-19 is as in
shown in equation (1) means there are two layers of figure (5).
respectively,
and another layer of
repeated thrice corresponding to the layers between
. Similarly, the process continued until the fifth
convolutional layer after which average pooling is
Fig. 5: VGG-19 architecture (CREDIT: [30]) Input Layer: Since the architecture is like that of VGG
Networks, the input image must be resized to the standard 224
3.6. Iyke-Net: A CNN trained from Scratch x 224 pixels, this was needful because medical images like an
x-ray, when taken from different devices come in various
There was no need for image quality enhancement as the
dataset images has already been denoised and in high sizes. For efficient preprocessing, the images must be resized
resolution. However, to make the model more robust, to 224 x 224 x 3 depicting the width, height, and channel
rescaling and normalization was used. The block diagram for
our Iyke-Net can be summarized as in figure (6). This section numbers (3 for RGB) respectively.
illustrates the design of the proposed CNN approaches for Convolutional Layer: This is the most important layer in
identifying pneumonia cases from chest x-ray images, as well
as the implementation details for the proposed CNN models. our proposed CNN model, as it is where majority of the
A simple CNN architecture built from the ground up is computations would be done. This layer's main job is to
included in the proposal. CNN algorithms extract meaningful
retrieve features from the image while keeping the spatial
features using a series of convolutional layers and fully
connected neural layers. relationship between image pixels intact. This is
accomplished by utilising a series of filters to learn the
retrieved features. In this study, a two-dimensional
convolution was performed using 10 filters, each of which
was built using a 7 * 7 filter size. In addition, the filters move
along the input images, calculating the dot product function,
also known as convolved features.
Fig. 6: Block diagram of the training process
Batch Normal Layer: Batch normalisation is a technique
The default step was to load the input images from the
for training very deep neural networks that standardises each
directory followed by data augmentation. In this study, two
mini-inputs batches to a layer. This stabilises the learning
different pretrained model were trained and tested and their
process and significantly reduces the number of training
performance compared against a tunned convolutional neural
epochs needed to create deep networks. It is a transformation
network.
that keeps the mean output close to 0 and the standard
deviation of the output close to 1 the following
hyperparameters (epsilon = 0.001 & momentum = 0.99) was
used for training our model.
ReLU Layer: This layer is responsible for replacing all
negative values to zero while allowing positive numbers to
assume their respective values from the convolved features,
thereby introduces non-linearity in the feature map.
Fully Connected (FC) Layer: All the activation functions
Fig. 7: Block diagram of Iyke-Net
from the preceding layer are related to the neurons in this
layer. This layer's primary function in this study is to classify
Proposed CNN model (Iyke-Net): The architecture of
the returned convolved features from dataset images into their
Iyke-Net depicted in figure (7) mirrors that of VGG-11, 10
respective classes.
filters were used followed by batch normalization and dropout
SoftMax Layer: After the fully connected layers, a proper
regularizations added in between the convolutional layers of
interpretation of the probabilities is needed, and that is the
each block. The rectified linear unit (ReLU), used as an
function of the SoftMax layer. It simply classifies the values
activation function, with a fully connected (FC) layer and
between „0‟ and „1‟ or 0% and 100%.
Softmax for classification.
Output Layer: The final layer, consisting of the two disposal. As a standard technique, the image was reduced into
classes (Normal and Pneumonia) is presented at this layer. smaller sizes which is then passed into the convolutional
neural network for classification. The validation accuracy of
ResNet-50 vs VGG-19: We conducted two (2) our model was slightly higher when compared to other
experiments using pre-trained models as it is easier to conventional approaches due to the efficiency of using
pretrained models.
fine-tune the parameters unlike a network trained from scratch Second, we created a model to detect and categorise
as the results presented in table (3). pneumonia from frontal chest images. The technique starts by
reducing the input size of chest X-ray images. The
Iyke-Net (Trained from Scratch): Subsequently, we
convolutional neural network framework, which collects
trained a deep neural network called Iyke-Net whose features from the images and classifies them, is then used for
identification in the next stage. We experimented on training
architecture is illustrated in figure (7) from scratch to check
from scratch by starting with an epoch of 20 and increased the
computational intensity and accuracy. As a standard approach, epochs to 100 using a fixed batch size of 64.
the input images were first resized to 224 x 224 with a three The table (3) shows that our proposed CNN approach
works effectively on both the normal and pneumonia X-ray
(3) colour channel and used a learning rate of 0.001. To images utilised in our study. It is quite interesting to achieve
improve on the accuracy of our model, further experiments 93.6& accuracy and 92.03% recall using our approach. On the
other hand, pretrained VGG-19 and ResNet-50 performs
were performed and detailed in figures (8), (9) and (10). admirably on X-ray images, correctly identifying all
pneumonia images with 97.3% accuracy and 99.2% recall of
typical X-ray images.
3.7. Performance Evaluation Metrics
To compare the performance of various deep learning Table 2: Model accuracy under various epochs
algorithms for classifying the prevalence of pneumonia from Epochs Training Validation
Accuracy (%) Accuracy (%)
chest x-ray images, the standard metrics were used. It is worth
highlighting that the experiment‟s performance measures are 10 65.02 45.34
based on an average of 3 simulated runs, with each consisting 100 81.24 79.02
of 100, 200 and 400 epochs with mini-batch size of 64. The 200 83.97 83.91
accuracy, precision (NPV), recall(sensitivity), and F1-scores 300 93.6 93.55
employed in this investigation will be described next.
400 88.60 88.68
(2)
It can be seen from table (2) that the accuracy was below
average, and that led us to conduct more experiments. Data
augmentation was done, and the epochs increased to 100
(3) while keeping all the hyperparameters constant. We observe
improvements in the performance of the baseline model
highlighting the possibility of more accuracy as we finetune
(4) our model. The model shown in figure (8) trained in
02Hours:31Mins:05Secs with an accuracy(validation) of
(5) 79.02%. Furthermore, the model was fine-tuned by adding
dropout layers in between convolutional layers.
We started by dropping out 20% of the dataset at random
(6) during training. Variations of dropout (30%, 45%, and 55%)
were used alongside increasing the epochs to 200 while
Where pneu = the pneumonia class keeping other parameters constant and obtained an
accuracy(validation) of 83.91% as shown in figure (9) and the
model trained in 05Hours:06Mins:11Secs. Although slight
improvement was recorded in accuracy, but that came at the
4. Results and Discussion cost of resources. Finally, we tried tunning the
hyperparameters, in this case, the learning rate was changed
The results were based on two different approaches; first from 0.001 to 0.0001 and the epochs increased to 400 which
we conducted experiments using VGG-19 and ResNet-50, took 16Hours:43Mins:11Secs to train with an accuracy of
pretrained models to discriminate pneumonia images from 88.68% as depicted in figure (10).
normal chest x-ray images taken from the anterior to posterior
(AP/PA) at high resolution. The choice of ResNet-50 over
ResNet-101 was to compensate for limited resources at our
Fig. 10: Cross entropy loss and classification accuracy curves for 400 epochs
It can be observed that the model did not improve beyond

300 epochs, suggesting that we may have overtrained our
model. Further, we retrained our model by decreasing both the
number of epochs and learning rates to 300 and 0.00001
respectively. The model trained in 13Hours:05Mins:33Secs
yielded an accuracy of 93.6%, better than using 400 epochs.
In summary, the performance accuracy of each model is
presented in Table (3).
Table 3: Accuracy of VGG-19, ResNet-50 and Iyke-Net.

Metrics VGG-19 ResNet-50 IykeNet
Accuracy (%) 97.3 96.2 93.60
Fig. 8: Cross entropy loss and classification accuracy curves for 100 epochs
Specificity (NPV) (%) 97.2 96.4 91.66
Precision (%) 96.7 95.3 91.30
Recall (%) 99.2 98.4 92.03
Training Results: Our findings on using pretrained

models such as VGG-19 and ResNet-50 trained on same
datasets, gave better results as compared to training from
scratch. Further, to validate the performance of our approach,
a comparative analysis was done using state-of-the arts
models.
Validation Results: The performance of our model when

compared against existing state-of-the arts (SoTA)
approaches, which in most cases serves as the “ground truth”
highlighted the fact that Iyke-Net (our network trained from
scratch) outperforms most robust pre-trained models as shown
table (4).
Table 4: Comparison with state-of-the-arts model.

Fig. 9: Cross entropy loss and classification accuracy curves for 200 epochs Method Models Accuracy F1-Score Precision Recall
(%) (%) (%) (%)
Rajpukar et ChexNet 85 95 - -
al(2017)[31]
Loey et ResNet-50 84.5 - - 0.10
al(2020)[32]
Apostolopoulos VGG-19 93.5 - - 86.0
et al(2020)[33]
Proposed Proposed 93.60 91.66 91.30 92.03
method (Iyke-Net)
Note: Bold numbers indicate best performance metrics
It was observed that CheXpert[31] gave an accuracy of

85% for positive pneumonia case with an F1-score of 95%, a
very robust approach considering the size of the dataset used
(112, 120 frontal-view chest X-ray images) and the authors
did not report the precision and recall. The works of Loey et
al[32] yielded an accuracy lesser than our model despite using
only 158 images of pneumonia patients. A similar approach
was investigated by Apostolopoulos et al[33] with an
accuracy of 93.50, and no performance metrics in terms of
F1-Score and precision were reported. Their works on the
classification of covid-19, pneumonia and lung cancer from
chest x-ray images using pre-trained networks validates our
findings that pre-trained model with adequate fine-tuning Biomed. Opt. Express, vol. 10, no. 2, p. 622, 2019.
yields similar performance with a CNN trained from scratch. [10] J. Xing et al., “Lesion Segmentation in Ultrasound Using
Semi-pixel-wise Cycle Generative Adversarial Nets,” IEEE/ACM
5. Conclusion Trans. Comput. Biol. Bioinforma., pp. 1–1, 2020.
[11] J. Irvin et al., “CheXpert: A large chest radiograph dataset with
The advances in transfer learning notwithstanding, we uncertainty labels and expert comparison,” 33rd AAAI Conf. Artif.
demonstrated the possibility of training a deep neural network Intell. AAAI 2019, 31st Innov. Appl. Artif. Intell. Conf. IAAI 2019
on “not-so-powerful” laptop PC with a commensurable 9th AAAI Symp. Educ. Adv. Artif. Intell. EAAI 2019, pp. 590–597,
accuracy. Our approach was able to discriminate salient 2019.
features which helped in the classification of pneumonia from [12] W. Zhang, K. Doi, M. L. Gigei, Y. Wu, R. M. Nishikawa, and R. A.
chest x-ray images. We compared the performance of two Schmidt, “Computerized Detection of Clustered Microcalcifications
pre-trained against a deep neural network trained from scratch in Digital Mammograms Using a Shift-Invariant Artificial Neural
by fine-tuning appropriate hyper-parameters. Our Network,” Med. Phys., vol. 21, no. 4, pp. 517–524, 1994.
examinations showed that our model is able generalize to [13] H. P. Chan, S. C. B. Lo, B. Sahiner, K. L. Lam, and M. A. Helvie,
unseen dataset, a positive step in building a robust computer “Computer-aided detection of mammographic microcalcifications:
aided diagnostic tool. However, our experiments proved that Pattern recognition with an artificial neural network,” Med. Phys.,
it is computationally intensive to achieve state-of-the arts vol. 22, no. 10, pp. 1555–1567, 1995.
accuracy by training a deep neural network from scratch as [14] M. N. Saad, Z. Muda, N. S. Ashaari, and H. A. Hamid, “Image
shown in Table (4). Our findings highlighted that it is segmentation for lung region in chest X-ray images using edge
expensive to guarantee an efficient generalization by training detection and morphology,” Proc. - 4th IEEE Int. Conf. Control
a deep neural network from scratch. Syst. Comput. Eng. ICCSCE 2014, no. November, pp. 46–51, 2014.
More so, we used only chest x-ray images as a benchmark [15] R. Noviana, F. Febriani, I. Rasal, and E. U. C. Lubis, “Axial
where CT scan images should have been used. Only binary segmentation of lungs CT scan images using canny method and
classification (pneumonia vs normal) was studied. This work morphological operation,” AIP Conf. Proc., vol. 1867, no. August,
will be extended to detect and classify all the families of 2017.
COPDs from both chest X-ray and CT images suggesting that [16] A. Zotin, Y. Hamad, K. Simonov, and M. Kurako, “Lung boundary
deep learning could be useful in diagnosis of chronic detection for chest X-ray images classification based on GLCM and
obstructive pulmonary diseases. probabilistic neural networks,” Procedia Comput. Sci., vol. 159, pp.
1439–1448, 2019.
[17] N. Reamaroon et al., “Robust segmentation of lung in chest x-ray:
applications in analysis of acute respiratory distress syndrome,”
References BMC Med. Imaging, vol. 20, no. 1, pp. 1–13, 2020.
[18] F. Munawar, S. Azmat, T. Iqbal, C. Gronlund, and H. Ali,
[1] Y. Lecun, L. Bottou, Y. Bengio, and P. Ha, “LeNet,” Proc. IEEE, “Segmentation of Lungs in Chest X-Ray Image Using Generative
no. November, pp. 1–46, 1998. Adversarial Networks,” IEEE Access, vol. 8, pp. 153535–153545,
[2] Y. Lecun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 2020.
521, no. 7553, pp. 436–444, 2015. [19] M. Kim and B. D. Lee, “Automatic lung segmentation on chest
[3] J. Duan et al., “Automatic 3D Bi-Ventricular Segmentation of x-rays using self-attention deep neural network,” Sensors
Cardiac Images by a Shape-Refined Multi- Task Deep Learning (Switzerland), vol. 21, no. 2, pp. 1–12, 2021.
Approach,” IEEE Trans. Med. Imaging, vol. 38, no. 9, pp. 2151– [20] O. Stephen, M. Sain, U. J. Maduh, and D. U. Jeong, “An Efficient
2164, 2019. Deep Learning Approach to Pneumonia Classification in
[4] H. Bogunovic et al., “RETOUCH: The Retinal OCT Fluid Healthcare,” J. Healthc. Eng., vol. 2019, 2019.
Detection and Segmentation Benchmark and Challenge,” IEEE [21] J. Peng, C. Chen, M. Zhou, X. Xie, Y. Zhou, and C. H. Luo, “A
Trans. Med. Imaging, vol. 38, no. 8, pp. 1858–1874, 2019. Machine-learning Approach to Forecast Aggravation Risk in
[5] O. Oktay et al., “Anatomically constrained neural networks Patients with Acute Exacerbation of Chronic Obstructive
(ACNN): Application to cardiac image enhancement and Pulmonary Disease with Clinical Indicators,” Sci. Rep., vol. 10, no.
segmentation,” arXiv, vol. 37, no. 2, pp. 384–395, 2017. 1, pp. 1–9, 2020.
[6] V. K. Singh, “Segmentation and Classification of Multimodal [22] A. A. Ardakani, A. R. Kanafi, U. R. Acharya, N. Khadem, and A.
Medical Images based on Generative Adversarial Learning and Mohammadi, “Application of deep learning technique to manage
Convolutional Neural Networks,” 2019. COVID-19 in routine clinical practice using CT images: Results of
[7] D. Abdelhafiz, C. Yang, R. Ammar, and S. Nabavi, “Deep 10 convolutional neural networks,” Comput. Biol. Med., vol. 121,
convolutional neural networks for mammography: Advances, no. April, p. 103795, 2020.
challenges and applications,” BMC Bioinformatics, vol. 20, no. [23] S. Wang et al., “A deep learning algorithm using CT images to
Suppl 11, 2019. screen for Corona virus disease (COVID-19),” Eur. Radiol., pp. 1–
[8] Y. Song, M. N. A. Rana, J. Qu, and C. Liu, “A Survey of Deep 19, 2021.
Learning Based Methods in Medical Image Processing,” Curr. [24] H. tao Zhang et al., “Automated detection and quantification of
Signal Transduct. Ther., vol. 15, pp. 1–14, 2019. COVID-19 pneumonia: CT imaging analysis by a deep
[9] V. A. dos Santos et al., “CorneaNet: fast segmentation of cornea learning-based software,” Eur. J. Nucl. Med. Mol. Imaging, vol. 47,
OCT scans of healthy and keratoconic eyes using deep learning,” no. 11, pp. 2525–2532, 2020.
[25] A. Shoeibi et al., “Automated Detection and Forecasting of 12, 2019.
COVID-19 using Deep Learning Techniques: A Review,” 2020. [30] K. Simonyan and A. Zisserman, “Very deep convolutional
[26] M. E. H. Chowdhury et al., “Can AI Help in Screening Viral and networks for large-scale image recognition,” 3rd Int. Conf. Learn.
COVID-19 Pneumonia?,” IEEE Access, vol. 8, pp. 132665– Represent. ICLR 2015 - Conf. Track Proc., pp. 1–14, 2015.
132676, 2020. [31] P. Rajpurkar et al., “CheXNet: Radiologist-level pneumonia
[27] S. Minaee, R. Kafieh, M. Sonka, S. Yazdani, and G. Jamalipour detection on chest X-rays with deep learning,” arXiv, pp. 3–9, 2017.
Soufi, “Deep-COVID: Predicting COVID-19 from chest X-ray [32] M. Loey, F. Smarandache, and N. E. M. Khalifa, “Within the lack
images using deep transfer learning,” Med. Image Anal., vol. 65, of chest COVID-19 X-ray dataset: A novel detection model based
pp. 1–9, 2020. on GAN and deep transfer learning,” Symmetry (Basel)., vol. 12,
[28] H. Maghdid, A. T. Asaad, K. Z. G. Ghafoor, A. S. Sadiq, S. no. 4, 2020.
Mirjalili, and M. K. K. Khan, “Diagnosing COVID-19 pneumonia [33] I. D. Apostolopoulos and T. A. Mpesiana, “Covid-19: automatic
from x-ray and CT images using deep learning and transfer learning detection from X-ray images utilizing transfer learning with
algorithms,” p. 26, 2021. convolutional neural networks,” Phys. Eng. Sci. Med., vol. 43, no.
[29] Q. Ji, J. Huang, W. He, and Y. Sun, “Optimized deep convolutional 2, pp. 635–640, 2020.
neural networks for identification of macular diseases from optical
coherence tomography images,” Algorithms, vol. 12, no. 3, pp. 1–

ResNet-50 Vs VGG-19 Vs Training From Scratch A Comparative Analysis of The Segmentation and Classification of Pneumonia From

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ResNet-50 Vs VGG-19 Vs Training From Scratch A Comparative Analysis of The Segmentation and Classification of Pneumonia From

Uploaded by

Copyright:

Available Formats

Journal Pre-proof

ResNet-50 vs VGG-19 vs Training from Scratch: A comparative

Victor Ikechukwu A Research Scholar , Murali S dProfessor ,

To appear in: Global Transitions Proceedings

Victor Ikechukwu Aa*., Murali Sb., Deepu Rc., Shivamurthy R.C.d

1. Introduction Fine-tuning a CNN that has been trained using large,

Deep convolutional neural networks have proven to yield

(a) (b) 3.3. Baseline for preprocessing

The ResNet-50 (residual neural network) is a variation of

Layer Output size 18-layer 34-layer 50-layer 101-layer 152-layer

1x1 average pool, 1000-d fc, softmax

blocks with average pooling. Softmax is used as at the last

The VGG network is a tradename for the pre-trained CNN

It can be observed that the model did not improve beyond

Table 3: Accuracy of VGG-19, ResNet-50 and Iyke-Net.

Training Results: Our findings on using pretrained

Validation Results: The performance of our model when

Table 4: Comparison with state-of-the-arts model.

Note: Bold numbers indicate best performance metrics

It was observed that CheXpert[31] gave an accuracy of

You might also like