Professional Documents
Culture Documents
CHAPTER-1
INTRODUCTION
In recent years, the advent of Deep Learning has revolutionized medical diagnostics,
particularly in the realm of pneumonia detection. Leveraging Convolutional Neural Networks
(CNNs) and transfer learning techniques, researchers have developed robust predictive
models capable of analyzing CT scans with unprecedented efficiency. These models operate
by extracting salient features from CT images through convolutional layers, subsequently
employing dense networks for classification. The primary objective of these models is to
discern whether a given CT scan indicates the presence of pneumonia or not, thereby
facilitating prompt and accurate diagnosis.
In this context, this project aims to explore the application of Deep Learning methodologies
in pneumonia detection from CT scans. Through the utilization of CNNs and transfer
learning, we endeavour to construct predictive models that exhibit superior performance in
distinguishing pneumonia-affected CT scans from healthy ones. By harnessing the power of
advanced computational techniques, we aspire to contribute to the advancement of medical
diagnostics, ultimately improving patient outcomes and mitigating the burden of pneumonia
on global healthcare systems.
CHAPTER-2
LITERATURE SURVEY
2.1. Pneumonia detection in chest X-ray images using an ensemble of deep learning
models:
Pneumonia detection using chest X-ray images has been a subject of extensive research in
recent years, driven by the need for accurate and efficient diagnostic tools to combat this
prevalent respiratory infection. The literature survey on this topic encompasses various
studies focusing on different aspects of pneumonia detection, including image analysis
techniques, deep learning models, and ensemble methods. Some of key findings and
approaches in the existing literature are:[1]
Traditional image processing techniques have been utilized for pneumonia detection
in chest X-ray images, including thresholding, edge detection, and texture analysis.
However, these methods often lack robustness and may struggle to handle the
complexities and variations present in medical images.
Transfer learning has been extensively employed to address the challenge of limited
training data in medical image analysis tasks. Pre-trained CNN models trained on
large-scale image datasets (e.g., ImageNet) are fine-tuned on pneumonia-specific
datasets to leverage learned features and improve model generalization.
2.2. A Deep Learning based model for the Detection of Pneumonia from Chest X-Ray
Images using VGG-16 and Neural Networks:
Pneumonia is a viral infection which affects a significant proportion of individuals, especially
in developing and penurious countries where contamination, overcrowded, and unsanitary
living conditions are widespread, along with the lack of healthcare infrastructures.
Pneumonia produces pericardial effusion, a disease wherein fluids fill the chest and create
inhaling problems. It is a difficult step to recognize the presence of pneumonia quickly in
order to receive treatment services and improve survival chances. Deep learning, is a field of
artificial intelligence which is used in the successful development of prediction models. There
are various ways of detecting pneumonia such as CT-scan, pulse oximetry, and many more
among which the most common way is X-ray tomography. On the other hand, examining
chest X-rays (CXR) is a tough process susceptible to subjective variability. In this work, a
deep learning (DL) model using VGG16 is utilized for detecting and classifying pneumonia
using two CXR image datasets. The VGG16 with Neural Networks (NN) provides an
accuracy value of 92.15%, recall as 0.9308, precision as 0.9428, and F1-Score0.937 for the
first dataset. Furthermore, the experiment using NN with VGG16 has been performed on
another CXR dataset containing 6,436 images of pneumonia, normal and covid-19. The
results for the second dataset provide accuracy, recall, precision, and F1-score as 95.4%,
0.954, 0.954, and 0.954, respectively. The research outcome exhibits that VGG16 with NN
provides better performance than VGG16 with Support Vector Machine (SVM), VGG16 with
K-Nearest Neighbor (KNN), VGG16 with Random Forest (RF), and VGG16 with Naïve
Bayes (NB) for both datasets.[2]
CHAPTER-3
EXISTING SYSTEM
In this system, they developed a computer-aided diagnosis system for automatic pneumonia
detection using chest X-ray images. They employed deep transfer learning to handle the
scarcity of available data and designed an ensemble of three convolutional neural network
models: GoogLeNet, ResNet-18, and DenseNet-121. A weighted average ensemble technique
was adopted, where in the weights assigned to the base learners were determined using a
novel approach.
The scores of four standard evaluation metrics, precision, recall, f1-score, and the area under
the curve, are fused to form the weight vector, which in studies in the literature was
frequently set experimentally, a method that is prone to error. The existing approach was
evaluated on two publicly available pneumonia X-ray datasets, provided by Kermany et al.
and the Radiological Society of North America (RSNA), respectively, using a five-fold cross-
validation scheme. The existing system achieved accuracy rates of 98.81% and 86.85% and
sensitivity rates of 98.80% and 87.02% on the Kermany and RSNA datasets, respectively.
Ensemble learning is a popular strategy in which the decisions of multiple classifiers are
fused to obtain the final prediction for a test sample. It is performed to capture the
discriminative information from all the base classifiers, and thus, results in more accurate
predictions. Some of the ensemble techniques that were most frequently used in studies in the
literature are average probability, weighted average probability, and majority voting. The
average probability-based ensemble assigns equal priority to each constituent base learner.
However, for a particular problem, a certain base classifier may be able to capture
information better than others. Thus, a more effective strategy is to assign weights to all the
base classifiers. However, for ensuring the enhanced performance of the ensemble, the value
of the weights assigned to each classifier is the most essential factor. Most approaches set this
value based on experimental results. In this study, we devised a novel strategy for weight
allocation, where four evaluation metrics, precision, recall, f1-score, and area under receiver
operating characteristics (ROC) curve (AUC), were used to assign the optimal weight to three
base CNN models, GoogLeNet, ResNet-18, and DenseNet-121. In studies in the literature, in
general, only the classification accuracy was considered for assigning weights to the base
learners, which may be an inadequate measure, in particular when the datasets are class-
imbalanced. Other metrics may provide better information for prioritizing the base learners .
Early detection of pneumonia is crucial for determining the appropriate treatment of the
disease and preventing it from threatening the patient’s life. Chest radiographs are the most
widely used tool for diagnosing pneumonia; however, they are subject to inter-class
variability and the diagnosis depends on the clinicians’ expertise in detecting early
pneumonia traces. To assist medical practitioners, an automated CAD system was developed
in this study, which uses deep transfer learning-based classification to classify chest X-ray
images into two classes “Pneumonia” and “Normal.” An ensemble framework was developed
that considers the decision scores obtained from three CNN models, GoogLeNet, ResNet-18,
and DenseNet-121, to form a weighted average ensemble. The weights assigned to the
classifiers were calculated using a novel strategy wherein four evaluation metrics, precision,
recall, f1-score, and AUC, were fused using the hyperbolic tangent function. The framework,
evaluated on two publicly available pneumonia chest X-ray datasets, obtained an accuracy
rate of 98.81%, a sensitivity rate of 98.80%, a precision rate of 98.82%, and an f1-score of
98.79% on the Kermany dataset and an accuracy rate of 86.86%, a sensitivity rate of 87.02%,
a precision rate of 86.89%, and an f1-score of 86.95% on the RSNA challenge dataset, using
a five-fold cross-validation scheme. It outperformed state-of-the-art methods on these two
datasets. Statistical analyses of the proposed model using McNemar’s and ANOVA tests
indicate the viability of the approach. Furthermore, the proposed ensemble model is domain-
independent and thus can be applied to a large variety of computer vision tasks
3.1. Advantages:
3.2. Disadvantages:
Limited Sensitivity and Specificity: While X-rays can detect structural abnormalities
in the lungs, they may lack sensitivity and specificity in distinguishing between
different types of pulmonary conditions. This can lead to false positives or false
negatives in pneumonia diagnosis, potentially resulting in misdiagnosis or delayed
treatment.
Radiation Exposure: X-ray imaging involves exposure to ionizing radiation, albeit at
relatively low levels. While the risk of radiation-induced harm from a single X-ray is
minimal, repeated exposure over time may increase the cumulative risk, particularly
in vulnerable populations such as children and pregnant women.
False Positives and Negatives: Like any diagnostic tool, chest X-rays are not
infallible. False positives (indicating pneumonia when it is not present) and false
negatives (missing pneumonia when it is present) can occur. This highlights the
importance of considering clinical symptoms and other diagnostic information.
CHAPTER-4
PROPOSED SYSTEM
Building a web application that takes CT scan images of lungs as input and predicts whether
pneumonia is present or not.
4.2. Objective:
To overcome the limitations of the existing system, in this study, we developed a computer-
aided diagnosis system for automatic pneumonia detection using CT scan images, we
employed deep transfer learning to handle the scarcity of available data and designed an
ensemble of three convolutional neural network models: EfficientNetV2B1, InceptionV3,
VGG16, DenseNet121, ResNet50V2, MobileNetV2, Custom Model.
The proposed model has taken the three CNN models that have higher accuracy than the
remaining CNN models and applied bagging technique for those three CNN models. Increase
in the accuracy minimizes the deaths.
The proposed approach was evaluated on publicly available pneumonia CT scan dataset,
provided by Kaggle, because the availability of datasets is limited as the medical data of the
patients are not disclosed due to security reasons.
4.3. Models:
4.3.1 EfficientNetV2B1:
EfficientNetV2 is an extension of the EfficientNet architecture, designed to achieve
improved performance with fewer parameters and computational resources.
It introduces a compound scaling method that uniformly scales network width, depth,
and resolution with a set of fixed scaling coefficients.
4.3.2. InceptionV3:
It is known for its Inception modules, which consist of multiple parallel convolutional
branches with different filter sizes.
4.3.3. VGG16:
4.3.4. DenseNet121:
4.3.5. ResNet50V2:
4.3.6. MobileNetV2:
This custom model is a convolutional neural network (CNN) designed for binary image
classification. Here's a brief summary of its architecture:
Input Rescaling: The input images are rescaled so that the pixel values range from 0
to 1.
Convolutional Layers: The model consists of three convolutional layers, each
followed by a rectified linear unit (ReLU) activation function. These layers use a 3x3
kernel to extract features from the input images.
Max Pooling Layers: After each convolutional layer, a max-pooling layer is applied
to reduce the spatial dimensions of the feature maps and capture the most important
information.
Flatten Layer: The flattened layer is used to convert the 2D feature maps into a 1D
vector, which can be fed into a fully connected neural network.
Fully Connected Layers: There are two dense (fully connected) layers in the model.
The first dense layer consists of 128 neurons with ReLU activation, allowing the
model to learn complex patterns from the flattened feature maps. The second dense
layer has a single neuron with a sigmoid activation function, which produces the final
binary classification output.
Compilation: The model is compiled using the Adam optimizer and binary cross-
entropy loss function. It is optimized to classify binary labels, and accuracy is used as
the evaluation metric.
Feature Extraction: The system should employ convolutional neural network (CNN)
models for feature extraction from CT scan images. Transfer learning techniques
should be used to leverage pre-trained models for better performance.
Classification: Extracted features should be fed into dense neural networks for
classification, distinguishing between pneumonia-affected and unaffected CT scans.
Model Evaluation: The system should evaluate the performance of trained CNN
models using appropriate metrics such as accuracy, precision, recall, and F1 score to
assess their effectiveness in pneumonia detection.
Security: The system should adhere to strict security standards to protect patient data
and ensure confidentiality, integrity, and availability.
Usability: The system should have an intuitive user interface that is easy to navigate,
facilitating seamless interaction for healthcare professionals.
Robustness: The system should be resilient to variations in input data quality, noise,
and artifacts commonly encountered in medical imaging.
Compliance: The system should comply with relevant regulations and standards for
medical devices and software, ensuring legal and ethical compliance in healthcare
settings.
CHAPTER-6
DESIGN AND METHODOLOGY
UML diagrams are visual representations used to design and model software systems.
They are standardized diagrams used in software engineering to communicate design
decisions and system architecture.
UML diagrams can help developers, stakeholders, and team members understand the
structure and behaviour of a system before it is implemented.
Use case diagrams to illustrate the interactions between users (actors) and a system to
accomplish specific tasks or goals. They help to identify and define the functional
requirements of a system from the user's perspective.
Sequence diagrams visualize the interactions between objects in a particular scenario or use
case. They show the sequence of messages exchanged between objects over time, helping to
understand the dynamic behaviour of the system.
Activity diagrams represent the flow of control within a system, showing the sequence of
activities and decision points. They are useful for modeling business processes, workflow,
and the logic of complex operations.
Class diagrams depict the static structure of a system by showing classes, their attributes,
methods, and relationships between classes. They provide a blueprint for the implementation
of the system's objects and their interactions.
Training: Train the CNN models on the preprocessed CT scan dataset using
supervised learning. Use techniques like cross-validation and hyperparameter tuning
to optimize model performance.
Evaluation: Evaluate the trained models using metrics such as accuracy, precision,
recall, and F1-score on a separate test set. Perform error analysis to identify areas for
improvement.
Deployment: Deploy the trained models in a clinical setting for pneumonia detection
from CT scans. Ensure seamless integration with existing healthcare systems and
compliance with regulatory standards.
CHAPTER-7
IMPLEMENTATION
Sample Code:
#importing Libraries
import tensorflow as tf
import cv2
import os
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import random
def load_images_from_folder(folder):
images = []
labels = []
for classes in os.listdir(folder):
count = 0
for filename in os.listdir(folder+'/'+classes):
img = cv2.imread(os.path.join(folder,classes,filename))
if img is not None:
images.append(img)
labels.append(classes)
count+=1
print("no of images in",classes,"is",count)
return images,labels
imgs,labels = load_images_from_folder('G:/Engg/Project/CTDataset')
data = {'images':imgs,'labels':labels}
dataF = pd.DataFrame(data)
Then the 3-channel RGB images are converted to Gray Scale using cv2.
After, the images are resized to 160,160 for transfer learning models.
Original Images are in 512,512 and resized the images to 180,180 for custom model.
(180,180,1) for
(512, 512, 3) to
custom model
(160, 160, 1) for
transfer learning
models
Fig 7.2. Gray
scale conversion
and image resizing
7.1.3. Balancing:
As the data set is biased, we balanced the dataset by performing data augmentation on lower
samples class (Normal CT’s) and under sampling techniques on higher samples class
(Pneumonic CT’s)
Fig. 7.3.1 Class distribution diagram Before Oversampling and Down sampling
Fig. 7.3.2 Class distribution diagram After Oversampling and Down sampling
size+=1
i+=1
def DownSampling(data,maj_class,size,target_size):
data_maj_indexes = list(data[data['labels']==maj_class].index)
while(target_size<size):
t = random.randint(0, size-1)
i = data_maj_indexes[t]
data.drop(i,inplace=True)
data_maj_indexes.pop(t)
size-=1
The dataset is split to 4:1 ratio 80% data for training and 20% for validation.
The test set is stored in separate directory with around 10% of train set.
train_ds = tf.keras.utils.image_dataset_from_directory(
batch_size=batch_size)
val_ds = tf.keras.utils.image_dataset_from_directory(
Transfer learning using Keras involves leveraging pre-trained models to address new tasks
efficiently. With Keras, one can easily import popular architectures like VGG, ResNet, or
Inception, which are trained on massive datasets like ImageNet. By removing the top layers
and adding new ones tailored to the specific task at hand, such as image classification or
object detection, one can retrain the model on a smaller dataset. This approach significantly
reduces training time and resource requirements while often achieving impressive
performance, making Keras transfer learning models a go-to choose for many machine
learning practitioners.
Out of the all keras transfer learning models we have selected the 6 models with best
accuracy and reasonable no of parameters.
Out of all the 27 models we have selected based on performance and no of parameters:
1. ResNet50v2
2. EfficientNetV2B1
3. VGG16
4. Densenet121
5. MobileNetv2
6. Inception V3
model = tf.keras.Sequential([
tf.keras.layers.Rescaling(1./255),
tf.keras.layers.Conv2D(32, 3, activation='relu'),
tf.keras.layers.MaxPooling2D(),
tf.keras.layers.Conv2D(32, 3, activation='relu'),
tf.keras.layers.MaxPooling2D(),
tf.keras.layers.Conv2D(32, 3, activation='relu'),
tf.keras.layers.MaxPooling2D(),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dense(1, activation='sigmoid')
])
model.compile(
optimizer='adam',
loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
metrics=['accuracy'])
model.fit(
train_ds,
validation_data=val_ds,
epochs=20
)
Transfer Learning:
#Resnet50V2
base_model = tf.keras.applications.ResNet50V2(input_shape=(160,160,3),
include_top=False,
weights='imagenet')
#EfficientNetV2B1
base_model = tf.keras.applications.EfficientNetV2B1(input_shape=(160,160,3),
include_top=False,
weights='imagenet')
base_model.trainable = False
x = tf.keras.applications.vgg16.preprocess_input(inputs)
x = base_model(x, training=False)
x = tf.keras.layers.GlobalAveragePooling2D()(x) # Average pooling
x = tf.keras.layers.BatchNormalization()(x) # Introduce batch norm
x = tf.keras.layers.Dropout(0.2)(x) # Regularize with dropout
model.compile(
optimizer='adam',
loss=tf.keras.losses.BinaryCrossentropy(from_logits=False),
metrics=['accuracy'])
history = model.fit(
train_ds,
validation_data=val_ds,
epochs=20
)
CHAPTER-8
TESTING
While training the model we have used 20% data from training set as validation to check the
model’s performance on unseen data.
The train accuracies are lower than the validation accuracies because of the Drop out layers
and other regularizations techniques used to prevent overfitting.
1. Custom model
2. EfficientNetV2B1
3. ResNet50V2
4. DenseNet121
5. VGG16
6. MobileNetV2
7. InceptionV3
Out of the seven modes Custom Model, ResNet50V2, EfficientNetV2B1 performed with
better accuracy.
Sample Code:
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']
plt.figure(figsize=(8, 8))
plt.subplot(2, 1, 1)
plt.legend(loc='lower right')
plt.ylabel('Accuracy')
plt.ylim([min(plt.ylim()),1])
plt.subplot(2, 1, 2)
plt.legend(loc='upper right')
plt.ylabel('Cross Entropy')
plt.ylim([0,1.0])
plt.xlabel('epoch')
plt.show()
def preprocess_image(image_path):
img = cv2.imread(image_path)
return img
def predict_single_image(image_path):
img = preprocess_image(image_path)
prediction = model2.predict(img)
return prediction
def evaluate_folder(folder_path):
predictions = []
prediction = predict_single_image(image_path)
predictions.append(prediction)
return predictions
test_folder_normal = 'G:/Engg/Project/Test/Normal'
test_folder_pneumonic = 'G:/Engg/Project/Test/Pneumonic'
predictions_normal = evaluate_folder(test_folder_normal)
predictions_pneumonic = evaluate_folder(test_folder_pneumonic)
predicted_labels = []
predicted_labels.extend(predictions_normal)
predicted_labels.extend(predictions_pneumonic)
8.1.1. Resnet50V2:
On model training the performance of the model on training (96%) and validation sets (98%).
8.1.2. EfficientNetV2B1:
the performance of EfficientNet model on training (97.41%) and on validation set (99%).
The custom model gave top performance with 99.3% training accuracy and 99.4% on
validation set.
Accuracy : 98.78
Precision : 99.70
Recall : 98.25
8.2.2. ResNet50V2:
Accuracy : 96.88
Precision : 96.03
Recall : 98.83
8.2.3. EfficientNetV2B1:
Accuracy : 98.27
Precision : 97.17
Recall : 100.0
CHAPTER-9
RESULTS AND DISCUSSION
The user is provided with Interface to upload the CT Scan image of the lung to see the
results.
The Interface is built using the Streamlit a popular python library used to build the web apps
for machine learning and deep learning applications. After the image of scan being uploaded
by the user the image will be passed to the three models Resnet, EfficientNet and Custom
model. The outputs generated by each of the models will be averaged for more robust
prediction of the disease.
Firstly, the custom model can be specifically tailored to the intricacies of the dataset at hand.
By designing a model architecture and training it on the dataset, it can capture subtle nuances
and features that are particularly relevant to pneumonia detection from CT scans. This
customization allows the model to adapt precisely to the characteristics of the images and the
specific patterns indicative of pneumonia.
On the other hand, incorporating pre-trained models such as ResNet50V2 and EfficientNetV2
B1 brings the advantage of leveraging the knowledge learned from vast datasets like
ImageNet. These models have been trained on diverse visual recognition tasks, learning
hierarchical representations of features that are generally useful across a wide range of image
classification tasks. This means that these models have already learned to extract meaningful
features from images, including those relevant to pneumonia detection.
Ensembling these models combines the strengths of both approaches. The custom model
provides domain-specific insights and fine-tuned features, while the pre-trained models offer
a solid foundation of generalizable features. By aggregating the predictions of these models,
either through averaging their outputs or more sophisticated techniques like stacking, the
ensemble can effectively smooth out individual model biases and uncertainties. This results
in a more robust and reliable prediction, less susceptible to errors or noise in any single
model's predictions.
Furthermore, ensembling helps mitigate the risk of overfitting by reducing the reliance on
any one model's predictions. Instead, the ensemble combines multiple viewpoints, leading to
more balanced and accurate predictions across different cases and variations within the
dataset. Increase in the accuracy minimizes the death rate.
CHAPTER-11
REFERENCES
Daniel Joseph Alapat, Malavika Venu Menon, and Sharmila Ashok. “Detection of
Pneumonia in Chest X-ray Images Using Neural Networks”, in Vellore Institute of
Technology, Tamil Nadu, India, 2022.[1]
Rohit Kundu, Ritacheta Das, Zong Woo, Gi-Tae Han, Ram Sarkar. “Pneumonia
detection in chest X-ray images using an ensemble of deep learning models”, in
Jadavpur University, Kolkata, India and Gachon University, Seongnam, South Korea,
2021.[2]
Shagun Sharma, Kalpna Guleria. “A Deep Learning based model for the Detection of
Pneumonia from Chest X-Ray Images using VGG-16 and Neural Networks”, in
Chitkara University, Rajpura, 140401, Punjab, India, 2023.[3]