You are on page 1of 18

A Progress Report

on

LUNG DISEASE DETECTION USING DEEP LEARNING


AFTER BONE SUPPRESSION IN CHEST X-RAYS
carried out as part of the course CSE CS3270 Submitted by

Vedansh Maheshwari
199301043
VI-CSE

in partial fulfilment for the award of the degree

of

BACHELOR OF TECHNOLOGY
In

Computer Science & Engineering

Under the guidance of


Mr. Harish Sharma

Department of Computer Science and Engineering


School of Computing and Information Technology
Manipal University Jaipur
Jaipur, Rajasthan
CERTIFICATE

This is to certify that the project entitled "Lung disease detection using deep learning after

bone suppression in chest X-rays" is a bona fide work carried out as part of the course Minor

Project : CS3270, under my guidance byVedansh Maheshwari, student of Bachelor Of

Technology (B.Tech.) in Computer Science and Engineering (CSE) at the Department of

Computer Science & Engineering , Manipal University Jaipur, during the academic semester VI

of year 2021-22.

Place:Manipal University Jaipur

Date: 8 June 2022


Abstract:
Lung disorders, often known as respiratory diseases, are illnesses that affect the lungs' airways
and other components. If not recognised and treated in a timely manner, many disorders can
be fatal. Clinically, lung illnesses can be diagnosed by looking at chest x-ray images. And, as
medical images are used for a variety of purposes in hospitals, pathologies, and diagnostic
centres, the quantity of medical image collections is rapidly growing to capture diseases in
hospitals. Therefore we will implement a model to classify lung diseases in chest X-rays using
CNN and Transfer Learning.

A lot of research work has been already done on this topic still they lack in accuracy and
worthiness to implement in real life. The difficulty of querying and managing the large datasets
leads to a new mechanism called deep convolutional neural network. we proposed and
evaluated a deep convolutional neural network, designed for classifying the Chest Diseases. The
proposed model consists of transfer learning, resnet18 and ReLU activation function.

A publicly available dataset called Chest X-Ray 14 which consists of fifteen classes named
Atelectasis, Cardiomegaly, Effusion, Infiltration, Mass, Nodule, Pneumonia, Pneumothorax,
Consolidation, Edema, Emphysema, Fibrosis, Pleural Thickening, Hernia and No Finding images
used here out of which we will sort the data related to Pneumonia, Pneumothorax, Infiltration
and No Findings to train this model and classify.
List of Contents
LIST OF FIGURES
S.no Title Page no.

1 Eight visual examples of common thorax diseases 5

2 Literature Review Table 6

3 NO. of each disease by patient gender 8

4 Distributions of 14 disease categories with co-occurrence 8


statistics:
5 System architecture flow chart 9

6 Detailed methodology chart 10


1.Introduction
1.1 Motivation

The world is changing so fast that the pressure on health is increasing, the adverse changes in
climate, the environment, the lifestyle of humans, increase the risk as well as diseases for
people. Lungs are one of the organs severely affected. More than 3 million people in 2020 had
to face chronic obstructive pulmonary disease (COPD), caused mainly by smoking and pollution,
According to Forum of International Respiratory Societies , about 334 million people suffer from
asthma, and, each year, tuberculosis kills 1.4 million people, 1.6 million people die from lung
cancer, while pneumonia also kills millions of people. The COVID-19 pandemic impacted the
whole world , infecting millions of people and burdening healthcare systems . It is clear that
lung diseases are one of the leading causes of death and disability in this world. With the
technology, machines and computer power, the earlier identification of diseases, particularly
lung disease, can be helped to detect earlier and more accurately, which can save many people
as well as reduce the pressure on the system.
Usually lung diseases can be detected via skin test, blood test, sputum sample test , chest X-ray
examination and computed tomography (CT) scan examination. A chest X-ray test is very
common and is a cost-effective medical imaging technique. It is a diagnostic test that helps
clinicians identify and treat medical problems. Chest X-ray produces images of the blood
vessels, lungs, airways, heart and spine and chest bones. For this project we will use NIH chest
X-ray dataset. It is a public dataset followed
by labelled lung disease data.

Our machine learning model will be able to


classify each of the X-ray sample to one of
the fourteen classes and detect the disease.
The model will detect and differentiate
between diseases by learning about the
localisation of infection in all of these
diseases as shown in figure 1.

There has been lot of advancement in


machine learning, particularly deep learning Figure 1. localization of different diseases in lungs
which helps in identification, quantification
and classification of patterns in medical
images . These developments were made possible due to the ability of deep learning to learned
features merely from data, instead of hand-designed features based on domain-specific
knowledge. And these advancements assist clinicians in detecting and classifying certain
medical conditions efficiently.

1
We will try to improve existing medical
classification method by implement a Transfer
learning with ResNet 18 where the last layer
information serves as input to a new classifier.
CNN model after applying bone suppression
using Unet.

2.Literature Review

Year Author Problem Methodology Dataset Challenges Results


Statement
..
2021 Ankita Shelke, Lung X- VGG-16, Chest Images Size of the The accuracy
Madhura ray DenseNet- from Clinico dataset and of the VGG-16
Inamdar, data points model,
Classificati 161,Resnet-18 Diagnostic Lab,
Vruddhi Shah, Mumbai, India used were DenseNet-
Amanshu
on Using less which 161 model,
Tiwari Deep could lead and ResNet-
Learning to 18 gave us an
for overfitting. accuracy of
Automated 95.9 %, 98.9
% and 76 %
Covid-19 respectively
Screening

2021 Siddhanth LUNG Capsule network Full NIH Chest Capsnet has Accuracy of
Tripathi , DISEASE with CNN X-ray dataset slower 69.3% with
Sinchana DETECTION And convergence CNN+
Shetty, Somil USING DEEP CNN+VGG+data+ VGG+data
Jain, Vanshika LEARNING STN CNN+VGG+d +STN.
Sharma ata +STN And 63.% with
need more capsnet
epoch for
better
results

2020 Subrato Roy, Hybrid deep hybrid deep 5% Random Higher VGG Data STN
Prajoy learning for learning sample of NIH Training with CNN
Poddar,M.Ruba detecting framework by Chest X-ray time. model gives
iyat Hussain lung diseases combining VGG, dataset 73%

2
Mondal from X-ray data Very validation
images augmentation complex accuracy and
and spatial “locnet” 74% AUC 
transformer module has
network (STN) been used
with CNN
(VGG Data
STN with CNN)

2020 Asmaa Abbas,  Classification deep CNN, called Combined 80 Only binary Accuracy of
Mohammed M. of COVID-19 Decompose, samples of classification 98.23%
Abdelsamea  in chest X-ray Transfer normal CXR ima of covid 19
Mohamed images using learning(with ges from the positive or
Medhat Gaber DeTraC deep shallow tuning, Japanese negative
convolutiona deep tuning and Society of
l neural fine tuning), and Radiological
network Compose Technology
(DeTraC) (JSRT)
CXR images of ,
which contains
105 and 11
samples of
COVID-19 and
SARS 

2020 Zhizhen Zhou, Dilated Conditional Japanese cGAN 398 35.5+-4.418


Luping Zhou, Conditional GANs Society of trained on PSNR
Kaikai Shen GAN for U-Net-like Radiological this dataset 0.975+- 0.079
Bone generator Technology(JSR an SSIM
Suppression PatchGAN based T) dataset for approximati
in Chest 2 discriminator X-ray bone on of DES
Radiographs shadow Potential risk
with suppression that model
Enforced may not
Semantic preserve
Features lung nodules

2020 Classification Classification MobileNet use NIH Chest X-ray Classification AUC for
and predictions and case model dataset into only 10 Atelectasis
of Lung predictions (UCMobN) classes and (AUC 0.58),
Diseases from of Lung lower AUC consolidation
Chest X-rays Diseases (AUC 0.67),
using from Chest Edema (AUC
MobileNet X-rays using 0.74), Effusion
MobileNet (AUC 0.65),
Emphysema
(AUC 0.54),
Fibrosis (AUC
0.65),
Infiltration
(AUC 0.57),

3
Mass (AUC
0.51), Nodule
(AUC 0.58),
and
Pneumonia
(AUC 0.62)
2020 Jia Liang*a, Yu- Bone Cycle-GAN dual-energy PA Only about 36.078±0.305
Xing Tang*a , Suppression Pix2Pix chest bone PSNR with
You-Bao Tanga, on Chest radiographs suppression paired
Jing Xiaob, and Radiographs from the training of gan
Ronald M. With picture and 0.948 ±
Summersa Adversarial archiving and 0.004 AUC
Learning communication
system (PACS)
NIH ChestX-
ray8

2018 Rahib H. Abiyev Deep Backpropagation NIH Chest X-ray  required BPNN with
and Mohamma Convolutiona neural network dataset computation 80.04%
d Khaleel l Neural (BPNN), time and the recognition
Sallam Networks for competitive number of rate, CpNN
Ma’aitah Chest neural network iterations With 89.57%
Diseases (CpNN), and were roughly recognition
Detection convolutional higher for rate, CNN
neural network CNN With
(CNN) But CNN had 92.4%recogni
better tion rate
generalizatio
n power

2018 Worawate Automatic histogram JSRT Dataset Small 74.43±6.01%


Ausawalaithong Lung Cancer equalization, ChestX-ray14 dataset for of mean
, Arjaree Prediction median filtering, Dataset  lung cancer accuracy,
Thirach, from Chest 121-layer And only 74.96±9.85%
Sanparith X-ray Images Densely binary of mean
Marukatat, Using the Connected classification specificity,
Theerawit Deep Convolutional for detecting and
Wilaiprasitporn Learning Network,Transfe cancer 74.68±15.33%
Approach r Learning of mean
sensitivity

3.DataSet

4
Dataset used for training bone-suppression model is X-ray bone suppression dataset uploaded
by HMUONG.
This dataset includes 2 parts: 4080 iamges for Normal Chest X-ray images and
corresponding 4080 bone suppressed images

The dataset used here for training CNN is obtained from National Institutes of Health—Clinical
Center. Dataset contains 112,120 frontal-view X-ray images of 30,805 unique patients with the
text-mined fourteen disease image labels. Fourteen common thoracic pathologies include
Atelectasis, Consolidation, Infiltration, Pneumothorax, Edema, Emphysema, Fibrosis, Effusion,
Pneumonia, Pleural_thickening, Cardiomegaly, Nodule, Mass and Hernia We are using a
random sample of this dataset which contains. Contains 5 percent of the actual dataset which
is 5606 data images

Every image is paired with the following


Meta Data images Image Index, Finding
Labels, Follow-up #, Patient ID, Patient Age,
Patient Gender, View Position, Original
Image Size and Original Image Pixel Spacing.

Every image in the dataset belong to one or


more than one class from 14 diseases or
else it belongs to no finding class.

Figure 2 ORIGINAL DATASET DISTRIBUTION OF CLASSES

Class distribution in the sample we are using:

5
Figure 3 POSITIONWISE DISTRIBUTION IN SAMPLE DATSET

Figure 4DISTRIBUTION OF CLASSES IN SAMPLE DATASET

4.Methodology and Framework

4.1 System Architecture

6
Figure 5 SYSTEM ARCHITECTURE

4.2 Algorithms
 Generative Adversarial Networks, or GANs for short, are an approach to generative
modelling using deep learning methods, such as convolutional neural networks
Generative modelling is an unsupervised learning task in machine learning that involves
automatically discovering and learning the regularities or patterns in input data in such a
way that the model can be used to generate or output new examples that plausibly
could have been drawn from the original dataset.
 U-Net is an U shaped architecture for semantic segmentation. It consists of a
contracting path and an expansive path. The contracting path follows the typical
architecture of a convolutional network.
 ResNet18 is a 72-layer architecture with 18 deep layers. The architecture of this
network aimed at enabling large amounts of convolutional layers to function efficiently.
However, the addition of multiple deep layers to a network often results in a
degradation of the output.

7
 CNN is one of the most popular types of deep neural network.It uses learned features
with input data, and uses 2D convolutional layers,making this architecture well suited to
process 2D data such as images.
It eliminates the need of manual feature detection.The relevant features are learned
while network train on aa dataset.
 Transfer learning (TL) is a research problem in machine learning (ML) that focuses on
storing knowledge gained while solving one problem and applying it to a different but
related problem.

4.3 Detailed design Methodology

4.3.1 Bone suppression


Bone suppression is an autoencoder-like method for
removing bone shadow from X-ray pictures of the
chest. The model requires two types of X-ray images:
normal and bone-suppression. The target model will
remove bone shadow from chest X-ray pictures,
allowing our further model to diagnose lung ailments
more accurately.

Figure 6 BONE SUPPRESSION IN XRAYS

We shall use Unet architecture for the provided purpose. Unet is used for image segmentation which
is localization and classification. It is made up of two 3x3 convolutions that are applied repeatedly,
each followed by a rectified linear unit (ReLU) and a 2x2 max pooling operation with stride 2 for
downsampling. A 1x1 convolution is employed at the final layer to transfer each 64-component
feature vector to the desired number of classes. The network comprises a total of 23 convolutional
layers.

8
Figure 7UNET ARCHITECTURE

With the use of Unet we did not get the ideal results but still the results were helpful. For more ideal results we
will implement GANs for bone suppression

Figure 8BONE SUPPRESSION IN OUR MODEL

9
4.3.2 Classification of lung
diseases

For the classification purpose we will


apply the bone suppression model on
the NIH sample dataset we are using
for classification. We had to change
sizes to make the data suitable for the
model.
Then we took care for the imbalanced
class frequencies. We use transfer
learning and resnet 18 for
classification. Residual
network(ResNet) is a CNN model used
for Image recognition, Having 72 layer
architecture 18 layer deep. Dataset is
split as 5000 images for training and
303 images for testing and validation.

Figure 9RESNET 18 ARCHITECTURE

10
5. Work Done
 Literature review for related works
 Acquisition of dataset for classification and bone suppression
 Implement a model for bone suppression using Unet
 Convert NIH Dataset images to bone suppressed images
 Classification of lung into 15 classes using ResNet18 after bone suppression

6.Results and discussion


The performance results of the proposed model are

Figure 10 TRAINING SET ACCURACY

11
Figure 11TEST SET ACCURACY

12
Figure 12 VALIDATION SET ACCURACY

7. Conclusion and Future Plan


Conclusion:
We have tried to work on a different approach for first removing the bones from the dat and
then applying classification, hoping this would give better results.
From our observations the accuracy for deseases which are not localised at some particular
area in lungs but spread everywhere is more for our model

Future Plan:
Since our Unet model for bone suppression wasn’t
good enough we will try to implement Gans for this
task.

Figure 13GAN ARCHITECTURE

13
References

 https://www.hindawi.com/journals/jhe/2018/4168538/
 https://www.sciencedirect.com/science/article/pii/S2352914820300290#sec4
 https://www.kaggle.com/datasets/nih-chest-xrays/data
 https://arxiv.org/pdf/2002.03073.pdf
 https://www.udemy.com/course/machinelearning
 https://machinelearningmastery.com/what-are-generative-adversarial-networks-gans/
 https://ieeexplore.ieee.org/abstract/document/9033668/metrics#metrics
 https://www.kaggle.com/datasets/nih-chest-xrays/sample
 https://www.kaggle.com/datasets/abhishek/pretrained-model-weights-pytorch
 https://www.kaggle.com/datasets/hmchuong/xray-bone-shadow-supression
 https://towardsdatascience.com/an-overview-of-resnet-and-its-variants-5281e2f56035
 https://towardsdatascience.com/understanding-and-coding-a-resnet-in-keras-
446d7ff84d33

14

You might also like