Professional Documents
Culture Documents
(ICSMDI 2021)
Digvijay Desaia, Shreyash Zanjalb, Harish Chavanc, Rushikesh Patild, Pravin Futanee
a
Department of Information Technology, Vishwakarma Institute of Information Technology, Pune, 411048, India
b
Department of Information Technology, Vishwakarma Institute of Information Technology, Pune, 411048, India
c
Department of Information Technology, Vishwakarma Institute of Information Technology, Pune, 411048, India
d
Department of Information Technology, Vishwakarma Institute of Information Technology, Pune, 411048, India
e
Department of Information Technology, Vishwakarma Institute of Information Technology, Pune, 411048, India
E-mail address: {digvijay.21810440, shreyash.21810616, harish.21810663, rushikesh.21810558, pravin.futane}@viit.ac.in
ABSTRACT
Chest radiograph or chest X-ray (CXR) is one of the most conducted radiology examinations, it is also considered difficult to interpret. Increasing
availability of imaging equipment has also increased the demand for highly trained staff. Doctors are absolutely good at putting forward the diagnosis, but
some minor details can be overlooked. In this paper, deep learning method is used to predict Thorax disease categories using CXR and its metadata using
Convolutional Neural Network (CNN) and MobileNets neural network architecture. The huge NIH (National Institutes of Health, Maryland) dataset of
chest X-rays available is used for making the predictions of a multiclass image classification problem with 15 different labels. This model will provide a
sanity test in the form of second opinion for radiologists and doctors to achieve more confidence in predicting accurate diagnosis. The accuracy of this
model is appreciable as compared to the current practice of predicting diseases by radiologists. This paper successfully classifies different categories of
Thorax disease with 94.68% accuracy, which concludes that this model trained with deep learning neural network has a real-world application and can be
used in the prediction of Thorax diseases. The scope of this model would increase along with the accuracy when more data is collected from some other
lab tests, clinical notes, or some other scans.
Keywords: Neural networks, Multiclass classification, Chest x-ray, Medical image processing.
India is amongst the highest disease burdens in the world. Frequency of respiratory disease with its economic burden is rising [1]. Although Chest
radiograph or CXR is one of the most conducted radiology studies, it is also considered difficult to interpret [2].
The thorax or chest is a part located between neck and abdomen. It is secured and supported by the rib cage, shoulder girdle and spine. The disease related
to this part of the body are called as thorax diseases. These diseases include pneumothorax which is an abnormal collection of air in the space that exist
between the lung and the chest wall. Pneumonia is another thorax disease which in an inflammatory condition of the lungs which affect the small air sacs
present in the lungs known as alveoli. The symptoms include cough, rapid breathing, difficulty in breathing, fever, chest pain, etc. These diseases can be
predicted using the chest X-ray of the patient.
Today, India has nearly 10,000 practicing radiologists. Due to availability of less expensive imaging equipment manufactured in the country, accessing the
imaging equipment has become easier, but with all these developments, the growing availability of imaging equipment has also increased the demand for
highly trained staff [3]. In fact, in some parts of India there is only one radiologist for every 1,00,000 people, compared to a United States ratio of 1 for
every 10,000 [4].
In a medical division, the highest accuracy and confidence in CXR reporting is given by specialist registrars (StRs) and consultants followed by core
medical trainees (CMT) and general practitioners (GPST) [5]. But, for radiology diagnosis the error rate is approximately in-between 10-15%. The rate of
clinically significant errors in radiology as found in a 2001 review was between 2-20% [6].
In this paper, deep learning methods are used to predict 14 different Thorax disease categories using CXR and its metadata. Multi-class classification is
defined as the process of classifying instances into one of three or more classes present in the data. Classification of 14 different diseases using Softmax
regression will provide a sanity test in the form of second opinion for radiologists to achieve more confidence in predicting accurate diagnosis for the
same.
Detecting Malaria is an intensive manual process which is automated using deep learning with the help of Lister Hill National Centre for Biomedical
Communications (LHNCBC), part of National Library of Medicine (NLM) who have collected the dataset of healthy and infected blood smear images [7].
They focused on single disease i.e., Malaria detection, but in this paper, broader spectrum of problem is considered i.e., 14 Thorax disease classification.
Image processing of X-rays using deep learning have predicted 14 different categories of Thorax diseases using the same NIH dataset with F1 score of
71.8% [8]. Their main aim was to cast this problem as a multi-class, multi-label (1 patient may have multiple diseases), image classification challenge.
2.3. Predicting COVID-19 from chest x-ray images using deep transfer learning
Detecting Covid-19 from radiography or radiology images is one of the most preferred and quickest ways to diagnose a patient. Dataset of 5000 CXR are
available from the publicly available dataset. Board-certified radiologist identified X-rays detecting COVID-19 disease. ResNet50, ResNet18, DenseNet-
121 and SqueezeNet, were trained by transfer learning on a subset of 2000 x-ray images.
In treatment planning and clinical diagnosis, accurate identification and localization of anomalies in X-ray images are important. In this model, these tasks
can be achieved very well by small amount of location annotation. It effectively outputs both class information as well as limited location annotation, and
significantly outperforms the comparative reference baseline in both classification and localization tasks. For multi-label classification (multiple diseases
can be identified in one chest x-ray image), binary classifier has been defined for each disease type.
In this paper, the NIH dataset is improvised and used for experimentation. The study focuses on presenting the probability distribution of all 14
diseases, while utilizing all the data given by NIH.
3. Problem statement
4. Dataset
The NIH Chest X-ray dataset has 112,120 CXR with disease labels of 30,805 unique patients. It includes information like patient ID, patient gender,
patient age, number of visits of patient, view position, etc., which will be considered as patient’s traits in data analysis [9].
Labels in the images are created by using NLP to text-mine disease classification from the CXR report associated with it. It is clearly mentioned that the
accuracy of these NLP labelling is approximately >90%.
There are 15 classes in the expected distribution which consists of 14 Thorax diseases, and one for "No findings" label. These Images can be analysed as:
• “No findings”.
• Most probable disease found.
Out of 112,120 X-ray images as shown in Table 1, from the NIH dataset, 91,312 X-ray images are used after removing the multi-labelled X-ray images.
This is necessary due to fact that the Softmax activation function is a probability distribution function which gives us the distribution of probable disease
out of all 15 classes. Therefore, any multi-labelled X-ray couldn’t provide the distribution of a disease. New labels after removing the multi-labelled
images are shown in the Table 2.
Pneumonia, Fibrosis, Infiltration, Atelectasis, Edema, Consolidation, Nodule Mass, Pneumothorax, Pleural thickening, Cardiomegaly, Effusion,
Emphysema, and Hernia are the 14 different categories of Thorax diseases to be predicted. Fig. 1.1. and Fig. 1.1. show examples of the thorax diseases
and their labels with their identification.
5. Method
To classify Thorax diseases, Convolutional Neural Network (CNN) are used along with MobileNets neural network architecture. Softmax regression
as activation function in the final layer. Softmax regression allows to distribute the probability of disease among 15 different categories. In early layers,
MobileNets architecture will be used in the early layers to create a lightweight deep CNN.
Images are fed into the CNN as input, which is a Deep Learning algorithm, and then identifies the feature set in the image with the help of filters
(kernels) by training the weights and biases. This helps in differentiating different images with the help of feature sets.
CNN architecture as shown in Fig. 2., is comparable to the pattern of Neurons in human brain. It is analogous to the organization of visual cortex.
Receptive field is the region of visual field where individual neurons respond to stimuli. The collection of receptive fields overlaps and covers the
complete visual area.
Step 1: Convolution
Convolution is the process of adding each element of the image pixel by pixel to its local neighbours, weighted by the learnable filter or kernel. This helps
in feature extraction from the image and reduce the dependency of network on the individual neuron. Convolution is the matrix operation as shown in Fig.
3., and is not the matrix multiplication, although being similarly denoted by *.
Step 2: Pooling
The spatial size of features which are extracted in the convolution step are reduced by pooling layer. It adds up as an extra step for extracting dominant
feature. After extracting features from convolution step, dimensions of feature sets are decreased which subsequently helps to lower computational power
for data processing.
There are two types of Pooling as shown in Fig. 4.: i) Max Pooling ii) Average Pooling.
Max Pooling returns the highest value from the portion of the region of the image after the convolution step weighted by the filter.
Average Pooling returns the average of all the values from the portion of the region after convolution step covered by the filter.
Step 4: Flattening
Softmax regression (or multinomial logistic regression) is a more advanced and improved version logistic regression. In logistic regression, only
binary labels can be classified 1.e., y(i) ∈ {0,1}. While in Softmax regression, multiple classes are handled [27]. Softmax regression allows us to handle
y(i) ∈ {1, …, K} labels, where K is the number of classes [28]. p ∈ ℝ15 were obtained through the matrix calculation p = W x where x is the input data and
W ∈ ℝ15×10242. Denoting W (Weight matrix) as
− 𝜃𝜃1𝑇𝑇 −
⎡ ⎤
⎢ − 𝜃𝜃2𝑇𝑇 −⎥
⎢− 𝜃𝜃3𝑇𝑇 −⎥
𝑊𝑊 = ⎢ . ⎥ (1)
⎢ ⎥
⎢ . ⎥
⎢ . ⎥
𝑇𝑇
⎣− 𝜃𝜃15 −⎦
In equation 1, Θ are the parameters.
W was calculated through optimization of the following cost function,
−1 exp�𝜃𝜃𝑗𝑗𝑇𝑇 𝑥𝑥 (𝑖𝑖) �
𝐽𝐽(𝑊𝑊) = � ∑𝑚𝑚 15
𝑖𝑖=1 ∑𝑗𝑗=1 1{𝑦𝑦
(𝑖𝑖)
= 𝑗𝑗} 𝑙𝑙𝑙𝑙𝑙𝑙 𝑘𝑘 � (2)
𝑚𝑚 ∑𝑙𝑙=1 exp�𝜃𝜃𝑙𝑙𝑇𝑇 𝑥𝑥 𝑖𝑖 �
MobileNet neural network architecture is used to build light weight deep CNN. The MobileNet model is trained on the ImageNet dataset which
consists of images from 1000 different classes. It also consists of two simple global hyper-parameters- latency and accuracy. These hyper-parameters
undergo trade off which allow model builder to decide model of the right size for their application.
Before using data for training, the images in data were converted to 128x128 image rather than using a high resolution 1024x1024 greyscale X-rays. This
helps in training the model faster due to less computational power.
Below Fig. 6 shows the final architecture of the neural network:
To increase the accuracy of model and to reduce the computational power, before training the model, the data is pre-processed. This helps to increase
the model accuracy. The steps involved in pre-processing are:
• Reading the images: In this step, the path to the image dataset is stored into a variable then a function is created to load folders containing all the
images into arrays.
• Resizing the images: The images have 1024*1024 pixels size. This requires a lot of computational power to process the data. So, the images are
reduced to 128*128 pixels. These images are further denoised to get more accuracy during training the data model.
• Data augmentation: It is a process used to increase the number of images by making slight changes in the images. These changes include slight
rotation, flipping the image, cropping the image, etc. This step increases the data size which helps more in increasing the accuracy of the final training
model.
The data is split into Training, Validation and Test data set. Training data set being 72%, Validation set of 8% and Test set of 20%. Evaluation of the
model was based on its accuracy as metric. Calculation of overall Precision, F1-score and Recall is shown in table 2. The results obtained were
outstanding and clearly show that neural network performed very well for predicting thorax diseases.
Overall accuracy was 94.68 % as shown in Table 3, which is significantly good when compared to the previous works done with the similar intentions.
The precision score achieved is 0.69 as show in Table 4, which is a bit on the lower side. This is due the fact that the model used in implementation has 15
classes in total. Taking this into consideration, the precision achieved is very respectable. The overall F1-score achieved is 61.5% and the F- beta score
@beta=0.5 is 66% which is considerable for the medical standards. The labelled data used for training the model is NLP extracted which has accuracy
greater than 90%. Hence, the 10% inaccurate labels in the dataset also affect the performance of the model. If these inaccurate labels are more of the labels
that are diseases, then it reduces the precision and recall of the model very drastically. Finding and correcting these labels might help us to increase the
results of the proposed model. Also, there are very few examples of some diseases for testing like Fibrosis, Edema, Pneumonia and Hernia which have
less than 1000 examples in the complete dataset which makes it more difficult to get better results. Increasing the input dataset size for these images will
also help in improving the results.
Validation 95.48
Test 94.87
Table 4 – Results.
Data Precision Recall F1 Score F Beta Score
Train 0.68 0.55 0.60 0.65
Validation 0.70 0.57 0.63 0.67
As shown in Fig. 2., the first layer is the MobileNet network without any pretrained weights.
The second layer is a pooling layer which is used to reduce the dimensions of the feature set after convolution step.
The Third layer is a dropout layer with a probability of 0.5, which is to avoid overfitting of the data.
The fourth layer is dense layer with 512 units again followed by dropout with a probability of 0.5.
Final layer consists of 15 units corresponding to 15 classes with Softmax as activation function.
The model was built using Keras which is an opensource software library written python programming language to build deep learning models.
Fig. 7 shows the synopsis of the proposed model. In the Fig. 7, param means the parameters present in the model.
The main aim of future work will be focusing on improving the results of the model by increasing the dataset size and getting more images of the
diseases which have less images present in the dataset. Also, the number of diseases which can be predicted using this proposed model will be increased.
Currently, this model predicts 14 thorax diseases.
7. Conclusion
This model has unlocked a new potential for Thorax disease prediction using CXRs. The accuracy of this model is considerable as compared to the
current practice of predicting diseases by radiologists. This research successfully classified different categories of Thorax disease with 94.68% accuracy,
which concludes that this model trained with deep learning neural network has a real-world application and can be used in the prediction of Thorax
diseases. The scope of this model would increase along with the accuracy when more data is collected from some other lab tests, clinical notes, or some
other scans. The analysis would be better when it is correlated with more traits of patients with the diseases found. Additional experiments with resolution
of the images along with custom architecture of the neural network will help for better preprocessing. This model will provide a sanity test in the form of
second opinion for radiologist to achieve more confidence in predicting accurate diagnosis. The results can be improved by increasing the dataset size for
training the model. The future work will focus on the same and will also increase the number of diseases which can be predicted using the proposed
model.
Acknowledgements
This research did not receive any specific grant from any funding agencies in the public, commercial, or not-for-profit sectors.
REFERENCES
[1] Salvi, S., Apte, K., Madas, S., Barne, M., Chhowala, S., Sethi, T., Aggarwal, K., Agrawal, A., & Gogtay, J. (2015). Symptoms and medical conditions in
204 912 patients visiting primary health-care practitioners in India: a 1-day point prevalence study (the POSEIDON study). The Lancet Global Health, 3,
776–784.