You are on page 1of 16

B.

Tech ELECTRONICS AND COMMUNICATION

19ECE357 – PATTERN RECOGNITION


TERM WORK-5TH SEMESTER

GROUP NO.: 10 (ECE - C)


GROUP MEMBERS:

S No. Name Roll Number


1. SHAILESH.R CB.EN.U4ECE20254
2. SHRI RAMYA. KR CB.EN.U4ECE20255
3. DHARUN KUMAR.S CB.EN.U4ECE20213

PNEUMONIA DISEASE DETECTION USING DEEP LEARNING


MODELS
ABSTRACT
Pneumonia as defined by WHO, is a form of acute respiratory infection that is most
commonly caused by viruses or bacteria. It can cause mild to life-threatening illness in
people of all ages. However, it is the single largest infectious cause of death in children
worldwide.
According to UNICEF, over 880,000 children die from pneumonia each year, which
causes 16% of all fatalities in children under the age of five. Early diagnosis of paediatric
pneumonia can aid in hastening the healing process.
The Chest X-Ray Images (Pneumonia) dataset from Kaggle was used for the
experimentation. One, two, three, and four convolutional layers are used, respectively, in
the first, second, third, and fourth models.
The four classifier models were built using CNN for the detection of pneumonia. The
accuracy of the model is directly correlated to the dataset’s size. There is no correlation
between the number of convolutional layers and the accuracy of the model.
The main aim is to create CNN models that can detects the presence of pneumonia from
their chest X rays with evaluation parameters such as accuracy, recall and F1 scores.
Usually recall is favored as it gives the estimate of false negatives in the results. The
number of false negatives is imperative to know as it has a high significance in the
medical world. It estimates the real world performance of these models.
A model achieving high accuracy but low recall values is addressed as under-performing
because false negative implies that the model is predicting a diseased persona s normal.
This is detrimental to the patient’s health and would eventually lead to the patient’s loss.
FEATURE(S) TAKEN:

Dataset Description:
A dataset of 1.16 GB size has been imported from Kaggle, with a total of 5856 images in
the jpeg format split into Train, Test and Val folders. Each further divided into categories
Pneumonia and Normal. The images are chest X-ray images and have been selected from
pediatric patients ranging from one to five years of age from Guangzhou Women and
Children's Medical centre, Guangzhou.

CNN Model:
CNN stands for Convolutional Neural Network. Its a sub field of deep learning which is
mostly used for the analysis of visual imagery. It is a specific type of artificial neural network
that employs perceptrons. Perceptrons are the building blocks of neural networks. They are
simple models of a biological neuron but in an ANN setting.

Convolutional Layer:
These are the building blocks of the CNN.As we are working with images consisting of our
data, the input image is first converted into a matrix format. Just like how convolution
is carried out in mathematics, a convolution filter is applied to the converted matrix format of
the input image. This filter slides over it, performing an element-wise multiplication and then
storing the sum. This has now created a feature map. 3*3 filters are generally used to create
2D feature maps when the images are black and white. Usually, convolutions are performed
in
3D when the input image is represented as a 3D matrix. The third dimension is represented
by
the RGB colour.

Activation Functions:
The four models mentioned use two different activation functions, namely ReLU and
Softmax functions.

ReLU:
Standing for Rectified Linear Unit, this activation function is broadly used in CNNS.
It handles the problem of vanishing gradients and is useful for increasing the non-linearity of
layers. ReLU has many variants- namely, Noisy ReLUs , Leaky ReLUs and parametric
ReLUs.
The advantage of this particular activation function over other activation functions is
attributed to it's computational simplicity.
Softmax:
This activation function calculates relative probabilities by taking in a vector of
raw outputs of the neural network. It returns a vector of probability scores. The equation of
The softmax function is

where z is the vector of raw outputs from the neural network, value
of e is approximately 2.7and i stands for the i-th entry which can be interpreted as the
predicted probability of the test input belonging to class i. The softmax function is used in all
the four mentioned models. Since it's return type is a vector of probability scores,
this activation function normalizes the input data into a probability distribution.

Pooling layers:
After convolutional layers, pooling layers come into the picture. Pooling layers
are a layer of neural nodes in neural networks that reduce the size of the input feature set.
We employ the max-pooling technique as it assists us in recognizing the salient features of
the image. The max-pooling technique is done after convolution. Since this technique only
stores pixels of the maximum value, it's very easy to downsize an image. That is, we extract
the most important feature in the region. There are many other advantages to max-pooling
other than downsizing an image. It adds invariance. What's the advantage of adding
invariance? Invariance in images is important if we need to know the presence or absence
of the feature as opposed to the knowledge of it's precise location. Here, the max-pooling
layer having a dimension 2*2 selects maximum pixel intensity values.

Flattening layer and fully connected layers:


The input image is passed throughout the convolutional layer, pooling layers and is then fed
into the flattening layer. This layer, as the name itself suggests, flattens the image out into the
form of a column thereby reducing it's complexity. After this, it is then fed into the fully
connected layer. It is also known as dense layer. This comprises of multiple layers and every
node in the first layer is connected to every node in the second layer. This allows for each
layer to extract features. On this basis, the network makes a prediction. This process is called
forward propagation. After this, a cost function is calculated. Cost function is a parameter
that determines the performance of a model for a particular dataset. The cost function used in
the four models is categorical cross entropy. Cross entropy is mainly used as a loss function
for multi-classification models where there are two or more labels. It builds upon entropy and
calculates the difference between two probability distributions. After this, back propagation
takes place and the whole process is continued until this network reaches or more so achieves
an optimum performance. For this, an optimization algorithm namely Adam Optimization
algorithm has been used in all the four models.

Reducing Over-fitting:
This is a type of data simplification technique which reduces the overfitting of data such that
it properly fits into the training data set. Substantial overfitting is exhibited by the first
model. The next three exhibited dropout technique. Dropout technique helps with the
problem of vanishing gradients. It makes each neuron form it's own representation of
the input data. Dropout technique is where the technique cuts connections between
neurons in success layers during the training process.

Algorithm:
The number of epochs in these classifier models was fixed at 20. This number was fixed after
seeing that models with higher number of epochs overfit the training datasets. Many
optimizer functions were trained and studied too. Finally, Adam Optimizer function was
finalized as it brought out the best results. The Adam optimizer makes use of an
exponentially decaying average of past gradients.

First Classifier: A Simple classifier model along with a convolutional layer of image set to
64*64, 32 feature maps, ReLU activation function and a fully connected dense layer with 128
perceptrons was trained.

Second Classifier: To improve the performance of the first classifier, another convolutionaly
layer of 64 feature maps was added. This was done for better feature extraction. The number
of perceptrons increased from 128 to 256. This was done so that a better learning could occur.

Third Classifier: Three convolutional layers, 128 feature maps included in the third
convolutional layer, Dense layer kept intact, dropout layer introduced, learning rate of
optimizer reduced to 0.0001. The third convolutional layer was introduced for more detailed
feature extraction. The learning rate of the optimizer was reduced to prevent overfitting.

Fourth Classifier: Four convolutional layers, 256 feature maps in fourth convolutional layer.
Other parameters kept same as the third model.
MODEL :
The proposed work uses Deep learning models like Convolutional Neural Network (CNN)
model to detect the pneumonia disease from a scanned chest x-ray images of a patient.

PROGRAM :
import matplotlib.pyplot as plt

import tensorflow as tf

from tensorflow import keras

from tensorflow.keras.models import Sequential

from tensorflow.keras.layers import Dense, Activation, Conv2D, MaxPooling2D, Flatten, Dropout,


BatchNormalization

from tensorflow.keras.optimizers import Adam

from tensorflow.keras.callbacks import EarlyStopping

from tensorflow.keras.preprocessing.image import ImageDataGenerator

from sklearn.metrics import precision_recall_curve, roc_curve, accuracy_score, confusion_matrix,


precision_score, recall_score

from sklearn.decomposition import PCA

from sklearn.model_selection import train_test_split

import matplotlib.pyplot as plt

import seaborn as sns

plt.style.use('fivethirtyeight')

import pickle

import os

import numpy as np

import cv2
# %matplotlib inline

labels = ['PNEUMONIA', 'NORMAL']

img_size = 200

def get_training_data(data_dir):

data = []

for label in labels:

path = os.path.join(data_dir, label)

class_num = labels.index(label)

for img in os.listdir(path):

try:

img_arr = cv2.imread(os.path.join(path, img), cv2.IMREAD_GRAYSCALE)

resized_arr = cv2.resize(img_arr, (img_size, img_size))

data.append([resized_arr, class_num])

except Exception as e:

print(e)

return np.array(data)

from google.colab import drive

drive.mount('/content/drive')

train = get_training_data('/content/drive/MyDrive/train')

test = get_training_data('/content/drive/MyDrive/test')

val = get_training_data('/content/drive/MyDrive/val')

pnenumonia = 0

normal = 0

for i, j in train:

if j == 0:

pnenumonia+=1

else:

normal+=1

print('Pneumonia:', pnenumonia)

print('Normal:', normal)
print('Pneumonia - Normal:', pnenumonia-normal)

plt.imshow(train[1][0], cmap='gray')

plt.axis('off')

print(labels[train[1][1]])

X = []

y = []

for feature, label in train:

X.append(feature)

y.append(label)

for feature, label in test:

X.append(feature)

y.append(label)

for feature, label in val:

X.append(feature)

y.append(label)

X = np.array(X).reshape(-1, img_size, img_size, 1)

y = np.array(y)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=32)

X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.20, random_state=32)

X_train = X_train / 255

X_test = X_test / 255

X_val = X_val / 255

datagen = ImageDataGenerator(

featurewise_center=False,
samplewise_center=False,

featurewise_std_normalization=False,

samplewise_std_normalization=False,

zca_whitening=False,

rotation_range=90,

zoom_range = 0.1,

width_shift_range=0.1,

height_shift_range=0.1,

horizontal_flip=True,

vertical_flip=True)

datagen.fit(X_train)

model = Sequential()

model.add(Conv2D(256, (3, 3), input_shape=X_train.shape[1:], padding='same'))

model.add(Activation('relu'))

model.add(MaxPooling2D(pool_size=(2, 2), padding='same'))

model.add(BatchNormalization(axis=1))

model.add(Conv2D(64, (3, 3), padding='same'))

model.add(Activation('relu'))

model.add(MaxPooling2D(pool_size=(2, 2), padding='same'))

model.add(BatchNormalization(axis=1))

model.add(Conv2D(16, (3, 3), padding='same'))

model.add(Activation('relu'))

model.add(MaxPooling2D(pool_size=(2, 2), padding='same'))

model.add(BatchNormalization(axis=1))

model.add(Flatten())

model.add(Dropout(0.5))

model.add(Dense(64))

model.add(Activation('relu'))
model.add(Dropout(0.5))

model.add(Dense(1))

model.add(Activation('sigmoid'))

early_stop = EarlyStopping(patience=3, monitor='val_loss', restore_best_weights=True)

adam = Adam(learning_rate=0.0001)

model.compile(loss='binary_crossentropy',optimizer=adam,metrics=['acc'])

model.summary()

history = model.fit(datagen.flow(X_train, y_train, batch_size=10), callbacks=[early_stop],


validation_data=(X_val, y_val), epochs=15)

model.evaluate(X_test, y_test)

plt.figure(figsize=(16, 9))

plt.plot(history.epoch, history.history['acc'])

plt.title('Model Accuracy')

plt.legend(['train'], loc='upper left')

plt.show()

plt.figure(figsize=(16, 9))

plt.plot(history.epoch, history.history['loss'])

plt.title('Model Loss')

plt.legend(['train'], loc='upper left')

plt.show()

plt.figure(figsize=(16, 9))

plt.plot(history.epoch, history.history['val_acc'])

plt.title('Model Validation Accuracy')

plt.legend(['train'], loc='upper left')

plt.show()

plt.figure(figsize=(16, 9))
plt.plot(history.epoch, history.history['val_loss'])

plt.title('Model Validation Loss')

plt.legend(['train'], loc='upper left')

plt.show()

pred = model.predict(X_train)

precisions, recalls, thresholds = precision_recall_curve(y_train, pred)

fpr, tpr, thresholds2 = roc_curve(y_train, pred)

def plot_precision_recall(precisions, recalls, thresholds):

plt.plot(thresholds, precisions[:-1], 'b--')

plt.plot(thresholds, recalls[:-1], 'g-')

plt.title('Precision vs. Recall')

plt.xlabel('Thresholds')

plt.legend(['Precision', 'Recall'], loc='best')

plt.show()

def plot_roc(fpr, tpr):

plt.plot(fpr, tpr)

plt.plot([0, 1], [0, 1], 'k--')

plt.title('FPR (False Positive rate) vs TPR (True Positive Rate)')

plt.xlabel('False Positive Rate')

plt.ylabel('True Positive Rate (Recall)')

plt.show()

plot_precision_recall(precisions, recalls, thresholds)

plot_roc(fpr, tpr)

predictions = model.predict(X_test)

binary_predictions = []

threshold = thresholds[np.argmax(precisions >= 0.80)]

for i in predictions:

if i >= threshold:
binary_predictions.append(1)

else:

binary_predictions.append(0)

print('Accuracy on testing set:', accuracy_score(binary_predictions, y_test))

print('Precision on testing set:', precision_score(binary_predictions, y_test))

print('Recall on testing set:', recall_score(binary_predictions, y_test))

Image source: https://silvrback.s3.amazonaws.com/uploads/4ab81a17-4a77-4e9e-b092-


de5fac2afa07/confusionmatrix_large.png

![](https://silvrback.s3.amazonaws.com/uploads/4ab81a17-4a77-4e9e-b092-
de5fac2afa07/confusionmatrix_large.png)

matrix = confusion_matrix(binary_predictions, y_test)

plt.figure(figsize=(16, 9))

ax= plt.subplot()

sns.heatmap(matrix, annot=True, ax = ax)

ax.set_xlabel('Predicted Labels', size=20)

ax.set_ylabel('True Labels', size=20)

ax.set_title('Confusion Matrix', size=20)

ax.xaxis.set_ticklabels(labels)

ax.yaxis.set_ticklabels(labels)

plt.figure(figsize=(10,10))

for i in range(25):

plt.subplot(5,5,i+1)

plt.xticks([])

plt.yticks([])

plt.grid(False)

plt.imshow(X_train.reshape(-1, img_size, img_size)[i], cmap='gray')

if(binary_predictions[i]==y_test[i]):

plt.xlabel(labels[binary_predictions[i]], color='blue')
else:

plt.xlabel(labels[binary_predictions[i]], color='red')

plt.show()

model.save('pneumonia_detection_ai_version_3.h5')

PERFORMANCE METRICS/ OUTPUT

Accuracy and computation time are used as the metric for evaluation.

 Convolutional Neural Network

Accuracy on testing set: 0.9115646258503401


Precision on testing set: 0.8704819277108434
Recall on testing set: 0.8257142857142857
OUTPUT SCREENSHOTS-
REFERENCES
Kaushik, V. & Nayyar, Anand & Kataria, Gaurav & Jain, Rachna. (2020). Pneumonia
Detection Using Convolutional Neural Networks (CNNs). Lecture Notes in Networks and
Systems. 471-483. 10.1007/978-981-15-3369-3_36.

You might also like