You are on page 1of 8

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/346700750

Lung Cancer Prediction Using Deep Learning Framework

Article in International Journal of Control and Automation · May 2020

CITATIONS READS

26 2,264

6 authors, including:

R. Raja Subramanian Ravella Nikhil Mourya


Kalasalingam University 1 PUBLICATION 26 CITATIONS
44 PUBLICATIONS 371 CITATIONS
SEE PROFILE
SEE PROFILE

Srikar Amara

4 PUBLICATIONS 122 CITATIONS

SEE PROFILE

All content following this page was uploaded by R. Raja Subramanian on 08 December 2020.

The user has requested enhancement of the downloaded file.


International Journal of Control and Automation
Vol. 13, No. 3, (2020), pp. 154-160

Lung Cancer Prediction Using Deep Learning Framework


R. Raja Subramanian, R. Nikhil Mourya, V. Prudhvi Teja Reddy, B. Narendra Reddy,
Srikar Amara
Kalasalingam Academy of Research and Education, India
rajasubramanian.r@klu.ac.in,vprudhvitejareddy@gmail.com,
nrendrareddyvishal326@gmail.com,mourya.ravella@gmail.com, srikar.ask@gmail.com

Abstract
Lung carcinoma also known as lung cancer is one of the dangerous diseases caused all over the
world. It is caused due to the reluctant increase of cells in the lung tissues. It is cured only
during its early stage, by starting the treatment. This is detected using the Computed
Tomography (CT) scanning and blood test reports. By blood test, the tumour is detected after the
humans affected with a minimum span of 4 years. So, to know the early stage of cancer, CT
scanning is used. The CT images are classified into normal and abnormal. The abnormal image
is detected by focusing on the tumour portion. The dataset in jpg format, composed of Computed
Tomography (CT) images. The proposed model is trained by using the Convolutional Neural
Network (CNN). Pretrained ImageNet models including LeNet, AlexNet and VGG-16, are used to
detect lung cancer. The proposed model uses AlexNet model and the features obtained from the
last fully connected layer of the model were separately applied as input to the softmax classifier.
The combination of AlexNet and softmax layer together has given an accuracy of 99.52%. The
proposed model serves as a consistent and sustainable diagnosis model for lung cancer
detection.
Keywords: Deep Learning, Lung Cancer Prediction, AlexNet, Softmax layer, CT images

1. Introduction
Cancer is a disease which is caused by abnormal growth of cells/tissues in the human body.
Since many years’ cancer is one of the deadly diseases that threatens human health. As of the
figures it is estimated that there are 1.8 million new cancer cases diagnosed and more than
606,520 deaths in the United States as of 2020 and estimated new cases of whopping 247,270
lung cancer cases. The examination led in 2018, results that 18.1 million disease cases will be
added to the accessible malignancy cases on the planet and around 9.6 million will bring about
death. Lung Carcinoma growth is the most well-known kind of disease on the planet with a pace
of 13%. In clinical field, the assessment and determination of the lung CT picture is a delicate
procedure and requires time and high capability.
The abstract assessment prompts the emerging changeability among the onlookers. Thus, the
PC based frameworks are sought after. The diagnosing procedure should be possible by utilizing
existing open learning innovations. Along these lines, the cost will be radically decreased. For
that reason, one of the basic models favored for lung malignancy (carcinoma) determination is
profound learning picture nets. We have used convolutional neural network models such as
AlexNet, LeNet and VGG-16. AlexNet model is used for analysing and classifying lung CT
images. It consists of eight layers. In those layers, 5 of them are convolutional and max-pooling.
Three of which are fully connected layers. We’ve used a hybrid model where the CNN AlexNet
model is connected with the softmax layer. In our study, the CT image dataset is classified as
Cancer and Non- cancer dataset. We used the ‘Adam’ optimizer technique

2. Background and Related work


In this area, a synopsis of the writing and investigation of lung malignancy is introduced. The
trials were completed on an open dataset made out of Computed Tomography (CT) Images.
CNNs were utilized for include extraction and grouping purposes [1]. Common types of lung
cancer and its characteristics are enumerated. Inputs are collected online from patients for online
detections using the reinforcement algorithm [2]. Clamors are killed by applying a weighted
154
ISSN: 2005-4297 IJCA
Copyright ⓒ 2020 SERSC
International Journal of Control and Automation
Vol. 13, No. 3, (2020), pp. 154-160

mean histogram balance approach. To improve the nature of the picture, the IPCT is utilized [3].
A probabilistic model is utilized to foresee lung malignant growth by utilizing CNN based
technique utilized for profound component extraction and a low-portion CT check dataset is
joined with full-mode iterative recombination (IMR) [4].
Experiments are conducted Clamors are killed by applying a weighted mean histogram
balance approach. To improve the nature of the picture, the IPCT is utilized [3]. A probabilistic
model is utilized to foresee lung malignant growth by utilizing CNN based technique utilized for
profound component extraction and a low-portion CT check dataset is joined with full-mode
iterative recombination (IMR) [6]. In this a new model is created by using the SVM and CNN
together called SVM-CNN hybrid model for the pattern recognition. In this the last output layer
of CNN is replaced with SVM [6,7]. Deep CNN model is used to develop a classification scheme
over microscopic lung cancer images and for the classification purpose, a softmax layer is used
alongside with CNN model. It consists of layers such as two fully connected layers, three
convolutional and three pooling layers. To beat the overfitting in the preparation procedure, each
picture is gone through a lot of methodology, for example, reflection, revolution, channel and so
on. It made a 71% progress rate in the grouping procedure [7]. In another investigation, the
impact of including the lymph hub size as a contribution to a backpropagation ANN was
analyzed. This was done on the 133 patients with non - little cell lung malignant growth. It
guaranteed a precision of 99.2% [9]. So as to foresee the malignant growth in its beginning
period non-little cell lung disease (NSCLC), a picture classifier called computational
histomorphometry using surface, shape, atomic direction from digitized H&E tissue microarray
(TMA) was introduced. This model gives a precision of 81% [10].
By using the computer-aided diagnostic system (CADS), a lung cancer detection system is
developed. The images in the edges are extracted into three stages. In primary stage, the vertical
direction of the image at the edge map was extracted. In secondary stage, a horizontal direction
of the image at the edge map was mined. In final stage, both vertical and horizontal directions of
the edge map of all transitions were subtracted and combined [12].

3. Proposed Model for Lung Cancer Prediction


The dataset is gathered from various sources such as open source applications and from the
hospitals. We’ve gathered 100 images with 50 as cancer images and 50 as normal images [13].
The format is DICOM and later it is converted to jpg format for training the data, due to the
limitation of the format support in python. The resolution of the image is 504 x 504 pixels. Since
the dataset is less, we are using the data augmentation techniques for better accuracy [14]. The
different information expansion strategies can expand the arrangement exactness for CNN and
traditional AI calculations. In this investigation, the picture growth procedures are acknowledged
through the Keras library in Python. The augmentation techniques include cutting, rotation,
horizontal turning, width and height change, filling operations are done for each image in the
dataset [15].
AlexNet is the old and widely used in image classification tasks. This is the most successful
model in CNN. AlexNet architecture consists of 3 pooling layers, 5 convolutional layers and 3
fully connected layers. The filters of convolutional layers are 3x3 and 2x2. The final layer used
in this is the SoftMax layer. The trial customary was utilized without applying any increase
methods to forestall overfitting.
A Deep learning model is developed the following layers:

3.1. Convolution
Convolution thinks about the pictures piece by piece. The pieces that it searches for, known as
features [16]. The actual trained image is compared with the input by looking at the position of
the image with the trained image. CNN improves at seeing correspondingly than entire picture
coordinating plans [17].

155
ISSN: 2005-4297 IJCA
Copyright ⓒ 2020 SERSC
International Journal of Control and Automation
Vol. 13, No. 3, (2020), pp. 154-160

3.2. ReLu Layer


Rectified linear unit layer is a layer that categorizes the images into negative and positive
images that accompany the results and it falls under negative values, it will assign zero [18]. This
is done to maintain a strategic distance from the qualities from summarizing to zero. The change
work possibly initiates a hub if the info is over a specific amount, while the information is
underneath zero, the yield is zero, yet when the info transcends a specific limit, it has a straight
relationship with the reliant variable [19].

3.3. Max Pooling


Pooling is used to shrink the picture stack into a littler size. Right off the bat, it picks the
window size and picks a step. Walk your window over your separated pictures. From every
window, take the greatest worth. This should be possible with the filtered image.

3.4. Fully Connected Layer


This is the last layer where the real grouping occurs. In this we take our separated and
shrinked pictures and put them into the single rundown [20]. This can be done until the multiset
list comes to the single list. Then compare the input vector with the trained data. Then classify
the data by which it got the actual values when it is trained. If it matches the input to the given
dataset, then it provides the better possible output to that input [21].The most common models
used in CNN are AlexNet, LeNet and VGG-16. In this proposed model we used AlexNet.
AlexNet is developed by Alex Krizhevsky, Geoffrey Hinton, llya Sutskever in the year 2012 [22].
The network design of AlexNet consists of eight layers.
In those five are convolutional and max-pooling and three are fully connected layers. ReLu
(Rectified Linear unit) is used as a non-saturating function. The last activation function i.e.,
SoftMax is used in the proposed algorithm. Classification of lung cancer CT images is done
using deep CNN models [23]. Deep feature extraction is done by the last fully connected layer of
the AlexNet. Softmax is the last layer used by feeding machine learning models with deep
features.
In this proposed model, we used five convolutional and max pooling layers. convolutional
layers were followed by max pooling layers. In the max pooling layer, we were given a stride of
2. The activation function used in this model is ‘ReLu’. Rectified linear unit layer is a layer that
categorizes the images into negative and positive images that accompany the results and it falls
under negative values, it will assign zero [24]. This is done to avoid the values from summing up
to zero. We used a ‘flatten layer’ which is used to reshape data from a vector of n- dimension
matrix into the correct format. It is used always after the max pooling layer. It is the process of
converting all the result arrays into a single long continuous linear vector. Flatten doesn’t affect
the batch size. Dense layer is followed by the flatten layer. The last two layers in the proposed
model is ‘Dense layer’. In the first layer ReLu is the activation function with the filters of 64.
The layer of the proposed model is connected to the Softmax layer. Two filters are applied to the
last layer. Softmax layer is the probability distribution function. It ranges from 0 to1. Which is
the highest decimal, then it is predicted to the result [25]. The architecture for the proposed
model is shown in figure 1. The architecture consists of 8 layers. The image dataset of the lungs
was trained using the proposed algorithm. The augmentation techniques were used for the images
such as cutting, rotation, horizontal tuning, height and width change. The images were trained by
the CNN model AlexNet. For the testing, the data is taken without subjection of augmentation
techniques. This gives the best output and avoids the overfitting. The test set is inferred
separately and passed to the posterior probabilities. This predicts the output.

156
ISSN: 2005-4297 IJCA
Copyright ⓒ 2020 SERSC
International Journal of Control and Automation
Vol. 13, No. 3, (2020), pp. 154-160

Figure 1. Deep Learning Architecture for Lung Cancer Prediction Model

4. Experimentation and Results


The model is evaluated against the lung cancer data and the empirical evaluations are
depicted. Loss function refers how far the predicted values deviate from the actual normal values.
It is calculated on validation and training. Loss represents the poor prediction. If the loss percent

is zero then it is the perfect prediction. If the loss is higher, then it is the worst model [13, 14]
The loss percentage of the model is depicted in figure 2. It is evident that the model exhibits
lower loss percentage in subsequent epochs. The accuracy of training the model is presented in
figure 3. The model exhibits an accuracy rate of 99.52%. Confusion matrix is also known as
error matrix. It is a table, where true values are known and is used to define the operation of a
classifier on a test dataset. In a classification problem it is the summary of predicted results. The
main aim of the confusion matrix is to summarize the correct and incorrect predictions with the
count values by each class. It gives us the types of errors that came from the classifier. It is the
classifier with actual class and predicted class which contains the positive and negative. Positive
represents the observation (actual value) is true. Negative represents theobservation (actual
value) is false. Based on the positive and negative four terms are
Figure 2. Loss Percentage Graph

157
ISSN: 2005-4297 IJCA
Copyright ⓒ 2020 SERSC
International Journal of Control and Automation
Vol. 13, No. 3, (2020), pp. 154-160

derived for our easy understand. True Positive (TP), True Negative, False Positive (FP) and False
Negative (FN) [15, 16, 17]. In TP observation and prediction is true. In TN observation and
prediction is false. In FP observation is false but prediction is true. In FN observation is true but
prediction is false. The confusion matrix is graphically depicted in figure 4.
Figure 3. Accuracy of the Proposed Model

Figure 4. Confusion Matrix Representation

Table 1. Model Parameters and Empirical Results


Model ACC % Loss % Precision Recall F-Score
AlexNet + SVM 98.62 0.724 98.895 86.459 92.258
AlexNet + Deep kNN 97.75 0.761 98.478 84.125 90.737
AlexNet Model + softmax 99.52 0.649 99.203 88.265 93.416

7. Conclusion
A deep learning model leveraging AlexNet pretrained model combined with a softmax layer is
developed and used to efficiently classify the lung CT images for cancer. The proposed model is
compared with the existing state-of-the-art models. Empirical evaluations of the proposed model
depict that the model outperforms the similar models available in the literature, with the accuracy
level of 99.52%. In addition, a user interface leveraging python-tkinter is developed for effective
usage by common people. This health care project deems to be a sustainable application for
classifying lung CT images for cancer prediction. The results from the application are in par with
the clinical results.
In the future, we prefer to identify the influential factors for lung cancer in the human body,
and use the same as features of cancer prediction. Physiological parameters for prediction include
pressure, oxygen level, body temperature. An IoMT application can be developed to extract these
physiological parameters. Effective classification and analysis can be done leveraging fog-
assisted cloud computing.

158
ISSN: 2005-4297 IJCA
Copyright ⓒ 2020 SERSC
International Journal of Control and Automation
Vol. 13, No. 3, (2020), pp. 154-160

References
[1] M. Togacara, B. Ergen, Z. Cömert, “Detection of lung cancer on chest CT images using
minimum redundancy maximum relevance feature selection method with convolutional
neural networks”, Elsevier Journal of Biocybernetics and Biomedical engineering vol. 40,
no. 1, (2020) pp. 23-34.
[2] Z. Liu, C. Yao, H. Yu, T. Wu, “Deep reinforcement learning with its application for lung
cancer detection in medical Internet of Things”, Elsevier Journal of Future Generation
Computer Systems, vol. 97, (2019) pp. 1- 9.
[3] P. M. Shakeel, M. A. Burhanuddin, M. I. Desa, “Lung cancer detection from CT image
using improved profuse clustering and deep learning instantaneously trained neural
networks”, Elsevier Journal of Measurement, vol. 145, (2019) 702-712.
[4] Y. Liu, H. Wang, Y. Gua, X. Lv, “Image classification toward lung cancer recognition by
learning deep quality model”, Elsevier Journal of Visual Communication and Visual
Representation, vol. 63, (2019) pp. 274-282.
[5] P. Huang, C. T. Lin, Y. Li, M. Tammemagi, “Prediction of lung cancer risk at follow-up
screening with low-dose CT: a training and validation study of a deepLearning method”,
The Lancet Digital Health, vol. 1, (7), (2019).
[6] S. K., L. Mohanty, S. N., K., S., N., A., & Ramirez, G., “Optimal deep learning model for
classification of lung cancer on CT images”, Future Generation Computer Systems, vol. 92,
(2018) pp. 374-38.
[7] S. Alagarsamy, K. Kamatchi, V. Govindaraj, Y. D. Zhang, A. Thiyagarajan, “Multi-
channeled MR brain image segmentation: A new automated approach combining BAT and
clustering technique for better identification of heterogeneous tumors”, Biocybernetics and
Biomedical Engineering, vol.39, no.4 (2019), pp.1005-1035.
[8] S. Alagarsamy, K. Kamatchi, V. Govindaraj, A. Thiyagarajan “A fully automated hybrid
methodology using Cuckoo-based fuzzy clustering technique for magnetic resonance brain
image segmentation”, International journal of Imaging systems and technology, vol. 27,
no.4, (2017), pp. 317-332.
[9] S. Alagarsamy, K. Kamatchi, V. Govindaraj, “A Novel Technique for identification of
tumor region in MR Brain Image,” 3rd International conference on Electronics,
Communication and Aerospace Technology (ICECA), (2019), pp. 1061-1066.
[10] V. Govindaraj, V, Sofia Fazila, M. Beevi, M. Marikani, “An Imitating Device for Assisting
Trans-Radial Amputees using Gestures,” International Journal of Advanced Science and
Technology, vol. 29, no.7s, (2020), pp. 2922-2931.
[11] V. Govindaraj, V. Sivasankar, T. Raja, S. Alagarsamy, A. Thiyagarajan, “Light Guidance
for Easy Coagulation, Cutting and Desiccation in Surgical Diathermy,” Journal of Advanced
Science and Technology, vol. 29, no.7s, (2020), pp. 2941-2947.
[12] V. Govindaraj, K. Venkatesan, K. Seethapathy, S. Suthanthiram, S. Alagarsamy, A.
Thiyagarajan, “An Imitating Wearable Kidney: A Dialectical Replacement For The
Cumbersome Dialysis Procedure For Renal Failures,” Journal of Advanced Science and
Technology, vol. 29, no.7s, (2020), pp. 2932-2940.
[13] A. Thiyagarajan, P. Sankar, A.R. Chandragurunathan, J.M. Jaffer, V. Govindaraj,
S. Alagarsamy, “Detection of Alzheimer’s Disease Using Soft Computing Techniques,”
Journal of Advanced Science and Technology, vol. 29, no.8s, (2020), pp. 2914-2921.
[14] S. Alagarsamy, V. Govindaraj, K. Meenakumari, K. Priyadhara, “Identification of various
diseases in guava fruit using spiral optimization (SPO) technique,” Test Engineering and
Management, vol.83, (2020), pp.9561-9566.
[15] R.R. Subramanian, K. Seshadri, “Design and Evaluation of a Hybrid Hierarchical Feature
Tree Based Authorship Inference Technique”, In: Kolhe M., Trivedi M., Tiwari S., Singh V.
(eds) Advances in Data and Information Sciences. Lecture Notes in Networks and Systems,
vol 39. Springer, Singapore (2019).
[16] QZ. Song, L. Zhao, XK. Luo, XC. Dou, “Using deep learning for classification of lung
nodules on computed tomography images”, Journal of Health Engineering, vol. 72 (7),
(2017).

159
ISSN: 2005-4297 IJCA
Copyright ⓒ 2020 SERSC
International Journal of Control and Automation
Vol. 13, No. 3, (2020), pp. 154-160

[17] LK. Toney, HJ. Vessellw, “Neural networks for nodal staging of non-small cell lung Cancer
with FDG PET and CT: importance of combining uptake values and sizes of nodes and
primary tumor”, Radiology, vol.270, no.1 (2014), pp.91- 98.
[18] R.R Subramanian, R. Ramar, “Design of Offline and Online Writer Inference Technique”,
International Journal of Innovative Technology and Exploring Engineering, vol. 9, (2019),
pp. 1-4.
[19] X. Wang, A. Janowczyk, Y. Ahou, R. Thawani, P. Fu, K. Schalper, “Prediction of
recurrence in early stage non-small cell lung cancer using computer extracted nuclear
features from digital H&E images”, Sci Rep (2017) 7:13543
[20] Potghan S., “Multi-layer perceptron-based lung tumor classification”, Second int conf
Electron common aerospace technology, (2018) pp. 499- 502.
[21] R.R. Sudharsan, J. Deny, E. Kumaran, A.S. Geege, “An Analysis of Different Biopotential
Electrodes Used for Electromyography”, Journal of Nano- and Electronic Physics (2020) pp.
1-7.
[22] R.R. Krishna, P.S. Kumar, R.R. Sudharsan, “Optimization of wire-length and block
rearrangements for a modern IC placement using evolutionary techniques”, IEEE
International Conference on Intelligent Techniques in Control, Optimization and Signal
Processing, (2017) pp. 1-4.
[23] J. Deny, R.R Sudharsan, “Block Rearrangements and TSVs for a Standard Cell 3D IC
Placement”, In Intelligent Computing and Innovation on Data Science, Springer, (2020) pp.
207-214.
[24] R.R. Sudharsan, J. Deny, “Field Programmable Gate Array (FPGA)-Based Fast and Low-
Pass Finite Impulse Response (FIR) Filter”, In Intelligent Computing and Innovation on
Data Science, Springer, (2020) pp. 199-206.
[25] C. D. Scott and R. E. Smalley, “Diagnostic Ultrasound: Principles and Instruments”, Journal
of Nanosci. Nanotechnology., vol. 3, no. 2, (2003), pp. 75-80.

160
ISSN: 2005-4297 IJCA
Copyright ⓒ 2020 SERSC

View publication stats

You might also like