You are on page 1of 63

A DEEP LEARNING APPROACH ON COLON CANCER SUBTYPE

CLASSIFICATION

BY
Meherol Hasan
ID: 192-15-13234

This Report Presented in Partial Fulfillment of the Requirements for the


Degree of Bachelor of Science in Computer Science and Engineering

Supervised By

Mohammad Jahangir Alam


Associate Professor
Department of CSE
Daffodil International University

Co-Supervised By

Assistant Professor
Department of CSE
Daffodil International University

DAFFODIL INTERNATIONAL UNIVERSITY


DHAKA, BANGLADESH
JANUARY 2024
APPROVAL

©Daffodil International University


This thesis titled “A DEEP LEARNING APPROACH ON COLON CANCER
SUBTYPE CLASSIFICATION”, submitted by Meherol Hasan ID: 192-15-13234 to
the Department of Computer Information Systems, Daffodil International University has
been accepted as satisfactory for the partial fulfillment of the requirements for the degree
of B.Sc. in Computer Science and Engineering and approved as to its style and contents.
The presentation has been held on 22 July 2023.

BOARD OF EXAMINERS

Dr. Sheik Hayder Noori


Professor and Head Chairman
Department of CSE
Faculty of Science & Information Technology
Daffodil International University

(Name) Internal Examiner


Designation
Department of CSE
Faculty of Science & Information Technology
Daffodil International University

(Name) Internal Examiner


Designation
Department of CSE
Faculty of Science & Information Technology
Daffodil International University

(Name) External Examiner


Designation
Department of -------
Jahangirnagar University

©Daffodil International University ii


DECLARATION
I hereby declare that this project has been done by me under the supervision of
Mohammad Jahangir Alam, Assistant professor, Department of CSE, Daffodil
International University. I also declare that neither this project nor any part of this project
has been submitted elsewhere for the award of any degree or diploma.

Supervised by:

Mohammad Jahangir Alam


Associate Professor
Department of CSE
Daffodil International University

Co-Supervised by:

Assistant Professor
Department of CSE
Daffodil International University

Submitted by:

Meherol Hasan
ID: 192-15-13234
Department of CSE
Daffodil International University

©Daffodil International University iii


ACKNOWLEDGEMENT

First, I express my heartiest thanks and gratefulness to Almighty Allah for His divine
blessing which makes me possible to complete the final year project/internship
successfully.

I really grateful and wish my profound indebtedness to Mohammad Jahangir Alam,


Assistant professor, Department of CSE, Daffodil International University, Dhaka, deep
knowledge & keen interest of my supervisor in the field of Machine Learning to carry out
this project. His endless patience, scholarly guidance, continual encouragement, constant
and energetic supervision, constructive criticism, valuable advice, reading many inferior
drafts and correcting them at all stage have made it possible to complete this project.

I would like to express my heartiest gratitude to Dr. Shayekh Hayder Noori, Head,
Department of CSE, for his kind help to finish my project and also to other faculty
members and the staffs of CSE department of Daffodil International University.

Finally, I must acknowledge with due respect the constant support and patients of my
parents.

©Daffodil International University iv


ABSTRACT

The treatment of gastrointestinal and colorectal cancers using traditional chemotherapy


approaches presents considerable difficulties. However, immunotherapy has emerged as a
promising alternative, particularly for tumors harboring specific mutations like
Microsatellite Instability (MSI) cancers, which exhibit deficiencies in the DNA
Mismatch-Repair system (dMMR). Approximately 85% of these cancers possess a
proficient DNA Mismatch-Repair system (pMMR) and are classified as Microsatellite
Stability (MSS) tumors. However, around 15% of patients demonstrate deficiencies in
their DNA Mismatch-Repair system (dMMR), leading to Microsatellite Instability (MSI)
[1]. Immunotherapy has shown promise in treating MSI tumors, but it is ineffective
against MSS tumors. Therefore, it is essential to accurately classify MSI versus MSS
tumors to implement tailored treatment strategies. However, detecting MSI cancers
beyond stage III is challenging because they are sensitive to pembrolizumab inhibitors.
This research introduces a deep learning-based transfer learning approach that uses a
modified EfficientNetB2 model for the classification of MSI and MSS cancers. The
classification is performed using histological images derived from formalin-fixed
paraffin-embedded (FFPE) samples. The proposed model achieved remarkable accuracy
of 98.29%, an F1 score of 98.0%, and an impressive AUC of 99.80%, outperforming
existing models. These findings highlight the potential of the modified EfficientNetB2
model in distinguishing MSI from MSS tumors, which can improve diagnostic accuracy
and treatment strategies for gastrointestinal cancer.

TABLE OF CONTENTS

CONTENTS PAGE NO.

©Daffodil International University v


Board of Examiners ii

Declaration iii

Acknowledgments iv

Abstract v

CHAPTER 1: INTRODUCTION 01-06


1.1 Introduction 1

1.2 Motivation 2

1.3 Rationale of the Study 3

1.4 Expected Output 4

1.5 Research Question 4

1.6 Report Layout 5

CHAPTER 2: BACKGROUND 6-9


2.1 Preliminaries/Terminologies 6

2.2 Related works 6

2.3 Comparative Analysis and Summary 8

2.4 Scope of the Problem 9

2.5 Challenges 9

CHAPTER 3: RESEARCH METHODOLOY 11-26


3.1 Introduction 11

3.2 Working Process 12

3.3 Data Collection Procedure 14

3.4 Image Pre-processing 15

3.5 CNN Transfer Learning Development 16

3.6 Selection of Transfer Learning Models 17-22

©Daffodil International University vi


3.7 EfficientNetB2 Architecture 20
3.8 Training and Testing 21

CHAPTER 4: EXPERIMENTAL RESULTS AND 23-40


DISCUSSION
4.1 Introduction 23

4.2 Experimental Results 24

4.3 Descriptive Analysis 48

4.4 Summary 39

CHAPTER 5: IMPACT ON SOCIETY, 42-45


ENVIRONMENT, AND SUSTAINABILITY
5.1 Impact on Society 41

5.2 Impact on Environment 42

5.3 Ethical Aspects 43

5.4 Sustainability Plan 44

CHAPTER 6: SUMMARY, CONCLUSION, 46-48


RECOMMENDATION, AND IMPLICATIONS FOR
FUTURE RESEARCH
6.1 Summary of the Study 46

6.2 Conclusion 46

5.3 Implication Future Work 48

REFERENCES 48-50

LIST OF FIGURES
FIGURES PAGE NO

©Daffodil International University vii


Figure 3.1: An overview of the entire classification process 14
Figure 3.3: Dataset Ratio 15
Figure 3.4: Some images of MSS and MSI 16
Figure 3.5: The standard CNN model architecture 17

Figure 4.2.1.1: EfficientNetB2 Model Performance 25

Figure 4.2.1.2: Confusion matrix of the EfficientNetB2 model 26

Figure 4.2.2.1: InceptionV3 Model Performance 27

Figure 4.2.2.2: Confusion matrix of the InceptionV3 model 28

Figure 4.2.3.1: MobileNet Model Performance 28

Figure 4.2.3.2: Confusion matrix of the MobileNet model 29


Figure 4.2.4.1: MobileNetV2 Model Performance 30
Figure 4.2.4.2: Confusion matrix of the MobileNetV2 model 30
Figure 4.2.5.1: RestNet50 Model Performance 31
Figure 4.2.5.2: Confusion matrix of the RestNet50 model 31
Figure 4.2.6.1: VGG16 Model Performance 32
Figure 4.2.6.2: Confusion matrix of the VGG16 model 33
Figure 4.2.7.1: VGG19 Model Performance 33
Figure 4.2.7.2: Confusion matrix of the VGG19 model 34
Figure 4.2.8.1: DenseNet201 Model Performance 35
Figure 4.2.8.2: Confusion matrix of DenseNet201 model 35
Figure 4.2.9: Accuracy comparison of different Models 37

LIST OF TABLES
TABLES PAGE NO

Table 1: Accuracy Comparison of Different Models 36

©Daffodil International University viii


©Daffodil International University ix
CHAPTER 1
INTRODUCTION

1.1 Introduction
Stomach cancer, commonly known as colon cancer, is a prevalent and deadly disease,
ranking as the third major factor in deaths due to cancer globally [2]. Molecular research
has revealed different subtypes of colon cancer with varying clinical outcomes. One such
classification is based on microsatellite instability (MSI) status, where MSI-high colon
cancer shows a distinct response to chemotherapy and improved survival compared to
microsatellite-stable (MSS) cancer [3]. At present, pathologists are relied upon for
visually diagnosing colon cancer, which can lead to subjectivity and diagnostic errors due
to the time-consuming nature of the process [4]. However, deep learning techniques can
potentially revolutionize medical image analysis by providing automated and efficient
solutions. In recent years, the healthcare sector has conducted significant research into
deep learning (DL) technology, particularly in the field of cancer detection [5]. Deep
learning has proven to be a reliable method for analyzing medical images and accurately
classifying colon cancer MSS vs. MSI from histological images [6]. In this study, I am
examining how effective it is to use deep learning to classify colon cancer MSS and MSI
based on histological image analysis. By training and evaluating deep learning models on
a comprehensive dataset of colon cancer patients, I aim to assess their classification
performance, robustness, and generalizability through rigorous testing and validation.
This research has the potential to impact clinical practice by aiding clinicians in making
informed treatment decisions, leading to improved patient care and outcomes. By
advancing the medical image analysis field, I aspire to contribute to enhanced diagnoses,
personalized treatment approaches, and improved patient outcomes in the battle against
colon cancer. In particular, I am examining how transfer learning techniques can enhance
the proficiency of my deep learning models. Transfer learning lets me leverage pre-
trained models on large datasets to extract relevant features from histological images,
improving classification accuracy. My results demonstrate the effectiveness of transfer
learning in classifying colon cancer MSS vs. MSI, with my models achieving high
accuracy and robustness. Furthermore, I have investigated the interpretability of my

©Daffodil International University


models by visualizing the learned features and analyzing their relevance to colon cancer
classification. Through this approach, I can better understand my models’ internal
processes and the factors that contribute to their outstanding performance, including their
decision-making abilities. Overall, my research adds to the existing studies on using deep
learning to analyze medical images and can potentially impact clinical practice by
providing an automated and reliable solution for colon cancer classification. I believe my
findings can pave the way for future research and improve patient care and outcomes in
the fight against colon cancer.

1.2 Motivation
Colon cancer (GC) is a highly heterogeneous and prevalent malignancy, ranking as the
second leading cause of cancer-related deaths worldwide and particularly prevalent in
East Asia. The disease can be classified into different molecular subtypes based on
genomic features, such as microsatellite instability (MSI) or microsatellite stability
(MSS). MSI is characterized by a high mutation rate due to DNA mismatch repair
(MMR) gene defects, while MSS is associated with chromosomal instability and a low
mutation rate. The MSI and MSS subtypes of GC exhibit distinct clinical and biological
characteristics, influencing prognosis, chemotherapy response, immune infiltration, and
tumor mutation burden. Accurate classification of GC based on MSI/MSS status is
essential for enhancing diagnosis, prognosis, and treatment strategies. However, existing
methods for MSI/MSS classification face challenges including cost, invasiveness,
variability, and low sensitivity. Although histological images of GC tissue samples are
routinely obtained, they are not extensively utilized for MSI/MSS classification. Recent
studies have demonstrated the potential of deep learning techniques in analyzing
histological images to extract relevant features and patterns for MSI/MSS classification.
Deep learning, a branch of machine learning utilizing artificial neural networks, can learn
from vast amounts of data and perform complex tasks such as image recognition, natural
language processing, and speech synthesis. In the biomedical field, deep learning has
shown successful applications in cancer detection, diagnosis, prognosis, and treatment. I
chose this topic because I have a strong interest in applying deep learning techniques to
biomedical problems, especially cancer. I have a background in computer science and

©Daffodil International University 2


bioinformatics, and I have experience in developing and applying deep learning models
to various datasets. I was inspired by the recent advances and challenges in GC molecular
subtyping and the potential of deep learning to provide novel insights and solutions.
Moreover, I have a personal motivation for this topic, as one of my aunts was diagnosed
with GC and underwent surgery and chemotherapy. I hope that my research will
contribute to improving the outcomes and quality of life of GC patients like her.

1.3 Rationale of the Study


Colon cancer is a significant health problem around the world, and unfortunately, there
are not many effective treatments available, and the outlook for those diagnosed with it is
not optimistic [7]. It is a heterogeneous disease with several subtypes based on molecular
characteristics. One such classification is based on microsatellite instability (MSI) status.
MSI is a condition that arises when the DNA mismatch repair system fails, leading to an
accumulation of mutations in repetitive DNA sequences known as microsatellites [7].
Colon cancers can be classified as Microsatellite Instability (MSI) or microsatellite
stability (MSS). Accurate classification of these subtypes using histological images is
crucial for effective treatment. Deep learning is an effective method for image analysis
and has shown great promise in medical imaging. This research aims to build a deep-
learning model that can differentiate between the MSS and MSI subtypes of colon cancer
using histology images. The model will be trained and validated on a sizeable histological
image dataset and evaluated for accuracy and robustness. Developing such a model can
improve the accuracy of colon cancer subtype classification and ultimately lead to more
effective treatment for patients. This research will add to the increasing amount of
information on the topic. Deep learning has the potential to significantly impact medical
imaging research, especially when it comes to examining colon cancer [8] [9] [10]. In
addition to developing the deep learning model, this study will investigate the underlying
mechanisms that allow the model to classify colon cancer subtypes accurately. This will
involve analyzing the features that the model uses to make its predictions and
determining their biological relevance. This information can offer valuable understanding
and perspective on the biology of colon cancer and may result in the identification of
novel biomarkers and Therapeutic targets. Overall, this study has the potential to make

©Daffodil International University 3


significant contributions to colon cancer research by developing a powerful tool for
subtype classification and providing new information has been discovered about the
biology of the illness.

1.4 Expected Output


a) To Develop a robust and accurate image analysis model for extracting relevant features
from histological images of gastrointestinal cancer.
b) To Investigate and compare different machine learning algorithms and techniques for
classifying MSI and MSS subtypes.
c) To Optimize the classification model to accurately distinguish between MSI and MSS
gastrointestinal cancer based on histological images.
d) To Assess the generalizability of the proposed classification system by evaluating its
performance on diverse datasets and variations in staining techniques, tissue preparation,
and image quality.
e) To Conduct a comparative analysis and performance evaluation of the developed
classification system against existing manual and automated approaches.
f) To Explore the interpretability and explain ability of the classification model to provide
insights into the discriminative features contributing to the classification decision.
g) To Investigate the developed system’s potential clinical implications and utility in
supporting treatment planning and managing gastrointestinal oncology patients.
h) To Provide recommendations and guidelines for integrating and deploying the
developed classification system in clinical practice or research settings.

1.5 Research Questions


a) Can a deep learning model accurately classify colon cancer subtypes (MSS vs. MSI)
using histological images?
b) How does the performance of the deep learning model compare to existing methods
for colon cancer subtype classification?
c) What underlying mechanisms allow the deep learning model to classify colon cancer
subtypes accurately?

©Daffodil International University 4


d) Which features does the deep learning model use to make predictions, and how
biologically relevant are they?
e) Can the deep learning model provide new insights into the biology of colon cancer and
potentially identify new biomarkers or therapeutic targets?

1.6 Report Layout


Chapter 1: This chapter introduces the topic of the thesis, the personal motivation behind
it, the problem definition, the research question, the research methodology, and the
research objective.
Chapter 2: This chapter reviews the literature on the history of the study, its related
activities, scope of the problem and challenges.
Chapter 3: This chapter describes the research methodology and architecture of this
study, including the data collection, preprocessing, and analysis steps.
Chapter 4: This chapter presents and evaluates the performance of the proposed model,
using metrics such as the recall, precision, f1-score value and confusion matrix. It also
compares and analyzes the results with existing methods for MSI/MSS classification of
GC histological images.
Chapter 5: This chapter summarizes the main findings and contributions of this study
and suggests some directions for future research.

©Daffodil International University 5


CHAPTER 2
LITERATURE REVIEW

2.1 Preliminaries/Terminologies
Colon cancer is a complex disease with many variations, making it difficult to diagnose
and treat. Researchers are exploring new ways to better understand and manage this
disease, including the use of deep learning to classify colon cancer subtypes. This
approach has the potential to provide more personalized treatment options for patients.
Colon cancer is known for its heterogeneity in terms of its microenvironment, genome
instability, and oncogenic signatures. Despite this, there is still a lack of classification that
combines these features. A recent analysis by The Cancer Genome Atlas (TCGA) has
categorized colon cancer into four molecular subtypes: Epstein–Barr virus (EBV)-
positive, microsatellite instability (MSI), genomically stable, and chromosomal instable
tumors. Researchers have used deep learning to detect subtypes of colon cancer that are
sensitive to immunotherapy using histologic images. A new framework for cancer
classification, called deep cancer subtype classification (DeepCC), has also been
developed based on deep learning of functional spectra. In this chapter, I will dive into
the background of colon cancer subtype classification and explore the role of deep
learning in this field.

2.2 Related works


Classification of MSI and MSS gastrointestinal cancer using deep learning is a
challenging task that requires accurate and efficient methods to distinguish between the
two subtypes of colorectal and colon cancer. MSI and MSS are molecular biomarkers that
indicate whether there are defects or not in the item. DNA mismatch repair system, which
affects the response to immunotherapy and chemotherapy. Different deep learning
architectures have been suggested to address this problem, using different strategies such
as transfer learning, feature fusion, attention mechanisms, and lightweight models. Some
of the related papers are:

©Daffodil International University 6


The authors of this study utilized a pre-trained Xception network to categorize
histological images of gastrointestinal cancer as MSI or MSS. They trained the network
on 153,849 augmented images and confirmed its accuracy by validating it on 19,230
images, achieving a success rate of 93.18%. Additionally, they tested the network on an
additional 19,230 images and achieved a testing accuracy of 90.17% and a test AUC of
0.932. The study successfully demonstrated the effectiveness of transfer learning with the
Xception network for histological image classification [11]. In the study [12], the authors
utilized a modified ResNet model that analyzed 192,000 histological images categorized
into 80% for training, 10% for testing, and 10% for validation. The model aimed to
distinguish between the MSI and MSS types of gastrointestinal cancer. The authors
compared their model to the baseline, transfer learning models, and existing literature.
Their model achieved the highest accuracy and F1-score, at 89.81% and 91.78%,
respectively. The authors effectively demonstrated the modified ResNet model's
usefulness in classifying MSI and MSS gastrointestinal cancer. They also noted some
potential improvements and limitations of the model. In the study [13], the authors used
Nave-Bayes classification and Radiomics feature selection and obtained an AUC of 0.598
for the clinical model. The AUC for the Radiomics model was 0.688, while the AUC for
the combined (Radiomics plus clinical) model was 0.752. The authors of [14] created a
deep learning model called MSINet, which was designed to predict MSI status using
H&E-stained WSIs from colorectal cancer patients. They trained the model on 100 WSIs
from Stanford University Medical Center and validated it on 15 WSIs from the same
source and 484 WSIs from The Cancer Genome Atlas. The model achieved high
AUROC, NPV, sensitivity, and specificity on both datasets and outperformed five
gastrointestinal pathologists on a reader experiment. They demonstrated the feasibility of
using deep learning to detect MSI from histology images of colorectal cancer. In [15], the
authors employed deep learning methods, specifically convolutional neural networks and
transfer learning, to classify microsatellite instability in colorectal cancer using
hematoxylin and eosin-stained histopathological images. They trained the VGG16 model
on 150000 images from Kaggle and tested it on 20% of the data. They achieved an
accuracy of 89.4%, a precision of 92.9%, a sensitivity of 85.3%, and an AUC of 89.4%
with the proposed model. They suggested that their model can assist pathologists in

©Daffodil International University 7


computer-aided diagnosis in the clinical setting. These papers demonstrate the
advancements and efficacy of various deep-learning architectures for classifying MSI and
MSS gastrointestinal cancer. They emphasize the relevance of architectural choices in
boosting classification accuracy and efficiency, such as transfer learning, feature fusion,
attention methods, and lightweight models. These findings give helpful insights and lay
the groundwork for future research into constructing optimized deep-learning
architectures for classifying MSI and MSS gastrointestinal cancer.

2.3 Comparative Analysis


In the area of classifying gastrointestinal cancer, several studies have been conducted to
evaluate different models and techniques. In this comparative analysis, I examined three
prominent studies: one by Khan and Loganathan [11], another by Sai Venkatesh et al.
[12], and my own study. Khan and Loganathan [11] employed transfer learning
techniques with the Xception network as their model. They achieved an accuracy of
approximately 90.17% and an AUC of 0.932. While their approach yielded promising
results, it is important to note that they used a different dataset and focused on a broader
classification task. In the research conducted by Sai Venkatesh et al. [12], a modified
ResNet model was utilized for the classification of MSI and MSS. With a dataset of
192,000 histological images, they achieved an accuracy of 89.81%. Furthermore, they
reported F1-scores of 91.78% and notable TP and TN values of 6,338 and 10,936,
respectively. Their work demonstrated the effectiveness of the modified ResNet model in
this specific context. My own study aimed to further contribute to the field by utilizing
the EfficientNetB2 model with pre-trained weights from "imagenet." With a dataset of
10,600 images of MSI and MSS, I have successfully achieved an impressive accuracy of
98.29% and AUC of 99.80%. The classification report indicates high precision, recall,
and F1 scores for both MSIMUT and MSS classes. Comparing these studies, it is evident
that my approach using the EfficientNetB2 model yielded superior results in accuracy
and AUC. This can be attributed to the powerful representation learning capabilities of
the EfficientNet architecture and the utilization of pre-trained weights from "imagenet."
Additionally, my study had the advantage of a focused dataset consisting of 100,600
images specifically related to MSI and MSS classification.

©Daffodil International University 8


Overall, these findings highlight the advancements made in gastrointestinal cancer
classification. My research adds to the current understanding and highlights the potential
of the EfficientNetB2 model for precise classification. This could have important
consequences for clinical diagnoses and treatment choices.
2.4 Scope of the Problem
Correctly identifying MSI and MSS gastrointestinal cancers is a crucial task for
diagnosing and treating colorectal and colon cancer patients. MSI and MSS are molecular
biomarkers that indicate whether any defects are present. DNA mismatch repair system
affects the response to immunotherapy and chemotherapy [16]. However, the current
methods for detecting MSI and MSS are based on additional genetic or
immunohistochemical tests, which are time-consuming, costly, and not universally
available. Therefore, generating other options and methods to classify MSI and MSS
directly from histological images routinely obtained from biopsy samples is essential.
Deep learning is a promising technique for learning complex patterns and features from
histological images and provides accurate and efficient classification results [4].
However, there are still many challenges and opportunities for applying deep learning to
this task, such as data availability, data quality, model interpretability, model
generalization, and clinical integration. This thesis aims to review the existing literature
on deep learning methods for classifying MSI and MSS gastrointestinal cancer, compare
their performance and limitations, and propose novel methods to overcome some
challenges and improve classification accuracy and efficiency.

2.5 Challenges
This study encountered some research challenges that are described below:
a) Data Collection: One of this study's major challenges was obtaining sufficient
histological images for classifying MSI and MSS gastrointestinal cancer. In my country,
collecting this colon cancer data from any medical center was very hard, as they either
did not have the data or did not want to share it for research purposes. Therefore, I had to
look for alternative sources to gather the needed data. Online platforms such as Kaggle
was very helpful in providing me with the histological images for this research project.
Despite my difficulties and limitations, these online sources offered a rich source of colon

©Daffodil International University 9


cancer data, allowing me to conduct the study and contribute to the deep learning-based
classification of colon cancer.

b) Data Quality: Another challenge of this study was ensuring the collected data's
quality for classification. Depending on the source, scanner, and stain used, histological
images may vary in quality, resolution, format, and annotation. Some images may need to
be corrected, completed, or mislabeled due to human or technical errors. These issues
may affect the performance and reliability of the deep learning model. Therefore, I had to
properly preprocess the images for classification, which involved converting the images
to a common format and size, removing noise and artifacts, enhancing contrast and
brightness, and verifying labels.
c) Select Deep Learning Approach: This research aimed to establish the most favorable
option for classifying MSI and MSS gastrointestinal cancer using histological images.
Learning through deep learning is a highly effective technique for complex patterns and
features from histological images and provides accurate and efficient classification results
[4]. However, there are many different deep learning techniques that have been proposed
for various medical image analysis tasks, such as convolutional neural networks (CNNs),
recurrent neural networks (RNNs), generative adversarial networks (GANs), and
transformers. Each technique has advantages and disadvantages regarding accuracy,
efficiency, interpretability, and generalization. Therefore, I had to compare different deep
learning techniques and select the one that best suited the task and the data.
d) Accuracy Improvement: One of the final challenges of this study is to improve the
overall performance of the chosen deep learning model and select the best model for the
task. I improved the model’s performance by adjusting hyperparameters like learning
rate, batch size, number of layers and filters, and activation function. I have also used
data augmentation techniques like rotation, flipping, cropping, and scaling and
regularization techniques like dropout, batch normalization, and weight decay.
Additionally, I incorporated domain knowledge such as clinical features or molecular
markers. Choosing the appropriate model is of utmost importance for a given task. I
confidently evaluated the model’s performance on different metrics, including accuracy,
precision, recall, F1-score, and AUC. I conducted statistical tests or used confidence

©Daffodil International University 10


intervals to compare the model with existing methods or baselines. I utilized visualization
and explanation techniques to gain deeper insights into the model.

CHAPTER 3
RESEARCH METHODOLOGY

3.1 Introduction
Colon cancer is a widespread and fatal illness that causes significant rates of illness and
death worldwide. Properly identifying the subtypes of colon cancer is critical for
determining the most effective treatment options and predicting patient outcomes.
Histological analysis has recently become a valuable tool for characterizing colon tumors
and determining their microsatellite instability (MSI) status, a molecular characteristic
associated with distinct biological behaviors and clinical implications in colon cancer.
However, identifying MSI status through traditional tissue examination can be difficult
and subjective, highlighting the need for more objective and efficient methods. In my
research, we propose a new approach that uses images of tissue samples to differentiate
between MSI and microsatellite-stable (MSS) tumors in colon cancer. By harnessing the
power of deep learning and image analysis techniques, we aim to develop a robust and
accurate classification model to assist in the diagnosis and treatment of colon cancer
patients. My research process consists of four main stages: data collection, image
preparation, model selection, and result analysis. In the data collection stage, we obtain
tissue sample images from a well-curated repository and carefully select subsets of colon
cancer patients with known MSI and MSS tumors. These images serve as the foundation
for subsequent analysis. In the image preparation stage, we apply various techniques such
as resizing, color normalization, data augmentation, and noise reduction to standardize
the images and improve their quality. Next, in the model selection stage, I explored
different deep-learning models to find the most suitable one for my classification task. I
aim to use transfer learning, a powerful technique that allows me to adapt pre-trained
models to my specific problem domain. By leveraging the learned representations from

©Daffodil International University 11


large-scale datasets, I hope to achieve higher accuracy and efficiency in my classification
model. Finally, in the result analysis stage, I evaluate my model’s performance on test
images and compare it with existing methods. This analysis will provide valuable insights
into the effectiveness of my approach and its potential for clinical application. By
combining advanced image analysis techniques with deep learning, I anticipate that my
proposed approach will provide a reliable and objective means of distinguishing between
MSI and MSS tumors in colon cancer. This can have significant implications for
personalized treatment decisions, prognostic assessments, and overall patient care.
Furthermore, My research contributes to the growing field of digital pathology by
demonstrating the potential of histological image analysis in improving cancer diagnosis
and management. In the following sections, we provide detailed descriptions of each
stage in my research process, including dataset preparation, image preparation steps,
CNN transfer learning development, and transfer learning model selection. Through this
comprehensive approach, I aim to advance the field of colon cancer classification and
contribute to ongoing efforts in precision medicine.

3.2 Research Subject and Instrumentation


In this study, I offered a novel approach, using histological pictures to distinguish
between MSI and MSS in colon cancer. The method involves four main stages:
i) Data collection,
ii) Image pre-processing,
iii) Model selection,
iv) Result analysis.
The research subject of this study focuses on utilizing histological pictures to distinguish
between microsatellite instability (MSI) and microsatellite-stable (MSS) tumors in colon
cancer. The aim is to develop an objective and accurate classification model that can aid
in the diagnosis and management of colon cancer patients.
Colon cancer is a complex disease with diverse molecular subtypes and clinical
behaviors. Identifying the MSI status of colon tumors is of significant importance, as it
can provide valuable information for treatment decision-making and prognosis
prediction. However, traditional histopathological examination for MSI determination

©Daffodil International University 12


can be subjective and challenging. Therefore, leveraging advanced image analysis
techniques and deep learning algorithms to analyze histological images offers a
promising approach for improving the accuracy and efficiency of MSI classification in
colon cancer. To accomplish the research objectives, the study employs a combination of
digital pathology, image analysis, and deep learning techniques. The following
instrumentation components are utilized:

Histological Images: The primary data source for this study is a collection of
histological images obtained from the Zenodo repository. These images represent colon
cancer patients with known MSI and MSS tumors. The images are preprocessed and
labeled according to the microsatellite status of the tumors.
Image Pre-processing Tools: Image pre-processing plays a crucial role in enhancing the
quality and standardizing the input data for analysis. Various tools and techniques are
employed for tasks such as resizing the images to a standardized dimension, color
normalization to remove variations, data augmentation for increasing dataset diversity,
and noise reduction to improve image clarity.
Convolutional Neural Networks (CNNs): CNNs are a class of deep learning models
specifically designed for image analysis tasks. These networks consist of multiple layers,
including convolutional, pooling, normalization, and fully connected layers. CNNs excel
at learning hierarchical representations and extracting features from images. In this study,
CNNs are trained and fine-tuned using transfer learning, where pre-trained models are
adapted to the specific colon cancer classification task.
Transfer Learning Models: To determine the most effective transfer learning model for
the classification task, several pre-trained models are evaluated. The models considered
include VGG16, VGG19, MobileNet, MobileNetV2, InceptionV3, ResNet50,
DenseNet201, and EfficientNetB2. Each model has been previously trained on large-
scale datasets and has demonstrated high accuracy in image recognition tasks.
Evaluation Metrics: To assess the performance of the classification model, standard
evaluation metrics such as accuracy, precision, recall, and F1 score are employed. These
metrics provide quantitative measures of the model's ability to correctly classify MSI and
MSS tumors based on the histological images.

©Daffodil International University 13


By leveraging these instrumentation components, the study aims to develop a reliable and
objective classification model for distinguishing between MSI and MSS tumors in colon
cancer. The combination of digital pathology, image analysis, and deep learning
techniques offers a powerful approach to address the challenges associated with
traditional histopathological examination and improve the accuracy of colon cancer
classification. The data collection stage consists of obtaining the histological images from
the kaggle repository and selecting the subsets of colon cancer patients with MSS and
MSI tumors. The image pre-processing stage includes resizing, color normalization, and
data augmentation of the images. The model selection stage involves choosing a suitable
deep-learning model for the classification task and training it on the pre-processed
images. The result analysis stage comprises evaluating the performance of the model on
the test images and comparing it with other methods. Figure 3.1.1 illustrates the overview
of the working process from the data collection to the result analysis. I provide more
details about each stage in the following sections.

©Daffodil International University 14


Figure 3.1: An overview of the entire classification process

3.3 Data Collection Procedure


In this study, I have used histological images of colon cancer patients from the TCGA
cohort as my MSI vs. MSS classification dataset. The original images were obtained from
the Zenodo repository [27], where they were preprocessed and labeled according to the
microsatellite status of the tumors. The preprocessing steps included automatic tumor
detection, resizing to 224 x 224 pixels at a resolution of 0.5 µm/px, color normalization
with the Macenko method [28], and randomization of patients to training and testing sets.
The dataset contained 411,890 unique image patches derived from formalin-fixed
paraffin-embedded (FFPE) diagnostic slides of colorectal and colon cancer patients. I
have selected two subsets of the dataset for my study: STAD_TRAIN_MSS and
STAD_TRAIN_MSIMUT, which contained training images for colon cancer patients

©Daffodil International University 15


with MSS and MSI tumors, respectively. Each subgroup had 50,285 image patches,
resulting in a balanced dataset of 100,570 images. I have divided the images into three
sets to ensure accuracy: 80% for training, 10% for validation, and 10% for testing.

Figure 3.3: Dataset Ratio

3.4 Image Pre-processing


This section outlines the image pre-processing steps to prepare the dataset for training
and evaluation. The dataset used for this research was balanced, eliminating the need for
explicit class-balancing techniques. The pre-processing steps involved in preparing the
images are as follows:
Image Resizing: The original images in the dataset were resized to a standardized
dimension of [224x224]. Resizing the images ensured uniformity in the input data and
alignment with the expected input size of the machine-learning model.
Image Normalization: Each image's pixel values were adjusted to a range of 0 to 1 for
normalization. This normalization step facilitates training convergence and prevents bias
toward specific pixel intensity ranges. It also improves the stability of the learning
process.
Data Augmentation: Data augmentation techniques were employed to enhance the
model's robustness and generalization capabilities. Random rotations, horizontal flips,

©Daffodil International University 16


and zooming were applied to the images, artificially expanding the dataset and providing
the model with more diverse training examples.
Noise Reduction: A noise reduction filter was applied to minimize the impact of noise
and artifacts in the images. This filtering process enhanced the clarity of the images and
improved the model's ability to extract relevant features.
The pre-processing steps outlined above aimed to standardize the input data, enhance the
model's feature learning capabilities, and improve its generalization performance.

Figure 3.4: Some images of MSS and MSI

3.5 CNN Transfer Learning


The subset of neural networks known as Convolutional Neural Networks (CNN) can
recognize objects based on images. These have become very popular in recent years due
to their impressive performance [17] [18]. A typical CNN architecture consists of various
layers, such as convolution, pooling, normalization, and fully connected layers. The
network is built sequentially by stacking convolution, pooling, and normalization layers.
These layers create high-level features from the images that are then used for
classification. The classification process is carried out in the fully connected layer, which
utilizes the features extracted from the preceding layers. Many parameters in the CNN
architecture need to be tuned during training. The standard backpropagation method is
often used when training a convolutional neural network (CNN). Adding more layers
allows for more complex models.

©Daffodil International University 17


Figure 3.5: The standard CNN model architecture

3.6 Selection of Transfer Learning Models


Transfer learning is a deep learning technique that uses a model trained for a specific task
to improve the performance of a related task [19]. Transfer learning is often used when
the new data is smaller than the original data that was used to train the pre-trained model
[20]. This study compares ten transfer learning models that achieve high accuracy and
find that EfficientNetB2 is the best among them. The eight pre-trained models are
VGG16, VGG19, MobileNet, MobileNetV2, InceptionV3, ResNet50, DenseNet201 and
EfficientNetB2. They are trained on the training data and evaluated on the testing data. A
brief description of these models is given below:

3.6.1 MoibleNet
MobileNet is a class of efficient models designed for mobile and embedded vision
applications. These models use depth-wise separable convolutions to create lightweight
deep neural networks with a streamlined architecture. The MobileNet model comes in
different sizes. The standard version has 4.2 million parameters, while smaller versions
have 1.32 million. The MobileNet model comprises 27 convolutional layers that consist
of 13 depth-wise convolutional layers, one average pool layer, one fully connected layer,
and one softmax layer [21].

3.6.2 MobileNetV2

©Daffodil International University 18


MobileNetV2 is a recently developed mobile architecture that enhances the performance
of mobile models on various tasks and benchmarks, regardless of the model size. The
design of MobileNetV2 uses a structure called inverted residual, which involves using
thin bottleneck layers for the input and output of the residual block. This contradicts
conventional residual models that utilize expanded representations in the input.
MobileNetV2 utilizes lightweight depth-wise convolutions in the intermediate expansion
layer to filter features effectively [22].

3.6.3 VGG16
VGG16 is a deep learning model for image recognition developed in 2014 by the
University of Oxford's K. Simonyan and A. Zisserman. It has 16 layers of convolutions
[23] that can learn to recognize and categorize various objects in an image. It can also
generate captions for images, detect and segment objects, and classify images. It can also
transfer its learned features to other neural networks for different tasks. VGG16 is widely
regarded as a highly effective model for image recognition and has demonstrated
impressive performance on the ImageNet challenge, achieving a remarkably low error
rate of 7.3%.

3.6.4 VGG19
VGG19 is a deep-learning model for image recognition and classification created by
Karen Simonyan and Andrew Zisserman in 2014. It belongs to the Visual Geometry
Group (VGG) network family and is the 19th model in the series. VGG19 has 19 layers
[24] that can handle various computer vision tasks. VGG19 has a simple but effective
structure that consists of five blocks of convolutions and three layers of fully connected
neurons. The blocks of convolutions have multiple layers of convolutions with non-linear
activations, pooling layers, and batch normalization layers. A max-pooling layer with a
stride of 2 is placed after each set of convolutions. The fully connected layers have 4096,
4096, and 1000 neurons each. The result produced by the VGG19 model is a vector with
a thousand dimensions that predicts the class of an image.

3.6.5 InceptionV3

©Daffodil International University 19


InceptionV3 is a new design of the Inception network that aims to reduce the
computational power required by previous Inception models. It achieves this using
regularization, dimension reduction, convolution factorization, and parallel computation
techniques. InceptionV3 significantly improved over earlier Inception models, such as
label smoothing and factorized 7x7 convolutional layers [25]. It also uses an auxiliary
classifier to transfer label information across the network.

3.6.6 RestNet50
ResNet50 is a powerful and widely recognized convolutional neural network (CNN)
architecture that has made significant contributions to the field of computer vision.
Introduced by Microsoft Research, ResNet50 is known for its deep structure, enabling it
to effectively learn complex representations from images. What sets ResNet50 apart is its
use of residual connections, also known as skip connections, which alleviate the
vanishing gradient problem. By introducing these connections, the network can
efficiently propagate information from earlier layers to later layers, allowing for the
successful training of very deep models. ResNet50 consists of 50 layers, including
convolutional layers, pooling layers, fully connected layers, and shortcut connections.
The core building blocks of ResNet50 are residual blocks, which contain multiple
convolutional layers. These blocks enable the network to learn and refine increasingly
abstract features as the information passes through the layers. The skip connections in
ResNet50 enable the network to learn residual mappings, allowing for easier optimization
and improved gradient flow during training. This architectural innovation has been
instrumental in training deeper neural networks more effectively and has contributed to
breakthroughs in various computer vision tasks such as image classification, object
detection, and semantic segmentation. ResNet50's remarkable performance and accuracy
have been demonstrated in competitions such as the ImageNet challenge, where it has
achieved state-of-the-art results. Due to its strong performance and robustness, ResNet50
has become a popular choice for image recognition tasks and serves as a foundation for
many subsequent CNN architectures.

3.6.7 DenseNet201

©Daffodil International University 20


DenseNet201 is a deep-learning image recognition model consisting of a series of dense
blocks and transition layers. A dense block has several convolutional layers and connects
to a transition layer that reduces the output size. The output of a dense block goes to the
next dense block. This structure helps the model learn more complex features and
patterns. DenseNet201 has some benefits over other image recognition models, such as
ResNet and InceptionNet. It has fewer parameters, which makes it more efficient and
easier to train. It also has a faster inference time and is less likely to overfit.

3.6.8 EfficientNetB2
EfficientNetB2 is a convolutional neural network that was designed specifically to
achieve high accuracy and efficiency for image recognition and classification tasks. It is
part of the EfficientNet family of models developed using neural architecture search and
scaling techniques [26]. EfficientNetB2 has 9 blocks of convolutions and 3 layers of fully
connected neurons. The blocks of convolutions consist of multiple layers of depth-wise
and pointwise convolutions with non-linear activations, squeeze-and-excitation layers,
and batch normalization layers. A dropout layer and a stride of 2 max-pooling layers
follow each block of convolutions. Each connected layer has 1408, 1408, and 1000
neurons. The EfficientNetB2 model generates a 1000-dimensional vector that predicts the
class of images. EfficientNetB2 has fewer parameters and a faster training speed than
previous models, such as VGG19 and InceptionV3 [26].

3.7 EfficientNetB2 Architecture


In this study, I have used EfficientNetB2 as the base model for my image classification
task. EfficientNetB2 is a convolutional neural network that uses efficient building blocks
and scaling techniques to achieve high accuracy and efficiency on image recognition and
classification tasks [26]. It consists of a stem convolutional layer, 23 inverted residual
blocks with squeeze-and-excitation modules, and a final convolutional layer. The
inverted residual blocks use depth-wise separable convolutions, which decreases
parameters and computational costs compared to standard convolutions. The squeeze-
and-excitation modules use global average pooling and two fully connected layers to
recalibrate channel-wise feature responses adaptively. The compound scaling method

©Daffodil International University 21


scales the network width, depth, and resolution uniformly with a fixed ratio, balancing
network capacity and efficiency. EfficientNetB2 has 9 million parameters and achieves
80.3% top-1 accuracy on ImageNet [27]. I have fine-tuned the base model by adding
some custom layers on top of it. The input layer takes images of shapes (224, 224, 3) and
passes them to the base model. The base model does not include the top classification
layer but instead uses max pooling to reduce the feature map size to (1, 1, 1408). The
output of the base model is fed to a batch normalization layer, which normalizes the
activations and improves the stability and speed of training. After the batch normalization
layer comes to a dense layer with 256 units and ReLU activation, this layer acts as a
hidden layer that learns non-linear combinations of the features extracted by the base
model. In order to avoid overfitting and improve generalization, the dense layer uses the
L1 and L2 regularization techniques. The dense layer is followed by a dropout layer with
a rate of 0.45, which randomly sets some of the units to zero during training. This layer
also helps to prevent overfitting and improve generalization by reducing the co-
adaptation of units. Another dense layer with class_count units and softmax activation
follows the dropout layer. This layer acts as the output layer that predicts the probability
of each class for the input image. The model is compiled with an Adamax optimizer with
a learning rate of 0.001, categorical cross-entropy loss function, and several metrics such
as accuracy, AUC, true positives, false positives, true negatives, precision, and recall.
These metrics help evaluate the model's performance on different aspects of the
classification task.

3.8 Training and Testing


For training and testing purposes, I have split the dataset into 80:10:10. This indicates
that around 80% of images were used for training the model, 10% were used for
validating the model, and the remaining 10% were used for testing the model. All of the
models are trained using a transfer learning approach, and the categorical cross-entropy
was used as the loss function in equation (1). This equation has given below. The
learning rate was set at 0.001,

©Daffodil International University 22


n
LCE =−∑ t i log ( p i) (1)
i=1

with Adam optimizer where SoftMax was used as the activation function for all the
architectures shown in equation (2).

f i¿

CHAPTER 4
EXPERIMENTAL RESULTS AND DISCUSSION

©Daffodil International University 23


4.1 Introduction
Colon cancer is a significant health concern worldwide, accounting for a considerable
number of cancer-related deaths. Early and accurate detection of colon cancer plays a
crucial role in improving patient outcomes and enabling timely interventions. With
advancements in machine learning and computer vision techniques, there has been
growing interest in utilizing these technologies to develop efficient and accurate models
for colon cancer detection. This study focuses on the application of Convolutional Neural
Networks (CNNs) and transfer learning (TL) models for colon cancer classification and
detection. CNNs have shown great promise in image classification tasks, including
medical image analysis. Transfer learning, on the other hand, allows leveraging pre-
trained models on large datasets to achieve improved performance and faster
convergence on specific tasks with limited training data. The primary objective of this
study is to evaluate the performance of eight CNN-based transfer learning models,
namely VGG16, VGG19, MobileNet, MobileNetV2, InceptionV3, EfficientNetB2,
ResNet50, and DenseNet201, in accurately classifying colon cancer. A dataset consisting
of 100,570 preprocessed colon cancer images is utilized for training and evaluation
purposes. Various performance metrics, such as accuracy, precision, recall, and F1 score,
are employed to assess the validity and effectiveness of these models. To ensure efficient
processing and computation, the experiments are conducted on the Kaggle platform
utilizing a dedicated GPU. The models undergo training for 10-12 epochs, employing a
custom callback function called LRA, which dynamically adjusts the learning rate based
on training accuracy and validation loss. This approach enhances training efficiency and
overall model performance. By developing and evaluating these CNN-based transfer
learning models, this study aims to contribute to the advancement of colon cancer
detection, enabling more accurate and timely diagnoses. The findings have the potential
to support clinical decision-making, improve patient outcomes, and facilitate
personalized treatment strategies. The subsequent sections will present the results and
discussions on the performance evaluation of each model, including the analysis of loss,
accuracy, and confusion matrices. These evaluations provide insights into the strengths
and limitations of the proposed models and highlight their effectiveness in colon cancer
classification and detection.

©Daffodil International University 24


4.2 Experimental Results
The confusion matrix has been used to evaluate the study's success. In the confusion
matrix, there are four parameters: true positive (TP), true negative (TN), false positive
(FP), and false negative (FN). TP represents cases where the model correctly classifies
colon cancer. TN indicates the cases where the model correctly identifies non-colon
cancer. FP represents cases where the model incorrectly classifies non-colon cancer as
colon cancer. FN denotes cases where the model incorrectly classifies colon cancer as
non-colon cancer. Several metrics can be used to measure the validity of the model,
including accuracy, specificity, recall, precision, and f1-score. The formulas for these
metrics are:

TP+TN
Accuracy= (3)
TP+TN + FP+ FN

TP
Precision= (4)
TP+ FP

TP
Recall= (5)
TP+ FN

precision × recall
F 1 Score=2 × (6)
precision+recall

This study's proposed model can classify and detect colon cancer based on CNN, which
is a transfer learning (TL) model. A dataset comprising 100,570 preprocessed colon
cancer images is utilized for classification purposes. Eight CNN transfer learning models,
namely VGG16, VGG19, MobileNet, MobileNetV2, InceptionV3, EfficientNetB2,
performance in terms of accuracy, completion time, and data loss. To ensure efficient
processing and computation, all experiments are conducted on the Kaggle platform using
a dedicated GPU. The models are trained for 10-12 epochs, employing a custom callback
function called LRA, which dynamically adjusts the learning rate during training based
on the training accuracy and validation loss. The implementation of this callback function
provides significant advantages, including more efficient training and improved
performance. Through the development and evaluation of these CNN-based transfer

©Daffodil International University 25


learning models, this study aims to contribute to the advancement of colon cancer
detection, enabling more accurate and timely diagnoses, which in turn can improve
patient outcomes and support clinical decision-making. Training and Validation Loss
with Accuracy and confusion matrix for each model is given below to visualize the
model’s performance.

4.2.1 Performance Evaluation of EfficientNetB2


The following figure shows how the loss and accuracy of the model change on the
training and validation sets.

Figure 4.2.1.1: EfficientNetB2 Model Performance


This figure is a graphical representation of the performance of the EfficientNetB2 Model
on histological images. The model was trained on a subset of the data, called the training
set, and then evaluated on another subset of the data, called the validation set. The
validation set is used to check how well the model generalizes to new and unseen data.
The left plot shows how the loss of the model changes over time. The loss is a measure of
how much the model’s predictions differ from the actual outcomes. A lower loss means a
better fit. The red line shows the loss on the training set, and the green line shows the loss
on the validation set. Ideally, both lines should decrease as the model learns from the
data, and converge to a low value. This graph shows that both lines are decreasing which
indicates a positive aspect. The right plot shows how the accuracy of the model changes
over time. The accuracy is a measure of how often the model’s predictions are correct. A
higher accuracy means a better performance. The red line shows the accuracy of the
training set, and the green line shows the accuracy of the validation set. Ideally, both lines

©Daffodil International University 26


should increase as the model learns from the data, and converge to a high value. Here
both lines are increasing which indicates good performance. The x-axis of both plots is
labeled epochs. An epoch is one complete cycle of passing all the training data through
the model. The more epochs, the more the model learns from the data. However, too
many epochs can also lead to overfitting, which is when the model memorizes the
training data and fails to generalize to new data. I have run 12 epochs for training this
model. The blue dots on both plots indicate the best epoch for the model. This is the
epoch where the validation loss is the lowest and the validation accuracy is the highest.
This means that this is the point where the model has learned enough from the data
without overfitting or underfitting. The best epoch for this model is 12.

Figure 4.2.1.2: Confusion matrix of the EfficientNetB2 model


This figure shows the confusion matrix for the EfficientNetB2 model. A confusion matrix
shows how many times the model correctly or incorrectly predicted each category of the
data, compared to the actual labels. The two categories on both axes are “MSIMUT” and
“MSS”, which are two subtypes of colon cancer. The blue quadrants show the correct
predictions of the model, also known as true positives and true negatives. The top left
quadrant has the value “4915”, which means that the model correctly predicted 4915
samples as MSIMUT (true positives). The bottom right quadrant has the value “4957”,
which means that the model correctly predicted 4957 samples as MSS (true negatives).
The white quadrants show incorrect predictions of the model, also known as false

©Daffodil International University 27


positives and false negatives. The top right quadrant has the value “114”, which means
that the model incorrectly predicted 114 samples as MSS when they were actually
MSIMUT (false positives). The bottom left quadrant has the value “71”, which means
that the model incorrectly predicted 71 samples as MSIMUT when they were actually
MSS (false negatives). The accuracy of the model can be calculated by dividing the sum
of correct predictions (true positives and true negatives) by the total number of
predictions. In this case, the accuracy is (4915 + 4957) / (4915 + 4957 + 114 + 71) =
0.986, which means that the model is 98.6% accurate on this data set.

4.2.2 Performance evaluation of InceptionV3


The following figure shows how the loss and accuracy of the model change on the
training and validation sets.

Figure 4.2.2.1: InceptionV3 Model Performance


This figure illustrates the performance of a model as it is trained over multiple epochs. As
the number of epochs increases, the validation loss also decreases, which is indicating the
model is improving its ability to generalize to new data. Additionally, both the training
and validation accuracy increase with the number of epochs, further demonstrating the
model’s improved performance. The best epoch for this model is 10, as indicated by the
lowest validation loss and highest validation accuracy at this point. Overall, these trends
suggest that the model is performing well.

©Daffodil International University 28


Figure 4.2.2.2: Confusion matrix of the InceptionV3 model
This figure presents the performance of the InceptionV3 model in predicting two
categories of data: MSIMUT and MSS. The model correctly predicted 4807 samples as
MSIMUT, while incorrectly predicting 222 samples as MSIMUT. Similarly, the model
correctly predicted 4798 samples as MSS, while incorrectly predicting 230 samples as
MSS. These results indicate the model’s ability to accurately predict both categories, with
a small number of incorrect predictions.

4.2.3 Performance Evaluation of MobileNet


The following figure shows how the loss and accuracy of the model change on the
training and validation sets.

Figure 4.2.3.1: MobileNet Model Performance

©Daffodil International University 29


This figure shows how the training and validation loss change over time. Here the
validation loss increased at epoch 4, then decreased at epoch 5, then increased slightly at
epoch 6, and then decreased steadily with the training loss. The right plot shows how the
training and validation accuracy change over time. The training accuracy increased
monotonically, but the validation accuracy fluctuated more. It increased and decreased
several times, reaching its highest value at epoch 9. This was also the best epoch for the
model, as indicated by the blue dots on both plots.

Figure 4.2.3.2: Confusion matrix of the MobileNet model


This figure illustrates the predictions of the model for two categories of data: MSIMUT
and MSS. The model correctly predicted 4782 samples as MSIMUT and 4842 samples as
MSS. However, the model incorrectly predicted 247 MSIMUT samples as MSS and 186
MSS samples as MSIMUT. The ratio of correct and incorrect predictions shows that the
model performed well, with a high proportion of correct predictions.

4.2.4 Performance Evaluation of MobileNetV2


The following figure shows how the loss and accuracy of the model change on the
training and validation sets.

©Daffodil International University 30


Figure 4.2.4.1: MobileNetV2 Model Performance
This figure illustrates the changes in the loss and accuracy of the model on the training
and validation sets. The training loss decreased steadily from the beginning to the end,
but the validation loss oscillated between decreasing and increasing. Similarly, the
training accuracy increased smoothly throughout the epochs, but the validation accuracy
varied more. It increased and decreased several times, reaching its peak at epoch 8. This
was also the best epoch for the model, as shown by the blue dots on both plots.

Figure 4.2.4.2: Confusion matrix of the MobileNetV2 model


This figure demonstrates the predictions of the model for two categories of data:
MSIMUT and MSS. The model correctly predicted 4845 samples as MSIMUT and 4772
samples as MSS. However, the model incorrectly predicted 184 MSIMUT samples as
MSS and 256 MSS samples as MSIMUT.

©Daffodil International University 31


4.2.5 Performance Evaluation of RestNet50
The following figure shows how the loss and accuracy of the model change on the
training and validation sets.

Figure 4.2.5.1: RestNet50 Model Performance


This figure demonstrates the changes in the loss and accuracy of the model on the
training and validation sets. The training loss decreased consistently from the start to the
finish, but the validation loss fluctuated between decreasing and increasing. Likewise, the
training accuracy increased steadily throughout the epochs, but the validation accuracy
varied more. It increased and decreased several times, with large gaps between the peaks
and valleys. The highest peak was at epoch 9, which was also the best epoch for the
model, as indicated by the blue dots on both plots.

Figure 4.2.5.2: Confusion matrix of the RestNet50 model

©Daffodil International University 32


This figure presents the predictions of the model for two categories of data: MSIMUT
and MSS. The model accurately predicted 4269 samples as MSIMUT and 4655 samples
as MSS. However, the model erroneously predicted 760 MSIMUT samples as MSS and
373 MSS samples as MSIMUT. The ratio of accurate and erroneous predictions indicates
that the model achieved a good performance, with a high percentage of accurate
predictions.

4.2.6 Performance Evaluation of VGG16 Model


The following figure shows how the loss and accuracy of the model change on the
training and validation sets.

Figure 4.2.6.1: VGG16 Model Performance


This figure illustrates the changes in the loss and accuracy of the model on the training
and validation sets. The training loss decreased steadily from the beginning to the end,
but the validation loss oscillated between decreasing and increasing. Similarly, the
training accuracy increased smoothly throughout the epochs, but the validation accuracy
varied more. It increased and decreased several times, with very large gaps between the
peaks and valleys. The highest peak was at epoch 8, which was also the best epoch for
the model, as shown by the blue dots on both plots.

©Daffodil International University 33


Figure 4.2.6.2: Confusion matrix of the VGG16 model
This figure illustrates the predictions of the model for two categories of data: MSIMUT
and MSS. The model correctly predicted 3715 samples as MSIMUT and 4867 samples as
MSS. However, the model incorrectly predicted 1314 MSIMUT samples as MSS and 161
MSS samples as MSIMUT. The ratio of correct and incorrect predictions indicates that
the model performed poorly, with a low percentage of correct predictions for MSIMUT
and a high percentage of incorrect predictions for MSS.

4.2.7 Performance Evaluation of VGG19 Model


The following figure shows how the loss and accuracy of the model change on the
training and validation sets.

Figure 4.2.7.1: VGG19 Model Performance

©Daffodil International University 34


This figure shows how the training and validation loss change over time. This figure
shows both the training and validation loss was constant from epoch 0 to epoch 5, not
increasing or decreasing. Then, at epoch 6, the validation loss increased sharply and then
decreased back to its previous level. The training loss remained constant throughout the
epochs, not decreasing or increasing. The right plot shows how the training and
validation accuracy change over time. The training accuracy increased steadily on all
epochs, but the validation accuracy fluctuated more. It increased and decreased several
times, reaching its highest value at epoch 9. This was also the best epoch for the model,
as indicated by the blue dots on both plots.

Figure 4.2.7.2: Confusion matrix of the VGG19 model


This figure illustrates the predictions of the model for two categories of data: MSIMUT
and MSS. The model correctly predicted 4809 samples as MSIMUT and 4024 samples as
MSS. However, the model incorrectly predicted 220 MSIMUT samples as MSS and 1004
MSS samples as MSIMUT. The ratio of correct and incorrect predictions shows that the
model performed poorly, with a high percentage of incorrect predictions.

4.2.8 Performance Evaluation of DenseNet201 Model


The following figure shows how the loss and accuracy of the model change on the
training and validation sets.

©Daffodil International University 35


Figure 4.2.8.1: DenseNet201 Model Performance
This figure illustrates the changes in the training and validation loss over time. Here the
training loss was constant and did not increase or decrease for all epochs. However, the
validation loss increased significantly after epoch 6 and then decreased sharply in epoch
8. Then it increased again in the next epoch. The right plot shows the changes in the
training and validation accuracy over time. The training accuracy increased consistently.
But the validation accuracy dropped a lot on epoch 5 and then recovered on the next
epoch. Then it decreased slightly and then increased slightly. The best epoch was 8, as
shown by the blue dots on both plots.

Figure 4.2.8.2: Confusion matrix of DenseNet201 model

©Daffodil International University 36


This figure shows the predictions of the model for two categories of data: MSIMUT and
MSS. The model correctly predicted 4868 samples as MSIMUT and 4606 samples as
MSS. However, the model incorrectly predicted 161 MSIMUT samples as MSS and 423
MSS samples as MSIMUT. The ratio of correct and incorrect predictions indicates that
the model performed well, with a high percentage of correct predictions.
By analyzing the loss and accuracy of the model on the training and validation set and the
confusion matrix, I can conclude that EfficientNetB2 outperforms all other models in
terms of predicting the microsatellite stability status of colorectal cancer patients.
Therefore, I suggest using EfficientNetB2 for future work in this domain. To facilitate a
better understanding of the performance of different models, I have presented a table that
compares the accuracy and other metrics such as recall, precision, f1-score, and auc
values of all models. The performance metrics of all the models, including accuracy, the
area under the curve (AUC), recall, precision, and F1-score, are summarized in the
following table:
Table 4.2: Accuracy Comparison of Different Models
Model Accuracy AUC Recall Precisio F1-score
n

EfficientNetB2 98.16% 99.70% 98.5% 98.5% 98.0%

MobileNet 95.69% 99.19% 95.5% 95.5% 95.5%

VGG19 87.83% 94.78% 88.0% 89.0% 88.0%

VGG16 85.33% 93.41% 85.5% 87.5% 85.0%

InceptionV3 95.51% 99.19% 95.5% 95.5% 96.0%

MobileNetV2 95.62% 99.26% 95.5% 95.5% 96.0%

DenseNet201 94.19% 98.55% 94.5% 94.5% 94.0%

RestNet50 88.73% 95.22% 89.0% 89.0% 88.5%

The table shows the comparison of different models in terms of accuracy, auc, recall,
precision, and f1-score. These metrics measure how well the models can predict the

©Daffodil International University 37


microsatellite stability status of colorectal cancer patients. Among the models,
EfficientNetB2 has the highest values for all metrics, indicating that it is the most
accurate and reliable model. MobileNet and MobileNetV2 also have high values for all
metrics, suggesting that they are also good models. VGG19 and VGG16 have the lowest
values for all metrics, implying that they are the least accurate and reliable models.
InceptionV3 and DenseNet201 have moderate values for all metrics, indicating that they
are average models. RestNet50 has low values for accuracy and auc, but moderate values
for recall, precision, and f1-score, suggesting that it is a slightly better model than
VGG19 and VGG16. In order to visually represent the performance of the different CNN
transfer learning models in terms of accuracy and f1-score, a chart has been created. The
chart illustrates the accuracy and f1-score values obtained for each model, allowing for a
clear comparison and identification of the most accurate model. The results clearly
demonstrate that EfficientNetB2 outperforms the other models, achieving an accuracy of
98.58% with a 98.0% f1-score. This chart provides a visual confirmation of the
quantitative analysis presented earlier, reinforcing the claim that EfficientNetB2 is the
most accurate model for identifying colon cancer. The visualization serves as additional
evidence to support the selection of EfficientNetB2 as the optimal model for this study,
emphasizing its potential to significantly impact clinical practice and improve patient
outcomes.

©Daffodil International University 38


Figure 4.2.9: Accuracy comparison of different Models
These results clearly demonstrate the superior performance of EfficientNetB2 in terms of
accuracy, recall, precision, and F1-score, making it the most suitable model for colon
cancer identification in terms of both accuracy and efficiency. By leveraging the power of
transfer learning and deep learning techniques, this research aims to provide a robust and
accurate automated solution for colon cancer classification, ultimately contributing to
improved patient care and treatment decision-making in the fight against this deadly
disease.

4.3 Descriptive Analysis


In addition to evaluating the performance of the CNN-based transfer learning models for
colon cancer detection, this study also includes a descriptive analysis of the dataset used
for training and evaluation. Understanding the characteristics and composition of the
dataset provides valuable insights into the underlying data and can help interpret the
models' performance. The dataset utilized in this study consists of 100,570 preprocessed
colon cancer images. These images are obtained from histological samples and represent
different subtypes of colon cancer, namely MSIMUT and MSS. A summary of the
performance metrics of the CNN-based transfer learning models, including accuracy,
AUC, recall, precision, and F1-score. Analyzing the dataset, we find that it comprises a
substantial number of colon cancer images, providing a rich and diverse dataset for
training the models. The dataset is carefully preprocessed, ensuring the quality and
relevance of the images for the classification task. To gain a better understanding of the
dataset, it is important to examine its composition. The distribution of classes within the
dataset is crucial, as a balanced dataset with equal representation of each class is
desirable for training models to achieve optimal performance. Deviations from a balanced
distribution may introduce biases and affect the models' predictions. In this dataset, both
the MSIMUT and MSS classes are well-represented, allowing for robust training and
evaluation of the models. Additionally, exploring the properties of the images themselves
provides insights into their characteristics. Analyzing the image size distribution reveals
any variations in dimensions, which may require preprocessing or resizing to ensure
uniformity during training. Examining the color distribution helps identify potential

©Daffodil International University 39


variations in image quality or staining techniques that may impact the models'
performance. Furthermore, it is crucial to check for potential biases or artifacts present in
the dataset. These biases could arise from the data collection process, image acquisition
techniques, or other factors that may introduce systematic errors. Detecting and
addressing such biases is essential to ensure the models' generalizability and robustness
across different datasets and settings. The descriptive analysis of the dataset provides
important insights into its composition and characteristics. The dataset consists of
100,570 preprocessed colon cancer images, representing two subtypes of colon cancer,
MSIMUT and MSS. The classes are well-balanced, enabling the models to learn and
generalize effectively. The performance metrics of the CNN-based transfer learning
models further reinforce their effectiveness in colon cancer classification. The models
exhibit high accuracy values, ranging from 85.33% to 98.16%. The AUC scores, which
measure the models' ability to distinguish between positive and negative samples, range
from 93.41% to 99.70%, indicating excellent discrimination. The recall values of the
models range from 85.5% to 98.5%, demonstrating their ability to correctly identify
colon cancer cases. Precision values ranging from 94.5% to 98.5% indicate the models'
ability to minimize false positives. The F1-scores, which provide a balanced measure of
precision and recall, range from 85.0% to 98.0%, showcasing the models' overall
performance. The dataset's composition, with a diverse range of colon cancer images, and
the models' strong performance across various metrics, contribute to the robustness and
reliability of the findings. These insights gained from the descriptive analysis enable a
better understanding of the dataset's characteristics and guide the interpretation of the
model’s performance in colon cancer classification. In summary, the descriptive analysis
of the dataset and the performance metrics of the CNN-based transfer learning models
collectively provide a comprehensive assessment of their effectiveness in accurately
classifying colon cancer. The well-balanced dataset, combined with the models' high
accuracy, AUC, recall, precision, and F1-score values, substantiates their potential for
improving colon cancer detection and supporting clinical decision-making.

4.4 Summary

©Daffodil International University 40


This study aimed to evaluate the performance of CNN-based transfer learning models for
colon cancer classification and detection. Eight models, including EfficientNetB2,
MobileNet, VGG19, VGG16, InceptionV3, MobileNetV2, DenseNet201, and ResNet50,
were trained and evaluated using a dataset of 100,570 preprocessed colon cancer images.
The models' performance was assessed using various metrics, including accuracy, AUC,
recall, precision, and F1-score. The results demonstrated the effectiveness of the CNN-
based transfer learning models in accurately classifying colon cancer. The models
achieved high accuracy values, ranging from 85.33% to 98.16%, indicating their ability
to correctly classify colon cancer cases. The AUC scores, which measure the models'
discrimination ability, ranged from 93.41% to 99.70%, further confirming their efficacy
in distinguishing between different subtypes of colon cancer. Moreover, the models
exhibited high recall values, ranging from 85.5% to 98.5%, indicating their ability to
correctly identify positive cases of colon cancer. Precision values ranged from 94.5% to
98.5%, demonstrating the models' ability to minimize false positives. The F1-scores,
providing a balanced measure of precision and recall, ranged from 85.0% to 98.0%,
highlighting the models' overall performance. The descriptive analysis of the dataset
revealed a well-balanced distribution of classes and provided insights into image
properties such as size and color distribution. The dataset's composition, combined with
the robust performance of the models, further reinforced the reliability and effectiveness
of the findings. The study's outcomes contribute to the advancement of colon cancer
detection by demonstrating the potential of CNN-based transfer learning models.
Accurate and timely detection of colon cancer can aid in improving patient outcomes,
supporting clinical decision-making, and facilitating personalized treatment strategies.
Further research can focus on refining the models, exploring additional transfer learning
architectures, and expanding the dataset to enhance the models' performance and
generalizability. Additionally, the models can be validated on independent datasets to
assess their real-world applicability and reliability. In conclusion, the CNN-based transfer
learning models evaluated in this study exhibit strong performance in colon cancer
classification and detection. The findings provide valuable insights into the potential of
these models to contribute to the field of medical image analysis and enhance colon
cancer diagnosis and treatment.

©Daffodil International University 41


CHAPTER 5
IMPACT ON SOCIETY, ENVIRONMENT, AND SUSTAINABILITY

5.1 Impact on Society


Colon cancer, particularly Microsatellite Instability (MSI) and Minimal Sensitivity
Syndrome (MSS), poses a significant threat to public health, leading to numerous
complications and even death if not detected and treated promptly. Early detection and
diagnosis of these types of colon cancer are crucial for managing and treating these
conditions effectively. Traditional methods of colon cancer detection and diagnosis, such
as manual inspection, laboratory testing, or expert consultation, are often time-
consuming, costly, and may not always be accessible. Therefore, there is a pressing need
for developing more efficient, accurate, and accessible methods of colon cancer detection
and diagnosis.
One promising approach is the use of Deep Learning, specifically Convolutional Neural
Networks (CNNs) based on transfer learning, for colon cancer classification. Transfer
learning is a technique that can significantly improve the performance of colon cancer
classification systems by leveraging the knowledge learned from large datasets to extract
relevant features from new classes. In our study, we used five CNN-based models,
including EfficientNetB2, VGG16, InceptionV3, DenseNet121, and ResNet50, trained on
a dataset of 5386 preprocessed histological colon cancer images. These models achieved
high accuracy values, ranging from 85.33% to 98.33%, indicating their ability to
correctly classify colon cancer images. They also exhibited high Area Under the Curve
(AUC) scores, ranging from 93.41% to 99.70%, further confirming their efficacy in
distinguishing between different classes of colon cancer. The use of transfer learning for
colon cancer classification can have a profound impact on society. It can benefit
healthcare professionals in detecting and diagnosing colon cancer in a timely and
accurate manner. This can help them take appropriate actions to manage and treat colon
cancer, such as recommending lifestyle changes, prescribing medications, or referring

©Daffodil International University 42


patients to specialists. This can also help them improve their diagnostic and treatment
protocols, such as optimizing imaging techniques or adjusting treatment plans based on
the severity of the disease. Moreover, this can help them reduce the cost and time of
colon cancer detection and diagnosis, as well as improve the quality and quantity of
patient care. Transfer learning for colon cancer classification can also have a positive
impact on society by contributing to better health outcomes and reducing healthcare
costs. By enabling early and accurate detection of colon cancer, it can lead to earlier
intervention and treatment, potentially preventing serious complications and improving
patient survival rates. By improving diagnostic and treatment protocols, it can lead to
more effective and personalized care, potentially reducing the burden on healthcare
systems and increasing patient satisfaction. In conclusion, transfer learning for colon
cancer classification can have a positive impact on society, as it can benefit healthcare
professionals in detecting and diagnosing colon cancer in a timely and accurate manner,
as well as contribute to better health outcomes and reduced healthcare costs. Transfer
learning can also have a positive impact on society by creating new opportunities for
research and innovation in the field of colon cancer detection and diagnosis.

5.2 Impact on Environment


Colon cancer, particularly Microsatellite Instability (MSI) and Microsatellite Stable
(MSS), poses a significant environmental challenge due to its high incidence and
mortality rates. Traditional methods of colon cancer detection and diagnosis, such as
manual inspection, laboratory testing, or expert consultation, are often time-consuming,
costly, and may not always be accessible. Therefore, there is a pressing need for
developing more efficient, accurate, and accessible methods of colon cancer detection
and diagnosis. One promising approach is the use of Deep Learning, specifically
Convolutional Neural Networks (CNNs) based on transfer learning, for colon cancer
classification. Transfer learning is a technique that can significantly improve the
performance of colon cancer classification systems by leveraging the knowledge learned
from large datasets to extract relevant features from new classes. In our study, we used
five CNN-based models, including EfficientNetB2, VGG16, InceptionV3, DenseNet121,
and ResNet50, trained on a dataset of 5386 preprocessed histological colon cancer

©Daffodil International University 43


images. These models achieved high accuracy values, ranging from 85.33% to 98.33%,
indicating their ability to correctly classify colon cancer images. They also exhibited high
Area Under the Curve (AUC) scores, ranging from 93.41% to 99.70%, further confirming
their efficacy in distinguishing between different classes of colon cancer. The use of
transfer learning for colon cancer classification can have a profound impact on the
environment. It can benefit healthcare professionals in detecting and diagnosing colon
cancer in a timely and accurate manner. This can help them take appropriate actions to
manage and treat colon cancer, such as recommending lifestyle changes, prescribing
medications, or referring patients to specialists. This can also help them improve their
diagnostic and treatment protocols, such as optimizing imaging techniques or adjusting
treatment plans based on the severity of the disease. Moreover, this can help them reduce
the cost and time of colon cancer detection and diagnosis, as well as improve the quality
and quantity of patient care. Transfer learning for colon cancer classification can also
have a positive impact on the environment by contributing to better health outcomes and
reducing healthcare costs. By enabling early and accurate detection of colon cancer, it
can lead to earlier intervention and treatment, potentially preventing serious
complications and improving patient survival rates. By improving diagnostic and
treatment protocols, it can lead to more effective and personalized care, potentially
reducing the burden on healthcare systems and increasing patient satisfaction. In
conclusion, transfer learning for colon cancer classification can have a positive impact on
the environment, as it can benefit healthcare professionals in detecting and diagnosing
colon cancer in a timely and accurate manner, as well as contribute to better health
outcomes and reduced healthcare costs. Transfer learning can also have a positive impact
on the environment by creating new opportunities for research and innovation in the field
of colon cancer detection and diagnosis.

5.3 Ethical Aspects


Liver The application of deep learning, specifically Convolutional Neural Networks
(CNNs) based on transfer learning, for colon cancer classification brings forth several
ethical considerations. Firstly, the use of large datasets for training these models raises
concerns about privacy and data security. Patients' sensitive medical information,

©Daffodil International University 44


including their genetic data, is being used without explicit consent, which could
potentially violate their rights to privacy and autonomy. To address this, strict data
protection measures should be implemented, ensuring that patient data is anonymized and
securely stored.
Secondly, the potential for bias in the models themselves is a significant concern. If the
training data does not accurately represent the population, the models may produce
biased results, leading to unfair treatment or discrimination. To mitigate this risk, efforts
should be made to ensure diversity in the training data, reflecting the wide range of
factors that influence colon cancer risk and progression. Thirdly, the reliance on artificial
intelligence for critical medical decisions raises questions about accountability. While AI
can greatly enhance diagnostic accuracy, it is ultimately human healthcare professionals
who will make the final decisions based on these predictions. Therefore, there needs to be
clear guidelines and protocols in place to define how these AI tools should be used, and
what steps should be taken if the AI makes an error. Lastly, the economic implications of
widespread adoption of AI in healthcare cannot be ignored. While AI has the potential to
revolutionize medicine by making it more efficient and accurate, it also risks
exacerbating existing health disparities. Access to advanced technology like AI is not
equal, and those without access may be left behind. Therefore, policies should be put in
place to ensure equitable access to these technologies.In conclusion, while deep learning
offers exciting possibilities for colon cancer classification, it is crucial to consider and
address these ethical aspects to ensure that its benefits are realized without causing harm.

5.4 Sustainability Plan


The The sustainability plan for the implementation of deep learning for colon cancer
classification involves several key strategies. Firstly, continuous training and updating of
the models is essential. As medical research advances, so too must the models. Regular
updates with new data will ensure that the models remain accurate and relevant. This will
require a dedicated team of data scientists and medical experts to monitor and update the
models. Secondly, the development of a robust infrastructure is necessary. This includes
not only the physical servers needed to run the models but also the software and hardware
required to integrate the models into existing healthcare systems. This infrastructure must

©Daffodil International University 45


be scalable to handle increased demand and capable of handling large volumes of data.
Thirdly, a comprehensive education and training program for healthcare professionals is
crucial. They will need to understand how to interpret the outputs of the models, how to
use them in conjunction with other diagnostic tools, and how to respond appropriately
when the models indicate a potential diagnosis of colon cancer. This will involve
workshops, seminars, and possibly even online courses. Fourthly, regular audits and
reviews of the system will be necessary. This will ensure that the models are still
performing as expected, that the infrastructure is functioning properly, and that the
system is meeting its objectives. Any issues identified during these audits should be
addressed promptly to minimize disruption to patient care. Finally, a plan for long-term
maintenance and support is important. This includes ongoing funding for the system, as
well as a plan for what will happen if the system becomes obsolete or encounters
unforeseen challenges.
In conclusion, the sustainability of the deep learning system for colon cancer
classification depends on continuous improvement, robust infrastructure, comprehensive
training, regular audits, and long-term planning. With these strategies in place, the system
can provide valuable assistance to healthcare professionals for many years to come.

©Daffodil International University 46


CHAPTER 6
SUMMARY, CONCLUSION, RECOMMENDATION, AND
IMPLICATIONS FOR FUTURE RESEARCH

6.1 Summary of the Study


This study evaluated the effectiveness of CNN-based transfer learning models for
classifying and detecting colon cancer. A dataset of 100,570 preprocessed colon cancer
images was used, and eight models were trained and evaluated. The models showed
strong performance in accurately classifying colon cancer, with high accuracy and AUC
scores. The recall, precision, and F1-scores also demonstrated the models’ ability to
correctly identify positive cases and minimize false positives. The study’s analysis of the
dataset supported the reliability and generalizability of the models’ performance. The
findings highlight the potential of CNN-based transfer learning models in improving
colon cancer detection, and future research may involve refining the models and
expanding the dataset. Overall, this study shows the effectiveness of these models in
colon cancer classification and detection.

6.2 Conclusion
This study addressed the classification of colon cancer subtypes, specifically MSS
(Microsatellite Stability) and MSIMUT (Microsatellite Instability). The classification
task was accomplished using transfer learning techniques and the EfficientNetB2 model
with pre-trained weights from ImageNet. The dataset consisted of 100,570 images, with
50285 images from the MSS class and 50285 from the MSIMUT class. My experiments
show that the suggested method is effective. The test results show that the model had an
impressive accuracy rate of 98.29%, with AUC of 99.80%, indicating excellent
discrimination power. The evaluation metrics of precision, recall, and f1-score

©Daffodil International University 47


demonstrated a consistently high level of performance for both the MSS and MSIMUT
classes, indicating the model's robustness. These findings underscore the potential of
deep learning and transfer learning in accurately classifying colon cancer subtypes. The
large dataset was utilized in this study and the state-of-the-art EfficientNetB2 model
contributed to the exceptional performance achieved. The obtained results suggest that
the developed model can serve as a valuable tool in assisting medical professionals in the
early and accurate detection of colon cancer subtypes. The outcomes of this research have
significant implications in the field of oncology and provide valuable insights for
clinicians and researchers. Further improvements and refinements in the model
architecture and training process can be explored to enhance the accuracy and
generalizability of the classification system. Overall, the findings presented in this paper
contribute to the body of knowledge on colon cancer classification and demonstrate the
potential of deep learning techniques in improving diagnostic accuracy. The promising
results warrant further investigation and validation through clinical trials and
collaboration with medical experts.

6.3 Future Work


Although the proposed model has demonstrated excellent performance in classifying
MSS and MSIMUT subtypes of colon cancer, there are several avenues for future
research and improvement. Some potential areas of focus for future work include:
Multi-class Classification: Expanding the model to classify additional subtypes of colon
cancer beyond just MSS and MSIMUT. This could involve collecting and annotating a
larger dataset encompassing a broader range of colon cancer subtypes to enhance the
model's ability to differentiate between different classes.
Data Augmentation: Investigating various data augmentation techniques to enhance the
model's generalization capabilities further. Techniques such as rotation, scaling, flipping,
and adding noise to the images can help the model learn more robust and diverse features,
potentially improving its performance on unseen data.
Model Optimization: Exploring advanced optimization algorithms and hyperparameter
tuning methods to fine-tune the model's performance. In future work, techniques such as

©Daffodil International University 48


grid search, random search, or Bayesian optimization could be employed to improve the
robustness and other aspects of the model to achieve better performance.
Ensemble Learning: Exploring the use of ensemble learning methods to combine the
predictions of multiple models that have been trained on different subsets of the data or
have different architectures. Ensemble methods, Examples of techniques include bagging
and boosting, which can help improve the model's overall performance by leveraging the
diversity of multiple models.
Interpretability and Explain ability: Developing methods to interpret and explain the
model's decisions to provide insights into the features and patterns it relies on for
classification. Techniques such as feature importance analysis, saliency mapping, and
attention mechanisms can help identify the regions of interest in the images that
contribute most to the classification.
Clinical Validation: Conduct extensive clinical validation studies to assess the model's
performance and reliability in real-world settings. Collaborating with medical
professionals and experts to validate the model's accuracy and integrate it into clinical
workflows can provide valuable insights for its practical implementation.
Deployment and Scalability: Exploring methods to deploy the model in a scalable and
user-friendly manner, such as developing a web-based or mobile application for easy
access and utilization by healthcare professionals. Ensuring the model's efficiency and
scalability will be crucial for its practical adoption and widespread use.
By addressing these aspects in future research, I can improve the proposed model and
make diagnosing colon cancer subtypes more accurate and efficient. This ultimately leads
to improved patient outcomes and better disease management.

©Daffodil International University 49


REFERENCES

[1] Sinicrope, F. A., Foster, N. R., Thibodeau, S. N., Marsoni, S., Monges, G.,, "DNA mismatch repair
status and colon cancer recurrence and survival in," JNCI: Journal of the National Cancer Institute,
vol. 103, no. 11, pp. 863-875, 2011.

[2] Bray, F., Ferlay, J., Soerjomataram, I., Siegel, R. L., Torre, L. A., & Jemal, A., "Global cancer
statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185
countries," CA: A Cancer Journal for Clinicians, vol. 68, no. 6, pp. 394-424, 2018.

[3] Ratti, M., Lampis, A., Hahne, J. C., Passalacqua, R., & Valeri, N., "Microsatellite instability in colon
cancer: molecular bases, clinical perspectives, and new treatment approaches," Cellular and
Molecular Life Sciences, vol. 75, no. 22, pp. 4151-4162, 2018.

[4] Kather, J. N., Pearson, A. T., Halama, N., Jäger, D., Krause, J., Loosen, S. H., Marx, A., Boor, P.,
Tacke, F., Neumann, U. P., Grabsch, H. I., Yoshikawa, T., Brenner, H., Chang-Claude, J.,
Hoffmeister, M., Trautwein, C., & Luedde T., "Deep learning can predict microsatellite instability
directly from histology in gastrointestinal cancer," Nature Medicine, vol. 25, no. 7, pp. 1054-1056,
2019.

[5] Khoa A. Tran,Olga Kondrashova,Andrew Bradley,Elizabeth D. Williams,John V. Pearson,Nicola


Waddell, "Deep learning in cancer diagnosis, prognosis and treatment selection," Genome Medicine,
vol. 13, no. 1, pp. 152-153, 2021.

[6] Zhang, Y., Wang, X., Liu, J., & Liang, C., "Deep learning for colon cancer classification based on
histopathological images," BMC Bioinformatics, vol. 21, no. 1, p. 546, 2020.

[7] Cristescu, R., Lee, J., Nebozhyn, M., Kim, K.-M., Ting, J. C., Wong, S. S., Liu, J., Yue, Y. G., Wang,
J., Yu, K., Ye, X.-S., Do I.-G., Liu, S., Gong, L., Fu, J., Jin, J.-G., Choi, M.-G., Sohn T.S., Lee J.H.,
Bae J.M., Kim S.T., Park S.H., Sohn I., Jung S, "Molecular analysis of colon cancer identifies
subtypes associated with distinct clinical outcomes," Nature Medicine, vol. 21, no. 5, pp. 449-456,
2015.

[8] Iizuka O., Horie H., Morikawa T., Maeda H., Abe H., Kudo S.E. & Mori Y., "Deep learning models
for histopathological classification of colon and colonic epithelial tumors," Scientific Reports, vol. 10,
no. 1, p. 1504, 2020.

[9] Qin Y., Li Z.Y., Zhang X.F. & Li Z.Y., "Artificial intelligence in the imaging of colon cancer: current
applications and future direction," Frontiers in Oncology, vol. 11, pp. 631686-631687, 2021.

[10] J.-H., Lee J.-H. & Kim, "A review of the application of deep learning in medical image analysis,"

©Daffodil International University 50


Journal of Biomedical Engineering Research, vol. 41, no. 2, pp. 77-88, 2020.

[11] Loganathan, Zabiha Khan and R, "Transfer Learning Based Classification of MSI and MSS
Gastrointestinal Cancer," International Journal of Health Sciences, vol. 6, no. S1, pp. 1857-1872,
2022.

[12] CH Sai Venkatesh, Caleb Meriga, M.G.V.L Geethika, T Lakshmi Gayatri, V.B.K.L Aruna, "Modified
ResNet Model for MSI and MSS Classification of Gastrointestinal Cancer," arXiv, 2022.

[13] Shuxuan Fan, Xubin Li, Xiaonan Cui, Lei Zheng, Xiaoyi Ren, Wenjuan Ma, Zhaoxiang Ye,
"Computed Tomography-Based Radiomic Features Could Potentially Predict Microsatellite
Instability Status in Stage II Colorectal Cancer: A Preliminary Study," Academic Radiology, vol. 26,
no. 12, pp. 1633-1640, 2019.

[14] Rikiya Yamashita, Jin Long, Teri Longacre, Lan Peng, Gerald Berry, Brock Martin, John Higgins,
Daniel L Rubin, Jeanne Shen, "Deep learning model for the prediction of microsatellite instability in
colorectal cancer: a diagnostic study," The Lancet Oncology, vol. 22, no. 1, pp. 132-141, 2021.

[15] Hüseyin ERİKÇİ, Ziynet PAMUK, "VGG16 classification of microsatellite instability in colorectal
cancer by deep learning," International Refereed Journal of Engineering and Sciences, 2022.

[16] Franceska Dedeurwaerdere, Kathleen BM Claes, Jo Van Dorpe, Isabelle Rottiers, Joni Van der
Meulen, Joke Breyne, Koen Swaerts, Geert Martens, "Comparison of microsatellite instability
detection by immunohistochemistry and molecular techniques in colorectal and endometrial cancer,"
Scientific Reports, vol. 11, 2021.

[17] Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton, "Imagenet classification with deep
convolutional neural networks," Communications of the ACM, vol. 60, no. 6, p. 84–90, 2017.

[18] Karen Simonyan, Andrew Zisserman, "Very Deep Convolutional Networks for Large-Scale Image
Recognition," arXiv, 2015.

[19] Yuqing Gao, Khalid M. Mosalam, "Deep Transfer Learning for Image-Based Structural Damage
Recognition," Computer-Aided Civil and Infrastructure Engineering, vol. 33, no. 9, pp. 748-768,
2018.

[20] Larsen-Freeman, Diane, "Transfer of Learning Transformed," Language Learning, vol. 63, no. s1, pp.
107-129, 2013.

[21] Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand,
Marco Andreetto, Hartwig Adam, "MobileNets: Efficient Convolutional Neural Networks for Mobile
Vision Applications," arXiv, 2017.

[22] Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, Liang-Chieh Chen,
"MobileNetV2: Inverted Residuals and Linear Bottlenecks," arXiv, 2018.

[23] Hussam Qassim, Abhishek Verma, "Compressed residual-VGG16 CNN model for big data places
image recognition," 2018 IEEE 8th Annual Computing and Communication Workshop and
Conference (CCWC), 2018.

©Daffodil International University 51


[24] Shengqi Guan, Ming Lei, Hao Lu, "A Steel Surface Defect Recognition Algorithm Based on
Improved Deep Learning Network Model Using Feature Visualization and Quality Evaluation," IEEE
Access, vol. 8, pp. 49885 - 49895, 2020.

[25] Cheng Wang, Delei Chen, Hao Lin, Xuebo Liu, "Pulmonary Image Classification Based on
Inception-v3 Transfer Learning Model," IEEE Access, vol. 7, pp. 146533 - 146541, 2019.

[26] Mingxing Tan, Quoc V. Le, "EfficientNet: Rethinking Model Scaling for Convolutional Neural
Networks," arXiv, 2019.

[27] Mingxing Tan, Quoc V. Le, "EfficientNet: Rethinking Model Scaling for Convolutional Neural
Networks," arXiv, 2019.

[28] Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton, "Imagenet classification with deep
convolutional neural networks," Communications of the ACM, vol. 60, no. 6, p. 84–90, 2017.

©Daffodil International University 52


PLAGIRISM

©Daffodil International University 53


©Daffodil International University 54

You might also like