You are on page 1of 13

Type of the Paper (Article) 1

Transfer Learning Approach for Classification of Histopathol- 2

ogy Whole Slide Images 3

Shakil Ahmed1, Asadullah Shaikh2,* , Hani Alshahrani2 , Abdullah Alghamdi2 , Mesfer Alrizq2, Junaid Baber1 , 4
and Maheen Bakhtyar1 5

1 Department of Computer Science and Information Technology, University of Balochistan, Quetta, 87300, 6
Pakistan; shakil.mscs@um.uob.edu.pk (S.A.); junaidbaber@ieee.org (J.B.); maheenbakhtyar@um.uob.edu.pk 7
(J.B.) 8
2 College of Computer Science and Information Systems, Najran University, Najran and 61441, Saudi Arabia; 9
hmalshahrani@nu.edu.sa (H.A.); aaalghamdi@nu.edu.sa (A.A.); msalrizq@nu.edu.sa (M.A.) 10
* Correspondence: asshaikh@nu.edu.sa; 11

Abstract: The classification of Whole Slide Images (WSIs) provides physicians with an accurate anal- 12
ysis of diseases and also help them to treat the patients effectively. The classification can be linked 13
to further detailed analysis and diagnosis. Deep Learning (DL) has made significant advances in the 14
medical industry, including the use of Magnetic Resonance Imaging (MRI) scans, Computerized 15
Tomography (CR) scans, and ECGs to detect life-threatening diseases, including heart disease, can- 16
cer, and brain tumors. However, more advancement in the field of pathology is needed, and the 17
main hurdle for the slow progress is the shortage of large-labeled datasets of histopathology images 18
for training the models. The Kimia Path24 dataset was particularly created for the classification and 19
retrieval of histopathology images. It contains 23916 histopathology patches with 24 tissue texture 20
classes. Transfer learning-based framework is proposed and evaluated on two famous DL models, 21
Inception-V3 and VGG-16. To improve the productivity of Inception-V3 and VGG-16, we used their 22
pretrained weights and concatenated with image vector which is used as input for the training the 23
same architecture. Experiments show that the proposed innovation improves the accuracy of both 24
famous models. The patch to scan accuracy of VGG-16 is improved from 0.65 to 0.77, and for the 25
Citation: Lastname, F.; Lastname, F.;
Inception-V3, it is improved from 0.74 to 0.79. 26
Lastname, F. Title. Sensors 2021, 21,
x. https://doi.org/10.3390/xxxxx
Keywords: Deep Learning; transfer learning; histopathology; 27

Academic Editor: Firstname Last-


name
1. Introduction 28
In the field of medical science, automatic analysis of histological images has cre- 29
Received: date ated great convenience to the Doctors and scientists. The experts from different fields of 30
Accepted: date computing and machine learning are contributing in the medical science due to the avail- 31
Published: date ability of labeled data and technology that can digitize the data used in everyday analysis. 32
Recently, in the field of pathology, it is technologically easy to digitally scan the sample 33
Publisher’s Note: MDPI stays neu- on the slides that are used for microscopy analysis, and used it for computer aided anal- 34
tral with regard to jurisdictional
ysis and diagnosis. The digital scan of the sample on the slide is called WSI that enables 35
claims in published maps and institu-
to store the sample digitally on the computer in the shape of digital image. The WSI can 36
tional affiliations.
be used for detail analysis and diagnosis by the experts remotely or these can be used as 37
a reference for future predictions. The saved WSI can easily be shared with experts in 38
entirely different corner of the world for their swift analysis on the image. The WSI pro- 39
Copyright: © 2021 by the authors. cessing has provided huge convenience to the practitioners and also motivated the scien- 40
Submitted for possible open access tists to make more robust and reliable automatic analysis diagnostic models. The medi- 41
publication under the terms and cal image analysis softwares are powered by machine learning, particularly, deep learn- 42
conditions of the Creative Commons ing-based models. The Deep Learning with Convolutional Neural Network (CNN) is a 43
Attribution (CC BY) license quickly expanding field in histological image analysis. In a variety of image analysis 44
(http://creativecommons.org/licenses fields, machine learning using CNN has recently drawn the research community's inter- 45
/by/4.0/).

Sensors 2021, 21, x. https://doi.org/10.3390/xxxxx www.mdpi.com/journal/sensors


Sensors 2021, 21, x FOR PEER REVIEW 2 of 13

est[1]. It provides physicians with an accurate analysis of diseases and helps them to cor- 46
relate with any of previous stored sample, that leads to more effective medical decisions. 47
For computerized applications, a preliminary CNN based architectures are proposed: in- 48
cluding the use of Magnetic Resonance Imaging (MRI) scans, Computerized Tomogra- 49
phy scans, and electrocardiograms (ECGs) to detect life-threatening diseases, including 50
heart disease, cancer, and brain tumors. Training the deep learning models from scratch 51
creates problems because state-of-the-art CNN requires a significant training size [2] and 52
computational resources. CNNs, such as DenseNet[3] , when trained on ImageNet, ob- 53
tained high accuracy[4] because ImageNet has a huge databank of images for the training 54
of CNN models. Deep features and transfer learning, have allowed these deep models to 55
be used in a variety of domains, including medical applications [4] [5]. The methods for 56
extracting features from histopathological images based on similarity between the feature 57
vectors can be difficult when extracting data from a large database. Therefore, more ad- 58
vancement in the field of digital pathology is expected and needed. Microscopic analysis 59
of histopathology images is time-consuming and difficult. Automated histopathology im- 60
age diagnosis reduces pathologists' workload and helps them to concentrate on more sen- 61
sitive cases. 62
Deep Neural Network (DNN) architecture is a versatile technique that has been learned to 63
perform complex tasks like classification and facial recognition using a wide collection of images 64
(ImageNet). Using “pre-trained” networks in medical image classification is a realistic way to use 65
them. This solves the problem of not having a massive, well-labeled, and well-balanced image da- 66
taset. 67
These deep models have performed well in several domains, including medical applications 68
and deep characteristics in medical images. There are alternatives for transfer learning, given do- 69
main data and a network that has been trained to differentiate on large nonspecific datasets (e.g., 70
ImageNet, which has a huge databank of objects with more than 10,000 categories) i.e., the classifi- 71
cation model must be adapted to the current domain: a) the architecture is conditioned for several 72
epochs after being initialized with random weights. The model learns characteristics from the data 73
and computes weights using backpropagation at every single epoch. If the dataset isn't large 74
enough, this method won't be able to produce the most accurate results. It should be used as a ref- 75
erence point for the other two methods, b) This approach uses weights trained on a wider dataset 76
to initialize the model. A pre-trained CNN can be used as a Feature Extractor by freezing all convo- 77
lutional blocks and then training the connected layers with the new dataset; it assumed that the 78
layer just before the classifier is a feature layer, instead of using the classifier of pre-trained CNN, 79
classifier like SVM and neural network can be used for the classification purpose., c) fine-tuning a 80
pre-trained CNN as a classifier by retaining only the pre-trained network's final layers (the domain 81
layers) or by training models from scratch, in addition to retraining the classifier at the end of the 82
fully connected network. The following are the major advantages of this research work: - 83
i. All the images in the Kimia Path24 database were used for training and testing pur- 84
poses and were further classified into 24 classes for grayscale histopathology images. 85
ii. Training the entire Inception-V3 and VGG16 [6],[7] models from the scratch after 86
transferring the pretrained weights of same model has improved classification accu- 87
racy as compare to fine-tuning (by training the last few layers of the base network) or 88
using high level feature extractor techniques for the classification of grayscale images 89
in Path24 dataset. 90
iii. The proposed pre-trained CNN models have fully automated end-to-end structure 91
and do not need any hand-made feature extraction method. 92
93
In this paper, we analyze and assess the effectiveness of Inception-V3 and VGG-16 pre-trained 94
models for Kimia Path24 using the transfer learning approach, which involves full training of the 95
pre-trained models by fine-tuning early layers for automatic classification of histopathology images. 96
Following is a review of the paper's structure: Section II presents a concise overview of applicable 97
literature. Section III explains Dataset and Methodology in detail. Experiment, Results, and Discus- 98
sion are given in Section IV. Section V covers the conclusion. 99

2. Related Works 100


Pre-trained models, which have been trained on a huge databank of images, are used as feature 101
extractors or weight initializers, for the classification of histopathological images [6]-[8] . The high 102
Sensors 2021, 21, x FOR PEER REVIEW 3 of 13

dimensionality of digital pathology images makes processing and storage difficult; therefore, using 103
soft-computing approaches, understanding regions of importance in an image helps in quicker di- 104
agnosis and identification [9]. Scanning and segmentation, as well as detection and retrieval, are all 105
traditional image processing tasks that have increased in importance over time. It can be seen that 106
cell structures such as nuclei, glands, and lymphocytes have outstanding features, which can be 107
used as indicators to identify cancer cells, especially in histopathology[10]. However, digitization of 108
whole slide images is setting a landmark for laboratory standards, for more accurate and speedy 109
diagnostic of diseases [11]. 110
Image extraction and image classification are the main components of pathological images in 111
histopathology whole slide image (WSI) analysis[12],[13]. These are also used to help medical doc- 112
tors to make more specific and accurate decisions on the patient's medical condition. There are sev- 113
eral benefits of digitizing pathology images. Additionally, the better presentation of image pro- 114
cessing algorithms can make retrieval of images efficient. Clinicians and quality management staff 115
may take advantage of this property. The digitized version of pathology glass slides is one of the 116
most recent and prominent examples of extensive automated evidence [14]. The size of whole scan 117
images of pathology samples can be in gigabytes[15]. As a result, storing, processing, and transfer- 118
ring images in real-time is complicated. Yet, learning deep features from massive digitized histo- 119
pathology scans is a decent way to discover secret patterns that humans can't recognize. Further- 120
more, pathology image processing is now considered the "gold standard" of diagnosing multiple 121
diseases involving all forms of cancer[16]. 122
Over the past few years, clinicians & researchers have been interested in machine learning 123
techniques for automated analysis of digital pathology scans. With advantages of high variety, rich 124
structures, and wide dimensionality, these images come with special challenges. As a result, schol- 125
ars are looking at different image processing methods and how they can be applied to digital pa- 126
thology[11] . The use of deep features as image descriptors is a fairly new advancement, mainly 127
based on CNNs, which are trained from initial layers or used post-training for classification to ex- 128
tract high-dimensional characteristics embedded in the fully connected layer [17],[18],[19]. CNNs 129
and several other discriminative deep architectures need optimal training on a large amount of la- 130
beled (and balanced) data without adverse effects of overfitting [20],[21],[22]. In histopathology im- 131
age extraction, deep solutions have been widely used. In [23], to extract features from histopathol- 132
ogy files, a sparse autoencoder was used. The authors [24], demonstrated a patch-based CNN, and 133
proposed an expectation-maximization (EM) technique for training CNN. In [25], purposed a CNN- 134
based nuclei-guided feature extraction technique for histopathological imaging. There are number 135
of frameworks based on handcrafted features as well [26],[27],[28],[29],[30]. The rapid development 136
of digital data available, along with efficient computing capabilities, is one of the key reasons for 137
this [31]. They can be thought of as a more advanced variant of traditional neural networks[31],[32]. 138
CNN is a type of multilayer neural network that interacts with two-dimensional data, such as im- 139
ages and video frames [33]. As such, CNNs are feed-forward, their structure is designed to reduce 140
exposure to input image distortions including transformations and rotations. Deep Learning meth- 141
ods are well known for their ability to predict outcomes, particularly when they are provided with 142
a huge number of samples for training. Multiple operations on results, including non-linear trans- 143
formations, are used in the learning approach to provide precise and useful explanations [34]. In 144
2015, Bar et al.[35] applied DL methods for disease detection in chest X-ray data. They also use 145
CNNs that have been trained no non-medical images, for experiment purposes. According to the 146
findings, combining DL (Decaf) and Pi-Codes ("Picture Codes") yields the best results. 147
The use of pre-trained networks for operations outside of their original domain has gained 148
attention [36]. This is especially relevant in the medical field, and the most obvious reason for this 149
is no sufficient labeled data, needed by a deep network for training purposes. When it comes to 150
using pre-trained networks for medical imaging studies, these groups have achieved better results 151
[36],[37],[38]. Hence, other organizations use ImageNet (a huge databank of images, divided into 152
1000+ categories) for the training of networks [38],[39]. Kumar and coauthors extracted deep fea- 153
tures using AlexNet or VGG16 and precision was computed similarity distances among the 4,096 154
features extracted[39]. Kieffer and coauthors used Kimia Path24 to look into the use of deep features 155
by using pre-trained architectures, adjusting for the effects of transfer learning, and comparing pre- 156
trained networks against training from scratch [40]. 157
Later in this section, some famous and recent works on medical imaging; breast cancer classi- 158
fication[25], [41]-[42], [47]; cervical cancer pathology image classification [49]; MRI diagnosis [50]; 159
skin disease classification [51]; and lung cancer diagnosis [52] are reported. All mentioned frame- 160
works are either on deep learning architecture or have used transfer learning. 161
Sensors 2021, 21, x FOR PEER REVIEW 4 of 13

In recent years, CNN has improved the accuracies of medical image classification tasks from 162
traditional diagnosis to automatic diagnosis, reaching different levels with excellent performance. 163
An example of these tasks is breast cancer. Hematoxylin and eosin-stained breast biopsy images fall 164
into four categories: invasive carcinoma, in-situ carcinoma, benign tumor, and normal tissue. Saha 165
et al.[41] proposed an automatic disease detection of mitoses from breast histopathology WSIs with 166
precision of 0.92 and recall of 0.88. Han et al.[42] proposed framework to distinguish breast cancer 167
histopathology photos using a hierarchical deep learning model. Their purposed system divided 168
the subcategories of breast cancer imaging into 3 categories (lobular carcinoma, ductal carcinoma, 169
and fibroadenoma) with an overall accuracy of 0.93. Zheng et al. [25] created a CNN to categorize 170
breast cancer photographs into two groups (benign and malignant) with precision of 0.96 on their 171
dataset. Jia et al.[43] used a multi-instance learning algorithm to implement a fully connected net- 172
work to segment cancer areas on histopathological images. Xu et al.[44] used the transfer learning 173
approach, CNN was applied to segment and label histopathology WSIs. Shi et al. [45] applied deep 174
hashing method to retrieve and classify the histopathology images. The suggested model was tested 175
on the dataset of lung cancer by scientists, and the model reported accuracy is 0.97. Another research 176
[46] suggested three different CNN models, to classify the coronavirus contamination in X-radiation 177
cases, including Inception-ResNetV2, InceptionV3, and ResNet50. In terms of detection and identi- 178
fication, the ResNet50 system outperforms InceptionV3 and Inception-ResNetV2 with 0.98 accuracy, 179
whereas InceptionV3 attained 0.97, and Inception-ResNetV2 with 0.87. An ensembled based frame- 180
work to classify vivo endoscopic images as normal or abnormal using VGG, DenseNet, and incep- 181
tion-based networks is proposed [48]. 182
Huang et al.[49] proposed a Cervical Cancer Pathology Image Classification Algorithm (AF- 183
SENet) that uses convolution feature fusion, which is the first comprehensive recognition algorithm 184
for each pathology stage of cervical cancer. The proposed AF-SENet fusion subnetwork can effec- 185
tively resist the pathology of cervical cancer. 186
MRI is very important for establishing a correct diagnosis and therefore prescribing appropri- 187
ate treatment. MRI segmentation to remove tissue and establish a diagnosis is still an active area of 188
research. Deep learning methods mainly use convolutional neural networks to extract spatial infor- 189
mation from various filters at various network levels and then use dense neural network layers to 190
classify or backtrack. This architecture has been successfully implemented in many biomedical ap- 191
plications including Alzheimer's disease diagnosis [50]. 192
Srinivasu et al.[51] proposed a computerized skin disease classification process based on deep 193
learning, using MobileNet V2 and long-term short-term memory (LSTM). Experiments show that 194
the MobileNet V2 model can run on light computing devices with higher accuracy and high effi- 195
ciency. 196
Masud et al. [52] introduced a classification framework that can differentiate five types of lung 197
and colon tissues (two benign and three malignant) when examining histopathological images, it 198
can categorize cancerous tissues. 199
Pathologists examine pathology slides at various resolutions and fields of view in a similar 200
manner. Nonetheless, like many others, we use a deep learning approach on minor portions of the 201
image. By doing this, the classifier is expanded to each element of the entire slide. This study used 202
WSIs from the Kimia Path24 dataset, which is specially designed to examine the classification and 203
retrieval of histopathology images. In total, there are 1325 images for the test and 22,590 for training. 204
Because the DNN work on raw pixel values, requires no extra efforts from human; and can learn a 205
variety of graphical characteristics from the data held for training. 206
207

3. Material and Methods 208


Transfer learning is widely used for various applications. Pre-trained models learn small pat- 209
terns such as shapes and diagonals in the initial layer and then combine these components in sub- 210
sequent layers to learn multipart features. By using patterns learned from previous layers, the mod- 211
els make meaningful constructs in the final layer. 212
213

3.1. Proposed Model 214


We also take famous models for feature extraction and then use those features to train the 215
model. The difference between existing practices and proposed methodology is that we concatenate 216
the features extracted from existing model with the processed images and then train the model from 217
the scratch, as shown in Figure 2. It can be seen that weights from previously trained models are 218
Sensors 2021, 21, x FOR PEER REVIEW 5 of 13

transferred to the same architecture by infusing the weights with image raw pixel values. To project 219
weights and the pixel values in the same feature space, unit normalization before concatenation and 220
after concatenation is performed. We trained all of the network's layers because of their ability to 221
extract both common and individual functions. By doing so, we are able to pass to the new model 222
with information (weights values) of simple features gained in the first and middle layers. Histo- 223
pathology images are classified using the basic constructs that purposed CNN models have learned 224
to distinguish various images from the ImageNet. The following are the major contributions of this 225
work: 226
• Inception-V3 and VGG16 are evaluated for classifying histopathology images automatically. 227
• The classification effectiveness of purposed pre-trained models is tested by infusing the features 228
vectors from pretrained network with image pixels normalized. We used grayscale histopathol- 229
ogy images. 230
By training Inception-V3 and VGG16 models by transferring the weights of the same models 231
that are trained on very large and independent datasets. The accuracy of classification of histo- 232
pathology images has increased. Fine tuning and feature extractor-based experiments are already 233
conducted by many recent papers. However, we take the features from pretrained models and con- 234
catenate with original image before training the model form the scratch. 235

3.2. Dataset 236


We used Kimia Path24, an open-source dataset with histopathology images, to analyze our 237
tests. It was designed with digital pathology image classification and retrieval in mind. The dataset 238
was created using 350 whole scan images (WSIs) of different body parts. Different staining tech- 239
niques have been applied such as immunohistochemical (IHC), Hematoxylin and Eosin (H&E), and 240
Masson’s trichrome staining. Tissue-Scope LE 1.0 was used to record the images in the bright field 241
with a 0.75 NA lens1. A total of 24 WSIs is chosen for non-medical experts based on visual differen- 242
tiation. There are 22591 training instances and 1325 testing instances are provided of each resolution 243
of 1000 x 1000 pixels (0.5mm x 0.5mm) from 24 classes [47]. The dataset is quite challenging and 244
computationally expensive due to high dimensions of the images. Figure 1 shows some colored 245
images from the dataset; the dataset is freely available online2. 246
247

248
Figure 1. Random sample images from training set and test of Kimia Path24 dataset. 249

1 http://www.hurondigitalpathology.com/tissuescope-le-3/
2 https://kimialab.uwaterloo.ca/kimia/index.php/pathology-images-kimia-path24/
Sensors 2021, 21, x FOR PEER REVIEW 6 of 13

3.3. Accuracy Calculation: 250

The final accuracy calculation for the Kimia Path24 dataset is based on two types of accuracy 251

calculation, namely path-to-scan and whole-scan accuracies, established by [53]. 252

The total number of test patches is denoted by 𝑛𝑡𝑜𝑡 and for the dataset 𝑛𝑡𝑜𝑡 = 1,325. There are 253

24 different classes (one for each whole-slide image) denoted by set S, i.e., S = {c0, c1, . . ., c23}. Any 254

given test patch from the dataset is denoted by 𝑃𝑠𝑖 , where s ∈ S represents its class and i ∈ [1, 𝑛𝛤𝑠 ] 255

is index to identify it among all the patches associated with class s. The Γ𝑠 is set of patches 𝑃𝑠𝑖 that 256

belongs to class s such that  s = Psi s  S ,i = 1,2...,n s  with 𝑛𝛤𝑠 is number of patches in𝑠 𝑡ℎ 257

class. 258

p
Patch-to-scan accuracy is calculated using equation (1), where R represents the retrieved images 259

for each experiment: 260


261

 R s
p = sS (1) 262
ntot

and the whole-scan precision


w , which is expressed as equation (2). 263

1 |𝑅∩𝛤𝑠 |
𝜂𝑤 = 24 ∑ 𝜂𝛤𝑠
(2) 264
𝑠∈𝑆
Overall precision is calculated, as shown in the equation (3) [53]. 265

𝜂𝑡𝑜𝑡𝑎𝑙 = 𝜂𝑝 . 𝜂𝑤 (3) 266

4. Experiments and Results: 267


VGG-16 [6] and Inception-V3 [7] are employed to categorize Kimia Path24 grayscale histo- 268
pathology into 24 categories. Kimia Path24 contains 22,590 images for training and 1325 images for 269
testing. To provide a unique validation package, 80 percent of the training images are kept for train- 270
ing, while the remaining 20 percent are set aside for validation. Later, the performance of both mod- 271
els is tested on the test set. 272
Moreover, Inception-V3 and VGG16 original models with default configuration are also 273
trained from the scratch on the same dataset using the Keras3 library system. Keras is an open- 274
source machine learning library based on the Python programming language. 275

4.1 Experimental Setup: 276


As stated earlier, transfer learning is used to train the models, i.e., VGG16 and Inception-V3. 277
The pre-trained models are trained on very large datasets, we give the image as input to that 278
layer and extract the features from n-1 layer, later that feature is concatenated with unit nor- 279
malized image. To normalized the values in concatenated vector, the whole vector is unit nor- 280
malized again, as demonstrated in Figure 2. 281
282

3 Chollet, F. "Keras," https://github.com/fchollet/keras, 2015


Sensors 2021, 21, x FOR PEER REVIEW 7 of 13

283
Figure 2. Proposed framework based on transform learning, the given image is passed to the 284

pre-trained network and features is extracted, later that feature is concatenated with the same 285

image which is vectorized by unit normalization. 286


287
During training, Adamax was used to refine the network parameters. 10−6 was selected as 288
the learning rate. To regularize the deep models, a dropout ratio of 0.25 is chosen. All hyperparam- 289
eters are selected based on experimental trials. Initial values are taken as suggested by in their orig- 290
inal papers. In case of VGG-16, the suggested value of dropout ratio is 0.5, which was not optimal 291
in our validation trial. The different batch sizes are trialed during learning, the batch sizes are chosen 292
from 30 to 150 due to hardware constraints, and the optimal batch size we got is 140. The larger 293
batch size can also be taken if training size is increased either by expert annotation or data augmen- 294
tation. 295

4.2 Results: 296


The Inception-V3 and VGG16 pre-trained CNN models were trained to categorize grayscale 297
histopathology images for 50 epochs. All images with a resolution of 128 x 128 pixels were used to 298
test each pre-trained model. Figure 3 illustrates the training and validation failure curves for VGG16 299
[6] and Inception-V3 [7], as well as the validation precision. Experiments show that there was no 300
accuracy gain after 50 epochs, instead it started to deteriorate. 301
302
Sensors 2021, 21, x FOR PEER REVIEW 8 of 13

(a) (b)
303
Figure 3. The performance graphs of Inception-V3 and VG16 pre-trained models on gray- 304

scale images: (a) validation accuracies and training & validation loss of Inception-V3 model 305

on grayscale images, (b) validation accuracies, and training & validation loss of VGG16 on 306

grayscale images. 307


According to the results of the evaluation, Inception-V3 provided better classification accuracy 308
for grayscale images than VGG-16. Figure 4 illustrates the confusion matrices obtained using the 309
Inception-V3 and VGG16 models. 310
311

312
Figure 4. The confusion matrixes for the Inception-V3 and VGG16 pre-trained models. 313
314
In the grayscale test dataset, the Inception-V3 model correctly classified 1,058 out of 1325 im- 315
ages, while the VGG16 model correctly classified just 1,025 out of 1,325 images. Table 1 shows the 316
accuracy, recall, and F1-score values of Inception-V3 and VGG16 models for grayscale test-set im- 317
ages. Inception-V3 yielded an 80 percent for the average precision, recall, and F1-score for grayscale 318
histopathology images. On the other hand, VGG16 achieved 77 percent using the same assessment 319
criterion. 320
Sensors 2021, 21, x FOR PEER REVIEW 9 of 13

Table 1. The grayscale research dataset was used to create a classification report for the Incep- 321
tion-V3 model. 322

Amount Precision Recall F1-Score


Classes
of Data Inception-V3 VGG16 Inception-V3 VGG16 Inception-V3 VGG16
c0 64 0.83 0.70 0.84 0.81 0.84 0.75
c1 65 0.92 0.94 1.00 0.95 0.96 0.95
c2 65 0.80 0.80 0.91 0.85 0.85 0.82
c3 75 0.64 0.66 0.91 0.83 0.75 0.73
c4 15 0.00 0.00 0.00 0.00 0.00 0.00
c5 40 0.80 0.52 0.20 0.42 0.32 0.47
c6 70 0.90 0.83 0.89 0.86 0.89 0.85
c7 50 0.63 0.70 0.74 0.56 0.68 0.62
c8 60 0.79 0.79 0.77 0.77 0.78 0.78
c9 60 0.93 0.76 0.87 0.87 0.90 0.81
c10 70 0.90 0.83 0.90 0.83 0.90 0.83
c11 70 0.87 0.82 0.87 0.90 0.87 0.86
c12 70 0.76 0.70 0.93 0.87 0.84 0.78
c13 60 0.85 0.84 0.87 0.77 0.86 0.80
c14 60 0.97 0.84 0.97 0.93 0.97 0.88
c15 30 0.00 0.93 0.00 0.43 0.00 0.59
c16 45 0.81 0.76 0.64 0.56 0.72 0.64
c17 45 0.65 0.68 0.93 0.80 0.76 0.73
c18 25 0.00 0.89 0.00 0.32 0.00 0.47
c19 25 0.91 1.00 0.40 0.52 0.56 0.68
c20 65 0.68 0.74 1.00 0.98 0.81 0.85
c21 65 0.80 0.77 0.91 0.83 0.85 0.80
c22 65 0.84 0.80 0.65 0.74 0.73 0.77
c23 65 0.78 0.82 0.94 0.71 0.85 0.76
323
Inception-V3 yielded an 80 percent for the average precision, recall, and F1-score for grayscale 324
histopathology images. On the other hand, VGG16 achieved 77 percent using the same assessment 325
criterion. It can be seen few of the classes have zero precision, among them, many have a small 326
number of instances for the training and testing. 327

4.3 Discussion: 328


Kieffer et al.[40] , to categorize histopathology WSIs in Kimia Path24, used Inception-V3 and 329
VGG-16 models with fine-tuning and feature extraction approaches. On the same dataset (Kimia 330
Path24), Table 2 provides a comparison of our work with state-of-the-art frameworks. 331

Table 2. Comparison of proposed innovation with same famous state-of-the art models. 332

p w total
Paper Model Method
(%)
(%) (%)
Babaie et al. [53] CNN Train from scratch 64.98 64.75 42.07
Kieffer et al. [40] VGG-16 Feature Extractor 65.21 64.96 42.36
Kieffer et al. [40] VGG-16 Fine-tuning 63.85 66.23 42.29
Kieffer et al. [40] Inception-V3 Feature Extractor 70.94 72.24 50.54
Kieffer et al. [40] Inception-v3 Fine-tuning 74.87 76.10 56.98
Simonyan et al. [6] VGG-16 base Train from scratch 69.89 71.09 49.68
model
Sensors 2021, 21, x FOR PEER REVIEW 10 of 13

Szegedy et a. [7] Inception-V3 Train from scratch 72.65 73.00 53.03


base model
The Proposed VGG-16 Feature Extractor 77.41 71.27 55.17
The Proposed Inception-V3 Feature Extractor 79.90 71.33 57.00
333
To categorize grayscale histopathology images, Babaie et al. suggested a bag of words (BoW), 334
local binary pattern (LBP) histograms, and a CNN model in 2017. For the BoW, LBP, and CNN 335
methods, the absolute accuracy value (𝜼 𝒕𝒐𝒕𝒂𝒍) was recorded as 39.65 percent, 41.33 percent, and 336
42.07 percent, respectively. 337
Kieffer et al. [40] employed Inception-V3 and VGG16 pre-trained CNN models to categorize 338
histopathology images using transfer learning techniques. For grayscale histopathology images, In- 339
ception-V3 with 74.87 percent obtained maximum precision. 340
It can be seen from the Table 2 that the Inception-V3 and VGG16 accuracy is improved after 341
concatenating their pre-trained weights with image pixel values. Before concatenation, pretrained 342
feature vector and image are unit normalized so that they are on same Euclidean space, even after 343
concatenation, the final vector is unit normalized. Transferring of pretrained weights to the same 344
model for training actually improves the accuracy of the same model. The results for total accuracy 345
on the test set were 57.00 % for the Inception-V3 and 55.17 % for VGG16. The accuracy of original 346
VGG-16 and Inception-V3 is also competitive without transfer learning. The main reason that VGG- 347
16 and Inception-V3 gave better performance compared to Babaie et al. work is the fact that both 348
these models are deep and also the training dataset was comparatively small for these two. Whereas, 349
the proposed model uses the same two models but achieve better performance, and also compara- 350
tively more generalized, as beside the dataset used in the experiments, the weights from these two 351
models which are trained on millions of images are also incorporated during the training. 352
353

5. Conclusions 354
The adoption of deep learning in digital pathology would be extremely beneficial, as it would 355
move human appraisal of histology to higher quality, non-repetitive takes. It provides pathologists 356
with the ability to analyze data at high speeds while maintaining accuracy. For the automatic clas- 357
sification of histopathology images, this paper proposes training the entire pre-trained model from 358
pre-trained weights concatenated with image raw pixels. According to the findings, the pre-trained 359
models Inception-V3 and VGG-16 outperformed existing studies in the literature for Kimia Path24 360
grayscale histopathology scans. Both models had better patch-to-scan accuracy, but VGG-16 had a 361
noticeable increase in total accuracy, whereas Inception-V3 had a slight improvement. 362
The main limitations of the study are the size of the concatenated vector and the size of the 363
dataset used for the training of pre-trained models. We might have obtained better accuracy if we 364
had a huge number of samples, such as millions of histopathology WSIs, and also if the purposed 365
models were trained in medical imaging because the architecture will be adjusted appropriately for 366
research work. The training dataset is also imbalance as few of the classes have few instances. In 367
future work, we are interested to explore deep models for data augmentation to address the imbal- 368
ance nature of the dataset. 369
370
Author Contributions: S.A., A.S., and H.A.; methodology, S.A., A.A., and M.A.; formal analysis, 371
S.A., J.B., and M.B.; software, validation, writing—original draught preparation, S.A., and A.S.; writ- 372
ing—review and editing, A.A., H.A., and M.A.; supervision, project administration, and funding 373
acquisition, A. The published version of the manuscript has been read and approved by all authors. 374

Funding: The authors would like to thank the Deputy for Study and Innovation, Ministry of Edu- 375
cation, Kingdom of Saudi Arabia, for funding this research through a grant (NU/IFC/INT/01/008) 376
from the Najran University Institutional Funding Committee. 377

Conflicts of Interest: There are no conflicts of interest declared by the writers. 378

379
Refrences 380

381
Sensors 2021, 21, x FOR PEER REVIEW 11 of 13

1. LeCun, Y., Y. Bengio, and G. Hinton, Deep learning. nature 521 (7553), 436-444. Google Scholar Google Scholar Cross Ref 382

Cross Ref, 2015. 383

2. Sharma, S. and R. Mehra, Effect of layer-wise fine-tuning in magnification-dependent classification of breast cancer histopathological 384

image. The Visual Computer, 2020. 36(9): p. 1755-1769. 385

3. Huang, G., et al. Densely connected convolutional networks. in Proceedings of the IEEE conference on computer vision and pattern 386

recognition. 2017. 387

4. Deng, J., et al. Imagenet: A large-scale hierarchical image database. in 2009 IEEE conference on computer vision and pattern recognition. 388

2009. Ieee. 389

5. Huang, Z., Z. Pan, and B. Lei, Transfer learning with deep convolutional neural network for SAR target classification with limited 390

labeled data. Remote Sensing, 2017. 9(9): p. 907. 391

6. Simonyan, K. and A. Zisserman, Very deep convolutional networks for large-scale image recognition. arXiv preprint 392

arXiv:1409.1556, 2014. 393

7. Szegedy, C., et al. Rethinking the inception architecture for computer vision. in Proceedings of the IEEE conference on computer vision 394

and pattern recognition. 2016. 395

8. Alhindi, T.J., et al. Comparing LBP, HOG and deep features for classification of histopathology images. in 2018 international joint 396

conference on neural networks (IJCNN). 2018. IEEE. 397

9. Alinsaif, S. and J. Lang. Histological image classification using deep features and transfer learning. in 2020 17th Conference on 398

Computer and Robot Vision (CRV). 2020. IEEE. 399

10. Caicedo, J.C., F.A. González, and E. Romero, Content-based histopathology image retrieval using a kernel-based semantic annotation 400

framework. Journal of biomedical informatics, 2011. 44(4): p. 519-528. 401

11. Gurcan, M.N., et al., Histopathological image analysis: A review. IEEE reviews in biomedical engineering, 2009. 2: p. 147-171. 402

12. Pantanowitz, L., et al., Review of the current state of whole slide imaging in pathology. Journal of pathology informatics, 2011. 2. 403

13. Zheng, Y., et al., Size-scalable content-based histopathological image retrieval from database that consists of WSIs. IEEE journal of 404

biomedical and health informatics, 2017. 22(4): p. 1278-1287. 405

14. Zheng, Y., et al., Histopathological whole slide image analysis using context-based CBIR. IEEE transactions on medical imaging, 406

2018. 37(7): p. 1641-1652. 407

15. Ghaznavi, F., et al., Digital imaging in pathology: whole-slide imaging and beyond. Annual Review of Pathology: Mechanisms of 408

Disease, 2013. 8: p. 331-359. 409

16. Madabhushi, A., Digital pathology image analysis: opportunities and challenges. Imaging in medicine, 2009. 1(1): p. 7. 410

17. Rubin, R., D.S. Strayer, and E. Rubin, Rubin's pathology: clinicopathologic foundations of medicine. 2008: Lippincott Williams & 411

Wilkins. 412

18. Babenko, A. and V. Lempitsky. Aggregating local deep features for image retrieval. in Proceedings of the IEEE international 413

conference on computer vision. 2015. 414

19. Chatfield, K., et al., Return of the devil in the details: Delving deep into convolutional nets. arXiv preprint arXiv:1405.3531, 2014. 415

20. Oquab, M., et al. Learning and transferring mid-level image representations using convolutional neural networks. in Proceedings of 416

the IEEE conference on computer vision and pattern recognition. 2014. 417

21. Wan, J., et al. Deep learning for content-based image retrieval: A comprehensive study. in Proceedings of the 22nd ACM international 418

conference on Multimedia. 2014. 419

22. Khatami, A., et al., Parallel deep solutions for image retrieval from imbalanced medical imaging archives. Applied Soft Computing, 420

2018. 63: p. 197-205. 421

23. Cruz-Roa, A.A., et al. A deep learning architecture for image representation, visual interpretability and automated basal-cell carcinoma 422

cancer detection. in International Conference on Medical Image Computing and Computer-Assisted Intervention. 2013. Springer. 423
Sensors 2021, 21, x FOR PEER REVIEW 12 of 13

24. Hou, L., et al. Patch-based convolutional neural network for whole slide tissue image classification. in Proceedings of the IEEE 424

conference on computer vision and pattern recognition. 2016. 425

25. Zheng, Y., et al., Feature extraction from histopathological images based on nucleus-guided convolutional neural network for breast 426

lesion classification. Pattern Recognition, 2017. 71: p. 14-25. 427

26. Weston, J., S. Bengio, and N. Usunier, Large scale image annotation: learning to rank with joint word-image embeddings. Machine 428

learning, 2010. 81(1): p. 21-35. 429

27. Seide, F., G. Li, and D. Yu. Conversational speech transcription using context-dependent deep neural networks. in Twelfth annual 430

conference of the international speech communication association. 2011. 431

28. Glorot, X., A. Bordes, and Y. Bengio. Domain adaptation for large-scale sentiment classification: A deep learning approach. in ICML. 432

2011. 433

29. Boulanger-Lewandowski, N., Y. Bengio, and P. Vincent, Modeling temporal dependencies in high-dimensional sequences: 434

Application to polyphonic music generation and transcription. arXiv preprint arXiv:1206.6392, 2012. 435

30. Ciregan, D., U. Meier, and J. Schmidhuber. Multi-column deep neural networks for image classification. in 2012 IEEE conference 436

on computer vision and pattern recognition. 2012. IEEE. 437

31. Bengio, Y., Learning deep architectures for AI. 2009: Now Publishers Inc. 438

32. Hey, A.J. and A.E. Trefethen, The data deluge: An e-science perspective. 2003. 439

33. Arel, I., D.C. Rose, and T.P. Karnowski, Deep machine learning-a new frontier in artificial intelligence research [research frontier]. 440

IEEE computational intelligence magazine, 2010. 5(4): p. 13-18. 441

34. Bengio, Y., A. Courville, and P. Vincent, Representation learning: A review and new perspectives. IEEE transactions on pattern 442

analysis and machine intelligence, 2013. 35(8): p. 1798-1828. 443

35. Bar, Y., et al. Deep learning with non-medical training used for chest pathology identification. in Medical Imaging 2015: Computer- 444

Aided Diagnosis. 2015. International Society for Optics and Photonics. 445

36. Pan, S.J. and Q. Yang, A survey on transfer learning. IEEE Transactions on knowledge and data engineering, 2009. 22(10): p. 446

1345-1359. 447

37. Shin, H.-C., et al., Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and 448

transfer learning. IEEE transactions on medical imaging, 2016. 35(5): p. 1285-1298. 449

38. Girshick, R., et al., Region-based convolutional networks for accurate object detection and segmentation. IEEE transactions on 450

pattern analysis and machine intelligence, 2015. 38(1): p. 142-158. 451

39. Kumar, M.D., M. Babaie, and H.R. Tizhoosh. Deep barcodes for fast retrieval of histopathology scans. in 2018 International Joint 452

Conference on Neural Networks (IJCNN). 2018. IEEE. 453

40. Kieffer, B., et al. Convolutional neural networks for histopathology image classification: Training vs. using pre-trained networks. in 454

2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA). 2017. IEEE. 455

41. Saha, M., C. Chakraborty, and D. Racoceanu, Efficient deep learning model for mitosis detection using breast histopathology images. 456

Computerized Medical Imaging and Graphics, 2018. 64: p. 29-40. 457

42. Han, Z., et al., Breast cancer multi-classification from histopathological images with structured deep learning model. Scientific reports, 458

2017. 7(1): p. 1-10. 459

43. Jia, Z., et al., Constrained deep weak supervision for histopathology image segmentation. IEEE transactions on medical imaging, 460

2017. 36(11): p. 2376-2388. 461

44. Xu, Y., et al., Large scale tissue histopathology image classification, segmentation, and visualization via deep convolutional activation 462

features. BMC bioinformatics, 2017. 18(1): p. 1-17. 463

45. Shi, X., et al., Pairwise based deep ranking hashing for histopathology image classification and retrieval. Pattern Recognition, 2018. 464

81: p. 14-22. 465


Sensors 2021, 21, x FOR PEER REVIEW 13 of 13

46. Le Van, C., et al., Detecting lumbar implant and diagnosing scoliosis from vietnamese X-ray imaging using the pre-trained api models 466

and transfer learning. CMC-COMPUTERS MATERIALS & CONTINUA, 2021. 66(1): p. 17-33. 467

47. Hameed, Z., et al., Breast cancer histopathology image classification using an ensemble of deep learning models. Sensors, 2020. 20(16): 468

p. 4373. 469

48. Nguyen, D.T., et al., Enhanced image-based endoscopic pathological site classification using an ensemble of deep learning models. 470

Sensors, 2020. 20(21): p. 5982. 471

49. Huang, P., et al., AF-SENet: classification of cancer in cervical tissue pathological images based on fusing deep convolution features. 472

Sensors, 2021. 21(1): p. 122. 473

50. Yamanakkanavar, N., J.Y. Choi, and B. Lee, MRI segmentation and classification of human brain using deep learning for diagnosis 474

of alzheimer’s disease: a survey. Sensors, 2020. 20(11): p. 3243. 475

51. Srinivasu, P.N., et al., Classification of skin disease using deep learning neural networks with MobileNet V2 and LSTM. Sensors, 476

2021. 21(8): p. 2852. 477

52. Masud, M., et al., A machine learning approach to diagnosing lung and colon cancer using a deep learning-based classification 478

framework. Sensors, 2021. 21(3): p. 748. 479

53. Babaie, M., et al. Classification and retrieval of digital pathology scans: A new dataset. in Proceedings of the IEEE Conference on 480

Computer Vision and Pattern Recognition Workshops. 2017. 481

482

You might also like