SSRN Id4186534

ed
Monkeypox diagnostic-aid system with skin images

using convolutional neural networks
iew
Luis Muñoz-Saavedraa , Elena Escobar-Lineroa , Javier Civit-Masota , Francisco
Luna-Perejóna , Antón Civita,b , Manuel Domı́nguez-Moralesa,b,∗
a Architecture
and Computer Technology department (ATC), Robotics and Technology of
Computers Lab (RTC), E.T.S. Ingenierı́a Informática, Avda. Reina Mercedes s/n,
ev
Universidad de Sevilla, Seville, 41012, Spain
b Computer Engineering Research Institute (I3US), E.T.S. Ingenierı́a Informática, Avda.
Reina Mercedes s/n, Universidad de Sevilla, Seville, 41012, Spain
r
Abstract
er
Background: Monkeypox is similar to classical smallpox and, until now,
isolated cases have been detected in Africa, but in 2022 it has spread around
the world and was declared an international health emergency by the WHO at
pe
the end of July.
Objective: In this work, we collect monkeypox images from official gov-
ernment websites and public datasets (as well as images of healthy people and
people with other skin diseases) and create the MonkeypoxSkin dataset (pub-
ot
licly available), which contains close-up images of the skin where you can see
the posthulas formed by the disease. With these images, we design, implement,
and evaluate several diagnostic aid systems for monkeypox disease.
tn
Methods: The classifiers developed and evaluated are based on Convolu-

tional Neural Networks models and some ensembles composed of a combina-
tion of those models, obtaining automatic classification results between healthy,
rin
monkeypox and other skin damages, given a close skin tissue image.
Results: The results show a system accuracy greater than 93% when using a
unique CNN model (VGG-19 and ResNet50), and greater than 98% when using
a CNN ensemble formed by ResNet50, EfficientNet-B0 and MobileNet-V2.
ep
∗ Correspondingauthor
Email address: mjdominguez@us.es (Manuel Domı́nguez-Morales)
Pr
Preprint submitted to Elsevier August 9, 2022
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4186534
ed
Conclusions: In contrast to the work found to date, this classifier focusses
iew
on disease classification based on a low-resolution image obtained from the skin
surface, so it can be integrated into a mobile application for mass screening. To
do this, it has been necessary to design a proprietary dataset that is publicly
available for use. Furthermore, the results obtained improve the accuracy of the
previous works.
ev
Keywords: Monkeypox, Deep Learning, Convolutional Neural Networks, Skin
damage, Skin dataset
r
1. Introduction
The first recorded case of monkeypox dates back to 1958, when two outbreaks
er
of a smallpox-like disease occurred in monkey colonies under investigation. De-
spite being named “monkeypox”, the origin of the disease remains unknown.
pe
5 However, African rodents and non-human primates (such as monkeys) could
harbour the virus and infect humans (United Nations, 2022).
The first human case of monkeypox was reported in 1970. Before the 2022
outbreak, monkeypox cases had been reported in people in several countries
in Central and West Africa. Previously, almost all monkeypox cases in people
ot
10 outside of Africa were associated with international travel to countries where

the disease occurs frequently or through imported animals (World Heath Orga-
tn
nization, 2022).
According to the last CDC report (Centre for Disease Control and Preven-
tion) on 28 July 2022, more than 23000 cases have been reported worldwide since
the beginning of the 2022 outbreak (Centre of Disease Control and Prevention,
rin
15
2022). Regarding those cases, the two countries with the most reported cases
are the United States with more than 5800 and Spain with more than 4200.
Furthermore, in Spain and Brazil, monkeypox deaths have been reported in
ep
recent days (The Guardian, 2022).

20 Although the spread of this virus does not resemble other recent pandemics
such as COVID-19, its expansion in Western countries has been surprising.
Pr
ed
Therefore, it is important to find a mechanism to help facilitate a rapid diag-
iew
nosis of the disease that does not require expensive diagnostic equipment or a
long response time. Those premises are of vital importance in less developed
25 countries without access to medical centres with advanced equipment.
In order to implement such a system, it is necessary that the information it
receives can be provided by a nonsanitary user; therefore, the easiest input for
ev
a common user is an image, and, in order to be easily captured for an average
user, it should be a superficial image of the skin. Furthermore, this type of
30 system must take into account other skin damages caused by similar viruses or
r
other diseases such as smallpox, chickenpox, or measles.
However, currently there is no dataset of close-up skin images, although some
35
er
datasets of full-body images are available. Therefore, for such large-scale work,
it would be necessary to collect an appropriate dataset to design the classifier.
Regarding the development of classifying models, the use of artificial intel-
pe
ligence techniques has been extended in recent years within the field of health:
from the analysis of physiological signals (Torres-Soto et al., 2020; Rim et al.,
2020; Muñoz-Saavedra et al., 2020) to the study of bad habits and abnormalities
during daily-life activities (Zhu et al., 2020; Luna-Perejón et al., 2020, 2021a;
ot
40 Escobar-Linero et al., 2022a). Furthermore, in relation to this work, the use

of techniques derived from AI and machine learning (ML) has a wide use in
the design of automatic classifiers with medical images, making use of advanced
tn
Deep Learning (DL) techniques in Convolutional Neural Networks (CNN). Thus,

these techniques have been applied in multiple research works in recent years,
45 obtaining very positive results with a correct diagnoses rate higher than 80%
rin
(Roncato et al., 2020; Liu et al., 2021; Civit-Masot et al., 2021); even reaching,
in multiple cases, values higher than 95% accuracy (Kundu et al., 2021; Lotter
et al., 2021; Thomas et al., 2021; Civit-Masot et al., 2020b).
This research group has extensive experience in the field of machine and deep
ep
50 learning applied to eHealth. Some of the works have been referenced previously,
but other works include monitoring physiological signals (Luna-Perejón et al.,
2021b), biomechanical gait studies (Domı́nguez-Morales et al., 2019), fall detec-
Pr
ed
tors (Luna-Perejón et al., 2019; Escobar-Linero et al., 2022b), even aid systems
iew
for the diagnosis of cancer and other diseases (Civit-Masot et al., 2020a).
55 Once the problem to be addressed and the experience of this group have
been described, the main objectives of this work are described next:
• Designing a dataset of superficial skin photographs containing images of

monkeypox cases, healthy people, and people with other types of diseases
ev
that produce skin rashes. This dataset is provided publicly to the com-
60 munity.
• Studying various alternatives of classifiers based on convolutional neural
r
networks to distinguish between the three classes described above.
er
• Combining different classifier models to form ensemble systems that per-
form the same task, comparing the results with those obtained previously.
pe
65 The rest of the manuscript is structured as follows: The methods used to
develop and test the diagnosis-aid system are presented in Section 2, including
the description of the dataset developed for this work. The results obtained
after testing the classifier are detailed in Section 3. Finally, in Section 4, a
ot
discussion of the results obtained is exposed, including the final conclusions of

70 this work and future research lines.
tn
2. Methods
To carry out the study previously exposed, this section presents the custom
dataset used to train and test the classification system, the developed classifiers,
rin
and the metrics used to evaluate them.

75 Figure 1 shows the general processing scheme of the system proposed in this
work.
ep
2.1. Dataset
The custom dataset consists of 224x244 pixel images in RGB format. Images
were obtained from contrasted online media (such as CDC or WHO) and from
Pr
ed
iew
r ev
er
pe
Figure 1: Full system scheme
80 public datasets. The only public datasets found are the one collected by Ali
ot
et al. (2022) and the one collected by Ahsan et al. (2022a,b). For the dataset
collected in this work, three main premises are taken into account:
tn
• Classes: the dataset needs to distinguish between healthy and ill tissue
(with Monkeypox). In addition, it is essential to include other skin diseases
85 to determine the degree of importance of the injury.
rin
• Number of images: It is important to develop a balanced dataset with the

same number of images for each class.
• Type of images: To develop a classifier that can be integrated into a mobile

ep
device, images with similar characteristics for all classes and highlighting
90 skin lesions (avoiding images taken from a distance or of whole body parts)
are required.
Pr
ed
Following these premises, the final dataset collected is composed of three
iew
classes: “monkeypox”, “healthy”, and “other diseases” that cause skin damage.
Each class contains 100 images of the skin surface. The summary of the collected
95 dataset can be seen in Table 1, which also contains the comparison with the
public dataset found. The dataset collected for this work is publicly accessible
(Domı́nguez-Morales et al., 2022).
ev
Number Type
Dataset Classes
of images of images
Full body, Limbs,

Ali et al. (2022) [2] Monkeypox, Others 102, 126
Face, Trunk
r
[4] Healthy, Monkeypox, 54, 43, Full body, Limbs,
Ahsan et al. (2022a,b)
Chickenpox, Measles 47, 17 Face, Trunk
MonkeypoxSkin datase (this work) [3] Healthy, Monkeypox, 100, 100,
(Domı́nguez-Morales et al., 2022)
er
Other skin diseases 100
Close skin tissue
Table 1: Public monkeypox datasets compared with the dataset collected for this work.
pe
The main problems observed in Table 1 for the previous public datasets are
described next:
100 • Ali et al. (2022): the main problem of this dataset is the image classes. It
has only two classes, and, moreover, there is no “Healthy” class, and this
ot
absence can lead to failures when evaluating the classifier with healthy
tissue images; since there is no class containing it, any image that it clas-
tn
sifies is labelled as damaged tissue (by Monkeypox or some other disease).

105 This problem was previously addressed in other works (Muñoz-Saavedra
et al., 2021).
• Ahsan et al. (2022a): this second dataset contains a good number of classes
rin
(including “Healthy” ones), but it has two main drawbacks. The first is
the total number of images, only surpassing 50 images for the “Healthy”
110 class (54 images) and giving only 17 images for one of the classes (Measles).
ep
The second drawback is data balance: the worst case is observed with the
“Healthy” class, which has more than three times the images given by the
“Measles” class.
Pr
ed
Also, for both datasets, the images used do not follow a similarity pattern,
iew
115 including full-body images or images of single body parts (even with images in
which more than one person appears). For correct training of the classifier, it
would be necessary to discard a significant part of the images in these datasets
due to deficiencies such as those indicated, or even cases where two different
images overlap.
ev
Healthy
r
er
Monkeypox
pe
ot
Other diseases
tn
Figure 2: Sample images from MonkeySkin dataset.

rin
120 Trying to solve these drawbacks, and shown in Table 1, the dataset collected
for this work has a perfect balance between classes, which facilitates the classi-
fier training task. In addition, the use of close images of the skin surface for all
ep
classes helps to present a more uniform dataset that better represents the infor-
mation that can be obtained from a photograph with a mobile device. Figure 2
125 shows some examples of images from the collected dataset.
Pr
ed
The images of the dataset are used to train the classifiers, using 60% of the
iew
images for training, 20% for validation and the last 20% for testing. In addition,
to improve the training, a data augmentation process is performed only on the
images of the training set (performing random rotations of the images until a
130 total of 9000 images are obtained). This division is presented in Table 2.
Original dataset Augmented dataset
ev
Class
Train Validation Test Train Validation Test
Healthy 60 20 20 3000 20 20
Monkeypox 60 20 20 3000 20 20
r
Other skin damages 60 20 20 3000 20 20
TOTAL 180 60 60 9000 60 60
er
Table 2: Dataset elaborated for this work and subsets division.
2.2. Classifiers
pe
The classifiers evaluated are pre-trained models on which the weights of the
first layers of neurons will be maintained and trained on the weights of the last
layers. Each model will be independently evaluated, and subsequently combina-
135 tions of models will be performed to evaluate whether there is an improvement
ot
in classification. As for the hyperparameters used, they will be those commonly

used for these models, which are a batch size of 32 and a learning rate of 1e-4,
tn
with training of 50 epochs. The individual models used and the ensembles will
be presented below.
140 2.2.1. Individual models

rin
The individual models used in this work are as follows:
• VGG-16: is a convolutional neural network proposed by Simonyan & Zis-

serman (2014) of Oxford University, and gained notoriety by winning the
ep
ImageNet Large-Scale Visual Recognition Challenge (ILSVRC) in 2014.
145 • VGG-19: It is a variant with more computational layers than VGG-16,

therefore, heavier in memory storage and computational requirements.
Pr
ed
• ResNet50: It was introduced by Microsoft and won the ILSVRC (Ima-
iew
geNet Large-Scale Visual Recognition Challenge) competition in 2015 (He
et al., 2016). Its operation is based on increasing the number of layers by
150 introducing a residual connection, moving on to the next one directly, and
improving the learning process.
• MobileNet-V2: is a convolutional neural network architecture that seeks
ev
to perform well on mobile devices. It is based on an inverted residual
structure in which the residual connections are between the bottleneck
155 layers (Sandler et al., 2018).
r
• EfficientNet-B0: It is a convolutional neural network architecture and scal-
er
ing method that uniformly scales all dimensions of depth/width/resolution
using a compound coefficient (Tan & Le, 2019). The base EfficientNet-
B0 network is based on the inverted bottleneck residual blocks of Mo-
pe
160 bileNetV2, in addition to squeeze-and-excitation blocks.
2.2.2. Ensemble classifiers

Second, the models listed above are combined to form ensembles of three
models each. These ensembles are those formed by:
ot
• Ensemble 1: VGG-16 + VGG-19 + ResNet50
165 • Ensemble 2: VGG-16 + ResNet50 + EfficientNet-B0

tn
• Ensemble 3: ResNet50 + EfficientNet-B0 + MobileNet-V2
The models of each ensemble are run in parallel, and the results of each
ensemble are subsequently merged. The combinations of these models are per-
rin
formed following the three classical approaches of ensemble neural networks:
170 • Concatenation Ensemble: This is the most common technique to merge

different data sources. A concatenation ensemble receives different inputs,
ep
whatever their dimensions, and concatenates them on a given axis. This

operation can be dispersive, not allowing the final part of the network to
learn important information, or resulting in overfitting.
Pr
ed
175 • Average Ensemble: this can be considered as the opposite of the concate-
iew
nation operation. The pooled outputs of the networks are passed through
dense layers with a fixed number of neurons to equalise them. In this way,
the average is computed. The drawback of this approach is the loss of
information caused by the nature of the average operation.
• Weighted Ensemble: This is a special form of average operation, where
ev
180
the tensor outputs are multiplied by a weight and then linearly combined.
These weights determine the contribution of each model in the final re-
sult, but they are not fixed because these values are optimised during the
r
training process.
185 2.3. Evaluation metrics

er
To evaluate the effectiveness of the classification systems, it is common to
use different and well-known metrics: accuracy (most-used metric), sensitivity
pe
(also known as recall), specificity, precision and F1score (Sokolova & Lapalme,
2009).
190 To apply them, the classification results obtained for each class must be
tagged individually as ”True Positive” (TP: belonging to a class and classified as
ot
the same class), ”True Negative” (TN: belonging to another class and classified
as that class), ”False Positive” (FP: belonging to another class and classified
tn
to the evaluated class), or ”False Negative” (FN: belonging to the class and
195 classified as other class). According to them, the high-level metrics are presented
in the following equations:
rin
X T Pc + T Nc
Accuracy = , c ∈ classes (1)
c
T Pc + F Pc + T Nc + F Nc
X T Nc
Specif icity = , c ∈ classes (2)
T Nc + F Pc
ep
X T Pc
P recision = , c ∈ classes (3)
c
T Pc + F Pc
Pr
10
ed
X T Pc
iew
Sensitivity = , c ∈ classes (4)
c
T Pc + F Nc
precision ∗ sensitivity
F 1score = 2 ∗ . (5)
precision + sensitivity
About those metrics:
ev
• Accuracy: All samples classified correctly compared to all samples (see
Equation 1).
200 • Specificity: proportion of “true negative” values in all cases that do not
r
belong to this class (see Equation 2).
classified as it (see Equation 3). er

• Precision: Proportion of “true positive” values in all cases that have been
• Sensitivity (or Recall): Proportion of “true positive” values in all cases

pe
205 that belong to this class (see Equation 4).
• F1score : It considers both the precision and the sensitivity (recall) of the
test to compute the score. It is the harmonic mean of both parameters
ot
(see Equation 5).
2.4. Previous works

tn
210 A global search is performed on the main search engines (Scopus, IEEEx-
plorer, and Google Scholar) with the following search phrase: ”monkeypox”
AND (”deep learning” OR ”convolutional neural network”). Due to the cur-
rin
rent spread of the monkeypox outbreak, the results have not been filtered or
restricted by year. Moreover, those preprints or arXiv/bioRxiv works that are
215 waiting for acceptance are selected as well.
However, the results after the searching process and the elimination of ar-
ep
ticles not focused on the design of a classifier, reflect a very few works in this
area (focused only in 2022). So, the works used for the comparison will only be
those detailed next:
Pr
11
ed
220 • Ali et al. (2022): In this work, a binary classification of monkeypox and
iew
other skin diseases is performed using skin images taken by users. The
authors tested several convolutional neural network models such as VGG-
16, ResNet50, and Inception-V3. The dataset uses 102 monkeypox images
and 126 images of other skin diseases, but it does not include images of
225 healthy skin tissue. The best classifier has an accuracy greater than 82%.
ev
• Ahsan et al. (2022a): this work uses a custom dataset formed by the four
classes “healthy”, “measles”, “chickenpox” and “monkeypox”, containing
54, 17, 47 and 43 images for each class, respectively. Although the au-
r
thors use a data augmentation process, the dataset has very few images
230 for some classes, and it is quite unbalanced. However, the developed clas-
er
sifiers are trained only for two classes (“monkeypox” versus “others”, and
“monkeypox” versus “chickenpox”). Using a VGG-16 CNN for each im-
pe
plemented system, authors obtain a 83% accuracy for the “monkeypox” vs
“others” study for the training subset, and a 78% accuracy for the second
235 experiment using the training subset.
We believe that the lack of work in this area is due, first, to the novelty
ot
of the global spread of the virus; but, second, it is also due to the lack of a
sufficiently balanced and formal public dataset. This latter point is founded on
the fact that the works detailed in this section are the same as those named in
tn
240 Section 2.1, when exposing the existing datasets (i.e., each work uses its own
dataset). This aspect is one of the main objectives of this work when collecting
a new public dataset.
rin
3. Results
This section presents the results obtained after the training process for each
ep
245 model or ensemble. These results are shown by the most commonly used met-
rics (described earlier) and the confusion matrices. First, the individual model
results are detailed; and then the ensembles results are shown.
Pr
12
ed
3.1. Individual CNN results
iew
In this subsection, the results for each individual model are detailed. These
250 models are VGG-16, VGG-19, ResNet50, MobileNet-V2 and EfficientNet-B0.
3.1.1. VGG-16
The first individual model evaluated is VGG-16. The results obtained for
ev
this model are detailed in Table 3. As can be observed, global accuracy exceeds
91% on average, obtaining a value of almost 97% for the “monkeypox” class.
Class Accuracy Specificity Precision Sensitivity F1score
r
Monkeypox 96.67 100 100 90 94.73
Healthy 95 95 90.47 95 92.68
Other skin damages 91.67 92.5 85.71 90 87.8
Global 91.67
er
95.83 91.67 91.67
Table 3: Classification results obtained for VGG-16 classifier.

91.67
pe
ot
tn
rin
Figure 3: Confusion matrix for VGG-16 classification.
255 Figure 3 shows the classification details for the test subset. As shown in the
confusion matrix, there are no false positives for the “monkeypox” class. How-
ep
ever, 10% failures of Monkeypox samples are classified as other skin damages;
therefore, there are false negatives, which are the most dangerous cases.
Pr
13
ed
3.1.2. VGG-19
iew
260 The second model is VGG-19. The results obtained for this model are de-
tailed in Table 4. As can be observed, global accuracy exceeds 93% on average,
obtaining a value of almost 97% for the monkeypox class. These results are
better than those obtained for the VGG-16 model; although, individually, the
only class that improves its own accuracy is the “other skin damages” class.
ev
Monkeypox 96.67 97.5 95 95 95
Healthy 95 97.5 94.73 90 92.31
Other skin damages 95 95 90.47 95 92.68
r
Global 93.33 96.67 93.33 93.33 93.33
Table 4: Classification results obtained for VGG-19 classifier.
er
pe
ot
tn
Figure 4: Confusion matrix for VGG-19 classification.

rin
265 These results can be easily observed in Figure 4. For this case, there are
false positive cases for the “monkeypox” class (5% of the “healthy” class), al-
though false negative cases are reduced to 5% (classifying them as “other skin
damages”).
ep
Pr
14
ed
3.1.3. ResNet50
iew
270 The next model is ResNet50. The results obtained for this model are detailed
in Table 5. As can be observed, the global accuracy is 95% on average, obtaining
a value of almost 97% for each class individually. These results are the best
obtained so far (better than those obtained for the VGG-19 model).
ev
Monkeypox 96.67 95 90.91 100 95.24
Healthy 96.67 100 100 90 94.74
Other skin damages 96.67 97.5 95 95 95
Global 95 97.75 95 95 95
r
Table 5: Classification results obtained for ResNet50 classifier.
er
pe
ot
tn
Figure 5: Confusion matrix for ResNet50 classification.
Moreover, there is an essential difference between the ResNet50 and VGG-19

rin
275 results, as can be seen in Figure 5. This difference relies on the absence of false
negatives for the monkeypox class. For this case, the false negative decreasing
is changed for an increase of false positive cases. Both the accuracy and the
absence of false negatives makes this model the most efficient for diagnostic
ep
purposes.
Pr
15
ed
280 3.1.4. MobileNet-V2
iew
Next, the MobileNet-V2 model is evaluated. The results obtained for this
model are detailed in Table 6. For this case, the global accuracy decreases
below 89%, being the worst evaluated model to date. Not surprisingly, there
is a substantial drop in the accuracy of both the “monkeypox” class and the
285 “other skin damages” class.
ev
Monkeypox 91.67 95 89.47 85 87.18
Healthy 95 95 90.48 95 92.68
Other skin damages 90 92.5 85 85 85
r
Global 88.33 94.17 88.33 88.33 88.33
Table 6: Classification results obtained for MobileNet-V2 classifier.
er
pe
ot
tn
Figure 6: Confusion matrix for MobileNet-V2 classification.

rin
These data can be verified by looking at the confusion matrix in Figure

6. For the monkeypox class, 15% of the samples are classified as “other skin
damages” (false negatives) and 5% of the samples of the other two classes are
classified as monkeypox (false positives). As detailed above, this model obtains
ep
290 the worst results, mainly due to those false negatives (detailed by the low value
of sensitivity).
Pr
16
ed
3.1.5. EfficientNet-B0
iew
The last individual model evaluated is EfficientNet-B0. The results obtained
for this model are detailed in Table 7. The results are now better than those
295 obtained with the MobileNet-V2 model (global accuracy of 90% and monkeypox
accuracy of almost 97%); but these results are worse than those obtained with
the ResNet50 model.
ev
Monkeypox 96.67 97.5 95 95 95
Healthy 91.67 90 82.61 95 88.37
r
Global 90 95 90 90 90
Table 7: Classification results obtained for EfficientNet-B0 classifier.
er
pe
ot
tn
Figure 7: Confusion matrix for EfficientNet-B0 classification.

rin
As can be observed in Figure 7, both false negatives and false positives are
reduced to 5% for each, but they are not completely eliminated. So, although
300 these results are better than those obtained with the MobileNet-V2 model, they
are worse than those obtained with the rest of the evaluated models.
ep
Therefore, the best classification results obtained so far are those achieved
with the ResNet50 model. Next, the three different ensemble networks are
evaluated.
Pr
17
ed
305 3.2. Ensemble CNN results
iew
As indicated previously, the three ensemble classifiers evaluated will consist
of: [VGG-16 + VGG-19 + ResNet50], [VGG-16 + ResNet50 + EfficientNet-
B0] and [ResNet50 + EfficientNet-B0 + MobileNet-V2]. The results of these
ensembles will be combined by concatenation, simple average, and weighted av-
310 erage. Therefore, in addition to the ensembles, the mechanism used to combine
ev
the results will be evaluated.
3.2.1. VGG16+VGG19+ResNet50
r
For this first ensemble, the results obtained for each combination mechanism
are detailed in Table 8 (concatenation), Table 9 (simple average) and Table 10
315 (weighted average).
Class
Monkeypox
Accuracy
83.33
er
Specificity
85
Precision
72.72
Sensitivity
80
F1score
76.19
pe
Healthy 91.67 95 89.47 85 87.18
Global 81.67 90.83 81.67 81.67 81.67
Table 8: Classification results obtained for the ensemble VGG-16 + VGG-19 + ResNet50.
Combination results obtained by concatenation.
ot

Monkeypox 85 82.5 72 90 80
tn
Healthy 95 97.5 94.74 90 92.31

Other skin damages 86.67 95 87.5 70 77.78
Global 83.3 91.67 83.33 83.33 83.33
rin
Combination results obtained by simple average.
The best results for this ensemble are obtained using the weighted average
mechanism (see Table 10). Here, a global accuracy of almost 92% is obtained,
ep
very similar to the best result obtained with individual models (93.33% with
ResNet50); however, the increase in computational cost does not justify the use
320 of this ensemble. The main problems of this ensemble are the false negatives
Pr
18
ed
Monkeypox 93.33 95 90 90 90
iew
Healthy 96.67 100 100 90 94.74
Global 91.67 95.83 91.67 91.67 91.67
Combination results obtained by weighted average.
ev
for the monkeypox class (10% of monkeypox samples), as can be observed in
Figure 8.
r
er
pe
(a) (b)
ot
tn
rin
(c)
ep
Figure 8: Confusion matrix for the ensemble composed by VGG-16, VGG-19 and ResNet50.
Mixed results obtained by (a) concatenation, (b) simple average and (c) weighted average.
Pr
19
ed
3.2.2. VGG16+ResNet50+EfficientNetB0
iew
For the second ensemble, the results obtained for each combination mecha-
325 nism are detailed in Table 11 (concatenation), Table 12 (simple average), and
Table 13 (weighted average).

Monkeypox 91.67 97.5 94.12 80 86.49
ev
Healthy 93.33 95 90 90 90
Other skin damages 95 92.5 89.96 100 93.02
Global 90 95 90 90 90
Table 11: Classification results obtained for the ensemble VGG-16 + ResNet50 + EfficientNet-
r
B0. Combination results obtained by concatenation.
Class
Monkeypox
Healthy
Accuracy
91.67
91.67
er
Specificity
95
95
Precision
89.47
89.47
Sensitivity
85
85
F1score
87.18
87.18
pe
Global 88.33 94.17 88.33 88.33 88.33
B0. Combination results obtained by simple average.
ot

Monkeypox 93.33 95 90 90 90
Healthy 96.67 100 100 90 94.74
tn

Global 91.67 95.83 91.67 91.67 91.67
B0. Combination results obtained by weighted average.
rin
As happened for the previous ensemble, the best results are obtained with
the weighted average and, moreover, the accuracy results are the same too.
Moreover, both the individual accuracy of each class and the false positives
ep
330 and negatives are the same. Therefore, the problem is the same as before: this
system is too computationally complex for the results obtained (especially when
comparing them with the individual ResNet50 model).
Pr
20
ed
So far, the ensembles developed for this work do not obtain better accuracy
iew
results than those obtained with individual models. However, the last ensemble
335 is analysed in the next (and last) subsection of the Results section.
r ev
(a) er (b)
pe
ot
tn
(c)
Figure 9: Confusion matrix for the ensemble composed by VGG-16, ResNet50 and
EfficientNet-B0. Mixed results obtained by (a) concatenation, (b) simple average and (c)
rin
weighted average.
3.2.3. ResNet50+EfficientNet+MobileNetV2
For this last ensemble network, the results are detailed in the same way as for
ep
the previous ones, dividing them by the combination mechanism: concatenation

(see Table 14), simple average (see Table 15) and weighted average (see Table
340 16).
Pr
21
ed
For this last ensemble, the results obtained are better than all the previous
iew
ones. For combinations of simple average and weighted average, an overall
accuracy greater than 98% is obtained. Moreover, for the “healthy” class, the
accuracy is 100%, while for the other two classes (including “monkeypox”), the
345 accuracy obtained is 98.33%.
ev
Monkeypox 95 100 100 85 91.89
Healthy 96.67 95 90.91 100 95.24
Other skin damages 95 95 90.48 95 92.68
Global 93.33 96.67 93.33 93.33 93.33
r
Table 14: Classification results obtained for the ensemble ResNet50 + EfficientNet-B0 +
MobileNet-V2. Combination results obtained by concatenation.
Class
Monkeypox
Accuracy
98.33
er
Specificity
97.5
Precision
95.24
Sensitivity
100
F1score
97.56
pe
Healthy 100 100 100 100 100
Other skin damages 98.33 100 100 95 97.44
Global 98.33 99.17 98.33 98.33 98.33
MobileNet-V2. Combination results obtained by simple average.
ot

Monkeypox 98.33 97.5 95.24 100 97.56
tn
Healthy 100 100 100 100 100

Other skin damages 98.33 100 100 95 97.44
Global 98.33 99.17 98.33 98.33 98.33
rin
MobileNet-V2. Combination results obtained by weighted average.
As can be seen in the confusion matrices (see Figure 10), for the cases of
simple and weighted average (Figure 10-a and Figure 10-b), no false negatives
ep
are observed (sensitivity value of 100%), although there is a 5% false positives

cases from the “other skin damages” class.
350 The results obtained with this ensemble are almost a 4% better than those
Pr
22
ed
obtained so far with the individual ResNet50 model. The final results obtained
iew
for the best classifier include no misclassifications for both “monkeypox” and
“healthy” classes, and a 5% misclassified samples for the “other skin damages”
class (that are classified as “monkeypox”).
r ev
(a)
er (b)
pe
ot
tn
(c)
Figure 10: Confusion matrix for the ensemble composed by ResNet50, EfficientNet-B0 and
rin
MobileNet-V2. Mixed results obtained by (a) concatenation, (b) simple average and (c)
weighted average.
ep
355 4. Discussion
Following the results presented above, they can be evaluated together to

reach general conclusions.
Pr
23
ed
First, in relation to the public monkeypox datasets composed of skin im-
iew
ages, those found have been examined in detail. The search reveals two public
360 datasets with several shortcomings that were previously analysed: dataset (Ali
et al., 2022) has only two classes with very varied images and a slight imbalance
between them (102 images for the “monkeypox” class and 126 images for the
“others” class); and dataset (Ahsan et al., 2022b) has four classes with equally
ev
varied images, very few images and a large imbalance between classes (54 for
365 “healthy”, 43 for “monkeypox”, 47 for “chickenpox” and 17 for “measles”).
However, the dataset collected for this work is composed of three classes of
r
the same type and is completely balanced (100 images for “monkeypox”, 100
images for “healthy” and 100 images for “other skin damages”). Therefore, in
370 er
this aspect, there is a substantial improvement compared to the public datasets
so far, which may help future work in this area.
In second place, with respect to the classifiers evaluated, Table 17 sum-
pe
marises the accuracy results obtained in the previous section. In this table,
columns 2 to 4 show the results obtained depending on the mechanism used by
the ensemble to combine the results; this is the reason why these columns are
375 not filled for individual models.
ot
Accuracy results
Classifier
Concatenated Simple avr Weighted avr BEST
VGG-16 − − − 91.67
tn
VGG-19 − − − 93.33
ResNet50 − − − 95
EfficientNet-B0 − − − 90
MobileNet-V2 − − − 88.33
VGG16+VGG19+ResNet 81.67 83.3 91.67 91.67
rin
VGG16+ResNet+EfficientNet 90 88.33 91.67 91.67

ResNet+EfficientNet+MobileNet 93.33 98.33 98.33 98.33
Table 17: Summarized results for all the classifiers.

ep
Looking at the information presented in Table 17, the best results are those
obtained with the individual model ResNet50, and with the ensemble formed
by ResNet50 + EfficientNet-B0 and MobileNet-V2. Of all these cases, the in-
Pr
24
ed
dividual model obtained an accuracy of 95% and the ensemble an accuracy of
iew
380 98.33%, improving the results of the individual models by 3.33%.
It could be considered whether it is worth using an ensemble network con-
sisting of three individual networks to improve by more than a 3%; however, it
is important to note that, analysing each class individually, an improvement is
obtained in all of them: 2% improvement in the “monkeypox” class (keeping
ev
385 false negatives to zero and reducing false positives); 4% improvement for the
“healthy” class (reducing both false positives and false negatives to zero); and
2% improvement in the “other skin damages” class (reducing false positives to
r
zero, while slightly maintaining some false negatives). In any case, all the indi-
cators analysed show an improvement using the ensemble network. Therefore,
390
for a Monkeypox skin image classifier. er

we conclude that the use of the indicated ensemble seems to be the best solution
Finally, the results of the classifier developed in this work can be analysed in
pe
comparison with the two previous works detailed in Section 2. The comparative
results are presented in Table 18.
Work Dataset Classes Classifier Results (%)
VGG-16 [V GG16] Acc: 81.48, Pre: 85, Sen: 81, F1: 83

ResNet50 [ResN et] Acc: 82.96, Pre: 87, Sen: 83, F1: 84
Ali et al. (2022) Own [2] Monkeypox, Others
ot
Inception-V3 [Inception] Acc: 74.03, Pre: 74, Sen: 81, F1: 78

Ensemble [Ensemble] Acc: 79.26, Pre: 84, Sen: 79, F1: 81
[2] Monkeypox, Chickenpox [Case1] Acc: 83, Pre: 88, Sen: 83, Spe: 66, F1: 83
Ahsan et al. (2022a) Own VGG-16
[2] Monkeypox, Others [Case2] Acc: 78, Pre: 75, Sen: 75, Spe: 83, F1: 75
VGG-16 [V GG16] Acc: 91.67, Pre: 92, Sen: 92, Spe: 96, F1: 92
VGG-19 [V GG19] Acc: 93.33, Pre: 93, Sen: 93, Spe: 97, F1: 93
tn
ResNet50 [ResN et] Acc: 95, Pre: 95, Sen: 95, Spe: 97.75, F1: 95
[3] Healthy, Monkeypox, MobileNet-V2 [M obileN et] Acc: 88.33, Pre: 88, Sen: 88, Spe: 94, F1: 88
This work (2022) MonkeypoxSkin
Other skin diseases EfficientNet-B0 [Ef f icientN et] Acc: 90, Pre: 90, Sen: 90, Spe: 95, F1: 90
Ensemble 1 [Ensemble1] Acc: 91.67, Pre: 92, Sen: 92, Spe: 96, F1: 92
rin
Table 18: Comparison with previous works.
395 If we look at the results obtained in previous work, the classifier developed
for this work improves the overall accuracy of the system (in addition to the
ep
individual accuracy of all classes).

The first compared work uses its own two-class dataset and, therefore, its
classifier is a two-class classifier (“monkeypox” and “others”). The best classi-
Pr
25
ed
400 fication results are obtained with the individual ResNet50 model (almost 83%).
iew
In this aspect, it is similar to our work, since the best results for individual
models are obtained with the ResNet50 model; even so, our work improves the
classification results for the ResNet50 model by a 12% (it should also be taken
into account that our work uses three classes, not two). Finally, this first pre-
405 vious work also evaluates an ensemble (consisting of VGG-16 + ResNet50 +
ev
Inception-V3), obtaining worse results than those obtained with ResNet50.
The second compared work uses its own dataset too, but in this case it con-
tains four classes. However, in that work, two two-class classifiers are developed:
r
one classifies between “monkeypox” and “chickenpox”, and the other classifies
410 between “monkeypox” and “others”. This work only analyses the VGG-16
er
model, obtaining classification results of 83% for the first case and 78% for the
second one (taking into account the results presented for the test subset). If
we compare our work with this one, we obtain an improvement of 9% using the
pe
VGG-16 model (although it is one of the worst results obtained in our work).
415 In general, the improvement we obtain with respect to this work is more than
15%.
In summary, looking at our work in general, we have analysed several clas-
ot
sifiers to obtain a nearly optimal solution with an accuracy of over 98%. These
results are better than all the work published so far on all the parameters eval-
420 uated.
tn
Furthermore, the dataset collected for this work is more balanced and regular
than those published so far.
rin
Declarations
Acknowledgments. This work was supported by the Andalusian Regional I+D+i FEDER
425 Project under Grant DAFNE US-1381619.
Funding. Open Access funding provided thanks to the CRUE-CSIC agreement with Elsevier.
ep
Competing Interest. The authors declare that they have no known competing financial
interests or personal relationships that could have appeared to influence the work reported in
this document.
Pr
26
ed
430 Ethical statement. No ethical committee has been consulted, as this work uses a publicly
available dataset.
iew
Author contributions. Conceptualization: Manuel Domı́nguez-Morales, Francisco Luna-
Perejón, Javier Civit-Masot; Methodology: Manuel Domı́nguez-Morales, Luis Muñoz-Saavedra,
Elena Escobar-Linero; Formal analysis and investigation: Luis Muñoz-Saavedra, Elena Escobar-
435 Linero; Writing - original draft preparation: Luis Muñoz-Saavedra, Elena Escobar-Linero;
Writing - review and editing: Francisco Luna-Perejón, Javier Civit-Masot; Funding acquisi-
ev
tion: Antón Civit; Resources: Manuel Domı́nguez-Morales; Supervision: Antón Civit, Manuel
Domı́nguez-Morales.
References
r
440 Ahsan, M. M., Uddin, M. R., Farjana, M., Sakib, A. N., Momin, K. A., & Luna,
er
S. A. (2022a). Image data collection and implementation of deep learning-
based model in detecting monkeypox disease using modified vgg16. arXiv
preprint arXiv:2206.01862 , .
pe
Ahsan, M. M., Uddin, M. R., & Luna, S. A. (2022b). Monkeypox image data
445 collection. arXiv preprint arXiv:2206.01774 , .
Ali, S. N., Ahmed, M., Paul, J., Jahan, T., Sani, S., Noor, N., Hasan, T.
ot
et al. (2022). Monkeypox skin lesion detection using deep learning models: A
feasibility study. arXiv preprint arXiv:2207.03342 , .
Centre of Disease Control and Prevention (2022). Technical report:

tn
450 Multi-national monkeypox outbreak. https://www.cdc.gov/poxvirus/

monkeypox/clinicians/technical-report.html. Accessed: 2022-08-02.
Civit-Masot, J. et al. (2020a). Deep learning system for covid-19 diagnosis aid
rin
using x-ray pulmonary images. Applied Sciences, 10 , 4640.
Civit-Masot, J. et al. (2020b). Dual machine-learning system to aid glaucoma

diagnosis using disc and cup feature extraction. IEEE Access, 8 , 127519–
ep
455
127529.
Pr
27
ed
Civit-Masot, J. et al. (2021). A study on the use of edge tpus for eye fundus
iew
image segmentation. Engineering Applications of Artificial Intelligence, 104 ,
104384.
460 Domı́nguez-Morales, M. et al. (2019). Smart footwear insole for recognition of

foot pronation and supination using neural networks. Applied Sciences, 9 ,
3970.
ev
Domı́nguez-Morales, M. et al. (2022). MonkeypoxSkin dataset. https://
github.com/mjdominguez/MonkeypoxSkinImages.
r
465 Escobar-Linero, E., Domı́nguez-Morales, M., & Sevillano, J. L. (2022a).
Worker’s physical fatigue classification using neural networks. Expert Sys-
tems with Applications, 198 , 116784.
er
Escobar-Linero, E., Luna-Perejón, F., Muñoz-Saavedra, L., Sevillano, J. L., &
pe
Domı́nguez-Morales, M. (2022b). On the feature extraction process in ma-
470 chine learning. an experimental study about guided versus non-guided process
in falling detection systems. Engineering Applications of Artificial Intelli-
gence, 114 , 105170.
ot
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image
recognition. In Proceedings of the IEEE conference on computer vision and
pattern recognition (pp. 770–778).
tn
475
Kundu, R. et al. (2021). Pneumonia detection in chest x-ray images using an

ensemble of deep learning models. PloS one, 16 , e0256630.
rin
Liu, Z. et al. (2021). Deep learning framework based on integration of s-mask r-

cnn and inception-v3 for ultrasound image-aided diagnosis of prostate cancer.
480 Future Generation Computer Systems, 114 , 358–367.
ep
Lotter, W. et al. (2021). Robust breast cancer detection in mammography

and digital breast tomosynthesis using an annotation-efficient deep learning
approach. Nature Medicine, 27 , 244–249.
Pr
28
ed
Luna-Perejón, F. et al. (2019). Wearable fall detector using recurrent neural
iew
485 networks. Sensors, 19 , 4885.
Luna-Perejón, F. et al. (2020). Low-power embedded system for gait classifica-

tion using neural networks. Journal of Low Power Electronics and Applica-
tions, 10 , 14.
ev
Luna-Perejón, F. et al. (2021a). Ankfall—falls, falling risks and daily-life activ-
490 ities dataset with an ankle-placed accelerometer and training using recurrent
neural networks. Sensors, 21 , 1889.
r
Luna-Perejón, F. et al. (2021b). Iot garment for remote elderly care network.
Biomedical Signal Processing and Control , 69 , 102848.
495
er
Muñoz-Saavedra, L., Civit-Masot, J., Luna-Perejón, F., Domı́nguez-Morales,
M., & Civit, A. (2021). Does two-class training extract real features? a
pe
covid-19 case study. Applied Sciences, 11 , 1424.
Muñoz-Saavedra, L. et al. (2020). Affective state assistant for helping users with
cognition disabilities using neural networks. Electronics, 9 , 1843.
Rim, B. et al. (2020). Deep learning in physiological signal data: A survey.

ot
500 Sensors, 20 , 969.
Roncato, C. et al. (2020). Colour doppler ultrasound of temporal arteries for

tn
the diagnosis of giant cell arteritis: a multicentre deep learning study. Clin
Exp Rheumatol , 38 , S120–25.
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., & Chen, L.-C. (2018). Mo-
rin
505 bilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the

IEEE conference on computer vision and pattern recognition (pp. 4510–4520).
Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for
ep
large-scale image recognition. arXiv preprint arXiv:1409.1556 , .
Sokolova, M., & Lapalme, G. (2009). A systematic analysis of performance

510 measures for classification tasks. Inf. Process. & Manag., 45 , 427–437.
Pr
29
ed
Tan, M., & Le, Q. (2019). Efficientnet: Rethinking model scaling for convolu-
iew
tional neural networks. In International conference on machine learning (pp.
6105–6114). PMLR.
The Guardian (2022). Spain reports second death related to mon-

515 keypox. https://www.theguardian.com/world/2022/jul/30/
spain-reports-second-death-related-to-monkeypox. Accessed: 2022-
ev
08-02.
Thomas, S. M. et al. (2021). Interpretable deep learning systems for multi-class
r
segmentation and classification of non-melanoma skin cancer. Medical Image
520 Analysis, 68 , 101915.
er
Torres-Soto, J. et al. (2020). Multi-task deep learning for cardiac rhythm de-
tection in wearable devices. NPJ digital medicine, 3 , 1–8.
pe
United Nations (2022). What is monkeypox? https://news.un.org/en/
story/2022/07/1123212. Accessed: 2022-08-02.
525 World Heath Organization (2022). Monkeypox details. https://www.who.int/

news-room/fact-sheets/detail/monkeypox. Accessed: 2022-08-02.
ot
Zhu, H. et al. (2020). A deep learning approach for recognizing activity of daily
living (adl) for senior care: Exploiting interaction dependency and temporal
tn
patterns. Forthcoming at MIS Quarterly, .

rin
ep
Pr
30
Title
ed
Monkeypox diagnostic-aid system with skin images using convolutional neural
networks
Luis Muñoz-Saavedra 1, Elena Escobar-Linero 1, Javier Civit-Masot 1, Francisco Luna-Perejón 1,
Antón Civit 1,2, Manuel Domínguez-Morales 1,2,*
iew
1 Architecture and Computer Technology dept., E.T.S. Ingeniería Informática, Avda. Reina
Mercedes, s/n, Universidad de Sevilla, Seville, Spain
2 Computer Engineering Research Institute (I3US), Universidad de Sevilla, Seville, Spain
* Corresponding autor:
v
Manuel Domínguez-Morales, E.T.S. Ingeniería Informática, Avda. Reina Mercedes s/n, office
F0.71, 41012, Universidad de Sevilla, Seville, Spain. +34 95 4551206
re
Abstract
Background: Monkeypox is similar to classical smallpox and, until now, isolated cases have
er
been detected in Africa, but in 2022 it has spread around the world and was declared an
international health emergency by the WHO at the end of July.
Objective: In this work, we collect monkeypox images from official government websites and
public datasets (as well as images of healthy people and people with other skin diseases) and
pe
create the MonkeypoxSkin dataset (publicly available), which contains close-up images of the
skin where you can see the posthulas formed by the disease. With these images, we design,
implement, and evaluate several diagnostic aid systems for monkeypox disease.
Methods: The classifiers developed and evaluated are based on Convolutional Neural Networks
models and some ensembles composed of a combination of those models, obtaining automatic
ot
classification results between healthy, monkeypox and other skin damages, given a close skin
tissue image.
Results: The results show a system accuracy greater than 93% when using a unique CNN model
tn
(VGG-19 and ResNet50), and greater than 98% when using a CNN ensemble formed by
ResNet50, EfficientNet-B0 and MobileNet-V2.
Conclusions: In contrast to the work found to date, this classifier focusses on disease
classification based on a low-resolution image obtained from the skin surface, so it can be
rin
integrated into a mobile application for mass screening. To do this, it has been necessary to
design a proprietary dataset that is publicly available for use. Furthermore, the results
obtained improve the accuracy of the previous works.
Keywords: Monkeypox, Deep Learning, Convolutional Neural Networks, Skin damage, Skin
ep
dataset
Pr

SSRN Id4186534

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

SSRN Id4186534

Uploaded by

Copyright:

Available Formats

ed

Monkeypox diagnostic-aid system with skin images

Reina Mercedes s/n, Universidad de Sevilla, Seville, 41012, Spain

Methods: The classifiers developed and evaluated are based on Convolu-

Preprint submitted to Elsevier August 9, 2022

10 outside of Africa were associated with international travel to countries where

recent days (The Guardian, 2022).

40 Escobar-Linero et al., 2022a). Furthermore, in relation to this work, the use

Deep Learning (DL) techniques in Convolutional Neural Networks (CNN). Thus,

• Designing a dataset of superficial skin photographs containing images of

• Studying various alternatives of classifiers based on convolutional neural

discussion of the results obtained is exposed, including the final conclusions of

and the metrics used to evaluate them.

• Number of images: It is important to develop a balanced dataset with the

• Type of images: To develop a classifier that can be integrated into a mobile

Full body, Limbs,

sifies is labelled as damaged tissue (by Monkeypox or some other disease).

Figure 2: Sample images from MonkeySkin dataset.

Original dataset Augmented dataset

in classification. As for the hyperparameters used, they will be those commonly

140 2.2.1. Individual models

The individual models used in this work are as follows:

• VGG-16: is a convolutional neural network proposed by Simonyan & Zis-

ImageNet Large-Scale Visual Recognition Challenge (ILSVRC) in 2014.

145 • VGG-19: It is a variant with more computational layers than VGG-16,

• MobileNet-V2: is a convolutional neural network architecture that seeks

2.2.2. Ensemble classifiers

• Ensemble 1: VGG-16 + VGG-19 + ResNet50

165 • Ensemble 2: VGG-16 + ResNet50 + EfficientNet-B0

• Ensemble 3: ResNet50 + EfficientNet-B0 + MobileNet-V2

formed following the three classical approaches of ensemble neural networks:

170 • Concatenation Ensemble: This is the most common technique to merge

whatever their dimensions, and concatenates them on a given axis. This

• Weighted Ensemble: This is a special form of average operation, where

185 2.3. Evaluation metrics

classified as it (see Equation 3). er

• Sensitivity (or Recall): Proportion of “true positive” values in all cases

(see Equation 5).

2.4. Previous works

Class Accuracy Specificity Precision Sensitivity F1score

Table 3: Classification results obtained for VGG-16 classifier.

Figure 3: Confusion matrix for VGG-16 classification.

Table 4: Classification results obtained for VGG-19 classifier.

Figure 4: Confusion matrix for VGG-19 classification.

Class Accuracy Specificity Precision Sensitivity F1score

Figure 5: Confusion matrix for ResNet50 classification.

Moreover, there is an essential difference between the ResNet50 and VGG-19

Table 6: Classification results obtained for MobileNet-V2 classifier.

Figure 6: Confusion matrix for MobileNet-V2 classification.

These data can be verified by looking at the confusion matrix in Figure

Table 7: Classification results obtained for EfficientNet-B0 classifier.

Figure 7: Confusion matrix for EfficientNet-B0 classification.

Class Accuracy Specificity Precision Sensitivity F1score

Healthy 95 97.5 94.74 90 92.31

Combination results obtained by simple average.

Class Accuracy Specificity Precision Sensitivity F1score

Class Accuracy Specificity Precision Sensitivity F1score

Other skin damages 93.33 92.5 86.36 95 90.48

the previous ones, dividing them by the combination mechanism: concatenation

Class Accuracy Specificity Precision Sensitivity F1score

Class Accuracy Specificity Precision Sensitivity F1score

Healthy 100 100 100 100 100