You are on page 1of 31

ed

Monkeypox diagnostic-aid system with skin images


using convolutional neural networks

iew
Luis Muñoz-Saavedraa , Elena Escobar-Lineroa , Javier Civit-Masota , Francisco
Luna-Perejóna , Antón Civita,b , Manuel Domı́nguez-Moralesa,b,∗
a Architecture
and Computer Technology department (ATC), Robotics and Technology of
Computers Lab (RTC), E.T.S. Ingenierı́a Informática, Avda. Reina Mercedes s/n,

ev
Universidad de Sevilla, Seville, 41012, Spain
b Computer Engineering Research Institute (I3US), E.T.S. Ingenierı́a Informática, Avda.

Reina Mercedes s/n, Universidad de Sevilla, Seville, 41012, Spain

r
Abstract

er
Background: Monkeypox is similar to classical smallpox and, until now,
isolated cases have been detected in Africa, but in 2022 it has spread around
the world and was declared an international health emergency by the WHO at
pe
the end of July.
Objective: In this work, we collect monkeypox images from official gov-
ernment websites and public datasets (as well as images of healthy people and
people with other skin diseases) and create the MonkeypoxSkin dataset (pub-
ot

licly available), which contains close-up images of the skin where you can see
the posthulas formed by the disease. With these images, we design, implement,
and evaluate several diagnostic aid systems for monkeypox disease.
tn

Methods: The classifiers developed and evaluated are based on Convolu-


tional Neural Networks models and some ensembles composed of a combina-
tion of those models, obtaining automatic classification results between healthy,
rin

monkeypox and other skin damages, given a close skin tissue image.
Results: The results show a system accuracy greater than 93% when using a
unique CNN model (VGG-19 and ResNet50), and greater than 98% when using
a CNN ensemble formed by ResNet50, EfficientNet-B0 and MobileNet-V2.
ep

∗ Correspondingauthor
Email address: mjdominguez@us.es (Manuel Domı́nguez-Morales)
Pr

Preprint submitted to Elsevier August 9, 2022

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4186534
ed
Conclusions: In contrast to the work found to date, this classifier focusses

iew
on disease classification based on a low-resolution image obtained from the skin
surface, so it can be integrated into a mobile application for mass screening. To
do this, it has been necessary to design a proprietary dataset that is publicly
available for use. Furthermore, the results obtained improve the accuracy of the
previous works.

ev
Keywords: Monkeypox, Deep Learning, Convolutional Neural Networks, Skin
damage, Skin dataset

r
1. Introduction

The first recorded case of monkeypox dates back to 1958, when two outbreaks

er
of a smallpox-like disease occurred in monkey colonies under investigation. De-
spite being named “monkeypox”, the origin of the disease remains unknown.
pe
5 However, African rodents and non-human primates (such as monkeys) could
harbour the virus and infect humans (United Nations, 2022).
The first human case of monkeypox was reported in 1970. Before the 2022
outbreak, monkeypox cases had been reported in people in several countries
in Central and West Africa. Previously, almost all monkeypox cases in people
ot

10 outside of Africa were associated with international travel to countries where


the disease occurs frequently or through imported animals (World Heath Orga-
tn

nization, 2022).
According to the last CDC report (Centre for Disease Control and Preven-
tion) on 28 July 2022, more than 23000 cases have been reported worldwide since
the beginning of the 2022 outbreak (Centre of Disease Control and Prevention,
rin

15

2022). Regarding those cases, the two countries with the most reported cases
are the United States with more than 5800 and Spain with more than 4200.
Furthermore, in Spain and Brazil, monkeypox deaths have been reported in
ep

recent days (The Guardian, 2022).


20 Although the spread of this virus does not resemble other recent pandemics
such as COVID-19, its expansion in Western countries has been surprising.
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4186534
ed
Therefore, it is important to find a mechanism to help facilitate a rapid diag-

iew
nosis of the disease that does not require expensive diagnostic equipment or a
long response time. Those premises are of vital importance in less developed
25 countries without access to medical centres with advanced equipment.
In order to implement such a system, it is necessary that the information it
receives can be provided by a nonsanitary user; therefore, the easiest input for

ev
a common user is an image, and, in order to be easily captured for an average
user, it should be a superficial image of the skin. Furthermore, this type of
30 system must take into account other skin damages caused by similar viruses or

r
other diseases such as smallpox, chickenpox, or measles.
However, currently there is no dataset of close-up skin images, although some

35
er
datasets of full-body images are available. Therefore, for such large-scale work,
it would be necessary to collect an appropriate dataset to design the classifier.
Regarding the development of classifying models, the use of artificial intel-
pe
ligence techniques has been extended in recent years within the field of health:
from the analysis of physiological signals (Torres-Soto et al., 2020; Rim et al.,
2020; Muñoz-Saavedra et al., 2020) to the study of bad habits and abnormalities
during daily-life activities (Zhu et al., 2020; Luna-Perejón et al., 2020, 2021a;
ot

40 Escobar-Linero et al., 2022a). Furthermore, in relation to this work, the use


of techniques derived from AI and machine learning (ML) has a wide use in
the design of automatic classifiers with medical images, making use of advanced
tn

Deep Learning (DL) techniques in Convolutional Neural Networks (CNN). Thus,


these techniques have been applied in multiple research works in recent years,
45 obtaining very positive results with a correct diagnoses rate higher than 80%
rin

(Roncato et al., 2020; Liu et al., 2021; Civit-Masot et al., 2021); even reaching,
in multiple cases, values higher than 95% accuracy (Kundu et al., 2021; Lotter
et al., 2021; Thomas et al., 2021; Civit-Masot et al., 2020b).
This research group has extensive experience in the field of machine and deep
ep

50 learning applied to eHealth. Some of the works have been referenced previously,
but other works include monitoring physiological signals (Luna-Perejón et al.,
2021b), biomechanical gait studies (Domı́nguez-Morales et al., 2019), fall detec-
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4186534
ed
tors (Luna-Perejón et al., 2019; Escobar-Linero et al., 2022b), even aid systems

iew
for the diagnosis of cancer and other diseases (Civit-Masot et al., 2020a).
55 Once the problem to be addressed and the experience of this group have
been described, the main objectives of this work are described next:

• Designing a dataset of superficial skin photographs containing images of


monkeypox cases, healthy people, and people with other types of diseases

ev
that produce skin rashes. This dataset is provided publicly to the com-
60 munity.

• Studying various alternatives of classifiers based on convolutional neural

r
networks to distinguish between the three classes described above.

er
• Combining different classifier models to form ensemble systems that per-
form the same task, comparing the results with those obtained previously.
pe
65 The rest of the manuscript is structured as follows: The methods used to
develop and test the diagnosis-aid system are presented in Section 2, including
the description of the dataset developed for this work. The results obtained
after testing the classifier are detailed in Section 3. Finally, in Section 4, a
ot

discussion of the results obtained is exposed, including the final conclusions of


70 this work and future research lines.
tn

2. Methods

To carry out the study previously exposed, this section presents the custom
dataset used to train and test the classification system, the developed classifiers,
rin

and the metrics used to evaluate them.


75 Figure 1 shows the general processing scheme of the system proposed in this
work.
ep

2.1. Dataset

The custom dataset consists of 224x244 pixel images in RGB format. Images
were obtained from contrasted online media (such as CDC or WHO) and from
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4186534
ed
iew
r ev
er
pe
Figure 1: Full system scheme

80 public datasets. The only public datasets found are the one collected by Ali
ot

et al. (2022) and the one collected by Ahsan et al. (2022a,b). For the dataset
collected in this work, three main premises are taken into account:
tn

• Classes: the dataset needs to distinguish between healthy and ill tissue
(with Monkeypox). In addition, it is essential to include other skin diseases
85 to determine the degree of importance of the injury.
rin

• Number of images: It is important to develop a balanced dataset with the


same number of images for each class.

• Type of images: To develop a classifier that can be integrated into a mobile


ep

device, images with similar characteristics for all classes and highlighting
90 skin lesions (avoiding images taken from a distance or of whole body parts)
are required.
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4186534
ed
Following these premises, the final dataset collected is composed of three

iew
classes: “monkeypox”, “healthy”, and “other diseases” that cause skin damage.
Each class contains 100 images of the skin surface. The summary of the collected
95 dataset can be seen in Table 1, which also contains the comparison with the
public dataset found. The dataset collected for this work is publicly accessible
(Domı́nguez-Morales et al., 2022).

ev
Number Type
Dataset Classes
of images of images

Full body, Limbs,


Ali et al. (2022) [2] Monkeypox, Others 102, 126
Face, Trunk

r
[4] Healthy, Monkeypox, 54, 43, Full body, Limbs,
Ahsan et al. (2022a,b)
Chickenpox, Measles 47, 17 Face, Trunk
MonkeypoxSkin datase (this work) [3] Healthy, Monkeypox, 100, 100,
(Domı́nguez-Morales et al., 2022)

er
Other skin diseases 100
Close skin tissue

Table 1: Public monkeypox datasets compared with the dataset collected for this work.
pe
The main problems observed in Table 1 for the previous public datasets are
described next:

100 • Ali et al. (2022): the main problem of this dataset is the image classes. It
has only two classes, and, moreover, there is no “Healthy” class, and this
ot

absence can lead to failures when evaluating the classifier with healthy
tissue images; since there is no class containing it, any image that it clas-
tn

sifies is labelled as damaged tissue (by Monkeypox or some other disease).


105 This problem was previously addressed in other works (Muñoz-Saavedra
et al., 2021).

• Ahsan et al. (2022a): this second dataset contains a good number of classes
rin

(including “Healthy” ones), but it has two main drawbacks. The first is
the total number of images, only surpassing 50 images for the “Healthy”
110 class (54 images) and giving only 17 images for one of the classes (Measles).
ep

The second drawback is data balance: the worst case is observed with the
“Healthy” class, which has more than three times the images given by the
“Measles” class.
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4186534
ed
Also, for both datasets, the images used do not follow a similarity pattern,

iew
115 including full-body images or images of single body parts (even with images in
which more than one person appears). For correct training of the classifier, it
would be necessary to discard a significant part of the images in these datasets
due to deficiencies such as those indicated, or even cases where two different
images overlap.

ev
Healthy

r
er
Monkeypox

pe
ot
Other diseases

tn

Figure 2: Sample images from MonkeySkin dataset.


rin

120 Trying to solve these drawbacks, and shown in Table 1, the dataset collected
for this work has a perfect balance between classes, which facilitates the classi-
fier training task. In addition, the use of close images of the skin surface for all
ep

classes helps to present a more uniform dataset that better represents the infor-
mation that can be obtained from a photograph with a mobile device. Figure 2
125 shows some examples of images from the collected dataset.
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4186534
ed
The images of the dataset are used to train the classifiers, using 60% of the

iew
images for training, 20% for validation and the last 20% for testing. In addition,
to improve the training, a data augmentation process is performed only on the
images of the training set (performing random rotations of the images until a
130 total of 9000 images are obtained). This division is presented in Table 2.

Original dataset Augmented dataset

ev
Class
Train Validation Test Train Validation Test
Healthy 60 20 20 3000 20 20
Monkeypox 60 20 20 3000 20 20

r
Other skin damages 60 20 20 3000 20 20
TOTAL 180 60 60 9000 60 60

er
Table 2: Dataset elaborated for this work and subsets division.

2.2. Classifiers
pe
The classifiers evaluated are pre-trained models on which the weights of the
first layers of neurons will be maintained and trained on the weights of the last
layers. Each model will be independently evaluated, and subsequently combina-
135 tions of models will be performed to evaluate whether there is an improvement
ot

in classification. As for the hyperparameters used, they will be those commonly


used for these models, which are a batch size of 32 and a learning rate of 1e-4,
tn

with training of 50 epochs. The individual models used and the ensembles will
be presented below.

140 2.2.1. Individual models


rin

The individual models used in this work are as follows:

• VGG-16: is a convolutional neural network proposed by Simonyan & Zis-


serman (2014) of Oxford University, and gained notoriety by winning the
ep

ImageNet Large-Scale Visual Recognition Challenge (ILSVRC) in 2014.

145 • VGG-19: It is a variant with more computational layers than VGG-16,


therefore, heavier in memory storage and computational requirements.
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4186534
ed
• ResNet50: It was introduced by Microsoft and won the ILSVRC (Ima-

iew
geNet Large-Scale Visual Recognition Challenge) competition in 2015 (He
et al., 2016). Its operation is based on increasing the number of layers by
150 introducing a residual connection, moving on to the next one directly, and
improving the learning process.

• MobileNet-V2: is a convolutional neural network architecture that seeks

ev
to perform well on mobile devices. It is based on an inverted residual
structure in which the residual connections are between the bottleneck
155 layers (Sandler et al., 2018).

r
• EfficientNet-B0: It is a convolutional neural network architecture and scal-

er
ing method that uniformly scales all dimensions of depth/width/resolution
using a compound coefficient (Tan & Le, 2019). The base EfficientNet-
B0 network is based on the inverted bottleneck residual blocks of Mo-
pe
160 bileNetV2, in addition to squeeze-and-excitation blocks.

2.2.2. Ensemble classifiers


Second, the models listed above are combined to form ensembles of three
models each. These ensembles are those formed by:
ot

• Ensemble 1: VGG-16 + VGG-19 + ResNet50

165 • Ensemble 2: VGG-16 + ResNet50 + EfficientNet-B0


tn

• Ensemble 3: ResNet50 + EfficientNet-B0 + MobileNet-V2

The models of each ensemble are run in parallel, and the results of each
ensemble are subsequently merged. The combinations of these models are per-
rin

formed following the three classical approaches of ensemble neural networks:

170 • Concatenation Ensemble: This is the most common technique to merge


different data sources. A concatenation ensemble receives different inputs,
ep

whatever their dimensions, and concatenates them on a given axis. This


operation can be dispersive, not allowing the final part of the network to
learn important information, or resulting in overfitting.
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4186534
ed
175 • Average Ensemble: this can be considered as the opposite of the concate-

iew
nation operation. The pooled outputs of the networks are passed through
dense layers with a fixed number of neurons to equalise them. In this way,
the average is computed. The drawback of this approach is the loss of
information caused by the nature of the average operation.

• Weighted Ensemble: This is a special form of average operation, where

ev
180

the tensor outputs are multiplied by a weight and then linearly combined.
These weights determine the contribution of each model in the final re-
sult, but they are not fixed because these values are optimised during the

r
training process.

185 2.3. Evaluation metrics


er
To evaluate the effectiveness of the classification systems, it is common to
use different and well-known metrics: accuracy (most-used metric), sensitivity
pe
(also known as recall), specificity, precision and F1score (Sokolova & Lapalme,
2009).
190 To apply them, the classification results obtained for each class must be
tagged individually as ”True Positive” (TP: belonging to a class and classified as
ot

the same class), ”True Negative” (TN: belonging to another class and classified
as that class), ”False Positive” (FP: belonging to another class and classified
tn

to the evaluated class), or ”False Negative” (FN: belonging to the class and
195 classified as other class). According to them, the high-level metrics are presented
in the following equations:
rin

X T Pc + T Nc
Accuracy = , c ∈ classes (1)
c
T Pc + F Pc + T Nc + F Nc

X T Nc
Specif icity = , c ∈ classes (2)
T Nc + F Pc
ep

X T Pc
P recision = , c ∈ classes (3)
c
T Pc + F Pc
Pr

10

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4186534
ed
X T Pc

iew
Sensitivity = , c ∈ classes (4)
c
T Pc + F Nc

precision ∗ sensitivity
F 1score = 2 ∗ . (5)
precision + sensitivity
About those metrics:

ev
• Accuracy: All samples classified correctly compared to all samples (see
Equation 1).

200 • Specificity: proportion of “true negative” values in all cases that do not

r
belong to this class (see Equation 2).

classified as it (see Equation 3). er


• Precision: Proportion of “true positive” values in all cases that have been

• Sensitivity (or Recall): Proportion of “true positive” values in all cases


pe
205 that belong to this class (see Equation 4).

• F1score : It considers both the precision and the sensitivity (recall) of the
test to compute the score. It is the harmonic mean of both parameters
ot

(see Equation 5).

2.4. Previous works


tn

210 A global search is performed on the main search engines (Scopus, IEEEx-
plorer, and Google Scholar) with the following search phrase: ”monkeypox”
AND (”deep learning” OR ”convolutional neural network”). Due to the cur-
rin

rent spread of the monkeypox outbreak, the results have not been filtered or
restricted by year. Moreover, those preprints or arXiv/bioRxiv works that are
215 waiting for acceptance are selected as well.
However, the results after the searching process and the elimination of ar-
ep

ticles not focused on the design of a classifier, reflect a very few works in this
area (focused only in 2022). So, the works used for the comparison will only be
those detailed next:
Pr

11

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4186534
ed
220 • Ali et al. (2022): In this work, a binary classification of monkeypox and

iew
other skin diseases is performed using skin images taken by users. The
authors tested several convolutional neural network models such as VGG-
16, ResNet50, and Inception-V3. The dataset uses 102 monkeypox images
and 126 images of other skin diseases, but it does not include images of
225 healthy skin tissue. The best classifier has an accuracy greater than 82%.

ev
• Ahsan et al. (2022a): this work uses a custom dataset formed by the four
classes “healthy”, “measles”, “chickenpox” and “monkeypox”, containing
54, 17, 47 and 43 images for each class, respectively. Although the au-

r
thors use a data augmentation process, the dataset has very few images
230 for some classes, and it is quite unbalanced. However, the developed clas-

er
sifiers are trained only for two classes (“monkeypox” versus “others”, and
“monkeypox” versus “chickenpox”). Using a VGG-16 CNN for each im-
pe
plemented system, authors obtain a 83% accuracy for the “monkeypox” vs
“others” study for the training subset, and a 78% accuracy for the second
235 experiment using the training subset.

We believe that the lack of work in this area is due, first, to the novelty
ot

of the global spread of the virus; but, second, it is also due to the lack of a
sufficiently balanced and formal public dataset. This latter point is founded on
the fact that the works detailed in this section are the same as those named in
tn

240 Section 2.1, when exposing the existing datasets (i.e., each work uses its own
dataset). This aspect is one of the main objectives of this work when collecting
a new public dataset.
rin

3. Results

This section presents the results obtained after the training process for each
ep

245 model or ensemble. These results are shown by the most commonly used met-
rics (described earlier) and the confusion matrices. First, the individual model
results are detailed; and then the ensembles results are shown.
Pr

12

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4186534
ed
3.1. Individual CNN results

iew
In this subsection, the results for each individual model are detailed. These
250 models are VGG-16, VGG-19, ResNet50, MobileNet-V2 and EfficientNet-B0.

3.1.1. VGG-16
The first individual model evaluated is VGG-16. The results obtained for

ev
this model are detailed in Table 3. As can be observed, global accuracy exceeds
91% on average, obtaining a value of almost 97% for the “monkeypox” class.

Class Accuracy Specificity Precision Sensitivity F1score

r
Monkeypox 96.67 100 100 90 94.73
Healthy 95 95 90.47 95 92.68
Other skin damages 91.67 92.5 85.71 90 87.8
Global 91.67
er
95.83 91.67 91.67

Table 3: Classification results obtained for VGG-16 classifier.


91.67
pe
ot
tn
rin

Figure 3: Confusion matrix for VGG-16 classification.

255 Figure 3 shows the classification details for the test subset. As shown in the
confusion matrix, there are no false positives for the “monkeypox” class. How-
ep

ever, 10% failures of Monkeypox samples are classified as other skin damages;
therefore, there are false negatives, which are the most dangerous cases.
Pr

13

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4186534
ed
3.1.2. VGG-19

iew
260 The second model is VGG-19. The results obtained for this model are de-
tailed in Table 4. As can be observed, global accuracy exceeds 93% on average,
obtaining a value of almost 97% for the monkeypox class. These results are
better than those obtained for the VGG-16 model; although, individually, the
only class that improves its own accuracy is the “other skin damages” class.

ev
Class Accuracy Specificity Precision Sensitivity F1score
Monkeypox 96.67 97.5 95 95 95
Healthy 95 97.5 94.73 90 92.31
Other skin damages 95 95 90.47 95 92.68

r
Global 93.33 96.67 93.33 93.33 93.33

Table 4: Classification results obtained for VGG-19 classifier.

er
pe
ot
tn

Figure 4: Confusion matrix for VGG-19 classification.


rin

265 These results can be easily observed in Figure 4. For this case, there are
false positive cases for the “monkeypox” class (5% of the “healthy” class), al-
though false negative cases are reduced to 5% (classifying them as “other skin
damages”).
ep
Pr

14

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4186534
ed
3.1.3. ResNet50

iew
270 The next model is ResNet50. The results obtained for this model are detailed
in Table 5. As can be observed, the global accuracy is 95% on average, obtaining
a value of almost 97% for each class individually. These results are the best
obtained so far (better than those obtained for the VGG-19 model).

Class Accuracy Specificity Precision Sensitivity F1score

ev
Monkeypox 96.67 95 90.91 100 95.24
Healthy 96.67 100 100 90 94.74
Other skin damages 96.67 97.5 95 95 95
Global 95 97.75 95 95 95

r
Table 5: Classification results obtained for ResNet50 classifier.

er
pe
ot
tn

Figure 5: Confusion matrix for ResNet50 classification.

Moreover, there is an essential difference between the ResNet50 and VGG-19


rin

275 results, as can be seen in Figure 5. This difference relies on the absence of false
negatives for the monkeypox class. For this case, the false negative decreasing
is changed for an increase of false positive cases. Both the accuracy and the
absence of false negatives makes this model the most efficient for diagnostic
ep

purposes.
Pr

15

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4186534
ed
280 3.1.4. MobileNet-V2

iew
Next, the MobileNet-V2 model is evaluated. The results obtained for this
model are detailed in Table 6. For this case, the global accuracy decreases
below 89%, being the worst evaluated model to date. Not surprisingly, there
is a substantial drop in the accuracy of both the “monkeypox” class and the
285 “other skin damages” class.

ev
Class Accuracy Specificity Precision Sensitivity F1score
Monkeypox 91.67 95 89.47 85 87.18
Healthy 95 95 90.48 95 92.68
Other skin damages 90 92.5 85 85 85

r
Global 88.33 94.17 88.33 88.33 88.33

Table 6: Classification results obtained for MobileNet-V2 classifier.

er
pe
ot
tn

Figure 6: Confusion matrix for MobileNet-V2 classification.


rin

These data can be verified by looking at the confusion matrix in Figure


6. For the monkeypox class, 15% of the samples are classified as “other skin
damages” (false negatives) and 5% of the samples of the other two classes are
classified as monkeypox (false positives). As detailed above, this model obtains
ep

290 the worst results, mainly due to those false negatives (detailed by the low value
of sensitivity).
Pr

16

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4186534
ed
3.1.5. EfficientNet-B0

iew
The last individual model evaluated is EfficientNet-B0. The results obtained
for this model are detailed in Table 7. The results are now better than those
295 obtained with the MobileNet-V2 model (global accuracy of 90% and monkeypox
accuracy of almost 97%); but these results are worse than those obtained with
the ResNet50 model.

ev
Class Accuracy Specificity Precision Sensitivity F1score
Monkeypox 96.67 97.5 95 95 95
Healthy 91.67 90 82.61 95 88.37
Other skin damages 91.67 97.5 94.12 80 86.49

r
Global 90 95 90 90 90

Table 7: Classification results obtained for EfficientNet-B0 classifier.

er
pe
ot
tn

Figure 7: Confusion matrix for EfficientNet-B0 classification.


rin

As can be observed in Figure 7, both false negatives and false positives are
reduced to 5% for each, but they are not completely eliminated. So, although
300 these results are better than those obtained with the MobileNet-V2 model, they
are worse than those obtained with the rest of the evaluated models.
ep

Therefore, the best classification results obtained so far are those achieved
with the ResNet50 model. Next, the three different ensemble networks are
evaluated.
Pr

17

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4186534
ed
305 3.2. Ensemble CNN results

iew
As indicated previously, the three ensemble classifiers evaluated will consist
of: [VGG-16 + VGG-19 + ResNet50], [VGG-16 + ResNet50 + EfficientNet-
B0] and [ResNet50 + EfficientNet-B0 + MobileNet-V2]. The results of these
ensembles will be combined by concatenation, simple average, and weighted av-
310 erage. Therefore, in addition to the ensembles, the mechanism used to combine

ev
the results will be evaluated.

3.2.1. VGG16+VGG19+ResNet50

r
For this first ensemble, the results obtained for each combination mechanism
are detailed in Table 8 (concatenation), Table 9 (simple average) and Table 10
315 (weighted average).

Class
Monkeypox
Accuracy
83.33
er
Specificity
85
Precision
72.72
Sensitivity
80
F1score
76.19
pe
Healthy 91.67 95 89.47 85 87.18
Other skin damages 88.33 92.5 84.21 80 82.05
Global 81.67 90.83 81.67 81.67 81.67

Table 8: Classification results obtained for the ensemble VGG-16 + VGG-19 + ResNet50.
Combination results obtained by concatenation.
ot

Class Accuracy Specificity Precision Sensitivity F1score


Monkeypox 85 82.5 72 90 80
tn

Healthy 95 97.5 94.74 90 92.31


Other skin damages 86.67 95 87.5 70 77.78
Global 83.3 91.67 83.33 83.33 83.33

Table 9: Classification results obtained for the ensemble VGG-16 + VGG-19 + ResNet50.
rin

Combination results obtained by simple average.

The best results for this ensemble are obtained using the weighted average
mechanism (see Table 10). Here, a global accuracy of almost 92% is obtained,
ep

very similar to the best result obtained with individual models (93.33% with
ResNet50); however, the increase in computational cost does not justify the use
320 of this ensemble. The main problems of this ensemble are the false negatives
Pr

18

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4186534
ed
Class Accuracy Specificity Precision Sensitivity F1score
Monkeypox 93.33 95 90 90 90

iew
Healthy 96.67 100 100 90 94.74
Other skin damages 93.33 92.5 86.36 95 90.48
Global 91.67 95.83 91.67 91.67 91.67

Table 10: Classification results obtained for the ensemble VGG-16 + VGG-19 + ResNet50.
Combination results obtained by weighted average.

ev
for the monkeypox class (10% of monkeypox samples), as can be observed in
Figure 8.

r
er
pe

(a) (b)
ot
tn
rin

(c)
ep

Figure 8: Confusion matrix for the ensemble composed by VGG-16, VGG-19 and ResNet50.
Mixed results obtained by (a) concatenation, (b) simple average and (c) weighted average.
Pr

19

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4186534
ed
3.2.2. VGG16+ResNet50+EfficientNetB0

iew
For the second ensemble, the results obtained for each combination mecha-
325 nism are detailed in Table 11 (concatenation), Table 12 (simple average), and
Table 13 (weighted average).

Class Accuracy Specificity Precision Sensitivity F1score


Monkeypox 91.67 97.5 94.12 80 86.49

ev
Healthy 93.33 95 90 90 90
Other skin damages 95 92.5 89.96 100 93.02
Global 90 95 90 90 90

Table 11: Classification results obtained for the ensemble VGG-16 + ResNet50 + EfficientNet-

r
B0. Combination results obtained by concatenation.

Class
Monkeypox
Healthy
Accuracy
91.67
91.67
er
Specificity
95
95
Precision
89.47
89.47
Sensitivity
85
85
F1score
87.18
87.18
pe
Other skin damages 93.33 92.5 86.36 95 90.48
Global 88.33 94.17 88.33 88.33 88.33

Table 12: Classification results obtained for the ensemble VGG-16 + ResNet50 + EfficientNet-
B0. Combination results obtained by simple average.
ot

Class Accuracy Specificity Precision Sensitivity F1score


Monkeypox 93.33 95 90 90 90
Healthy 96.67 100 100 90 94.74
tn

Other skin damages 93.33 92.5 86.36 95 90.48


Global 91.67 95.83 91.67 91.67 91.67

Table 13: Classification results obtained for the ensemble VGG-16 + ResNet50 + EfficientNet-
B0. Combination results obtained by weighted average.
rin

As happened for the previous ensemble, the best results are obtained with
the weighted average and, moreover, the accuracy results are the same too.
Moreover, both the individual accuracy of each class and the false positives
ep

330 and negatives are the same. Therefore, the problem is the same as before: this
system is too computationally complex for the results obtained (especially when
comparing them with the individual ResNet50 model).
Pr

20

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4186534
ed
So far, the ensembles developed for this work do not obtain better accuracy

iew
results than those obtained with individual models. However, the last ensemble
335 is analysed in the next (and last) subsection of the Results section.

r ev
(a) er (b)
pe
ot
tn

(c)

Figure 9: Confusion matrix for the ensemble composed by VGG-16, ResNet50 and
EfficientNet-B0. Mixed results obtained by (a) concatenation, (b) simple average and (c)
rin

weighted average.

3.2.3. ResNet50+EfficientNet+MobileNetV2
For this last ensemble network, the results are detailed in the same way as for
ep

the previous ones, dividing them by the combination mechanism: concatenation


(see Table 14), simple average (see Table 15) and weighted average (see Table
340 16).
Pr

21

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4186534
ed
For this last ensemble, the results obtained are better than all the previous

iew
ones. For combinations of simple average and weighted average, an overall
accuracy greater than 98% is obtained. Moreover, for the “healthy” class, the
accuracy is 100%, while for the other two classes (including “monkeypox”), the
345 accuracy obtained is 98.33%.

Class Accuracy Specificity Precision Sensitivity F1score

ev
Monkeypox 95 100 100 85 91.89
Healthy 96.67 95 90.91 100 95.24
Other skin damages 95 95 90.48 95 92.68
Global 93.33 96.67 93.33 93.33 93.33

r
Table 14: Classification results obtained for the ensemble ResNet50 + EfficientNet-B0 +
MobileNet-V2. Combination results obtained by concatenation.

Class
Monkeypox
Accuracy
98.33
er
Specificity
97.5
Precision
95.24
Sensitivity
100
F1score
97.56
pe
Healthy 100 100 100 100 100
Other skin damages 98.33 100 100 95 97.44
Global 98.33 99.17 98.33 98.33 98.33

Table 15: Classification results obtained for the ensemble ResNet50 + EfficientNet-B0 +
MobileNet-V2. Combination results obtained by simple average.
ot

Class Accuracy Specificity Precision Sensitivity F1score


Monkeypox 98.33 97.5 95.24 100 97.56
tn

Healthy 100 100 100 100 100


Other skin damages 98.33 100 100 95 97.44
Global 98.33 99.17 98.33 98.33 98.33

Table 16: Classification results obtained for the ensemble ResNet50 + EfficientNet-B0 +
rin

MobileNet-V2. Combination results obtained by weighted average.

As can be seen in the confusion matrices (see Figure 10), for the cases of
simple and weighted average (Figure 10-a and Figure 10-b), no false negatives
ep

are observed (sensitivity value of 100%), although there is a 5% false positives


cases from the “other skin damages” class.
350 The results obtained with this ensemble are almost a 4% better than those
Pr

22

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4186534
ed
obtained so far with the individual ResNet50 model. The final results obtained

iew
for the best classifier include no misclassifications for both “monkeypox” and
“healthy” classes, and a 5% misclassified samples for the “other skin damages”
class (that are classified as “monkeypox”).

r ev
(a)
er (b)
pe
ot
tn

(c)

Figure 10: Confusion matrix for the ensemble composed by ResNet50, EfficientNet-B0 and
rin

MobileNet-V2. Mixed results obtained by (a) concatenation, (b) simple average and (c)
weighted average.
ep

355 4. Discussion

Following the results presented above, they can be evaluated together to


reach general conclusions.
Pr

23

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4186534
ed
First, in relation to the public monkeypox datasets composed of skin im-

iew
ages, those found have been examined in detail. The search reveals two public
360 datasets with several shortcomings that were previously analysed: dataset (Ali
et al., 2022) has only two classes with very varied images and a slight imbalance
between them (102 images for the “monkeypox” class and 126 images for the
“others” class); and dataset (Ahsan et al., 2022b) has four classes with equally

ev
varied images, very few images and a large imbalance between classes (54 for
365 “healthy”, 43 for “monkeypox”, 47 for “chickenpox” and 17 for “measles”).
However, the dataset collected for this work is composed of three classes of

r
the same type and is completely balanced (100 images for “monkeypox”, 100
images for “healthy” and 100 images for “other skin damages”). Therefore, in

370 er
this aspect, there is a substantial improvement compared to the public datasets
so far, which may help future work in this area.
In second place, with respect to the classifiers evaluated, Table 17 sum-
pe
marises the accuracy results obtained in the previous section. In this table,
columns 2 to 4 show the results obtained depending on the mechanism used by
the ensemble to combine the results; this is the reason why these columns are
375 not filled for individual models.
ot

Accuracy results
Classifier
Concatenated Simple avr Weighted avr BEST
VGG-16 − − − 91.67
tn

VGG-19 − − − 93.33
ResNet50 − − − 95
EfficientNet-B0 − − − 90
MobileNet-V2 − − − 88.33
VGG16+VGG19+ResNet 81.67 83.3 91.67 91.67
rin

VGG16+ResNet+EfficientNet 90 88.33 91.67 91.67


ResNet+EfficientNet+MobileNet 93.33 98.33 98.33 98.33

Table 17: Summarized results for all the classifiers.


ep

Looking at the information presented in Table 17, the best results are those
obtained with the individual model ResNet50, and with the ensemble formed
by ResNet50 + EfficientNet-B0 and MobileNet-V2. Of all these cases, the in-
Pr

24

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4186534
ed
dividual model obtained an accuracy of 95% and the ensemble an accuracy of

iew
380 98.33%, improving the results of the individual models by 3.33%.
It could be considered whether it is worth using an ensemble network con-
sisting of three individual networks to improve by more than a 3%; however, it
is important to note that, analysing each class individually, an improvement is
obtained in all of them: 2% improvement in the “monkeypox” class (keeping

ev
385 false negatives to zero and reducing false positives); 4% improvement for the
“healthy” class (reducing both false positives and false negatives to zero); and
2% improvement in the “other skin damages” class (reducing false positives to

r
zero, while slightly maintaining some false negatives). In any case, all the indi-
cators analysed show an improvement using the ensemble network. Therefore,
390

for a Monkeypox skin image classifier. er


we conclude that the use of the indicated ensemble seems to be the best solution

Finally, the results of the classifier developed in this work can be analysed in
pe
comparison with the two previous works detailed in Section 2. The comparative
results are presented in Table 18.
Work Dataset Classes Classifier Results (%)

VGG-16 [V GG16] Acc: 81.48, Pre: 85, Sen: 81, F1: 83


ResNet50 [ResN et] Acc: 82.96, Pre: 87, Sen: 83, F1: 84
Ali et al. (2022) Own [2] Monkeypox, Others
ot

Inception-V3 [Inception] Acc: 74.03, Pre: 74, Sen: 81, F1: 78


Ensemble [Ensemble] Acc: 79.26, Pre: 84, Sen: 79, F1: 81
[2] Monkeypox, Chickenpox [Case1] Acc: 83, Pre: 88, Sen: 83, Spe: 66, F1: 83
Ahsan et al. (2022a) Own VGG-16
[2] Monkeypox, Others [Case2] Acc: 78, Pre: 75, Sen: 75, Spe: 83, F1: 75
VGG-16 [V GG16] Acc: 91.67, Pre: 92, Sen: 92, Spe: 96, F1: 92
VGG-19 [V GG19] Acc: 93.33, Pre: 93, Sen: 93, Spe: 97, F1: 93
tn

ResNet50 [ResN et] Acc: 95, Pre: 95, Sen: 95, Spe: 97.75, F1: 95
[3] Healthy, Monkeypox, MobileNet-V2 [M obileN et] Acc: 88.33, Pre: 88, Sen: 88, Spe: 94, F1: 88
This work (2022) MonkeypoxSkin
Other skin diseases EfficientNet-B0 [Ef f icientN et] Acc: 90, Pre: 90, Sen: 90, Spe: 95, F1: 90
Ensemble 1 [Ensemble1] Acc: 91.67, Pre: 92, Sen: 92, Spe: 96, F1: 92
Ensemble 2 [Ensemble2] Acc: 91.67, Pre: 92, Sen: 92, Spe: 96, F1: 92
Ensemble 3 [Ensemble3] Acc: 98.33, Pre: 98, Sen: 98, Spe: 99, F1: 98
rin

Table 18: Comparison with previous works.

395 If we look at the results obtained in previous work, the classifier developed
for this work improves the overall accuracy of the system (in addition to the
ep

individual accuracy of all classes).


The first compared work uses its own two-class dataset and, therefore, its
classifier is a two-class classifier (“monkeypox” and “others”). The best classi-
Pr

25

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4186534
ed
400 fication results are obtained with the individual ResNet50 model (almost 83%).

iew
In this aspect, it is similar to our work, since the best results for individual
models are obtained with the ResNet50 model; even so, our work improves the
classification results for the ResNet50 model by a 12% (it should also be taken
into account that our work uses three classes, not two). Finally, this first pre-
405 vious work also evaluates an ensemble (consisting of VGG-16 + ResNet50 +

ev
Inception-V3), obtaining worse results than those obtained with ResNet50.
The second compared work uses its own dataset too, but in this case it con-
tains four classes. However, in that work, two two-class classifiers are developed:

r
one classifies between “monkeypox” and “chickenpox”, and the other classifies
410 between “monkeypox” and “others”. This work only analyses the VGG-16

er
model, obtaining classification results of 83% for the first case and 78% for the
second one (taking into account the results presented for the test subset). If
we compare our work with this one, we obtain an improvement of 9% using the
pe
VGG-16 model (although it is one of the worst results obtained in our work).
415 In general, the improvement we obtain with respect to this work is more than
15%.
In summary, looking at our work in general, we have analysed several clas-
ot

sifiers to obtain a nearly optimal solution with an accuracy of over 98%. These
results are better than all the work published so far on all the parameters eval-
420 uated.
tn

Furthermore, the dataset collected for this work is more balanced and regular
than those published so far.
rin

Declarations

Acknowledgments. This work was supported by the Andalusian Regional I+D+i FEDER
425 Project under Grant DAFNE US-1381619.
Funding. Open Access funding provided thanks to the CRUE-CSIC agreement with Elsevier.
ep

Competing Interest. The authors declare that they have no known competing financial
interests or personal relationships that could have appeared to influence the work reported in
this document.
Pr

26

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4186534
ed
430 Ethical statement. No ethical committee has been consulted, as this work uses a publicly
available dataset.

iew
Author contributions. Conceptualization: Manuel Domı́nguez-Morales, Francisco Luna-
Perejón, Javier Civit-Masot; Methodology: Manuel Domı́nguez-Morales, Luis Muñoz-Saavedra,
Elena Escobar-Linero; Formal analysis and investigation: Luis Muñoz-Saavedra, Elena Escobar-
435 Linero; Writing - original draft preparation: Luis Muñoz-Saavedra, Elena Escobar-Linero;
Writing - review and editing: Francisco Luna-Perejón, Javier Civit-Masot; Funding acquisi-

ev
tion: Antón Civit; Resources: Manuel Domı́nguez-Morales; Supervision: Antón Civit, Manuel
Domı́nguez-Morales.

References

r
440 Ahsan, M. M., Uddin, M. R., Farjana, M., Sakib, A. N., Momin, K. A., & Luna,

er
S. A. (2022a). Image data collection and implementation of deep learning-
based model in detecting monkeypox disease using modified vgg16. arXiv
preprint arXiv:2206.01862 , .
pe
Ahsan, M. M., Uddin, M. R., & Luna, S. A. (2022b). Monkeypox image data
445 collection. arXiv preprint arXiv:2206.01774 , .

Ali, S. N., Ahmed, M., Paul, J., Jahan, T., Sani, S., Noor, N., Hasan, T.
ot

et al. (2022). Monkeypox skin lesion detection using deep learning models: A
feasibility study. arXiv preprint arXiv:2207.03342 , .

Centre of Disease Control and Prevention (2022). Technical report:


tn

450 Multi-national monkeypox outbreak. https://www.cdc.gov/poxvirus/


monkeypox/clinicians/technical-report.html. Accessed: 2022-08-02.

Civit-Masot, J. et al. (2020a). Deep learning system for covid-19 diagnosis aid
rin

using x-ray pulmonary images. Applied Sciences, 10 , 4640.

Civit-Masot, J. et al. (2020b). Dual machine-learning system to aid glaucoma


diagnosis using disc and cup feature extraction. IEEE Access, 8 , 127519–
ep

455

127529.
Pr

27

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4186534
ed
Civit-Masot, J. et al. (2021). A study on the use of edge tpus for eye fundus

iew
image segmentation. Engineering Applications of Artificial Intelligence, 104 ,
104384.

460 Domı́nguez-Morales, M. et al. (2019). Smart footwear insole for recognition of


foot pronation and supination using neural networks. Applied Sciences, 9 ,
3970.

ev
Domı́nguez-Morales, M. et al. (2022). MonkeypoxSkin dataset. https://
github.com/mjdominguez/MonkeypoxSkinImages.

r
465 Escobar-Linero, E., Domı́nguez-Morales, M., & Sevillano, J. L. (2022a).
Worker’s physical fatigue classification using neural networks. Expert Sys-
tems with Applications, 198 , 116784.
er
Escobar-Linero, E., Luna-Perejón, F., Muñoz-Saavedra, L., Sevillano, J. L., &
pe
Domı́nguez-Morales, M. (2022b). On the feature extraction process in ma-
470 chine learning. an experimental study about guided versus non-guided process
in falling detection systems. Engineering Applications of Artificial Intelli-
gence, 114 , 105170.
ot

He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image
recognition. In Proceedings of the IEEE conference on computer vision and
pattern recognition (pp. 770–778).
tn

475

Kundu, R. et al. (2021). Pneumonia detection in chest x-ray images using an


ensemble of deep learning models. PloS one, 16 , e0256630.
rin

Liu, Z. et al. (2021). Deep learning framework based on integration of s-mask r-


cnn and inception-v3 for ultrasound image-aided diagnosis of prostate cancer.
480 Future Generation Computer Systems, 114 , 358–367.
ep

Lotter, W. et al. (2021). Robust breast cancer detection in mammography


and digital breast tomosynthesis using an annotation-efficient deep learning
approach. Nature Medicine, 27 , 244–249.
Pr

28

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4186534
ed
Luna-Perejón, F. et al. (2019). Wearable fall detector using recurrent neural

iew
485 networks. Sensors, 19 , 4885.

Luna-Perejón, F. et al. (2020). Low-power embedded system for gait classifica-


tion using neural networks. Journal of Low Power Electronics and Applica-
tions, 10 , 14.

ev
Luna-Perejón, F. et al. (2021a). Ankfall—falls, falling risks and daily-life activ-
490 ities dataset with an ankle-placed accelerometer and training using recurrent
neural networks. Sensors, 21 , 1889.

r
Luna-Perejón, F. et al. (2021b). Iot garment for remote elderly care network.
Biomedical Signal Processing and Control , 69 , 102848.

495
er
Muñoz-Saavedra, L., Civit-Masot, J., Luna-Perejón, F., Domı́nguez-Morales,
M., & Civit, A. (2021). Does two-class training extract real features? a
pe
covid-19 case study. Applied Sciences, 11 , 1424.

Muñoz-Saavedra, L. et al. (2020). Affective state assistant for helping users with
cognition disabilities using neural networks. Electronics, 9 , 1843.

Rim, B. et al. (2020). Deep learning in physiological signal data: A survey.


ot

500 Sensors, 20 , 969.

Roncato, C. et al. (2020). Colour doppler ultrasound of temporal arteries for


tn

the diagnosis of giant cell arteritis: a multicentre deep learning study. Clin
Exp Rheumatol , 38 , S120–25.

Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., & Chen, L.-C. (2018). Mo-
rin

505 bilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the


IEEE conference on computer vision and pattern recognition (pp. 4510–4520).

Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for
ep

large-scale image recognition. arXiv preprint arXiv:1409.1556 , .

Sokolova, M., & Lapalme, G. (2009). A systematic analysis of performance


510 measures for classification tasks. Inf. Process. & Manag., 45 , 427–437.
Pr

29

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4186534
ed
Tan, M., & Le, Q. (2019). Efficientnet: Rethinking model scaling for convolu-

iew
tional neural networks. In International conference on machine learning (pp.
6105–6114). PMLR.

The Guardian (2022). Spain reports second death related to mon-


515 keypox. https://www.theguardian.com/world/2022/jul/30/
spain-reports-second-death-related-to-monkeypox. Accessed: 2022-

ev
08-02.

Thomas, S. M. et al. (2021). Interpretable deep learning systems for multi-class

r
segmentation and classification of non-melanoma skin cancer. Medical Image
520 Analysis, 68 , 101915.

er
Torres-Soto, J. et al. (2020). Multi-task deep learning for cardiac rhythm de-
tection in wearable devices. NPJ digital medicine, 3 , 1–8.
pe
United Nations (2022). What is monkeypox? https://news.un.org/en/
story/2022/07/1123212. Accessed: 2022-08-02.

525 World Heath Organization (2022). Monkeypox details. https://www.who.int/


news-room/fact-sheets/detail/monkeypox. Accessed: 2022-08-02.
ot

Zhu, H. et al. (2020). A deep learning approach for recognizing activity of daily
living (adl) for senior care: Exploiting interaction dependency and temporal
tn

patterns. Forthcoming at MIS Quarterly, .


rin
ep
Pr

30

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4186534
Title

ed
Monkeypox diagnostic-aid system with skin images using convolutional neural
networks
Luis Muñoz-Saavedra 1, Elena Escobar-Linero 1, Javier Civit-Masot 1, Francisco Luna-Perejón 1,
Antón Civit 1,2, Manuel Domínguez-Morales 1,2,*

iew
1 Architecture and Computer Technology dept., E.T.S. Ingeniería Informática, Avda. Reina
Mercedes, s/n, Universidad de Sevilla, Seville, Spain
2 Computer Engineering Research Institute (I3US), Universidad de Sevilla, Seville, Spain

* Corresponding autor:

v
Manuel Domínguez-Morales, E.T.S. Ingeniería Informática, Avda. Reina Mercedes s/n, office
F0.71, 41012, Universidad de Sevilla, Seville, Spain. +34 95 4551206

re
Abstract
Background: Monkeypox is similar to classical smallpox and, until now, isolated cases have
er
been detected in Africa, but in 2022 it has spread around the world and was declared an
international health emergency by the WHO at the end of July.
Objective: In this work, we collect monkeypox images from official government websites and
public datasets (as well as images of healthy people and people with other skin diseases) and
pe
create the MonkeypoxSkin dataset (publicly available), which contains close-up images of the
skin where you can see the posthulas formed by the disease. With these images, we design,
implement, and evaluate several diagnostic aid systems for monkeypox disease.
Methods: The classifiers developed and evaluated are based on Convolutional Neural Networks
models and some ensembles composed of a combination of those models, obtaining automatic
ot

classification results between healthy, monkeypox and other skin damages, given a close skin
tissue image.
Results: The results show a system accuracy greater than 93% when using a unique CNN model
tn

(VGG-19 and ResNet50), and greater than 98% when using a CNN ensemble formed by
ResNet50, EfficientNet-B0 and MobileNet-V2.
Conclusions: In contrast to the work found to date, this classifier focusses on disease
classification based on a low-resolution image obtained from the skin surface, so it can be
rin

integrated into a mobile application for mass screening. To do this, it has been necessary to
design a proprietary dataset that is publicly available for use. Furthermore, the results
obtained improve the accuracy of the previous works.
Keywords: Monkeypox, Deep Learning, Convolutional Neural Networks, Skin damage, Skin
ep

dataset
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4186534

You might also like