You are on page 1of 16

applied

sciences
Article
A Deep Learning-Based Mobile Application for
Monkeypox Detection
Haifa F. Alhasson * , Elaf Almozainy, Manar Alharbi, Naseem Almansour, Shuaa S. Alharbi
and Rehan Ullah Khan

Department of Information Technology, College of Computer, Qassim University, Buraydah 52571, Saudi Arabia;
392206338@qu.edu.sa (E.A.); 392206339@qu.edu.sa (M.A.); 392206376@qu.edu.sa (N.A.);
shuaa.s.alharbi@qu.edu.sa (S.S.A.); re.khan@qu.edu.sa (R.U.K.)
* Correspondence: hhson@qu.edu.sa

Abstract: The recent outbreak of monkeypox has raised significant concerns in the field of public
health, primarily because it has quickly spread to over 40 countries outside of Africa. Detecting
monkeypox in its early stages can be quite challenging because its symptoms can resemble those of
chickenpox and measles. However, there is hope that potential use of computer-assisted tools may be
used to identify monkeypox cases rapidly and efficiently. A promising approach involves the use of
technology, specifically deep learning methods, which have proven effective in automatically detect-
ing skin lesions when sufficient training examples are available. To improve monkeypox diagnosis
through mobile applications, we have employed a particular neural network called MobileNetV2,
which falls under the category of Fully Connected Convolutional Neural Networks (FCCNN). It
enables us to identify suspected monkeypox cases accurately compared to classical machine learning
approaches. The proposed approach was evaluated using the recall, precision, F score, and accuracy.
The experimental results show that our architecture achieves an accuracy of 0.99%, a Recall of 1.0%,
an F-score of 0.98%, and a Precision of 0.95%. We believe that such experimental evaluation will
contribute to the medical domain and many use cases.

Keywords: artificial neural network; deep learning; detection performance; image analysis; image
classification; machine learning models; neural network; object detectors
Citation: Alhasson, H.F.; Almozainy,
E.; Alharbi, M.; Almansour, N.;
Alharbi, S.S.; Khan, R.U. A Deep
Learning-Based Mobile Application 1. Introduction
for Monkeypox Detection. Appl. Sci.
The Mpox virus, also known as monkeypox, is a type of virus that belongs to the family
2023, 13, 12589. https://doi.org/
Poxviridae and the genus Orthopoxvirus. It is believed that monkeypox is transmitted to
10.3390/app132312589
humans through rodents. The symptoms of monkeypox, such as lesions and rashes, are
Academic Editor: Xianpeng Wang similar to those caused by measles, chickenpox, and smallpox. Due to this similarity, early
Received: 15 October 2023
diagnosis of the disease can be very challenging [1]. However, electron microscopy can
Revised: 15 November 2023
be used to definitively diagnose the virus. It is crucial to isolate individuals who have the
Accepted: 19 November 2023
disease to prevent the spread of the virus [2].
Published: 22 November 2023 Recent outbreaks of the monkeypox virus have caused concern in many parts of the
world. Considering its clinical characteristics, monkeypox closely resembles chickenpox,
measles, and smallpox. The monkeypox virus has gained attention due to its severity and
widespread occurrence. In 1970, a nine-month-old child from the Democratic Republic
Copyright: © 2023 by the authors. of Congo became the first documented case of the disease. Since then, monkeypox has
Licensee MDPI, Basel, Switzerland. become an emerging zoonotic disease that poses a significant public health threat [3]. As
This article is an open access article
a result, outbreaks have increased but are primarily restricted to the continent of Africa.
distributed under the terms and
Due to its rapid spread in more than 40 countries outside of Africa, the recent monkeypox
conditions of the Creative Commons
outbreak has become a public health concern [4]. The transmission of the monkeypox
Attribution (CC BY) license (https://
virus from infected animals to humans has historically been limited due to the need for
creativecommons.org/licenses/by/
close and prolonged contact with wild animals. Human-to-human transmission is rare,
4.0/).

Appl. Sci. 2023, 13, 12589. https://doi.org/10.3390/app132312589 https://www.mdpi.com/journal/applsci


Appl. Sci. 2023, 13, 12589 2 of 16

as it typically requires direct interaction with an infected individual. This limited spread
resulted in the disease being confined to specific regions where such interactions occur
regularly. However, recent outbreaks have shown an alarming increase in the number of
monkeypox cases in countries where it was not previously considered endemic [5]. These
outbreaks have highlighted the urgent need for early detection and accurate diagnosis of
monkeypox infections [6]. Monkeypox is a typically self-limiting disease. A monkeypox
rash appears between 1 and 5 days after the onset of the initial signs of monkeypox. The
severity of the disease depends on the amount of virus exposure the patient has received,
their health status, and any complications that occur. As a result, the disease spreads to
other parts of the body. It may be mistaken for chickenpox due to the similarity of the
rashes. Eventually, the rash becomes crusty and begins to spread, making it important to
accurately detect and diagnose monkeypox early on [7,8].
The symptoms of monkeypox, smallpox, and measles are almost identical, as shown in
Figure 1, making it difficult to distinguish them without a laboratory test [9]. Vaccination is
the only way to completely eradicate the disease [10]. While the virus is not life-threatening,
it can cause complications in severe cases, such as sepsis, pneumonia, and blindness [9].

(a) Measles (b) Chickenpox (c) Monkeypox

Figure 1. Sample of monkeypox Skin Image Detection dataset (MSID).

Based on research on virus-related disease detection using Deep Learning (DL) meth-
ods, the majority of studies use the transfer learning approach, which uses well-established
pre-trained DL methods. To the best of our knowledge, there is no extensive research
available on the detection of the monkeypox virus except for the work in ref. [9]. In ref. [9],
interesting results have been achieved. It was determined that the state of the art has three
major limitations.
• Most of the models are limited to binary classification with limited performance.
• There is a limited number of models that offer high accuracy in classifying monkeypox
for quick, real-time diagnosis using smartphones.
• Most of the models have been evaluated using some evaluation metrics in the literature.
The best-performing models in the literature cannot be compared with others and are
not sufficiently generalized, as in ref. [11].
A limited number of research studies have been found to demonstrate the potential of
Machine Learning (ML) methods in diagnosing monkeypox disease using image processing
techniques. In a previous study [12], two main factors were identified as the cause for the
lack of a basis for the advancement of an image-based diagnosis of monkeypox disease:
1. There is a small number of publicly accessible datasets that can be utilized to train and
test to construct an ML model to diagnose monkeypox.
2. Given that the virus has recently gained significant exposure in numerous nations, it is
customary to incorporate a suitable ML algorithm alongside the dataset. Furthermore,
the presentation of a model requires additional time when dealing with image data.
In this article, we evaluate, develop and deploy a mobile application based on DL to
identify and classify monkeypox disease among other skin lesions. From an application
Appl. Sci. 2023, 13, 12589 3 of 16

perspective, monkeypox images can be distinguished from chickenpox and measles images
using pre-trained deep learning networks in the Android mobile application.
The main contribution of this article is a proposed mobile-based deep-learning model
for detecting and classifying monkeypox skin lesions. The model achieves an accurate and
affordable diagnosis, leading to improved treatment outcomes and a reduction in healthcare
costs. It also suggests that deep learning can have broader applications in medical imaging
tasks beyond monkeypox diagnosis.
The rest of the paper is organized as follows: Section 2 provides an explanation of
the deep learning-based approaches utilized in this paper with details of dataset used
to evaluate the method. Details of the results related to the discussion are provided in
Section 3. Section 4 concludes the article.

1.1. Monkeypox Detection Methods


The detection of monkeypox lesions can be achieved with the help of a computer or
algorithms for monkeypox lesions for surveillance purposes and rapid identification of
suspected cases. Ali et al. [10] introduce a new dataset of skin lesions caused by monkeypox,
chickenpox, and measles. Almost all images were obtained from websites, news websites,
and publicly available case reports. The current state of automatic detection of monkeypox
skin lesions is hampered by a significant deficiency in training examples. This dearth of
data is primarily due to the challenges associated with obtaining the necessary images for
training purposes.
In addition, the Monkeypox Skin Lesion Dataset (MSLD) [13] has become an abstract
reaction to the recent outbreak of monkeypox. It is a dataset that contains Web-scrapped
images of monkeypox and non-monkeypox cases (measles and chickenpox), as well as
images of various body parts (face, neck, hand, arm, leg). Their study also explores the
feasibility of leveraging transfer learning with VGG16, ResNet50 and InceptionV3 VGG16.
Tianyi et al. [14] believe that the AI-Autonomous Mobile Clinics Mobile Platform
(AICOM-MP) is required to handle images taken by various devices, especially low-
resolution images taken on resource-constrained devices. Irmak et al. [2] provide publicly
available monkeypox image datasets. Their dataset, however, is limited to binary classifica-
tion with limited performance.

1.2. Classification Approaches


There are many researchers who classified similar skin diseases before discover-
ing monkeypox disease. Wei et al. [15] approach is based on Support Vector Machines
(SVM), and therefore three skin diseases were classified. Dermatitis, herpes, and psoriasis.
Bhadula et al. [16] employed convolutional neural networks (CNN) associated with five
distinct ML classifiers to identify three kinds of skin diseases (i.e., acne, lichen planus
and sjs ten). Furthermore, Sriwong et al. [17] in their research developed three types of
learning models: transfer learning using the existing AlexNet architecture (Alexnet-TL),
SVM modeling based on feature extraction data from images (FESVM), and SVM modeling
based on feature extraction data from images and patient data (FVM). Roy et al. [18] utilized
different segmentation approaches to detect skin diseases such as chickenpox. In their
work, they applied different image processing techniques such as adaptive thresholding,
edge detection, K-means clustering, and morphology-based image segmentation.
The earliest trials to detect other skin diseases similar to monkeypox were conducted
by Sahin et al. [19]. They used some pre-trained DL models on the MSLD dataset and dis-
covered that the MobileNetV2 and EfficientNetB0 models performed well. Arias et al. [20]
developed a DL model to detect varicella zoster using K-Nearest Neighborhood (KNN),
neural networks, and logistic regression.
Sahin et al. [21], proposed a smartphone-based skin disease-detection method utilizing
MobileNetV2. In addition, through the data presented in Hussain’s work [22], they evalu-
ated the performance of ResNet50, DenseNet121, Inception-V3, SqueezeNet, MnasNetA1,
MobileNet-V2, and ShuffleNet-V2.
Appl. Sci. 2023, 13, 12589 4 of 16

Akin et al. [23] analyzed monkeypox skin lesions using an AI-assisted CNN. ResNet-18,
ResNet-50, VGG- 16, Densenet-161, EfficientNet B7, EfficientNet V2, GoogLeNet, MobileNet
V2, MobileNet V3, ResNeXt-50, ShuffleNetV2, and ConvNeXt were used in the experiments.
This resulted in the MobileNetV2 model having the best accuracy compared with the other
modules. Islam et al. [13] proposed a web-scraping-based data collection system for a
monkeypox skin lesion. In the classification task, ResNet50, Inception-V3, DenseNet121,
MnasNet-A1, MobileNet-V2, ShuffleNet-V2, and SqueezeNet models were used. The
ShuffleNet-V2 outperformed the other modules.
Ahsan et al. [24] generated a new monkeypox classification dataset. Based on the
1915 augmented images, the VGG16 model was implemented. Ali et al. [10] applied
different deep learning modules to the MSID dataset in order to classify monkeypox. These
modules included VGG16, ResNet50, Inception-V3, and ResNet50. The ResNet50 achieved
the best accuracy compared to other modules.
Sahin et al. [21] presented an android mobile application, using a smartphone camera
to detect and classify the monkey-pox. They used four different DL networks VGG16,
ResNet50, InceptionV3, and Ensemble. In their experiment, EfficientNetb0 and Mo-
bileNetv2 showed better performance in terms of accuracy compared to other modules.
Haque et al. [25] applied the MSLD dataset to different transfer-learning-based models
using VGG19, DenseNet121, Xception, EfficientNetB3, and MobileNetV2. Ali et al. [10]
analyzed monkeypox skin lesions and other diseases (chickenpox, measles) using pre-
trained models such as VGG-16, ResNet50, and InceptionV3. On another hand, Hus-
sain et al. [22] compared the performance of monkeypox classification using wide range of
modules: ResNet50, DenseNet121, Inception-V3, SqueezeNet, MnasNet-A1, MobileNet-V2
and ShuffleNet-V2. Ahsan et al. [9] applied a modified VGG16 to identify monkeypox.
They applied feature extraction and prediction using Local Interpretable Model-Agnostic
Explanations (LIME). Sitaula et al. [11] introduced a new method based on DenseNet-169
and Xception on the monkeypox-2022 image dataset, whereas Bajwa et al. [26] utilized
DenseNet-161, ResNet-152, NASNet, and SE ResNeXt-101 in the DermNet dataset.
Manne et al. [27], reviews several articles using CNNs to classify skin lesions. Recently,
machine learning algorithms have reduced the rate of misclassification of skin lesions
compared to dermatologists. In our paper, we also present the use of CNNs in successfully
classifying skin cancer types, as well as the methods that have been implemented and the
success rate. Rajput et al. [28] proposed an efficient convolutional neural network (CNN)
model for identifying skin cancer problems with high accuracy. In order to classify the
HAM10K data, the AlexNet model has been customized.
Other methods that are designed to utilize a collection of transfer learning modules
to improve the classification task are discussed in [29,30]. Raza et al. [29] introduced an
ensemble model for the classification of skin lesions. Their module includes the use of hyper-
transfer learning methods such as Xception, Inceptionv3, DenseNet121, and DenseNet201.
On the other hand, Gouda et al. [30] presents a new method called ESRGAN that classifies
images using Resnet50, InceptionV3, and Inception Resnet. Table 1 summarizes the various
approaches used in previous studies, along with the techniques applied and the datasets
used to evaluate the models.
Appl. Sci. 2023, 13, 12589 5 of 16

Table 1. Summary of the different approaches in previous studies.

Additional
Research Paper Network Used and Accuracy Value (%) Dataset Used Evaluation
Measurement
ResNet18 (86.8%), GoogleNet (82.22%),
1-fold
Sahin et al. [21] EfficientNetb0 (91.11%), NasnetMobile (86.67%), Prepared
cross-validation
ShuffleNet (80.00%), MobileNetv2 (91.11%)
Precision, Recall,
Manjurul et al. [12] VGGNet (71%), MobileNet (94.4%) MSLD F1-score and
Specificity
VGG19-CBAM-Dense (71.86%), Xception
CBAM-Dense (83.89%),
Monkey-pox Skin,
DenseNet121-CBAM-Dense (78.27%), 4-fold
Haque et al. [25] Lesion
MobileNetV2-CBAM-Dense (74.07%), cross-validation
Dataset (MSLD)
EfficientNetB3CBAMDense (81.43%), Ensemble (79.26%),
ShuffleNet (80.00%)
Xception (96.83%), Resnet50 (96.93%),
Almuayqil et al. [31] DenseNet201 (97.16%), InceptionV3 (96.62%), Prepared –
VGG19 (96.97%), InceptionResnet (96.65%)
ResNet-18 (98.25%), ResNet50 (96.49%), VGG-16 (92.98%),
Densenet-161 (96.49%), EfficientNetB7 (94.74%),
Monkey-pox Skin,
GogLeNet (96.49%), EfficientNetV2 (96.49%),
Akin et al. [23] Images Sensitivity
MobileNetV2 (98.25%), MobileNetV3 (75.44%),
Dataset (MSID)
ResNeXt-50 (92.98%), ShuffleNetV2 (78.95%),
ConvNeXt (96.49%)
VGG16 (0.93%), InceptionResNetV2 (0.98%),
Ahsan et al. [9] ResNet50 (0.72%), ResNet101 (0.72%), Prepared Precision
MobileNetV2 (0.99%), VGG19 (0.90%)
ResNet50 (0.72%), Inception-V3 (0.71%),
DenseNet121 (0.78%), MnasNet-AI (0.72%), 5-fold
Hussain et al. [22] Prepared
MobileNet-V2 (0.77%), ShuffleNet-V2 (0.79%), cross-validation
SqueezeNet (0.65%)

1.3. DL Approaches
1.3.1. CNN
Utilizing deep learning (DL) techniques enables the automated acquisition of complex
features necessary for visual pattern recognition. Convolutional neural networks (CNNs),
a specific type of DL approach, have been extensively employed in various computer
vision tasks, encompassing facial expression recognition, text recognition, face recognition,
gender classification, age classification, and action recognition. Moreover, CNNs have
exhibited remarkable performance in biomedical applications, specifically in the domains
of pattern recognition and computer vision. Different CNN architectures employ different
combinations of convolutional, pooling, and fully connected layers. The convolutional
process involves the conversion of input data into filters [21,32]. The convolution process is
visually depicted in Figure 2.
Suppose a square neuron layer is followed by a convolutional layer. If we use a m × m
filter w, our convolutional layer output will be of the size ( N − m + 1) × ( N − m + 1).
To calculate the pre-nonlinearity input to a unit x l ij in our layer, we must sum up the
contributions (weighted by the filter components) from the previous layer cells.

m −1 m −1
xijl = ∑ ∑ wab yl(− 1
i + a)( j+b)
(1)
a =0 b =0
Appl. Sci. 2023, 13, 12589 6 of 16

Figure 2. The structure of CNN [32].

1.3.2. MobileNetV2
Howard et al. [33,34] introduced a class of efficient models designed for applications
involving mobile and embedded vision, which they call MobileNets. These MobileNets
are built upon a simplified architecture that utilizes depth-wise separable convolutions
to construct deep and lightweight neural networks. The key idea behind depth-wise
separable convolution is that it achieves the same output as the conventional convolution
method but with increased efficiency. This efficiency arises from a reduction in the number
of parameters involved in the computation, as in ref. [33]. Furthermore, the authors
incorporate inverted residual layers into their network architecture. These layers are added
right after the initial convolution layer and feature 32 filters. This is then followed by
a pointwise convolution which produces output with a size of 7 × 7 × 1280 pixels. In
convolutional networks, residual blocks serve to convey information from the beginning
to the end of the convolutional block using a skip-connection. It is commonly observed
that the layers between the beginning and end of a residual block have more channels
than those between the beginning and end of the residual block. MobileNetV2 uses an
inverted residual block in which the connected layers have fewer channels than the layers
between, resulting in a much lower number of parameters than the standard residual block.
Therefore, the authors present two simple global hyperparameters that are designed to
efficiently trade off latency and accuracy. By selecting these hyperparameters, the model
creator is able to select the appropriate model size for their application based on the
constraints of the problem.
Under the same level of complexity, MobileNet performs inference significantly faster.
When the computational resources are relatively low, MobileNet uses a slow down sampling
strategy, which causes severe performance degradation. MobileNetV2 takes input images
with the size of 224 × 224 × 3 pixels. Therefore, the input images on the dataset have to be
resized and cropped. Figure 3 shows the architecture of MobileNet that contains several
convolutional blocks.

1.3.3. Transfer Learning


In transfer learning, knowledge is transferred from a model that has been utilized for
another purpose to learn how to perform a new task. As a base model, this method uses a
pre-trained model trained on a large dataset, which will be used for other tasks that utilize
different datasets. By transferring knowledge from the large dataset into the pre-trained
model, the model obtains knowledge, which is represented in the network in the form of
weights [35]. In order to avoid having to train the second network from scratch using a new
dataset, we “transfer” what we have learned from the first network. In the initial layer, the
model learns to identify general features present in the images, such as lines, edges, and
shapes. This is achieved through the application of convolutional filters, which scan the
input image and detect these basic visual elements. By analyzing patterns and variations in
the pixel values, the network gradually builds an understanding of the fundamental visual
components that make up an image. Therefore, during transfer learning, the weights of
Appl. Sci. 2023, 13, 12589 7 of 16

the base network or model are generally frozen. In the last layer of the network, which
is usually a fully connected layer, the network is trained to learn the specific features of
the new dataset. A number of layers are added to the base model, called the head model,
which improves the performance of the network.

Figure 3. The structure of MobileNetV2 [33].

2. Methods
2.1. Datasets
In order to train and test the proposed model, the focus is on two datasets from the
literature as shown in Table 2. Furthermore, our aim is to make the chosen dataset different
in terms of clarity, illumination, shadow, and noise. Therefore, we use two datasets.

Table 2. Selected dataset from the literature.

Dataset Name No. of Class Number of Image Monkeypox Image of Other


MSLD [13] 2 102 126
MSID [24] 4 279 491

2.1.1. MSLD Dataset


A primary aim of MSLD is to distinguish monkeypox cases from similar non-monkeypox
cases. As a result, in addition to the monkeypox class, we also classified images of chick-
enpox rash and pustules in an additional class named ‘Others’, as they have similar
characteristics to monkeypox. Vega et al. [36] studied the work of Ahsan et al. in ref. [24].
They concluded that despite the fact that the relevant areas of the images have
been blinded, the resulting model is able to accurately classify the given classes in both
studies [12,24]. According to ref. [36],the authors declare that the model constructed by
Ahsan et al.does not adequately represent the phenomena they are trying to describe. As
a consequence, such a solution is not suitable for clinical diagnosis, nor is the dataset
suitable for any medical machine-learning research. Based on the results of this rebuttal
experiment, similar methodologies and datasets can be used in other works following a
similar approach.

2.1.2. MSID Dataset


In order to achieve high accuracy rates, it is essential to have a large dataset. Generally,
ML models perform better when the dataset is equally divided and non-biased. As a result,
Appl. Sci. 2023, 13, 12589 8 of 16

we acquired a diverse and large dataset. For experimental evaluation, we used Kaggle’s
MSID Monkeypox Skin Image Dataset, which consists of 770 images and is comparatively
state of the art for the detection of monkeypox. We split the dataset into three parts: training
(40%), validation (20%), and testing (20%), before conducting augmentation to avoid any
inducing bias to the model performance results. Then, we applied Data Augmentation to
the dataset and increased the number of images to 4074 before dividing the images into
four categories. Figure 4a,b display some sample images from MSID and MSLD datasets.

(a) MSID dataset (b) MSLD dataset

Figure 4. Sample images from the two datasets.

2.2. Data Augmentation and Balancing


It is the process of generating new data instances from existing data in order to
artificially increase the amount of data. Because this dataset is small and unbalanced as
shown in (Supplementary Materials, Figure S1a), which may not lead to the production of
an accurate model, we applied data augmentation to the dataset and increased its total to
5234 images divided into four categories.
As a result of the data augmentation, the dataset was expanded from 770 to 5234 im-
ages. The augmentation processes on our dataset were auto-orientation, grayscale, satu-
ration, and resizing. The images were resized to 256 × 256, which is the size suitable for
the Mo-bileNetV2 model. The second data augmentation method we used was grayscale,
which applies to 25% images as in [37], along with other four data auto-orientation aug-
mentation strategies, including rotating the x-axis, the y-axis, and both axes, combining
four units in a specified direction, including an x and y shift, rotation of 360 degrees in
increments of 30 or 45 degrees, and image shearing. Samples of the augmentation results
are shown in Figure 5. This process aims to improve the generalizability of a model, which
helps to minimize the over-fitting by creating new, artificially augmented datasets on
which it can be trained. Supplementary Materials, Figure S1b shows the result of this
augmentation step.

(a) (b) (c)

Figure 5. Augmentation approaches used on MSID dataset. (a) Resizing and grayscale effect,
(b) resizing and rotation effect, and (c) resizing and saturation effect.
Appl. Sci. 2023, 13, 12589 9 of 16

Previous studies have highlighted the problem of class imbalances [37], where some
classes had significantly higher volumes than others. The dataset was balanced using
SMOTETomek [38], which combines oversampling and under-sampling techniques. Prior
to implementing the SMOTETomek technique, the target variable count was found to be
imbalanced, as indicated in (Supplementary Materials, Figure S1a). The method involves
combining oversampling and undersampling techniques to generate synthetic samples
for minority classes, as well as removing noisy samples from majority classes. As shown
in (Supplementary Materials, Figure S1b), the SMOTETomek technique has resulted in a
target variable count of 1220–1356 for all classes.

2.3. Proposed Model


We augment the MobileNet paradigm as our proposed model. In MobileNet, depth
wise separable convolutions are used as opposed to standard convolutions. MobileNet does
not use standard convolutions, but instead uses depth-wise separable convolutions, which
require only one-eighth of the computation time. MobileNet has two hyper-parameters:
width multiplier and length multiplier for resolution. At the same level of complexity, Mo-
bileNet performs inference significantly faster. When the computational budget is relatively
low, MobileNet uses a slowdown sampling strategy, which causes severe performance
degradation. Figure 6 shows the role of MobileNetv2 in our framework. We have changed
the image size to match MobileNetV2 (pre-trained using ImageNet dataset [39] that con-
sisted of about 1000 object classes, 1,281,167 training images, 50,000 validation images, and
100,000 test images). The input image size is 256 × 256 for optimal performance in the
base model (MobileNetV2). In addition, we have incorporated a head model that includes
denser layers compared to our previous approach. This modification aims to facilitate
intensive training, considering the significant similarity observed among the images be-
longing to different classes in the dataset. Furthermore, we have deliberately decelerated
the learning process to allow the model ample time to comprehend the intricacies of the
images. This adjustment was necessary, as our previous attempts at rapid learning yielded
inaccurate outcomes and low levels of precision. Figure 7 shows the different layers of the
modified MobileNetV2 architecture and the details of the head model in our framework
where all operations have been performed in the dense layer.

Figure 6. The structure of our model using samples of MSID dataset [24].
Appl. Sci. 2023, 13, 12589 10 of 16

2.3.1. Base Model Architecture


• After 3 × 3 Convolution, Batch Normalization and ReLU activation follow. The first
layer of MobileNet has a kernel dimension of 3 × 3 × 3 × 32. The input dimension is
224 × 224 × 3 and the output dimension is 112 × 112 × 32.
• After 3 × 3 depthwise convolution, batch normalization and ReLU activation are per-
formed. A 1 × 1 Convolution follows this sub-unit, followed by Batch Normalization
and ReLU activation. As a result, the sequence of these two subunits creates our
second unit.
• The MobileNet-Tiny network is designed to process RGB images with dimensions of
224 × 224 × 3 as input. These images are then passed through a series of convolution
layers and Bottleneck Residual Blocks (BRBs). The purpose of these layers and blocks
is to extract and learn relevant features from the input image. After completing this
process, the network produces a feature map with dimensions of 7 × 7 × 320.
• This feature map represents a condensed representation of the original image, where
each element on the map corresponds to a specific feature that the network has learned
to recognize.
• Afterwards, the feature map, along with the other feature maps from BRB4, BRB5,
and BRB6, is fed into the SSDLite predictor layers to generate detections. These pass
through a Non-Maximum Suppression layer, which then filters them and generates
the final detections and bounding boxes.

2.3.2. Head Model


To reduce the computation cost required to operate the neural network, we used nine
dense layers, each with different unit values (1024, 512, 256, 164, 128, 64, 32, 16, and 4),
and included ReLU as a parameter of the activation function. We set SoftMax as one of
the parameters of the activation function of the last layer as the CNN size increases. As a
result, the computational cost of adding additional ReLU increases linearly as the CNN
size increases. Furthermore, we use Dropout, which is a technique used in neural networks
to prevent overfitting. It involves the random removal of nodes from the input and hidden
layers during training. This forces the network to learn more robust and generalizable
features. When a node is dropped, all connections associated with it are temporarily
removed, resulting in a modified network architecture. This helps to prevent the network
from relying too heavily on specific nodes and encourages the learning of more diverse
representations. In Figure 7, the head model and its layers are shown, and two sub-units
create our second unit.
The proposed method was trained and validated using the MSID dataset, as shown
in Table 3.

Table 3. The details of hyperparameters used in our proposed model.

Hyperparameters Value
Learning rate 0.001
Optimizer Adamax
Activation function Softmax
Regularizers L2 (0.01)
Early stopping Yes
Batch size 128
Dropout 0.2
Dense activation function ReLU
Appl. Sci. 2023, 13, 12589 11 of 16

Figure 7. The head model.

2.3.3. Mobile Perspective


We tested the inference and then implemented the mobile application following the
design specifications detailed in Table 4. As a result of the model’s accuracy score of 99%,
the application test results produced correct diagnoses for the four classes, as shown in
Figure S2 in Supplementary Materials. Evaluation and results will be further discussed in
the following sections, providing a visual representation of the structure of the network
and how the Dropout technique is applied.

Table 4. Design decisions for the mobile application Mobile Device.

Feature Type Feature Name


Smartphone Operating System Android OS
IDE Android Studio Chipmunk 2021.2.1
SDK Android 12 SDK (API level 31)
Minimum SDK Android 5.0 SDK (API level 21)
Programming Language Java
Camera API Camera2 API
ML Library TensorFlow Lite

3. Results and Discussion


3.1. Performance Evaluation Metrics
By examining model metrics, we are able to gain an understanding of the model’s
performance. In order to evaluate the model, we use Accuracy, Recall, Precision, and
F1-score. The metrics are described in Table S1 in Supplementary Materials.
Appl. Sci. 2023, 13, 12589 12 of 16

3.2. Performance Analysis in Terms of Evaluation Metrics


The proposed method achieves 99% accuracy on the MSID dataset, which contains
many classes compared with other datasets. Table 5 shows the details of measured metrics
for each class. The proposed approach achieves high precision and recall and effectively
minimizes false positives. Its balance accuracy, recall, and precision make it an optimal
model for monkeypox detection. The used data have a high imbalance in which the number
of class data is smaller than the number of other class data. Therefore, for this dataset,
further processing, such as using the over-sampling algorithm used in this study or using
a weighted cost function [34], has been implemented to address the imbalance dataset
problem. Figure 8a shows the smooth loss of both the training and validation processes
and Figure 8b shows the achieved performance curve of both training and validation.
The proposed method achieves a higher accuracy of 99% compared with Akin et al. [23]
applying the MobileNet-V2 and achieving an accuracy of 98.25%. The AUC ROC curve
is another way to measure the performance of a machine-learning model. It is also used
to summarize the ROC curve. The model produced a result of 0.99. Figure 9 shows a
graph that displays the highest score among the evaluation metrics used to evaluate the
proposed method.

Table 5. F1-score, Recall, and Precision testing results for all classes.

Metric Chickenpox Measles Monkeypox Normal


F1-score (%) 0.88 0.99 0.98 0.99
Recall (%) 0.79 1.00 1.00 0.99
Precision (%) 1.00 1.00 0.95 0.99
accuracy
loos

epoch epoch
(a) Training vs. validation loss (b) Training vs. validation accuracy

Figure 8. Performance of the training process and validation of the proposed method on MSID dataset.
accuracy, recall, precision, auc

epoch

Figure 9. A graph showing the highest score among the evaluation metrics used in the study.
Appl. Sci. 2023, 13, 12589 13 of 16

3.3. Performance Analysis in Terms of Confusion Metric


The confusion matrix is a widely used metric for classifying models, from which a
variety of measures are derived, including accuracy, error rate, and sensitivity. Figure 10
shows the calculation of each class label’s true positives, true negatives, false positives, and
false negatives. A true positive is the number of instances in which the actual and predicted
values are the same for a given label. The true negatives are the sum of the values of both
the rows and columns in the confusion matrix, except for the row and column for the class
label. False positives are the summation of the values in the column of the class other than
the true positive value. Lastly, the values of false negatives are the sum of all of the values
in the row of the class except the true positive value.

Figure 10. Confusion Matrix of monkeypox classification using MSID dataset.

3.4. Performance Comparison with Earlier Works


In our study, we obtained a higher percentage compared to previous studies that used
the same dataset (MSID), as shown in Table 6 when considering factors such as adding
more layers to the network, controlling the training process speed, hyperparameters, and
the model’s learning ability. The results specifically focused on the MSID after fine-tuning
and optimizing hyperparameters.

Table 6. Comparison between the suggested approach and related studies

Research Paper Accuracy Measurement


Akin et al. [23] MobileNetV2 (98.25%)
Our methodology Modified MobileNet-V2 (99%)

3.5. Scientific Contribution of the Present Study


This study focuses on the application of advanced deep-learning techniques for the
early detection of monkeypox in a real-time system, filling a gap in the current literature.
The proposed model demonstrates high speed and accuracy in classifying and diagnosing
monkeypox skin lesions. Data preprocessing and augmentation methods were utilized to
enhance the accuracy of image classification. The findings demonstrate that the suggested
model exhibits exceptional accuracy in classification, making it a valuable diagnostic tool
for rapid and precise diagnoses in clinical environments. Implementation of this model
could aid healthcare professionals in improving treatment outcomes, reducing healthcare
Appl. Sci. 2023, 13, 12589 14 of 16

costs, and facilitating prompt diagnoses across various ailments. This study also highlights
the potential significance of deep learning in other medical imaging tasks.
Nevertheless, it is important to acknowledge the limitations of this study. One limi-
tation is the relatively small sample size in the dataset used and its applicability to other
datasets. Another limitation is that the dataset utilized solely consists of pox-related images
without any samples of “no skin lesions”. Future research should consider incorporating
larger and more diverse datasets that include various types of skin lesions as well as cases
without any skin lesions for a comprehensive analysis. Furthermore, it would be valuable
to evaluate the performance of the proposed model on different platforms and devices to
ensure its practicality and scalability.

4. Conclusions
The recent monkeypox outbreak has become a major cause for concern within the field
of public health, particularly due to its rapid spread to more than 40 countries outside of
Africa. Detecting monkeypox in its early stages presents a significant challenge because
its symptoms can easily be mistaken for those of chickenpox and measles. One promising
avenue is the use of computer-assisted tools, specifically harnessing the power of deep
learning methods. These methods have already demonstrated their effectiveness in auto-
matically identifying skin lesions, provided they are trained with sufficient examples. To
contribute to the diagnosis of monkeypox through easily accessible mobile applications, in
this article, we used a specific type of neural network known as MobileNetV2.
Our proposed approach was rigorously evaluated using various metrics, including
Recall, Precision, F-score, and accuracy. The experimental results showed the effectiveness
of our proposed approach: our architecture achieved an astonishing precision rate of 99%, a
perfect recall rate of 100%, an F-score of 98%, and a high precision rate of 95%. These results
underscore the potential of deep learning and MobileNetV2 as valuable tools in the battle
against monkeypox, offering both precision and efficiency in diagnosis. We believe that
such experimental evaluation will contribute to the medical domain and many use cases.

Supplementary Materials: The following supporting information can be downloaded at: https://www.
mdpi.com/article/10.3390/app132312589/s1, Figure S1: The distribution of skin disease classes in
the MSID dataset before and after augmentation (a) before augmentation and (b) after augmentation.;
Figure S2:An illustration of different use cases in the application.; Table S1: The description of used
evaluation metrics in the study.
Author Contributions: Methodology, H.F.A., E.A., M.A. and N.A.; Software, E.A., M.A., N.A. and
H.F.A.; Validation, E.A., M.A. and N.A.; Writing—original draft, H.F.A.; Writing—review & editing,
H.F.A., S.S.A. and R.U.K.; Supervision, H.F.A.; Project administration, H.F.A. All authors have read
and agreed to the published version of the manuscript.
Funding: The researchers would like to thank the Deanship of Scientific Research, Qassim University
for funding the publication of this project.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: The data presented in this study are available in the article.
Conflicts of Interest: The authors declare no conflict of interest.

References
1. Qureshi , M.; Khan, S.; Bantan, R.A.; Daniyal, M.; Elgarhy, M.; Marzo, R.R.; Lin, Y. Modeling and Forecasting Monkeypox Cases
Using Stochastic Models. J. Clin. Med. 2022, 11, 6555. [CrossRef]
2. Irmak, M.C.; Aydin, T.; Yağanoğlu, M. Monkeypox Skin Lesion Detection with MobileNetV2 and VGGNet Models. In Proceedings
of the Medical Technologies Congress (TIPTEKNO), Antalya, Turkey, 31 October–2 November 2022; pp. 1–4.
3. Rahim, M.A. Monkeypox: An emerging zoonotic disease with pandemic potential. BIRDEM Med J. 2022, 12, 170–171. [CrossRef]
4. Fowotade, A.; Fasuyi, T.; Bakare, R. Re-emergence of monkeypox in Nigeria: A cause for concern and public enlightenment. Afr.
J. Clin. Exp. Microbiol. 2018, 19, 307–313. [CrossRef]
Appl. Sci. 2023, 13, 12589 15 of 16

5. Lim, C.K.; McKenzie, C.; Deerain, J.; Chow, E.P.; Towns, J.; Chen, M.Y.; Fairley, C.K.; Tran, T.; Williamson, D.A. Correlation
between monkeypox viral load and infectious virus in clinical specimens. J. Clin. Virol. 2023, 161, 105421. [CrossRef]
6. Radhakumar, D.S.; Thiyagarajan, S.; Rajaram, K.; Parsanathan, R. Human antimicrobial peptide Histatin 1, 3, and its autopro-
teolytic cleaved peptides target the monkeypox virus surface proteins: Molecular modelling and docking studies. Biotechnol.
Bioprocess. 2023, 4, 1–10.
7. Bernard, S.M.; Anderson, S.A. Qualitative assessment of risk for monkeypox associated with domestic trade in certain animal
species, United States. Emerg. Infect. Dis. 2006, 12, 1827. [CrossRef]
8. Dubois, M.E.; Slifka, M.K. Retrospective analysis of monkeypox infection. Emerg. Infect. Dis. 2008, 14, 592. [CrossRef]
9. Ahsan, M.M.; Abdullah, T.A.; Ali, M.S.; Jahora, F.; Islam, M.K.; Alhashim, A.G.; Gupta, K.D. Transfer learning and Local
interpretable model agnostic based visual approach in Monkeypox Disease Detection and Classification: A Deep Learning
insights. arXiv 2022, arXiv:2211.05633.
10. Ali, S.N.; Ahmed, M.T.; Paul, J.; Jahan, T.; Sani, S.M.S.; Noor, N.; Hasan, T. Monkeypox skin lesion detection using deep learning
models: A feasibility study. arXiv 2022, arXiv:2207.03342.
11. Sitaula, C.; Shahi, T.B. Monkeypox virus detection using pre-trained deep learning-based approaches. J. Med Syst. 2022, 46, 78.
[CrossRef]
12. Ahsan, M.M.; Uddin, M.R.; Farjana, M.; Sakib, A.N.; Momin, K.A.; Luna, S.A. Image Data collection and implementation of deep
learning-based model in detecting Monkeypox disease using modified VGG16. arXiv 2022, arXiv:2206.01862.
13. Islam, T.; Hussain, M.A.; Chowdhury, F.U.H.; Islam, B.R. A Web-scrapped Skin Image Database of Monkeypox, Chickenpox,
Smallpox, Cowpox, and Measles. bioRxiv 2022. . [CrossRef]
14. Tianyi Yang, T.; Tianze Yang, T.; Liu, A.; Tang, J.; An, N.; Liu, S.; Liu, X. AICOM-MP: An AI-based Monkeypox Detector for
Resource-Constrained Environments. arXiv 2022, arXiv:2211.14313.
15. Wei, L.s.; Gan, Q.; Ji, T. Skin disease recognition method based on image color and texture features. Comput. Math. Methods Med.
2018, 2018, 8145713. [CrossRef]
16. Bhadula, S.; Sharma, S.; Juyal, P.; Kulshrestha, C. Machine learning algorithms based skin disease detection. Int. J. Innov. Technol.
Explor. Eng. (IJITEE) 2019, 9, 4044–4049. [CrossRef]
17. Sriwong, K.; Bunrit, S.; Kerdprasop, K.; Kerdprasop, N. Dermatological classification using deep learning of skin image and
patient background knowledge. Int. J. Mach. Learn. Comput. 2019, 9, 862–867. [CrossRef]
18. Roy, K.; Chaudhuri, S.S.; Ghosh, S.; Dutta, S.K.; Chakraborty, P.; Sarkar, R. Skin Disease detection based on different Segmentation
Techniques. In Proceedings of the International Conference on Opto-Electronics and Applied Optics (Optronix), Kolkata, India,
18–20 March 2019; pp. 1–5.
19. Teo, J. Early detection of silent hypoxia in COVID-19 pneumonia using smartphone pulse oximetry. J. Med Syst. 2020, 44, 1–2.
[CrossRef]
20. Arias, R.; Mejía, J. Varicella zoster early detection with deep learning. In Proceedings of the IEEE Engineering International
Research Conference (EIRCON), Lima, Peru, 21–23 October 2020; pp. 1–4.
21. Sahin, V.H.; Oztel, I.; Yolcu Oztel, G. Human Monkeypox Classification from Skin Lesion Images with Deep Pre-trained Network
using Mobile Application. J. Med Syst. 2022, 46, 134. [CrossRef]
22. Hussain, M.A.; Islam, T.; Chowdhury, F.U.H.; Islam, B.R. Can artificial intelligence detect monkeypox from digital skin images?
bioRxiv 2022. . [CrossRef]
23. Akin, K.D.; Gurkan, C.; Budak, A.; Karataş, H. Classification of Monkeypox Skin Lesion using the Explainable Artificial
Intelligence Assisted Convolutional Neural Networks. Avrupa Bilim Teknol. Derg. 2022, 40, 106–110.
24. Ahsan, M.M.; Uddin, M.R.; Luna, S.A. Monkeypox Image Data collection. arXiv 2022, arXiv:2206.01774.
25. Haque, M.E.; Ahmed, M.R.; Nila, R.S.; Islam, S. Classification of Human Monkeypox Disease Using Deep Learning Models and
Attention Mechanisms. arXiv 2022, arXiv:2211.15459.
26. Bajwa, M.N.; Muta, K.; Malik, M.I.; Siddiqui, S.A.; Braun, S.A.; Homey, B.; Dengel, A.; Ahmed, S. Computer-aided diagnosis of
skin diseases using deep neural networks. Appl. Sci. 2020, 10, 2488. [CrossRef]
27. Manne, R.; Kantheti, S.; Kantheti, S. Classification of Skin cancer using deep learning, ConvolutionalNeural Networks-
Opportunities and vulnerabilities—A systematic Review. Int. J. Mod. Trends Sci. Technol. 2020, 6, 2455–3778.
28. Rajput, G.; Agrawal, S.; Raut, G.; Vishvakarma, S.K. An accurate and noninvasive skin cancer screening based on imaging
technique. Int. J. Imaging Syst. Technol. 2022, 32, 354–368. [CrossRef]
29. Raza, R.; Zulfiqar, F.; Tariq, S.; Anwar, G.B.; Sargano, A.B.; Habib, Z. Melanoma classification from dermoscopy images using
ensemble of convolutional neural networks. Mathematics 2022, 10, 26. [CrossRef]
30. Gouda, W.; Sama, N.U.; Al-Waakid, G.; Humayun, M.; Jhanjhi, N.Z. Detection of Skin Cancer Based on Skin Lesion Images Using
Deep Learning. Healthcare 2022, 10, 1183. [CrossRef]
31. Almuayqil, S.N.; Abd El-Ghany, S.; Elmogy, M. Computer-Aided Diagnosis for Early Signs of Skin Diseases Using Multi Types
Feature Fusion Based on a Hybrid Deep Learning Model. Electronics 2022, 11, 4009. [CrossRef]
32. Albawi, S.; Mohammed, T.A.; Al-Zawi, S. Understanding of a convolutional neural network. In Proceedings of the International
Conference on Engineering and Technology (ICET), Antalya, Turkey, 21–23 August 2017; pp. 1–6.
Appl. Sci. 2023, 13, 12589 16 of 16

33. Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.C. Mobilenetv2: Inverted residuals and linear bottlenecks. In
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018;
pp. 4510–4520.
34. Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. Mobilenets: Efficient
convolutional neural networks for mobile vision applications. arXiv 2017, arXiv:1704.04861.
35. Krishna, S.T.; Kalluri, H.K. Deep learning and transfer learning approaches for image classification. Int. J. Recent Technol. Eng.
(IJRTE) 2019, 7, 427–432.
36. Vega, C.; Schneider, R.; Satagopam, V. Analysis: Flawed Datasets of Monkeypox Skin Images. J. Med Syst. 2023, 47, 37. [CrossRef]
37. Chen, R.J.; Lu, M.Y.; Chen, T.Y.; Williamson, D.F.; Mahmood, F. Synthetic data in machine learning for medicine and healthcare.
Nat. Biomed. Eng. 2021, 5, 493–497. [CrossRef]
38. Batista, G.E.; Bazzan, A.L.; Monard, M.-C. Balancing training data for automated annotation of keywords: A case study. Wob
2003, 3, 10–18.
39. Deng, J.; Dong, W.; Socher, R.; Li, L.J.; Li, K.; Fei-Fei, L. Imagenet: A large-scale hierarchical image database. In Proceedings of
the IEEE Conference on Computer Vision And Pattern Recognition, Miami, FL, USA, 20–25 June 2009; pp. 248–255.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.

You might also like