You are on page 1of 29

SCHOOL OF ARCHITECTURE, COMPUTING & ENGINEERING

Melanoma Skin Disease Prediction system utilizing


Convolutional Neural Networks
(Chapter 3: Methodology)

Student Name: Sunita Upreti


Student Number: U2250878
Module Code: CN7000
Table of Contents
3.1 Research Design.........................................................................................................................................4

3.1.1 Data Scaling..................................................................................................................................5


3.1.2 Data Augmentation.......................................................................................................................7
3.1.3. Feature Extraction...............................................................................................................................9

3.1.4. Deep Learning Classification..............................................................................................................9

3.2 Convolution Neural Network..............................................................................................................10

3.3 Data Collection and Preprocessing............................................................................................................13

3.3.1........................................................................Data Collection
13

3.3.2.....................................................................Data Preprocessing
15

3.3.2.1. Dataset Augmentation and Balancing........................................................................................15

3.3.3..................................................Performance Evaluation Metrics Method


18

3.4 Implementation Results.............................................................................................................................20

3.5 Evaluation of the Proposed System.....................................................................................................20


Chapter 3
Methodology
The general process flow for classifying various kinds of skin lesions into seven categories is
shown in this chapter the multi-class classification of skin cancer. The preprocessing of melanoma
pictures to meet the requirements of the model is the first step of the strategy. The prototype's
creation employs agile methodologies, encompassing the following steps:

 Data scaling and augmentation.

 Segmentation.

 Deep learning (DL) classification.

 Feature extraction

 Parameters optimization.

 Performance metrics

The purpose of this preprocessing step is to guarantee that all input photos are properly prepared
and uniform. The preprocessed pictures are then supplied into the feature extraction and
refinement architecture. The traits and patterns shown in the skin lesion photos must be captured
during this stage. The design successfully converts the source pictures into a representative feature
space that emphasizes crucial data for classification. Following feature extraction and
preprocessing, the pictures are classified using a variety of classifiers, including InceptionV3,
Xception, Densenet, Mobile net, Resnet-50, CNN, and VGG16. Based on learned characteristics,
each of these classifiers can discriminate between several kinds of skin lesions. Five different
stacking models are created using the weights of these distinct models to further improve the
classification performance. In stacking, forecasts from many base models are combined to provide
a more reliable and precise final prediction. Using an ensemble approach, which combines the
outputs of various classifiers, the system is better able to comprehend the intricate characteristics
that distinguish each type of skin cancer.
3.1 Research Design

Figure 1: melanoma skin disease prediction architecture


The design research for the mobile app prototype, a comprehensive set of user requirements has
been identified to ensure that the app effectively meets the needs of its users. These requirements
are essential for guiding the development process and ensuring that the final product is user-
friendly, functional, and aligned with the intended purpose. The mobile app prototype should allow
users to capture or upload images of skin lesions as shown in the fig 1. This could be achieved
through the device's camera or by selecting images from the device's gallery. The application is the
Mobilenet model responsible for predicting whether a skin lesion is indicative of melanoma. The
selection of the VGG16 architecture is a strategic choice that balances performance, interpretability,
and efficiency. By building upon the proven success of VGG16 in image classification and
leveraging its pre-trained weights, the Melanoma Skin Disease Prediction system aims to achieve
accurate and reliable predictions while maintaining transparency in its decision-making process.The
architecture typically consists of convolutional layers, pooling layers, and fully connected layers.
The trained model Mobilenet was converted into a Tflite model to use in the mobile application.
This could involve using frameworks like TensorFlow Lite or Core ML to convert the model into a
format that can run efficiently on mobile devices. Once the CNN processes the input image, it
provides a prediction score. The application should interpret this score and display results to the
user.

3.1.1 Data Scaling

Scaling techniques are used to equalize the scale of features or independent variables in data.
Scaling is typically done in data processing during the data pre-processing stage to fit the data
inside a certain range (Milletari F, 2022). The realm of melanoma skin disease prediction, the
integration of CNNs offers great promise. However, to unlock the full potential of these complex
models, meticulous data preprocessing is essential. Incorporating a range of scaling techniques—
standardization, normalization, min-max scaling, and max-absolute scaling—becomes a pivotal
effort to amplify the performance of the proposed CNN-based prediction system.
 Standardization, the first step, transforms data into a distribution with a mean of 0 and a
standard deviation of 1. In the context of CNNs for melanoma prediction, this fosters
uniformity among input features. Neural networks are sensitive to feature magnitudes, which
can overshadow others. Standardizing levels the field, enabling the network to learn more
effectively, enhancing training stability, and potentially expediting convergence.

 Normalization via min-max scaling recalibrates data into the [0, 1] range. In the intricate
world of melanoma image patterns, this normalization is crucial. It empowers CNN to
extract and understand these intricate patterns while guarding against the impact of outliers.
This approach strengthens the model's predictive capabilities.

 Min-max scaling, by confining feature values within [0, 1], aligns well with CNN
characteristics. For melanoma prediction, this strategy safeguards image pixel values. CNNs
favor activation functions within this range, which smooths gradient updates during training.
This approach minimizes sensitivity to lighting and pixel disparities across images.

 Max-absolute scaling, less conventional but significant, bolsters model robustness by


dividing values by their absolute maximum. This guards against outlier-induced
disturbances. In melanoma prediction using CNNs, it acts as a safeguard, anchoring feature
values. This protection is invaluable against outlier-driven disruptions during learning.
3.1.2 Data Augmentation
Image data augmentation technique used to generate changed copies of photographs to
artificially increase the dataset size (Mikołajczyk A, 2022). Data augmentation helps deal with
the "not enough data" problem, reduces over-fitting, and improves the models' capacity for
generalization (Balaha HM, 2021). After using the data augmentation technique on an image, the
coordinates of a point may also be obtained using the transformation matrix. In the trials for the
current investigation, the following image augmentation techniques were used: (1) flipping, (2)
rotation, (3) shifting, (4) shearing, (5) zooming, (6) cropping, (7) color change, and (8)
brightness change.

1. Flipping: Images may be rotated both vertically and horizontally. Functions for vertical
flips are absent in various frameworks. Instead, a 180-degree rotation of the picture
followed by a horizontal flip can be used to do a vertical flip.

2. Rotation: The picture may be rotated either clockwise or counterclockwise, around the
center or any other point, or on an axis between 1 and 359 degrees. Data labels might no
longer be kept when the degree of rotation rises.

3. Shifting: Shift augmentation is the process of moving all of the pixels in a picture from
one place to another. There are two forms of shifting: horizontal-axis shift augmentation
and vertical-axis shift augmentation.

4. Shearing: Shearing is used to change the orientation of the picture and shift one portion
of it, much like a parallelogram.

5. Zooming: Images with various zoom levels are produced by applying zooming. This
enhancement enlarges the image and adds new, arbitrary pixels to it. You may either
zoom in or out of the image.
6. Cropping: The process of randomly cropping an area of the photograph is known as
random cropping. When the middle of the image has more information than the corner,
center cropping is also used to crop the image.

7. Color changing: Digital image data is commonly represented as a tensor with


dimensions (height, width, and color channels). When adding color, the pixel values
rather than the location are altered.

8. Brightness changing: One method of data augmentation is to adjust the image's


brightness. The final image is either brighter or darker than the original.
3.1.3. Feature Extraction

Convolutional neural networks (CNNs), in particular, will be used as a powerful deep-learning


tool for feature extraction. The high-level features from the pictures will be extracted using the
pre-trained CNN models as VGG-16.

3.1.4. Deep Learning Classification

Classification is the process of grouping a given data set into categories. Both organized and
unstructured data may be classified. Its major objective is to map discrete output variables to
input variables to determine which class the incoming data will belong to. Typically, the classes
are described as labels, categories, or goals. Several DL techniques, including CNNs
(Broomhead DS, 2022), recurrent neural networks (Schuster M, 2022), long short-term memory
networks [60], generative adversarial networks (Goodfellow I, 2022), radial basis function
networks (Goodfellow I, 2022), deep belief networks (Hinton GE, 2022), and autoencoders
(Rumelhart DE, 2019), can be utilized to carry out the classification job. Only CNN models are
employed to carry out the categorization in the current study.
3.2 Convolution Neural Network

The CNN is a subset of machine learning (Georgevici A.I, 2019). These networks consist of
layers of nodes, including an input layer, one or more hidden layers, and an output layer. Each
node within each layer has a weight and threshold and is connected to other nodes. When the
node's output is greater than the threshold, it is triggered and sends data to the next layer. If the
output is lower than the threshold, no data is sent (Cao Z, 2021).

Neural networks offer diverse architectures tailored to specific tasks. Convolutional Neural
Networks (CNNs), showcased by (Krizhevsky A, 2021), excel in computer vision and
classification, while recurrent neural networks (RNNs), demonstrated by (Mikołajczyk A, 2022),
shine in speech recognition and language processing. CNNs have transformed image analysis,
supplanting manual feature extraction. Formerly, object identification relied on painstaking
feature engineering. CNNs ushered in a scalable approach, autonomously learning hierarchical
features from raw image data, vastly enhancing object recognition efficiency.

Given these strides, the prototype opts for CNNs. Their aptitude for deciphering intricate
image patterns aligns seamlessly with the goal of melanoma skin lesion prediction. Leveraging
CNNs, a predictive model is created, scrutinizing skin lesions to predict malignancy. While
resource-intensive, particularly GPUs (Lee S, 2020), CNNs' ability to yield accurate results
justifies their role. This choice ensures the system furnishes robust predictions, advancing early
melanoma detection, a pivotal facet of skin cancer diagnosis.

Convolution layer: The convolutional layer, where most of the processing takes place, is the
fundamental component of a CNN. But it needs a few things, including input data, a feature map,
and a filter. A tensor, also known as a kernel or filter, performs a convolution to determine if the
features are present. The kernel represents an image component as a 2D weights array.
Backpropagation and gradient descent are used throughout training to modify parameters like
weight values. However, the three hyperparameters listed in (Loussaief S, 2018) that need to be
changed before the start of training include:
 The number of filters: It influences the depth of the output. As an example, three separate
filters each yield three feature maps that correspond to three different depths.

 Stride: Stride is the number of pixels the kernel moves across in the input matrix.

 Padding: discussed in the paragraph that follows.


There are different types of padding:

 Zero padding: If the filters and the input picture are incompatible, it is frequently
employed. All components that are outside of the input matrix are set to zero, which
results in a bigger or similar output.

 Valid padding: It is sometimes referred to as no padding. If the dimensions do not match


in this type, the most recent convolution will be ignored.

 Same padding: The output and input layers are guaranteed to have the same size as this
type.

 Full padding: In this case, the output size is enhanced by padding the input border with
zeros.

Activation function: A nonlinear activation function receives the outputs of the linear
convolution. Previously, smooth nonlinear functions like tangent hyperbolic (also known as
Tanh) or sigmoid functions were utilized (Ramachandran P, 2021). The rectified linear unit
(ReLU) is now the most often utilized function (Balaha HM S. M., 2022). ReLU is a piecewise
linear function that returns 0 in the case of negative input and the input value in all other cases.
As a result, the output range is 0 to infinity. It has become the norm for some types of neural
networks' activation functions. As a result of its simpler, simpler to teach, and frequently superior
design (J, 2019).

Pooling layer: To reduce the number of input parameters and conduct dimensionality reduction,
pooling layers (also known as down-sampling layers) are employed. The pooling procedures pass
a filter across the full input, just like the convolution one does. The filter differs in that it doesn't
have any weights. Instead, the kernel populates the output array by applying a summing function
to the values in the receptive field (Cao Z, 2021). The main forms of pooling are:
 Max pooling: The input is delivered to a filter that determines which pixel in the output
array will carry the maximum value. In this filter, patches are extracted from the input's
feature maps, the maximum value is created in each patch, and any more values are then
discarded (Scherer D, 2021).

 Average pooling: A filter receives the input and computes the average value before
sending it to the output array. Several advantages are achieved; however, a lot of
information is lost in the pooling layer. They help to reduce CNN's complexity, increase
productivity, and lessen the danger of overfitting (S, 2018).
3.3 Data Collection and Preprocessing

3.3.1 Data Collection

Medical imaging information on skin lesions from the SIIM-ISIC


(https://www.kaggle.com/competitions/siim-isic-melanoma-classification/data)dataset is
included in the research on Melanoma Skin Disease Prediction's open-source dataset. The
collection contains skin lesion pictures in a variety of file types, including DICOM, JPEG, and
TFRecord. These pictures give the prediction model input and give visual details about the skin
lesions. The dataset contains the following columns:
 image_name: A unique identifier that points to the filename of the related DICOM image.
 patient_id: A unique identifier for each patient.
 sex: The sex of the patient. If the sex is unknown, the field will be blank.
 age_approx: An approximate value representing the patient's age at the time of imaging.
 anatom_site_general_challenge: Indicates the location of the imaged site on the
patient's body.
 diagnosis: This field is available only in the training dataset and provides detailed
diagnosis information.
 benign_malignant: An indicator that classifies whether the imaged lesion is benign or
malignant.
 target: A binarized version of the target variable, representing the malignancy status (1
for malignant, 0 for benign).
The dataset also contains metadata in CSV where train.csv the training set, test csv for the test set
and samplesubmission.csv for submission file in correct format. files in addition to Images. In
addition to patient IDs, sex, anatomical site of the lesion, complete diagnosis information (for the
training set), and an indicator of malignancy (benign or malignant), the metadata provides other
information about the photos and patients.

Figure 1: Society for Imaging Informatics in Medicine - International Skin Imaging Collaboration images of
melanoma
3.3.2 Data Preprocessing

The current study uses the appropriate equations Eq. 1 for standardization, Eq. 2 for normalization, Eq. 3
for the min-max scaler, and Eq. 4 for the max-absolute scaler, where is the image mean and is the image
standard deviation. Data scaling is described in Sect. 3.1.1.

𝑖𝑛𝑝𝑢𝑡−μ
Output = σ

Output = 𝑖𝑛𝑝𝑢𝑡
max (input)

𝑖𝑛𝑝𝑢𝑡−min (𝑖𝑚𝑝𝑢𝑡)
Output = max(input)−min (𝑖𝑚𝑝𝑢𝑡)

Output = 𝑖𝑛𝑝𝑢𝑡
|max (input)|

3.3.2.1. Dataset Augmentation and Balancing

The data augmentation technique uses the following ranges: (1) 25 for rotation; (2) 15% for
width and height shifts; (3) 15% for shearing; (4) horizontal and vertical flipping; and (5)
adjusting brightness in the range [0.8: 1.2]. To minimize any over-fitting and boost diversity,
data augmentation is also employed to enhance the pictures throughout the learning and
optimization phase [17]. Eq. 5 is the transformation metric used for horizontal flipping (i.e., x-
axis), Eq. 6 is used for rotation, Eq. 7 is used for shifting, Eq. 8 is used for shearing, and Eq. 9
for zooming, where θ is the rotation angle, tx determines shifting along the x-axis, while ty
determines shifting along the y-axis, shx determines shear along the x-axis, while shy determines
shear along the y-axis, and determines the zoom factor along the x-axis, while Cy determines the
zoom factor along the y- axis.
1 0 0
Flapping Matrix = [0 𝑐𝑜𝑠𝜃 −𝑠𝑖𝑛𝜃]
0 𝑠𝑖𝑛𝜃 𝑐𝑜𝑠𝜃

𝑐𝑜𝑠𝜃 𝑠𝑖𝑛𝜃 0
Rotation Matrix = [−𝑠𝑖𝑛𝜃 𝑐𝑜𝑠𝜃 0]
0 0 1

1 0 0
Shifting Matrix = [ 0 1 0]
𝑡𝑥 𝑡𝑦 1

1 𝑠ℎ𝑦 0
Shearing Matrix = [ ℎ𝑥 1 0]
0 0 1

𝑐𝑥 0 0
Zooming Matrix = [ 0 𝑐𝑦 0]
0 0 1

These transformation matrices allow for the creation of augmented images by applying various
geometric operations. Data augmentation is crucial for increasing the diversity of the training dataset
and enhancing the model's ability to generalize to different variations and perspectives of the skin
lesion images.
3.3.2 Segmentation phase

To properly detect and isolate the tumor section from the medical skin pictures is our goal during
the segmentation phase. The U-Net, U-Net++, Attention U-Net, V-Net, and Swin U-Net models
are used for image segmentation. A left contraction path and a right expansion path make up the
U-Net model. In this investigation, we used three U-Net models (U-Net, U-Net++, and Attention
U-Net) with four distinct parameters, including (1) using the default left contraction path in two
settings and the VGG19 and DenseNet121 architectures in the other two, (2) setting the pre-
trained weights for the VGG19 and DenseNet121 with ImageNet, (3) blocking the update of the
ImageNet weights, (4) setting the architecture depth to five with the number of filters of [64, 128,
256, 512, 1024] in each level (or block), (5) setting the input image size to 1281283, and
Through this segmentation procedure, we hope to deliver accurate and reliable findings.
3.3.3 Hardware and software requirements

On an Intel Core i7 11 generation CPU with 16 GB of RAM and NVIDIA 3050 RTX GPUs with
scalable link interface (SLI), experimental simulations were done using Google Colab. The
performance of the developed algorithm is assessed using a dataset of 2000 dermoscopic images.

V. Nivedita, K. (Nivedita, 2022) Subramaniam proposes a technique for diagnosing skin diseases
using image processing, Python, and the Yolov3 tool. By analyzing images of the affected area,
applying image processing methods, and extracting relevant features, the system aims to provide
reliable and fast results without the need for physical examination. Their paper focuses on four
specific skin diseases, including acne, melanoma, blisters, and cold sores, to achieve high accuracy
in disease prediction. This method proves particularly beneficial in areas with limited access to
dermatologists. Al-masni, Mohammed A. and Al-antari (Al-masni, 2019), an integrated model for
automated diagnosis of skin lesion diseases using medical dermoscopy images is proposed. The
model consists of two cascaded deep learning networks: a full-resolution convolutional network
(FrCN) for segmentation of skin lesion boundaries, and a deep residual network (ResNet-50) for
lesion classification. The model is evaluated on the ISIC 2017 challenge dataset and achieves an
overall accuracy of 94.03% and an average Jaccard similarity index of 77.11% for lesion
segmentation using FrCN. For the classification task, the model achieves an overall prediction
accuracy of 81.57% and an F1-score of 75.75% using ResNet-50. They were chosen at random to
determine which photographs should be used for the training, validating, or testing portion. For fair
image processing, the dataset's whole picture library was downsized to 640480 pixels. The WOA
approach was used to train the proposed CNN. The learning rate ranged between 0.2 and 0.9.

Additionally, nearly all the training pixels were successful. The ideal situation is to choose a
neural network with the least number of neurons. According to (Fujiyoshi H, 2019), the
performance ratio may be used to choose an appropriate learning rate. The suggested network was
trained using 30,000 iterations, however as the performance ratio is important, a trade-off between
performance ratio and training time is made, and the learning rate is chosen to be 0.9
The training stage was repeated 60 times, and the final findings were presented using the mean
values, to make an accurate and independent examination of the pictures.

Five performance measures were used to demonstrate how well the suggested system performed,
and they are specified as follows: -

1. Sensitivity (True Positive Rate)

Sensitivity, also known as the True Positive Rate or Recall, measures the proportion of actual
positive cases (skin melanoma) that are correctly identified by the system. It's calculated as the
ratio of the number of correctly detected skin cancer cases to the total number of actual skin
cancer cases.

𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑜𝑟𝑟𝑒𝑐𝑡𝑙𝑦 𝑑𝑒𝑡𝑒𝑐𝑡𝑒𝑑 𝑠𝑘𝑖𝑛 𝑐𝑎𝑛𝑐𝑒𝑟 𝑐𝑎𝑠𝑒𝑠


Sensitivity =

𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑘𝑖𝑛 𝑐𝑎𝑛𝑐𝑒𝑟 𝑐𝑎𝑠𝑒𝑠

2. Specificity (True Negative Rate):

Specificity measures the proportion of actual negative cases (healthy skin) that are correctly identified
as negative by the system. It's calculated as the ratio of the number of correctly detected healthy skin
cases to the total number of actual healthy skin cases.

𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑜𝑟𝑟𝑒𝑐𝑡𝑙𝑦 𝑑𝑒𝑡𝑒𝑐𝑡𝑒𝑑 ℎ𝑒𝑎𝑙𝑡ℎ 𝑠𝑘𝑖𝑛 𝑐𝑎𝑠𝑒𝑠


Specificity =

𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 ℎ𝑒𝑎𝑙𝑡ℎ𝑦 𝑠𝑘𝑖𝑛 𝑐𝑎𝑠𝑒𝑠

3. Positive Predictive Value (PPV):

PPV, also known as Precision, measures the proportion of correctly detected skin cancer cases among
all the cases that the system identified as positive. It's calculated as the ratio of the number of correctly
detected skin cancer cases to the total number of cases that were identified as skin cancer.

𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑜𝑟𝑟𝑒𝑐𝑡𝑙𝑦 𝑑𝑒𝑡𝑒𝑐𝑡𝑒𝑑 𝑠𝑘𝑖𝑛 𝑐𝑎𝑛𝑐𝑒𝑟 𝑐𝑎𝑠𝑒𝑠


PPV =

𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓𝑑𝑒𝑡𝑒𝑐𝑡𝑒𝑑 ℎ𝑒𝑎𝑙𝑡ℎ𝑦 𝑠𝑘𝑖𝑛 𝑐𝑎𝑠𝑒𝑠


4. Negative Predictive Value (NPV):

NPV measures the proportion of correctly detected healthy skin cases among all the cases that the
system identified as negative. It's calculated as the ratio of the number of correctly detected healthy
skin cases to the total number of cases that were identified as healthy skin.

𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑜𝑟𝑟𝑒𝑐𝑡𝑙𝑦 𝑑𝑒𝑡𝑒𝑐𝑡𝑒𝑑 ℎ𝑒𝑎𝑙𝑡ℎ𝑦 𝑠𝑘𝑖𝑛 𝑐𝑎𝑠𝑒𝑠


NPV =

𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓𝑑𝑒𝑡𝑒𝑐𝑡𝑒𝑑 ℎ𝑒𝑎𝑙𝑡ℎ𝑦 𝑠𝑘𝑖𝑛 𝑐𝑎𝑠𝑒𝑠

5. Accuracy:

Accuracy measures the overall correctness of the system's predictions. It calculates the proportion of all

cases, both positive and negative, that were correctly identified by the system.

𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑜𝑟𝑟𝑒𝑐𝑡𝑙𝑦 𝑑𝑒𝑡𝑒𝑐𝑡𝑒𝑑 𝑐𝑎𝑠𝑒𝑠


Accuracy =

𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑎𝑠𝑒𝑠

By using different geometrical procedures, these transformation matrices allow for the construction of
enhanced pictures. Data augmentation is essential for broadening the training dataset's variety and improving
the model's capacity to generalize to other iterations and viewpoints of melanoma images.
3.4 . Method
Skin cancer is a known cause of death in humans. Abnormal skin cells grow, known as skin cancer.
The skin cells are produced by the sun-exposed human body and can occur anywhere on the human
body (Khalid M. Hosny, Feb 2019).

The convolutional neural network (CNN) is of the feed-forward neural networks in general used for
image classification tasks. Convolutional Neural Networks (CNNs) can provide more accessible
healthcare to underserved areas. The most dangerous is melanoma, which more than 10,000 people
die in the United States every year and accounts for about 75% of skin cancer deaths (Singh, 2015).
The estimated 5-year survival rate for melanoma is more than 99% when detected in early stages, but
drops to about 14% in later stages, so early detection and treatment are critical (Lauren E. Davis,
2019). However, according to IMS Health, the United States has only 9,600 dermatologists and 7,800
dermatology clinics serving 323 million people (Benlagha, 2021).

Figure 2:CNN Architecture (J. Daghrir, 2020)

Using convolutional neural networks (CNNs), DeepMelanoma will classify skin lesion images into
cancer types using state-of-the-art models. Training and testing have been done on this model on a
data set provided by the International Skin Imaging Collaboration (ISIC) (Bakheet, 2017). Models can
be used to analyze lesion images for early detection of risk.
3.5 Data Collection
The primary data-collecting approach for developing a Melanoma Skin Disease Prediction system
utilizing Convolutional Neural Networks (CNNs) would be dermoscopy pictures. Dermoscopy is a
non-invasive imaging technology used to examine skin lesions in great detail. Dermoscopy pictures
give crucial visual information for the diagnosis and prediction of melanoma.

The following stages may be included in the data-collecting process:

 Collaboration with Healthcare organizations: Access to a wide range of dermoscopy


pictures can be obtained by collaboration with hospitals, dermatological clinics, or research
organizations. Collaboration with these institutions can aid in the acquisition of a big and
representative dataset.

 Informed Consent: It is critical to obtain informed consent from patients or persons whose
dermoscopy photos will be utilized in order to ensure ethical concerns and privacy
protection. Consent forms should explicitly describe the objective of data collection as well
as how the data will be anonymized and used for research.

 Image Acquisition: Dermoscopy images can be acquired using specialist dermoscopy


instruments or digital dermatoscopes. These gadgets provide enlarged and well-illuminated
views of skin lesions. Trained medical experts or researchers may execute the image
capture procedure, guaranteeing that the photos are of excellent quality and appropriately
reflect various forms of skin lesions.

 Data Annotation: Annotation of dermoscopy pictures may be required to identify and


indicate places of interest, such as melanoma lesions. Manual annotation can be performed
by experienced dermatologists or skilled annotators. The annotation process generates
ground truth labels that are used to train and evaluate machine learning models.

In addition to the SIIM-ISIC dataset, other publicly available datasets and resources may be
utilized to supplement the training and testing of the CNN model. These datasets may include
additional dermoscopy images of melanoma and non-melanoma skin lesions to enhance the
diversity and robustness of the training data.
3.6 Research Procedure

The research for the DeepMelanoma mobile application can take place in various locations,
including:

1. On Mobile Devices: The primary location for the research is on users' mobile devices,
where they install and interact with the DeepMelanoma application. This allows for real-
time data collection and user engagement within the application.

2. Online Platforms: The application may have an online presence, such as a website or a
dedicated portal, where users can access additional information, support, or resources
related to the application. These online platforms can serve as supplementary spaces for
research activities.

3. Research Facility: If the research involves conducting specific experiments, usability


tests, or user studies, a controlled research facility may be utilized. This facility can
provide an environment conducive to data collection, observation, and evaluation of user
interactions with the application.

4. Collaborative Spaces: In some cases, research activities may involve collaboration with
external partners, such as healthcare institutions or research organizations. These
collaborations can involve conducting studies or collecting data on their premises or
jointly organizing research activities.
3.7 Resource Tools

The resources required tools and technologies for this project are as follows:

Programming language Requirements

1. Python: Machine Learning model train.

2. TensorFlow Lite: Machine learning model implementation in the Flutter application.

3. Dart: For developing the mobile application.

4. Flutter Framework: For developing mobile applications.

Software Requirements

1. Visual Studio or Android Studio

2. Figma

3. GitHub

4. Firebase

5. Google Maps

6. Gantt Chart

7. Draw.io

8. Android Studio Emulator

Hardware Requirements

1. Mobile Device: Android device

2. A PC with the following characteristics is used: 16 GB RAM, 8 GB graphic card, 512 SSD,
11th generation Intel Corei7 processor, NVIDIA 3050 RTX, and Windows 11 OS.

Dataset
The SIIM-ISIC Melanoma Classification: The dataset used in this research comprises different data
formats commonly utilized in the medical imaging domain, namely DICOM, JPEG, and TFRecord.
These formats contain both the essential image data and associated metadata necessary for the
analysis of melanoma. The dataset has been meticulously curated and encompasses comprehensive
clinical details, including patient demographics such as sex and age, anatomical site information
indicating the location of the lesion, diagnosis specifics providing additional information about the
characteristics of the lesion, and a target variable that indicates the malignancy status of the lesion.
This diverse dataset with rich clinical information enables robust analysis and facilitates accurate
prediction of melanoma cases.
3.8 Limitation of Research
This research is the reliance on available datasets for training the Convolutional Neural Networks
(CNNs). The quality and diversity of the dataset may impact the generalization capability of the
developed prediction model. If the dataset is not representative of the broader population or lacks a
balanced distribution of various skin lesion types, the model's performance may be affected, leading
to biased predictions. Additionally, the use of pre-trained CNN models introduces a dependency on
the features learned from external datasets, which might not fully capture the intricacies of the
specific problem domain. Furthermore, while data augmentation techniques are employed to
enhance dataset diversity, there is a possibility that artificially generated samples might not fully
encapsulate the complexities of real-world skin lesion images. Despite these limitations, rigorous
validation and evaluation processes will be implemented to assess the model's reliability and
address potential biases, contributing to the robustness of the research outcomes.

3.9 Ethical Consideration


In this research, stringent ethical considerations will be upheld to ensure the well-being and rights of
all participants. Central to this approach is the obtainment of informed consent, wherein participants
will be provided with comprehensive information about the research objectives, procedures, potential
risks, and their voluntary participation rights. Emphasis will be placed on preserving privacy and
confidentiality, achieved through meticulous data anonymization and secure storage practices.
Rigorous risk assessments will be carried out to identify and mitigate potential physical,
psychological, or emotional risks, prioritizing participants' safety. Adequate support services will be
accessible, and open channels of communication will be established to address any concerns. The
research will be subject to ethical approval from relevant review boards, attesting to its alignment with
ethical guidelines and participant protection. Researchers will be well-trained in ethical research
conduct, ensuring the upholding of participant rights and welfare. Inclusivity, transparency, and
continuous monitoring will be maintained throughout the study, fostering a respectful and responsible
environment that prioritizes participant well-being and the advancement of knowledge within a
morally principled framework.
References
Baghdadi NA, M. A. (2022). An automated diagnosis and classification of COVID-19 from chest CT
images using a transfer learning-based convolutional neural network. Comput Biol Med, 144-105383.

Balaha HM, A. H. (2021). Automatic recognition of handwritten Arabic characters: a comprehensive


review. Neural Comput Appl, 3011-3034.

Balaha HM, A. H. (2021). Recognizing Arabic handwritten characters using deep learning and
genetic algorithms. Multimed Tools Appl 80, 32473-32509.

Balaha HM, S. M. (2022). Hybrid deep learning and genetic algorithms approach (HMB-DLGAHA) for
the early ultrasound diagnoses of breast cancer. Neural Comput Appl, 8671-8695.

Broomhead DS, L. D. (2022). Radial basis functions, multi-variable functional interpolation and
adaptive networks. Tech. rep, royal signals and radar establishment malvern (United Kingdom).

Cao Z, Y. H. (2021). Attention fusion for one-stage multispectral pedestrian detection. Sensors,

4184. Cao Z, Y. H. (2021). Attention fusion for one-stage multispectral pedestrian detection.

Sensors, 4184.

Fujiyoshi H, H. T. (2019). Deep learning-based image recognition for autonomous driving. IATSS Res,
244– 252.

Georgevici A.I, T. M. (2019). Neural networks and deep learning: a brief introduction.

Goodfellow I, P.-A. J.-F. (2022). Generative adversarial nets. Adv Neural Inf Process Syst, 27.

Gu J, W. Z. (2018). Recent advances in convolutional neural networks. Pattern Recogn, 354-377.

Hinton GE, O. S.-W. (2022). A fast learning algorithm for deep belief nets. Neural Comput, 1527-

1554. J, B. (2019). (2019) A gentle introduction to the rectified linear unit (ReLU). Mach Learn

Mastery 6.

Krizhevsky A, S. I. (2021). Imagenet classification with deep convolutional neural networks. . Adv
Neural Inf Process , 1097-1105.

Lee S, K. H. (2020). CNN-based image recognition for topology optimization. Knowl-Based Syst , 198-
105887.
Loussaief S, A. A. (2018). Convolutional neural network hyper-parameters optimization based on
genetic algorithms. Int J Adv Comput Sci Appl , 252-266.

Mikołajczyk A, G. M. (2022). Data augmentation for improving deep learning in image classification
problem. 2022 international interdisciplinary PhD workshop (IIPhDW), 117-122.

Milletari F, N. N. (2022). V-net: Fully convolutional neural networks for volumetric medical image
segmentation. 2022 fourth international conference on 3D vision (3DV), 665-571.

Ramachandran P, Z. B. (2021). Searching for activation functions. arXiv preprint, 34-90.

Rumelhart DE, H. G. (2019). Learning internal representations by error propagation. California Univ San
Diego La Jolla Inst for Cognitive Science.

S, T. (2018). Neural networks and deep learning. Mach Learn, 875-936.

Scherer D, M. A. (2021). Evaluation of pooling operations in convolutional architectures for object


recognition. In: International conference on artificial neural networks, 92-101.

Schuster M, P. K. (2022). Bidirectional recurrent neural networks. IEEE Trans Signal Process, 2673-
2681.

You might also like