Professional Documents
Culture Documents
Firdaus
Intelligent System Research Group
Faculty Of Computer Science Sriwijaya
University
Palembang, Indonesia
virdauz@gmail.com
Abstract— Breast cancer has been identified as the most applied, and has proved to be a powerful method to face
widespread cancer amongst women and also the major cause of of complex problems [6]
female cancer death all over the world. In this paper, we build Some researches about breast cancer using various
the classification model of a person who is exposed to breast methods have been done. Alexander W. Forsyth et al
cancer based on recurrences-event and no-recurrences event.
This classification using datasets from the University of
2018 using Natural Languange Processing method, in this
Medicine Center, Institute Of Oncology, Ljublijana, Yugoslavia study NLP and machine learning method are used to
of the 286 datasets consist 2 classes, 201 No-Recurrences- extract symptoms of breast cancer, conclusions from this
Events classes, 85 Recurrences-events classes and 10 attributes study demonstrate the potential of machine learning to
including classes. The algorithm used for breast cancer collect, track and analyze the symptoms experienced by
classification is the Multilayer Perceptron algorithm with the cancer patients during chemotherapy. The final model
accuracy level of 96.5% and high evaluation is 69.93% in 8-fold achieved precision of 0.82, 0.86, and 0.99 and recall of
cross validation from 10-fold cross validation. 0.56, 0.69, and 1.00 for positive, negative, and neutral
symptom labels, respectively. The most common positive
Keywords— breast cancer, deep learning, multilayer
symptoms were pain, fatigue, and nausea. Machine based
perceptron, classification
labeling of 103,564 sentences took two minutes [7]
Ahmed M. Abdel-Zaher, et al 2015 using Deep
I. INTRODUCTION Belief Network to detecting breast cancer based on DBN
Breast cancer is one of the most deadly deceases unsupervised pre-training phase followed by a supervised
affecting women worldwide. This deceases is the leading back propagation neural network phase (DBN-NN). This
cause of cancer death. Breast cancer examination is technique was tested on the Wisconsin Breast Cancer
includes categories such as cancer category or not, Dataset (WBCD). The classifier complex gives an
recurrent-event or no-recurrent-event, and benign or accuracy of 99.68% indicating promising results over
malignant. [1] [2][3] previously-published studies. The proposed system
In giving results, an anatomical pathologist must provides an effective classification model for breast
calculate what the anatomic pathology laboratory has cancer. [8]
produced. Doctor's calculations are not timely due to the Hiba Chougrad, et al 2018 using Convolutional
verdict of a breast cancer patient. Therefore, it motivates Neural Network. In this paper, they developed a
the use of methods that can be used by computers.[4] Computer-aided Diagnosis (CAD) system based on deep
The use of computer science method is deep learning. Convolutional Neural Networks (CNN). The developed
In recent years deep learning has attracted wide attention framework could predict and provide the correct diagnosis
in the field of classification. The classification algorithm for 98.23% of the images from MIAS, 97.35% from
in medical needs is still highly relied upon, especially if it DDSM 95.50% for INbreast and 96.67% for BCDR. The
is associated with deep learning [5]. The use of deep results obtained demonstrate that the proposed framework
learning classification applications has been widely is performant and can indeed be used to predict if the
mass lesions are benign or malignant. [9]
!
"
237
!
"
238
D. Classification Result
Table 2. Table Testing Data
To identify breast cancer, the most popular Neural
Network (NN) model – general multilayer perceptron
(MLP) with back propagation learning rule is used [17].
The first layer is the input layer that receives input
vectors. Last layer is called output layer; layers in
between first and last ones are called hidden layers. Each
neuron in the hidden layer and output layer receives
output vectors from the previous layer to evaluate the
weighted sum and to achieve the output vectors by the
activation functions of the neurons. The main goal of a
classifier system using NN is to recognize an unknown
pattern based on some deterministic or statistical The designed MLP in this study consists of an input
measurements. layer and one hidden layer with variable number of nodes
depending on the number of input markers and an output
E. Evaluation layer WIth one neuron. The network is fed with different
Evaluation phase using 10-fold Cross Validation combination of markers in each nm to investigate the
which is one technique to assess the accuracy of a model predictive significance of each marker. Hence, the number
built on the dataset. Model making aims to predict the of input neurons is defined by the number of markers and
dataset. The data used in the model development process the number of hidden neurons is optimized for each
is called training data, while the data to be used to validate marker combination. The network is then trained using
the model is called data testing. In this technique the Multilayer Perceptron Algorithm and validated with k-
dataset is divided into 10 partitions randomly. Then a fold cross validation.
number of 10 experiments were conducted, each The following picture is a screenshot of the multilayer
experiment using the 10th partition data as data testing perceptron training process with weka app
and utilizing the rest of the other partition as training data.
Classification and Evaluation using Weka 38 applications.
III. RESULT
The result of the arff data structure of the 286 datasets
used consisted of 2 classes, 201 classes of No-
Recurrence-Events and 85 for the Recurrence-Events
class. The following is a table of training data to be used
for calculating Multilayer Perceptron. There are 10
attributes: Age, Menopause, Tumor-Size, nv-nodes,
Node-Caps, Deg-malig, Breast, Breast-Quad, Irradiate
and class.
Tabel-1 Table Training Data
!
"
239
!
"
240
!
"
241
!
"
242