You are on page 1of 5

International Conference on Computational Intelligence & IoT (ICCIIoT) 2018

Performance Evaluation of Breast Cancer Classifiers using Different


Tools
Sushil Kumar Saroja, Nitika Nigamb, Nagendra Pratap Singhc
abc
Department of CSE, M.M.M. University of Technology, Gorakhpur, India
c
Department of CSE, National Institute of Technology, Hamirpur, India
sushil.mnnit10@gmail.com, nigamniti8@gmail.com, nps010175@gmail.com

Abstract: Breast cancer is one of the deadliest diseases in the world. Two There are many types of breast cancer in which most common are ductal
million new cases are registered in the year 2018. In this scenario, detection carcinoma in situ, invasive ductal carcinoma and invasive lobular
and classification of breast cancer at right time become more crucial to carcinoma. If the cancer cell is in situ, it doesn’t spread anywhere but if, it
diagnose. There are several approaches and tools exist to detect and classify invasive cancer then there are maximum chances of spreading into
the breast cancer. But betterment of the approaches and tools are still surrounding of breast tissues.
continuing. In our approach, we classify the breast cancer and compare our As it is known, early detection of breast cancer is very helpful to
approach with other approaches which are implemented in different other control and cure the disease thus; many approaches of machine learning
have been used for the early detection of breast cancer. The automatic
tools. Here, we find that our approach is better than them. detection of cancer without any medical experts can be done with the help
of classifiers. There are many types of classifiers which are used to classify
Keywords- Breast cancer, Support vector classifier, K-NN classifier. the breast cancer into malignant or benign. The classifiers like Support
Vector Machine (SVM) [19] and K-Nearest Neighbour (KNN) classifier
have been used in this paper.
1. Introduction

Our body is made up of cells, which grow and divide every microsecond.
Sometimes, the uncontrollable cells in the body create an extra mass which
is an unwanted part. The mass formed by these unwanted extra cells leads
to a tumour and it may be malignant (cancerous tumour) or benign (non-
cancerous tumour). The benign tumour grows in the body but doesn’t
spread in other parts of body whereas malignant grows rapidly.
Breast cancer is the second most dangerous diseases which lead to
death [1]. It is most commonly found in women. According to the data
provided by the World Cancer Research Fund International (WCRF), breast
cancer mostly occurs after menopause and in 2012 about 1.7 million new
cases had been observed [2]. There are many reasons like hormonal
disbalance, early mensuration, late menopause, first pregnancy issue after
the age of 30 and miscarriages, due to which breast cancer occurs. In the
world, the highest frequency having breast cancer is found in Belgium
which is followed by Denmark. This disease also causes other health issues
like depression, mood swings, anger and anxiety [3]. There are many
methods used till now for the detection of breast cancer like Removing a
sample of breast cells for testing (biopsy), Mammogram [4], Breast
magnetic resonance imaging (MRI) [5] etc. These techniques don’t provide
accurate results in the detection of breast cancer.
The breast cancer is determined by the characteristics of the tumour
size, it is basically divided into four stages. According to the patient’s
tumour size the stages of breast cancer is determined, and doctors provide
treatment. If tumour is greater than 2cm then doctors do chemotherapy. In
the stage 1 tumour size is usually small and contained within the breast. In
stage 2 the size of tumour increases but doesn’t spread to other parts of
body. In some cases, the cancerous cells have spread into lymph node. In
the case of stage 3 the cancerous cell has been started to spread and it is
becoming larger. In stage 4, the tumour is now spread into every parts of
body and there is no chance to save the patient. This stage is also known as
secondary or metastatic cancer [6]. All the stages are shown in Figure 1.

SUSHIL KUMAR SAROJ ET AL. 1


International Conference on Computational Intelligence & IoT (ICCIIoT) 2018

accuracy obtained was about 98.53% by using 10 folds-cross validation


when dataset is divided into 50-50 ratio. Y. Ireaneus Anna Rejani et. al.
[10] used SVM classifier and achieved about 88.75% accuracy. They
basically detect the tumour using mammogram then segmented the grey
scaled image. After segmentation, features are extracted then they applied
the SVM classifier. Azar et. al. [11] proposed probabilistic neural network
for the detection of breast cancer. They compared three classification
algorithms which include probabilistic neural network (PNN), radial basis
function (RBF) and multi-layer perceptron (MLP). PNN performs better
than other classification algorithm, this classifier got 100 and 98.66%
accuracy in both training as well as resting phase. In 1999, Pena-Reyes et.
al [12] combined fuzzy system with evolutionary algorithm which was
named as fuzzy-genetic approach. They obtained 97.36% accuracy and in
addition to this their system had high classification performance. Since,
they had applied a rule-based system also therefore, system was human
interpretable. Ahmad et. al. [13] proposed a method for automatic diagnosis
of breast cancer. The genetic algorithm-based multi-objective optimization
of Artificial Neural Network classifier (GA-MOO-NN) was given. They
achieved 98.85 % best accuracy when dataset is divided into 3 parts:
training (50%), testing (25%) and validation (25%). Akay [1] diagnosed
breast cancer using the combination of two methods named as SVM method
and Feature selection. He divided dataset into 50:50, 70:30 and 80:20
training-testing groups. The best accuracy (99.51%) was given by 80:20
training-testing partition.
Chen et al. [14] proposed a swarm method based on SVM classifier
and achieved about 99.30% accuracy by considering cross validation of 10
folds. In [15] authors compare the SVM classifiers named as proximal
SVM, Lagrangian SVM, Newton method for Lagrangian SVM, Linear
programming SVM and smooth SVM. In all the classifiers Linear
programming SVM gives best accuracy (97.14%). In [16] authors used
SVM and K- Nearest Neighbour (KNN) machine learning approach for
predicting breast cancer. They obtained 98.57% in case of SVM and
97.14% with KNN by consideration of 10 folds cross validation.

3. Classifiers

3.1. Support Vector Machine (SVM)


Figure 1. Various stages of the breast cancer [6].
SVM was first proposed by V. Vapnik in 1963 for classification and
regression task. It is a supervised machine learning method which basically
2. Literature Survey aligns new instances to one of the classes by defining separable hyperplane.
If (n-1) dimensional hyperplane separated the n sample datasets, then it is
a linear SVM classifier. A linear classifier is used to classify the labelled
Edwin Smith Papyrus was the first person who found this disease in 1860 dataset which consist of n-dimensional vector given as (x1, y1), (x2, y2)
in an Egyptian Tomb. He described 8 different cases of tumours of breast. ……. (xn, yn). The pair consist of vector xi ∈ Rn and yi ∈ {1, 0} which
The first successful treatment was done by two surgeons J. L. Petit and B. defines the binary class. The main motive is to find “maximum-margin
Bell, in which they removed the infected part i.e. lymph nodes, breast tissue hyperplane” as it will decide how correctly the classification is done. The
and chest muscles [7]. In 1970, a new methodology came into notice which maximum-margin hyperplane helps in finding the separation of points
removes the cancerous tumour only, but in 1985 researcher found a new having class with 1 from class with 0 as shown in figure 2. The points closet
method “radiation” which is given after lumpectomy. After World War II a to the hyperplane are known as support vectors and hyperplane is given as
new treatment was came into concern known as “Chemotherapy”. In this
treatment, the cancerous part is used to shrink before surgery and prevents h(x)= w*x +b (1)
from reoccurrence of tumour again. In 1923, English researcher Janet Lane-
Claypan did the huge scale survey of ladies with breast cancer disease, where, w is a coefficient vector, x is a sample points and b is the constant
which distinguished a few hazard factors like menopause, pregnancy, value of distance from the origin. In case of linear SVM if h(x)>=0 then the
hormonal misbalance etc [8]. points will fall in class 1, otherwise class 0. The maximum distance between
There have been many machine learning techniques used for the early the two different support vectors is defined as
detection of breast cancer. These techniques help the medical experts to
diagnosing the disease easily as it provides the effectiveness of 2
distance=
classification and recognition system. Kemal Polat et. al. [9] used least ||𝑤||
square support vector method for diagnosing the breast cancer. The

SUSHIL KUMAR SAROJ ET AL. 2


International Conference on Computational Intelligence & IoT (ICCIIoT) 2018

SVM is also known as non-probability binary linear classifier in case of underlying distribution of the data. For example, suppose our data is
predicting new instances into one of the two classes. highly non-Gaussian but the learning model we choose assumes a
Gaussian form. In that case, our algorithm would make extremely poor
predictions.

• Instance-based learning means that our algorithm doesn’t explicitly


learn a model. Instead, it chooses to memorize the training instances
which are subsequently used as “knowledge” for the prediction phase.
Concretely, this means that only when a query to our database is made
(i.e. when we ask it to predict a label given an input), will the algorithm
use the training instances to spit out an answer. A popular choice is
the Euclidean distance given by

d (x, x′) =√(𝑥1 − 𝑥′1)2 + (𝑥2 − 𝑥′2)2 + ⋯ + (𝑥𝑛 − 𝑥′𝑛)2 .

4. Implementation

All authors will be required to complete the Proceedings exclusive license


transfer agreement before the article can be published. This transfer
Figure 2. H2 hyperplane shows maximum-margin [19] agreement enables Elsevier-SSRN to protect the copyrighted material for
Non-Linear dataset is handling with the help of Kernel functions in SVM the authors, but does not relinquish the authors’ proprietary rights. The
classifier, which was proposed by Aizerman. There are many kernel copyright transfer covers the exclusive rights to reproduce and distribute
functions used to maximize margin hyperplane, which are polynomial the article, including reprints, photographic reproductions, microfilm or any
(homogenous and non-homogenous), Gaussian radial basis function other reproductions of similar nature and translations.
(GRBF, mostly used) and Hyperbolic tangent. The kernel function is given
as
h(x) = w*ɸ(x) + b (2) 4.1 Data Collection and Preparation

where w is coefficient vector, b is constant value and ɸ(x) is mapping The dataset of breast cancer is collected from the UCI machine learning
function of input data which is given by k(xi,xj) = ɸ(xi) * ɸ(xj). repository which is named as Wisconsin Breast Cancer (WBC) [17]. This
Additionally, k(xi,xj) is known as kernel function. dataset contains 699 number of instances which consist of 10 attributes. It
The optimal equation is given as contains one more attribute which is named as “class” to differentiate
whether a patient is detected with the cancerous tumour or not. It is given
1
Max [∑𝑛𝑖=1 𝑎(𝑖) ∑𝑛𝑖,𝑗=1 𝑎(𝑖)𝑎(𝑗)𝑦(𝑖)𝑦(𝑗)k(x(i), x(j))] (3) as 2 for benign and 4 for malignant. There are 458 benign and 241
2
malignant instances. We removed the missing values (16) so our new
The kernel function can be RBF or polynomial. dataset contains 683 instances.

3.2. K-Nearest Neighbour Classifier (KNN) 4.2 Performance Measure

KNN is a simple algorithm which falls into category of supervised learning. We can measure the performance of machine learning algorithms by
It is a lazy and non-parametric method used in classification and regression. calculating the performance indices. These indices are calculated on the
It stores all the possible cases while training and classifies new observation basis of the confusion matrix, which consists of 4 parameters named as True
by majority (closest) value of its k- neighbour. In the training process, the positive (TP), False positive (FP), True Negative (TF) and False Negative
algorithm stores the features in the vectors as well as labels the dataset. (FN). The confusion matrix is a table which describe the performance of a
While in the testing process, new observations are classified by calculating classifier which consist of actual and predicted class [18]. It represents the
the similarity measures like Euclidean distance, cosine similarity etc. The classification for two classifiers given in Table 1, where
selection of value ‘k’ is the most important factor in KNN classification as
it will determine how well the data can be utilized to generalize the result. TP=the prediction given is true and in actual it confirms to be true (correctly
KNN falls in the supervised learning family of algorithms. Informally, this recognizes).
means that we are given a labelled dataset consisting of training
observations (x1, y1) (x2, y2) and would like to capture the relationship TN=the prediction given is negative and it confirms to be true (incorrectly
between x1x2 and y1y2. More formally, our goal is to learn a function recognizes).
h:X→Y so that given an unseen observation x1x2, h(x1) h(x2) can
confidently predict the corresponding output y1y2.The KNN classifier is FP= the prediction given is true and it doesn’t to be true (correctly
also a non-parametric and instance-based learning algorithm. excluded).

• Non-parametric means it makes no explicit assumptions about the FN= the prediction given is negative and it confirms to be negative
functional form of h, avoiding the dangers of mismodeling the (incorrectly recognizes).

SUSHIL KUMAR SAROJ ET AL. 3


International Conference on Computational Intelligence & IoT (ICCIIoT) 2018

Table 1. Representation of confusion matrix Table 2. Confusion matrix

The accuracy is the measure parameter which calculate the effectiveness of


a classifier and it is given as
𝑇𝑃+𝐹𝑁
Accuracy = (4)
𝑇𝑃+𝑇𝑁+𝐹𝑁+𝐹𝑃

The Specificity shows how many negatives are correctly identified which
is given as
𝑇𝑁
Specificity = (5)
𝑇𝑁+𝐹𝑃
Table 3. Performance measure indices
and, sensitivity is given as how many positives are correctly identified
𝑻𝑷
Sensitivity = (6)
𝑻𝑷+𝑭𝑵

5. RLanguage

R is a programming dialectal for graphics and statistical computing which


is maintained by the R Foundation for Statistical Computing. It is freely
available under the GNU General Public License and has integrated
development environment. It is accessed with command line interpreter. In
our experiment we are using R 3.4 version which consists of packages.

Table 4. Comparison with other existing methods.


6. Experiments and Result Analysis

The experiments have been done on the dataset WBC for classification of
breast cancer as malignant or benign. We have use 10 folds cross validation
with 70:30 and 80:20 training-testing datasets. Among 204 instances 138
are correctly identified in both case (SVM and KNN) with 70:30 partitions
and with 136 instances 96 are correctly identified in both case (SVM and
KNN) with 80:20 partitions, the confusion matrix has been shown in Table
2. Here we can observe that if the size of training set increases then accuracy
of both classifiers decreases. The parameters evaluated like specificity,
precision and sensitivity are shown in Table 3. It depicts that SVM
performance is much better than KNN as the specificity, precision and
sensitivity obtained by SVM are 98.41%, 97.87% and 97.87% respectively
in case of 70:30 partitions.
We basically compare the previous result with our result using R
language as shown in Table 4. In the comparison we found that author [15]
used 4 folds techniques using MATLAB 7.0 whereas author [20] used 10
folds technique using WEKA. In our approach we are using 10 folds
techniques i.e. the dataset is divided into 10 portions which is repeated 3
times using R tool. Here, we observed that we are getting more accuracy
with same dataset used by [15] and [20] using different tool.

SUSHIL KUMAR SAROJ ET AL. 4


International Conference on Computational Intelligence & IoT (ICCIIoT) 2018

Cancer of the Breast with Special Reference to its Associated Antecedent


7. Conclusion
Conditions., (32), 1926.
[9] K. Polat, & S. Güneş “Breast cancer diagnosis using least square support
In this paper, we use the SVM and KNN methods to classify the breast
vector machine,” Digital Signal Processing, 17(4), 694-701, 2007.
cancer which is implemented in R Language tool. Here, we can see that our
[10] Y. Rejani, S. T. Selvi “Early detection of breast cancer using SVM
approach has better results than others which are implemented in different
classifier technique,”. arXiv preprint arXiv:0912.2314, 2009.
other tools. In the next, we test all standard classifiers in different tools and
[11] “breast cancer cases,” [Online] Available:
compare the results.
https://www.breastcancer.org/symptoms/understand_bc/statistics.
[12] C. A. Pena-Reyes, M. Sipper “A fuzzy-genetic approach to breast cancer
diagnosis,” Artificial Intelligence in Medicine (17), 131–155, 1999.
[13] F. Ahmad, N. A. M. Isa, Z. Hussain, S. N. Sulaiman “A genetic algorithm-
REFERENCES
based multi-objective optimization of an artificial neural network classifier
for breast cancer diagnosis,” Neural Computing and Applications,
Springer, Volume 23, Issue 5, pp 1427–1435, October 2013.
[14] H. L. Chen, B. Yang, G. Wang, S. J. Wang, J. Liu, D. Y. Liu “Support
[1] M. F. Akay “Support vector machines combined with feature selection for vector machine based diagnostic system for breast cancer using swarm
breast cancer diagnosis,” Expert systems with applications, 36(2), 3240- intelligence,” Journal of medical systems, 36(4), 2505-2519, 2012.
3247, 2009. [15] A. T. Azar, S. Ahmed, El-Said “Performance analysis of
[2] “Breast cancer statistics,” [Online]. Available: support vector machines classifiers in breast cancer mammography
http://www.wcrf.org/int/cancer-facts-figures/data-specific- recognition,” Neural Computing and Applications, Springer, Volume 24,
ancers/breastcancer-statistics, accessed on: May 20, 2018. Issue 5, pp 1163–1177, April 2014.
[3] A. D. Vinokur, B. A. Threatt, D. V. Kaplan, W. A. Satariano, “The process [16] M. M. Islam, H. Iqbal, M. R. Haque, M. K. Hasan “Prediction of breast
of recovery from breast cancer for younger and older patients,” Changes cancer using support vector machine and K-Nearest neighbours,”
during the first year. Cancer, 65(5), 1242-1254, 1990. In Humanitarian Technology Conference (R10-HTC), 2017 IEEE Region
[4] A. T. Azar, S. Ahmed, El-Said “Probabilistic neural 10(pp. 226-229). IEEE, December 2012.
network for breast cancer classification,” Neural Computing and [17] “Breast Cancer Wisconsin (Original) Data Set,” [Online]. Available:
Applications, Springer, vol. 23, pp.1737-1751, 2013. https://archive.ics.uci.edu/ml/machine-learning-databases/breast-
[5] E. Warner, H. Messersmith, P. Causer “Systematic review: using magnetic cancerwisconsin/breast-cancer- wisconsin.data, accessed on: April 30,
resonance imaging to screen women at high risk for breast cancer,” Annals 2018.
of internal medicine,148(9):671–679,06 May 2008. [18] R. Kohavi “Glossary of terms,” Machine Learning, 30, 271-274.
[6] “Grades and stages,” [Online]. Available: [19]“Support vector machine,” In Wikipedia, The Free Encyclopedia.
http://breastcancernow.org/about-breast-cancer/have-you-recently-been- Retrieved 08:05, June 11, 2018.
diagnosed-with-breast-cancer/understanding-your-results/grades-and- https://en.wikipedia.org/w/index.php?title=Support_vector_machine&oldid
stages, accessed on: April 30, 2018. =842541213
[7] “A Brief History of Breast Cancer,” [Online]. Available: [20] V. Chaurasia, S. Pal “A novel approach for breast cancer detection using
https://www.healthcentral.com/slideshow/a-brief-history-of-breast- data mining techniques,” International Journal of Innovative Research in
cancer#slide=7, accessed on: April 30, 2018. Computer and Communication Engineering, ISSN ONLINE (2320-9801.
[8] J. E. L. Claypon “A Further Report on Cancer of the Breast with Special
Reference to its Associated Antecedent Conditions,” A Further Report on

SUSHIL KUMAR SAROJ ET AL. 5

You might also like