You are on page 1of 7

J Med Syst (2012) 36:2901–2907

DOI 10.1007/s10916-011-9768-0

ORIGINAL PAPER

Prediction of Breast Cancer Using Artificial Neural Networks


Ismail Saritas

Received: 31 May 2011 / Accepted: 2 August 2011 / Published online: 12 August 2011
# Springer Science+Business Media, LLC 2011

Abstract In this study, an artificial neural network (ANN) Cancer risk increases with age. Mammography is a method
was developed to determine whether patients have breast that is used in scanning asymptomatic women and can
cancer or not. Whether patients have cancer or not and if identify cancer breast cancer in an initial stage (a-c). The
they have its type can be determined by using ANN and BI- combined use of mammography and physical examination
RADS evaluation and based on the age of the patient, mass significantly reduces mortality in breast cancer.
shape, mass border and mass density. Though this system Breast calcifications are often seen in mamographies and
cannot diagnose cancer conclusively, it helps physicians in most of them are benign calcifications, but especially
deciding whether a biopsy is required by providing calcifications smaller than 1 mm are the most precise
information about whether the patient has breast cancer or mammographic finding of early breast cancer. 70% of in
not. Data obtained from 800 patients who were diagnosed situ carcinomas manifest themselves only through micro-
with cancer definitively through biopsy. The definitive calcifications [3, 4]. Therefore, identification of calcifica-
diagnosis corresponding to each patient and the data from tions constitutes an important field of mammographic
ANN model results were investigated using Confusion evaluation [5].
matrix and ROC analyses. In the test data of the ANN Microcalcifications are an important characteristic of
model that was implemented as a result of these analyses, various breast lesions that also include carcinomas. The
disease prediction rate was 90.5% and the health ratio was appearance (size and shape), number and distribution of
80.9%. It is seen from these high predictive values that the microcalcifications are generally very important for radiol-
ANN model is fast, reliable and without any risks and ogists to determine whether the lesion that was identified
therefore can be of great help to physicians. through mammography is benign or malign. However, the
lesion needs to be examined histopathologically for a
Keywords Artificial neural network . BI-RADS . Breast definitive diagnosis [6].
cancer . Breast cancer prediction Although interpretations of microcalcifications may exhibit
variations among radiologists, an effort must be made to make
evaluation standard as much as possible and an unambiguous
Introduction conclusion must be given to clinicians. In 1993, ‘The American
College of Radiology’ (ACR) developed a standard reporting
Breast cancer is the most frequently encountered cancer type system called ‘Breast Imaging Reporting and Data System’
in women and the second most terminal one after the lung (BI-RADS) in order to improve communication between
cancer [1, 2]. It is also seen in men though not so frequently. clinicians and radiologists and provide a common terminology
among radiologists [7]. A qualified evaluation of breast
density (thickness) is possible through BI-RADS, which is
I. Saritas (*) commonly used these days. A sample mammogram image of
Department of Electronics and Computer Education,
the 4 categories of breast density is given in Fig. 1 [8].
Technical Education Faculty, Selcuk University,
Konya, Turkey BI-RADS was updated in 2011 and divided into 6
e-mail: isaritas@selcuk.edu.tr categories (Table 1) [9].
2902 J Med Syst (2012) 36:2901–2907

Fig. 1 Representations of the 4 breasts imaging reporting and data system (BI-RADS) breast density qualitative and quantitative assessments

The shape, boundary and density of the tissue are reasoning was designed (CBR) [15–19]. The advantage of
determined through ultrasound and mammography. This CBR is that it is a process of clear judgment that leads to
helps the doctor to decide on biopsy. On the other hand, it diagnosis. A CBR makes inferences on the basis of a stored
has been reported that about 10–30% of all breast biopsies database. ANN, on the other hand, is an intelligent system
indicate malign breast cancers [10]. Unnecessary breast that continuously renews itself by using the database
biopsies lead to both redundant expenses for tests as well as similarly. Therefore, it yields more accurate predictive results
major psychological and physical disorders for the patient for diagnosis than CBR.
[10, 11]. Besides image processing, ultrasonography and physical
In recent years, computer assisted diagnostic (CAD) examination, there is a need for performing a biopsy and
systems have been designed based on BI_RADS [12] making a definitive diagnosis to diagnose breast cancer.
standards to determine tissue deformation so that the doctor Despite the need for a biopsy to make a definitive
can follow the suspected area or perform a breast biopsy. diagnosis, patients with a low risk of cancer tend to avoid
Generally, BI_RADS features are gathered from different it due to the complications and high costs that it may involve.
radiology centers for a suspected area seen in the Therefore, owing to the stated reasons, patients may choose a
mammography for different BI-RADS features such as different method that can make an accurate decision before
shape and boundary of the mass provided by differentially they have a biopsy performed on them. To attain this goal, an
trained physicians. ANN model was devised to help doctors decide whether a
Initially, an ANN model was devised to draw conclusions biopsy is required or the patient should be monitored closely
from BI_RADS definitions [13, 14]. Then, alternative according to the patients’ BI-RADS values, age, shape of the
approaches based on Bayesian network and case-based mass, mass boundary and mass density.

Table 1 BI-RADS mammography classification

BI-RADS category Definition Risk of malignancy Recommended follow-up

0 Incomplete assessment N/A Further workup


1 Negative study N/A Repeat mammogram in 1y
2 Benign N/A Repeat mammogram in 1y
3 Probably benign <2% Repeat mammogram in 6 month
4 Suspicious 20% Biopsy should be considered
5 Highly suggestive of malignancy 90% Appropriate action should be taken
6 Known biopsy-proven malignancy N/A Appropriate action should be taken
J Med Syst (2012) 36:2901–2907 2903

Materials and methods tested for stopping or for saving training data and is used for
predicting an output.
In this study, literature data provided by M. Elter, R. The process of network training is stopped when the
Schulz-Wendtland and T. Wittenberg were used [10, 20]. error that is being tested has reached the desired tolerance
Those of the data that involved ambiguity were removed value [21, 22].
and thus 822 out of 956 data were used. Back-Propagation (BP) algorithm is the most popular
MATLAB Neural Network Toolbox software was used to algorithm which also has the largest area of use. BP
develop the ANN model and make the analyses. consists of two phases, i.e. feedforward and back-
propagation procedures.
During feedforward, information that is subjected to
Artificial neural network (ANN) processing from the input layer to the output layer is
generated. In the case of back-propagation, on the other
Artificial Neural Networks are computer systems that have hand, the difference between the network output value
been developed for the purpose of automatically performing obtained from the feedforward procedure and the desired
skills such as generating new information through learning value is compared with the desired difference tolerance and
and forming and discovering new information, which are the error in the output layer is calculated. The obtained
characteristics of the brain. error is back-propagated to update the links in the input
Generally, it consists of three layers, namely an input layer neurons [21, 23].
layer, one or more hidden layers and an output layer BP training algorithm is a ramp descent algorithm. BP
(Fig. 2). Each layer has a certain number of elements that algorithm is used to improve the performance of the
are bound to each other called neurons or knots. Each network by reducing the total error through changing the
neuron is connected to the other via link weights and weights along the ramp. Training is stopped when the mean
accompanying communication links. Signals travel along square error (MSE) values stop decreasing and when there
the neurons on link weights. Each neuron receives multiple is an increase in these values, which is an indication of
inputs from other neurons in proportion to their link over-training [21, 24].
weights and generates an output signal that can also be
generated by other neurons [21, 22]. 1X n
MSE% ¼ ðdi  Oi Þ2 ð1Þ
The network is subjected to two processes, namely training n i¼1
and test, to develop an ANN model. In the case of training, the
network is educated for an output prediction depending on the Here, di is the target or true value, Oi is the network output
input data. In the case of test, on the other hand, the network is value or predicted value, and n is the output data number.

Fig. 2 Structure of an ANN


2904 J Med Syst (2012) 36:2901–2907

Application of artificial neural network to the data number in order to obtain the lowest error value. After the
most suitable network structure was determined, the trained
The purpose of this devised ANN is to predict severity algorithm was applied to the test data set. The ANN
(benign or malign) of a mammographic mass tissue by parameters that were used are given in Table 2.
using a patient’s age and BI-RADS features. The ANN was created by using the BP algorithm which
The data set that was used involved BI-RADS values, has minimum error delivery and 9 neurons in its hidden
the patient’s age and three BI-RADS features and area of layer. A mathematical formula was obtained using ANN
significance. These data were defined on full range digital and Eq. 6.
mammogram for 399 benign and 423 malign tissues in the For Eq. 4, NETj is obtained using Eq. 3.
radiology institute of Erlangen-Nuremberg University in
 
2000–2006 [10, 20]. 169 of the data were selected ðWi Þi;1  BIRADS þ ðWi Þi;2  Age þ ðWi Þi;3  Shape þ
NETj ¼
randomly to test the model later. The remaining 653 data, ðWi Þi;4  Margin þ ðWi Þi;5  Density  2:1191
on the other hand, were selected randomly to be used in the ð3Þ
training of the model, i.e. 80% for training, 5% for
validation and 15% for testing. The constants which are link weight values (W1)i,j for the
Each sample was examined by the physician and defined BP algorithm, which has 9 neurons in its hidden layer, are
as follows [10, 20]. given in Table 3. In the formula given above, NETj is the
1. BI-RADS evaluation: 1–5 (consecutive, not a prediction!) total of the multiplication of input parameters and their
2. Age: Age of the patient (digital) weights. i and j, which are subscripts, are input and hidden
3. Figure 1: mass shape: round = 1 oval = 2 round neuron numbers respectively.
protruding = 3 irregular = 4 (nominal) In the ANN input layer, as BI-RADS features and age
4. Mass edge boundaries: boundaries are defined = 1, entry parameters, 9 neurons were used in the hidden layer
microlobulated = 2, obscured = 3, ill-defined = 4, and the Severity diagnosis was used in the output layer. 9
speculated = 5 (nominal) equation pieces were used for NET1-NET9 and F1-F9, as
5. Density: Mass density high = 1, iso = 2, low = 3, fat- total and activation functions respectively.
containing = 4 (ordinal) Here, the transfer function that was used for this
6. Nature: benign = 0 malign = 1 (binominal, goal field!) approach is the TANSIG transfer function that is given in
Eq. 4.
In this study, a feedforward network structure containing
an entry layer, a hidden layer and output layer (Fig. 6) was
2
used. After the ANN structure was developed, it was Fj ¼ 1 ð4Þ
normalized within the 0–1 value set using Eq. 2 [21] in 1 þ eð2NETj Þ
order to improve the training characteristics of the data set.
For Eq. 6, A is obtained using Eq. 5.
x  xmin
xnorm ¼ ð2Þ
xmax  xmin 0 1
0:4421  F1  0:3955  F2  0:4658  F3 þ 0:3157  F4
The training data set was used to determine ANN neuron A ¼ 1:1796  F5  0:0061  F6  0:2972  F7 þ 0:4249  F8 A
@
0:2708  F9  2:1191
and bias weight values. Training was repeated by changing
the number of neutrons in the hidden layer and the iteration ð5Þ

Table 2 Parameters that


were used in ANN Parameters Properties

The number of neurons in the input layer 5


The number of hidden layers 1
The number of neurons in the hidden layer 9
The number of neurons in the output layer 1
Learning rate (α) 0.3
Training rate coefficient (β) 0.3
Learning algorithm Gradient reduction algorithm (trainlm)
Transfer function Logarithmic sigmoid (tansig)
J Med Syst (2012) 36:2901–2907 2905

Table 3 Weight values for Eq. 7


Number of neurons BI-RADS W1(i,j) Age W1(i,j) Shape W1(i,j) Margin W1(i,j) Density W1(i,j)
in the hidden layer

1 1,1627 −1,2199 −1,0579 −0,4368 −0,8746


2 −0,3600 0,2081 1,1471 0,9569 1,3134
3 −0,9299 −1,2773 0,4513 −0,0430 1,2915
4 1,2861 −1,1498 −0,9214 −0,4671 −0,7741
5 −1,9271 −0,0753 −0,2703 −0,5172 −0,8155
6 1,0994 −0,9916 −1,4366 0,0361 0,6090
7 −0,3945 −1,2386 −1,1503 −1,1803 0,3392
8 1,2736 1,1961 0,1942 0,9558 −0,7063
9 −0,1239 0,9133 −1,3526 1,0869 −0,8882

Equation 6 is severity result. Predictive value of the positive result (PVP) is the actual
possibility of being sick when the diagnostic test passed the
2 judgment of sick (Eq. 10).
Severity ¼ 1 ð6Þ
1 þ eð2AÞ
A
PVP ¼ ð10Þ
AþB

The predictive value of the negative result (PVN) is the


Results and suggestions
possibility of being healthy when the diagnostic test said
healthy (Eq. 11).
Confusion [25] matrix and ROC [24] analyses were
performed in order to see the success of the study that D
was conducted. PVN ¼ ð11Þ
DþC

Of the 693 training data, the ANN that was performed


Confusion matrix analysis classified 285 out of 344 malign data and 296 out of 349
benign data to be successful (Table 4).
In the Confusion matrix analysis: When the diagnostic test that was performed on the
Sensitivity is the ability expressed with the rate at which training data in Table 3 was examined, the following results
the system can determine real patients among those that were reached:
were definitely diagnosed to be sick (Eq. 7). Using Eq. 7, 8, 9, 10 and 11 successively, these values
were found to be: sensitivity=0.843, specificity=0.834,
A accuracy=0.838 and PVP=0.828 and PVN=0.848.
Sensitivity ¼ ð7Þ Moreover, of the 131 test data that were not subjected to
AþC
ANN training, ANN classified 55 out of 68 malign data and
Specificity is the ability expressed with the rate at which 57 out of 63 benign data to be successful (Table 5).
the system can determine real healthy among those that When the diagnostic test that was performed on the
were definitely diagnosed to be healthy (Eq. 8). training data in Table 5 was examined, the following results
were reached:
D
Specificity ¼ ð8Þ
DþB
Table 4 Training data set confusion matrix
Accuracy is the actual total correct diagnoses as sick and Classification Malign Benign Sum of rows
healthy (Eq. 9)
Malign 285 (A) 59 (B) 344
Benign 53 (C) 296 (D) 349
AþD
Accuracy ¼ ð9Þ Sum of Columns 338 355 693
AþBþCþD
2906 J Med Syst (2012) 36:2901–2907

Table 5 Test data set confusion matrix

Classification Malign Benign Sum of rows

Malign 55 (A) 13 (B) 68


Benign 6 (C) 57 (D) 63
Sum of Columns 61 70 131

Using Eq. 7, 8, 9, 10 and 11 respectively, the following


values were found: sensitivity=0.902, specificity=0.814,
accuracy=0.855, PVP=0.809 and PVN=0.905.

ROC analysis

The basic idea behind medical tests is to estimate the patient’s


Fig. 3 Graph of the ROC analysis of training data
possibility of being sick on the basis of their test results. ROC
Analysis is used to determine the true accuracy of medical
diagnostic tests. Of the terms used in this analysis, sensitivity whereas the area below ANN test data YSA (Fig. 4) ROC
is the ratio of the sick persons (True Positive) whose tests were curve was determined to be 0.870017. The fact that these
positive (sick) to the number of patients whereas specificity is areas are large (close 1) indicates that the study that was
the ratio of healthy persons (False Positive) whose tests were performed was successful in breast cancer diagnostic
negative (healthy) to the number of healthy persons [21, 22, predictions.
25]. ROC analysis is a standard approach which is used to According to the both analyses, it is observed that the
determine the sensitivity and specificity of the diagnostic BI-RADS features that were used as input parameters and
procedure. To this end, ROC curves that will define the age data can be used in making accurate predictions in
relationship between the sensitivity and specificity of the breast cancer severity diagnosis.
diagnosis are used. The axes of the curves are TP (calling the When the study that was performed is compared with the
sick sick) and FP (calling the healthy sick). The curves are studies in the relevant literature that were conducted using
between the limits of 0 and 1, and while proximity to different methods, it is seen that it can be used as a
coordinate y and upper boundary indicates a successful test, supplementary tool that may eliminate unnecessary biopsy.
the curves that have an inclination of 45o indicate an Moreover, better breast cancer prediction results can be
unsuccessful test [21, 22, 25] obtained by increasing the number of patient data, adding
Thus, the success of the test can be determined by appropriate input parameters and using methods of artificial
examining ROC curves. In a successful test, the area below intelligence that can be used in conjunction with ANN.
the curves is expected to be greater. Figure 3 shows the
ROC curves that indicate accuracy of the data set used in
this study at the stage of training while Fig. 4 shows ROC
curves that indicate test accuracy.
Using the equation obtained in the study (Eq. 6),
patients’ BI-RADS features and age data and ANN severity
prediction value can be determined quickly and easily.
According to the test data set confusion matrix analysis, it is
observed that the ratio of distinguishing the sick from the true
sick is 90.2%, the ability to distinguish the healthy from the
true healthy is 81.4% and the ratio of total number of correct
diagnoses of the test as sick and healthy is 85.5%. According to
these results, when the ANN predictive value passed the
judgment of sick, the possibility of being true sick is 80.9%.
Also, when the ANN predictive value passed the judgment of
healthy, the possibility of being true healthy is 90.5%.
According to the ROC analysis, the area below ANN
training data (Fig. 3) ROC curve was found to be 0.850017, Fig. 4 Graph of the ROC analysis of test data
J Med Syst (2012) 36:2901–2907 2907

Acknowledgements This study was supported by Selcuk University 1st ACM International Health Informatics Symposium (IHI ‘10).
Coordination Office of Scientific Research Projects (BAP). Moreover, ACM, New York, pp. 76–82, 2010.
I would like to express my heartfelt thanks for Prof. Dr. Unal SERT of 12. American College of Radiology, Breast imaging reporting and
Selcuk University, Meram Medicine Faculty, who was my advisor and data system (BI-RADS), 4th edition. American College of
helped me in the evaluation of breast cancer data. Radiology, Reston, VA, 2003.
13. Baker, J. A., Kornguth, P. J., Lo, J. Y., Williford, M. E., and Floyd, C.
E., Breast cancer: Prediction with artificial neural network based on
BI-RADS standardized lexicon. Radiology 196:817–822, 1995.
References
14. Markey, M. K., Lo, J. Y., Vargas-Voracek, R., Tourassi, G. D., and
Floyd, C. E., Perception error surface analysis: A case study in
1. Firat, D., and Hayran, M., Cancer Statistics in Turkey and in the breast cancer diagnosis. Comput. Biol. Med. 32:99–109, 2002.
World (1990–1992). Turkish Association for Cancer Research and 15. Floyd, C. E., Lo, J. Y., and Tourassi, G. D., Case-based reasoning
Control. İz Press, Ankara, 1995. computer algorithm that uses mammographic findings for breast
2. Şengelen, M., Kutluk, T., and Fırat, D., Cancer statistics in Turkey biopsy decisions, AJR. Am. J. Roentgenol. 175:1347–1352, 2000.
and in the World (1996–2003). Turkish Association for Cancer 16. Bilska-Wolak, A. O., and Floyd, C. E., Investigating different
Research and Control. İz Press, Ankara, 2007. similarity measures for a case-based reasoning classifier to predict
3. Stomper PC, Connolly JC, Meyer JE, et al. (1989) Clinically breast cancer. Proc SPIE 4322:1862–1866, 2001.
occult ductal carcinoma in-situ detected with mammography: 17. Bilska-Wolak, A. O., and Floyd, C. E., Development and evaluation
Analysis of 100 cases with radiologic-pathologic correlation. of a case-based reasoning classifier for prediction of breast biopsy
Radiology 172–235 outcome with BI-RADS lexicon. Med. Phys. 29:2090–2100, 2002.
4. Holland, R., Ductal carcinoma in situ (DCIS). Eur. Radiol. 18. Bilska-Wolak, A. O., Floyd, C. E., Lo, J. Y., and Baker, J. A.,
10:327–330, 2000. Computer aid for decision to biopsy breast masses on mammography:
5. American College of Radiology, Breast imaging reporting and Validation on new cases. Acad. Radiol. 12:671–680, 2005.
data system (BI-RADS), 3rd edition. American College of the 19. Markey, M. K., Fischer, E. A., and Lo, J. Y. Bayesian networks of
Radiology, Reston, VA, 1998. BIRADS descriptors for breast lesion classifications. International
6. Erkul, Z. K., Erkuş, M., Taşkın, F., and Meteoğlu, İ., Liesegang Conference of the IEEE Engineering in Medicine and Biology
ring calcification in breast biopsy: case report. J. Breast Healthy. 1 Society. San Francisco, California 3031–3034, 2004
(1):22–24, 2005. 20. UCI Machine Learning Repository: Data Sets 2007. http://archive.ics.
7. Gülsün, M., Demirkazık, F. B., Köksal, A., and Arıyürek, M., uci.edu/ml/datasets/Mammographic+Mass, Accessed 29 March 2011
According to BI-RADS assessment of breast microcalcifications 21. Saritas, I., Ozkan, I. A., and Sert, I. U., Prognosis of prostate
and to investigate the agreement between reviewers. Off. J. cancer by artificial neural networks. Expert Syst. Appl. 37
Turkish Soc. Radiol. 8(3):358–363, 2002. (9):6646–6650, 2010.
8. Renee, W., Pinsky, M. D., Mark, A., and Helvie, M. D., Medscape 22. Ronco, A. L., and Fernandez, R., Improving ultrasonographic
mammographic breast density: Effect on imaging and breast diagnosis of prostate cancer with neural networks. Ultrasound
cancer risk: Breast density measurement. JNCCN-J. Natl. Compr. Med. Biol. 25(5):729–733, 1999.
Cancer Netw. 8:1157–1165, 2010. 23. Allahverdi, N., Expert systems. An artificial ıntelligence application.
9. Edward, T. B., Rick, K., and Robert, R., Conn’s current therapy. Atlas Press, Istanbul, 2002.
Elsevier INC, Saunders, Philadelphia, 2011. 24. Türker, N., Tokan, F., and Yıldırım, T. Determination of artificial
10. Elter, M., Schulz-Wendtland, R., and Wittenberg, T., The prediction neural networks performances by roc analysis on the diagnosis of
of breast cancer biopsy outcomes using two CAD approaches that heart disease. Biomedical Engineering National Congres, BİYO-
both emphasize an intelligible decision process. Med. Phys. 34 MUT 2005, İstanbul 206–208, 2005.
(11):4164–4172, 2007. 25. Wichard, J.D., Cammann, H., Stephan, C., and Tolxdorff, T.
11. Nassif, H., Page, D., Ayvaci, M., Shavlik, J., and Burnside, E. S., Classification models for early detection of prostate cancer.
Uncovering age-specific invasive and DCIS breast cancer rules using Hindawi Publishing Corporation Journal of Biomedicine and
inductive logic programming. In: Veinot, T. (Ed.), Proceedings of the Biotechnology 7, 2008.

You might also like