You are on page 1of 6

International Machine Vision and Image Processing Conference

GA Based Neuro Fuzzy Techniques for Breast Cancer Identification


Arpita Das1 and Mahua Bhattacharya2 1 Institute of Radio Physics & Electronics, University of Calcutta 92, A.P.C. Road, Kolkata-700009, India (e-mail: dasarpita_rpe@yahoo.co.in) 2 Indian Institute of Information Technology & Management, Gwalior Morena Link Road, Gwalior-474010, India (e-mail: mb@iiitm.ac.in) Abstract
An intelligent computer-aided diagnostics system may be developed to assist the radiologists to recognize the masses / lesions appearing in breast in different groups of benignancy / malignancy. In present work we have attempted to develop a computer assisted treatment planning system implementing Genetic algorithm based Neuro-fuzzy approaches. The boundary based features of the tumor lesions appearing in breast have been extracted for classification. The shape features represented by Fourier Descriptors, introduce a large number of feature vectors. Thus to classify different boundaries, a standard classifier needs a large number of inputs, and simultaneously to train the classifier a large number of training cycles are required. This may invite the problem of over learning, followed by chance of misclassification. In proposed methodology, Genetic Algorithm (GA) has been used for searching of significant input feature vectors. Finally adaptive neuro fuzzy based classifier has been introduced for classification of tumor masses in breast. diseases where the idea of shape similarity measure has been implemented by minimization of distance function D between the contours of tumor lesions and the model. In recent years, considerable efforts have been taken to develop automated methods for detection and classification of masses. Many researchers utilized shape descriptors for the detection of microcalcifications [3]-[4]. A comprehensive study of methods using shape descriptors for classification has been reported in [5]. Further in breast cancer identification, development of a fully automated technique for segmentation of tumor masses is a great challenge. Several investigators exploited the methods using intensity values to decide whether a pixel may belong in the region of interest or background [6]. Sahiner et. al. [7] developed an automated, three stages segmentation algorithm including clustering, active contour, and speculation detection stages. Authors [11] have reported methods for discrimination of benign and malignant lesion in breast using ultrasound. In present paper authors have described an improved segmentation process of breast tumor using fuzzy cmeans clustering algorithm. To describe the nodular and stellate margin of masses very precisely, authors introduce Fourier descriptors as shape representing features [8]. In the next stage, classifier has been designed using adaptive neuro fuzzy techniques [9] to discriminate the benignancy from malignant growth of tumor. This method also incorporates the automated false positive reduction of mass boundaries.

1. Introduction
Objective of digital mammography is to detect breast cancer at the early stage of development. Masses appearing in breast are three-dimensional lesions representing sign of breast cancer. Masses are described by their shape, margin and textural characteristics, and may affect the surrounding tissues. The margin is the border, of mass, which is one of the most important criteria to determine whether the mass is belonging to benign group or malignant group. A round or round to oval shape masses with sharply defined borders may have a high likelihood of benign stage. A benign mass generally possesses circumscribed margin. In our earlier work we [1],[2] have suggested shape similarity measure for finding the prognosis of

2. Morphology of Tumor Mass


Masses in mammograms are compact areas that appear brighter than the tissue in which they are embedded because of higher attenuation of X-rays. The primary features that indicate malignancy are related to the mass shape and margin. Fig-1 illustrates the morphological spectrum of breast masses frequently seen on mammograms [10].

978-0-7695-3332-2/08 $25.00 2008 IEEE DOI 10.1109/IMVIP.2008.19

136

Presently fuzzy c-means clustering algorithm has been used to develop fully automated intensity based segmentation technique of masses. The chosen number of fuzzy cluster centers is three as shown in Fig-3. Cluster A represents the healthy breast tissue. Second cluster B represents the false presence of masses (due to dense fibroglandular tissues) and C represents the actual mass region.

Fig-1: Morphology of mammographic masses

3. Proposed Methodology
In present study, tumor masses are extracted from surrounding normal breast tissues by fuzzy based segmentation technique. The significant shape-based boundary features are searched by genetic algorithm and are fed to an adaptive neuro fuzzy classifier for a correct decision, whether the masses are benign or malignant. The overview of the proposed method is presented below in Fig-2.

Fig-3: Final Fuzzy Partition Membership Functions

The ultimate Fuzzy partition membership functions are shown in Fig-3, which depict that there is an overlapping among the membership functions. In present problem, decision algorithm which has been developed is described below: If the membership grade of a pixel in the mammograms is greater than 0.5 in cluster C, decision is taken that the particular pixel considered to a calcified lesion. According to the decision rule developed here, the shaded region in Fig-3 indicates the actual mass. Other two clusters are suppressed to reduce the effect of false positive presence of masses. Algorithm: Let X={x1, x2,..,x n} be a set of given data. A fuzzy c-partition of X is a family of fuzzy subsets of X, denotes by P = {A1, A2,.., Ac}, which satisfies

A (x
i =1 i

) =1

(1)

Fig-2. Brief overview of our work

The performance index of a fuzzy partition P, Jm (), is defined in terms of the cluster centers by the formula

3.1 Segmentation of Tumor Mass Using Fuzzy C-Means Clustering Algorithm


Proper segmentation of masses in dense breast identified in mammograms is an important and challenging task. The final classification of tumor mass in different grades of benignancy / malignancy depends upon the appropriateness of segmentation technique.

( A , v 1 ,... v c ) =

c m [ A i ( x k )] k =1 i = 1
n

|| x k v i||

(2)

where xk vi 2 represents the distance between xk and vi (vi is the cluster centers) . Clearly, the smaller the value of Jm (), the better the fuzzy partition P. Thus, the goal of fuzzy c-means clustering method is to find a fuzzy partition P that minimizes the performance index Jm (), which offers

137

[ A i ( x k )] x
m k =1 n

(3)

[ A i ( x k )]
k =1

3.2 Extraction of boundary as feature using Fourier Descriptors


Feature selection is the choice of descriptors in a particular application. The boundary features carry the information about shape and margin of the segmented masses. Presently authors introduce Fourier Descriptors to represent the mass boundary. The conventional shape descriptors like compactness, area, number of concavity/convexity points cannot discriminate the nodular and stellate masses [Fig-1]. The proposed method of boundary description using Fourier Descriptors very precisely differentiates these two types of masses. Algorithm: Let us consider a tumor that describes kpoints digital boundary in the x-y plane, staring at an arbitrary point (x0,y0) to the coordinate pairs (x1,y1), (x2,y2),, (x k-1,y k-1) along the boundary. These co-ordinates are represented by the form x(k) = x k and y(k) = y k. Thus the boundary can be represented as s(k) = [x(k), y(k)] (4) for k = 0, 1, 2,., k 1 Each co-ordinate pair can be treated as a complex number so that s(k) = x(k) + j*y(k) for k=0, 1, 2, .., k 1. (5)

finding a feature subset of input training as well as test patterns that are able to describe all of the information required to classify them. The boundary or margin detection of masses based on Fourier Descriptors, introduces a large number of feature vectors. Thus to classify different boundaries, a standard classifier needs a large number of inputs, that encounters the problem of over learning and which may introduce the chance of misclassification. To solve this problem of over learning we have introduced the optimization technique for feature selection using Genetic algorithm. GA uses three operators selection (or reproduction), crossover and mutation to achieve the goal of evolution. Presently Genetic algorithm (GA) is used to search two significant Fourier shape descriptors that are able to represent a particular class of tumors. Compactness measure has been used to describe the third important shape feature. Different image boundaries are recognized on the basis of Fourier Descriptors and play the role of objective functions that would be maximized to search the significant descriptors.

3.4 Classification of Significant Features


The proposed method uses adaptive neuro-fuzzy network for classification of features into benign and malignant groups. Present approach is a robust technique, since Adaptive Neuro-Fuzzy Inference System (ANFIS) is an innovative soft computing approach that is able to handle the uncertainty present in the system. As a result the decision taken by the intelligent expert system is more close to reality. Adaptive neuro fuzzy architecture Fig- 4 illustrates the structure of the adaptive neurofuzzy inference architecture for boundary detection of tumor masses, where nodes of the same layer have similar functions as described below. Layer 1: Every node I in this layer is an adaptive node with a node function O1,i = Ai (x), O1,i = Bi (y), where x (or y) is the input to node i and Ai (or Bi) is a linguistic label (such as large or small) associated with this node. In other words O1,i is the membership grade of fuzzy set A (A1, A2) or B (B1, B2). The membership function for A may be any appropriate parameterized membership function, such as generalized bell function shown below with {ai, bi, ci} as the premise parameter set. As the values of these parameters change, the bell-shaped function varies accordingly. for i = 1,2

The x-axis is treated as real axis and y-axis as the imaginary one. The Discrete Fourier Transform (DFT) of s(k) is given below K1 (j2 u k /K) a(u) = (1/K) s(k) e k=0 for u = 0, 1, 2, ., K1. (6)

The complex coefficient a(u) is known as Fourier Descriptor of the boundary. Measuring Compactness of a particular shape is another frequently used boundary descriptor. It is defined as (perimeter)2/area. Compactness is minimal for round shaped figure. It is also insensitive to the orientation of the images. In this paper, compactness measurement is used as an important shape feature.

3.3 Introduction to Genetic Algorithm for Reduction of Feature Subspace


The feature selection problem (FSP) is an important issue in machine learning, which basically consists of

138

(7)

for i = 1, 2. In general, any T-norm operator that performs fuzzy AND can be used as the node function in this layer.

= 1+|

x ci ai

2b

Layer 2: Every node in this layer is a fixed node labeled , whose output is the product of all the incoming signals: O2,i = wi = Ai (x) Bi (y), (8)

f Predicted output Fig- 4: The ANFIS Model for Final classification

Layer 3: Every node in this layer is a fixed node labeled N. The ith node calculates the ratio of the rules firing strength to the sum of all rules firing strengths:

3 ,i

w
1

(9)
2

Parameters of this layer are referred to as consequent parameters. Layer 5: The single node in this layer is fixed node labeled , which computes the overall output as the summation of all incoming signals:

Layer 4: Every node i in this layer is an adaptive node with a node function

5 ,i

wf
i

w f w
i i i i

(11)

4 ,i

w f
i

w (p
i

x +

y +

(10)

3.4.1 Hybrid Learning Rule for Training ANFIS Hybrid leaning rule combines Gradient Decent (GD) method and least-squares estimator (LSE) for fast identification of parameters in adaptive neuro-fuzzy model. In hybrid learning approach, each epoch is composed of a forward pass and a backward pass as

where wi is a normalized firing strength from layer-3 and {pi, qi, ri} is the parameter set of this node.

139

shown in Table-1. The hybrid method converges much faster than any conventional approach since it reduces the search space dimensions of the original pure back propagation learning. Also the hybridization of Neurofuzzy approaches is robust and adaptive even in the noisy, uncertain environment. In the present paper we reduced the input feature vector size to 3 only and there are two bell-shaped membership functions assigned for each input variable. Thus number of fuzzy if-then rules for ANFIS learning is 2 3 =8. Table 1: Summary of hybrid learning procedure # Premise parameters Consequent parameters Signals Forward pass Fixed LSE Node outputs Backward pass GD Fixed Error signals

roundness. 1 = D1 O1 (12)

where, D1 = Desired output value of a benign mass, O1= Obtained output value of the test mass. The degree of malignancy is higher for higher value of 1. The decisions regarding the prognosis of test masses are defined below: If 1 <= 20, the shape & margin of test masses are considered as Almost Round or Round to Oval Shape & Smooth Boundary Benign. If 20 <= 1 <= 40, the shape & margin of test masses are considered as Lobulated & Non-Circumscribed Boundary Tendency towards Malignancy. If 1 > 40, the shape & margin of the test masses are considered as Irregular & Ill-defined Boundary Malignant.

3.5 Decision Making Logic Design of appropriate decision rule is the most important step for successful pattern recognition scheme. Presently classification of tumor masses has been carried out extracting shape features of the patterns. The proposed ANFIS model has been trained with round shaped benign masses as indicated by the radiologists. Our objective is to classify each of the test masses, whether they belong to benign or malignant stage. For this purpose, it is necessary to define an Euclidean distance function (1) to determine the deviation of

4. Experimental Results
We have implemented our proposed algorithm to a database consisting of 200 images. The classifier was first trained with obvious benign masses as identified by the radiologists and other non-obvious test cases have been classified during the experiment. Segmentation and final classification of tumor patterns in benignancy or malignancy of few non-obvious cases are given in Fig-5 and Table-2. The successful classification rate of the proposed methodology is almost 87%.

Accurate segmentation of non-obvious test masses

Fig-5(a) Mammograms from MIAS Database

Fig-5(b) Suspicious cutout of masses, extracted from the Mammograms.

[1]

[2]

[3]

[4]

[5]

[6]

Fig-5(c): Accurate contour segmentation of masses using FCM

Table-2: Final decision on the degree of benignancy/malignancy of the test masses

140

Database Data-1 Data-2 Data-3 Data-4 Data-5 Data-6

Value of 1 8.4197 43.2399 42.7143 43.2407 43.8239 30.1899

Decision on the Mass Shape Almost Round or Round to Oval Irregular & Ill-defined Irregular & Ill-defined Irregular & Ill-defined Irregular & Ill-defined Lobulated & Non-Circumscribed

Final Decision on Prognosis Benign Stage Possibly in Malignant Stage Possibly in Malignant Stage Possibly in Malignant Stage Possibly in Malignant Stage Tendency towards Malignant Stage

5. Discussion
Proposed methodology for breast tumor classification using mammogram, is based on adaptive neuro-fuzzy model extracting the boundary of the lesion which is the region of interest. This classification predicts the prognosis of the disease either towards benignancy or malignancy. The output node value of the classifier indicates deviation or Distance function of the test masses with respect to the trained benign masses. Genetic algorithm has been used to overcome the problem of over learning and chances of misclassification in feature extraction and representation for the adaptive neuro Fuzzy based classifier. The performance of the proposed technique is satisfactory in 87% cases The result has been further verified by the physicians/ radiologists,

[3]. D. H. Davies and D. R. Dance, Automatic computer detection of clustered calcifications in digital mammograms, Phys. Med. Biol., vol. 35, no. 8, pp. 11111118, 1990. [4]. L. Shen, R. M. Rangayyan, and J. E. L. Desautels, Application of shape analysis to mammographic calcifications, IEEE Trans. Med. Imag,, vol.13, pp. 263 274, 1994. [5]. J. Kilday, F. Palmieri, and M. D. Fox, Classifying mammographic lesions using computerized image analysis, IEEE Trans. Med. Imag., vol. 12, pp. 664669, 1993. [6]. Mendez AJ, Tahoces PG, Lado MJ, Souto M., Vidal JJ, Computer-aided diagnosis: Automatic detection of malignant masses in digitized mammograms, Medical Physics, vol. 25, no. 6, pp. 957-964, 1998. [7]. B. Sahiner, H.P. Chan, N. Petrick, M.A. Helvie, and L.M. Hadjiiski, Improvement of mammographic mass characterization using speculation measures and morphological features, Med. Physics, vol.28, pp.1455 1465, 2001. [8]. M. Bhattacharya and A. Das, Discrimination for Malignant and Benign Masses in Breast Using Mammogram: A Study on Adaptive Neuro-Fuzzy Approaches, Proc. of Indian International Conference on Artificial Intelligence (IICAI-07), Pune, India by Springer Link, pp. 1007-1026, 17 -19 Dec, 2007. [9]. J.-S. R. Jang, ANFIS: Adaptive-Network based Fuzzy Inference Systems, IEEE Trans. Systems, Man and Cybernetics, vol. 23, no. 3, pp. 665-685, 1993. [10]. L. M. Bruce and R. R. Adhami, Classifying Mammographic Mass Shapes Using the Wavelet Transform Modulus-Maxima Method, IEEE Trans. Medical Imaging, vol. 18, no. 12, pp. 1170-1177, 1999. [11]. Craig K. Abbey, Roger J. Zemp, Jie Liu, Karen K. Lindfors, and Michael F. Insana, Observer Efficiency in Discrimination Tasks Simulating Malignant and Benign Breast Lesions Imaged With Ultrasound, IEEE Trans. Medical Imaging, vol. 25, no.2, pp: 198 -209, 2006.

Acknowledgement
The authors would like to thank to Dr. S. K. Sharma of EKO X-ray and Imaging Institute, Kolkata. The authors also acknowledge CSIR, Govt. of India for financial support to continue this research work.

References
[1]. M. Bhattacharya, D. Dutta Majumder, Knowledge Based Approach to Medical Image Processing in Pattern Directed Information Analysis (Algorithms, Architecture & Applications), publisher: New Age Wiely, 2008.

[2]. D.Dutta Majumder & Mahua Bhattacharya, Cybernetic Approach To Medical Technology : Application To Cancer Screening And Other Diagnostics, Millennium Volume of Kybernetes, International Journal of Systems & Cybernetes, MCB publications UK, Vol. 29 , number 7/8 , pp : 871-895, 2000.

141

You might also like