Journal of Computer Applications (JCA) ISSN: 0974-1925, Volume VI, Issue 3, 2013

Dyadic Wavelet Transform based Classification of Microcalcifications Using SVM
Suman Mishra a,*, Hariharan Ranganathan b,1
Abstract - Breast cancer is the leading cause of cancer related casualties among women all over the world. The mortality rate can be reduced significantly by detecting the disease at its premature stage. Among various screening programs X-ray mammography is an effective screening tool for breast cancer detection. Detection process for presence of microcalcifications in mammogram is cumbersome and time consuming for radiologists. The suspicion of breast cancer arises from the presence of microcalcification clusters. However the radiologists are prone to make false judgments of whether the suspected region is malignant or benign. We have developed a novel algorithm to assist radiologists in the diagnosis of microcalcification clusters. In this work we have investigated the performance of a Computer Aided Diagnosis (CAD) system for the detection of clustered microcalcifications and classifying it as benign or malignant. DWT (Dyadic Wavelet Transformation) features are extracted from preprocessed images and passed through support vector machine (SVM) classifier. The proposed methodology is carried out on mammogram images downloaded from DDSM database. The proposed system results exhibits good classification accuracy. Index Terms – Benign, Computer Aided Diagnosis (CAD)

This results in unnecessary painful biopsy examinations [2, 3]. Due to the subtle nature of these microcalcifications, these are often missed in the mammogram. Some studies in this regard state that up to 40% of the cases unambiguous signs of a cancer were missed by the radiologists, resulting patient casualty. Thus the reliable classification of microcalcifications into malignant and benign categories plays a crucial role in early breast cancer diagnosis. The proposed system for classification of microcalcifications consists of several steps. In the first step regions of interest (ROI) specification is carried out followed by preprocessing and enhancement using dyadic wavelet transform in the second step,. Next step deals with and feature extraction and classification with SVM classifier. The proposed methodology is carried on mammographic images taken from DDSM database [4]. The remaining part of this paper is organized as follows. In section 2 a brief review of related work is provided. Methodology is explained in section 3. Experimental results are discussed in Section 4, and conclusion is presented in the last section. II. RELATED WORK Different techniques were used by researchers for classification of microcalcifications. Microcalcifications appear in mammograms of varying sizes, therefore researchers attempted to classify the microcalcification in to malignant and benign categories by manifold techniques. Among the most recently developed ones, most techniques use feature based approach based on multiscale filter bank decompositions. H. Yoshida et al. [5] applied a Discrete Wavelet Transform (DWT) with dyadic scales. Every wavelet scale is multiplied by a weight factor. Then the image was reconstructed by applying the inverse transform. The weights are determined by supervised learning, using a set of training cases. N.V.S. Sree Rathna Lakshmi et al. [6] developed a method, for detecting microcalcification in mammograms based on combined feature set with Ant Colony Optimization (ACO). The diagonal matrix „S‟ obtained from the Singular Value Decomposition (SVD) of LL band of wavelet transform was used as one of the feature set for classification of mammogram. Jacobi Moments was employed for detecting microcalcifications in mammograms. ACO was used for reducing the Jacobi feature set dimensionality through selecting a subset of features that performed well in the classification phase. R. Nakayama et al. [7] developed a filter bank based on the concept of the Hessian matrix for classification. Using Bayes discriminant function in each Region of Interest, eight features were extracted. The proposed method was evaluated using 600 mammograms. 51

System, DDSM Database, Transformation, Malignant, Clusters, Support Vector Mammography,

Dyadic Wavelet Microcalcification Machine, X-ray

I. INTRODUCTION The most common form of cancer in women today is the breast cancer, and the second-most common cause of cancer-related deaths. But if it‟s detected early, the chances of treating it successfully are fairly better. As per WHO reports, nearly two million women are diagnosed with breast cancer worldwide every year. Digital mammography is proven as efficient tool to detect breast cancer before clinical symptoms appear [1]. Mammography involves producing images of the breast through X-ray filming, making the visualization of the internal breast structure for analysis that can detect any abnormality if present. Microcalcifications are among the earliest signs of a breast carcinoma and can be the indication of non-palpable breast disease. Due to its high spatial resolution, mammography enables the detection of microcalcifications at an early stage, however, it is tedious and time consuming for radiologists to distinguish malignant from the benign ones.
Manuscript received 10/September/2013. Suman Mishra, Research Scholar, Department of Electronics Engineering, Sathyabama University, Chennai, India, Phone / Mobile No. 9444176981, (E-mail: emailssuman@gamil.com). Hariharan Ranganathan, Principal, Rajiv Gandhi College of Engineering, Sriperumbudur, India, Phone / Mobile No. 8939953116, (E-mail: ranlal@yahoo.com).

Dyadic Wavelet Transform based Classification of Microcalcifications Using SVM

Liyang Wei et al. [8] proposed several machine learning methods for the classification of clustered microcalcification. They have used support vector machine (SVM), kernel Fisher discriminant (KFD), relevance vector machine (RVM), and committee machines. Many techniques for handling imbalanced data for binary class and multi class problems are formulated. Armando Bazzani et al. [9] investigated the performance of a Computer Aided Diagnosis (CAD) system for the detection of clustered microcalcifications in mammograms. Their detection algorithm consists of difference-image techniques and gaussianity statistical tests to find out the most obvious signals to discover more subtle microcalcifications by exploiting a multi-resolution analysis by means of the wavelet transform. The positive reduction step was used to separate false signals from microcalcifications by means of an SVM classifier. The algorithm was experimented on 40 images from the Nijmegen database. Akram I. Omara et al. [10] used both the wavelet coefficients and the statistical measures of different wavelet detail levels as features that describe effectively any normal and abnormal region. Two Techniques were used for the classification stage the minimum distance classifier and the voting K-Nearest Neighbor classifier. We observe that there is a necessity of developing techniques for automatic classification of microcalcification in mammogram images with a very little or no accurate participation. Such an objective is of greater importance for the mammogram denoising, enhancement methods, and efficient classification. III. METHODOLOGY The proposed methodology consists of regions of interest (ROI) specification, preprocessing, enhancement, feature extraction and classification. The block diagram illustrating the methodology is shown in Fig. 1. This method does not treat the detection of microcalcification. A. Region of Interest (ROI) Identification The first stage of microcalcification classification is ROI identification. The mammogram image is decomposed by undecimated wavelet transform (filter bank implementation without down-sampling). The resulting horizontal detailed image or vertical detailed image is used to identify the region encircling the microcalcification clusters. Third and fourth order statistical parameters, skewness and kurtosis [13], are used to find the regions of microcalcification clusters. An estimate of the skewness is given by = = (1) (2) and the statistical parameter kurtosis holds the expression where xi is the input data over N observations, is the ensemble average of xi and σ with its standard deviation. The third and fourth order statistical estimates were calculated for every overlapping 32x32 square regions of horizontal band-pass image or vertical bandpass image. The area having skewness value greater than 0.2 and kurtosis value greater than 4 is marked as a region of interest (ROI).

B. Preprocessing The artifacts and noises in the mammogram are removed in the preprocessing step. Preprocessing extracts the pectoral muscle to reduce the processing area for classification of microcalcification. In this phase a mammogram is opened using a structuring element, and reconstruction the image is done. Thresholding operation is carried out on the difference image with a suitable value that is obtained experimentally. To smooth irregularities and to detect the edges Morphological operators and Sobel edge detector are applied. C. Microcalcification Enhancement Microcalcifications appear as subtle and bright spots, whose size varies from 0.3mm to 1mm in the mammogram image. It is not easy to enhance the microcalcification regions since surrounding dense breast tissue makes the abnormality areas almost invisible. Microcalcifications are high frequency in nature, hence can be extracted by using high pass filtering. But conventional enhancement technique like unsharp masking, homomorphic filters and high boost filtering tends to change the characteristics of microcalcification. To overcome these limitations microcalcification regions can be enhanced by dyadic wavelet transform without modifying characteristics of microcalcification. Wavelet analysis permits the decomposition of image at different levels of resolution. In Fig. 2, the filter bank structure of the two-dimensional wavelet transform is shown from level j to level j+1, which generates four sub-images at level j+1. Sj be original image, the approximation sub-image Sj+1 is obtained by applying the vertical low-pass filter followed by horizontal low-pass filter to Sj. The sub-image Sj+1 LH is obtained by applying the vertical low-pass filter followed by the horizontal high-pass filter. The sub-image Sj+1 HL is obtained by applying the vertical high-pass filter followed by horizontal low pass filter. Finally, the response Sj+1 HH is obtained by applying the vertical and horizontal high-pass filters successively [11]. The down sampling by a factor 2 is introduced after each level of filtering. The same procedure is repeated for each level of approximation coefficients till Sj+n is achieved. The digitized mammogram incorporated with a size of 1024 x 1024 pixels was taken from Digital Database for screening mammogram (DDSM). Mammogram image was decomposed up to 6 levels by applying dyadic wavelet transform with a decimation factor 2. The original 1024x1024 grayscale digital mammogram image was decomposed to 6 levels by applying Daubechies4 wavelet transform. Finally, the lowest approximation image S6 is of single pixel width. Since microcalcification appears as high frequency behavior in mammogram, the enhancement is achieved by setting the value of S6 as zero. The detail coefficients are enhanced by as per the equation (3).

(3) where, x and y are spatial coordinates, D represents all horizontal, vertical and diagonal sub-bands. Tj be a non negative threshold obtained by taking standard deviation of respective sub-image. The best visual quality of microcalcification is obtained while the gain (g) is set as 1.2.

52

Journal of Computer Applications (JCA) ISSN: 0974-1925, Volume VI, Issue 3, 2013

DDSM Database

Digital Mammogram

ROI Specification

Preprocessing & Enhancement by Dyadic Wavelet Transform

Feature Extraction

SVM Based Classification

Result Malignant / Benign Figure 1. Flow diagram of the complete CAD system

HL(Z)

2 Sj+1

HL(Z)

2

HH(Z)

2

Sj+1 LH

Sj HL(Z) 2 Sj+1 HL

HL(Z)

2

HH(Z)

2

Sj+1 HH

Figure 2. Two dimensional dyadic wavelet structure

(a)

(b)

(c)

Figure 3. Preprocessing Stages (a) Original Image (b) Resized Image (c) Morphological Dilated Image

53

Dyadic Wavelet Transform based Classification of Microcalcifications Using SVM

Table 1 Classification accuracy of the proposed approach based on energy features

86
Classification Accuracy (%)

Decompos ition Level 1 2 3 4 5 6

Classification accuracy (%) KNN SVM Benign Malignant Benign Malignant 74 91 70 97 72 90 68 99 84 92 66 99 72 89 66 100 74 87 66 100 82 87 66 100

85 84 83 82

KNN
SVM

81 80 79
78 1 2 3 4 5 Decomposition Level 6

Table 2 Classification accuracy of the proposed approach based on entropy features

Figure 5. Average classification accuracy obtained from KNN and SVM classifier using entropy features

Classification accuracy (%) Decompos ition Level Benign 1 2 3 4 5 6 80 78 78 76 80 76 KNN Malignant 88 90 92 85 86 86 Benign 76 66 70 66 68 66 SVM 91 95 96 99 100 100
Classification Accuracy (%)

Malignant

Table 3 Classification accuracy of the proposed approach based on energy and entropy features

98 97 96 95 94 93 92 91 90 89 88 87 1 2 3 4 5 Decomposition Level 6

KNN SVM

Decompos ition Level 1 2 3 4 5 6

Classification accuracy (%) KNN SVM Benign Malignant Benign Malignant 92 94 98 94 93 95 96 96 90 92 93 98 96 91 96 100 95 97 98 100 96 94 96 100

Figure 6. Average classification accuracy obtained from KNN and SVM classifier using energy and entropy features

The reconstruction of weighted higher frequency sub-bands provides better visibility of microcalcification region than the other breast regions. D. Feature Extraction Feature extraction involves simplifying the amount of resources required to describe a large set of data accurately. Analysis with a large number of variables generally requires a large amount of memory and computation power or a classification algorithm which over fits the training sample and generalizes poorly to new samples. Feature extraction is a general term for methods of constructing combinations of the variables to get around these problems while still describing the data with sufficient accuracy. In order to accurately classify the potential microcalcification a set of features energy and entropy are extracted. These features are given to the SVM classifier for classification. E. SVM Classifier Support vector machine is a learning technique which is well-founded in modern statistical learning theory [14]. Support vector machines use the training data to create the optimal separating hyperplane between the two classes. The optimal hyperplane maximizes the margin of the closest data points. In this way the SVM minimizes the misclassification probability of new cases. The optimal separating hyperplane is computed as a decision surface of the form:

90
Classification Accuracy (%)

88 86 84 82 80 78 76 1 2 3 4 5 Decomposition Level 6

KNN SVM

Figure 4. Average classification accuracy obtained from KNN and SVM classifier using energy features

54

Journal of Computer Applications (JCA) ISSN: 0974-1925, Volume VI, Issue 3, 2013 (4) where xi are support vectors which are determined from the training data, K( , ) is the inner product kernel which must satisfy Mercer‟s theorem [14], and is used to map the data from its original dimension to higher dimension so that the data is linearly separable in the mapped dimension, ls is the number of support vectors, di is the class indicator (di є{−1,+1}) of xi, and b is bias. The coefficients αi is calculated by solving the quadratic programming problem: Maximize (5) subject to 0≤ for i=1,…,l (6) where C is a user specified positive regularization parameter used to control the amount of allowed overlap between classes. Given the expression g(x), the decision is based on the sign of g(x) as: Decision (7) SVM and KNN classifiers with energy feature, entropy feature and combination of both are shown in Fig. 4, Fig. 5 and Fig. 6. It is evident that accuracy level of SVM classifier is better than the KNN classifier when both the features energy and entropy are taken into consideration. V. CONCLUSION The most common life threatening type of cancer affecting woman is breast cancer. Mammography is an effective screening for breast cancer. Microcalcification clusters is a feature associated with the disease. Microcalcifications are tiny objects and the classification of it in to malignant and benign categories is a challenging task. In this approach we made an attempt to develop a method for the classification of microcalcification. For accurate classification of microcalcification we have developed a CAD system, which uses dyadic wavelet transform and SVM classifier. This classifier effectively classifies the potential microcalcifications as it handles the imbalanced data effectively. We have implemented our work and applied it to DDSM dataset. The classification step is time consuming, but our approach increases the classifier performance. REFERENCES
[1] Ms Tripty Singh, Dr. Sarita Singh Bhadauoria, Dr A.K Wadwani and Dr S.Wadhwani, “Contrast Enhancement of Clusters In Images Using Fuzzy-Rule Based Algorithm”, International Journal of Advances in Scienceand Technology, Feb Issue Vol. 2, No. 2, pp. 18-28, 2011. [2] H. D. Cheng et al. Computer-aided detection and classification of microcalcifications in mammograms: A survey. Pattern Recognition, 36(12):2967–2991, 2003. [3] H. Soltanian-Zadeh et al. Comparison of multiwavelet, wavelet, Haralick, and shape features for microcalcification classification in mammograms. Pattern Recognition, 37(10):1973 –1986, 2004. [4] D. K. M. Heath, K.W. Bowyer, Current status of the digital database for screening mammography, in: Proceedings of the Fourth International Workshop on Digital Mammography, Kluwer Academic Publishers, 1998, pp. 457–460. [5] H. Yoshida, K. Doi and R. M. Nishikawa, "Automated detect ion of clustered microcalcifications in digital mammograms using wavelet transform techniques", Medical Imaging, SPlE , pp. 868-886, 1994. [6] N.V.S. Sree Rathna Lakshmi and C. Manoharan, “Wavelet Analysis and Orthogonal Moments based Classification of Microcalcification in Digital Mammograms”, Journal of Computer Science 7 (10): 1541-1544, 2011. [7] R. Nakayama, Y. Uchiyama, K. Yamamoto, et al. , “Computer aided diagnosis scheme using a filter bank for detection of microcalcification clusters in mammograms”, IEEE Transaction on Medical Imaging, 53 (2) (2006) 273–283. [8] Liyang Wei, Yongyi Yang, Robert M. Nishikawa, and Yulei Jiang, “A Study on Several Machine-Learning Methods for Classification of Malignant and Benign Clustered Microcalcifications“, IEEE Transactions On Medical Imaging, Vol. 24, No. 3, March 2005. [9] Armando Bazzani et al., “Automatic detection of clustered microcalcifications in digital mammograms using an SVM classifier”, ESANN‟2000 proceedings – European Symposium on Artificial Neural Networks Bruges (Belgium), 26-28 April 2000, D- Facto public., ISBN 2-930307-00-5, pp. 195-200. [10] Akram I. Omara, Ahmed S. Mohamed, Abo-Bakr M. Youssef, and Yasser M. Kadah, "Computer Aided Diagnosis in Digital Mammography", Proceeding of the 3rd Cairo International Biomedical Engineering Conference (CIBEC'06), Cairo , Dec. 2005. [11] Nakayama R,Uchiyama Y,Yamamoto K,Watanabe R, Namba, K, “Computer-aided diagnosis scheme using a filter bank for detection of microcalcification clusters in mammograms”, IEEE Transactions on Biomedical Engineering, Vol 53.No.2,p.273-283, February 2006. [12] Zyout, I. 2010. Toward automated detection and diagnosis of mammographic microcalcifications. Doctoral dissertation, Dept. of Elect. & Comp. Eng., Western Michigan University.

In this work a radial basis function (RBF) is chosen as an inner product kernel, which is defined as: (8) where σ > 0 is a user specified constant which defines the kernel width. In RBF kernel support vector machine, number and value properties of support vectors determine the number of kernels and their centers [15]. Using RBF as an inner product kernel provides classification of a non-linear set of data, which means perfect discrimination of the microcalcification texture features. IV. EXPERIMENTAL RESULTS For the development and evaluation of the proposed methodology, we used the Digital Database for Screening Mammography (DDSM) [4] that is a publicly available database of digitized screen-film mammograms. It contains 2620 cases acquired from Massachusetts General Hospital, Wake Forest University, and Washington University in St. Louis School of Medicine. The data are comprised of studies of patients from different ethnic and racial backgrounds. The DDSM contains descriptions of breast lesions in terms of the American College of Radiology‟s breast imaging lexicon called the Breast Imaging Reporting and Data System (BI-RADS) [4]. Mammograms in the DDSM database were digitized by different scanners depending on the institutional source of the data. We have used cases (studies) of the DDSM from volumes benign_2, benign_3, benign_5, cancer_6, cancer_7, and cancer_8. The data from 50 benign cases and 100 malignant cases, each containing calcifications were analyzed. The software used for accessing the mammograms is available in the website. Our methodology receives a digital mammogram and processes it through the five stages: - region of Interest (ROI) identification, preprocessing, enhancement, feature extraction, and classification. Preprocessing extracts the breast contour and removes the pectoral muscle. Figure 3 shows the successful result of preprocessing. The extracted features are fed to the SVM classifier and the KNN classifier. Table 1, table 2 and table 3 gives the accuracy levels obtained from the above two mentioned classifiers with energy feature, entropy feature and combination of both at 1 – 6 decomposition levels. The graph displaying the classification accuracy versus decomposition levels for

55

Dyadic Wavelet Transform based Classification of Microcalcifications Using SVM
[13] M.N. Gurcan, Y. Yardimci, A.E. Cetin and R. Ansari , “Automated Detection and Enhancement of Microcalcification on Digital Mammograms using Wavelet Transform Techniques”, Dept. of Radiology, Univ.of Chicago,1997. [14] Vapnik, V., Statistical Learning Theory, Wiley, New York, 1998. [15] S. Chatterjee, “Classification of natural textures using Gaussian Markov random field models”, in Markov Random Fields, Theory and Applications, R. Chellappa, A. Jain, Ed., Academic Press, 1993, pp. 159-177.

BIBLIOGRAPHY
Suman Mishra obtained his Bachelor‟s degree in Electronics and Communication Engineering from Institution of Engineers India. Then he obtained his Master‟s degree in Applied Electronics from University of Madras. He is pursuing his Ph.D in the Department of Electronics Engineering in the area of Medical Image Processing at Sathyabama University Chennai. Currently, he is an Associate Professor in the Department of Electronics and Communication Engineering at Rajiv Gandhi College of Engineering, Chennai, India. His current research interest is Digital Image Processing applications in the field of Medical Imaging. Dr. Hariharan Ranganathan completed his B. E and M. Sc (Engg.) from College of Engineering, Guindy, Madras, India in 1975 and 1978, respectively, specializing in Electronics and Communication Engineering. He has had experience in the Computer Industry in Hardware and Software fields supporting major clients all over India. He started teaching in the year 2000. He taught in the Gulf for a brief period of little over a year. He completed his Ph.D from Anna University, Chennai (Madras), India in the year 2007. He has authored many papers in referred international journals and international conferences. He has been the reviewer for journal papers and PhD theses. His interests include wireless communication networks, antennas, medical electronics, and artificial neural networks. Currently he is the Principal of Rajiv Gandhi College of Engineering, Sriperumbudur, India. He has been a member of various Conferences.

56