You are on page 1of 6

International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)

Web Site: www.ijettcs.org Email: editor@ijettcs.org, editorijettcs@gmail.com Volume 2, Issue 5, September October 2013 ISSN 2278-6856

AFeature FeatureCombination CombinationFramework Frameworkfor forBrain Brain A MRIAnalysis Analysis MRI


1 1 1 2 3 2 Weishi2, Zhang 1 Xiaobo1, Qian Dongxiang Huang Ying Yihong , Zhang Yong , Luo Gang1, Gang Huang Huang Ying,1,Liang Liang Yihong , Zhang Yong , Luo Weishi2, Zhang , Huang Xiaobo1, Qian , 3 2 4 3 Rigui , Zhang 2 Jiaming Lai Yang Yang , Hong Dongxiang , Lai Rigui3, Zhang , Hong Jiaming4
1 1Faculty of Automation, Guangdong University of Technology, Guangzhou, 510006, China Faculty of Automation, Guangdong University of Technology, Guangzhou, 510006, China 22Department of Neurosurgery, Guangdong No.2 Provincial Peoples Hospital, Guangzhou, 510317,

3 3Department of Neurosurgery, the Third Affiliated Hospital of Guangzhou Medical University, Guangzhou, 510150, China Department of Neurosurgery, the Third Affiliated Hospital of Guangzhou Medical University, Guangzhou, 510150, China 44

China Department of Neurosurgery, Guangdong No.2 Provincial Peoples Hospital, Guangzhou, 510317, China

Schoolof ofMedical MedicalInformation Information Engineering, Guangzhou University Chinese Medicine, Guangzhou, 510006, China School Engineering, Guangzhou University ofof Chinese Medicine, Guangzhou, 510006, China Corresponding authors: Zhang Yong; Zhang Gang. Corresponding authors: Zhang Yong; Zhang Gang.

Abstract: A good feature representation would capture the


key characteristic of an image. To determine a proper feature representation is a subjective problem and relies heavily on experience. We propose a machine learning framework to learn an optimal feature space through a set of basic feature representation for brain MRI analysis. It learns a weight vector to combine different features together through a large margin based learning procedure. The evaluation result on a synthetic and a real data set show that the method is effective.

Keywords: feature extraction, feature combination, large margin nearest neighbor learning, 2-dimensional wavelet transformation

properly cut and expressed as a feature vector [5]. Since cutting a brain MRI image has been studied in many literatures, we focus on the feature extraction problem of brain MRI image in this paper. We think it a key step for MRI image analysis. But to the best of our knowledge, it has not been explored in depth. Though there are large amount of studies on image feature extraction [6], the methods specified to brain MRI image feature extraction are seldom found in the literatures. Figure 1 sketch these two ideas in brain MRI image analysis.

1. INTRODUCTION
MRI is an important examination method for brain disease diagnosis. It generates an inner view of what happens in the patients brain, which provides important clues for diagnosis [1]. Since brain diseases are common in daily life, large amount of brain MRI images are produced everyday which become a great burden for doctors and laboratory experts [2]. Currently intelligent methods based on artificial intelligent, data mining or machine learning has been introduced to analyze brain MRI images automatically. On one hand, some study attempted to construct a model captured the knowledge of how to cut a brain MRI image into meaningful regions, such as white matter (WM), grey matter (GM), cerebrospinal fluid (CSF) and some lesion regions with obvious characteristic [3]. And on the other hand, people also go a step further towards building a computer aid diagnosis (CAD) system that can give a diagnosis advice or a decision when given a brain MRI image. For the first direction, the focus is on the graphical discriminant functional in some unknown space [4]. Supervised, semi-supervise or unsupervised learning methods can be used. And currently some constructive results have been achieved in some subdomains of brain MRI image cutting. For the second direction, we consider that it is developed upon the success of the first direction, i.e. a brain MRI image can be processed only if it is Volume 2, Issue 5 September October 2013

Figure 1 Two ideas for brain MRI image analysis We propose a novel feature extraction method for brain MRI image. The motivation of this study is that we want to find an optimal combination strategy to combine some currently proposed features. The optimal strategy is defined as the best performance of a classifier (e.g. SVM) on a training data set with some feature expression. The motivation of this work is the different performance of different feature representation for the same model. If the model and the training set are fixed, methods of feature expression control the performance of the whole system. However, to determine the best feature expression for a problem is not easy. It requires human expertise and experience, which is subjective and not always be able to obtain stable model. Hence we go another direction. We combine several existing features of a brain MRI image to obtain a new feature representation. And find a weight vector to get the best result. Thus the problem can be formularized as Eq. (1): Page 152

International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)


Web Site: www.ijettcs.org Email: editor@ijettcs.org, editorijettcs@gmail.com Volume 2, Issue 5, September October 2013 ISSN 2278-6856
(1) where is a quantified function evaluating the loss of a learner l on a training data set D. g(w, f) is function calculating the weighted combination if a set of feature representations f. The remainder of this paper is organized as following. In Section 2 we review some related work. Section 3 presents the main method of this work. We present a large margin nearest neighbor method to learn an optimal weight vector. In Section 4 we report some evaluation results on a synthetic and a real data set. And in Section 5 we conclude the paper. normalization problem. Zhang and Liu [10] proposed a multi-category large margin unified machine to tackle the problem of the selection of soft and hard classifiers. From a probabilistic perspective, soft classifier, e.g. logistic regression, estimiates the class conditional probability while hard classifier, e.g. support vector machine, directly find the decision boundary. When it comes to multicategory classification problem, it becomes increasingly difficult. To construct an effective model, one should understand multi-category classification deeply. They proposed an effective algorithm to solve the problem and achieve competitive performance.

3 THE METHOD
3.1 Problem definition Let D be a set of images with a set of standard annotation terms, and we have D={(Ii , Ti )| i=1,n}, where Ii is a 2D brain MRI image with matrix representation and Ti is a set of annotation terms indicating the characteristic of Ii . Let be a function such that , mapping a given image to a d-dimensional feature vector. Our goal is to find an optimal that maximizes the classification accuracy on D for some fixed learner. In this work we use a standard version of support vector machine (SVM) implemented in [11] as the learner. To make the problem clearer, we would like to impose some constraints on . Motivated by the feature combination of skin biopsy image analysis [12], we confine in a linear combination form as Eq. (2): (2) where stands for the th feature representation of image Ii and is the corresponding weight for . Also we impose a normalization constraint on , that and , leading to a convex combination. Below we would present how to determine and how to find an optimal . 3.2 Region-based feature extraction In a brain MRI image, different local areas may be associated with different annotation terms. Hence it is meaningful to discriminate these local areas and extract features based on these areas. Our previous work [13] indicated that visual disjoint borders may act as a good separator for these areas. In this work, we also apply the same method to generate local visual disjoint areas. Roughly speaking, we use a famous cutting algorithm, namely Normalized Cut [14] to generate visual disjoint areas in an unsupervised manner. Note that the number of regions to be generated should be pre-set before running the algorithm. To achieve a feature set with enough information, 4 methods are applied to extract features, as shown in Table I. Page 153

2. RELATED WORK
Our work is based on the areas of brain MRI image analysis and large margin nearest neighbor learner. Below we review some important related work in both areas. For brain MRI image analysis, many efforts have been made to introduce different intelligent methods in various domains. Loizou et al. [7] introduces the use of multiscale amplitude modulation-frequency modulation texture analysis of multiple sclerosis (MS). They proposed a texture analysis based model to identify potential associations between lesion texture and disease progression. Texture-based features are good for brain MRI images. Medically speaking, lesion regions in a brain MRI image always has texture characteristic different from normal regions. However, texture is not sufficient since visual border also provides useful information for diagnosis. L. Harrison [8] proposed a method to show the influence of the texture parameter of MRI. They reported an empirical study of the best discrimination in lymphoma MRI texture. Their result indicated that fine tuned texture parameters can great improve the model performance. For large margin nearest neighbours learning, some significant work needs review which founds the base of this work. Liu et al. [9] proposed a feature selection algorithm based on large margin subspace learning. They seek a projection matrix to maximize the class margin of a given data set, and make use of information provided by the nearest samples of a given sample. The learned projection matrix can be used as a weighted Euclidean distance in further learning problem. They also proposed to impose 1-norm on the matrix to enforce the sparsity of the rows in the matrix. The idea of their method can be applied to our work. Though they tackle the problem of feature selection, the projection matrix can be regarded as a special case of combination weights. One thing should be concerned is that they do not deal with the Volume 2, Issue 5 September October 2013

International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)


Web Site: www.ijettcs.org Email: editor@ijettcs.org, editorijettcs@gmail.com Volume 2, Issue 5, September October 2013 ISSN 2278-6856
Table 1: Four feature extraction methods
No. F1 F2 F3 F4 Description 2D wavelet transformation as features 9-level histogram for Cluster with DWT efficiencies Bag of Features Reference Zhang [13] Dou [6] Chen [15] Caicedo[16]

principle of k nearest neighbors. We first write down the target function as Eq. (3): (3) According to Eq. (2), is a function of . Thus we can express the cost function as Eq. (4): (4) where indicates whether and are neighbors, and control the tradeoff between accuracy and model complexity. The function is a regularization term evaluates the model complexity. In this work we apply a hinge loss function in it. We should that the first is a penalty term that only punishes the neighbor pairs with different labels. The above problem can be solved as a standard SDP (semi definite program) problem. A SDP problem is a linear program with linear constraints that can be expressed as a positive semi definite (PSD) matrix. To see this, we rewrite Eq. (3) as a matrix form: (4) Let . We have the SDP problem of Eq. (3) and (4) with the following constraints: (5) (6) It can be solved through a standard SDP solver. Like the optimization procedure of SVM, some soft margin can be added to the model to allow some mistake for the kNN procedure. Hence in Eq. (5) a set of slack variables can be introduced. Eq. (7) shows this setting: (7) And one additional constraint needs add to the constraint set: (1) When is obtained, we use a decomposition to find the optimal , leading to the optimal feature combination. The method requires solving a SDP problem, whose time complexity is , in which is the number of training data samples. When running the algorithm, we maintain a relative small training data set to make the algorithm run efficiently.

For each generated region, we compute four feature groups according to Table 1. To make it possible to perform weighted combination, we set each feature group to be a 9-ary vector. Note that for F1 and F2 they are natural to be 9-ary vector. For F3, we run clustering algorithm to generate 9 clusters. For F4, we first generate a codebook, and apply binary coding for 9 segments to generate a real vector of 9 elements. There is one considerable problem to be addressed. The method used to generate visual disjoint regions provides a natural cutting with irregular shape. But for most of our feature extraction methods, only regular shapes (e.g. rectangle) can be processed. To fill this gap, we apply a minimal pixel padding method to fill the irregular region to a rectangle. Our previous work [13] indicated that with proper processing, the filling pixels would not affect much to the model. Figure 3 shows the pixel padding of irregular regions. Details of the padding method can be found in [13].

Figure 2 Pixel Padding fort irregular regions 3.3 Large margin feature combination The main idea of large margin feature combination is to find an optimal distance function, such that all samples with the same concept label should be located near each other, while samples with different concept label should be apart from each other. Figure 3 gives an example of this idea.

4 EVALUATION
The proposed algorithm is evaluated on a synthetic data set and a real brain MRI clinical data set. The synthetic data set is generated by a web-based program [18] which captures the essence of brain MRI images. The synthetic data set can evaluate the performance of a given model since it can generate image with different noise level, clarity, number of lesion regions, and views of a brain. Table II shows some details of our data sets. We change the parameters to get data sets of different Page 154

Figure 3 An example of large margin nearest neighbors learning To formularize this, we follow the idea of [17]s idea to find an optimal distance function most obeying the Volume 2, Issue 5 September October 2013

International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)


Web Site: www.ijettcs.org Email: editor@ijettcs.org, editorijettcs@gmail.com Volume 2, Issue 5, September October 2013 ISSN 2278-6856
settings. In Table 2, the columns L, C, N, R and S stand for the number of lesion regions, clarity, noise level, image rotation and data set size, respectively. Table 2: Details of the synthetic data set
Name S1 S2 S3 S4 L 3 2 2 1 C M High High Low N 2 2 1 3 R N Y N N S 100 120 120 100

It is interesting to find that for synthetic data sets, the detection accuracy of lesion region is lower than that of normal regions. However, when it comes to normal regions detection, the case is totally different. We owe this phenomenon to the uncertainty manually added to the synthetic data sets. The clinical data set is carefully modified for diagnosis whose quality is better. We also report the comparison results between the proposed method and the 4 original feature representations. Figure 4 and 5 show the comparison results for lesion regions detection on the synthetic data sets and the real data sets.

It should be noticed that the quality of the real data set is better than the synthetic data set. This is because the real data set is obtained by carefully examination of MRI operators. Table 3 shows the details of the real clinical data set used for our evaluation. Table 3: Details of the real clinical data set
Name R1 R2 R3 R4 Descriptio n Size

Figure 4 Comparison results on the synthetic data sets

Normal Subject Tumor 1 + 2 20 Normals T1Weighted-

112 141 98 102

The real data set is directly downloaded from the IBSR project [19]. The size of each subset is similar to that of the synthetic data set. For the settings of the main learner, we directly use the SVM implementation from the LibSVM project [11] with default parameters setting. According to Table 1, we first compute 4 groups of features and then use the method proposed in subsection 3.3 to find an optimal combination weights for them. The result is also a 9-ary feature vector. Then the learned feature vectors are fed to the SVM model for training and testing. The performance is evaluated by classification accuracy of normal regions and lesion regions detection. We record them separately in our result reports. Also the performances of 4 original features are also reported for comparison. The ratio between training and testing data sets is 5 : 5. Table 4 shows the result of the accuracy on our evaluation data sets. Table 4: Evaluation results (classification accuracy)
Name S1 S2 S3 S4 L 78.4% 81.2% 83.1% 79.0% N 90.3% 89.6% 93.5% 91.7% Name R1 R2 R3 R4 L 88.4% 85.9% 88.1% 86.4% N 87.6% 86.3% 84.2% 89.5%

Figure 5 Comparison results on the real data sets In both Figure 4 and 5, F1, F2, F3 and F4 are the same as Table 1. Comb stands for the proposed feature combination method. We can see from both figures that the proposed method can learn a feature which achieves the best performance against all other original features in both synthetic and real data sets.

5 CONCLUSION
We have proposed a feature combination framework for brain MRI image analysis. Different from previous methods, we do not directly propose a new feature extraction algorithm specific for some applications. We apply a supervised machine learning framework to find an optimal combination of the existent feature extraction methods. The proposed framework can provide suitable feature expression when given a training data set. Future work includes feature fusion for different source of data. Note that the proposed method requires features to be homogeneous, which greatly limits its application. Acknowledgment This work is supported by the Science and Technology Planning Project of Haizhu District, Guangzhou (2011YL-05), the 2012 College Student Career and Innovation Page 155

Volume 2, Issue 5 September October 2013

International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)


Web Site: www.ijettcs.org Email: editor@ijettcs.org, editorijettcs@gmail.com Volume 2, Issue 5, September October 2013 ISSN 2278-6856
Training Plan Project (1184512043), the 2011 Higher Education Research Fund of GDUT (2013Y04). [14] Shi, J. & Malik, J. Normalized Cuts and Image Segmentation IEEE Trans. Pattern Anal. Mach. Intell., IEEE Computer Society, 2000, 22, 888-905. [15] Chen, Y. & Wang, J. Z. Image Categorization by Learning and Reasoning with Regions J. Mach. Learn. Res., 2004, 5, 913-939 [16] Caicedo, J. C.; Cruz-Roa, A. & Gonzlez, F. A. Combi, C.; Shahar, Y. & Abu-Hanna, A. (Eds.) Histopathology Image Classification Using Bag of Features and Kernel Functions. AIME, 2009, 5651, 126-135. [17] Kilian Q. Weinberger and Lawrence K. Saul. 2009. Distance Metric Learning for Large Margin Nearest Neighbor Classification. J. Mach. Learn. Res. 10, 2009, 207-244. [18] http://brainweb.bic.mni.mcgill.ca/brainweb/ [19] IBSR Project: http://www.nitrc.org/projects/ibsr

References
[1] Krit Somkantha et al. Boundary Detection in Medical Images Using Edge Following Algorithm Based on Intensity Gradient and Texture Gradient Features. IEEE transactions on biomedical engineering, 2011, vol. 58, no. 3. [2] Shapiro, Linda G and stockman, George C. Computer Vision, Prentice hall. ISBN 0-13-0307963, 2002. [3] Li, N., M. Liu and Y. Li, Image segmentation algorithm using watershed transform and level set method. In Proc. ICASSP 2007. IEEE International Conference on Acoustics, Speech and Signal Processing, 2007, 1: 613-616. [4] Tisdall, D. and M.S. Atkins, 2005. MRI Denoising via Phase Error Estimation. In Proc. SPIE, Medical Imaging 2005: Image Processing (2005), 5747: 646654. [5] Robb, R.A. Biomedical Imaging, Visualization and Analysis. Wiley-Liss, USA, 2000. [6] Weibei Dou, Yuan Ren, Yanping Chen, et al. Histogram-Based generation method of membership function for extracting features of brain tissues on MRI images. In Proceedings of the Second international conference on Fuzzy Systems and Knowledge Discovery, 2005, Vol. Part I. SpringerVerlag, Berlin, Heidelberg, 189-194. [7] C. P. Loizou, V. Murray, M. M. Pattichis, et al. Multiscale Amplitude-Modulation FrequencyModulation (AMFM) Texture Analysis of Multiple Sclerosis in Brain MRI Images. Trans. Info. Tech. Biomed. 15, 1 2011, 119-129. [8] L. Harrison, P. Dastidar, H. Eskola, et al. Texture analysis on MRI images of non-Hodgkin lymphoma. Comput. Biol. Med. 2008, 38, 4, 519-524. [9] Bo Liu, Bin Fang, Xinwang Liu, Jie Chen, Zhenghong Huang, and Xiping He. Large Margin Subspace Learning for feature selection. Pattern Recogn. 46, 10 2013, 2798-2806. [10] Chong Zhang and Yufeng Liu. Multicategory largemargin unified machines. J. Mach. Learn. Res. 14, 1, 2013, 1349-1386. [11] Chih-Chung Chang and Chih-Jen Lin. LIBSVM: A library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 3, Article 27, 2011, 27 pages. [12] Bunte, K.; Biehl, M.; Jonkman, M. F. & Petkov, N. Learning effective color features for content based image retrieval in dermatology Pattern Recogn., 2011, 44, 1892-1902. [13] Zhang, G.; Shu, X.; Liang, Z.; Liang, Y.; Chen, S. & Yin, J. Multi-instance learning for skin biopsy image features recognition Proceedings - 2012 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2012, 2012, 83 - 88. Volume 2, Issue 5 September October 2013

AUTHOR
Huang Ying is MD of Faculty of Automation, Guangdong University of Technology. Her research direction includes intelligent information processing and computer vision. Zhang Yong, M.D. Director of Neurosurgery department of Guangdong No.2 provincial hospital. The top expert of China in the field of cranial nerve diseases. Gang Zhang is PhD candidate in the School of Information Science and Technology at SUN YAT-SEN University, China. He received his MSc Degree in Computer Software and Theory from SUN YAT-SEN University, China, in 2005. His current research interests include data mining, machine learning, and its applications to bioinformatics and Traditional Chinese Medicine. Now he is a lecturer in School of Automation, Guangdong University of Technology. Luo Weishi, Attending doctor of neurosurgery. Mater of neurosurgery. specializing in diagnosis and treatment of cranial nerve diseases. Qian Dongxiang is PhDMD of the Third Affiliated Hospital of Guangzhou medical University. As the director of Neurosurgery department, he does great job in clinical work, medical education and medical research concurrently. His research direction is injury and repair of central nervous system. According to dedication in much academic area, he gains lots of honor in Neuroscience academia. And, He was commended to assume the responsibility for various social duty, such as the associate director of Neurosurgery branch in Guangzhou institute of medicine, the national commission of Society for

Page 156

International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)


Web Site: www.ijettcs.org Email: editor@ijettcs.org, editorijettcs@gmail.com Volume 2, Issue 5, September October 2013 ISSN 2278-6856
Neuroscience of China, the commission of Neurosurgeon branch in Guangdong physicians society, and so on. Lai Rigui is a master candidate in the school of GuangZhou medical University. His research direction is injury and repair of central nervous system. Zhang Yang, Neurosurgery resident, specializing in intraoperative neural electrophysiological monitoring.

Volume 2, Issue 5 September October 2013

Page 157