Saifinalthesis PDF

Automated annotation of X-ray images
A THESIS
submitted by
SUMATHI GANESAN
for the award of the degree

of
DOCTOR OF PHILOSOPHY
figure=aulogo.ps
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
ANNAMALAI UNIVERSITY, ANNAMALAINAGAR

March 2016
To my parents, family
and well-wishers
Dr.V. Srinivasan BE, ME, M.Tech,.. Annamalainagar - 608002
Professor of Computer Science and Engineering Tamilnadu, India
Annamalai University Date: 08-02-2016
CERTIFICATE
This is to certify that the thesis entitled Automated annotation of medical

X-ray images is a bonafide record of the work done by Ms.Sumathi Ganesan, Re-
search Scholar, Department of Computer Science and Engineering, under my guidance
during the period 2011 - 2016 and that this thesis has not previously formed the basis
for the award of any degree, diploma, associateship, fellowship or other similar title to
the candidate.
This is also to certify that the thesis represents the independent work of the can-
didate.
Signature of the Research Guide
i
ACKNOWLEDGEMENTS
The completion of this doctoral dissertation was possible, only with the support of
several people. I would like to express my sincere gratitude to all of them. First and
foremost, I am extremely grateful to my research guide, Dr. T.S. Subashini, Associate
Professor of Computer Science and Engineering for her valuable guidance, scholarly inputs
and consistent encouragement I received out through my research work. This exercise was
possible only because of the unconditional support provided by her. With an amicable and
positive disposition, she has always made herself available to clarify my doubts despite her
busy schedules and I consider it as a great opportunity to do my doctoral programme under
her guidance and to learn from her research expertise. I have the immense pleasure to thank
her for all her help and support.
I owe a deep sense of gratitude to Dr. S. Manian, Vice Chancellor, Annamalai Univer-
sity and Dr. K. Arumugam, Registrar, Annamalai University for giving me an opportunity
to pursue Ph.D in this University.
I am grateful to Dr. C. Antony Jeyasehar, Dean, Faculty of Engineering and Tech-
nology for the excellent research environment he has created to learn and pursue research
work. I thank Prof. V. Srinivasan, Head of the Department of Computer Science and
Engineering for his constant encouragement.
I am also grateful to Dr. V. Ramalingam, Professor, Department of Computer Science
and Engineering, whose constant motivation and encouragement urged me to complete this
research work.
I especially wish to express my sincere gratitude and respect to Dr. S. Palanivel,
and Dr. M. Kalaiselvi Geetha, Professors of Computer Science and Engineering whose
encouragement and support at various levels enabled me to develop a passion for research.
I am also thankful to Dr. A.D. Sampath Kumar, Department of Ortho, Government
Medical College, Salem, Dr. N. Amudhavalli, and Dr. P. Gunasekar, Professors of
Radiology, Raja Muthiah Medical College Hospital, Annamalainagar for their valuable help
and comments in carrying out this work.
All my fellow research scholars inspired me in research through their interactions. I
acknowledge my sincere thanks to them for the precious moments we shared. In particular,
I would like to thank Mr. S. Nagarajan, Assistant Professor of Computer Science and
ii
Engineering and Mrs. K. Vaidehi and Mr. G.N. Balaji, Research Scholars for spending
their precious time with me in discussing technical and other research matters.
My deepest gratitude goes to my family for their unflagging love and support. Especially,
I am indebted to my father Mr. A. Ganesan who spared many things to provide the best
possible environment for my study. I would like to extend my sincere gratitude and feel
proud of my sister Dr. G. Shanthi, Professor, Raja Muthaiah Medical College and my
uncle Dr. S. Selvakumar who has always been one of my well wishers, and I thank them
for being the inspiration in completing my doctoral research. I would like to extend my
whole hearted thanks to my children S. Servesh and S. Sriversh for willingly sacrificing
some of their childhood pleasures in the way of helping me to complete my research work.
I would like to extend my sincere thanks to Dr. S. Raja Soma Sekar, Assistant Profes-
sor, Department Electrical Engineering and Mr. E. Pavendhan and Mr. S. Sudhakar, the
Post Graduate students, Annamalai University for their valuable help in collecting X-rays.
Further, I also feel extremely thankful to all my friends and relations who extended their
moral support and help in successful completion of this work.
Finally, and most importantly, I would like to thank the almighty God for his kind grace
in completing this research successfully without any obstacles.
Sumathi Ganesan
iii
ABSTRACT
KEYWORDS: Pre-processing, Segmentation, Classification, Orientation detection,
Abnormalities detection and Automated annotation.
The research advancements in the field of image processing enable us to quantita-

tively analyze and visualize all modalities of medical images such as X-ray, comput-
erized tomography (CT), magnetic resonance imaging (MRI) and ultrasonic images.
Conceiving the four modalities, X- ray diagnosis is communally used unless the ab-
normalities are complicated. In such case the CT, MRI or ultrasonic images may be
needed for further diagnosis and surgery planning.
This research utilizes medical images taken from the image retrieval in medical
application (IRMA) database of the Department of Diagnostic Radiology, Aachen and
University of Technology (RWTH), Aachen, Germany. Besides images were also ob-
tained from the Raja Muthaiah Medical College and Hospital, Annamalai University
and a few from the Government Medical College and Hospital, Salem, India for sub-
stantiation and implementation of the proposed algorithm. The images are collected
in such a way that they have a high ratio of intra-class variability and inter-class
similarities.
In this research, six different classes of X-ray images are taken, namely chest, skull,
palm, neck, spine and foot. The proposed annotation process involves pre-processing
of X-rays to make them fit for further processing. It is followed by segmentation,
classification, orientation detection, abnormality detection and automated annotation.
The three major steps in the proposed annotation process, namely the classification,
orientation detection and abnormalities detection are carried out using the combination
of shape and texture features and wavelet coefficients respectively.
As the medical X-ray images are in gray scale with same texture characteristics,
an attempt to categorize them with only texture feature will not yield better results.
Hence, the texture features are combined with shape features extracted from Zernike
moments using three different classifiers, namely back propagation neural network
(BPNN), probabilistic neural network (PNN) and support vector machine (SVM).
iv
Out of three classifiers considered for the study, SVM classifier out performed BPNN
and PNN in classifying six different classes of X-ray images.
For detecting the orientation of the X-ray images, both model based and template
based approach are followed using Harris corner algorithm, SVM algorithm and SURF
algorithm respectively. The best results were obtained with SVM classifier and DWT
features. For an automated diagnosis of abnormalities, classifiers like decision tree,
extreme learning machine (ELM) and SVM are employed. The results show that the
SVM can detect the abnormalities present in the X-ray images better.
Finally, 21 bit annotation code is generated incorporating various information such
as, patient ID, age, gender, X-ray class, X-ray view along with the information as to
whether the given X-ray image is a normal or abnormal one.
v
TABLE OF CONTENTS
Thesis Certificate i
Acknowledgements ii
Abstract iv
List of Tables x
List of Figures xiii

Abbreviations xviii
Notations xx
1 Introduction 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 X−rays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Principle of X-rays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Types of X-ray Images . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4.1 Views of X-ray Images . . . . . . . . . . . . . . . . . . . . . . . 5
1.5 Need for Annotation of X-ray Images . . . . . . . . . . . . . . . . . . . 9
1.6 Digital Image Representation . . . . . . . . . . . . . . . . . . . . . . . 10
1.7 Computer Aided System for Detection and Analysis of X-ray Images . 10
1.7.1 Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.7.2 Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.7.3 Feature Extraction . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.7.4 Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.7.5 Orientation Detection . . . . . . . . . . . . . . . . . . . . . . . 13
1.7.6 Abnormality Detection . . . . . . . . . . . . . . . . . . . . . . 13
1.7.7 Annotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.8 Data Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.9 Objectives of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.10 Organization of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . 14
2 Review of Literature 16
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2 Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3 Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.4 Feature Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.5 Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.6 Orientation Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.7 Abnormality Detection . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.8 Annotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.9 Techniques used in the Proposed Work . . . . . . . . . . . . . . . . . . 32
2.10 Segmentation Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.10.1 Connect Component Labeling (CCL) . . . . . . . . . . . . . . . 32
2.10.2 Expectation Maximization (EM) . . . . . . . . . . . . . . . . . 33
2.11 Feature Extraction Techniques . . . . . . . . . . . . . . . . . . . . . . . 35
2.11.1 Gray Level Co-Occurrence Matrix (GLCM) . . . . . . . . . . . 35
2.11.2 Zernike Moment (ZM) . . . . . . . . . . . . . . . . . . . . . . . 36
2.11.3 Discrete Wavelet Transform (DWT) . . . . . . . . . . . . . . . . 39
2.12 Modeling Techniques used in the Proposed Work . . . . . . . . . . . . 43
2.12.1 Back Propagation Neural Network (BPNN) . . . . . . . . . . . 43
2.12.2 Probabilistic Neural Network (PNN) . . . . . . . . . . . . . . . 46
2.12.3 Support Vector Machine (SVM) . . . . . . . . . . . . . . . . . . 47
2.12.4 Speeded up Robust Features(SURF) . . . . . . . . . . . . . . . 48
2.13 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3 Classification of X-ray images using texture and shape features 50

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.2 Proposed Methodology for Classification of X-ray Images . . . . . . . . 51
vii
3.3 Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.4 Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.5 Feature Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.6 Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.7 Performance Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.8 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4 Orientation Detection 76
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.2 Proposed Methodology for Orientation Detection . . . . . . . . . . . . 77
4.3 Model based Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.3.1 Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.3.2 Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.3.3 Feature Extraction . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.3.4 Model based SVM . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.4 Harris Corner Detection Algorithm . . . . . . . . . . . . . . . . . . . . 85
4.5 Template based Classification using speeded up robust features(SURF) 91
4.5.1 SURF Algorithm for Orientation Detection . . . . . . . . . . . 91
4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
5 Abnormality Detection 100

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
5.2 Proposed methodology for Abnormalities Detection . . . . . . . . . . . 102
5.3 Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
5.4 Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
5.5 Feature extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
5.6 Proposed Modeling Techniques for Abnormality Detection . . . . . . . 105
5.6.1 Decision tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
5.6.2 ELM Classifier . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
5.6.3 SVM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
viii
5.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
6 Annotation 115
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
6.2 Proposed Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
6.2.1 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . 118
6.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
7 Summary and Conclusion 122

7.1 Summary of the Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
7.2 Major Contributions of the Work . . . . . . . . . . . . . . . . . . . . . 123
7.3 Directions for Future Research . . . . . . . . . . . . . . . . . . . . . . . 124
Bibliography 125
List of Publications 132
ix
List of Tables
2.1 Summary of GLCM features . . . . . . . . . . . . . . . . . . . . . . . . 36

2.2 List of Zernike polynomials upto 4th order. . . . . . . . . . . . . . . . 38
3.1 Confusion matrix of SVM with GLCM features for X-ray classification 58
3.2 Performance of SVM with GLCM features for X-ray classification . . . 58
3.3 Confusion matrix of SVM with ZM features for X-ray classification . . 59
3.4 Performance of SVM with ZM features for X-ray classification . . . . . 59
3.5 Confusion matrix of SVM with GLCM and ZM features for X-ray clas-
sification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.6 Performance of SVM with GLCM and ZM features for X-ray classification 60
3.7 Overall performance of SVM in classifying X-ray images with different
sets of features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.8 Confusion matrix of BPNN with GLCM features for X-ray classification 63
3.9 Performance of BPNN with GLCM features for X-ray classification . . 64
3.10 Confusion matrix of BPNN with ZM features for X-ray classification . . 64
3.11 Performance of BPNN with ZM features for X-ray classification . . . . 65
3.12 Confusion matrix of BPNN with GLCM and ZM features for X-ray
classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.13 Performance of BPNN with GLCM and ZM features for X-ray classifi-
cation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.14 Overall performance of BPNN in classifying X-ray images with different
set of features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.15 Confusion matrix of PNN with GLCM features for X-ray classification . 69
3.16 Performance of PNN with GLCM features for X-ray classification . . . 69
3.17 Confusion matrix of PNN with ZM features for X-ray classification . . 70
3.18 Performance of PNN with ZM features for X-ray classification . . . . . 70
3.19 Confusion matrix of PNN with GLCM and ZM features for X-ray clas-
sification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
3.20 Performance of PNN with GLCM and ZM features for X-ray classification 71
3.21 Overall performance of PNN classifier in classifying X-ray images with
different sets of features . . . . . . . . . . . . . . . . . . . . . . . . . . 72
3.22 Overall performance measures of BPNN, PNN and SVM classifiers . . 74
4.1 Confusion matrix of SVM with DWT features for the detection of the
X-ray view . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.2 Performance of SVM with DWT features for the detection of the X-ray
view . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.3 Confusion matrix for orientation detection using Harris corner algorithm 89
4.4 Performance measures for orientation detection using Harris corner al-
gorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
4.5 Confusion matrix for orientation detection using SURF algorithm . . . 95
4.6 Performance measures of SURF algorithm in detecting the orientation
of X-rays. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
4.7 The overall performance measures of SVM classifier, Harris corner and
SURF algorithm in detecting the X-ray views. . . . . . . . . . . . . . . 98
5.1 Confusion matrix for abnormality detection using decision tree with
DWT features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.2 Performance measures of decision tree in detecting abnormality in X-rays107
5.3 Confusion matrix for abnormality detection using ELM with DWT fea-
tures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.4 Performance measures of ELM with DWT features in detecting abnor-
mality in X-rays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
5.5 Confusion matrix of using SVM with DWT features in detecting the
abnormality. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
xi
5.6 Performance measures of SVM with DWT features in detecting abnor-
mality in X-rays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
5.7 Overall performance of decision tree, ELM and SVM classifier with
DWT features for detecting the abnormality in X-rays. . . . . . . . . . 113
xii
List of Figures
1.1 Principle of X-ray machine. . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2 Modern X-ray machine. . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Different classes of X-ray. . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 (a) AP positioning and (b) AP view. . . . . . . . . . . . . . . . . . . . 6
1.5 (a) PA positioning and (b) PA view. . . . . . . . . . . . . . . . . . . . 6
1.6 (a) Left positioning and (b) Left lateral view. . . . . . . . . . . . . . . 7
1.7 (a) Right positioning and (b) Right lateral view. . . . . . . . . . . . . . 7
1.8 (a) Right oblique positioning and (b) Right oblique view. . . . . . . . . 8
1.9 (a) Left oblique positioning and (b) Left oblique view. . . . . . . . . . . 8
1.10 Computer aided system for annotation of X-ray images. . . . . . . . . . 11
2.1 DWT breakdown of the signal. . . . . . . . . . . . . . . . . . . . . . . . 39

2.2 Filterbank representation of DWT dilations. . . . . . . . . . . . . . . . 42
2.3 A computational neural model. . . . . . . . . . . . . . . . . . . . . . . 43
2.4 A feed forward back propagation neural network. . . . . . . . . . . . . 44
2.5 Architecture of PNN. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.6 Maximum margin hyperplane and support vectors. . . . . . . . . . . . 48
3.1 Block diagram of the proposed methodology for classification of X-ray

images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.2 M3 filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.3 Original X-ray images (a) Chest (b) Spine (c) Palm, (d) Foot (e) Neck
and (f) Skull . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.4 Pre-processed X-ray images using M3 filter (a) Chest (b) Spine (c) Palm
(d) Foot (e) Neck and (f) Skull . . . . . . . . . . . . . . . . . . . . . . 53
3.5 Segmented X-ray images for six different classes of X-rays. . . . . . . . 54
3.6 A comparison of the accuracy of SVM in classifying X-ray images with
3.7 A comparison of the sensitivity of SVM in classifying X-ray images with
3.8 A comparison of the specificity of SVM in classifying X-ray images with
3.9 A comparison of the accuracy of BPNN in classifying X-ray images with
3.10 A comparison of the sensitivity of BPNN in classifying X-ray images
with different sets of features . . . . . . . . . . . . . . . . . . . . . . . . 67
3.11 A comparison of the specificity of BPNN in classifying X-ray images
with different sets of features . . . . . . . . . . . . . . . . . . . . . . . . 68
3.12 A comparison of the accuracy of PNN in classifying X-ray images with
3.13 A comparison of the sensitivity of PNN in classifying X-ray images with
3.14 A comparison of the specificity of PNN in classifying X-ray images with
3.15 Comparison of accuracy of the three classifiers with the combination of
Zernike moments and GLCM features . . . . . . . . . . . . . . . . . . . 75
4.1 The architecture of the proposed methodology for orientation detection. 77

4.2 The proposed methodology for Orientation Detection using SVM classifier. 78
4.3 Pre-processed X-ray images (a) Chest AP view, (b) Chest lateral view,
(c) Foot AP view (d) Foot oblique view (e) Neck AP view (f) Neck
lateral view (g) Palm AP view (h) Palm oblique view (i) Skull AP view
(j) Skull lateral view (k) Spine AP view and (l) Spine lateral view. . . 79
4.4 Segmented X-ray images (a) Chest AP view, (b) Chest lateral view, (c)
Foot AP view (d) Foot oblique view (e) Neck AP view (f) Neck lateral
view (g) Palm AP view (h) Palm oblique view (i) Skull AP view (j)
Skull lateral view (k) Spine AP view and (l) Spine lateral view. . . . . 81
xiv
4.5 Class-wise accuracy of SVM in detecting the orientation of the X-ray
images using DWT features. . . . . . . . . . . . . . . . . . . . . . . . 84
4.6 Class-wise sensitivity of SVM in detecting the orientation of the X-ray
images using DWT features. . . . . . . . . . . . . . . . . . . . . . . . 84
4.7 Class-wise specificity of SVM in detecting orientation of the X-ray im-
ages using DWT features. . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.8 The block diagram of the Harris corner detection. . . . . . . . . . . . . 86
4.9 The feature point detected using Harris corner algorithm for chest, skull
and neck X-rays. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.10 The result of the sample view of skull X-ray image for orientation de-
tection using Harris corner detector. . . . . . . . . . . . . . . . . . . . . 88
4.11 The result of the sample view of neck X-ray image for orientation de-
tection using Harris corner detector. . . . . . . . . . . . . . . . . . . . . 88
4.12 Class-wise accuracy of Harris corner algorithm in detecting X-ray view 90
4.13 Class-wise sensitivity of Harris corner algorithm in detecting X-ray view 90
4.14 Class-wise specificity of Harris corner algorithm in detecting X-ray view 91
4.15 Overall block diagram of the proposed method for orientation detection
using SURF algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
4.16 The result of the sample view of three different X-rays namely chest,
skull and palm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
4.17 The GUI showing the view of skull X-ray image. . . . . . . . . . . . . . 94
4.18 The GUI showing view of palm X-ray image. . . . . . . . . . . . . . . . 94
4.19 Class-wise accuracy of SURF algorithm in detecting the X-ray views. . 96
4.20 Class-wise sensitivity of SURF algorithm in detecting the X-ray views. 96
4.21 Class-wise specificity of SURF algorithm in detecting the X-ray views. . 97
4.22 Graph showing overall accuracy of SVM classifier, Harris corner algo-
rithm and SURF in detecting the X-ray views. . . . . . . . . . . . . . . 98
xv
5.1 Six different classes of X-ray images namely (a) Normal chest image,
(b) Abnormal chest image, (c) Normal skull image, (d) Abnormal skull
image, (e) Normal palm image, (f) Abnormal palm image, (g) Normal
foot image, (h) Abnormal foot image. . . . . . . . . . . . . . . . . . . . 101
5.2 Block diagram of the proposed methodology for abnormalities detection. 102
5.3 Pre-processed X-ray images (a) Normal chest image, (b) Abnormal chest
image, (c) Normal skull image, (d) Abnormal skull image, (e) Normal
palm image, (f) Abnormal palm image, (g) Normal foot image, (h)
Abnormal foot image. . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
5.4 Segmented X-ray images (a) Normal chest image, (b) Abnormal chest
image, (c) Normal skull image, (d) Abnormal skull image, (e) Normal
palm image, (f) Abnormal palm image, (g) Normal foot image, (h)
Abnormal foot image. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
5.5 The structure of the decision tree . . . . . . . . . . . . . . . . . . . . . 106
5.6 Class-wise performance of decision tree classifier in detecting abnormal-
ity in X-rays. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
5.7 Class-wise performance of ELM classifier in detecting abnormality in
X-rays. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
5.8 Class-wise performance of SVM classifier in detecting abnormality in
X-rays. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
5.9 A comparison of the classification performances of decision tree, ELM
and SVM with DWT features for detecting the abnormality. . . . . . . 113
6.1 GUI model for annotation of medical X-ray images. . . . . . . . . . . . 115

6.2 Distribution of generated code. . . . . . . . . . . . . . . . . . . . . . . 116
6.3 GUI model for proposed method of automated annotation of X-ray images.117
6.4 The automated code generated for the sample chest anterior-posterior
normal view. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
6.5 The automated code generated for the sample chest lateral normal view. 118
6.6 The automated code generated for the sample skull lateral abnormal
view . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
xvi
6.7 The automated code generated for the sample skull lateral normal view. 119
6.8 The automated code generated for the sample palm oblique normal view.120
6.9 The automated code generated for the sample palm anterior-posterior
normal view. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
xvii
ABBREVIATIONS
AANN – Auto associative Neural Network
AHE – Adaptive Histogram Equalization
AIA – Automatic Image Annotation
ANN – Artificial Neural Network
AP – Anterior-Posterior View
BPNN – Back Propagation Neural Network
BOW – Bag of Words
CAD – Computer Aided Detection/Diagnosis
CBIR – Content Based Image Retrieval
CCL – Connected Component Labeling
CT – Computed Tomography
DWT – Discrete Wavelet Transform
ELM – Extreme Learning Machine
EM – Expectation Maximization
FD – Fourier Descriptor
GCS-LBP – Gabor-based Centre Symmetric-Local Binary Patterns
GLCM – Gray Level Co-occurrence Matrix
GO – Gabor Orientation
HRST – Hybrid and Respective Smoothing- Sharpening
IGD – Intensity Gradient Direction
IM – Invariant Moments
k-NN – k -Nearest Neighbor
LBP – Local Binary Pattern
LSA – Latent Semantic Analysis
MLP – Multi Layer Perceptron
MRF – Markov Random Field
MRI – Magnetic Resonance Imaging
NB – Naive Bayes
PA – Posterior-Anterior View
xviii
PLSA – Probabilistic Latent Semantic Analysis
PNN – Probabilistic Neural Network
RBF – Radial Basis Function
RBFNN – Radial Basis Function Neural Network
ROI – Region of Interest
RWTH – Department of Diagnostic Radiology, Aachen and University of Technology
SIFT – Scale Invariant Feature Transform
SURF – Speeded up Robust Features
SVM – Support Vector Machine
TRUS – Transrectal Ultrasound
ZM – Zernike Moments
xix
NOTATION
Lower case boldface letters are used to denote vectors and upper case boldface letters
to denote matrices. In addition, the following convention is used throughout the thesis:
English Symbols
I, P – gray level image
x, y – feature vector
H, V, D – horizontal, vertical and diagonal detail coefficient vector
d – direction of the pixels
P(i, j) – gray value of the pixel at coordinate position (i, j) of the original image
Zn,m – zernike moments of order n
(x, y), (i, j) – pair of coordinates
P (.) – probability of likelihood score
F – activation function
g[n], h[n] – impulse response of the filter
p, q, r – consequent parameters
e – squared error
W – weight martix
H(X, σ) – hessian martix
R – threshold value
H – eigen value
X, Y – input derivatives of image
R – root node
H – hidden node
C – class/Chance node
Z – output node
T – terminal node
tp, tn, f p, f n – true positive, true negative, false positive and false negative counts respectively
k – empirical constant
xx
Greek Symbols
µ – mean vector
Σ – covariance matrix
µ – mean gray value
σ – standard deviation of gray value
P
– summation
φ – scaling function
α – learning rate
Θ – angle
xxi
Chapter 1
Introduction
1.1 Introduction
The need of medical imaging technologies in the field of clinical sciences and surgery
exploration has been very much inevitable over the past couple of decades, as the
science of imaging has undergone major advancements and a paradigm shift. An
imaging technique that has the ability to capture, conceive and correlate the informa-
tion about the human body is the need of the hour. Keeping in view these objectives,
the fraternity of researchers in the field of imaging-sciences have emerged out over
the years with a variety of state-of-the-art imaging modalities like X-rays, ultrasound
scan, computerized tomography scan (CT ), magnetic resonance imaging scan (MRI)
etc, which have not only shortened the time taken for the diagnosis, but also have
helped in extending the longevity of the patients. Of all of them, the X-rays are more
reliable, simple, cheap and versatile and quite handy for the early diagnosis of almost
any diseases that come with painless procedure. Its ease-of-availability, less time con-
sumption, non-invasive-feature and less requirement of expertise makes it the best in
class amongst its rivals for the early detection and diagnosis of medical issues.
However, owing to its own inherent advantages as enumerated above, the amount
of digital X-ray images that are being taken out has been extensively increasing and
there arises an obvious need of sophisticated, yet a concrete, automated and effective
methodology that can address issues such as acquiring, analyzing, classifying, storing,
managing and accessing images from a vast database of digital images, with regard to
any of the queries on the hand, viz. based on the patient ID, the different views of
X-rays, pathological findings etc. Thus, it can be easily understood that the archival
1
management, efficient storage and accessing of medical images by an automated means
can really be a boon to the physicians and radiologists. Until recent past, the physicians
and radiologists had been relying much on their expertise and domain knowledge to
ascertain the know-how of the X-rays. Manual annotation is a challenging task; it is
tedious, time consuming and expertise centric affair. By using automated annotation
the most wanted images can be obtained from the vast database which consumes less
processing time and requires less domain specific knowledge as well. Because of all
these reasons, the automated annotation of medical images has become popular and is
gaining more significance over the years. In this thesis, the entire work has been focused
to create an automatic annotation scheme which includes various processes ranging
from acquiring, analyzing, classifying, and generating code words for the various class
of X-ray images.
1.2 X−rays
Since its discovery, the X-rays have been useful to mankind in so many ways, leaving
a strong note upon its unparalleled usage in the field of imaging sciences. Basically,
X-rays are used to image the internal organs of the body to diagnose diseases. It uses
small amount of ionizing radiations and this procedure is absolutely non-invasive. The
amount of radiation depends upon the surface area of the body; smaller and larger areas
receive smaller and larger dose of radiations respectively, and in case of pregnancy,
doctors adopt alternative methods of examination. Radiographers and radiologists are
two different dimensions of practitioners involved in X-rays. A radiographer conducts
X-ray examining procedures and a radiologist interprets it. This test is very common
where about seven million X-ray examination are made every year in this modern era.
Some of the uses include diagnosis of fractures for detecting broken bones, diagnosis
of dislocations, diagnosis of abnormal positions of joints, diagnosis of bone or joint
conditions which helps to detect some types of cancer or arthritis, diagnosis of chest
abnormalities such as pneumonia, lung cancer, emphysema or heart failure, detection
of foreign objects like bullet fragments or swallowed coins.
2
1.3 Principle of X-rays
Basically, an X-ray machine produces a controlled beam of radiation, which is used to
create an image of the body portion of the area it penetrates. The beam of rays is
directed at the area being examined and after passing through it, the beam falls on
a piece of film or a special plate, where it casts a type of shadow from which it can
be read accordingly. As far as the human body is concerned, there are different types
of tissues in the body that can block or absorb the radiation differently, for example,
the dense tissues, such as the bones block most of the radiation and appear white on
the film. Contrarily, the soft tissues, such as the muscles may block less radiation and
appear darker on the film. Often, in practice, multiple images are taken from different
angles, so that a more complete view of the area is available. The images obtained
during X-ray examinations may be viewed on a film or put through a process called
digitizing, so that they can be viewed on a computer screen as well. Sometimes an
X-ray examination includes contrast, for which a drug called the contrast agent is used
which will highlight or contrast parts of the body so that they show more clearly on
the X-ray images [1]. Fig. 1.1 shows the principle of X-ray machine and Fig. 1.2 shows
the modern X-ray machine.
Fig. 1.1: Principle of X-ray machine.
3
Fig. 1.2: Modern X-ray machine.
1.4 Types of X-ray Images

As far as the utility level of X-rays is concerned, it can be very pivotal in viewing,
monitoring and diagnosing bone fractures, artery blockages, abdominal pain, joint
injuries and infections, etc., based on which it can be termed as the abdominal X-ray,
barium X-ray, bone X-ray, chest X-ray, dental X-ray, extremity X-ray, foot X-ray, joint
X-ray, lumbosacral spine X-ray, neck X-ray, palm X-ray, pelvis X-ray, sinus X-ray, skull
X-ray, and thoracic spine X-ray. [2]. Fig. 1.3 shows the different types of X-rays. In
this thesis, for the automated annotation of X-ray images, six classes of X-ray images
namely, foot, skull, palm, neck, chest and spine are considered.
4
Fig. 1.3: Different classes of X-ray.
1.4.1 Views of X-ray Images
In order to have a clear view of the intricate organs of the human body and for better
interpretation by the physicians and radiologists, different views of X-rays need to be
taken. The formal imaging methods can be broadly classified based on the anatomy
of human beings into the antero-posterior (AP) view, posterior-anterior (PA) view,
lateral, oblique view etc. In the X-ray imaging jargon, the AP view describes the
direction of the beam through the patient from anterior to posterior, i.e., from the
front to the back or to be more precise, is viewed as if one faces the patient from his
front to back, regardless of the projection, whereas, for the PA view it is vice-versa.
However, the lateral and oblique views differ from both as a projection that is neither
frontal nor dorsal but a slanting direction or any variation from the perpendicular or
the horizontal plane respectively [3]. The representation of the various views mentioned
are given in Fig. 1.4 to Fig. 1.9
5
Anterior-Posterior (AP) View
Radiographs are taken with the patient facing the X-ray tube, so that the X-ray beam
enters their anterior side, and exits posteriorly.
Fig. 1.4: (a) AP positioning and (b) AP view.
Posterior-Anterior (PA) View
Posterior-Anterior (PA) view films are obtained while the patient faces away from the
X-ray tube and the X-ray beam goes into the posterior and comes out of their anterior.
Fig. 1.4 and Fig. 1.5 shows the principle of AP and PA views and their corresponding
X-rays.
Fig. 1.5: (a) PA positioning and (b) PA view.
6
Lateral View
Lateral radiographs are ones in which the patient stands sideways to the X-ray tube.
They can be done with either the patients to the left or right side of the film. If the
patients is placed next to the left side of the film, it is called a left lateral; and when
placed next the right side of the film, is called right lateral. Fig. 1.6 and Fig. 1.7 show
the principle of left lateral view and right lateral views and their corresponding X-rays.
Fig. 1.6: (a) Left positioning and (b) Left lateral view.
Fig. 1.7: (a) Right positioning and (b) Right lateral view.
7
Oblique View
Oblique is halfway between AP (or) PA and lateral radiography. The patient will be
rotated about 45 degree from the lateral (or frontal) position and if the patient is
closer to the film then the view is a left oblique, furthermore if the patient is turned
so they are obliquely facing the film that is with their anterior side closer to the film,
then the view is an anterior oblique and for the posterior side, it is vice versa. Fig. 1.8
and Fig. 1.9 show the principle of right and left oblique view and their corresponding
X-rays.
Fig. 1.8: (a) Right oblique positioning and (b) Right oblique view.
Fig. 1.9: (a) Left oblique positioning and (b) Left oblique view.
8
1.5 Need for Annotation of X-ray Images
Due to an ever increasing nature of the digital X-ray images in the field of medical
sciences, the burden towards the requirement of storing and retrieving theses images
from a large collection of image databases is escalating and inevitable, which can result
in consuming more operational time as well. So, there is need to efficiently store and
retrieve theses images from a large collection of image databases. In the recent years,
content based image retrieval (CBIR) systems have been developed to browse, search
and retrieve images from vast databases. CBIR systems search images using low level
features such as color, texture, shape, spatial layout etc., which can be automatically
extracted and used to index images. Automated annotation, uses a set of linguistic
terminologies or code words that can categorize the images with regard to a concept
and are endowed with an effective mean for accessing them, out of their vast database.
In some scenarios, desired pictorial information can be efficiently described by
means of code words. The process of assigning a code word or text to an image is
termed as annotation. Image annotation systems attempt to reduce the semantic gap.
The task of automatically assigning semantic labels to images is known as automatic
image annotation (AIA). Automatic image annotation is also known as auto annotation
or linguistic indexing. In the last decade, automatic image annotation (AIA) has
remained a highly popular topic in the field of information retrieval research. The
main idea of AIA is to automatically learn semantic descriptors from large number of
image samples, and use the concept models to label new images.
Once images are annotated with semantic labels, images can be retrieved by code
words. In the recent years, a variety of learning methods have been actively researched
for automatically annotating images. The main purpose of these methods is to assign a
code word to each image. Different strategies including co-occurrence model, machine
translation model, latent space approaches, classification approaches and relevance
language models have been proposed in the literature and each strategy tries to improve
the previous one.
Most of the techniques define a parametric or non-parametric model to capture
9
the relationship between image features and code words. Therefore, it remains a boon
for the physicians and radiologists for an accurate interpretation of the X-rays.
1.6 Digital Image Representation

The society of researchers in the field of medical images have been working on the
digital images, which are basically two-dimensional picture(s) or photograph(s) being
formed through picture elements, referred to as pixels. Owing to digitization of im-
ages, an array of image processing techniques and pattern classification methods have
been invariably deployed throughout this thesis. Categorically, the digital images are
classified into gray level images and color images. A gray level image can be defined as
a two-dimensional function I(x,y), where x and y are the spatial or plane coordinates,
and the amplitude of the image I at any pair of coordinates (x,y) is called the intensity
or gray level of the image at that point. When (x,y) and the amplitude values of I
are all finite and discrete quantities, then the image is a digital image. Each pixel in a
gray level image is represented digitally in terms of 8 bits. The total number of levels
may be 256.
1.7 Computer Aided System for Detection and Anal-

ysis of X-ray Images
The process of annotation begins from the database collection of medical X-ray images
that are in the digital format. Pre-processing and segmentation of these images is
followed by feature extraction and classification. After the feature extraction and
classification steps, by the watchful selection of the appropriate methodologies, the
X-rays are classified into any one of the aforesaid six classes of X-rays. The detection
of its orientation is followed amongst its four possibilities considered, from which the
abnormalities in them are segregated. Then automated annotation is carried out which
involves assigning a code to the X-ray image which takes into account the patient ID,
X-ray class and view of the images along with their characteristics, i.e., a normal
or abnormal X-ray. The schematic representation of the proposed work is shown in
10
Fig. 1.10: Computer aided system for annotation of X-ray images.
Fig. 1.10. The annotation of X-ray images proposed in this thesis is methodically
attainable through different processes ranging from the formation of medical image
database, till the annotation which are briefed one after the other as follows:
1.7.1 Pre-processing
In general, radiographic images are poor in quality because of its physical charac-
teristics and the size of the defects with varied intensities. Hence, it is desirable to
pre-process the images. Pre-processing of the image is one of the preliminary steps
which is highly essential to ensure the high accuracy of the subsequent steps for au-
tomated annotation of X-ray images. It includes manipulation of the intensity and
contrast, noise reduction, background removal, edge sharpening, filtering, etc. Out of
the variety of existing techniques, median filter and the M3-filter has been exploited
in this work to enhance the X-ray images.
11
1.7.2 Segmentation
Segmentation is typically used to locate objects and boundaries in an image. More

precisely, image segmentation is the process of assigning a label to every pixel in
an image such that the pixels with similar label share certain visual characteristics.
Segmentation is a stage where a significant effort is made to delineate regions of interest
(ROI) and to discriminate them from background regions of the image. In many cases,
the segmentation approach dictates the outcome of the entire analysis since feature
extraction and further classification of the abnormality depends on the accuracy of
the segmented regions. Usually, segmentation algorithms operate on the intensity
or the texture variations using techniques that include threshold logic, edge-based
method, region-based technique, connectivity-based method and pattern-recognition
techniques. In this work, ROI segmentation is carried out using connected component
labeling (CCL) and the expectation-maximization (EM) algorithm for the classification
of the X-ray images, orientation detection and abnormality detection.
1.7.3 Feature Extraction
The content of an image is called as its features. Features may be color, shape or
texture etc. Feature extraction is a technique which is used to extract and represent
the contents of the X-ray images for further processing. Quantification algorithms
can be applied to the segmented regions of the X-ray image so as to extract the
essential information needed for the process of automated annotation. Since the types
and views of X-ray images vary considerably, a number of techniques that address the
needs of a specific application are needed. In this work, statistical features and wavelet
coefficients are extracted to classify and to detect the orientation and abnormality of
the six different types of X-ray images considered in this study.
1.7.4 Classification
A mathematical or statistical model called the classifier is used to classify the regions
of interest ( pathology) into different classes. Classifiers can be used to automatically
derive knowledge from the features extracted and use this knowledge to recall previ-
12
ously seen patterns and to classify new patterns with high accuracy. This knowledge
may assist physicians in making the diagnostic process more objective and more reli-
able. This thesis deals with the extraction of shape and texture features that are used
to classify six different class of X-ray images using different types of classifiers.
1.7.5 Orientation Detection
In order to have a clear view of the intricate organs of the human body and to help
physicians and radiologists to make better interpretation, different views of X-rays
needs to be taken. Of late, the radio-graphic positioning is highly standardized in
order to facilitate the interpretation better, so that there can be many possible views
of X-rays out of which the most common ones include AP, PA, lateral and oblique
views. In this work Harris corner detection algorithm, support vector machine with
wavelet coefficients and SURF algorithm are utilized to identify the X-ray views.
1.7.6 Abnormality Detection
Once, the X-ray image has been classified into any of the six classes, the abnormalities
in them need to be detected, which can be taken care of, by employing algorithms
like decision tree, extreme learning machine and SVM wherein the extraction of the
features play a significant role to pinpoint the abnormalities in them.
1.7.7 Annotation
Annotation is one of the important classification problems ie., classifying a given image
into one of the pre-defined labels. It is a process of assigning meaningful words to an
image taking its content into account. Image annotation is the process of automatically
assigning metadata in the form of keywords to digital images by the computerized
system. Through some bits of codes, the patient ID is linked with the X-ray along
with its type and orientation. Thus by assigning suitable number of bits for each
of the metadata associated with the X-ray image such as, patient ID, type of X-ray,
orientation and abnormality of the X-ray images can be automatically annotated.
13
1.8 Data Source
The images employed in this work have been received with due permission from the
IRMA group from RWTH University Hospital of Aachen, Germany [4]. Besides im-
ages were also obtained from the Raja Muthaiah Medical College and Hospital, An-
namalai University, Chidambaram and a few from the Government Medical College
and Hospital, Salem. Throughout the work, care has been taken to ensure that the
database employed contains images representing different ages, genders, views, posi-
tions, pathologies etc. In this work, six different classes of X-ray images have been
taken, wherein each class consists of 30 images having different views.
1.9 Objectives of the Thesis

The core objective of this research work is to automatically annotate the X-rays with
reduced computational burden and increased user-friendliness, yet with less domain-
expertise, in addition to addressing the following intricacies.
• Classification of the X-ray images.
• Orientation detection of the X-ray images.
• Checking for abnormalities.
• Generation of annotation code automatically.
1.10 Organization of the Thesis

The focus of the research work presented in this thesis is to automatically annotate the
X-ray images using image processing techniques and pattern classification methods.
Chapter 1 gives the general introduction on automatic annotation of X-ray images,
its need, applications and the technicalities behind it, followed by the objective of the
work along with the organization of the proposed work.
Chapter 2 deals with the elaborate survey conducted upon the existing literature
over the proposed ideology, with a motto to realize the extent of the technology that
has gone so far, with the point of study of finding a fitting optimal solution against
14
the flaws, if any, which are inherently present in the existing approaches. And thereby,
a degree of novelty can be ensured by meticulously bridging the gap between the
shortcomings found and the innovative technologies of the future.
Chapter 3 throws light upon the classification of the six different classes of medical
X-rays images undertaken for this research, namely, foot, skull, palm, neck, chest and
spine based on the combination of shape and texture features using different classifiers
like back propagation neural network (BPNN), probabilistic neural network (PNN),
support vector machines (SVM) etc.
Chapter 4 focuses on the detection of the view of the X-ray images, namely the
AP, PA, lateral and oblique views using various algorithms like Harris-corner detector,
SURF algorithms and SVM along with wavelet coefficients.
Chapter 5 checks for abnormalities in X-ray images by employing suitable algo-
rithms like decision-tree, extreme learning machine (ELM) and SVM.
Chapter 6 discusses automated annotation that involves code-generation which
takes into account the patient ID, X-ray type and orientation of the X-ray images
along with the information about their anatomical existence.
Chapter 7 summarizes the work presented in the thesis through its merits, appli-
cations, scopes and the possibilities of future-extension, besides claiming the contribu-
tions of the overall work executed.
15
Chapter 2
Review of Literature
2.1 Introduction
This chapter gives an overview of image processing and pattern classification tech-
niques used for automated annotation of medical X-ray images from the literature and
it is organized as follows: Section 2.2 explains the various image processing techniques
employed in the literature for pre-processing of medical X-ray images. Section 2.3 cov-
ers the literature related to segmentation techniques for the X-ray images. Section 2.4
present a review of various features extraction techniques found in literature. Sections
2.5, 2.6 and 2.7 present a review of modeling techniques for classification, orientation
detection and abnormalities detection of medical X-ray images. Finally, Section 2.8
presents a review on automated annotation of medical X-ray images.
2.2 Pre-processing
The images collected by different type of sensors are generally affected by different
types of noises [5]. Radiographic images are prone to noise because of its physical
characteristics and intensity variations. Hence it degrades the quality of the X-ray
images. The accuracy of interpretation of X-ray images depends mainly on the quality
of the radiograph. The objective of pre-processing is to improve the quality of the
X-ray image and make it ready for further processing by removing the irrelevant noise
and unwanted parts in the background of the X-ray image [6]. In order to improve the
quality of images, different types of filters such as mean, median and Wiener filters
are used in [7] and [8]. Mean filter is basically a convolution filter. Median filter is
similar to the mean filter but it reduces noise without blurring the edges of the X-ray
16
image. Wiener filters are mainly used to reduce minimum mean square error of the
X-ray images.
The authors in [9] discussed de-noising algorithms, filtering approach and wavelet
based approach for removing different noises like Gaussian, salt and pepper and speckle
noises present in the X-ray images. The wavelet based approach has been proved to
be the best for de-noising images.
Statistical filter, a modified version of hybrid median filter used for noise reduction
is described in [10]. The work in [11] describes the contrast enhancement of X-ray
images and presents a new approach for contrast enhancement which is based upon
adaptive neighborhood technique. Comparative analysis of the proposed technique
against the existing major contrast enhancement techniques are performed and results
of the proposed technique is found to be promising.
The authors in [12] present the adaptive histogram equalization (AHE) technique
on the soft tissue of the lateral neck radiograph. Here, the image is assessed and
evaluated during pre and post processing by the radiologists. The result showed that
AHE achieves better contrast enhancement which aids the detection of soft tissue.
The work in [13] describes a novel hybrid and respective smoothing- sharpening
(HRST) technique to remove the random noise present in digital X-ray images. Tran-
srectal ultrasound (TRUS) images are preprocessed with M3 filter and segmented using
DBSCAN clustering after applying morphological operators in [14].
The work in [15] presents 2D adaptive noise-removal, and median filtering to re-
move the noise. The authors in [16] and [17] propose M3 filters which is the hy-
bridization of mean and median filter. It replaces the central pixel by the maximum
value of mean and median for each sub-window, thereby preserving the high frequency
components in an image. Therefore, it is most suitable for denoising all type of med-
ical images, and is a simple, intuitive, effective, and easy to implement method for
smoothing images.
17
2.3 Segmentation
Segmentation, is a process of partitioning the region of interest from the background
of the image and it is an essential analysis function for which numerous algorithms
have been developed in the field of image processing.
In medical image analysis, segmentation is important for feature extraction, image
measurements, and image display. It is useful to classify image pixels into anatomical
regions such as bones, muscles and blood vessels, and also the pathological regions,
such as cancer, and tissue deformities. Hence, it is a fundamental step in X-ray im-
age analysis and different approaches for segmentation are discussed in the literature
reviewed here. Segmentation techniques are categorized into supervised and unsuper-
vised approaches. Supervised segmentation uses a prior knowledge about the region
of interest and background regions to be segmented. Machine learning methods or
model-based approaches are used in supervised segmentation. Based on the properties
(grey-level, texture, shape or color) of an image, it is partitioned into a set of regions
which are uniform or distinct and such type of segmentation comes under unsupervised
segmentation.
Segmentation approaches are generally categorized into model based methods,
contour based methods, region based methods, clustering and thresholding methods.
While model based methods find out similar patterns, and contour based methods rely
on boundary of regions, the region based methods partition the images into spatially
inter-related homogeneous regions. On the other hand, clustering methods group to-
gether the pixels having the same characteristics, thereby resulting in non-connected
regions. Recently, many studies concentrate on machine learning approaches to mea-
sure the similarity between the automated segmented regions and radiographer marked
regions.
The work in [18] presents region-growing algorithm which is a simple pixel-based
image segmentation method, that involves the selection of pixels (the seeds), and
the growing regions around these seeds, using a homogeneity criteria. The authors
in [19] apply watershed transform to the gradient of an image to acquire the boundary
18
of the X-ray image. This type of segmentation method is simple, intuitive and has
good properties, which make it useful for many classes of X-ray image segmentation.
The work in [20] summarizes that the region growing algorithms are fast and perform
accurate segmentation of regions that are spatially separated, and yet having the same
features.
The authors in [21] and [22] present edge-based segmentation methods to find edges
in the chest X-ray images. Segmentation is used to localize the suspicious area from the
image and it also helps in separating the suspicious area from the remaining structure
using edge detection methods, skeletonization, contour, thresholding and boundary de-
tection. The authors in [18], [23] and [24] summarize edge detection method including
Sobel, Prewitt, Roberts detectors for chest X-ray image classification. Sobel, Prewitt
and Robert are very sensitive to noise and they can detect edges in any direction.
Although these operators detect only strong edges, still the implementation of this
operator is simple and less complex.
The work in [25] and [26] describe the Canny edge detection that can detect edges in
all possible direction. However, the implementation is very complex and very difficult
and is dependent on the size of the object and sensitivity to noise.
In [27], model based approach is used for segmenting tooth X-ray images. This
methods often require manual interaction and exhibits poor convergence to concave
boundaries. The authors in [28] present automatic contour extraction method for
tooth segmentation, in dental X-ray images.
The authors in [29] present a segmentation method that consists of three steps:
image enhancement, region of interest localization, and tooth segmentation using mor-
phological operations and snake (or) active contour model. Automated approach based
on iterative thresholding and adaptive thresholding for dental X-ray image segmenta-
tion are described in [30] and [31]. The authors in [32] analyze the connected compo-
nents to obtain the desired region of interests (ROIs) using mathematical morphology
approach.
The work in [33] introduces mean-shift algorithm for segmenting X-ray images.
The two steps of the mean shift algorithm are: the filtering step, where the original
19
image is filtered in the feature space; and the clustering step, where the filtered data
points are grouped using linkage clustering or edge-directed clustering. The authors
in [34] and [35] present the fuzzy C-means algorithm for X-ray image segmentation for
the automated classification of X-ray images.
Component Labeling (CCL) is applied for segmentation of mammogram in order
to detect abnormalities (malignant or non malignant) in [36] and [37]. CCL scans an
image and groups its pixels into components based on pixel connectivity, i.e, all pixels
in a connected component share similar pixel intensity values and are in some way or
the other connected with each other. Once all groups have been determined, each pixel
is labeled with a gray-level or a color (color labeling) according to the component and
it is more suitable for medical image analysis and applications. Hence in this work
this method is proposed for the segmentation of the X-ray images.
The work in [38] describes the classification-based techniques using an adaptive
fuzzy method for lateral skull segmentation. However, classification-based algorithms
are generally not effective for X-ray image segmentation, due to their intrinsic proper-
ties. EM algorithm is used to find the most likely distribution function to describe the
relation between obvious variables and latent variables and is discussed in [39]. The
work in [39], [40] and [41] introduce a novel classification method based on the hy-
bridization of GA and EM algorithm for classification of X-ray images which produces
better results.
2.4 Feature Extraction

One of the most significant issues in developing an automated annotation of X-ray
images is to extract the contents that help to discriminate the different kinds of features
found in the X-ray images. The content means some property extracted from the image
such as texture, shape and color. These contents are extracted using image processing
techniques and are called features, and they provide useful information for further
analysis. Color features are not suitable for gray scale medical X-ray images. So
in this work, texture features and shape features are used for feature extraction and
analyzed in this section. Texture has been one of the most important characteristics
20
which has been used to classify and recognize objects for image retrieval [42], [43]
and [44].
Texture features are concerned with the spatial distribution and frequency rela-
tionship of gray levels. Texture is one of the most important low-level features in
computer vision and pattern recognition area, and is very much useful for medical
image applications. There are numerous types of texture features like geometrical,
statistical, model-based features. The work in [45] uses the gray level co-occurrence
matrix (GLCM) for extracting the texture features. Gray-level co-occurrence matrix
(GLCM) is one of the well-known statistical tools for extracting texture information
from the X-ray images. It provides information about position of pixels having similar
gray level values. GLCM extracts contrast, energy, homogeneity and entropy features
of the image at four different directions (0◦ ,45◦ , 90◦ , and 135◦ ). The work in [42] ob-
tained a classification accuracy of 90.7% using GLCM features on a dataset consisting
of 116 classes of X-ray images.
The authors in [43] propose an approach, which combines three level of features
like global, local and pixel level, which gives an accuracy of 89.0% for two different
medical classes like chest and hand X-ray using SVM classifier.
The authors in [46] obtained a classification accuracy of 94.2% using SVM to
classify 21 classes of X-ray images present in IRMA database. The authors used
a combination of shape features, local pattern and gray level co-occurrence matrix
features to model the SVM classifier.
Eight gray-level co-occurrence matrix were constructed for eight different orienta-
tions; 0◦ , 45◦ , 90◦ , 135◦ , -45◦ , -90◦ , -135◦, and -180◦ in [47]. The total dimensionality
of feature vector for each given X-ray image was 565 among which 93 features were
extracted at the global level, 372 features were computed at the local level, and 100 fea-
tures were extracted at the pixel level. Finally, the combined image features from three
different levels for each X-ray image were stored in a big feature vector and clustering
was performed and the accuracy obtained was 81.4% using K-means clustering.
The work in [48] discusses the low-level image representation such as gray level co-
occurrence matrix(GLCM), Canny edge operator, local binary pattern (LBP), pixel
21
value, and local patch-based image representation such as Bag of Words (BoW). These
features have been exploited in different algorithms for automatic classification of med-
ical X-ray images. The classification performance obtained with regard to the various
image representation techniques was analyzed. These experiments were evaluated on
Image CLEF 2007 database which consists of 11000 medical X-Ray images under 116
classes. Experimental results show that the classification performance obtained by ex-
ploiting LBP and BOW outperformed the other algorithms with respect to the image
representation techniques discussed.
Shape Feature
Shape is one of the most important and effective low level visual feature. Shape feature
extraction methods are usually divided into contour-based features and region-based
features. Commonly used contour-based shape feature extraction methods include
Fourier, wavelet, curvature scale space descriptors, shape signatures, moments and
function of moments and are described in [49].
The authors in [50] describe geometric features, invariant moments (IM) and
Zernike moments (ZM) that are usually concise, robust and easy to compute for ori-
entation detection. These moments are invariant to scaling, rotation and translation
of the object too, Wherein fourier descriptor (FD) is also used as a valid description
tool for detecting the orientation of X-ray images.
Considering medical X-ray image characteristics and also to avoid complexity, a
novel feature is proposed in [46] which is the combination of shape and texture features.
The feature extraction process is started by edge and shape information extraction
from original medical X-ray images.
Finally, Gabor filter is used to extract spectral texture features from shape images.
Furthermore, in order to study the effect of feature fusion on the classification perfor-
mance, different effective features like local binary pattern and gray level co-occurrence
matrix are utilized and their performance evaluated.
The authors in [51] present a novel feature extraction scheme for medical X-ray im-
age categorization. Gabor-based centre symmetric-local binary patterns (GCS-LBP)
22
feature is introduced. The proposed scheme is implemented on a subset of IRMA
dataset for 15 different X-ray categories. Experimental result show that the proposed
method provides an accuracy of 85.7%.
The proposed model in [52] analyzes the X-ray images image using various statis-
tical measures like mean, standard deviation, entropy, skewness, kurtosis etc.
Zernike Moments
Pseudo-Zernike moments consist of a set of orthogonal and complex number moments

which have some very important properties. Firstly, the Zernike moments’ magni-
tudes are invariant under image rotation. Secondly, Zernike moments have multilevel
representation capabilities. Thirdly, Zernike moments are less sensitive to image noise.
In [53], the pseudo-Zernike moments of an image are used as shape descriptor,
which have better feature representation capabilities and are more robust to noise
than other moment representations.
The authors in [54] discuss that Zernike moments, with orthogonal basis functions
are less sensitive to noise than geometric moments and are more powerful in discrim-
inating objects. ZM technique is used to reduce the problem of translation, rotation
and scaling of X-ray images.
This is done by normalizing the Zernike moment values. Reconstructed image
is compared with the original one. Orthogonality property simplifies the work of
reconstructing the image and produces better results for different types of X-ray images
and hence this method is proposed for classification of X-ray images.
2.5 Classification
Automatic machine classification and grouping of patterns are important problems in
a variety of engineering and scientific disciplines such as computer vision, medicine,
marketing, biology, psychology and remote sensing. The pattern classification can be
either supervised or unsupervised. In supervised classification, the input pattern is
assigned to a member of a pre-defined class, and in the unsupervised classification the
pattern is assigned to an unknown class. A wide range of supervised and unsupervised
23
architectures are seen in the literature which includes clustering techniques, neural
networks, association rule based classifiers, support vector machine (SVM) classifiers
and many more. Several studies demonstrate that the practice of combining several
base classifier models into one aggregated classifier leads to significant gains in classi-
fication performance over its constituent members. This section gives a brief review of
the various classification techniques found in the literature for automatic annotation
of medical X-rays.
K-Nearest Neighbor(KNN)
KNN is a supervised learning algorithm and it is the simplest of all machine learning
algorithms where an object is classified by a majority vote of its neighbors with the
object.
In [48] and [55] KNN has been developed for classification of lung X-ray images.
Japanese society of radiological technology successfully has done testing to detect lung
cancer and achieved 96.0% accuracy.
In [56], 750 X-ray images are taken with four structural groups namely head, neck,
upper-limb, lower-limb. 500 X-ray images were used for training and 250 for testing
using KNN classifier with intensity, texture and histogram of oriented gradients (HOG)
features and obtained a classification accuracy of 86.0%.
Artificial Neural Network
An artificial neural network (ANN) is an information processing system which contains

a large number of highly interconnected processing neurons. These neurons work
together in a distributed manner to learn from the input information, to coordinate
internal processing, and to optimize its final output and is explained in [57]. The
three layer back propagation neural network, the radial basis function neural network
(RBFNN) and auto associative neural network used for classification of mammogram
are described in [58].
The authors in [59] detected nodules in the diseased area of the chest X-ray images
and obtained an accuracy of 96.0% using the pixel-based technique while the feature-
based technique produced an accuracy of 88.0% using ANN classifier.
24
Radial Basis Function Neural Network (RBFNN)
In [60] radial basis function neural network classifier is used for lung cancer detection.
A correct classification rate of 96.0% is achieved using curvelet transform based fea-
tures for chest X-ray images.
Back Propagation Neural Network (BPNN)
The back propagation neural network is the most commonly used neural network in
classification applications. Back propagation neural network is used for classification
of different classes of X-rays images using Fourier transform in [59] and [61].
Probabilistic Neural Network (PNN)
The PNN is a feed forward neural network, and it is a supervised learning algorithm.
The authors in [62] analyzed skull X-ray images to detect tumors (benign or malig-
nant) for the primary level classification of brain tumor using PCA and PNN classifiers.
The Probabilistic Neural Network (PNN) classifier produced a high score of 78.05%
using texture features.
Support Vector Machine (SVM)
SVM is a powerful machine learning technique for classification and regression. The
authors in [63] used 2655 radiographic images from IRMA dataset and good perfor-
mance was achieved.
In [64] and [65] Coiflet wavelets are used to extract features from the CT images,
the extracted features are then classified using support vector machine (SVM) with
radial basis function (RBF) kernel and 90.0% accuracy was obtained.
The authors in [66] proposed an automatic X-ray image classification with multi-
level feature extraction using SVM and KNN classifiers. The accuracy of SVM was
high with 89.0%, when compared to K-nearest neighbour which gave an accuracy of
82.0%.
The authors in [64] developed an efficient lung nodule detection scheme by per-
forming nodule segmentation through multi-scale wavelet based edge detection and
morphological operations followed by SVM classification. This methodology used three
25
different types of kernels namely linear, radial basis function (RBF) and polynomial,
among which the RBF kernel gave better performance with an accuracy of 92.86%.
Hence, in this work SVM with RBF is proposed for automated annotation of X-ray
images.
Clustering Techniques
K-means algorithm elaborately discussed in [67] assigns each point to the cluster
whose center (also called centroid) is the nearest. The center is the average of all the
points in the cluster, that is, its coordinates are the arithmetic mean of each dimension
separately over all the points in the cluster. The main advantages of this algorithm is
its simplicity and speed which allows it to run on large datasets. Its disadvantage is
that it does not yield the same result with each run, since the resulting cluster depends
on the initial random assignments. It minimizes intra-cluster variance, but does not
ensure that the result has a global minimum of variance. Another disadvantage is the
requirement for the concept of a mean to be definable which is not always the case.
The work in [63] describes a novel fuzzy scheme for medical X-ray image classifica-
tion. The shape and texture features are extracted. The proposed method is evaluated
using 2655 X-ray images from IRMA dataset. Classification accuracy rates obtained
by the fuzzy classifier is higher than that of the multilayer precepton.
An automatic medical X-ray images clustering system was developed by merging
the outputs from different neural-network classifiers with different features. These fea-
tures are based on pixel-value, local binary pattern, global means of rows and columns,
local-partition means of rows and columns, and local histogram features. Merged out-
puts from different classifiers show improvement in overall accuracy than individual
classifiers. The best individual classifier is the global means of rows and columns clas-
sifier with 83.2% accuracy rate. The merged outputs from the five classifiers gave an
accuracy of 86.2%. The merged outputs from the top three classifiers produced an
accuracy of 87.2% and are summarized in [68].
Decision Trees
The work in [69] presents a fast and efficient method for classifying X-ray images
26
using random forests with wavelet-based local binary pattern (LBP) to improve image
classification performance and to reduce training and testing time.
The work in [70] Naive Bayes and the decision tree classifiers are used to train,
test and classify CT images and the performance are measured. The decision tree
classifier outperformed other classifiers in detecting tumor.
2.6 Orientation Detection

In order to have a clear view of the various organs of the human body, different views
of X-rays are taken. This plays a vital role for viewing the particular portions of the
organs to be examined for correct diagnosis.
The authors in [71] present a novel method for detecting the view of the chest X-
ray images automatically. The chest rib-orientation was measured using a generalized
line histogram technique and resulted in 90.0% accuracy.
An efficient automated method for identifying the angle of curvature of the spine
and angles between the vertebrae is proposed in [72].
FAST, SIFT, SURF and BRISK not only detect feature points, but also encode the
information about the surrounding region of the feature point. These descriptor values
are usually invariant to scaling and rotation, and are used to establish correspondences
between feature points in successive frames.
The work in [73] and [74] summarizes two robust feature detection algorithms
namely scale invariant feature transform (SIFT) and speeded up robust features (SURF)
for detecting the keypoint and keypoint descriptors of the X-ray images.
The authors in [75] present a scale, rotation and color illumination invariant feature
detector and descriptor for medical applications using SIFT and SURF algorithm.
The work in [76] detail an original procedure for automatically detecting vertebrae
in X-ray images using SIFT features with SVM classifier. The experiments were con-
ducted on 50 radiographs, mainly focusing on the cervical vertebrae C3 to C7. The
obtained results were found to be very promising.
The authors in [77] present a novel method which combines a template matching
27
model and K-means clustering to identify cervical vertebrae in X-ray images. The
proposed approach was successfully tested with 330 X-ray images with 97.5% accuracy.
SURF and Harris corner algorithm are mostly used for detecting the feature point
in X-ray images and hence these two algorithm are proposed for the orientation detec-
tion of X-ray images in this work.
2.7 Abnormality Detection

Abromalities detection based on medical image classification is an area of research
which has proved to be a challenging task during the past two decades. This field
gained more attention due to the new challenges posed by voluminous image databases
[78]. Among the four modalities, (X-ray, CT, MRI, Ultrasound), X-ray diagnosis is
commonly used for abnormalities detection unless the abnormalities are complicated
(e.g. stress fractures). In such cases the CT, MRI or ultrasound may be needed for
further diagnosis and operation.
X-Ray is one the oldest and frequently used devices, that makes images of any
bone in the body, including the hand, wrist, arm, elbow, shoulder, foot, ankle, leg
(shin), knee, thigh, hip, pelvis or spine to detect the abnormalities. Detection of
abnormalities is considered important, as a wrong diagnosis often leads to ineffective
patient management, increased dissatisfaction and expensive litigation.
The authors in [79]and [80] reviewed various classification algorithms that can
be used to classify X-ray images as normal or abnormal. Moreover, although many
classification approaches have been developed, selection of a suitable classifier requires
consideration of many factors, such as classification accuracy, algorithm performance
and computational resources.
In [81], [82] and [83] fusion-based classifiers are constructed with various features
such as contrast, homogeneity, energy, entropy, mean, variance, standard deviation,
correlation, Gabor orientation (GO), markov random field (MRF), and intensity gra-
dient direction (IGD) to train and test the three classifiers namely BPNN, SVM and
Naive Bayes(NB), for the purpose of detecting abnormalities in X-ray images.
28
The work in [84] and [85] present an approach to detect lung abnormalities using
SVM classifier with SIFT features in 12,000 chest X-ray images.
A fully automatic method to detect abnormalities is presented in [86] using frontal
chest X-rays. 388 X-ray images are taken, and segmented using active shape models.
Texture features are extracted from each region using the moments. Additional differ-
ence features are obtained by subtracting feature vectors from corresponding regions
in the left and right lung fields. A separate training set is constructed for each region.
All regions are classified by voting among the nearest neighbors with leave-one-out
method. Then, classification results of each regions are combined using a weighted
multiplier.
The authors in [87] analyze the applicability of five different edge detection algo-
rithms for computer aided fracture detection systems of bone X-ray images. The five
algorithms considered are Canny, Sobel, Prewitt, log and Roberts. These algorithms
were selected because of their popularity in image analysis systems. Various experi-
ments proved that the Sobel edge detection algorithm is fast and efficient in identifying
the fracture.
The work in [88] detects abnormalities automatically in hand X-ray images, using
neural network, Bayesnet, Naive Bayes and J48 (a decision tree classifier). The Bayes
net classifier outperformed other classifiers with 86.0% accuracy.
GLCM based method is proposed in [89] to segment the hand X-ray image and to
separate the bone regions from the soft tissue regions. K-means clustering is applied
using GLCM texture features for separating the bones from the soft tissues.
The authors in [90] developed a computer aided diagnosing system which uses
factorizing personalized Markov chains (FPMC) algorithm for segmentation and rule
based technique for classifying the cancer nodules. The learning is performed with the
help of extreme learning machine and resulted in better classification rates.
The paper in [91] presents a neural network based approach to detect lung cancer
in chest X-ray images. The authors used image processing techniques to denoise,
enhance, segment and to detect edges in the X-ray images. The area, perimeter and
shape features of the nodule are extracted and considered as the inputs to train the
29
artificial neural network and to discriminate whether the extracted nodule is malignant
or non-malignant.
The authors in [92] developed an automatic pulmonary nodule diagnosis system
for early detection of lung cancer. Totally 133 lung X-ray images are taken as input and
enhanced using Gaussian filter and median filter. The region growing segmentation
algorithm is applied on enhanced lung regions. Then, the shape of the nodule is
calculated using shape formula with the help of area and perimeter of the nodule.
Finally, the extracted features helps to find the cancerous and non-cancerous nodules
in chest X-ray images using ANN classifier. To differentiate the cancerous nodules
from other suspected nodule area, an artificial neural network using back propagation
is developed.
2.8 Annotation
Owing to the rapid development of modern medical devices, more and more medical
images are generated. For instance, over 640 million medical images stored in more
than 100 national health service trusts in UK, as of March 2008 are described in [93].
The cost of manually managing these images is very high, and also prone to errors. As
a result, there is an increased demand for a computerized system to index, compare,
analyze and annotate these valuable resources. As such, the research on data mining
and medical content-based image retrieval has progressed in recent times. It is believed
that the quality of such a medical system and patient care system can be improved
by successful categorization which uses the information directly extracted from the
images. Unlike earlier years of this research when the categorization of medical images
was restricted to a few classes only, today, this task is challenged when it deals with
real-life constraints of content-based medical image classification. This is where the
Image CLEF(cross language evaluation forum) medical image annotation challenge was
born. The goal of this challenge is to categorize the images into pre-defined classes
automatically and assigning correct labels to unseen test images. The database used
in this study is provided by the IRMA group for the medical image classification [4].
30
The work in [94] and [95] summarizes that the digital X-ray images are used for
diagnosis, treatment of diseases and management of patients. Nowadays, with the
development of digital technologies, medical images such as X-ray, MRI, magnetic
resonances spectroscopy (MRS), CT etc, are stored in very large databases or file
systems with information including radiology reports, findings and visual property.
With the increase of medical images, management, effective evaluation and reliable
accessing are the important issues.
Image annotations describe an image with one or more words that express the
content and meaning of the image. The papers [96] and [97] describe that the anno-
tation can be done by traditional method and automated method. The traditional or
manual method of annotation is done by humans, but automated annotation is done
by machines.
In [95] the latent semantic analysis (LSA) technique significantly improves the
quality of the annotation by combining different visual features (i.e. local descriptor
and global feature). In addition, the LSA is used to reduce the complexity of compu-
tation on large matrices and to obtain good performance. Experiments made on five
types of radiological images shows that this approach works better with skull X-ray
images.
The authors in [98] and [99], propose a pathology-based medical image annota-
tion using a statistical machine translation approach and the overall performance is
promising and will be useful to doctors and medical professionals.
The work in [100] uses both a direct and two hierarchical classification schemes
and Images CLEF 2009 database is used for the annotation purpose. The direct scheme
employs SVM classifier to automatically annotate X-ray images. The two hierarchical
schemes divided the classification task into sub-problems. The first hierarchical scheme
exploits ensemble SVMs trained on IRMA sub-codes. The second learns from sub-
groups of data defined by frequency of classes. Experiments shows that hierarchical
annotation of images by training individual SVMs over each IRMA sub-code dominates
its rivals in annotation accuracy with increased processing time relative to the direct
scheme.
31
The three different annotation techniques such as annotation by binary classifi-
cation, probabilistic latent semantic analysis (PLSA) based image annotation, and
annotation using top similar images to the query image are discussed for automatic
annotation of X-ray images in [101].
The authors in [102] describe the main parameters of annotation and retrieval
in which, some of the techniques used are multilevel, IRMA code, visual BOW and
improved multilayer perceptron MLP neural network. Finally, the author highlights
the advantage and disadvantage of each of the techniques with their accuracy rates.
2.9 Techniques used in the Proposed Work

The Following techniques have been used in this work for classification, view classifi-
cation and abnormality detection of medical X-ray images.
2.10 Segmentation Algorithms

In this work, Connect Component Labeling (CCL) is used for classification of medical
X-ray images and Expectation Maximization (EM) algorithm is used for view classifi-
cation and abnormality detection of medical X-ray images in order to find the region
of interest.
2.10.1 Connect Component Labeling (CCL)
Connected component labeling is used to find the region of interest. Connected com-
ponent labeling works by scanning an image, pixel-by-pixel (from top to bottom and
left to right) in order to identify connected pixel regions, i.e. regions of adjacent pixels
which share the same set of intensity values V.
Connected component labeling works on binary or gray level images and different
measures of connectivity are possible. The default is 8-connectivity. For a binary
image V = 1, the connected components labeling operator scans the image by moving
along a row until it comes to a point p (where p denotes the pixel to be labeled at
32
any stage in the scanning process) for which V = 1. When this is true, it examines
the four neighbors of p which have already been encountered in the scan (i.e. the
neighbors (i) to the left of p, (ii) above it, and (iii and iv) the two upper diagonal
pixels). Based on this information, the labeling of p occurs as follows:
• If all four neighbors are 0, assign a new label to p, else
• If only one neighbor has V = 1, assign its label to p, else
• If more than one of the neighbors have V = 1, assign one of the labels to p
and make a note of the equivalences.
After completing the scan, the equivalent label pairs are sorted into equivalence
classes and a unique label is assigned to each class. As a final step, a second scan is
made through the image, during which each label is replaced by the label assigned to
its equivalence classes.
2.10.2 Expectation Maximization (EM)
EM stands for Expectation Maximization.
• Expectation: This step computes an expectation of the likelihood assuming

parameters.
• Maximization: This step computes maximum likelihood estimation of parame-

ters by maximizing the expected likelihood found in E-step.
In E-Step
Assume parameters for two Gaussians. Parameters for Gaussian two Gaussians θ1 and
θ2 are mean and variance and is given in equation 2.1.
θ1 = (µσ11 ) θ2 = (µσ22 ) (2.1)
Assuming means µ1 and µ2 variance is calculated as
(xi − µ1 )2 (xi − µ1 )2
P P
2 2
σ1 = σ2 = (2.2)
n n
33
Now the probability density function for both the Gaussians is computed as
1 1 xi − µ1 2
p(xθ1i ) = √ exp[− ( )] (2.3)
2πσ1 2 σ1
Then the normalized probability values Xi and Yi for both the Gaussians is calculated
as
p( θx1i )
Xi = (2.4)
p( θx1i ) + p( θx2i )
p( θx1i )
Yi = (2.5)
p( θx1i ) + p( θx2i )
In M-Step
Using the normalized probability new means µ1 and µ2 are calculated
n
X n
X
µ1 = Xi xi µ2 = Yi xi (2.6)
i=1 i=1
n
X n
X
nX = Xi nY = Yi (2.7)
i=1 i=1
Where, nX and nY is the total normalized probability of Gaussian 1 and Gaussian

2 respectively.
µ1 µ2
µ1 = µ2 = (2.8)
nX nX
Using the new means µ1 and µ2 the new variance for both the Gaussians is now
calculated.
n
X n
X
σ12 = Xi (xi − µ1 ) 2
σ22 = Yi (xi − µ2 )2 (2.9)
i−1 i−1
σ12 σ22
σ12 = σ22 = (2.10)
nX nY
34
Iterations between E-step and M-step are continued till the means remains the same.
The EM method is iterative, it starts from an initial guess of the set of parameters
i.e., it assumes the parameters. Then, by using the feature value of the images it
computes the mean, variance, probability density function and normalized probability
values in E-step. The new mean and new variance are calculated by using the above
calculated values in M-step. These steps were repeated step by step in such a way
that the behaviour provided by the model is more and more similar to the behaviour
described by the samples in [103].
2.11 Feature Extraction Techniques

In this work, Gray Level Co-Occurrence Matrix (GLCM ) and Zernike Moment (ZM)
features are used for classification of medical X-ray images and Discrete Wavelet Trans-
form (DWT) features is used for view classification and detecting the abnormality of
medical X-ray images.
2.11.1 Gray Level Co-Occurrence Matrix (GLCM)
In statistical texture analysis, texture features are computed from the statistical dis-
tribution of observed combinations of intensities at specified positions relative to each
other in the image. According to the number of intensity points (pixels) in each combi-
nation, statistics are classified into first-order, second-order and higher-order statistics.
The gray level co-occurrence matrix (GLCM) is a way of extracting second order
statistical texture features. GLCM is an estimate of a joint probability density function
of gray level pairs in an image. GLCM could be expressed according to the following
expression
Pd,θ (i, j), (i, j = 0, 1....n) (2.11)
where i and j show the gray level of 2 pixels having the distance of d from each other
and the angle θ and N is the number of gray levels in the image. For obtaining the fine
texture details d =4 and 8 are tried, but d =4 provides better results and the direction
35
Table 2.1: Summary of GLCM features
Feature Expression
P
m−p
f (i).f (i+p)
m
Auto correlation C(P ) = m−p
∗ i=1
P
m
f 2 (i)
i=1
P
n P
n
n−1 p(i,j)
n2 i=1 j=1
P
Contrast C=
n=o
P g−1
g−1
i + j − x − y 2 ∗ P (i.j)
P
Cluster shade S=
i=0 j=0
p−1
P
Dissimilarity D= |i − j|P (i, j)
i,j
P g−1
g−1
(i − s)2 P (i, j)
P
Sum of square variance S=
i j
2n
(i − s)2 Px+y (i)
P
Sum variances S=
i=2
P (i−µi) (j−µj) (p(i,j))
Correlation Cor= σi σj
i,j
Ng
P Ng
P
Entropy Entro=− Pd (i, j)log(Pd (i, j))
i=1 j=1
L−1 L
[p(i, j, d, θ)]2
P P
Angular second moment ASM=
i=0 j=1
2nφ
P
Sum entropy F = − Px+y (i)logPx+y (i)
i=2
E = P (i, j)2
P
Energy
i,j
ng
ng P
1
P
Inverse different moment IV = 1+(i−j)2
p(i, j)
i=0 j=0
2ng
ipix+y
P
Sum average SA =
i=2
θ = 0 is used because there is no significant dependence of the discriminatory power of

the texture features on the direction of the pixel pairs. Out of the 18 GLCM features,
the most relevant 13 features are used in this work and it is given in Table 2.1.
2.11.2 Zernike Moment (ZM)
The Zernike moments computation from an input image consists of three steps: (i)
computation of radial polynomials, (ii) computation of Zernike basis functions and
(iii) computation of Zernike moments by projecting the image onto the Zernike basis
36
functions.
The real-valued 1-D radial polynomial Rn,m (ρ) is defined as
(n−|m|)/2
X (n − s)!
Rn,m (ρ) = (−1)s ρn−2s (2.12)
s=0
s!(((n + |m|)/2) − s)!(((n − |m|)/2) − s)!
where, n is a non-negative integer, m is non-zero integer subject to constraints n − |m|

is even and |m| ≤ n represents the azimuthal angle. ρ is the length of the vector from
origin to (x, y).
Using radial polynomial Rn,m , complex valued 2-D Zernike basis functions, which
are defined within a unit circle, and are formed by
Vn , m(ρ, θ) = Rn,m (ρ)ejmθ , |ρ| ≤ 1 (2.13)
The complex Zernike polynomials satisfy the orthogonality condition.

Z 2π Z 1 n
∗ π/(n+1) n=p,m=q
Vn,m (ρ, θ)Vp,q (ρ, θ)ρdρdθ = 0 otherwise (2.14)
0 0
where * denotes the complex conjugate. The orthogonality implies no redundancy or

overlap of information between the moments with different orders and repetitions. This
property enables the contribution of each moment to be unique and independent from
the information in an image. Complex Zernike moments of order n with repetition m
are defined as
2π 1
n+1
Z Z
∗
Zn,m = f (ρ, θ)Vn,m (ρ, θ)ρdρdθ (2.15)
π 0 0
For digital images, the integrals M can be replaced by summations and f (c, r) is
the image function where, c and r denote the column and row number of the image
respectively. In addition, the coordinates of the image must be normalized into [0,1]
by a mapping transform.
It is to be noted that in this case, the pixels located on the outside of the circle
are not involved in the computation of the Zernike moments. Eventually, the discrete
form of the Zernike moments for an image with the size N x N is expressed as follows:
N −1 N −1 N −1 N −1
n+1 XX ∗ n+1 XX
Zn,m = f (c, r)Vn,m(c, r) = f (c, r)Rn,m (ρcr )e−jmθcr (2.16)
λN c=0 r=0 λN c=0 r=0
37
where 0 ≤ ρcr ≤ 1, and λN is a normalization factor. In the discrete implementation
of Zernike moments, the normalization factor must be the number of pixels located
in the unit circle by the mapping transform, which corresponds to the area of a unit
circle, π, in the continuous domain. Table 2.2 shows the list of Zernike polynomials.
Table 2.2: List of Zernike polynomials upto 4th order.
Index N M Zernike Polynomials
0 0 0 1
1 1 -1 2ρsinθ
2 1 1 2ρcosθ
√
3 2 -2 6ρ2 sin2θ
√
4 2 0 3(2ρ2 − 1)
√
5 2 2 6ρ2 cos2θ
√
6 3 -3 8ρ3 sin3θ
√
7 3 -1 8(3ρ3 − 2ρ)sinθ
√
8 3 1 8(3ρ3 − 2ρ)cosθ
√
9 3 3 8ρ3 cos3θ
√
10 4 -4 10ρ4 sin4θ
√
11 4 0 5(6ρ4 − 3ρ2 θ)
√
12 4 0 5(6ρ4 − 6ρ2 + 1)
√
13 4 2 10(4ρ4 − 3ρ2 cos2θ
√
14 4 -4 10ρ4 cos4θ
38
2.11.3 Discrete Wavelet Transform (DWT)
When analyzing signals of a non-stationary nature, it is often beneficial to be able to

acquire a correlation between the time and frequency domains of a signal. The Fourier
transform provides information about the frequency domain, however time localized
information is essentially lost in the process. In contrast to the Fourier transform, the
wavelet transform allows exceptional localization in both the time domain via transla-
tion/shifting of the mother wavelet, and in the frequency domain via dilation/scaling.
The translation and dilation operations applied to the mother wavelet are performed to
calculate the wavelet coefficients, which represent the correlation between the wavelet
and a localized section of the signal. The wavelet coefficients are calculated for each
wavelet segment, giving a time-scale function relating the wavelets correlation to the
signal. This process of translation and dilation of the mother wavelet is depicted below.
Fig. 2.1: DWT breakdown of the signal.
39
The effect of this shifting and scaling process is to produce a time-scale representa-
tion as depicted in Fig. 2.1. The wavelet transform offers superior temporal resolution
of the high frequency components and scale (frequency) resolution of the low frequency
components.
A time-scale representation of a digital signal is obtained using digital filtering
techniques. Filters of different cut-off frequencies are used to analyze the signal at
different scales. The signal is passed through a series of high pass filters to analyze
the high frequencies, and it is passed through a series of low pass filters to analyze
the low frequencies. The resolution of the signal, which is a measure of the amount
of detailed information in the signal, is changed by the filtering operations, and the
scale is changed by up-sampling and down-sampling (sub-sampling) operations. Sub-
sampling a signal corresponds to reducing the sampling rate, or removing some of the
samples of the signal. For example, sub-sampling by two refers to dropping every other
sample of the signal. Sub-sampling by a factor n reduces the number of samples in
the signal n times. Up-sampling a signal corresponds to increasing the sampling rate
of a signal by adding new samples to the signal. For example, up-sampling by two
refers to adding a new sample, usually a zero or an interpolated value, between every
two samples of the signal. Up-sampling a signal by a factor of n increases the number
of samples in the signal by a factor of n .
The procedure starts with passing the signal (sequence) through a half-band dig-
ital lowpass filter with impulse response h[n]. Filtering a signal corresponds to the
mathematical operation of convolution of the signal with the impulse response of the
filter. The convolution operation in discrete time is defined as follows:
∞
X
x[n] ∗ y[n] = x[k] · h[n − k] (2.17)
k=−∞
A half-band lowpass filter removes all frequencies that are above the half of the highest
frequency in the signal. For example, if a signal has a maximum of 1000Hz compo-
nent, then half band lowpass filtering removes all the frequencies above 500Hz .
After passing the signal through a half-band lowpass filter, half of the samples can
be eliminated. Lowpass filtering removes the high frequency information, but leaves
40
the scale unchanged. Only the sub-sampling process changes the scale. Resolution, on
the other hand, is related to the amount of information in the signal, and therefore, it
is affected by the filtering operations. Half band lowpass filtering removes half of the
frequencies, which can be interpreted as losing half of the information. Therefore, the
resolution is halved after the filtering operation. However, the sub-sampling operation
after filtering does not affect the resolution, since removing half of the spectral compo-
nents from the signal makes half the number of samples redundant anyway. Half the
samples can be discarded without any loss of information. In summary, the lowpass
filtering halves the resolution, but leaves the scale unchanged. The signal is then sub
sampled by two since half the number of samples are redundant. This doubles the
scale. This procedure can mathematically be expressed as
∞
X
y[n] = h[k] · x[2n − k] (2.18)
k=−∞
The DWT analyzes the signal at different frequency bands with different resolutions by
decomposing the signal into a coarse approximation and detailed information. DWT
employs two sets of functions, called scaling functions and wavelet functions, which
are associated with low pass and highpass filters respectively. The decomposition of
the signal into different frequency bands is simply obtained by successive highpass and
lowpass filtering of the time domain signal. The original signal x[n] is first passed
through a half-band highpass filter g[n] and a lowpass filter h[n] . This constitutes one
level of decomposition and can mathematically be expressed as
X
yhigh [k] = x[n] · g[2k − n] (2.19)
n
X
ylow [k] = x[n] · h[2k − n] (2.20)
n
where yhigh [k] and ylow [k] are the outputs of the highpass and lowpass filters, respec-
tively, after sub-sampling by two. This decomposition halves the time resolution since
only half the number of samples now characterize the entire signal. However, this
operation doubles the frequency resolution, since the frequency band of the signal now
spans only half the previous frequency band, effectively reducing the uncertainty in
41
the frequency by half. The above procedure, which is also known as the sub-band
coding, can be repeated for further decomposition. At every level, the filtering and
sub-sampling will result in half the number of samples (and hence half the time resolu-
tion) and half the frequency band spanned (and hence double the frequency resolution).
The dilation function of the discrete wavelet transform can be represented as a tree
Fig. 2.2: Filterbank representation of DWT dilations.
of low and high pass filters, with each step transforming the low pass filter as shown
in Fig. 2.2. The original signal is successively decomposed into components of lower
resolution, while the high frequency components are not analyzed further. The maxi-
mum number of dilation that can be performed is dependent on the input size of the
data to be analyzed, with 2N data samples enabling the breakdown of the signal
into N discrete levels using the discrete wavelet transform.
42
2.12 Modeling Techniques used in the Proposed
Work
The following modeling techniques have been used in this work for classification and
view classification and Abnormality detection of medical X-ray images.
2.12.1 Back Propagation Neural Network (BPNN)
Neural networks are adaptive networks which are composed of simple elements oper-
ating in parallel. These elements are inspired by the biological nervous systems. As
in nature, the network function is determined largely by the connections between ele-
ments. Commonly, neural networks are adjusted or trained so that a particular input
leads to specific target output. Neural networks have been trained to perform com-
plex functions in various application areas including pattern recognition, identification,
classification, speech, vision and control systems. BPNN uses a set of interconnected
Fig. 2.3: A computational neural model.
neural elements that process the information in a layered manner. A computational

neural element, called perceptron, provides an output as a thresholded weighted sum
of all inputs. The basic function of the perceptron is shown in Fig. 2.3 and is analogous
to the synaptic activities of a biological neuron. In a layered network structure, the
neural element may receive its input from an input vector or other neural elements.
43
A weighted sum of these inputs constitutes the argument of a non-linear activation
function such as a sigmoidal function. The resulting threshold value of the activation
function is the output of the neural element. The output is distributed along weighted
connections to other neural elements. The computational output of a neural element
can be expressed as
n
X
y=F wi xi + wn+1 (2.21)
i=1
where F is a non-linear activation function that is used to threshold the weighted

sum of inputs xi and wi is the respective weight. Assuming a multi-layer feed forward
Fig. 2.4: A feed forward back propagation neural network.
neural network with L layers of N perceptrons in each layer as shown in Fig. 2.4, such
that

(k) k (k−1)
y =F w y f or k = 1, 2...L (2.22)
44
where y (k) is the output of the k th layer neural elements with k = 0 representing
the input layer and w (k) is the weight matrix of the k th layer. The neural network is
trained by presenting classified examples of input and output patterns. Each example
consists of the input and output vectors {y (0) , y (L)} or {x, y (L) } that are encoded for
the desired class. The objective of the training is to determine a weight matrix that
would provide the desired output, respectively for each input vector in the training
set. Training the feed forward network consists of the following steps:
(1) Assign random weights in the range of [−1, +1] to all weights wijk .
(2) For each classified pattern pair y (0) , y (L) in the training set, the following steps
are carried out.
• Compute the output values of each neural element using the current weight
matrix.
• Find the error e(k) between the computed output vector and the desired
output vector for the classified pattern pair.
• Adjust the weight matrix using the change ∆W (k) computed as in (1.7).
∆W (k) = αe(k) [y (k−1) ] f or all layers k = 1, ...., L, (2.23)
where α is the learning rate that can be set between 0 and 1 .
(3) Repeat step 2 for all classified pattern pairs in the training set until the error
vector for each training example is sufficiently low or zero.
The non-linear activation function is an important consideration in computing the

error vector for each classified pattern pair in the training set. A sigmoidal activation
function can be used and is given in (2.24).
1
F (y) = (2.24)
1 + e−y
The algorithm described above, also called a back propagation neural network is sen-
sitive to the selection of initial weights and noise in the training set that can cause
the algorithm to get stuck in local minima. This causes poor generalization perfor-
mance when it is used to classify new patterns. Another problem with BPNN is to find
45
the optimal network architecture with the consideration of optimal number of hidden
layers and perceptrons in each of the hidden layers in [103].
2.12.2 Probabilistic Neural Network (PNN)
In PNN, the operations are organized into a multilayered feed forward network with
four layers:
1. Input layer
2. Pattern layer
3. Pattern layer / Summation layer
4. Output layer
Fig. 2.5: Architecture of PNN.
The architecture of PNN is shown in Fig. 2.5. Feature vectors are represented in the
input layer. The training set for the PNN represented in the hidden layer is fully
interconnected with the input layer. Finally, an output layer represents each of the
possible classes for which the input data can be classified. However, the hidden layer
is not fully interconnected to the output layer. The example nodes for a given class
connect only to that class’s output node and none other.
In this work, the feature vectors are taken as input and its mean and variance are
computed to calculate weight w and this weight is multiplied with the input vector F
46
in the hidden layer.
x − x11
W = e[ ] (2.25)
σ
H = Wi F (2.26)
In summation layer Cj , the product of two vectors from the hidden layers are
summated and its average value is computed.
Pn 2
i=1 e hiγ−1
CJ = (2.27)
n
The maximum of the average value is compared with the values of the test image
in the output layer. If the estimated value is similar with the test value in the output
layer, the images are correctly classified, else it is mis-classified in [103]. .
2.12.3 Support Vector Machine (SVM)
SVM is a powerful machine learning technique for classification and regression. It was
developed by Vapnik, and is based on statistical learning theory. This gave rise to
a new class of theoretically elegant learning machines that use a central concept of
support vectors and kernels for a number of learning tasks. Kernel machines provide a
modular framework that can be adapted to different tasks and domains by the choice
of the kernel function and the base algorithm. They are replacing neural networks in
a variety of fields.
In literature, a predictor variable is called an attribute, and a transformed attribute
that is used to define the hyperplane is called a feature. The task of choosing the most
suitable representation is known as feature selection. A set of features that describe
one case (i.e., a row of predictor values) is called a vector.
So the goal of SVM modeling is to find the optimal hyperplane that separates
clusters of vector in such a way that cases with one category of the target variable are
on one side of the plane and cases with the other category are on the other side of the
plane. The vectors near the hyperplane are the support vectors. Fig. 2.6 shows the
maximum margin hyperplane and support vector in [103].
47
Fig. 2.6: Maximum margin hyperplane and support vectors.
2.12.4 Speeded up Robust Features(SURF)
Among the fast tracking algorithms, SURF seems to be mainly useful dealing with
similar images to find out quickly some correspondences. It is effective in collecting
more class specific information, robust in dealing with view point changes. SURF is
becoming one of the most popular feature detectors and descriptors in computer vi-
sion field. It is able to generate scale-invariant and rotation-invariant interest points
with descriptors. Evaluations show its superior performance in terms of repeatabil-
ity, distinctiveness, and robustness. SURF is selected as the interest point detector
and descriptor for the following reasons: (1) X-ray image could be taken under the
conditions of (i) view variation, (ii) Size variation and (iii) Shape variation.
Interest points with descriptors generated by SURF are invariant to variation and
location changes. Computational cost of SURF is small, which enables fast interest
point localization and matching. The SURF detector is based on the Hessian matrix
for its good performance in computational cost and accuracy. For a point (x,y) in an
48
image I, The Hessian matrix H(X,σ) with is defined as
 
Lxx2 (x, y, σ) Lxy (x, y, σ)
H(X, σ) =   (2.28)
Lxy (x, y) Lyy (x, y, σ)
Modern feature extractors select prominent features by first searching for pixels that
demonstrate rapid changes in intensity values in both the horizontal and vertical di-
rections.
Such pixels yield high Harris corner detection scores and are referred to as key-
points. Keypoints are searched over a subspace of (x, y, σ) ǫR3 . The variable repre-
sents the Gaussian scale space at which the keypoint exists. In SURF, a descriptor
vector of length 64 is constructed using a histogram of gradient orientations in the
local neighborhood around each keypoint.
Surf extracts the salient features and descriptors from the images. This extractor
is preferred over SIFT due to its concise descriptor length. Whereas the standard
SIFT implementation uses a descriptor consisting of 128 floating point values, SURF
condenses this descriptor length to 64 floating point values.
The template consists of a sample image of each category to be classified from
which the proposed system extracts knowledge. SURF first detects the interest points
and generates the corresponding descriptors.
The pre-computed SURF descriptors of template images in each category are then
used to match with the extracted descriptors of the input image, and when the input
image matches with the template images, then it is said to be that it belongs to the
corresponding category of the images.
2.13 Summary
In this chapter, literature related to automated annotation of medical images was
presented. The different image processing techniques employed in the literature for
pre-processing and segmentation of the region of interest were discussed. Review of the
different types of features used in orientation detection and abnormalities detection
and a brief review of the existing methods for classification were also presented.
49
Chapter 3
Classification of X-ray images using

texture and shape features
3.1 Introduction
In practice, quite a large amount of medical images are captured and stored in digital
format, which are classified based on their anatomical parts. New trends for image
retrieval using automatic image classification and annotation have been investigated
in the past few years. It is believed that the archival and management of huge volume
of medical images can be improved by successful classification. Usually, annotation
is performed manually by the physicians and medical experts, which may ultimately
result in consuming more time, besides requiring domain-specific-knowledge. There-
fore, automatic medical image annotation is becoming increasingly important for more
effective image classification.
In this work, an attempt has been made to classify X-ray images which will act as a
preliminary step for image annotation. The X-ray images are classified into six different
classes namely foot, skull, palm, neck, chest and spine. The framework proposed in
this work involves the following steps: pre-processing, segmentation, feature extraction
and classification. X-ray images are pre-processed and segmented to suppress the
unwanted distortions and to enhance some image features which are important for
further processing. Then, the most relevant features are extracted, which are used
for the classification of X-ray images with three different classifiers and finally the
performances of the classifiers are measured and compared. The classification results
are useful for the development of the automated X-ray annotation system.
50
3.2 Proposed Methodology for Classification of X-
ray Images
In this work, medical X-ray images are taken as input. Initially the X-ray images
are pre-processed using M3 filter to reduce noise and to enhance the quality of the
X-ray images. Then, the filtered images are segmented using connected component
labeling (CCL) from which texture and shape feature are extracted from the region
of interest (ROI). The extracted features are then utilized for classifying the X-ray
images. The classifiers used in this research are back propagation neural network
(BPNN), probabilistic neural network (PNN) and support vector machine (SVM).
The various processes involved in the proposed system are shown in Fig. 3.1.
Fig. 3.1: Block diagram of the proposed methodology for classification of X-ray images
51
3.3 Pre-processing
Pre-processing helps to improve the quality of visualization and interpretation of med-
ical X-ray images. Hence, it is desirable to pre-process the image such that the noise
is reduced and the visibility is improved. In order to carry out the task without de-
stroying the details of the X-ray images, a M3 filter is applied in this work. The M3
filter is quite popular because, for certain types of random noise, it has excellent noise-
reduction capabilities, with considerably less blurring than linear smoothing filters of
similar size. These filters are very effective in removing the noise as well as in pre-
serving the sharpness of edges and to retain the important information of the X-ray
images.
Fig. 3.2: M3 filter
In this work, X-ray image of size 512 × 512 is taken. A 3 × 3 M3 filter which is ba-
sically a combination of mean and median filter removes the noise by replacing the
center pixel with the maximum of the median and mean of the 3 × 3 neighborhood.
The idea of M3 filter is shown in Fig. 3.2. The original X-ray images are shown in
Fig. 3.3 and the pre-processed images are shown in Fig. 3.4.
52
Fig. 3.3: Original X-ray images (a) Chest (b) Spine (c) Palm, (d) Foot (e) Neck and
(f) Skull
Fig. 3.4: Pre-processed X-ray images using M3 filter (a) Chest (b) Spine (c) Palm (d)
Foot (e) Neck and (f) Skull
53
3.4 Segmentation
Segmentation is a process carried out to find the region of interest (ROI). Connected
component labeling (CCL) is applied for this purpose and the largest labeled compo-
nent is the required ROI. CCL scans an image and groups its pixels into components
based on pixel connectivity, i.e, all pixels in a connected component share similar pixel
intensity values and are in some way or the other connected with each other. Once all
groups have been determined, all the pixels belonging to a component is labeled with
a grey-level or a color (color labeling). The results of the segmented images are shown
in Fig. 3.5.
Fig. 3.5: Segmented X-ray images for six different classes of X-rays.
3.5 Feature Extraction

Feature extraction is an important part of supervised classification. A specific com-
bination of features is required in any image processing and analysis applications. In
the literature, transform based features, region based features, shape based features
and texture based features are discussed. However, it is noticed from the literature,
54
that texture features represent the gray level intensity measures much better than the
other features and also shape features helps in identifying the X-ray class. Hence, in
this work, shape and texture features are used for the classification of X-ray images.
Though many feature extraction methods are available in the literature to extract
the shape and texture information, GLCM features and Zernike moment features are
widely used and they play a vital role for analyzing various classes of medical images.
GLCM Features
In statistical texture analysis, texture features are computed from the statistical dis-
tribution of observed combinations of intensities at specified positions relative to each
other in the image. According to the number of intensity points (pixels) in each combi-
nation, statistics are classified into first-order, second-order and higher-order statistics.
The gray level co-occurrence matrix (GLCM) is a way of extracting second order sta-
tistical texture features. GLCM is an estimate of a joint probability density function
of gray level pairs in an image. GLCM could be expressed according to the following
expression
Pd,θ (i, j), (i, j = 0, 1....n) (3.1)
where i and j show the gray level of 2 pixels having the distance of d from each other
and the angle θ and N is the number of gray levels in the image. For obtaining the fine
texture details d =4 and 8 are tried, but d =4 provides better results and the direction
θ = 0 is used because there is no significant dependence of the discriminatory power of
the texture features on the direction of the pixel pairs. Out of the 18 GLCM features,
the most relevant 13 features are used in this work. The detailed information about
the 13 features is given in Chapter 2.
Zernike Moments
Zernike moment has been suggested to be a good descriptor for shape. Zernike mo-
ments are computed by projecting the image onto a set of complex Zernike polynomials
which satisfy the orthogonal property. This helps to ascertain that the images have no
55
overlapping of information between the moments. Knowledge of the precise boundary
of an object is not required to compute the Zernike moment features and a detailed
description of the same is given in Chapter 2. In this work, Zernike moment shape
features are extracted using the 15 Zernike polynomials upto the 4th order.
3.6 Classification
This stage makes the final decision regarding the classification of X-ray images. In this
work, supervised learning method is used for the classification of the X-ray images.
Support vector machine (SVM), back propagation neural network (BPNN) and proba-
bilistic neural network (PNN) are used to classify the medical X-ray images into the six
classes namely, skull, palm, neck, chest and spine. The features thus extracted are used
for classification using SVM, BPNN and PNN classifiers. An elaborate description of
SVM, BPNN and PNN has been given in Chapter 2.
3.7 Performance Measures

The performance of the classifiers are evaluated by several metrics such as sensitivity,
specificity and accuracy. The quality of classification is measured from a confusion
matrix which records correctly and incorrectly recognized examples of each class. The
actual and predicted cases produced by the classification system can be provided by
a confusion matrix. tp is the number of true positives, fp is the number of false
positives, tn is the number of true negatives and fn is the number of false negatives.
Accuracy measures the quality of the classification by finding the true/false positives
and true/false negatives. Whereas, sensitivity deals with only positive cases, specificity
deals with only negative cases.
The performance measures, sensitivity, specificity, and accuracy are calculated as
follows:
(tp + tn)
Accuracy = (3.2)
(tp + tn + f p + f n)
56
tp
Sensitivity = (3.3)
(tp + f n)
tn
Specif icity = (3.4)
(tn + f p)
3.8 Experimental Results

In this work, the shape, texture and a combination of shape and texture features are
extracted from 180 X-ray images, under six different classes, ensuring that each class
consists of 30 images. For shape features, 15 Zernike moment features are extracted
and for the texture features 13 GLCM features are extracted. Each of the image data
point belongs to any of the six classes namely, foot, skull, palm, neck, chest and spine.
All the attributes are first normalized between -1 and +1 for the classifiers namely
BPNN, PNN and SVM to have a common range to work with. In this work true
positive refers to correct classification of the X-ray images and true negative refers
to incorrect recognition. One versus rest procedure has been employed in order to
evaluate the performances of the various classifiers.
Modeling using Support Vector Machine
SVM is trained in multiclass mode so as to distinguish one class from the remaining 5
classes out of the 6 classes, viz., chest, palm, neck, spine, foot and skull. For example,
if the SVM classifier is intended to be trained for classifying the chest images, then
chest feature vectors are assigned +1 and all the remaining five classes are assigned
-1. The same procedure is followed for all the other classes, wherein at each step, the
concerned class will be assigned with +1 and the others ’-1’ and each class is trained
independently.
The confusion matrix of SVM with GLCM, ZM, and the combination of GLCM and
ZM features for classification of X-rays are shown in Table 3.1, Table 3.3 and Table 3.5.
The performance measures are evaluated using three different set of features such as
GLCM features, ZM features and a combination of Zernike and GLCM features. The
performance is measured in terms of accuracy, sensitivity and specificity, and the
57
results are tabulated in Table 3.2, Table 3.4 and Table 3.6. Besides, a graphical
illustration of the performance of SVM using various features are shown in Fig. 3.6.
to Fig. 3.8.
Table 3.1: Confusion matrix of SVM with GLCM features for X-ray classification
X-ray Class tp tn fp fn
Chest 8 72 6 4
Skull 10 74 3 3
Foot 9 72 5 4
Palm 11 73 4 2
Spine 10 76 2 2
Neck 10 72 2 6
From Table 3.2 it could been seen that SVM using GLCM features was able to
classify spine X-ray images with the highest accuracy of 95.55% followed by palm and
skull with an accuracy of 93.33% each.
Table 3.2: Performance of SVM with GLCM features for X-ray classification
X-ray Class Accuracy% Sensitivity% Specificity%
Chest 88.88 66.66 92.30
Skull 93.33 76.92 96.10
Foot 90.00 69.23 93.50
Palm 93.33 84.61 94.80
Spine 95.55 83.33 97.43
Neck 91.11 62.50 97.29
Overall Performance 92.03 73.87 95.23
58
Table 3.3: Confusion matrix of SVM with ZM features for X-ray classification
Chest 11 72 4 3
Skull 10 75 3 2
Foot 9 74 4 3
Palm 8 74 2 6
Spine 10 76 1 3
Neck 11 73 2 4
Table 3.4 shows the performance of SVM in classifying X-ray images using Zernike
moment features. The table indicates that the Zernike moment was well modeled by
SVM and the overall performance is 93.14% and it was able to classify all classes of
X-ray images with better accuracy rate.
Table 3.4: Performance of SVM with ZM features for X-ray classification
Chest 92.22 78.57 94.73
Skull 94.44 83.33 96.15
Foot 92.22 75.00 94.87
Palm 91.11 57.14 97.36
Spine 95.55 76.92 98.70
Neck 93.33 73.33 97.33
59
Experiments were conducted to find out how SVM could perform, when both
shape and texture feature are combined. Table 3.6 shows the performance result of
SVM when modeled using the combined features.
Table 3.5: Confusion matrix of SVM with GLCM and ZM features for X-ray classifi-
cation
Chest 13 75 0 2
Skull 12 77 0 1
Foot 12 76 1 1
Palm 10 77 2 1
Spine 11 77 1 1
Neck 12 74 1 3
Table 3.6: Performance of SVM with GLCM and ZM features for X-ray classification
Chest 97.77 86.66 100.00
Skull 98.88 92.30 100.00
Foot 97.77 92.30 98.70
Palm 96.66 90.90 97.46
Spine 97.77 91.66 98.71
Neck 95.55 80.00 98.66
60
The overall performance of SVM in classifying X-ray images with three different
sets of features are given in Table 3.7. It is found that the result were quite promising
with an overall accuracy of 96.66%. The sensitivity and specificity is also quite high.
Thus it can be concluded that the SVM classifier with the combined features of GLCM
and ZM gives better overall performance results than the other two sets of features.
Table 3.7: Overall performance of SVM in classifying X-ray images with different sets
of features
S.no Features Overall Accuracy% Overall Sensitivity% Overall Specificity%
1 GLCM 92.03 73.76 95.23
2 ZM 93.14 74.04 96.52
3 GLCM+ZM 97.40 88.97 98.92
The graph showing overall performance measures of SVM classifier is shown in

Fig. 3.6. to Fig. 3.8.
Fig. 3.6: A comparison of the accuracy of SVM in classifying X-ray images with
different sets of features
61
Fig. 3.7: A comparison of the sensitivity of SVM in classifying X-ray images with
Fig. 3.8: A comparison of the specificity of SVM in classifying X-ray images with
62
Modeling using BPNN
The thirteen GLCM features extracted from the 180 X-ray images are fed to the three
layered BPNN classifier. In this work, sigmoid function is used as an activation func-
tion. The performance of the classifier is checked for different network structures and
the best performance is achieved with the network structure having 13 input neurons
and 7 hidden neurons. Table 3.8 shows the confusion matrix of BPNN with GLCM
features for X-ray classification and the performances are tabulated in Table 3.9.
Table 3.8: Confusion matrix of BPNN with GLCM features for X-ray classification
Chest 9 77 1 3
Skull 8 66 11 5
Foot 7 66 9 8
Palm 10 66 10 4
Spine 8 67 10 5
Neck 11 65 8 6
From Table 3.9, it could be inferred that the BPNN modeled using GLCM features
gave an overall accuracy of 85.18%. It was able to classify chest X-ray image more
accurately when compared to other X-ray image types. The classification accuracy of
foot X-ray was very poor with 81.11%.
The fifteen ZM features extracted from the 180 X-ray images are fed to the three
layered BPNN classifier. Table 3.10 gives the confusion matrix of BPNN with ZM
features for X-ray classification and the performance measures of the BPNN with
ZM features are tabulated in Table 3.11. The BPNN modeled using shape features
was successful in classifying chest X-ray image with a very high accuracy of 97.77%,
63
however it gave comparatively poor performance with regard to other X-ray image
types with an overall accuracy of 87.95%.
In order to find the effect of both the shape and texture features, another attempt
has been endeavored incorporating a three-layered classification network, employing 28
Table 3.9: Performance of BPNN with GLCM features for X-ray classification
Chest 95.55 75.00 98.71
Skull 82.22 61.53 85.71
Foot 81.11 46.66 84.61
Palm 84.44 71.42 86.84
Spine 83.33 61.53 87.01
Neck 84.44 64.70 89.04
Table 3.10: Confusion matrix of BPNN with ZM features for X-ray classification
Chest 10 78 1 1
Skull 9 69 8 4
Foot 8 69 7 6
Palm 9 67 8 6
Spine 9 68 9 4
Neck 12 67 5 6
64
Table 3.11: Performance of BPNN with ZM features for X-ray classification
Chest 97.77 90.90 98.73
Skull 86.66 69.23 89.61
Foot 85.55 57.14 90.78
Palm 84.44 60.00 89.33
Spine 85.55 69.23 94.44
Neck 87.77 66.66 91.78
GLCM features and ZM moment features as the input neurons, along with 15 hidden
neurons. Table 3.12 gives the confusion matrix of BPNN with GLCM and ZM features
for X-ray classification. Table 3.13 gives the performance of BPNN when a combination
of GLCM and texture features were employed. The overall accuracy is 89.62%, and
much of an improvement could not be seen when the features are combined.
Table 3.12: Confusion matrix of BPNN with GLCM and ZM features for X-ray clas-
sification
Chest 12 76 0 2
Skull 11 69 4 6
Foot 9 69 7 5
Palm 11 68 7 4
Spine 10 69 8 3
Neck 12 68 5 5
65
Table 3.13: Performance of BPNN with GLCM and ZM features for X-ray classifica-
tion
Chest 97.77 85.71 100
Skull 88.88 64.70 94.52
Foot 86.66 64.28 90.78
Palm 87.77 73.33 90.66
Spine 87.77 76.92 89.61
Neck 88.88 70.58 93.15
The overall performance of BPNN classifier with three set of features are given in
Table 3.14. From Table 3.14, it could be concluded that GLCM+ZM with BPNN was
able to model the classes better when compared to GLCM features and ZM features,
with an accuracy of 89.62%, specificity of 93.12% and sensitivity of 72.58% respectively.
Table 3.14: Overall performance of BPNN in classifying X-ray images with different
set of features
1 GLCM 85.18 63.47 88.65
2 ZM 87.95 68.86 92.44
3 GLCM+ZM 89.62 72.58 93.12
A graphical illustration of the performances of BPNN with three sets of features

are given in Fig. 3.9 to Fig. 3.11.
66
Fig. 3.9: A comparison of the accuracy of BPNN in classifying X-ray images with
Fig. 3.10: A comparison of the sensitivity of BPNN in classifying X-ray images with
67
Fig. 3.11: A comparison of the specificity of BPNN in classifying X-ray images with
Modeling using PNN
Thirteen GLCM features extracted from 180 X-ray images are fed to the PNN classifier.
The performance of the classifier is checked for different network structures and the
best performance is achieved with the network structure having 13 input neurons and 7
hidden neurons. Table 3.15 shows the confusion matrix of PNN with GLCM features
for X-ray classification and the performance measures are tabulated in Table 3.16
and from this it is noticed that the skull X-ray images were better classified with an
accuracy of 81.11% compared to other X-ray classes.
The PNN was then trained using ZM, 15 features were extracted and used to
model the PNN. The best performance acclaimed was with the network structure
having 15 input neurons and 9 hidden neurons. Table 3.17 shows the confusion matrix
of PNN with ZM features for X-ray classification the performance measures of the
PNN with ZM features are tabulated in Table 3.18 and it is observed that PNN was
able to classify chest and skull X-ray images comparatively better than the other X-ray
68
Table 3.15: Confusion matrix of PNN with GLCM features for X-ray classification
Chest 5 62 14 9
Skull 8 65 11 6
Foot 6 61 15 8
Palm 5 60 14 11
Spine 7 61 13 9
Neck 6 63 13 8
Table 3.16: Performance of PNN with GLCM features for X-ray classification
Chest 74.44 35.71 81.57
Skull 81.11 57.14 85.52
Foot 74.44 42.85 80.26
Palm 72.22 31.25 81.04
Spine 75.55 43.75 82.43
Neck 76.66 42.85 82.89
69
types. However, the performance of PNN with ZM feature was relatively better when
compared to the PNN with GLCM features.
Table 3.17: Confusion matrix of PNN with ZM features for X-ray classification
Chest 6 71 6 7
Skull 8 67 8 7
Foot 6 61 15 8
Palm 5 65 11 9
Spine 7 61 16 6
Neck 5 61 15 9
Table 3.18: Performance of PNN with ZM features for X-ray classification
Chest 85.55 46.15 92.20
Skull 83.33 53.33 89.33
Foot 74.44 42.85 80.26
Palm 77.77 35.71 85.52
Spine 75.55 53.84 79.22
Neck 73.33 35.71 80.26
A combinatorial variant involving both the shape and texture features was used
to model the PNN. Totally 28 features which includes 13 GLCM and 15 ZM features
70
are used in the classification of X-ray images. A three layered PNN classification
network is fed with these 28 features as input neurons along with 15 hidden neurons.
Table 3.19 shows the confusion matrix of PNN with GLCM and ZM features for X-ray
classification and the performance measures are given in Table 3.20.
Table 3.19: Confusion matrix of PNN with GLCM and ZM features for X-ray classi-
fication
Chest 8 67 7 8
Skull 6 64 10 10
Foot 7 65 12 6
Palm 8 67 9 6
Spine 9 61 15 5
Neck 8 63 13 6
Table 3.20: Performance of PNN with GLCM and ZM features for X-ray classification
Chest 83.33 50.00 90.54
Skull 77.77 37.5 86.48
Foot 80.00 46.66 84.41
Palm 83.33 57.14 88.15
Spine 77.77 64.28 80.26
Neck 78.88 57.14 82.89
71
The overall performance slightly improved when compared to PNN modeled with
GLCM and ZM features separately. Though the overall performance of all the three
features are not that much appreciable, it could be inferred that the combination of
GLCM and ZM is relatively better when features are taken individually and classified
using PNN. Table 3.21. gives the comparison of the classification performance of PNN
using the three sets of features and the graphical illustrations are given in Fig. 3.12 to
Fig. 3.14.
Table 3.21: Overall performance of PNN classifier in classifying X-ray images with
1 GLCM 75.73 42.25 82.28
2 ZM 78.32 44.59 84.46
3 GLCM+ZM 80.18 53.31 85.45
Fig. 3.12: A comparison of the accuracy of PNN in classifying X-ray images with
72
Fig. 3.13: A comparison of the sensitivity of PNN in classifying X-ray images with
Fig. 3.14: A comparison of the specificity of PNN in classifying X-ray images with
73
Table 3.22 depicts the overall accuracy, sensitivity and specificity of BPNN, PNN
and SVM classifier with the three types of features employed in this study. It could
be seen from Table 3.22 that the performance of all the classifiers are comparatively
better when the texture and shape features are combined.
Thus it could be concluded that the combined texture and shape features better
represented the X-ray images and the classifiers were able to model these features more
effectively. Also from Table 3.22, it could be inferred that out of the three classifiers
considered for the study, SVM classifier out performed BPNN and PNN in correctly
classifying the X-rays.
Table 3.22: Overall performance measures of BPNN, PNN and SVM classifiers
Classifier Features Accuracy Sensitivity Specificity
BPNN GLCM 85.18 63.47 88.65
ZM 87.95 68.86 92.44
GLCM+ZM 89.62 72.58 93.12
PNN GLCM 75.73 42.25 82.28
ZM 78.32 44.59 84.46
GLCM+ZM 80.18 53.31 82.89
SVM GLCM 92.03 73.76 95.23
ZM 93.14 74.04 96.52
GLCM+ZM 97.66 88.97 98.92
A performance comparison of the three classifier when the combination of Zernike

moment and GLCM feature are used is depicted in the graph shown in Fig. 3.15.
74
Fig. 3.15: Comparison of accuracy of the three classifiers with the combination of
Zernike moments and GLCM features
3.9 Summary
In this work three classification technique namely, BPNN, PNN and SVM classifier
are employed for classifying the X-ray image. Out of these three classifiers SVM
classifier with the combination of GLCM + ZM features out performed BPNN and
PNN in correctly classifying the X-ray images. Hence, it could be concluded that SVM
classifier could be employed in the processing pipeline of the automated annotation
system.
75
Chapter 4
Orientation Detection
4.1 Introduction
Orientation of X-ray images is referred to as radiographic positioning or view of an
X-ray image. Radiographic positioning is highly standardized for better positioning,
so as to accurately interpret, which may lead to correct diagnosis by physicians and
radiologists. Hence, it plays a pivotal role in viewing the particular portions or the
areas to be examined. Radiographic positions viz. lateral view, oblique view, anterio-
posterior view, posterio-anterior view etc., are based on the way the X-ray images are
radiographed with respect to the object and the film. When the X-rays pass through
the object from front to back of the patient, it is referred to as Anterio-Posterior (AP)
view. If it is taken from back to front of the patient, it is said to be Posterio-Anterior
(PA) view. When it is passed through the object from the side of the patient, it
is called a lateral view. In oblique view, X-rays pass through the object based on
an angle. Hence, it is quite inevitable that there is a need for suitable automatic
algorithms for the detection of the orientation of the X-ray images.
Keeping such objectives in view, an attempt has been made in this work to detect
the orientation of the X-ray images. It could be an initial step for image retrieval and
automated annotation. In this work, both model-based approaches and template-based
matching approaches are used for orientation detection. In model-based approach, a
mathematical model is used to find the similarity between the images and in template-
matching, orientation is detected by comparing the text image with the template
images of each category. The best matching reference image label is concluded as the
orientation of the respective query image. For the proposed method, the different views
76
that are taken into account are AP, PA, lateral and oblique views. Wherein for the
set of images of chest, skull, spine and neck, the AP and lateral views are considered,
while the oblique and AP are extended to images of foot and palm. However, the AP
and PA views are one and the same, and so, through out this thesis, the work confines
to AP views only. The lateral view is a side view of the X-ray image, which will not
cover the inner position of the images of foot and palm. Hence, lateral view of the foot
and palm is not mostly used for any prediction. So, oblique view is proposed for foot
and palm X-ray images. The detailed description about the different types of X-ray
views are given in Chapter 1.
4.2 Proposed Methodology for Orientation Detec-

tion
The architecture of the proposed methodology for orientation detection is given in
Fig. 4.1. In this work, the orientation of X-ray images are detected using model-based
Fig. 4.1: The architecture of the proposed methodology for orientation detection.
77
approach and template-matching based approach. Harris corner detection algorithm
and SVM classifier are used for model-based whereas SURF algorithm is employed for
template-based approach.
4.3 Model based Approach

It is inferred from Section 3.7 of Chapter 3, that the SVM classifier produces better
overall results for classification of X-ray images. Hence in this work, SVM classifier is
proposed for model based approach. The input X-ray images are pre-processed and
then segmented using expectation maximization (EM) algorithm. The features are
extracted from the region of interest (ROI) using discrete wavelet transform (DWT).
The extracted features are fed in to the SVM classifier for detecting the orientation
of the X-ray images. The proposed methodology for orientation detection using SVM
classifier is given in Fig. 4.2.
Fig. 4.2: The proposed methodology for Orientation Detection using SVM classifier.
78
4.3.1 Pre-processing
Initially, to eliminate noise and to improve the contrast of the images, X-ray images
are pre-processed as discussed in Chapter 3. These pre-processed images are taken as
input for orientation detection. The pre-processed results are given in Fig. 4.3.
Fig. 4.3: Pre-processed X-ray images (a) Chest AP view, (b) Chest lateral view, (c)
Foot AP view (d) Foot oblique view (e) Neck AP view (f) Neck lateral view (g) Palm
AP view (h) Palm oblique view (i) Skull AP view (j) Skull lateral view (k) Spine AP
view and (l) Spine lateral view.
79
4.3.2 Segmentation
The success of image analysis depends on the reliability of segmentation, but an accu-
rate partitioning of an image is generally a very challenging problem. The goal is to
simplify the representation of the images and to find the ROI. In the literature, region
based method, contour based method, clustering method and model based method are
described and used. However, it is observed from the literature that the clustering
methods are widely used for medical image segmentation to detect the position of the
images.
Although many clustering algorithms are available in the literature, one of the
most commonly used technique in image segmentation is the expectation-maximization
(EM) algorithm. EM is a more statistically formalized method, which includes the idea
of partial membership in classes. It has better convergence properties and is generally
preferred for classification applications.
It consist of E-step and M-step. E-step computes an expectation of the likelihood
assuming parameters and M-step computes maximum likelihood estimates of parame-
ters by maximizing the expected likelihood found in E-step. The detailed information
about the EM algorithm is given in Chapter 2.
EM algorithm finds out maximum likelihood estimates of parameter in probabilis-
tic modes. This algorithm iterates between the E-step and M-step until convergence
occurs. It assumes the parameters initially.
Then, by using the feature value of the images, it computes the mean, variance,
probability density function and normalized probability values in E-step.
The new mean and variance are calculated using the above calculated values in
M-step. These steps are repeated until the model resembles the behavior described by
the samples. The segmented results are given in Fig. 4.4.
80
Fig. 4.4: Segmented X-ray images (a) Chest AP view, (b) Chest lateral view, (c) Foot
AP view (d) Foot oblique view (e) Neck AP view (f) Neck lateral view (g) Palm AP
view (h) Palm oblique view (i) Skull AP view (j) Skull lateral view (k) Spine AP view
and (l) Spine lateral view.
4.3.3 Feature Extraction
Feature extraction is a technique which is used to extract and represent the contents of
the X-ray images. In this work, features are extracted using discrete wavelet transform
(DWT). In the literature, transform based features, shape-based features and texture
81
based features are illustrated. From the literature it is observed that the transform
based features are widely used and play a vital role in identifying the position of the
image which is elaborated in Chapter 2.
The first step in feature extraction is the decomposition of wavelet. This operation
returns the wavelet decomposition of the image at predefined scale. The decomposi-
tion vector consists of one approximation coefficient vector and three detail coefficients
vector namely, horizontal, vertical and diagonal detail coefficients. In this work, ver-
tical, horizontal and approximate coefficient vectors were taken upto four levels using
Haar coefficients and then normalization process is employed to simplify the coefficient
values, so that all coefficient values become less than or equal to one. Next, energy for
each vector is computed by squaring every element in that vector. Feature reduction
is carried out to reduce the number of features by summing 100 consecutive number of
energy values (coefficients) together. Finally, the reduced features are used for classifi-
cation. Reduced features from level 2 to 5 are used to train and test the classifier, out
of which the third level gives better results. To further reduce the number of features,
1000 consecutive number of energy values (coefficients) are summed together. How-
ever summing too many coefficients may hide the variations between the coefficients
making it difficult for the classifier to distinguish the various classes of X-rays.
4.3.4 Model based SVM
A brief overview of SVM is given in Section 2.6.3 of Chapter 2. In this work, the
SVM with radial basis function (RBF) kernel is used. Leave one out procedure has
been adopted in testing the performance. Out of 180 X-ray images considered, SVM is
trained for one half of the data (90 X-ray images) and tested with the other half of the
data for evaluating the effectiveness of the classifier. Both training and testing data
consists of three types of X-ray images for each class (30 X-ray images) namely AP,
lateral and oblique views. For chest, skull, neck and spine the X-rays are classified as
either AP view or lateral view, whereas for foot and palm, the X-ray are classified as
either AP view or oblique view. As leave one out procedure is employed, two binary
class SVMs are created: one, to classify chest, skull, neck and spine and other to
82
classify foot and palm. SVM is trained in two class mode to provide a value of +1 for
AP view and -1 for rest of the views concerned.
Table 4.1: Confusion matrix of SVM with DWT features for the detection of the
X-ray view
Class name tp tn fp fn
Chest 13 75 1 1
Foot 13 76 0 1
Palm 11 74 1 4
Neck 10 76 2 2
Skull 13 75 0 2
Spine 12 77 0 1
Table 4.2: Performance of SVM with DWT features for the detection of the X-ray
view
Chest 97.97 92.85 98.68
Foot 98.88 92.85 100
Palm 94.44 73.33 98.66
Neck 95.55 83.33 97.43
Skull 97.97 86.66 100
Spine 98.88 92.30 100
Overall performance 97.28 86.88 99.12
The confusion matrix for the 90 test images is shown in Table 4.1. From Table 4.1
it is inferred that, out of 90 images, 62 are correctly classified. The performance
calculated using this confusion matrix is shown in Table 4.2 and it is observed that
the overall accuracy obtained is 97.28%. Figs. 4.5 to Fig. 4.7 shows the performance
of SVM in detecting the view of the X-ray image with DWT features.
83
Fig. 4.5: Class-wise accuracy of SVM in detecting the orientation of the X-ray images
using DWT features.
Fig. 4.6: Class-wise sensitivity of SVM in detecting the orientation of the X-ray
images using DWT features.
84
Fig. 4.7: Class-wise specificity of SVM in detecting orientation of the X-ray images
using DWT features.
4.4 Harris Corner Detection Algorithm

In this method, orientation detection is based on the feature points obtained using
Harris corner detector algorithm. In order to obtain the best features of medical X-ray
images for the recognition of the X-ray view, the detection of the corners of the images
is more important, which is facilitated by the algorithm of Harris corner detector. It
aims at finding and joining points of the corner of the image over any direction without
any limits on its angle. The block diagram for the orientation detection using Harris
corner detection algorithm is given in Fig. 4.8.
Using the Harris corner detection, the rotation and transformation of various images
are broadly distinguished from their two distinct variants pertaining to the X-ray im-
ages. Using this algorithm, images are vertically partitioned at (x/2, y). Feature
points on both sides are calculated and compared with its threshold and the thresh-
old is computed using the following algorithm, which is represented by the following
equation.
85
Fig. 4.8: The block diagram of the Harris corner detection.
1. Compute X and Y derivatives of the image
Gxσ , Gyσ (4.1)
2. Compute products of derivatives at every pixel
Ix = Gxσ ∗ I, Iy = Gyσ ∗ I (4.2)
3. Compute the sum of the product of derivatives at each pixel
Sx2 = GσI ∗ IIx2 , Sy2 = GσI ∗ IIy2 , Sxy = GσI ∗ Ixy (4.3)
4. Define at each pixel (x, y) the matrix

 
Sx2 (x, y) Sxy (x, y)
 
Sxy (x, y) Sy2 (x, y)
86
5. Compute the response of the detector at each pixel
R = Det(H) − K(T race(H))2 (4.4)
(k Empirical constant, k = 0.04-0.06)
6. Threshold on value of R.
The X-ray image is given as input, and the X and Y derivatives of the images are
calculated, then the product and sum of the product of the derivatives at each pixel
are also calculated. Using these values, the Eigen value H and the threshold R are
computed; The detected feature point is plotted on the X-ray image in which the
green points represent the right portion of the image and yellow points represent the
left position of the image as shown in Fig. 4.9.
Fig. 4.9: The feature point detected using Harris corner algorithm for chest, skull and
neck X-rays.
87
The number of feature points on the right and left portions of the image are
counted and compared with threshold value R. If it is greater than or equal to R, it is
an anterior view X-ray else it is lateral/oblique view X-ray. The GUI to detect X-ray
orientation using Harris corner algorithm is shown in the Fig. 4.10 and Fig. 4.11.
Fig. 4.10: The result of the sample view of skull X-ray image for orientation detection
using Harris corner detector.
Fig. 4.11: The result of the sample view of neck X-ray image for orientation detection
using Harris corner detector.
88
To evaluate the efficacy of the Harris corner algorithm, the confusion matrix is
derived. The performance is calculated using the confusion matrix which is shown in
Table 4.3. Table 4.4 shows the performances of Harris corner detector algorithm and
it produces the highest accuracy of 95.0% for neck and foot X-ray images, and the
overall accuracy obtained is 91.47%.
Table 4.3: Confusion matrix for orientation detection using Harris corner algorithm
Chest 22 140 10 8
Foot 24 148 2 6
Palm 22 146 4 8
Neck 22 148 2 8
Skull 24 140 10 6
Spine 20 138 12 10
Table 4.4: Performance measures for orientation detection using Harris corner algo-
rithm
Chest 90 73.33 93.33
Foot 95.55 80.0 98.66
Palm 93.33 73.33 97.33
Neck 91.11 73.33 98.66
Skull 91.11 80 93.33
Spine 87.77 66.6 92.0
It could be seen from the Table 4.2 and Table 4.3 that model based approach
using SVM classifier produces accuracy of 97.28% and it gives better performance
than Harris corner detector algorithm for all the six different classes of the X-ray
89
images. A graphical illustration for class-wise accuracy, sensitivity and specificity of
Harris corner algorithm in detecting X-ray view are shown in Figs 4.12 to 4.14.
Fig. 4.12: Class-wise accuracy of Harris corner algorithm in detecting X-ray view
Fig. 4.13: Class-wise sensitivity of Harris corner algorithm in detecting X-ray view
90
Fig. 4.14: Class-wise specificity of Harris corner algorithm in detecting X-ray view
4.5 Template based Classification using speeded up

robust features(SURF)
Speeded up robust features(SURF) seems to be one of the most useful algorithm in
computer vision field. It is easy to find out the correspondences between the similar
images easily and it is effective in collecting most specific information, and is robust
in dealing with view point changes. It generates scale-invariant and rotation-invariant
interest points with descriptors. SURF is selected as the interest point detector and
descriptor because X-ray images are of different size, view and shape. Computational
cost of SURF is less, which enables fast interest point localization and matching. The
block diagram of the proposed method is given in Fig. 4.15.
4.5.1 SURF Algorithm for Orientation Detection
The orientation of X-ray images are detected by finding point correspondences between
the query image and the reference images, which is stored for each category or class of
X-ray images. The discrete image point correspondence is carried out in three steps;
feature point selection, feature point representation and feature point comparison and
has been elaborated earlier in Section 3.6 of Chapter 3. The interest points are selected
91
Fig. 4.15: Overall block diagram of the proposed method for orientation detection
using SURF algorithm .
as distinct location of the image like corners, blobs etc. This process is done iteratively
to ensure the same set of points in different viewing conditions. Fig. 4.16 exhibits, the
detected point in both query image and a reference or template image. The next step
is to represent the selected interest points using a feature vector. The final step is to
match the feature vector of the images using distances based similarity measures.
In this research, Euclidean distance is adopted since it is not computationally
expensive. The dimension of the feature vector plays a major role and here the interest
points are represented using 64 bit descriptor. The distance between the reference
image and query image are calculated using the 64 bit descriptors.
The reference image having minimum distance with the query image are classified
as the corresponding class of the X-ray image. A GUI is developed which allows the
user to select a test image and which shows matching points between the test image
and the reference image and which finally displays the orientation of the X-ray image.
The GUI showing the view of three different X-rays namely chest, skull and palm are
92
Fig. 4.16: The result of the sample view of three different X-rays namely chest, skull
and palm.
shown in Figs. 4.17 and 4.18.

In this work, 180 X-ray images are taken as input for identifying the X-ray view and
its confusion matrix are given in Table 4.5. The performances metrics are evaluated and
it is shown in Table 4.6. The overall accuracy obtained is 94.62%. SURF algorithm
gives better results for foot, palm, neck and spine X-ray images. It can be seen
that the overall accuracy of SURF algorithm is better than Harris corner algorithm,
however SVM is comparatively better than SURF as well as Harris corner algorithm
in accurately detecting X-ray views. Class wise accuracy, sensitivity and specificity of
SURF algorithm in detecting the X-ray views are graphically illustrated in Fig. 4.19
to 4.21.
93
Fig. 4.17: The GUI showing the view of skull X-ray image.
Fig. 4.18: The GUI showing view of palm X-ray image.
94
Table 4.5: Confusion matrix for orientation detection using SURF algorithm
Chest 20 140 10 10
Foot 30 146 4 0
Palm 28 148 2 2
Neck 27 148 2 3
Skull 23 140 10 7
Spine 25 147 5 3
Table 4.6: Performance measures of SURF algorithm in detecting the orientation of

X-rays.
Chest 88.88 66.66 93.33
Foot 97.77 100 97.33
Palm 97.77 93.33 98.66
Neck 97.22 90 98.66
Skull 90.55 76.66 93.33
Spine 98.55 89.28 96.71
95
Fig. 4.19: Class-wise accuracy of SURF algorithm in detecting the X-ray views.
Fig. 4.20: Class-wise sensitivity of SURF algorithm in detecting the X-ray views.
96
Fig. 4.21: Class-wise specificity of SURF algorithm in detecting the X-ray views.
The overall performance results were calculated for both model based approach
and template matching based approach and are shown in Table 4.7. The overall
class wise accuracy, sensitivity and specificity of SVM classifier, Harris corner and
SURF algorithm in detecting the X-ray views are graphically illustrated in Fig. 4.22.
From Table 4.7, it is inferred that among the performance measures the specificity,
sensitivity and accuracy are found to be better with Harris corner for chest and skull
X-ray images, whereas for neck, palm, spine and foot, the SURF algorithm gives better
results. However, SURF algorithm gives an overall accuracy of 94.62% as compared
to Harris corner algorithm. Comparing SVM classifier and SURF algorithm, SVM
exhibited an accuracy of 97.28% and produced reliable results than that of SURF.
97
Table 4.7: The overall performance measures of SVM classifier, Harris corner and
SURF algorithm in detecting the X-ray views.
Techniques Approach Accuracy Sensitivity Specificity
Model based SVM 97.28 86.88 99.12
Model based Harris corner 91.47 74.44 95.55
Template matching based SURF 94.62 85.98 96.33
Fig. 4.22: Graph showing overall accuracy of SVM classifier, Harris corner algorithm
and SURF in detecting the X-ray views.
4.6 Summary
In this work, model based and template matching based methods were employed to
automatically detect the X-ray views. SVM and Harris Corner detector algorithm
comes under model based approach whereas SURF comes under template matching
based approach. Among the thre apporachs SVM outperform SURF and Harris corner
98
algorithm. Hence, it could be finally concluded that the SVM classifier can be used
to detect the X-ray view and could be employed in the processing pipeline of the
automated annotation system.
99
Chapter 5
Abnormality Detection
5.1 Introduction
As explained in the previous chapters, medical imaging plays a key role in the early
detection and treatment of medical diagnosis. It gives most significant information to
the physicians and radiologists for treating various ailments of the patients. Over the
past decades, the growth in the field of medical imaging has been incredible and expo-
nential. This enables the experts of the medical field in treating various diseases and
aids in diagnostic procedures efficiently. Thus, digitization in the field of radiography
has already undergone sea changes in terms of depicting the structural modalities and
their abnormalities. Abnormalities may be present in any of the specified organs of
the human body.
Radiography images are usually based on the visual inspection, which are of late
replaced by using the state-of-the-art technologies such as computer aided detection
/ computer aided diagnosis so as to increase the speed and accuracy of diagnosis.
With the advancements in computing techniques, there are umpteen procedures in the
automatic detection of abnormalities. It is a well-known fact that there is no common
method that can be applied to analyze or process the detection of abnormalities. In
line with them, this chapter focuses in developing an automated technique to detect
the abnormalities in an efficient way.
In this work, for abnormality detection, six classes of X-ray images are taken, viz.
foot, skull, palm, neck, chest and spine, each consisting of 30 images including both
normal and abnormal X-ray images. Out of the above six classes, the abnormalities
may be found in terms of either some anatomical anomalies or pathological findings.
In chest X-ray, the abnormalities usually arise due to reasons like rib fracture, infected
lungs etc, for the skull X-rays the abnormalities include trauma condition like skull
100
fracture and infection due to sinus, tumor etc, with respect to spine, neck, palm
and foot, X-rays abnormalities are because of accident cases leading to fracture and
dislocation and also owing to congenital and degenerative causes. They may include
clubbed foot and osteoporosis / arthritis.
Though, the above mentioned abnormalities are the various possibilities found in
such six classes, the following abnormalities against each class is adhered to in this
work: for chest X-rays, lung infection is considered, for skull X-ray images, the fracture
/ tumor in the skull is considered, and for spine, neck, foot and palm fractures only are
considered. The sample normal and abnormal X-ray images for six different classes
are shown in Fig. 5.1. The abnormality detection is obtained through a successive
Fig. 5.1: Six different classes of X-ray images namely (a) Normal chest image, (b)
Abnormal chest image, (c) Normal skull image, (d) Abnormal skull image, (e) Normal
palm image, (f) Abnormal palm image, (g) Normal foot image, (h) Abnormal foot
image.
procedure involving pre-processing, segmentation, feature extraction and to identify

whether the X-ray images is normal or abnormal.
101
5.2 Proposed methodology for Abnormalities De-
tection
The objective of abnormality detection is accomplished by the initialization of a filter-
ing algorithm, which is used to smooth the X-ray images and for removing different
types of noises ( darkness, brightness, blurring, etc) present in the image. It is then
followed by the segmentation of X-ray image and the features are extracted from the
segmented regions. From the extracted features, a suitable classifier is then employed
to distinguish the image as to whether it is normal or abnormal. The performance of
the classifiers are evaluated to show their efficacy in detecting the abnormality. The
block diagram of the proposed methodology for abnormality detection is shown in
Fig. 5.2.
Fig. 5.2: Block diagram of the proposed methodology for abnormalities detection.
102
5.3 Pre-processing
X-ray images are frequently degraded by noise, which deteriorates the visual quality
of the image and hides the important information required for accurate diagnosis.
Therefore the X-ray image needs to be pre-processed to eliminate the noise. It is
achieved using M3 filter which has been already elaborated in Chapter 2. The pre-
processed X-ray images are given in Fig. 5.3.
Fig. 5.3: Pre-processed X-ray images (a) Normal chest image, (b) Abnormal chest
image, (c) Normal skull image, (d) Abnormal skull image, (e) Normal palm image, (f)
Abnormal palm image, (g) Normal foot image, (h) Abnormal foot image.
5.4 Segmentation
Segmentation is the process of dividing an image into regions with similar properties
such as gray level, texture, brightness, and contrast. The role of segmentation is to
sub-divide the objects of an X-ray image in order to study its anatomical structure that
helps in decision making. Many researchers proposed various segmentation algorithms
103
to extract the region-of-interest from X-ray image. The expectation maximization
(EM) algorithm is employed for the process of segmentation and a detailed description
of it is given in Chapter 3. The results of the segmentation are shown in Fig. 5.4.
Fig. 5.4: Segmented X-ray images (a) Normal chest image, (b) Abnormal chest im-
age, (c) Normal skull image, (d) Abnormal skull image, (e) Normal palm image, (f)
Abnormal palm image, (g) Normal foot image, (h) Abnormal foot image.
5.5 Feature extraction

One of the most important issues in the automated abnormality detection of X-ray
images is to extract the contents that helps in differentiating the different features
found in the X-ray image and in this work DWT features are used. Vertical, horizontal
and approximate coefficient vectors were taken up to four level using Haar coefficients
and then they are normalized to simplify the coefficient value to one or less than
or equal to one. The number of features are reduced by summing 100 consecutive
coefficients together, and then used for the classification. The features obtained from
104
level 3 decomposition are used to train and test the classifiers, as the third level features
produced better results. The abnormality present in the X-ray images are detected
using three different classifiers namely, extreme learning machine (ELM), SVM and
decision tree.
5.6 Proposed Modeling Techniques for Abnormal-

ity Detection
The modeling techniques namely decision tree, ELM and SVM have been used in this
work for determining whether the given X-ray is normal or abnormal.
5.6.1 Decision tree
Decision tree algorithm is a formal, structured approach for making decisions based
on the rules. It helps to decompose a complex problem into smaller and manageable
understanding. It is also non-parametric, because for each class the distribution of the
variable does not require any assumption. It consists of root node represented by ’R’,
Chance node or leaf node ’C’, and terminal node ’T’.
Each class contains decision criteria based on one feature. Features with close
resemblance are used for the first split and this procedure is repeated until there is no
further split. Decision tree is constructed from the training dataset, which consists of
objects described by the set of attributes and class label.
The class that is associated with the leaf is the output of the tree. The tree
misclassifies the images, when the output of the tree does not match with the class
label. The structure of the decision tree is shown in Fig. 5.5. In this work, six chance
nodes C1 to C6 represents the six different classes of X-ray images. The probability will
be assigned to each branch emanating from each chance node. Based on the probability,
the two terminal nodes T1 and T2 for each class are assigned for the anterior view and
lateral view.
If the probability is greater than 0.6 then the orientation of the X-ray images is
said to be anterior view else it is said to be of lateral or oblique view.
105
Fig. 5.5: The structure of the decision tree
Experimental Results
180 X-ray images are taken, out of which 96 X-ray images are used as training set and
84 X-ray images are used as testing set. Each class consists of 30 X-ray images (both
normal and abnormal images) and in each class 16 X-ray images are used for training
and 14 X-ray images are used for testing. DWT coefficients are used as input features
to classify the test image as either normal or abnormal. The input features are used
to train the binary classifier, which can automatically infer whether the image is a
normal X-ray image or an abnormal one.
The performance of the classifiers are evaluated in terms of sensitivity, specificity
and accuracy. The classification performance of the decision tree classifier is evaluated
using the confusion matrix shown in Table 5.1.
The system achieves 76.94% sensitivity and 75.04% specificity, respectively and
produces an overall accuracy 78.56% as shown in Table 5.2. Fig. 5.6 gives the class-
wise performance of decision tree in identifying the X-ray as normal or abnormal using
DWT features.
106
Table 5.1: Confusion matrix for abnormality detection using decision tree with DWT
features
Chest 5 5 2 2
foot 7 5 1 1
palm 6 6 0 2
skull 5 7 1 1
neck 5 6 1 2
spine 6 3 3 2
Table 5.2: Performance measures of decision tree in detecting abnormality in X-rays
chest 71.42 71.42 71.42
foot 85.71 85.50 83.33
palm 85.71 75 75
skull 85.71 83.3 85.50
neck 78.57 71.42 75
spine 64.28 75.0 60
107
Fig. 5.6: Class-wise performance of decision tree classifier in detecting abnormality
in X-rays.
5.6.2 ELM Classifier
The second work is based on extreme learning machine (ELM) using discrete wavelet
transform (DWT) features. The proposed technique is used to classify X-ray images
into two classes (abnormal and normal). It consists of three layers namely input layer,
a single hidden layer, and an output layer, learning speed of ELM is extremely fast
and it has better generalization performance than gradient based learning in ANN.
The traditional classic gradient based learning algorithm may face several issues
such as local minima, improper learning rate and over fitting etc. The gradient based
learning algorithm will work only for differentiable activation, whereas ELM could be
used with many non differentiable activation functions. The implementation of ELM
is accomplished in two different phases. In the first phase, a model is built using the
training samples of the dataset. In the second phase, each test sample in the dataset
is classified using the model evolved in the training phase. In each phase, the accuracy
is computed along with class output for each test sample of the dataset.
108
180 X-ray images are covering six different classes of X-ray images namely chest, palm,
skull, spine, neck and foot are taken. Out of 180 X-ray images 96 X-ray images are
used for training and 84 X-ray images are used for testing.
Each class consists of 30 X-ray images (both normal and abnormal images) and
in each class 16 X-ray images are used for training and 14 X-ray images are used for
testing.
The DWT features are used to train the binary classifier which can automatically
infer whether an image is a normal or abnormal one and its confusion matrix is given
in Table 5.3.
Table 5.3: Confusion matrix for abnormality detection using ELM with DWT features
Chest 6 6 1 1
foot 7 6 0 1
palm 6 6 1 1
skull 6 7 0 1
neck 5 6 1 2
spine 7 5 1 1
The performance of the classifier is evaluated in terms of sensitivity, specificity

and accuracy. The system achieves 83.25% sensitivity and 90.07% specificity, and an
accuracy 86.89%.
ELM produces better overall accuracy than that of the decision tree algorithm.
Table 5.4 gives the performance measures for the ELM classifier. Fig. 5.7 shows class
wise performances of ELM classifier using DWT features.
109
Table 5.4: Performance measures of ELM with DWT features in detecting abnormality
in X-rays
Chest 85.71 85.71 85.71
foot 92.85 87.50 100
palm 85.71 85.71 85.71
skull 92.85 85.71 100
neck 78.57 71.42 85.71
spine 85.71 85.50 83.33
Fig. 5.7: Class-wise performance of ELM classifier in detecting abnormality in X-rays.
110
5.6.3 SVM
The SVM with radial basis function kernel is utilized here. Out of 180 X-ray images,
96 X-ray images are used for training and 84 X-ray images are used for testing. Each
class consists of 30 X-ray images (both normal and abnormal images) and in each class
16 X-ray images are used for training and 14 X-ray images are used for testing.
Two binary SVM classifiers are created and trained to distinguish one class from
the other. SVM is trained in two class mode, so as to provide a value of +1 for normal
image and -1 for abnormal images. Table 5.5 shows the confusion matrix of SVM
classifier with DWT features.
The performance is measured using the confusion matrix which is shown in Table 5.6. It
could be seen that the DWT features are well modeled by SVM classifier and the overall
performance obtained is 88.09% and it is able to identify the abnormality present in
all the six classes of X-ray images with better accuracy rates. SVM produces better
overall performance than ELM and decision tree algorithm. Fig. 5.8 shows class-wise
performances of SVM classifier using DWT features.
Table 5.5: Confusion matrix of using SVM with DWT features in detecting the ab-
normality.
Chest 6 7 0 1
foot 6 7 0 1
palm 6 5 2 1
skull 7 6 1 0
neck 6 6 1 1
spine 7 5 2 0
111
Table 5.6: Performance measures of SVM with DWT features in detecting abnormality
in X-rays
Chest 92.85 85.71 100
foot 92.85 85.71 100
palm 78.57 85.71 71.42
skull 92.85 100 85.71
neck 85.71 85.71 85.71
spine 85.71 100 71.42
Fig. 5.8: Class-wise performance of SVM classifier in detecting abnormality in X-rays.
112
Table 5.7: Overall performance of decision tree, ELM and SVM classifier with DWT
features for detecting the abnormality in X-rays.
Decision tree 78.56 76.94 75.04
ELM 86.89 83.25 90.07
SVM 88.09 90.47 85.71
Fig. 5.9: A comparison of the classification performances of decision tree, ELM and
SVM with DWT features for detecting the abnormality.
The overall performance measures of decision tree, ELM and SVM classifier in
terms of sensitivity, specificity and accuracy are given in Table 5.7 and the compar-
ison of the classification performances of decision tree, ELM and SVM classifiers are
graphically represented in Fig. 5.9
113
5.7 Summary
In this work, three classification techniques namely decision tree, ELM and SVM
classifiers are employed to detect the abnormality present in the X-ray images using
DWT features. Out of the three classifiers, SVM classifier out performed decision
tree and ELM in correctly identifying the abnormal X-ray images. Hence, it could be
concluded that SVM classifier can be employed in the processing pipeline of automated
annotation system.
114
Chapter 6
Annotation
6.1 Introduction
Over the past two decades, the modalities incorporated in imaging the organs of the
human body have been increasing from time to time. Until the recent past, the physi-
cians, radiologists and researchers normally store and retrieve them manually. Manual
storing and retrieval of medical images from such a vast database usually suffered
from various intricacies like requirement of expertise, sparing more time, the inability
to arrive at the expected features on search etc. In order to get rid of the aforesaid is-
sues of medical images, recently, automated annotation of medical images is preferred.
Thus, automated annotation would not only ease the work of physicians, radiologists
and researchers who have no medical domain knowledge, but also greatly result in
less time-consumption and more search accuracy. With the advent of soft computing
technologies on hand, it is always possible to effectively accomplish the objective of
automated annotation and the graphical user interface (GUI) created for annotation
of medical X-rays is shown in Fig. 6.1.
Fig. 6.1: GUI model for annotation of medical X-ray images.
115
6.2 Proposed Work
After detecting the class and orientation of the given X-ray image, the X-ray image is
checked for the presence of any abnormality. These information which are obtained is
used in this chapter to automatically annotate the image. A 21 bit code is generated
incorporating various information such as, patient ID, age, gender, X-ray class, X-
ray orientation along with the information about pathology. The annotation code
comprises of 21 bits, in which the first 12 bits are for the patient ID, next 2 bits for
the age, 1 bit to represent gender (male or female), 3 bits for the class of X-ray image,
2 bits for view/orientation of the X-ray image and 1 bit to say whether the X-ray
is normal or abnormal. The bits corresponding to the various annotation data is as
shown in Fig. 6.2. Initially for obtaining general information of the patient, a GUI is
Fig. 6.2: Distribution of generated code.
created as shown in Fig. 6.3 where the patient ID, age, and gender information are
obtained which makes for the 15 bits of the 21 bit code. It has been proved in Chapter 3
that support vector machine (SVM) with the combination of gray level co-occurrence
matrix (GLCM) and Zernike moment features were able to classify X-rays with a
very high accuracy. So for classifying X-ray images for the purpose of annotation,
the combination of GLCM and Zernike moment features with SVM is applied. After
determining the class of the X-ray, it is automatically assigned a predefined 3 bit code
( chest−001, foot−010, skull−011, neck−100, palm−101, spine−110) to form the part
of the 21 bit annotation code.
116
Fig. 6.3: GUI model for proposed method of automated annotation of X-ray images.
Next, the orientation of the X-ray is automatically determined. Chapter 4 dis-

cussed three different methods for orientation detection such as SVM, Harris corner,
and SURF algorithm. However, SVM with discrete wavelet transform (DWT) features
is found to be very effective in determining the X-ray views. So, for automatically iden-
tifying the view of the X-ray for the purpose of annotation, in this chapter SVM and
DWT is proposed. After determining the view of the X-ray, it is assigned a predefined
2 bit code ( 01−anterior view, 10−lateral view ) which form the 19th and 20th bit of
the annotation code.
Similarly from Chapter 5, it is understood that SVM with DWT features was able
to detect the presence of abnormality present in the X-ray images. So, to determine
whether the given X-ray is normal or abnormal, in this work SVM with DWT is
proposed. After determining whether the X-ray is normal or abnormal, the final bit
of the 21 bit code is assigned (1−normal, 0−abnormal).
Thus, the 21 bit code is automatically generated once an X-ray image is given as
input to the proposed system.
117
6.2.1 Experimental Results
Fig. 6.4: The automated code generated for the sample chest anterior-posterior normal
view.
Fig. 6.5: The automated code generated for the sample chest lateral normal view.
118
Fig. 6.6: The automated code generated for the sample skull lateral abnormal view .
Fig. 6.7: The automated code generated for the sample skull lateral normal view.
119
Fig. 6.8: The automated code generated for the sample palm oblique normal view.
The snapshot of the user interface in which a normal anterior−posterior view chest
X-ray is annotated is shown in Fig. 6.4. The snapshot of the user interface in which
Fig. 6.9: The automated code generated for the sample palm anterior-posterior normal
view.
120
a normal lateral view chest X-ray is annotated is shown in Fig. 6.5. The snapshot of
the user interface in which a abnormal lateral view skull X-ray is annotated is shown
in Fig. 6.6. The snapshot of the user interface in which a normal lateral view skull
X-ray is annotated is shown in Fig. 6.7. The snapshot of the user interface in which
a normal oblique view palm X-ray is annotated is shown in Fig. 6.8. The snapshot of
the user interface in which a normal anterior−posterior view palm X-ray is annotated
is shown in Fig. 6.9.
6.3 Summary
Thus, a 21 bit code is automatically generated incorporating various information such
as, patient ID, age, gender, X-ray class, X-ray view along with the information about
the pathology. This annotation code could be used to facilitate automated archival
and retrieval.
121
Chapter 7
Summary and Conclusion
7.1 Summary of the Work

In this thesis, comprehensive methodologies have been proposed for automated anno-
tation of medical X-ray images using image processing techniques and pattern classifi-
cation methods. The performance of the proposed systems have been evaluated using
the test images obtained from IRMA database, and that obtained from Raja Muthiah
Medical College and Hospital, Annamalainagar, and from Government Medical College
and Hospital, Salem.
Denoising algorithm and M3 filter are used for removing noise in X-ray image and
to improve the quality of X-ray image. The pre-processed images are segmented to
find the region of interest using connected component labeling. It significantly reduced
the number of pixels that needs to be used in the extraction of features for further
processing. The test-case results and their corresponding interpretations throw light
on the extraordinary performances of those algorithms that are implemented for clas-
sification, orientation and abnormalities detection, besides arriving at the automated
code generation, as the culmination step.
For classification of X-ray images, connected component labeling has been ex-
ploited to segment the image. This method help to focus the region of the interest,
which in turn reduce the unnecessary computational overhead. The combination of
shape and texture features namely GLCM with Zernike moment features derived from
the region of interest have been used to classify six different classes of X-ray images.
Furthermore, for the identification of the orientation of X-ray images, expectation
maximization algorithm is used to segment the region of interest and its features are
122
extracted based on wavelet coefficient analysis. Both, model based and template based
algorithms are used to detect the orientation of the X-ray image. In model based ap-
proach, Harris corner algorithm and SVM classifier are used and in template based
approach, the SURF algorithm is used. The best performance is achieved by SVM
classifier with DWT features.
Also, an intelligent diagnosis system based on wavelet coefficient analysis for the
classification of normal and abnormal X-ray images for different classes have been
proposed. Classifiers namely, Decision tree, ELM and SVM are employed to clas-
sify normal or abnormal images. The results so obtained exhibit that the SVM can
effectively be used in the detection of abnormalities.
Finally, a 21 bit code is also generated, out of which 12 bits are used for patient
ID, 2 bits for age, 1 bit for gender, 3 bits for class of X-ray images, 2 bits to represent
the orientation of the X-ray images and finally 1 bit to represent whether the given
X-ray is a normal or abnormal one.
7.2 Major Contributions of the Work

The most important claim of this research is that it provides an automated annotation
of medical X-ray image, in a very comprehensive manner, yet almost independent of
domain specific knowledge. The experimental results validate the effectiveness of the
proposed methods. In a nut shell, the uniqueness and the novelty that have been
ensured in this research can be as enlisted as given below;
• Various issues are taken up in the analysis of automated annotation of X-ray

images namely X-ray pre-processing, region of interest segmentation, feature ex-
traction, classification of X-ray images, orientation and abnormalities detection
of the X-ray images and finally the code generation.
• M3 filter is proposed for noise elimination and for improving the quality of
images.
• For classification of X-ray images, combination of GLCM with ZM features are

extracted from the region of the interest which is found out by using CCL. Out
123
of three different classifiers namely BPNN, PNN and SVM, SVM provides better
results.
• For orientation detection of the X-ray images, the region of interest is segmented
using EM algorithm and feature extraction is done through wavelet coefficients.
Orientation is detected using three different techniques namely Harris corner
algorithm, SVM classifier and SURF algorithm. The best performance is found
out to be achieved, when SVM classifier is employed with DWT coefficients.
• For the detection of abnormalities, the EM algorithm is employed for segmenta-

tion and DWT features extracted from the segmented region are classified using
various classifiers such as decision tree, ELM and SVM to ascertain whether the
X-ray image is normal or abnormal. Out of them, the SVM produces better
results than other models.
7.3 Directions for Future Research

In this thesis, the most important six classes of X-ray images viz., head, neck, chest,
spine, palm and foot have been taken into consideration for generating the auto-
mated annotation. This is achieved through a sequence of processes ranging from
pre-processing, segmentation, feature extraction, classification, abnormality detection
and code generation. However, in future, this work can also be extended to various
other X-ray types like Hip, Upper/lower Limbs, Knee, Elbow, Ankle etc. The six
classes so determined for this work have been based on the probability and statistics
of the trauma and degeneration possibilities associated with them. Though this work
has been simply confined to the X-rays, there is a possibility of future scope to include
other modalities like computerized tomography (C.T.), ultra sonogram (U.S.G.), mag-
netic resonance imaging (M.R.I.) etc. In this thesis, the abnormality namely, fracture
alone is paid attention, whereas the future works can throw light upon the other causes
of pathologies like malignancy in bone / tissue level, arthritis etc.
124
Bibliography
[1] Otto Chan, ABC of emergency radiology, John Wiley & Sons, 2012.
[2] Ravikiran Ongole and BN Praveen, Textbook of Oral Medicine, Oral Diagnosis and
Oral Radiology, Elsevier Health Sciences, 2014.
[3] Frank Levy and Kyoung-Hee Yu, “Offshoring radiology services to india,” Industrial
Performance Center, Massachusetts Institute of Technology (September 2006)¡ web.
mit. edu/ipc/publications/pdf/06-005. pdf, 2006.
[4] Thomas M Lehmann, Henning Schubert, Daniel Keysers, Michael Kohnen, and
Berthold B Wein, “The irma code for unique classification of medical images,” in
Medical Imaging 2003. International Society for Optics and Photonics, 2003, pp. 440–
451.
[5] Mukesh C Motwani, Mukesh C Gadiya, Rakhi C Motwani, and Frederick C Harris,
“Survey of image denoising techniques,” in Proceedings of GSPX, 2004, pp. 27–30.
[6] M Rajalakshmi and P Subashini, “International journal of emerging technologies in

computational and applied sciences (ijetcas) www. iasir. net,” .
[7] L Sahawneh and B Carroll, “Stochastic image denoising using minimum mean squared
error (wiener) filtering,” Electrical and Computer Engineering, pp. 471–474, 2009.
[8] S Rajeshwari and T Sree Sharmila, “Efficient quality analysis of mri image using pre-
processing techniques,” in Information & Communication Technologies (ICT), 2013
IEEE Conference on. IEEE, 2013, pp. 391–396.
[9] G Amar Tej and Prashanth K Shah, “Efficient quality analysis and enhancement of
mri image using filters and wavelets,” .
[10] Zeinab Mustafa, Banazier Abrahim, Yasser M Kadah, et al., “K11. modified hybrid
median filter for image denoising,” in Radio Science Conference (NRSC), 2012 29th
National. IEEE, 2012, pp. 705–712.
[11] Navdeep Kanwal, Akshay Girdhar, and Savita Gupta, “Region based adaptive contrast
enhancement of medical x-ray images,” in Bioinformatics and Biomedical Engineer-
ing,(iCBBE) 2011 5th International Conference on. IEEE, 2011, pp. 1–5.
[12] Noorhayati Mohamed Noor, Noor Elaiza Abd Khalid, Mohd Hanafi Ali, and Alice
Demi Anak Numpang, “Fish bone impaction using adaptive histogram equalization
(ahe),” in Computer Research and Development, 2010 Second International Confer-
ence on. IEEE, 2010, pp. 163–167.
[13] Sadeer G Al-Kindi, Ghassan Al-Kindi, et al., “Breast sonogram and mammogram en-
hancement using hybrid and repetitive smoothing-sharpening technique,” in Biomed-
ical Engineering (MECBME), 2011 1st Middle East Conference on. IEEE, 2011, pp.
446–449.
125
[14] R Manavalan and K Thangavel, “Evaluation of textural feature extraction from grlm
for prostate cancer trus medical images,” International Journal of Advances in Engi-
neering & Technology, vol. 1, no. 6, 2012.
[15] Amir Rajaei, Rangarajan Lalitha, and Elham Dallalzadeh, “Medical image texture
segmentation usingrange filter,” computer science and information technology, vol. 2,
no. 1, 2012.
[16] K Thangavel, R Manavalan, and I Laurence Aroquiaraj, “Removal of speckle noise

from ultrasound medical image based on special filters: comparative study,” ICGST-
GVIP Journal, vol. 9, no. 3, pp. 25–32, 2009.
[17] Pierre Soille, “Filtering,” in Morphological Image Analysis, pp. 241–265. Springer,
2004.
[18] Cristina Stolojescu-CriŞan and Ştefan Holban, “A comparison of x-ray image segmen-
tation techniques,” Advances in Electrical and Computer Engineering Engineering,
vol. 13, no. 3, 2013.
[19] SV Kasmir Raja, A Shaik Abdul Khadir, and Dr SS Riaz Ahamed, “Moving toward
region-based image segmentation techniques: A study,” Journal of Theoretical and
Applied Information Technology, vol. 5, no. 13, pp. 81–87, 2009.
[20] Ding Feng, Segmentation of bone structures in X-ray images, Ph.D. thesis, Citeseer,
2006.
[21] W Burgern and Mark J Burge, “Principles of digital image processing fundamental
techniques,” 2009.
[22] Manoj R Tarambale and Nitin S Lingayat, “Computer based performance evaluation
of segmentation methods for chest x-ray image,” International Journal of Bioscience,
Biochemistry and Bioinformatics, vol. 3, no. 6, pp. 545, 2013.
[23] Aditya A Tirodkar, “A multi-stage algorithm for enhanced x-ray image segmentation,”
International Journal of Engineering Science Technology (IJESE), vol. 3, no. 9, 2011.
[24] P Annangi, S Thiruvenkadam, A Raja, Hao Xu, XiWen Sun, and Ling Mao, “A region
based active contour method for x-ray lung segmentation using prior shape and low
level features,” in Biomedical Imaging: From Nano to Macro, 2010 IEEE International
Symposium on. IEEE, 2010, pp. 892–895.
[25] Felice Andrea Pellegrino, Walter Vanzella, and Vincent Torre, “Edge detection revis-
ited,” Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on,
vol. 34, no. 3, pp. 1500–1518, 2004.
[26] Milan Sonka, Vaclav Hlavac, and Roger Boyle, Image processing, analysis, and ma-
chine vision, Cengage Learning, 2014.
[27] Dzung L Pham, Chenyang Xu, and Jerry L Prince, “Current methods in medical
image segmentation 1,” Annual review of biomedical engineering, vol. 2, no. 1, pp.
315–337, 2000.
[28] Anil K Jain and Hong Chen, “Matching of dental x-ray images for human identifica-
tion,” Pattern recognition, vol. 37, no. 7, pp. 1519–1532, 2004.
126
[29] Jindan Zhou and Mohamed Abdel-Mottaleb, “A content-based system for human
identification based on bitewing dental x-ray images,” Pattern Recognition, vol. 38,
no. 11, pp. 2132–2142, 2005.
[30] Omaima Nomir and Mohamed Abdel-Mottaleb, “Human identification from dental
x-ray images based on the shape and appearance of the teeth,” Information Forensics
and Security, IEEE Transactions on, vol. 2, no. 2, pp. 188–197, 2007.
[31] Fazel Keshtkar and Wail Gueaieb, “Segmentation of dental radiographs using a swarm
intelligence approach,” in Electrical and Computer Engineering, 2006. CCECE’06.
Canadian Conference on. IEEE, 2006, pp. 328–331.
[32] Eyad Haj Said, Diaa Eldin M Nassar, Gamal Fahmy, and Hany H Ammar, “Teeth
segmentation in digitized dental x-ray films using mathematical morphology,” Infor-
mation Forensics and Security, IEEE Transactions on, vol. 1, no. 2, pp. 178–189, 2006.
[33] Dorin Comaniciu and Peter Meer, “Mean shift: A robust approach toward feature
space analysis,” Pattern Analysis and Machine Intelligence, IEEE Transactions on,
vol. 24, no. 5, pp. 603–619, 2002.
[34] Bing Nan Li, Chee Kong Chui, Stephen Chang, and Sim Heng Ong, “Integrating
spatial fuzzy clustering with level set methods for automated medical image segmen-
tation,” Computers in Biology and Medicine, vol. 41, no. 1, pp. 1–10, 2011.
[35] Laura Florea, Corneliu Florea, Constantin Vertan, and Alina Sultana, “Automatic
tools for diagnosis support of total hip replacement follow-up,” Advances in Electrical
and Computer Engineering, vol. 11, no. 4, pp. 55–62, 2011.
[36] TS Subashini, Vennila Ramalingam, and S Palanivel, “Automated assessment of breast

tissue density in digital mammograms,” Computer Vision and Image Understanding,
vol. 114, no. 1, pp. 33–43, 2010.
[37] TS Subashini, Vennila Ramalingam, and S Palanivel, “Breast mass classification based
on cytological patterns using rbfnn and svm,” Expert Systems with Applications, vol.
36, no. 3, pp. 5284–5290, 2009.
[38] I El-Feghi, Shanjin Huang, MA Sid-Ahmed, and M Ahmadi, “X-ray image segmen-
tation using auto adaptive fuzzy index measure,” in Circuits and Systems, 2004.
MWSCAS’04. The 2004 47th Midwest Symposium on. IEEE, 2004, vol. 3, pp. iii–499.
[39] Yanbin Peng, “Service discovery framework supported by em algorithm and bayesian
classifier,” Physics Procedia, vol. 33, pp. 206–211, 2012.
[40] Xianbin Wen, Hua Zhang, Jianguang Zhang, Xu Jiao, and Lei Wang, “Multiscale
modeling for classification of sar imagery using hybrid em algorithm and genetic algo-
rithm,” Progress in Natural Science, vol. 19, no. 8, pp. 1033–1036, 2009.
[41] Commun ACM, “1 ro duda, pe hart, and dg stork. pattern classification. john wiley
and sons, inc, 2001. 2 stephen g. eick, michael c. nelson, and jerry d. schmidt. graphical
analysis of computer log files.,” Commun. ACM, vol. 37, no. 12, pp. 50–56, 1994.
[42] Mohammad Reza Zare, Ahmed Mueen, Woo Chaw Seng, and Mohammad Hamza
Awedh, “Combined feature extraction on medical x-ray images,” in Computational
127
Intelligence, Communication Systems and Networks (CICSyN), 2011 Third Interna-
tional Conference on. IEEE, 2011, pp. 264–268.
[43] A Mueen, M Sapiyan Baha, and R Zainuddin, “Multilevel feature extraction and x-ray
image classification,” Journal of Applied Sciences, pp. 1224–1229, 2007.
[44] Hayit Greenspan and Adi T Pinhas, “Medical image categorization and retrieval for
pacs using the gmm-kl framework,” Information Technology in Biomedicine, IEEE
Transactions on, vol. 11, no. 2, pp. 190–202, 2007.
[45] N S Usha E Saranya, S Praveenkumar, “A review based study of classification of x-ray

images using content based image retrieval (cbir),” International Journal of Advanced
Scientific Research Development (IJASRD), vol. 11, no. 2, pp. 190–202, 2011.
[46] Seyyed Mohammad Mohammadi, Mohammad Sadegh Helfroush, and Kamran Kazemi,
“Novel shape-texture feature extraction for medical x-ray image classification,” Int J
Innov Comput Inf Control, vol. 8, pp. 659–76, 2012.
[47] A Ray and Krishnendu Sasmal, “A new approach for clustering of x-ray images,”
2010.
[48] Mohammad Reza Zare, Woo Chaw Seng, and Ahmed Mueen, “Automatic classification
of medical x-ray images,” Malaysian Journal of Computer Science, vol. 26, no. 1, pp.
9–22, 2013.
[49] Nikhil R Pal and Debrup Chakraborty, “Mountain and subtractive clustering method:
improvements and generalizations,” International Journal of Intelligent Systems, vol.
15, no. 4, pp. 329–341, 2000.
[50] Mingqiang Yang, Kidiyo Kpalma, and Joseph Ronsin, “A survey of shape feature
extraction techniques,” Pattern recognition, pp. 43–90, 2008.
[51] Fatemeh Ghofrani, Mohammad Sadegh Helfroush, Habibollah Danyali, and Kamran
Kazemi, “Medical x-ray image classification using gabor-based cs-local binary pat-
terns,” in Int Conf Electron Biomed Eng Appl (ICEBEA), 2012, vol. 284, p. 8.
[52] Yimo Tao, Zhigang Peng, Arun Krishnan, and Xiang Sean Zhou, “Robust learning-
based parsing and annotation of medical radiographs,” Medical Imaging, IEEE Trans-
actions on, vol. 30, no. 2, pp. 338–350, 2011.
[53] B Jyothi, Y Madhavee Latha, PG Krishna Mohan, and VSK Reddy, “Medical im-
age retrieval using moments,” International Journal of Application or Innovation in
Engineering & Management (IJAIEM), vol. 2, no. 1, pp. 195–200, 2013.
[54] A Chalechale, A Bahari, and M Vatanchian, “Vision-based bone image recognition

using geometric properties,” Iranian Journal of Science & Technology, Transaction B:
Engineering, vol. 34, no. B6, pp. 597–604, 2010.
[55] Youness Mobssite, B Belhaouari Samir, and Ahmed Fadzil B Mohamad Hani, “Sig-
nal and image processing for early detection of coronary artery diseases: A review,”
in INTERNATIONAL CONFERENCE ON FUNDAMENTAL AND APPLIED SCI-
ENCES 2012:(ICFAS2012). AIP Publishing, 2012, vol. 1482, pp. 712–723.
128
[56] Moshe Aboud, Assaf B Spanier, and Leo Joskowicz, “Automatic classification of body
parts x-ray images,” .
[57] Jianmin Jiang, P Trundle, and Jinchang Ren, “Medical image analysis with artificial
neural networks,” Computerized Medical Imaging and Graphics, vol. 34, no. 8, pp.
617–631, 2010.
[58] KAG Udeshani, RGN Meegama, and TGI Fernando, “Statistical feature-based neural
network approach for the detection of lung cancer in chest x-ray images,” International
Journal of Image Processing (IJIP), vol. 5, no. 4, pp. 425, 2011.
[59] SN Deepa and B Aruna Devi, “A survey on artificial intelligence approaches for medical
image classification,” Indian Journal of Science and Technology, vol. 4, no. 11, pp.
1583–1595, 2011.
[60] M Obayya and Mohamed Ghandour, “Lung cancer classification using curvelet trans-
form and neural network with radial basis function,” International Journal of Com-
puter Applications, vol. 120, no. 13, 2015.
[61] Atta Elalfi, Mohamed Eisa, and Hosnia Ahmed, “Artificial neural networks in medical
images for diagnosis heart valve diseases,” IJCSI) International Journal of Computer
Science Issues, vol. 10, no. 5, pp. 83–90, 2013.
[62] Nικóλαoς Παγ ώνης, ∆ιoν ν́σης A Kάβoυρας, Kωνστ αντ ίνoς Σιδηρóπoυλoς,
Γεώργιoς X Σακελλαρóπoυλoς, and Γεώργιoς X Nικηϕoρίδης, “Improving the classi-
fication accuracy of computer aided diagnosis through multimodality breast imaging,”
2015.
[63] Fatemeh Ghofrani, Mohammad Sadegh Helfroush, Mahmoud Rashidpour, and Kamran
Kazemi, “Fuzzy-based medical x-ray image classification,” Journal of medical signals
and sensors, vol. 2, no. 2, pp. 73, 2012.
[64] NT Renukadevi and P Thangaraj, “Performance evaluation of svm–rbf kernel for

medical image classification,” Global Journal of Computer Science and Technology,
vol. 13, no. 4, 2013.
[65] Matteo Masotti, “A ranklet-based image representation for mass classification in dig-
ital mammograms,” Medical physics, vol. 33, no. 10, pp. 3951–3961, 2006.
[66] KP Aarthy and US Ragupathy, “Detection of lung nodule using multiscale wavelets
and support vector machine,” International Journal of Soft Computing and Engineer-
ing (IJSCE), vol. 2, no. 3, 2012.
[67] Saima Hassan, Jafreezal Jaafar, Brahim S Belhaouari, and Abbas Khosravi, “A new
genetic fuzzy system approach for parameter estimation of arima model,” in IN-
TERNATIONAL CONFERENCE ON FUNDAMENTAL AND APPLIED SCIENCES
2012:(ICFAS2012). AIP Publishing, 2012, vol. 1482, pp. 455–459.
[68] Ibrahim Zeiadan, Amr Zamel, and Ahmed Al Zohairy, “Clustering of medical x-ray
images by merging outputs of different classification techniques,” in CEUR Workshop
Proceedings, CEURWS. org, Toulouse, France (September 8-11 2015).
129
[69] Byoung Chul Ko, Seong Hoon Kim, and Jae-Yeal Nam, “X-ray image classification
using random forests with local wavelet-based cs-local binary patterns,” Journal of
digital imaging, vol. 24, no. 6, pp. 1141–1151, 2011.
[70] C Bhuvaneswari, P Aruna, and D Loganathan, “Classification of lung diseases by im-

age processing techniques using computed tomography images,” International Journal
of Advanced Computer Research, vol. 4, no. 1, pp. 87, 2014.
[71] Said Mahmoudi and Mohammed Benjelloun, “Corner points detection for vertebral
mobility analysis,” in Signal Processing and Communications, 2007. ICSPC 2007.
IEEE International Conference on. IEEE, 2007, pp. 1275–1278.
[72] Moritz Klüppel, Jian Wang, David Bernecker, Peter Fischer, and Joachim Hornegger,
“On feature tracking in x-ray images,” in Bildverarbeitung für die Medizin 2014, pp.
132–137. Springer, 2014.
[73] Faiz M Hasanuzzaman, Xiaodong Yang, and YingLi Tian, “Robust and effective
component-based banknote recognition for the blind,” Systems, Man, and Cyber-
netics, Part C: Applications and Reviews, IEEE Transactions on, vol. 42, no. 6, pp.
1021–1030, 2012.
[74] Nabeel Younus Khan, Brendan McCane, and Geoff Wyvill, “Sift and surf performance
evaluation against various image deformations on benchmark dataset,” in Digital Im-
age Computing Techniques and Applications (DICTA), 2011 International Conference
on. IEEE, 2011, pp. 501–506.
[75] Dusty Sargent, Chao-I Chen, Chang-Ming Tsai, Yuan-Fang Wang, and Daniel Kop-
pel, “Feature detector and descriptor for medical images,” in SPIE Medical Imaging.
International Society for Optics and Photonics, 2009, pp. 72592Z–72592Z.
[76] Fabian Lecron, Mohammed Benjelloun, and Saı̈d Mahmoudi, “Fully automatic verte-
bra detection in x-ray images based on multi-class svm,” in SPIE Medical Imaging.
International Society for Optics and Photonics, 2012, pp. 83142D–83142D.
[77] Mohamed Amine Larhmam, Mohammed Benjelloun, and Saı̈d Mahmoudi, “Vertebra
identification using template matching modelmp and k-means clustering,” Interna-
tional journal of computer assisted radiology and surgery, vol. 9, no. 2, pp. 177–187,
2014.
[78] K Balachandran and Dr R Anitha, “Supervisory expert system approach for pre-
diagnosis of lung cancer,” Published in International Journal of Advanced Engineering
& Applications, 2010.
[79] Samuel Ferbrianto kurniawan, darama putra, i ketut gede, and aa komapiang oka
sudana, “Bone fracture detection using opencv.,” Journal of Theoretical & Applied
Information Technology, vol. 64, no. 1, 2014.
[80] SK Mahendran and S Santhosh Baboo, “Ensemble systems for automatic fracture
detection,” IACSIT International Journal of Engineering and Technology, vol. 4, no.
1, 2012.
[81] SK Mahendran and S Santhosh Baboo, “An enhanced tibia fracture detection tool
using image processing and classification fusion techniques in x-ray images,” Global
130
Journal of Computer Science and Technology (GJCST), vol. 11, no. 14, pp. 23–28,
2011.
[82] Joshua Congfu He, Wee Kheng Leow, and Tet Sen Howe, “Hierarchical classifiers for
detection of fractures in x-ray images,” in Computer Analysis of Images and Patterns.
Springer, 2007, pp. 962–969.
[83] Tian Tai Peng et al., “Detection of femur fractures in x-ray images,” Master of Science
Thesis, National University of Singapore, 2002.
[84] Uri Avni, Hayit Greenspan, Eli Konen, Michal Sharon, and Jacob Goldberger, “X-ray
categorization and retrieval on the organ and pathology level, using patch-based visual
words,” Medical Imaging, IEEE Transactions on, vol. 30, no. 3, pp. 733–746, 2011.
[85] Bram Van Ginneken, Shigehiko Katsuragawa, Bart M ter Haar Romeny, Max
Viergever, et al., “Automatic detection of abnormalities in chest radiographs using
local texture analysis,” Medical Imaging, IEEE Transactions on, vol. 21, no. 2, pp.
139–149, 2002.
[86] SK Mahendran, “A comparative study on edge detection algorithms for computer

aided fracture detection systems,” International Journal of Engineering & Innovative
technology, vol. 2, no. 5, pp. 2249–0604, 2012.
[87] Mahmoud Al-Ayyoub, Ismail Hmeidi, and Haya Rababah, “Detecting hand bone frac-
tures in x-ray images,” Journal of Multimedia Processing and Technologies (JMPT),
vol. 4, no. 3, pp. 155–168, 2013.
[88] Hum Yan Chai, Lai Khin Wee, Tan Tian Swee, and Sheikh Hussain, “Glcm based
adaptive crossed reconstructed (acr) k-mean clustering hand bone segmentation,” Book
GLCM based adaptive crossed reconstructed (ACR) k-mean clustering hand bone seg-
mentation, pp. 192–197, 2011.
[89] M Gomathi and P Thangaraj, “A computer aided diagnosis system for detection
of lung cancer nodules using extreme learning machine,” International Journal of
Engineering Science and Technology, vol. 2, no. 10, pp. 5770–5779, 2010.
[90] Jia Tong, Zhao Da-Zhe, Wei Ying, Zhu Xin-Hua, and Wang Xu, “Computer-aided
lung nodule detection based on ct images,” in Complex Medical Engineering, 2007.
CME 2007. IEEE/ICME International Conference on. IEEE, 2007, pp. 816–819.
[91] Vinod Kumar and Anil Saini, “Detection system for lung cancer based on neural
network: X-ray validation performance,” International Journal of Enhanced Research
in Management & Computer Applications, ISSN, pp. 2319–7471, 2013.
[92] Vinod Kumar and Kanwal Garg, “Neural network based approach for detection of ab-
normal regions of lung cancer in x-ray image,” in International Journal of Engineering
Research and Technology. ESRSA Publications, 2012, vol. 1.
[93] Mohammad Reza Zare, Abdullah Mueen, and Woo Chaw Seng, “Automatic classifi-
cation of medical x-ray images using a bag of visual words,” Computer Vision, IET,
vol. 7, no. 2, pp. 105–114, 2013.
131
[94] Ivica Dimitrovski, Dragi Kocev, Suzana Loskovska, and Sašo Džeroski, “Hierarchical
annotation of medical images,” Pattern Recognition, vol. 44, no. 10, pp. 2436–2449,
2011.
[95] Riadh Bouslimi, Abir Messaoudi, and Jalel Akaichi, “Using a bag of words
for automatic medical image annotation with a latent semantic,” arXiv preprint
arXiv:1306.0178, 2013.
[96] Riadh Bouslimi and Jalel Akaichi, “Using hausdorff distance for new medical image
annotation,” arXiv preprint arXiv:1203.1793, 2012.
[97] Ivica Dimitrovski, Dragi Kocev, Suzana Loskovska, and Sašo Džeroski, “Imageclef
2009 medical image annotation task: Pcts for hierarchical multi-label classification,”
in Multilingual Information Access Evaluation II. Multimedia Experiments, pp. 231–
238. Springer, 2010.
[98] Tianxia Gong, Shimiao Li, Chew Lim Tan, Boon Chuan Pang, CC Tchoyoson Lim,
Cheng Kiang Lee, Qi Tian, and Zhuo Zhang, “Automatic pathology annotation on
medical images: A statistical machine translation framework,” in Pattern Recognition
(ICPR), 2010 20th International Conference on. IEEE, 2010, pp. 2504–2507.
[99] Dengsheng Zhang, Md Monirul Islam, and Guojun Lu, “A review on automatic image
annotation techniques,” Pattern Recognition, vol. 45, no. 1, pp. 346–362, 2012.
[100] Devrim Ünay, Octavian Soldea, Süreyya Akyüz, Müjdat Çetin, and Aytül Erçil, “Med-
ical image retrieval and automatic annotation: Vpa-sabanci at imageclef 2009,” The
Cross-Language Evaluation Forum (CLEF), 2009.
[101] Mohammad Reza Zare, Ahmed Mueen, and Woo Chaw Seng, “Automatic medical
x-ray image classification using annotation,” Journal of digital imaging, vol. 27, no. 1,
pp. 77–89, 2014.
[102] Shimiao Li, Tianxia Gong, Jie Wang, Ruizhe Liu, Chew Lim Tan, Tze Yun Leong,
Boon Chuan Pang, CC Tchoyoson Lim, Cheng Kiang Lee, Qi Tian, et al., “Tbidoc:
3d content-based ct image retrieval system for traumatic brain injury,” in SPIE Medical
Imaging. International Society for Optics and Photonics, 2010, pp. 762427–762427.
[103] Richard O Duda, Peter E Hart, and David G Stork, Pattern classification, John Wiley
& Sons, 2012.
132
LIST OF PUBLICATIONS
REFEREED INTERNATIONAL JOURNALS
• Sumathi Ganesan, T.S. Subashini, “An Approach towards the efficient indexing
and retrieval on medical X-Ray images”, in International Journal of Computer
Applications, (ISSN: 0975 8887), vol.76, 2013.
• Sumathi Ganesan, T.S. Subashini, “Classification of medical X-ray images for

automated annotation,” in Journal of Theoretical and Applied Information Tech-
nology, (ISSN: 1992-8645), vol.63, 2014.
• Sumathi Ganesan, T.S. Subashini and E.Pavendhan, “Automated Annotation

of X-ray images using Statistical Moment Features”, in International Journal of
Applied Engineering Research, (ISSN:0973-4562), vol.9, 2014 (Annexure-II).
• Sumathi Ganesan, T.S. Subashini, “Orientation of Medical X-Ray Images using

Harris Corner Detector and Speed up Robust Features Algorithm”, in Inter-
national Journal of Advanced Journal in Science, Engineering and Technology
ISSN: 2394-1588 ), vol.2, 2015.
• Sumathi Ganesan, T.S. Subashini, “View Classification of Medical X-ray Images

using PNN classifier, Decision Tree Algorithm and SVM Classifier”, in Interna-
tional Journal of Research in Engineering and Technology (ISSN: 2321-7308 ),
vol.4, 2015 (Annexure-II).
• Sumathi Ganesan, T.S. Subashini, “Classification of medical X-ray Images using

PNN and SVM Classifier”, in International Research Journal of Emerging Trend
in Multidisciplinary (ISSN: 2395-4434 ), vol.1, 2015.
• Sumathi Ganesan, T.S. Subashini, “Detection of three Different Views of Med-

ical X-ray Images using Harris Corner Detector and Decision Tree ”, Com-
municated in International Journal of Enterprise Network Management (ISSN:
17481260 ), march 2015 (Annexure-II)
133
PRESENTATIONS IN INTERNATIONAL CONFERENCES
• Sumathi Ganesan, T.S.Subashini and E. Saranya, “Classification of X-ray im-

ages for image retrieval in medical applications”, in International Conference
on Computation Intelligence and Advanced Manufacturing Research, Vels Uni-
versity, Chennai, 5th and 6th April 2013.
• Sumathi Ganesan, T.S.Subashini and M.Karthikeyan “Content based image re-

trieval system for X-ray images”, in International Conference on Computer Sci-
ence and Engineering organised by Interscience Research Network on 14th April
2013.
• Sumathi Ganesan, and T. S. Subashini, “ Automatic X-ray Image Classification

using shape features based on edge oriented histogram”, in World Conference
on Infectious Diseases, organized by the Jaya Charitable and Educational Trust,
Tamil Nadu, India. ISBN 978-81-928547-0-0 December 2013
• Sumathi Ganesan, T.S.Subashini, K.Jayalakshmi, “Classification of X-rays us-

ing statistical moments and SVM”, in 3rd IEEE International Conference on
Communication and Signal Processing, organised by AdhiParasakthi Engineer-
ing College 3rd to 5th April 2014.
• Sumathi Ganesan, T.S. Subashini ,“Breast Density Classification for content

based image retrieval using Support vector machine”, in International Confer-
ence on Synchronizing Management Theories and Business Practices , Chal-
lenges Ahead 27-29, July 2012.
• Sumathi Ganesan, T.S.Subashini and E. Saranya, “Feature Extraction for con-

tent based image retrieval in X-ray Images using Shape and texture features”,
in Second International Conference on Furistic trends in Computer Science En-
gineering and Information Technology, Organised by Thriuvalluvar College of
Engineering and Technology, Vandavasi, 2nd and 3rd March 2013.
• Sumathi Ganesan, T.S.Subashini, “Classification of medical X-ray images using

PNN and SVM classifier”, in International Conference on Science, Technology,
Engineering and Management, organised by Jeppiar Engineering College, 26th
27th March 2015.
PRESENTATIONS IN NATIONAL CONFERENCES
• Sumathi Ganesan, T.S.Subashini, S.Sabarathinam, “Unique Code Classification

of X-ray Images”, in National conference on Computing Techniques, University
college of Engineering, Villupuram, April 5 2013.
134
• Sumathi Ganesan, T.S.Subashini and E. Saranya, “An ontological study of Con-
tent based image retrieval in X-ray Images”, in National conference on Com-
puting Techniques,University college of Engineering, Villupuram, April 5 2013.
• Sumathi Ganesan, T.S.Subashini and E. Saranya, “Classification of X-ray im-

ages for content based image retrieval using SVM,” in National Conference on
Advances in Computer Science and Applications, organised by R.M.D. Engi-
neering College, 9th March 2013.
• Sumathi Ganesan, T.S.Subashini, “Abnormality detection of medical X-ray im-

ages using SVM classifier,” in National Conference on Multimedia Signal Pro-
cessing”, organised by Department of Computer Sciences Engineering, 12-13th
February 2016.
135

Saifinalthesis PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Saifinalthesis PDF

Uploaded by

Copyright:

Available Formats

Automated annotation of X-ray images

for the award of the degree

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

ANNAMALAI UNIVERSITY, ANNAMALAINAGAR

This is to certify that the thesis entitled Automated annotation of medical

Signature of the Research Guide

The research advancements in the field of image processing enable us to quantita-

List of Figures xiii

3 Classification of X-ray images using texture and shape features 50

5 Abnormality Detection 100

7 Summary and Conclusion 122

2.1 Summary of GLCM features . . . . . . . . . . . . . . . . . . . . . . . . 36

1.1 Principle of X-ray machine. . . . . . . . . . . . . . . . . . . . . . . . . 3

2.1 DWT breakdown of the signal. . . . . . . . . . . . . . . . . . . . . . . . 39

3.1 Block diagram of the proposed methodology for classification of X-ray

4.1 The architecture of the proposed methodology for orientation detection. 77

6.1 GUI model for annotation of medical X-ray images. . . . . . . . . . . . 115

Fig. 1.1: Principle of X-ray machine.

1.4 Types of X-ray Images

1.4.1 Views of X-ray Images

Fig. 1.4: (a) AP positioning and (b) AP view.

Posterior-Anterior (PA) View

Fig. 1.5: (a) PA positioning and (b) PA view.

1.6 Digital Image Representation

1.7 Computer Aided System for Detection and Anal-

Segmentation is typically used to locate objects and boundaries in an image. More

1.7.3 Feature Extraction

1.7.5 Orientation Detection

1.7.6 Abnormality Detection

1.9 Objectives of the Thesis

• Classification of the X-ray images.

• Orientation detection of the X-ray images.

• Checking for abnormalities.

• Generation of annotation code automatically.

1.10 Organization of the Thesis

2.4 Feature Extraction

Pseudo-Zernike moments consist of a set of orthogonal and complex number moments

Artificial Neural Network

An artificial neural network (ANN) is an information processing system which contains

Back Propagation Neural Network (BPNN)

Probabilistic Neural Network (PNN)

Support Vector Machine (SVM)

2.6 Orientation Detection

2.7 Abnormality Detection

2.9 Techniques used in the Proposed Work

2.10 Segmentation Algorithms

2.10.1 Connect Component Labeling (CCL)

• If all four neighbors are 0, assign a new label to p, else

• If only one neighbor has V = 1, assign its label to p, else

2.10.2 Expectation Maximization (EM)

EM stands for Expectation Maximization.

• Expectation: This step computes an expectation of the likelihood assuming

• Maximization: This step computes maximum likelihood estimation of parame-

θ1 = (µσ11 ) θ2 = (µσ22 ) (2.1)

Assuming means µ1 and µ2 variance is calculated as

Using the normalized probability new means µ1 and µ2 are calculated

Where, nX and nY is the total normalized probability of Gaussian 1 and Gaussian

2.11 Feature Extraction Techniques

2.11.1 Gray Level Co-Occurrence Matrix (GLCM)

Pd,θ (i, j), (i, j = 0, 1....n) (2.11)

θ = 0 is used because there is no significant dependence of the discriminatory power of

2.11.2 Zernike Moment (ZM)

where, n is a non-negative integer, m is non-zero integer subject to constraints n − |m|