You are on page 1of 4

Automatic screening of fundus images using a combination of

convolutional neural network and hand-crafted features


Balazs Harangi, Member, IEEE, Janos Toth, Member, IEEE, Agnes Baran,
and Andras Hajdu, Senior Member, IEEE

Abstract— Diabetic retinopathy (DR) and especially diabetic comprehensive and automated method of DR screening has
macular edema (DME) are common causes of vision loss long been recognized, and previous efforts have made good
as complications of diabetes. In this work, we consider an progress using image classification, pattern recognition, and
ensemble that organizes a convolutional neural network (CNN)
and traditional hand-crafted features into a single architecture machine learning. Recently, deep learning is increasingly
for retinal image classification. This approach allows the joint applied in medical image analysis, and a number of such
training of a CNN and the fine-tuning of the weights of hand- methods have been proposed where CNNs are used for
crafted features to provide a final prediction. Our solution microaneurysm or exudate detection (e.g., [2], [3]).
is dedicated to the automatic classification of fundus images
according to the severity level of DR and DME. For an objective In the last year, the organizers of the 2018 ISBI Challenge
evaluation of our approach, we have tested its performance on on Diabetic Retinopathy - Segmentation and Grading sub-
the official test datasets of the IEEE International Symposium challenge 2 called for participation in developing efficient
on Biomedical Imaging (ISBI) 2018 Challenge 2: Diabetic methods to classify RGB fundus images according to the
Retinopathy Segmentation and Grading Challenge, section B. severity level of DR and diabetic macular edema (DME).
Disease Grading: Classification of fundus images according to
the severity level of diabetic retinopathy and diabetic macular DR is referred as a clinical diagnosis, characterized by the
edema. As for our experimental results based on testing on presence of one or more retinal lesions like microaneurysms
the Indian Diabetic Retinopathy Image Dataset (IDRiD), the (MA), hemorrhages (HE), hard exudates (EX), and soft
classification accuracies have been found to be 90.07% for the exudates (SE). DME is a complication associated with DR in
5-class DR challenge, and 96.85% for the 3-class DME one. which retinal thickening or accumulation of fluid can occur
Index Terms— diabetic retinopathy screening, hand-crafted
features, deep learning, ensemble learning at any stage of DR. The risk of having DME is classified
into no risk and two probable risks based on the location
I. I NTRODUCTION of HEs. According to the instructions of the challenge, two
classification tasks had to be addressed. First, the input image
Diabetic Retinopathy (DR) is an eye disease associated had to be classified according to the five different stages of
with long-standing diabetes and the leading cause of blind- DR. Second, it had to be labeled by the three stages of DME
ness in the working-age population of the developed world. (see Fig. 1 for examples of the different DR/DME stages).
In 2015, an estimated 30.3 million people of all ages (or
In this paper, we combine the powerful, self-extracted,
9.4% of the population) had diabetes in the U.S. and the
CNN-based features with traditional, hand-crafted ones into
World Health Organization estimates that 422 million people
a single framework to enhance classification performance.
have the disease worldwide [1]. Around 40% to 45% of US
Namely, we have extracted 68 conventional features (see
citizens with diabetes have some stage of DR. Progression to
Section 2 for details) and modified the CNN AlexNet [4] to
vision impairment can be slowed down or prevented if DR is
compose a combined system, where the traditional features
detected in time; however, it can be difficult as the disease
are embedded into the CNN via a fully-connected (FC) layer.
often shows few symptoms until it is too late to provide
In this way, we have created a single network architecture,
efficient treatment.
which can be trained by back-propagation in the usual
Clinical experts can identify DR by the presence of lesions
way for neural nets. During the training stage the pre-
associated with the vascular abnormalities caused by the
trained AlexNet is fine-tuned and the weights of hand-crafted
disease. While this approach is effective, its resource needs
features are optimized together. Our experimental evaluation
are high, and there is a lack of expertise and required
was performed on the publicly available IDRiD database [5]
equipment in many areas where the rate of diabetes is high.
collated by a retinal specialist in Nanded, Maharashtra, India.
As the number of individuals with diabetes is growing,
the infrastructure needed to prevent blindness due to DR
will be more and more difficult to access. The need for a
II. M ETHODOLOGY
*This work was supported in part by the projects EFOP-3.6.2-16-2017-
00015 and EFOP-3.6.3-VEKOP-16-2017-00002 supported by the European
Union, co-financed by the European Social Fund. In this section, we give an overview of the image level and
Balazs Harangi, Agnes Baran, Andras Hajdu and Janos Toth are with lesion specific methods used to extract the traditional features
the Faculty of Informatics, University of Debrecen, POB 400, H-4002,
Debrecen, Hungary (phone: +36-52-512-900/75121; e-mail: harangi.balazs, considered in our system. We also describe how the CNN
baran.agnes, hajdu.andras, toth.janos@inf.unideb.hu). and conventional features are fused in a single architecture.

978-1-5386-1311-5/19/$31.00 ©2019 IEEE 2699


Fig. 1. Sample images from the training dataset of the ISBI 2018 challenge: (a) DR0 and DME0; (b) DR1 and DME1, (c) DR2 and DME2; (d) DR3;
(e) DR4.

A. AM-FM Based Image Level Feature Extraction confidence levels, and the number of candidates is counted
at each level in order to obtain six features.
The amplitude-frequency modulation (AM-FM) method
extracts information from an image by decomposing its green 2) Exudate specific features: Features related to EXs are
channel at different scales into AM-FM components [6]. The important in assessing the severity of non-proliferative DR
different scales are obtained using a set of twenty-five band- and the risk of DME. EXs are lipid residues of serous leakage
pass channel filters associated with four frequency scales. from damaged capillaries and appear as bright spots with
After all image features are extracted, k-means clustering is varied shapes in retinal images. The combined output of the
used to cluster this information into 30 groups. As a result, EX detector ensemble is obtained as follows: each hPP, CEi
a feature vector is obtained, which reflects the intensity, pair produces a binary mask that contains EX candidates.
geometry and texture of the structures contained in the image Then, a probability map is generated using the output of the
[7]. different pairs, where the probability of a pixel belonging to
an EX is determined by the ratio of the number of pairs that
B. Lesion Specific Feature Extraction marked the pixel as EX to the total number of pairs.
To extract lesion specific features related to MAs and EXs, For the construction of the ensemble discussed above,
we used two detector ensembles, which consist of a set of we used the results of Nagy et al. [16]. Four PPs (gray-
hpreprocessing method, candidate extractori pairs (hPP, CEi world normalization (GN) [17], illumination equalization
for short) organized into a voting system. Such a hPP, CEi (IE) [10], morphological contrast enhancement (MC) [18],
pair is formed by applying the PP to the retinal image and and vessel removal with inpainting (VR) [11]) and three CEs
the CE to its output. This way, a hPP, CEi pair extracts a (the methods of Sopharak et al. [19], Walter et al. [20], and
set of lesion candidates from the input image, acting like a Welfer et al. [21]) were used. The eight hPP, CEi pairs used
single detector algorithm. in the ensemble are shown in Table I (b).
1) Number of microaneurysms: MAs are the earliest signs
of DR and indicators of its development, therefore their TABLE I
number is a key feature in DR classification. MAs are T HE hPP, CEi PAIRS OF THE ( A ) MA, AND ( B ) EX DETECTOR
swellings of the capillaries and appear as small red dots; ENSEMBLES .
however, their detection is difficult due to their similarity to
vessel fragments. The combined output of the MA detector (a)
PP CE (b)
ensemble is obtained in the following manner: the candidates PP CE
1 NP Lazar et al.
of the result sets of the hPP, CEi pairs are merged, if their 1 GN Sopharak et al.
2 CLAHE Lazar et al.
2 MC Sopharak et al.
Euclidean distances are smaller than a predefined value. To 3 IE Lazar et al.
3 VR Sopharak et al.
each joint candidate of the ensemble a confidence value is 4 WK Lazar et al.
4 IE Walter et al.
assigned showing the ratio of the number of hPP, CEi pairs 5 NP Walter et al.
5 MC Walter et al.
6 CLAHE Walter et al.
suggesting this candidate to the total number of pairs of the 6 VR Walter et al.
7 NP Zhang et al.
ensemble. 7 IE Welfer et al.
8 VR Zhang et al.
8 MC Welfer et al.
To construct the hPP, CEi pairs of the ensemble, we 9 WK Zhang et al.
relied on the results of Antal and Hajdu [8]. Five PPs
(contrast limited adaptive histogram equalization (CLAHE)
[9], illumination equalization (IE) [10], vessel removal with Finally, the result set of the ensemble is thresholded at
inpainting (VR) [11], Walter-Klein contrast enhancement eight different confidence levels, and the following features
(WK) [12], and no preprocessing (NP) for formal reasons) are calculated at each level to obtain a total of 32 features:
and three CEs (the methods of Lazar et al. [13], Walter et ratio of all EX pixels to ROI pixels, number of EXs (8-
al. [14], and Zhang et al. [15]) were used; the pairs of our connected components), ratio of the largest EX (8-connected
MA ensemble are listed in Table I (a). Finally, the result component) to the ROI, and ratio of the average EX size to
candidate set of the ensemble is thresholded at six different the ROI.

2700
Fig. 2. Visual explanation of the hand-crafted feature extraction and their combination with CNN features.

C. The Fusion of CNN and Hand-Crafted Features where d(i) is the ground truth label for the ith training image
To build our network, we have extended the FC layer x(i) , while M is the cardinality of the training dataset. The
F Cf use originally containing 4096 neurons of AlexNet pre- entries of the weighting matrices A1 and A2 are trainable
ceding the FC layer F Cclass reducing these weights to class parameters of the network; they are initialized with random
probabilities (5 classes for DR, 3 classes for DME). That values.
is, we have added a 68-dimensional vector containing the
III. EXPERIMENTAL RESULTS
traditional features to F Cf use to gain 4164 features as it can
be seen in the visual architecture of the proposed ensemble For experimental analyses, we have considered the dataset
framework on Fig. 2. Then, the 4164×5 (resp. 4164×3) layer made available during the ISBI 2018 challenge [22]. The
F Cclass is considered for the DR (resp. DME) classification organizers shared data for both the DR and DME classifica-
task. In this way, both the final weighting F Cclass of the tion tasks. The DR task is a 5-class classification problem to
CNN/traditional features and the 4096 AlexNet features assign grades 0 to 4 according to the severity of the disease
could be trained by backpropagation. To have a more precise and regarding the 3-class DME task. In the Indian Diabetic
(i)
formal description, for a given training image x(i) , let y1 Retinopathy Image Dataset [5], 413 images are available for
(i) training and 103 for testing for both classification problems.
and y2 denote the feature vectors provided by AlexNet and
the traditional extractors, respectively. These two vectors are The number of training images is rather low for efficient
combined by F Cclass using the weight matrices A1 and A2 deep learning and there are too few images in the test set
via as well for comprehensive performance evaluation. Thus,
(i) (i)
y (i) = AT1 y1 + AT2 y2 , (1) to increase training data, we have included a large dataset
(22,700 images) made publicly available in a 2015 DR
where A1 has the size 4096×5 (resp. 4096×3), and A2 is of challenge on Kaggle [23]. This dataset contains ground truth
68×5 (resp. 68×3) for grading DR (resp. DME). As usual in for the DR task, while the missing DME labels were assigned
CNN architectures, F Cclass is followed by a softmax layer by a local ophthalmologist for 11,600 images of this set.
to have the predictive class probabilities yb(i) for the image For the measurement of the system accuracy, we used all
x(i) , and by a final classification layer. To train our network, the images (516) from the IDRiD dataset. Thus, for training
we have used the mean squared error loss function purposes we have used the Kaggle training dataset, and tested
M 2 classification performance on the ISBI 2018 data.
1 X  (i) To evaluate our approach, we have considered the common
L= yb − d(i) , (2)
2M i=1 error measure accuracy corresponding to a given confidence

2701
level on the receiver operating characteristic curve. However, R EFERENCES
it is the simplest way to measure the performance of our [1] World Health Organization, Global report on diabetes. World Health
methods regarding the 5- and 3-class classification tasks. An Organization, 2016.
image is a true case if the predicted label and the ground truth [2] J. Shan and L. Li, “A deep learning method for microaneurysm
detection in fundus images,” in Connected Health: Applications,
label match, else it is a false case. To calculate accuracy the Systems and Engineering Technologies (CHASE), 2016 IEEE First
total number of true cases is divided by the total number International Conference on. IEEE, 2016, pp. 357–358.
of input images. In this way, the classification performances [3] J. H. Tan, H. Fujita, S. Sivaprasad, S. V. Bhandary, A. K. Rao, K. C.
Chua, and U. R. Acharya, “Automated segmentation of exudates,
have been found to be 90.07% for the 5-class DR challenge, haemorrhages, microaneurysms using single convolutional neural net-
and 96.85% for the 3-class DME one, respectively. work,” Information Sciences, vol. 420, pp. 66–76, 2017.
[4] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification
As a comparative analysis between our method and other with deep convolutional neural networks,” Commun. ACM, vol. 60,
approaches in this specific field, we have involved all the no. 6, pp. 84–90, 2017.
state-of-the-art approaches into the quantitative compari- [5] P. Porwal, S. Pachade, R. Kamble, et al., “Indian Diabetic Retinopathy
Image Dataset (IDRiD),” 2018.
son that were presented during the 2018 ISBI Challenge [6] J. P. Havlicek, “AM-FM image models,” Ph.D. dissertation, The
on Diabetic Retinopathy - Segmentation and Grading sub- University of Texas at Austin, 1996.
challenge 2. Notice that, in this challenge just a subset of [7] C. Agurto, V. Murray, E. Barriga, S. Murillo, M. Pattichis, H. Davis,
S. Russell, M. Abramoff, and P. Soliz, “Multiscale AM-FM meth-
the IDRiD dataset [5] was considered for the evaluation and ods for diabetic retinopathy lesion detection,” IEEE Transactions on
the organizers calculated the average of the 5- and 3-class Medical Imaging, vol. 29, no. 2, pp. 502–512, Feb 2010.
classification accuracy for the final scoreboard. According [8] B. Antal and A. Hajdu, “Improving microaneurysm detection using
an optimally selected subset of candidate extractors and preprocessing
to this metric, our approach (HarangiM2) finished at the methods,” Pattern Recognition, vol. 45, no. 1, pp. 264 – 270, 2012.
4th place and outperforms 9 other solutions from the 13 [9] K. Zuiderveld, “Contrast limited adaptive histogram equalization,” in
submitted to the challenge as can be seen in Table II. Graphics Gems IV, P. S. Heckbert, Ed. Academic Press Professional,
Inc., 1994, pp. 474–485.
We note here that some of the algorithms that were ranked [10] A. Youssif, A. Ghalwash, and A. Ghoneim, “Comparative study
higher in this challenge used information about the number of contrast enhancement and illumination equalization methods for
of images in the different classes of the training set for the retinal vasculature segmentation,” in Cairo International Biomedical
Engineering Conference, 2006, pp. 1–5.
final decision. Such a bias can be an advantage in an artificial [11] A. Criminisi, P. Perez, and K. Toyama, “Object removal by exemplar-
environment like a challenge but it could not be exploited in based inpainting,” in IEEE Conference on Computer Vision and
a real-world application. Pattern Recognition (CVPR), vol. 2, 2003, pp. 721–728.
[12] T. Walter and J.-C. Klein, “Automatic detection of microaneurysms in
color fundus images of the human retina by means of the bounding box
TABLE II closing,” in Medical Data Analysis: Third International Symposium
T HE OFFICIAL RESULT OF THE 2018 ISBI C HALLENGE ON D IABETIC (ISMDA), 2002, pp. 210–220.
R ETINOPATHY - S EGMENTATION AND G RADING SUB - CHALLENGE 2. [13] I. Lazar and A. Hajdu, “Retinal microaneurysm detection through local
rotating cross-section profile analysis,” IEEE Transactions on Medical
No. Team name Authors Accuarcy Imaging, vol. 32, no. 2, pp. 400–407, 2013.
[14] T. Walter, P. Massin, A. Erginay, R. Ordonez, C. Jeulin, and J.-
1 Mammoth Junyan Wu et al. 0.9322
C. Klein, “Automatic detection of microaneurysms in color fundus
2 SDNU Xiaodan Sui et al. 0.8789
images,” Medical Image Analysis, vol. 11, no. 6, pp. 555–566, 2007.
3 HarangiM1 Balazs Harangi et al. 0.8741 [15] B. Zhang, X. Wu, J. You, Q. Li, and F. Karray, “Detection of
4 HarangiM2 Balazs Harangi et al. 0.8692 microaneurysms using multi-scale correlation coefficients,” Pattern
5 AVSASVA Varghese Alex et al. 0.8426 Recognition, vol. 43, no. 6, pp. 2237–2248, 2010.
6 VRT Jaemin Son et al. 0.7554 [16] B. Nagy, B. Harangi, B. Antal, and A. Hajdu, “Ensemble-based
7 LzyUNCC Zhongyu Li et al. 0.5327 exudate detection in color fundus images,” in Symposium on Image
8 Py Siddhesh Thakur et al. 0.5206 and Signal Processing and Analysis, 2011, pp. 700–703.
9 SZU Xuechen Li et al. 0.4964 [17] G. D. Finlayson, B. Schiele, and J. L. Crowley, “Comprehensive colour
10 SS K V Sai Sundar et al. 0.4843 image normalization,” in Computer Vision — ECCV’98, H. Burkhardt
11 ZJU-BII-SGEX Xingzheng Lyu et al. 0.3995 and B. Neumann, Eds. Berlin, Heidelberg: Springer Berlin Heidel-
12 deepdr Ling Dai et al. 0.3148 berg, 1998, pp. 475–490.
13 NTHU-CVLab Chih-Hsuan Liu et al. 0.1719 [18] P. Soille, Morphological Image Analysis: Principles and Applications.
Springer-Verlag, 2004.
[19] A. Sopharak, B. Uyyanonvara, S. Barman, and T. H. Williamson,
“Automatic detection of diabetic retinopathy exudates from non-dilated
retinal images using mathematical morphology methods,” Comp. Med.
IV. CONCLUSION Im. Grap., vol. 32, no. 8, pp. 720–727, 2008.
[20] T. Walter, J.-C. Klein, P. Massin, and A. Erginay, “A contribution of
In this work, we have presented an ensemble-based system image processing to the diagnosis of diabetic retinopathy-detection of
dedicated to DR and DME grading. Our ensemble consists exudates in color fundus images of the human retina,” Transactions
on Medical Imaging, vol. 21, no. 10, pp. 1236–1243, 2002.
of a convolutional neural network (based on AlexNet) and [21] D. Welfer, J. Scharcanski, and D. Marinho, “A coarse-to-fine strategy
several hand-crafted features, traditionally used in fundus for automatically detecting exudates in color eye fundus images,”
image analysis in the recent decades. The hand-crafted Computerized Medical Imaging and Graphics, vol. 34, no. 3, pp. 228–
235, 2010.
features are inserted in the CNN architecture to let it be [22] “IEEE International Symposium on Biomedical Imaging 2018 Chal-
trained also in consideration of them. As for the performance lenge 2: Diabetic Retinopathy Segmentation and Grading Challenge,”
of our approach, we have measured 90.07% classification https://idrid.grand-challenge.org/, accessed: 2019-02-10.
[23] “Kaggle Diabetic Retinopathy Detection competition,”
accuracy for the 5-class DR task, and 96.85% for the 3-class https://www.kaggle.com/c/diabetic-retinopathy-detection, accessed:
DME one using the required protocol of the challenge. 2019-02-10.

2702

You might also like