Professional Documents
Culture Documents
https://doi.org/10.1007/s13246-020-00890-3
SCIENTIFIC PAPER
Abstract
Diabetic retinopathy (DR) is a complication of diabetes mellitus that damages the blood vessels in the retina. DR is considered
a serious vision-threatening impediment that most diabetic subjects are at risk of developing. Effective automatic detection
of DR is challenging. Feature extraction plays an important role in the effective classification of disease. Here we focus on
a feature extraction technique that combines two feature extractors, speeded up robust features and binary robust invariant
scalable keypoints, to extract the relevant features from retinal fundus images. The selection of top-ranked features using
the MR-MR (maximum relevance-minimum redundancy) feature selection and ranking method enhances the efficiency of
classification. The system is evaluated across various classifiers, such as support vector machine, Adaboost, Naive Bayes,
Random Forest, and multi-layer perception (MLP) when giving input image features extracted from standard datasets (IDRiD,
MESSIDOR, and DIARETDB0). The performances of the classifiers were analyzed by comparing their specificity, precision,
recall, false positive rate, and accuracy values. We found that when the proposed feature extraction and selection technique
is used together with MLP outperforms all the other classifiers for all datasets in binary and multiclass classification.
Keywords DR detection · Retinal fundus images · SURF · BRISK · Feature selection and ranking · MR-MR method · 10-
Fold cross validation
13
Vol.:(0123456789)
Physical and Engineering Sciences in Medicine
Fig. 1 Different stages of DR in
fundus images
these forms yellow flicks in the retina, called exudates. MAs classification. If all features that are related to DR can be
and HMs are categorized as dark lesions and exudates as automatically extracted, then the detection process will
bright lesions. It is necessary to implement a technique that become easier. For that, artificial intelligence will be useful.
can extract both of these lesion types. Intra-retinal microvas- Here we propose a simple and efficient algorithm for binary
cular abnormalities (IRMAs)[4] are among the most impor- and multiclass classification of DR from fundus images.
tant factors for detecting DR. IRMAs involve the abnormal
branching or expansion of retinal blood vessels.
DR is difficult to cure once it has progressed to an Related works
advanced stage. Ultimately, DR leads to complete vision
loss. It is important to reduce the global prevalence of DR. There have been many studies addressing DR detection and
To achieve this, many techniques have been developed to grading the severity of the disease. The probabilistic latent
detect DR in its early stages. It is important to make the semantic analysis (PLSA) technique has been combined with
technique as accurate as possible, as well as to reduce the the Bag of Visual Words (BoVW) method to separate dis-
implementation cost. There are already many effective algo- eased and normal retinal images using the SVM (support
rithms for the detection of macular edema from retinal opti- vector machine) classifier[6].A comparative study of the
cal coherence tomography (OCT) images [5]. If a subject different feature extraction techniques is conducted in[7].
has achieved PDR, they will have had to have gradually According to these authors’ analysis, SIFT (scale invariant
progressed through all other stages of DR, ultimately caus- feature transform), SURF (speeded up robust features), and
ing vision loss. In this regard, early detection of DR in its BRISK (binary robust invariant scalable keypoints) are the
primary stage (Mild NPDR)—or at least in the moderate most scale-invariant feature detectors have control on wide-
NPDR stage—is important. In the primary stage, the lesions spread scale variations. In[8], SURF and BoVW are used
appear, but they might not be clearly visible and difficult to to extract features and are given to the SVM classifier for
differentiate with the normal retinal conditions. Extracting further classification, which yielded an accuracy of around
each of the lesions in the retina of the subjects is a difficult 94%. Segmentation of retinal vessels using the Adaboost
task, and it makes the DR detection more complicated. For classifier is evaluated in[9]. Then, this feature based classi-
accurate automated detection, highly efficient feature extrac- fier was trained and tested with the DRIVE database, which
tion techniques are needed to get the most reliable features achieved an area under the ROC (Region of Operating Char-
from the fundus image that can differentiate normal and DR acteristics) curve value of 0.9561. The detection and clas-
images. This work concentrates on such an efficient auto- sification of glaucoma from retinal images by combining
mated feature learning method that can be used to extract clinical and multiresolution features is described in[10]. A
features from retinal fundus images for binary and multiclass fast feature extraction algorithm using SURF and BRISK
13
Physical and Engineering Sciences in Medicine
is proposed in[11], then, the extracted feature matching is regions in the image. All of the features obtained from the
performed using k- Nearest Neighbours (k-NN) for retina extractors are passed as input to the MR-MR technique for
identification task. The concept of feature extraction can feature selection and ranking. Here, the features are ranked
be adapted as it is a combination of two strong local fea- according to their relevancy, and then the most relevant 30
ture extraction techniques. A decision support system for features will be selected by maintaining minimum redun-
early detection of DR is introduced in[12]. The system dancy. This, in turn, enhances the speed of the classifier.
was developed with Gabor and Discrete Fourier Transform In this work, the strongest 30 features of each image are
(DFT) attributes. Then, spectral regression discriminant selected as input to the classifier for training. The feature
analysis is used to perform the dimensionality reduction. selection is performed according to the relevancy of a feature
Random forest and logistic regression classifiers are used in describing the characteristics of the image. Then, 10-fold
for the classification. For bright lesion detection, feature cross-validation is applied for validating the classifier. The
extraction using SIFT is applied in[13]. Then, dimension- script for the work is done in python 3.7.
ality reduction is carried out using Laplacian Eigen (LE)
maps. In[14], local features of retinal images are extracted
Feature extraction techniques
using Local Binary Patterns (LBP). Then, this is evaluated
across Artificial Neural Network (ANN), Random Forest,
This is an important task that increases the efficiency of the
and SVM for the detection task. In[15], feature selection
whole system. In the proposed work, a combination of two
from the retinal OCT images are made by the Laplacian
local feature extractors (SURF and BRISK) is used. Local
score (L-score) method for angle-closure glaucoma detec-
features illustrate the local properties of an image (i.e., they
tion. Then, the maximum relevance-minimum redundancy
extract a set of salient points that can be detected repeat-
(MR-MR) method is used for dimensionality reduction. The
edly from the same image irrespective of scale variance,
classification was performed using the AdaBoost classifier.
illumination variance, and orientation variance and describe
A sparse coding technique with linear SVM for retinal image
the gradient properties around the salient points). Thus, an
classification is proposed in[16]. These authors make use
automatic feature learning technique is implemented using
of the BoVW technique for feature extraction. According
SURF and BRISK.
to their evaluation, a dictionary size of 100 achieves bet-
ter sensitivity and specificity. A data fusion method with
a meta-SVM classifier for DR detection is implemented Speeded up robust features (SURF)
in[17]. In[18], a scanning window analysis (SWA) and the
hybrid method of morphology are applied for retinal feature As discussed, SURF[23] is an efficient feature detector and
extraction. Principal component analysis (PCA) is adapted descriptor that can be applied for object recognition and
to locate the optic disc for retinal feature extraction in[19]. classification.
Also, to detect the disk boundary, a modified active shape The steps involved in this feature extraction are:
model (ASM) is proposed. In[20], PCA is used for localiza-
tion of the optic disc and segmentation is based on Markov 1. Selection of interest points
random field. In[21], the Gullible Bayes classifier is used to 2. Extracting the descriptors
classify DR and normal images, which yielded higher accu- 3. Matching descriptor vectors of different images
racy than SVM. In[22], the DR classification was done using
a fuzzy image processing technique. The evaluation is per- For interest point selection, it is necessary to filter the
formed for k-NN, Polynomial and RBF (Radial Basis Func- images using box filters. The filtering time can be reduced
tion) kernel SVM, and a naive Bayes classifier, of which the if we use integral images[23] instead of the original
k-NN classifier showed the best performance. images. Each pixel of an integral image is calculated from
the original image by summing the pixels above and left
to it. Then, the convolution with the box filters produces
Methodology a matrix, called a Hessian matrix. This matrix is used to
find points of interest. The points are selected such that
The proposed method for detecting DR at its earlier stage the determinant of the Hessian matrix is maximum. Let us
consists of five steps: database selection, feature extraction, assume a point in the image as s = (p, q) . Then, the hessian
dimensionality reduction, classification, and performance matrix H(s, 𝜎) in s at scale 𝜎 is given in Eq. (1). The scal-
analysis. The proposed method is illustrated in Fig. 2. ing is applied by using box filters of different sizes (mainly
SURF and BRISK techniques are applied to extract the fea- up-scaling the size) without changing the image size. Then
tures from the input image. The feature descriptors in this the approximate scale can be calculated using Eq. (2).
work mainly focus on the local features, which means local
13
Physical and Engineering Sciences in Medicine
Fig. 2 Proposed method
Retinal fundus Image
Classifier
Performance
Evaluation
( )
initial filter scale After getting the interest points, our aim is to select the
𝜎approx =current filter size × (1) reproducible orientation for those interest points. This is
initial filter size
achieved based on the information from a circular region
[ ] around the required points. Then, a square region is con-
Lpp (s, 𝜎) Lpq (s, 𝜎) structed that is aligned to the selected orientation, and the
H(s, 𝜎) =
Lpq (s, 𝜎) Lqq (s, 𝜎) (2)
descriptors corresponding to that interest point are extracted.
The feature matching process is done using Laplacian sign
where, Lpp (s, 𝜎) is the convolved output of the image in point
indexing of interest points. The Laplacian sign recognizes
s with the Gaussian second order derivation (box filter). In
bright blobs on dark backgrounds and vice versa. In the
the same way all the other three components are obtained.
matching stage, features are compared if they have the same
Then, determinant of the hessian matrix can be obtained as
type of contrast. This information provides faster match-
seen in Eq. (3)
ing without affecting the performance of the descriptor. The
det(H) = Lpp (s, 𝜎) × Lqq (s, 𝜎) − Lpq (s, 𝜎) (3) SURF algorithm exhibits the property of scale invariance,
lighting invariance, rotation, and translation invariance[23].
After getting the determinant value, the non-maximum
suppression can be used to select the point with maximum Binary robust invariant scalable keypoints (BRISK)
determinant value in each 3 × 3 neighborhood in the image.
Thus, interest points can be selected. Then, the descriptors BRISK[24] computes brightness comparisons to form a
for these key points must be selected. For that we need to go binary descriptor string from configurable circular sampling
through two steps. patterns. This is a scale and rotation invariant algorithm. It
offers the quality of high-end features, mainly in applications
– Step 1: Assigning Orientation where there is demand in time. The steps involved in this
– Step 2: Construct square region and descriptors extrac- feature extraction are:
tion
13
Physical and Engineering Sciences in Medicine
octaves and intraoctaves are illustrated in[24]. In the first The features selected by using the maximum relevance
pyramid, the first layer is the original image, and by imple- approach should have a high correlation, which means they
menting successive half sampling, all the other layers are show maximum redundancy. Therefore, it is required to con-
derived. In the case of the second pyramid, the initial intra sider the minimum redundancy condition, which is formu-
octave alone is derived by downsampling the original image lated in Eq. (6). Let, R(M) represents the mutual information
by a factor of 1.5, and the other intra octave layers are from a between two features ri and rj in set M. Then, the equation
successive half sampling method. If s denotes the scale, then for mutual information can be modified as:
s(vi ) = 2i and s(bi ) = 2i ⋅ (1.5) . It is mentioned in[24] that by
using downsampling of 1.5, the computational effectiveness 1 ∑
min[R(M)] = 𝜇(ri , rj ) (6)
can be maintained. In the proposed work, using the same |M|2 r ,r ∈M
i j
method, we were also able to construct computationally
effective scale space of the original image. The predominant The method that combines Eqs. (5) and (6) is called “MR-
points are selected among all the neighbour octave and intra MR” method. This method finds the compact set of features
octave layers alternatively. Thus, after an iterative process, with highest relevance with less redundancy by maximizing
we get a set of scale-space key points. The BRISK descrip- D(M, s) and minimizing R(M). To combine both criteria
tors are composed as binary strings. The keypoint descrip- (to optimize ‘D’ and ‘R’), consider an objective function
tion is related to positioning the sampling pattern for each 𝛹 (D, R) . It can be defined as:
keypoint. It is important that, according to each keypoint,
the sampling pattern should be exactly scaled and rotated.
max[𝛹 (D, R), 𝛹 ] = D − R (7)
Then, hamming distance is used for matching purposes[25]. The nearest optimal features that can be defined by 𝛹 (.) can
An example for SURF and BRISK feature extraction is found using incremental search methods[26]. Generally, the
illustrated in Fig. 3. The feature extraction is carried out in feature that maximizes the optimal criterion is selected for
the gray scale image and the detected keypoints are marked further process.
as red dots in the blue channel image. The difference in the
feature point selection for each category is clearly visible in
each image. Classifier
13
Physical and Engineering Sciences in Medicine
13
Physical and Engineering Sciences in Medicine
AdaBoost classifier
13
Physical and Engineering Sciences in Medicine
Ada-Boost algorithm has the power to select only those 3. Again use the best split method to split the others into
features known to improve the predictive power of the branches
model[29], thereby improving execution time of the classi- 4. Repeat steps 1–3 until form a root node with target as
fier by eliminating the irrelevant features. Mathematically, the leaf nodes
this classifier[9] can be defined as: 5. Construct the forest by iteration (doing steps 1–4) for n
times to create n trees
(10)
T
DT (x) = 𝛴t=1 ft (x)
Random forest
13
Physical and Engineering Sciences in Medicine
Multi layer perceptron (MLP) available dataset is split into K-sub sections (K = 1, 2,
3,). Then, each subsection is treated as a validation set for
MLP is a multi-layer feed-forward network that maps inputs to each iteration.
outputs in a nonlinear manner. The MLP base structure con- The general steps in K-fold validation is as follows:
tains an input layer, hidden layers, and an output layer, with
each node fully connected to the nodes in the next layer with 1. Randomly shuffle the dataset;
appropriate weights, which is schematically represented in 2. Data will split into K-sub groups (If K = 10, then split
Fig. 6. In the proposed work only one hidden layer is used by the data into ten groups)
considering the advantages of single hidden layer MLP which 3. The evaluation process is performed for each group;
is mentioned in[34]. The number of nodes in the hidden layer
is derived from the number of attributes and classes averaged. – Use one group as a test set,
MLP uses a backpropagation method for training, there might – Use the remaining groups as the training dataset,
be a non-linear activation function that is not seen in other – Train the classifier with this dataset and evaluate the
neural networks. In MLP, the sigmoid function is generally model with the test data,
used, and it is described in Eq. (11). – Retain the evaluation score and repeat the steps by
selecting another group.
yi (si ) = (1 + e−si )−1 (11)
4. Summarize the model efficiency using the evaluation
where, yi depicts the ith node output and the weighted sum
scores.
of the input synapses is denoted by si . In back propagation
algorithm[35], the motive is to reduce the error propagated
The results are stored in the form of a confusion matrix[37].
in the network by adjusting the weights at each node. The
The structure of the confusion matrix that depicts the char-
error ej (n) at the jth output node in the nth data point can be
acteristics of a binary classifier is shown in Table 1. In that
calculated using the actual output value aj (n) and predicted
matrix, P, Q, R, and S represents the number of true positives
output value yj (n) as in Eq. (12).
(TP), false negatives (FN), false positives (FP), and true nega-
ej (n) = aj (n) − yj (n) (12) tives (TN) respectively. TP and TN give the results of correctly
classified data while FP and FN give the incorrectly classi-
In order to minimize the error in the entire output, the cor- fied details. Using these values, we can calculate the accu-
rections in weights at each node is done by Eq. (13) and the racy, F-score, specificity, precision and recall of the classifier
new weight for each node can be acquired from Eq. (14). to examine system efficiency.
1
𝜎(n) = 𝛴j [e2j (n)] (13) Accuracy defines the overall power of the system. It can be
2
obtained from the confusion matrix using the formula,
𝜕𝜎(n) P+S
𝛥Wji (n) = − 𝛼 yi (n) (14) Accuracy = (15)
j (n) P+Q+R+S
where, 𝛼 is the learning rate, yi (n) is the previous node out- False Positive Rate (FPR) gives the rate of incorrect positive
put. The iterative process continue until the error become predictions. The best FPR rate for a good classifier is 0.0.
unchangeable.
R
FPR = (16)
R+S
Performance analysis
Precision gives the positive prediction value. This value pro-
The performance of the classifier is analyzed using K-fold vides the information on how efficiently our system avoids
cross-evaluation[36]. In this evaluation technique, the entire FPs. It can be measured as,
P
Precision = (17)
P+R
Table 1 Confusion matrix Actual diagnosis Predicted
Recall, also called as sensitivity which gives the information
diagnosis
about how efficiently the model reduces FNs. This can be
DR NO DR calculated as,
DR P Q
NO DR R S
13
Physical and Engineering Sciences in Medicine
(a) SVM
DR 638 16
Results and discussions No DR 11 535
(b) AdaBoost
The proposed system is evaluated for both binary and DR 654 0
multi-class classification. For binary classification, all No DR 66 480
categories in the MESSIDOR and IDRiD databases are (c) Naive Bayes
combined into a single category of DR, whereas in the DR 614 40
DIARETDB0 database, only normal and DR images are No DR 23 523
available. For multiclass classification, we used the cat- (d) Random Forest
egories available in the MESSIDOR and IDRiD data- DR 637 17
bases. In MESSIDOR, the images are categorized into No DR 18 528
four classes: normal, mild DR, moderate DR, and severe (e) MLP
DR. In IDRiD, there are five classes (normal, mild NPDR, DR 642 12
moderate NPDR, severe NPDR, and PDR). No DR 13 533
13
Physical and Engineering Sciences in Medicine
Table 4 Confusion matrix for the evaluation of each classifier using combined feature extraction method. Here, a proportion-
DIARETDB0 Database based feature selection is not used because sometimes the
DR No DR SURF or BRISK methods provide all relevant features
that can describe the characteristics of the input image.
(a) SVM
In that case, feature selection ratio from each method may
DR 107 3
miss predominant features. In the proposed work, all of
No DR 1 19
the extracted features are combined and ranked accord-
(b) AdaBoost
ing to their relevance using the MR-MR method and 30
DR 107 3
top-ranked features are fed into the classifier. For the per-
No DR 1 19
formance analysis, a 10-fold cross-validation method is
(c) Naive Bayes
also utilized. The classifiers evaluated are SVM, Adaboost,
DR 94 16
Naive Bayes, Random Forest, and MLP. The confusion
No DR 6 14
matrices in Tables 2, 3 and 4 are used to evaluate the per-
(d) Random Forest
formance of each classifier with MR-MR selected features.
DR 109 1
From these confusion matrices, the correctly classified and
No DR 4 16
wrongly classified instances can be recognized. While
(e) MLP
evaluating the matrices, we noted that when the IDRiD
DR 107 3
and MESSIDOR datasets are used, the number of FNs is
No DR 0 20
zero for the Adaboost classifier and the number of FNs are
zero for MLP classifier while the DIARETDB0 database is
used. The number of FPs is higher in SVM and Adaboost
Bold characters indicate the evaluation metrics values that give best classification performance in the pro-
posed work
Bold characters indicate the evaluation metrics values that give best classification performance in the pro-
posed work
13
Physical and Engineering Sciences in Medicine
Bold characters indicate the evaluation metrics values that give best classification performance in the pro-
posed work
Bold characters indicate the evaluation metrics values that give best classification performance in the pro-
posed work
Table 9 Validation accuracy of each classifier using SURF+BRISK- Table 10 Validation accuracy of each classifier using SURF+BRISK-
MR-MR selected features from IDRiD Database images for binary MR-MR selected features from MESSIDOR Database images
classification
Classifier Correctly classified Accuracy (%)
Classifier Correctly classified Accuracy (%) instances
instances
SVM 1173 97.75
SVM 402 97.33 Adaboost 1134 94.52
Adaboost 403 97.57 Naive Bayes 1137 94.75
Naive Bayes 406 98.31 Random Forest 1165 97.08
Random Forest 407 98.55 MLP 1175 97.92
MLP 408 98.78
Bold characters indicate the evaluation metrics values that give best
Bold characters indicate the evaluation metrics values that give best classification performance in the proposed work
classification performance in the proposed work
13
Physical and Engineering Sciences in Medicine
Table 11 Validation accuracy of each classifier using SURF+BRISK- Table 13 Confusion matrix for the evaluation of each classifier for
MR-MR selected features from DIARETDB0 Database images multiclass classification using MESSIDOR Database
Classifier Correctly classified Accuracy (%) Normal Mild Moderate Severe
instances
(a) SVM
SVM 126 96.92 Normal 532 14 0 0
Adaboost 126 96.92 Mild 6 141 6 0
Naive Bayes 108 83.07 Moderate 0 6 234 7
Random Forest 125 96.15 Severe 0 0 11 243
MLP 127 97.69 (b) Adaboost
Normal 546 0 0 0
Mild 1 0 0 152
Moderate 0 0 0 247
Table 12 Performance of each classifier (binary case) with proposed
feature extraction with training on MESSIDOR and IDRiD and test- Severe 0 0 0 254
ing on DIARETDB0 (c) Naive Bayes
Normal 501 39 0 6
Trained Database Classifier Correctly clas- Accuracy (%)
sified instances Mild 15 129 7 2
Moderate 1 19 208 19
IDRiD SVM 114 87.6 Severe 0 0 60 194
Adaboost 115 88.4 (d) Random Forest
Naive Bayes 106 81.5 Normal 539 7 0 0
Random Forest 119 91.5 Mild 2 142 9 0
MLP 122 93.8 Moderate 0 7 228 12
MESSIDOR SVM 111 85.3 Severe 0 0 4 250
Adaboost 115 88.4 (e) MLP
Naive Bayes 104 80.0 Normal 546 0 0 0
Random Forest 118 90.7 Mild 4 146 3 0
MLP 120 92.3 Moderate 1 3 239 4
Bold characters indicate the evaluation metrics values that give best Severe 0 1 5 248
classification performance in the proposed work
when compared to Naive Bayes, Random Forest, and MLP. average values. The weighted average details of the effi-
The misclassification of features is comparatively low in ciency measures are given in Table 8. From the analysis, it
MLP. While analysing the MLP confusion matrix, the mis- is clear that the MLP classifier produces the lowest FP rate
classification very less compared to others. (i.e., the amount of misclassification is comparatively low
The detailed class wise efficiency measures such as FP when MLP classifier is used with SURF-BRISK features).
rate, specificity, Precision, Recall, F1 score are illustrated The FP rate is 0.017 with IDRiD dataset, 0.021 with MESSI-
in Tables 5, 6 and 7 for each classifier using datasets from DOR dataset, and 0.004 with DIARETDB0. The weighted
IDRiD, MESSIDOR and DIARETDB0 respectively. The average of precision, recall, F1 sore is high in MLP for all
analysis shows less FPR for all classifiers. This is one of the cases.
signatures of a good classifier. It is difficult to understand The accuracies obtained for each classifier analyzed using
the classifier efficiency with this single element. When the the IDRiD, MESSIDOR, and DIARETDB0 databases are
details in Table 5 were analyzed, for the Adaboost classi- given in Tables 9, 10 and 11. It is clear from the accuracy
fier, the precision for the normal class is 1.00, but for the evaluation that the MLP shows the highest accuracy in all
DR class the precision is 0.965. At the same time, for the cases. The weighted average of detailed measures for the
DR class, recall is 1.00, and for Normal class it is 0.925. IDRiD, MESSIDOR, and DIARETDB0 databases are:
These measures also affect the efficiency of the system. The 0.988, 0.979, and 0.980 precision; 0.988, 0.979, and 0.977
same case is repeated in the case of the Adaboost classifier recall; and 0.988, 0.979, and 0.978 F1 scores respectively.
with IDRiD database and MLP with DIARETDB0 dataset. There was little variation in the accuracy of classifier while
The efficiency of the system varies according to its weighted
13
Physical and Engineering Sciences in Medicine
Bold characters indicate the evaluation metrics values that give best classification performance in the pro-
posed work
13
Physical and Engineering Sciences in Medicine
Table 17 Confusion matrix for the evaluation of each classifier for MESSIDOR database (detailed efficiency measures shown
multiclass classification using IDRiD Database in Table 14), the MLP classifier performs better than all
Normal Mild NPDR Moderate Severe PDR other classifiers with less FPR and higher specificity, pre-
NPDR NPDR cision, recall, and F1-score values. The weighted average
(a) SVM
values of FPR, specificity, precision, recall, and F1 score of
Normal 132 0 2 0 0
the MLP classifier are 0.007, 0.993, 0.982, 0.983, and 0.982
Mild 11 0 9 0 0
respectively (Table 15). The MLP classifier gives the best
NPDR performance measures in multiclass classification with an
Moderate 0 0 131 5 0 accuracy of 98.25%, while the lowest accuracy was achieved
NPDR by the Adaboost classifier (66.67%). The accuracies of clas-
Severe 0 0 11 59 4 sifiers using MESSIDOR database are shown in Table 16.
NPDR By using the IDRiD database, the proposed algorithm
PDR 0 0 0 14 35 shows an accuracy of 92.01 (Table 20). While analyzing the
(b) Adaboost detailed efficiency measures, the MLP works better than the
Normal 134 0 0 0 0 other classifiers. In Table 18, the performance measures are
Mild 20 0 0 0 0 very poor for the Adaboost classifier. The weighted average
NPDR
values in Table 19 show that MLP gives good performance
Moderate 1 0 135 0 0
NPDR
over the other classifiers, with an FPR of 0.031, specificity
Severe 0 0 74 0 0
of 0.969, precision of 0.925, recall of 0.920, and F1 score of
NPDR 0.908. The accuracies for each classifier using IDRiD data-
PDR 0 0 49 0 0 base are given in Table 20. The category used in each data-
(c) Naive Bayes base for multiclass classification is different. So, the training
Normal 120 10 4 0 0 and testing of the proposed work with different database is
Mild 3 16 1 0 0 not possible in the case of multiclass classification.
NPDR
Moderate 78 8 37 13 0 Comparison of the proposed system
NPDR with pre‑trained models
Severe 2 0 24 21 27
NPDR
In the proposed feature extraction, two local feature extrac-
PDR 0 0 0 6 43
tors are combined and higher ranked features are used
(d) Random Forest
for classification to reduce the system complexity and to
Normal 130 3 1 0 0
increase system performance. It is necessary to analyse
Mild 5 4 11 0 0
NPDR the performance of the hybrid feature extraction with the
Moderate 1 1 125 9 0 existing pre-trained models. For comparative study here we
NPDR used the features from RESNET-50[41] and VGG-16[42]
Severe 0 0 3 67 4 pre-trained models. The accuracy of the classifier obtained
NPDR using the extracted features from the pre-trained models
PDR 0 0 0 3 46 RESNET-50 and VGG-16 are demonstrated in Table 21
(e) MLP for binary class and in Table 22 for multiclass classifica-
Normal 134 0 0 0 0 tion. When compared to the proposed work, the classifica-
Mild 12 5 3 0 0 tion accuracy using pre-trained network features becomes
NPDR competitively less. These Image- Net pre-trained models are
Moderate 0 0 136 0 0 mainly trained on natural or general images. Thus we can
NPDR
infer that the selected pre-trained networks are not able to
Severe 0 0 2 71 1
NPDR provide fine features for DR classification task.
PDR 0 0 4 11 34
13
Physical and Engineering Sciences in Medicine
Bold characters indicate the evaluation metrics values that give best classification performance in the pro-
posed work
Table 19 Weighted average values calculated for each measures in Table 20 Validation accuracy of each classifier for multiclass classifi-
Table 18 for multiclass classification using IDRiD Database cation using IDRiD Database
Classifier FP rate Specificity Precision Recall F1 score Classifier Correctly classified Accuracy (%)
instances
SVM 0.050 0.95 0.823 0.864 0.841
SVM 357 86.44
Adaboost 0.171 0.829 0.453 0.651 0.527
Adaboost 269 65.13
Naive Bayes 0.152 0.848 0.566 0.574 0.532
Naive Bayes 237 57.39
Random Forest 0.033 0.967 0.889 0.901 0.892
Random Forest 372 90.07
MLP 0.031 0.969 0.925 0.920 0.908
MLP 380 92.01
Bold characters indicate the evaluation metrics values that give best
classification performance in the proposed work Bold characters indicate the evaluation metrics values that give best
classification performance in the proposed work
13
Physical and Engineering Sciences in Medicine
Table 21 Accuracy measures of classifier using pre-trained network Table 22 Accuracy measures of classifier using pre-trained network
feature extraction for binary classification feature extraction for multiclass classification
Pre-trained Database Classifier Accuracy using Pre-trained Database Classifier Accuracy using
network pre-trained network pre-trained
network (%) network(%)
13
Physical and Engineering Sciences in Medicine
Bold characters indicate the evaluation metrics values that give best classification performance in the pro-
posed work
classifier, the accuracies were 98.25% and 92.01% using the 2. Zachariah S, Wykes W, Yorston D (2015) Grading diabetic retin-
MESSIDOR and IDRiD databases, respectively. The average opathy (dr) using the scottish grading protocol. Commun Eye
Health 28:72–73
accuracy of the system with the MLP classifier was 95.13%. 3. Abramoff MD, Garvin MK, Sonka M (2010) Retinal imaging and
The weighted average values of precision, recall, and F1 image analysis. IEEE Rev Biomed Eng 3:169–208. https://doi.
score for MLP in all cases marks the quality of the classifier. org/10.1109/RBME.2010.2084567
The weighted average FP rate is lower for MLP than other 4. Ali R, Usman Akram M (2018) Analysing vascular structure to
determine intra retinal microvascular abnormalities (IRMA), pp
classifiers, which increase the efficiency of the classifier. As 49–52. https://doi.org/10.1109/CIBEC.2018.8641825
per the evaluations, the MLP serves as a good classifier by 5. Jemshi KM, Gopi VP, Issac Niwas S (2018) Development of an
using the SURF-BRISK extracted MR-MR selected features. efficient algorithm for the detection of macular edema from optical
Future work should aim to develop much easier, efficient, coherence tomography images. Int J Comput Assist Radiol Surg
13(9):1369–1377. https://doi.org/10.1007/s11548-018-1795-6
and novel feature extraction techniques for five-class grading 6. Sreejini K, Govindan V (2019) Retrieval of pathological retina
of DR. Also, efforts should be made to derive novel methods images using bag of visual words and plsa model. Int J Eng Sci
using deep learning techniques with efficient architectures Technol 22:777–785. https: //doi.org/10.1016/j.jestch .2019.02.002
for efficient DR classification. 7. Tareen SAK, Saleem Z (2018) A comparative analysis of sift,
surf, kaze, akaze, orb, and brisk. In: International conference on
computing, mathematics and engineering technologies (iCoMET),
pp 1–10. https://doi.org/10.1109/ICOMET.2018.8346440
Compliance with ethical standards 8. Kamil R, Al-Saedi K, Al-Azawi R (2018) An accurate system to
measure the diabetic retinopathy using svm classifier. Ciência e
Conflict of interest The authors declare that they have no conflict of Técnica Vitivinícola 33:135–139
interest. 9. Lupascu CA, Tegolo D, Trucco E (2010) FABC: retinal vessel
segmentation using adaboost. IEEE Trans Inf Technol Biomed
Ethical approval For this type of study, formal consent is not required. 14(5):1267–1274. https://doi.org/10.1109/TITB.2010.2052282
10. Kausu T, Gopi VP, Wahid KA, Doma W, Niwas SI (2018) Combi-
Informed consent This article does not contain any studies with nation of clinical and multiresolution features for glaucoma detec-
human participants or animals performed by any of the authors. tion and its classification using fundus images. Biocybern Biomed
Eng 38(2):329–341. https://doi.org/10.1016/j.bbe.2018.02.003
11. Abdulmunem M, Fatoohi Z (2018) Propose retina identifica-
tion system based on the combination of surf detector and brisk
descriptor. Iraqi J Sci 59(2B):946–955
References 12. Akyol K, BAYIR S, Sen B (2017) A decision support system for
early-stage diabetic retinopathy lesions. Int J Adv Comput Sci
1. Cheung N, Wang JJ, Klein R, Couper DJ, Sharrett AR, Wong Appl 8:369–379. https://doi.org/10.14569/IJACSA.2017.081249
TY (2007) Diabetic retinopathy and the risk of coronary heart 13. Naga Sai Prasad VG, Ratna B, Rajesh V (2018) Feature extrac-
disease. Diabetes Care 30(7):1742–1746. https://doi.org/10.2337/ tion based retinal image analysis for bright lesion classification
dc07-0264
13
Physical and Engineering Sciences in Medicine
in fundus image. Biomed Res 29:3648–3653. https : //doi. features. Multimed Tools Appl. https://doi.org/10.1007/s1104
org/10.4066/biomedicalresearch.29-16-2170 2-019-7485-8
14. de la Calleja J, Tecuapetla L, Auxilio Medina M, Bárcenas E, 28. Daqi G, Tao Z (2007) Support vector machine classifiers using
Urbina Nájera AB (2014) LBP and machine learning for diabetic RBF kernels with clustering-based centers and widths. In: 2007
retinopathy detection. Int Conf Intell Data Eng Autom Learn international joint conference on neural networks, pp 2971–2976.
8669:110–117. https://doi.org/10.1007/978-3-319-10840-7_14 https://doi.org/10.1109/IJCNN.2007.4371433
15. Issac Niwas S, Lin W, Kwoh CK, Kuo CJ, Sng CC, Aquino MC, 29. Wang R (2012) Adaboost for feature selection, classification and
Chew PTK (2016) Cross-examination for angle-closure glaucoma its relation with svm, a review. Phys Procedia 25:800–807. https: //
feature detection. IEEE J Biomed Health Inform 20(1):343–354. doi.org/10.1016/j.phpro.2012.03.160. International conference
https://doi.org/10.1109/JBHI.2014.2387207 on solid state devices and materials science, macao
16. Sidibé D, Sadek I, Mériaudeau F (2015) Discrimination of retinal 30. Schapire RE (2013) Explaining AdaBoost. Springer, Berlin, pp
images containing bright lesions using sparse coded features and 37–52. https://doi.org/10.1007/978-3-642-41136-6_5
svm. Comput Biol Med 62:175–184. https://doi.org/10.1016/j. 31. Roychowdhury A, Banerjee S (2018) Random forests in the clas-
compbiomed.2015.04.026 sification of diabetic retinopathy retinal images. In: Bhattacharyya
17. Jelinek HF, Pires R, Padilha R, Goldenstein S, Wainer J, Bos- S, Gandhi T, Sharma K, Dutta P (eds) Advanced computational
somaier T, Rocha A (2012) Data fusion for multi-lesion diabetic and communication paradigms, vol 475. Springer, Singapore, pp
retinopathy detection. In: 25th IEEE international symposium 168–176. https://doi.org/10.1007/978-981-10-8240-5_19
on computer-based medical systems (CBMS), pp 1–4. https:// 32. Breiman L (2001a) Random forests. Mach Learn 45(1):5–32. https
doi.org/10.1109/CBMS.2012.6266342 ://doi.org/10.1023/A:1010933404324
18. Panchal P, Bhojani R, Panchal T (2016) An algorithm for reti- 33. Breiman L (2001b) Random forests. Mach Learn 45:5–32. https
nal feature extraction using hybrid approach. Procedia Com- ://doi.org/10.1023/A:1010933404324
put Sci 79:61–68. https://doi.org/10.1016/j.procs.2016.03.009. 34. Huang G-B, Chen Y-Q, Babri HA (2000) Classification ability
Proceedings of international conference on communication, of single hidden layer feed forward neural networks. IEEE Trans
computing and virtualization (ICCCV) 2016 Neural Netw 11(3):799–801. https://doi.org/10.1109/72.846750
19. Li H, Chutatape O (2004) Automated feature extraction in color 35. Saifuddin H, Vijayalakshmi H (2016) Prediction of diabetic retin-
retinal images by a model based approach. IEEE Trans Biomed opathy using multi layer perceptron. Int J Adv Res 4:658–664.
Eng 51(2):246–254. https://doi.org/10.1109/TBME.2003.82040 https://doi.org/10.21474/IJAR01/714
0 36. Yadav S, Shukla S (2016) Analysis of k-fold cross-validation over
20. Gopi VP, Anjali MS, Niwas SI (2017) Pca-based localization hold-out validation on colossal datasets for quality classification.
approach for segmentation of optic disc. Int J Comput Assist In: 2016 IEEE 6th international conference on advanced comput-
Radiol Surg 12(12):2195–2204. https://doi.org/10.1007/s1154 ing (IACC), pp 78–83. https://doi.org/10.1109/IACC.2016.25
8-017-1670-x 37. Visa S, Ramsay B, Ralescu A, Knaap E (2011) Confusion matrix-
21. Sudha V, Karthikeyan C (2018) Analysis of diabetic retinopathy based feature selection. CEUR Workshop Proc 710:120–127
using naive bayes classifier technique. Int J Eng Technol 7:440– 38. Porwal P, Pachade S, Kamble R, Kokare M, Deshmukh G, Sahas-
442. https://doi.org/10.14419/ijet.v7i2.21.12462 rabuddhe V, Meriaudeau F (2018) Indian diabetic retinopathy
22. Rahim SS, Palade V, Shuttleworth J, Jayne C (2016) Automatic image dataset (IDRiD): a database for diabetic retinopathy screen-
screening and classification of diabetic retinopathy and maculopa- ing research. Data 3:1–8
thy using fuzzy image processing. Brain Inform 3(4):249–267. 39. Decencière E, Zhang X, Cazuguel G, Lay B, Cochener B, Trone
https://doi.org/10.1007/s40708-016-0045-3 C, Gain P, Ordonez R, Massin P, Erginay A, Charton B, Klein JC
23. Bay H, Tuytelaars T, Van Gool L (2006) Surf: speeded up robust (2014) Feedback on a publicly distributed database: the messi-
features. In: Leonardis A, Bischof H, Pinz A (eds) Computer dor database. Image Anal Stereol 33(3):231–234. https://doi.
vision—ECCV 2006, vol 3951. Springer, Berlin, pp 404–417 org/10.5566/ias.1155
24. Leutenegger S, Chli M, Siegwart RY (2011) Brisk: binary robust 40. Kalesnykiene V, kristian Kamarainen J, Lensu L, Sorri I, Uusi-
invariant scalable keypoints. In: 2011 international conference talo H, Kälviäinen H, Pietilä J (2007) DIARETDB0: Evaluation
on computer vision, pp 2548–2555. https://doi.org/10.1109/ database and methodology for diabetic retinopathy algorithms
ICCV.2011.6126542 41. He K, Zhang X, Ren S, Sun J (2015) Deep residual learning for
25. Gularte A, Thomasi C, De Bem R, Adamatti D (2013) Perfor- image recognition. In: 2016 IEEE conference on computer vision
mance evaluation of brisk algorithm on mobile devices. VISAPP and pattern recognition (CVPR), pp 770–778
2013 Proc Int Conf Comput Vis Theory Appl 2:5–11 42. Simonyan K, Zisserman A (2014) Very deep convolutional net-
26. Peng H, Long F, Ding C (2005) Feature selection based on mutual works for large-scale image recognition. CoRR abs/1409.1556
information criteria of max-dependency, max-relevance, and min-
redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226– Publisher’s Note Springer Nature remains neutral with regard to
1238. https://doi.org/10.1109/TPAMI.2005.159 jurisdictional claims in published maps and institutional affiliations.
27. Kandhasamy JP, Kadry Balamurali S, Ramasamy LK (2019)
Diagnosis of diabetic retinopathy using multi level set segmenta-
tion algorithm with feature extraction using svm with selective
13