You are on page 1of 9

Knowledge-Based Systems 81 (2015) 5664

Contents lists available at ScienceDirect

Knowledge-Based Systems
journal homepage: www.elsevier.com/locate/knosys

Computer-aided diagnosis of diabetic subjects by heart rate variability


signals using discrete wavelet transform method
U. Rajendra Acharya a,b, Vidya K. Sudarshan a,, Dhanjoo N. Ghista c, Wei Jie Eugene Lim a, Filippo Molinari d,
Meena Sankaranarayanan e
a

Department of Electronics and Computer Engineering, Ngee Ann Polytechnic, Singapore 599489, Singapore
Department of Biomedical Engineering, Faculty of Engineering, University of Malaya, Malaysia
University 2020 Foundation, MA, USA
d
Biolab, Department of Electronics and Telecommunications, Politecnico di Torino, Torino, Italy
e
Department of Mathematics, Anand Institute of Higher Technology, Kazhipattur, Chennai 603 103, India
b
c

a r t i c l e

i n f o

Article history:
Received 9 June 2014
Received in revised form 3 February 2015
Accepted 5 February 2015
Available online 12 February 2015
Keywords:
Diabetes
HRV
Classier
DWT
Feature extraction
Feature ranking

a b s t r a c t
Diabetes Mellitus (DM), a chronic lifelong condition, is characterized by increased blood sugar levels. As
there is no cure for DM, the major focus lies on controlling the disease. Therefore, DM diagnosis and treatment is of great importance. The most common complications of DM include retinopathy, neuropathy,
nephropathy and cardiomyopathy. Diabetes causes cardiovascular autonomic neuropathy that affects
the Heart Rate Variability (HRV). Hence, in the absence of other causes, the HRV analysis can be used
to diagnose diabetes. The present work aims at developing an automated system for classication of normal and diabetes classes by using the heart rate (HR) information extracted from the Electrocardiogram
(ECG) signals. The spectral analysis of HRV recognizes patients with autonomic diabetic neuropathy, and
gives an earlier diagnosis of impairment of the Autonomic Nervous System (ANS). Signicant correlations
with the impaired ANS are observed of the HRV spectral indices obtained by using the Discrete Wavelet
Transform (DWT) method. Herein, in order to diagnose and detect DM automatically, we have performed
DWT decomposition up to 5 levels, and extracted the energy, sample entropy, approximation entropy,
kurtosis and skewness features at various detailed coefcient levels of the DWT. We have extracted
relative wavelet energy and entropy features up to the 5th level of DWT coefcients extracted from
HR signals. These features are ranked by using various ranking methods, namely, Bhattacharyya space
algorithm, t-test, Wilcoxon test, Receiver Operating Curve (ROC) and entropy.
The ranked features are then fed into different classiers, that include Decision Tree (DT), K-Nearest
Neighbor (KNN), Nave Bayes (NBC) and Support Vector Machine (SVM). Our results have shown maximum diagnostic differentiation performance by using a minimum number of features. With our system,
we have obtained an average accuracy of 92.02%, sensitivity of 92.59% and specicity of 91.46%, by using
DT classier with ten-fold cross validation.
2015 Elsevier B.V. All rights reserved.

1. Introduction
According to the International Diabetes Federation (IDF), it is
estimated that in 2013 a total of 381 million people were diagnosed
with diabetes across the globe, out of which 23 million people are
from Southeast Asian countries [26]. Due to lack of nance or access
to healthcare, most of the populations around the world are unaware that they may be suffering from diabetes [26]. Statistics shows
that around 1.9 million people are diagnosed with diabetes in USA
every year and 79 million have pre-diabetic conditions [7]. By 2030,
Corresponding author. Tel.: +65 64608393.
E-mail address: vidya.2kus@gmail.com (K.S. Vidya).
http://dx.doi.org/10.1016/j.knosys.2015.02.005
0950-7051/ 2015 Elsevier B.V. All rights reserved.

the number of diabetes subjects is estimated to get almost double


(2.8% in 2000 and 4.4% in 2030), as its incidence is increasing rapidly every year Sarah et al. [47]. Diabetes and its complications have
shown a notable impact on individuals, families, and health systems and countries economy. The USA alone spends around
$245 billion annually on the diagnosed diabetes patients. It is predicted that by 2050, 1 in 3 Americans adults may have diabetes if
the current tendency is continued [7,10].
Diabetes mellitus (DM) is a condition that is dened by hyperglycemia state (blood glucose level), which in turn leads to
microvascular, and macrovascular damage [60]. Even though,
nding a cure for this DM condition is difcult, emphasis is laid
on early diagnosis of DM. In this regard, it is well known that a

U. Rajendra Acharya et al. / Knowledge-Based Systems 81 (2015) 5664

person with diabetes exhibits autonomic neuropathy (AN), damage


to the nervous system or cardiovascular autonomic neuropathy
(CAN), a well-known complication of DM that affects the central
and peripheral vascular systems and causes abnormalities in the
heart rate signal [1]. Thus, diabetes can also be diagnosed by
studying the heart rate variability.
Concerning heart rate variability, the heart rate (HR), a
non-stationary/nonlinear signal, is obtained by calculating the time
elapsed between two ventricular contractions or the time between
two consecutive R-waves (RR interval) on the ECG signals [27].
The HR Variability (HRV) is one of the reliable methods for
qualifying physiological dysfunction in terms of the condition of
sympathetic and parasympathetic nervous system [6,25]. The analysis of HRV enables us to evaluate overall cardiac health in terms
of the heart rate regulation, based on the status of the autonomic
nervous system responsible for regulating cardiac activity [37].
Spectral analysis of the short-term HRV enables quantitative
evaluation of the neurologic oscillations, and delivers values for
neural regulation of heart rate [9,51,34]. The spectral analysis of
HRV (spectral parameters like the power spectrum of HRV signal)
recognizes patients with autonomic diabetic neuropathy, and gives
an earlier diagnosis of impairment of the autonomic nervous
system (ANS) [15,21]. Signicant correlations are observed
between impaired ANS and the HRV indices obtained by spectral
analyses using nonparametric and parametric methods namely,
fast Fourier Transform (FFT) and autoregressive (AR) method
respectively [12]. Timefrequency domain analysis of HRV makes
it easier to quantify the ANS activity in DM subjects [48]. Even
though the autonomic functions are better assessed by using frequency domain features, the accuracy of spectral power is limited
by the low level of the signal to noise ratio [6].
Nonlinear dynamic techniques are used in HRV signal analysis
to circumvent the limitations of time and frequency domain analysis [4]. Nonlinear methods are needed for the analysis of nonlinear
signals and systems [35]. The non-linear methods have been
applied in HRV analysis [50,30] to predict diabetes [14,5,20] and
cardiovascular disease (CVD) [23]. Nonlinear techniques can be
coupled to frequency analysis techniques. Among all of these
techniques, the DWT has the advantage of providing multiple
resolutions. This method provides discrimination between two
different signals with the same spectrum magnitude, thus distinguishing the subtle changes in the signals [17,2,56].
In normal and diabetic subjects, the HRV signal has been used to
study and measure the activity and symptoms of the cardiac
parasympathetic nervous system [41]. Their study reported that
diabetic subjects exhibit diminished cardiac parasympathetic
activity before the appearance of autonomic neuropathy symptoms. Several studies conducted (Table 3) have reported that
diabetic patients are characterized by reduced HRV, with less information about HRV across the spectrum of blood glucose levels. In
2000, Singh et al. [52] studied the correlation between hyperglycemia (increased blood glucose level) and reduced HRV. They
reported reduced HRV variables in DM subjects and in subjects
with impaired blood (plasma) glucose levels by using time domain
features.
Awdah et al. [11] studied diabetic subjects with and without
autonomic neuropathy by using the time domain analysis of
HRV. Their results showed signicant decrease in all the time
domain measures for diabetic subjects with and without diabetic
neuropathy compared to the control class. In 2005, Flynn et al.
[22] used detrended uctuation analysis (DFA) to study the HRV
changes over a short time ECG recordings of 20 min. Their study
reported reduced values of HRV for diabetic subjects. Chemla
et al. [16] used autoregressive (AR) methods to study the HRV
spectral components in diabetic patients. They found that diabetic
subjects exhibit decreased spectral values, and that FFT method is

57

more suitable for evaluation of short-term HRV spectral components in diabetic subjects.
Analysis in the time and frequency domain of RR interval has
been carried out by Ahmad Seyd et al. [8], to quantify the autonomic nervous system (ANS) in DM patients. Signicant differences in high frequency (HF) power, very low frequency (VLF)
and low frequency (LF) power were noted between DM patients
and normal classes in the frequency domain analysis of extracted
data (NN interval normal to normal interval). This study also
observed signicant difference in time domain analysis of root
mean square of successive NN interval differences (RMSSD) and
the standard deviation of NN interval (SDNN) between the DM
and control groups.
Multiscale entropy (MSE) analysis method has also been used
to diagnose the autonomic dysregulation in DM patients by
Trunkvalterova et al. [58]. Their study performed the analysis of
heart rate (HR) signal, systolic and diastolic blood pressure (SBP
and DBP) signals in both normal and diabetic subjects, to evaluate
the SampEn and linear measures. They reported that in young
patients with DM, the changes in cardiovascular control were
detected by the MSE analysis of SBP and DBP oscillations and HR
signals. The relationship between HRV and duration of type 2 diabetes based on sex-differences was studied by Nolan et al. [38]. By
this study result, an inverse relationship was reported between the
Type 1 and Type 2 diabetes duration and HRV measures among
male subjects only. The inverse association of HRV with increasing
age of diabetes diagnosis, as well as increasing severity of coronary
heart disease risk and obesity was observed in female subjects.
Then in 2012, Faust et al. [20] used time and frequency domain
and nonlinear methods to study the HRV signals of both diabetic
and normal subjects; they have proposed unique ranges for various
features of the two classes. The HRV parameter in diabetic and
non-diabetic patients with renal transplantation has been investigated in time and frequency domain by Kirvela et al. [31]; their
result highlighted that in end-stage diabetic neuropathy patients
the autonomic neuropathy is the main reason to cause severe
impairment of HRV and partly by the co-existing heart disease.
Recently, a novel Diabetic Integrated Index (DII) has been developed by Acharya et al. [3], by using nonlinear parameters extracted
from the HRV signal. This DII is a number which can distinguish
and classify the two classes in terms of just one number. They also
reported that the AdaBoost classier yielded a high classication
accuracy of 86% for the two classes (normal and diabetic). In this
research group, Swapna et al. [54] used Higher Order Spectral features to classify diabetic patients from normal subjects; their
method reported the highest accuracy, sensitivity and specicity
of 90.5%, 85.7% and 95.2% respectively, by using Gaussian mixture
model classier. The magnitude plots of the HOS bispectrum
obtained from HRV signals have been subjected to principal
component analysis for feature reduction [28]. These principal
components with SVM classier reported an accuracy of 79.93%.
However, Acharya et al. [5] reported 90% of accuracy, 92.5% of sensitivity and 88.7% of specicity with AdaBoost classier coupled
with four nonlinear features. Pachori et al. [42] (In press), proposed
a new nonlinear method based on Empirical Mode Decomposition
(EMD) to discriminate between normal and diabetic RR-interval
signals. In their proposed method, EMD decomposes the RR-interval
signal into IMFs from which ve features (FourierBessel series
expansion, amplitude modulation bandwidth, frequency modulation bandwidths, analytic signal representation and second order
difference Plot) are extracted. The study results show that the
features extracted exhibits are statistically signicant difference
between normal and diabetic classes.
In our present work, in order to automatically diagnose and
detect DM, we have performed DWT decomposition up to 5 levels
and have extracted the energy, sample entropy, approximation

58

U. Rajendra Acharya et al. / Knowledge-Based Systems 81 (2015) 5664

entropy, kurtosis and skewness features at various detailed coefcient levels of the DWT. Fig. 1 shows an overview of our proposed
methodology for diabetic HR signal classication. In the off-line
system, normal and diabetes RR signal data are analyzed by
DWT, performed up to 5 level of decomposition. Energy, sample
entropy, approximate entropy, kurtosis and skewness features
are extracted from each levels of the detailed coefcients of the
DWT. Then, these features are ranked by using Bhattacharyya
space algorithm, t-test, Wilcoxon test, Receiver Operating Curve
(ROC), and entropy method. The ranked features are fed to DT,
KNN, NBC and SVM classiers to obtain the highest classication
performance using minimum number of features. In the on-line
system, up to ve levels decomposition are performed by using
DWT method and the features (energy, ApEn, SampEn, kurtosis,
and skewness) are extracted. These features are ranked and fed
to the selected classiers for automated classication as normal
and DM.
The ow of the paper is as follows. Section 2 delineates (i) the
data acquisition process and pre-processing, (ii) feature extraction
method and feature ranking methods, and (iii) classication. The
results of this novel diagnostic system are presented in Section 3.
The discussion of the results is carried out in Section 4, and conclusion is provided in Section 5.
2. Methods for HRV analysis
2.1. Data acquisition/pre-processing
The electrocardiogram signals (ECG) were acquired from 30
subjects (15 subjects with DM and 15 healthy subjects) in a relaxed
supine position for 60 min. The ECG recordings were performed by
using BIOPAC (Aero Camino Goleta, CA, USA) equipment, and the
AcqKnowledge software inbuilt within equipment to convert the
recordings into heart rate time series. Fig. 2 shows the RR signals
of normal and DM patients. We have kept the ECG sampling rate
to 500 Hz. A total of 81 datasets from 15 diabetic subjects (10 male
and 5 female) and 82 datasets from 15 normal subjects (8 male and

7 female) were used in this study, with each dataset having 1000
samples. All the subjects were instructed about the aim of the
study and signed an informed consent before being examined.
The study received the approval by the Kasturba Medical
Hospital, in Manipal, India. Band reject lter with a center frequency of 50 Hz was used to remove the power-line interference noise.
RR points were detected using Pan and Tomkins algorithm [40].

2.2. Feature extraction


Feature extraction step is the crucial process in biomedical signal analysis and interpretation. We have performed DWT on the
HR signals up to ve levels, and extracted features of Energy (E),
Approximate Entropy (ApEn), Sample Entropy (SampEn), Kurtosis
(Kur) and Skewness (Skw) from these different levels of DWT
coefcients. The DWT method and the features extracted are
described briey in the following section.

2.2.1. Discrete Wavelet Transform (DWT)


The DWT transforms the signal from time domain to wavelet
domain and delivers different coefcient values. In the DWT, the
given heart rate signals are passed through high pass and low pass
lter. Once ltering is done, half of the samples are eliminated as it
is sub-sampled by 2. This is the rst level of decomposition. Then
the low pass lter coefcients are subjected to low pass and high
pass lter again and this procedure is repeated for different levels
of decomposition. At each level, the number of samples and
frequency band are halved [55]. This converts a signal into low
pass (approximate) coefcients and high pass (detailed) coefcients.
In this work, we have used db8 mother wavelet function [17]. We
performed DWT on HR signals up to ve levels, and then extracted
features of energy, ApEn, SampEn, kurtosis, and skewness.
In this work, A5 is the fth level of the approximate coefcients
and D1D5 correspond to rst to fth level detailed coefcients.
Fig. 3 shows the DWT performed on RR interval signals of normal
and DM patients.

Fig. 1. Proposed system.

U. Rajendra Acharya et al. / Knowledge-Based Systems 81 (2015) 5664

59

Fig. 2. Typical RR interval signals: (a) normal subject and (b) diabetic subject.

2.2.2. Energy (E)


It is the square of the DWT coefcient of the heart rate signal.
2.2.3. Approximate Entropy (ApEn)
It is a method used to quantify the amount of regularity and
unpredictability of signal variations [43]. This regularity statistic
has potential application in ECG and heart rate data analysis/time
series [44]. A signal varying rhythmically has small ApEn and vice
versa. Herein, we have used the ApEn formula proposed by Pincus
et al. [45].
2.2.4. Sample Entropy (SampEn)
It is a modication of approximate entropy used for the assessment of complexity and regularity of physiological time-series
[57]. Unlike ApEn, SampEn is independent of data length and
performs consistently well. A signal with more repeating patterns
will have small SampEn and vice versa.
2.2.5. Kurtosis and skewness
These two values are used to assess the probability distributions of the signal series [46]. Kurtosis indicates whether the data
is peak or at relative to the normal distribution. Skewness
measures the asymmetry of the tails of distribution. The kurtosis
(Kur) and skewness (Skw) are dened as

EfX  lg4 

2.3. Feature ranking


Ranking methods are one of the fastest methods in feature
selection problem. Feature ranking is used to select a subset of
features, which will reduce the classiers complication without
making any difference in its performance. In our work, the different feature ranking methods namely, Bhattacharyya space
algorithm, t-test, Wilcoxon test, Receiver Operating Curve (ROC),
and entropy are used to rank the signicant features. These feature
ranking methods are briey explained below.
2.3.1. Bhattacharyya method
In this method, the features are ranked according to their ability
in discriminating the training data. Bhattacharyya ranking method
yields a single evaluation route, to thereby reduce the number of
classications by adding every feature [29].
2.3.2. t-test
The student t-test method is used to determine whether the
mean of two sets are different or not [13]. The test gives the
p-value and t-values for the features extracted for the two groups
of data. Statistically, a low p-value is preferred (p < 0.05), and higher
the t-value better the ranking. Hence in this work, the low p-value
features are selected and the t-values are used to rank them.

2.3.3. Wilcoxon test


It assess the difference between the two related samples. This is
a paired test that is suitable for comparing two different measurement sets made on the same data [61].

where X is the probability distribution of the signal, l is the mean


value of the data set, and r represents the standard deviation of
the data set.

2.3.4. Receiver Operating Curve (ROC) method


In this method, the sensitivity and specicity of a diagnostic test
is evaluated to obtain the ROC curve at different threshold values,

Kur

Skw

r4
EfX  lg3 

r3

60

U. Rajendra Acharya et al. / Knowledge-Based Systems 81 (2015) 5664

Fig. 3. Typical DWT plots of RR interval signals: (a) normal and (b) diabetic subject.

and it is plotted as sensitivity versus 1-specicity. A test that perfectly discriminates between the two groups would yield a curve;
then, by determining the area under curve, the soundness of a test
can be assessed. In practice, the area varies between 0.5 and 1; if
the area is closer to 1, the test is considered better; the test is considered worst if the area is closer to 0.5 [39].
2.3.5. Entropy based test
This method is based on the fact that entropy is lower for orderly layout and higher for disorderly layout. In this method, the features are ranked in descending order of relevance, by nding the
descending order of the entropies after removing each feature
one at a time [18].

average value of the ten folds. The different classiers used in


our study are explained below.
2.4.1. Decision Tree (DT)
This classier uses the signicant features from the training
data to construct a tree [33]. The two classes are dened by using
the rules extracted from the constructed tree. Then the class of the
test data is determined using these rules. The main advantage of
this classier is its ability to break down a complex decisionmaking process into a collection of simpler decisions, thereby
providing a solution which is often easier to interpret. There may
be difculties involved in designing an optimal DT classier. The
performance of a DT classier strongly depends on how well the
tree is designed.

2.4. Classication
In our work, we have used ten-fold cross validation method to
evaluate the classiers [2]. Our main objective is to obtain the best
classication accuracy, by using the minimum number of ranked
features and identify the best classier. In this method, the whole
set of ranked features are rst divided into 10 equal parts, with the
rst 9 parts (147 data les) being used for training the classier,
followed by using the trained classier on the one remaining part
(16 data les) to evaluate its performance. This whole process is
repeated 10 times by taking different parts for training and testing
dataset. The classier performance is measured by using the

2.4.2. K-Nearest Neighbor (KNN)


It is a simple classier that determines the k-nearest neighbors
by using the minimum distance from the testing and training data
[32]. The most common among the k-nearest neighbors are
assigned with a class. This classier has poor run-time performance when the training set is large. In this work, we have used
k = 3.
2.4.3. Naive Bayes Classier (NBC)
It is a probabilistic classier which works on the principle of
Bayes theorem, and on the assumption that the features are

61

U. Rajendra Acharya et al. / Knowledge-Based Systems 81 (2015) 5664


Table 1
Range (Mean Standard Deviation) of features extracted from normal and diabetic RR interval signals.
Features

ApEn_D1
Kur_D3
SamEn_A5
Kur_D2
Kur_D1
ApEn_D3
ApEn_D2
Kur_D4
ApEn_A5
SampEn_D1
Skw_D2
ApEn_D4
E_D1
Skw_A5
ApEn_D5
E_A5
E_D5
Skw_D1
E_D4
E_D2
Skw_D3
E_D3
Skw_D4
Skw_D5
Kur_D5
Kur_A5

Normal

Diabetes

Mean

SD

Mean

SD

0.878684
0.050443
0.414618
0.027187
0.021703
0.850749
0.859135
0.097679
0.665725
0.712377
0.397354
0.731534
0.002803
0.503106
0.667284
7.58E05
0.000253
0.636675
0.000343
0.002461
0.424324
0.001619
0.391949
0.436783
0.193536
0.145777

0.09963
0.049304
0.141522
0.042692
0.052228
0.084797
0.108898
0.090308
0.172749
0.113272
0.039024
0.147349
0.003401
0.121693
0.158245
2.93E05
0.000313
0.052334
0.00035
0.002364
0.042139
0.001354
0.066628
0.188226
0.179643
0.130321

0.766539
0.14026
0.348454
0.098319
0.079683
0.785439
0.786305
0.150161
0.609154
0.646319
0.428037
0.688021
0.019003
0.528192
0.637826
0.012382
0.01238
0.620989
0.012399
0.014506
0.438184
0.01256
0.380989
0.455444
0.204018
0.151545

0.226049
0.205666
0.106587
0.19369
0.157082
0.168723
0.214654
0.166174
0.147757
0.251198
0.138865
0.191352
0.113083
0.143809
0.183521
0.111107
0.111107
0.133937
0.111105
0.111103
0.121263
0.111088
0.12243
0.163141
0.163548
0.177971

independent random variables [24]. The main advantage of this


classier is that it requires a small amount of training data to estimate the parameters (means and variances of the variables)
required for classication. The most important downside of this
classier is that it has strong feature independence assumptions.
2.4.4. Support Vector Machine (SVM)
It is one of the most widely used classiers, which constructs a
separating hyper-plane in a feature space which separates the
training data into two classes [19]. Kernel functions are used, if
the data used are nonlinearly separable, to map the original input
data to a higher dimensional feature space where the features
might become linearly separable. This work concerns polynomial
kernel functions of order 1, 2 and 3 and radial basis function
(RBF) kernels. We have used Least Square SVM (LS-SVM) in this
work [53]. The biggest advantage of SVM is to overcome the curse
of dimensionality in traditional machine learning and local
minima. When dealing with small sample size problem, the generalization ability of this classier is the best. The biggest limitation
of the SVM lies in the choice of the kernel, and the most serious one
from a practical point of view is the high algorithmic complexity
and extensive memory requirements.
3. Results
In our work, we have extracted a total of twenty-six features
from HRV signals by using the DWT method. Table 1 shows the

p-value

t-value

6.37E05
0.000174
0.000946
0.00142
0.001826
0.002089
0.006906
0.013084
0.026095
0.031581
0.055938
0.105526
0.196578
0.23085
0.273877
0.317341
0.324434
0.325119
0.327228
0.32778
0.330019
0.37381
0.478117
0.499988
0.697491
0.813513

4.106849
3.8445
3.368416
3.246782
3.169823
3.128092
2.736556
2.509336
2.245531
2.168592
1.92543
1.627786
1.296734
1.202721
1.097925
1.003052
0.988411
0.987009
0.982703
0.981578
0.977032
0.891839
0.710994
0.676036
0.389405
0.236283

results of statistical analysis. The results of automated detection


and classication of HRV signals of DM subjects are tabulated in
Table 2. A ten-fold cross validation has been performed on the
ranked features by using different ranking methods which resulted
in an average accuracy of 92.02%, sensitivity of 92.59% and
specicity of 91.46% is shown in Table 2.
Fig. 4 shows the plot of accuracy versus number of features for
various ranking methods. It clearly shows that the t-test method
yields the highest classication accuracy for 21 ranked features,
beyond which there is a drop in the accuracy level. Fig. 5 shows
the plot of average accuracy (%), sensitivity (%) and specicity (%)
versus different folds of ten-fold cross-validation for DT classier.
It can be noted from Table 1 that all the entropies in the different levels of detailed coefcients have decreased for the diabetic
class due to decrease in the variability. Also, the kurtosis, skewness
and energy of the detailed coefcients have higher value for
diabetic than the normal class.
4. Discussion
In our work, we have developed an automated DM diagnostic
system by extracting the energy and entropy features of the rst
ve levels of detailed coefcients of DWT. Table 3 provides a summary of these works to discriminate DM automatically by using
HRV analysis to detect diabetes.
Our results show that the entropy features (namely the variables ApEn and SampEn of Table 1) are always statistically lower

Table 2
Results of classication by using various classiers (features ranked using t-test method).
Classiers

Features

TP

TN

FP

FN

Sensitivity (%)

Specicity (%)

Accuracy (%)

DT
KNN
NBC
SVM Polynomial 1
SVM Polynomial 2
SVM Polynomial 3

8
5
13
4
6
6

75
74
24
57
67
75

76
76
78
78
72
67

6
6
4
4
10
15

6
7
57
24
14
6

92.59
91.36
29.63
70.37
82.72
92.59

92.68
92.68
95.12
95.12
87.80
81.71

92.64
92.02
62.58
82.82
85.28
87.12

62

U. Rajendra Acharya et al. / Knowledge-Based Systems 81 (2015) 5664

Table 3
Studies conducted to discriminate normal and diabetic subjects using HRV signals.
Authors

Methods

Features

Classier,
number of
features

Performance

Pfeifer et al. [41]

Time domain

RR variations

Nil, one

Singh et al. [52]

Time and frequency


domain
Time domain
Time, Frequency
domain and nonlinear

SDNN, high and Low Frequency (LF) power, LF/HF

Nil, four

SDRR, NN50, RMSSD, pNN50%, etc


All features

Nil, Time

Supine HRV during a beta-adrenergic


blockade and deep respiratory rate can
effectively estimate parasympathetic
nervous activity in diabetic and control
subjects
LF power and LF/HF ratio were lower in
DM
Decreased with diabetes
Decreased with diabetes

Awdah et al. [11]


Faust et al. [20]

Nil, eight
domain:
seven
Freq domain:
three

Kirvela et al. [31]

Time and frequency


domain

All time domain and frequency domain features

Flynn et al. [22]

Detrended uctuation
analysis
FFT and Autoregressive
spectral analysis
Time domain

Short range correlation (a1)

Nonlinear:
twenty-two
Nil, All time
and
frequency
domain
features
Nil, One

LF/HF ratio, LF(nu), and HF(nu)

Nil, Three

Short range correlation (a1) decreases for


diabetes subjects
Decreased value for diabetes subjects

SD, root mean square of successive differences in


normal-to-normal R-R intervals
All time domain and frequency domain features

Nil, Three

Decreased value for diabetes subjects

Nil, Time

All parameters reduced with diabetes

Chemla et al. [16]


Schroeder et al. [49]
Ahamed Seyd et al. [8]

Time and frequency


domain

domain: nine
Freq domain:
eleven
Nil, MSE

Trunkvalterova et al.
[58]
Nolan et al. [38]

Nonlinear

Multiscale entropy (MSE)

Time and frequency


domain

Acharya et al. [3]

Nonlinear

High frequency (HF) power, root mean square of


successive differences between RR intervals, total
RR variability
RQA features, Correlation dimension, long term
variability

Swapna et al. [54]

HOS

Jian et al. [28]


Acharya et al. [5]

HOS
Nonlinear

Pachori et al. [42]


(in press)

Nonlinear

This work

DWT

Diminished HRV has been observed in


diabetic autonomic neuropathy

Bispectrum moments, entropies and weighted


centers
PCA features
RQA features, ApEn

Mean frequency using FourierBessel series


expansion, two bandwidth parameters (amplitude
modulation and frequency modulation bandwidths)
and Analytic Signal Representation (ASR) and
Second Order Difference Plot (SODP)
Entropies, energy, skewness and kurtosis

for DM as compared to controls. This is in accordance to some very


recent studies that showed how diabetes reduced the entropy of
the EMG signals [59] and of the near-infrared signals measuring
muscle metabolism [36]. This decreased signal entropy is found
both in the electrical activation of the muscles and suggested that
during metabolism diabetes might alter the muscle ber conduction velocity and membrane functioning. We believe the entropy
of the signal is a very important parameter also when analyzing
the HRV signal, because it might directly reect a neuromuscular
effect of DM.
This
newly
developed
system
has
the
following
advantages:

Nil, Three

PerceptronAdaBoost,
ve

GMM, eight
SVM
Least
SquaresAdaBoost,
four
Kruskal
Wallis
statistical
test, ve
DT, eight

MSE was signicantly reduced on scales 2


and 3 in DM
Between HRV measures, duration of Type
1 and Type 2 diabetes relationship is
inverse
Diabetes Index, Accuracy: 86%

Sensitivity: 87.5%
Specicity: 84.6%
Accuracy: 90.5% Sensitivity: 85.7%
Specicity: 95.2%
Accuracy: 79.93%
Accuracy: 90.0% Sensitivity: 92.5%
Specicity: 88.7%

Features provide statistically signicant


difference between diabetic and normal
classes

Accuracy: 92.02%
Sensitivity: 92.59%
Specicity: 91.46%

(a) The developed software is repeatable and not prone to any


inter/intra-observer variability.
(b) This diagnostic tool will eliminate the need of repeated tests
to conrm the DM, and thereby provide more reliable and
faster diagnosis.
(c) This method is highly effective during the situation when lot
of data are to be collected for long durations to understand
and identify the abnormality.
(d) Our method performed better than the rest of the techniques
reported in the above table.
(e) The proposed system is robust (ten-fold stratied crossvalidation) and reduces the burden on the clinicians.

U. Rajendra Acharya et al. / Knowledge-Based Systems 81 (2015) 5664

63

Fig. 4. Plot of accuracy versus number of features for the various ranking methods.

Fig. 5. Plot of average accuracy (%), sensitivity (%) and specicity (%) versus different folds of ten-fold cross-validation for DT classier.

5. Conclusion
[4]

Diabetes is identied as one of the rapidly growing health concern in rural and urban cities of developed and developing countries. Earlier intervention and continued treatment helps to keep
the diabetes under control. In this work, we have provided a tutorial on how diabetes is associated with cardiovascular autonomic
neuropathy, which affects HRV. Hence, we can detect diabetes by
carrying out HRV spectral analysis. We have presented an automated DM detection system, by using DWT features (of energy
and entropy) extracted from the HRV signals. Using our presented
method, we have obtained the accuracy, sensitivity and specicity
of 92.02%, 92.59% and 91.46% respectively by using DT classier.
The proposed method can be further extended to develop a CAD
system which can assist the clinicians to screen the diabetes
subjects.

[5]

[6]
[7]
[8]

[9]

[10]
[11]

[12]

References
[13]
[1] I.V. Aaron, E.M. Raelene, D.M. Braxton, R. Roy, Diabetic autonomic neuropathy,
Diabetes Care 26 (2003) 15531579.
[2] U.R. Acharya, S. Vinitha Sree, C.A. Ang Peng, S.S. Jasjit, Use of principal
component analysis for automatic classication of epileptic EEG activities in
wavelet framework, Expert Syst. Appl. 39 (2012) 90729078.
[3] U.R. Acharya, O. Faust, S. Vinitha Sree, D.N. Ghista, S. Dua, P. Joseph, A.V.I.
Thajudin, N. Janarthanan, T. Tamura, An integrated diabetic index using heart

[14]
[15]

rate variability signal features for diagnosis of diabetes, Comput. Method


Biomech. Biomed. Eng. 16 (2013) 222234.
U.R. Acharya, N. Kannathal, S.M. Krishna, Comprehensive analysis of cardiac
health using heart rate signals, Physiol. Meas. 25 (2004) 11391151.
U.R. Acharya, O. Faust, N.A. Kadri, J.S. Suri, W. Yu, Automated identication of
normal and diabetes heart rate signals using nonlinear measures, Comput. Biol.
Med. 43 (10) (2013) 15231529.
U.R. Acharya, K.P. Joseph, N. Kannathal, M.L. Choo, J.S. Suri, Heart rate
variability: a review, Med. Biol. Eng. Comput. 44 (2006) 10311051.
American Diabetes Association (ADA) Fast Facts Data and statistics about
diabetes, 2013.
P.T. Ahamed Seyd, T.V.I. Ahamed, J. Jeevamma, P.K. Jospeh, Time and frequency
domain analysis of heart rate variability and their correlations in diabetes
mellitus, World Acad. Sci., Eng. Technol. 2 (2008) 583586.
S. Akselrod, D. Gordon, J.B. Madwed, D.C. Snidman, R.J. Cohen, Hemodynamic
regulation: investigation by spectral analysis, Am. J. Physiol. 249 (1985) 867
875.
American diabetes association (ADA), Diagnosis and classication of diabetes
mellitus. Diabetes Care, vol. 27, 2004.
A. Awdah, A. Nabil, S. Ahmad, Q. Reem, A. Khidir, Time-domain analysis of
heart rate variability in diabetic patients with and without autonomic
neuropathy, Ann. Saudi Med. 22 (2002) 56.
P. Aurelien, R. Manuel, A.J. Sophie, B. Claire De, Anre Denjean, Spectral analysis
of heart rate variability interchangeability between autoregressive analysis
and fast Fourier transform, J. Electrocardiol. 39 (2006) 3137.
J.F. Box, Guinness, gosset, sher, and small samples, Statist. Sci. 2 (1987) 45
52.
Roy Bhaskar, G. Sobhendu, Nonlinear methods to assess changes in heart rate
variability in type 2 diabetic patients, Arq. Bras. Cardiol. (2013).
S. Cerutti, A. Bianchi, B. Bontempi, G. Comi, Power spectrum analysis of heart
rate variability signal in the diagnosis of diabetic neuropathy, Proceedings of
the annual international conference of the IEEE engineering in medicine and
biology society 1 (1989) 1213.

64

U. Rajendra Acharya et al. / Knowledge-Based Systems 81 (2015) 5664

[16] D. Chemla, J. Young, F. Badilini, P. Maison-Blanche, H.Y. Affres, Lecarpentier P.


Chanson, Comparison of fast Fourier transform and auto-regressive spectral
analysis for the study of heart rate variability in diabetic patients, Int. J.
Cardiol. 104 (3) (2005) 307313.
[17] G. Donna, U.R. Acharya, J.M. Roshan, S. VinithaSree, T.C. Lim, A.V.I. Thajudin, J.S.
Suri, Automated diagnosis of coronary artery disease affected patients using
LDA, PCA, ICA and discrete wavelet transform, Knowledge Based Syst. 37
(2012) 274282.
[18] M. Dash, H. Liu, Handling large unsupervised data via dimensionality
reduction. ACM SIGMOD Workshop on Research Issues in Data Mining and
Knowledge Discovery, 1999.
[19] E. Osuna Edgar, F. Robert, G. Federico, Support vector machines: training and
applications, technical report. MIT AI Lab. Centre for Biological and
Computational Learning, March 1997.
[20] O. Faust, U.R. Acharya, F. Molinari, S. Chattopadhyay, T. Tamura, Linear and
non-linear analysis of cardiac health in diabetic subjects, Biomed. Signal
Process. Control 7 (3) (2012) 295302.
[21] Federico Bellavere, Italo Balzani, Giovanni De Masi, Maurizio Carraro, Pasquale
Carenza, Claudio Cobelli, Karl Thomaseth, Power spectral analysis of heart rate
variations improves assessment of diabetic cardiac autonomic neuropathy,
Diabetes 41 (1992) 633640.
[22] A.C. Flynn, H.F. Jelinek, M. Smith, Heart rate variability analysis: a useful
assessment tool for diabetes associated cardiac dysfunction in rural and
remote areas, Aust. J. Rural Health 3 (2) (2005) 7782.
[23] F.J. Herbert, M.I. Hasan, A.A. Hayder, H.K. Ahsan, Association of cardiovascular
risk using nonlinear heart rate variability measures with the Framigham risk
score in a rural population, Front. Physiol., Comput. Physiol. Med. (2013) 4.
[24] J. Han, M. Kamber, J. Pei, Data mining: Concepts and Techniques, Morgan
Kaufmann, Waltham, MA, 2005.
[25] Chu Duc Hoang Chu, Phan Kien Nguyen, Viet Dung Nguyen, A review of heart
rate variability and its applications, APCBEE Proc. 7 (2013) 8085.
[26] International Diabetes Federation Diabetes Atlas, sixth ed., 2013.
[27] Constant Isabelle, Dominique Laude, Isabelle Murat, Jean-Luc Elghozi, Pulse
rate variability is not a surrogate for heart rate variability, Clin. Sci. 97 (1999)
391397.
[28] Wei Jian Lee, Cheng Lim Teik, Automated detection of diabetes by means of
higher order spectral features obtained from heart rate signals, J. Med. Imaging
Health Inform. 3 (2013) 440447.
[29] T. Kailath, The divergence and Bhattacharyya distance measures in signal
selection, IEEE Trans. Commun. Technol. 15 (1) (1967) 5260.
[30] G. Kheder, A. Kachouri, R. Taleb, M.M. Ben, M. Samet, Feature extraction by
wavelet transforms to analyse the heart rate variability during two meditation
technique. 6th WSEAS International conference on Circuits, Systems,
Electronics, Control and Signal Processing, 2007.
[31] M. Kirvela, K. Salmela, L. Toivonen, A.M. Koivusalo, L. Lindgren, Heartrate
variability in diabetic and non-diabetic renal transplant patients, Acta
Anaesthesiol. Scand. 40 (7) (1996) 804808.
[32] D.T. Larose, Discovering Knowledge in Data: An Introduction to Data Mining,
KNN, Willey Interscience, New Jersey, USA, 2004, pp. 90106 (Chapter 5).
[33] D.T. Larose, Decision trees, Chapter 6 in discovering knowledge in data: an
introduction to data mining, Wiley Interscience, Hoboken, N, 2004, pp. 108126.
[34] A. Malliani, F. Lombardi, M. Pagani, Power spectrum analysis of heart rate
variability: a tool to explore neural regulatory mechanisms, Brit. Heart J. 71
(1994) 12.
[35] A. Metin, Nonlinear Biomedical Signal Processing. Fuzzy Logic, Neural
Networks, and New Algorithms, vol. 1, IEEE Press, Fuzzy logic, 2000.
[36] F. Molinari, U.R. Acharya, R.J. Martis, R. De Luca, G. Petraroli, W. Liboni, Entropy
analysis of muscular near-infrared spectroscopy (NIRS) signals during exercise
programme of type 2 diabetic patients: quantitative assessment of muscle
metabolic pattern, Comp. Method Prog. Biomed. 112 (2013) 518528.
[37] Karim Nasim, Hasan Jahan Ara, Ali Syed Sanowar, Heart rate variability a
review, J. Basic Appl. Sci. 7 (2011) 7177.
[38] R.P. Nolan, S.M. Barry-Bianchi, A.E. Mechetiuc, M.H. Chen, Sex-based
differences in the association between duration of type 2 diabetes, Diab.
Vasc. Dis. Res. 6 (2009) 276282.

[39] N.A. Obuchowski, Receiver operating characteristic curves and their use in
radiology, Radiology 229 (2003) 38.
[40] J. Pan, W.J. Tompkins, A real time QRS detection algorithm, IEEE Trans. Biomed.
Eng. 32 (3) (1985) 230236.
[41] M.A. Pfeifer, D. Cook, J. Brodsky, D. Tice, A. Reenan, S. Swedine, J.B. Halter, D.
Porte, Quantitative evaluation of cardiac parasympathetic activity in normal
and diabetic man, Diabetes 31 (4) (1982) 339345.
[42] R.B. Pachori, P. Avinash, K. Shashank, R. Sharma, U.R. Acharya, Application of
empirical mode decomposition for analysis of normal and diabetic RR-interval
signals. Expert Systems with Applications, 2015 (in press).
[43] A.M. Pincus, Approximate entropy as a measure of system complexity, Proc.
Nat. Acad. Sci. 88 (1991) 22972301.
[44] S.M. Pincus, I.M. Gladstone, A.E. Richard, A regularity statistic for medical data
analysis, J. Clin. Monit. (1991) 7.
[45] S.M. Pincus, D.L. Keefe, Quantication of hormone pulsatility via an
approximate entropy algorithm, Am. J. Physiol. 262 (1992) E741E754.
[46] Shi Ping, Hu Sijung, Z. Yisheng, A preliminary attempt to understand
compatibility of photoplethysmographic pulse rate variability with
electrocardiogramic heart rate variability, J. Med. Biol. Eng. 28 (2008) 173
180.
[47] W. Sarah, S. Richard, R. Gojka, G. Anders, K. Hilary, Global prevalence of
diabetes estimates for the year 2000 and projections for 2030, Diabetes Care
27 (2004) 10471053.
[48] Sarika Tale, T.R. Sontakke, Time-frequency analysis of heart rate variability
signal in prognosis of type 2 diabetic autonomic neuropathy, 2011.
International Conference on Biomedical Engineering and Technology, vol. 11,
2011.
[49] E.B. Schroeder, L.E. Chambless, D. Liao, R.J. Prineas, J.W. Evans, W.D. Rosamond,
G. Heiss, Diabetes, glucose, insulin, and heart rate variability: the
atherosclerosis risk in communities (aric) study, Diabetes Care 28 (3) (2005)
668674.
[50] A. Schumacher, Linear and nonlinear approaches to the analysis of RR interval
variability, Biol. Res. Nursing 5 (2004) 211221.
[51] B. Pomeranz, R.J.B. Macaulay, M.A. Caudill, I. Kutz, D. Adam, K.M. Kilborn, A.C.
Barger, D.C. Shannon, R.J. Cohen, H. Benson, Assessment of autonomic
function in humans by heart rate spectral analysis, Am. J. Physiol. 248
(1985) 151153.
[52] J.P. Singh, M.G. Larson, C.J. ODonnell, P.F. Wilson, H. Tsuji, D.M. Lloyd-Jones, D.
Levy, Association of hyperglycemia with reduced heart rate variability (the
Framingham heart study), Am. J. Cardiol. 86 (3) (2000) 309312.
[53] J.A.K. Suykens, J. Vandewalle, Least square support vector machine classiers,
Neural Process. Lett. 9 (1999) 293300.
[54] G. Swapna, U.R. Acharya, V.S. Sree, J.S. Suri, Automated diagnosis of diabetes
using higher order spectra features extracted from heart rate signals, Intell.
Data Anal. 17 (2) (2013) 309326.
[55] M. Ratnakar, K.S. Sunil, J. Nitisha, Signal ltering using discrete wavelet
transform, Int. J. Recent Trends Eng. (2009) 2.
[56] J.M. Roshan, U.R. Acharya, C.M. Lim, ECG beat classication using PCA, LDA,
ICA, and discrete wavelet transform, Biomed. Signal Process. Control 8 (2013)
437448.
[57] J.S. Richman, J.R. Mooran, Physiological time-series analysis using approximate
entropy and sample entropy, Am. J. Physiol. Heart Circphysiol. 278 (2000)
20392049.
[58] Z. Trunkvalterova, M. Javorka, I. Tonhajzerova, J. Javorkova, Z. Lazarova, K.
Javorka, M. Baumert, Reduced short-term complexity of heart rate and blood
pressure dynamics in patients with diabetes mellitus type 1: multiscale
entropy analysis, Physiol. Meas. 29 (7) (2008) 817828.
[59] K. Watanabe, T. Miyamoto, Y. Tanaka, K. Fukuda, T. Moritani, Type 2 diabetes
mellitus patients manifest characteristic spatial EMG potential distribution
pattern during sustained isometric contraction, Diab. Res. Clin. Pract. 97 (3)
(2012) 468473.
[60] WHO Consultation: denition and diagnosis of diabetes mellitus and
intermediate hyperglycemia, 2006.
[61] F. Wilcoxon, Individual comparisons by ranking methods, Biometric Bull. 1
(1945) 8083.

You might also like