Professional Documents
Culture Documents
Knowledge-Based Systems
journal homepage: www.elsevier.com/locate/knosys
Department of Electronics and Computer Engineering, Ngee Ann Polytechnic, Singapore 599489, Singapore
Department of Biomedical Engineering, Faculty of Engineering, University of Malaya, Malaysia
University 2020 Foundation, MA, USA
d
Biolab, Department of Electronics and Telecommunications, Politecnico di Torino, Torino, Italy
e
Department of Mathematics, Anand Institute of Higher Technology, Kazhipattur, Chennai 603 103, India
b
c
a r t i c l e
i n f o
Article history:
Received 9 June 2014
Received in revised form 3 February 2015
Accepted 5 February 2015
Available online 12 February 2015
Keywords:
Diabetes
HRV
Classier
DWT
Feature extraction
Feature ranking
a b s t r a c t
Diabetes Mellitus (DM), a chronic lifelong condition, is characterized by increased blood sugar levels. As
there is no cure for DM, the major focus lies on controlling the disease. Therefore, DM diagnosis and treatment is of great importance. The most common complications of DM include retinopathy, neuropathy,
nephropathy and cardiomyopathy. Diabetes causes cardiovascular autonomic neuropathy that affects
the Heart Rate Variability (HRV). Hence, in the absence of other causes, the HRV analysis can be used
to diagnose diabetes. The present work aims at developing an automated system for classication of normal and diabetes classes by using the heart rate (HR) information extracted from the Electrocardiogram
(ECG) signals. The spectral analysis of HRV recognizes patients with autonomic diabetic neuropathy, and
gives an earlier diagnosis of impairment of the Autonomic Nervous System (ANS). Signicant correlations
with the impaired ANS are observed of the HRV spectral indices obtained by using the Discrete Wavelet
Transform (DWT) method. Herein, in order to diagnose and detect DM automatically, we have performed
DWT decomposition up to 5 levels, and extracted the energy, sample entropy, approximation entropy,
kurtosis and skewness features at various detailed coefcient levels of the DWT. We have extracted
relative wavelet energy and entropy features up to the 5th level of DWT coefcients extracted from
HR signals. These features are ranked by using various ranking methods, namely, Bhattacharyya space
algorithm, t-test, Wilcoxon test, Receiver Operating Curve (ROC) and entropy.
The ranked features are then fed into different classiers, that include Decision Tree (DT), K-Nearest
Neighbor (KNN), Nave Bayes (NBC) and Support Vector Machine (SVM). Our results have shown maximum diagnostic differentiation performance by using a minimum number of features. With our system,
we have obtained an average accuracy of 92.02%, sensitivity of 92.59% and specicity of 91.46%, by using
DT classier with ten-fold cross validation.
2015 Elsevier B.V. All rights reserved.
1. Introduction
According to the International Diabetes Federation (IDF), it is
estimated that in 2013 a total of 381 million people were diagnosed
with diabetes across the globe, out of which 23 million people are
from Southeast Asian countries [26]. Due to lack of nance or access
to healthcare, most of the populations around the world are unaware that they may be suffering from diabetes [26]. Statistics shows
that around 1.9 million people are diagnosed with diabetes in USA
every year and 79 million have pre-diabetic conditions [7]. By 2030,
Corresponding author. Tel.: +65 64608393.
E-mail address: vidya.2kus@gmail.com (K.S. Vidya).
http://dx.doi.org/10.1016/j.knosys.2015.02.005
0950-7051/ 2015 Elsevier B.V. All rights reserved.
57
more suitable for evaluation of short-term HRV spectral components in diabetic subjects.
Analysis in the time and frequency domain of RR interval has
been carried out by Ahmad Seyd et al. [8], to quantify the autonomic nervous system (ANS) in DM patients. Signicant differences in high frequency (HF) power, very low frequency (VLF)
and low frequency (LF) power were noted between DM patients
and normal classes in the frequency domain analysis of extracted
data (NN interval normal to normal interval). This study also
observed signicant difference in time domain analysis of root
mean square of successive NN interval differences (RMSSD) and
the standard deviation of NN interval (SDNN) between the DM
and control groups.
Multiscale entropy (MSE) analysis method has also been used
to diagnose the autonomic dysregulation in DM patients by
Trunkvalterova et al. [58]. Their study performed the analysis of
heart rate (HR) signal, systolic and diastolic blood pressure (SBP
and DBP) signals in both normal and diabetic subjects, to evaluate
the SampEn and linear measures. They reported that in young
patients with DM, the changes in cardiovascular control were
detected by the MSE analysis of SBP and DBP oscillations and HR
signals. The relationship between HRV and duration of type 2 diabetes based on sex-differences was studied by Nolan et al. [38]. By
this study result, an inverse relationship was reported between the
Type 1 and Type 2 diabetes duration and HRV measures among
male subjects only. The inverse association of HRV with increasing
age of diabetes diagnosis, as well as increasing severity of coronary
heart disease risk and obesity was observed in female subjects.
Then in 2012, Faust et al. [20] used time and frequency domain
and nonlinear methods to study the HRV signals of both diabetic
and normal subjects; they have proposed unique ranges for various
features of the two classes. The HRV parameter in diabetic and
non-diabetic patients with renal transplantation has been investigated in time and frequency domain by Kirvela et al. [31]; their
result highlighted that in end-stage diabetic neuropathy patients
the autonomic neuropathy is the main reason to cause severe
impairment of HRV and partly by the co-existing heart disease.
Recently, a novel Diabetic Integrated Index (DII) has been developed by Acharya et al. [3], by using nonlinear parameters extracted
from the HRV signal. This DII is a number which can distinguish
and classify the two classes in terms of just one number. They also
reported that the AdaBoost classier yielded a high classication
accuracy of 86% for the two classes (normal and diabetic). In this
research group, Swapna et al. [54] used Higher Order Spectral features to classify diabetic patients from normal subjects; their
method reported the highest accuracy, sensitivity and specicity
of 90.5%, 85.7% and 95.2% respectively, by using Gaussian mixture
model classier. The magnitude plots of the HOS bispectrum
obtained from HRV signals have been subjected to principal
component analysis for feature reduction [28]. These principal
components with SVM classier reported an accuracy of 79.93%.
However, Acharya et al. [5] reported 90% of accuracy, 92.5% of sensitivity and 88.7% of specicity with AdaBoost classier coupled
with four nonlinear features. Pachori et al. [42] (In press), proposed
a new nonlinear method based on Empirical Mode Decomposition
(EMD) to discriminate between normal and diabetic RR-interval
signals. In their proposed method, EMD decomposes the RR-interval
signal into IMFs from which ve features (FourierBessel series
expansion, amplitude modulation bandwidth, frequency modulation bandwidths, analytic signal representation and second order
difference Plot) are extracted. The study results show that the
features extracted exhibits are statistically signicant difference
between normal and diabetic classes.
In our present work, in order to automatically diagnose and
detect DM, we have performed DWT decomposition up to 5 levels
and have extracted the energy, sample entropy, approximation
58
entropy, kurtosis and skewness features at various detailed coefcient levels of the DWT. Fig. 1 shows an overview of our proposed
methodology for diabetic HR signal classication. In the off-line
system, normal and diabetes RR signal data are analyzed by
DWT, performed up to 5 level of decomposition. Energy, sample
entropy, approximate entropy, kurtosis and skewness features
are extracted from each levels of the detailed coefcients of the
DWT. Then, these features are ranked by using Bhattacharyya
space algorithm, t-test, Wilcoxon test, Receiver Operating Curve
(ROC), and entropy method. The ranked features are fed to DT,
KNN, NBC and SVM classiers to obtain the highest classication
performance using minimum number of features. In the on-line
system, up to ve levels decomposition are performed by using
DWT method and the features (energy, ApEn, SampEn, kurtosis,
and skewness) are extracted. These features are ranked and fed
to the selected classiers for automated classication as normal
and DM.
The ow of the paper is as follows. Section 2 delineates (i) the
data acquisition process and pre-processing, (ii) feature extraction
method and feature ranking methods, and (iii) classication. The
results of this novel diagnostic system are presented in Section 3.
The discussion of the results is carried out in Section 4, and conclusion is provided in Section 5.
2. Methods for HRV analysis
2.1. Data acquisition/pre-processing
The electrocardiogram signals (ECG) were acquired from 30
subjects (15 subjects with DM and 15 healthy subjects) in a relaxed
supine position for 60 min. The ECG recordings were performed by
using BIOPAC (Aero Camino Goleta, CA, USA) equipment, and the
AcqKnowledge software inbuilt within equipment to convert the
recordings into heart rate time series. Fig. 2 shows the RR signals
of normal and DM patients. We have kept the ECG sampling rate
to 500 Hz. A total of 81 datasets from 15 diabetic subjects (10 male
and 5 female) and 82 datasets from 15 normal subjects (8 male and
7 female) were used in this study, with each dataset having 1000
samples. All the subjects were instructed about the aim of the
study and signed an informed consent before being examined.
The study received the approval by the Kasturba Medical
Hospital, in Manipal, India. Band reject lter with a center frequency of 50 Hz was used to remove the power-line interference noise.
RR points were detected using Pan and Tomkins algorithm [40].
59
Fig. 2. Typical RR interval signals: (a) normal subject and (b) diabetic subject.
EfX lg4
Kur
Skw
r4
EfX lg3
r3
60
Fig. 3. Typical DWT plots of RR interval signals: (a) normal and (b) diabetic subject.
and it is plotted as sensitivity versus 1-specicity. A test that perfectly discriminates between the two groups would yield a curve;
then, by determining the area under curve, the soundness of a test
can be assessed. In practice, the area varies between 0.5 and 1; if
the area is closer to 1, the test is considered better; the test is considered worst if the area is closer to 0.5 [39].
2.3.5. Entropy based test
This method is based on the fact that entropy is lower for orderly layout and higher for disorderly layout. In this method, the features are ranked in descending order of relevance, by nding the
descending order of the entropies after removing each feature
one at a time [18].
2.4. Classication
In our work, we have used ten-fold cross validation method to
evaluate the classiers [2]. Our main objective is to obtain the best
classication accuracy, by using the minimum number of ranked
features and identify the best classier. In this method, the whole
set of ranked features are rst divided into 10 equal parts, with the
rst 9 parts (147 data les) being used for training the classier,
followed by using the trained classier on the one remaining part
(16 data les) to evaluate its performance. This whole process is
repeated 10 times by taking different parts for training and testing
dataset. The classier performance is measured by using the
61
ApEn_D1
Kur_D3
SamEn_A5
Kur_D2
Kur_D1
ApEn_D3
ApEn_D2
Kur_D4
ApEn_A5
SampEn_D1
Skw_D2
ApEn_D4
E_D1
Skw_A5
ApEn_D5
E_A5
E_D5
Skw_D1
E_D4
E_D2
Skw_D3
E_D3
Skw_D4
Skw_D5
Kur_D5
Kur_A5
Normal
Diabetes
Mean
SD
Mean
SD
0.878684
0.050443
0.414618
0.027187
0.021703
0.850749
0.859135
0.097679
0.665725
0.712377
0.397354
0.731534
0.002803
0.503106
0.667284
7.58E05
0.000253
0.636675
0.000343
0.002461
0.424324
0.001619
0.391949
0.436783
0.193536
0.145777
0.09963
0.049304
0.141522
0.042692
0.052228
0.084797
0.108898
0.090308
0.172749
0.113272
0.039024
0.147349
0.003401
0.121693
0.158245
2.93E05
0.000313
0.052334
0.00035
0.002364
0.042139
0.001354
0.066628
0.188226
0.179643
0.130321
0.766539
0.14026
0.348454
0.098319
0.079683
0.785439
0.786305
0.150161
0.609154
0.646319
0.428037
0.688021
0.019003
0.528192
0.637826
0.012382
0.01238
0.620989
0.012399
0.014506
0.438184
0.01256
0.380989
0.455444
0.204018
0.151545
0.226049
0.205666
0.106587
0.19369
0.157082
0.168723
0.214654
0.166174
0.147757
0.251198
0.138865
0.191352
0.113083
0.143809
0.183521
0.111107
0.111107
0.133937
0.111105
0.111103
0.121263
0.111088
0.12243
0.163141
0.163548
0.177971
p-value
t-value
6.37E05
0.000174
0.000946
0.00142
0.001826
0.002089
0.006906
0.013084
0.026095
0.031581
0.055938
0.105526
0.196578
0.23085
0.273877
0.317341
0.324434
0.325119
0.327228
0.32778
0.330019
0.37381
0.478117
0.499988
0.697491
0.813513
4.106849
3.8445
3.368416
3.246782
3.169823
3.128092
2.736556
2.509336
2.245531
2.168592
1.92543
1.627786
1.296734
1.202721
1.097925
1.003052
0.988411
0.987009
0.982703
0.981578
0.977032
0.891839
0.710994
0.676036
0.389405
0.236283
Table 2
Results of classication by using various classiers (features ranked using t-test method).
Classiers
Features
TP
TN
FP
FN
Sensitivity (%)
Specicity (%)
Accuracy (%)
DT
KNN
NBC
SVM Polynomial 1
SVM Polynomial 2
SVM Polynomial 3
8
5
13
4
6
6
75
74
24
57
67
75
76
76
78
78
72
67
6
6
4
4
10
15
6
7
57
24
14
6
92.59
91.36
29.63
70.37
82.72
92.59
92.68
92.68
95.12
95.12
87.80
81.71
92.64
92.02
62.58
82.82
85.28
87.12
62
Table 3
Studies conducted to discriminate normal and diabetic subjects using HRV signals.
Authors
Methods
Features
Classier,
number of
features
Performance
Time domain
RR variations
Nil, one
Nil, four
Nil, Time
Nil, eight
domain:
seven
Freq domain:
three
Detrended uctuation
analysis
FFT and Autoregressive
spectral analysis
Time domain
Nonlinear:
twenty-two
Nil, All time
and
frequency
domain
features
Nil, One
Nil, Three
Nil, Three
Nil, Time
domain: nine
Freq domain:
eleven
Nil, MSE
Trunkvalterova et al.
[58]
Nolan et al. [38]
Nonlinear
Nonlinear
HOS
HOS
Nonlinear
Nonlinear
This work
DWT
Nil, Three
PerceptronAdaBoost,
ve
GMM, eight
SVM
Least
SquaresAdaBoost,
four
Kruskal
Wallis
statistical
test, ve
DT, eight
Sensitivity: 87.5%
Specicity: 84.6%
Accuracy: 90.5% Sensitivity: 85.7%
Specicity: 95.2%
Accuracy: 79.93%
Accuracy: 90.0% Sensitivity: 92.5%
Specicity: 88.7%
Accuracy: 92.02%
Sensitivity: 92.59%
Specicity: 91.46%
63
Fig. 4. Plot of accuracy versus number of features for the various ranking methods.
Fig. 5. Plot of average accuracy (%), sensitivity (%) and specicity (%) versus different folds of ten-fold cross-validation for DT classier.
5. Conclusion
[4]
Diabetes is identied as one of the rapidly growing health concern in rural and urban cities of developed and developing countries. Earlier intervention and continued treatment helps to keep
the diabetes under control. In this work, we have provided a tutorial on how diabetes is associated with cardiovascular autonomic
neuropathy, which affects HRV. Hence, we can detect diabetes by
carrying out HRV spectral analysis. We have presented an automated DM detection system, by using DWT features (of energy
and entropy) extracted from the HRV signals. Using our presented
method, we have obtained the accuracy, sensitivity and specicity
of 92.02%, 92.59% and 91.46% respectively by using DT classier.
The proposed method can be further extended to develop a CAD
system which can assist the clinicians to screen the diabetes
subjects.
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
References
[13]
[1] I.V. Aaron, E.M. Raelene, D.M. Braxton, R. Roy, Diabetic autonomic neuropathy,
Diabetes Care 26 (2003) 15531579.
[2] U.R. Acharya, S. Vinitha Sree, C.A. Ang Peng, S.S. Jasjit, Use of principal
component analysis for automatic classication of epileptic EEG activities in
wavelet framework, Expert Syst. Appl. 39 (2012) 90729078.
[3] U.R. Acharya, O. Faust, S. Vinitha Sree, D.N. Ghista, S. Dua, P. Joseph, A.V.I.
Thajudin, N. Janarthanan, T. Tamura, An integrated diabetic index using heart
[14]
[15]
64
[39] N.A. Obuchowski, Receiver operating characteristic curves and their use in
radiology, Radiology 229 (2003) 38.
[40] J. Pan, W.J. Tompkins, A real time QRS detection algorithm, IEEE Trans. Biomed.
Eng. 32 (3) (1985) 230236.
[41] M.A. Pfeifer, D. Cook, J. Brodsky, D. Tice, A. Reenan, S. Swedine, J.B. Halter, D.
Porte, Quantitative evaluation of cardiac parasympathetic activity in normal
and diabetic man, Diabetes 31 (4) (1982) 339345.
[42] R.B. Pachori, P. Avinash, K. Shashank, R. Sharma, U.R. Acharya, Application of
empirical mode decomposition for analysis of normal and diabetic RR-interval
signals. Expert Systems with Applications, 2015 (in press).
[43] A.M. Pincus, Approximate entropy as a measure of system complexity, Proc.
Nat. Acad. Sci. 88 (1991) 22972301.
[44] S.M. Pincus, I.M. Gladstone, A.E. Richard, A regularity statistic for medical data
analysis, J. Clin. Monit. (1991) 7.
[45] S.M. Pincus, D.L. Keefe, Quantication of hormone pulsatility via an
approximate entropy algorithm, Am. J. Physiol. 262 (1992) E741E754.
[46] Shi Ping, Hu Sijung, Z. Yisheng, A preliminary attempt to understand
compatibility of photoplethysmographic pulse rate variability with
electrocardiogramic heart rate variability, J. Med. Biol. Eng. 28 (2008) 173
180.
[47] W. Sarah, S. Richard, R. Gojka, G. Anders, K. Hilary, Global prevalence of
diabetes estimates for the year 2000 and projections for 2030, Diabetes Care
27 (2004) 10471053.
[48] Sarika Tale, T.R. Sontakke, Time-frequency analysis of heart rate variability
signal in prognosis of type 2 diabetic autonomic neuropathy, 2011.
International Conference on Biomedical Engineering and Technology, vol. 11,
2011.
[49] E.B. Schroeder, L.E. Chambless, D. Liao, R.J. Prineas, J.W. Evans, W.D. Rosamond,
G. Heiss, Diabetes, glucose, insulin, and heart rate variability: the
atherosclerosis risk in communities (aric) study, Diabetes Care 28 (3) (2005)
668674.
[50] A. Schumacher, Linear and nonlinear approaches to the analysis of RR interval
variability, Biol. Res. Nursing 5 (2004) 211221.
[51] B. Pomeranz, R.J.B. Macaulay, M.A. Caudill, I. Kutz, D. Adam, K.M. Kilborn, A.C.
Barger, D.C. Shannon, R.J. Cohen, H. Benson, Assessment of autonomic
function in humans by heart rate spectral analysis, Am. J. Physiol. 248
(1985) 151153.
[52] J.P. Singh, M.G. Larson, C.J. ODonnell, P.F. Wilson, H. Tsuji, D.M. Lloyd-Jones, D.
Levy, Association of hyperglycemia with reduced heart rate variability (the
Framingham heart study), Am. J. Cardiol. 86 (3) (2000) 309312.
[53] J.A.K. Suykens, J. Vandewalle, Least square support vector machine classiers,
Neural Process. Lett. 9 (1999) 293300.
[54] G. Swapna, U.R. Acharya, V.S. Sree, J.S. Suri, Automated diagnosis of diabetes
using higher order spectra features extracted from heart rate signals, Intell.
Data Anal. 17 (2) (2013) 309326.
[55] M. Ratnakar, K.S. Sunil, J. Nitisha, Signal ltering using discrete wavelet
transform, Int. J. Recent Trends Eng. (2009) 2.
[56] J.M. Roshan, U.R. Acharya, C.M. Lim, ECG beat classication using PCA, LDA,
ICA, and discrete wavelet transform, Biomed. Signal Process. Control 8 (2013)
437448.
[57] J.S. Richman, J.R. Mooran, Physiological time-series analysis using approximate
entropy and sample entropy, Am. J. Physiol. Heart Circphysiol. 278 (2000)
20392049.
[58] Z. Trunkvalterova, M. Javorka, I. Tonhajzerova, J. Javorkova, Z. Lazarova, K.
Javorka, M. Baumert, Reduced short-term complexity of heart rate and blood
pressure dynamics in patients with diabetes mellitus type 1: multiscale
entropy analysis, Physiol. Meas. 29 (7) (2008) 817828.
[59] K. Watanabe, T. Miyamoto, Y. Tanaka, K. Fukuda, T. Moritani, Type 2 diabetes
mellitus patients manifest characteristic spatial EMG potential distribution
pattern during sustained isometric contraction, Diab. Res. Clin. Pract. 97 (3)
(2012) 468473.
[60] WHO Consultation: denition and diagnosis of diabetes mellitus and
intermediate hyperglycemia, 2006.
[61] F. Wilcoxon, Individual comparisons by ranking methods, Biometric Bull. 1
(1945) 8083.