You are on page 1of 25

Arrhythmia Detection Using Features Extracted from

Time Frequency Optimized Biorthogonal Wavelet Bases

Deepak Vijaya,∗, Hitesh Suthara,∗


a Department of Electrical Engineering, Institute of Infrastructure Technology Research and
Management, Ahmedabad, India

Abstract

An arrhythmia is defined as the irregular heartbeat pattern in human cardio-


vascular system. This system gets weaker with the age which arises the threat
of heart disorders. An electrocardiogram tool is used to record the ECG signals
from heart. These informative ECG signals are used to detect the arrhythmias.
The wavelet bases are found to be very effective and flexible atoms in analysis
of biosignals. Therefore, to get the optimal wavelet bases, we have used optimal
13/7 parametrized biorthogonal wavelet filter bank. In our work, we have de-
composed the sequences into subbands using 13/7 optimal filter bank. We have
computed renyi entropy, fuzzy entropy, sample entropy, norm and energy of all
subbands. These are used as discriminating features. These features are fed
to quadratic support vector machine (Q-SVM) for the classification of normal,
atrial fibrillation, atrial flutter and ventricular fibrillation class. For validation,
we have used 10-fold cross validation strategy. In our work, we have used two
seconds and five seconds ECG signals. We achieved the accuracy, sensitivity and
Positive Predictive Value (PPV) of 98.00%, 91.20% and 94.72% for five seconds
data respectively and for the two seconds data, the accuracy, sensitivity and
PPV are as 96.30%, 88.61% and 89.48% respectively.

Keywords: Arrhythmia, atrial fibrillation, atrial flutter, biorthogonal wavelet


bases, electrocardiogram, joint duration-bandwidth localization,
parametrization, ventricular fibrillation.

1. Introduction

As the life expectancy at birth increases, the survival at older age increases
which results in growing older population. It is estimated that the world el-
derly population will reach 70 to 80 years of life expectancy by 2050 according

∗ Corresponding author
Email addresses: deepak.vijay.14e@iitram.ac.in (Deepak Vijay),
hitesh.suthar.14e@iitram.ac.in (Hitesh Suthar)

Preprint submitted to Elsevier February 26, 2018


to the report published by United Nations [1]. People over world faces health
issue after 60 years of age. The estimation of projection of life expectancy will
depend on the progress in postponing mortality caused by many of the disease
related to the old age. Most commonly diseases that grows in elderly pop-
ulations are cardiovascular disease, cancers, diabetes and respiratory disease.
Arrhythmia is one of the major disease associated with human cardiovascular
system. While aging, in cardiovascular system decrease in compliance of blood
vessels through arterial stiffening and thickening, weaken of left ventricular, and
disbalance in diastolic filling. An arrhythmia is defined as an abnormal heart-
beat in which beating of heart may be fast (tachycardia), slow (bradycardia) or
irregular (premature contraction). Atrial fibrillation (αf ib ), atrial flutter (αf l ),
and ventricular fibrillation (νf ib ) are the different types of arrhythmias that are
usually suspected in elderly people [2].
The αf ib can lead to stroke, heart failure and other heart-related disorder. In
a normal person heart contract and relax for a regular heartbeat i.e. electrical
impulse starts from the atria of the heart and goes down to the ventricles while
in atrial fibrillation, the atria of the heart beats irregularly as the control of
electrical activity becomes disorganized [3]. In electrocardiography signals, the
rhythm of atrial fibrillation is fast i.e. rate of heartbeat of 155 to 225 beats per
minutes. The fast heartbeat rate is due to the rapid contraction of ventricular
and it has P-wave absent in the ECG signal [4]. Atrial flutter (αf l ) is heartbeat
rhythm disorder. In this arrhythmia the atria of the heart beats very rapidly,
however it is more organised with respect to the atrial fibrillation. In αf l the
heart beat is between 240 to 360 beats pre minute that is similar to saw-tooth
waveform which is nothing but a flutter wave. Ventricular fibrillation (νf ib ) is
due to the rapid heartbeat that is known as ventricular tachycardia (VT). The
cause of these rapid hearbeat is electrical impulses in the ventricular.
The morphology of the ECG signals contains vast information [3] about
the electrical impulses coming from the heart. ECG signals are used to detect
and diagnosis for the cardiac health conditions. ECG signals are non linear
and complex in nature, so it is very difficult to observe manually the milli-volt
amplitudes. To extract the abnormalities from recorded data, the dataset should
be of 24-hour duration. Due to the long recorded data, manual analysis of the
ECG signal can be time consuming [3, 5]. Hence the role of Computer Aided
Diagnosis (CAD) is required for these analysis. The CAD system ensures the
reduction of time to analysis the ECG signals [6].
Several studies have been done for the classification of electrocardiograph
signals. Some of the recent works are shown in Table 6. In the work of Martis
et al [7], automated diagnosis of three class is performed using K-nearest neigh-
bour. In [7], author has used higher order spectra methods on 641 ηsr , 855 αf ib
and 887 αf l ECG sets. In another research, Martis et al. [8] have employed a
discrete cosine transform combined with ICA on the ECG sets. The work of
Wang et al. [9] performed short-duration multi-fractal method for arrhythmia
detection based on fuzzy kohonen network for the characterization of these three
classes αf ib , αf l and VT using fuzzy kohonen network classifier.
Acharya et al. [10] proposed a CAD system to differentiate among the four

2
ECG classes. In this work, the author used decision tree classifier for the anal-
ysis of ECG beats. In addition, Desai et al. [11] used recurrence quantification
method for the analysis of multiclass tachycardia beats using ensemble classifier.
Fahim et al. [12] used data mining approach with expectation-maximization
based clustering on fifty-compressed ECG signals. The author reduced the
number of feature by using correlation-based feature subset method using the
rule-based system.
In the recent work of Acharya en al. [13], they used convolution neural
network (CNN) for the classification of different classes of arrhythmias using
different intervals of ECG segments without any involvement of feature selection.
The CNN used in this work is of 11-layer deep. They obtained the accuracy
of 92.50% for two seconds of duration ECG sets and 94.50% for five seconds
of duration ECG sets. Although, they have achieved significant results but the
dataset required for the training is ample and it takes more time to train the
data.
From [7–13], we observed that most of the authors have used three classes
for the diagnosis of arrhythmia. Our paper focuses on four class (αf ib , αf l ,
νf ib and ηsr ) classification using the biorthogonal wavelet filter bank followed
by feature selection. We have extracted features namely: Norm (N1 ), Energy
(N2 ), Sample Entropy (Es ), Fuzzy Entropy (Ef ) and Renyi Entropy (Er ). We
achieved maximum accuracy of 98.00% for five seconds dataset and 96.30% for
two seconds of dataset which is better than the recent work of Acharya en al.
[13].
Biorthogonal wavelet filter banks are computationally fast, Due to which
they are the major tool for the classification of Electrocardiograph and Elec-
troencephalogram signals. Various techniques are used for the design of the
wavelet filter banks. The work of Sharma et al., [14] used an eigen-filter based
technique for the design of joint duration-bandwidth localized BWFB. This
method is numerically efficient as well as it can be incorporated in duration and
bandwidth domain simultaneously. In addition, a parametrization technique can
be used to design the joint duration-bandwidth localized discrete-time BWFB
as illustrated by Sharma et al. [15]. This techniques is employed where PR
and VMs conditions are needed to impose structurally. It provides the num-
ber of parameters constant regardless of the filter length. Performance of this
technique is found to be efficient in the applications of image coding [15].
In this paper, we are using joint duration-bandwidth localized (JDBWL)
discrete-time biorthogonal wavelet filter bank designed using parametrization
technique for the classification of arrhythmia. The technique used has the ad-
vantages of both the linear-phase and the joint duration bandwidth localization.
The proposed technique uses the optimal criterion taking into consideration the
JDBWL of all the basis sample of the analysis and synthesis FBs. Initially the
parametrized linear-phase PR two-channel filter bases are designed with non
arbitrary vanishing moments (VMs), degrees of freedom and lengths, then the
optimization of independent variables is done to achieve the wavelet FBs that
are optimized in duration and bandwidth.
Following are the salient features of the parametrization technique used:

3
–We have successfully achieved parametric aspects for analysis and synthesis
filter of 13/7 FBs. Due to unrestricted free parameters, parametrized filters
coefficients can be optimized.
–For predefined decomposition levels, the JDBWL of all discrete-time wavelet
bases is taken into the optimization criterion.
–The performance of proposed technique for designing optimized JDBWL FBs
for the detection of arrhythmia is found better than other methods.

2. Dataset Used

The test dataset in our work is obtained from PhysioBank, that includes
following databases: MIT-BIH atrial fibrillation (afdb), MIT-BIH arrhythmia
(mitdb) and Creighton university ventricular tachyarrhythmia (cudb) (Table 1),
which are lead-2 ECG signals only. The used databases are of different dura-
tions: two seconds and five seconds (table ref), these contain 21,709 (Dataset
A) and 8,683 (Dataset B) numbers of ECG segments, respectively (Table 2).

Databank Dataset
MIT-BIH arrhythmia (mitdb) αf ib , αf l , ηsr
MIT-BIH atrial fibrillation (afdb) αf ib , αf l
Creighton university ventricular tachyarrhythmia (cudb) νf ib

Table 1: Database used in the study

Groups Number of Samples Number of Samples


2 seconds 5 seconds
(Dataset B) (Dataset A)
αf ib 18,804 7,521
αf l 1,840 736
νf ib 162 65
ηsr 902 361
Total Samples 21,709 8,683

Table 2: ECG dataset overview (2 seconds and 5 seconds)

4
3. Methodology

ECG Data Collection Design of JDBWL Wavelet


and Preprocessing Filter Banks Decomposition

Classes Separation Classification Feature


& Model Validation using LS-SVM Extraction

Figure 1: Flow chart showing the methodology for automatic determination of different classes
of ECG signals

3.1. Pre-processing
In the database, the ECG signals received from MIT-BIH arrhythmia have
sampling frequency of 360 Hz and the database of ECG signals from MIT-BIH
αf ib and Creighton University ventricular tachyarrhythmia have the sampling
frequency of 250 Hz. Since the sampling frequency of the databases are different,
therefore we have down-sampled the frequency from 360 Hz to 250 Hz.

Normal ECG Atrial Fibrillation ECG


6 2
Normalized Amplitude

Normalized Amplitude

4 0

-2
2

-4
0
-6
0 0.5 1 1.5 2 0 0.5 1 1.5 2
Seconds Seconds
Atrial Flutter ECG Ventricular Fibrillation ECG
2 3
Normalized Amplitude

Normalized Amplitude

2
0 1
0
-2 -1
-2
-4 -3
0 0.5 1 1.5 2 0 0.5 1 1.5 2
Seconds Seconds

Figure 2: Two seconds ECG signals for different groups

5
Normal ECG Atrial Fibrillation ECG
8 2

Normalized Amplitude

Normalized Amplitude
6
0

4
-2
2

-4
0

-2 -6
0 1 2 3 4 5 0 1 2 3 4 5
Seconds Seconds
Atrial Flutter ECG Ventricular Fibrillation ECG
6 3
Normalized Amplitude

Normalized Amplitude
4 2

1
2
0
0
-1
-2 -2

-4 -3
0 1 2 3 4 5 0 1 2 3 4 5
Seconds Seconds

Figure 3: Five seconds ECG signals for different groups

3.2. Design of Biorthogonal Wavelet Filter Bank


The proposed method follows the following basic idea and procedure: Ini-
tially, the number of independent parameters, VMs and lengths of AF and
SF are fixed. This is followed by defining ALF as a symmetric polynomial in
z-domain. This polynomial can be represented by two factors as follows: bi-
nomial polynomial with roots at z = −1 and polynomial with free parameter.
The later one can be regarded as free-polynomial (contains independent param-
eters). Similarly the SLF is defined, which is also a symmetric polynomial. It is
expressed as multiplication of binomial part and a remainder polynomial. Then
the coefficient of remainder polynomial are achieved (in terms of independent
parameters). By using the half-band condition of PR, we have derived the linear
expressions which states that the parameter of even powers of z terms should
be considered zero leaving the constant term. After that the expressions are
solved and all of the filter coefficients are defined in terms of independent pa-
rameters and at last, these independent variables are optimized to get the FBs
that result in bandwidth duration localized WFBs which is then used for the
respective applications [15].
Let the ALF and SLF are denoted by G0 (z) and B0 (z) and the analysis high-
pass filter and synthesis high-pass filter are represented by G1 (z) and B1 (z),
respectively. Let the product filter (PF) Q(z) is defined as:

Q(z) = z l G0 (z)B0 (z) (1)

6
where l=(2n + 1), n = 0, 1...N (integer)
As a result the perfect reconstruction condition is expressed as [16]

Q(z) + Q(−z) = 2 (2)

The product-filter Q(z) is a symmetric polynomial form. The expansion coef-


ficients related to the 2n (n-integer) power of z, are zero except the constant
term 1 (z 0 ). It is also identified as the half-band polynomial filter. Therefore the
design can be collapsed to the half band filter Q(z). Let us define a Lagrange’s
half band polynomial for the order of n.

Q4n−2 (z) = z n (1 + z −1 )2n S2n−2 (z) (3)

Where  n−1
X n + m + 1 
−1 n
S2n−2 (z) = (2 − z − z ) (4)
n
m=0

In the above equations, the order is represented by 4n − 2.


For achieving the degrees of freedom, we define a new modified LHBP of degrees
of freedom of 2f .

Q̃4n−2 (z) = z (n−f ) (1 + z −1 )2(n−f ) A2f (z)B2n−2 (z) (5)

Q̃4n−2 (z) = z (n−f ) (1 + z −1 )2(n−f ) S̃2n+2f −2 (z) (6)


where S̃2n+2f −2 (z) = A2f (z)B2n−2 (z)
Pf Pm−1
with A2f (z) = 1 + k=1 jk (z k + z −k ) and B2n−2 (z) = 1 + k=1 lk (z k + z −k )
In our method we have used the order of 5 and f = 2, therefore the modified
LHBP can be expressed as:

Q̃18 (z) = z 3 (1 + z −1 )6 S̃12 (z) (7)

Where Q̃18 (z) is the modified LHBP of order of 18 and,

S̃12 (z) = A4 (z)B8 (z) (8)


P2 P4
where A4 (z) = 1 + k=1 jk (z
k
+ z −k ) and B8 (z) = 1 + k=1 lk (z
k
+ z −k )
1. Construction of parametric FBs with two degrees of freedom
By taking two degrees of freedom four VMs are reduced from LHBP.
As a result, the remaining VMs are six in modified LHBP. Therefore by
employing two and four VMs to analysis and synthesis low-pass filter, 13
and 7 lengths filter pair can be constructed.

G̃0 (x) = (1 + x)(x2 + γx + δ) (9)

B̃0 (x) = (1 + x)2 (x4 + q1 x3 + q2 y 2 + q3 y + q4 ) (10)


z+z −1
where x = 2 and γ & δ are free parameters.

7
2. Duration-bandwidth localization and discrete wavelet bases
There is an uncertainty in the analysis of signal in time and frequency-
domains simultaneously as a signal cannot be arbitrarily localized in time
and frequency at the same time. In the work of Ishii and Furukawa [17],
they have given variance based duration bandwidth localization measure
for sequences in l2 (Z). Let b(m) is defined as a sequence of discrete time
real-valued in l2 (Z) with |b|22 = 1 and discrete-time fourier transform of
2
b(m) is B(ω). The time-variance ςm and the mean m0 of the sequence is
expressed as:
X
m0 = m|b(m)|2 (11)
m∈Z
X
2
ςm = (m − m0 )2 |b(m)|2 (12)
m∈Z

Frequency mean ω0 and frequency variance ςω2 of a low-pass sequence b(n)


can be given by:
ˆ π
2 1
ω0 = 0, ςω = (ω − ω0 )2 |B(ω)|2 dω (13)
2π −π

Let us denote the product of time and frequency variance of the sequence
b(m) by Ψb . The lower bound of the TFP is decided by the uncertainty
principle which can be expressed as:

2 2 (1 − |B(π)|)2
Ψb = ς m ςω ≥ (14)
4
Frequency mean ω0 and frequency variance ςω2 of a band-pass sequence
b(n) are as follows [18]:
ˆ ˆ
1 π 2 2 1 π
ω0 = ω|B(ω)| dω, ςω = (ω − ω0 )2 |B(ω)|2 dω (15)
π 0 π 0

From the equation (5) and (8), the lower bound can be given to the TFP
for a band-pass sequence using the inequality (7) as [18]:

2 2 (1 − η)2
Ψb = ςm ςω ≥ (16)
4
 
ω0 2 ω0
where η = π |B(0)| + 1− π |B(π)|2

Frequency mean ω0 and frequency variance ςω2 of a high-pass sequence b(n)


are as follows [18]:
ˆ π
2 1
ω0 = π, ςω = (ω − ω0 )2 |B(ω)|2 dω (17)
2π −π

8
Hence using equation (10) and inequality (7) TFP for a high-pass sequence
is obtained as [18]:

2 2 (1 − |B(0)|)2
Ψb = ς m ςω ≥ (18)
4
Interestingly, similar to the continuous time function, for low-pass, high-
pass and band-pass sequence in l2 Z, the time-frequency product is also
lower bounded. It is due to null of the spectrum of the sequence at ω = π
and ω = 0. The synthesis and analysis high-pass filter should have one
zero at ω = 0. The generated iterated filters should have at least one zero
at ω = π and other at ω = 0 for band-pass discrete wavelet sequence.
Hence, the TFP of all the sequences and filters is bounded by 0.25 that is:

Y ( z)

(A) Analysis Filter Bank

Yˆ ( z )

(B) Synthesis Filter Bank

Figure 4: Tree Structured FBs

2 2 1
Ψb = ς m ςω ≥ (19)
4
For two channel biorthogonal filter bank, we have chosen the i number
of iterations to decompose a sequence into i + 1 sub-bands. For i = 4
iterations, Figure 4 illustrates the structure of ALF and SLF. The tree
type structure of iterated FB can be reconstructed as an equivalent parallel
structured filter bank (Figure 5) having i + 1 parallel branches (filters)
using the noble identities of [19]. In the Figure 4 the lower-most branch

9
is high-pass filter, upper-most branch is low-pass filter and other branches
represent the band-pass filters. We minimize TFP for all the filters (low-
pass, high-pass and band-pass filter). Expression for the i + 1 analysis-
filters for parallel structured FB
B11 (z) = B1 (z) (20)
j−2
j−1 Y k
B1j (z) = B1 (z 2 ) B0 (z 2 ), j = 2, 3, ..., i (21)
k=0
i−1
Y k
B0i (z) = B0 (z 2 ) (22)
k=0
and expression for i + 1 synthesis-filters is as follows:
G11 (z) = G1 (z) (23)
j−2
j−1 Y k
Gj1 (z) = G1 (z 2 ) G0 (z 2 ), j = 2, 3, ..., i (24)
k=0
i−1
Y k
Gi0 (z) = G0 (z 2 ) (25)
k=0

B⁴₀(z) 16 16 G⁴₀(z)

B⁴₁(z) 16 16 G⁴₁(z)

Y ( z) B³₁(z) 8 8 G³₁(z) Yˆ ( z )

B²₁(z) 4 4 G²₁(z)

B¹₁(z) 2 2 G¹₁(z)

(A) Analysis Filter Bank (B) Synthesis Filter Bank

Figure 5: Parallel Structured FBs

Let, in time-domain, bj1 (m), j = 1, ..., i and b10 represents the parallel-filters
of parallel structured AFB which can be regarded as wavelet-vectors/sequences.
These vectors are known as discrete time analysis wavelet-basis and it is
denoted as ∆. Similarly, we define parallel filters of parallel structured
SFB by g1j (m), j = 1, ..., i and g01 . The vectors are known as discrete time
synthesis wavelet-basis (WB), represented by ∆. ˜ Therefore, the uncer-
tainty principle can be defined for discrete time analysis and synthesis
bases. For µ = {b1 , b2 , b3 , ...bM } this basis, that contains M discrete-
wavelet-vectors, the expression of TFP is defined as:
M
X
Ψµ = w k Ψb k (26)
k=1

10
PM
wk represents the weights such that 0 ≤ wk ≤ 1 and k=1 wk = 1. Ψµ is
weighted-average of time-frequency products of the wavelet vectors. The
uncertainty relation for µ is:
M
1 X
Ψµ = wk Ψbk ≥ 0.25 (27)
M
k=1

Therefore, the TFP for discrete time is same as continuous time analysis
and synthesis WBs that is bounded by 0.25.
3. Optimization method
To obtain the JDBWL, the objective function is minimized.

P = rΨ∆ + (1 − r)Ψ∆
˜ , r ∈ [0, 1] (28)

For controlling of localizations of analysis and synthesis bases, r is a trade


off factor. In our work, the value of trade-off factor is chosen as r = 0, 0.5
and 1 to achieve duration-bandwidth optimized wavelet-bases.
(a) Jointly optimal analysis and synthesis wavelet bases:
The objective function for r = 0.5 to minimize the duration band-
width product of the analysis and synthesis WBs collectively.
1
P = Ψ∆+∆
˜ = [Ψ∆ + Ψ∆
˜] (29)
2
(b) Unconstrained optimization problem:
For a parametric construction of FBs, we impose conditions of per-
fect reconstruction and vanishing moments. Hence, the design of
duration-bandwidth localized wavelet bases for each case is defined
as:
(b∗0 (m), g0∗ (m)) = argmin(P(βi )) (30)
(βi )

The optimization process for all the steps are explained as following:
i. The numbers of free-parameters, VMs and lengths of the filter
pair as well as decomposition levels are fixed .
ii. The expressions for G0 (z) and B0 (z) are obtained by using parametriza-
tion technique.
iii. The region where the filter banks are regular, find the value of
independent parameter. The cascade-algorithm converges in this
region.
iv. Using the range mentioned above, the initial value for free-parameters
are selected.
v. Optimization of the objective function P is performed using
f mincon solver optimization toolbox of MATLAB.
vi. Inspect the value of function P whether it is nearer to the lower-
bound 0.25, otherwise go back to the step iv and repeat the
process until lower bound is close to 0.25.

11
Index f0 h0
0 0.586549684753843 0.575285420160546
1 0.260676013170069 0.300071355040137
2 -0.0683046054662545 -0.0376427100802732
3 -0.00979969727517512 -0.0500713550401366
4 0.0238641104514918 –
5 -0.000876315894894261 –
6 0.00116565263784130 –
1
f0 : Analysis Filter Coefficients; h0 : Synthesis Filter Coefficents
Table 3: Filter coefficients of the Analysis and Synthesis Filter Bank.

3.3. LS-SVM Classifier


In the present work, we have observed maximum accuracy with Least squares
support vector machines (LS-SVM). The SVMs are supervised machine learning
algorithms for classification and pattern identification and are widely used in
data mining applications like nonlinear estimation, function estimation, density
estimation and classification proposed by Vapnik [20]. The SVMs approach
involves identifying optimum hyperplane by maximizing the distance between
the classes. Vapnik- Chervonenkis dimension [21] introduced the SVM algorithm
that follows statistical learning. The major disadvantage of SVM was its higher
computational complexity. LS-SVM are least version of support vector machine
(LS-SVM), which are set of relative supervised learning method that analyse
data and recognise pattern. It is also used for classification and regression
analysis. In this version, one find the solution by solving a set of linear equation
instead of convex quadratic programming (QP) problem for classical SVM.

4. Features used

4.1. Norm (N )
The Lp norm of y (β) is defined as:
 M 1/p
1 X (β) p
(β)
||Nβp || Nβp 1/p

||y ||p = = = |y | (31)
M i=1

where p≥1 has the same dimensions as the corresponding signals y (β) . The
norm adds two trends: β decrease with the power 1/p and a strong increase
caused by the power p. Thus the norm can be generalised to represent the
norms from the least to the maximum, which regard to the orders p = −∞ and
p = ∞, respectively. This includes the absolute mean (p = 1) and the rms value
(p = 2) as special cases. The norm can be achieved as the norm for the norms
of individual samples.

12
4.2. Sample Entropy (Es )
Real-valued discrete time signal is defined as x = (x1 , x2 , ..., xN ) of length
N . At each state of x, a vector including the n-th value is defined as Xtn =
{xt , xt+1 , ..., xt+n−2 , xt+n−1 } where t = 1, 2, ..., N − (n − 1) and n is the embed-
ding dimension that represents number of samples in each vector. The absolute
value of the difference between Vt1 and Vt2 is the distance L(Vt1 , Vt2 ) between
these vectors. If the distance L(Vt1 , Vt2 ) is lessor than predefined tolerance s, the
probability P m (s) represents the n-dimensional matched  vectors  [22]. Therefore
P m+1 (s)
the sample entropy is defined as: Es (x, n, s) = − ln P m (s)

4.3. Fuzzy Entropy (Ef )


Let a time series x = (x1 , x2 , ..., xN ) have been an embedding dimension
n and the tolerance s, V1n = (xt , xt+1 , ..., xt+n−1 ) − x0 t is constructed where
Pn−1 x
x0 t = j=0 t+j . The successive distance (SD) between Vtn1 and Vtn2 is SDt1 t2 =
 n n n
SD Vt1 , Vt2 = max{|Vtn1 +k − Vtn2 +k | : 0 ≤ k ≤ n − 1 and t1 6= t2 }. For the fuzzy
entropy of power m and tolerance s, the degree of similarity  Ct1 t2 is obtained
by fuzzy-function µ(Ct1 t2 , m, s) = exp − (Ct1 t2 )m /s . Where the function
1
PN −n 1
PN −n 
ψ n (x, m, s) = N −n t1 =1 N −n−1
m
t2 =1,t1 6=t2 exp Ct1 t2 /s [23]. So the fuzzy
entropy Ef is :  
ψ n+1
Ef (x, n, m, s) = − ln ψn

4.4. Renyi Entropy (Er )


Let us assume the events are x = (x1 , x2 , ..., xN ) that have the probabilities
(p1 , p2 , ..., pN ). Each of these has the Ik bits of information, therefore the total
PN
number of information is: I(P ) = k=1 pk Ik . This expression is also known
as Shannon’s entropy. For any arbitrary PN function with f −1 , the total number
−1
of information sets: I(P ) = f ( k=1 pk g(Ik )). Therefore the result of this
PN
equation gives: Iβ (P ) = 1−β 1
log( k=1 )Pkβ . β has non negative value different
from unity [24]. This yields a parametric group of information measures, that
is known as Renyi’s entropies.

5. Results

The confusion matrix for the five seconds dataset and two seconds dataset are
shown in the Table 4 and Table 5. From the Table 4, it is seen that the 99.31%
ECG-segments of αf ib for five seconds has been classified correctly. Whereas,
more than 12% of the ECG-segments of αf l are wrongly classified as αf ib and
ηsr . Similarly, more than 12% of νf ib signals are wrongly classified as αf ib .
From the Table 5, it is observed that 97.98% ECG-segments of αf ib has
been classified correctly. However, more than 18% of νf ib signals are wrongly
classified as αf ib , αf l and ηsr .

13
From Tables (4, 5), the overall classification for dataset A and B is done
with the accuracy, sensitivity and PPV of 98.00%, 91.20% & 94.78% for dataset
A and 96.30%, 88.61% & 89.48% for dataset B, respectively.
Figure 6 and 7 show the ROC curve for the classification. The Area under
curve (AUC) obtained is 0.99 and 0.98 for five seconds dataset and two seconds
dataset, respectively.

Table 4: Confusion matrix for five seconds

Original/ ηsr αf ib αf l νf ib Acc(%) Sen(%) PPV(%)


Predicted
ηsr 335 23 3 0 98.00 92.79 94.37
αf ib 12 7469 35 5 98.00 99.31 98.59
αf l 8 79 649 0 98.00 88.18 94.47
νf ib 0 10 0 55 98.00 84.61 91.67
2
Acc=Accuracy, PPV=Positive Predictive Value, Sen=Sensitivity

Table 5: Confusion matrix for two seconds

Original/ ηsr αf ib αf l νf ib Acc(%) Sen(%) PPV(%)


Predicted
ηsr 811 81 10 0 96.30 89.91 78.81
αf ib 202 18417 176 9 96.30 97.98 98.03
αf l 16 265 1557 2 96.30 84.61 89.07
νf ib 6 23 5 127 96.30 81.93 92.02

14
1

0.8

True positive rate


0.6

AUC = 0.99

0.4

0.2

0 0.2 0.4 0.6 0.8 1


False positive rate

Figure 6: ROC curve for five seconds dataset

0.8
True positive rate

0.6

AUC = 0.98

0.4

0.2

0 0.2 0.4 0.6 0.8 1


False positive rate

Figure 7: ROC curve for two seconds dataset

15
6. Discussion

From results, the first thing observed is that the performance of five seconds
and two seconds databases are comparable. The five seconds data performed
slightly well than the two seconds data as the five seconds database contain
more duration i.e. three additional seconds.
The data samples of νf ib taken in this study is very less (161 and 65 ECG-
segments) as compared to the other data samples. Therefore, it is the main
reason for achieving significantly less sensitivity and PPV.
The accuracy obtained is comparable to the various studies of Table 6. We
achieved accuracies of 93.50% and 89.60% for dataset A and B, respectively by
using total 86827 number of ECG-segments (902 ηsr , 18804 αf ib , 1840 αf l and
162 νf ib ) of five seconds database and total 8683 number of ECG-segments (361
ηsr , 7521 αf ib , 736 αf l and 65 νf ib ) of two seconds database. We used norm
(N1 ), energy (N2 ), reiny entropy (Er ), fuzzy entropy (Ef ) and sample entropy
(Es ) as a discriminating features. Using 13/7 biorthogonal wavelet filter bank,
the performance of our method is robust.

16
Table 6: Performance comparison of our work with previous works in the field of detection of
arrhythmia

Authors,Years Databank Class of ECG Classifier Performance Analysis


Three Class
A = 99.50%
αf ib : SN = 98.30%
Wang et al., αf ib , Fuzzy Kohomen SP = 100.30%
2001 mitdb νf ib , Network A = 97.20%
[9] νtac νf ib : SN = 98.30%
SP = 96.70%
A = 97.80%
νtac : SN = 95.00%
SP = 99.20%
Martis et al., afdb αf ib K-nearest A = 99.50%
2013 mitdb αf l neighbor SN = 100.00%
[7] ηsr SP = 99.22%
Martis et al., afdb αf ib KNN A = 99.45%
2014 mitdb αf l SN = 99.61%
[8] ηsr SP = 100.00%
Four Class
αf ib , premature
Fahim et al.,2011 mit-bih atrial beat, A = 97.00%
2011 physiobank Premature Rule-based (average)
[12] ventricular, νf ib
contraction,
Acharya et al., afdb αf ib Decision A = 96.30%
2016 cudb αf l Tree SN = 99.30%
[10] mitdb νf ib , ηsr SP = 84.10%
Desai et al., afdb αf ib Rotation A = 98.37%
2016 cudb αf l forest
[11] mitdb νf ib , ηsr
Acharya et al., afdb αf ib Convolution for Net B
2017 cudb αf l neural A = 92.5%
[13] mitdb νf ib network SN = 98.09%
ηsr SP = 93.13%
for Net A
A = 94.90%
SN = 99.13%
SP = 81.44%
Dataset A
αf ib A = 98.00%
afdb αf l SN = 91.20%
Present study cudb νf ib LS-SVM P P V = 94.72%
mitdb ηsr Dataset B
A = 96.30%
SN = 88.61%
SP = 89.48%
3 A= Accuracy, SN= Sensitivity, SP=specificity, α
f ib = Atrial fibrillation, αf l = Atrial flutter, νf ib = Ventricular
flutter, νtac = Ventricular tachyacardia, ηsr = Normal sinus rhythm, afdb= MIT-BIH atrial fibrillation, cudb=
Creighton university ventricular tachyaarhythmia, mitdb= MIT-BIH arrhythmia

17
SB 1 SB 2 SB 3
150
35
20
30

15 25 100
20
10
15
50
10
5
5

0 0 0
0 1 2 3 0 1 2 3 0 1 2 3

SB 4 SB 5

200
80

150
60

40 100

20
50

0
0 1 2 3 0 1 2 3

(a) Norm
SB 1 SB 2 SB 3
60
2.5 500

50
2 400
40
1.5 300
30

1 200
20

0.5 10 100

0 0 0
0 1 2 3 0 1 2 3 0 1 2 3

SB 4 SB 5

300
3000
250
2500
200 2000

150 1500

100 1000

50 500

0 0
0 1 2 3 0 1 2 3

(b) Energy

18
SB 1 SB 2 SB 3
1.4
0.12 0.3
1.2
0.25
0.1
1
0.2
0.08
0.8
0.15
0.06
0.6
0.1
0.04 0.4
0.05
0.02 0.2
0
0 0
-0.05
0 1 2 3 0 1 2 3 0 1 2 3

SB 4 SB 5

1.4
1.5
1.2

1
1
0.8

0.6
0.5
0.4

0.2
0
0 1 2 3 0 1 2 3

(c) Fuzzy Entropy


SB 1 SB 2 SB 3
8
1
4
6 0

2 -1
4 -2
0
-3
2
-4
-2
-5
0
-4 -6

0 1 2 3 0 1 2 3 0 1 2 3

SB 4 SB 5

1 -4
0

-1 -5

-2
-6
-3

-4 -7

-5
-8
-6
0 1 2 3 0 1 2 3

(d) Reiny Entropy

19
SB 1 SB 2 SB 3
2.5

2 2
1.5

1.5 1.5

1
1
1
0.5
0.5
0.5
0

0 0
-0.5
0 1 2 3 0 1 2 3 0 1 2 3

SB 4 SB 5
2.5 2

2
1.5
1.5

1 1

0.5
0.5
0

-0.5 0
0 1 2 3 0 1 2 3

(e) Sample Entropy

Figure 6: Box plots for normal vs different types of arrhythmia for two seconds

20
SB 1 SB 2 SB 3
350
50 350
300
300
40
250
250
30 200 200

150 150
20
100 100
10
50 50

0 0 0
0 1 2 3 0 1 2 3 0 1 2 3

SB 4 SB 5
350
250

300
200
250
150
200
100
150

50
100

0 50
0 1 2 3 0 1 2 3

(f) Norm
SB 1 SB 2 SB 3

5 500 800

4 400
600

3 300
400
2 200

200
1 100

0 0 0
0 1 2 3 0 1 2 3 0 1 2 3

SB 4 SB 5
800 4000

600 3000

400 2000

200 1000

0 0
0 1 2 3 0 1 2 3

(g) Energy

21
SB 1 SB 2 SB 3
0.4
1.2
0.1
1
0.3
0.08
0.8
0.06
0.2
0.6
0.04
0.4
0.1
0.02
0.2

0 0 0
0 1 2 3 0 1 2 3 0 1 2 3

SB 4 SB 5
1.4

1.2 1.5
1

0.8
1
0.6

0.4
0.5
0.2

0 1 2 3 0 1 2 3

(h) Fuzzy Entropy


SB 1 SB 2 SB 3
3

1.5 2.5
1.5
2
1
1 1.5

0.5 1
0.5
0.5

0 0 0
0 1 2 3 0 1 2 3 0 1 2 3

SB 4 SB 5
2.5 2.5

2 2

1.5 1.5

1 1

0.5 0.5

0 0
0 1 2 3 0 1 2 3

(i) Sample Entropy

22
SB 1 SB 2 SB 3

5 -1
2
4
-2
3 0
-3
2
-2 -4
1
-5
0 -4

-1 -6
-6
-2 -7
0 1 2 3 0 1 2 3 0 1 2 3

SB 4 SB 5
0
-4.5
-1
-5
-2 -5.5

-3 -6

-6.5
-4
-7
-5
-7.5
-6
-8
-7
0 1 2 3 0 1 2 3

(j) Reiny Entropy

Figure 5: Box plots for normal vs different types of arrhythmia for five seconds

7. Conclusion

The diagnosis of arrhythmia is generally achieved through ECG-signals. For


accurate and automated detection of arrhythmia, a computer aided diagnosis
system needs to be designed as the elderly population affected by this, is large.
In our work, we have used 13/7 parametrized biorthogonal wavelet filter banks
for classification of different classes of arrhythmias (αf ib , αf l , νf ib and ηsr )
which is followed by feature extraction and feature ranking. We have used 8,683
ECG-segments of dataset A and 21,709 ECG-segments of dataset B. Through
our method, we obtained an accuracy, sensitivity and positive predictive value
(PPV) of 98.00%, 91.20% and 94.72%, respectively for dataset A and for the
dataset B, the accuracy, sensitivity and PPV are as 96.30%, 88.61% and 89.48%,
respectively. The robustness in our method can be improved by using more
samples of databases in every class.

References

[1] U. Nations, World population ageing 2015, New York: Department of Eco-
nomic and Social Affairs PD.

23
[2] P. Anversa, T. Palackal, E. H. Sonnenblick, G. Olivetti, L. G. Meggs, J. M.
Capasso, Myocyte cell loss and myocyte cellular hyperplasia in the hyper-
trophied aging rat heart., Circulation research 67 (4) (1990) 871–885.
[3] R. Acharya, S. M. Krishnan, J. A. Spaan, J. S. Suri, Advances in cardiac
signal processing, Springer, 2007.
[4] K. Najarian, R. Splinter, Biomedical signal and image processing, CRC
press, 2012.
[5] A. L. Goldberger, Clinical Electrocardiography E-Book: A Simplified Ap-
proach, Elsevier Health Sciences, 2012.
[6] R. J. Martis, U. R. Acharya, H. Adeli, Current methods in electrocar-
diogram characterization, Computers in biology and medicine 48 (2014)
133–149.
[7] R. J. Martis, U. R. Acharya, H. Prasad, C. K. Chua, C. M. Lim, J. S. Suri,
Application of higher order statistics for atrial arrhythmia classification,
Biomedical Signal Processing and Control 8 (6) (2013) 888–900.
[8] R. J. Martis, U. R. Acharya, H. Adeli, H. Prasad, J. H. Tan, K. C. Chua,
C. L. Too, S. W. J. Yeo, L. Tong, Computer aided diagnosis of atrial
arrhythmia using dimensionality reduction methods on transform domain
representation, Biomedical Signal Processing and Control 13 (2014) 295–
305.
[9] Y. Wang, Y.-S. Zhu, N. V. Thakor, Y.-H. Xu, A short-time multifractal
approach for arrhythmia detection based on fuzzy neural network, IEEE
Transactions on Biomedical Engineering 48 (9) (2001) 989–995.
[10] U. R. Acharya, H. Fujita, M. Adam, O. S. Lih, T. J. Hong, V. K. Sudar-
shan, J. E. Koh, Automated characterization of arrhythmias using nonlin-
ear features from tachycardia ecg beats, in: Systems, Man, and Cybernetics
(SMC), 2016 IEEE International Conference on, IEEE, 2016, pp. 000533–
000538.
[11] U. Desai, R. J. Martis, U. R. Acharya, C. G. Nayak, G. Seshikala,
R. SHETTY K, Diagnosis of multiclass tachycardia beats using recurrence
quantification analysis and ensemble classifiers, Journal of Mechanics in
Medicine and Biology 16 (01) (2016) 1640005.
[12] F. Sufi, I. Khalil, Diagnosis of cardiovascular abnormalities from com-
pressed ecg: a data mining-based approach, IEEE Transactions on Infor-
mation Technology in Biomedicine 15 (1) (2011) 33–39.
[13] U. R. Acharya, H. Fujita, O. S. Lih, Y. Hagiwara, J. H. Tan, M. Adam,
Automated detection of arrhythmias using different intervals of tachycardia
ecg segments with convolutional neural network, Information sciences 405
(2017) 81–90.

24
[14] M. Sharma, V. M. Gadre, S. Porwal, An eigenfilter-based approach to the
design of time-frequency localization optimized two-channel linear phase
biorthogonal filter banks, Circuits, Systems, and Signal Processing 34 (3)
(2015) 931–959.
[15] M. Sharma, P. Achuth, R. B. Pachori, V. M. Gadre, A parametrization
technique to design joint time–frequency optimized discrete-time biorthog-
onal wavelet bases, Signal Processing 135 (2017) 107–120.
[16] G. Strang, T. Nguyen, Wavelets and filter banks, SIAM, 1996.
[17] R. Ishii, K. Furukawa, The uncertainty principle in discrete signals, IEEE
Transactions on Circuits and Systems 33 (10) (1986) 1032–1034.
[18] M. Sharma, R. Kolte, P. Patwardhan, V. Gadre, Time-frequency localiza-
tion optimized biorthogonal wavelets, in: Signal Processing and Commu-
nications (SPCOM), 2010 International Conference on, IEEE, 2010, pp.
1–5.

[19] M. Vetterli, J. Kovačević, V. K. Goyal, Foundations of signal processing,


Cambridge University Press, 2014.
[20] C. Cortes, V. Vapnik, Support-vector networks, Machine learning 20 (3)
(1995) 273–297.

[21] V. Learning, Guest editorial vapnik-chervonenkis (vc) learning theory and


its applications, IEEE Transactions on Neural Networks 10 (1999) 985.
[22] J. S. Richman, J. R. Moorman, Physiological time-series analysis using
approximate entropy and sample entropy, American Journal of Physiology-
Heart and Circulatory Physiology 278 (6) (2000) H2039–H2049.

[23] W. Chen, Z. Wang, H. Xie, W. Yu, Characterization of surface emg signal


based on fuzzy entropy, IEEE Transactions on neural systems and rehabil-
itation engineering 15 (2) (2007) 266–272.
[24] J. C. Principe, Information theoretic learning: Renyi’s entropy and kernel
perspectives, Springer Science & Business Media, 2010.

25

You might also like