A Novel Deep Arrhythmia-Diagnosis Network For Atrial Fibrillation Classification Using Electrocardiogram Signals

SPECIAL SECTION ON DEEP LEARNING
FOR COMPUTER-AIDED MEDICAL DIAGNOSIS
Received April 26, 2019, accepted May 11, 2019, date of publication May 24, 2019, date of current version June 21, 2019.
Digital Object Identifier 10.1109/ACCESS.2019.2918792
A Novel Deep Arrhythmia-Diagnosis Network

for Atrial Fibrillation Classification Using
Electrocardiogram Signals
HAO DANG 1,2 , MUYI SUN 1,2 , GUANHONG ZHANG1,2 , XINGQUN QI 1,2 ,
XIAOGUANG ZHOU1,2 , AND QING CHANG3,4

1 School of Automation, Beijing University of Posts and Telecommunications, Beijing 100876, China
2 Engineering Research Center of Information Network, Ministry of Education, Beijing 100876, China
3 Shanghai General Practice Medical Education and Research Center, Shanghai 201800, China
4 Jiading District Central Hospital Affiliated Shanghai University of Medicine & Health Sciences, Shanghai 201800, China
Corresponding authors: Xiaoguang Zhou (zxg@bupt.edu.cn) and Qing Chang (robie0510@hotmail.com)

This work was supported in part by the Open Foundation of State Key Laboratory of Networking and Switching Technology, Beijing
University of Posts and Telecommunications, under Grant SKLNST-2018-1-18, and in part by the Foundation of Jiading Health Science
under Grant 2018-KY-04.
ABSTRACT Atrial fibrillation (AF), a common abnormal heartbeat rhythm, is a life-threatening recurrent
disease that affects older adults. Automatic classification is one of the most valuable topics in medical
sciences and bioinformatics, especially the detection of atrial fibrillation. However, it is difficult to accurately
explain the local characteristics of electrocardiogram (ECG) signals by manual analysis, due to their small
amplitude and short duration, coupled with the complexity and non-linearity. Hence, in this paper, we propose
a novel deep arrhythmia-diagnosis method, named deep CNN-BLSTM network model, to automatically
detect the AF heartbeats using the ECG signals. The model mainly consists of four convolution layers: two
BLSTM layers and two fully connected layers. The datasets of RR intervals (called set A) and heartbeat
sequences (P-QRS-T waves, called set B) are fed into the above-mentioned model. Most importantly, our
proposed approach achieved favorable performances with an accuracy of 99.94% and 98.63% in the training
and validation set of set A, respectively. In the testing set (unseen data sets), we obtained an accuracy
of 96.59%, a sensitivity of 99.93%, and a specificity of 97.03%. To the best of our knowledge, the algorithm
we proposed has shown excellent results compared to many state-of-art researches, which provides a new
solution for the AF automatic detection.
INDEX TERMS Convolution neural network, deep learning, bi-directional long short-term memory, atrial
fibrillation.
I. INTRODUCTION its incidence increases with age [1], [2]. AF is characterized

According to the report of World Health Statistic 2017 from by rapid and irregular ventricular contractions [3], which is
the World Health Organization (WHO), non-communicable divided into paroxysmal atrial fibrillation, persistent atrial
diseases accounted for 70% of the total deaths, and fibrillation, and permanent atrial fibrillation [4].
the most primary causes of death were cardiovascular The abnormal heartbeat rhythm is related to the high risk
disease (CVD) [1]. The number of patients died of cardio- of stroke, heart disease, and cardiovascular disease [2], [5].
vascular disease accounted for more than 40% of deaths In addition, there is also a close relationship between AF
from residents, ranking first comparing to cancer and other and obesity [6], hypertension mitral or tricuspid valvular
diseases [1]. Thereinto, atrial fibrillation (AF) is the most disorders, obstructive sleep apnea [7], hyperthyroidism [8],
common clinically sustained cardiac arrhythmia disorder, and alcoholism [9]–[11], which can not only lower the quality
with a prevalence of approximately 1% in the population, and of life, but also increase the risk of mortality. Accordingly,
AF is the most common arrhythmia disease, and it is neces-
The associate editor coordinating the review of this manuscript and sary to automatically detect AF for early diagnosis, manage-
approving it for publication was Shuihua Wang. ment, and prevention of related complications [12].
2169-3536
2019 IEEE. Translations and content mining are permitted for academic research only.
VOLUME 7, 2019 Personal use is also permitted, but republication/redistribution requires IEEE permission. 75577
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
H. Dang et al.: Novel Deep Arrhythmia-Diagnosis Network for AF Classification Using ECG Signals
At present, the diagnosis of AF mainly depends on the A. MACHINE LEARNING

presence of some typical symptoms (such as breathing dif- Since the last decades, the pattern recognition methods based
ficulties, chest pain, abnormal pulse, and so on.) [12] and the on machine learning has been widely researched and applied
characteristics of electrocardiogram (ECG). However, it is for AF detection and classification [17]–[21]. For instance,
difficult to accurately detect AF in the early stage, mainly Liu et al. [17] proposed a normalized fuzzy entropy, a novel
because sometimes there are no obvious symptoms when entropy measure suitable to enhance the performance of
AF occurs [13], and well-trained professional physicians entropy-based AF detectors using a short-term RR interval.
are required to accurately determine the feature information In their work, the detector was trained using the MIT-BIH
of ECG [12], [14]. Therefore, it is important and valuable atrial fibrillation databases (MIT-BIH AF), and tested on
to develop a fast and accurate algorithm of AF automatic the MIT-BIH normal sinus rhythm (MIT-BIH NSR) and
detection. MIT-BIH arrhythmia databases (MITAB), with the high-
In recent years, deep convolutional neural networks est area under receiver (AUC) values of 92.72%, 95.27%,
(DCNN) are widely applied to analyze images, such an image and 96.76% for 12−, 30−, and 60-beat window lengths
classification, objection detection, and image segmentation. respectively. Lake and Moorman et al. [18] developed a
With the development of DCNN, deep networks naturally new method of entropy estimation in short time series and
provide a new research idea for different research directions implemented them to detect AF. The proposed algorithm
due to different network structures, depth, and width, which has achieved a high degree of accuracy in distinguish-
can integrate low/mid/high-level features [15] and classi- ing AF from normal sinus rhythm in 12-beat. An average
fiers in an end-to-end multilayer model. Simultaneously, the sensitivity of 91% and specificity of 98% were obtained.
‘‘levels’’ of features can be enriched by the number of stacked Liu et al. [19] used an SVM-based method to detect the
layers [16]. However, conventional ECG recognition sys- four arrhythmias included AF, which achieve a moderate
tems often rely on well-trained professional physicians and performance with an accuracy of 86.23% and 87.71% on
also require complex feature extraction processes, increasing the training and validation set of PhysioNet/Computing in
the computational time to classify arrhythmias. This makes Cardiology Challenge 2017. Zhao et al. [20] also developed
them less reliable in the clinic or practice, as their accu- a new entropy measure, named EntropyAF , to detect AF,
racy and efficiency often vary greatly for larger databases, which was trained using the MIT-BIH AF, and tested on
which also is a motivation of CNN-BLSTM for ECG the clinical wearable long-term AF recordings. EntropyAF
classification. acquired a performance with an accuracy of 87.10%, a sensi-
In our study, a reliable arrhythmia-diagnosis network based tivity of 92.77%, and a specificity of 85.17%. Mei et al. [21]
on deep learning for the automatic detection of AF is pro- presented a method based on heart rate variability and spectral
posed. The network includes three main stages of data features to detect AF from short single lead ECG recording.
processing, feature extraction, and classification. Initially, For two classes classification problem (Normal and AF),
the peaks detection and data segmentation are required to accuracy varies from 92.0% to 96.0% under different addi-
obtain the input data, which is fed into the second stage. tional noise levels. Excellent accuracy and very high speci-
Secondly, the automatic feature extraction is implemented by ficity of this model made it a competitive algorithm for
four convolution layers and two BLSTM blocks. Then the preliminary screening.
ECG signal episodes are classified by two fully-connected Otherwise, we review the special issue about AF detec-
layers and a SoftMax layer. Ultimately, we quantitatively tion from Physiological Measurement journal [22]–[34].
investigate the performance of the proposed network and The review mainly summarizes the three parts included the
the function of different tricks (such as batch normaliza- source of data, features selection, and popular algorithm.
tion, different epoch, and so on.) by experiments. In general, The data used in [22]–[34] mainly includes three aspects:
the proposed method does not need the traditional hand- common public databases (such as MITAB, MIT-BIH AF,
crafted feature extraction methods, and shows state-of-the-art wrist-worn devices, and the data from PhysioNet/CinC Chal-
results. It provides a new idea for the classification of lenge dataset and so on.). The four chief features, such as
one-dimensional signals. AF features, RR interval features, morphology features, and
features of similarity index between beats, are selected to
detect AF signals.
II. RELATED WORK In general, some machine learning classifiers included
In this section, we discuss in detail the approaches that support vector machining (SVM), AdaBoosted Decision
closely related to this work: machine learning and deep Tree (ADT), Artificial Neural Network (ANN), Gradient
learning approaches, which are widely used to detect AF. Boost Decision Tree (GBDT), Coefficient of Sample Entropy
However, due to the limitations of space and capacity, we can- (CosEn), and Linear Discriminant Analysis (LDA) are espe-
not provide a complete and comprehensive research report, cially popular in the above researches. Those methods can
although we have collected a large number of state-of-the-art be divided into five steps: feature perception, data pre-
researches. processing, features extraction, feature selection, signals
75578 VOLUME 7, 2019

prediction, and classification. The classification of the Xu et al. [40] proposed a new model that combines modified
ECG signal has witnessed a quantum leap in the performance frequency slice wavelet transform (MFSWT) and convolu-
on benchmark datasets. tional neural network (CNN) for automatic AF beat detection.
Nevertheless, with the development of machine learn- The method was used on the MIT-BIH AF and achieved
ing, conventional models have several insurmountable draw- a mean accuracy of 81.07% from 5-fold cross-validation.
backs. First, most of the reported approaches in the literatures The corresponding sensitivity and specificity performances
for classifying the ECG signals solely rely on the traditional were 74.96% and 86.41%. Andersen et al. [41] developed an
methods of computer vision with machine learning, which end-to-end model combining the CNN and recurrent neural
separate feature extraction from classifier design, then com- network (RNN) to classify the AF and normal sinus rhythm
bine them when applied. Second, handcrafted extract features (NSR). It showed an accuracy, a sensitivity, and a specificity
methods are widely used in the traditional machine learning of 97.8%, 98.98%, and 96.95%, validated through 5-fold
pipelines for the ECG signals classification. And it mainly cross-validation. And it also achieved the results of accuracy
depends on a large number of expert priori knowledge and (87.40%), sensitivity (98.96%), and specificity (86.04%) on
sufficient capacity of biomedical signal processing. On the unseen data sets.
basis of this, it is also necessary to design a good classifier However, it remains a challenging work to build an effec-
algorithm, but is difficult to achieve the optimal effect. So, it is tive learning idea for ECG signals, due to itself specificity
important to acquire representative, dominant, and critical features of small datasets and one-dimensional signal. Mean-
features for a particular ECG signal and simultaneously while, it is also very difficult to improve the detection speed
design excellent and reliable classifier. In addition, it is and accuracy.
time-consuming and strenuous to analyze the ECG signals, In our paper, we propose a novel deep arrhythmia-
due to the effect of computer hardware. Moreover, because diagnosis method, namely deep CNN-BLSTM network,
of the unique characteristics of ECG signals: small to automatically detect AF from ECG signals. In summary,
amplitude (mV) and small duration (sec), it also is a challenge the contributions of this work are as follows:
to analyze the ECG signals. Hence, the interpretation of these (1) We propose an end-to-end deep CNN-BLSTM network
long duration of signals may lead to inter and intra-observer for the detection of AF signals due to its features of low
variabilities [35]. dimensionality and small memory, which replaces the hand-
crafted features with CNN and BLSTM.
B. DEEP LEARNING (2) We design multi-scale signals included RR intervals
Recent years, deep learning regarded convolution neural net- (named set A) and Heartbeat sequences (P-QRS-T waves,
works (CNN) as the core has made a series of breakthroughs named set B.) as the inputs of the network. This design can
for image classification, objection detection, and semantic be leveraged to extract multi-scale features, and then improve
segmentation. The deep learning has become a dramatical the generalization of the network model.
method to solve the research problem of computer vision, due (3) We present a detailed experimental study of ECG signal
to the advances in the CNN model, GPU hardware and the classification with significantly improved performance.
feature of large-scale data. (4) Last, we show that our proposed CNN-BLSTM net-
Therefore, it appears natural to leverage convolu- work achieves state-of-the-art results on datasets and carries
tional layer to extract features of ECG signals and out the full comparisons with other methods.
solve the problems of machine learning mentioned above.
Acharya et al. [36] implemented a CNN model to detect III. DESCRIPTION AND PRE-PROCESSING OF DATASET
a normal and myocardial infarction(MI) ECG beats, which In this section, on one hand, we mainly introduce the database
achieved an average accuracy of 93.53% and 95.22% based used in this paper, which includes the basic information
ECG signals. Al Rahhal et al. [37] proposed deep learn- and the chief characteristics of the AF signal. On the other
ing approach for active classification of ECG signals and hand, the pre-processing of dataset, such as peaks detection,
obtained significant accuracy improvements with less expert segmentation, and normalization, is described in this work.
interaction and faster online retraining compared to the
state-of-the-art methods in 2016. Acharya et al. [38] also A. DESCRIPTION OF DATASET
presented a CNN technique to automatically detect the dif- In this paper, the ECG signals are acquired from the
ferent ECG segments included the normal, atrial fibrilla- MIT-BIH Atrial Fibrillation databases (MIT-BIH AF) [42]
tion, atrial flutter, and ventricular fibrillation. They have from physio net. The database includes 25 long-term ECG
acquired an accuracy of 94.90%, a sensitivity of 99.13%, and records (mostly paroxysmal atrial fibrillation). Of these,
a specificity of 81.44% for five seconds of ECG duration. 23 records contain two ECG signals (in the.dat file). Each sig-
Pourbabaee et al. [39] developed a deep learning machine to nal record lasts for 10 hours with a sampling rate of 250 and
screen and identify patients with paroxysmal atrial fibrillation a resolution of 12 bits. A typical normal ECG signal of
(PAF). The extensive simulations indicated that combining a heartbeat is shown in Fig.1, which includes P-wave,
the learned features with other classifiers will significantly Q-wave, R-wave, S-wave, QRS-complex, T-wave, U-wave,
improve the performance of the patient screening system. PR-interval, PR- segment, ST-segment, and QT-segment.
VOLUME 7, 2019 75579

FIGURE 3. The detection of R peak. The black dots represent the position
of R peak of the ECG signal. The RR intervals is an important feature for
the detection of AF.
B. PRE-PROCESSING OF DATASET
In our work, the ECG signals from the public database are
FIGURE 1. A typical normal ECG signal of a heartbeat (Source: Ref. [43]). pre-processed as follows:
A normal and integrated heartbeat is comprised of P-wave, PR-interval,
Q-wave, R- wave, QRS-complex, S- wave, T- wave, ST-segment, (1) Peak detection. This stage is to determine the positions
QT-segment, and U- wave. Each waveform corresponds to the of all interested peaks in the ECG signals. The detection of
physiological process of cardiac excitation. The total
duration of a heartbeat is about 0.8s.
R-peak is essential for set A. And the detection of P-wave,
QRS-complex, and T-wave contribute to the construction of
set B. Fig.3 gives the detection of R peak.
(2) The setting of two datasets. In our paper, we divide
the AF database into two data sets according to different
input signals. According to the RR interval, the ECG sig-
nals are extracted base on the detection of R-peak, which is
called set A. On the other hand, the P-wave, QRS-complex,
and T-wave detected in step (1) are divided into a heartbeat
sequences centered around R-peak, which are called set B.
FIGURE 2. AF signal segments of different samples. It is observed that Fig.4 shows the samples of set A and set B.
these segments represent AF signal feature though the time periods are (3) The segmentation of ECG signals. On the basis of
different. The characteristic information of AF contained in each signal is
different.
above, the integrated RR intervals of a dataset are divided
into a mass of overlapping sequences included 100 samples
(or 100 values, not 100 beats, it is 99 samples overlap).
Otherwise, the database mainly includes four kinds of heart- A RR interval of set A only contains 100 values, which will be
beat annotations: atrial fibrillation, atrial flutter, AV junc- input the model. Similarly, an integrated heartbeat sequence
tional rhythm, and all other rhythms. Fig.2 shows AF signal (shown in Fig. 1) is from P onset to T offset (or U onset). The
segments from different time periods, such as 1s and 3s. heartbeat sequences of set B is divided into a segment, which
Atrial fibrillation (AF), a type of abnormal heart rhythm, contains 500 values centered around R-peak, which also will
is characterized by a rapid and irregular rhythm caused a be input the model. According to the label information of AF
series of threats to the human body. Specifically, in four in the annotation file(.atr), all the input data segments are
aspects: labeled. As long as a RR interval (set A) or one heartbeat
(1) P waves disappear, which replace with F-wave of dif- (set B) is labeled as AF, the label of this data segment is
ferent sizes and shapes; considered to be AF, and the labels of all other data segments
(2) The F-wave frequency is 350 − 600 beats/min, and its are normal signals without AF features.
size, shape, and amplitude are different. The P-R interval is (4) Z-score normalization. In order to address the prob-
not easy to determine; lem of non-standard data and improve the comparability of
(3) RR interval is absolutely uneven, and ventricular rate data, the new set A and set B are normalized by z-score
generally increases, but usually <160 beats / min; normalization.
(4) There is a normal QRS-complex when there is no differ-
ence in terms of intraventricular conduction, and when intra- IV. NETWORK ARCHITECTURE
ventricular differential conduction occurs, the QRS-complex In this part, we describe the network structure model pro-
is broadly deformed. posed in this study, which mainly includes four convolu-
In this paper, we aim at researching the classification of tion layers, two BLSTM layers, two full connection layers,
AF rhythms. So, we only consider whether the signal seg- and other supporting computing operations (pooling layers,
mentation is the AF or not. ReLU activate, batch normalization, dropout, and so on.).
75580 VOLUME 7, 2019

it is applied to analyze ECG signals, and its main function is

to carry on the process of features learning.
2) POOLING LAYER
In essence, the pooling layer is to retain maximum spatial
information while reducing the number of feature maps by
operating corresponding mathematical calculation, such as
max pooling or average pooling.
3) ACTIVATION FUNCTION
Activation function is used to introduce nonlinearity in the
values of the incoming layer, so that the neural network can
solve more complex problems, but also enhance the feature
expression and learning ability of the neural network model.
In this paper, the ReLU is regarded as the activation function
in convolutional layers, while the Sigmoid and tanh are used
as the activation function in BLSTM blocks.
4) FULLY CONNECTED LAYER

It plays a role of a classifier in the CNN, mapping the dis-
tributed feature representation learned from the convolutional
layer.
FIGURE 4. Set A: RR intervals (left) and set B: Heartbeat sequences 5) SOFTMAX LAYER
(right). In the MIT-BIT AF database, each dataset has a duration
of 10 hours, which include a number of RR intervals. The integrated RR
SoftMax function maps multiple scalars to a probability dis-
intervals of a dataset are divided into a mass of overlapping sequences tribution with each value output in the range (0,1), which is
included 100 samples (or 100 values, not 100 beats, it is 99 samples
overlap). The RR intervals of set A only contain 100 values, which will be
often used in the last layer of the neural network as the output
input the model. Similarly, each dataset also includes a mass of layer for multi-classification.
heartbeats (An integrated heartbeat is shown in Fig. 1.). In this paper, the input ECG signal samples are first fed
An integrated heartbeat is from P onset to T offset (or U onset). The
heartbeat sequences of set B only contain 500 values of P, Q, R, S, T, into 3 convolution layers and 3 max-pooling layers with a
which also will be input the model. stride of 1 and 2 respectively. The main function is to extract
effective features from input samples.
We first review the network model of CNN and bi-directional B. BI-DIRECTIONAL LONG SHORT-TERM MEMORY
long-term memory network (BLSTM), which are closely NETWORK
related to the model structure proposed in this paper. Then, As an important branch of deep learning, Recurrent Neural
the most important is put forward the research model of this Networks (RNN) [45] has always been the research focus.
paper, and describe the structure, parameters, and mathemat- RNN introduces the concept of loop, which is mainly aimed
ical expression of the model in detail. at solving problems related to time series signals (such as
continuous sound signals or continuous handwritten words.).
A. CONVOLUTIONAL NEURAL NETWORK RNN takes advantage of itself internal memory to process
Since the CNN initially was proposed in the 1980s by input sequences in any duration, which makes it easier to
Fukushima [44], it has been spectacular advances in deep deal with such problems as unsegmented handwriting recog-
learning. Generally, the CNN network model is processing of nition or speech recognition.
feature extraction and learning, which mainly includes input The huge advantage of RNN is to learn the information
layer, convolution layer, polling layer, activation function characteristics of different time effectively. However, it has
layer, fully connected layer or fully convolution layer, and appeared that actual initial information lost over time, which
SoftMax layer. is called a long-term dependence problem. So, the Long-
short-term-memory [46], [47] (LSTM) network arises at the
1) CONVOLUTION LAYER historic moment, which can learn characteristics of the previ-
The essence of convolution layer is to perform corresponding ous stages from input signals. Although the LSTM memory
mathematical calculation on vectors, which is affected by the unit can deal with uncertain long time series data, the tra-
shape of its input as well as the size of the kernel, stride, ditional model of LSTM often ignored the information in
and padding. It is applicable to any input type of included the future stage, only processing the data along the direc-
an image, a text or a sound slip. The chief function is to tion of forward. The future input information can also pro-
extract feature map on the given input signal. In this paper, vide insight about hidden states that can help prediction.
VOLUME 7, 2019 75581

FIGURE 5. The Bi–directional long short-term memory (BLSTM) network. The BLSTM allows the output units yi to
automatically learn a representation that depends on both the previous and the future information.
And it is sensitive to the input values around time t.
In recent years, a novel method based on the bidirectional composite equation (4) to (9).
long short-term memory (BLSTM) be proposed, which is uti-
ft = σ (wf [ht−1 , xt ] + bf ) (4)
lized to extract the bidirectional information from the forward
model (operating from t0 , t1 , . . . , to tn .) and backward model it = σ (wi [ht−1 , xt ] + bi ) (5)
(operating from tn , tn−1 , . . . , to t0 .) at the same time. jt = tanh(wc [ht−1 , xt ] + bj ) (6)
BLSTM is a recurrent neural network that has an input ot = σ (wo [ht−1 , xt ] + bk ) (7)
layer, two hidden layers, and an output layer. The hid-
ct = ft ∗ct−1 + it ∗ jt (8)
den layers are a memory block included three gate mecha-
nisms called input gate, forget gate, and output gate. These ht = tanh(c)t ∗ kt (9)
gate mechanisms control the amount of information and where ft , it , ot , and ct respectively represent the forget gate,
the extraction of features. The input gate determines what input gate, output gate, and cell state in long short-term
new information will be stored in the cell state, the forget memory (LSTM) network (Fig.6), jt is a tanh layer created
gate determines what information will be retained or dis- a new candidate value vector, which is added to the state. The
carded from the cell state. After updating the cell state, cell state is processed through the tanh (to get a value between
the output gate completes the output of network model. −1 and 1) and multiplied by the output of the sigmoid gate.
Recently, BLSTM model has been widely researched in Then we obtain the final output. The σ and tanh are a usual
the continuous time series signals fields such as speech element-wise sigmoid or tanh activation function. The sym-
recognition. bols of ‘‘⊕’’ and ‘‘⊗’’ denote the element-wise concatenated
In our paper, an important BLSTM block is leveraged and multiplied.
in the network architecture. As illustrated in Fig.5, a stan-
dard BLSTM block includes an input layer, a forward layer, C. THE PROPOSED ARCHITECTURE
a backward layer, and an output layer, which realizes the In this study, a novel deep CNN-BLSTM network
function processing the data in both directions of two sep- (Fig.7) model included four convolution layers and two
arate hidden layers and then merging to the same output BLSTM blocks is proposed to classify the AF signals.
layer. The learning of both directions is able to obtain a Generally, the CNN model has been widely implemented
more accurate prediction result. In Fig.5, a BLSTM block to analyze two-dimensional images signals in many litera-
computes the forward hidden layer Fi , the backward hidden tures. However, in our paper, the CNN network model is
layer Bi , and the output layer yi by updating the following applied to process the one-dimensional ECG signals. The
equations (1)-(3): proposed architecture is shown in Fig.7 and detailed param-
eters are listed in Table 1. The Fig.8 shows the operation
Fi = f (w1 xi + w2 Fi−1 + bi ) (1)
details between two BLSTM blocks, two fully-connected
Bi = f (w3 xi + w5 Bi1 + bj ) (2) layers, and SoftMax layer. Detailly, the deep CNN-BLSTM
yi = g(w4 Fi + w6 Bi−1 + bk ) (3) network is composed of four convolutional layers, two
BLSTM blocks, two fully-connected layers, a SoftMax layer,
where, f is the cell state vectors update function in the hidden and corresponding activate function, batch normalization,
layers of the BLSTM block, which is detailly defined by global max-pooling, and dropout layer. Most importantly, the
75582 VOLUME 7, 2019

FIGURE 6. The LSTM network. The σ and tanh represent an element-wise sigmoid and tanh activation function. The symbols of ‘‘⊕’’ and ‘‘⊗’’ denote the
element-wise concatenation and multiplication.
FIGURE 7. The proposed novel deep CNN-BLSTM network in this paper. FM represents feature map; MP denotes max-pooling. The operations of four
convolutional layers and two BLSTM blocks show the processing of feature learning. The operation of concatenate is a layer that connects multiple inputs
to one output that implements the joint of input data.
network architecture includes two different input signals. analysis of the performance metrics and compare with impor-
Set A is the overlapping sequences of RR intervals and set tant references in recent years.
B is the Heartbeat sequences (P-QRS-T waves).
A. IMPLEMENTATION DETAILS
V. EXPERIMENTAL RESULTS In the standard deep CNN-BLSTM network, the first three
In order to fully analyze and verify the effectiveness of the convolution layers (from stage one of Conv to stage three
model proposed in this paper, we train and validate the two of Conv.) are followed by the ReLU activate function, which
data sets of set A and set B described in the III section respec- increases the nonlinear representation ability of network
tively according to the design of the model, and then fully model. The activate function is sigmoid and tanh in the
comparing the different results by adjusting key parameters. BLSTM blocks. Then the last convolutional layer is used
We first describe the main initialization parameters, opti- to adjust the output from BLSTM block, which is bene-
mization function in the model training stage. Furthermore, ficial to the operation of global max-pooling. Meanwhile,
three evaluation metrics including accuracy, sensitivity, and it also can adequately learn features from the signal. Global
specificity are calculated. Finally, we make a comprehensive max-pooling is implemented between the BLSTM block
VOLUME 7, 2019 75583

TABLE 1. The detailed parameters of network architecture.
and the first fully-connected layer to compress the features

of the output sequences from BLSTM block. The fully-
connected layers are followed by batch normalization, ReLU,
and dropout. The network model is carried out by optimizing
the cross-entropy function with Adam optimizer [48], which
utilizes the TensorFlow [49] on a Titian Xp GPU with batch FIGURE 8. The details of two BLSTM blocks.
size 1024 for 100 epochs.
The most important calculation of CNN model is gradient
The cross-entropy loss function of two class problem is
backpropagation and weight update. The weights and biases
obtained according to equation (12).
are updated according to equation (10), (11).
L yt , yp =−(yt ∗ log yp +(1−yt )log(1−yp )yt ∈{0, 1} (12)

∂c

rλ where, yt represents the ground truth of each sample, yp is a
wl = 1 − w−r (10)
n ∂w probability predicted by the training model.
∂c Otherwise, the two datasets are divided into three phases
bl = bl−1 − r (11)
∂w of training (80%), validation (10%), and testing (10%). The
training set is used to train the model and determine the model
where w, r, λ,n and c represent the weight, learning rate, parameters. The validation set is used to tune parameters and
regularization parameter, the number of training samples and determine the optimal model. And the test set, which is not
the cost function respectively. been used before, is utilized to evaluate the performance of
According to the earlier experiments, the initial learning the trained model. The detailed results will be shown in part
rate and decay rate are set to 0.001 and 0.9, respectively. The of section B.
cross-entropy function is leveraged to evaluate the loss of the In addition, a 10-fold cross-validation strategy is employed
model. During the stage of fully-connected layer, the dropout in our study. The set A and set B are randomly divided
is adopted to reduce overfitting and improve the generaliza- into 10 equal segments after the generation of signal seg-
tion. The parameter of dropout is set to 0.2 taking into account ments.9/10 parts are used to training and 1/10 parts are used
the data difference and the number of neurons. to validation of the proposed model. The approach is repeated
75584 VOLUME 7, 2019

10 times by iterating all the data. Then, the evaluation metrics TABLE 2. The overall classification results for the AF class.
of accuracy, sensitivity, and specificity are shown in each
step. Finally, the overall evaluation metrics is achieved by
averaging the results of the 10 steps.
B. PERFORMANCE EVALUATION
In order to evaluate the performance of the proposed network
architecture, we assume the class containing AF signals as
the positive class, and therefore the accuracy, sensitivity, and
specificity are computed, respectively.
TP + TN
Accuracy = (13)
TP + FP + TN + FN
TP
Sensitivity = (14)
TP + FN
TN
Specitifity = (15)
TN + FP
TP means true positive, which is the number of AF signals
classified correctly;
FP means false positive, which is the number of AF signals
classified wrongly;
TN means true negative, which is the number of signals
without AF classified correctly;
FN means false negative, which is the number of signals
without AF classified wrongly.
In the experiment, in addition to a standard network model
(shown in Table 1), we set up three comparison experiment
(named 1, 2, 3.) through parameter modulation, which is
FIGURE 9. Training and validation accuracy with 100 epochs in set B.
aimed at verifying the superiority of the model. The dif-
ference between experiment 1 and the standard model is
that epoch is decreased to 80. And the BLSTM block in
the model is replaced with the conventional LSTM block in
experiment 2. And in experiment 3, the BN layer between the
last fully connected layer and output layer is removed, while
the BN layer also is removed among the two full-connected
layers. Fig.9 and Fig.10 show the accuracy rate and the loss
of training and validation set with 100 epochs when the input
is set A.
As can be seen from the Fig.9 and Fig.10, the performance
of deep CNN-BLSTM network on the training set is slightly
better than the validation set. Meanwhile, the change of loss
FIGURE 10. Training and validation loss with 100 epochs in set A.
illustrates that the model converges to a steady value and there
is no over-fitting phenomenon. The maximum accuracy of
training set is 99.92%, and maximum accuracy is 98.94% in
the validation set. AF is closely related to the feature information of RR interval.
When the performance of the model is measured in terms So, the result of set A is better than set B. Therefore, we aban-
of time costs, the average epoch last 192 seconds using don the data of set B in the following experiment. Otherwise,
a Titian Xp GPU during the training stage. In the testing compared with experiment 1 (80 epochs), Fig.13 shows the
stage, we cost 54 seconds to 104,858 signal records. The accuracy with 100 and 80 epochs in the training and valida-
average signal cost is about 0.51ms, which leads to real-time tion set. It can be seen from the Fig.13 that the change trend is
speed theoretically. The average accuracy, sensitivity, and almost same, but the accuracy variation trend is more stable
sensitivity for the testing set are given in Table2. Meanwhile, than that of 80 epochs during the stage of BLSTM network
set B is also used as input for training. Fig.11 and Fig.12 show model with 100 epochs. Meanwhile, Fig. 14 shows that the
the accuracy is worse than the result of set A, although the loss BLSTM model has better performance compared with LSTM
is also convergent in the end. The accuracy is less than 85% network under the condition of consistent parameters, which
in the training set and validation set. The main reason is that fully reflects the superiority of BLSTM model.
VOLUME 7, 2019 75585

FIGURE 11. Training and validation accuracy with 100 epochs in set A. FIGURE 14. The accuracy of BLSTM and LSTM model.
FIGURE 12. Training and validation loss with100 epochs in set B.

FIGURE 15. The accuracy with BN and without BN.
C. COMPARISON AND DISCUSSION

Table 3 summarizes the various techniques used by the
researchers to automatically detect AF using ECG signals.
It mainly includes research literatures, databases, meth-
ods, and three evaluation metrics of accuracy, sensitivity,
and specificity. The chief methods contain morphological
analysis, machine learning, and popular deep learning in
recent years. These studies have achieved good results.
Zhou et al. [10] used a novel recursive method to detect
FIGURE 13. The accuracy with 80 and100 epochs.
AF episodes, which achieved a test performance of 96.89%,
98.25%, and 97.67% with respect to the sensitivity, speci-
ficity, and overall accuracy. Although the two performances
Otherwise, the chief function of Batch Normalization (BN) of specificity and accuracy are slightly higher than the result
is to make the input of each layer maintain the same distri- of our method, Zhou [10] method involves several basic
bution in the process of deep neural network training, so as operations of nonlinear filters, symbolic dynamics, and the
to eventually alleviate the gradient disappearance/explosion calculation of Shannon entropy, which will consume a lot of
phenomenon and accelerate the training speed of the model. time with increasing computation. Petrėnas et al. [50] utilized
By comparing the result of experiment 3 (Fig.15), it can ectopic beat filtering and bigeminal suppression for detec-
be concluded that the operation of BN layer has certain tion of subclinical AF episodes. Asgari et al. [12] leveraged
effect, but the difference is relatively weak. In our experi- the stationary wavelet transform and support vector machine
ment, the change trend with BN (blue solid) and that trend to detect AF episodes. Henzel et al. [51] demonstrated the
without BN (yellow solid) in the training set is almost same effectiveness of machine learning based on generalized linear
in Fig.15. And the average accuracy of standard network model evaluated with ROC. These three methods (Proposed
model and experiment 3 differed by 0.3% in the valida- in [12], [50] and [51].) are same with the idea proposed
tion set. The main reason is that the network model for in [10], which belong to machine learning methods. Their
ECG signal research is relatively simple compared with the comprehensive performances are worse than the method we
image classification model, and the main role of BN is rel- proposed. Moreover, the main problem of machine learning
atively weak. Finally, all the experimental results are shown is that handcrafted extract features methods rely on a large
in Table 2. number of expert priori knowledge and sufficient capacity of
75586 VOLUME 7, 2019

TABLE 3. Summary of selected studies conducted for the automated detection of af heartbeats.
biomedical signal processing, which requires a lot of time and on the training set and validation set in this paper. Since the
computation. conventional feature extraction method usually takes most of
In the MIT-BIH AF, the [17] method achieved the the time, it increases the computation compared with the deep
sensitivity of 96.58%,96.71%, and 98.46% using 12-beat, learning idea.
30-beat, and 60-beat RR segments respectively, the speci- In view of the deep learning approaches are widely
ficity of 83.31%, 87.52%, and 89.85% for the three- employed to deal with AF, Acharya et al. [38] proposed a
time window types, the accuracy of 88.96%, 91.43%, multi-layer DCNN to automatically detect the AF segments.
93.51%. And the [18] method based-COSEn also acquired They achieved the results of accuracy (92.50%, 94.90%), sen-
the results of sensitivity (91%) and specificity (98%) for sitivity (98.09%, 99.13%), and specificity (93.13%, 81.44%)
MIT-BIH AF, respectively. Compared with [17] and [18], for two and five seconds of ECG segments. Compar-
our study employs 100 samples (or 100 values, not 100- ing with the result of our study, the method mentioned
beat) from RR intervals as the input signal, shows the above is highlighted deficiency. For a detailed analysis,
higher performances in MIT-BIH AF than [17] and [18] Andersen et al. [41] showed a method that combining the
methods. CNN and LSTM to classify the AF and normal sinus rhythm
Otherwise, we also evaluated the method based on (NSR). It showed an accuracy, sensitivity, and specificity
SVM proposed in [19] with respect to the new data from the of 97.8%, 98.98%, and 96.95%, validated through 5-fold
PhysioNet/Computing in Cardiology Challenge 2017. It is a cross-validation. And it also achieved the results of accuracy
study of our team in Cardiology Challenge 2017. The results (87.40%), sensitivity (98.96%), and specificity (86.04%) on
showed the best accuracy is 86.23% and 87.71% on the train- unseen data sets. This paper is very similar to our study.
ing set and validation set with respect to all the feature sets, Compared with the method we proposed, there are mainly
which ranked the fourth in AF detection. However, the perfor- four differences: the structure of model, the input of signal,
mances are worse than the accuracy of 99.94% and 98.63% the validation method, and the performance results.
VOLUME 7, 2019 75587

First, the model proposed in our study includes four con- of hospital. So, genuine clinical contributions need to be
volutional layers, two BLSTM blocks, two fully connected strengthened in the future study.
layers, and other tricks (such as pooling layers, ReLU acti- Another limitation is that in order to reduce the complexity
vate, batch normalization, global max-pooling, dropout, and of training, our input signals are only two points of RR peaks
so on. It is shown in Fig. 7). The different from [41] is between all the RR intervals, which do not involve the other
that three convolution layers followed max-pooling layer. signal values between the RR intervals. Hence, the possibility
And global max-pooling is leveraged between the second of further enhancement of the method using all the points of
BLSTM blocks and the fully connected layers to compress RR intervals to be investigated.
the features from the BLSTM blocks. Otherwise, correspond-
ing tricks of batch normalization and dropout are used to VI. CONCLUSION AND FUTURE WORK
alleviate the gradient disappearance/explosion phenomenon Automatic classification is the most valuable topics in medi-
and overfitting of the model. By contrast, our method is cal and bioinformatics, especially the detection of AF. In this
more complicated and comprehensive. The four convolu- paper, we have proposed a novel deep CNN-BLSTM network
tional layers and two BLSTM blocks can adequately learn to detect the AF signal from ECG records. The model inte-
features from the signal. Second, in [41], the raw ECG record- grates the feature extraction methods of CNN and BLSTM,
ings from the databases are converted to RRI sequences. which is composed of four CNN layers, two BLSTM blocks
The RRI sequences are fed into the model. In our study, and two fully connected layers. These CNN layers and
we design multi-scale signals included RR intervals (named BLSTM layers are used to extract features of ECG signals.
set A) and Heartbeat sequences (named set B.) as the inputs Otherwise, two datasets of RR intervals (named set A) and
of network. This demonstrates the generalization of the Heartbeat sequences (named set B.) mentioned in section III
model. are trained respectively. The performance of the set A is
In addition, we obtain 1,038,946 records by segmenting the largely higher than the set B. According to the analysis,
ECG signals. Each record includes 100 RR intervals. Total the main reason is that the signal of AF is closely related
RR intervals is 103,894,600 records. This amount of data is to the information of RR intervals. Therefore, we abandoned
much larger than [41]. This is the main reason why I design the data of set B in the following work. Finally, we have
more complex model. Respecting to a large amount of data achieved the accuracy of 99.94%, 98.63%, 96.59% in train-
and assessing the uncertainty of the performance, 10-fold ing, validation and testing set. Also, we obtained sensitiv-
cross validation is leveraged in model validation. ity and specificity of 99.93%, and 97.03% in the testing
Most importantly, our proposed approach achieved favor- set.
able performances in set A with an accuracy of 99.94%, The experimental results have confirmed that the method
98.63% in training and validation set, respectively. In unseen is effective for AF detection, especially in terms of accuracy,
data sets, we obtained an accuracy of 96.59%, a sensitivity sensitivity, and sensitivity. In addition, different from the
of 99.93%, and a specificity of 97.03%, which is 9.19%, traditional feature learning methods used in the biomedical
0.97%, and 10.99% higher compared with [41], respec- signal processing, the proposed deep-CNN-BLSTM network
tively. This also proves that our method has better effect and neither need the professional knowledge in the biomedi-
generalization. cal field nor the handcrafted feature extraction methods in
In general, compared with the traditional morphology and conventional machining learning. Otherwise, the network
machine learning methods, our proposed method does not accomplishes AF detection with little computation overhead,
consume a large amount of time and energy to selecting and which leads to real-time speed theoretically.
extracting features, all available information in the training Our future research will involve the classification of multi-
dataset is fed into the deep CNN-BLSTM network. Therefore, ple arrhythmia signals and extend to other public databases
it can extract the implicit characteristic information, so that it to assess the accuracy of the method and the generaliza-
can acquire state-of-art performances in an unseen dataset. tion ability of the model, as well as to optimize the learn-
The average signal record cost is about 0.51ms, which leads ing ability of neural networks and enhance the clinical
to real-time speed theoretically. contributions.
Meanwhile, we innovate the input signals of model and
network structure compared with the other deep learning REFERENCES
methods, and achieving a commendable effect on the training [1] S. T. Mathew, J. Patel, and S. Joseph, ‘‘Atrial fibrillation: Mechanistic
set, validation set, and testing set. insights and treatment options,’’ Eur. J. Internal Med., vol. 20, no. 7,
A limitation of the present work is that the pro- pp. 672–681, Nov. 2009.
[2] C. R. C. Wyndham, ‘‘Atrial fibrillation: The most common arrhythmia,’’
posed network is not evaluated on another public database Texas Heart Inst. J., vol. 27, no. 3, pp. 257–267, Feb. 2000.
except MIT-AF. Otherwise, although we carry out training, [3] V. Markides and R. J. Schilling, ‘‘Atrial fibrillation: Classification, patho-
validation, and testing and achieve excellent performance physiology, mechanisms and drug treatment,’’ Heart, vol. 89, no. 8,
metrics on the public clinical database of MIT-AF, and it also pp. 939–943, Aug. 2003.
[4] K. Harris, D. Edwards, and J. Mant, ‘‘How can we best detect atrial
provides a new solution of AF detection. the model is not been fibrillation?’’ J. Roy. College Phys. Edinburgh, vol. 42, no. 18, pp. S5–S22,
employed in the genuine clinical diagnosis and application Jan. 2012.
75588 VOLUME 7, 2019

[5] E. M. Hylek, A. S. Go, Y. Chang, N. G. Jensvold, L. E. Henault, [26] M. Athif, P. C. Yasawardene, and C. Daluwatte, ‘‘Detecting atrial fibril-
J. V. Selby, and D. E. Singer, ‘‘Effect of intensity of oral anticoagulation on lation from short single lead ECGs using statistical and morphological
stroke severity and mortality in atrial fibrillation,’’ New England J. Med., features,’’ Physiol. Meas., vol. 39, no. 6, Jun. 2018, Art. no. 064002.
vol. 349, no. 11, pp. 1019–1026, Sep. 2003. [27] T. Teijeiro, C. A. García, D. Castro, and P. Félix, ‘‘Abductive reasoning as a
[6] Y. Miyasaka, M. E. Barnes, B. J. Gersh, S. S. Cha, K. R. Bailey, basis to reproduce expert criteria in ECG atrial fibrillation identification,’’
W. P. Abhayaratna, J. B. Seward, and T. S. M. Tsang, ‘‘Secular trends in Physiol. Meas., vol. 39, no. 8, Aug. 2018, Art. no. 084006.
incidence of atrial fibrillation in Olmsted County, Minnesota, 1980 to 2000, [28] S. Parvaneh, J. Rubin, A. Rahman, B. Conroy, and S. Babaeizadeh, ‘‘Ana-
and implications on the projections for future prevalence,’’ Circulation, lyzing single-lead short ECG recordings using dense convolutional neural
vol. 114, no. 2, pp. 119–125, Jul. 2006. networks and feature-based post-processing to detect atrial fibrillation,’’
[7] A. S. Gami, G. Pressman, S. M. Caples, R. Kanagala, J. J. Gard, Physiol. Meas., vol. 39, no. 8, Aug. 2018, Art. no. 084003.
D. E. Davison, J. F. Malouf, N. M. Ammash, P. A. Friedman, and [29] J. Harju, A. Tarniceriu, J. Tarniceriu, A. Vehkaoja, A. Yli-Hankala, and
V. K. Somers, ‘‘Association of atrial fibrillation and obstructive sleep I. Korhonen, ‘‘Monitoring of heart rate and inter-beat intervals with
apnea,’’ Circulation, vol. 110, no. 4, pp. 364–367, Nov. 2004. wrist plethysmography in patients with atrial fibrillation,’’ Physiol. Meas.,
[8] M. Beed, ‘‘Bennett’s cardiac arrhythmias, practical notes on interpretation vol. 39, no. 6, Jun. 2018, Art. no. 065007.
and treatment,’’ Brit. J. Anaesthesia, vol. 112, no. 4, p. 775, Apr. 2014. [30] L. M. Eerikäinen, A. G. Bonomi, F. Schipper, L. R. C. Dekker,
[9] K. Mukamal, J. Tolstrup, J. Friberg, G. Jensen, and M. Grønbæk, ‘‘Alcohol R. Vullings, H. M. D. Morree, and R. M. Aarts, ‘‘Comparison between
consumption and risk of atrial fibrillation in men and women: The Copen- electrocardiogram- and photoplethysmogram-derived features for atrial
hagen City heart study,’’ Circulation, vol. 112, no. 12, pp. 1736–1742, fibrillation detection in free-living conditions,’’ Physiol. Meas., vol. 39,
Sep. 2005. no. 8, Aug. 2018, Art. no. 084001.
[10] X. Zhou, H. Ding, B. Ung, E. Pickwell-MacPherson, and Y. Zhang, ‘‘Auto- [31] I. Christov, V. Krasteva, I. Simova, T. Neycheva, and R. Schmid, ‘‘Ranking
matic online detection of atrial fibrillation based on symbolic dynamics of the most reliable beat morphology and heart rate variability features
and Shannon entropy,’’ Biomed. Eng. Online, vol. 13, no. 1, pp. 13–18, for the detection of atrial fibrillation in short single-lead ECG,’’ Physiol.
Dec. 2014. Meas., vol. 39, no. 9, Sep. 2018, Art. no. 094005.
[11] U. Desai, R. J. Martis, U. R. Acharys, C. G. Nayak, G. Seshikala, and [32] M. Rizwan, B. M. Whitaker, and D. V. Anderson, ‘‘AF detection from ECG
S. K. Ranjan, ‘‘Diagnosis of multiclass tachycardia beats using recurrence recordings using feature selection, sparse coding, and ensemble learning,’’
quantification analysis and ensemble classifiers,’’ J. Mech. Med. Biol., Physiol. Meas., vol. 39, no. 12, Dec. 2018, Art. no. 124007.
vol. 16, no. 1, pp. 314–335, Feb. 2016. [33] V. D. A. Corino, R. Laureanti, L. Ferranti, G. Scarpini, F. Lombardi, and
[12] S. Asgari, A. Mehrnia, and M. Moussavi, ‘‘Automatic detection of L. T. Mainardi, ‘‘Detection of atrial fibrillation episodes using a wristband
atrial fibrillation using stationary wavelet transform and support vector device,’’ Physiol. Meas., vol. 38, no. 5, p. 787, Apr. 2017.
machine,’’ Comput. Biol. Med., vol. 60, pp. 132–142, May 2015. [34] T. Conroy, J. H. Guzman, B. Hall, G. Tsouri, and J.-P. Couderc, ‘‘Detec-
[13] J. R. Mehall, R. M. Kohut, Jr., E. W. Schneeberger, W. H. Merrill, and tion of atrial fibrillation using an earlobe photoplethysmographic sensor,’’
R. K. Wolf, ‘‘Absence of correlation between symptoms and rhythm in Physiol. Meas., vol. 38, no. 10, p. 1906, Sep. 2017.
‘symptomatic’ atrial fibrillation,’’ Ann. Thoracic Surg., vol. 83, no. 6, [35] R. J. Martis, U. R. Acharya, and H. Adeli, ‘‘Current methods in electro-
pp. 2118–2121, Jun. 2007. cardiogram characterization,’’ Comput. Biol. Med., vol. 48, pp. 133–149,
May 2014.
[14] J. Mant, D. A. Fitzmaurice, F. D. Hobbs, S. Jowett, E. T. Murray, R. Holder,
M. Davies, and G. Y. Lip, ‘‘Accuracy of diagnosing atrial fibrillation on [36] U. R. Acharya, H. Fujita, S. L. Oh, Y. Hagiwara, J. H. Tan, and
electrocardiogram by primary care practitioners and interpretative diag- M. Adam, ‘‘Application of deep convolutional neural network for auto-
nostic software: Analysis of data from screening for atrial fibrillation in mated detection of myocardial infarction using ECG signals,’’ Inf. Sci.,
the elderly (SAFE) trial,’’ Brit. Med. J. (Clin. Res. Ed.), vol. 335, no. 7616, vols. 415–416, pp. 190–198, Nov. 2017.
pp. 380–382, Aug. 2007. [37] M. M. Al Rahhal, Y. Bazi, H. AlHichri, N. Alajlan, F. Melgani, and
R. R. Yager, ‘‘Deep learning approach for active classification of electro-
[15] M. D. Zeiler and R. Fergus, ‘‘Visualizing and understanding convolutional
cardiogram signals,’’ Inf. Sci., vol. 345, pp. 340–354, Jun. 2016.
networks,’’ presented at the ECCV, 2014.
[38] U. R. Acharya, H. Fujita, S. L. Oh, Y. Hagiwara, J. H. Tan, and
[16] K. He, X. Zhang, S. Ren, and J. Sun, ‘‘Deep residual learning for image
M. Adam, ‘‘Automated detection of arrhythmias using different intervals of
recognition,’’ presented at the CVPR, 2015.
tachycardia ECG segments with convolutional neural network,’’ Inf. Sci.,
[17] C. Liu, O. Julien, R. Erik, Q. Li, L. Zhao, N. Shamim, and C. D. Clifford, vol. 405, no. 1, pp. 81–90, Sep. 2017.
‘‘A comparison of entropy approaches for AF discrimination,’’ Physiol.
[39] B. Pourbabaee, M. J. Roshtkhari, and K. Khorasani, ‘‘Deep convolutional
Meas., vol. 39, no. 7, pp. 18–36, Jul. 2018.
neural networks and learning ECG features for screening paroxysmal atrial
[18] D. E. Lake and J. R. Moorman, ‘‘Accurate estimation of entropy in very fibrillation patients,’’ IEEE Trans. Syst., Man, Cybern. Syst., vol. 48, no. 12,
short physiological time series: The problem of atrial fibrillation detection pp. 2095–2104, Dec. 2018.
in implanted ventricular devices,’’ Amer. J. Physiol. Heart Circulat. Phys- [40] X. Xu, S. Wei, C. Ma, K. Luo, L. Zhang, and C. Liu, ‘‘Atrial fibrillation
iol., vol. 300, no. 1, pp. H319–H325, Jan. 2011. beat identification using the combination of modified frequency slice
[19] N. Liu, M. Sun, L. Wang, W. Zhou, H. Dang, and X. Zhou, ‘‘A support wavelet transform and convolutional neural networks,’’ J. Healthcare Eng.,
vector machine approach for AF classification from a short single-lead vol. 2018, Jul. 2018, Art. no. 2102918.
ECG recording,’’ Physiol. Meas., vol. 39, no. 6, Jun. 2018, Art. no. 064004. [41] R. S. Andersen, A. Peimankar, and S. Puthusserypady, ‘‘A deep learning
[20] L. Zhao, C. Liu, S. Wei, Q. Shen, F. Zhou, and J. Li, ‘‘A new entropy- approach for real-time detection of atrial fibrillation,’’ Expert Syst. Appl.,
based atrial fibrillation detection method for scanning wearable ECG vol. 115, pp. 465–473, Jan. 2019.
recordings,’’ Entropy, vol. 20, no. 12, p. 904, Nov. 2018. [42] A. L. Goldberger, L. A. N. Amaral, L. Glass, J. M. Hausdorff,
[21] Z. Mei, X. Gu, H. Chen, and W. Chen, ‘‘Automatic atrial fibrillation detec- P. C. Ivanov, R. G. Mark, J. E. Mietus, G. B. Moody, C. K. Peng, and
tion based on heart rate variability and spectral features,’’ IEEE Access, H. E. Stanley, ‘‘PhysioBank, PhysioToolkit, and PhysioNet: Components
vol. 6, pp. 53566–53575, Sep. 2018. of a new research resource for complex physiologic signals,’’ Circulation,
[22] A. Sološenko, A. Petrènas, B. Paliakaité, L. Sörnmo, and V. Marozas, vol. 101, no. 23, pp. 215–220, Jun. 2000.
‘‘Detection of atrial fibrillation using a wrist-worn device,’’ Physiol. Meas., [43] C. Pater, ‘‘Methodological considerations in the design of trials for safety
vol. 40, no. 2, Feb. 2019, Art. no. 025003. assessment of new drugs and chemical entities,’’ Current Control Trials
[23] M. Shao, G. Bin, S. Wu, G. Bin, J. Huang, and Z. Zhou, ‘‘Detection of atrial Cardiovascular Med., vol. 6, no. 1, pp. 1–13, Feb. 2005.
fibrillation from ECG recordings using decision tree ensemble with multi- [44] K. Fukushima, ‘‘Neocognitron: A self-organizing neural network model
level features,’’ Physiol. Meas., vol. 39, no. 9, Sep. 2018, Art. no. 094008. for a mechanism of pattern recognition unaffected by shift in position,’’
[24] V. Gliner and Y. Yaniv, ‘‘An SVM approach for identifying atrial fibrilla- Biol. Cybern., vol. 36, no. 4, pp. 193–202, Apr. 1980.
tion,’’ Physiol. Meas., vol. 39, no. 9, Sep. 2018, Art. no. 094007. [45] J. J. Hopfield, ‘‘Neural networks and physical systems with emergent
[25] N. Sadr, M. Jayawardhana, T. T. Pham, R. Tang, A. T. Balaei, and collective computational abilities,’’ Proc. Nat. Acad. Sci. USA, vol. 79,
P. De Chazal, ‘‘A low-complexity algorithm for detection of atrial fib- no. 8, pp. 2554–2558, Apr. 1982.
rillation using an ECG,’’ Physiol. Meas., vol. 39, no. 9, Sep. 2018, [46] S. Hochreiter and J. Schmidhuber, ‘‘Long short-term memory,’’ Neural
Art. no. 064003. Comput., vol. 9, no. 8, pp. 1735–1780, Dec. 1997.
VOLUME 7, 2019 75589

[47] F. A. Gers, J. Schmidhuber, and F. Cummins, ‘‘Learning to forget: XINGQUN QI is currently pursuing the master’s
Continual prediction with LSTM,’’ Neural Comput., vol. 12, no. 10, degree in control science and engineering from
pp. 2451–2471, Oct. 2000. the School of Automation, Beijing University of
[48] A. Radford, L. Metz, and S. Chintala, ‘‘Unsupervised representation learn- Posts and Telecommunications, Beijing, China.
ing with deep convolutional generative adversarial networks,’’ presented at His research interests include the deep learning
the ICLR, 2016. and computer vision.
[49] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro,
G. S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow,
A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur,
J. Levenberg, D. Mane, R. Monga, S. Moore, D. Murray, C. Olah,
M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker,
V. Vanhoucke, V. Vasudevan, F. Viegas, O. Vinyals, P. Warden, M. Watten-
berg, M. Wicke, Y. Yu, and X. Zheng, ‘‘TensorFlow: Large-scale machine
learning on heterogeneous systems,’’ 2016, arXiv:1603.04467. [Online].
Available: https://arxiv.org/abs/1603.04467
[50] A. Petrenas, V. Marozas, and L. Sörnmo, ‘‘Low-complexity detection
XIAOGUANG ZHOU received the M.S. degree
of atrial fibrillation in continuous long-term monitoring,’’ Comput. Biol.
Med., vol. 65, pp. 184–191, Oct. 2015.
from the Department of Precision Instrument,
[51] N. Henzel, J. Wróbel, and K. Horoba, ‘‘Atrial fibrillation episodes detec- Tsinghua University, in 1984, and the Ph.D. degree
tion based on classification of heart rate derived features,’’ in Proc. 24th in engineering from the Tokyo University of Agri-
Int. Conf. IEEE Mixed Design Integr. Circuits Syst. MIXDES, Jun. 2017, culture and Technology, Japan. He was a Visiting
pp. 571–576. Professor with the Tokyo University of Agriculture
and Technology, from 2001 to 2002, and a JSPS
Researcher with Tokyo University, from 2013 to
HAO DANG received the M.S. degree in pattern 2014. He is currently a Professor and a Doc-
recognition and intelligent system from the Henan toral Supervisor with the School of Automation,
University of Technology, Zhengzhou, China, Beijing University of Posts and Telecommunications. He is also the Director
in 2016. He is currently pursuing the Ph.D. degree of the Engineering Research Center of Information Networks, Ministry
in control science and engineering with the School of Education. He is the author of over ten books, over 100 articles, and
of Automation, Beijing University of Posts and over 16 inventions. His research interests include control theory and its
Telecommunications, Beijing, China. His research application in engineering, deep learning, computer vision, the Internet
interests include the pattern recognition, intelli- of Things and automated logistics systems, and mechatronics technology.
gent systems, and machine learning. He is also a permanent member of the Chinese Association of Automa-
tion/Manufacturing Technology Committee and the China Institute of Com-
munications/Equipment Manufacturing Technical Committee.
MUYI SUN received the B.S. degree in automa-

tion from the Beijing University of Posts and
Telecommunications, Beijing, China, in 2015,
where he is currently pursuing the Ph.D. degree in
control science and engineering with the School QING CHANG received the M.D. degree from the
of Automation. His research interests include the Department of Clinical Medicine, Natong Med-
pattern recognition and intelligent systems. ical College, in 2000, and the Ph.D. degree in
molecular cell biology from The University of
Tokyo, in 2013. He was a Postdoctoral Fellow
with the Shanghai Jiao Tong University School of
Medicine, from 2013 to 2015. He is currently an
GUANHONG ZHANG received the B.S. degree Associate Professor and a Master Supervisor with
in automation from the Beijing University of Posts the Shanghai University of Medicine & Health
and Telecommunications, Beijing, China, in 2015, Sciences. He is also an Attending Doctor and a
where he is currently pursuing the master’s degree Research Fellow with the Shanghai General Practice Medical Education
in pattern recognition and intelligent system from and Research Center, Jiading District Central Hospital Affiliated Shanghai
the School of Automation. His research interests University of Medicine & Health Sciences. He has published more than
include the machine learning, computer vision, 20 articles. His research interests include the usage and challenge of inno-
and neural networks. vating technology in general practice medicine, such as sequencing, big data,
and AI.
75590 VOLUME 7, 2019

A Novel Deep Arrhythmia-Diagnosis Network For Atrial Fibrillation Classification Using Electrocardiogram Signals

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

A Novel Deep Arrhythmia-Diagnosis Network For Atrial Fibrillation Classification Using Electrocardiogram Signals

Uploaded by

Copyright:

Available Formats

SPECIAL SECTION ON DEEP LEARNING

FOR COMPUTER-AIDED MEDICAL DIAGNOSIS

A Novel Deep Arrhythmia-Diagnosis Network

XIAOGUANG ZHOU1,2 , AND QING CHANG3,4

Corresponding authors: Xiaoguang Zhou (zxg@bupt.edu.cn) and Qing Chang (robie0510@hotmail.com)

I. INTRODUCTION its incidence increases with age [1], [2]. AF is characterized

At present, the diagnosis of AF mainly depends on the A. MACHINE LEARNING

75578 VOLUME 7, 2019

VOLUME 7, 2019 75579

75580 VOLUME 7, 2019

it is applied to analyze ECG signals, and its main function is

4) FULLY CONNECTED LAYER

VOLUME 7, 2019 75581

75582 VOLUME 7, 2019

VOLUME 7, 2019 75583

TABLE 1. The detailed parameters of network architecture.

and the first fully-connected layer to compress the features

75584 VOLUME 7, 2019

VOLUME 7, 2019 75585

FIGURE 12. Training and validation loss with100 epochs in set B.

C. COMPARISON AND DISCUSSION

75586 VOLUME 7, 2019

VOLUME 7, 2019 75587

75588 VOLUME 7, 2019

VOLUME 7, 2019 75589

MUYI SUN received the B.S. degree in automa-

75590 VOLUME 7, 2019

You might also like