You are on page 1of 4

2018 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI)

4-7 March 2018


Las Vegas, Nevada, USA

P-QRS-T Localization in ECG Using Deep Learning


Hedayat Abrishami1 , Matthew Campbell2 , Chia Han1 , Richard Czosek3 , and Xuefu Zhou1

Abstract— This paper describes a work using the capabilities QRS-wave, and the T-wave, embedded in a cardiac complex.
of deep neural networks to predict key wave locations in Past attempts have included manually designed features from
a cardiac complex on an electrocardiogram (ECG) as part the cardiac complexes to each type of diagnoses. Managing
of a challenge introduced by Physionet, a provider of ECG
collections, on detecting critical waveforms that contain essen- multiple versions of manually defined features would not
tial information in cardiology. The key waves include P-wave, be cost effective in practice. A reliable DL method would
QRS-wave, and T-wave. Recent attempts to extract hierarchical generate its own features leading to one capable and easy-
features of cardiac complexes have been reported in literature, to-maintain system.
but finding the accurate position of critical cardiac waves has The task of locating three major waves in cardiac com-
been a challenge in the ECG signal processing research. This
study investigates multiple architectures and learning rates of plexes is one of the primary challenges introduced by Phys-
the deep neural networks and adopts a four-step procedure ioNet [3], a provider of collections of recorded physiologic
to find the best one that can predict the wave locations. A signals. PhysioNet produces a comprehensive dataset, called
remarkable rate of 96.2% of accuracy in the localization task QTDB [3], detecting waveforms in the ECG . For each
has been achieved. This study consists of four parts to produce interval of P-wave, QRS-wave, and T-wave in this dataset, a
output predictions; obtaining the cardiac complexes from QT
Databse (QTDB) ; introduce multiple architectures, including point has been marked. Since the QTDB is an aggregation of
fully-connected networks, LeNet-style ConvNet with dropout, some other datasets, in a few cases, the beginning and end
LeNet-style ConvNet without dropout and train these networks; of the wave are marked, thus establishing two points for one
use an unseen test set to calculate the accuracy of the system interval. Therefore, three points are given, and each point
with different tolerance in each wave interval; compare all these represents a location in a wave interval.
architectures together to analyze the most suitable architecture
for this task.

I. INTRODUCTION
In recent years, with increasing computational power,
several machine learning methodologies have been used
for diagnosing heart diseases by classifying different heart
conditions and abnormalities [1], [2]. These approaches are
the classifiers for heart conditions, and they do not locate
the related wave for the associated symptoms. Finding the
location of major waves, regardless of their morphology,
can augment the possibility of even more accurate medical
diagnoses in automated ECG diagnostic systems, leading to
practical, high-throughput, cost effective population-based
screening. This paper presents the investigation of a par-
ticular branch of machine learning, the deep learning (DL)
methods, which are applied to extracting features necessary
for detecting position of ECG key waves. The DL methods Fig. 1. One instance of a cardiac complex extracted from QTDB.
are attractive since they are capable of learning features on
their own by observing a large amount of instances. The aim Recorded physiological signals are sampled at 250Hz, and
of this research work is to apply DL on ECG reads, to both the QTDB includes 105 fifteen-minutes of two-channel ECG
detect and locate the major waves, namely, the P-wave, the recording, chosen to include a broad variety of QRS and
ST-T morphologies. An automated system has annotated the
This research was supported by Children’s Cardiomyopathy Foundation
Grant.
waveforms and experts made corrections when the automated
1 Hedayat Abrishami, Chia Han and Xuefu Zhou are with the Department system failed to perform annotation [3]. This dataset allows
of EECS, University of Cincinnati, 2600 Clifton Ave, Cincinnati, OH researchers to perform on the entire ECG signal or beat to
45220, USA abrishht@mail.uc.edu han@ucmail.uc.edu beat analysis. Fig. I shows an extracted cardiac complex from
zhoxu@ucmail.uc.edu
2 Matthew Campbell is with the Division of Cardiology QTDB padded to 300 data points sampled at 250Hz (1.2
at Children’s Hospital of Philadelphia, PA 19104, USA second(s)) and 3 marks indicating the location of P-wave,
campbellm5@email.chop.edu QRS-wave, and T-wave. In this work, the ECG beat to beat
3 Richard Czosek is with the Cincinnati Children’s Hospital
Medical Center, 3333 Burnet Ave, Cincinnati, OH 45229,USA within every cardiac complex is analyzed and our focus is
Richard.czosek@cchmc.org not on extracting cardiac complexes but only on finding the

978-1-5386-2405-0/18/$31.00 ©2018 IEEE 210


waves in previously extracted cardiac complexes. TABLE I
BASELINE NETWORK ARCHITECTURE .
II. R ELATED W ORKS
Layer Name Size
In the last four decades, scientists have developed multiple 0 input 300
automated approaches to detect various waveforms in ECG 1 dense 150
2 output 3
signals. Pan-Tompkins in 1985 [4] developed one of the
most famous derivative-based approaches for detecting only
the QRS-wave. Usually, there are three main approaches in The third part is removing the baseline drift, using the
finding various waveforms in ECG, derivative-based [5] [6], Dohare [19] two median filters method.
wavelet-filters [7], [8], and amplitude-based methods. Also, After performing these steps dataset is ready to be fed
machine learning methods have been used in this realm of to the neural networks. Three locations in P-wave, QRS-
study. These researches include Neural Network (NN) [9], wave, and T-wave intervals are network’s output target. These
Random Forest [10], Support Vector Machine (SVM) [10], locations are not necessarily at peak of these waves but just
Ada Boost, Naive Bayes, K-Nearest Neighborhood (KNN), inside the wave intervals.
Hidden Markov Models (HMM), rule-based methods [11],
B. Architectures
linear discriminants [12], and logistic regression [13]. With
successful emergence of Deep Learning (DL) methods in Three different deep neural network architectures have
various topics of signal processing and data analysis, recently been implemented, a two-layer fully-connected neural net-
[1], [2] are the first related DL methods used to classify work, and the other two are both ConvNets.
heartbeats into 5 categories. However, there is no study The reason for implementing a fully-connected neural
on finding and locating the waves in heart beats, which is network is to figure out the baseline for convergence of
of high importance in cardiology communities since every our networks and examine if ConvNets can perform better
cardiologist refers to them in their diagnoses. than the fully-connected networks, and if the hierarchical
DL is becoming more and more popular since the great re- feature extractions are actually beneficial to the task. The
sult of Krizhevsky [14] in classifying ImageNet [15] dataset inputs are the padded 300 data points cardiac complexes.
in 2014 by using a large Convolutional Neural Network The next layer, hidden layer, is fully-connected with 150
(ConvNet) that was introduced by Lecun [16]. DL has neurons followed by a ReLU activation. The last layer is the
been used in image/video synthesis, object classification and output layer with 3 neurons. Each of these 3 neurons predicts
localization, etc. Longpre [17] used ConvNets to detect facial the location of each P-wave, QRS-wave, and T-wave. Mean
key point locations (eyes, lips, eyebrows, etc.). Zeiler [18] Root Square Error (MRSE) is used as the loss function. The
uses ConvNet to classify and localize objects in ImageNet network is trained with Adam Optimizer [20] with three
dataset [15]. different learning rates, 1E-3, 1E-4, and 1E-5 annealed to
1E-7 through 900 epochs. The best result was achieved by
III. T ECHNICAL A PPROACH the learning rate, 1E-3, with the loss cost of 16.26 on the test
set. This error rate means on average, the predicted output for
A. Data Preparation
every key-wave is 0.065 second off by its actual occurrence.
Our data preparation consists of three steps. The first step This experiment forms the baseline of other experiments.
is extracting cardiac complexes from QTDB. The second step Table I shows the architecture of the baseline network.
is padding the cardiac complexes. The third step is removing Next, two LeNet-style neural networks [16] are imple-
wander drift baseline. mented, consisting of alternating convolutional layers and
In the first step, cardiac complexes are extracted from max-pooling layers, followed by fully connected hidden
every lead of fifteen-minute ECGs by their annotations. Since layers. One of the ConvNet architectures has a dropout
the QT-dataset consists of different datasets, the annotations layer. A dropout layer, for every weight connection, based
in some cases include the beginning and ending of the waves on a probability, updates or ignores the weight updates. It
instead of the regular one point per wave interval. In those is claimed that dropout the layer can break the possible
cases the mid-point of the beginning and ending of the wave symmetry in the network [21].
is taken. Table II illustrates the architecture of the ConvNet. This
The second step makes all the cardiac complexes the same neural network is six layers deep, consisting of two layers of
length of 300 data point signals. Since the QTDB is sampled alternating convolutional and max-pooling layers. The input
at 250 Hz, the 300 points translate to 1.2 seconds. Extracted to this network is a 1-D cardiac complex signal with the size
cardiac complex signals have different lengths, and fixed of 300 points sampled at 250 Hz. All the convolutional layers
length signals are needed to feed to the network. Padding the have (5 × 1) filters and all the max-pooling layers have (2 ×
signal with repetitions of the signal’s last value is adopted 1) filters. For all the convolutional and max-pooling layers,
for shorter periods. Also, the padding location are randomly the stride is (3 × 1). This is followed by two condensed,
chosen at the beginning, end, or both sides of the cardiac fully-connected layers. The first layer of ConvNet includes
complex. This is a very important step to prevent networks 16 of 5 × 1 filters, (H1 , H2 , ..., H16 ). Each filter has a bias
getting biased toward a certain area for the wave intervals. and five shared weights connected to its local receptive field.

211
TABLE II TABLE IV
C ONVOLUTIONAL NEURAL NETWORK WITHOUT DROPUTS . R ESULT FOR EVERY ARCHITECTURE AND RELATED LEARNING RATES .

Layer Name Size Learning Training Validation Test Set


Architecture
0 input 1×300 × 1 Rate Set Cost Set Cost Cost
1 conv1 1×100 × 16 1e-3 18.09 18.19 18.28
2 maxpool1 1×50 × 16 With Dropout 1e-4 14.46 14.30 14.44
3 conv2 1×18 × 32 1e-5 44.82 44.84 44.91
4 maxpool2 1×9 × 32 1e-3 4.71 5.36 5.57
none reshape 288 ConvNet 1e-4 6.73 6.85 7.00
5 dense1 150 1e-5 33.90 33.68 34.01
6 dense2 3 1e-3 15.62 15.88 18.84
Fully Connected 1e-4 31.37 31.29 31.15
1e-5 42.54 42.07 42.63
TABLE III
C ONVOLUTIONAL NEURAL NETWORK WITH DROPOUTS .
D. Test Loss Function
Layer Name Size
0 input 1×300 × 1 Considering the fact that the desired key waves have
1 conv1 1×100 × 16 duration, the test loss function should tolerate a degree of
2 maxpool1 1×50 × 16
displacement from the dataset annotation. Therefore, a loss
3 dropout1 1×50 × 16
4 conv2 1×18 × 32 function with variation tolerance is justified. This suggests a
5 maxpool2 1×9 × 32 new layer that absorbs some variation in the prediction before
6 dropout2 1×9 × 32 sending the network output to the RMSE loss function. Eq.2
7 reshape 288
8 dense1 150
shows the displacement absorption layer, as follows:
9 dense2 3
d i = y i − pi

The second convolutional layer has 32 filters. The first fully- 
⎨ eps If di ≥ eps
connected layer has 150 neurons. The output layer of size di = -eps If di ≤ −eps (2)

three produces the prediction of these three wave positions yi − di Otherwise
 
on related input cardiac complex data. 500 iteration has been yi = yi − d i
used with the learning rate of 1E −5 annealing to 10−7 using
where yi is the neural network prediction for the ithwave
Adam optimizer [20].
location, pi is the annotated mark in the QTDB, and yi 
The next ConvNet architecture for cardiac complexes is
is the network’s new predicted position. eps is a constant
the same as the previous one but with one difference: A
indicating the vicinity that the loss function can tolerate for
dropout layer added after each max pooling and is illustrated
every wave.
in table III. A dropout layer gives a probability to evaluate
whether to update a weight connection. The probability given IV. R ESULTS
for this task was 0.2. This is to investigate further if using A total of 133, 524 complexes are extracted from QTDB
such layers in a small ConvNet is beneficial. The rest of the and they are divided into three different sets with no sample
hyperparameters are the same as the previous ConvNet. repetition. The division is 60% of the data for training,
10% for validation, and 30% for testing purposes. Bear in
C. Training Loss Function mind that these sets are patient independent which means
In the experiment, the loss function requires measuring the ECG recordings are exclusive for every set. Table IV
the distance between the predicted location of the waves by shows the entire grid search experiment. The best results for
the proposed neural network and the marked annotation of each architecture are highlighted. The best architecture in our
the QTDB on cardiac complexes. RMSE is employed as the experiment was found to be ConvNet without dropout layers.
loss function and it is shown in Eq.1. Here, n = 3, given that This network is named as ECGNet. As shown in Table IV
there are three dimensions to the output, yi is the predicted the values of RMSE for training, validation and testing are
output of the i − th dimension, and ti is the expected output 4.71, 5.36, and 5.57, respectively. The system is able to find
of the i − th dimension. Each one of the dimensions is the waves very well because the P-wave, QRS-wave, and T-
a desired wave location. The neural network uses RMSE wave have durations, and it is not just one point but any point
loss function to train the weights with backpropagation within their interval that is acceptable for detecting wave
method[16]. Moreover, RMSE function is used for the test set locations. However, the results of only finding the peaks
with additional consideration that warrants further discussion are provided. The loss of 5.57 RMSE, 0.028 second, for
in the next section. the waves that usually have durations of more than ten data
 points with a sampling rate of 250Hz, can be considered
 n remarkable.
1 
RM SE = (yi − ti )2 ) (1) Measuring the RMSE in terms of accuracy percentage
n i=1 point is possible too. Considering there are 300 data points,

212
if every location of the waves predicted randomly, then of delineating, and localization of key waves augments the
the average accuracy would be 150 data points. Therefore, possibility of making impact to future research in cardi-
accuracy percentage can be obtained by Eq.3: ology. By combining the vital information of waveforms
150 − RM SECost with other methods in recognizing symptoms, more accurate
Accuracy = (3) heart related diseases can be diagnosed and high-throughput,
150
automated ECG diagnostic systems can be developed to serve
The best result for the test set in terms of percentage is
the need of large population screening for disease prevention.
96.2%, and it is obtained without considering that the waves
have durations. Table V shows the result of the ECGNEt with R EFERENCES
three different vicinity tolerance ranges, eps, (0, 5, and 10), [1] S. Kiranyaz, T. Ince, and M. Gabbouj. Real-time patient-specific ecg
using the added layer for test loss function introduced in Eq. classification by 1-d convolutional neural networks. IEEE Transactions
on Biomedical Engineering, 63(3):664–675, March 2016.
2. One can notice that when the eps is zero, it means that [2] S. Kiranyaz, T. Ince, and M. Gabbouj. Real-time patient-specific ecg
this added layer for the test has no effect, and the output of classification by 1-d convolutional neural networks. IEEE Transactions
this layer is the same as its input. This means that the exact on Biomedical Engineering, 63(3):664–675, March 2016.
[3] Goldberger et al. Physiobank, physiotoolkit, and physionet compo-
QTDB annotation for the output, and error is calculated. nents of a new research resource for complex physiologic signals.
With a vicinity tolerance of 10 data points, the system’s ”Circulation”, 2000 (June 13).
accuracy on a large variety of cardiac complexes is 99.62%, [4] J. Pan and W. J. Tompkins. A real-time qrs detection algorithm. IEEE
Transactions on Biomedical Engineering, BME-32(3):230–236, March
a phenomenal result considering tis is the very first one to 1985.
find these waves in cardiac complexes using a DL method. [5] Karimipour et al. Real-time electrocardiogram p-qrs-t detection-
delineation algorithm based on quality-supported analysis of charac-
TABLE V teristic templates. Comput. Biol. Med., 52:153–165, September 2014.
ECGN ET WITH VICINITY TOLERANCE . [6] Homaeinezhad et al. Discrete wavelet-aided delineation of PCG signal
events via analysis of an area curve length-based decision statistic.
Test set results for ECGNet Cardiovascular engineering (Dordrecht, Netherlands), 10, 2010.
Vicinity RMSE Error percentage Accuracy [7] Martinez et al. A wavelet-based ecg delineator: evaluation on standard
databases. IEEE Transactions on Biomedical Engineering, 51(4):570–
0 5.57 3.8% 96.2%
581, April 2004.
5 2.33 1.6% 98.4%
[8] Tafreshi et al. Automated analysis of ECG waveforms with atypical
10 1.61 1.1% 98.9%
QRS complex morphologies. Biomedical Signal Processing and
Control, 10:41–49, 2014.
Karimipour [5] has done an extensive research on finding [9] Ouyang et al. Training a nn with ecg to diagnose the hypertrophic
portions of hcm. In 1998 IEEE International Joint Conference on Neu-
P-QRS-T wave-forms using a combination of wavelet-based ral Networks Proceedings. IEEE World Congress on Computational
pre-processing, derivative-based wave-detection, and rule- Intelligence (Cat. No.98CH36227), volume 1, pages 306–309 vol.1,
based decision-maker method. Unlike Karimipour’s work, May 1998.
[10] Q. A. Rahman, L. G. Tereshchenko, M. Kongkatong, T. Abraham,
the proposed method uses a simple median-filter pre- M. R. Abraham, and H. Shatkay. Identifying hypertrophic cardiomy-
processing method and the rest is training the network. opathy patients by classifying individual heartbeats from 12-lead ecg
Martinez [7] proposed a wavelet-based ECG delineator. Table signals. In 2014 IEEE International Conference on Bioinformatics
and Biomedicine (BIBM), pages 224–229, Nov 2014.
VI compare ENCGNet proposed method and Karimipour’s [11] Kaiser et al. Automatic learning of rules. A practical example of
work which shows the lower RMSE in ECGNet. However, using artificial intelligence to improve computer-based detection of
in Karimipour’s work, cardiac complexes are detected by the myocardial infarction and left ventricular hypertrophy in the 12-lead
ECG. Journal of electrocardiology, 29 Suppl:17–20, 1996.
algorithm but our work doesn’t find cardiac complexes, and [12] P. de Chazal and R. B. Reilly. A patient-adapting heartbeat classifier
the only focus is to examine the ability of DL in finding using ecg morphology and heartbeat interval features. IEEE Transac-
different waveforms. tions on Biomedical Engineering, 53(12):2535–2543, Dec 2006.
[13] Warner et al. Improved electrocardiographic detection of left ven-
The training had been done on a CUDA enabled Geforce tricular hypertrophy. Journal of electrocardiology, 35 Suppl:111–5,
GT 740M graphic card and it took approximately one hour 2002.
and twenty minutes to complete the 900 epochs. [14] Krizhevsky et al. Imagenet classification with deep convolutional
neural networks. In Advances in Neural Information Processing
V. C ONCLUSION Systems, page 2012.
[15] Olga et al. Imagenet large scale visual recognition challenge. Int. J.
To our knowledge, there has been no DL research to date Comput. Vision, 115(3):211–252, December 2015.
that has focused on finding the location of major ECG waves, [16] Yann Lecun, Léon Bottou, Yoshua Bengio, and Patrick Haffner.
Gradient-based learning applied to document recognition. In Proceed-
regardless of their morphology. The results in this work ings of the IEEE, pages 2278–2324, 1998.
showed that DL is capable of extracting features necessary [17] Shayne Longpre and Ajay Sohmshetty. Facial keypoint detection.
for detecting the position of ECG key waves. The ability 2016.
[18] Zeiler et al. Visualizing and understanding convolutional networks. In
In Computer Vision–ECCV 2014, pages 818–833. Springer, 2014.
TABLE VI [19] Ashok Kumar Dohare, Vinod Kumar, and Ritesh Kumar. An efficient
new method for the detection of qrs in electrocardiogram. Comput.
C OMPARISON OF ECGN ET WITH OTHERS . Electr. Eng., 40(5):1717–1730, July 2014.
[20] Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic
Method QTDB RMSE (ms) optimization. CoRR, abs/1412.6980, 2014.
ECGNet (Proposed) 30% of data 5.57 [21] Srivastava et al. Dropout: A simple way to prevent neural networks
Karimipour [5] 40 records 10.80 from overfitting. J. Mach. Learn. Res., 15(1):1929–1958, January
Martinez [7] 105 records 11.175 2014.

213

You might also like