You are on page 1of 18

applied

sciences
Article
Research on the Classification of ECG and PCG Signals Based
on BiLSTM-GoogLeNet-DS
Jinghui Li 1,2 , Li Ke 1, *, Qiang Du 1 , Xiaodi Ding 1 and Xiangmin Chen 1

1 School of Electrical Engineering, Shenyang University of Technology, Shenyang 110870, China


2 College of Telecommunication and Electronic Engineering, Qiqihar University, Qiqihar 161006, China
* Correspondence: keli@sut.edu.cn; Tel.: +86-024-2549-9250

Abstract: Because a cardiac function signal cannot reflect cardiac health in all directions, we pro-
pose a classification method using ECG and PCG signals based on BiLSTM-GoogLeNet-DS. The
electrocardiogram (ECG) and phonocardiogram (PCG) signals used as research objects were collected
synchronously. Firstly, the obtained ECG and PCG signals were filtered, and then the ECG and PCG
signals were fused and classified by using a bi-directional long short-term memory network (BiLSTM).
After that, the time-frequency processing was performed on the filtered ECG and PCG signals to
obtain the time-frequency diagram of each signal; the one-dimensional signal was changed into a
two-dimensional image signal, and the size of each image signal was adjusted to input the improved
GoogLeNet network for classification. Then we obtained the two-channel classification results. The
three-channel classification results, combined with the classification results of the above BiLSTM
network, were finally obtained. The classification results of these three channels were used to make a
decision via the fusion strategy of the improved D-S theory. Finally, we obtained the classification
results. Taking 70% of the signals in the database as training data and 30% as test data, the obtained
classification accuracy was 96.13%, the sensitivity was 98.48%, the specificity was 90.8%, and the F1
score was 97.24%. From the experimental results, the method proposed in this paper obtained high
Citation: Li, J.; Ke, L.; Du, Q.; Ding,
classification accuracy, and the classification effect was better than a cardiac function signal, which
X.; Chen, X. Research on the makes up for the low accuracy of the cardiac function signal for judging cardiac disease.
Classification of ECG and PCG
Signals Based on Keywords: ECG and PCG signals; bi-directional long short-term memory network (BiLSTM);
BiLSTM-GoogLeNet-DS. Appl. Sci. GoogLeNet; D-S theory
2022, 12, 11762. https://doi.org/
10.3390/app122211762

Academic Editor: Byung-Gyu Kim


1. Introduction
Received: 1 September 2022
The existing cardiovascular disease diagnosis and treatment technology means are
Accepted: 14 November 2022
singular, and the diagnosis and treatment processes are complex, which inconvenience a
Published: 19 November 2022
patient’s disease examination. Clinically, a doctor’s diagnosis about cardiovascular disease
Publisher’s Note: MDPI stays neutral is mostly based on multi-modal detection data, such as PCG, ECG, cardiac color Doppler
with regard to jurisdictional claims in ultrasound, and so on. ECG and PCG signals play very important roles in the early predic-
published maps and institutional affil- tion and diagnosis of heart disease. Although artificial auscultation is a convenient and
iations. low-cost cardiac diagnosis technology, doctors need to have very rich clinical experiences.
It is easy to make mistakes in diagnoses according to the subjective judgments of doctors,
especially for some complex heart diseases. An ECG examination is a common method
used in modern cardiovascular diagnosis; when the patient has symptoms such as angina
Copyright: © 2022 by the authors.
pectoris or myocardial infarction, the ECG signals will change significantly [1]. However,
Licensee MDPI, Basel, Switzerland.
the examination of some diseases will still be less sensitive. When the patient asks the
This article is an open access article
doctor for help, the doctor often makes a preliminary judgment about the heart disease
distributed under the terms and
conditions of the Creative Commons
according to the patient’s condition. For more serious heart diseases, other instruments
Attribution (CC BY) license (https://
need to be checked. This process may lead to the development of the disease, with the loss
creativecommons.org/licenses/by/ of the best opportunity for treatment. Moreover, in order to fully understand the health
4.0/). status of the patient’s heart, currently, only the detection method of an insertable catheter

Appl. Sci. 2022, 12, 11762. https://doi.org/10.3390/app122211762 https://www.mdpi.com/journal/applsci


Appl. Sci. 2022, 12, 11762 2 of 18

can be used in clinical practice, but this method is traumatic and risky to the patient, and
requires the hospital to have high technical equipment.
Nowadays, most researchers carry out signal analyses and algorithm development for
cardiac function signals, such as an ECG or PCG signal, while there are few related studies
based on ECG and PCG signals. Although researchers have fused PCG, ECG, and cardiac
impedance signals, most of them treat one kind of signal as the reference of another kind
of signal. For example, a cardiac impedance signal is processed with the help of an ECG
signal, the PCG signal extracts the feature points with the help of the ECG signal, or simply
fuses the ECG and PCG signals to give the heart disease diagnosis result by only a cardiac
function signal. For example, Rodion Stepanov et al. determined cardiac hemodynamic
parameters by using the combined features of ECG, PCG, and cardiac impedance signals.
The relevant parameters of cardiac impedance were obtained with the help of ECG and PCG
signals [2]. Jan Nedoma et al. compared heart rate monitoring results by synchronizing
the acquisition of the ballistic cardiography, electrocardiogram, and phonocardiogram
signals [3]. Mónica Martin et al. improved heart disease diagnosis accuracy by acquiring
ECG and PCG signals synchronously [4]. Han Li detected coronary artery disease by
a dual-input neural network using electrocardiogram and phonocardiogram signals [5].
Abdelakram Hafid et al. synchronously acquired ECG and cardiac impedance signals
with the least electrodes. The parameter calculation of a cardiac function signal was
completed by means of an ECG signal [6]. Chakir Fatima et al. used different classifiers to
classify the signals by the synchronously collected PCG and ECG signals and compared the
classification results with the PCG signals [7]. Huan Zhang proposed the detection method
of coronary artery disease, which made use of multi-modal features. The multi-modal
feature models were proposed using ECG and PCG signals. Finally, the classification
accuracies of these modes were compared [8].
Through the above-mentioned research, it can be found that people have gradually
shifted the focus from cardiac function signals to more model cardiac function signals. The
simultaneous research on multiple decision-making methods shows the advantages of
multi-modal signal research and makes up for the flaw of single model signal diagnosis.
Therefore, this paper integrates various technologies to realize the judgments of cardiac
health statuses via ECG and PCG signals. This paper proposes the classification method
of ECG and PCG signals based on BiLSTM-GoogLeNet-DS (combined with the fusion
technology for deep learning). This method can extract the features of long-distance
dependent information based on BiLSTM, which can improve the training time and learn
the features better, and can produce higher accuracy for long-span time series data. The
GoogLeNet model has also achieved excellent results in the field of image recognition with
its multi-layer convolutional neural network structure and inception structure. This paper,
combined with the advantages of these two deep learning networks, uses the BiLSTM
network and GoogLeNet network to extract and classify the features of ECG and PCG
signals. The final classification results were obtained by using the fusion technology
twice with the help of the improved D-S theory. The experiments show that the proposed
algorithm improves the classification accuracy of ECG and PCG signals effectively, and is
more accurate than single-modal cardiac function signals.
The structure of this paper is organized as follows: the methodology of the proposed
algorithm is presented in Section 2. Section 3 presents the related theory about the BiL-
STM network and GoogLeNet model. Section 4 presents the implementation steps of the
proposed algorithm. Section 5 presents the experiments and results. The experimental
dataset and the parameter settings of the network model are presents. The comparative
experiments are set. The evaluation criteria of the classification accuracy, sensitivity, speci-
ficity, precision, and F1 score are given. The comparison results with other algorithms are
provided. Finally, the conclusions are drawn in Section 6.
precision, and F1 score are given. The comparison results with other algorithms are pro-
vided. Finally, the conclusions are drawn in Section 6.

Appl. Sci. 2022, 12, 11762 2. Methodology 3 of 18


This paper mainly selects the PCG and ECG signals collected synchronously as the
research objects. PhysioNet published a cardiovascular disease dataset containing both
PCG and ECG signals in 2016 [9]. Based on this dataset, this paper proposes the classifi-
2. Methodology
cationThis
method
paperbased
mainlyon BiLSTM-GoogLeNet-DS
selects the PCG and ECG signals to predict
collectedand classify the signals
synchronously as the in the
database. Firstly,
research objects. the ECGpublished
PhysioNet and PCGa cardiovascular
signals are preprocessed,
disease datasetrespectively,
containing bothmainly
PCG for fil-
and ECG
tering. signals
Then the in 2016 [9]. Based
processed signalsonare
thissent
dataset, thisBiLSTM
to the paper proposes
network thetoclassification
perform network
methodand
fusion based on BiLSTM-GoogLeNet-DS
classify the signals. After that, to predict and classify
the filtered PCGthe signals
and ECGinsignals
the database.
are, respec-
Firstly, the ECG and PCG signals are preprocessed, respectively, mainly for filtering. Then
tively, subjected to wavelet transform to obtain the time-frequency diagram of each signal.
the processed signals are sent to the BiLSTM network to perform network fusion and classify
Each time-frequency diagram is resized and converted into the image size suitable for
the signals. After that, the filtered PCG and ECG signals are, respectively, subjected to
GoogLeNet network
wavelet transform training,
to obtain and then classified
the time-frequency diagramthrough the improved
of each signal. GoogLeNet net-
Each time-frequency
work to obtain the classification results of PCG and ECG signals,
diagram is resized and converted into the image size suitable for GoogLeNet network respectively. The three-
channel
training, classification
and then classifiedresults, combined
through with the
the improved classification
GoogLeNet network results of the above
to obtain
BiLSTM network,
classification resultsare obtained.
of PCG and ECG Thesignals,
classification decision
respectively. is made by the
The three-channel fusion strategy
classification
results,
of combined with
the improved the classification
D-S theory. results
Calculating of probability
the the above BiLSTM
values network,
of ECGareand obtained.
PCG signals
belonging to different categories can lead to the basic probability assignmenttheory.
The classification decision is made by the fusion strategy of the improved D-S (BPA) func-
Calculating the probability values of ECG and PCG signals belonging to different categories
tion. The local recognition credibility of each feature is estimated by using the confusion
can lead to the basic probability assignment (BPA) function. The local recognition credibility
matrix. The weighted correction is carried out when constructing the BPA of the classifier.
of each feature is estimated by using the confusion matrix. The weighted correction is
Finally, thewhen
carried out classification results
constructing areofobtained.
the BPA TheFinally,
the classifier. following is the specific
the classification implementa-
results are
tion process, as shown in Figure 1.
obtained. The following is the specific implementation process, as shown in Figure 1.

BiLSTM Network

Time-frequency
PCG signal Pre-processing diagram
Wavelet Transform GoogLeNet Network DS Decision
fusion output

Time-frequency
ECG signal Pre-processing diagram
Wavelet Transform GoogLeNet Network

Figure 1.
Figure Classification method
1. Classification methodand implementation
and process
implementation of ECG
process of and
ECGPCG
andsignals.
PCG signals.
3. Related Theory
3.3.1.
Related Theory
BiLSTM
3.1. BiLSTM
LSTM (long short-term memory network), which has the advantage of memorability,
is anLSTM
improved model
(long of the recurrent
short-term memory neural network.
network), It canhas
which efficiently learn theof
the advantage nonlinear
memorability,
characteristics of the time series. In the internal structure of LSTM, the activation
is an improved model of the recurrent neural network. It can efficiently learn the nonlinear method
of the nodes in the
characteristics of thehidden
timelayer is very
series. complex.
In the By selectively
internal structure of remembering,
LSTM, the forgetting
activationin-method
formation with relatively small effects, and remembering useful information with relatively
of the nodes in the hidden layer is very complex. By selectively remembering, forgetting
large time intervals, it solves the problem of gradient explosion and gradient disappearance
information
of a long timewith
series.relatively small effects, and remembering useful information with rela-
tivelyThe
large time intervals,
memory unit of LSTM it solves the of
consists problem of gradient
three gate structures: explosion
input gate,and gradient
forget gate disap-
pearance
and output of gate.
a longThe time series.structure is shown in Figure 2. These gates ensure that
network
The memory
information unit of LSTM
can be selectively passed, consists
and the of three
state gate
of the structures:
transmission unitinput
can begate, forget gate
obtained
by some linear operations, so as to ensure the invariance of information
and output gate. The network structure is shown in Figure 2. These gates ensure that in-during transmission.
By memorizing
formation can betheselectively
information selectively,
passed, and itthecan ensure
state thattransmission
of the most of the sample databeare
unit can obtained
effectively used. Its deep features can be learned to diagnose diseases.
by some linear operations, so as to ensure the invariance of information during transmis-
sion. By memorizing the information selectively, it can ensure that most of the sample
data are effectively used. Its deep features can be learned to diagnose diseases.
Appl. Sci. 2022, 12, x FOR PEER REVIEW 4 of 19
Appl. Sci. 2022, 12, 11762 4 of 18

Figure 2.
Figure Networkstructure
2. Network structure of
of LSTM.
LSTM.

In LSTM,
In LSTM, thethe sample
sample data
data will
will go
go through
through thethe following
following three
three stages:
stages: the
the forgetting
forgetting
gate, deciding which information to forget, and which information to retain.
gate, deciding which information to forget, and which information to retain. Through the Through the
input gate,
input gate, the
the cell
cell state
state is
is updated
updated and
and the
the updated
updated information
information is
is determined.
determined. Through
Through
the output gate, the output information is
the output gate, the output information is obtained. obtained.
The current sample input xt and the previous sequence output ht−1 jointly determine
xt h
Thestate
the cell current
ct−1sample
of which input and needs
information the previous sequence output t −1 jointly deter-
to be remembered.
ct −1
mine the cell state of whichf information
= σ(W · (h needs
, x )to
+ be
b )remembered. (1)
t f t −1 t f
f =  (Wf  (ht −1 , xt ) + bf ) (1)
In the following formula:t W f is the weight. b f is offset. σ is the sigmoid activation
function. Ct−1 represents the cellWstate, which is usedbto save the information of the previous
In the
memory following
moment. formula: the
f t represents f is the weight.
forgetting degreef ofisthe offset.  is the
forgetting gatesigmoid activa-
to the cell state.
The input gatecupdates the cell state and determines the new information to be added to
tion state. t −1 represents the cell state, which is used to save the information of the
function.
the cell
f t = σ(Wi · (ht−1 , xt ) + bc )
previous memory moment. t irepresents the forgetting degree of the forgetting gate (2) to
the cell state. The input gate updates
C the(cell
et = tanh Wc ·state (ht−1and
, xt )determines
+ bc ) the new information (3) to
be added to the cell state.
Ct = f t · Ct−1 + it · C et (4)
i =  (W 
In the following formula:t it represents
i ( h ,
t −the
1 x ) + b
tupdate )
c degree of the input gate to the new (2)
information; C et is the candidate vector of the current new state information; then the cell
Ct = tanh(the
state is updated, f t · Ct−1 represents  (ht −1 , xt ) + to
Wcinformation bc )be forgotten; it · Cet represents the
(3)
information to be retained; Ct represents the current cell state. In this way, the information
is stored in the memory unit, and finally the information is output at the output gate.
Ct = ft  Ct −1 + it  Ct (4)
ot = σ (Wo · (ht−1 , xt ) + bo ) (5)
i
In the following formula: t represents the update degree of the input gate to the
ht = ot · tanh(Ct ) (6)
C
new information; t is the candidate vector of the current new state information; then
In the following formula: ot represents the output information of the output gate; ht is
the cell
output ofisthe
ft  CtAt
hidden layer. −1 the same time, output ht to it  C
the state updated, represents thethe
information will
bealso be fed into
forgotten; the next
t rep-

LSTM unit. C t represents the current cell state. In this way,


resents the information
Sigmoid function: to be retained;
the information is stored in the memory unit, and− xfinally −1 the information is output at the
σ ( x ) = (1 + e ) (7)
output gate.
ht is the output of the hidden layer. At the same time, the output
ht will also be fed
into the next LSTM unit.
Sigmoid function:

Appl. Sci. 2022, 12, 11762  ( x) = (1 + e− x ) −1 5(7)


of 18

Tanh function:

Tanh function: e x − e− x
tanh( x) = e x− x− e− x (8)
tanh( xe) =+ ex
x
−x
(8)
e +e
Although the LSTM neural network has a memory function and can learn historical
Although the LSTM neural network has a memory function and can learn historical
information,
information, it can onlyonly
it can learnlearn
forward information
forward and cannot
information make effective
and cannot use of back-
make effective use of
ward information. The BiLSTM network model is the improvement of
backward information. The BiLSTM network model is the improvement of LSTM. It can LSTM. It can make
upmake
for theupshortcomings of the above
for the shortcomings LSTM,
of the aboveandLSTM,
its learning ability
and its has further
learning ability improved
has further
compared
improved compared to LSTM. The BiLSTM structure is shown in Figure 3.network
to LSTM. The BiLSTM structure is shown in Figure 3. The BiLSTM has
The BiLSTM
two-direction transmission layers, a forward-transmission layer, and a back-transmission
network has two-direction transmission layers, a forward-transmission layer, and a back-
layer. The forward-transmission
transmission layer trains the time
layer. The forward-transmission layerseries
trainsforward,
the timethe back-transmis-
series forward, the
sion layer trains the time series backward, and both the forward-transmission
back-transmission layer trains the time series backward, and both the forward-transmission and back-
transmission layers are connected
and back-transmission layers aretoconnected
the outputtolayer.
the output layer.

Figure 3. BiLSTM structure diagram.


Figure 3. BiLSTM structure diagram.
The original signal directly inputs the information to the BiLSTM network layer
The original
through the inputsignal directly
layer. inputs
The input the information
sample to the
signal obtains BiLSTM
a value network
through layer
the forward
through the input layer.
LSTM calculation The input
and obtains sample
a value signalthe
through obtains a value
reverse LSTMthrough the forward
calculation. The value
LSTM calculation
transmitted and
to the obtains
hidden a value
layer through the
is determined reverse
by these twoLSTM calculation. The value
values.
transmitted to the hidden layer
The formula is as follows: is determined by these two values.
The formula is as follows: →
→ → → →
h t = f ( w · x t + v · h t −1 + b ) (9)
ht = f (w  xt + v  ht −1 + b ) (9)
← ← ← ← ←
h t = f ( w · x t + v · h t −1 + b ) (10)
ht = f (w  xt + v →
 ht −←
1 + b) (10)
y t = g (U [ h t ; h t ] + c ) (11)
→ ←
In the formula, h t is the output of the forward LSTM layer, h t is the output of the
backward LSTM layer, the output state of BiLSTM will be spliced and output according to
→ ←
h t and h t ; yt is the output of the hidden layer after the BiLSTM superposition [10–13].
The original cardiac function signals are directly input into the deep learning network
model without manual feature extraction. The deep information is learned through the
neural network, and the final classification results are output, so as to realize the classifi-
cation and diagnosis of cardiac function signals. Since the ECG and PCG signals are time
series signals, the relationship between the front and back sequences of signals should be
fully considered. The BiLSTM network can take into account the input of forward and
backward information. Compared with the LSTM network, the BiLSTM network can make
more effective use of the information contained in cardiac function signals. Since the signal
values of forward and reverse transmission of different data are different, the BiLSTM
the neural network, and the final classification results are output, so as to realize the clas-
sification and diagnosis of cardiac function signals. Since the ECG and PCG signals are
time series signals, the relationship between the front and back sequences of signals
should be fully considered. The BiLSTM network can take into account the input of for-
Appl. Sci. 2022, 12, 11762 ward and backward information. Compared with the LSTM network, the BiLSTM6 of net-
18
work can make more effective use of the information contained in cardiac function signals.
Since the signal values of forward and reverse transmission of different data are different,
the BiLSTM network can use bidirectional information to learn its intrinsic features, so as
network can use bidirectional information to learn its intrinsic features, so as to improve
to improve the classification accuracy.
the classification accuracy.
3.2. GoogLeNet
3.2. GoogLeNet Model Model
Based on
Based on the
the mechanism
mechanism of of convolutional
convolutional neural neural networks,
networks, many many researchers
researchers have have
designed convolutional
designed convolutional neural neural network
network models
models for for specific
specific problems,
problems, such such as
as GoogLeNet,
GoogLeNet,
AlexNet, VGGNet,
AlexNet, VGGNet, ResNet,
ResNet, and and so so on.
on. The
The GoogLeNet
GoogLeNet model model proposed
proposed by by the
the Google
Google
team is relatively successful convolutional neural network model. The model has aa total
team is a relatively successful convolutional neural network model. The model has total
of 22
of 22 layers
layersof ofnetwork
networkstructure.
structure.InInthe themodel,
model,ininaddition
addition toto the
the convolution
convolution layer,
layer, pool-
pooling
ing layer,
layer, andconnection
and full full connectionlayer,layer,
there isthere
alsoisthealso the inception
inception structurestructure
proposed proposed by the
by the Google
GoogleThrough
team. team. Through the convolutional
the convolutional kernels kernels of different
of different scalesscales
in theininception
the inception struc-
structure,
ture, different
different imageimage features
features can becan be extracted.
extracted. The The GoogLeNet
GoogLeNet modelmodel
has has achieved
achieved excel-
excellent
lent results
results in theinfield
the field of image
of image recognition
recognition withwith its multi-layer
its multi-layer convolutional
convolutional neuralneural
networknet-
work structure
structure and inception
and inception structure.
structure.
The core architecture of the GoogLeNet network model has nine inception modules.
The
The
The initial
initial inception
inception v1 v1network
networkconsists
consistsofofthree
threeconvolutional
convolutionallayers size11×× 1,
layersofofsize 1, 33 ×
× 3,
55 ×
× 5, and aa 3 ××33pooling
5, and poolinglayer
layer(as(as shown
shown in in Figure
Figure 4). This
This structure processes the input
images
images in parallel, and and then
then splices
spliceseacheachoutput
outputresultresultaccording
accordingtotothe thechannel,
channel, which
which is
is used to increase the width of the network and the adaptability
used to increase the width of the network and the adaptability of the network to the scale. of the network to the
scale. If the input
If the input imageimage has morehas channels,
more channels, the convolutional
the convolutional kernelkernel parameters
parameters will bewill
more.be
more. The computation
The computation amount amount will increase
will increase accordingly,
accordingly, so it is necessary
so it is necessary to reduce to the
reduce
dimen-the
dimension
sion of the of the Since
data. data. the
Since1 ×the 1 × 1 convolution
1 convolution does not does not change
change the height
the height and width
and width of the
of the image,
image, it can itreduce
can reduce the number
the number of channels.
of channels. Therefore, 1 × 11 ×
Therefore, 1 convolutional
convolutional kernels
kernels are
are added before 3 × 3, 5 × 5 convolutional layers and after the 3
added before 3 × 3, 5 × 5 convolutional layers and after the 3 × 3 max pooling layer, respec-× 3 max pooling layer,
respectively,
tively, to reduce to reduce the feature
the feature map thickness,
map thickness, as shownas shown in Figure
in Figure 5. 5.

Previous layer

11 3 3 5 5 3 3
convolutions convolutions convolutions max pooling

Appl. Sci. 2022, 12, x FOR PEER REVIEW 7 of 19

Filter
concatenation

Figure 4. Inception
Figure 4. Inception network
network module
module diagram.
diagram.

Previous layer

1 1 1 1 33
convolutions convolutions max pooling
1 1
convolutions
3 3 5 5 11
convolutions convolutions convolutions

Filter
concatenation

Figure 5.
Figure 5. Inception network
network module
module diagram.
diagram.

As can be seen from Figure 5, this structure is different from the convolutional neural
network of the earlier sequential structure. The difference in the size of the convolutional
kernel makes the expressions of the model features more diversified, which greatly im-
proves the ability of the model feature extraction. The GoogLeNet structure is shown in
1 1 1 1 33
convolutions convolutions max pooling
1 1
convolutions
3 3 5 5 11
convolutions convolutions convolutions
Appl. Sci. 2022, 12, 11762 7 of 18

Filter
concatenation
As can be seen from Figure 5, this structure is different from the convolutional neural
networkFigure 5. Inception network module diagram.
of the earlier sequential structure. The difference in the size of the convolutional
kernel makes As can beexpressions
the seen from Figureof5,the model isfeatures
this structure more
different from the diversified, which greatly im-
convolutional neural
proves the ability
network of theofearlier
the model
sequentialfeature
structure.extraction.
The differenceThe
in theGoogLeNet structure is shown in
size of the convolutional
Figure 6.kernel
Duemakes
to thethe complex
expressions network
of the modelstructure
features moreof diversified,
GoogLeNet, whichits
greatly im-
structure is approxi-
proves the ability of the model feature extraction. The GoogLeNet structure is shown in
mately represented by modules of different colors representing different
Figure 6. Due to the complex network structure of GoogLeNet, its structure is approxi- network functions.
The red mately
module represents
represented the pooling
by modules layer,
of different the
colors blue module
representing represents
different network func-the convolution
tions.
layer and ReLUThe red module
layer, the represents
green modulethe pooling layer, the blue
represents the module
splicing represents the con-
operation, the yellow mod-
volution layer and ReLU layer, the green module represents the splicing operation, the
ule represents the Softmax activation function and the white modules at both ends represent
yellow module represents the Softmax activation function and the white modules at both
the input
endsand classified
represent output.
the input The red
and classified boxThe
output. represents different
red box represents Inception
different Inceptionstructures and
structures
the green and the green
box represents box
the represents classification
auxiliary the auxiliary classification
structure. structure.
FromFrom thisthis
figure, it can be
seen thatfigure,
the it can be seen that the GoogLeNet structure contains many similar inception mod-
GoogLeNet structure contains many similar inception
ules. A complex GoogLeNet model is formed by connecting these modules end-to-end
modules. A complex
GoogLeNet model
[14–18]. is formed by connecting these modules end-to-end [14–18].

Figure 6.Figure 6. GoogLeNet


GoogLeNet structure.
structure.
4. Classification of ECG and PCG Signals Based on BiLSTM-GoogLeNet-DS
4. Classification of ECG and PCG Signals Based on BiLSTM-GoogLeNet-DS
The ECG and PCG signal classification method based on BiLSTM-GoogLeNet-DS is
Theproposed.
ECG and PCGfusion
Different signal classification
methods, method
combined with based
the network on BiLSTM-GoogLeNet-DS
structure characteristics is
proposed. of the BiLSTM network
Different fusionand GoogLeNet
methods, network, are
combined used the
with to classify
networkthe ECG and PCG characteristics
structure
signals, which make up for the inaccuracy of disease judgment by using the cardiac func-
of the BiLSTM network and GoogLeNet network, are used to classify the ECG and PCG
tion signal. Using the ECG and PCG signal classification, a better classification and diag-
signals, which make
nosis effect up for the
was obtained. Theinaccuracy
implementationof steps
disease judgment
of this byasusing
algorithm are the cardiac function
follows:
signal. Using the ECG and PCG signal classification, a better classification and diagnosis
Appl. Sci. 2022, 12, x FOR PEER REVIEW 8 of 19
effect was obtained. The implementation steps of this algorithm are as follows:
Step 1: Obtain the ECG and PCG signals. This paper uses the PCG and ECG signals
synchronously collected in the PhysioNet/CINC challenge 2016 heart sound database as
Step 1: Obtain the ECG and PCG signals. This paper uses the PCG and ECG signals
the research object. Because the signal acquisition time in the database is different, the
synchronously collected in the PhysioNet/CINC challenge 2016 heart sound database as
length
the of object.
research the signal
Becausewas unified
the signal in thetime
acquisition experiment,
in the databaseand 10,000 the
is different, sampling points of each
length
signalofwere
the signal was unified
intercepted forinthe
the experiment,
experiment. andThe
10,000 sampling points
sampling of eachof the signal is 2000 Hz.
frequency
signal were intercepted for the experiment. The sampling frequency of the signal is 2000
The database includes 113 normal signals and 292 abnormal signals (405 signals in total).
Hz. The database includes 113 normal signals and 292 abnormal signals (405 signals in
Figure
total). 7 shows
Figure 7 showsthe synchronously
the synchronously collected
collected PCG andPCG and ECG signals.
ECG signals.

Figure
Figure 7. PCG and ECG
7. PCG andsignals.
ECG signals.
Step 2: Preprocess the ECG and PCG signals. The main purpose is to filter the signals.
The useful information of the ECG signal is concentrated below 20 Hz. If the ECG signal
has strong power frequency interference, the signal will generally produce a peak at the
power frequency of 50 or 60 Hz in the spectrum. The effective frequency of the PCG signal
is concentrated below 300 Hz. In this paper, the ECG signal was mainly subjected to low-
Appl. Sci. 2022, 12, 11762 8 of 18

Step 2: Preprocess the ECG and PCG signals. The main purpose is to filter the signals.
The useful information of the ECG signal is concentrated below 20 Hz. If the ECG signal has
strong power frequency interference, the signal will generally produce a peak at the power
frequency of 50 or 60 Hz in the spectrum. The effective frequency of the PCG signal is
concentrated below 300 Hz. In this paper, the ECG signal was mainly subjected to low-pass
filtering with the cutoff frequency of 20 Hz; a high-pass filter with the cutoff frequency of
Appl. Sci. 2022, 12, x FOR PEER REVIEW
25 Hz and a low-pass filter with the cutoff frequency of 400 Hz were performed9 on of 19
the
PCG signal. Since there is too much noise in the ECG signal, the denoising results 9ofofthe
Appl. Sci. 2022, 12, x FOR PEER REVIEW 19
ECG signal are given here. Figure 8 shows the comparison of the ECG signal before and
after filtering.

(a) (b)
(a)8. Comparison before and after filtering: (a) ECG original
Figure (b)signal; (b) ECG signal after filter-
ing.
Figure 8.
8. Comparison
Comparison before
beforeand
andafter
afterfiltering:
filtering:(a)
(a)ECG
ECGoriginal
original signal;
Figure signal; (b)(b)
ECGECG signal
signal after
after filter-
filtering.
ing.
Step
Step3:3:UseUsethetheBiLSTM
BiLSTMnetwork
networkfor fortraining
trainingtotocreate
createthe
thenetwork
networkmodel.
model.The The filtered
filtered
signals
signals are
Step trained
are3: Use the
trained and classified
BiLSTM
and with
network
classified withafor
atotal ofof405
training
total tosamples.
405 create theMoreover,
samples. network
Moreover, 70%
70%samples
model. are
The filtered
samples are
used for
signals training
are trainedand 30%
and samples
classified are
with used
a for
total ofthe405test. Then
samples.
used for training and 30% samples are used for the test. Then the samples are readjustedthe samples
Moreover, are
70% readjusted
samples are
according
according to
used for trainingthe labels. The
and 30%
to the labels. abnormal
Thesamples
abnormal samples
aresamples and
used forand normal
the normal samples
test. Then are placed
the samples
samples separately
are readjusted
are placed separately
inin
a acentralized
according to the
centralized manner
labels.and
manner Thethen
and reintegrated
abnormal
then samples
reintegrated into
and
into aa new
newdataset.
normal dataset.
samplesThis
Thisaresettingisisconvenient
placed
setting conven-
separately
ient
for for network
in anetwork
centralized training
manner
training and
and and results
then
results comparison
reintegrated
comparison during
into athe
during the
new test. The
test.dataset. integrated
This setting
The integrated ECG
ECGisand and
conven-
PCG
PCG
ientdata
data for are spliced
arenetwork
spliced and and
training
fused fused
and
as the asinput
the input
results of theofBiLSTM
comparison the BiLSTM
during network.
the
network. test.
TheTheThe optimal
integrated
optimal networknetwork
ECG and
model
model
PCG is
data obtained
are splicedthrough
and training.
fused as the Then
input the
of theclassification
BiLSTM results
network.
is obtained through training. Then the classification results are obtained. The BiLSTM are
The obtained.
optimal The
network
BiLSTM
model network
network obtained architecture
isarchitecture through
is shown is shown
training.
in Figurein Figure
Then
9. the9.classification results are obtained. The
BiLSTM network architecture is shown in Figure 9.
LSTM

LSTM
Abnormal signal
LSTM
Abnormal signal
PCG signal LSTM

PCG signal LSTM

LSTM
ECG signal Signal fusion input
Normal signal
LSTM
ECG signal
Multi-modal Signal fusion input
Normal signal
signal LSTM
Multi-modal BiLSTM network Fully connected Classification layer
Softmax layer
signal layer
BiLSTM network Fully connected Classification layer
Softmax layer
layer
Figure
Figure9.9.BiLSTM
BiLSTMnetwork
networkstructure.
structure.
Figure 9. BiLSTM network structure.
Step 4: Convert the filtered one-dimensional ECG and PCG signals into a two-dimen-
sional Step
image. The filtered
4: Convert the signal
filteredisone-dimensional
subjected to the continuous
ECG and PCG wavelet transform
signals to obtain
into a two-dimen-
the wavelet coefficients. Then the scale sequence is converted into the frequency sequence.
sional image. The filtered signal is subjected to the continuous wavelet transform to obtain
Finally, the wavelet
the wavelet time- Then
coefficients. frequency diagram
the scale of each
sequence signal is obtained
is converted by combining
into the frequency the
sequence.
time series.
Finally, theThe size of
wavelet the time-frequency
time- frequency diagram diagram
of eachis adjusted to 224 × 224,
signal is obtained as shown in
by combining the
Figure 10.
time series. The size of the time-frequency diagram is adjusted to 224 × 224, as shown in
Appl. Sci. 2022, 12, 11762 9 of 18

Step 4: Convert the filtered one-dimensional ECG and PCG signals into a two-
dimensional image. The filtered signal is subjected to the continuous wavelet transform to
obtain the wavelet coefficients. Then the scale sequence is converted into the frequency
Appl. Sci. 2022, 12, x FOR PEER REVIEW
sequence. Finally, the wavelet time- frequency diagram of each signal is obtained by
Appl. Sci. 2022, 12, x FOR PEER REVIEW 10 of
of 19
10 com-19
bining the time series. The size of the time-frequency diagram is adjusted to 224 × 224, as
shown in Figure 10.

(a)
(a) (b)
(b)
Figure 10.
Figure 10. Time-frequency
10. Time-frequency diagrams
Time-frequency diagrams of
diagrams of signals:
of signals: (a)
signals: (a) the
(a) the time-frequency
time-frequency diagram
the time-frequency diagram of
of the ECG
the ECG signal;
ECG signal;
signal;
Figure diagram of the
(b)
(b) the
the time-frequency
time-frequency diagram
diagram of
of the
the PCG
PCG signal.
signal.
(b) the time-frequency diagram of the PCG signal.
Step
Step 5:5: Determine
5: Determine
Determine the the GoogLeNet
the GoogLeNet network
GoogLeNet network structure
network structure
structure and and improve
and improve
improve its its structure.
its structure.
structure. TheThe
The
Step
two-dimensional properties
two-dimensional properties
properties of of the
of the time-frequency
the time-frequency
time-frequency diagram diagram
diagram of of each
of each signal
each signal obtained
signal obtained above
obtained above
above
two-dimensional
are converted
are converted
converted intointo
intoaaathree-dimensional
three-dimensional
three-dimensionalform form
form224 224
224××× 224
224
224 ××× 3,
3, which is suitable for the input
are 3, which is suitable for the input input
of
of the
the GoogLeNet
GoogLeNet
of the GoogLeNet network.network.
network. Then
Then 70%
70% of
of the
the signals
signals are
are used
used for
for training
training
training and
and 30%
30% of
of the
the
of the
signals
signals are
are used
used for
for the
the
signals are used for the test. test.
test. Then
Then determine
determine the
the GoogLeNet
GoogLeNet network
network structure,
structure, as
as shown
shown
structure, as shown
in
in Figure
in Figure
Figure11.11. The
11.The original
Theoriginal GoogLeNet
originalGoogLeNet
GoogLeNetnetwork network
networkuses uses different
usesdifferent
different inception
inception
inception modules
modules
modules to to
to gener-
gener-
generate
ate
aate a total
a total
total of 144
of layers
of 144 layers
144 layers of the network
ofnetwork
of the the network structure.
structure.
structure. On the On the
Onbasis basis
the basis of retaining
of retaining
of retaining the original
the original
the original 140-
140-
140-layer
layer structure,
layer structure,
structure, this
this paper paper
this paper
improvesimproves
improves
the lastthe
the last four-layer
last four-layer
four-layer structure.
structure.
structure. Using
UsingUsing
the newthe new
thedropout dropout
new dropoutlayer,
layer,
layer,
the the
the fully-connected
fully-connected
fully-connected layer,
layer,
layer, the the
the Softmax
Softmax Softmax
layer, and layer,
the and
layer, and the
the classification
classification
classification layer
layerthe
layer replace replace
replace the
the
original
original layers.
originalThis
layers. layers. This
This network
network structurestructure
network is used tois
structure used
istrain
usedtheto train
to PCG the
the PCG
train and ECG and
PCG and ECG signals,
signals, respec-
ECGrespectively.
signals, respec-
The
tively.
tively. The
optimal optimal
network
The optimal network
model model
model is
is obtained
network obtained
isby by
by continuously
continuously
obtained adjustingadjusting
continuously the network
adjusting the
the network
parameters.
network pa-
pa-
rameters.
Then Then the
the classification
rameters. classification
results of the
Then the classification results of the
two-channel
results two-channel signals
signals aresignals
of the two-channel obtained. are obtained.
are obtained.

Figure 11.
11. GoogLeNet
Figure 11. GoogLeNet network structure.
network structure.
structure.

Step
Step 6:
6: The
The three-channel
three-channel classification
classification results
results obtained
obtained in
in the
the above
above steps
steps are
are fused
fused
at the decision level using the improved D-S theory. According to the obtained classifica-
at the decision level using the improved D-S theory. According to the obtained classifica-
tion
tion results,
results, the
the BPA
BPA function
function under
under various
various classification
classification methods
methods is is obtained.
obtained. Then
Then the
the
ECG
ECG and PCG signals are fused and determined according to the improved D-S theory.
and PCG signals are fused and determined according to the improved D-S theory.
The
The following
following is
is the
the implementation
implementation process.
process.
(1)
(1) Classification
Classification framework
framework establishment
establishment
Appl. Sci. 2022, 12, 11762 10 of 18

Step 6: The three-channel classification results obtained in the above steps are fused at
the decision level using the improved D-S theory. According to the obtained classification
results, the BPA function under various classification methods is obtained. Then the ECG
and PCG signals are fused and determined according to the improved D-S theory. The
following is the implementation process.
(1) Classification framework establishment
The above classification results are used as the evidence of ECG and PCG signal
classifications. The classification framework is established, as shown in the formula.

Θ= {ωnormal , ωabnormal } (12)

ωnormal represents the normal signal with a classification result, and ωabnormal represents
the abnormal signal with a classification result. The subset of the framework Θ contains
{{ ωnormal }, {ωabnormal }, {ωnormal , ωabnormal }}.
Research studies that deposit large datasets in publicly available databases should
specify where the data are deposited and provide the relevant accession numbers. If the
accession numbers have not yet been obtained at the time of submission, please state that
they will be provided during the review. They must be provided prior to publication.
(2) The BPA function establishment
The normal PCG signal probability value p(ω1n ) and the abnormal PCG signal proba-
bility value p(ω1p ) obtained in Step 5 are used as the BPA function m1 of the PCG signal.
The normal ECG signal probability value p(ω2n ) and the abnormal ECG signal probability
value p(ω2p ) are used as the BPA function m2 of the ECG signal. The normal ECG and PCG
signal probability value p(ω3n ) and the abnormal ECG and PCG signal probability value
p(ω3p ) obtained in Step 3 are used as the BPA function m3 of the ECG and PCG signals, as
shown in the formula.

m1 (ωnormal ) = p(ω1n )






 m1 (ωabnormal ) = p(ω1p )
 m2 (ωnormal ) = p(ω2n )


m2 (ωabnormal ) = p(ω2p ) (13)
 m3 (ωnormal ) = p(ω3n )



m (ω ) = p(ω3p )


 3 abnormal


m1 ({ωnormal , ωabnormal }) = m2 ({ωnormal , ωabnormal }) = m3 ({ωnormal , ωabnormal }) = 0
According to the D-S theory, the probability function values m(ωnormal ) and m(ωabnormal )
of the evidence are obtained, namely, the basic credibility function, as shown in Formula (14).
(
m (ω )m (ω )m (ω )
m(ωnormal ) = 1 normal 2 Knormal 3 normal
m (ω )m (ω )m3 (ωabnormal ) (14)
m(ωabnormal ) = 1 abnormal 2 abnormal
K

K is called as the fusion index. The bigger the K value, the higher the support degree
belonging to the same category.

K = m1 (ωnormal )m2 (ωnormal )m3 (ωnormal ) + m1 (ωabnormal )m2 (ωabnormal )m3 (ωabnormal ) (15)
Appl. Sci. 2022, 12, 11762 11 of 18

(3) Implementation of the improved D-S theory algorithm.


The basic principle of the algorithm is to use the method based on local credibility to
perform the weighted correction. The classification probability is corrected by the confusion
matrix generated previously. Then construct the BPA function of the classifier.
For the classification problem in this paper, it represents the tendency to judge as
normal or abnormal. Pω (ωi ω j ) represents the support degree of the target belonging to
class ωi . Thus, Pω (ωi ω j ) is called the local credibility of the classifier for class ωi . The
BPA function m(ωi ) about the target belonging to ωi is:

m(ωi ) = Pi × Pω (ωi ω j ) (16)

ωij
Pω (ωi |ω j ) = (17)
k
∑ ωij
j =1

In the formula, Pi is the probability of belonging to class ωi by the previous classification


output results. Pω (ωi ω j ) is the local credibility information. ωij represents the number of
samples in which class ωi is judged to be class ω j . k is the category number of signals. The BPA
functions are all obtained after the above-mentioned weighted-fusion correction [19–25].
Using the obtained BPA, information fusion is carried out according to
Formulas (14) and (15).
Step 7: Obtain the final classification results according to the discriminant criterion by
using the previous fusion results.
The discrimination criterion is as follows:

xi = −1, xi >= ε
(18)
xi = 1, xi < ε

where ε is the set threshold value, xi is the final classification result, “−1” represents normal
sample, and “1” represents the abnormal sample.

5. Experiments and Results


5.1. Dataset
The MIT PCG Database (MTTHSDB) contributes to this study. This dataset comes
from the PhysioNet/CINC challenge 2016 heart sound classification competition on the
PhysioNet website. PhysioNet acquire datasets (training-a to training-f) from different
research institutes. The training-a dataset is provided by MIT and contains 409 samples.
Among these 409 samples, 405 samples include ECG and PCG signals, simultaneously.
There are 113 normal signals and 292 abnormal signals in the database. The signals are
recorded at the sampling rate of 44.1 kHz. They are resampled at 2000 Hz in the post-
processing. In the experiment, 70% of the signals in the database were used as the training
set, and 30% were used as the test set [26].
The experiment was performed on the computer processor Intel (R) Celeron (R) CPU
N3450 1.10 GHz and RAM 4.00 GB. The experimental software was MATLAB 2021a
(Natick, MA, USA).

5.2. Parameter Settings of the Network Model


1. Parameter settings of the BiLSTM network model.
The main parameters in the model include the dimension of the input vector, the
number of hidden layers, the related parameters of training, and so on. The optimal
parameter settings of the BiLSTM network model are shown in Table 1.
Appl. Sci. 2022, 12, 11762 12 of 18

2. Parameter settings of the GoogLeNet network model.


The main parameters in the model include the dimension of the input vector, the
related parameters of training, and so on. The optimal parameter settings of the GoogLeNet
network model are shown in Table 2.

Table 1. BiLSTM network model parameters.

Parameters Definition Optimal Value


sequenceInputLayer Input vector dimension 284
Number of hidden units Number of hidden nodes 50
Adam Optimizer Adam
MaxEpochs Training epochs 10
MiniBatchSize Number of samples used per batch 150
InitialLearnRate Learning rate 0.01

Table 2. GoogLeNet network model parameters.

Parameters Definition Optimal Value


imgsTrain Input vector dimension [224 224 3]
dropoutLayer Random drop rate 0.6
sgdm Optimizer sgdm
MaxEpochs Training epochs 20
MiniBatchSize Number of samples used per batch 15
InitialLearnRate Learning rate 0.001

5.3. Evaluation Metrics


To test the proposed algorithm, the classifications, accuracy, sensitivity, specificity,
and F1 score are used to measure the performance. Since the signals are divided into
normal and abnormal signals, the abnormal signals are regarded as positive and the normal
signals are regarded as negative. If the positive signal is classified correctly, it is regarded
as the true positive signal (TP), otherwise it is regarded as the false negative signal (FN);
if the negative signal is classified correctly, it is regarded as the true negative signal (TN),
otherwise it is regarded as the false positive signal (FP) [27,28].

TP + TN
Accuracy = × 100% (19)
TP + FP + FN + TN
TP
Sensitivity = Recall = × 100% (20)
TP + FN
TN
Speci f icity = × 100% (21)
TN + FP
TP
Precision = × 100% (22)
TP + FP
2 × Recall × Precision
F1 Score = × 100% (23)
Recall + Precision

5.4. Comparative Experimental Setup


In the experiment, the following classification models are set to evaluate the model
in this paper, and the classification performance of the BiLSTM-GoogLeNet-DS model is
verified by comparing the models:
Appl. Sci. 2022, 12, 11762 13 of 18

(1) BiLSTM model.


The BiLSTM model is composed of two LSTMs in different directions, which can
simultaneously capture the front and backward timing information and solve the long-
distance dependency problem in the classification process. After the PCG and ECG signals
are filtered separately, the signal features are extracted by BiLSTM, and then input into
the fully-connected layer and the Softmax layer to realize the classification of the signal.
In order to verify the superiority of ECG and PCG signal classification, in addition to
inputting PCG and ECG signals into the BiLSTM network model to obtain their respective
classification results, the paper also splices and fuses the PCG and ECG signals into the
BiLSTM network model for classification. Since the classification effects of ECG and PCG
signals are better than that of single-modal cardiac function signals using the PCG or ECG
signal, this paper mainly focuses on the classification results of ECG and PCG signals using
the BiLSTM model for further processing.
(2) GoogLeNet model.
The GoogLeNet model consists of several inception modules. The GoogLeNet model
structure selected in this paper consists of nine layers of inception modules, namely
inception-3a, inception-3b, inception-4a, inception-4b, inception-4c, inception-4d, inception-
4e, inception-5a and inception-5b. Each inception module includes three convolutional
kernels of 1 × 1, 3 × 3, and 5 × 5. When the image data are input into the network,
7 × 7 convolution is performed first, then the ReLU layer, pooling layer, regularization,
and 3 × 3 convolution. After the ReLU layer, pooling layer, and regularization, the data are
input into each inception module in turn. The filtered PCG and ECG signals are subjected
to continuous wavelet transform to obtain their time-frequency diagrams. The sizes of the
diagrams are transformed to be suitable for the input into the GoogLeNet network. The
network is used to classify the PCG and ECG signals, respectively.
(3) BiLSTM-GoogLeNet-DS model.
The BiLSTM-GoogLeNet-DS model first splices and fuses ECG and PCG signals into
the BiLSTM network to obtain the fused classification results. Then the ECG and PCG
signals are, respectively, input into the GoogLeNet network for classification. Based on the
above classification results, a total of three channel classification results are obtained. The
improved D-S theory is used to fuse the above classification results. Then the classification
results are finally determined according to the discriminant criterion.

5.5. Experimental Results and Analysis


In order to verify the superiority of the proposed algorithm, the above network models
are compared with the BiLSTM-GoogLeNet-DS model, mainly from the evaluation indicators.
This paper improves the last four layers of the GoogLeNet network, and mainly
updates the random dropout layer, fully-connected layer, Softmax layer, and classification
layer. Using the improved D-S theory, the BPA function is weighted by the method based
on confusion matrix. The following are some experimental results. Figure 12 shows the
confusion matrix obtained by different models. The confusion matrix is mainly used to
weight the BPA function. It solves the problem that the BPA function of the D-S theory is
difficult to determine in decision fusion.
Interventionary studies involving animals or humans, and other studies that require
ethical approval, must list the authority that provides approval and the corresponding
ethical approval code.
Appl. Sci. 2022, 12, x FOR PEER REVIEW 15 of 19
Appl. Sci. 2022, 12, 11762 14 of 18

(a) (b)

(c) (d)
Figure 12. Confusion matrix obtained by different models: (a) ECG-PCG-BiLSTM; (b) ECG-Goog-
Figure 12. Confusion matrix obtained by different models: (a) ECG-PCG-BiLSTM; (b) ECG-GoogLeNet;
LeNet; (c) PCG-GoogLeNet; (d) BiLSTM-GoogLeNet-DS.
(c) PCG-GoogLeNet; (d) BiLSTM-GoogLeNet-DS.

Interventionary
In order to show morestudies involving
intuitively andanimals or humans,
clearly the andofother
superiority studies that
the proposed require
algorithm,
ethical approval, must list the authority that provides approval and
the BiLSTM-GoogLeNet-DS model is compared with the BiLSTM model and the GoogLeNet the corresponding
ethical
model, approval code.
respectively. The comparison results are shown in Table 3. It can be seen from Table 3
In order to show
that the classification more proposed
method intuitively in
and clearly
this paper,the superiority
based of the proposed algo-
on BiLSTM-GoogLeNet-DS,
rithm, the BiLSTM-GoogLeNet-DS model is compared with the BiLSTM model and the
has the best classification effect. Although the classification method based on PCG-BiLSTM
GoogLeNet model, respectively. The comparison results are shown in Table 3. It can be
obtained 100% sensitivity, 100% precision, and a 100% F1 score, the specificity is only 0%.
seen from Table 3 that the classification method proposed in this paper, based on BiLSTM-
This means that the classification method cannot identify healthy samples. Although the
GoogLeNet-DS, has the best classification effect. Although the classification method based
classification method based on PCG-GoogLeNet obtained 100% sensitivity, the specificity
on PCG-BiLSTM obtained 100% sensitivity, 100% precision, and a 100% F1 score, the spec-
is only 18.29%. This means that the classification effect of this method for health samples is
ificity is only 0%. This means that the classification method cannot identify healthy sam-
also very unsatisfactory.
ples. Although the classification method based on PCG-GoogLeNet obtained 100% sensi-
tivity, the specificity is only 18.29%. This means that the classification effect of this method
Table 3. Comparison of several algorithm models.
for health samples is also very unsatisfactory.
Classification Mode Accuracy Sensitivity Specificity Precision F1 Score
Table 3. Comparison of several algorithm models.
PCG-BiLSTM 71.13% 100% 0% 100% 100%
ECG-BiLSTM
Classification Mode 75%
Accuracy 91.58%
Sensitivity 34.15%
Specificity 77.41%
Precision F183.9%
Score
PCG-ECG-BiLSTM
PCG-BiLSTM 78.17%
71.13% 95.05%
100% 36.59%
0% 78.69%
100% 86.1%
100%
PCG-GoogLeNet 76.41% 100% 18.29% 75.09% 85.78%
ECG-BiLSTM 75% 91.58% 34.15% 77.41% 83.9%
ECG-GoogLeNet 85.92% 99.51% 52.44% 83.75% 90.95%
PCG-ECG-BiLSTM 78.17% 95.05% 36.59% 78.69% 86.1%
BiLSTM-GoogLeNet-DS 96.13% 98.48% 90.8% 96.04% 97.24%
PCG-GoogLeNet 76.41% 100% 18.29% 75.09% 85.78%
ECG-GoogLeNet 85.92% 99.51% 52.44% 83.75% 90.95%
BiLSTM-GoogLeNet-DS
In order to express 96.13%
the comparison 98.48% in
results the 90.8% 96.04% this paper
table more clearly, 97.24%uses

the following histograms for graphical displays. It can be clearly seen from Figure 13 that
the classification accuracy of the algorithm proposed in this paper (based on the BiLSTM-
GoogLeNet-DS model) is higher than that of other classification models. In Figure 14, the
Appl. Sci. 2022, 12, x FOR PEER REVIEW 16 of 19

In order to express the comparison results in the table more clearly, this paper uses
the following histograms for graphical displays. It can be clearly seen from Figure 13 that
Appl. Sci. 2022, 12, 11762 15 of 18
the classification accuracy
In order to express of the algorithm
the comparison resultsproposed
in the tableinmore
this clearly,
paper (based on the
this paper usesBiLSTM-
the following histograms
GoogLeNet-DS model) for graphical
is higher displays.
than that ofItother
can beclassification
clearly seen from Figure
models. In13Figure
that 14, the
the classification accuracy of the algorithm proposed in this paper (based on the BiLSTM-
sensitivity, specificity, precision, and F1 score are compared. The best classification results
GoogLeNet-DS
sensitivity,
are obtained bymodel)
using is
specificity, higher
precision,
the than
andthat
F1ofscore
BiLSTM-GoogLeNet-DS otherare
classification
compared. models.
model. The The In Figure
best 14, the results
classification
classification accuracy, sen-
sensitivity,
are obtained specificity,
by using precision, and F1 score are
the BiLSTM-GoogLeNet-DS compared. The best classification results
sitivity, specificity, precision, and F1 score are 96.13%,model. 98.48%,The classification
90.8%, 96.04%, andaccuracy,
97.24%,
are obtained by using the BiLSTM-GoogLeNet-DS model. The classification accuracy, sen-
sensitivity,
respectively. specificity, precision, andpoint
From a comprehensive F1 score are the
of view, 96.13%, 98.48%,
algorithm 90.8%, is
proposed 96.04%, and
better than
sitivity, specificity, precision, and F1 score are 96.13%, 98.48%, 90.8%, 96.04%, and 97.24%,
97.24%,
other respectively.
algorithm From a comprehensive point of view, the algorithm proposed is better
respectively. Frommodels.
a comprehensive point of view, the algorithm proposed is better than
than other algorithm models.
other algorithm models.

Figure 13. The comparison of of


accuracy.
Figure 13. The
Figure 13. The comparison
comparison of accuracy.
accuracy.

(a) (b)

(a) (b)

(c) (d)

(c) (d)

Figure 14. The comparison of sensitivity, specificity, precision, F1 score: (a) the comparison of
sensitivity; (b) the comparison of specificity; (c) the comparison of precision; (d) the comparison of
F1 score.
Appl. Sci. 2022, 12, 11762 16 of 18

This paper also compares the methods from other literature studies. The comparison
mainly focuses on the methods based on deep learning and the classification signals.
Moreover, the current research studies on ECG and PCG signals mainly focus on the
database provided in the PhysioNet/CINC challenge 2016 heart sound database, so the
comparative classification effect is more convincing. The comparison results are shown
in Table 4. The feature form is very similar, mainly focus on time-domain, frequency-
domain, and time-frequency domain. The classification process is performed by 1D or 2D
convolutional layers, and so on. The researched signal is single-modal or ECG and PCG
signals. From the results, the classification effect of the method proposed in this paper
is better than the methods in the literature, which shows that the algorithm proposed in
this paper has certain advantages. The results also show that the classification method
based on ECG and PCG signals has a stronger discriminative ability than those based on
single-modal signals. Among the methods based on ECG and PCG signals, the classification
effect of the method proposed in this paper is the best. Although the classification effect of
the proposed method is close to that of Li Han’s [5] method, it also shows that the method
proposed in this paper has slight advantages.

Table 4. Comparison between this algorithm and other algorithms in the literature.

Literature Classification Method Feature Form Results


Time-frequency Acc = 87.0%
Tschannen [29] 1-D CNN domain features Sen = 90.8%
PCG signal Spe = 83.2%
Acc = 84.8%
MFCC features
Rubin [30] 2-D CNN Sen = 76.5%
PCG signal
Spe = 93.1%
Acc = 88.8%
MFCC features
Noman [31] 2-D CNN Sen = 86.1%
PCG signal
Spe = 91.6%
Acc = 92.5%
PCG signal
Chakir Fatima [7] Support vector machine Sen = 92.31%
ECG signal
Spe = 92.86%
Acc = 95.6%
PCG signal
Li Han [5] Dual-Input Neural Network Sen = 98.5%
ECG signal
Spe = 89.2%
Acc = 96.13%
PCG signal
This paper BiLSTM-GoogLeNet-DS Sen = 98.48%
ECG signal
Spe = 90.8%

6. Conclusions
In this paper, an ECG–PCG signal classification method based on BiLSTM-GoogLeNet-
DS is proposed. The method filters the obtained ECG and PCG signals, and uses the
BiLSTM network to fuse and classify the two cardiac function signals. Then, the time-
frequency processing of the filtered signals is performed. The time-frequency diagrams
of signals are input into the GoogLeNet network for classification. Combining the above
classification results provides a discriminative decision using the improved D-S theory to
obtain the final classification results.
The following conclusions can be drawn from the experiments:
(1) The proposed ECG and PCG signal classification method based on the BiLSTM-
GoogLeNet-DS model is superior to the classification methods of the BiLSTM model
and the GoogLeNet model. The classification method of ECG and PCG signals is
better than the classification methods of single-modal signals.
Appl. Sci. 2022, 12, 11762 17 of 18

(2) This paper uses the deep learning-oriented hybrid fusion method. After the feature
layer fusion of signals, the improved D-S theory is used to fuse the decision layer. The
better classification results are obtained.
(3) The BiLSTM model is used for feature layer fusion, and the obtained classification
results are better than those of single-modal signals. This method can extract the
features of long-distance-dependent information according to the BiLSTM network,
which can improve the training time and learn the features better. It can produce
higher classification accuracy for long-span time series data.
(4) The classification effect obtained by using the GoogLeNet model to classify cardiac
function signals is better than the classification method based on the BiLSTM model,
which highlights the advantages of GoogLeNet in classification and recognition.
(5) The classification results of GoogLeNet and BiLSTM are fused by the improved
D-S theory for the decision-making layer, which mainly calculates the BPA function
for the classification result of each channel. Using the local recognition credibility
information generated by the confusion matrix to weigh the BPA function can improve
the final classification result.
(6) The best classification results are obtained by applying the ECG and PCG signal
classification method based on BiLSTM-GoogLeNet-DS with 96.13% classification
accuracy, 98.48% sensitivity, 90.8% specificity, 96.04% precision, and a 97.24% F1 score.
It can be concluded from the experimental results that the information fusion of ECG
and PCG signals improves the performance indicators of classification and improves the
classification effect. How to fuse more modal cardiac function signals to be used for disease
diagnosis and treatment will be the next research direction.

Author Contributions: Conceptualization, J.L. and L.K.; methodology, J.L.; software, J.L.; valida-
tion, Q.D. and X.D.; formal analysis, X.C.; investigation, J.L.; resources, Q.D.; data curation, J.L.;
writing—original draft preparation, J.L.; writing—review and editing, L.K.; visualization, X.D.;
supervision, X.C.; project administration, J.L.; funding acquisition, L.K. and J.L. All authors have read
and agreed to the published version of the manuscript.
Funding: This research was funded by the National Nature Science Foundation of China, grant
number 52077143; Department of Education of Liaoning Province, grant number LZGD2020002,
LJKZ0131; The Fundamental Research Funds in Heilongjiang Provincial Universities of China, grant
number 135409424 and Heilongjiang Science Foundation Project, grant number ZD2019F004.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: The database is supported by the PhysioNet/CinC Challenge 2016.
Acknowledgments: We acknowledge the database support from the PhysioNet/CinC Challenge 2016.
Conflicts of Interest: The authors declare no conflict of interest.

References
1. Mourad, N. ECG denoising algorithm based on group sparsity and singular spectrum analysis. Biomed. Signal Process. Control
2019, 50, 62–71. [CrossRef]
2. Rodion, S.; Sergey, P.; Peter, F.; Dumbler, A. Beat-to-beat cardiovascular hemodynamic parameters based on wavelet spectrogram
of impedance data. Biomed. Signal Process. Control. 2017, 36, 50–56.
3. Nedoma, J.; Fajkus, M.; Martinek, R.; Kepak, S.; Cubik, J.; Zabka, S.; Vasinek, V. Comparison of BCG, PCG and ECG Signals
in Application of Heart Rate Monitoring of the Human Body. In Proceedings of the 2017 40th International Conference on
Telecommunications and Signal Processing (TSP), Barcelona, Spain, 5–7 July 2017; pp. 420–424.
4. Monica, M.; Pedro, G.; Cristina, O.; Coimbra, M.; da Silva, H.P. Design and Evaluation of a Diaphragm for Electrocardiography in
Electronic Stethoscopes. IEEE Trans. Biomed. Eng. 2020, 67, 391–398.
5. Li, H.; Wang, X.; Liu, C.; Wang, Y.; Li, P.; Tang, H.; Yao, L.; Zhang, H. Dual-Input Neural Network Integrating Feature Extraction
and Deep Learning for Coronary Artery Disease Detection Using Electrocardiogram and Phonocardiogram. IEEE Access 2019,
7, 146457–146469. [CrossRef]
Appl. Sci. 2022, 12, 11762 18 of 18

6. Abdelakram, H.; Sara, B.; Malika, K.; Attari, M.; Seoane, F. Simultaneous Recording of ICG and ECG Using Z-RPI Device with
Minimum Number of Electrodes. J. Sens. 2018, 2018, 3269534.
7. Fatima, C.; Abdelilah, J.; Chafik, N.; Hammouch, A. Recognition of cardiac abnormalities from synchronized ECG and PCG
Signals. Phys. Eng. Sci. Med. 2020, 43, 673–677.
8. Zhang, H.; Wang, X.; Liu, C.; Liu, Y.; Li, P.; Yao, L.; Li, H.; Wang, J.; Jiao, Y. Detection of coronary artery disease using multi-modal
feature fusion and hybrid feature selection. Physiol. Meas. 2020, 41, 115007. [CrossRef]
9. Heart Sound Database Web Site. Available online: https://www.physionet.org (accessed on 4 March 2016).
10. Cheng, J.; Zou, Q.; Zhao, Y. ECG signal classification based on deep CNN and BiLSTM. BMC Med. Inform. Decis. Mak. 2021,
21, 365. [CrossRef]
11. Wang, L.; Zhou, X. Detection of Congestive Heart Failure Based on LSTM-Based Deep Network via Short-Term RR Intervals.
Sensors 2019, 19, 1502. [CrossRef]
12. Xu, X.; Jeong, S.; Li, J. Interpretation of Electrocardiogram (ECG) Rhythm by Combined CNN and BiLSTM. IEEE Access 2020,
8, 125380–125388. [CrossRef]
13. Singh, R.; Rajpal, N.; Mehta, R. An Empiric Analysis of Wavelet-Based Feature Extraction on Deep Learning and Machine
Learning Algorithms for Arrhythmia Classification. Int. J. Interact. Multimedia Artif. Intell. 2021, 6, 25. [CrossRef]
14. Alkhodari, M.; Fraiwan, L. Convolutional and recurrent neural networks for the detection of valvular heart diseases in phonocar-
diogram recordings. Comput. Methods Programs Biomed. 2021, 200, 105940. [CrossRef] [PubMed]
15. AlDuwaile, D.; Islam, S. Using Convolutional Neural Network and a Single Heartbeat for ECG Biometric Recognition. Entropy
2021, 23, 733. [CrossRef]
16. Alquran, H.; Alqudah, A.M.; Abu-Qasmieh, I.; Al-Badarneh, A.; Almashaqbeh, S. ECG classification using higher order spectral
estimation and deep learning techniques. Neural Network World 2019, 4, 207–219. [CrossRef]
17. Byeon, Y.-H.; Pan, S.-B.; Kwak, K.-C. Intelligent Deep Models Based on Scalograms of Electrocardiogram Signals for Biometrics.
Sensors 2019, 19, 935. [CrossRef]
18. Kim, J.-H.; Seo, S.-Y.; Song, C.-G.; Kim, K.-S. Assessment of Electrocardiogram Rhythms by GoogLeNet Deep Neural Network
Architecture. J. Healthc. Eng. 2019, 2019, 2826901. [CrossRef]
19. Dempster, A.P. Upper and lower probabilities induced by a multivalued mapping. Ann. Math. Stat. 1967, 38, 325–339. [CrossRef]
20. Shafer, G.A. Mathematical Theory of Evidence; Princeton University Press: Princeton, NJ, USA, 1976.
21. Chen, N.; Sun, F.; Ding, L.; Wang, H. An adaptive PNN-DS approach to classification using multi-sensor information fusion.
Neural Comput. Appl. 2008, 31, 693–705. [CrossRef]
22. Chen, L.; Feng, Y.; Maram, M.A.; Wang, Y.; Wu, M.; Hirota, K.; Pedrycz, W. Multi-SVM based Dempster-Shafer theory for gesture
intention understanding using sparse coding feature. Appl. Soft Comput. J. 2019, 85, 105787. [CrossRef]
23. Rajib, G.; Pradeep, K.; Pratim, P. A Dempster -Shafer theory based classifier combination for online Signature recognition
andverification systems. Int. J. Mach. Learn. Cybern. 2019, 10, 2467–2482.
24. Liu, Y.; Cheng, Y.; Zhang, Z.; Wu, J. Multi-information Fusion Fault Diagnosis Based on KNN and Improved Evidence Theory. J.
Vib. Eng. Technol. 2021, 10, 841–852. [CrossRef]
25. Xiaotao, C.; Lixin, J.; Ying, Y.; Ruiyang, H. Method of Network Representation Fusion Based on D-S Evidence Theory. Tien Tzu
Hsueh Pao/Acta Electron. Sin. 2020, 48, 854–860.
26. Liu, C.; Springer, D.; Li, Q.; Moody, B.; Juan, R.A.; Chorro, F.J.; Castells, F.; Roig, J.M.; Silva, I.; Johnson, A.E. An open access
database for the evaluation of heart sound algorithms. Physiol. Meas. 2016, 37, 2181–2213. [CrossRef] [PubMed]
27. Hyung, L.J.; Young, K.S.; Chun, O.P.; Gi, K.K.; Jin, S.D. Heart Sound Classification Using Multi Modal Data Representation and
Deep Learning. J. Med. Imaging Health Inform. 2020, 10, 537–543.
28. Li, J.; Ke, L.; Du, Q.; Chen, X.; Xiaodi, D. ECG and PCG signals classification algorithm based on improved D-S evidence theory.
Biomed. Signal Process. Control. 2021, 71, 103078. [CrossRef]
29. Tschannen, M.; Kramer, T.; Marti, G.; Heinzmann, M.; Wiatowski, T. Heart sound classification using deep structured features. In
Proceedings of the Computing in Cardiology Conference (CinC), Vancouver, BC, Canada, 11–14 September 2016; pp. 565–568.
30. Rubin, J.; Abreu, R.; Ganguli, A.; Nelaturi, S.; Matei, I.; Sricharan, K. Classifying heart sound recordings using deep convolutional
neural networks and mel-frequency cepstral coefficients. In Proceedings of the Computing in Cardiology Conference (CinC),
Vancouver, BC, Canada, 11–14 September 2016; pp. 813–816.
31. Noman, F.; Ting, C.-M.; Salleh, S.-H.; Ombao, H. Short-segment heart sound classification using an ensemble of deep convolutional
neural networks. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP),
Brighton, UK, 12–17 May 2019; pp. 1318–1322.

You might also like