You are on page 1of 16

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TAFFC.2017.2781732, IEEE Transactions on Affective Computing

SUBMITTED TO IEEE TRANSACTIONS ON AFFECTIVE COMPUTING 1

Automatic ECG-Based Emotion Recognition


in Music Listening
Yu-Liang Hsu, Member, IEEE, Jeen-Shing Wang, Member, IEEE, Wei-Chun Chiang, and
Chien-Han Hung

Abstract—This paper presents an automatic ECG-based emotion recognition algorithm for human emotion recognition. First,
we adopt a musical induction method to induce participants’ real emotional states and collect their ECG signals without any
deliberate laboratory setting. Afterward, we develop an automatic ECG-based emotion recognition algorithm to recognize
human emotions elicited by listening to music. Physiological ECG features extracted from the time-, and frequency-domain, and
nonlinear analyses of ECG signals are used to find emotion-relevant features and to correlate them with emotional states.
Subsequently, we develop a sequential forward floating selection-kernel-based class separability-based (SFFS-KBCS-based)
feature selection algorithm and utilize the generalized discriminant analysis (GDA) to effectively select significant ECG features
associated with emotions and to reduce the dimensions of the selected features, respectively. Positive/negative valence,
high/low arousal, and four types of emotions (joy, tension, sadness, and peacefulness) are recognized using least squares
support vector machine (LS-SVM) recognizers. The results show that the correct classification rates for positive/negative
valence, high/low arousal, and four types of emotion classification tasks are 82.78%, 72.91%, and 61.52%, respectively.

Index Terms—Electrocardiogram, emotion recognition, music, machine learning

——————————  ——————————

1 INTRODUCTION

W ITH rapid advancements in technology, advanced


and user-friendly human-computer interaction (HCI)
systems should become capable of considering human
example, people have a “poker face,” or they may not ex-
press their emotions via intuitive human body language
when they are angry. Similarly, using physiological meas-
affective states in interactions to promote mutual sympa- urements (biosignals), including electromyogram (EMG),
thy between humans and machines. In human communi- electroencephalography (EEG), electrocardiography
cation, the expression and understanding of emotions can (ECG), galvanic skin resistance (GSR), skin temperature
help to achieve mutual sympathy. To develop an emotion- (ST), skin conductivity (SC), respiration (RSP), body ex-
based HCI, we need to equip machines with the abilities to pression, and blood oxygen saturation (OXY), for emotion
understand and identify human emotions without any in- classification has some limitations [4], [9], [10], [11], [12].
put devices to translate users’ intentions. However, emo- First, physiological patterns cannot be mapped onto spe-
tion is a psychological and physiological expression and is cific emotional states uniquely because emotions could be
associated with mood, temperament, personality, disposi- influenced by time, context, space, and culture, while
tion, and motivation, which are produced by cognitive physiological patterns may differ from user to user and
processes, subjective feelings, motivational tendencies, be- from situation to situation. Second, recorded biosignals
havioral reactions, and physiological arousal. Therefore, usually include motion artifacts caused by electrodes’
developing an effective emotion recognition system for movement on the skin surface. Third, determination of the
identifying various emotions is an interesting and chal- “ground truth” of the biosignals is an arduous task because
lenging topic. we can only observe signal flows or trends; however, we
Recently, many studies have focused on developing ef- perceive or experience emotions intuitively. On the other
fective emotion recognition systems that identify implicit hand, physiological measurements offer numerous bene-
features of human communication, such as speech, facial fits for researches to develop an emotion-based HCI. First,
expressions, gestures, or physiological measurements, in various biosignals for users’ affective states can be gath-
different experimental settings [1], [2], [3], [4], [5], [6], [7], ered continuously, and these can truly reflect human emo-
[8]. However, the features of the abovementioned audio- tions in daily life through the autonomous nervous system
visual emotion channels are not adequate for obtaining (ANS). Second, physiological ANS activity could over-
emotion classification results, because human can disguise come possible artifacts of human social masking because
their emotions by artifacts of human social masking. For that cannot be easily triggered by any conscious control,
and some of that are not culturally specific. Therefore,
————————————————
many researchers have used the fusion of the physiological
 Yu-Liang Hsu is with the Department of Automatic Control Engineering,
measurements to recognize human emotions effectively
Feng Chia University, Taichung 40724, Taiwan. E-mail: because these reflect the involuntary reactions of the hu-
hsuyl@fcu.edu.tw. man body [4], [10], [11], [13]. However, using too many bi-
 Jeen-Shing Wang, Wei-Chun Chiang, and Chien-Han Hung is with the osignals to recognize human emotions is not suitable for
Department of Electrical Engineering, National Cheng Kung University,
Tainan 701, Taiwan. E-mail: jeenshin@mail.ncku.edu.tw, outlawbo@hot-
1949-3045mail.com, chienhan.hung@gmail.com.
(c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
xxxx-xxxx/0x/$xx.00 © 200x IEEE Published by the IEEE Computer Society
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TAFFC.2017.2781732, IEEE Transactions on Affective Computing

2 IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID

practical applications, and it may hinder subjects during meanings of the chosen words are culturally dependent [4].
daily life activities [14]. Therefore, it is important to de- Therefore, the discrete models require more than one word
velop a reliable emotion recognition system for an emo- to describe mixed emotions. In the affective dimensional
tion-based HCI that uses suitable physiological channels models, humans need to scale emotions in multiple dimen-
and shows acceptable recognition abilities and robustness sions for categorizing emotions. Recently, two common
against any artifacts caused by human movement or hu- scales used in emotion classification are valence and
man social masking. arousal [4], [14], [17], [18]. All emotions can be mapped
In this paper, we use only one ECG channel to classify onto the valance and arousal axes in the two-dimensional
four types of emotions by an emotion recognition system. (2D) emotion plane. For example, joy has positive valence
The novelty of this paper is the proposed automatic ECG- and high arousal, whereas sadness has negative valence
based emotion recognition algorithm. This algorithm con- and low arousal.
sists of a sequential forward floating selection-kernel-
based class separability-based (SFFS-KBCS-based) feature 2.2 ECG Signal and Emotion
selection algorithm, the generalized discriminant analysis From the literature review, we found that emotion is sys-
(GDA) feature reduction method, and least squares sup- tematically elicited by subjective feelings, physiological
port vector machine (LS-SVM) classifiers for effective arousal, motivational tendencies, cognitive processes, and
recognition of music-induced emotions by using physio- behavioral reactions. From the viewpoint of physiological
logical changes of only one ECG channel. For this purpose, arousal, it is difficult to find out an ANS differentiation of
we design an accurate experiment for collecting partici- emotions because the ANS may be influenced easily by
pants’ ECG signals when listening to music and then de- several factors such as attention, social interaction, ap-
velop an automatic emotion recognition algorithm using praisal, and orientation. However, recently reported stud-
only ECG signals for detecting R-waves, generating signif- ies have shown that ANS activity comprising the sympa-
icant emotion-relevant ECG features, and recognizing var- thetic nervous system (SNS) and parasympathetic nervous
ious human emotions effectively. After extracting ECG fea- system (PNS) is viewed as an important component of the
tures from the time-, frequency-domain, and nonlinear emotion response [19]. In addition, heart rate (HR) and
analyses of ECG signals, we select appropriate features heart rate variability (HRV), that are the variation over
from a total of 34 normalized features by using the SFFS- time of the period between successive heartbeats, are the
based search strategy combined with the KBCS-based se- common ECG features extracted from the ECG signals for
lection criterion. Subsequently, we utilize the GDA to ef- emotion recognition [2], [6], [10], [13], [14], [20], [21]. Re-
fectively reduce the dimension of the significant ECG fea- cent research studies have shown that music can actually
tures associated with emotions. Finally, we use the LS- produce specific physiological reactions of change in HR
SVM classifiers to recognize arousal-valence emotion and HRV that are associated with different emotions [4],
stages. [22], [23], [24], [25], [26].
This paper is organized as follows. Section 2 presents a
2.3 Boundary Conditions for Finding Relationship
brief overview of related research on automatic emotion
between Emotion and ECG Signals
recognition systems based on ECG signals when listening
to music. The experimental setup and protocol is presented This study focuses on the relationship between emotion and
in Section 3. Section 4 introduces the proposed automatic ECG signals. We summarize some possible factors that affect
ECG-based emotion recognition algorithm. Next, Sections the emotion classification results by using ECG signals as fol-
5 and 6 present the experimental results and discussion, lows:
respectively. Finally, conclusions are presented in the last 1. Not enough time for some participants to reach a neu-
chapter. tral state or emotional states elicited by the musical ex-
cerpts during the baseline and music listening stages.
2. The selected stimuli music does not have enough inten-
2 RELATED RESEARCH sity to elicit emotions for the participants in the emotion
classification tasks.
2.1 Dimensional Emotion Models 3. The difficulty of subject-independent classification is
It is difficult to judge or model human emotions because the intricate variety of nonemotional individual con-
people express their emotions differently based on such texts among the participants, rather than an individual
factors as their cognitive process and subjective feeling. ECG specificity in emotion [4].
Over the past several decades, many researchers have de- 4. The participants cannot faithfully report their emo-
voted to develop diverse emotion models for modeling hu- tional state on the GEMS-9 questionnaire because of the
man emotions [15], [16]. Among various emotion models, unconcentrated condition during music listening or the
the discrete and affective dimensional models are two misunderstanding of the meaning of the GEMS-9.
common approaches to model emotions, and they are not 5. There is no one-to-one relation between emotion and
exclusive of each other. In the discrete models, humans physiological changes in ECG-based features: Feeling
must choose a prescribed list of word labels to label emo- changes may occur without concomitant autonomic
tions in discrete categories for indicating their current emo- changes in ECG-based features and vice versa [19].
tion, for example, joy, tension, sadness, anger, fear, etc [10], The abovementioned factors directly or indirectly affect
[15]. However, the stimuli may elicit blended emotions the determinations of the ground truths for the classifiers,
that cannot be adequately expressed in words because the
1949-3045 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TAFFC.2017.2781732, IEEE Transactions on Affective Computing

AUTHOR ET AL.: TITLE 3

Fig. 2 Block diagram of experimental procedures.

range, regular rhythm, and legato articulation. For present-


ing the stimuli, Windows Media Player software and
Yamaha Corporation speakers were used, and the music
volume was set to the maximum volume of Windows Me-
Fig. 1. Target emotions in a 2D emotion model comprising valence and dia Player and 75% loudness level of the computer speaker
arousal axes are Joy, Tension, Sadness, and Peacefulness. volume. However, each participant was asked before the
experiment whether the volume was comfortable.
and they further deteriorate the classification rates and
trouble the model selection of classifiers. 3.3 Experimental Protocol
Each participant signed a consent form and filled out a
3 EXPERIMENTAL SETUP questionnaire before participating in this experiment. Next,
the electrodes of the ECG recording instrument were
3.1 Participants placed on the participants and the recorded ECG signals
A total of 61 healthy participants selected from a course of should be checked by the experimenter to avoid interfer-
analysis of scientific evidence in healing music opened at ence resulting from incorrect placement of the electrodes
National Cheng Kung University, aged between 17 and 36 or a destructive ECG recording instrument. Then, the par-
years old (20.4±2.4), participated in this study as partial ticipants were given instructions to comprehend the exper-
fulfillment of the course requirement. imental protocol and the meanings of the scale used for
self-assessment. When the instructions were clear to the
3.2 Materials and Setup
participants, the experiment was started. The experiment
The experiment was performed in a laboratory environ- involved the following six stages:
ment with controlled temperature. ECG signals were rec- 1. Baseline stage (5 min): The experimenter instructed the
orded using a NeXus-10 instrument with Biotrace+ soft- participants to press the “Start Recording” button on
ware on a dedicated recording PC (Intel® Core Processor the screen to start ECG signal recording. After ECG sig-
i5-2400, 3.30 GHz). ECG signals (lead II) were recorded nal recording, the experimenter played the music of
with a 2048 Hz sampling rate using three electrodes. Par- “Whale Music” to let each participant’s emotion reach
ticipants could watch their ECG signals on a 42-inch screen. a neutral state (near the origin of the 2D emotion model
Stimuli selected from the participants themselves (elective in Fig. 1) during this stage.
music) and from music experts (assigned music) were pre- 2. Elective music listening stage (15 min): At the begin-
sented using a dedicated stimulus PC manipulated by the ning of this music listening stage, the participants
experimenter. The elective music was selected from partic- pressed the enter key on the keyboard and entered “1”
ipants themselves, which can truly induce their four types to make a marker for setting a start point of the ECG
of emotions corresponding to the four quadrants in the 2D signal recording in this music listening stage, and then
emotion model shown in Fig. 1 [16], [17]. The assigned mu- the experimenter played the elective music that ex-
sic was selected from music experts according to specific presses only one target emotion. After listening to the
musical features including tempo, mode, dynamics, timbre, elective music, the participants pressed the enter key on
rhythm, tonality, and harmonic progression [27]. For ex- the keyboard and entered “2” to make a marker for set-
ample, the songs chosen to represent joy are characterized ting an end point of the ECG signal recording in this
by fast tempo, major mode, consonance, high loudness, music listening stage.
high pitch, large pitch range, ascending melody, regular 3. First self-assessment stage (3 min): The participants
rhythm, many harmonics/bright timbre, and staccato ar- should fill out the self-assessment within 3 min to re-
ticulation. In contrast, the songs chosen to represent sad- flect their current emotion after listening to the elective
ness are characterized by slow tempo, minor mode, disso- music.
nance, soft loudness, low pitch, narrow pitch range, de- 4. Assigned music listening stage (15 min): At the begin-
scending melody, firm rhythm, few harmonics/soft timbre, ning of this music listening stage, the participants
and legato articulation. In addition, the songs chosen to pressed the enter key on the keyboard and entered “3”
represent tension are characterized by dissonance, high to make a marker for setting a start point of the ECG
loudness, rhythmic complexity, harmonic complexity, as- signal recording in this music listening stage, and then
cending melody, and increased note density. Finally, the the experimenter played the assigned music with the
songs chosen to represent peacefulness are characterized same emotion as the elective music. After listening to
by slow tempo, consonance, soft loudness, narrow pitch the assigned music, the participants pressed the enter
1949-3045 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TAFFC.2017.2781732, IEEE Transactions on Affective Computing

4 IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID

key on the keyboard and entered “4” to make a marker


for setting an end point of the ECG signal recording in
this music listening stage.
5. Second self-assessment stage (3 min): The participants
should fill out the self-assessment within 3 min to re-
flect their current emotion after listening to the as-
signed music.
6. Recovery stage (5 min): The participants pressed the (a)
enter key on the keyboard and entered “5” to make a
marker for setting a start point of the ECG signal re-
cording in this stage, and the experimenter played the
music of “Yellow Fantasy” used to let each participant
recover their emotion to a neutral state during this
stage.
The abovementioned markers were used to separate the
ECG signal recording within each experimental stage in this
(b)
experiment. Fig. 2 shows a block diagram of the experimental
procedures.

3.4 Participant Self-assessment


After listening to the music in the two music listening
stages, the participants performed self-assessments to re-
flect their current emotion after listening to the elective and
assigned music, respectively. In the self-assessment stages,
the participants were asked to indicate how strongly they (c)
had experienced the corresponding feeling during stimu-
lus presentation for each of the nine Geneva Emotional
Music Scale (GEMS-9) emotion categories. Each of the nine
emotion labels (Wonder, Transcendence, Power, Tender-
ness, Nostalgia, Peacefulness, Joy, Sadness, and Tension)
was scaled from 1 to 5 (1 = not at all, 5 = very much).

3.5 Dataset for Evaluation (d)


The aims of the proposed automatic ECG-based emotion Fig. 3 ECG signals collected from different paericipants for four emo-
tions. (a) Joy. (b) Tension. (c) Sadness. (d) Peacefulness.
recognition algorithm are to discriminate the posi-
tive/negative valence, high/low arousal, and four target
participants. The ECG segments distributed in the posi-
emotions (including joy, tension, sadness, and peaceful-
tive/negative valence are 300 and 95, respectively. The
ness). Toward this end, the participants’ ratings labeled in
ECG segments distributed in the high/low arousal are 160
the GEMS-9 during the experiment are used as the ground
and 235, respectively. Fig. 3 shows the ECG signals col-
truths (the ideal outputs for the LS-SVM classifier) for the
lected from different participants for four emotions. The
classification tasks, and we set two criteria to identify the
ECG database used in this paper is available and accessible
strongly elicited emotion patterns in music listening. These
through a web-interface.1
are described as follows.
1. We assume that the last 5-min ECG segment in the mu-
sic listening stages could reflect the feeling of the par- 4 METHODOLOGY
ticipants’ ratings. Fig. 4 shows the block diagram of the proposed automatic
2. Regarding the GEMS-9, we only focus on the last four ECG-based emotion recognition algorithm. We now
emotion ratings, such as Peacefulness, Joy, Sadness, briefly describe each procedure of the proposed automatic
and Tension, because this study aims to classify these ECG-based emotion recognition algorithm as follows: (1)
four emotions and their categories on the valence- Signal preprocessing: Remove baseline wander and reduce
arousal space shown in Fig. 1. According to the ratings, signal amplitude biases by using the median filters and Z-
we selected the last 5-min ECG segments of these par- score normalization method, respectively. (2) R-wave de-
ticipants from the self-collected database when the rat- tection: Use the R-waves detected by the QRS detection al-
ing score of only one of the four emotions was four or gorithm proposed by Pan and Tompkins [28] to derive the
five and this emotion is the same as that of the elicited R-R time intervals (RR intervals) for further extracting
music. ECG-based features. (3) Windowing: Window the ECG re-
cording to classify emotions based on each 1-min epoch. (4)
As mentioned earlier, a total of 395 strongly elicited
Noisy epoch rejection: Reject incorrect epochs that include
emotion samples (1-min ECG segment) distributed in the
four emotions (including 105 for Joy, 55 for Tension, 40 for
Sadness, and 195 for Peacefulness) were collected from 61 1. http://nemedataset.ddns.net/index/
1949-3045 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TAFFC.2017.2781732, IEEE Transactions on Affective Computing

AUTHOR ET AL.: TITLE 5

body artifact noise and incomplete ECG recording. (5) Fea-


ture extraction: Obtain a total of 34 features extracted from
the time-, frequency-domain, and nonlinear analyses of
ECG signals. (6) Feature normalization: First, obtain the
degrees of the physiological changes of the features be-
tween the conditions of the baseline stage and the music
listening stage. Subsequently, reduce the effect of the
value’s range from the degrees of the physiological
changes of the features through the min-max normaliza-
tion method. (7) Feature selection: Select the appropriate
features out of a total of 34 normalized features by using
the proposed SFFS-KBCS-based algorithm. (8) Feature re-
duction: Reduce the dimensions of the significant features Fig. 4 Block diagram of automatic ECG-based emotion recognition
from the selected features using the GDA. (9) Classifier algorithm.
construction: Discriminate emotions by using the LS-SVMs
with one-against-one strategy based on the reduced fea- successive R-waves detected in the normalized ECG sig-
tures. Now, we introduce each procedure of the proposed nals.
automatic ECG-based emotion recognition algorithm in 4.3 Windowing
detail in the following sections.
According to [19], the most widely used duration of
4.1 Signal Preprocessing physiological variables is 1 min; therefore, we use 1 min as
In the ECG signal preprocessing procedure, two interfer- the window size for classifying the emotions during music
ence signals are removed for the following reasons: (1) listening. Hence, in this paper, the ECG signals and its R-
Baseline wander caused by the subject’s movement or waves are segmented into a series of successive 1-min
breathing might mislead the ECG annotation from the ac- windows.
curate identification of the ECG features [29]. (2) ECG sig- 4.4 Incorrect Epoch Rejection
nal amplitude biases generated by individuals and instru-
Before extracting features from each ECG epoch (1-min
mental differences cause extreme amplitude scaling. The
window), we must ensure that the signal of each ECG
baseline wander removal and Z-score normalization pro-
epoch is a complete observation. The two types of incorrect
cedures are used to remove the abovementioned interfer-
epoch rejection conditions include: (1) Incomplete/noisy
ence signals, respectively. (1) Baseline wander removal:
ECG data caused by electrodes not well attached to
According to [30], a median filter of 200-ms width is used
subject’s skin. (2) Sudden change in the RR intervals
to remove the QRS complexes and P-waves of the original
caused by the ECG data with the electrode motion artifact
ECG signals. Subsequently, a median filter of 600-ms
noise or body movement noise that cannot be filtered. As
width is used to remove the T-waves of the abovemen-
mentioned earlier, we should reject incorrect epochs by the
tioned resulting signal. Once we obtain the signals result-
conditions of no change in the ECG data and sudden
ing from the second filter operation, the baseline wander
change in the RR intervals. Therefore, the incorrect epochs
signal of the original ECG signals can be obtained. Then,
are excluded from the subsequent procedures.
the baseline wander signal is subtracted from the original
ECG signals. (2) Z-score normalization method: Because of 4.5 Feature Extraction
instrumental and human differences, the signal amplitude In the feature extraction procedure, a total of 34 features
biases of the waveforms of the ECG signals are inconsistent. listed in Table 1 can be extracted from the time-, frequency-
Therefore, the abovementioned differences of each wave- domain, and nonlinear analyses of ECG signals in each 1-
form of each ECG signal are reduced by the Z-score nor- min epoch [20], [32], [33].
malization method [31]. This procedure results in a nor-
malized ECG signal with zero mean and unity standard 4.5.1 Time-domain Analysis
deviation. In the time-domain analysis, we calculated a total of 12 fea-
tures. (1) HRV related parameters: The standard deviation
4.2 R-wave Detection
of RR intervals (SDNN), root mean square of differences
The RR intervals acquired from the normalized ECG sig- between adjacent RR intervals (RMSSD), number of suc-
nals are crucial indices for extracting the ECG based fea- cessive RR intervals that differ more than 50 ms (NN50),
tures. The more accurate the R-waves detection are, the percentage of successive RR intervals that differ more than
more accurate ECG-based features we can obtain. In this 50 ms (pNN50), and standard deviation of differences be-
paper, the R-waves of the normalized ECG signals are de- tween adjacent RR intervals (SDSD). The relative equa-
tected based on the QRS detection algorithm proposed by tions for calculating the time-domain HRV related param-
Pan and Tompkins [28] to derive the R-R time intervals (RR eters are shown as follows.
intervals), and then the amplitude and occurrence time of
the R-waves are obtained. Subsequently, the RR intervals

n 2
SDNN = ( Ri - R) / n , (1)
are obtained by calculating the time difference between i 1


n 2
RMSSD = i 2
( Ri - Ri 1 ) / ( n  1), (2)
1949-3045 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TAFFC.2017.2781732, IEEE Transactions on Affective Computing

6 IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID


2
pLF = (LF / TP)  100.
n
SDSD = i 2
( Ri - Ri 1 - RMSSD2 ) / ( n  1), (3) (11)

where 𝑅𝑖 is the ith RR interval, 𝑅̅ is the average of the RR pHF = (HF / TP)  100. (12)
intervals, and 𝑛 is the number of the RR intervals.
(2) HR related parameter: The number of R-waves within 4.5.3 Nonlinear Analysis
one epoch divided by 1 min (BPM). In the nonlinear analysis, a total of 9 features, including 2
(3) RR interval related parameters: The median value of RR ECG-derived respiration (EDR) related parameters, 3 Poin-
intervals (Median-RRI), interquartile range of RR intervals caré plot related parameters, 2 nonlinear dynamics related
(IQR-RRI), mean absolute deviation of RR intervals (MAD- parameters, and 2 autocorrelation related parameters, are
RRI), mean of the difference between adjacent RR intervals described as follows:
(Diff-RRI), coefficient of variation of RR intervals (CV-RRI), A. EDR related parameters
and difference between the maximum and the minimum Before we extract two EDR-related features (RSPrate and Co-
RR interval (Range). The relative equations for calculating herence), we should firstly obtain the EDR signal. The detailed
the RR interval related parameters are shown as follows. procedure for extracting the EDR signal can be referred from
1 n [35], [36]. Subsequently, we extract two features: respiratory
MAD_RRI =  Ri - R ,
n i =1
(4) rate (RSPrate) and coherence between final EDR signal and
1 RR intervals (Coherence). To estimate the respiratory rate

n
Diff_RRI = R - Ri -1 ,
i =2 i
(5) (RSPrate), the final EDR signal is firstly subtracted by its mean
n-1
to remove the DC component. Then, the PSD analysis of the
1
  Ri - R  / n ,
n 2
CV_RRI = i =1
(6) EDR signal is applied to obtain the respiratory rate from the
R
frequency of the maximum peak in the low-frequency band
4.5.2 Frequency-domain Analysis (0.1–1 Hz) multiplied by 60 [35], [36]. In addition, according
In the frequency-domain analysis, a total of 13 HRV related to [37], the PSD of the EDR signal in the low-frequency band,
parameters are calculated at certain frequency bands. The 0–0.4 Hz, is similar to the PSD of the RR intervals when hu-
RR interval signal need to be resampled and interpolated mans have positive-valence emotion. Therefore, we can cal-
to transform them into a series of regularly resampled sig- culate the coherence between the final EDR signal and the RR
nals and to prevent the generation of additional harmonic intervals in the low-frequency band (0–0.4 Hz), which is a
components [34]. After resampling and interpolating, the function of the auto spectral density of the final EDR signal
power spectral density (PSD) of the resampled RR inter- (𝐺𝑥𝑥 ) and the RR intervals (𝐺𝑦𝑦 ), and the cross spectral density
vals is calculated by using the fast Fourier transform (FFT)- (𝐺𝑥𝑦 ) of the EDR signal and the RR intervals (𝐶𝑜ℎ𝑒𝑟𝑒𝑛𝑐𝑒 =
2
based method. The PSD analysis could be used to calculate |𝐺𝑥𝑦 | ⁄(𝐺𝑥𝑥 × 𝐺𝑦𝑦 )).
the power of specific frequency ranges and the peak fre- B. Poincaré plot related parameters
quencies for three different frequency bands: very-low-fre- The Poincaré plot analysis measures the quantitative beat-to-
quency range (VLF) (0.0033–0.04 Hz), low-frequency range beat correlation between adjacent RR intervals. According to
(LF) (0.04–0.15 Hz), and high-frequency range (HF) (0.15– the ellipse fitting process introduced in [38], [39], SD1 repre-
0.4 Hz). Concerning the frequency-domain analysis, we sents the standard deviation of the instantaneous beat-to-beat
calculated the following features: power calculated in the RR interval variability, SD2 represents the standard deviation
VLF, LF, and HF bands (VLF, LF, and HF), total power in of the continuous long-term beat-to-beat RR interval variabil-
the full frequency range (TP), ratio of power calculated ity, and SD12 is the ratio of SD1 to SD2.
within the LF band to that calculated within the HF band C. Nonlinear dynamics related parameters
(LF/HF), LF power normalized to the sum of the LF and HF In the nonlinear dynamics analysis, we extract two features
power (LFnorm), HF power normalized to the sum of the ApEn and SampleEn by calculating the approximate entropy
LF and HF power (HFnorm), VLF power expressed as per- and sample entropy, respectively. The approximate entropy
centage of the total power (pVLF), LF power expressed as and sample entropy are similar methods that quantify the
percentage of the total power (pLF), HF power expressed randomness or predictability of RR interval dynamics. They
as percentage of the total power (pHF), frequency of the are scale-invariant and model-independent; however, there is
highest peak in the VLF band (VLFfr), frequency of the some computational difference. They all assign a nonnegative
highest peak in the LF band (LFfr), and frequency of the number to a series of RR intervals that have larger values with
highest peak in the HF band (HFfr). The relative equations more complexity or irregularity in the data [32], [40].
for calculating the frequency-domain HRV related param- 1) ApEn: A parameter quantifies the amount of regularity
eters are summarized as follows. or predictability of the RR intervals. There are two user-
specified parameters in the ApEn measure: a run length
TP = VLF + LF + HF. (7) m and a tolerance window r (m and r used in this study
are 1 and 0.25 times the standard deviation of the RR
LFnorm = LF / (TP - VLF ) = LF / ( LF + HF ). (8) intervals). Initially, given N data points from a time se-
ries of RR intervals {rr(1), rr(2),…, rr(N)}. ApEn is com-
HFnorm = HF / (TP - VLF ) = HF / ( LF + HF ). (9) puted according to:
Step 1: Obtain a sequence of vectors rr(i) = [rr(i),
pVLF = (VLF / TP)  100. (10) rr(i+1),…, rr(i+m-1)], i = 1, 2,…, N-m+1. These vectors
represent m successive rr values, starting with the ith
1949-3045 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TAFFC.2017.2781732, IEEE Transactions on Affective Computing

AUTHOR ET AL.: TITLE 7

1
TABLE 1
ECG FEATURE SETS GENERATED IN THIS STUDY
0.9 (Lag time, Autocorrelation coefficient)

0.8

SDNN, RMSSD, NN50,


0.7
HRV
pNN50, SDSD
Autocorrelation

0.6
HR BPM
0.5 Time-domain analysis
Median-RRI, IQR-RRI,
0.4
RR interval MAD-RRI, Diff-RRI, CV-
0.3 RRI, Range
0.2 VLF, LF, HL, TP, LF/HF,
0.1
Frequency-domain analysis HRV LFnorm, HFnorm, pVLF, pLF,
pHF, VLFfr, LFfr, HFfr,
0
0 5 10 15 20 25
Lag time (second)
30 35 40 45 50
EDR RSPrate, Coherence
Poincaré plot SD1, SD2, SD12
Nonlinear analysis
Fig. 5 Autocorrelation of QRS complexes. (ACFcoef is 0.8982 and its Nonlinear dynamics ApEn, SampleEn
lag time is 0.69 s, then ACFfreq is the reciprocal of the lag time, 1.4492 Autocorrelation ACFcoef, ACFfreq
Hz.)

point. Step 4: Obtain the average frequency of all m-point pat-


Step 2: Calculate the maximum absolute difference be- terns in the sequence close to each other, 𝜙′𝑚 (𝑟), by
tween the respective scalar components of 𝑟𝑟(𝑖) and computing the average of each 𝐶′𝑚𝑟 (𝑖).
𝑟𝑟(𝑗) to obtain the distance between 𝐫𝐫(𝑖) and 𝐫𝐫(𝑗).
m (r) = i =1 Crm (i) / ( N - m + 1).
N-m+1
(18)
drr(i), rr( j) = maxk 0, 1, ..., m-1( rr(i + k ) - rr( j + k ) ). (13)
Step 5: SampleEn is obtained as follows:
Step 3: Calculate the distance between a given vector
𝐫𝐫(𝑖) and the other vectors 𝐫𝐫(𝑗) for j = 1, 2,…, N-m+1. SampleEn(m, r) = -ln m (r)  m+1(r) . (19)
If 𝑑[𝐫𝐫(𝑖), 𝐫𝐫(𝑗)] ≤ 𝑟 is true, then set 𝑁 𝑚 (𝑖) = 𝑁 𝑚 (𝑖) + 1,
where i = 1, 2,…, N-m+1. Then, within tolerance r, we D. Autocorrelation related parameters
can obtain the 𝐶𝑟𝑚 (𝑖) values that measure the regularity Nonlinear features such as the maximum autocorrelation
of patterns similar to a given segment of length m for i coefficient (ACFcoef) and reciprocal of the lag time (ACFfreq)
= 1, 2,…, N-m+1. is obtained from the autocorrelation of the QRS complexes,
which is used to quantify the similarity of the QRS complexes
Crm (i) = N m (i) / ( N - m + 1). (14) corresponding to the time, as shown in Fig. 5.

4.6 Feature Normalization


Step 4: Obtain the average frequency of all m-point pat-
terns in the sequence close to each other, 𝜙 𝑚 (𝑟), by To obtain the degrees of physiological changes in the ECG
computing the average of the natural logarithm of each features between the conditions of the baseline stage and the
𝐶𝑟𝑚 (𝑖). music listening stage and to reduce the effect of the value’s
range from the degrees of the physiological changes of the
 m (r) = i =1 ln Crm(i) / ( N - m + 1).
N-m+1
(15) ECG features, we first calculate the degrees of physiological
changes in each subject for each extracted feature as follows:
Step 5: ApEn is obtained as follows:
uij = (xij  bij ) / bij , (20)
ApEn(m, r) =  (r)   m m+1
(r). (16)
where 𝑥𝑖𝑗 represents the ith feature extracted from the jth
2) SampleEn: A parameter quantifies the amount of regu- subject during the music listening stage, 𝑏𝑖𝑗 represents the
larity or predictability of the RR intervals. The run ith feature extracted from the jth subject during the
length m and tolerance window r are set the same as baseline stage, and 𝑢𝑖𝑗 is the degree of physiological
ApEn in this study. Initially, given N data points from a change in the feature. Then, the 34 degrees of the
time series of RR intervals {rr(1), rr(2),…, rr(N)}. Sam- physiological changes of the ECG features obtained in (20)
pleEn is calculated as follows: are mapped in the new range [ −1 , 1 ] by the min-max
Step 1 and Step 2: The two procedures are the same as normalization method.
those of ApEn. 4.7 Feature Selection
Step 3: Calculate the distance between a given vector
From the feature extraction and feature normalization
𝐫𝐫(𝑖) and the other vectors 𝐫𝐫(𝑗) for j = 1,2,…,N-m+1,
procedures, the normalized feature vectors representing
where 𝑗 ≠ 𝑖 . If 𝑑[𝐫𝐫(𝑖), 𝐫𝐫(𝑗)] ≤ 𝑟 is true, then set
the degrees of physiological changes between the
𝑁′𝑚 (𝑖) = 𝑁′𝑚 (𝑖) + 1, where i = 1, 2,…, N-m+1. Then,
conditions of the baseline stage and the music listening
within a tolerance r, we can obtain the 𝐶′𝑚 𝑟 (𝑖) values stage are high-dimensional vectors. Feature selection or
that measure the regularity of patterns similar to a
feature reduction is the most important procedure for any
given segment of length m for i = 1, 2,…, N-m+1.
ECG-related analysis, and it must be applied to reduce the
dimensionality of normalzied feature vectors for
Crm (i) = Nm (i) / ( N - m + 1). (17)
constructing effective classifiers. In other words, feature
selection or feature reduction can reduce the
1949-3045 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TAFFC.2017.2781732, IEEE Transactions on Affective Computing

8 IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID

computational complexity and increase the classification sequential forward selection (SFS) and sequential backward
accuracy. selection (SBS) to reduce the nesting effect. The SFS is a bot-
In this paper, the proposed feature selection method com- tom-up feature selection method, whereas the SBS is a top-
bines a selection criterion with a search strategy for further se- down feature selection method. Suppose that we want to
lecting the appropriate features out of a total of 34 normalized choose a p-dimensional feature subset from the original fea-
features. This study develops an SFFS-KBCS-based feature se- ture set. SFS starts from an empty feature set and sequentially
lection algorithm that uses the SFFS method as the search adds one feature from the original feature set, thereby result-
strategy and the KBCS method as the selection criterion. ing in the best selection criterion. On the other hand, SBS
Herein, we describe these methods as follows. starts from the original feature set and sequentially deletes
one feature from the original feature set, thereby resulting in
4.7.1 KBCS-based Selection Criterion the best selection criterion. The SFFS process consists of three
The KBCS method utilized in this paper was originally devel- steps: inclusion, conditional exclusion, and continuation of
oped by Wang [41]. Let (𝐱, 𝑦) ∈ (ℝ𝑑 × 𝑌) represent a sample, conditional exclusion [42].
where ℝ𝑑 stands for a d-dimensional feature space, 𝑌 denotes First, suppose 𝐹𝑘 = {𝑓𝑖 : 1 ≤ 𝑖 ≤ 𝑘} is a selected feature set
the set of class labels, and the size of 𝑌 is the number of classes in which 𝑘 features have already been selected from the orig-
𝑐. This method projects the samples onto a kernel space 𝒦, inal feature set 𝑌 = {𝑦𝑖 : 1 ≤ 𝑖 ≤ 𝑛}, where n is the number of
𝜙
𝐦𝑖 is the mean vector for the ith class in the kernel space 𝒦, total features.
𝑛𝑖 is the number of samples in the ith class, 𝐦𝜙 is the mean Step 1: [Inclusion] Select feature 𝑓𝑘+1 by using the basis SFS
𝜙
vector for all classes in the kernel space 𝒦, 𝐒𝐵 denotes the be- method from the available set {𝑌 − 𝐹𝑘 } to form a feature set
𝜙
tween-class scatter matrix in the kernel space 𝒦, 𝐒𝑊 denotes 𝐹𝑘+1 , that is, the feature 𝑓𝑘+1 is the most significant feature
𝜙
the within-class scatter matrix in the kernel space 𝒦, and 𝐒 𝑇 in the available set {𝑌 − 𝐹𝑘 }, which is then added to 𝐹𝑘 .
denotes the total scatter matrix in the kernel space 𝒦. Let 𝜙(⋅) Therefore, 𝐹𝑘+1 = {𝐹𝑘 , 𝑓𝑘+1 }.
be a possibly nonlinear feature mapping from the feature Step 2: [Conditional exclusion] Find the least significant fea-
space ℝ𝑑 to a kernel space 𝒦: 𝜙: ℝ𝑑 → 𝒦, 𝐱 → 𝜙(𝐱), K de- ture 𝑓𝑗 in the feature set 𝐹𝑘+1 . If 𝑓𝑘+1 is the least significant
notes a kernel matrix with {𝐊}𝑖𝑗 = 𝑘(𝐱 𝑖 , 𝐱𝑗 ), where 𝑘(𝐱 𝑖 , 𝐱𝑗 ) is feature in the feature set 𝐹𝑘+1 , that is,
a kernel function, and 𝐊𝐴,𝐵 is a kernel matrix with the con-
straints of 𝐱 𝑖 ϵ 𝒜 and 𝐱𝑗 ϵ ℬ. The operator Sum(⋅) denotes the Jl ( Fk 1 - fk 1 ) = Jl ( Fk )  Jl ( Fk 1 - f j ), j = 1, 2, ..., k, (26)
summation of all elements in a matrix, and trace(A) is the
trace of a square matrix A. The followings are the relative 𝜙
where 𝐽𝑙 (∙) denotes the feature selection criterion function
equations: obtained from the KBCS. It means that the best feature
combination in 𝐹𝑘 is {𝐹𝑘+1 − 𝑓𝑘+1 }. Then, set 𝑘 = 𝑘 + 1 and
trace(SB ) = trace  i =1 ni (mi - m )(mi - m )T 
c

  (21) return to Step 1. If 𝑓𝑗 ≠ 𝑓𝑘+1 is the least significant feature


=  i =1 Sum(K Di , Di ) / ni - Sum(K D, D ) / n,
c in the feature set 𝐹𝑘+1 , that is,

Jl ( Fk 1 - f j )  Jl ( Fk ), j = 1, 2, ..., k. (27)


trace(SW ) = trace  i =1  j =1 ( (xij ) - mi )( (xij ) - mi )T 
i c n

  (22)
Then, exclude 𝑓𝑗 from the feature set 𝐹𝑘+1 to form a new
= trace(K D, D )   i =1 Sum(K Di , Di ) / ni ,
c

feature set 𝐹𝑘′ , that is,


trace(ST ) = trace(SB )  trace(SW ) Fk = Fk 1  f j .
(23) (28)
= trace(K D, D )  Sum(K D, D ) / n,
𝜙 𝜙
Note that 𝐽𝑙 (𝐹𝑘′ ) > 𝐽𝑙 (𝐹𝑘 ). If 𝑘 = 2, then feature set 𝐹𝑘 = 𝐹𝑘′
where xij represents the jth sample of the ith class. The class 𝜙 𝜙
and 𝐽𝑙 (𝐹𝑘 ) = 𝐽𝑙 (𝐹𝑘′ ) and return to Step 1, else go to Step 3.
separability in the kernel space is represented as Step 3: [Continuation of conditional exclusion] Find the least
𝜙
significant feature 𝑓𝑠 in the feature set 𝐹𝑘′ . If 𝐽𝑙 (𝐹𝑘′ − 𝑓𝑠 ) ≤
J = trace(SB ) / trace(ST ). (24) 𝜙 𝜙 𝜙
𝐽𝑙 (𝐹𝑘−1 ), then feature set 𝐹𝑘 = 𝐹𝑘 , 𝐽𝑙 (𝐹𝑘 ) = 𝐽𝑙 (𝐹𝑘′ ) and re-

𝜙 𝜙
turn to Step 1. If 𝐽𝑙 (𝐹𝑘′ − 𝑓𝑠 ) > 𝐽𝑙 (𝐹𝑘−1 ) , then exclude 𝑓𝑠
When using the normalized kernel or the Gaussian radial from the feature set 𝐹𝑘 to form a newly reduced feature set

basis function (RBF) kernel (stationary kernel), the class sepa- ′
𝐹𝑘−1 , that is,
rability can obtain the lower bound as
Fk1 = Fk  f s , (29)
Jl trace(SB ). (25)
and set 𝑘 = 𝑘 − 1 . If 𝑘 = 2 , then feature set 𝐹𝑘 = 𝐹𝑘′ ,
In this study, we use the Gaussian RBF kernel 𝑘(𝐱𝑖 , 𝐱𝑗 ) = 𝜙 𝜙
𝐽𝑙 (𝐹𝑘 ) = 𝐽𝑙 (𝐹𝑘′ ) and return to Step 1, else repeat Step 3.
2
exp(−‖𝐱𝑖 − 𝐱𝑗 ‖ /2𝜎 2 ), where the Gaussian width 𝜎 is the ker- The SFFS-based search strategy is initialized by setting 𝑘 =
𝜙
nel parameter, and 𝐽𝑙 is proposed as a criterion function for 0 and 𝐹0 = ∅, and the SFS method is used until a feature set
feature selection [41]. of cardinality 2 is obtained (it means until the two most sig-
4.7.2 SFFS-based Search Strategy nificant features are included). Then, the process continues
with Step 1. Finally, we combine the KBCS-based search cri-
The sequential forward floating selection (SFFS) is a well-
known suboptimal feature selection method that combines
1949-3045 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TAFFC.2017.2781732, IEEE Transactions on Affective Computing

AUTHOR ET AL.: TITLE 9

𝜙
where 𝐟𝑖𝑗 is the jth sample of the ith class, 𝐦𝑖 is the mean
vector for the ith class in the kernal space 𝒦 , 𝑛𝑖 is the
number of samples in the ith class, and 𝐦𝜙 is the mean vec-
tor for all classes in the kernel space 𝒦.
Step 2: Find transformation matrix 𝐓 ∈ 𝒦 by maximizing
the following equation in the kernel space 𝒦:

J(T) = TTSBT / TTSW T , (32)


𝜙 𝜙
where 𝐓 T 𝐒𝐵 𝐓 and 𝐓 T 𝐒𝑊 𝐓 are the new between-class scat-
ter matrix and new within-class scatter matrix in the new
feature space, respectively. In general, 𝐓 = [𝐭1 , 𝐭 2 , … , 𝐭 𝑞 ] is
obtained by applying an eigenvalue (𝜆) decomposition of
𝜙 −1 𝜙 𝜙
𝐒𝑊 𝐒𝐵 via the following equation when the inverse of 𝐒𝑊
exists:

SBT = SW T. (33)

Explicitly computing the mapping functions 𝜙(𝐟𝑖 ) and


then performing the process is computationally complex.
Thus, the feature data is implicitly embedded via dot prod-
ucts and by using a kernel trick in which a kernel function
𝑘(𝐟𝑖 , 𝐟𝑗 ) = 〈𝜙(𝐟𝑖 ), 𝜙(𝐟𝑗 )〉 is used to replace the dot product
in the new feature space. In this study, we use the Gaussian
2
RBF kernel function 𝑘(𝐟𝑖 , 𝐟𝑗 ) = exp(−‖𝐟𝑖 − 𝐟𝑗 ‖ /2𝜎 2 ) ,
Fig. 6 Flowchart of SFFS-KBCS-based feature selection algorithm. where the Gaussian width 𝜎 is the kernel parameter. Sub-
sequently, 𝐭 𝑘 is calculated using (34), which is a linear com-
terion with the SFFS-based search strategy to form the pro- bination of all samples in 𝒦.
posed SFFS-KBCS-based feature selection algorithm. Fig. 6
t k =  i =1  j =1  ij  (fij ),
c n
shows the flowchart of the SFFS-KBCS-based feature selec-
i
(34)
tion algorithm.
where 𝛼𝑖𝑗 are real coefficients corresponding to 𝜙(𝐟𝑖𝑗 ), and
4.8 Feature Reduction they are obtained by solving
To reduce the dimensions of the selected features obtained
from the proposed SFFS-KBCS-based feature selection al-  = TKDK / (TKK), (35)
gorithm, the GDA is utilized in this study [43]. The GDA is
a feature reduction approach that is used for dealing with where 𝛂T = [𝛂1T , 𝛂T2 , . . . , 𝛂T𝑐 ] is a vector of coefficients with
nonlinear discriminant problems using a kernel function 𝛂T𝑖 = [𝛼𝑖1 , 𝛼𝑖2 , … , 𝛼𝑖𝑛𝑖 ], and K is an N by N symmetric kernel
operator. The GDA aims to use linear properties via a pos- matrix defined on the class elements as follows:
sibly nonlinear feature mapping function 𝜙(⋅) and then
find a projection matrix 𝐓, which can maximize the ratio of K = (Kxy )x1, ...c; y=1, ..., c , (36)
𝜙
the between-class scatter matrix 𝐒𝐵 to the within-class scat-
𝜙
ter matrix 𝐒𝑊 both in the kernel space, to map the original where N as the number of all training samples,
selected feature set 𝐟𝑖 ∈ ℝ𝑝 in a p-dimensional space onto (K xy ) = ( k(fxi , fyj ))i 1, ...ni ; j= 1, ...,n j . The matrix 𝐃 is a block diag-
another smaller feature set 𝐠 𝑖 ∈ ℝ𝑞 in a q-dimensional onal matrix that is given as
space. Let (𝐟, 𝑢) ∈ (ℝ𝑝 × 𝑈) represent a sample, where ℝ𝑝
denotes a p-dimensional feature space, 𝑈 denotes the set of D = (Di )i1, ...c , (37)
class labels, and the size of 𝑈 is the number of classes 𝑐.
This method firstly projects the samples from the original where 𝐃𝑖 ∈ ℝ𝑛𝑖 ×𝑛𝑖 with all terms equal to 1⁄𝑛𝑖 . Finally, by
selected feature space ℝ𝑝 onto a kernel space 𝒦 via a non- solving the eigenvalue problem, we can obtain the eigen-
linear feature mapping function 𝜙(⋅): 𝜙: ℝ𝑝 → 𝒦, 𝐟 → 𝜙(𝐟). vector 𝛂 that defines the projection matrix 𝐓 ∈ 𝒦.
We summarize the GDA method as follows: Step 3: Finally, a new feature vector 𝐠 𝑖 = [𝑔1 , 𝑔2 , … , 𝑔𝑞 ]T is
𝜙
Step 1: Compute both a within-class scatter matrix 𝐒𝑊 and obtained from the original selected feature 𝐟𝑖 by the
𝜙
a between-class scatter matrix 𝐒𝐵 in the kernel space by the following equation:
following equations:
gk = t Tk  (fi )   l =1  j =1  lj  (flj )T  (fi )
c l n

(38)
SW =  i =1  j =1 ( (fij ) - mi )( (fij ) - mi )T ,
c n
(30) =  l =1  j =1  lj k (flj , fi ).
i c n
l

SB = i =1 ni (mi - m )(mi - m )T ,


c
(31) After the significant feature set is determined through the fea-
ture reduction method, we integrate the SFFS-KBCS+GDA-
1949-3045 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TAFFC.2017.2781732, IEEE Transactions on Affective Computing

10 IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID

based features as the input features for the following classifier. strategy is utilized if m classes need to be classified. If the out-
put of each LS-SVM classifier is one of the two classes, then
4.9 Classifier Construction the assigned class is increased by one vote. Finally, the classi-
An LS-SVM is a binary classifier that relies on a nonlinear fication result is represented as the class with the largest votes.
mapping of the training set to a higher-dimensional space, If two classes have an identical number of votes, we simply
wherein the transformed data is well-separated by a sepa- select the one with the smaller label, that is, if Joy and Peace-
rating hyperplane. Assume a training set of N data points, fulness, which are labeled as “1” and “4”, respectively, have
{𝐱 𝑖 , 𝑦𝑖 }𝑁
𝑖=1 , where 𝐱 𝑖 = [𝑥𝑖1 , … , 𝑥𝑖𝑞 ] ∈ ℝ is the ith input fea-
𝑞
the same number of votes, we predict that the output is Joy.
ture vector (q is the number of dimensions of the reduced In the one-against-one method for multiclass support vector
features obtained by the GDA) and 𝑦𝑖 ∈ {−1, 1} is the ith machines, if a given sample is classified into two classes
output label. The LS-SVMs are used to perform classifica- with the same number of votes, it is assigned into one class
tion tasks using the following decision functions: randomly [46] or the class with the smaller index [45]. Our
selection is based on the method proposed in [45]. In this
y( x) = sign   i =1 i yi k ( x , x i )  b  ,
N
(39) study, the output of the classifier is represented as the label of
 
the two types of valence (i.e., negative and positive valence
where 𝛼𝑖 are Lagrange multipliers (which are either are labeled as “1” and “2”), the two types of arousal (i.e., low
positive or negative) and 𝑏 is a real constant. Subsequently, and high arousal are labeled as “1” and “2”), and the four
we utilize the Gaussian RBF function for the kernel types of emotions (i.e., Joy, Tension, Sadness, and Peaceful-
function 𝑘 in this study, which is described as follows: ness are labeled as “1”, “2”, “3”, and “4”, respectively).
2
k ( xi , x j ) = exp(- xi - x j / 2 2 ), (40)
5 RESULTS
where the Gaussian width 𝜎 is the kernel parameter (𝜎 > The classification performances of the proposed automatic
0). The decision functions are constructed as follows: ECG-based emotion recognition algorithm were validated us-
ing a total number of 395 ECG samples collected from 61 par-
yi wT(xi )  b  1, i = 1, ..., N, (41) ticipants (including 105 for Joy, 55 for Tension, 40 for Sadness,
and 195 for Peacefulness). The ECG segments distributed in
where 𝜙(∙) is a nonlinear mapping function to map the in- positive/negative valence are 300 and 95, respectively, and
put space onto a higher dimensional space. However, 𝜙(∙) those distributed in high/low arousal are 160 and 235, respec-
is not explicitly constructed since the possibility of a sepa- tively. In addition, we compared the classification perfor-
rating hyperplane does not exist in the higher dimensional mances of the LS-SVM classifier between the proposed
space. Therefore, slack variables 𝜉 = (𝜉1 , … , 𝜉𝑁 ) introduced method and four other feature reduction methods, such as
as follows are utilized to solve this misclassification prob- SFFS-KBCS+GDA, SFFS-KBCS+PCA, SFFS-KBCS+LDA,
lem as follows: SFFS-KBCS+PCA+LDA, and SFFS-KBCS+PCA+GDA, once
the optimal dimensions of each of the feature reduction
 yi  w  ( x i )  b   1 - i , i = 1, ..., N
T

 (42) schemes were estimated. To investigate the robustness of our


 i  0, i = 1, ..., N proposed algorithm, we evaluated the classification perfor-
mances of the combinations of the feature selection method
According to the structural risk minimization principle of
and feature reduction methods by 2-fold, 10-fold, leave-one-
statistical learning theory, the risk bound of the LS-SVM is
subject-out (LOSO), and leave-one-out (LOO) cross-valida-
minimized by solving the following optimization problem:
tion strategies. Four common measures of accuracy (Acc),
min w ,b ,e J( w , b , e) = (1 / 2)wT w  ( / 2)i 1 ei 2 , specificity (Sp), sensitivity (Se), and correct classification rate
N

(43) (CCR) were used to evaluate the performance of the proposed


subject to : yi  wT ( xi )  b   1  ei , i = 1, ..., N. classification scheme.
The Lagrange method is used to solve this problem as fol-
5.1 Valence Classification
lows:
To obtain an optimal feature subset selected by the SFFS-
L( w , b , e; α) yi  J( w , b , e)  KBCS-based feature selection method for the PCA, LDA,
(44)

N
i 1  
αi yi  w T ( x i )  b   1  ei . GDA, PCA+LDA, and PCA+GDA feature reduction meth-
ods from the 34 features, we varied the number of features
The Karush-Kuhn-Tucker (KKT) conditions and Mercer’s from 1 to 34. The LS-SVM classifiers between the afore-
theorem are then applied to solve the equation [44]. More mentioned five feature reduction methods were verified
detailed information on the LS-SVM can be referred in [44], by LOO cross-validation for different numbers of features.
[45], and [46]. A comparison of the CCRs using different numbers of fea-
Because LS-SVM is a binary classifier, a one-against-one tures through the different feature reduction methods with
strategy is utilized for multiclass classification [45]. In this the LS-SVM classifiers is shown in Fig. 7; the best CCR
study, we use the LS-SVMs with one-against-one strategy to achieved a performance of 82.78% when 32 dimensions
classify participants’ emotions. For classifying the target emo- were selected by the SFFS-KBCS for the GDA method. The
tions in this study, the one-against-one method constructs overall CCRs of SFFS-KBCS+PCA+LS-SVM, SFFS-
𝑚(𝑚 − 1)/2 binary classifiers, where a max-wins voting KBCS+LDA+LS-SVM, SFFS-KBCS+GDA+LS-SVM, SFFS-
1949-3045 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TAFFC.2017.2781732, IEEE Transactions on Affective Computing

AUTHOR ET AL.: TITLE 11

SVM
KBCS+PCA+LDA+LS-SVM, and SFFS- 90
SFFS-K
KBCS+PCA+GDA+LS-SVM were 78.73%, 71.14%, 82.78%, 80
SFFS-K
SFFS-K
69.87%, and 79.49%, respectively. The classification perfor- SFFS-K
SFFS-K
mance comparisons of the proposed feature reduction
70

methods with the LS-SVM classifiers by the 2-fold, 10-fold, 60

CCR(%)
LOSO, and LOO cross-validation strategies are summa- 50
rized in Table 2. Obviously, these results demonstrate that
the proposed SFFS-KBCS+GDA+LS-SVM scheme outper- 40

forms other classification schemes for classifying posi- 30


tive/negative valence emotions.
20
0 5 10 15 20 25 30 35
5.2 Arousal Classification Number of features

To obtain an optimal feature subset selected by the SFFS- Fig. 7 CCRs versus number of features selected by SFFS-KBCS with
KBCS-based feature selection method for the PCA, LDA, LS-SVM classifier between different feature reduction methods by
GDA, PCA+LDA, and PCA+GDA feature reduction meth- LOO cross-validation in the positive/negative valence classification
task. (Green: PCA. Orange: LDA. Red: GDA. Brown: PCA+LDA. Pink:
ods from the 34 features, we varied the number of features PCA+GDA.)
from 1 to 34. A comparison of the CCRs using different SVM
75
numbers of features through the different feature reduc- SFFS-KB
SFFS-KB
tion methods with the LS-SVM classifier verified by LOO 65
SFFS-KB
SFFS-KB
cross-validation is shown in Fig. 8; the best CCR was SFFS-KB
72.91% when 18 dimensions were selected by the SFFS- 55
KBCS for the GDA method. The overall CCRs of SFFS-
CCR(%)

KBCS+PCA+LS-SVM, SFFS-KBCS+LDA+LS-SVM, SFFS- 45


KBCS+GDA +LS-SVM, SFFS-KBCS+PCA+LDA+LS-SVM,
and SFFS-KBCS+PCA+GDA+LS-SVM were 72.91%, 35
68.10%, 72.91%, 67.34%, and 72.41%, respectively. The clas-
sification performance comparisons of the proposed fea- 25
ture reduction methods with the LS-SVM classifiers by the 0 5 10 15 20
Number of features
25 30 35

2-fold, 10-fold, LOSO, and LOO cross-validation strategies


Fig. 8 CCRs versus number of features selected by SFFS-KBCS with
are summarized in Table 3. These results demonstrate that
LS-SVM classifier between different feature reduction methods by
the SFFS-KBCS+GDA+LS-SVM classification scheme can LOO cross-validation in the high/low arousal classification task.
obtain the best classification performance in the high/low (Green: PCA. Orange: LDA. Red: GDA. Brown: PCA+LDA. Pink:
arousal classification task. PCA+GDA.)
SVM
65
5.3 Four Types of Emotion Classification 60
SFFS-KB
SFFS-KB
Similar to the aforementioned emotion classification tasks, 55
SFFS-KB
SFFS-KB
we varied the number of features from 1 to 34 for obtaining 50
SFFS-KB

an optimal feature subset for PCA, LDA, GDA, PCA+LDA,


CCR(%)

45
and PCA+GDA feature reduction methods from the 34 fea-
40
tures. The LS-SVM classifiers between the aforementioned
35
five feature reduction methods were also verified by LOO
cross-validation for different numbers of features. A com- 30

parison of the CCRs using different numbers of features 25

through the different feature reduction methods with the 20


0 5 10 15 20 25 30 35
Number of features
LS-SVM classifiers is shown in Fig. 9; the best CCR was
61.52% when 24 dimensions were selected by the SFFS- Fig. 9 CCRs versus number of features selected by SFFS-KBCS with
KBCS for the GDA method. The average accuracy of SFFS- LS-SVM classifier between different feature reduction methods by
LOO cross-validation in the four emotions classification task. (Green:
KBCS+PCA+LS-SVM, SFFS-KBCS+LDA+LS-SVM, SFFS- PCA. Orange: LDA. Red: GDA. Brown: PCA+LDA. Pink: PCA+GDA.)
KBCS+GDA+LS-SVM, SFFS-KBCS+PCA+LDA +LS-SVM,
and SFFS-KBCS+PCA+GDA+LS-SVM was 78.73%, 75.19%,
80.76%, 75.19%, and 80.63%, respectively. The overall
6 DISCUSSION
CCRs of the aforementioned five classification schemes
were 57.47%, 50.38%, 61.52%, 50.38%, and 61.27%, respec- 6.1 Comparisons of Proposed Method with Other
tively. The classification performance comparisons of the Existing Approaches for MAHNOB-HCI
proposed feature reduction methods with the LS-SVM
classifiers by the four cross-validation strategies are sum-
marized in Table 4. These results demonstrate the ability of
the SFFS-KBCS+GDA+LS-SVM classification scheme to
classify the four types of emotions effectively.

1949-3045 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TAFFC.2017.2781732, IEEE Transactions on Affective Computing

12 IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID

Database user-independent classification rate was validated by the


In this section, we compared the classification perfor- LOSO cross-validation strategy. The performance compar-
mances of the proposed SFFS-KBCS+GDA+LS-SVM isons of our proposed scheme using only ECG signals and
scheme with the existing ANOVA+SVM scheme presented the ANOVA+SVM scheme using the peripheral biosignals
in [8] by using only the ECG signals and the peripheral bi- are summarized in Table 5. The results show that the clas-
osignals (ECG, GSR, RSP, and ST) in the MAHNOB-HCI sification performance of the proposed SFFS-
database, respectively. The MAHNOB-HCI database de- KBCS+GDA+LS-SVM scheme is similar to that of the ex-
veloped by Soleymani et al. is an open database for emo- isting ANOVA+SVM scheme even if the former used only
tion analysis. In this database, the EEG signal, peripheral ECG signals collected from [8] instead of the four periph-
biosignals (ECG, GSR, RSP, and ST), face and body videos, eral biosignals. This validates that the proposed scheme
eye gaze, and audio were collected from 27 subjects (11 can be served as an effective tool for classifying posi-
males and 16 females, age: 26.06±4.39 years). The emotions tive/negative valence and high/low arousal elicited
of each subject were elicited through 20 emotional video through the video clips by using only ECG signals.
clips, and subjects also presented a self-report for the five
6.2 Comparisons of Proposed Method with Other
questions, including emotional label, arousal, valence,
dominance, and predictability. According to [8], the affec-
tive states are divided into three levels on the valence and
arousal axes based on the subject’s emotional labels. The
TABLE 2
CLASSIFICATION PERFORMANCE COMPARISONS OF PROPOSED FEATURE REDUCTION METHODS WITH LS-SVM CLASSIFIERS BY
CROSS-VALIDATION IN POSITIVE/NEGATIVE VALENCE CLASSIFICATION TASK

TABLE 3
CLASSIFICATION PERFORMANCE COMPARISONS OF PROPOSED FEATURE REDUCTION METHODS WITH LS-SVM CLASSIFIERS BY
CROSS-VALIDATION IN HIGH/LOW AROUSAL CLASSIFICATION TASK

TABLE 4
CLASSIFICATION PERFORMANCE COMPARISONS OF PROPOSED FEATURE REDUCTION METHODS WITH LS-SVM CLASSIFIERS BY
CROSS-VALIDATION IN FOUR EMOTIONS CLASSIFICATION TASK

1949-3045 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TAFFC.2017.2781732, IEEE Transactions on Affective Computing

AUTHOR ET AL.: TITLE 13

Existing Approaches Using Biosignals TABLE 5


This study aimed to develop an automatic ECG-based emo- CCRS OBTAINED FROM DIFFERENT CLASSIFICATION SCHEMES
tion recognition algorithm that can classify positive/negative USING DIFFERENT SIGNALS FOR MAHNOB-HCI DATABASE
valence, high/low arousal, and four types of emotions (posi-
tive valance/high arousal (Joy), negative valance/high
arousal (Tension), negative valance/low arousal (Sadness),
and positive valance/low arousal (Peacefulness)) effectively,

TABLE 6
CLASSIFICATION PERFORMANCE COMPARISONS OF PROPOSED CLASSIFICATION SCHEME W ITH SOME EXISTING SCHEMES FOR
POSITIVE/NEGATIVE VALENCE CLASSIFICATION TASK
Author Signals No. of Subjects Emotions Induction Method Classification Scheme CCR

61 Positive/Negative Valence Music 82.78%


SFFS-KBCS+GDA
Proposed method ECG
Positive/Neutral/Negative Va- +LS-SVM
27 Video Clips 44.10%
lence
BVP, EOG, EMG,
Koelstra et al. [5] 32 Positive/Negative Valence Music Video Clips LDA+Gaussian naïve Bayes 62.70%
GSR, ST, RSP
Positive/Neutral/Negative Va-
Soleymani et al. [8] ECG, GSR, RSP, SC 27 Video Clips ANOVA+SVM 45.50%
lence

TABLE 7
CLASSIFICATION PERFORMANCE COMPARISONS OF PROPOSED CLASSIFICATION SCHEME W ITH SOME EXISTING SCHEMES FOR
HIGH/LOW AROUSAL CLASSIFICATION TASK
Author Signals No. of Subjects Emotions Induction Method Classification Scheme CCR

61 High/Low Arousal Music 72.91%


Proposed method ECG SFFS-KBCS+GDA+LS-SVM
Positive/Neutral/Negative Va-
27 Video Clips 49.20%
lence
BVP, EOG, EMG,
Koelstra et al. [5] 32 High/Low Arousal Music Video Clips LDA+Gaussian naïve Bayes 57.00%
GSR, ST, RSP
Soleymani et al. [8] ECG, GSR, RSP, SC 27 Excited/Medium/Calm Arousal Video Clips ANOVA+SVM 46.20%

TABLE 8
CLASSIFICATION PERFORMANCE COMPARISONS OF PROPOSED CLASSIFICATION SCHEME W ITH SOME EXISTING SCHEMES FOR
MULTI-EMOTION CLASSIFICATION TASK
Author Signals No. of Subjects Emotions Induction Method Classification Scheme CCR
Joy,
61.52%
Proposed method ECG 61 Tension, Sadness, Peace- Music SFFS-KBCS+GDA+LS-SVM
(4 emotions)
fulness
78.40 %
Sad, Anger, Stress, Sur- Multimodal (Audio, Visual and (3 emotions)
Kim et al. [2] ECG, ST, SC 50 SVM
prise Cognitive stimuli) 61.80 %
(4 emotions)
Fear, Anger, Sadness, Recall their personal emotional 65.30%
Rainville et al. [6] ECG, RSP 43 PCA+Heuristic decision tree
Happiness episode (4 emotions)
EMG, ECG, GSR, 62.70%
Rigas et al. [7] 9 Happiness, Disgust, Fear Picture viewing (IAPS) K-nearest neighbors
RSP (3 emotions)
ECG, SC, EMG, 69.70%
Kim and André [4] 3 Joy, Anger, Sad, Pleasure Music pLDA+EMDC
RSP (4 emotions)
Amusement, Anger, 74.00%
Wen et al. [10] OXY, GSR,ECG 101 Video Clips Random forests classifier
Grief, Fear, Baseline (5 emotions)
Positive & High arousal,
ECG, BVP, GSR, Negative & High arousal, 50.30%
Gu et al. [1] 28 Picture viewing (IAPS) K-nearest neighbors
EMG, RSP Positive & Low arousal, (4 emotions)
Negative & Low arousal

respectively, using only ECG signals when listening to music. scheme [8] is similar. In addition, the CCR of the proposed
In this section, we compared the existing classification SFFS-KBCS+GDA+LS-SVM scheme is approximately 20%
schemes with our proposed scheme for the positive/negative higher than that of the LDA+Gaussian naïve Bayes scheme
valence, high/low arousal, and multi-emotion classification with BVP, EOG, EMG, GSR, ST, and RSP biosignals [5].
tasks, respectively. The performance comparisons of our pro- Thus, the proposed SFFS-KBCS+GDA+LS-SVM scheme is
posed scheme and the existing schemes for the valence, appropriate for evaluating positive/negative valence emo-
arousal, and multi-emotion classification tasks are summa- tion classification performance by using only ECG signals.
rized in Tables 6, 7, and 8, respectively. According to Table 7, the CCRs obtained using the pro-
We used the proposed SFFS-KBCS+GDA+LS-SVM posed SFFS-KBCS+GDA+LS-SVM scheme for the
scheme to classify positive/neutral/negative valance us- high/low arousal classification task were 72.91% and
ing the ECG data collected from [8]; as shown in Table 6, 49.20% when the participants’ emotions were elicited by
the classification performance of the proposed SFFS- music and video clips, respectively. In addition, the overall
KBCS+GDA+LS-SVM scheme and the ANOVA+SVM CCR of the proposed SFFS-KBCS+GDA+LS-SVM scheme
1949-3045 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TAFFC.2017.2781732, IEEE Transactions on Affective Computing

14 IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID

was better than that of the ANOVA+SVM scheme [8] and ACKNOWLEDGMENTS
the LDA+Gaussian naïve Bayes scheme [5] by more than
This work was supported by the Ministry of Science and
3.00% and 15.91%, respectively. Therefore, the results indi-
Technology of the Republic of China, Taiwan, under Grant
cate that the proposed SFFS-KBCS+GDA+LS-SVM scheme
No. MOST 106-3011-E-006-002 and MOST 106-2221-E-035-
is the best combination for the high/low arousal emotion
004.
classification task when using only ECG signals.
The performance comparisons of our proposed scheme
REFERENCES
and six existing schemes for the multi-emotion classifica-
tion task are summarized in Table 8. The results obviously [1] Y. Gu, S. L. Tan, K. J. Wong, M. H. R. Ho, and L. Qu, “A biometric
show that the performance of the proposed SFFS- signature based system for improved emotion recognition using
physiological responses from multiple subjects,” in Proc. of 8th
KBCS+GDA+LS-SVM scheme using only ECG signals is
IEEE Int’l Conf. Industrial Informatics, pp. 61-66, 2010.
similar to that of other schemes using multi biosignals.
[2] K. H. Kim, S. W. Bang, and S. R. Kim, “Emotion recognition sys-
This validates that the proposed SFFS-KBCS+GDA+LS-
tem using short-term monitoring of physiological signals,” Med-
SVM scheme can achieve performance similar to that of ical & Biological Engineering & Computing, vol. 42, pp. 419-427,
other existing schemes even when using fewer biosignals. 2004.
Furthermore, we find that the CCRs of the proposed SFFS- [3] D. Kulić and A. Croft, “Affective state estimation for human-ro-
KBCS+GDA+LS-SVM scheme deteriorated from 82.78%- bot interaction,” IEEE Trans. Robotics, vol. 23, no. 5, pp. 991-1000,
72.91% to 61.52% when the number of classified emotion 2007.
categories was increased from two to four. In other words, [4] J. Kim and E. André, “Emotion recognition based on physiolog-
the number of emotion categories is an influential factor ical changes in music listening,” IEEE Trans. Pattern Analysis and
that deteriorates the performance of the feature reduction Machine Intelligence, vol. 30, no. 12, pp. 2067-2083, 2008.
method and the LS-SVM classifier when the proposed [5] S. Koelstra, C. Mühl, M. Soleymani, J. S. Lee, A. Yazdani, T.
scheme uses only ECG signals. Ebrahimi, T. Pun, A. Nijholt, and I. Patras, “DEAP: A database
for emotion analysis using physiological signals,” IEEE Trans.
Affective computing, vol. 3, no. 1, pp. 18-31, 2012.
7 CONCLUSION [6] P. Rainville, A. Bechara, N. Naqvi, and A. R. Damasio, “Basic
In this paper, we present an automatic ECG-based emotion emotions are associated with distinct patterns of cardiorespira-
recognition algorithm consisting of the SFFS-KBCS+GDA tory activity,” International Journal of Psychophysiology, vol. 61, pp.
5-18, 2006.
feature reduction method and LS-SVM classifiers using
[7] G. Rigas, C. D. Katsis, G. Ganiatsas, and D. I. Fotiadis, “A user
only ECG signals for discriminating positive/negative va-
independent, biosignal based, emotion recognition method,” in
lence, high/low arousal, and four types of emotions (joy,
Proc. 11th Int’l conf. User Modeling, pp. 314-318, 2007.
tension, sadness, and peacefulness) elicited by listening to [8] M. Soleymani, J. Lichtenauer, T. Pun, and M. Pantic, “A multi-
music, respectively. A total of 34 features extracted from modal database for affect recognition and implicit tagging,”
the time-, frequency-domain, and nonlinear analyses of IEEE Trans. Affective computing, vol. 3, no. 1, pp. 42-55, 2012.
ECG signals to provide discriminative information for the [9] A. Kleinsmith and N. Bianchi-Berthouze, “Affective body expres-
emotion recognition tasks. Subsequently, the degrees of sion perception and recognition: A survey,” IEEE Trans. Attective
the physiological changes in the aforementioned ECG fea- Computing, vol. 4, no. 1, pp. 15-33, 2013.
tures between the conditions of the baseline stage and the [10] W. Wen, G. Liu, N. Cheng, J. Wei, P. Shangguan, and W. Huang,
music listening stage were obtained for the proposed SFFS- “Emotion recognition based on multi-variant correlation of
KBCS+GDA+LS-SVM classification scheme. The overall physiological signals,” IEEE Trans. Attective Computing, vol. 5, no.
CCRs of 82.78%, 72.91%, and 61.52% were obtained by the 2, pp. 126-140, 2014.
[11] K. Wac and C. Tsiourti, “Ambulatory assessment of affect: Sur-
LOO cross-validation strategy for valence, arousal, and
vey of sensor systems for monitoring of autonomic nervous sys-
four emotion classes using the ECG features with the pro-
tems activation in emotion,” IEEE Trans. Attective Computing, vol.
posed SFFS-KBCS+GDA+LS-SVM classification scheme to
5, no. 3, pp. 251-272, 2014.
classify a multi-subject database consisting of 395 strongly [12] R. Jenke, A. Peer, and M. Buss, “Feature extraction and selection
elicited affective samples from 61 participants. According for emotion recognition from EEG,” IEEE Trans. Attective Compu-
to the aforementioned experimental results, the effective- ting, vol. 5, no. 3, pp. 327-339, 2014.
ness of the proposed SFFS-KBCS+GDA+LS-SVM scheme [13] M. Kusserow, O. Amft, and G. Tröster, “Modeling arousal
has been validated successfully. Moreover, the CCRs are phases in daily living using wearable sensors,” IEEE Trans. At-
higher than or similar to those reported in the literatures tective Computing, vol. 4, no. 1, pp. 93-105, 2013.
reviewed in this paper when considering the different in- [14] F. Agrafioti, D. Hatzinakos, and A. K. Anderson, “ECG pattern
duction methods, emotion types, and number of subjects. analysis for emotion detection,” IEEE Trans. Attective Computing,
In conclusion, from the aforementioned experimental re- vol. 3, no. 1, pp. 102-115, 2012.
[15] P. Ekman, “An argument for basic emotions,” Cognition and Emo-
sults, we believe that the proposed automatic ECG-based
tion, vol. 6, pp. 169-200, 1992.
emotion recognition algorithm can be considered effective
[16] J. A. Russell, “A circumplex model of affect,” Journal of Personal-
for ECG-based emotion recognition tasks.
ity and Social Psychology, vol. 39, no. 6, pp. 1161-1178, 1980.
1949-3045 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TAFFC.2017.2781732, IEEE Transactions on Affective Computing

AUTHOR ET AL.: TITLE 15

[17] J. Posner, J. A. Russell, and B. S. Peterson, “The circumplex model variability: Standards of measurement, physiological interpreta-
of affect: An integrative approach to affective neuroscience, cog- tion and clinical use,” European Heart Journal, vol. 17, pp. 354–381,
nitive development, and psychopathology,” Development and 1996.
Psychopathology, vol. 17, no. 3, pp. 715-734, 2005. [34] J. P. Niskanen, M. P. Tarvainen, P. O. Ranta-aho, and P. A. Kar-
[18] M. D. van der Zwaag, J. H. Janssen, and J. H. D. M. Wesrerlink, jalainen, “Software for advanced HRV analysis,” Computer Meth-
“Directing physiology and mood through music: Validation of ods and Programs in Biomedicine, vol. 76, pp. 73-81, 2004.
an affectice music player,” IEEE Trans. Attective Computing, vol. [35] S. B. Park, Y. S. Noh, S. J. Park, and H. R. Yoon, “An improved
4, no. 1, pp. 57-68, 2013. algorithm for respiration signal extraction from electrocardio-
[19] S. D. Kreibig, “Autonomic nervous system activity in emotion: A gram measured by conductive textile electrodes using instanta-
review,” Biological Psychology, vol. 84, pp. 394-421, 2010. neous frequency estimation,” Medical & Biological Engineering &
[20] G. Valenza, A. Lanatà, and E. P. Scilingo, “The Role of nonlinear Computing, vol. 46, pp. 147-158, 2008.
dynamics in affective valence and arousal recognition,” IEEE [36] R. Bailón, L. Sörnmo, and P. Laguna, “A robust method for ECG-
Trans. Affective computing, vol. 3, no. 2, pp. 237-249, 2012. based estimation of the respiratory frequency during stress test-
[21] M. Nardelli, G. Valenza, A. Greco, A. Lanata, and E. P. Scilingo, ing,” IEEE Trans. Biomedical Engineering, vol. 53, no. 7, pp. 1273-
“Recognizing emotions induced by affective sounds through 1285, 2006.
heart rate variability,” IEEE Trans. Attective Computing, vol. 6, no. [37] W. A. Tiller, R. McCraty, and M. Atkinson, “Cardiac coherence:
4, pp. 385-394, 2015. A new, noninvasive measure of autonomic nervous system or-
[22] M. Orini, R. Bailón, R. Enk, S. Koelsch, L. Mainardi, and P.La- der,” Alternative Therapies, vol. 2, no. 1, pp. 52–65, 1996.
guna, “A method for continuously assessing the automatic re- [38] M. P. Tulppo, T. H. Mäkikallio, T. E. S. Takala, T. Seppänen, and
sponse to music-induced emotions through HRV analysis,” Med- H. V. Huikuri, “Quantitative beat-to-beat analysis of heart rate
ical & Biological Engineering & Computing, vol. 48, pp. 423-433, dynamics during exercise,” American Journal of Physiology-Heart
2010. and Circulatory Physiology, vol. 271, pp. 244-252, 1996.
[23] C. L. Krumhansl, “An exploratory study of musical emotions [39] G. D. Vito, S. D. R. Galloway, M. A. Nimmo, P. Maas, and J. J. V.
and psychophysiology,” Canadian Journal of Experimental Psychol- McMurray, “Effects of central sympathetic inhibition on heart
ogy, vol. 51, pp. 336-352, 1997. rate variability during steady-state exercise in healthy humans,”
[24] A. L. Roque, V. E. Valenti, H. L. Guida, M. F. Campos, A. Knap, Clinical Physiology and Functional Imaging, vol. 22, pp. 32-38, 2002.
L. C. M. Vanderlei, L. L. Ferreira, C. Ferreira, and L. C. de Abreu, [40] J. McNames and M. Aboy, “Reliability and accuracy of heart rate
“The effects of auditory stimulation with music on heart rate var- variability metrics versus ECG segment duration,” Medical & Bi-
iability in healthy women,” Clinics, vol. 68, no. 7, pp. 960-967, ological Engineering & Computing, vol. 44, pp. 747-756, 2006.
2013. [41] L. Wang, “Feature selection with kernel class separability,” IEEE
[25] M. Naji, M. Firoozabadi, and P. Azadfallah, “Classification of Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 9, pp.
music-induced emotions based on information fusion of fore- 1534-1546, 2008.
head biosignals and electrocardiogram,” Cognitive Computation, [42] P. Pudil, J. Novovičová, and J. Kittler, “Floating search methods
vol. 6, no. 2, pp. 241-252, 2014. in feature selection,” Pattern Recognition Letters, vol. 15, pp. 1119-
[26] F. M. Vanderlei, L. C. de Abreu, D. M. Carner, and V. E. Valenti, 1125, 1994.
“Symbolic analysis of heart rate variability during exposure to [43] G. Baudat and F. Anouar, “Generalized discriminant analysis us-
musical auditory stimulation,” Alternative Therapies in Health and ing a kernel approach,” Neural Computation, vol. 12, no. 10, pp.
Medicine, vol. 22, no. 2, pp. 24-31, 2016. 2385-2404, 2000.
[27] A. Gabrielsson and P. N. Juslin, “Emotion Expression in Music,” [44] J. A. K. Suykens and J. Vandewalle, “Least squares support vec-
Handbook of Affective Sciences, R. J. Davidson, K. R. Scherer, and tor machine classifiers,” Neural Processing Letters, vol. 9, pp. 293-
H. H. Goldsmith, eds., pp. 503-534, Oxford Univ. Press, 2003. 300, 1999.
[28] J. Pan and W. J. Tompkins, “A real-time QRS detection algo- [45] C. W. Hsu and C. J. Lin, “A comparison of methods for multiclass
rithm,” IEEE Trans. Biomedical Engineering, vol. 32, no. 3, pp. 230- support vector machines,” IEEE Trans. Neural Networks, vol. 13,
236, 1985. no. 2, pp.415-425, 2002.
[29] M. Mneimneh , E. Yaz , M. Johnson, and R. Povinelli, “An adap- [46] B. Liu, Z. Hao, and E. C. C. Tsang, “Nesting one-against-one al-
tive kalman filter for removing baseline wandering in ECG sig- gorithm based on SVMs for pattern classification,” IEEE Trans.
nals,” in Proc. Computers in Cardiology, pp.253-256, 2006. Neural Networks, vol. 19, no. 12, pp. 2044-2052, 2008.
[30] P. de Chazal, C. Heneghan, E. Sheridan, R. Reilly, P. Nolan, and
M. O’Malley, “Automated processing of the single-lead electro- Yu-Liang Hsu (M’17) received the B.S. degree in
Automatic Control Engineering from the Feng
cardiogram for the detection of obstructive sleep apnea,” IEEE Chia University, Taichung, Taiwan, in 2004, and
Trans. Biomedical Engineering, vol. 50, no. 6, pp. 686-696, 2003. the M.S. and Ph.D. degrees in Electrical Engi-
[31] J. S. Wang, W. C. Chiang, Y. L. Hsu, and Y. T. C. Yang, “ECG neering from National Cheng Kung University,
arrhythmia classification using a probabilistic neural network Tainan, Taiwan, in 2007 and 2011, respectively.
He is currently an Assistant Professor in the De-
with a feature reduction method,” Neurocomputing, vol. 116, pp. partment of Automatic Control Engineering, Feng
38-45, 2013. Chia University. His research interests include
[32] U. R. Acharya, K. P. Joseph, N. Kannathal, C. M. Lim, and J. S. computational intelligence, biomedical engineer-
Suri, “Heart rate variability: A review,” Medical and Biological En- ing, nonlinear system identification, and wearable intelligent technol-
ogy.
gineering and Computing, vol. 44, pp. 1031-1051, 2006.
[33] Task Force of the European Society of Cardiology and North
American Society of Pacing and Electrophysiology, “Heart rate
1949-3045 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TAFFC.2017.2781732, IEEE Transactions on Affective Computing

16 IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID

Jeen-Shing Wang (S’94-M’02) received the B.S.


and M.S. degrees in Electrical Engineering from
the University of Missouri, Columbia, in 1996 and
1997, respectively, and the Ph.D. degree from
Purdue University, West Lafayette, IN, in 2001.
He is currently a Distinguished Professor in the
Department of Electrical Engineering, National
Cheng Kung University, Taiwan. His research in-
terests include computational intelligence, wera-
ble system design, big data analysis, and optimi-
zation.

Wei-Chun Chiang received the B.S. and


Ph.D.degrees in Electrical Engineering from Na-
tional Cheng Kung University, Tainan, Taiwan, in
2007 and 2014. He is currently a Postdoctoral
Fellow in the Department of Electrical Engineer-
ing, National Cheng Kung University. His re-
search interests include computational intelli-
gence and biomedical signal analysis.

Chien-Han Hung received the B.S. degree in


Engineering Science from National Cheng Kung
University in 2011, and the M.S. degree in Elec-
trical Engineering from National Cheng Kung
University, Tainan, Taiwan, in 2013. Her re-
search interests include signal processing and
affective computing.

1949-3045 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
information.

You might also like