Deep SVMs Identify Stress from EDA

2nd Reading
June 5, 2020 17:16 2050031
International Journal of Neural Systems (2020) 2050031 (16 pages)

c World Scientific Publishing Company
DOI: 10.1142/S0129065720500318
Deep Support Vector Machines for the Identification of Stress

Condition from Electrodermal Activity
by UPPSALA UNIVERSITY on 06/09/20. Re-use and distribution is strictly not permitted, except for Open Access articles.
Roberto Sánchez-Reolid∗,† , Arturo Martı́nez-Rodrigo‡,§ , Marı́a T. López∗,†

and Antonio Fernández-Caballero∗,†,¶,
∗
Departamento de Sistemas Informáticos
Universidad de Castilla-La Mancha, 02071 Albacete Spain
†
Instituto de Investigación en Informática de Albacete
02071 Albacete, Spain
‡
Departamento de Sistemas Informáticos
Int. J. Neur. Syst. Downloaded from www.worldscientific.com
Universidad de Castilla-La Mancha, 16071 Cuenca, Spain

§
Instituto de Tecnologı́as Audiovisuales, 16071 Cuenca, Spain
¶
CIBERSAM (Biomedical Research Networking
Centre in Mental Health), Spain

Antonio.Fdez@uclm.es
Accepted 18 March 2020
Published Online 5 June 2020
Early detection of stress condition is beneficial to prevent long-term mental illness like depression and
anxiety. This paper introduces an accurate identification of stress/calm condition from electrodermal
activity (EDA) signals. The acquisition of EDA signals from a commercial wearable as well as their storage
and processing are presented. Several time-domain, frequency-domain and morphological features are
extracted over the skin conductance response of the EDA signals. Afterwards, a classification is undergone
by using several classical support vector machines (SVMs) and deep support vector machines (D-SVMs).
In addition, several binary classifiers are also compared with SVMs in the stress/calm identification task.
Moreover, a series of video clips evoking calm and stress conditions have been viewed by 147 volunteers
in order to validate the classification results. The highest F1-score obtained for SVMs and D-SVMs are
83% and 92%, respectively. These results demonstrate that not only classical SVMs are appropriate for
classification of biomarker signals, but D-SVMs are very competitive in comparison to other classification
techniques. In addition, the results have enabled drawing useful considerations for the future use of SVMs
and D-SVMs in the specific case of stress/calm identification.
Keywords: Electrodermal activity; support vector machines; deep support vector machines; calm; stress.
1. Introduction scenarios are necessarily harmful to the individual,

Stress may be defined as a set of changes that occur since it is likely that a situation or condition that
in the body as a physical response to repeated stim- may be stressful for one subject does not have the
uli.1 These stimuli, also denominated stressors, can same effect on another.2,3
report all the external factors or situations that trig- Eustress can be considered a positive cognitive
ger stress. It has been corroborated that not all stress response to stress situation, as it enables us to be

Corresponding author.
2050031-1
2nd Reading
June 5, 2020 17:16 2050031
R. Sánchez-Reolid et al.
creative, take the lead and effectively respond to lower computational cost. Moreover, the SVM mod-
those issues that require it. On the other hand, dis- els already known can be included within the more
tress, or negative stress, uses to cause a state of men- modern approach of the D-SVMs.32–34 Our inten-
tal fatigue that often lead to a variety of physical tion is to reuse our previous approaches in detecting
and mental disorders.4,5 Therefore, the development stress condition through SVMs22,23,35 and to imple-
of early stress detection techniques seem necessary ment D-SVMs and other binary classifiers36,37 for
to prevent health problems related to distress.6–8 the sake of comparing the methods in this applica-
Nowadays, there is a great demand to develop tion area. In addition, an objective of this paper is
and adapt new technologies to monitor and detect to determine whether D-SVMs improve SVM-based
negative stress situations in daily life.9 Precisely, the models in discerning between calm and stress despite
Affective Computing field arises with this aim.10–12 their increase in complexity and computational cost.
It is a new area of computing research that is The remainder of this paper is structured as fol-
described as “computing which relates to, arises lows. Section 2 introduces a description of all materi-
from, or deliberately influences emotions”.10 In this als and methods used to identify stress through EDA
sense, our research explores how to detect and eval- acquired from a commercial wearable. This section
uate the emotional state identified as stress condi- details the dataset obtained to carry out an experi-
tion. This type of studies usually analyze various ment and how to process the data and obtain impor-
physiological signals that can be measured with non- tant features. It also introduces the architectures and
invasive and nonintrusive devices complemented by configurations of the SVMs and D-SVMs employed.
machine learning techniques.6,13,14 This approach is Afterwards, the results obtained for each of the tests
widely used in areas such as neurology to detect pat- carried out are shown in Sec. 3. Finally, a discussion
terns of epilepsy attack, hallucinations, mental dis- is provided on the research in Sec. 4, ending up with
tress, Alzheimer and health wellness.15–18 the most relevant conclusions in Sec. 5.
Preliminary works have already demonstrated
the feasibility of detecting stress from physiological 2. Materials and Methods
measurements11,19,20 by analyzing the response of
2.1. Materials
the peripheral nervous system.12,21–24 In this respect,
recent advances in microelectronics allow the use of 2.1.1. Acquisition device
noninvasive, nonintrusive wearable devices for con- An important piece of this work consists in the acqui-
tinuous monitoring of these physiological variables. sition, processing and procurement of a dataset to
These wearables are well valued as they are com- be used for accurate identification of the stress level.
fortable, lightweight, provide long battery life and The Empatica E4 wristband38 is a wearable designed
allow wireless communication, and acquiring the sig- to measure and collect physiological signals like tem-
nals that will be analyzed later on.6,19,25 perature, EDA, blood volume pressure and accelera-
As in previous studies characterizing changes in tion. This commercial device is used in clinical exper-
emotional experiences,4,26–28 this work uses one of iments and domestic environments for a continuous
the most common physiological variables to deter- monitoring of physiological variables. The Empatica
mine the activation level, namely the electrodermal E4 must be securely attached to the wrist so that
activity (EDA). EDA is a biomarker to quantify the electrodes correctly touch the skin. Otherwise,
changes in the sympathetic nervous system by mea- when the device is not properly connected, it does
suring the conductivity of the skin.20,27,29,30 These not sample well and the captured data are not valid.
are caused by a change of activity in the sweat glands This paper has used EDA signals for the purpose
as a consequence of stimuli produced in the periph- of designing and comparing classical and D-SVMs
eral nervous system.6,29,31 for the identification of the stress condition. The
Hence, this paper describes the use of support EDA signals are obtained by measuring the poten-
vector machines (SVMs) and deep support vector tial when a small constant current is applied between
machines (D-SVMs) for the classification of both two electrodes located on the Empatica E4 wearable
conditions. SVMs are powerful classifiers character- across two metallic electrodes (chromium-silver elec-
ized by handling a great number of features with trodes). Generally, the skin reacts under stress by
2050031-2
2nd Reading
June 5, 2020 17:16 2050031
D-SVMs for the Identification of Stress Condition from EDA
producing an increase of sweat. As a consequence, comparisons between two classes. Several metrics
the conductivity of the skin grows. On the other were used to calculate the performance of the mod-
hand, sweat production stops and skin conductiv- els. Let us highlight that it is mandatory to evaluate
ity decreases when subjected to a neutral or calm if the models work in accordance with their design.
stimulus. Once the prediction was made on each type
of classifier, four different types of answers were
2.1.2. Dataset obtained: true positive (TP), false positive (FP), true
negative (TN) and false negative (FN). TP and TN
The dataset was collected as described in our previ-

are used to say that an element of a set has been
ous experiments.12,26,35,39 A series of video clips were
classified corresponding to its category. On the other
shown to the participants to elicit calm and stress
hand, we can define FP and FN when an element
conditions. The dataset contained scenes of movie
belongs to a different class than it was classified.6,41
films that emphasized a particular emotional state.
The metrics used to evaluate the model are the fol-
For instance, there were scenes provoking fear (high
lowing ones:
activation), and therefore increasing stress, or tran-
quillity (low activation) promoting the state of calm. • The precision (P ) tells about the success probabil-
The scenes were shown randomly, and between clip ity of making a correct positive classification. It is
and clip a distracting task was launched to eliminate computed as number of true positives divided by
the effect of the emotion previously evoked. Each clip the total number of positive cases.
had a duration of 47 s and the sampling frequency
TP
for the EDA signals was 4 Hz. While the participant P = . (1)
TP + FP
received these stimuli, their physiological variables
were acquired and saved. • The recall (R) is defined as the percentage of posi-
A total of 147 people were recruited for exper- tive cases caught. Recall explains how sensitive the
imentation 68.4% of the participants were women model is toward identifying the positive class. It is
aged 31.4 (8.03) and 31.6% were men aged 36.3 computed as the number of true positives divided
(4.99). The participants were all volunteers and were by the sum of true positives and false negatives.
not rewarded for performing the experiment. The TP
participants signed an agreement form that inform- R= . (2)
TP + FN
ing them of the risks associated to carrying out the
experiment. The participant could stop the exper- • The F1-score, also called F -measure, is a measure
iment at any time if they felt uncomfortable. The of a test’s accuracy. It is defined as the harmonic
experiment was designed following the Helsinki Dec- mean between precision and recall. It is used as a
laration and it was approved by the Ethical Commit- statistical measure to rate performance.
tee in Clinical Research at Universidad de Castilla- 2×P ×R
F1-score = × 100. (3)
La Mancha according to the European and Spanish P +R
legislation.2,39
F1-score was used as the basement for robustly
The experiment was conducted in a controlled
estimating the performance of the implemented
environment. The experiment room was equipped
models.
with a comfortable seat and all video clips were
displayed on a 27” monitor. Before starting the
experiment, the wearable that monitored the phys- 2.2. Methods
iological variables was placed on the participant. 2.2.1. Electrodermal activity processing
Then, the participant was left alone to perform the
Once the signals had been obtained, a signal pro-
experiment.40
cessing process was performed to calculate the fun-
damental features of the EDA signals. The data typ-
2.1.3. Performance metrics ically underwent several processing steps. The EDA
In this binary classification problem, it has been signals had to be filtered in order to eliminate arte-
decided to use a training method that establishes facts and noise recorded during the acquisition. For
2050031-3
2nd Reading
June 5, 2020 17:16 2050031
this sake, a low-pass filter with a 4 Hz cut-off fre- Table 1. Features obtained from phasic sig-
nals (SCR).
quency (finite impulse response filter) and a Gaus-
sian filter to smooth the signal were implemented to Analysis Features
attenuate artefacts and noise. These steps were per-
formed using LEDALAB,30 an open source Matlab Temporal M, SD, MA, MI, DR, D1,
software for analysis of skin conductance data. D2, FM, FD, SM, SSD
Morphological AL, IN, AP, RM, IL, EL
The EDA levels were not the same for all partic- Statistical SK, KU, MO
ipants, mainly due to demographic information (e.g.
Frequency F1, F2, F3

age, race, gender). Therefore, in order to compare the
signals with each others, a deconvolution process was
carried out.29 As shown in Eqs. (4)–(6) the skin con- component. In first place, the temporal parameters
ductance (SC) or EDA has two components, one is are the mean value (M), standard deviation (SD),
the skin conductance level, commonly referred to as maximum peak value (MA), minimum peak value
tonic signal, and the other component is called skin (MI) and dynamic range (DR), which is the differ-
conductance response (SCR), also known as phasic ence between the maximum and minimum values.
signal. The first and second derivative (D1 and D2) were
SC = SCL + SCR also computed to see the tendencies in skin conduc-
tivity, in addition to their means (FM and FD), and
= SCtonic + SCphasic , (4) their standard deviations (SM and SSD).23
Besides, several morphological features were cho-
SC = SCtonic + Driverphasic ∗ IRF, (5)
sen: arc length (AL), integral area (IN), normalized
SC = (Drivetonic + Driverphasic ) ∗ IRF, (6) mean power (AP), perimeter and area ratio (IL),
energy and perimeter ratio (EL), and, finally, three
where ∗ is the convolution operation and IRF is the
statistic parameters, skewness (SK), kurtosis (KU)
impulse response function.
and momentum (MO). Lastly, in relation to the fre-
The convolution of SC data results in a conduc-
quency, the fast Fourier transform (FFT) through
tive function that encompasses a tonic fraction, as
bandwidths F1 (0.1, 0.2), F2 (0.2, 0.3) and F3 (0.3,
shown in Eq. (5). If one of them can be estimated,
0.4) was calculated.23,28
the other is obtained implicitly. The tonic can be
observed in absence of phasic activity and the pha-
sic driver is obtained by subtracting the tonic driver 2.2.3. Dataset processing for stress detection
from SC.29
As explained above, the signals for each calm/stress
SC
= DriverSC , (7) state have a length of 47 s. The first 4 s and the last
IRF
3 s were eliminated to avoid possible artefacts due to
DriverSC = Drivertonic + Driverphasic , (8) the connection and others unwanted effects.
SC Next, two different studies were performed on
Driverphasic = − Drivertonic . (9) the same dataset. Prior to launching stress detec-
IRF
The SCR is considered to be the effective signal tion, the first study consisted in dividing each of the
for establishing an individual’s response to a stim- SCR files into segments of different time intervals
ulus.42 Once the deconvolution process was com- (splits between 1 and 40 s) from the beginning of
pleted, the SCR signals were ready for comparison the file (start of the stimulus), where no overlaps
among all the participants. between the intervals were applied (see Fig. 1(a)). In
the second study, overlapping was enabled between
the adjacent time intervals (see Fig. 1(b)). In this
2.2.2. Feature extraction case, for a split of n seconds, n − 1 seconds back will
Different features were selected in order to quan- be taken. For example, if we work with 5-s segments,
tify the SCR signals (see Table 1). As shown in the we take the first and the previous 4 s.
table, several time-domain, frequency-domain and Two objectives were covered by using over-
morphological metrics were computed on the SCR lapping. The first objective was to obtain more
2050031-4
2nd Reading
June 5, 2020 17:16 2050031
Algorithm 1: Feature dataset generation.

Data: Participants’ EDA file
Result: Data set with the calculated features
with/without overlapping
Initialization;
for 1 to 147 EDA files do
Remove first 4 and last 3 s;
(a) Save Dataset;

if want overlapping then
Select overlapping window (1 S by default);
for 1 to 147 EDA files do
for 1 to 40 s do
Split file every n overlapped
seconds;
Calculate Features;
Add Features to overlapping file for

(b) each split;
Fig. 1. Configuration of the dataset for each of the tests. else
Study (a) shows a separation in n seconds without over- for 1 to 147 EDA files do
lapping. Study (b) shows a separation in n seconds with for 1 to 40 s do
overlapping. Split file every n seconds;
Calculate Features;
Add Features to file for each split;
information (more classification features), so that the
Save each different split feature file;
model would get more robust during the training
End;
phase, and there arise fewer errors during classifi-
cation. The second aim was to minimize the effects
due to arbitrariness in the choice of the time seg-
ments. This was repeated for each participant and
six groups: decision trees, logistic regression, ensem-
signal obtained.
ble trees, linear discriminant, naı̈ve Bayes and near-
Before that, as previously mentioned, filtering
est neighbors (KNN). The “Classification Learner”,
and deconvolution processes were performed for each
an app belonging to the Statistics and Machine
SC signal. Again, this eliminates artefacts and noise
Learning Toolbox by Matlab, was used for training
before obtaining the searched features (see Table 1).
these binary classifiers.43
Below, the algorithm used to generate the features
Different configuration parameters were selected.
dataset is shown (see Algorithm 1). Once this pro-
For decision trees, three different configurations were
cess was done, we had a dataset with all the fea-
analyzed: fine with 100, medium with 20 and coarse
tures for all the splits (split from 1 to 40 s). This
with 4 splits, respectively. Logistic regression uses a
generated dataset was employed to feed each of
standard configuration. For ensemble trees we have
the classification models (used for training, test and
selected four types: boosted, bagged, RUS boosted
validation).
and subspace KNN with AdaBoost, bag, KNN and
RUS boost as ensemble methods, correspondingly.
For the linear discriminant classifier we also used a
2.2.4. Other binary classifiers standard configuration. For naı̈ve Bayes we used two
Although our major interest was in an in-depth com- distributions for numeric predictors: Gaussian and
parison on the performance of classical versus deep Bayesian. For KNN we used five different configu-
SVMs for stress detection, we were also interested rations: fine, medium and coarse, using Euclidean
in a comparison of D-SVMs with other binary clas- distance and 1, 10, 100 neighbors; cosine with 10
sifiers. These binary classifiers were separated into neighbors, using angular distance and similarity;
2050031-5
2nd Reading
June 5, 2020 17:16 2050031
weighted, using 10 neighbors and Manhattan is based the so-called kernel functions.44 From an
metric. algorithmic point of view, the geometric margin
optimization problem represents a quadratic opti-
2.2.5. Classification with support vector mization problem with linear constraints that are
machines (SV M ) solved by means of standard quadratic programming
techniques.
SVMs were originally introduced by Vapnik in 1995
to solve a binary classification problem.44 Currently,
SVMs are being widely used to solve manifold types

2.2.6. Classification with deep support vector
of supervised classification problems such as regres-
machines (D-SV M )
sion, clustering and multi-classification.44,45
In general, an SVM is looking for a hyperplane Most deep learning methods for classification are
that optimally separates the points of one class from based on the union of fully-connected convolutional
the other, considering the minimum distance possi- layers (deep neural networks/dense neural networks).
ble to each of the characteristic vectors. Depending These layers use the softmax function to learn from
on the number of parameters (features) and their low-level structures. However, there are some excep-
dimensionality, it is necessary in some cases to make tions to this traditional architecture.33 In this paper,
a transformation in the vector space to get a sepa- the focus is put on SVM-based fully-connected lay-
ration as optimal as possible. The concept of “opti- ers. This is how this kind of D-SVM concept arises.
mal separation” is where the fundamental character- The idea emerges from the creation of a network
istic of SVM resides. This type of algorithm looks for of assembled SVMs (SVM-Ensemble).32 For this
the hyperplane that possesses the minimum distance reason, the architecture of this model is a mix-
(margin) with the points that are closer to itself. ture between statistically inspired machine learning
That is why SVMs are also sometimes referred to SVMs and a traditional artificial neural networks
as maximum margin classifiers. In this way, the vec- configuration.
tor points that are labeled in one category will be on As shown in Fig. 2, a generic architecture can be
one side of the hyperplane, and the cases that are in exhibited for this type of models. This multi-layer
the other category will be on the other side. architecture contains an Input Layer, a series of Hid-
The search for the hyperplane of separation in the den Layers and an Output Layer. Inside the k hidden
transformed spaces, normally of very high dimension, layers there are m SVMs that deliver new features to
Fig. 2. Description of the D-SVM architecture.
2050031-6
2nd Reading
June 5, 2020 17:16 2050031
the next layer, ending up with the prediction in the support vector itself and no amount of adjustment
output layer. In order to carry out this assembly, a with C prevents over-fitting. The kernel is applied
series of operations must be carried out beforehand. in each data instance to locate the original nonlin-
The system first trains a set of separate SVM clas- ear observations in a higher-dimensional space when
sifiers, getting the data randomly from the training they become separable. It makes it possible to sepa-
dataset. Each SVM that composes the first D-SVM rate the different groups. Finally, degree establishes
layer is trained in the standard way. The next SVM the polynomial order in the case of using a poly-
layers are trained with a combination of support vec- nomial function. In most cases, for a same architec-
tors that belong to the previous layers. The process it ture, different configurations were obtained that per-
repeated for each of the layers. Only those paths that formed well in our approach.44,47
produce the highest accuracy remain active. Finally, Note that for the implementation of both SVMs
the output layer will provide better features for clas- and D-SVMs, “scikit-learn”,48 “Keras”49 and “Ten-
sification than worked separately.32,46 sorFlow”50 machine learning platforms were used
Once the assembly has been carried out, attention under the Python programming language.
was on the flow of data belonging to the test and val-
idation set. We observed that the different features SVM Configurations. We started from a simple
of the dataset are introduced randomly in the input model. As shown in Table 2, different ranges were
layer. The output of this first layer generates a new established for the values of the parameters. Our idea
dataset that will be used to train the next hidden was to only choose two or three configurations that
layer. This new dataset is composed of the correct offered the best prediction capabilities according to
predictions of the previous model. Finally, when all the chosen metrics for each of the intervals.
features have passed through the hidden layers, they Different tests were carried out with the val-
encounter the output layer. This output provides the ues obtained in a first approximation. These ranges
final classification of our item. were oversized for all parameters. In general terms,
A difference with traditional swallow learning the solutions that made the model converge were
algorithms is that better data is obtained for the the simplest ones. The grid search method was
final classification task with the increase of layers, used to tune the hyper-parameters (C, gamma and
according to the deep learning mechanism. iterations) using different cross validations and neg-
ative squared root as a selection criterion.
2.2.7. Configurations D-SVM Configurations. According to the param-

eters obtained in the configuration of each SVM,
Before developing any model based on D-SVM it is the architecture for the D-SVMs was configured. In
recommended to perform a preconfiguration of the order to establish the number of layers and SVMs
different SVMs that shape it. In this way, an idea of per layer, the inverted pyramid method was imple-
the range of parameters that will form the D-SVM is mented. Indeed, this method has been widely used
gotten, as well as the best kernel functions to finally in deep learning architectures.41,51–53
find a good accuracy. Therefore, in first place the The inverted pyramid method was used to deter-
parameters to be used in the D-SVM configurations mine the number of SVMs in the input layer.51–53
were studied.
The parameters of the several configurations are
described as follows. C is the penalty parameter Table 2. Configuration of the different SVMs.
for each SVM. It sets the margin to the separa- Type Kernel C Gamma Degree Iters.
tion hyperplane. The gamma parameter defines the
reach-out of the influence of a single training exam- SVM1 Linear 1–102 10−5 –105 1 107
ple. The gamma parameter defines how far the influ- SVM2 Quadratic 1–102 10−5 –105 2 108
ence of a single training example goes, with low val- SVM3 Cubic 1–102 10−5 –105 3 108
SVM4 Polynomial 1–102 10−5 –105 4 108
ues meaning “far” and high values meaning “near”.
SVM5 Polynomial 1–102 10−5 –105 5 108
If gamma is too large, the radius of the area of SVM6 RBF 1–102 10−5 –105 — 108
influence of the support vectors only includes the
2050031-7
2nd Reading
June 5, 2020 17:16 2050031
Table 3. Configuration of each layer in the D-SVMs. as more restrictive. Once the simulations had been
Input Hidden Output
carried out, the data obtained were used to carry
Type layer 1 layer 2 layer out statistical analyses. Different ANOVA analy-
ses were performed to establish differences between
D-SVM1 SVM1, SVM2, SVM3, — SVM3 groups and their level of significance. The p-value is
SVM4, SVM5
a statistic value that establishes if there are signifi-
D-SVM2 SVM1, SVM3, — SVM6
SVM4, SVM6 cant differences between groups. For the two datasets
(with/without overlapping), it was found that there
D-SVM3 SVM1, SVM2, SVM3, SVM5, SVM6

SVM4, SVM5 SVM6 were no significant differences (p-value ≤ 0.05). This
is why, the same parameters (mean values) were used
in both cases.
Therefore, considering that√we tackled with 22 fea-
In Tables 4–6, the different mean and standard
tures, the input layer had 22 = 4.6 inputs. In the
deviation values are shown. It was decided to sep-
same manner in which deep learning uses an input
arate data in three intervals ([1,10], (10,20] and
layer composed of four or five neurons, here the input
(20, 40] s) because the proper method makes this
layer consisted of four and five SVMs. The same
clustering after analyzing ANOVA differences.

criterion was followed for the hidden layers, which
means that one or two SVMs would be necessary.
In the case of using two SVMs, a final output layer Table 4. Configuration of the different SVMs for time
of one SVM was necessary. This criterion provided intervals between 1 and 10 s.
several configurations. Some of them are shown in Type Kernel C Gamma F1-score
Table 3. These were selected after picking the best
results obtained. SVM1 Linear 0.74 (.05) 220 (3) 0.59 (.14)
SVM2 Quadratic 0.87 (.25) 124 (7) 0.55 (.30)
SVM3 Cubic 1.50 (.50) 76 (1) 0.64 (.05)
3. Results SVM4 Polynomial 0.84 (.04) 56 (5) 0.68 (.22)
This paper was not only focused on determining SVM5 Polynomial 0.92 (.03) 60 (3) 0.65 (.32)
SVM6 RBF 1.30 (.34) 45 (9) 0.74 (.14)
the configuration of SVMs and D-SVMs to obtain a
higher accuracy in the classification of the emotional
conditions of calm and stress from EDA signals. We
Table 5. Configuration of the different SVMs for time
were also interested in knowing the minimum possi-
intervals between 10 and 20 s.
ble interval for the recognition of these two states.
In addition, we were looking to find out which is the Type Kernel C Gamma F1-score
influence of the different types of features (tempo-
SVM1 Linear 0.74 (.05) 420 (3) 0.74 (.05)
ral, statistical, frequential and morphological) in the SVM2 Quadratic 0.87 (.25) 324 (7) 0.75 (.04)
accuracy of the models. Finally, we were interested SVM3 Cubic 1.50 (.50) 156 (1) 0.76 (.04)
in knowing the best data segmentation model (with SVM4 Polynomial 0.84 (.04) 206 (5) 0.76 (.05)
or without interval overlapping). SVM5 Polynomial 0.92 (.03) 260 (3) 0.74 (.02)
SVM6 RBF 1.30 (.34) 235 (9) 0.78 (.01)
3.1. Optimized parameters for SVMs
and D-SVMs
Table 6. Configuration of the different SVMs for time
Once the simulations were carried out to obtain the intervals between 20 and 40 s.
optimal parameters according to each of the selected
Type Kernel C Gamma F1-score
time intervals, the C and gamma values that assured
the better results in terms of F1-score were selected. SVM1 Linear 0.74 (.05) 520 (3) 0.81 (.03)
With these parameters, a total of 50 simulations were SVM2 Quadratic 0.92 (.14) 424 (7) 0.82 (.03)
performed for each of the predictive models, for a SVM3 Cubic 1.63 (.50) 625 (1) 0.81 (.03)
randomly distributed dataset, 70% data for training, SVM4 Polynomial 0.46 (.14) 240 (5) 0.80 (.02)
SVM5 Polynomial 0.85 (.02) 469 (3) 0.81 (.03)
15% for test and 15% for validation, and cross val- SVM6 RBF 1.30 (.26) 545 (9) 0.83 (.01)
idations of 5, 7 and 10 folds. The latter were used
2050031-8
2nd Reading
June 5, 2020 17:16 2050031
Table 7. Training time (mean and standard deviation) of the different classifiers.
Type Classifier [1,10] (10,20] (20,40]
Tree (fine) 6.02 (0.50) 9.87 (0.91) 9.62 (0.01)

Tree (medium) 4.25 (0.20) 8.89 (0.62) 4.75 (0.60)
Tree (coarse) 3.48 (0.45) 8.65 (0.50) 4.66 (0.30)
Logistic regression 6.48 (0.25) 9.65 (0.54) 5.52 (1.30)
Ensemble tree (boosted) 455.13 (0.20) 43.69 (1.66) 13.99 (0.42)
Ensemble tree (bagged) 504.23 (3.30) 50.54 (3.55) 13.40 (0.39)

Ensemble tree (RUS boosted) 879.73 (1.60) 65.27 (3.98) 17.32 (0.18)
Ensemble tree (subspace KNN) 853.23 (0.62) 70.33 (2.67) 16.35 (0.00)
Linear discriminant 2.68 (0.25) 7.97 (1.29) 4.55 (0.32)
Naı̈ve Bayes (Gaussian) 4.09 (0.68) 9.45 (0.34) 5.44 (1.23)
Classical Naı̈ve Bayes 199.54 (0.23) 20.16 (2.32) 8.44 (1.02)
KNN (fine) 352.46 (3.52) 24.55 (0.53) 8.20 (0.63)
KNN (medium) 370.36 (2.61) 24.88 (0.61) 9.14 (0.36)
KNN (coarse) 374.81 (1.45) 23.45 (0.44) 8.67 (0.25)
KNN (cosine) 384.25 (5.20) 26.75 (0.13) 8.60 (0.54)

KNN (weighted) 397.63 (2.78) 56.20 (0.24) 8.52 (0.85)
SVM (linear) 78.44 (2.03) 16.32 (0.26) 6.36 (1.50)
SVM (quadratic) 980.57 (6.02) 319.34 (4.02) 15.81 (0.10)
SVM (cubic) 1409.32 (5.36) 350.02 (2.30) 21.75 (0.05)
SVM (polynomial, fourth degree) 3600.21 (5.89) 560.20 (5.20) 120.36 (0.24)
SVM (polynomial, fifth degree) 3750.21 (2.46) 630.20 (1.20) 192.95 (0.84)
SVM (RBF) 334.32 (2.10) 20.65 (0.90) 6.82 (0.02)
D-SVM1 5302.98 (6.32) 2156.20 (20.00) 296.14 (3.12)
Deep D-SVM2 2800.58 (10.64) 1500.36 (22.30) 123.52 (4.50)
D-SVM3 9895.88 (20.65) 3456.20 (4.33) 428.36 (0.63)
As explained before, the simplest values that composed of an Intel-i5.4600 processor, 32 GB of

increased the performance of the classifier and RAM, 1 TB of hard disk and Ubuntu Linux 18.04
decreased the simulation time were selected. On the as operating system. Table 7 shows the time spent
other hand, these values also saved time and gave to train each model. As foreseen, the methods con-
accuracy in the construction of the D-SVMs. sidered as “classical” spend less time than the “deep”
Let us interpret the results in terms of the F1- ones.
score metric for each of the intervals. In the interval The linear discriminant classifier takes the lowest
[1, 10] s (see Table 4), there are sub-optimal results. time for all intervals with 2.68 (0.25), 7.97 (1.29) and
For a linear kernel, the F1-score is around 59%. On 4.55 (0.32) s for intervals [1, 10], (10, 20] and (20, 40],
the other hand, the percentage of a nonlinear kernel respectively. In relation to the decision trees, the
increases up to 74% for SVM6, radial basis function Tree (fine) configuration takes longer to train than
(RBF) kernel. In the interval (10, 20] s (see Table 5), the Tree (coarse) for the three intervals. For naı̈ve
the SVM6 RBF kernel’s F1-score is 78%. Lastly, all Bayes, significantly less time to train the Naı̈ve Bayes
kernels increase in F1-score up to 83% in the interval (Gaussian). In the case of ensemble methods, it usu-
(20, 40] s (see Table 6), once again for the SVM6 RBF ally takes longer to train for the [1, 10] s interval.
kernel. To sum up, RBF kernel outperforms in the In contrast, they take significantly less time for the
three cases analyzed. (20, 40] s interval. KNN-based classifiers last between
352.46 and 397.63 s for the (1, 10] interval, while for
the (10, 20] interval it is between 20.16 and 56.20 s,
3.2. Training times for all classifiers
and for the (20, 40] interval it takes 8.52 and 9.14 s to
One fundamental aspect when evaluating classi- train. The explanation of the different training times
fiers is the training time and their efficiency. This is described next. The linear discriminant looks for
experiment was carried out with a desktop computer linear combinations among different features. It ends
2050031-9
2nd Reading
June 5, 2020 17:16 2050031
up quickly when it does not find combinations, as is F1-score with an AUC of 0.81 for the (20, 40] time
the case here. On the other hand, the training time interval.
of the decision trees depends on the number of splits Using the values of the parameters obtained so
made on the data set. In our case, a fine configu- far, several simulations were carried out using both
ration (100 splits) spends more time than a coarse SVMs and D-SVMs. Table 8 shows the results for
one (4 splits). Moreover, the ensemble trees meth- the SVM and D-SVM models in terms of the F1-score
ods take more time than the methods based only on and AUC metrics. The best results for the time inter-
decision trees due to their more complex topology. val [1, 10] s were 74% and 92% for SVM (polyno-
The KNN-based methods take an intermediate time mial, fifth degree) and D-SVM1, respectively. For the
compared to the other classifiers. There are no signif- interval (10, 20], the best results were 78% for SVM
icant differences for the intervals within the different (polynomial, fifth degree) and 89% for D-SVM1.
configurations. Finally, for time interval (20, 40], the best results
Focusing on SVMs, these need significantly more were 83% for SVM (polynomial, fifth degree) and
time than the other classical methods, except the 92% for D-SVM1. To sum up, distance-based classi-
SVM (linear) and SVM (RBF) configuration. This fiers (KNN and SVM) have a good performance in
may be due to the learning methods used and the large time intervals. As the D-SVM models are fed
kernel functions associated with that process. By by SVMs, their F-score is the best in all intervals.
increasing the degree of the polynomial (SVM (lin-
ear), SVM (quadratic), SVM (cubic), SVM (poly-
nomial, fourth degree) and SVM (polynomial, fifth 3.4. Influence of different groups of
degree)), the time needed to train grows. In the end, features in SVM and D-SVM
as told before, D-SVM-based methods take much Refocusing on the intention of comparing SVMs and
longer to train than all the rest of classifiers. D-SVM D-SVMs, another analysis carried out on the dataset
takes much longer due to two main reasons, the asso- was to verify how each type of parameter affected the
ciated kernel function of the SVMs that compose it metrics of the model. Different tests were carried out
and the increase in the number of layers. in which the following results were obtained. These
analyses consisted in training a same model but with
the difference that a group of parameters was elim-
3.3. Stress detection for all classifiers inated on each training set. Another of the analyses
Table 8 shows the results of the simulations per- was also to group the parameters in permutations
formed with all classifiers. Within decision trees the of two into two categories. Table 9 shows the model
configuration that works best for all signal inter- that provided the best result measured as F1-score
vals is the Tree (coarse). The best F1-score obtained for each of the configurations.
is 65.32% and the area under the curve (AUC) is
0.59. For logistic regression we have that the best
result is for the (20, 40] interval with an F1-score of 3.5. Minimal time interval
73.00% and an AUC of 0.66. The best result of the with/without
overlapping for SVM and D-SVM
ensemble methods for interval [1, 10] is Ensemble tree
(RUS boosted) with and F1-score of 66.02% and an Thanks to the selection of the parameters that opti-
AUC of 0.59. In interval [10, 20] the best result is mize each SVM, our workload was reduced sub-
for Tree-Bagged with 72.32% for F1-score and 0.68 stantively. It allowed us to determine with preci-
for AUC. Finally, Ensemble tree (boosted) shows the sion which is the minimum interval in seconds that
best result for the interval (20, 40], with 75.32% for allowed us to differentiate between a condition of
F1-score and 0.60 for AUC. On the other hand, for calm and stress. Figures 3 and 4 show the F1-score
linear discriminant the best result for F1-score is obtained with each of the test sets for a cross val-
43.65% and AUC of 0.59. In the Naı̈ve Bayes group idation CV = 10 on all SVMs and D-SVMs. As
we have the best result for the (10, 20] interval with already mentioned, there are no significant differ-
an F1-score of 43.50% and and AUC of 0.59. Lastly, ences between the results obtained with overlapping
the best performance in the KNN group is 82.69% of and without overlapping.
2050031-10
2nd Reading
June 5, 2020 17:16 2050031
Table 8. Mean F1-score (mean and standard deviation) and AUC value of the different classifiers.
[1,10] (10,20] (20,40]

Classifier F1-score (%) AUC F1-score (%) AUC F1-score (%) AUC
Tree (fine) 42.76 (0.45) 0.57 (0.04) 43.10 (0.06) 0.66 (0.20) 60.32 (3.82) 0.59 (0.14)
Tree (medium) 43.76 (1.35) 0.59 (0.12) 45.50 (0.67) 0.64 (0.20) 57.32 (1.43) 0.58 (0.14)
Tree (coarse) 46.76 (1.15) 0.55 (0.03) 47.00 (0.67) 0.55 (0.12) 65.32 (0.28) 0.59 (0.08)
Logistic
regression 59.62 (0.40) 0.65 (0.08) 70.00 (0.01) 0.68 (0.01) 73.00 (0.78) 0.66 (0.12)
Ensemble tree
(boosted) 60.32 (3.82) 0.59 (0.14) 70.32 (0.37) 0.62 (0.11) 75.32 (0.30) 0.60 (0.01)
Ensemble tree
(bagged) 61.20 (1.82) 0.59 (0.14) 72.32 (0.10) 0.68 (0.31) 71.32 (0.12) 0.60 (0.01)
Ensemble tree
(RUS boosted) 66.02 (4.82) 0.59 (0.14) 70.32 (0.20) 0.68 (0.21) 70.85 (0.02) 0.62 (0.38)
Ensemble tree
(subspace KNN) 64.17 (2.82) 0.59 (0.14) 70.32 (0.06) 0.69 (0.21) 70.65 (0.20) 0.61 (0.12)
Linear
discriminant 26.03 (4.65) 0.56 (0.30) 43.65 (2.45) 0.59 (0.08) 40.32 (0.98) 0.53 (0.04)
Naı̈ve Bayes
(Gaussian) 36.80 (1.55) 0.56 (0.54) 43.50 (0.80) 0.59 (0.80) 40.32 (0.86) 0.56 (0.24)
Naı̈ve Bayes 42.76 (2.55) 0.57 (0.43) 43.50 (0.67) 0.59 (0.20) 40.32 (6.48) 0.53 (0.14)
KNN (fine) 60.22 (0.23) 0.78 (0.00) 70.23 (0.15) 0.78 (0.09) 80.69 (0.25) 0.81 (0.04)
KNN (medium) 61.62 (0.03) 0.75 (0.01) 71.30 (0.05) 0.81 (0.04) 82.69 (0.10) 0.81 (0.04)
KNN (coarse) 61.20 (0.02) 0.75 (0.00) 72.00 (0.25) 0.76 (0.02) 81.90 (0.00) 0.81 (0.04)
KNN (cosine) 65.36 (0.05) 0.78 (0.02) 70.30 (0.05) 0.82 (0.10) 80.69 (0.25) 0.81 (0.04)
KNN (weighted) 60.64 (0.01) 0.79 (0.01) 71.30 (0.52) 0.82 (0.10) 80.69 (0.25) 0.81 (0.04)
SVM (linear) 59.02 (0.14) 0.55 (0.80) 74.00 (0.05) 0.68 (0.01) 81.00 (0.78) 0.76 (0.32)
SVM (quadratic) 55.43 (0.30) 0.59 (0.70) 75.38 (0.04) 0.72 (0.01) 82.01 (0.03) 0.78 (0.32)
SVM (cubic) 64.30 (0.05) 0.69 (0.58) 76.00 (0.04) 0.76 (0.90) 81.10 (0.03) 0.86 (0.32)
SVM (polynomial,
fourth degree) 65.00 (0.12) 0.80 (0.02) 74.03 (0.25) 0.80 (0.02) 81.03 (0.05) 0.82 (0.10)
SVM (polynomial,
fifth degree) 74.00 (0.14) 0.78 (0.01) 78.10 (0.01) 0.81 (0.20) 83.00 (0.00) 0.80 (0.00)
SVM (RBF) 68.09 (0.54) 0.78 (0.04) 76.31 (0.21) 0.78 (0.15) 80.87 (0.42) 0.80 (0.40)
D-SVM1 92.01 (0.01) 0.80 (0.40) 89.10 (0.03) 0.78 (0.02) 92.01 (0.01) 0.80 (0.40)
D-SVM2 84.31 (0.02) 0.76 (0.32) 84.12 (0.02) 0.77 (1.02) 84.12 (0.02) 0.77 (1.02)
D-SVM3 72.01 (1.03) 0.74 (0.50) 78.5 (0.23) 0.75 (0.02) 79.00 (0.20) 0.76 (0.30)
In accordance with some related works, an SVMs do.6,39 Currently, D-SVMs and SVMs, as well
F1-score threshold of 70% to enabled differentiating as other machine learning methods, have the poten-
between the two conditions (calm and stress),2,6,20,35 tial to be used in applications requiring the detection
and the minimum interval for identifying a condition of the emotional state of a person.
was 3 s for SVMs (see Fig. 3). When increasing the Previous studies have postulated that it is not
threshold to an outstanding 80%, only 4 s were neces- possible to quickly determine stress from EDA sig-
sary for D-SVMs (see Fig. 4) to differentiate between nals because it is a slow physiological variable when
the conditions. compared with others (e.g. electroencephalographic
signals and heart rate, among others).40,54 According
4. Discussion to the literature, the time interval from the moment
Let us remind that this research is based on deter- the stimulus occurs until the change in EDA appears
mining whether D-SVMs are useful to differentiate is 3 or 4 s. Therefore, a challenge arises for shortening
between calm/stress conditions in the same way as the minimum interval of time that a system requires
2050031-11
2nd Reading
June 5, 2020 17:16 2050031
Table 9. Influence on F1-score (mean and standard deviation) of the parameter type for several SVMs and
D-SVMs.
Parameter type SVM D-SVM

Time
interval (s) Temporal Morphological Frequential Kernel F1-score (%) Type F1-score (%)
× Linear/cubic 82.6 (0.32) D-SVM1 92.6 (0.02)

× Quadratic 71.6 (1.02) D-SVM2 84.8 (0.20)
[1, 10] × RBF 82.6 (1.40) D-SVM1 83.2 (0.85)
× × Quadratic 82.6 (0.58) D-SVM1 87.6 (0.05)

× × Cubic 82.6 (2.35) D-SVM1 85.3 (0.45)
× × RBF 82.6 (1.02) D-SVM1 86.4 (0.01)

× Quadratic 71.6 (0.41) D-SVM2 84.8 (0.14)
(10,20] × RBF 82.6 (0.13) D-SVM1 83.2 (0.02)
× × Quadratic 82.6 (0.62) D-SVM1 87.6 (0.04)
× × Cubic 82.6 (0.14) D-SVM1 85.3 (0.04)
× × RBF 82.6 (0.01) D-SVM1 86.4 (0.00)

× Quadratic 71.6 (0.01) D-SVM2 84.8 (0.05)
(20,40] × RBF 82.6 (0.02) D-SVM1 83.2 (0.03)
× × Quadratic 82.6 (0.02) D-SVM1 87.6 (0.13)
× × Cubic 82.6 (0.02) D-SVM1 85.3 (0.30)
× × RBF 82.6 (0.00) D-SVM1 86.4 (0.00)
Fig. 3. F1-score variation for each SVM with optimal configuration at each time interval for CV = 10.
to classify the emotional condition. Hence, the results rule. Although a paper has demonstrated that after
obtained in our study are comparable to and even 10 s the calm/stress condition of a person can be
improve the outcomes of more classical methods. In established with some precision,6 this interval has
the literature related to the detection of stress with been shortened in our approach by using D-SVM
SVMs it has been established that the accuracy range configurations (down to 3 s). This is the reason why
is between 75% and 90%.23,28,55,56 our focus has been different to other approaches.
Moreover, the accuracy is higher for greater inter- Instead of analyzing a signal of a certain time length,
vals of time. Most papers on the topic provide anal- we were interested in knowing what is the minimum
yses that have to process data during 20 to 40 s as a interval for differentiating between calm and stress
2050031-12
2nd Reading
June 5, 2020 17:16 2050031

Fig. 4. F1-score variation for each D-SVM with optimal configuration at each time interval for CV = 10.
conditions. At the same time, we were interested in distress is produced (stressors). In this respect, the
understanding why this happens. In our study, the results achieved in this work have given an F1-score
minimum time interval using an SVM has been cal- of 83% for SVM and 92% for D-SVM. Note that the
culated to be 4 s. On the other hand, the minimum best results have always been obtained with the RBF
detection interval is 3 s for the D-SVM architecture. kernel. In comparison with the results obtained in
We believe that this 1-s decrease is due to the fact other related studies, it is possible to conclude that
that this type of architecture is much faster in dis- those obtained in our approach are comparable, and
covering the patterns in the parameters that compose for most works slightly better. In other approaches,
the EDA signals. generally using more than one sensor, stress detec-
We can consider that each of the models tion ranges between 80% and 95%. Our method uses
described in this work perform well in terms of solely skin conductance response features to provide
detecting both calm and stress conditions due to the a high performance comparable to other works.
great ease inherent to SVMs in handling a large num- On the other side, we have to consider a num-
ber of features. On the contrary, it becomes more dif- ber of constraints and limitations. First, the experi-
ficult to administer the amount of data generated in ment was conducted in a controlled environment on
D-SVMs. It is necessary to manage the new training middle-aged volunteers. For this reason, the results
datasets generated in the successive layers. If this is cannot be generalized beyond the age range of the
not considered seriously, the design of D-SVMs can participants (18 to 44 years). The second limitation
generate numerous errors that get larger as the num- is the quality of the data obtained. In acquisition
ber of layers increase.46 In our case, these errors have systems based on physiological signals, it is common
been solved by generating all the datasets for each of that artefacts occur which damage or worsen the sig-
the layers at the same time, always respecting that nal. In our case, we ran an experiment very similar to
they were randomly separated. some of our previous ones. For this reason, the prob-
A thorough review of the literature on stress lems that normally appear in this type of acquisition
detection reveals that most works agree that stress systems were solved.
is a very difficult subject and its measurement is not
an easy task. There are many markers that can be
5. Conclusions
used, as for instance, EDA, blood volume pressure,
accelerometers, electroencephalography, and so on. This paper has introduced D-SVMs as novel meth-
Many algorithms can be applied, and many forms ods for the detection of stress/calm conditions. Until
of stress can be observed.11,21,22,57,58 Moreover, the now, this kind of classification through EDA sig-
results provided in all these works should be taken nals has mostly been carried out by using SVMs,
with caution due to the existence of many ways although several contributions have been made with
2050031-13
2nd Reading
June 5, 2020 17:16 2050031
other binary classifiers. In this paper, we decided Acknowledgments

to make a comparison study between more classic This work has been partially supported by Spanish
methods and D-SVMs. The results have shown that Ministerio de Ciencia e Innovación, Agencia Estatal
D-SVMs outperform in terms of F1-score, obtaining
de Investigación (AEI)/European Regional Devel-
values from 89.10% to 92.01%, improving the classic
opment Fund (FEDER, UE) under PID2019-
methods that reached 83.00%. Nonetheless, as future
106084RB-I00 and DPI2016-80894-R grants, by
work, it is our intention to also use other novel classi-
Castilla-La Mancha Regional Government SBPLY/
fication methods based on artificial neural networks
17/180501/000192 grant, and by CIBERSAM of

like recurrent neural networks, long short-term mem-
the Instituto de Salud Carlos III. Roberto Sánchez-
ory and deep neural networks.59–62
Reolid holds BES-2017-081958 scholarship from
Our most significant contribution is the develop- Spanish Ministerio de Educación y Formación
ment of a complete acquisition, signal processing and Profesional.
classification system based on SVMs and D-SVMs
with a high ability to discriminate between calm and
stress conditions. In this study, it has also been made
References
possible to identify the contributions of parameters

based on time domain, signal morphology and fre- 1. G. Fink, Stress: Concepts, definition and history, in
Reference Module in Neuroscience and Biobehavioral
quency domain to the F1-score of the classifiers based
Psychology (Elsevier, 2017).
on SVMs. 2. A. Salazar-Ramirez, E. Irigoyen, R. Martinez and
In this paper, D-SVMs have demonstrated to pro- U. Zalabarria, An enhanced fuzzy algorithm based
vide robustness at a not too high computational cost. on advanced signal processing for identification of
Although D-SVMs take significantly more time to stress, Neurocomputing 271 (2018) 48–57.
3. A. Martı́nez-Rodrigo, B. Garcı́a-Martı́nez,
train than the others, their predictions are much bet-
R. Alcaraz, P. González and A. Fernández-Caballero,
ter. In addition, once the model is trained, the pre- Multiscale entropy analysis for recognition of visu-
diction is instantaneous for all models. The results ally elicited negative stress from EEG recordings,
obtained in our studies indicate the possibility of Int. J. Neural Syst. 29(2) (2019) 1850038.
constructing efficient and simplified models with 4. J. C. Castillo, A. Castro-González, A. Fernández-
configurable parameters. Thus, the computational Caballero, J. M. Latorre, J. M. Pastor, A. Fernández-
Sotos and M. A. Salichs, Software architecture for
cost and time for development and implementa-
smart emotion recognition and regulation of the age-
tion are reduced. Their development is especially ing adult, Cogn. Comput. 8(2) (2016) 357–367.
designed to solve problems with a high number of 5. M. Le Fevre, J. Matheny and G. S. Kolt, Eustress,
features. distress, and interpretation in occupational stress,
We have found that the best way to develop a J. Manag. Psychol. 18(7) (2003) 726–744.
6. O. M. Mozos, V. Sandulescu, S. Andrews, D. Ellis,
highly accurate classifier is to use the three types
N. Bellotto, R. Dobrescu and J. M. Ferrandez, Stress
of parameters (time domain, signal morphology and detection using wearable physiological and socio-
frequency domain) on each of the signals. But in the metric sensors, Int. J. Neural Syst. 27(2) (2017)
case of not being able to use all of them, those that 1650041.
make the greatest contribution are within the time 7. J. Xu, J. Wang, T. Bai, X. Zhang, T. Li, Q. Hu,
domain of the signal. The simplicity of the classi- H. Li, L. Zhang, Q. Wei, Y. Tian et al., Electrocon-
vulsive therapy induces cortical morphological alter-
fication model enables this scheme to operate in a
ations in major depressive disorder revealed with
long term. An additional relevant aspect found is surface-based morphometry analysis, Int. J. Neural
that the use of a noninvasive (commercial) device Syst. 29(7) (2019) 1950005.
makes it possible to constantly monitor EDA and 8. L. L. Davis, M. Weaver, E. Zamrini, A. Stevens,
to have a larger dataset to work with. The use of D.-H. Kang and C. R. Parker Jr, Biopsychological
markers of distress in informal caregivers, Biol. Res.
such devices assists in identifying the different stages
Nurs. 6(2) (2004) 90–99.
of stress in a more continuous manner over time, 9. D. Carneiro, J. C. Castillo, P. Novais, A. Fernández-
also potentially capable of attending emotional self- Caballero, J. Neves and M. T. López, Stress
regulation. monitoring in conflict resolution situations, in
2050031-14
2nd Reading
June 5, 2020 17:16 2050031
Ambient Intelligence — Software and Applications Ambient Intelligence and Smart Environments, eds.
(Springer, 2012), pp. 137–144. P. Novais and S. Konomi, Vol. 21 (IOS Press, 2016),
10. R. W. Picard, Affective Computing (MIT Press, pp. 416–425.
2000). 23. R. Zangróniz, A. Martı́nez-Rodrigo, J. Pastor,
11. R. W. Picard, Automating the recognition of stress M. López and A. Fernández-Caballero, Electroder-
and emotion: From lab to real-world impact, IEEE mal activity sensor for classification of calm/distress
Multimedia 23(3) (2016) 3–7. condition, Sensors 17(10) (2017) 2324.
12. A. Fernández-Sotos, A. Fernández-Caballero and 24. A. Fernández-Sotos, A. Fernández-Caballero and
J. M. Latorre, Elicitation of emotions through music: J. M. Latorre, Influence of tempo and rhythmic unit
The influence of note value, in Artificial Computa- in musical emotion regulation, Front. Comput. Neu-
tion in Biology and Medicine, eds. J. M. Ferrández, rosci. 10 (2016) 80.
Ŕ. lvarez Sánchez, F. de la Paz, F. J. Toledo-Moreo 25. J. Choi, B. Ahmed and R. Gutierrez-Osuna, Devel-
and H. Adeli (Springer, 2015), pp. 488–497. opment and evaluation of an ambulatory stress mon-
13. M. Seeger, Gaussian processes for machine learning, itor based on wearable sensors, IEEE Trans. Inf.
Int. J. Neural Syst. 14(2) (2004) 69–106. Tech. Biomed. 16(2) (2011) 279–286.
14. L. Tian and A. Noore, A novel approach for short- 26. A. Fernández-Caballero, A. Martı́nez-Rodrigo, J. M.
term load forecasting using support vector machines, Pastor, J. C. Castillo, E. Lozano-Monasor, M. T.
Int. J. Neural Syst. 14(5) (2004) 329–335. López, R. Zangróniz, J. M.
15. A. Martı́nez-Rodrigo, L. Fernández- Latorre and A. Fernández-Sotos, Smart environment

Aguilar, R. Zangróniz, J. M. Latorre, J. M. Pastor architecture for emotion recognition and regulation,
and A. Fernández-Caballero, Film mood induction J. Biomed. Inform. 64 (2016) 55–73.
and emotion classification using physiological signals 27. J. C. Castillo, A. Fernández-Caballero, A. Castro-
for health and wellness promotion in older adults liv- González, M. A. Salichs and M. T. López, A frame-
ing alone, Expert Syst. 37(2) (2020) e12425. work for recognizing and regulating emotions in
16. L. Khedher, I. A. Illán, J. M. Górriz, J. Ramı́rez, the elderly, in Ambient Assisted Living and Daily
A. Brahim and A. Meyer-Baese, Independent Activities, eds. L. Pecchia, L. Chen, C. Nugent and
component analysis-support vector machine-based J. Bravo (Springer, 2014), pp. 320–327.
computer-aided diagnosis system for Alzheimers 28. A. Martı́nez-Rodrigo, A. Fernández-Sotos, J. M.
with visual support, Int. J. Neural Syst 27(3) (2017) Latorre, J. Moncho-Bogani and A. Fernández-
1650050. Caballero, Neural correlates of phrase rhythm: An
17. M. Graña, L. Ozaeta and D. Chyzhyk, Resting state eeg study of bipartite versus rondo sonata form,
effective connectivity allows auditory hallucination Front. Neuroinform. 11 (2017) 29.
discrimination, Int. J. Neural Syste. 27(5) (2017) 29. M. Benedek and C. Kaernbach, A continuous mea-
1750019. sure of phasic electrodermal activity, J. Neurosci.
18. O. Faust, U. R. Acharya, L. C. Min and B. H. Sputh, Methods 190(1) (2010) 80–91.
Automatic identification of epileptic and background 30. C. Karenbach, Ledalab–A Software Package for the
EEG signals using frequency domain parameters, Analysis of Phasic Electrodermal Activity, Techni-
Int. J. Neural Syst. 20(2) (2010) 159–176. cal Report, Allgemeine Psychologie, Institut für Psy-
19. C. Setz, B. Arnrich, J. Schumm, R. La Marca, chologie (2005).
G. Tröster and U. Ehlert, Discriminating stress from 31. N. Starliper, F. Mohammadzadeh, T. Songkakul,
cognitive load using a wearable eda device, IEEE M. Hernandez, A. Bozkurt and E. Lobaton, Activity-
Trans. Inf. Tech. Biomed. 14(2) (2009) 410–417. aware wearable system for power-efficient prediction
20. V. Sandulescu, S. Andrews, D. Ellis, N. Bellotto of physiological responses, Sensors 19(3) (2019) 441.
and O. M. Mozos, Stress detection using wearable 32. Z. Qi, B. Wang, Y. Tian and P. Zhang, When ensem-
physiological sensors, in International Work-Conf. ble learning meets deep learning: A new deep sup-
Interplay Between Natural and Artificial Computa- port vector machine for classification, Knowl. Based
tion (Springer, Cham, Switzerland, 2015), pp. 526– Syst. 107 (2016) 54–60.
532. 33. S. Kim, S. Kavuri and M. Lee, Deep network with
21. A. Martı́nez-Rodrigo, R. Zangróniz, J. M. Pastor support vector machines, in International Conf.
and A. Fernández-Caballero, Arousal level classifica- Neural Information Processing (Springer, 2013),
tion in the ageing adult by measuring electrodermal pp. 458–465.
skin conductivity, in Ambient Intelligence for Health, 34. H.-C. Kim, S. Pang, H.-M. Je, D. Kim and S. Y.
eds. J. Bravo, R. Hervás and V. Villarreal (Springer, Bang, Constructing support vector machine ensem-
2015), pp. 213–223. ble, Pattern Recog. 36(12) (2003) 2757–2767.
22. A. Martı́nez-Rodrigo, A. Fernández-Caballero, 35. R. Sánchez-Reolid, A. Martı́nez-Rodrigo and
F. Silva and P. Novais, Monitoring electrodermal A. Fernández-Caballero, Stress identification from
activity for stress recognition using a wearable, in electrodermal activity by support vector machines,
2050031-15
2nd Reading
June 5, 2020 17:16 2050031
in Understanding the Brain Function and Emotions, 50. TensorFlow, An end-to-end open source machine
eds. J. Ferrández, J. Álvarez Sánchez, F. de la Paz, learning platform (2020).
J. Toledo and H. Adeli (Springer, 2019), pp. 202–211. 51. I. Ullah and A. Petrosino, About pyramid structure
36. S. Betti, R. M. Lova, E. Rovini, G. Acerbi, in convolutional neural networks, in 2016 Int. Joint
L. Santarelli, M. Cabiati, S. Del Ry and F. Cav- Conf. Neural Networks (IEEE, 2016), pp. 1318–
allo, Evaluation of an integrated system of wearable 1324.
physiological sensors for stress monitoring in work- 52. Y. Xin, S. Wang, L. Li, W. Zhang and Q. Huang,
ing environments by using biological markers, IEEE Reverse densely connected feature pyramid network
Trans. Biomed. Eng. 65(8) (2017) 1748–1758. for object detection, in Asian Conference on Com-
37. H. F. Posada-Quintero and K. H. Chon, Innovations puter Vision (Springer, 2018), pp. 530–545.
in electrodermal activity data collection and sig- 53. S. Singhania, N. Fernandez and S. Rao, 3han: A
nal processing: A systematic review, Sensors 20(2) deep neural network for fake news detection, in
(2020) 479. Int. Conf. Neural Information Processing (Springer,
38. Empatica, E4 wristband from empatica (2019), 2017), pp. 572–581.
https://www.empatica.com/en-eu/research/e4/. 54. J. Hernandez, I. Riobo, A. Rozga, G. D. Abowd and
39. R. Sánchez-Reolid, A. S. Garcı́a, M. A. Vicente- R. W. Picard, Using electrodermal activity to rec-
Querol, L. Fernández-Aguilar, M. T. López, ognize ease of engagement in children during social
A. Fernández-Caballero and P. González, Artificial interactions, in 2014 ACM Int. Joint Conf. Pervasive
neural networks to assess emotional states from and Ubiquitous Computing (ACM, 2014), pp. 307–
brain-computer interface, Electronics 7(12) (2018) 317.
384. 55. J. Zhai and A. Barreto, Stress detection in computer
40. J. J. Braithwaite, D. G. Watson, R. Jones and users based on digital signal processing of noninva-
M. Rowe, A guide for analysing electrodermal activ- sive physiological variables, in 2006 Int. Conf. IEEE
ity (EDA) & skin conductance responses (SCRs) for Engineering in Medicine and Biology Society (IEEE,
psychological experiments, Psychophysiology 49(1) 2006), pp. 1355–1358.
(2013) 1017–1034. 56. F.-T. Sun, C. Kuo, H.-T. Cheng, S. Buthpitiya,
41. M. Sahlgren and R. Cöster, Using bag-of-concepts to P. Collins and M. Griss, Activity-aware mental
improve the performance of support vector machines stress detection using physiological sensors, in Int.
in text categorization, in 20th Int. Conf. Computa- Conf. Mobile Computing, Applications, and Services
tional Linguistics (ACM, 2004), p. 487. (Springer, 2010), pp. 282–301.
42. W. Boucsein, D. C. Fowles, S. Grimnes, G. Ben- 57. M. Salai, I. Vassányi and I. Kósa, Stress detection
Shakhar, W. T. Roth, M. E. Dawson and D. L. Fil- using low cost heart rate sensors, J. Health. Eng.
ion, Publication recommendations for electrodermal 2016 (2016) 5136705.
measurements, Psychophysiology 49(8) (2012) 1017– 58. H. Eisenbarth, L. J. Chang and T. D. Wager, Mul-
1034. tivariate brain prediction of heart rate and skin con-
43. MathWorks, Classification learner (2020), https:// ductance responses to social threat, J. Neurosci.
www.mathworks.com/help/stats/classificationlearn- 36(47) (2016) 11987–11998.
er-app.html 59. D. Belo, J. Rodrigues, J. R. Vaz, P. Pezarat-Correia
44. C. Cortes and V. Vapnik, Support-vector networks, and H. Gamboa, Biosignals learning and synthesis
Mach. Learn. 20(3) (1995) 273–297. using deep neural networks, Biomed. Eng. Online
45. H. Drucker, C. J. Burges, L. Kaufman, A. J. Smola 16(1) (2017) 115.
and V. Vapnik, Support vector regression machines, 60. S. Alhagry, A. A. Fahmy and R. A. El-Khoribi,
in Advances in Neural Information Processing Sys- Emotion recognition based on eeg using lstm recur-
tems (ACM, 1997), pp. 155–161. rent neural network, Emotion 8(10) (2017) 355–
46. A. Abdullah, R. C. Veltkamp and M. A. Wiering, An 358.
ensemble of deep support vector machines for image 61. M. Ahmadlou and H. Adeli, Enhanced probabilistic
categorization, in 2009 Int. Conf. Soft Computing neural network with local decision circles: A robust
and Pattern Recognition (IEEE, 2009), pp. 301–306. classifier, Integr. Comput.-Aided Eng. 17(3) (2010)
47. C. Silva and B. Ribeiro, Towards expanding rele- 197–210.
vance vector machines to large scale datasets, Int. J. 62. M. H. Rafiei and H. Adeli, A new neural dynamic
Neural Syst. 18(1) (2008) 45–58. classification algorithm, IEEE Trans. Neural Netw.
48. Scikit learn, Scikit-learn: Machine learning in python Learn. Syst. 28(12) (2017) 3074–3083.
(2020).
49. Keras, Keras: The python deep learning library
(2020).
2050031-16

Deep SVMs Identify Stress from EDA

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Deep SVMs Identify Stress from EDA

Uploaded by

Copyright:

Available Formats

2nd Reading

June 5, 2020 17:16 2050031

International Journal of Neural Systems (2020) 2050031 (16 pages)

Deep Support Vector Machines for the Identification of Stress

Roberto Sánchez-Reolid∗,† , Arturo Martı́nez-Rodrigo‡,§ , Marı́a T. López∗,†

Universidad de Castilla-La Mancha, 16071 Cuenca, Spain

1. Introduction scenarios are necessarily harmful to the individual,

D-SVMs for the Identification of Stress Condition from EDA

The dataset was collected as described in our previ-

Frequency F1, F2, F3

D-SVMs for the Identification of Stress Condition from EDA

Algorithm 1: Feature dataset generation.

(a) Save Dataset;

Add Features to overlapping ﬁle for

SVMs are being widely used to solve manifold types

Fig. 2. Description of the D-SVM architecture.

D-SVMs for the Identification of Stress Condition from EDA

2.2.7. Configurations D-SVM Conﬁgurations. According to the param-

D-SVM3 SVM1, SVM2, SVM3, SVM5, SVM6

clustering after analyzing ANOVA diﬀerences.

D-SVMs for the Identification of Stress Condition from EDA

Type Classiﬁer [1,10] (10,20] (20,40]

Tree (ﬁne) 6.02 (0.50) 9.87 (0.91) 9.62 (0.01)

Ensemble tree (bagged) 504.23 (3.30) 50.54 (3.55) 13.40 (0.39)

KNN (cosine) 384.25 (5.20) 26.75 (0.13) 8.60 (0.54)

As explained before, the simplest values that composed of an Intel-i5.4600 processor, 32 GB of

D-SVMs for the Identification of Stress Condition from EDA

[1,10] (10,20] (20,40]

Parameter type SVM D-SVM

× Linear/cubic 82.6 (0.32) D-SVM1 92.6 (0.02)

× × Quadratic 82.6 (0.58) D-SVM1 87.6 (0.05)

× Linear/cubic 82.6 (0.25) D-SVM1 92.6 (0.20)

× × RBF 82.6 (0.01) D-SVM1 86.4 (0.00)

× Linear/cubic 82.6 (0.03) D-SVM1 92.6 (0.02)

D-SVMs for the Identification of Stress Condition from EDA

other binary classiﬁers. In this paper, we decided Acknowledgments

17/180501/000192 grant, and by CIBERSAM of

possible to identify the contributions of parameters

D-SVMs for the Identification of Stress Condition from EDA

15. A. Martı́nez-Rodrigo, L. Fernández- Latorre and A. Fernández-Sotos, Smart environment

You might also like