You are on page 1of 5

Med Bio Eng Comput

DOI 10.1007/s11517-006-0107-4

TECHNICAL NOTE

Adaptive feature extraction for EEG signal classification


Shiliang Sun Æ Changshui Zhang

Received: 14 March 2006 / Accepted: 21 August 2006


 International Federation for Medical and Biological Engineering 2006

Abstract One challenge in the current research of normal output channels of peripheral nerves and
brain–computer interfaces (BCIs) is how to classify muscles, arouses more and more interests of late
time-varying electroencephalographic (EEG) signals years [9, 13–15]. Up to now, study of BCI systems has
as accurately as possible. In this paper, we address this mainly involved recording of electroencephalographic
problem from the aspect of updating feature extractors (EEG) signals using surface electrodes, as this kind
and propose an adaptive feature extractor, namely of recording is relatively convenient, harmless and
adaptive common spatial patterns (ACSP). Through inexpensive compared with other methods [2]. In this
the weighed update of signal covariances, the most paper, we focus on the classification problem of
discriminative features related to the current brain EEG signals, a crucial component embodied in gen-
states are extracted by the method of multi-class eral EEG-based BCIs. For an EEG-based BCI,
common spatial patterns (CSP). Pseudo-online simu- adaptive learning algorithms are necessary in princi-
lations of EEG signal classification with a support ple, because the recorded EEG signals usually change
vector machine (SVM) classifier for multi-class mental over time due to both biological and technical causes,
imagery tasks show the effectiveness of the proposed such as subject attention, subject fatigue, disease
adaptive feature extractor. progression, electrode impedances, amplifier noise,
and environmental noise [13]. The high variability of
Keywords Brain–computer interface (BCI) Æ EEG recordings makes it a difficult task to classify
Common spatial patterns (CSP) Æ EEG signal different EEG signals accurately and necessitates
classification Æ Feature extraction adaptive learning to boost up the performance of
existing BCIs.
With respect to adaptive learning for EEG signal
1 Introduction classification, one can choose to update classifiers or
alternatively the feature extractors. However, up to
The research of brain–computer interfaces (BCIs), now, there is not much work addressing this problem.
which aim to provide their users communication and The adaptive update of Bayesian statistical classifier
control capabilities that do not depend on the brain’s with Gaussian mixture model (GMM) is recently
studied in several papers [6, 7, 10, 11], whereas the
performance is still very moderate. Wolpaw and
McFarland [16] used the least-mean-square (LMS)
S. Sun (&) Æ C. Zhang algorithm to adaptively adjust weights for a two-
State Key Laboratory of Intelligent Technology
and Systems, Department of Automation,
dimensional movement control and found out that
Tsinghua University, Beijing 100084, China people with severe motor disabilities could use scalp
e-mails: shiliangsun@gmail.com; sunsl02@mails.tsinghua.edu.cn EEG signals to operate a robotic arm or a neuro-
C. Zhang prosthesis. In this paper, we propose to address the
e-mail: zcs@mail.tsinghua.edu.cn adaptive learning problem in EEG signal classification

123
Med Bio Eng Comput

via updating feature extractors. The basic feature 1X K


extractor is named common spatial patterns (CSP), CðkÞ ¼ xðkÞx> ðkÞ; ð1Þ
K k¼1
whose essence is to project EEG signals to the most
discriminative directions found after the simultaneous where x(k) is a N · T EEG recording matrix, and N
diagonalization of covariance matrices from different and T are, respectively, the number of recording elec-
signal categories [8]. Because of the inherent variabil- trodes and recording points. As we know, the CSP
ity of EEG patterns, the discriminative directions for feature extractor adopts fixed covariances obtained
classification tend to shift over time. In order to resolve from training sessions. But in the ACSP feature
this problem, the method of adaptive common spatial extractor when encountering a new trial or EEG
patterns (ACSP) is thus presented to improve the CSP segment from test sessions, e.g., x(k), we update the
method. corresponding covariance matrix as follows:

CðkÞ ¼ lCðk  1Þ þ ð1  lÞxðkÞx> ðkÞ; ð2Þ


2 Adaptive feature extraction
where l2[0,1] is defined as variability coefficient in our
2.1 Common spatial patterns and its extension paper. This kind of adaptive strategy embodies the idea
to the multi-class paradigm of weighted average, i.e., the current covariance matrix
is described as the weighted sum of historical covari-
The original feature extractor of CSP can be seen as ance and the covariance of the current recording seg-
linear spatial filters that lead to signals which discrim- ment. In general, those two covariance components
inate optimally between two conditions. It is based on given in the right side of Eq. 2 are both necessary.
a decomposition of raw multi-channel signals into C(k – 1) contains historical information and can also
spatial patterns that are extracted from the data of two benefit the robust computation of C(k). xðkÞx> ðkÞ
populations of EEGs in a manner that maximizes their contains the newly added information and reflects the
differences. These spatial patterns provide a weighting time-varying characteristic of EEG signals. Concretely,
of the electrodes, which is derived directly from the in the multi-class paradigm if the new EEG segment
data (for detailed description and computational is- x(k) belongs to condition A, we would update the re-
sues, please refer to [8]). Recently, some approaches lated variables as follows:
are presented to extend CSP to the multi-class para-
digm, such as using CSP within the classifier, one ver- CA ðkÞ ¼ lCA ðk  1Þ þ ð1  lÞxðkÞx> ðkÞ;
sus the rest CSP (OVR), approximate simultaneous
CB Rest ðkÞ ¼ lCB Rest ðk  1Þ þ ð1  lÞxðkÞx> ðkÞ; ð3Þ
diagonalization [3, 8]. In this paper, the idea of OVR is >
adopted to carry out feature extraction as suggested in CC Rest ðkÞ ¼ lCC Rest ðk  1Þ þ ð1  lÞxðkÞx ðkÞ:
[3]. To be exact, if three sets of EEG trials or segments
corresponding to three different conditions A, B, and C Then, we follow the standard procedures of CSP to
are given (e.g., three class of mental imaginary tasks), derive projection directions.
we use each set to obtain covariance matrices CA, CB, The ACSP method is a superset of the basic CSP
and CC, respectively. Then we can combine the sam- method because when variability coefficient l = 1, it
ples belonging to conditions B and C to obtain a degenerates to the CSP method. The selection of l is
covariance matrix CA_Rest. Likewise, covariance very flexible, which can reflect the discrepancies of
matrices CB_Rest and CC_Rest could be obtained. Se- different subjects. For subjects whose signal vari-
quently, we use each pair of covariance matrices (e.g., ability is very slow, l should take large values to
CA and CA_Rest) to carry out the standard CSP proce- retain more historical information and vice versa.
dures. Thus the optimal spatial filters related to the When the ACSP feature extractor is followed by a
corresponding conditions A, B, and C are reserved. classification task in BCI applications, we employ the
The final feature extractor is the combination of the last EEG segment to update the feature extractor
three sets of spatial filters. and then use the updated feature extractor to extract
the features of the current segment. Based on the
extracted features, the classifier trained on the pre-
2.2 Adaptive common spatial patterns vious training sessions could give the estimated label
of the current segment. This process runs iteratively
Given K EEG trials or segments x(k) (k = 1,...,K), the as new EEG segments are continually recorded and
covariance matrix can be usually estimated as sent for classification.

123
Med Bio Eng Comput

3 Experiments kernel is adopted to classify EEG segments [12]. The


experimental paradigm for training a classifier is as
The data set used in this paper contains EEG record- follows. First we choose two recording sessions from
ings from three normal subjects (denoted by S1, S2, S3, the same subject, one serving as training session and
respectively) during mental imagery tasks, which are the other test session. Then we use the stationary CSP
imagination of repetitive self-paced left-hand move- method to extract features from the training session,
ments (class C1), imagination of repetitive self-paced and use these features further to train a SVM classifier.
right-hand movements (class C2) and generation of The number of sources (number of spatial patterns
different words beginning with the same random letter related to one condition, such as ‘left-hand movement’)
(class C3). For every subject, there were three record- varies from 2 to 6 in our experiments, for this is
ing sessions acquired on the same day, each lasting empirically enough to describe the sources of mental
about 4 min with breaks of 5–10 min in between [1]. imagery tasks. Therefore, the maximum dimensions of
Galán et al. [5] show that subjects S1, S2, and S3 rep- one EEG segment after feature extraction would be 18
resent three different levels of mental consistency, (each 6 dimensions belong to one kind of brain state).
which are, respectively, consistent, scarcely consistent, The optimal parameters, i.e., the penalty parameter for
and inconsistent. Hence the data set is representative error terms in the SVM classifier, the variance
and in this paper it is employed to assess algorithms. parameter in RBF kernel, and the number of sources
related to each mental task, are selected through 20-
3.1 Signal preprocessing fold cross validation [4] on each training set. Finally,
SVM is retrained using these optimal parameters and
As mental imagery tasks are mainly related to the the whole training set.
activities of brain’s sensorimotor cortices, from the
entire electrode cap we retain the central 15 electrodes 3.3 Experimental results
covering this cortical region for signal analysis, which
are F3, Fz, F4, Fc1, Fc2, C3, Cz, C4, Cp1, Cp2, Pz, P3, For the sake of objectively evaluating the ACSP feature
P4, Po3, Po4. Because the usual frequency of sponta- extractor, two other feature extractors are employed
neous EEG recordings is normally below 50 Hz, we for performance comparisons. One is stationary CSP
convert the original sampling rate to 128 Hz without (SCSP), the standard CSP method which does not up-
fear of information loss. The signals are then referred date spatial filters at all on new test sessions. The spatial
by the Common Average Reference method [15]. filters computed from the training session are used to
Further, the continuous recording sessions are, implement feature extraction of the test session. The
respectively, partitioned to segments of 1 s with 0.5 s other feature extractor, which we name as windowed
overlapped to provide an output every 0.5 s using the CSP (WCSP), updates the signal covariance by adding a
last second of data, as in [6]. Resultantly, an EEG new EEG segment and removing the first segment from
segment would be 128 points, with a time length of 1 s. the original segment entries for calculating covariance
To emphasize the l rhythm which is highly discrimi- matrix. Like ACSP, WCSP uses updated covariances to
native in distinguishing different mental tasks, these construct a new feature extractor and then extracts
segments are temporally filtered (forward and reverse features of the current EEG segment. The variability
filtering) with pass bands 8–13 Hz (for subjects S1 and coefficient l in ACSP feature extractor takes 0.95 in
S2) and 11–15 Hz (for subject S3). The filtering bands our experiments empirically.
are determined by observing the first sessions of these The experimental results of classification rates on
three subjects, and the slight variation of filtering bands all the available recording sessions using the above
among subjects reflects the individual specialty. In three feature extractors are given in Table 1. Through
addition, each EEG segment is normalized to have unit a paired t test, no significant differences are found
energy, and to avoid the imperfection of temporal fil- between feature extractors SCSP and WCSP (P
tering to the start and end parts of a segment, 81 points value = 0.25), although the average accuracy of WCSP
(with index from 20 to 100) are retained for analysis is slightly better than that of SCSP. However, with
from the original 128 points. regard to SCSP and ACSP, 16 out of 18 classification
results of using ACSP are superior to those using SCSP.
3.2 Training the classifier Through a paired t test, significant differences are
found between feature extractors SCSP and ACSP
Throughout this paper, the support vector machine (P value < 1 · 10–3). From these results, we can draw
(SVM) classifier with a radial basis function (RBF) the conclusion that the ACSP feature extractor is the

123
Med Bio Eng Comput

Table 1 The classification accuracies (%) using different feature and the present study does not take this into account.
extractors A complete adaptive algorithm should also address the
Subject Training session Test session Accuracy adaptive behavior of the brain itself.
SCSP WCSP ACSP The computational complexity of ACSP is much less.
Because the number of electrodes (data dimensions) in
S1 1 2 69.03 68.60 70.32 BCI utilities is usually small, such as 64, 128, and 256 (15
1 3 68.74 68.31 76.66
electrodes in our experiments), the simultaneous diag-
2 1 61.59 65.24 63.73
2 3 69.38 69.16 64.24 onalization of covariance matrices could almost be
3 1 51.93 59.23 63.30 implemented in real time. This could completely meet
3 2 58.71 57.42 67.96 the needs of online learning. Besides, as the subjects
S2 1 2 51.29 57.33 65.09 used in this article represent three different levels of
1 3 56.71 60.82 74.46
2 1 54.31 53.88 63.36 mental consistency, which are respectively consistent,
2 3 58.01 57.36 69.70 scarcely consistent, and inconsistent [5], we are confi-
3 1 53.02 56.03 70.47 dent that ACSP would function well in a wide range of
3 2 59.91 56.03 65.52 users. We believe that ACSP would show more merits
S3 1 2 56.49 58.87 65.80
1 3 52.60 48.48 57.58 in future developments of BCI technology.
2 1 54.63 54.85 53.96
2 3 55.19 54.33 56.06 Acknowledgments The authors would like to thank IDIAP
3 1 51.98 54.85 57.27 Research Institute of Switzerland for providing the analyzed
3 2 64.94 63.64 66.67 data. The authors are also grateful to the anonymous editor and
Average 58.25 59.14 65.12 reviewers for giving valuable comments. This work was sup-
ported by the Chinese Natural Science Foundation (60475001)
and Postdoctoral Science Foundation (2005038075).
best among SCSP, WCSP, and ACSP, at least on the
used data set. At the same time, this also manifests the
necessity and applicability of our adaptive feature References
extractor. 1. Chiappa S, Millán JR (2005) Data set V <metal imagery,
multi-class> [http://www.ida.first.fraunhofer.de/projects/bci/
competition_iii/ desc_V.html]. IDIAP Research Institute,
4 Discussions and conclusions Switzerland
2. Curran EA, Stokes MJ (2003) Learning to control brain
activity: a review of the production and control of EEG
In this paper, we propose the ACSP method for the
components for driving brain–computer interface (BCI)
feature extraction of EEG signals. Its efficacy and systems. Brain Cogn 51:326–336
superiority over the SCSP and WCSP methods are 3. Dornhege G, Blankertz B, Curio G, Müller KR (2004)
validated through classification experiments on multi- Boosting bit rates in noninvasive EEG single-trial classifi-
cations by feature combination and multiclass paradigms.
ple recording sessions of three subjects. Theoretically,
IEEE Trans Biomed Eng 51:993–1002
because EEG signals are time-varying, if we use SCSP 4. Duda RO, Hart PE, Stork DG (2000) Pattern classification.
to extract features on new EEG segments, the distri- Wiley, New York
bution of signal features would be transferred. In this 5. Galán F, Oliva F, Guàrdia J (2005) BCI competition III, data
set V: algorithm description [http://www.ida.first.fraunhofer.
sense, adaptive feature extraction has a sound basis.
de/projects/bci/competition_iii/results/ martigny/FerranGalan_
However, because WCSP treats all the EEG entries in desc.pdf]. Faculty of Psychology, University of Barcelona
computing covariance matrix equally, it cannot catch 6. Millán JR (2004) On the need for on-line learning in brain–
hold of the variability of EEG signals effectively. This computer interfaces. In: Proceedings of the international
joint conference on neural networks, Budapest, Hungary
is the reason why WCSP is inferior to ACSP which
7. Millán JR, Renkens F, Mouriño J, Gerstner W (2004) Brain-
considers the weighted problem of EEG entries actuated interaction. Artif Intell 159:241–259
advisably. 8. Müller-Gerking J, Pfurtscheller G, Flyvbjerg H (1999)
It should be noted that although ACSP has obtained Designing optimal spatial filters for single-trial EEG classi-
fication in a movement task. Clin Neurophysiol 110:787–798
the best results generally, pitfalls also exist. From
9. Nicolelis MAL (2001) Actions from thoughts. Nature
Table 1, we can find that there are several training and 409:403–407
test trails where ACSP does not show improvements in 10. Sun S, Zhang C (2005) Learning on-line classification via
the classification. We provide an explanation for this decorrelated LMS algorithm: application to brain–computer
interfaces. Lect Notes Comput Sci 3735:215–226
phenomenon. Though the subjects are doing mental
11. Sun S, Zhang C, Lu N (2005) On the on-line learning algo-
imagery tasks, there exist some irrelevant activities rithms for EEG signal classification in brain computer
from the brain. These activities also change over time, interfaces. Lect Notes Comput Sci 3614:638–647

123
Med Bio Eng Comput

12. Vapnik V (2000) The nature of statistical learning theory. 15. Wolpaw JR, Birbaumer N, McFarland DJ, Pfurtscheller G,
Springer, New York Vaughan TM (2002) Brain–computer interfaces for commu-
13. Vaughan TM (2003) Guest editorial brain–computer inter- nication and control. Clin Neurophysiol 113:767–791
face technology: a review of the second international meet- 16. Wolpaw JR, McFarland DJ (2004) Control of a two-dimen-
ing. IEEE Trans Neural Syst Rehabil 11:94–109 sional movement signal by a non-invasive brain–computer
14. Wolpaw JR, McFarland DJ, Neat GW, Forneris C (1991) An interface in humans. Proc Natl Acad Sci 101:17849–17854
EEG-based brain–computer interface for cursor control.
Electroencephalogr Clin Neurophysiol 78:252–259

123

You might also like