You are on page 1of 34

2004 P300 Data Analysis Competition

[ Introduction | Recent Competition | Schedule | Submission | Data Sets | Judging and

Winning | Results | People ]







-- Brendan Allison 8-29-04


What is a BCI?

A brain computer interface is a communication system in which users send

information using brain activity alone. All other interfaces, such as keyboards, mice,
or voice systems, require people to send information through peripheral nerves and
muscles. Disabled individuals may have difficulty using conventional interfaces.
Some people suffer from "locked in syndrome," meaning they are completely unable
to control any muscles. For such users, a BCI is the only hope for ever
communicating with loved ones, controlling even simple devices like televisions or
lamps, or otherwise expressing oneself.

BCIs can be divided into two general categories. Most BCIs are noninvasive BCIs,
meaning that no surgery is required. They instead utilize EEG activity measured at the
scalp with an electrode cap. This sort of technology has been in widespread use for
many decades. Electrode caps are safe, painless, and fairly easy to apply and remove.
Some BCI, called invasive BCIs, instead record the brain's activity via sensors placed
on the surface of the brain or even inside it. Obviously, the neurosurgery required for
this implantation is a very serious procedure, and is only performed when medically
necessary. Invasive BCIs can ultimately provide more information about brain
activity than noninvasive BCIs, because the EEG signal recorded at the scalp is badly
smeared by the skull and other tissues. However, the surgery is very difficult and
expensive, and the jury is still out on the long term safety and effectiveness of
implanted systems.

Here are a few websites with more information about BCIs:

BrainLab: This page is a bit old, but has some good information. Dr. Allison's PhD
thesis can be found under "other writings," along with some slideshows providing a
nice introduction to BCIs and other files.

UCSD: This is the lab where Dr. Allison worked as a graduate student. This site
contains a slew of links as well as slideshows from Dr. Allison's class last summer
about BCIs. Note these slideshows are meant for university students, while the ones
on the BrainLab are meant for anyone.

Wadsworth Research Center: This site contains information about BCI2000, an

excellent program developed by the Wadsworth Research Center that is being used by
many research groups. It also contains some videos of BCIs in operation.

There are many more sites out there! Please see the links on the UCSD website, or
just search the web for "brain computer interface" or "brain machine interface."

What is a P300 BCI?

There are several different types of BCI systems, each defined by the type of brain
activity they utilize. They all have various advantages and drawbacks in terms of
speed, training time, ease of use, whether surgery is needed, and other factors. A P300
BCI uses a type of brainwave called the P300. P300 BCIs are relatively fast and easy
to use, require no training or training or surgery, and work with nearly all adults. The
main drawback of P300 BCIs is that they can only provide one binary signal - that is,
a YES or a NO. Other BCIs, such as mu BCIs, can provide provide more information.

Graphical representation of a P300 BCI. A user wears an electrode cap that

measures the brain's response to flashes. She pays attention to flashes of a certain
target letter (such as "K") while ignoring other flashes. The user's brain produces a
different response to the flashes that contain the target letter (red line) than other
letters (blue line). By determining which flashes produced a red line The more
effectively a pattern recognition can discriminate the red line from the blue line, the
better the BCI. This difference may seem obvious in the graph above, but this graph
was derived by averaging together dozens of trials. It's much harder with fewer trials.

Conference objectives: Why have this competition?

One very important factor in determining the speed and accuracy of a P300 BCI (or
any BCI) is the pattern recognition approach applied to the EEG data. In the case of
P300 BCIs, a better pattern recognition system could recognize the P300 more
quickly and accurately. This is not the first effort to explore different approaches.
Farwell and Donchin (1988), Donchin et al (2000), Bayliss' doctoral thesis (2001),
Meinicke et al. (2002), and Sajda et al (2003) all compared different techniques.
These papers generally concluded that different approaches can yield significantly
different results. As there remain many approaches that have not been tested, it is
likely that this competition will further elucidate the best approaches. This
competition also utilizes a wider variety of P300 datasets, from different types of
P300 BCIs, with both healthy and disabled subjects. Furthermore, all P300 BCI
papers used healthy subjects. There are currently no papers published regarding P300
BCIs in locked in patients - the people who need these systems most.

Of course, it is impossible to ever determine the best pattern recognition approach

with finality. The old gunslingers in the American Wild West had a saying, "There's
always someone faster on the draw." Similarly, there will always be a better pattern
recognition algorithm out there. This fact should not be discouraging to those
interested in the competition. On the contrary, competitions such as these help
identify new directions for research, and spur increased attention to the field.

Comparison to Recent Competition

A BCI data analysis competition was held in 2003, organized by Dr. Benjamin
Blankertz. See his competition web page for more information.

The current competition differs from the 2003 competition in several ways:

1) All datasets in this competition involve P300 BCIs. The 2003 competition
contained only one dataset with P300 data.

2) Further, the datasets in this competition are from different types of P300 BCIs,
such as the classic "Donchin speller," Allison's "multiple flash approach," the
Wadsworth Research Center's "Sequential P3 BCI," and other systems. These BCIs
yield somewhat different P300s that may be best studied with different types of
3) Because there will be several data sets from different labs, the number of channels
varies from 5 to 64. It is possible that different approaches will perform differently
depending on the number of channels available. Though probably insignificant, the
impedances, referencing, and sampling rates vary as well.

4) This competition will include P300 BCI data from locked in patients. No such data
were available at the 2003 competition.

5) The previous competition had the problem of having five different winners for the
same P300 dataset. This is because many of the competitors were so good, they
attained 100% accuracy with all of the data available! We are taking three steps to
prevent this. One is the aforementioned use of several different datasets, and thus a lot
more data. Some of the data, such as our ALS patient data, is much more challenging.
The second is that, in the previous competition, fifteen single trials were made
available. In this competition, differing numbers of single trials will be made
available, including in some cases only one single trial. It is highly unlikely that any
competitor will attain 100% accuracy across all datasets. Finally, the judging rules
below are different from the last competition and make a tie absolutely impossible.

6) As with any new competition, we expect to have some different competitors and
approaches in this competition.

This competition is not meant to compete with Dr. Blankertz's 2003 competition nor
his upcoming data analysis competition. On the contrary, both Dr. Blankertz and I feel
these two competitions complement each other well. Participants in this competition
are encouraged to contact Dr. Blankertz regarding his upcoming competition. For
more information, please see his web page for his 2003 competition.


Sample data are currently available for download on our FTP server. Please
contact Dr. Allison for more information. More sample data will be placed online
before the competition begins. These dates are tentative.

SOMETIME: Competition begins! All data will be online.

SOMETIME AFTER THAT: Deadline for submission.
A LITTLE AFTER THAT: Announcement of the results on this web site.

• There is no prerequisite for entry. New people, ideas, and approaches are
always welcome!
• To enter the competition, please contact Dr. Allison, whose contact
information is below. You will receive information to access our FTP server.
By entering the competition, you agree to all rules on this web page.
• One researcher/research group may submit results to one or to several data
sets. Winners will be announced for each data set separately. However,
researchers must submit to at least three different data sets to compete for the
title of grandmaster. Please see the "judging rules" below to see how winners
will be determined.
• One researcher may NOT submit multiple results to one data set. She/he has to
decide for her/his favorite one. However ... (see next point)
• From one research group multiple submissions to one data set are possible.
The sets of involved researchers do not have to be disjoint, but (1) the 'first
author' (contributor) should be distinct, and (2) approaches should be
substantially different.
• All submissions should be sent to Dr. Allison.
• We send confirmations for each submissions we get. If you do not receive a
confirmation within 3 days please resend your email and inform Dr. Allison.
• The submission must include the estimated labels, names and affiliations of all
involved researchers and a short note on the involved processing techniques.
• Anyone submitting a dataset must agree to sign the "grandmaster" certificate
unless they are the overall winner (see Judging and Winning, below).
• SPECIAL EXCEPTION: The current rules may create an ethical quandary
for P300 data from ALS patients. What if a participant develops a
substantially better pattern recognition approach for ALS patients very soon
after the competition begins? It would be inappropriate to keep this
development secret for months until the submission deadline, as it could be of
immediate benefit. Therefore, for the ALS dataset only, the submission rules
differ from the above in one way. One researcher MAY submit different
results to the ALS data set, with the understanding that these results may be
field tested in an ALS patient. However, only the last submission will be
judged for the competition. All submissions prior to the final submission will
be ignored for purposes of judging. Furthermore, to prevent an unfair
advantage for that competitor, the results of the field test will not be shared
with anyone outside of this laboratory (except the patient and his/her
caretaker) until after the submission deadline. This policy may be frustrating
for competitors eager to hear advance results of their submission, but this is
the only way to preserve the integrity of this competition.

Data sets

Each data set (except the ALS data set) will contain data from at least three different
healthy subjects. Each subject's directory will contain two top level directories, named
"labelled" and "unlabelled." Each of these two directories will contain at least three
subdirectories, named 8, 3, and 1. Each of these numbered subdirectories will contain
at least 10 different runs. The numbers 8, 3, and 1 reflect the number of single trials
contained in its daughter subdirectories. Thus the subdirectories under the "1"
directory contains only one trial. Within each of the other two directories, the single
trials will all be from the same run, within a few minutes of each other. That is, the
"8" directories will contain groups of 8 single trials from the same run, though the "3"
and the "1" directories will each feature data from a different run.

Here is a graphical representation of this directory structure:


/ \

labelled unlabelled

/ | \ / | \

8 3 1 8 3 1

/ | \ / | \ / | \ / | \ / | \ / |

RUN NUMBER 1 2 3 ... etc

The "labelled" directory serves as a training set, and the "unlabelled" directory is the
test set. These directories will never feature the exact same data, since that would
eliminate any challenge in identifying the training set. The goal of each competitor is
to identify whether each of the subdirectories in the "unlabelled' directories contain a
P300 - that is, whether they reflect EEG evoked by a target flash or an ignored flash.

In the case of datasets with more than one condition (sets 2, 3, 4, and 6), each
directory will also specify the condition. For example, data set 2 will specify whether
the data are from the single or multiple flash condition and the target probability.

Data set I: 6 x 6 Donchin matrix

provided by the BrainLab at Georgia State University

Data set 2: 8 x 8 matrix with single vs. multiple flashes

provided by the Cognitive Neuroscience Laboratory at UC San Diego

Data set 3: 4 x 4 and 12 x 12 matrix

provided by the Cognitive Neuroscience Laboratory at UC San Diego
Data set 4: Sequential BCI for answering questions
provided by the BrainLab at Georgia State University

Data set 5: Sequential BCI for answering questions (ALS patients)

provided by the BrainLab at Georgia State University

Data set 6: 1 x 6 matrix with robot control icons

provided by the BrainLab at Georgia State University

Data set 7: 6 x 6 Donchin matrix under extreme conditions

provided by the BrainLab at Georgia State University

Judging and Winning

The following rules will apply to each data set.

Each submission will earn one point for each correctly identified numbered
subdirectory. For example, if data set 1 contained the minimum number of unlabelled
directories (3 per each of 3 subjects, each containing 10 runs), a maximum of 90
points could be earned. Most data sets will contain well above the minimum number
of unlabelled directories.

Please note that this differs somewhat from the 2003 competition. In it, each run
reflected a subject choosing several targets in sequence to spell a word. The data in
this competition instead reflect a subject choosing a single target. When data were
collected for data set 1, subjects spelled a three letter sequence, but these will be
broken up into individual letters for the competition.

In the event of a tie, points will be doubled for correct classification of all "1"
directories, since recognizing a single trial is a more challenging feat than multiple
trials. If a tie remains, points will be doubled for correct classification of any "3"
directory. In the very unlikely event a tie remains, all people in the tie will be given a
copious amount of new P3 data, and given one month for a special "tiebreaker"
competition. Since this will delay the determination of who is the winner, the website
shall reflect that this "tiebreaker" competition is pending until it is decided. If a tie
remains after the tiebreaker, the organizer shall go insane, thus ending the

There will be one winner for each data set. Winners are encouraged to submit their
results to IEEE Transactions on Biomedical Engineering (TBME), IEEE Transactions
on Neural Systems and Rehabilitation Engineering (TNSRE), or a similar journal.
Winners will also be announced at the 2005 BCI conference and given a certificate
designed and signed by Drs. Allison and Moore.

Participants must submit entries to at least three datasets to be considered for the title
of overall winner. The scores for all the datasets that participant used will be
averaged. Whoever has the highest average score shall be the overall winner, or
"grandmaster." In addition to the fact that the grandmaster will very probably win at
least one of the datasets, the grandmaster gains two additional prizes. The first is a
crown designed by Dr. Allison. The grandmaster shall be crowned at the 2005 BCI
conference, if present. To add to the excitement, please note that Dr. Allison has no
relevant design experience and is a terrible artist. The second is a special certificate.
All participants except the grandmaster are required to sign this certificate
acknowledging the grandmaster has "bragging rights" until the next competition.

Dissemination of Results

The results of this competition, including a list of winners and their pattern
recognition approaches, will be announced at the 2005 BCI conference in New York,
posted on this website, and submitted to IEEE TNSRE or TMBE. Winning
participants are also encouraged to submit their results to either of these or another
relevant journal. The conference organizer is not affiliated with either of these IEEE
journals, and decisions regarding publication rest solely with its editorial board.
However, IEEE TNSRE has been receptive to articles from the recent data analysis

Each participant agrees to deliver an extended description (one page in IEEE style) of
the algorithm used for the publication in in case she/he is the winner for one of the
data sets. In this publication and any other dissemination of results, each participant
must reference the group that recorded the data and cite at least one of the papers
listed in the respective description for each data set.

If interested in the IEEE TNSRE publication from the 2003 conference organizers,
please see Sajda et al. (2003). An example of an IEEE publication from one of the
winners of this competition is Mensh et al. (2003).


This conference is being coordinated by Brendan Allison, Ph. D. My PhD thesis

involved P300 BCIs, but included a thorough review of all BCIs. I graduated from
UC San Diego in 2003, where I worked in the Cognitive Neuroscience Laboratory
under Dr. Jaime Pineda. After a 3 month internship at the Wadsworth Research
Center in New York, I now work at the BrainLab at Georgia State University. Our
lab director is Dr. Melody Moore.
Wadsworth BCI Dataset

NIPS*2001 BCI Classification Contest

Contact: Gerwin Schalk
(; 518-486-2559)

Background and Rationale

Developing brain-computer interfaces1 (“BCIs”) is a complex problem that
involves issues in a variety of disciplines. Because a BCI system consists of two adaptive
controllers, the user and the system, many factors determine the speed and accuracy of
communication. Consequently, the problem is significantly more complex than a simple
classification problem. It is conceivable, for example, that a simple system with well-
matched components (e.g., feature extraction, translation algorithm, and feedback) can
outperform a more sophisticated system with components that are not well matched to
each other. Furthermore, algorithms need to be tested in situations that are as realistic as
possible, since inferences most often depend on many assumptions. Any conclusions
drawn from offline analyses must ultimately be verified by online testing.

This comprehensive dataset represents a complete record of actual BCI

performance with 4 possible choices (i.e., target positions in a target task) in each trial.
We hope that participants in this contest will be able to achieve significantly better
classification accuracies than the ones achieved online (as reported below). If the most
successful result exceeds previous online performance, we propose to test the algorithm

The Objective in the Contest

We collected data from three subjects (as described in more detail below): ten
daily sessions per subject and six runs per session (see Figure 1). The objective in this
contest is to use the “labeled” sessions (i.e., session 1-6) to train a classifier and to test
this classifier by predicting the correct class (i.e., the target position) for each trial in the
unlabeled sessions (i.e., sessions 7-10 for each subject).

Specifically, for each subject and each of the sessions 7-10, participants are asked
to submit a file named in the form ssnnnRES.mat (e.g., “AA007RES.mat”) that contains
the variables runnr, trialnr, and predtargetpos. runnr shall specify the run number in the
session, trialnr shall specify the trial number within the run, and predtargetpos shall

“A brain-computer interface is a communication system that does not depend on the brain’s normal output
pathways of peripheral nerves and muscles.” (Wolpaw et al., 2000).

specify the predicted class of this trial (i.e., the predicted position of the target on the
screen). Each proposed classifier needs to be causal (i.e., can only use previous data to
make a prediction, that is, data from earlier sessions or from earlier trials in the current

In order to be most practical in online operation, the algorithm should allow for
constant (1-dimensional) feedback during operation and it should require as little
previous data as possible. The classifier’s parameters could be estimated from the whole
training set (sessions 1-6), but better classification results might be achieved by
continuously updating classification parameters (in online operation, parameters are
updated after each trial).

We will then calculate, for each contest participant, the average classification
accuracy over all three subjects and four test sessions (sessions 7-10). We will do this by
comparing the predicted target position in each trial with the actual target position in the
trial during online operation. The participant with the highest average accuracy is the
winner of the contest.

The Data Set

10 sessions per subject

1 2 3 4 5 6 7 8 9 10

data labeled (i.e., training set)

data unlabeled (i.e., test set)

Figure 1: This figure illustrates the labeled and unlabeled sessions in each subject’s data
set (i.e., training and test set, respectively). Each session is stored in a different Matlab
file (e.g., “AA001.mat”), consists of six runs. Each run contains about 32 individual

Task Used Online

The subject sat in a reclining chair facing a video screen and was asked to remain
motionless during performance. Scalp electrodes recorded 64 channels of EEG
(Sharbrough et al., 1991; for channel assignment see Figure 2), each referred to an

electrode on the right ear (amplification 20,000; band-pass 0.1-60 Hz). All 64 channels
were digitized at 160 Hz and stored. A small subset of channels was used to control
cursor movement online as described below.

The subjects used mu or beta rhythm amplitude (i.e., frequencies between 8-12
Hz or 18-24 Hz, respectively) to control vertical cursor movement toward the vertical
position of a target located at the right edge of the video screen. Data were collected
from each subject for 10 sessions of 30 min each. Each session consisted of six runs,
separated by one-minute breaks, and each run consisted of about 32 individual trials.
Each trial began with a 1-s period during which the screen was blank. Then, the target
appeared at one of four possible positions on the right edge. One sec later, a cursor
appeared at the middle of the left edge of the screen and started traveling across the
screen from left to right at a constant speed. Its vertical position was controlled by the
subject's EEG (update rate: 10 times per sec) as described below. The subject’s goal was
to move the cursor to the height of the correct target. When the cursor reached the right
edge, the screen went blank. This event signaled the end of the trial. See Figure 4 for a
depiction of this paradigm.

Cursor movement was controlled as follows. Ten times/sec, the last 200 ms of
digitized EEG from 1-3 channels over sensorimotor cortex was re-referenced to a
common average reference or a Laplacian derivation (McFarland et al., 1997b) and then
submitted to frequency analysis by an autoregressive algorithm (McFarland et al., 1997a)
to determine amplitude (i.e., the square root of power) in a mu and/or beta rhythm
frequency band. The amplitudes for the 1-3 channels were combined to give a control
signal that was used as the independent variable in a linear equation that controlled
vertical cursor movement. Electrode position and center frequency remained constant for
a particular subject, but certain parameters were updated online after each trial (e.g.,
parameters that estimate the signal’s dynamics (i.e., the slope and the intercept of the
linear equation that translated rhythm amplitude into cursor movement)).

Figure 2: This diagram illustrates electrode designations (Sharbrough, 1991)
and channel assignment numbers as used in our experiments.

The Signal
For each session (e.g., AA001.mat), the EEG signal is stored in one big matrix
signal (total # samples x 64 channels). Other variables define the run number (run), trial
number within the run (trial), and sample number within the run (sample). Please refer to
Figure 3 for an illustration.

For each trial, events are coded by the variables TargetCode, ResultCode,
Feedback, and InterTrialInterval. TargetCode defines the position of the target on the
screen (i.e., 0 when no target was on the screen, and 1-4 when a target was on the screen
(TargetCode was 1 when the target was on the top)). ResultCode specified the target that
was actually selected in online operation (i.e., the class predicted online by the classifier).
(For the unlabeled sessions (sessions 7-10), TargetCode and ResultCode were set to –1,
whenever they otherwise would be > 0.) InterTrialInterval was 1 when the screen was
blank and Feedback was 1 when the user was using his/her EEG to control the cursor
(i.e., the time period used online for classification). Please refer to Figure 4 for an
overview of the time-course of the paradigm used in each trial and for the relevant

Thus, to get the indices of all samples in a trial specified by cur_trial where the
subject controlled the cursor (which can be used as row numbers in the variable signal),
one would use the following code:
trialidx=find((trial == cur_trial) & (Feedback == 1));

Other periods (e.g., the inter-trial-interval preceding each trial, or the period in
which the target was on the screen but the cursor was not) are included because they
could be used to calculated baseline parameters for each trial.
High-Level Organization of Data

run trial sample signal (samples x channels)

1 1 1 s1,1 s1,64
1 1 2
1 . 3
1 . 4
1 2 5
1 2 6
1 . 7
1 . 8
1 3 9
1 3 10
. . .
. . .
. . n
2 1 1
2 1 2
2 . 3
2 . 4
2 2 5
2 2 6
2 . 7
2 . 8
2 3 9
2 3 10
. . .
. . .
. . n st,1 st,64
t = total # samples in session

Figure 3: This figure illustrates the content of each Matlab file (e.g., “AA001.mat”).
Channel numbers (e.g., columns in the variable signal (i.e., a matrix of total # samples x
64 channels) correspond to channel numbers in Figure 2. See text for a description of the
vectors run, trial, and sample. Additional variables label samples within each trial (refer
to Figure 4 for details).

Variable Name

TargetCode 0 2 2 2 2

Feedback 0 0 1 1 0

IntertrialInterval 1 0 0 0 0

ResultCode 0 0 0 0 1

Stage 1 2 3 4 5

Figure 4: This figure illustrates the time course of variables that

encode the events within each trial. In stage 1, the screen is blank. In stage 2, a target
appears in one out of four locations on the right edge of the screen (TargetCode equals
one when the target was on the top). In stages 3 and 4, the user produces EEG to control
a cursor vertically so as to hit the target. In stage 5, the cursor either hit (TargetCode
equals ResultCode) or missed (TargetCode does not equal ResultCode) the target. The
next trial then starts again at stage 1 with a blank screen. In the unlabeled sessions,
TargetCode is –1 in stages 2-5 and ResultCode is –1 in stage 5 (instead of specifying the
actual target positions).

Demonstration Code

• featuredemo.m
This program calculates and displays average spectra for each of the four target
locations for a specific subject, session number, and electrode position.

• demo.m
This program demonstrates a very simple classifier that uses a particular training
session and uses parameters learned from this session to classify a different session.
Specifically, it uses a training session to calculate the average band-power for a
particular electrode position for each of the four classes (i.e., target positions) and
then calculates three thresholds between those four means. Subsequently, it uses
these thresholds to classify each trial in a different session.
• suminfo.m
Lists the average accuracies that were achieved online (by using the linear equation as
described above) for a particular subject for each session and each run (for unlabeled
runs, it will display 0).

Online Accuracies

Matlab File Session 1st run 2nd run 3rd run 4th run 5th run 6th run AVG
AA001.mat 1 43.75 59.38 46.88 65.63 59.38 65.63 56.78
AA002.mat 2 65.63 81.25 84.38 59.38 59.38 65.63 69.28
AA003.mat 3 75.00 84.38 78.13 71.88 81.25 68.75 76.57
AA004.mat 4 68.75 68.75 87.50 78.13 78.13 71.88 75.52
AA005.mat 5 56.25 62.50 75.00 78.13 56.25 50.00 63.02
AA006.mat 6 53.13 65.63 78.13 81.25 81.25 56.25 69.27

BB001.mat 1 65.63 78.13 81.25 78.13 90.63 81.25 79.17

BB002.mat 2 68.75 59.38 62.50 62.50 71.88 78.13 67.19
BB003.mat 3 71.88 78.13 78.13 78.13 84.38 75.00 77.61
BB004.mat 4 40.63 53.13 62.50 65.63 75.00 68.75 60.94
BB005.mat 5 40.63 68.75 56.25 71.88 71.88 71.88 63.55
BB006.mat 6 78.13 71.88 68.75 62.50 71.88 65.63 69.80

CC001.mat 1 62.86 65.71 71.43 68.57 74.29 74.29 69.53

CC002.mat 2 62.86 65.71 71.43 68.57 74.29 74.29 69.53
CC003.mat 3 77.14 65.71 71.43 65.71 68.57 74.29 70.48
CC004.mat 4 74.29 80.00 77.14 74.29 62.86 68.57 72.86
CC005.mat 5 62.86 62.86 54.29 57.14 57.14 74.29 61.43
CC006.mat 6 74.29 85.71 82.86 77.14 74.29 77.14 78.57

original CC002 was corrupt
CC002.mat equals CC001.mat

Table 1: This table illustrates the online accuracies for the

labeled sessions (sessions 1-6). Note that accuracy would be 25% by chance.

McFarland, D.J., Lefkowicz, A.T. and Wolpaw, J.R. Design and operation of an EEG-
based brain-computer interface with digital signal processing technology. Behav. Res.
Methods Instrum. Comput., 1997a, 29: 337-345.
McFarland, D.J., McCane, L.M., David, S.V. and Wolpaw, J.R. Spatial filter selection for
EEG-based communication. Electroenceph. clin. Neurophysiol., 1997b, 103: 386-394.
Sharbrough, F., Chatrian, G.E., Lesser, R.P., Luders, H., Nuwer, M. and Picton, T.W.
American Electroencephalographic Society guidelines for standard electrode position
nomenclature. J. Clin. Neurophysiol., 1991, 8: 200-202.
Wolpaw, J.R., Birbaumer, N., Heetderks, W.J., McFarland, D.J., Peckham, P.H., Schalk,
G., Donchin, E., Quatrano, L.A., Robinson, C.J. and Vaughan, T.M. Brain-Computer
Interface Technology: A Review of the First International Meeting. IEEE Trans Rehab
Eng, 2000, 8(2): 164-173.

Download page
14 subjects recorded in the original study are available for download
here (total 3.4 Gb; thanks to Marc Macé who helped retrieving the data).
Each archive file contains two consecutive days of recording for one
subject. Use Winzip under windows and tar under Linux to uncompress
archive files.

Two extra subjects (1 male; 1 female) (not included in the original study)
are available here. Subject cma was used in the pilot study and was not
included because of her extensive training in doing the categorization
task. Subject rgr has one .CNT raw data file missing (file 6 of session 2)
and was not included in our study for this reason.

Channel location
Electrode locations and electrode names (as stored in the original .CNT
raw data file along with the 10-20 system correspondence) are available
as an Excel file here (a channel location file compatible with the
EEGLAB software is also available here).

Images presented during the experiment are available here for target
images and here for non-target images (you may not download or copy
these images; The Corel images on this site are for viewing only and
may not be downloaded or saved. They were purchased by the CERCO
CNRS laboratory to use for psychophysics research, and under the
terms of our licensing agreement, we cannot sell or give away these
images). All of these image are also avaialbe on a web site of the
university of Berkeley (enter an image number on this site to look up the
image presented in each trial).

Arnaud Delorme
Last modified: Wed Jul 14 15:04:53 PDT 2004
EEG data available for public download

EEG local links

• Rapid categorization of natural images

o Task and constraints
o Categorization of B&W and Color images in monkeys and humans
(behavior only)
o Categorization of familiar versus new images in humans (behavior and
o Categorization versus detection of target in natural images (EEG)
o Spectral analysis using ICA in the categorization task (human EEG)
o Other relevant publications of theThorpe group on categorization
• EEG changes accompanying volontary regulation of the 12-Hz EEG activity
• EEG tools
o EEGLAB 4.x
o Function to read Neuroscan files under Matlab
o Free software overview for EEG/ERP under Matlab

Rapid categorization of natural images

Task and Constraints

• Visual Categorization of natural images by humans and monkeys (A)

• Animal / Non Animal (Target = Animal) or Food / Non Food (Food = target)
• Central brief presentation: 30ms - No time for eye movement
• Go / No-go Response (Target: Release button (C) and touch screen (B) in under
1s; Non-target images: keep pressing button for 1s (C)
• Response to the first presentation of a new image - no learning

All the images were taken from a Commercialized CDrom (Corel CD) containing a large
variety of photographs (about 40,000). To minimize the effect of context for target
images, we randomly varied the number and type of animals and their size, view and
position. Below are some examples of images presented to the human and monkey

Images in the food task Images in the animal task

Categorization of B&W and Color images in monkeys and

humans (behavior only)

Previously unseen B&W natural scenes mixed with color ones . The faster the subject, the less
he uses color cues (see figure below). Monkeys behave as fast human subjects (published in
Vision Research see publication for more details).

Advantage in accuracy for colored images decreases

for subject with fast behavioral responses

Categorization of familiar versus new images in humans

(behavior and EEG)

Human subjects had to categorize the same 200 images for 15 consecutive days. Then
these images were mixed with 1200 previously unseen ones to asses the difference
between familiar and new images. Surprisingly, for images categorized in under 250ms
in humans they were no significant difference between familiar and new images (below
are the reaction times of all human subjects, B is a zoom of fastest reaction times in A).
Moreover for humans, early ERPs for familiar and new images were hardly
distinguishable (see figure below, on the left the grand average ERPs of 14 subjects
and on the right the differential ERPs for the two type of images). For more details, this
work has been pulished in Journal of Cognitive Neuroscience (see publication).

categorization versus detection of target in natural images


14 human subjects alternated between two tasks. In one tasks, the subjects had to
perform the standard animal categorization task (see above). In the other tasks, subject
had to categorize a single image (that contained an animal for comparison with the
categorization task) among various non-target images (as in the categorization task,
target and non-target were equiprobable). As shown in the figure below, we first
observed a 30-40 ms delay for the categorization compared to the single image
detection task both in terms of differential ERPs (grand average target minus non-
target) and behavioral responses (in d' accuracy measure shifted by -60 ms so that they
would align with the ERPs).
We also observed similar regions of activity for the two differential ERPs. To interpret
this result, we hypothetized that both tasks recruited the same regions of activation but
that the top-down task preseting of this region depended on the task, so that the unique
image detection task was faster than the categorization one. (This work is under the
process of being published.)

Spectral analysis using ICA in the categorization task

(human EEG)

Application of ICA to two sessions (2 consecutive days) of the categorization task for
one subject. We found a good correspondance between the two sessions both for the
ICA components and for their synchronization. We also develop a new type of
representation of brain activity "brainmovie" to vizualize the spectral correlation
(coherence) of many ICA component simultaneously. This work has been published in
Neurocomputing, see publications.
(Click to pop-up the brainmovie window)

Other relevant publications of theThorpe group on


Thorpe, S., Fize, D., Marlot, C. (1996) Nature 1996 Jun

6;381(6582):520-2. Pubmed link

Van Rullen, R., Thorpe S. (2000) J Cogn Neurosci 2001 May

15;13(4):454-61. Pubmed link

Rousellet, G., Fabre-Thorpe, M., Thorpe, S. (2002). Nat

Neurosci 2002 Jul;5(7):629-30. Pubmed link

EEG changes accompanying volontary regulation of the 12-Hz EEG

activity (BCI)

Jonathan Wolpaw and this team at the Wadsworth center are tranning subject to control
the so-called mu rythm at 12 Hz to move a cursor on the screen up and down. We
analsysed some of their data and observed that 12Hz regulation at few electrode sites
are accompanied by large changes at other sites and in other frequency bands. The
figure below shows the behavior of 3 components (A, B, C) at different frequencies for
up-regulated and down-regulated trials. This work is in the process of being published
in IEEE Transactions on Rehabilitation Engineering. For more detials, see publication.
EEG tools


Back in 2000, I wrote a graphical package under Matlab (EEGLAB 2.1) to reject
automatically (or semi-automatically) artifact in EEG data. It was designed to be user
friendly and fully scriptable (all the command can be executed from Matlab scripts). It
was based on the former ICA toolbox package for Matlab and also provided some
functions to automatically visualize independent components. With Scott Makeig, in
2002, we then decided to fuse the previous version of EEGLAB (2.1) with the ICA
toolbox. We added more data processing functions and extend the capacities of other
functions. We also focussed on making the function more stable and user friendly.
EEGLAB 4.x is available HERE.

Function to import Neuroscan data files

These are functions I programmed to read (neuroscan) EEG, AVG and DAT files under
Matlab (NeuroScan EEG file formats). Copy these function into a directory and launch
Matlab on this directory. Type the function without argument to get the help of how to
use it. Note updated versions of these functions are distributed as part of the EEGLAB
toolbox. Click here to see details of the existing solutions to read Neuroscan CNT data

loadeeg.m : to load Neuroscan EEG file (optimized for speed)

loadavg.m : to load Neuroscan AVG files

loaddat.m : to load Neuroscan DAT text files

ldcnt.m: to load Neuroscan CNT files (by Andrew James, with additions by

Free software overview for EEG/ERP under Matlab

Here is an incomplete review of EEG tools (most of them free). I placed a special focus
on Matlab which is quite convenient to process EEG data.

MATLAB EEGLAB Toolbox for Electrophysiological Data Analysis

Our toolbox for ICA and spectral analysis for EEG under Matlab. Allow
scripting and most know spectral and single-trial operations.
FastICA code for Matlab
ICA decomposition. An alternative to the runica() function in the EEGLAB toolbox
(based on a different ICA algorithm). Intuitive description of ICA. Not dedicated to EEG.
ERPA visualizing tool under Linux
Tool to visualize ERPs under Linux. Convenient to determine the latencies and
amplitude of ERP peaks. However, except for this functionality, the software has few
capacities and it is not free (I have not tested the latest version though).
Magnetic/Electric Source Analysis, User Interface
I did not tested it. It runs on all platforms.
Stan - software for EEG/ERP processing
Some C-programs and Neuroscan processing functions under Linux/Windows
(average, artifacts, events...). Under Windows, a primitive graphic interface is also
available (I did not test it).
EEG (MEG) Source localization using Matlab. Can map dipole locations onto MRI data.
Can not yet use the scructural MRI properties for modeling though. Also, there is not
scripting language.
Alois' Matlab page
Detailed Matlab page for EEG (not ERP) file format. Also on this page, the Adaptive
Autoregressive Modeling Matlab toolbox for online processing of EEG (I did not tested it

EEG Toolbox
Tools for looking at ERPs (some functions of EEGLAB for reading data are also
included). The function for determining the latency of ERP peaks is worth trying. The
error handling is horrible though: the toolbox keeps on generating strange errors and
you don't really know why.

Matlab and EEG

Review of the tools available to process EEG data with Matlab.
Mathtools is a technical computing portal for all scientific and engineering needs. The
portal is free and contains over 20,000 useful links to technical computing, covering
C/C++, Java, Excel, Matlab, Fortran and others.

Data set: BCI-experiment

Data set provided by Department of Medical Informatics, Institute for
Biomedical Engineering, University of Technology Graz. (Gert
Correspondence to Alois Schlögl <>
This dataset was recorded from a normal subject (female, 25y) during a
feedback session. The subject sat in a relaxing chair with armrests. The
task was to control a feedback bar by means of imagery left or right hand
movements. The order of left and right cues was random.
The experiment consists of 7 runs with 40 trials each. All runs were
conducted on the same day with several minutes break in between. Given
are 280 trials of 9s length. The first 2s was quite, at t=2s an acoustic
stimulus indicates the beginning of the trial, the trigger channel (#4) went
from low to high, and a cross “+” was displayed for 1s; then at t=3s, an
arrow (left or right) was displayed as cue. At the same time the subject
was asked to move a bar into the direction of a the cue. The feedback was
based on AAR parameters of channel #1 (C3) and #3 (C4), the AAR
parameters were combined with a discriminant analysis into one output
parameter. (similar to [1,2]). The recording was made using a G.tec
amplifier and a Ag/AgCl electrodes. Three bipolar EEG channels
(anterior ‘+’, posterior ‘-‘) were measured over C3, Cz and C4. The EEG
was sampled with 128Hz, it was filtered between 0.5 and 30Hz. The data
is not published yet, similar experiments are described in [1-4].
The trials for training and testing were randomly selected. This should
prevent any systematic effect due to the feedback.

1 2 3
5 cm

C3 1 Cz 2 C4 3
0 1 2 3 4 5 6 7 8 9 sec

Feedback period with Cue


Figure 1: Electrode positions (left) and timing scheme (right).


Format of the data

The data is saved in a Matlab-fileformat. The variable x_train
contains 3 EEG channels, 140 trials with 9 seconds each. The variable
y_train contains the classlabels ‘1’, ‘2’ for left and right,
respectively. x_test contains another set of 140 trials. The cue was
presented from t = 3s to 9s. At the same time, the feedback was presented
to the subject. Within this period, it should be possible to distinguish the
two types of trials.
Requirements and evaluation
The task is to provide an analysis system, that can be used to control a
continuous feedback. For this reason, you should provide a continuous
value (<0 class “1”, >0 class “2”, 0 non-decisive ) for each time point.
The magnitude of the value should reflect the confidence of the
classification, the sign indicates the class. Include a description of your
analysis system.
There is a close relationship between the error rate and the mutual
information [4]. We propose the mutual information because it take also
into account the magnitude of the outputs. The criterion will be the ratio
between the maximum of the mutual information and the time delay
since the cue (t=3s). Only the period between t=4 and t=9s will be

[1] A. Schlögl, K. Lugger and G. Pfurtscheller (1997) Using Adaptive
Autoregressive Parameters for a Brain-Computer-Interface Experiment,
Proceedings of the 19th Annual International Conference if the IEEE Engineering
in Medicine and Biology Society ,vol 19 , pp.1533-1535.

[2] C. Neuper, A. Schlögl, G. Pfurtscheller (1999) Enhancement of left-right

sensorimotor EEG differences during feedback-regulated motor imagery.
J Clin Neurophysiol. 16(4):373-82.

[3] Pfurtscheller G, Neuper C, Schlögl A, Lugger K. (1998) Separability of EEG

signals recorded during right and left motor imagery using adaptive
autoregressive parameters. IEEE Trans Rehabil Eng. 6(3):316-25.

[4] Schlögl A., Neuper C. Pfurtscheller G. (2002) Estimating the mutual

information of an EEG-based Brain-Computer-Interface, Biomedizinische Technik
47(1-2): 3-8.
BCI Competition III

[ goals | news | data sets | schedule | submission | download | organizers | references ]

Goals of the organizers

The goal of the "BCI Competition III" is to validate signal processing and
classification methods for Brain-Computer Interfaces (BCIs). Compared to the past
BCI Competitions, new challanging problems are addressed that are highly relevant
for practical BCI systems, such as

• session-to-session transfer (data set I)

• small training sets, maybe to be solved by subject-to-subject transfer (data set
• non-stationarity problems (data set IIIb, data set IVc),
• multi-class problems (data set IIIa, data set V, data set II,),
• classification of continuous EEG without trial structure (data set IVb, data set

Also this BCI Competition includes for the first time ECoG data (data set I) and one
data set for which preprocessed features are provided (data set V) for competitors that
like to focus on the classification task rather than to dive into the depth of EEG
The organizers are aware of the fact that by such a competition it is impossible to
validate BCI systems as a whole. But nevertheless we envision interesting
contributions to ultimately improve the full BCI.

Goals for the participants

For each data set specific goals are given in the respective description. Technically
speaking, each data set consists of single-trials of spontaneous brain activity, one part
labeled (training data) and another part unlabeled (test data), and a performance
measure. The goal is to infer labels (or their probabilities) for the test set from training
data that maximize the performance measure for the true (but to the competitors
unknown) test labels. Results will be announced at the Third International BCI
Meeting in Rensselaerville, June 14-19, and on this web site. For each data set, the
competition winner gets a chance to publish the algorithm in an article devoted to the
competition that will appear in IEEE Transactions on Neural Systems and
Rehabilitation Engineering.

[ top ]
Results of the BCI Competition III

are available here.

BCI Competition III is closed

for submissions.

Description of Data Set I in ASCII format corrected

In the description of Data Set I in ASCII format (on the download web page) rows
and columns were confused. The description is updated now.

Channel Labels in Preprocessed version of Data Set V in Matlab format


In the Matlab format of Data Set V, the field clab of the variable nfo holds the
channel labels. In the preprocessed version of this data set there are only 8 (spatially
filtered) channels of the original 32. Erroneously, in the file with the preprocessed
data nfo.clab contained all 32 channels, instead of the 8 channel subset. This is
corrected now.

Restriction of the test data in data set IIIb

Please see additional information on data set IIIb.

Description of data set IVc updated

In the description of data set IVc it was said that there are 280 test trials. This
information is wrong, the test set contains 420 trials. So the submission file to data set
IVc must contain 420 lines of classifier output. The description is corrected now.

Submissions to Data Set IIIa and IIIb

Due to the large size of files for submissions to data set IIIa and IIIb, the files should
not be sent by email, but put on the ftp server of TU-Graz: please log in by ftp to as user ftp with password ftp, go to the directory
/incoming/bci2005/submissions/IIIa/ resp.
/incoming/bci2005/submissions/IIIb/ and put you file there. Be sure to have
transfer mode set to binary. If you have problems, please contact

Clarification of Rules for Data Set V

The description of data set V was updated in order to clarify the requirement 'The
algorithm should provide an output every 0.5 seconds using the last second of data.',
see description V.
[ top ]

Data sets
Data set I: ‹motor imagery in ECoG recordings, session-to-session transfer›
(description I)
provided by Eberhard-Karls-Universität Tübingen, Germany, Dept. of Computer
Engineering and Dept. of Medical Psychology and Behavioral Neurobiology (Niels
Birbaumer), and
Max-Planck-Institute for Biological Cybernetics, Tübingen, Germany (Bernhard
Schökopf), and
Universität Bonn, Germany, Dept. of Epileptology
cued motor imagery (left pinky, tongue) from one subject; training and test data are
ECoG recordings from two different sessions with about one week in between
[2 classes, 64 ECoG channels (0.016-300Hz), 1000Hz sampling rate, 278 training
and 100 test trials]

Data set II: ‹P300 speller paradigm› (description II)

provided by Wadsworth Center, NYS Department of Health (Jonathan R. Wolpaw,
Gerwin Schalk, Dean Krusienski)
the goal is to estimate to which letter of a 6-by-6 matrix with successively
intensified rows resp. columns the subject was paying attention to; data from 2
[36 classes, 64 EEG channels (0.1-60Hz), 240Hz sampling rate, 85 training and
100 test trials, recorded with the BCI2000 system]

Data sets IIIa: ‹motor imagery, multi-class› (description IIIa)

provided by the Laboratory of Brain-Computer Interfaces (BCI-Lab), Graz
University of Technology, (Gert Pfurtscheller, Alois Schlögl)
cued motor imagery with 4 classes (left hand, right hand, foot, tongue) from 3
subjects (ranging from quite good to fair performance); performance measure:
[4 classes, 60 EEG channels (1-50Hz), 250Hz sampling rate, 60 trials per class]

Data sets IIIb: ‹motor imagery with non-stationarity problem› (description IIIb,
additional information)
provided by TU-Graz (as above)
cued motor imagery with online feedback (non-stationary classifier) with 2 classes
(left hand, right hand) from 3 subjects; performance measure: mutual information
[2 classes, 2 bipolar EEG channels 0.5-30Hz, 125Hz sampling rate, 60 trials per

Data set IVa: ‹motor imagery, small training sets› (description IVa)
provided by the Berlin BCI group: Fraunhofer FIRST, Intelligent Data Analysis
Group (Klaus-Robert Müller, Benjamin Blankertz), and Campus Benjamin
Franklin of the Charité - University Medicine Berlin, Department of Neurology,
Neurophysics Group (Gabriel Curio)
cued motor imagery with 2 classes (right hand, foot) from 5 subjects; from 2
subjects most trials are labelled (resp. 80% and 60%), while from the other 3 less
and less training data are given (resp. 30%, 20% and 10%); the challenge is to
make a good classification even from little training data, thereby maybe using
information from other subjects with many labelled trials.
[2 classes, 118 EEG channels (0.05-200Hz), 1000Hz sampling rate, 280 trials per

Data set IVb: ‹motor imagery, uncued classifier application› (description IVb)
provided by the Berlin BCI group (see above)
training data is cued motor imagery with 2 classes (left hand, foot) from 1 subject,
while test data is continuous (i.e., non-epoched) EEG; the challenge is to provide
classifier outputs for each time point, although it is unknown to the competitors at
what time points mental states changed; performance measure: mutal information
with true labels (-1: left hand, 1: foot, 0: rest) averaged over all samples
[2 classes, 118 EEG channels (0.05-200Hz), 1000Hz sampling rate, 210 training
trials, 12 minutes of continuous EEG for testing]

Data set IVc: ‹motor imagery, time-invariance problem› (description IVc)

provided by the Berlin BCI group (see above)
cued motor imagery with 2 classes (left hand, foot) from 1 subject (training data is
the same as for data set IVb); test data was recorded 4 hours after the training data
and contain an additional class 'relax'; performance measure: mutal information
with true labels (-1: left hand, 1: foot, 0: relax) averaged over all trials
[2 classes, 118 EEG channels (0.05-200Hz), 1000Hz sampling rate, 210 training
trials, 420 test trials]

Data set V: ‹mental imagery, multi-class› (description V)

provided by IDIAP Research Institute (José del R. Millán)
cued mental imagery with 3 classes (left hand, right hand, word association) from
3 subjects; besides the raw signals also precomputed features are provided
[3 classes, 32 EEG channels (DC-256Hz), 512Hz sampling rate, continuous EEG
and precomputed features]

[ top ]

December 12th 2004: launching of the competition
May 22nd 2005, midnight CET to May 23rd: deadline for submissions
June 16th 2005 (approx.): announcement of the results on this web site

[ top ]

Submissions to a data set are to be sent to the responsible contact person as stated in
the data set description. The submission has to comprise the estimated labels, names
and affiliations of all involved researchers and a short note on the involved processing
techniques. We send confirmations for each submission we get. If you do not receive
a confirmation within 2 days please resend your email and inform other organizing
committee members, e.g., 〈〉,
〈〉, 〈〉,
〈〉, 〈〉
One researcher may NOT submit multiple results to one data set. She/he has to decide
for her/his favorite one. However: From one research group multiple submissions to
one data set are possible. The sets of involved researchers do not have to be disjoint,
but (1) the 'first author' (contributor) should be distinct, and (2) approaches should be
substantially different.
For details on how to submit your results please refer to the description of the
respective data set. If questions remain unanswered send an email to the responsable
contact person for the specific data set which is indicated in the description.
Submissions are evaluated for each data set separately. There is no need to submit for
all data sets in order to participate.
Each participant agrees to deliver an extended description (1-2 pages) of the used
algorithm for publication until July 31st 2005 in case she/he is the winner for one of
the data sets.

[ top ]

Albany: Gerwin Schalk, Dean Krusienski, Jonathan R. Wolpaw

Berlin: Benjamin Blankertz, Guido Dornhege, Klaus-Robert Müller

Graz: Alois Schlögl, Bernhard Graimann, Gert Pfurtscheller

Martigny: Silvia Chiappa, José del R. Millán

Tübingen: Michael Schröder, Thilo Hinterberger, Thomas Navin Lal, Guido

Widman, Niels Birbaumer

[ top ]

References to papers about past BCI Competitions.

• Benjamin Blankertz, Klaus-Robert Müller, Gabriel Curio, Theresa M.

Vaughan, Gerwin Schalk, Jonathan R. Wolpaw, Alois Schlögl, Christa
Neuper, Gert Pfurtscheller, Thilo Hinterberger, Michael Schröder, and
Niels Birbaumer. The BCI competition 2003: Progress and perspectives in
detection and discrimination of EEG single trials. IEEE Trans. Biomed. Eng.,
51(6):1044-1051, 2004.
• The issue IEEE Trans. Biomed. Eng., 51(6) contains also articles of all
winning teams of the BCI Competition 2003.
• Paul Sajda, Adam Gerson, Klaus-Robert Müller, Benjamin Blankertz,
and Lucas Parra. A data analysis competition to evaluate machine learning
algorithms for use in brain-computer interfaces. IEEE Trans. Neural Sys.
Rehab. Eng., 11(2):184-185, 2003.

References to BCI Overview papers.

• Eleanor A. Curran and Maria J. Stokes. Learning to control brain activity:

A review of the production and control of EEG components for driving brain-
computer interface (BCI) systems. Brain Cogn., 51:326-336, 2003.
• Jonathan R. Wolpaw, Niels Birbaumer, Dennis J. McFarland, Gert
Pfurtscheller, and Theresa M. Vaughan. Brain-computer interfaces for
communication and control. Clin. Neurophysiol., 113:767-791, 2002.
• José del R. Millán. Brain-computer interfaces. In M.A. Arbib (ed.),
"Handbook of Brain Theory and Neural Networks, 2nd ed." Cambridge: MIT
Press, 2002.
• Andrea Kübler, Boris Kotchoubey, Jochen Kaiser, Jonathan Wolpaw,
and Niels Birbaumer. Brain-computer communication: Unlocking the locked
in. Psychol. Bull., 127(3):358-375, 2001.

References to BCI Special Issues.

• IEEE Trans. Biomed. Eng., 51(6), 2004.

• IEEE Trans. Neural Sys. Rehab. Eng., 11(2), 2003.
• IEEE Trans. Rehab. Eng., 8(2), 2000.

Links to General Interest BCI Sites.

• BCI Competition II
• BCI Competition I
• BCI-info International Platform for BCI research
• BCI2000 flexible BCI research and development platform
• BIOSIG toolbox for Matlab or Octave

[ top ]

Dr. Benjamin Blankertz 〈〉
Fraunhofer FIRST (IDA)
Kekulé Str. 7, D-12489 Berlin, Germany
Tel: +49 30 6392 1875
NIPS 2001 - Brain Computer Interface Workshop

NIPS*2001 Brain Computer Interface Workshop
Positions Post Workshop Data Competition
EEG synchronized imagined movement task
(courtesy of Allen Osman, University of Pennsylvania).


Overview of Classification Task

Classes: 2 (Left and right imagined button presses)

Trials: 90 (45 left and 45 right imagined button presses)

Subjects: 9

Goal: Discriminate between left and right imagined button press.

Performance metric: For testing, event time stamps have been provided
without truth labels. Please report the labels determined by your
classification algorithm. We will compare your findings with the truth
labels. (see below)

EEG Experimental details

The task of the subject was to synchronize an indicated response with a

highly predictable timed cue. The subjects were well trained until their
responses were consistently within 100 ms of the synchronization signal.
The response was instructed to be either explicit or imagined, and either
left index finger, right index finger, or both index fingers. Each explicit
movement was recorded by a button push corresponding to the
appropriate hand(s).

Each trial began with a blank screen and was designating as a time when
it was acceptable for the subject to blink. This blank screen lasted exactly
2 seconds after which was replaced by a fixation point on the screen
telling the subject that the trial has begun. The fixation point lasted for
500 milliseconds (ms). The fixation point was replaced by the letter 'E' or
'I', which instructed the subject to perform either explicit or imagined
movement. This letter remained on the screen for 250 ms and was then
replaced again by the fixation point. 1250 ms after the onset of the 'E' or

1 of 3 2007/04/21 09:37 ‫ظ‬.‫ق‬

NIPS 2001 - Brain Computer Interface Workshop

'I', the fixation point was replaced by the letter 'L', 'R', 'B', or 'N',
instructing the the subject to act with left index finger, right index finger,
both index fingers, or not all at, respectively. This letter remained on the
screen for 250 ms and was then replaced by the fixation point. 1250
ms after the letter indicating which finger to use, an 'X' appeared for 50
ms which was the synchronization cue and indicated it was time to make
the requested response. After the 'X' disappeared, the fixation point stayed
on the screen for 950 ms and then was replaced by the blank screen,
indicating the beginning of the next trial.

The eight trial types were randomly mixed within a 7 minute 12 second
block. Each block consisted of 72 trials. Therefore, nine of each trial type
was performed in each block.

Electro-physiological Measurements

EEG (electroencephalogram) was recorded from 59 electrodes placed on

on site corresponding to the International 10/20 System and referenced to
the left mastoid. All signals were sampled at 100 Hz.

Cartesian coordinates of sensors for one subject

Description of data

The supplied data files consist of 10 blocks of the synchronized

movement experiment recorded from each of 8 subjects with a sampling
rate of 100 Hz. While data is available for 8 classes of trials (explicit or
imagined for left / right / both / neither trials), for purposes of the
competition only event labels corresponding to imagined movement have
been provided. Half of the total number of these imagined movement
event labels have been provided for training. The other half have been
retained to test the performance of submitted classification algorithms.

Description of file format, These ASCII files contain pairs of time

stamps and corresponding events. The time stamp corresponds to the
sample number at which the letter 'L' or 'R' was displayed,
instructing the subject to imagine moving their left index finger or
right index finger. 1250 ms after the letter indicating which finger to
use, an 'X' appeared for 50 ms which was the synchronization cue
and indicated it was time to make the requested response. Event types

2 of 3 2007/04/21 09:37 ‫ظ‬.‫ق‬

NIPS 2001 - Brain Computer Interface Workshop

are coded as follows: 5 - 'L' displayed for imagined left button press, 6 -
'R' displayed for imagined right button press. This ASCII file is in the same format as and The time stamp gives the sample number at which the
event occurs. The event type is coded as 7 for all time stamps to facilitate
data analysis using selectevents.m, however half of the events actually
correspond to left imagined button presses and half correspond to right
imagined button presses. The order of left and right imagined button press
trials is random for all subjects. The actual event labels have not been
provided for the purpose of the competition. Please report your predicted
event labels.

alldata.bin: contains the raw data and can be read using readdata.m. The
format of the data is an array of 59*N 'float32' numbers with N being the
number of samples. Use selectevents.m to read out data based on the time
stamp of events. See example.m for an example.

The Beat Goes On: Rhythmic Modulation of Cortical Potentials by

Imagined Tapping

author = {Osman, Allen and Albert, Robert},
booktitle = {Cognitive Neuroscience Annual Meeting},
year = 2001,
address = {New York},
month = {March}

3 of 3 2007/04/21 09:37 ‫ظ‬.‫ق‬