fMRI Analysis with the General Linear Model

Jody Culham
Brain and Mind Institute

Department of Psychology
Western University
http://www.fmri4newbies.com/
fMRI Analysis
with emphasis on the General Linear Model
Last Update: October 20, 2014

Last Course: Psychology 9223, F2014, Western University
Review From Last Class
Block Design: Short Equal Epochs
raw time
course
HRF-
convolved
time course
Time (2 s volumes)
Alternation every 4 sec (2 volumes)
• signal amplitude is weakened by HRF because signal doesn’t have enough time
to return to baseline
• not to far from range of breathing frequency (every 4-10 sec)  could lead to
respiratory artifacts
• if design is a task manipulation, subject is constantly changing tasks, gets
confused
Block Design: Short Unequal Epochs
raw time
course
HRF-
convolved
time course
Time (2 s volumes)
4 sec stimuli (2 volumes) with 8 sec (4 volumes) baseline

• we’ve gained back most of the HRF-based amplitude loss but the other problems
still remain
• now we’re spending most of our time sampling the baseline
Block Design: Long Epochs
The other extreme…
raw time
course
HRF-
convolved
time course
Time (2 s volumes)
Alternation Every 68 sec (34 volumes)

• more noise at low frequencies
• linear trend confound
• subject will get bored
• very few repetitions – hard to do eyeball test of significance
Find the “Sweet Spots”
Respiration
• every 4-10 sec (0.3 Hz)
• moving chest distorts susceptibility
Cardiac Cycle
• every ~1 sec (0.9 Hz)
• pulsing motion, blood changes
Solutions
• gating
• avoiding paradigms at those frequencies
You want your paradigm frequency

to be in a “sweet spot” away from
the noise
Block Design: Medium Epochs
raw time
course
HRF-
convolved
time course
Time (2 s volumes)
Every 16 sec (8 volumes)
• allows enough time for signal to oscillate fully
• not near artifact frequencies
• enough repetitions to see cycles by eye
• a reasonable time for subjects to keep doing the same thing
Block Design: Other Niceties
truncated
too soon
Time (2 s volumes)
• If you start and end with a baseline condition, you’re less likely
to lose information with linear trend removal and you can use
the last epoch in an event related average
Block Design Sequences: Three Conditions
• Suppose you want to add a third condition to act as a
more neutral baseline
• For example, if you wanted to identify visual areas as
well as object-selective areas, you could include resting
fixation as the baseline.
• That would allow two subtractions
– scrambled - fixation  visual areas
– intact - scrambled  object-selective areas
• That would also help you discriminate differences in
activations from differences in deactivations
• Now the options increase.

• For simplicity, let’s keep the epoch duration at 16 sec.
Block Design: Repeating Sequence
• We could just order the epochs in a repeating sequence…
• Problem: There might be order effects

• Solution: Counterbalance with another order
• Problem: If you lose a run (e.g., to head motion), you lose

counterbalancing)
Block Design: Random Sequence
• We could make multiple runs with the order of conditions
randomized…
• Problem: Randomization can be flukey

• Problem: To avoid flukiness, you’d want to have different
randomization for different runs and different subjects, but
then you’re going to spend ages defining protocols for analysis
Block Design: Regular Baseline
• We could have a fixation baseline between all stimulus
conditions (either with regular or random order)
Benefit: With event-related averaging,

this regular baseline design provides
nice clear time courses, even for a
block design
Problem: You’re spending half of your

scan time collecting the condition you
care the least about
But I have 4 conditions to compare!
Here are a couple of options.
A. Orderly progression
Pro: Simple
Con: May be some confounds (e.g., linear
trend if you predict green&blue > pink&yellow)
B. Random order in each run

Pro: order effects should average out
Con: pain to make various protocols, no possibility to average all data into one
time course, many frequencies involved
C. Kanwisher lab clustered design
• sets of four main condition epochs separated by baseline epochs
• each main condition appears at each location in sequence of four
• two counterbalanced orders (1st half of first order same as 2nd half of
second order and vice versa) – can even rearrange data from 2nd order to
allow averaging with 1st order
Pro: spends most of your n on key

conditions, provides more repetitions
Con: not great for event-related averaging
because orders are not balanced (e.g., in top
order, blue is preceded by the baseline 1X,
by green 2X, by yellow 1X and by pink 0X.
As you can imagine, the more conditions you try to shove in a run, the thornier
ordering issues are and the fewer n you have for each condition.
But I have 8 conditions to compare!
• Just don’t.
• In my experience, any block design experiment with

more than four conditions becomes unmanageable
and incomprehensible
• Event-related designs might still be an option… stay

tuned…
From Design to Data
The GLM for Math Whizzes
Friston 2005, Ann. Rev. Psych.

Simple Two-Condition Paradigm
Visual Stimuli
Baseline (blank screen with fixation point)
TR = 2 s/volume
Duration = 8 min, 44 s = 524 s
#Volumes = 262
Let’s Start with One Voxel in Occipital Cortex
One Occipital Voxel’s Time Course
Activation
(Raw)
Time (2-s volumes)

Voxel Compared to Protocol
Activation
(Raw)
Time (2-s volumes)

Y-Axis Converted to %BSC
Activation
(Raw)
Time (2-s volumes)
% BOLD Signal Change for each timepoint
Y%BSC = (Yraw – Ybaseline)/Ybaseline)

Linear Correlation
Square-
%BSC
0.23
Wave
0
Correlation
0.28 0
0.17 0
0.23 0 Correlation Between Square-Wave Predictor and Time Course Data
0.30 0 r 0.59872977
-0.39 0 r^2 0.358477338
-0.05 0 data points 262
0.39 0 df 260
0.42 0
0.09 0
-0.29 0
-0.96 0
-0.75 0
0.21 0
-0.23 0
-0.39 1
-0.53 1
0.61 1
1.52 1
3.23 1
3.89 1
3.86 1
4.51 1
4.82 1
5.08 1
3.85 1
…
Correlation Between %BSC and SW Predictor
How Can We Do Better? Use HRF
We can convolve our square wave

predictor with an HRF model
Note: we can choose

which HRF model
to use
Correlation Between %BSC and HRF Predictor
Correlation Between %BSC and HRF Predictor
Just a few more times
• There are 60,199 voxels in this data set
• So we just have to do this 60,198 more times…

Effect of Minimum Thresholds
r = .80 r = .60 r = .40 r = .12 r=0

64% of variance 36% of variance 16% of variance 1% of variance 0% of variance
p < ran out of digits p < 10-26 p < 10-10 p < .05 p <= 1
Maximum Threshold r >= .80 r < -.80 Maximum Threshold

Cosmetic Cosmetic
Minimum Threshold Minimum Threshold

Important! r > .40 r < -.40 Important!
positively correlated negatively correlated

with predictor with predictor
(stimulus > baseline) (stimulus < baseline)
Effect of Maximum Thresholds
r = .40
16% of variance
p < 10-10
r >= .60 r >= .80 r >= 1
r > .40
GLM: 1 predictor
GLM: 1 predictor
• Why do we have only one predictor when there are

two conditions -- Stimulus and Baseline?
• Why not add…
Analogy
• How many degrees of freedom are in this equation
with two variables?
• i.e., how many things can you change?
x+y=7
1.4
1.2
0.8
0.6
0.4
0.2
0
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 06 13 20 27 34 41 48 55 62 69 76 83 90 97 04 11 18 25 32 39 46 53 60
1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2
-0.2
-0.4
We have 1 degree of freedom here
• Adjust the height of the predictor function to match the data
β=1 β=2 β=4 β=0 β = -1 β = -5
4
3
2
1
0
-1
-2
-3
-4
-5
The beta weight is NOT a correlation
• correlations measure goodness of fit regardless of scale
• beta weights are a measure of scale
small ß small ß
large r small r
large ß large ß
large r small r
Brain Voyager’s Model
Brain Voyager’s Output
Our model
accounts for
Our model had
variance of
only 1 df
635
Remaining df is Total variance

noise (residuals) = 784
In a model for a
single subject,
total df =
volumes - 1 Our model
accounts for
R2 = 0.92 =
81% of the
variance
635/784
= 81%
F test
F = MSsignal/Msnoise
F = 635/0.576 = 1102
Look up F of 1102 with

df =260
 p < .000001
MS = SS/df
Look up t =
33.2
with 260 df
p<.
000001
se is an estimate of
noise for our beta
t = signal/noise
Remember our 1 df (height of predictor)
= β/se
This is it – our β
= 3.591/0.108
= 33.2
Comparison
Correlation GLM
• both maps set to p < .00001 with 260 df

• correlation yields r map (r<.27)
• GLM yields t map (t>4.51)
The General Linear Model (GLM)
GLM definition from Huettel et al.:
• a class of statistical tests that assume that the
experimental data are composed of the linear
combination of different model factors, along with
uncorrelated noise
• Model
– statistical model
• Linear
– things add up sensibly (1+1 = 2)
• note that linearity refers to the predictors in the
model and not necessarily the BOLD signal
• General
– many simpler statistical procedures such as
correlations, t-tests and ANOVAs are subsumed by
the GLM
A More Complex Design
• Actually, we had more conditions
• There were multiple categories of visual stimuli
Houses
Faces
Objects
Bodies
Scrambled Images
Now we have 5 df
• Now we have 5 degrees of freedom
Each predictor goes

Houses
from 0 to 1
We can estimate the

Faces
amount of activation
for each condition by
looking at how much
Objects we have to scale the
predictor to best fit
the data
Bodies
Scrambled
Images
Let’s look at another voxel (in PPA)
Our Second Voxel’s Data
Our Second Voxel’s Model
This voxel shows

sig higher activity
(β) for Houses than
baseline
… but NOT sig
higher activity (β)
for Faces than
baseline
But are Houses Sig > than other stims?
1 x βHouses
-1 x βFaces
0 x βObjects
0 x βBodies
0 x βScrambled Images
i.e., βHouses - βFaces
Contrast
a vector is just a row (or column) of numbers
Vectors
But are Houses > Faces?
1 -1 0 0 0
Houses – Faces
βHouses – βFaces
= 1.031 – 0.147
= 0.884
Is this Difference Significant?
se = noise estimate
for contrast
t = signal/noise
=0.884/0.109
= 8.075
Look up t (df=260)
= 8.075
 p < .000001
Simple Example Experiment: LO Localizer
Lateral Occipital Complex
• responds when Blank
subject views objects
Screen
TIME
Intact Scrambled
Objects (Unit: Volumes) Objects
One volume (12 slices) every 2 seconds for 272

seconds (4 minutes, 32 seconds)
Condition changes every 16 seconds (8 volumes)

If you only pay attention to one slide
in this lecture, it should be the next
one!!!
Example: GLM with 2 predictors
× 1
= + +
× 2
fMRI Signal = Design Matrix x Betas + Residuals

“what we “how much of “what we
“our data” = CAN x it we CAN + CANNOT
explain” explain” explain”
Statistical significance is basically a ratio of

explained to unexplained variance
Implementation of GLM in SPM
Many thanks to Øystein Bech Gadmar for

creating this figure in SPM
 Time
Intact Scrambled
Predictor Predictor
• SPM represents time as going down
• SPM represents predictors within the design matrix as grayscale plots (where black = low,
white = high) over time
• GLM includes a constant to take care of the average activation level throughout each run
– SPM shows this explicity (BV may not)
We create a GLM with 2 predictors
when 1=2
= + +
when 2=0.5
fMRI Signal = Design Matrix x Betas + Residuals

“what we “how much of “what we
“our data” = CAN x it we CAN + CANNOT
explain” explain” explain”
Statistical significance is basically a ratio of

explained to unexplained variance
How to Reduce Noise
• If you can’t get rid of an artifact, you can include it as a
“predictor of no interest” to soak up variance
Example: Some people

include predictors from the
outcome of motion
correction algorithms
Corollary: Never leave out

predictors for conditions
that will affect your data
(e.g., error trials)
This works best when the

motion is uncorrelated with
your paradigm (predictors
of interest)
Including First Derivative
• Some recommend including the first derivative of the

HRF-convolved predictor
– can soak up some of the variance due to misestimations of
the HRF
Now do you understand why we did temporal filtering?
raw
data
high-
pass
low-
pass
band-
pass
Poldrack, Mumford & Nichols, 2011 fMRI Data Analysis

Common Predictors of No Interest
• People often include these to reduce residuals
– motion parameters
– signal from ventricles
– models for error trials
Contrasts:
Examples with Real Data
Sam’s Paradigm:
Localizer for Ventral-Stream Visual Areas
Fusiform Face Area

Contrasts in the GLM
• We can examine whether a single predictor is significant
(compared to the baseline)
R L
z = -20
• We can also examine whether a single predictor is
significantly greater than another predictor
Contrast Vectors
Houses Faces Objects Bodies Scram
Faces - Baseline 0 +1 0 0 0
Faces - Houses -1 +1 0 0 0
Faces - Objects 0 +1 -1 0 0
Faces - Bodies 0 +1 0 -1 0
Faces - Scrambled 0 +1 0 0 -1
Balanced Contrasts
1 2 1 1 1
Condition
Unbalanced Balanced
Contrast -1 +1 -1 -1 -1 Σ=-3 Contrast -1 +4 -1 -1 -1 Σ=-0
β 1 2 1 1 1 β 1 2 1 1 1
Contrast -1 2 -1 -1 -1 Σ=-2 Contrast -1 8 -1 -1 -1 Σ=4

xβ xβ
If you do not balance the contrast, you are If you balance the contrast, you are comparing one
comparing one condition vs. the sum of all the others condition vs. the average of all the others
Problems with Bulk Contrasts
β β
1 2 1 1 1 2 2 2 2 .5
Condition Condition
Balanced: Faces vs. Other Balanced: Faces vs. Other

Contrast -1 +4 -1 -1 -1 Σ=0 Contrast -1 +4 -1 -1 -1 Σ=0
β 1 2 1 1 1 β 2 2 2 2 0.5
Contrast -1 8 -1 -1 -1 Σ=4 Contrast -2 8 -2 -2 - Σ=1.5

xβ xβ 0.5
• Bulk contrasts can be significant if only a subset of

conditions differ
Conjunctions
(sometimes called Masking)
Houses Faces Objects Bodies Scram
Faces - Baseline 0 +1 0 0 0
AND
Faces - Houses -1 +1 0 0 0
Faces - Objects 0 +1 -1 0 0
AND
AND
Faces - Bodies 0 +1 0 -1 0
AND
Faces - Scrambled 0 +1 0 0 -1
To describe this in text:

• [(Faces > Baseline) AND (Faces > Houses) AND (Faces >
Objects) AND (Faces > Bodies) AND (Faces > Scrambled)]
Conjunction Example
Faces – Faces – Faces – Faces – Faces –

Houses Objects Bodies Scrambled Baseline
Superimposed Conjunction
Maps
P Values for Conjunctions
• If the contrasts are independent:
• e.g., [(Faces > Houses) AND (Scrambled > Baseline)]
– pcombined = (psinglecontrast)numberofcontrasts
• e.g., pcombined = (0.05)2 = 0.0025
• If the contrasts are non-independent:

• e.g., [(Faces > Houses) AND (Faces > Baseline)]
– pcombined is less straightforward to compute
Real Voxel: GLM
• Here’s the time course from a voxel in right FFA (defined by conjunction)
GLM Data, Model, and Residuals
dfpredictors = # of predictors
dfresidual = dftotal - dfpredictors
dftotal = #volumes - 1
262 volumes (time points)
GLM predictors account for

(0.784)2 = 61% of variance
Real Voxel: Betas
t = β/se
e.g., tFace = βFace/seFace
tFace = 1.371/0.076 = 18.145
t(5,261)= 18.145  p < .000001

Real Voxel: Contrasts
Σ[Contrast x β] = 0 x 0.964
+ 1 x 1.371
+ 0 x 0.979
+ 0 x 1.000
- 1 x 0.687
= 1.371 – 0.687
= 0.684
Dealing with Faulty Assumptions
What’s this #*%&ing reviewer
complaining about?!
1. Correction for multiple comparisons
2. Correction for serial correlations
– only necessary for data from single subjects
– not necessary for group data
Types of Errors
Is the region truly active?
Yes No
Does our stat test indicate
p value:
that the region is active?
probability of a Type I error

HIT Type I
Yes
Error e.g., p <.05
“There is less than a 5%

probability that a voxel our
stats have declared as
Type II Correct
No
“active” is in reality NOT

Error Rejection active
Slide modified from Duke course

Dead Salmon
poster at Human Brain Mapping conference, 2009
• 130,000 voxels
• no correction for
multiple
comparisons
Fishy Headlines
Mega-Multiple Comparisons Problem
Typical 3T Data Set
30 slices x 64 x 64
= 122,880 voxels of (3 mm)3
If we choose p < 0.05…
122,880 voxels x 0.05 = approx. 6144 voxels should be significant due

to chance alone
We can reduce this number by only examining voxels inside the brain
~64,000 voxels (of (3 mm)3) x 0.05 = 3200 voxels significant by chance

Possible Solutions to Multiple
Comparisons Problem
• Bonferroni Correction
– small volume correction
• Cluster Correction
• False Discovery Rate
• Gaussian Random Field Theory
• Test-Retest Reliability
Bonferroni Correction
• divide desired p value by number of comparisons
Example:
desired p value: p < .05
number of voxels in brain: 64,000
required p value: p < .05 / 64,000  p < .00000078
• Variant: small-volume correction

• only search within a limited space
• brain
• cortical surface
• region of interest
• reduces the number of voxels and thus the severity of Bonferroni
• Drawback: overly conservative

• assumes that each voxel is independent of others
• not true – adjacent voxels are more likely to be sig in fMRI data
than non-adjacent voxels
Cluster Correction
• falsely activated voxels should be randomly dispersed
• set minimum cluster size (k) to be large enough to make it unlikely that
a cluster of that size would occur by chance
• some algorithms assume that data from adjacent voxels are
uncorrelated (not true)
• some algorithms (e.g., Brain Voyager) estimate and factor in spatial
smoothness of maps
• cluster threshold may differ for different contrasts
• Drawbacks:
• handicaps small regions (e.g., subcortical foci) more than large
regions
• researcher can test many combinations of p values and k values
and publish the one that looks the best
False Discovery Rate
• “controls the proportion of rejected hypotheses that are falsely rejected”
(Type II errors)
• standard p value (e.g., p < .01) means that a certain proportion of all
voxels will be significant by chance (1%)
• FDR uses q value (e.g., q < .01), meaning that a certain proportion of the
“activated” (colored) voxels will be significant by chance (1%)
• Drawbacks
• very conservative when there is little activation; less conservative when
there is a lot of activation
Gaussian Random Field Theory
• Fundamental to SPM
• If data are very smooth, then the chance of noise points passing
threshold is reduced
• Can correct for the number of “resolvable elements” (“resels”) rather
than number of voxels
• Drawback: Requires smoothing

Test-Retest Reliability
• Perform statistical tests on each half of the data
• The probability of a given voxel appearing in both purely by chance is
the square of the p value used in each half
e.g., .001 x .001 = .000001
• Alternatively, use the first half to select an ROI and the second half to
test your hypothesis
• Drawback: By splitting your data in half, you’re reducing your statistical

power to see effects
Sanity Checks: “Poor Man’s Bonferroni”
• For casual data exploration, not publication
• Jack up the threshold till you get rid of the schmutz (especially in air,
ventricles, white matter – may be real)
• If you have a comparison where one condition is expected to produce
much more activity than the other, turn on both tails of the comparison
• If two areas are symmetrically active, they’re less likely to be due to
chance (only works for bilateral areas)
• Jody’s rule of thumb: “If ya can’t trust the negatives, can ya trust the
positives?”
• Too subjective for serious use
Example: MT localizer data
Moving rings > stationary rings (orange)

Stationary rings > moving rings (blue)
Have We Been So Obsessed with
Limiting Type I Error that Type II Error is
Out of Control?
Is the region truly active?
Yes No
Does our stat test indicate
that the region is active?
HIT Type I
Yes
Error
Type II Correct
No
Error Rejection

Comparison of Methods
simulated
data
uncorrected
-high Type I
-low Type II
Bonferroni
-low Type I
-high Type II
FDR
-low Type I
-low Type II
Poldrack, Mumford & Nichols, 2011 fMRI Data Analysis

Strategies for Exploration vs. Publication
• Deductive approach
– Have a specific hypothesis/contrast planned
– Run all your subjects
– Run the stats as planned
– Publish
• Inductive approach
– Run a few subjects to see if you’re on the right track
– Spend a lot of time exploring the pilot data for
interesting patterns
– “Find the story” in the data
– You may even change the experiment, run additional
subjects, or run a follow-up experiment to chase the
story
• While you need to use rigorous corrections for publication, do not be

overly conservative when exploring pilot data or you might miss
interesting trends
• Random effects analyses can be quite conservative so you may want to
do exploratory analyses with fixed effects (and then run more subjects if
needed so you can publish random effects)
What’s this #*%&ing reviewer
complaining about?!
1. Correction for multiple comparisons
2. Correction for serial correlations
– only necessary for data from single subjects
– not necessary for group data
• stay tuned to find out why: Group Data lecture
Correction for Temporal Correlations
When analyzing a single subject, degrees of freedom = number of volumes – 1
e.g., if our run has 200 volumes (400 s long if TR = 2), then df = 199
Statistical methods assume that each of our time points is independent.
In the case of fMRI, this assumption is false.
Even in a “screen saver scan”, activation in a voxel at one time is correlated with it’s
activation within ~6 sec
This artificially inflates your statistical significance.

Autocorrelation function
To calculate the magnitude of the
original problem, we can compute the
autocorrelation function on the residuals
shift by 1 For a voxel or ROI, correlate its time
volume course with itself shifted in time
Plot these correlations by the degree of
shift by 2 shift
volumes
time
If there’s no autocorrelation, function

should drop from 1 to 0 abruptly – pink
line
The points circled in yellow suggest
there is some autocorrelation, especially
at a shift of 1, called AR(1)
BV can correct for the autocorrelation
to yield revised (usually lower) p values
BEFORE
AFTER
BV Preprocessing Options
Temporal Smoothing of Data
• We have the option in our software to temporally
smooth our data (i.e., remove high temporal
frequencies or “low-pass filter”)
• However, I recommended that you not use this option
• Now do you understand why?

To Localize or Not to Localise?
To Localize or Not to Localise?
Neuroimagers can’t even

agree how to SPELL
localiser/localizer!
Methodological Fundamentalism
The latest review I received…
Approach #1:
Voxelwise Statistics
Run a statistical contrast for every voxel in your search volume.
Correct for multiple comparisons.
Find a bunch of blobs.
Voxelwise Approach: Example
• Malach et al., 1995, PNAS
• Question: Are there areas of the human brain that are more responsive to
objects than scrambled objects
• You will recognize this as what we now call an LO localizer, but Malach was the
first to identify LO
LO (red) responds more to objects, abstract sculptures

LO activation is shown in red, behind MT+ and faces than to textures, unlike visual cortex (blue)
activation in green which responds well to all stimuli
Approach #2:
Region of interest (ROI) analysis
• Identify a region of interest
images from
O’Reilly et al.,
2012, SCAN
Functional Anatomical Functional-Anatomical

ROI ROI ROI
• Perform statistical contrasts for the ROI data in an

INDEPENDENT data set
– Because the runs that are used to generate the area are
independent from those used to test the hypothesis, liberal
statistical thresholds (e.g., p < .05) can be used
Localizer Scan
• A separate scan conducted to identify functional
regions of interest
Example of ROI Approach
Culham et al., 2003, Experimental Brain Research
Does the Lateral Occipital Complex compute object shape for grasping?
Step 1: Localize LOC
Intact
Objects
Scrambled Objects
Culham et al., 2003, Experimental Brain Research
Does the Lateral Occipital Complex compute object shape for grasping?
Step 2: Extract LOC data from experimental runs
Grasping
Reaching
NS NS
p = .35 p = .31
Very Simple Stats
% BOLD Signal Then simply do a
Change paired t-test to see
Left Hem. LOC
whether the peaks are
Subject Grasping Reaching significantly different
between conditions
1 0.02 0.03
Extract average peak 2 0.19 0.08

from each subject for 3 0.04 0.01
each condition
4 0.10 0.32
NS NS
p = .35 p = .31
5 1.01 -0.27
6 0.16 0.09
7 0.19 0.12
• Instead of using % BOLD Signal Change, you can use beta weights
• You can also do a planned contrast in Brain Voyager using a module
called the ROI GLM
Example: The Danger of ROI Approaches
• Example 1: LOC may be a heterogeneous area with subdivisions; ROI
analyses gloss over this
• Example 2: Some experiments miss important areas (e.g., Kanwisher
et al., 1997 identified one important face processing area -- the fusiform
face area, FFA -- but did not report a second area that is a very
important part of the face processing network -- the occipital face area,
OFA -- because it was less robust and consistent than the FFA.
Pros and Cons: Voxelwise Approach
Benefits
• Require no prior hypotheses about areas involved
• Include entire brain
• May identify subregions of known areas that are implicated in a
function
• Doesn’t require independent data set
Drawbacks
• Requires conservative corrections for multiple comparisons
• vulnerable to Type II errors
• Neglects individual differences in brain regions
• poor for some types of studies (e.g., topographic areas)
• Can lose spatial resolution with intersubject averaging
• Requires speculation about areas involved
Pros and Cons: ROI Approach
Benefits
• Extraction of ROI data can be subjected to simple stats
• Elimination of mega multiple comparisons problem greatly improves
statistical power (e.g., p < .05)
• Hypothesis-driven
• Useful when hypotheses are motivated by other techniques (e.g.,
electrophysiology) in specific brain regions
• ROI is not smeared due to intersubject averaging
• Important for discriminating abutting areas (e.g., V1/V2)
• Easy to analyze and interpret
• Can be useful for dissecting factorial design data in an unbiased manner
Drawbacks
• Neglects other areas that may play a fundamental role
• If multiple ROIs need to be considered, you can spend a lot of scan time
collecting localizer data (thus limiting the time available for experimental
runs)
• Works best for reliable and robust areas with unambiguous definitions
• Sometimes you can’t find an ROI in some subjects
• Selection of ROIs can be highly subjective and error-prone
A Proposed Resolution
• There is no reason not to do BOTH ROI analyses and
voxelwise analyses
– ROI analyses for well-defined key regions
– Voxelwise analyses to see if other regions are also involved
• Ideally, the conclusions will not differ
• If the conclusions do differ, there may be sensible reasons
– Effect in ROI but not voxelwise
• perhaps region is highly variable in stereotaxic location between subjects
• perhaps voxelwise approach is not powerful enough
– Effect in voxelwise but not ROI
• perhaps ROI is not homogenous or is context-specific
The War of Non-Independence
Finding the Obvious
A priori probability of getting JQKA
sequence = (1/13)4 = 1/28,561
A posteriori probability of getting JQKA

sequence = 1/1 = 100%
Non-independence error
• occurs when statistical tests performed are not independent
from the means used to select the brain region
Arguments from Vul & Kanwisher, book chapter in press

Non-independence Error
Egregious example
• Identify Area X with contrast of A > B
• Do post hoc stats showing that A is statistically higher than B
• Act surprised!!!
More subtle example of selection bias
• Identify Area X with contrast of A > B
• Do post hoc stats showing that A is statistically higher than C and C is
statistically greater than B
Arguments from Vul &

Kanwisher, book chapter in
press
Figure from Kriegeskorte et

al., 2009, Nature
Neuroscience
Double Dipping & How to Avoid It
• Kriegeskorte et al.,
2009, Nature
Neuroscience
• surveyed 134 papers in
prestiguous journals
• 42% showed at least
one example of non-
independence error
Correlations Between Individual Subjects’
Brain Activity and Behavioral Measures
Sample of Critiqued Papers:
Eisenberg, Lieberman & Williams, 2003, Science
• measured fMRI activity during social rejection
• correlated self-reported distress with brain activity
• found r = .88 in anterior cingulate cortex, an area implicated in physical pain
perception
• concluded “rejection hurts”
social exclusion
> inclusion
“Voodoo Correlations”
The original title of the paper
was not well-received by
reviewers so it was changed
even though some people still
use the term
Voodoo
2009
• reliability of personality and emotion measures: r ~ .7

• reliability of activation in a given voxel: r ~ .7
• highest expected behavior: fMRI correlation is ~.74
• so how can we have behavior: fMRI correlations of r ~.9?!
“Voodoo Correlations”
"Notably, 53% of the surveyed studies selected voxels based on a correlation with the
behavioral individual-differences measure and then used those same data to compute a
correlation within that subset of voxels."
Vul et al., 2009, Perspectives on Psychological Science

Avoiding “Voodoo”
• Use independent means to select
region and then evaluate
correlation
• Do split-half reliability test
– WARNING: This is reassuring that the
result can be replicated in your sample
but does not demonstrate that result
generalizes to the population
Is the “voodoo” problem all that bad?
• High correlations can occur in legitimately analyzed data

• Did voxelwise analyses use appropriate correction for multiple
comparisons?
– then result is statistically significant regardless of specific correlation
• Is additional data being used for
1. inference purposes?
– if they pretend to provide independent support, that’s bad
2. presentation purposes?
– alternative formats can be useful in demonstrating that data is clean (e.g., time
courses look sensible; correlations are not driven by outliers)

fMRI Analysis with the General Linear Model

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

fMRI Analysis with the General Linear Model

Uploaded by

Copyright:

Available Formats

Jody Culham

Brain and Mind Institute

Last Update: October 20, 2014

4 sec stimuli (2 volumes) with 8 sec (4 volumes) baseline

Alternation Every 68 sec (34 volumes)

You want your paradigm frequency

• Now the options increase.

• Problem: There might be order effects

• Problem: If you lose a run (e.g., to head motion), you lose

• Problem: Randomization can be flukey

Benefit: With event-related averaging,

Problem: You’re spending half of your

B. Random order in each run

Pro: spends most of your n on key

• In my experience, any block design experiment with

• Event-related designs might still be an option… stay

Friston 2005, Ann. Rev. Psych.

Time (2-s volumes)

Time (2-s volumes)

Time (2-s volumes)

% BOLD Signal Change for each timepoint

Y%BSC = (Yraw – Ybaseline)/Ybaseline)

We can convolve our square wave

Note: we can choose

• So we just have to do this 60,198 more times…

r = .80 r = .60 r = .40 r = .12 r=0

Maximum Threshold r >= .80 r < -.80 Maximum Threshold

Minimum Threshold Minimum Threshold

positively correlated negatively correlated

r >= .60 r >= .80 r >= 1

• Why do we have only one predictor when there are

β=1 β=2 β=4 β=0 β = -1 β = -5

Remaining df is Total variance

Look up F of 1102 with

• both maps set to p < .00001 with 260 df

Each predictor goes

We can estimate the

This voxel shows

i.e., βHouses - βFaces

One volume (12 slices) every 2 seconds for 272

Condition changes every 16 seconds (8 volumes)

fMRI Signal = Design Matrix x Betas + Residuals

Statistical significance is basically a ratio of

Many thanks to Øystein Bech Gadmar for

fMRI Signal = Design Matrix x Betas + Residuals

Statistical significance is basically a ratio of

Example: Some people

Corollary: Never leave out

This works best when the

• Some recommend including the first derivative of the

Poldrack, Mumford & Nichols, 2011 fMRI Data Analysis

Fusiform Face Area

Houses Faces Objects Bodies Scram

Contrast -1 2 -1 -1 -1 Σ=-2 Contrast -1 8 -1 -1 -1 Σ=4

Balanced: Faces vs. Other Balanced: Faces vs. Other

Contrast -1 8 -1 -1 -1 Σ=4 Contrast -2 8 -2 -2 - Σ=1.5

• Bulk contrasts can be significant if only a subset of

Houses Faces Objects Bodies Scram

To describe this in text:

Faces – Faces – Faces – Faces – Faces –

• e.g., pcombined = (0.05)2 = 0.0025

• If the contrasts are non-independent:

GLM predictors account for