Data Modeling General Linear Model & Statistical Inference

Thomas Nichols, Ph.D. Assistant Professor Department of Biostatistics http://www.sph.umich.edu/~nichols Brain Function and fMRI ISMRM Educational Course July 11, 2002

1

Motivations
• Data Modeling
– Characterize Signal – Characterize Noise

• Statistical Inference
– Detect signal – Localization (Where’s the blob?)

2

Outline
• Data Modeling
– – – – General Linear Model Linear Model Predictors Temporal Autocorrelation Random Effects Models

• Statistical Inference
– Statistic Images & Hypothesis Testing – Multiple Testing Problem
3

Basic fMRI Example
• Data at one voxel
– Rest vs. passive word listening

• Is there an effect?
4

A Linear Model
• “Linear” in parameters β1 & β2

= β1

+ β2

+

Intensity

x1

x2
5

error ε

Time

Linear model, in image form…

= β1

+ β2

+

Y

=

β1 x1

+ β 2 x2

+

ε

6

Linear model, in image form…
Estimated

ˆ = β1

ˆ + β2

+

Y

=

ˆ β1 x1

ˆ β 2 x2 +

+

ˆ ε

7

… in image matrix form…

=

ˆ  β1  × ˆ  β 2   

+

Y

=

X

×

ˆ β

+

ˆ ε

8

… in matrix form.
1 p 1 1

Y = Xβ + ε

β

Y

=
N

X

p

+
N

ε

N

N: Number of scans, p: Number of regressors

9

Linear Model Predictors
• Signal Predictors
– Block designs – Event-related responses

• Nuisance Predictors
– Drift – Regression parameters

10

Signal Predictors
• Linear Time-Invariant system • LTI specified solely by
– Stimulus function of experiment – Hemodynamic Response Function (HRF)
• Response to instantaneous impulse
11

Blocks

Events

Convolution Examples
Experimental Stimulus Function

Block Design

Event-Related

Hemodynamic Response Function

Predicted Response
12

HRF Models
• Canonical HRF
– Most sensitive if it is correct – If wrong, leads to bias and/or poor fit
• E.g. True response may be faster/slower • E.g. True response may have smaller/ bigger undershoot SPM’s HRF

13

HRF Models
• Smooth Basis HRFs
– More flexible – Less interpretable
• No one parameter explains the response Gamma Basis

– Less sensitive relative to canonical (only if canonical is correct)
Fourier Basis
14

HRF Models
• Deconvolution
– Most flexible
• Allows any shape • Even bizarre, non-sensical ones

– Least sensitive relative to canonical (again, if canonical is correct)

Deconvolution Basis
15

Drift Models
• Drift
– Slowly varying – Nuisance variability

• Models
– Linear, quadratic – Discrete Cosine Transform

Discrete Cosine 16 Transform Basis

General Linear Model Recap
• Fits data Y as linear combination of predictor columns of X
Y = Xβ + ε

• Very “General”
– Correlation, ANOVA, ANCOVA, …

• Only as good as your X matrix
17

Temporal Autocorrelation
• Standard statistical methods assume independent errors
– Error εi tells you nothing about εj i ≠ j

• fMRI errors not independent
– Autocorrelation due to – Physiological effects – Scanner instability
18

Temporal Autocorrelation In Brief
• Independence • Precoloring • Prewhitening

19

Autocorrelation: Independence Model
• Ignore autocorrelation • Leads to
– Under-estimation of variance – Over-estimation of significance – Too many false positives

20

Autocorrelation: Precoloring
• Temporally blur, smooth your data
– This induces more dependence! – But we exactly know the form of the dependence induced – Assume that intrinsic autocorrelation is negligible relative to smoothing

• Then we know autocorrelation exactly • Correct GLM inferences based on “known” autocorrelation
21

[Friston, et al., “To smooth or not to smooth…” NI 12:196-208 2000]

Autocorrelation: Prewhitening
• Statistically optimal solution • If know true autocorrelation exactly, can undo the dependence
– De-correlate your data, your model – Then proceed as with independent data

• Problem is obtaining accurate estimates of autocorrelation
– Some sort of regularization is required
• Spatial smoothing of some sort
22

Autocorrelation Redux
Advantage Indep. Precoloring Whitening Simple Disadvantage Inflated significance Software All SPM99

Avoids Statistically autocorr. est. inefficient Statistically optimal

Requires precise FSL, autocorr. est. SPM2
23

Autocorrelation: Models
• Autoregressive
– Error is fraction of previous error plus “new” error – AR(1): εi = ρεi-1 + ηI
• Software: fmristat, SPM99

• AR + White Noise or ARMA(1,1)
– AR plus an independent WN series
• Software: SPM2

• Arbitrary autocorrelation function
ρk = corr( εi, εi-k )
• Software: FSL’s FEAT
24

Statistic Images & Hypothesis Testing
• For each voxel
– Fit GLM, estimate betas
• Write b for estimate of β

Y = Xβ + ε

– But usually not interested in all betas
• Recall β is a length-p vector

25

Building Statistic Images
Predictor of interest
β1 β2

=

β3 β4 β5 β6 β7

+

Y

=

X

×

β β
9

β8

+

ε

26

Building Statistic Images
• Contrast
– A linear combination of parameters – c’β
contrast of estimated parameters variance estimate c’ = 1 0 0 0 0 0 0 0

b1 b2 b3 b4 b5 ....

c’b
T=

T=

s2c’(X’X)+c
27

Hypothesis Test
• So now have a value T for our statistic • How big is big
– Is T=2 big? T=20?

28

Hypothesis Testing
• Assume Null Hypothesis of no signal • Given that there is no signal, how likely is our measured T? • P-value measures this
– Probability of obtaining T as large or larger
T

P-val

∀ α level
– Acceptable false positive rate
29

Random Effects Models
• GLM has only one source of randomness Y = Xβ + ε
– Residual error

• But people are another source of error
– Everyone activates somewhat differently…

30

Fixed vs. Random Effects • Fixed Effects
– Intra-subject variation suggests all these subjects different from zero

Distribution of each subject’s effect Subj. 1 Subj. 2 Subj. 3 Subj. 4 Subj. 5 Subj. 6 0

• Random Effects
– Intersubject variation suggests population not very different from zero

31

Random Effects for fMRI
• Summary Statistic Approach
– Easy
• Create contrast images for each subject • Analyze contrast images with one-sample t

– Limited
• Only allows one scan per subject • Assumes balanced designs and homogeneous meas. error.

• Full Mixed Effects Analysis
– Hard
• Requires iterative fitting • REML to estimate inter- and intra subject variance
– SPM2 & FSL implement this, very differently

– Very flexible

32

Random Effects for fMRI Random vs. Fixed
• Fixed isn’t “wrong”, just usually isn’t of interest • If it is sufficient to say “I can see this effect in this cohort” then fixed effects are OK • If need to say “If I were to sample a new cohort from the population I would get the same result” then random effects are needed
33

Multiple Testing Problem
• Inference on statistic images
– Fit GLM at each voxel – Create statistic images of effect

• Which of 100,000 voxels are significant?
α=0.05 ⇒ 5,000 false positives!
t > 0.5 t > 1.5 t > 2.5 t > 3.5 t > 4.5 t > 5.5 t > 6.5

34

MCP Solutions: Measuring False Positives
• Familywise Error Rate (FWER)
– Familywise Error
• Existence of one or more false positives

– FWER is probability of familywise error

• False Discovery Rate (FDR)
– R voxels declared active, V falsely so
• Observed false discovery rate: V/R

– FDR = E(V/R)
35

FWER MCP Solutions
• Bonferroni • Maximum Distribution Methods
– Random Field Theory – Permutation

36

FWER MCP Solutions
• Bonferroni • Maximum Distribution Methods
– Random Field Theory – Permutation

37

FWER MCP Solutions: Controlling FWER w/ Max
• FWER & distribution of maximum
FWER = P(FWE) = P(One or more voxels ≥ u | Ho) = P(Max voxel ≥ u | Ho)

• 100(1-α)%ile of max distn controls FWER
FWER = P(Max voxel ≥ uα | Ho) ≤ α
α

38

FWER MCP Solutions: Random Field Theory
• Euler Characteristic χu
– Topological Measure
• #blobs - #holes
Threshold

– At high thresholds, Random Field just counts blobs – FWER = P(Max voxel ≥ u | Ho) = P(One or more blobs | Ho) ≈ P(χu ≥ 1 | Ho) ≈ E(χu | Ho)

39 Suprathreshold Sets

Controlling FWER: Permutation Test
• Parametric methods
– Assume distribution of max statistic under null hypothesis
5%
Parametric Null Max Distribution

• Nonparametric methods

– Use data to find distribution of max statistic 5% under null hypothesis Nonparametric Null Max Distribution – Any max statistic!
40

Measuring False Positives
• Familywise Error Rate (FWER)
– Familywise Error
• Existence of one or more false positives

– FWER is probability of familywise error

• False Discovery Rate (FDR)
– R voxels declared active, V falsely so
• Observed false discovery rate: V/R

– FDR = E(V/R)
41

Measuring False Positives FWER vs FDR
Noise

Signal

Signal+Noise

42

Control of Per Comparison Rate at 10%
11.3% 11.3% 12.5% 10.8% 11.5% 10.0% 10.7% 11.2% 10.2% 9.5% Percentage of Null Pixels that are False Positives

Control of Familywise Error Rate at 10%
FWE

Occurrence of Familywise Error

Control of False Discovery Rate at 10%
6.7% 10.4% 14.9% 9.3% 16.2% 13.8% 14.0% 10.5% 12.2% 8.7% Percentage of Activated Pixels that are False Positives 43

Controlling FDR: Benjamini & Hochberg
• Select desired limit q on E(FDR) • Order p-values, p(1) ≤ p(2) ≤ ... ≤ p(V) • Let r be largest i such that p(i) ≤ i/V × q • Reject all hypotheses corresponding to p(1), ... , p(r).
p-value
1

p(i)

0

i/V × q
i/V
44

0

1

Conclusions
• Analyzing fMRI Data
– Need linear regression basics – Lots of disk space, and time – Watch for MTP (no fishing!)

45

Thanks
• Slide help
– Stefan Keibel, Rik Henson, JB Poline, Andrew Holmes

46

Sign up to vote on this title
UsefulNot useful