# Data Modeling General Linear Model & Statistical Inference

Thomas Nichols, Ph.D. Assistant Professor Department of Biostatistics http://www.sph.umich.edu/~nichols Brain Function and fMRI ISMRM Educational Course July 11, 2002

1

Motivations

• Data Modeling

– Characterize Signal – Characterize Noise

• Statistical Inference

– Detect signal – Localization (Where’s the blob?)

2

Outline

• Data Modeling

– – – – General Linear Model Linear Model Predictors Temporal Autocorrelation Random Effects Models

• Statistical Inference

– Statistic Images & Hypothesis Testing – Multiple Testing Problem

3

**Basic fMRI Example
**

• Data at one voxel

– Rest vs. passive word listening

• Is there an effect?

4

A Linear Model

• “Linear” in parameters β1 & β2

= β1

+ β2

+

Intensity

x1

x2

5

error ε

Time

Linear model, in image form…

= β1

+ β2

+

Y

=

β1 x1

+ β 2 x2

+

ε

6

**Linear model, in image form…
**

Estimated

ˆ = β1

ˆ + β2

+

Y

=

ˆ β1 x1

ˆ β 2 x2 +

+

ˆ ε

7

… in image matrix form…

=

ˆ β1 × ˆ β 2

+

Y

=

X

×

ˆ β

+

ˆ ε

8

… in matrix form.

1 p 1 1

Y = Xβ + ε

β

Y

=

N

X

p

+

N

ε

N

N: Number of scans, p: Number of regressors

9

**Linear Model Predictors
**

• Signal Predictors

– Block designs – Event-related responses

• Nuisance Predictors

– Drift – Regression parameters

10

Signal Predictors

• Linear Time-Invariant system • LTI specified solely by

– Stimulus function of experiment – Hemodynamic Response Function (HRF)

• Response to instantaneous impulse

11

Blocks

Events

Convolution Examples

Experimental Stimulus Function

Block Design

Event-Related

Hemodynamic Response Function

Predicted Response

12

HRF Models

• Canonical HRF

– Most sensitive if it is correct – If wrong, leads to bias and/or poor fit

• E.g. True response may be faster/slower • E.g. True response may have smaller/ bigger undershoot SPM’s HRF

13

HRF Models

• Smooth Basis HRFs

– More flexible – Less interpretable

• No one parameter explains the response Gamma Basis

**– Less sensitive relative to canonical (only if canonical is correct)
**

Fourier Basis

14

HRF Models

• Deconvolution

– Most flexible

• Allows any shape • Even bizarre, non-sensical ones

– Least sensitive relative to canonical (again, if canonical is correct)

Deconvolution Basis

15

Drift Models

• Drift

– Slowly varying – Nuisance variability

• Models

– Linear, quadratic – Discrete Cosine Transform

Discrete Cosine 16 Transform Basis

**General Linear Model Recap
**

• Fits data Y as linear combination of predictor columns of X

Y = Xβ + ε

• Very “General”

– Correlation, ANOVA, ANCOVA, …

**• Only as good as your X matrix
**

17

Temporal Autocorrelation

• Standard statistical methods assume independent errors

– Error εi tells you nothing about εj i ≠ j

**• fMRI errors not independent
**

– Autocorrelation due to – Physiological effects – Scanner instability

18

**Temporal Autocorrelation In Brief
**

• Independence • Precoloring • Prewhitening

19

**Autocorrelation: Independence Model
**

• Ignore autocorrelation • Leads to

– Under-estimation of variance – Over-estimation of significance – Too many false positives

20

Autocorrelation: Precoloring

• Temporally blur, smooth your data

– This induces more dependence! – But we exactly know the form of the dependence induced – Assume that intrinsic autocorrelation is negligible relative to smoothing

**• Then we know autocorrelation exactly • Correct GLM inferences based on “known” autocorrelation
**

21

[Friston, et al., “To smooth or not to smooth…” NI 12:196-208 2000]

Autocorrelation: Prewhitening

• Statistically optimal solution • If know true autocorrelation exactly, can undo the dependence

– De-correlate your data, your model – Then proceed as with independent data

**• Problem is obtaining accurate estimates of autocorrelation
**

– Some sort of regularization is required

• Spatial smoothing of some sort

22

Autocorrelation Redux

Advantage Indep. Precoloring Whitening Simple Disadvantage Inflated significance Software All SPM99

Avoids Statistically autocorr. est. inefficient Statistically optimal

**Requires precise FSL, autocorr. est. SPM2
**

23

Autocorrelation: Models

• Autoregressive

– Error is fraction of previous error plus “new” error – AR(1): εi = ρεi-1 + ηI

• Software: fmristat, SPM99

**• AR + White Noise or ARMA(1,1)
**

– AR plus an independent WN series

• Software: SPM2

**• Arbitrary autocorrelation function
**

ρk = corr( εi, εi-k )

• Software: FSL’s FEAT

24

**Statistic Images & Hypothesis Testing
**

• For each voxel

– Fit GLM, estimate betas

• Write b for estimate of β

Y = Xβ + ε

**– But usually not interested in all betas
**

• Recall β is a length-p vector

25

**Building Statistic Images
**

Predictor of interest

β1 β2

=

β3 β4 β5 β6 β7

+

Y

=

X

×

β β

9

β8

+

ε

26

**Building Statistic Images
**

• Contrast

– A linear combination of parameters – c’β

contrast of estimated parameters variance estimate c’ = 1 0 0 0 0 0 0 0

b1 b2 b3 b4 b5 ....

c’b

T=

T=

s2c’(X’X)+c

27

Hypothesis Test

• So now have a value T for our statistic • How big is big

– Is T=2 big? T=20?

28

Hypothesis Testing

• Assume Null Hypothesis of no signal • Given that there is no signal, how likely is our measured T? • P-value measures this

– Probability of obtaining T as large or larger

T

P-val

∀ α level

– Acceptable false positive rate

29

**Random Effects Models
**

• GLM has only one source of randomness Y = Xβ + ε

– Residual error

**• But people are another source of error
**

– Everyone activates somewhat differently…

30

**Fixed vs. Random Effects • Fixed Effects
**

– Intra-subject variation suggests all these subjects different from zero

Distribution of each subject’s effect Subj. 1 Subj. 2 Subj. 3 Subj. 4 Subj. 5 Subj. 6 0

• Random Effects

– Intersubject variation suggests population not very different from zero

31

**Random Effects for fMRI
**

• Summary Statistic Approach

– Easy

• Create contrast images for each subject • Analyze contrast images with one-sample t

– Limited

• Only allows one scan per subject • Assumes balanced designs and homogeneous meas. error.

**• Full Mixed Effects Analysis
**

– Hard

• Requires iterative fitting • REML to estimate inter- and intra subject variance

– SPM2 & FSL implement this, very differently

– Very flexible

32

**Random Effects for fMRI Random vs. Fixed
**

• Fixed isn’t “wrong”, just usually isn’t of interest • If it is sufficient to say “I can see this effect in this cohort” then fixed effects are OK • If need to say “If I were to sample a new cohort from the population I would get the same result” then random effects are needed

33

**Multiple Testing Problem
**

• Inference on statistic images

– Fit GLM at each voxel – Create statistic images of effect

**• Which of 100,000 voxels are significant?
**

α=0.05 ⇒ 5,000 false positives!

t > 0.5 t > 1.5 t > 2.5 t > 3.5 t > 4.5 t > 5.5 t > 6.5

34

**MCP Solutions: Measuring False Positives
**

• Familywise Error Rate (FWER)

– Familywise Error

• Existence of one or more false positives

– FWER is probability of familywise error

**• False Discovery Rate (FDR)
**

– R voxels declared active, V falsely so

• Observed false discovery rate: V/R

– FDR = E(V/R)

35

**FWER MCP Solutions
**

• Bonferroni • Maximum Distribution Methods

– Random Field Theory – Permutation

36

**FWER MCP Solutions
**

• Bonferroni • Maximum Distribution Methods

– Random Field Theory – Permutation

37

**FWER MCP Solutions: Controlling FWER w/ Max
**

• FWER & distribution of maximum

FWER = P(FWE) = P(One or more voxels ≥ u | Ho) = P(Max voxel ≥ u | Ho)

**• 100(1-α)%ile of max distn controls FWER
**

FWER = P(Max voxel ≥ uα | Ho) ≤ α

α

uα

38

**FWER MCP Solutions: Random Field Theory
**

• Euler Characteristic χu

– Topological Measure

• #blobs - #holes

Threshold

– At high thresholds, Random Field just counts blobs – FWER = P(Max voxel ≥ u | Ho) = P(One or more blobs | Ho) ≈ P(χu ≥ 1 | Ho) ≈ E(χu | Ho)

39 Suprathreshold Sets

**Controlling FWER: Permutation Test
**

• Parametric methods

– Assume distribution of max statistic under null hypothesis

5%

Parametric Null Max Distribution

• Nonparametric methods

– Use data to find distribution of max statistic 5% under null hypothesis Nonparametric Null Max Distribution – Any max statistic!

40

**Measuring False Positives
**

• Familywise Error Rate (FWER)

– Familywise Error

• Existence of one or more false positives

– FWER is probability of familywise error

**• False Discovery Rate (FDR)
**

– R voxels declared active, V falsely so

• Observed false discovery rate: V/R

– FDR = E(V/R)

41

**Measuring False Positives FWER vs FDR
**

Noise

Signal

Signal+Noise

42

**Control of Per Comparison Rate at 10%
**

11.3% 11.3% 12.5% 10.8% 11.5% 10.0% 10.7% 11.2% 10.2% 9.5% Percentage of Null Pixels that are False Positives

**Control of Familywise Error Rate at 10%
**

FWE

Occurrence of Familywise Error

**Control of False Discovery Rate at 10%
**

6.7% 10.4% 14.9% 9.3% 16.2% 13.8% 14.0% 10.5% 12.2% 8.7% Percentage of Activated Pixels that are False Positives 43

**Controlling FDR: Benjamini & Hochberg
**

• Select desired limit q on E(FDR) • Order p-values, p(1) ≤ p(2) ≤ ... ≤ p(V) • Let r be largest i such that p(i) ≤ i/V × q • Reject all hypotheses corresponding to p(1), ... , p(r).

p-value

1

p(i)

0

i/V × q

i/V

44

0

1

Conclusions

• Analyzing fMRI Data

– Need linear regression basics – Lots of disk space, and time – Watch for MTP (no fishing!)

45

Thanks

• Slide help

– Stefan Keibel, Rik Henson, JB Poline, Andrew Holmes

46