Professional Documents
Culture Documents
1 / 38
Multiple Hypothesis Testing
1 / 38
Multiple Hypothesis Testing
1 / 38
Multiple Hypothesis Testing
1 / 38
A Quick Review of Hypothesis Testing
2 / 38
A Quick Review of Hypothesis Testing
2 / 38
A Quick Review of Hypothesis Testing
2 / 38
A Quick Review of Hypothesis Testing
2 / 38
A Quick Review of Hypothesis Testing
2 / 38
1. Define the Null and Alternative Hypotheses
3 / 38
1. Define the Null and Alternative Hypotheses
3 / 38
2. Construct the Test Statistic
4 / 38
2. Construct the Test Statistic
4 / 38
2. Construct the Test Statistic
µ̂t − µ̂c
T = q
s n1t + n1c
4 / 38
3. Compute the p-Value
• The p-value is the probability of observing a test statistic
at least as extreme as the observed statistic, under the
assumption that H0 is true.
5 / 38
3. Compute the p-Value
• The p-value is the probability of observing a test statistic
at least as extreme as the observed statistic, under the
assumption that H0 is true.
• A small p-value provides evidence against H0 .
5 / 38
3. Compute the p-Value
• The p-value is the probability of observing a test statistic
at least as extreme as the observed statistic, under the
assumption that H0 is true.
• A small p-value provides evidence against H0 .
• Suppose we compute T = 2.33 for our test of H0 : µt = µc .
5 / 38
3. Compute the p-Value
• The p-value is the probability of observing a test statistic
at least as extreme as the observed statistic, under the
assumption that H0 is true.
• A small p-value provides evidence against H0 .
• Suppose we compute T = 2.33 for our test of H0 : µt = µc .
• Under H0 , T ∼ N (0, 1) for a two-sample t-statistic.
0.4
Probability Density Function
0.3
T=2.33
0.2
0.1
0.0
−4 −2 0 2 4
5 / 38
3. Compute the p-Value
• The p-value is the probability of observing a test statistic
at least as extreme as the observed statistic, under the
assumption that H0 is true.
• A small p-value provides evidence against H0 .
• Suppose we compute T = 2.33 for our test of H0 : µt = µc .
• Under H0 , T ∼ N (0, 1) for a two-sample t-statistic.
0.4
Probability Density Function
0.3
T=2.33
0.2
0.1
0.0
−4 −2 0 2 4
6 / 38
4. Decide Whether to Reject H0 , Part 1
6 / 38
4. Decide Whether to Reject H0 , Part 1
6 / 38
4. Decide Whether to Reject H0 , Part 1
6 / 38
4. Decide Whether to Reject H0 , Part 2
7 / 38
4. Decide Whether to Reject H0 , Part 2
8 / 38
4. Decide Whether to Reject H0 , Part 2
9 / 38
4. Decide Whether to Reject H0 , Part 2
10 / 38
4. Decide Whether to Reject H0 , Part 2
11 / 38
4. Decide Whether to Reject H0 , Part 3
12 / 38
4. Decide Whether to Reject H0 , Part 3
12 / 38
4. Decide Whether to Reject H0 , Part 3
12 / 38
Multiple Testing
13 / 38
Multiple Testing
13 / 38
Multiple Testing
13 / 38
A Thought Experiment
14 / 38
A Thought Experiment
14 / 38
A Thought Experiment
14 / 38
A Thought Experiment
14 / 38
A Thought Experiment
14 / 38
A Thought Experiment
14 / 38
Multiple Testing: Even XKCD Weighs In
https://xkcd.com/882/
15 / 38
The Challenge of Multiple Testing
16 / 38
The Challenge of Multiple Testing
16 / 38
The Challenge of Multiple Testing
16 / 38
The Challenge of Multiple Testing
16 / 38
The Family-Wise Error Rate
17 / 38
The Family-Wise Error Rate
17 / 38
Challenges in Controlling the Family-Wise Error Rate
18 / 38
Challenges in Controlling the Family-Wise Error Rate
18 / 38
Challenges in Controlling the Family-Wise Error Rate
α = 0.05
α = 0.01
0.8
α = 0.001
Family−Wise Error Rate
0.6
0.4
0.2
0.0
Number of Hypotheses 18 / 38
The Bonferroni Correction
FWER = Pr(falsely reject at least one null hypothesis)
= Pr(∪m
j=1 Aj )
Xm
≤ Pr(Aj )
j=1
19 / 38
The Bonferroni Correction
FWER = Pr(falsely reject at least one null hypothesis)
= Pr(∪m
j=1 Aj )
Xm
≤ Pr(Aj )
j=1
20 / 38
Fund Manager Data
20 / 38
Fund Manager Data
20 / 38
Fund Manager Data with Bonferroni Correction
21 / 38
Fund Manager Data with Bonferroni Correction
21 / 38
Fund Manager Data with Bonferroni Correction
21 / 38
Holm’s Method for Controlling the FWER
22 / 38
Holm’s Method for Controlling the FWER
22 / 38
Holm’s Method for Controlling the FWER
22 / 38
Holm’s Method for Controlling the FWER
22 / 38
Holm’s Method for Controlling the FWER
22 / 38
Holm’s Method for Controlling the FWER
22 / 38
Holm’s Method on the Fund Manager Data
Manager Mean, x̄ s t-statistic p-value
One 3.0 7.4 2.86 0.006
Two -0.1 6.9 -0.10 0.918
Three 2.8 7.5 2.62 0.012
Four 0.5 6.7 0.53 0.601
Five 0.3 6.8 0.31 0.756
• The ordered p-values are p(1) = 0.006, p(2) = 0.012,
p(3) = 0.601, p(4) = 0.756 and p(5) = 0.918.
23 / 38
Holm’s Method on the Fund Manager Data
Manager Mean, x̄ s t-statistic p-value
One 3.0 7.4 2.86 0.006
Two -0.1 6.9 -0.10 0.918
Three 2.8 7.5 2.62 0.012
Four 0.5 6.7 0.53 0.601
Five 0.3 6.8 0.31 0.756
• The ordered p-values are p(1) = 0.006, p(2) = 0.012,
p(3) = 0.601, p(4) = 0.756 and p(5) = 0.918.
• The Holm procedure rejects the first two null hypotheses,
because
• p(1) = 0.006 < 0.05/(5 + 1 − 1) = 0.0100
• p(2) = 0.012 < 0.05/(5 + 1 − 2) = 0.0125,
• p(3) = 0.601 > 0.05/(5 + 1 − 3) = 0.0167.
23 / 38
Holm’s Method on the Fund Manager Data
Manager Mean, x̄ s t-statistic p-value
One 3.0 7.4 2.86 0.006
Two -0.1 6.9 -0.10 0.918
Three 2.8 7.5 2.62 0.012
Four 0.5 6.7 0.53 0.601
Five 0.3 6.8 0.31 0.756
• The ordered p-values are p(1) = 0.006, p(2) = 0.012,
p(3) = 0.601, p(4) = 0.756 and p(5) = 0.918.
• The Holm procedure rejects the first two null hypotheses,
because
• p(1) = 0.006 < 0.05/(5 + 1 − 1) = 0.0100
• p(2) = 0.012 < 0.05/(5 + 1 − 2) = 0.0125,
• p(3) = 0.601 > 0.05/(5 + 1 − 3) = 0.0167.
• Holm rejects H0 for the first and third managers, but
Bonferroni only rejects H0 for the first manager.
23 / 38
A Comparison with m = 10 p-values
1e−03
● ●
● ●
●
●
1e−07
2 4 6 8 10
Ordering of p−values
24 / 38
A More Extreme Example
1e−01
●
●
● ● ●
1e−03
●
1e−05
2 4 6 8 10
Ordering of p−values
25 / 38
Holm or Bonferroni?
26 / 38
Other Methods
27 / 38
Other Methods
27 / 38
Other Methods
27 / 38
Other Methods
27 / 38
The False Discovery Rate
28 / 38
The False Discovery Rate
28 / 38
The False Discovery Rate
28 / 38
The False Discovery Rate
28 / 38
The False Discovery Rate
28 / 38
Intuition Behind the False Discovery Rate
V number of false rejections
FDR = E =E
R total number of rejections
29 / 38
Intuition Behind the False Discovery Rate
V number of false rejections
FDR = E =E
R total number of rejections
29 / 38
Intuition Behind the False Discovery Rate
V number of false rejections
FDR = E =E
R total number of rejections
29 / 38
Intuition Behind the False Discovery Rate
V number of false rejections
FDR = E =E
R total number of rejections
29 / 38
Intuition Behind the False Discovery Rate
V number of false rejections
FDR = E =E
R total number of rejections
29 / 38
Intuition Behind the False Discovery Rate
V number of false rejections
FDR = E =E
R total number of rejections
29 / 38
Benjamini-Hochberg Procedure to Control FDR
30 / 38
Benjamini-Hochberg Procedure to Control FDR
30 / 38
Benjamini-Hochberg Procedure to Control FDR
30 / 38
Benjamini-Hochberg Procedure to Control FDR
30 / 38
Benjamini-Hochberg Procedure to Control FDR
30 / 38
Benjamini-Hochberg Procedure to Control FDR
30 / 38
Benjamini-Hochberg Procedure to Control FDR
Then, FDR ≤ q.
30 / 38
A Comparison of FDR Versus FWER, Part 1
31 / 38
A Comparison of FDR Versus FWER, Part 1
31 / 38
A Comparison of FDR Versus FWER, Part 1
31 / 38
A Comparison of FDR Versus FWER, Part 1
31 / 38
A Comparison of FDR Versus FWER, Part 2
32 / 38
A Comparison of FDR Versus FWER, Part 2
32 / 38
A Comparison of FDR Versus FWER, Part 2
32 / 38
A Comparison of FDR Versus FWER, Part 2
32 / 38
A Comparison of FDR Versus FWER, Part 2
32 / 38
Re-Sampling Approaches
• So far, we have assumed that we want to test some null
hypothesis H0 with some test statistic T , and that we know
(or can assume) the distribution of T under H0 .
• This allows us to compute the p-value.
0.4
Probability Density Function
0.3
T=2.33
0.2
0.1
0.0
−4 −2 0 2 4
33 / 38
Re-Sampling Approaches
• So far, we have assumed that we want to test some null
hypothesis H0 with some test statistic T , and that we know
(or can assume) the distribution of T under H0 .
• This allows us to compute the p-value.
0.4
Probability Density Function
0.3
T=2.33
0.2
0.1
0.0
−4 −2 0 2 4
33 / 38
A Re-Sampling Approach for a Two-Sample t-Test,
Part 1
• Suppose we want to test H0 : E(X) = E(Y ) versus
Ha : E(X) 6= E(Y ), using nX independent observations
from X and nY independent observations from Y .
• The two-sample t-statistic takes the form
µ̂X − µ̂Y
T = p .
s 1/nX + 1/nY
34 / 38
A Re-Sampling Approach for a Two-Sample t-Test,
Part 1
• Suppose we want to test H0 : E(X) = E(Y ) versus
Ha : E(X) 6= E(Y ), using nX independent observations
from X and nY independent observations from Y .
• The two-sample t-statistic takes the form
µ̂X − µ̂Y
T = p .
s 1/nX + 1/nY
34 / 38
A Re-Sampling Approach for a Two-Sample t-Test,
Part 1
• Suppose we want to test H0 : E(X) = E(Y ) versus
Ha : E(X) 6= E(Y ), using nX independent observations
from X and nY independent observations from Y .
• The two-sample t-statistic takes the form
µ̂X − µ̂Y
T = p .
s 1/nX + 1/nY
34 / 38
A Re-Sampling Approach for a Two-Sample t-Test,
Part 1
• Suppose we want to test H0 : E(X) = E(Y ) versus
Ha : E(X) 6= E(Y ), using nX independent observations
from X and nY independent observations from Y .
• The two-sample t-statistic takes the form
µ̂X − µ̂Y
T = p .
s 1/nX + 1/nY
35 / 38
A Re-Sampling Approach for a Two-Sample t-Test,
Part 2
35 / 38
A Re-Sampling Approach for a Two-Sample t-Test,
Part 2
35 / 38
A Re-Sampling Approach for a Two-Sample t-Test,
Part 2
35 / 38
A Re-Sampling Approach for a Two-Sample t-Test,
Part 2
35 / 38
A Re-Sampling Approach for a Two-Sample t-Test,
Part 2
35 / 38
A Re-Sampling Approach for a Two-Sample t-Test,
Part 2
35 / 38
400
Application to Gene Expression Data, Part 1
T=−2.0936
300
200
100
0
−4 −2 0 2 4
36 / 38
400
Application to Gene Expression Data, Part 2
T=−0.5696
300
200
100
0
−4 −2 0 2 4
37 / 38
More on Re-Sampling Approaches
38 / 38
More on Re-Sampling Approaches
38 / 38
More on Re-Sampling Approaches
38 / 38
More on Re-Sampling Approaches
38 / 38