This action might not be possible to undo. Are you sure you want to continue?

the ex ess dis overy ount and alpha-investing rules

**Dean P. Foster and Robert A. Stine
**

Department of Statisti
s

**The Wharton S
hool of the University of Pennsylvania
**

Philadelphia, PA 19104-6340

April 7, 2005

Abstra t

**We propose an adaptive, sequential methodology for testing multiple hypotheses.
**

Our methodology
onsists of a new
riterion, the ex
ess dis
overy
ount (EDC), and a

new
lass of testing pro
edures that we
all alpha-investing rules. The ex
ess dis
overy

ount is the dieren
e between the number of
orre
tly reje
ted null hypotheses and a

fra
tion of the total number of reje
ted hypotheses. EDC shares many properties with

the false dis
overy rate (FDR), but is adapted to testing a sequen
e of hypotheses rather

than a xed set. Be
ause EDC
ontrols the
ount of in
orre
tly reje
ted hypotheses

rather than a ratio, we are able to prove that a wide
lass of testing pro
edures that

we
all alpha-investing rules
ontrol EDC. Alpha-investing rules mimi
alpha-spending

rules used in sequential trials, but possess a key dieren
e. When a test reje
ts a null

hypothesis, alpha-investing rules earn additional probability toward testing subsequent

hypotheses. Alpha-investing rules allow one to in
orporate domain knowledge into the

testing pro
edure and improve the power of the tests.

Key words and phrases: Bonferroni method, false dis
overy rate (FDR), family wide

error rate (FWER), multiple
omparison pro
edure.

All
orresponden
e regarding this manus
ript should be dire
ted to Prof. Stine at the address shown

with the title. He
an be rea
hed via e-mail at stinewharton.upenn.edu.

1

EDC and Alpha-investing

1

Introdu tion

**We propose an adaptive, sequential methodology for testing multiple hypotheses. Our
**

approa
h works in the usual setting in whi
h one has a bat
h of several hypotheses

as well as
ases in whi
h the hypotheses arrive sequentially in a stream. Streams of

hypothesis tests arise naturally in variety of
ontemporary modeling appli
ations, su
h

as genomi
s and variable sele
tion for large models. In
ontrast to the
omparatively

well-dened problems that spawned multiple
omparison pro
edures su
h as Tukey's

studentized range, these appli
ations
an involve thousands of tests. For example,

mi
roarrays lead one to
ompare a
ontrol group to a treatment group using measured

dieren
es on over 6,000 genes (Dudoit, Shaer and Boldri
k, 2003). In
ontrast, the

example used by Tukey to motivate the problems of multiple
omparisons
ompares

the means of only 6 groups (Tukey, 1953, available in Braun (1994)). If one
onsiders

the possibility for intera
tions, then the number of tests is virtually innite. Be
ause

our approa
h allows the testing to pro
eed sequentially, the
hoi
e of future hypotheses

an depend upon the results of previous tests. Thus, having dis
overed dieren
es in

ertain genes, an investigator
ould, for example, dire
t attention toward related genes

identied by
ommon trans
ription fa
tor binding sites (Gupta and Ibrahim, 2005).

Our methodology has two key
omponents, a
riterion and a pro
edure. For multiple testing, we distinguish
riteria that
ontrol the number of Type I errors from

testing pro
edures. We
all our new
riterion the ex
ess dis
overy
ount (EDC). EDC

tra
ks the expe
ted number of true reje
tions among the reje
ted hypotheses. To
ontrol EDC, a test pro
edure must guarantee that the expe
ted
ount of true reje
tions

ex
eeds a
hosen fra
tion of the number of reje
ted hypotheses. For example, one might

want to guarantee that at least 95% of the reje
ted hypotheses were reje
ted
orre
tly.

Although one
an use EDC to
ontrol traditional tests, the advantage of this
riterion

is that it permits one to
ontrol adaptive testing pro
edures in whi
h the
hoi
e of the

next hypothesis to test depends on previous results.

The se
ond
omponent of our methodology is a
lass of adaptive testing pro
edures

that we
all alpha-investing rules. We show that testing pro
edures in this
lass
ontrol

EDC. Alpha-investing rules allow one to test a possibly innite stream of hypotheses,

a
ommodate dependent tests, and in
orporate domain knowledge. Alpha-investing

rules mimi
alpha-spending rules that are
ommonly used in
lini
al trials. Unlike

alpha-spending rules, however, alpha-investing rules treat ea
h test as an \investment."

2

EDC and Alpha-investing

**Ea
h test has a
ost, but
an generate a prot in the form of the an in
rease in the
**

amount of Type I error available for subsequent tests.

The rest of this paper develops as follows. We rst review several ideas from the

literature on multiple
omparisons, parti
ularly those related to the family wide error

rate and the false dis
overy rate. With these ideas in pla
e, we dene EDC in Se
tion 3

and alpha-investing rules in Se
tion 4. In Se
tion 5, we show that alpha-investing rules

ontrol a generalized version of EDC. We give several examples of testing a sequen
e

of hypotheses using alpha-investing rules in Se
tion 6. We
lose in Se
tion 7 with a

brief summary dis
ussion, and defer the single proof to the appendix.

2

Criteria and Pro edures

**We begin with a brief review of
riteria and pro
edures used to test a
olle
tion of
**

hypotheses. To set the stage for des
ribing EDC, we review the two most important

riteria
ommonly applied in testing multiple hypotheses: the family wide error rate

and the false dis
overy rate. These
riteria generalize the notion of the Type I error

rate (-level) to tests of several hypotheses and are often
onfused with testing pro
edures. The false dis
overy rate is a
riterion that one might design a testing pro
edure

to satisfy, but is not itself a testing pro
edure. Just as there are many -level tests of a

simple hypothesis, so too are there various multiple testing pro
edures. We
onne our

attention to two, the Bonferroni pro
edure and step-up/step-down tests. These pro
edures are most
losely related to and suggestive of the alpha-investing rules developed

in Se
tion 4.

Suppose that we have a set of m null hypotheses H(m) = fH1; H2; : : : ; Hm g that

spe
ify values for parameters = f1; 2; : : : ; mg. Ea
h parameter j
an be s
alar or

ve
tor-valued, and denotes the spa
e of parameter values. In the most familiar
ase,

ea
h null hypothesis spe
ies that a parameter is zero, Hj : j = 0. We des
ribe the

situation in whi
h every hypothesis has this form and is true as the \null model."

We follow the standard notation for labeling the true and false reje
tions as shown

in Table 1, whi
h is taken from Benjamini and Ho
hberg (1995). Assume that m0 of the

null hypotheses in H(m) are true. The observable statisti
R(m)
ounts how many of

these m hypotheses are reje
ted. The unobservable random variable V (m) denotes the

number of false positives among the m tests, those
ases in whi
h the testing pro
edure

in
orre
tly reje
ts a true null hypothesis. Similarly, S (m) = R(m) V (m)
ounts the

3

4

EDC and Alpha-investing

Table 1:

Counts of the number of null hypotheses that are true and false, displayed as sums

**of unobserved random variables. The marginal random variable
**

number reje
ted is observable, but internal
ounts su
h as

R(m)

that ounts the total

V (m) depend upon .

Claim

A
ept

True

State

H0

H0

H0

U ( m)

T (m)

m R(m)

H0

V ( m)

m0

S (m) m m0

R(m)

m

Reje t

**number of
orre
tly reje
ted null hypotheses. We index these random variables with a
**

supers
ript to distinguish them from a statisti
su
h as R(m); V (m) and S (m) are

not observable without . For a null model, m0 = m, V (m) = R(m) and S (m) = 0.

A basi
premise of multiple testing is to
ontrol the
han
e for any false reje
tion.

The family wide error rate (FWER) is the probability of falsely reje
ting any null

hypothesis from H(m), regardless of the values of the underlying parameters,

FWER(m) sup P (V (m) 1) :

2

(1)

**An important spe
ial
ase is
ontrol of FWER under the null model. We refer to this
**

riterion as the size of a pro
edure,

Size(m) = P0(V (m) 1) ;

(2)

**where P0 denotes the probability measure under the null model. All of the pro
edures
**

that we des
ribe
ontrol Size(m), but not all
ontrol the more general FWER.

The Bonferroni pro
edure is familiar and represents an important ben
hmark for

omparison. Let p1 ; : : : ; pm denote the p-values of tests of H1; : : : ; Hm . Given a
hosen

level 0 < < 1, the usual Bonferroni pro
edure reje
ts those Hj for whi
h pj =m.

Let the indi
ators Vj 2 f0; 1g tra
k in
orre
t reje
tions; Vj = 1 if Hj is in
orre
tly

reje
ted and is zero otherwise. Then V (m) = P Vj and the inequality

P (V (m) 1)

m

X

j =1

P (Vj = 1)

(3)

**shows that this pro
edure
ontrols FWER(m) . More generally, one need not
**

distribute equally over H(m); the pro
edure only requires that the sum of the levels is not more than . For example, alpha-spending rules allo
ate over a
olle
tion

5

EDC and Alpha-investing

**of hypotheses with a larger share given to hypotheses of greater interest. Although it
**

ontrols FWER, the Bonferroni pro
edure is often
riti
ized for having little power

ompared to other methods. Clearly, its power de
reases as m in
reases be
ause the

threshold =m for dete
ting a signi
ant ee
t de
reases.

To obtain more power when some null hypotheses are false but still
ontrol FWER,

Holm (1979) introdu
ed the following so-
alled step-down testing pro
edure. Order

the
olle
tion of m hypotheses so that the p-values of the asso
iated test statisti
s are

sorted from smallest to largest (putting the most signi
ant rst),

p(1) p(2) p(m) :

**The test of H(1) has p-value p(1) , the test of H(2) has p-value p(2) and so forth.
**

Holm's pro
edure reje
ts those hypotheses H(j) for whi
h p(j) is less than an in
reasing sequen
e of thresholds. The pro
edure rst
ompares the smallest p-value

to the Bonferroni threshold. If p(1) > =m, the pro
edure stops and does not reje
t any hypothesis. Consequently, Size(m) . If p(1) =m, the pro
edure reje
ts H(1) and moves on to test H(2) . Rather than
ompare p(2) to =m, however,

Holm's pro
edure
ompares p(2) to a larger threshold, =(m 1). In general, if we

dene jd = minfj : p(j) > =(m j + 1)g, then Holm's step-down pro
edure reje
ts

H(1) ; : : : ; H(jd 1) . Be
ause of the nesting, this testing pro
edure is
losed in the sense

of Mar
us, Peritz and Gabriel (1976) and hen
e
ontrols FWER(m) . Obviously,

when
ompared to using the Bonferroni threshold for ea
h p-value, Holm's method has

larger power. The improvement is small, however, when m is large be
ause =m is so

lose to =(m j ) when testing the smallest p-values.

The false dis
overy rate (FDR)
riterion
ontrols the size of a testing pro
edure

but introdu
es a dierent type of
ontrol if the null model is reje
ted. Benjamini

and Ho
hberg (1995) dene FDR as the expe
ted proportion of false positives among

reje
ted hypotheses,

!

V (m)

(4)

FDR(m) = E R(m) j R(m) > 0 P(R(m) > 0) :

For the null model, R(m) = V (m) and FDR(m) = FWER(m). Thus, test pro
edures

that
ontrol FDR(m) have Size(m) . Under the alternative, FDR(m) de
reases

as the number of false null hypotheses m m0 in
reases (Dudoit et al., 2003). As a

result, FDR(m) be
omes more easy to
ontrol in the presen
e of non-zero ee
ts,

allowing more powerful pro
edures. Variations on FDR in
lude pFDR (whi
h drops

EDC and Alpha-investing

**the term P(R > 0) Storey, 2002, 2003) and the lo
al false dis
overy rate fdr(z) (whi
h
**

estimates the false dis
overy rate as a fun
tion of the size of the test statisti
Efron,

2005a,b). Closer to our work, Meinshausen and Ri
e (2004) and Meinshausen and

Buehlmann (2004)
onsider estimates of m0, the total number of false hull hypotheses

in H(m).

Benjamini and Ho
hberg (1995) show that the following so-
alled step-up testing

pro
edure
ontrols FDR. First, assume that the p-values are independent and dene

ju = maxfj : p(j ) j =mg. Using the inequality of Simes (1986), they show that the

testing pro
edure that reje
ts H(1) ; : : : ; H(j)
ontrols FDR(m) . This testing pro
edure thus
ontrols Size(m) , but does not
ontrol FWER for all . A similar stepdown pro
edure that reje
ts H(1) ; : : : ; H(jd 1) for jd = minfj : p(j) > =(m j + 1)g

also has FDR(m) . Although this step-down pro
edure has less power than its stepup
ousin (be
ause jd 1 ju ), it has more power than Holm's pro
edure. Holm's

step-down pro
edure sets thresholds for the p-values to m ; m 1 ; m 2 ; : : : whereas a

Simes-based step-down pro
edure uses the larger thresholds m ; 2m ; 3m ; : : :. A
ost of

this greater power is a restri
tion to independent tests that Holm's pro
edure does

not require. Subsequent papers (su
h as Benjamini and Yekutieli, 2001; Sarkar, 1998;

Troendle, 1996)
onsider situations in whi
h this type of step-up/step-down testing

ontrols FDR under dependen
e, but the results obtain only for
ertain types of dependen
e.

3

The Ex ess Dis overy Count (EDC)

**The ex
ess dis
overy
ount (EDC) is a new
riterion for
ontrolling a multiple testing
**

pro
edure. Its form resembles that of FDR, and it too
ontrols an unobservable random

variable. EDC operates in the domain of
ounts, however, rather than ratios of
ounts,

and EDC emphasizes the number of
orre
t reje
tions S (m) rather than the number

of in
orre
t reje
tions V (m). EDC is the expe
ted dieren
e between the number of

orre
tly reje
ted null hypotheses S (m) and a fra
tion 0
1 of the number of

reje
ted hypotheses R(m) (see Figure 1). For a pro
edure that tests H(m), we have

Denition 1. The ex
ess dis
overy
ount
riterion for testing a set of m hypotheses

is

EDC;
(m) = E [S (m)
R(m)℄ + ; 0 < ;
< 1 :

(5)

Typi
al values for the two tuning parameters and
are 0.05 and 0.95, respe
tively.

6

7

EDC and Alpha-investing

Figure 1: EDC ontrols the gap between the number of true reje tions

S

and a fra tion of

**the number of reje
ted null hypotheses. A strong signal implies most of the null hypotheses
**

in

H are false.

Count

EΘ R

EΘ SΘ

EDC

Γ EΘ R - Α

No signal

Moderate

Strong signal

Θ

**FDR(m)
ontrols the expe
ted proportion of false positives V (m)=R(m) given that
**

R(m) > 0. EDC;
(m) instead
ontrols the expe
ted dieren
e in the
ounts S (m)

R(m). Being a ratio, 0 FDR(m) 1 and hen
e resembles a
onditional probability.

In
ontrast EDC;
(m) need not be positive, let alone lie between 0 and 1.

We are most interested in pro
edures su
h as that suggested by Figure 1 for whi
h

EDC is positive. In this gure, the x-axis indi
ates the amount of signal in the sense of

the proportion of null hypotheses in H that are false. \Strong signal" implies that many

of the m hypothesis are false, whereas \no signal" implies the null model. We will say

that a multiple testing pro
edure \
ontrols EDC" if EDC;
(m) 0. Control of EDC

amounts to showing that the expe
ted
ount of true reje
tions is at least
E R(m) .

Under the null model, S (m) = 0 so that

EDC;
(m) =
E R(m)
Size(m) :

Thus, a pro
edure that
ontrols EDC;
(m) 0 also
ontrols Size(m) =
. One
an

also use EDC to
ontrol FWER. If
= 1,
ontrol of EDC implies
ontrol of FWER

be
ause

EDC;1 (m) 0 ) P (V (m) 1) E V (m) :

This property suggests that one
an think of as
ontrolling the FWER when
1.

The se
ond tuning parameter
more
losely resembles FDR in the sense of
ontrolling the pro
edure on
e it reje
ts the null model. Assuming that E R(m) > 0,
ontrol

8

EDC and Alpha-investing

**Figure 2: When viewed as
ontrolling the proportion of false positives among reje
ted null
**

hypotheses, EDC
ontrols the gap between the ratio of expe
tations

EV =ER

and a de-

**reasing fun
tion of the number of reje
ted null hypotheses. A strong signal in the heuristi
**

sense here implies most of the null hypotheses in

H are false.

Proportion

FWER ΑΓ

H1-ΓL+ΑEΘ R

EΘ VΘ EΘ R

No signal

Moderate

Strong signal

Θ

**of EDC;
(m) implies
**

E [S (m)
R(m)℄ + 0

)

E V (m)

E R(m)

(1 ) + E R(m)

**When many hypotheses in H(m) are false and R(m) is large, most of the
ontrol on
**

the pro
edure
omes from
. Figure 2 shows EDC from this \FDR point of view" that

emphasizes the ratio E V (m)=E R(m) rather than
ounts. The FDR
riterion on this

s
ale is a horizontal line an
hored at FWER that
ontrols E V (m)=R(m) rather

than the ratio of expe
tations. A
riterion that
ontrols the ratio of expe
tations (rather

than the expe
tation of the ratio) has been dis
ussed in Benjamini and Ho
hberg

(1995).

To supplement these sket
hes, we ran a small simulation. Figure 3 shows simulated

values of FDR and EDC for testing a
olle
tion of m = 200 hypotheses using three

pro
edures: a naive, xed-level test that reje
ts Hj if pj = 0:05, the step-down

Simes pro
edure, and the standard Bonferroni pro
edure. The tested hypotheses Hj :

j = 0 spe
ify the means of 200 normal populations. We set the values of the j by

sampling a spike-and-slab mixture. The mixture puts 100(1 1 )% of its probability

in a spike at zero; 1 = 0 identies the null model. The slab of this mixture { the

9

EDC and Alpha-investing

**= 0:05 and
= 0:95)
ontrol the size of test
**

pro
edures (1 = 0) and the number reje
ted as the level of signal 1 grows. The lines show

Figure 3: FDR (left) and EDC (right, with

**FDR and EDC for the Bonferroni pro
edure (|), Simes-based step-down testing (
**

a naive pro
edure that reje
ts ea
h hypothesis at level

FDR

0.08

EDC

0.07

6

0.06

= 0:05 (

), and

).

4

0.05

0.04

2

0.03

0

0.02

0.01

-2

.1 .2 .3 .4 .5 .6 .7 .8 .9

1

Π1

.1 .2 .3 .4 .5 .6 .7 .8 .9

1

Π1

signal { is a normal distribution, so that

8

<

0

w:p:

j :

N (0; 2 ) w:p:

1 1 :

1

(6)

**We set the varian
e of the signal
omponent of the mixture to 2 = 2 log m so that
**

the standard deviation of the non-zero j mat
hes the bound
ommonly used in hard

thresholding. The test statisti
s are independent, normally distributed random variables Zj iid

N (j ; 1) for whi
h the two-sided p-values are pj = 2(1 (jZj j)). Given

these p-values, we
omputed FDR and EDC0:05;0:95 in a simulation with 10,000 trials.

In the simulation, we varied the amount of signal varying 1 from 0 (the null model)

to 1.

Qualitatively, FDR and EDC perform similarly. The shaded regions in Figure 3

indi
ate la
k of
ontrol of the indi
ated
riterion. Bonferroni and step-down testing

ontrol FWER(200) 0:05 and EDC;
(200) 0. Simulated values of these
riteria

remain outside of the shaded regions for all values of 1 . On the other hand, the naive

pro
edure that tests all 200 hypotheses at level 0.05 produ
es results that fall into

the shaded region for many values of 1 . Both FDR and EDC show this pro
edure

as swit
hing from liberal (shaded region) to
onservative at about the same level of

signal, namely 0:6 < 1 < 0:7. Noti
e that FDR emphasizes, relatively speaking,

dieren
es among the pro
edures when the amount of signal is small; as 1 nears 1,

FDR falls to zero for all 3 pro
edures. Dudoit et al. (2003) dis
uss this aspe
t of

FDR further. EDC preserves a more uniform s
ale for various amounts of signal. We

EDC and Alpha-investing

**note also that the Bonferroni pro
edure produ
es linear trends in EDC. The slope of
**

the line seen in the right panel of Figure 3 depends upon the
hoi
e of
in EDC;
.

Conservative

methods for
e

V (m) to be small regardless of the presen
e of signal so

that E S (m)
R(m) +
(1
) 1.

4

Alpha-Investing Rules

**Alpha-investing rules provide a framework for devising multiple testing pro
edures that
**

ontrol EDC in a dynami
setting that allows streams of hypotheses. Alpha-investing

rules resemble alpha-spending rules su
h as those often used in sequential
lini
al trials.

In a sequential trial, investigators routinely monitor the a
umulating results for safety

and eÆ
a
y. This monitoring leads to a sequen
e of tests of one (or several) null

hypotheses as the data a
umulate. Alpha-spending (or error-spending) rules
ontrol

the level of su
h tests. Given an overall Type I error rate for the trial, su
h as = 0:05,

alpha-spending rules allo
ate, or spend, over a sequen
e of tests. As Tukey (1991)

writes, \On
e we have spent this error rate, it is gone." When repeatedly testing one

null hypothesis H0 in a
lini
al trial, spending rules guarantee that P (reje
t H0)

when H0 is true.

While similar in that they allo
ate Type I error over multiple tests, alpha-investing

rules dier from alpha-spending rules in the following way. An alpha-investing rule

earns additional probability toward subsequent Type I errors with ea
h reje
ted hypothesis. Rather than treating ea
h test as an expense that
onsumes its Type I

error rate, an alpha-investing rule treats tests as investments, motivating our
hoi
e of

name. In keeping with this analogy, we
all the Type I error rate available to the rule

its alpha-wealth. As with an alpha-spending rule, an alpha-investing rule
an never

spend more than its
urrent alpha-wealth. Unlike an alpha-spending rule, however, an

alpha-investing rule earns an in
rement in its alpha-wealth ea
h time that it reje
ts a

null hypothesis. For alpha-investing, Tukey's remark be
omes \If we invest the error

rate wisely, we'll earn more for further tests." A pro
edure that invests its alpha-wealth

in testing hypotheses that are reje
ted a
umulates additional wealth toward subsequent tests. The more hypotheses that are reje
ted, the more alpha-wealth it earns. If

the test of Hj is not signi
ant, however, the rule loses the -level invested in this test

and its alpha-wealth de
reases. The more wealth a rule invests in testing hypotheses

that are not reje
ted, the less alpha-wealth remains for subsequent tests.

10

11

EDC and Alpha-investing

More spe
i
ally, an alpha-investing rule is a fun
tion I that determines the level for testing the next hypothesis in a sequen
e of tests. We assume an exogenous

system external to the investing rule determines the next hypothesis to test. (Though

not part of the investing rule itself, this exogenous system
an use the sequen
e of

reje
tions Rj to determine the next hypothesis to test.) An alpha-investing rule has

two parameters: the initial alpha-wealth and the amount earned (
alled the pay-out)

when a null hypothesis is reje
ted. Let W (k) 0 denote the alpha-wealth a
umulated

by an investing rule after k tests; W (0) is the initial alpha-wealth. For example, one

might
onventionally set W (0) = 0:05 or 0:10. At step j , an alpha-investing rule sets

the level for testing Hj to some value j up to its
urrent wealth, 0 j W (j 1).

The level j for testing Hj typi
ally depends upon the sequen
e of prior out
omes R1 ,

R2 ; : : : ; Rj 1 , and so we write an alpha-investing rule in general as

j

= IW (0);! (R1 ; R2; : : : ; Rj 1)

= IW (0);! (j ) :

(7)

**The out
omes of the sequen
e of tests determine the alpha-wealth W (j 1) available
**

for testing Hj+1. Let pj denote the p-value of the test of Hj . If pj j , the test reje
ts

Hj . In this
ase, the investing rule pays log 1=(1 pj ) pj from the invested j and

earns a pay-out ! that is added to its alpha-wealth. If pj > j , the pro
edure does not

reje
t Hj and its alpha-wealth de
reases by log(1 j ). The
hange in the alpha-wealth

is thus

8

< ! + log(1 pj ) if pj j ;

W (j ) W (j 1) = :

(8)

log(1 j ) if pj > j :

The appearan
e of log(1 ) and log(1 p) in (8) deserves some explanation.

Consider the following \mi
ro-investment" approa
h to testing a single null hypothesis H0. Set the initial wealth W (0) = and assume that the test of H0 returns

p-value p0. Rather than use one test at level , a mi
ro-investment approa
h uses a

sequen
e of tests, ea
h risking a small amount of the total alpha-wealth. First

test H0 at level , reje
ting H0 if p0 . If p0 > , the investing rule pays for the

rst test, and then tests H0
onditionally on p0 > at level . This se
ond test reje
ts

H0 if < p0 2 2 . If this se
ond test does not reje
t H0 , the investing rule again

pays and retests H0, now
onditionally on p0 > 2 2. This pro
ess
ontinues until

the investing rule either spends all of its alpha-wealth or reje
ts H0 on the kth attempt

be
ause

1 (1 )k 1 < p0 1 (1 )k :

EDC and Alpha-investing

**If the pro
edure reje
ts H0 after k tests, then the total of the mi
ro-payments made is
**

log(1 p0 ) ! log(1 p ) as ! 0 :

k =

0

log(1 )

The in
rements to the wealth dened in equation (8) essentially treat ea
h test as a

sequen
e of su
h mi
ro-level tests.

In the next se
tion, we show that alpha-investing rules that a
umulate alphawealth in this way
ontrol EDC. The initial alpha-wealth W (0)
ontrols the
han
e

for reje
ting the null model. Under the null model when no hypothesis is reje
ted, an

investing rule performs like an alpha-spending rule with level W (0) and so Size(m)

W (0). Results des
ribed in the next se
tion permit one to make a
orresponden
e

between the parameters W (0) and ! that
hara
terize an alpha-investing rule and the

parameters and
that identify EDC. In parti
ular, to
ontrol EDC, it will be shown

most natural to asso
iate W (0) with and ! with
.

Whereas W (0)
ontrols the probability of reje
ting the null model, the pay-out

!
ontrols how the testing pro
edure performs on
e it has reje
ted the null model.

The notion of
ompensation for reje
ting a hypothesis
aptured in (8) allows one to

build
ontext-dependent information into the testing pro
edure. Suppose that the

substantive
ontext suggests that the rst few hypotheses are most likely to be those

that are reje
ted and that false hypotheses
ome in
lusters. In this setting, one might

onsider using an alpha-investing rule like the following. Assume that the last reje
ted

hypothesis is Hk . If false hypotheses are
lustered, an alpha-investing rule should

invest most of its wealth W (k) available after reje
ting Hk in testing Hk+1 . A rule

that does this is

IW (0);! (k) = 6 W(2k ) (k 1k)2 ; k = k + 1; : : : ; minfj : j > k ; Rj = 1g : (9)

This rule invests 6=2 0:6 of its wealth in testing H1 or the null hypothesis Hk+1

that follows a reje
ted hypothesis. The -level falls o rapidly at the rate 1=k2 as more

subsequent hypotheses are tested and not reje
ted. If the substantive insight is
orre
t

and the false null hypotheses are
lustered, then tests of hypotheses like H1 or Hk+1

represent \good investments." An example in Se
tion 6 illustrates these ideas.

While it is relatively straightforward to devise investing rules, it may be diÆ
ult

a priori to order the hypotheses in su
h a way that those most likely to be reje
ted

ome rst. Su
h an ordering relies heavily on the stru
ture of the spe
i
testing

situation. Another
ompli
ation is the
onstru
tion of tests that provide the p-values

12

13

EDC and Alpha-investing

**that determine the alpha-wealth of an investing rule a
ording to (8). In order to show
**

that a pro
edure
ontrols EDC, we require a test of Hj to have the property that

8 2 ; E (Vj j Rj 1; Rj 2; : : : ; R1 ) j :

(10)

**This
ondition amounts to requiring that,
onditionally on having either a
epted or
**

reje
ted the prior j 1 hypotheses, the test of Hj is done at level no higher than the

nominal
hoi
e j . The tests need not be independent.

These pro
edures only require that the test of Hj maintain the stated level
onditionally on the binary random variables R1, R2 ; : : : ; Rj 1. In parti
ular, we

note that the test is not
onditioned on the test statisti
(su
h as a z-s
ore) or parameter

estimate. Adaptive testing in a group sequential trial (e.g. Lehma
her and Wassmer,

1999) uses the information on the observed z-statisti
at the rst look. Tsiatis and

Mehta (2003) shows that using this information leads to a less powerful test
ompared

to traditional group sequential tests that only look at a
eptan
e at the rst look.

Remark.

5

Alpha-Investing Rules Control EDC

**An important extension of EDC generalizes this
riterion to an arbitrary number of
**

hypotheses. This version of the
riterion repla
es the xed
ount of hypotheses in the

denition (5) of EDC;
(m) by an arbitrary stopping time.

Denition 2. The ex
ess dis
overy
ount of a pro
edure for testing a stream of

hypotheses H1; H2; : : : is

EDC;
= inf

inf E S (M )
R(M ) + :

2 M 2M

(11)

**where M 2 M, the set of stopping times with nite expe
tation.
**

The
ondition on M for
es S (M ) R(M ) M and so implies that both E R(M )

and E S (M ) are bounded. Be
ause step-up testing halts after the last signi
ant

test (whi
h is not a stopping time), this extension of EDC does not apply to su
h

pro
edures. In what follows, we will
on
entrate then on step-down pro
edures.

We oer two observations on this generalized
riterion. First, EDC drifts to 1

as the number of tests in
reases for any testing pro
edure that xes the level of signi
an
e. To see that this is so, suppose a sequen
e of tests are made at level (as in the

naive pro
edure
onsidered in the prior example). Under the null model, we expe
t

14

EDC and Alpha-investing

**100% of the hypotheses to be falsely reje
ted. Be
ause all of the null hypotheses are
**

true, S (m) = 0 and EDC;
(m) =
E R(m) = (1
m) ! 1 as m ! 1.

Hen
e EDC;
= 1.

Se
ond, we observe that it is always possible to
onstru
t a test pro
edure for

whi
h EDC;
0. The Bonferroni pro
edure oers a
on
rete example. Although

the
ommon appli
ation of the Bonferroni rule assigns equal -level to ea
h test, this

need not be the
ase. All that is ne
essary is that the sum of the levels be less than

P

. If one tests Hj at level j and j j , then E V (m) for all m. Thus,

EDC;
(m) 0 for all and m.

The following theorem states that an alpha-investing rule IW (0);! with wealth determined by (8)
ontrols EDC so long as the pay-out ! is not too large. The theorem follows by showing that a sto
hasti
pro
ess related to the alpha-wealth sequen
e

W (0); W (1); : : : is a sub-martingale. Be
ause the proof of this result relies only on

the optional stopping theorem for martingales, we do not require independent tests,

though this is the
ertainly the easiest
ontext in whi
h to show that the p-values are

honest in the sense required for (10) to hold.

Theorem 1 An alpha-investing rule IW (0);! governed by (8) with initial alpha-wealth

W (0) and pay-out ! 1
ontrols EDC;
,

EDC;

0:

(12)

**A proof of the theorem is in the appendix.
**

6

Examples

**The examples in this se
tion illustrate alpha-investing rules and EDC. Our rst two
**

examples
onsider testing a large, but xed,
olle
tion of m hypotheses for whi
h we observe independent p-values p1, p2, : : :, pm. The rst des
ribes an alpha-investing rule

that mimi
s Simes-based step-down testing. The se
ond shows how alpha-investing

rules are able to leverage domain knowledge to form a more powerful multiple testing pro
edure. A third example des
ribes alpha-investing when testing a stream of

hypothesis using dependent test statisti
s.

15

EDC and Alpha-investing

6.1

Comparison to Step-Down Testing

We
ompare alpha-investing to the Simes-based step-down testing pro
edure des
ribed

in Se
tion 2. This pro
edure reje
ts H(1) ; H(2) ; : : : ; H(jd 1) , where jd = minfk : p(k) >

k =mg identies the rst test that is not reje
ted. (Step-up testing does not provide

a stopping time.) Assume that the step-down pro
edure
ontrols FDR(m) and

reje
ts a small number k > 0 of the m hypotheses. It follows then that the p-values

have the following stru
ture:

p(1) =m; p(2) 2=m; : : : ; p(k) k=m;

and p(k+1) > (k + 1)=m :

(13)

**To reprodu
e this behavior with alpha-investing,
onsider the following approa
h.
**

Set the initial alpha-wealth W (0) = and ! = . Dene the alpha-investing to

allo
ate its available alpha-wealth W (j ) equally over the hypotheses that have not been

reje
ted, and begin by testing ea
h hypothesis at the Bonferroni level =m. Be
ause of

the stru
ture in the p-values (13), this rst pass reje
ts at least one hypothesis, namely

H(1) . To keep the presentation simple, suppose that only one hypothesis has p-value

less than =m. The pro
edure pays log(1 =m) for ea
h test that does not reje
t,

and earns + log(1 p(1) ) for reje
ting H(1) . Hen
e, after testing ea
h hypothesis at

level =m, its alpha-wealth is at least

W (m)

= W (0) + + log(1 p(1) ) + (m 1) log(1 =m)

2 + m log(1 =m)

2=m

(14)

**After this rst pass through the hypotheses, its alpha-wealth is virtually un
hanged,
**

and it retains enough wealth to reje
t H(2) .

For the se
ond pass through the remaining m 1 null hypotheses, the alphainvesting rule reje
ts any hypothesis for whi
h pj 2=m, as in the Simes pro
edure.

Be
ause these tests
ondition on pj > =m, this round of testing requires that the

alpha-investing rule test ea
h of the remaining m 1 hypothesis at level

P < p 2 j p > = :

0 m

j

m

j

m

m

**It possesses enough wealth after the se
ond round to do this be
ause, from (14) for
**

1=2,

W (m) 2 =m

m 1 m :

m 1

16

EDC and Alpha-investing

**As in the rst round, this se
ond pass again approximately
onserves the alpha-wealth
**

of the pro
edure. Thus, so long as m is large and k m so that bounds similar

to (14) hold, ea
h pass though the hypotheses
onserves enough alpha-wealth for the

next round of tests. In this way, the investing rule gradually raises the threshold for

reje
ting a hypothesis as the number of reje
ted hypotheses in
reases.

The simulation summarized in the next se
tion
ompares this alpha-investing rule

to step-down testing. The alpha-investing rule generally does slightly better (reje
ts

more false hypotheses) than step-down testing for two reasons. First, the lower bound

(14) for the wealth W (m), for example, assumes p(1) = =m. In fa
t, we would expe
t

p(1) to be
loser to =(2m), on average. Se
ond, our des
ription assumes that the

p-values reje
ted by step-down testing are evenly distributed, with one between ea
h

threshold. Instead, it is likely that some passes of the investing rule will reje
t more

than one hypothesis and thus have greater alpha-wealth for testing in the next round

than suggested by these lower bounds.

6.2

Investing Rules that Leverage Domain Knowledge

The performan
e of an alpha-investing rule improves, in the sense of being more powerful, if the investigator \knows the s
ien
e". If the investigator is able to order the

hypotheses a priori so that those most likely to be reje
ted are tested rst, then alphainvesting
an reje
t
onsiderably more hypotheses than step-down testing. The full

benet is only realized, however, when one exploits an aggressive investing rule. The

prior investing rule assumes that the hypotheses are arranged in no parti
ular order

and spreads its alpha-wealth evenly over the remaining hypotheses.

Suppose that the test pro
edure reje
ts Hk and is about to test Hk+1. Rather

than spread its
urrent alpha-wealth W (k ) evenly over the remaining hypotheses, a

rule
an invest more in testing the next hypothesis. For example, one
an allo
ate

W (k ) using a dis
rete probability mass fun
tion su
h as this version of the investing

rule (9). If none of the remaining hypotheses are reje
ted, then the level for testing Hj

is

1 ; j = k + 1; : : : ; m ;

W (k )

(15)

j =

h (j k )2

m k ;2

where the normalizing
onstant hq;2 = Pqi=1 1=i2 . If one of these tests reje
ts a hypothesis, the pro
edure reallo
ates its wealth so that all is spent by the time the pro
edure

tests Hm. Mimi
ing the language of nan
ial investing, we des
ribe this type of alpha-

EDC and Alpha-investing

investing rule as aggressive and the previous method as onservative.

**dpf: This idea of using prior information is impli
it in alpha spending
**

rules. But not mu
h FDR theory on su
h rules exist. Re
ently, Genovese

and Wasserman (2004) uses prior information on the hypotheses to
ome up

with a weighted Benjamini-Ho
hberg (
alled wBH) pro
eedure. Following

the ideas of the previous se
tion, we
an show that the wBH pro
eedure

satises the EDC. Thus the wBH pro
eedure is somewhere between these

two pro
eedures. Not quite as aggressive as this se
tion, but mu
h more

aggressive than the usual Simes method. (GOD THIS IS A BAD PARAGRAPH. SHOULD WE EXPAND IT?)

**The simulation summarized in Figure 4
ompares step-down testing to
onservative
**

and aggressive alpha-investing rules. For this simulation, we assume that the investigator tests the hypotheses in the order implied by jj j. The m = 200 hypotheses test

means as dened in the simulation in Se
tion 3 (see equation 6). We set the initial

wealth W (0) = 0:05, = 0:05,
= 0:95, and used step-down testing that
ontrols

F DR(200) < 0:05. Figure 4 shows FDR and EDC. All three pro
edures
ontrol both

FDR and EDC, as they should. FDR for step-down testing
losely tra
ks the performan
e of the
onservative alpha-investing rule. The parti
ularly low FDR obtained

by aggressive alpha-investing may appear surprising at rst. The low error rate is

another benet of the side-information. Aggressive alpha-investing spends all of its

alpha-wealth testing the initial hypotheses | whi
h happen to be false | and runs

out of wealth before en
ountering the hypotheses for whi
h j = 0. This rule also has

larger EDC.

Alpha-investing guarantees prote
tion from too many false reje
tions, but how well

does it nd signal? Figure 5
ompares the power of the these alpha-investing rules to

that of step-down testing. The plot shows the number of
orre
t reje
tions S (m) made

by three dierent rules: aggressive alpha-investing that exploits domain knowledge

using the rule (15),
onservative alpha-investing (whi
h assumes a random order) and

step-down testing. The gure shows the average number of hypotheses reje
ted by ea
h

investing rule relative to the number reje
ted by step-down testing, on a per
entage

s
ale. For example, with a weak signal (1 = 0:10),

investing) > 150%

100 S (m;S aggressive

(m; step-down)

In general, for weak signals, aggressive alpha-investing identies about 30% more false

17

18

EDC and Alpha-investing

**Figure 4: Both alpha-investing rules (
onservative |, aggressive
**

and EDC (right), as does step-down testing (

).

) ontrol FDR (left)

Conservative alpha-investing assumes

**no domain knowledge, whereas aggressive alpha-investing uses domain knowledge, here the
**

ordering of

2i .

FDR

EDC

0.07

6

0.06

0.05

4

0.04

2

0.03

0

0.02

0.01

-2

.1 .2 .3 .4 .5 .6 .7 .8 .9

1

Π1

.1 .2 .3 .4 .5 .6 .7 .8 .9

1

Π1

**hypotheses than step-down testing. The two be
ome more similar as signal strength
**

grows (in the form of more false null hypotheses). As dis
ussed in the prior se
tion,

onservative alpha-investing reje
ts a few more hypothesis, about 5-10%, than Simesbased step-down testing.

6.3

Dependent Tests

**The previous examples illustrate EDC and alpha-investing rules when testing a
losed
**

set of m hypotheses using independent tests. For dependent tests, however, step-down

testing does not guarantee
ontrol of FDR. In
omparison, one
an nd alpha-investing

rules that
ontrol EDC.

EDC itself makes no assumption of independen
e of the the tests, but does require

that the tests be
onditionally
orre
t in the sense of (10). When hypothesis tests

are independent, it is simple to assure that ea
h test indeed has level j . One need

only form ea
h test as though only one hypothesis were being tested; the out
omes of

the prior tests R1 ; R2 : : : ; Rj 1 do not ae
t its level. This
ondition is mu
h more

diÆ
ult to establish when the tests are dependent. Although EDC allows any sort

of dependen
e, it may not be possible to
onstru
t tests that satisfy this
ondition

without making assumptions on the form of the dependen
e.

In some
ases, however, known properties of multivariate distributions suggest a

suitable test pro
edure. For example, suppose that the test statisti
s Y = (Y1; : : : ; Ym )

for H(m) have a multivariate normal distribution with mean ve
tor ~ and
ovarian
e

19

EDC and Alpha-investing

**Figure 5: Aggressive alpha-investing using (15) exploits domain knowledge to a
hieve higher
**

power than Simes-based step-down testing.

This plot shows the per entage of orre tly

reje ted null hypotheses for ea h pro edure, relative to step-down testing.

Both alpha-

investing rules have more power than step-down testing with the same size.

**% Rejected vs Step-Down
**

150

140

130

120

110

Aggressive

Conservative

100

Step-down

.1 .2 .3 .4 .5 .6 .7 .8 .9

1

Π1

**matrix , Y N (~; ). In this
ase, Dykstra (1980) shows that
**

P(jYm j <
m j jY1 j
1 ; : : : ; jYm 1 j
m 1 ) P(jYm j
m ) :

(16)

**Thus, so long as no prior two-sided hypothesis has been reje
ted, an -level test of
**

Hm that ignores the prior out
omes | as though they were independent | has level

at least . The pro
edure is
onservative. If, however, some prior test reje
ts a null

hypothesis, these results no longer hold.

In this
ase, the simplest way to ensure the level of a test is to remove the ee
t

of the reje
ted hypothesis. If Hk , say, has been reje
ted, then one
an guarantee (10)

holds by
onstru
ting subsequent tests to be independent of Yk and any Yj ; j < k

whi
h is
orrelated with Yk . By removing the information from the reje
ted test, the

a
eptan
e region for subsequent two-sided tests is a symmetri
onvex set around the

origin and inequalities su
h as (16) hold.

For example,
onsider a balan
ed two-way analysis of varian
e with r row ee
ts

P

P

r;i and
olumn ee
ts
;j with i r;i = j
;j = 0. Write the ve
tor of row

ee
ts as ~r and the ve
tor of
olumn ee
ts ~
. For ea
h
ell of the design, we have

n independent normally distributed observations Yijk

Yijk = 0 + r;i +
;j + Zijk ; Zijk iid

N (0; 2 ); k = 1; : : : ; n;

with known varian e 2. Assume that the hypotheses to be tested have the form

EDC and Alpha-investing

**Hj : ~0r;j ~r = 0; ~0
;j ~
= 0. Standard results from linear models show that the usual
**

tests of Hj and Hk are independent if ~0r;j~r;k = 0 and ~0
;j~
;k = 0. Suppose one

begins with tests of the row ee
ts (
= 0). There are no
onstraints on the tests

until reje
ting a hypothesis, Hk say. At this point, one
an
ommen
e testing
olumn

**ee
ts, ignoring the prior results for the row ee
ts be
ause these are orthogonal. One
**

an
ontinue testing other hypotheses among the row ee
ts so long as ~r;j is orthogonal

to ~r;k .

A similar pro
edure
an be used in stepwise regression. Consider the familiar

forward stepwise sear
h, seeking predi
tors of the response Y among X1 ; X2 ; : : : ; Xm

in a linear model

Yi = 0 + 1 X1;i + 2 X2;i + + m Xm;i + Zi ; Zi iid

N (0; 2 ) :

Assume that all of the variables have mean zero and 0 = 0. Under the normal linear

model with known error varian
e, (16) implies that tests of Hj : j = 0 based on the

familiar z-s
ores for the predi
tors Zj = (Xj0 Y )=(Xj0 Xj ) satisfy (10) until some Hk

is reje
ted. For further tests, one
an assure that (10) holds by sweeping Xk and all

predi
tors among X1 ; X2 ; : : : ; Xk 1 that are
orrelated with Xk from the remaining

predi
tors. In pra
ti
e, most predi
tors are
orrelated with ea
h other to some extent

and this
ondition requires sweeping X1 ; X2 ; : : : ; Xk from subsequent predi
tors. If

we
olle
t these k predi
tors into an n k matrix X , then the subsequent predi
tors

would be X~j = (I X 0 (X 0 X ) 1 X )Xj ; j = k + 1; : : :. The resulting loss of variation

in predi
tors suggests it would be prudent to at least partially \orthogonalize" the

predi
tors prior to using this type of sear
h.

7

Dis ussion

The
ombination of EDC with alpha-investing rules invites the use of adaptive strategies for testing multiple hypotheses. Rather than posit a xed set of hypotheses in

advan
e of analysis, one
an oer a strategy for determining whi
h hypotheses to test

next after getting some preliminary results. We would expe
t good strategies to leverage domain knowledge and be spe
i
to the parti
ular method of analysis.

Part of our motivation for developing EDC and alpha-investing rules arose from

our work using stepwise regression for data mining (Foster and Stine, 2004). In this

appli
ation, we
ompared forward stepwise regression to tree-based
lassiers for predi
ting the onset of personal bankrupt
y. To make regression
ompetitive, we expanded

20

21

EDC and Alpha-investing

**the stepwise sear
h to in
lude all possible intera
tions among more than 350 \base"
**

predi
tors. This produ
ed more than 67,000 possible predi
tors. Be
ause so many of

these predi
tors were intera
tions (more than 98%), it is not surprising that most of

the predi
tors identied by the sear
h were intera
tions. Furthermore, be
ause of the

wide s
ope of this sear
h, the pro
edure la
ked power to nd subtle ee
ts that while

small, improve the predi
tive power of a model. It be
ame apparent to us that a hybrid

sear
h that only
onsidered the intera
tion Xj Xk , say, after in
luding both Xj and

Xk as main ee
ts might be very ee
tive. At the time, however, we la
ked a method

for
ontrolling the sele
tion pro
edure when the s
ope of the sear
h dynami
ally expands as in this situation. We expe
t to use alpha-investing heavily in this work in the

future.

We spe
ulate that the greatest reward from developing a spe
ialized testing strategy

will
ome from developing methods that sele
t the next hypothesis rather than spe
i

fun
tions to determine how is spent. The rule (15) invests most of the
urrent wealth

in testing hypotheses following a reje
tion. One
an imagine quite a few other
hoi
es.

Our work and those of others in information theory (Rissanen, 1983; Foster, Stine

and Wyner, 2002), however, suggest that one
an nd universal alpha-investing rules.

Given a pro
edure for ordering the hypothesis, a universal alpha-investing rule would

lead to reje
ting as many hypothesis as the best rule within some
lass. We would

expe
t su
h a rule to spend its alpha-wealth a bit more slowly than the simple rule

(15), but retain this general form.

Another area of appli
ation for alpha-investing is in group-sequential
lini
al trials.

In other work (Foster and Stine, 2005) we address the
on
ept of adaptive design with

a modi
ation for alpha-investing. We show that the
omplaints raised in Tsiatis and

Mehta (2003) about the eÆ
ien
y of su
h tests
an be mitigated by proper alphainvesting. At the same time, we allow the resear
her freedom to design rules that

guide how to spend or invest their alpha-wealth.

Appendix: Proof of Theorem 1

**We prove Theorem 1 in this se
tion. We begin by dening an empiri
al ex
ess dis
overy
**

ount. Dene the random variable

ed
;
(; j ) S (j )
R(j ) +

22

EDC and Alpha-investing

so that

EDC;
= inf

inf E (ed
;
(; M )) :

2 M 2M

Now dene

A(j ) ed ; (; j ) W (j ) :

**Our main lemma shows that A(j ) is a sub-martingale for alpha-investing rules with
**

initial alpha-wealth W (0) and pay-out ! 1
. A sub-martingale is \in
reasing"

in the sense that

E (A(j ) j A(j

1); A(j 2); : : : ; A(1)) A(j 1) :

**By denition S (0) = R(0) = 0 so that ed
;
(; 0) = . So if W (0) , ! 1
**

and A(j ) is a sub-martingale, then the optional stopping theorem implies that for all

nite stopping times M

E (ed
;
(; M )) E (ed
;
(; M ) W (M )) W (0) 0 :

**The rst inequality follows be
ause the alpha-wealth W (j ) 0 [a:s:℄, and the se
ond
**

inequality follows from the sub-martingale property. Sin
e EDC for alpha-investing

rules is the inmum over su
h expe
tations, all of whi
h are non-negative, EDC itself

is non-negative.

Thus to show Theorem 1 all we need is the following lemma:

Lemma 1 Let V (m) and R(m) denote the
umulative number of false reje
tions and

the
umulative number of all reje
tions, respe
tively, when testing a sequen
e of null

hypotheses fH1 ; H2 ; : : :g using an alpha-investing rule IW (0);! with initial alpha-wealth

W (0) , pay-out ! 1
, and
umulative alpha-wealth W (m). Then the pro
ess

A(j )

ed
;
(; j ) W (j )

= (1
)R(j ) V (j ) + W (j )

is a sub-martingale,

E (A(m) j A(m

1); : : : ; A(1)) A(m 1) :

Proof.

(17)

**We begin with some notation for the in
rements that dene the
ounts in Table 1.
**

Write V (m) and R(m) as sums of indi
ators Vj ; Rj 2 f0; 1g,

V (m) =

m

X

Vj ;

j =1

R(m) =

m

X

j =1

Rj :

23

EDC and Alpha-investing

**Similarly write the a
umulated alpha-wealth W (m) and A(m) as sums of in
rements,
**

Pm

P

W (m) = m

j =0 Wj and A(m) = j =0 Aj . Let j denote the alpha level of the test of

Hj that satises the
ondition (10). The
hange in the alpha-wealth from testing Hj

an be written as:

Wj = Rj ! + log(1 (pj ^ j )) ;

where ^ is the minimum operator. Substituting this into Aj we get

Aj = (1
!)Rj

log(1 (pj ^ j )) :

Vj

**Sin
e Rj 0 and 1
! 0 by the
onditions of the lemma, it follows that
**

Aj Vj

log(1 (pj ^ j )) :

(18)

**If j 62 Hj , then Vj = 0 and Aj 0 almost surely. So we only need to
onsider the
**

ase in whi
h the null hypothesis Hj is true.

Abbreviate the
onditional expe
tation

Ej 1 (X ) = E (X j A(1); A(2); : : : ; A(j

1)) :

**Then, when Hj is true, pj U [0; 1℄ so that
**

Ej

1(

log(1 (pj ^ j )) =

Z1

log(1 (p ^ j ))dp

Z0j

log(1 p)dp

=

0

= j :

Z1

j

log(1 j )dp

**Sin
e Ej 1(Vj ) j by the denition of this being an j level test, equation (18)
**

implies Ej 1 Aj 0.

Referen es

**Benjamini, Y. and Ho
hberg, Y. (1995) Controlling the false dis
overy rate: a pra
ti
al
**

and powerful approa
h to multiple testing. Journal of the Royal Statist. So
., Ser.

B, 57, 289{300.

Benjamini, Y. and Yekutieli, D. (2001) The
ontrol of the false dis
overy rate in multiple

testing under dependen
y. Annals of Statisti
s, 29, 1165{1188.

24

EDC and Alpha-investing

**Braun, H. I. (ed.) (1994) The Colle
ted Works of John W. Tukey:
**

isons, vol. VIII. New York: Chapman & Hall.

Multiple Compar-

**Dudoit, S., Shaer, J. P. and Boldri
k, J. C. (2003) Multiple hypothesis testing in
**

mi
roarray experiments. Statisti
al S
ien
e, 18, 71{103.

Dykstra, R. L. (1980) Produ
t inequalities involving the multivariate normaldistribution. Journal of the Amer. Statist. Asso
., 75, 646{650.

Efron, B. (2005a) Large s
ale simultaneous hypothesis testing: the
hoi
e of a null

hypothesis. Journal of the Amer. Statist. Asso
., 100, 96{104.

| (2005b) Sele
tion and estimation for large-s
ale simultaneous inferen
e.

Te
h. rep., Department of Statisti
s, Stanford University, http://wwwstat.stanford.edu/brad/papers/hivdata.

Foster, D. P. and Stine, R. A. (2004) Variable sele
tion in data mining: Building a

predi
tive model for bankrupt
y. Journal of the Amer. Statist. Asso
., 99, 303{313.

| (2005) Theoreti
al foundations for adaptive testing using alpha-investing rules. Te
h.

rep., Statisti
s Department, University of Pennsylvania.

Foster, D. P., Stine, R. A. and Wyner, A. J. (2002) Universal
odes for nite sequen
es

of integers drawn from a monotone distribution. IEEE Trans. on Info. Theory, 48,

1713{1720.

Genovese, Christopher, K. R. and Wasserman, L. (2004) False dis
overy
ontrol with

p-value weighting. in progress.

Gupta, M. and Ibrahim, J. G. (2005) Towards a
omplete pi
ture of gene regulation:

using Bayesian approa
hes to integrate genomi
sequen
e and expression data. Te
h.

rep., University of North Carolina, Chapel Hill, NC.

Holm, S. (1979) A simple sequentially reje
tive multiple test pro
edure. S
andinavian

Journal of Statisti
s, 6, 65{70.

Lehma
her, W. and Wassmer, G. (1999) Adaptive sample size
al
ulations in group

sequential trials. Biometri
s, 55, 1286{90.

25

EDC and Alpha-investing

**Mar
us, R., Peritz, E. and Gabriel, K. R. (1976) On
losed testing pro
edures with
**

spe
ial referen
e to ordered analysis of varian
e. Biometrika, 63, 655{660.

Meinshausen, N. and Buehlmann, P. (2004) Lower bounds for the number of false null

hypotheses for multiple testing of asso
iations under general dependen
e. Te
h. Rep.

121, ETH Zuri
h, http://stat.ethz.
h/ ni
olai/.

Meinshausen, N. and Ri
e, J. (2004) Estimating the proportion of false null hypotheses

among a large number of independently tested hypotheses. To appear, Annals of

Statisti
s.

Rissanen, J. (1983) A universal prior for integers and estimation by minimum des
ription length. Annals of Statisti
s, 11, 416{431.

Sarkar, S. K. (1998) Some probability inequalities for ordered Mtp2 random variables:

A proof of the Simes
onje
ture. Annals of Statisti
s, 26, 494{504.

Simes, R. J. (1986) An improved bonferroni pro
edure for multiple tests of signi
an
e.

Biometrika, 73, 751{754.

Storey, J. D. (2002) A dire
t approa
h to false dis
overy rates.

Statist. So
., Ser. B, 64, 479{498.

Journal of the Royal

**| (2003) The positive false dis
overy rate: a Bayesian interpretation and the q-value.
**

Annals of Statisti
s, 31, 2013{2035.

Troendle, J. F. (1996) A permutation step-up method of testing multiple out
omes.

Biometri
s, 52, 846{859.

Tsiatis, A. A. and Mehta, C. (2003) On the ineÆ
ien
y of the adaptive design for

monitoring
lini
al trials. Biometrika, 90, 367{378.

Tukey, J. W. (1953) The problem of multiple
omparisons. Unpublished le
ture notes.

| (1991) The philosophy of multiple
omparisons. Statisti
al S
ien
e, 6, 100{116.

Sign up to vote on this title

UsefulNot useful- Testing of Hyp.
- Hypothesis Testing for Single Populations - Chapter Nine
- Unit 20
- Hypothesis Testing Chandru
- 7.2rvsd.ppt
- Chapter 8
- Project 3
- Chapter Eight
- Chapter 10 One-Sample Tests of Hypothesis
- Testing Hypotheses Pri
- Lecture 2
- Hypothesis Testing
- Testing of Hypothesis
- 2.Introduction to Hypothesis Testing
- Hypothesis
- 11. Hypothesis Testing
- 03 Hypothesis Dan ObjectivesHypothesis Testing Feb 2014
- Chapter 9
- Black Business Statistics Study Guide Ch09
- Hypothesis Testing
- Hypothesis Testing
- Hypothesis Testing - 1
- Hypo
- STA404 CHAPTER 08
- An Example of a Hypothesis
- Worksheet1-solns (1)
- cbesta2chap09slides
- 8.hypothesis
- 3 - Hypothesis Testing
- Research Hypothesis
- edc

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

We've moved you to where you read on your other device.

Get the full title to continue

Get the full title to continue reading from where you left off, or restart the preview.

scribd