You are on page 1of 10

Stat Comput (2011) 21: 45–54

DOI 10.1007/s11222-009-9145-8

A permutation test for umbrella alternatives


Dario Basso · Luigi Salmaso

Received: 21 July 2008 / Accepted: 14 July 2009 / Published online: 27 August 2009
© Springer Science+Business Media, LLC 2009

Abstract There is a wide variety of stochastic ordering level up to a point, then decreases with further increase in
problems where K groups (typically ordered with respect the treatment level. In the literature, this up-then-down pat-
to time) are observed along with a (continuous) response. tern has been identified as umbrella ordering (see, e.g., Mack
The interest of the study may be on finding the change- and Wolfe 1981). Umbrella orderings can be observed with
point group, i.e. the group where an inversion of trend of many physical and biological phenomena in a wide variety
the variable under study is observed. A change point is not of scientific research areas.
merely a maximum (or a minimum) of the time-series func- There has been considerable previous work on proce-
tion, but a further requirement is that the trend of the time- dures designed to test homogeneity against umbrella or-
series is monotonically increasing before that point, and dering alternatives. Such testing procedures are generally
monotonically decreasing afterwards. A suitable solution based on ranks. An introductory review on such tests can
can be provided within a conditional approach, i.e. by con- be found in Wolfe (2006) and in Millen and Wolfe (2005).
sidering some suitable nonparametric combination of de- Mack and Wolfe (1981) proposed a test statistic based on
pendent tests for simple stochastic ordering problems. The the Jonckheere-Terpstra statistic.
proposed procedure is very flexible and can be extended to Hettmansperger and Norton (1987) pointed out that no
trend and/or repeated measure problems. Some comparisons comparisons are made in the Mack-Wolfe test between
through simulations and examples with the well known samples preceding the known peak and those following it.
Mack & Wolfe test for umbrella alternative and with Page’s Pan (1996) proposed to retrieve the information across the
test for trend problems with correlated data are investigated. peak using a test statistic which is the maximum of the
Jonckheere-Terpstra statistics. However, within this kind of
Keywords Nonparametric combination · Mack and alternatives, tests for umbrella alternatives with an unknown
Wolfe’s test · Page’s test · Repeated measures · Trend peak are much more practical. The most common approach
analysis to construct test statistics for this setting is to take the max-
imum of the test statistics for umbrella alternatives with
known peaks (Mack and Wolfe 1981; Hettmansperger and
1 Introduction Norton 1987; Chen and Wolfe 1990; Shi 1988; Hartlaub and
Wolfe 1999). Magel and Qin (2003) proposed a test that
In one-way ANOVA experiments, it is common that the re- extends the Chen and Wolfe (1990) test for umbrella alter-
sponse variable increases with an increase in the treatment natives with an unknown peak to use with ranked-set sam-
ple data which is essentially based on the procedure pro-
posed by Hartlaub and Wolfe (1999). The proposed test,
D. Basso · L. Salmaso ()
Department of Management and Engineering, University of
however, is not the best in situations where the first loca-
Padova, Str. lla S. Nicola 14, 36100 Vicenza, Italy tion shift or the last location shift is much higher than the
e-mail: salmaso@gest.unipd.it others. If the first or last location shift is expected to be
D. Basso much higher than the remaining shifts, and the remaining
e-mail: dario@stat.unipd.it location shifts are expected to be approximately equal, both
46 Stat Comput (2011) 21: 45–54

the Chen-Wolfe and Mack-Wolfe tests are recommended for exchangeable errors with zero mean and finite variance σ 2
use. It has been also shown (Kössler 2006) that generally the (typically i.i.d. random variables, independent of δk ’s), and
Hettmansperger-Norton-type test performs best, closely fol- nk are fixed sample sizes. Let Fk (y) be the cumulative dis-
lowed by the Chen-Wolfe-type test and the Shi-type test. Re- tribution function of the response variable in group k. Then
cently, Pan (2008) proposed a non-parametric distribution- we wish to assess the null hypothesis of no treatment effect:
free confidence procedure for umbrella orderings by con-
structing a random confidence subset of the ordered treat- H0 : F1 (y) = F2 (y) = · · · = FK (y), ∀y ∈ R,
ments such that it contains all the unknown peaks (optimal
treatments) of an umbrella ordering with any pre-specified against the umbrella alternative hypothesis:
confidence level. Anyway in the literature it is well recog-
H1 : F1 (y) ≥ · · · ≥ Fk−1 (y) ≥ Fk (y)
nized that Mack and Wolfe and Chen and Wolfe type tests
are the milestones for the umbrella alternative problems. ≤ Fk+1 (y) ≤ · · · ≤ FK (y),
There is very few literature concerning nonparametric
permutation proposals for umbrella alternatives a part for for some k ∈ {1, . . . , K}, and with at least one strict inequal-
some hints given in Manly (1997) and the recent paper ity in a set of points of positive probability. That is, the in-
by Neuhäuser et al. (2003) where a modified Jonckheere- terest of the study is on finding the change-point group k
Terpstra test is presented in a suitable permutation version (if it exists), i.e. the group where an inversion of trend of
in order to obtain reliable results with small, sparse, unbal- the variable under study is observed. A change point is not
anced, and tied data. merely a maximum of the time-series function, but a further
In this paper, we introduce a permutation test for um- requirement is that the trend of the time-series is monotoni-
brella alternatives. Permutation tests do not require assump- cally nondecreasing before group k and monotonically non-
tion on the distribution of data. Moreover, the distribution increasing afterwards. Thus there are two main aspects to
of the test statistic is exact, whereas the majority of exist- consider: (i) is there any umbrella behaviour due to the ex-
ing tests for umbrella alternatives are exact only asymptoti- perimental factor? (ii) If so, which one is the change-point
cally. Therefore, permutation tests can be applied at any α- group? A parametric solution to this problem is very dif-
values, whereas the existing competitors require tabulated ficult, especially when K > 2. These hypotheses define a
critical values for some α-levels that have been chosen by problem of isotonic inference (see Hirotsu 1998).
the related authors. The procedure we are introducing works A nonparametric rank solution of this kind of problem
even with very small sample sizes (say 2 replicates for each was given by Mack and Wolfe (1981): they proposed a test
treatment) and/or in umbalanced cases. Thus, we recom- statistic for umbrella alternatives which is a weighted linear
mend this procedure when small sample sizes are available combination of standardized Mann-Whitney statistics and
or when data cannot be assumed to follow a specified distri- provided their null distribution in a wide variety of settings.
bution. Moreover, our method is very flexible, as it can be The proposal of this work is an approach conditional on
applied in a wide variety of situations (see Sects. 6 and 7). the observed data. If we knew the peak group, say the k̂th
The procedure proposed makes use of the nonparametric one, the problem of umbrella alternatives could be simplified
combination, introduced by Pesarin (2001). This methodol- in an intersection of alternative hypotheses:
ogy has the advantage to allow the decomposition of com-

 
plex hypotheses (such as the umbrella alternative) into sim- H1 = H H ,
1k̂ 1k̂
ple “partial” hypotheses. Each partial hypothesis is then
tested by a suitable “partial test statistic” and the information where:
related to the partial tests is then combined together through 
H = F1 (y) · · · ≥ Fk̂−1 (y) ≥ Fk̂ (y), and
the nonparametric combination leading to a global test sta- 1k̂
tistic for the complex problem. 
H = Fk̂ (y) ≤ Fk̂+1 (y) ≤ · · · ≤ FK (y).
The context is that of one-way ANOVA experiment, 1k̂
where the experimental factor (time, increasing doses of That is, if the peak group was known, the umbrella alterna-
drug) levels determines the treatments which identify the K tive could be written as the intersection of two simple sto-
groups. chastic ordering alternatives (an increasing one and a de-
Let Yik be the observed response variable on the ith sub- creasing one). In order to introduce the permutation test for
ject from group k = 1, . . . , K. We assume Yik to follow the umbrella alternative, we first need to introduce some suitable
additive model: permutation tests for ordinary stochastic ordering problems
Yik = μ + δk + εik , i = 1, . . . , nk , (1) (Sect. 2). This will require the nonparametric combination
(NPC) methodology, which is a very useful tool when one
where μ is the population mean, δk is the treatment effect needs to combine different informations/aspects of the same
on the kth group (which may also be stochastic) and εik are problem. The NPC methodology is introduced in Sect. 3 and
Stat Comput (2011) 21: 45–54 47

it will be useful for defining the permutation test for simple d


so X1 ⊕ X2 ≤ X2 and we have proved (ii). In the same way,
stochastic ordering problems. Section 4 of this paper is ded- let FX2 ⊕X3 (t) = ω2 FX2 (t) + ω3 FX3 (t) with ω2 , ω3 ∈ [0, 1],
icated to our proposal for umbrella problems. In Sect. 5 the ω2 + ω3 = 1, then:
test proposed is evaluated through a simulation study, and
compared with that of Mack and Wolfe. In Sect. 6 we show FX2 ⊕X3 (t) = ω2 F2 (t) + ω3 F3 (t)
how the proposed procedure can also be applied to repeated
measure or/and trend problems. Finally, in Sect. 7 a compar- ≤ ω2 F2 (t) + ω3 F2 (t) = F2 (t),
ison with Page’s test for trend analysis with correlated data d
is investigated, and an application of the permutation test is therefore X2 ⊕ X3 ≥ X2 , and this proves (i). 
discussed.
Now, conditionally on the observed data, consider the
pooled vector of observations y1 y2 = [y1 , y2 ]
, where yj
2 Simple stochastic ordering alternatives is a vector of nj observations from FYj (y), j = 1, 2, and
the symbol denotes the pooling of two vectors. Then the
Under the assumption of model (1), let us consider the sim- random variable Y1 ⊕ Y2 describing the generic observation
ple stochastic ordering problem for the first k̂ samples to of y1 y2 has (empirical) cumulative distribution function
assess the null hypothesis F1 (y) = F2 (y) = · · · = Fk̂ (y) equal to:
against the alternative hypothesis F1 (y) ≥ · · · ≥ Fk̂−1 (y) ≥
nj
Fk̂ (y), y ∈ R. Note that under the null hypothesis the ele- 1 
2 
F̂Y1 ⊕Y2 (y) = I (yij ≤ y)
ments of the response are exchangeable (this fact will enable n1 + n2
j =1 i=1
us to provide the null distribution of a proper test statistic). n1
If k̂ = 2, then the stochastic ordering problem reduces to n1 i=1 I (yi1 ≤ y)
=
a two-sample problem with restricted alternative. If k̂ > 2, n1 + n2 n1
then let us consider the whole data set is split into two pooled n2
n2 i=1 I (yi2 ≤ y)
pseudo-groups, where the first is obtained by pooling to- +
gether data of the first j groups (ordered with respect to the n1 + n2 n2
treatment levels), and the second by pooling together the re- = ω1 F̂Y1 (y) + ω2 F̂Y2 (y),
maining observations. In order to better understand the rea-
son why we pool together the ordered groups, suppose k̂ = 3 where I (·) is the indicator function. Therefore, condition-
and let us consider the following theorem: ally, Y1 ⊕ Y2 has a mixture distribution.
By extending this result to the k̂ groups and by applying
Theorem 1 Let X1 , X2 , X3 be mutually independent ran- d d d
Theorem 1, we have that if Y1 ≤ Y2 ≤ · · · ≤ Yk̂ holds, then:
dom variables which admit cumulative distribution function
d d
Fj (t), t ∈ R, j = 1, 2, 3. Then, if X1 ≤ X2 ≤ X3 , we have: d
Y1⊕2⊕···⊕j ≤ Yj +1⊕j +2⊕···⊕k̂ ∀j ∈ {1, . . . , k̂ − 1}.
d
(i) X1 ≤ X2 ⊕ X3 and
d In general, let z1(j ) = y1 y2 · · · yj be the first (ordered)
(ii) X1 ⊕ X2 ≤ X3 , pseudo-group and let z2(j ) = yj +1 · · · yk̂ be the sec-
where W ⊕ V indicates a mixture of random variables ond (ordered) pseudo-group, j = 1, . . . , k̂ − 1. Let Z1(j ) and
W and V , i.e. FW ⊕V (t) = ωW FW (t) + ωV FV (t), t ∈ R, Z2(j ) be the random variables describing the generic obser-
ωW , ωV ∈ [0, 1], ωW + ωV = 1. vation of the pooled vectors z1(j ) and z2(j ) , respectively. In
the null hypothesis, data of every pair of pseudo-groups are
d d exchangeable because the related variables satisfy the rela-
Proof By definition, X1 ≤ X2 ≤ X3 is equivalent to F1 (t) ≥ d
F2 (t) ≥ F3 (t), ∀ t ∈ R. The random variable X1 ⊕ X2 has tionships Z1(j ) = Z2(j ) , j = 1, . . . , k̂ − 1. In the alternative,
d
cumulative distribution function equal to: by Theorem 1, we have Z1(j ) ≤ Z2(j ) , which corresponds to
the monotonic stochastic ordering (dominance) between any
FX1 ⊕X2 (t) = ω1 F1 (t) + ω2 F2 (t), pair of pseudo-groups (i.e. for j = 1, . . . , k̂ − 1). This sug-
gests that we express the hypotheses in the equivalent form:
with ω1 , ω2 ∈ [0, 1], ω1 + ω2 = 1. Therefore, by hypothesis:
⎧ ⎫
⎨k̂−1
 ⎬
FX1 ⊕X2 (t) = ω1 F1 (t) + ω2 F2 (t) H0 :
d
(Z1(j ) = Z2(j ) )
⎩ ⎭
≥ ω1 F2 (t) + ω2 F2 (t) = F2 (t), j =1
48 Stat Comput (2011) 21: 45–54

against ∗
where z̄2(j ∗ ∗ ∗
) and z̄1(j ) are the means of z2(j ) and z1(j ) ,
⎧ ⎫ respectively, and σ̂j∗2 is the pooled estimate of the er-
⎨k̂−1


H :
d
(Z1(j ) ≤ Z2(j ) ) , ror variance. Thus, the set {b Tj∗ , b = 1, . . . , B} is a
1k̂ ⎩ ⎭ random sample from the null permutation distribution
j =1
of the test statistic Tj  .
where a breakdown into a set of sub-hypotheses (or partial
– Obtain the p-value of each sub-problem (partial p-value)
hypotheses) is emphasized.
by computing:
Let us pay attention to the j th sub-hypothesis H0(j ) :
d d
{Z1(j ) = Z2(j ) } against H1(j ) : {Z1(j ) ≤ Z2(j ) }. Note that the #[Tj∗ ≥ Tj  ]
pj  = .
related sub-problem corresponds to a two-sample compari- B
son for restricted alternatives, a problem which has an exact
and unbiased permutation solution (for further details see The previous algorithm provides k̂ − 1 p-values related to

Pesarin 2001). This solution is based on the test statistics the sub-hypothesis system H0(j ) against H1(j ) . In order to
(among others): combine the partial information into a global test we require
the NPC methodology which is introduced in Sect. 3. Obvi-
Z̄2(j ) − Z̄1(j ) ously, if the alternative hypothesis is:
Tj  = j = 1, . . . , k̂ − 1, (2)
n1(j ) + n2(j )
1 1 
H : Fk̂ (y) ≤ Fk̂+1 (y) · · · ≤ FK (y),
1k̂

where Z̄2(j ) and Z̄1(j ) are sample means of the second and the previous algorithm still applies by replacing the test sta-
the first pseudo-groups, respectively, and n1(j ) and n2(j ) tistic (2) with:
are the lengths of z1(j ) and z2(j ) . Note that the condi-
∗ ∗ W̄1(j ) − W̄2(j )
tional variance of Z̄2(j ) − Z̄1(j ) under H0 is proportional Tj  = j = k̂, . . . , K − 1,
to n−1
1(j ) + n−1
2(j ) . n1(j ) + n2(j )
1 1

Large values of the test statistics Tj  are significant


d
against H0(j ) : Z1(j ) = Z2(j ) in favor of the alternatives where W̄1(j ) is the mean of the pooled vector w1(j ) =
 d [yk̂ , yk̂+1 , . . . , yj ], and W̄2(j ) is the mean of the pooled vec-
H1(j ) : Z(j ) ≤ Z2(j ) . We can obtain a permutation test for tor w2(j ) = [yj +1 , . . . , yK ].

each sub-problem H0(j ) vs. H1(j ) by the following algo-
rithm:
– Let y = [y1 , y2 , . . . , yk̂ ]
be the vector of the observed 3 The nonparametric combination
data in k̂ groups.
There are some problems where the complexity requires a
– For j = 1, . . . , k̂ − 1, repeat:
further approach. Consider, for instance, a multivariate prob-
1. Let z1(j ) = [y1 , . . . , yj ]
and z2(j ) = [yj +1 , . . . , yk̂ ]
; lem where q (possibly dependent) variables are considered,
2. Compute the observed values of the partial test statis- or a multi-aspect problem (such as the one of previous sec-

tics for the sub-problem H0(j ) vs. H1(j ) by computing: tion, where k̂ − 1 informative tests are available).
The difficulties arise because of the underlying depen-
z̄2(j ) − z̄1(j ) dence structure among variables (or aspects), which is gen-
Tj  = ; (3)
1
+ 1 erally unknown or too difficult to model and/or manipulate.
n1(j ) n2(j )
Moreover, a global answer involving several dependent vari-
– Consider a large number B of independent random per- ables (aspects) is often required, so the question is how to
mutations of the response y, and let y∗b be a random per- combine the information related to the q variables (aspects)
mutation of y. At each step b = 1, . . . , B, repeat: into a unique “global” test.
j To this end, consider the partial tests to assess H0(j )
1. let z∗1(j ) be the vector with the first n1(j ) = =1 n 
against H1(j ) of previous section. Define the alternative hy-
observations and z∗2(j ) be the vector of the last n2(j ) =
k̂ pothesis for (increasing) stochastic ordering as:

=j +1 n observations of yb ; k̂−1 
2. Obtain the permutation null distribution of the test sta-

d
tistic by computing: H : (Z1(j ) ≤ Z2(j ) ) .
1k̂
j =1
∗ ∗
z̄2(j ) − z̄1(j )
b
Tj∗ = , The formulation of the alternative hypothesis suggests that
n(1)j + n(2)j
1 1
it should be rejected whenever one of the partial alternative
Stat Comput (2011) 21: 45–54 49

hypotheses H1(j ) , j = 1, . . . , k̂ − 1, is true. Therefore, let us that the last condition implies that the related partial p-value
consider a function: tends to zero, i.e. the j th partial test is extremely signifi-
cant against the j th partial null hypothesis H0(j ) . Moreover,
ψ(λ1 , . . . , λj , . . . , λq ) : Rq → R, (4) is significant against H0 for large values. Note also that
the partial tests are dependent since their null distributions
where λj is a partial statistic for the j th partial problem (e.g.
are obtained under the global null hypothesis H0 : F1 (y) =
λj is the p-value of the j th partial test). Then we define
F2 (y) = · · · = FK (y) (i.e. the null distribution of each par-
ψ a combining function if it satisfies the following require-
tial test is obtained from the same permutations of the whole
ments:
vector of data).
1. ψ must be continuous in all its q arguments; Finally, in order to obtain a permutation test for the sto-
2. ψ must be non-decreasing on its arguments. By this we chastic ordering problem:
mean that
– Obtain the observed value of the global test statistic by
ψ(λ1 , . . . , λj , . . . , λq ) ≥ ψ(λ1 , . . . , λ
j , . . . , λq ) computing (4) from the observed partial test Tj  , j =
1, . . . , k̂ − 1;
if λj is more significant against H0j than λ
j . – Obtain the null distribution of the global test statistic by
3. ψ must reach its supremum ψ̄ (possibly not finite) when computing:
one of its arguments tends to the rejection of the related
partial null hypothesis for whatever α > 0. That is: 
k̂−1

b
ψk̂ = b
Tj∗ ,
ψ(λ1 , . . . , λj , . . . , λq ) → ψ̄ j =1

if λj is “extremely” significant against H0j . The meaning where the Tj∗ are the partial test statistics computed from
of the word “extremely” will be clearer in what follows. y∗ ;

– Obtain a global p-value to assess H0 against H1 as:
The λ’s in the definition of the combining function could
be either test statistics or p-values. For instance, if the λ’s ∗ ≥ψ ]
#[b ψ 
are test statistics which are significant for large values (as pG = .
k̂ B
in the previous section), some suitable combining functions
are the following: If the global p-value is significant, then there is empirical

q evidence that at least one inequality in H = F1 (y) ≥ · · · ≥
– The direct combining function: ψ = j =1 Tj  1k̂
Fk̂−1 (y) ≥ Fk̂ (y) is satisfied.
– The maxT combining function: ψ = maxj Tj  
As before, if the alternative hypothesis is H : Fk̂ (y) ≤
Instead, if the combining function is based on the partial p- 1k̂
Fk̂+1 (y) · · · ≤ FK (y), then define the global test statistic as:
values (i.e. λj = pj  ), the following combining functions
can be appropriate: 
K−1
q ψk̂ = Tj  .
– Fisher’s combining function: ψ = −2 j =1 log(pj  ),
0 ≤ ψ < +∞; j =k̂

– Tippett’s combining function: ψ = 1 − minj pj  , 0 ≤ For further reading on the nonparametric combination
ψ ≤ 1; q methodology see Pesarin (2001).
– Liptak’s combining function: ψ = j =1 Φ −1 (1 − pj  ),
where Φ is the standard normal cumulative distribution
function, −∞ < ψ < +∞. 4 Permutation test for umbrella alternatives
Note that the above combining functions all satisfy the prop-
erties (1) → (3): let us consider, for instance, the direct com- If the peak group was known, then the umbrella alternative
bining function applied to the partial test statistics Tj  of could be detected by combining together two partial tests
previous section. In this case, a global test statistic for the for simple stochastic ordering alternatives. However, it will
stochastic ordering problem is defined as: generally be unknown. Nevertheless we can detect the peak
group by repeating the procedure for known peak as if every

k̂−1 group were the known peak group: that is, for each k ∈
ψk̂ = Tj  . (4) 1, . . . , K, let:
j =1

k−1 
K−1
Clearly, (4) is continuous, it is nondescreasing and it reaches ψk = Tj  and ψk = Tj 
its supremum (not finite) whenever ∃ j : Tj  → +∞. Note j =1 j =k
50 Stat Comput (2011) 21: 45–54

be two partial test to assess H0k : F1 (y) = F2 (y) = · · · = – Obtain the global p-value as:
 
FK (y) against respectively H1k and H1k by applying the
#[b Π ∗ ≤ Π]
direct nonparametric combination of the partial tests Tj  ’s ΠG = .
and Tj  ’s. Note that when k = 1 we actually test for de- B
creasing ordering only, whereas when k = K we only test Note that the combining functions are applied simultane-
for increasing ordering. Then: ously to each random permutation, providing the null distri-
 butions of partial and global tests as well. The NPC method-
– Obtain the partial p-values to assess H0k against H1k and
 G , p G ) be the pair of p-values
ology applies three times in this testing procedure:
H1k , respectively. Let (pk k
from the observed data; 1. when obtaining simple stochastic ordering tests to assess
 
– Obtain the null distribution of the pair of p-values to as- H1k and H1k for the kth group (“direct” combining func-
  tion);
sess H0k against respectively H1k and H1k . This will be
indicated with the pair ( pk , pk ), b = 1, . . . , B. That
b G∗ b G∗ 2. when combining together two partial tests for simple sto-
G∗ , b p G∗ ) is obtained by applying the previous al-
is, (b pk chastic ordering alternatives, providing a test for um-
k
brella per each group as it were the known peak group
gorithm for simple stochastic ordering alternatives and by
(“Fisher’s” combining function);
replacing y with y∗b ;
3. when combining together the partial test for umbrella on
– Obtain the observed value of the test statistic with
each group (“Tippett’s” combining function).
Fisher’s NPC function:
A significant global p-value Π G indicates that there is ev-
Ψk = −2 log(pk
G
· pk
G
). idence in favor of an umbrella alternative. The peak group
is then identified by looking at the partial p-values for um-
– Obtain the null distribution of Ψk by computing: brella alternatives {π1 , π2 , . . . , πk , . . . , πK }. The peak group
(if any) is then the one with minimum p-value.
b
Ψk∗ = −2 log(b pk
G∗ b ∗G
· pk ), b = 1, . . . , B. The proposed algorithm may still apply with different
combining functions in first two steps, but not in the third
– Obtain the p-value for umbrella alternative on group k as: step. This is because Tippett’s ψ is significant only when
at least one of its arguments is significant. As regards our
#[b Ψk∗ ≥ Ψk ]
πk = . choices in the first two steps of the algorithm, the direct com-
B bining function in step 1 has been chosen for computational
Note that if πk is significant then there is evidence on data reasons, whereas Fisher’s combining function in step 2 has
of an umbrella alternative with peak group k. In order to been applied because it is generally suitable when no spe-
evaluate if there is a significant presence of any umbrella cific knowledge of the sub-alternatives is expected.
alternative, we finally combine the p-values for umbrella
alternative of each group. To do so:
– Obtain the null distribution of the p-value for umbrella 5 A comparison with Mack & Wolfe test
alternative on group k as:
In this section we show the performances of the permutation
#[ ∗ ≥ b Ψk∗ ] test for umbrella alternatives by providing results on some
b
πk∗ = , b = 1, . . . , B, simulations under the null hypothesis and under some alter-
B
natives. The chosen settings are K = 5 groups with nj ≡ 3
where  ∗ is the vector with the permutation null distrib- observations each (j = 1, . . . , 5). The simulated data have
ution of πk . a standard normal distribution, possibly with some non ran-
– Apply Tippett’s combining function to the πk ’s, providing dom location shifts in some groups (under the alternative
the observed value of the global test statistic for umbrella hypothesis). The location shifts considered in each simula-
alternative in any group as: tion are indicated by the symbols δk on the top of each ta-
ble. Each simulation is based on 1000 independent Monte
Π = min(π1 , π2 , . . . , πK ). Carlo data generations. The simulation settings have been
chosen in accordance with the example appeared in Mack
Note that small values of Π are significant against the null and Wolfe (1981). This example is about a score intelligence
hypothesis. test: five male groups with three subjects each were evalu-
– Obtain the null distribution of Π G by computing: ated through the Welchsler Adult Intelligence Scale (WAIS).
The groups were identified by different classes of age. The
b
Π ∗ = min(b π1∗ , b π2∗ , . . . , b πK∗ ), b = 1, . . . , B. authors conjectured that the intelligence score follows an
Stat Comput (2011) 21: 45–54 51
Table 2 Rejection rates of permutation and Mack & Wolfe’ tests un-
der H0

Group 1 2 3 4 5

δk 0 0 0 0 0
Permutation test
α Partial tests G.T.

0.05 0.006 0.014 0.010 0.012 0.012 0.046


0.10 0.028 0.024 0.020 0.028 0.030 0.100
0.20 0.042 0.050 0.044 0.056 0.046 0.192
α Pr{πk = minj πj |Π G ≤ α} Tot.

0.05 0.206 0.176 0.206 0.235 0.176 1


0.10 0.196 0.214 0.196 0.196 0.196 1
0.20 0.208 0.188 0.208 0.198 0.198 1
Mack & Wolfe’s test
Fig. 1 Boxplot representation of the example from Mack and Wolfe
α Pr{k̂ = j |p ≤ α} p
(1981)
0.05 0.214 0.214 0.243 0.214 0.114 0.049
Table 1 Permutation Test results, WAIS score data
0.10 0.209 0.191 0.255 0.200 0.145 0.090
Age 15–19 20–34 35–54 55–69 >70 0.20 0.192 0.206 0.224 0.229 0.150 0.207

πk 0.05794 0.00599 0.00299 0.12687 0.94306


Π G = 0.014 for peak-known umbrella are also shown. In order to account
for multiplicity, the partial p-values are compared to the
adjusted α-level through a Bonferroni’s correction (there-
umbrella trend. Figure 1 shows the boxplot representation
fore the actual nominal level for the partial tests is α/5).
of data, and the dotted line is the trend line connecting the
Under the null hypothesis, the probability of observing a
group means. Mack and Wolfe’s test will be compared to
peak group should be uniformly distributed among the K
our proposal along this section. For details on the test sta-
groups, therefore the estimated probabilities of the event
tistic and data of the example we refer to Mack and Wolfe
πk = min{p1 , . . . , πK } conditional to the rejection of the
(1981). The data of Example 1 in Mack & Wolfe were ana-
global null hypothesis are also shown. The bottom of the
lyzed by both the permutation test and Mack & Wolfe’s test.
Mack & Wolfe’s test gave pretty significant results in fa- table refers to Mack & Wolfe’s test performance: the rejec-
vor of the umbrella alternative, providing an approximated tion rates of the test corresponding to the nominal α-sizes
p-value equal to 0.0328 with the third age-group as the es- are reported in the “p” column. The Mack & Wolfe test
timated peak group. The results of the permutation test are provides a unique (global) p-value and the peak group is
shown in Table 1: the global p-value is equal to 0.014, in- estimated through a maxZk combining function, where Zk
dicating a strongly significant presence of an umbrella al- is a two-sample Mann & Whitney statistic comparing the
ternative. The peak group is then individuated through the kth group and the remaining ones. The probability of ob-
partial p-values: provided that the global test is significant, serving a peak in group j , j = 1, . . . , 5, conditional to the
the peak group is the one with minimum partial p-value. rejection of the null hypothesis is also reported (denoted by
In this example the third group (π3 = 0.00299) is the peak “Pr{k̂ = j |p ≤ α}”).
group. Table 3 shows the behaviour of the test under an umbrella
We have compared our permutation test with Mack & alternative with peak on third group. As regards the permu-
Wolfe’s test through a simulation study. Table 2 reports the tation test, note that the rejection rates of the global test are
results of a simulation under H0 . The top of the table refers far bigger than the nominal levels. Moreover, the rejection
to the permutation test: here the rejection rates of the null rates of the partial tests (accounting for multiplicity) are di-
hypothesis of partial and global tests at different α-sizes are rectly proportional to the sizes of δk ’s, and that group 3 has
shown. Note how the rejection rates of the global test col- been detected as the peak group about 35% of the times that
umn (indicated by “G.T.”) are close to the nominal ones. the global null hypothesis has been rejected at all α levels
Then, for each group, the rejection rates of the partial tests (however δk sizes are modest compared to the variance of
52 Stat Comput (2011) 21: 45–54
Table 3 Rejection rates of permutation and Mack & Wolfe’ tests un- Table 4 Rejection rates of permutation and Mack & Wolfe’ tests un-
der umbrella alternative der anti-umbrella alternative

Group 1 2 3 4 5 Group 1 2 3 4 5

δk 0 0.9 1 0.9 0 δk 1 0.5 0 0.5 1


Permutation test Permutation test
α Partial tests G.T. α Partial tests G.T.

0.05 0.002 0.116 0.128 0.094 0.004 0.306 0.05 0.010 0.000 0.000 0.000 0.015 0.022
0.10 0.006 0.166 0.216 0.144 0.008 0.446 0.10 0.020 0.002 0.000 0.000 0.018 0.040
0.20 0.014 0.264 0.290 0.226 0.030 0.622 0.20 0.027 0.004 0.000 0.000 0.028 0.068
α P {πk = minj πj |Π G ≤ α} Tot. α P {πk = minj πj |Π G ≤ α}

0.05 0.007 0.366 0.366 0.248 0.013 1 0.05 0.455 0.000 0.000 0.000 0.545 1
0.10 0.018 0.332 0.386 0.247 0.018 1 0.1 0.500 0.050 0.000 0.000 0.450 1
0.20 0.023 0.354 0.354 0.241 0.029 1 0.2 0.441 0.088 0.000 0.000 0.471 1
Mack & Wolfe’s test Mack & Wolfe’s test
α Pr{k̂ = j |p ≤ α} p
α Pr{k̂ = j |p ≤ α} p
0.05 0.007 0.257 0.414 0.293 0.029 0.274 0.00 0.625 0.000 0.000 0.083 0.292 0.012
0.10 0.005 0.285 0.443 0.249 0.018 0.426 0.10 0.568 0.023 0.023 0.068 0.318 0.041
0.20 0.003 0.294 0.429 0.260 0.014 0.598 0.20 0.451 0.122 0.012 0.098 0.317 0.087

data distribution σ 2 = 1). The power of Mack & Wolfe test permutations. The model for repeated measure response can
is very close to that of permutation test. be written as follows:
In Table 4 we have set the location shifts in order to sim-
ulate an anti-umbrella alternative. That is, data are not un- Yik = μi + δik + εik , i = 1, . . . , n; k = 1, . . . , K. (5)
der the null hypothesis, but the true alternative hypothesis
is not of umbrella kind. The trend is first decreasing then This model accounts for different location parameters on
increasing, and the tests involved in the comparison should each subject prior to the experiment (μi , i = 1, . . . , n) and
not recognize this kind of alternative, since they have been different effects of the treatment on each subject (δik , k =
specifically ideated for umbrella alternatives. Indeed, the re- 1, . . . , K). Note that here the error components εik may
jection rates of the global tests are always lower than the be assumed to be exchangeable within the same subject
related nominal levels. or among subjects. Under the null hypothesis of no treat-
The simulations we run showed that the probability of ment effect (i.e. H0 : δik = 0, ∀i, k), only the observations
having ties (i.e., more than one estimated peak group) is within each subject are exchangeable because of the loca-
higher when Mack & Wolfe test is applied. tion parameter μi in model (5). Since subjects are indepen-
dent, the permutation strategy can be modified by consider-
ing n independent permutations of each subject’s observa-
6 Umbrella with repeated measures tions. Note that, by considering independent permutations
within each subject the entire permutation space consists
Umbrella alternative and trend problems are often associ- in (K!)n points, whereas with the usual permutation strat-
ated with repeated measure experiments, where a number of egy (of Sects. 2–4) the permutation space consists in (nK)!
n units is subjected to the same treatment (e.g. increasing points.
doses of drugs) and the response is measured at several time
points. The test proposed here can be extended to both trend
and repeated measure problems. If trend analysis is consid- 7 A comparison with Page’s test
ered, it is sufficient to consider the partial p-value π1 or πK
of the algorithm in Sect. 4, depending if the alternative hy- Page (1963) proposed a distribution-free test for trend prob-
pothesis is decreasing trend (π1 ) or increasing trend (πK ). lems where repeated measures are also considered. For in-
If, in addition, repeated measures are considered, the stance, consider the case when the observations of n subjects
same algorithm applies by considering a restricted kind of corresponding to K treatments are ranked according with
Stat Comput (2011) 21: 45–54 53

some criterion of measurement. Referring to the case where Table 5 Rejection rates of the null hypothesis of the permutation test
accounting for trend alternatives/repeated measures, and Page’s test re-
a continuous variable Y is measured, let μj , j = 1, . . . , K, jection rates under H0
be the mean of Y in the j th treatment. Page’s test is suitable
for testing for the specific alternative H1 : μ1 ≥ μ2 ≥ · · · ≥ Corr. matrix = I
μK with at least one strict inequality against the usual one- Null α
way ANOVA null hypothesis H0 : μ1 = μ2 = · · · = μK . It Test Distribution 0.05 0.1 0.2
is also known as a trend test for repeated measures.
The test statistic, denoted by L, is a sort of correlation Permutation Test 0.053 0.104 0.207
index between the expected ranking under the specified al- Page Monte Carlo 0.038 0.064 0.122
ternative and the sum of subject’s ranks in each group. To Asympt. 0.048 0.083 0.142
perform Page’s test, first rank each subject’s observations. Corr. matrix = 
Let Rij be the rank of the ith subject associated to the j th Null α
level ofthe treatment, Rij = #[Yih ≤ Yij ], 1 ≤ h ≤ K. Then Test Distribution 0.05 0.1 0.2
R·j = i Rij is the sum of subject ranks corresponding to
the j th treatment. If the specific alternative is H1 : μ1 ≥ Permutation Test 0.056 0.102 0.198
μ2 ≥ · · · ≥ μK , then the expected treatment
 ranking is given Page Monte Carlo 0.042 0.069 0.175
by Rj∗ = {1, 2, . . . , K}. Then L = i R·j Rj∗ is Page’s sta- Asympt. 0.068 0.139 0.212
tistic. Large values of the test statistic are significant against
the null hypothesis.
Table 6 Rejection rates of the null hypothesis of the permutation test
Page (1963) provided the tables of critical values of the
accounting for trend alternatives/repeated measures, and Page’s test re-
L statistic for various n and K, corresponding to some α- jection rates under H1 ; δ = [0.5, 1, 1.5]
values. For untabled probabilities, the author suggests to
consider a standardized version of the L statistic, which is Corr. matrix = I
approximately distributed as a χ 2 distribution with 1 d.f. For Null α
further details we refer to Page (1963). Test Distribution 0.05 0.1 0.2
We have implemented some R functions performing the
permutation test for umbrella and the Page test. The permu- Permutation Test 0.390 0.538 0.708
tation test for umbrella alternative can be easily modified in Page Monte Carlo 0.214 0.376 0.544
order to account for repeated measures (by only changing Asympt. 0.342 0.470 0.634
the permutation strategy) and/or trend analysis (by looking Corr. matrix = 
at the first/last partial p-value). Increasing trends are con- Null α
sidered both in the simulation studies and in the example of Test Distribution 0.05 0.1 0.2
this section, so the results of the permutation test are actu- Permutation Test 0.425 0.564 0.775
ally referred to the partial p-value πK (i.e., the global test Page Monte Carlo 0.420 0.582 0.778
statistic is now πK and the third step of our procedure is not Asympt. 0.544 0.712 0.848
performed).
As regards Page’s test, the asymptotic p-value is based
upon normal-deviate approximations, whereas the Monte Table 5 reports the rejection rates of the permutation test
Carlo p-value is obtained by applying the same permuta- and Page’s test when data are under H0 . Here the function
tion strategy of the permutation test accounting for repeated performing the permutation test has been modified in or-
measures. We have considered two scenarios in the simula- der to account for trend alternatives and repeated measures.
tion study: the observations of each subject have been gener- Note that the rejection rates of the permutation test are very
ated according to model (5) and errors follow a multivariate close to the nominal α-levels both when data are uncorre-
normal distribution with correlation matrix I (identity ma- lated and when they are correlated. Page’s test seems to be
trix for uncorrelated data) and conservative when data are uncorrelated.
⎡ ⎤ Table 6 shows a simulation under H1 . Here we have con-
1 0.75 0.5
 = ⎣0.75 1 0.25⎦ , sidered an increasing trend as the alternative hypothesis,
0.5 0.25 1 with correlated/uncorrelated data. There is a gain in power
of Page’s test when considering correlated data.
for correlated data. We have considered K = 3 groups and As a final example, we have considered a gene expres-
n = 4 observations as in the final example of this section. sion study where the interest is to assess whether the gene-
The location shifts in the power simulation study are equal expression of an estrogen receptor (named ER-β) is in-
to δik = [0.5, 1, 1.5], ∀ i. fluenced by the quantity of hormone (Testosterone) levels
54 Stat Comput (2011) 21: 45–54
Table 7 Data of Example 2 accounting for trend alternatives, the permutation test works
n◦ Hormones as the peak is known.
Subject 0 (I) 10 (II) 100 (III)
Acknowledgements This research has been supported by the Project
Research CPDA088513 of University of Padova (Project Coordinator:
1 0 0.521138084 1.546789352
Prof. Luigi Salmaso).
2 0 0.495544338 1.102776615
3 0 0.474216264 0.173186268
4 0 −0.318758763 −0.065501549 References
ȳk 0 0.293034981 0.689312672
Chen, Y.I., Wolfe, D.A.: A study of distribution-free tests for umbrella
alternatives. Biom. J. 32, 47–57 (1990)
Table 8 p-values of permutation test accounting for trend/repeated Hartlaub, B.A., Wolfe, D.A.: Distribution-free ranked-set sample pro-
measures and Page’s test. Data from Table 7 cedures for umbrella alternatives in the m-sample setting. Envi-
ron. Ecol. Stat. 6, 105–118 (1999)
Test Null Distribution p-value Peak Group Hettmansperger, T.P., Norton, R.M.: Tests for patterned alternatives in
k-sample problems. J. Am. Stat. Assoc. 82, 292–299 (1987)
Permutation – 0.0159 III Hirotsu, C.: Isotonic inference. In: Encyclopedia of Biostatistics,
Exacta 0.1088 – pp. 2107–2115. Wiley, New York (1998)
Page Monte Carlo 0.1091 – Kössler, W.: Some c-sample rank tests of homogeneity against um-
brella alternatives with unknown peak. J. Stat. Comput. Simul.
Asympt. 0.0786 – 76, 57–74 (2006)
Mack, G.A., Wolfe, A.D.: K-Sample rank tests for umbrella alterna-
a result from StatXact 8 tives. J. Am. Stat. Assoc. 76(373), 175–181 (1981)
Magel, R.C., Qin, L.: A non-parametric test for umbrella alternatives
based on ranked-set sampling. J. Appl. Stat. 30, 925–937 (2003)
in bovine fetuses brain cell cultures. In particular, the re- Manly, B.F.J.: Randomization, Bootstrap and Monte Carlo Methods in
searchers expected to find an increasing trend on the re- Biology, 2nd edn. Chapman and Hall, London (1997)
Millen, B.A., Wolfe, D.A.: A class of nonparametric tests for umbrella
sponse, which should give some indications on the sexual
alternatives. J. Stat. Res. 39, 7–24 (2005)
differentiations of fetuses. Three concentration levels (0%, Neuhäuser, M., Leisler, B., Hothorn, L.A.: A trend test for the analy-
10% and 100%) of Testosterone were given to four cell cul- sis of multiple paternity. J. Agric. Biol. Environ. Stat. 8, 29–35
tures and the gene expression of ER-β was measured. Data (2003)
Page, E.B.: Ordered hypotheses for multiple treatments: a significance
of Example 2 are shown in Table 7. The sampling group
test for linear ranks. J. Am. Stat. Assoc. 58(301), 216–230 (1963)
means are displayed in the last line of the table. Pan, G.: Distribution-free tests for umbrella alternatives. Commun.
The data of Table 6 were analyzed with the R functions Stat.—Theory Methods 25, 3185–3194 (1996)
performing the permutation test accounting for trend alter- Pan, G.: Distribution-free confidence procedure for umbrella orderings.
Aust. N.Z.J. Stat. 38, 161–172 (2008)
natives/repeated measures and Page’s test. The results are
Pesarin, F.: Multivariate Permutation Tests with Applications in Bio-
displayed in Table 8. Page’s test gave a Monte Carlo p- statistics. Wiley, Chichester (2001)
value which is approximately 11 %, and we have also run Shi, N.-Z.: Rank test statistics for umbrella alternatives. Commun.
the analysis with the StatXact software in order to obtain the Stat.—Theory Methods 17, 2059–2073 (1988)
Wolfe, D.A.: Nonparametric distribution-free procedures for order re-
exact p-values. The asymptotic p-value is not feasible here
stricted alternatives. In: Ahsanullah, M., Raquab, M.Z. (eds.) Re-
because of the small size of sample. The permutation test, cent Developments in Order Random Variables. Nova Science,
instead, gave a global p-value equal to π3 = 0.0159. When New York (2006)

You might also like