You are on page 1of 14

The Canadian Journal of Statistics 143

Vol. 36, No. 1,2008, Pages 143-156


Lo revue canadienne & statistique

Nonparametric tests of hypotheses


for umbrella alternatives
Mayer ALVO

Key words and phrases: Asymptotic power efficiency; distance; Kendall’s tau; nonparametric test; rank;
Spearman’s rho; umbrella alternatives; unknown peak.
MSC 2000: Primary 62G 10; secondary 62G20.

Abstract: The author proposes a general method for constructing nonparametric tests of hypotheses for
umbrella alternatives. Such alternatives are relevant when the treatment effect changes in direction after
reaching a peak. The author’s class of tests is based on the ranks of the observations. His general approach
consists of defining two sets of rankings: the first is induced by the alternative and the other by the data
itself. His test statistic measures the distance between the two sets. The author determines the asymptotic
distribution for some special cases of distances under both the null and the alternative hypothesis when
the location of the peak is known or unknown. He shows the good power of his tests through a limited
simulation study.

Tests d’hypotheses non parametriques pour des contre-hypotheses parapluies


R h m t : L‘auteur propose une mBthode gBnCrale de construction de tests non paramktriques pour des
contre-hypothtses parapluies. De telles hypotheses sont pertinentes lorsque l’effet de traitement chahge
de direction apres avoir atteint un sommet. La classe de tests de l’auteur est fond& sur les rangs des
observations. Son approche gknkrale fait intervenir deux ensembles de rangs : le premier dkoule de la
contre-hypothtseet le second Bmane des donn&s elles-memes. L a statistique de son test mesure la distance
entre les deux ensembles. L‘auteur en dCtermine la loi asymptotiquedans quelques cas s e i a u x de distance
sous l’hypothtse nulle et sous la contre-hypothtse selon que la localisation du sommet est connue ou non.
I1 illustre la bonne puissance de ses tests au moyen d’une petite Btude de simulation.

1. INTRODUCTION
Let Xi(,), . . ., i = 1,.. . ,k, be k independent random samples with Xi(,!, C =
1,. .. ,mi having an absolutely continuous distribution function Fi(z).In the parametnc case,
we may have Fi(z)= F ( z - ei), where F has median zero. We shall be concerned with testing
the hypothesis of no treatment effect against the alternative that there is a monotone treatment
effect subject to a change in direction. Alternatives of this type arise whenever the treatment
effect changes in direction after reaching a peak. For example, the effectiveness of a drug may
change with time, the effectiveness in learning as a function of age may peak at a certain stage,
the reaction to increasing levels of a drug dosage may peak at a certain point and decrease there-
after, etc. Formally, letting Fp be the distribution where the tuming point occurs, the hypothesis
and the alternative are, respectively,
7io : F l ( z ) = . - .= Fk(z), for all z, (1)
7il : F1(z) 2 . * * L Fp-l(X) L F p ( 2 ) I Fp+1(z)I * - . I Fk(Z), (2)
with at least one strict inequality for some z. Equivalently, in the parametric case, the hypotheses
become

xo : 8, = ... = e k (3)
: el 5 . -.5 ep-l 5 e, 2 e,, 2 - . . ? ek, (4)
with at least one strict inequality in % I . The reader is referred to Barlow, Bartholomew, Brem-
ner & Brunk (1972) and to Robertson, Wright & Dystra (1988) for more extensive literature on
144 ALVO Vol. 36,No. 1

the parametric case. We note that an umbrella alternative contains as special cases, the ordered
alternatives corresponding to p = k or p = 1. These special cases have been studied in their
own right by Terpstra (1952), Jonckheere (1954), Page (1963) and Alvo & Cabilio (1995) in the
context of block experiments.
In Section 2, we propose a general approach for constructing test statistics based on the ranks
of the observations for the situation when the location of the peak is known. Specific examples
using both the Spearman and Kendall distances are determined. We obtain the asymptotic null
and non-null distributions for these test statistics under the assumption that the minimum of the
sample sizes gets large. In Section 3, we consider the case where the location of the peak is
unknown. In Section 4, we report on the results of some simulation studies and note that the
test statistics perform well under a variety of different underlying distributions. We conclude in
Section 5 with some final remarks.

2. THE CONSTRUCTION OF THE TEST STATISTICS FOR KNOWN PEAK


The general strategy for developing tests of ‘Hovs ‘HI consists of first defining a standardized
statistic, say V; for the case where the peak is known to be at population Fp. For the case where
the location of the peak is unknown, the test statistic is based on the m u p Vp. Umbrella alter-
natives were first considered by Mack & Wolfe (1981) and later by Simpson & Margolin (1986),
Hettmansperger & Norton (1987) and Shi (1988). For the case of the unknown peak, Chen &
Wolfe (1990) modified the Mack-Wolfe statistic. See also Chen (1991) and, more recently,
Millen & Wolfe (2005), who introduce modifications and exhibit a simulation study. Kossler
(2006) compares the test statistics based on the statistics due to Chen and Wolfe, Hettmansperger
and Norton, and Shi.
Hettmansperger& Norton (1987) considered the problem of testing (3) against an alternative
of the form
‘ H ~ : 8 ~ = 8 0 + t k ~8 >, 0 , j = 1 , ..., k , (5)
where the { c j } are a given set of constants that specify the pattern to be detected. Their test
statistic V; takes the form of a weighted sum ai?ii, with ai = 0 and where ?ii represents
the average of the ranks in the ithpopulation. The weights are chosen to optimize the Pitman
efficacy and are functions of the {q}.The latter are rarely known in practice and the authors
recommend choosing them to be equally spaced. Mack & Wolfe (1981) proposed using Mann-
Whitney statistics on both sides of the peak to estimate the peak when its location is unknown.
Chen & Wolfe (1990), Chen (1991) and more recently Millen & Wolfe (2005) considered the
case of an unknown peak and based their statistics on Mann-Whitney scores as well. However,
as Hettmansperge & Norton (1 987) have noted, “the lack of comparisons across peaks can result
in some loss of efficiency.” Shi (1988) proposed a test statistic similar to Hettmansperger and
Norton’s, but using a different weighting scheme. Kossler (2006) generalized the test statistics
due to Hettmansperger and Norton, Chen and Wolfe, and Shi by using score functions instead
of ranks when the location of the peak is unknown. He concluded from an extensive simulation
study that the Hettmansperger-Norton type test performed best overall followed closely by the
Chen-Wolfe type test and the Shi-type test. He also considered the test due to Pan (1996).
In what follows, we first assume that there is a single peak and that its location is known.
We then propose an approach for deriving test statistics based on the notion of distance between
permutations. It is seen that the choice of Spearman distance leads naturally to test statistics of
the Hettmansperger-Norton type but without the need to specify an alternative of the form (5).
On the other hand, the choice of Kendall distance leads to the Mann-Whitney-type statistics that
allows for comparisons across the peak. As noted above, we can then develop test statistics based
on these to deal with the case of an unknown peak.
We adopt the following notation. Let ni = ml+. . .+mi, i = 1,.. . ,k , n = nk. fi = n-mp,
no = 0. Let P = { p : [p(l),. . . ,p ( n ) ] }be the set of all permutations of the integers 1,.. . , n,
and let d(p, v ) be a distance function between permutations p and v.
2008 HYPOTHESES FOR UMBRELLA ALTERNATIVES 145

Motivated by Critchlow (1992), Alvo & Pan (1997) proposed a general approach to hypoth-
esis testing based on the ranks of the observations. It consists of defining a set of permutations
induced by the observationsand a further set of extremal permutations “most in agreement” with
the alternative. The test statistic is then based on a measure of the distance between these two
sets. Specifically, we propose the following steps.
Step 1: Rank all the observations together so that the smallest gets rank 1, the next smallest
rank 2, etc. Let the n-dimensional vector
r = ( r ( l ). ,, . ,r(ml)lr(ml+I ) , . . . , r ( m l +m2)l... , r ( n ) )
represent the ranks of the { X i o } ,i = 1,. . . ,k, C = 1,. . . ,mi and grouped by populations. In
view of the continuity assumption on the distributions, ties among the observations occur with
probability zero.
Step 2: Define { r }to be the subclass of permutations “equivalent” to the observable permuta-
tion T in the sense that ranks occupied by identically distributed random variables are exchange-
able. This subclass consists of all the permutations r where the rankings within each population
are permuted among themselves only. The cardinality of { r }is given by the product (n
mh!).
Step 3: Define E to be an extremal subclass of P consisting of all permutations which are
“most in agreement with 3-11.” The extremal set E does not necessarily correspond to the entire
critical region but rather consists of those permutations which provide the strongest evidence in
favour of the alternative. In the present context, permutations in E are such that ranks occupied
by observations from Fi are always less than those from Fit, if i < i’ Ip, whereas the reverse
is true if p 5 i < i’. Moreover, ranks attributed to a distribution consist of consecutive integers.
The enumeration of the extremal set E is a two-stage procedure. First, choose the relative
order of the ( p - 1)populations FI, .. . ,FP-1among F1, . . . ,Fp- 1 , F’+I, . .. ,Fk. This can be
done in c = ( :I;) ways. Then partition the integers 1,. . . ,n in accordance with the prescribed
ordering of the populations while taking into account corresponding sample sizes. The extremal
set E is finally obtained by permuting the integers within each population. Population Fp is
+
always allocated the last mp integers, namely ii 1,. , . ,n. The cardinality of E is therefore
equal to ~ ( n m i ! ) .
Step 4: Let d ( p , v) be a distance function between two permutations p , v and define the
distance between the two sets { T } and an extremal set E by computing the sum of all pairwise

p e { ? ~v)e E

Small values of d ( { r } , E) are inconsistent with the null hypothesis and consequently lead to
rejection of 3-10.
In what follows, we shall consider distances between permutations defined by Spearman and
by Kendall, respectively, as follows:
Spearman’s rho:

Kendall’s tau:
146 ALVO Vol. 36, No. 1

Both Spearman and Kendall distances are right invariant in the sense that they are invariant
with respect to a change in the relabclling of the objects ranked. Note the Kendall distance is
over e, l' which satisfy the condition indicated.

2. I . The test statistics corresponding 10 the Spearman distance.


In this section, we derive the test statistic corresponding to Spearman distance under the extremal
set E when the location of the peak is known. Throughout, we shall assume that permutations
defined by the extremal sets are arranged in columns indexed by 1 5 i(l) 5 n in such a way that
ranks are in increasing order for populations F,, i 5 p and in decreasing order when i 2 p .
Suppose that F,, i < p , is in relative position j and hence populations F1 , .. . , F,-1 are
in relative positions chosen from among the first j - 1 positions. Populations F,+ 1 , . . . , Fp- 1
are then assigned positions chosen from j + 1 , . . . , k - 1. This can happen with frequency
(31:) ( ;:I:). The positions of the remaining populations are then automatically deter-
mined. Together populations F1, . . . , Fz-l and Fk+z-, 1 , . . . , Fk are assigned the first a,, inte-
gers where

+ + +
Population F, is assigned integers a,, 1, . . . , a,, m, whose sum is equal to (a1, q ) m , .
The process of permuting ranks within F, implies that each entry will contribute the sum (m,-l)!
times. Hence, for each entry taking into account the permutations, we have

Finally, summing over each j we have

On the other hand for the data vector we have for each entry in Fi

(J-Jmz!) (%- n-1 T)


Similarly for i > y, we may define

The calculation of (6) then yields

where
2008 HYPOTHESES FOR UMBRELLA ALTERNATIVES 147

and i i i represents the average of the ranks for the ith population and

It is instructive to consider the special case mi = m where an equal number of observations


is taken from each population. In that case, a i j = b i j = ( j - l ) m and

~ i= ,

i c-

~km

C
-1
+ (?)

j=k-i+l
k-i

On using the identity (12.16) in Feller (1 968, p. 65)


-1-j
-p-1
(?)

+(?)
ifi <p,

ifi = p ,

i f i >p.

It follows that

2.2. The test statistic corresponding to the Kendall distance.


In this section, we derive the test statistic corresponding to the Kendall distance. Consider for
now the situation when there is only one observation per population. Fix 1 I il < p < 22 I k.
Suppose that integer j is assigned to F,, ,and integer j 2 is assigned to F,, ,with j 2 > j . Then the
frequency with which this can happen is given by

(:I;) ( j 2 - j - 1
j z + i 2 - i l - k - 1) (ki2 -p)’
-32
j2>j‘

In fact, from the point of view of the p - 1 populations F I ,...,Fp-l, the number of ways
of choosing il - 1integers to be less than j is ( :
I:
)
. If q is the number of populations among
F ( i l + l ) , . . . ,F(p-l) which are assigned ranks greater than j but less than j 2 , then we must have

q + il + ( k - 2 2 ) = j 2 - 1 .
Theirranksarechosenfromj+l, . . . , j 2 - 1.
Summing over j 2 we obtain the total number of negatives,
H(il,i2) = c (! - 1)
21 - 1
(h - j
4
- 1) ( k - j z - 1)
iz-p-1
jz>j
148 A LVO Vol. 36,No. 1

(11)

Alternatively, we may sum first over j in (1 I ) to obtain

It follows that the sum over the signs is given by the positives less the negatives, c - 2 H ( 2 1 , 2 2 ) .
Considering only the second term in (8) and letting W(21,Z z ) be

i
-C if 1 5 il < 22 5 p ,
W(21,22)= c - 2 H ( i l , i 2 ) if 1 5 2 1 < p < 22 5 k ,
C ifp 5 il < 22 5 k ,
it follows that the Kendall test statistic becomes

c
il <iz
W ( i l , i 2 ) sgn ( n ( i 1 )- 4 2 2 ) ) .

We now consider the more general case with unequal observations when the extremal set is E .
In that case, the extremal set is determined by first specifying the ordering of the populations and
then permuting within populations. It follows that the weight function will be a function only of
the indices 21, i 2 since the sign of the difference n ( i l ( l ) )- 7 r ( i 2 ( 1 1 ) ) is determined entirely by
the ordering of the populations. The data set on the other hand consists of permuting the ranks
occupied by the ranks within populations. This yields the double sum Ce El,sgn{n(zl(l)) -
n ( i 2 ( l ' ) ) } . There is no contribution to the sum from permutations within each population. Set

U(i1, 22) = cc
mll mz2

e=i el=]
sgn{ T ( i l ( l ) )- 7 + 2 ( " ) ) } .

Hence the Kendall test statistic for unequal numbers of observations when the extremal set is E

il<i2

It is interesting to note that Mack & Wolfe (1981) as well as Millen & Wolfe (2005) define a test
statistic which combines the sums of Mann-Whitney statistics to the left and to the right of the
peak only. They do not include comparisons across the peak.
2.3. The asymptotic distribution of the test statistics under the null hypothesis.
In this section, we consider the asymptotic distribution of the Spearman and Kendall test statistics
under the null hypothesis when the location of the peak is known.

T H E O R E 1.
MAssume that minmi + co with miln + Xi > 0. Define
2008 HYPOTHESES FOR UMBRELLA ALTERNATIVES 149

Then the test statistic


k
n+l
i=l

corresponding to the Spearman distance is asymptotically normal with mean 0 and variance u:.

Proof of Theorem I . In the Spearman case, we make use of the representation of S, as a norrnal-
ized linear rank statistic

Since m i / n converges to a constant, it follows that as min(mi)4 00,

+
It can be shown that V = (n 1)/2 and in the equal sample size case the variance is given by

u; =
12 i=l P 2k
k t l - i
k + 1- p
,,,,,)
2k

For the asymptotic distribution of the Kendall statistic (13), we consider a projection ap-
proach onto the space of linear rank statistics. Recall the following result from H6jek &
sidak ( 1967)

otherwise.

The following theorems imply that the Kendall and Spearman statistics are asymptotically
equivalent. The proofs are given in the Appendix.

THEOREM 2. The projection of Dp in (13) onto the space of linear rank statistics is given by
h

D, = IcS,/n.
h

THEOREM 3. var Dplvar D, -+ 1 as minm, 4 00, with m i / n + A, > 0. Hence the Kendall
and Spearman test statistics are asymptotically equivalent.

The asymptotic distribution under the alternative hypothesis can be obtained in a similar way
as in Alvo & Pan (1997). Assume that F ( z ) is a twice differentiable continuous distribution
function with density f(z). Let F - ' ( Z L )= inf{z : F ( z ) 2 u } . It can be shown that the
asymptotic power efficiency is given by the expression

where
150 ALVO Vol. 36,No. 1

As in Alvo & Pan (1997), we may conclude that the asymptotic power efficiency for the statistics
based on both Spearman and Kendall is greater when the underlying distribution is logistic as
compared to the normal. As we shall see, this is supported by the limited simulation study
conducted.
Further insight in the proposed statistics may be obtained by recalling the asymptotic power
efficiency (APE) of test statistics having the general form

It can be shown (Hettmansperger& Norton 1987) that the APE is given by

where the {ci} describe the alternative defined in ( 5 ) . In Table 1, values of APE/I(f) are
displayed in the case of equal sample sizes and k = 5, p = 2 for various test statistics. The
weights { wi} are indicated in each case using results of Kossler (2006). It can be seen that the
statistic Sp compares well in all cases.
TABLE1 : Values of APE/I(f) for k = 5,p = 2 and equal sample size. S, : (0.5,I , 0.75,0.5,0.25);
Hettmansperger and Norton (HN): ( 1 , 2 , 1 , 0 ,-1); Shi: (-2.4,4.449,0,0.449, -2);
Mack and Wolfe (MW): (1,2,3,2,1).

{GI S, HN Shi MW

0,1,0, -1, -2 0.886 1.040 0.566 0.183


0,1,0.5,0,-0.5 0.260 0.222 0.212 0.103
0,L1,1,1 0.006 0.006 0.029 0.046
0 , 1 , 1 ,L O 0.125 0.055 0.114 0.183
0,1,1,0,0 0.186 0.125 0.141 0.140
0,1,0,0,0 0.098 0.075 0.141 0.003
0 , 1 , 0 , 0 ,-1 0.346 0.346 0.297 0.071
0 , 1 , 0 ,-1, -1 0.445 0.498 0.340 0.046
0 , 1 , 1 , 0 ,-1 0.498 0.445 0.297 0.346

3.THE TEST STATISTICS WHEN THE PEAK IS UNKNOWN


In the case when the location of the peak is unknown, we may construct test statistics using the
approach of Hettmansperger & Norton (1987). Allowing p to vary, let

be the standardized statistics and let

S,, = maxSi, D,, = maxDi.


P P

The test based on Spearman distance rejects ‘Hoif S,, is large. Similarly, the test based on
Kendall distance rejects Ho if D,,, is large. The asymptotic distribution of the respective test
statistics under the null hypothesis is given in the next theorem.
2008 HYPOTHESES FOR UMBRELLA ALTERNATIVES 151

THEOREM 4. Let the vector S = (Si,.. . ,Si)T and let cov(S) = BBT. Under ‘Ho, if
min mi --+ 00 with m i / m 4 X i > 0, i = 1,.. . ,k , then S has asymptotically the distribution
of BZ where Z is multivariate normal with mean 0 and covariance matrix I. Consequently,
S, has asymptotically the distribution of max BZ. Similarly, let D = (DT, . . . ,Dl)T. Then
cov(D) and cov(S ) are asymptotically equivalent and D,, has asymptotically the distribution
of max BZ.

Proof of Theorem 4. In the Spearman case, we note that the components of the vector are linear
combinations of the average ranks over each population. Let dl, . . . ,dk be a set of arbitrary
coefficients and consider the linear combination
k
cdpS;=Ed:
p= 1 i=l
k
(-T,- ,,,>,
-

where

The asymptotic normality of (14) follows as in the proof of Theorem 1. Moreover from Hhjek &
Sid& (1967, p. 62), we calculate

where

Using arguments similar to those in the proofs of Theorems 2 and 3 in the Appendix, the corre-
sponding asymptotic result for the Kendall statistic follows from the fact that as n -+ 00,

I c o~( D ; , - COV(S;, s
D;,) ;,)l-+ 0.
As noted by Hettmansperger & Norton (1987), the distribution of max BZ cannot be explic-
itly determined although it can be easily simulated. The approximate p-value of the test is given
bY
P (maxBZ 2 s m a x ) ,
where,,s is the observed value of S
., In Table 2 below we indicate the asymptotic critical
values corresponding to different levels of significance and different values of k. The critical
values were very stable for m 2 3. We report here the critical values for m = 10.

TABLE2: Critical values.

k\% 1 1.5 2 2.5 3 3.5 4 4.5 5 10

3 2.67 2.54 2.44 2.36 2.29 2.23 2.18 2.14 2.09 1.80
4 2.77 2.63 2.53 2.45 2.39 2.33 2.28 2.23 2.19 1.89
5 2.82 2.67 2.59 2.51 2.44 2.38 2.33 2.29 2.24 1.95
6 2.86 2.73 2.63 2.55 2.48 2.42 2.37 2.32 2.28 1.98
7 2.89 2.74 2.65 2.57 2.50 2.44 2.39 2.24 2.30 2.00
152 ALVO Vol. 36,No. 1

Example. In order to illustrate the Spearman test, we consider an example in Mack & Wolfe
(1981)on the Wechsler adult intelligence scale scores on males by age groups. We reproduce the
data below.
Age Group
16- 19 20-34 35-54 55-69 > 70
8.62 9.85 9.98 9.12 4.80
9.94 10.43 10.69 9.89 9.18
10.06 11.31 11.40 10.57 9.27

Assuming that the location of the peak is unknown, we calculate that the test statistics due to
Spearman and to Kendall yield values of 2.316 (P-value x 0.04) and 2.391 (P-value x 0.035),
respectively. Here the approximate P-values are determined from Table 1 whereby we have used
a value of m = 10. More precise simulation results reveal that for m = 3, the critical values
corresponding to levels of 3.25% and 3% are 2.33 and 2.38 respectively. It can be seen that the
statistic based on the Kendall distance is slightly more sensitive for this data set.

4. SIMULATION STUDY
In this section, we report results on a limited simulation study only for the case when the lo-
cation of the peak is unknown. Four families of distributions were considered: normal, double
exponential, logistic and exponential. The normal has short to medium tails; the double expo-
nential has longer tails; the exponential is skewed. Let k = 5, p = 3 and let the distributions
Fl . . ,F5 have location parameters equal to 0,1/2,1,1/2,0 respectively under the alternative
1 .

and variances all equal to 1. We considered only the cases where the sample sizes were equal.
For each value of the sample size, we used 10,000 repetitions. The simulations were done for
both the Spearman and Kendall statistics under the extremal set E.
First we note that the critical value of the test for k = 5 was 2.24. The Spearman and Kendall
statistics maintain the level of significance very close to the target 0.05 in repeated runs and for
various values of k and m as shown in Table 3. In Table 4 below, we display the simulated power
function for the tests based on Spearman and Kendall respectively as the sample size varies. The
true peak is located at p = 3.

TABLE3 : Significance level for the case k = 5 , p = 3 N: normal; D: double exponential; L: logistic;
E: exponential.

Spearman Kendall
m N D L E N D L E
~

5 0.048 0.045 0.047 0.043 0.044 0.049 0.041 0.042


6 0.050 0.044 0.045 0.047 0.051 0.047 0.047 0.058
7 0.052 0.047 0.048 0.047 0.052 0.049 0.054 0.041
8 0.045 0.047 0.049 0.043 0.047 0.048 0.051 0.055
9 0.049 0.050 0.052 0.049 0.052 0.050 0.048 0.046
10 0.049 0.048 0.048 0.048 0.048 0.051 0.050 0.058
15 0.045 0.047 0.050 0.045 0.052 0.049 0.050 0.051

It can be seen that for sample sizes of 10 or less, the Kendall statistic performs somewhat bet-
ter than Spearman’s when the underlying distribution is either exponential or double exponential.
In other cases, the two statistics appear very comparable. Although not reported here, additional
2008 HYPOTHESES FOR UMBRELLA ALTERNATIVES 153

simulations showed that both statistics performed well for various values of Ic, m when the loca-
tion parameters under the alternative are not equally spaced. It was also noted that the power for
both statistics increases as the location of the true peak shifts to either end.

TABLE4: Power function for the case k = 5,p = 3 N: normal; D: double exponential; L: logistic;
E exponential

Spearman Kendall
m N D L E N D L E
5 0.344 0.448 0.386 0.558 0.352 0.459 0.383 0.602
6 0.408 0.539 0.463 0.656 0.421 0.555 0.462 0.700
7 0.478 0.619 0.519 0.744 0.481 0.627 0.536 0.763
8 0.533 0.677 0.578 0.800 0.539 0.691 0.592 0.823
9 0.589 0.737 0.635 0.844 0.598 0.740 0.640 0.867
10 0.648 0.778 0.696 0.883 0.649 0.793 0.696 0.900
15 0.833 0.929 0.870 0.979 0.831 0.932 0.877 0.981

5. CONCLUSION
An approach has been proposed for constructing nonparametric tests of hypotheses under um-
brella alternatives based on the notion of distance between permutations. The use of the Spear-
man and Kendall distances lead to new test statistics when the location of the peak is known. This
is then used to derive test statistics for the case where the location of the peak is unknown. The
asymptotic properties of the statistics are derived and it is shown specifically that the statistics
are asymptotically equivalent as the minimum sample size gets large subject to a mild condition.
An application to intelligence scores illustrates the sensitivities of the test statistics. A limited
simulation study when the underlying distribution of the data follows a normal, logistic, double
exponential or exponential distribution shows that the test statistics have good power for m d -
erate sample sizes. The approach appears to have potential applications in the study of isotonic
regression whereby different orderings are specified under the alternative. It would be interesting
in future work to derive the test statistics corresponding to the Hamming and Spearman footrule
distance functions and to compare them with the ones obtained here. As well, the two group
problem considered by Pan & Wolfe (1996)would be of interest in light of the results obtained
here.

APPENDIX
, will project Dp onto the space of linear rank statistics.
Proof of Theorem 2. For given i l , i ~we
From HAjek & Sid& (1967,Th. b, p. 59), the projection of Dp onto the space of linear rank
statistics is given by
154 ALVO Vol. 36,No. 1

For i < p , interchanging the order of summation from (1 1)

Similarly, for i > p , from (12)

Hence

and

- ai-lI[i<pl- (A.3)
Consequently from (A.l), (A.2) and (A.3),

Since

the result follows.

Proof of Theorem 3. Define

21 iz

A direct calculation shows that xi mi(u2i - uli) = 0. Moreover, from (A.2) and (A.3), it
follows that

Moreover, 2c(vi,, - V) = (upi - uli). Hence

i i

We now calculate var(Dp).Note that


2008 HYPOTHESES FOR UMBRELLA ALTERNATIVES 155

whereas

Now, for i l < 22, it follows from (A.4),

Then

as n -+ 00 and this completes the proof. 0

ACKNOWLEDGEMENTS
The author would like to thank Vladislav Brion for some useful discussions and for his help in performing
the simulations in this article. The author is also grateful to two referees for their interest in this research
and for their very useful suggestions which helped to improve the presentation. Thanks as well to the Asso-
ciate Editor. This research was supported by a discovery grant from the Natural Sciences and Engineering
Research Council of Canada.

REFERENCES
M. Alvo & P. Cabilio (1995). Testing ordered alternatives in the presence of incomplete data. Journal of
the American Statistical Association, 90,1015-1024.
M. Alvo & J. Pan (1997). A general theory of hypothesis testing based on rankings. Journal ofstatistical
Planning and Inference, 61,219-248.
R. E.Barlow, D. J. Bartholomew, J. M. Bremner & H. D. Brunk (1972). Statistical Inference under Order
Restrictions. Wiley, New York.
Y.I. Chen (1991). Notes on the Mack-Wolfe and Chen-Wolfe tests for umbrella alternatives. Biometrical
Journal, 33,281-290.
Y.I. Chen & D. A. Wolfe (1990). A study of distribution-free tests for umbrella alternatives. Biometrical
J o u m l , 32,47-57.
156 ALVO Vol. 36, No. 1

D. Critchlow (1992). On rank statistics: An approach via metrics on the permutation group. Journal of
Statistical Planning and Inference, 32,325-346.
W. Feller (1968). An Introduction to Probability Theory and Its Applications, Volume I, 3rd edition. Wiley,
New York.
J. HAjek & SidAk (1967). Theory of Rank Tests. Academic Press, New York.
T. P. Hettmansperger & R. M. Norton (1987). Tests for patterned alternativesin k-sample problems. Journal
of the American Statistical Association, 82,292-299.
A. R. Jonckheere (1954). A distribution-freek sample test against ordered alternatives. Biometrika, 41,13 1-
145.
W. Kossler (2006). Some c-sample rank tests of homogeneity against umbrella alternatives with unknown
peak. Journal of Statistical Computation and Simulation, 76,57-74.
W. Kossler & H. Buning (2000). The asymptotic power and relative efficiency of some c-sample rank tests
of homogeneity against umbrella alternatives. Statistics, 34, 1-26.
G. A. Mack & D. A. Wolfe (1981). k-sample rank tests for umbrella alternatives. J o u m l of the American
Statistical Association, 76, 175-1 8 1.
B. A. Millen & D. A. Wolfe (2005). A class of nonparametric tests for umbrella alternatives. Journal of
Statistical Research, 39,7-24.
E. B. Page (1963). Ordered hypothesis for multiple treatments: A significance test for linear ranks. Journal
of the American Statistical Association, 58,216-230.
G. Pan (1 996). Distribution-free tests for umbrella alternatives. Communication in Statistics, Theory and
Methods, 25,3 185-3 194.
G. Pan & D. A. Wolfe (1996). Comparing groups with umbrella orderings. J o u m l of the American
Statistical Association, 9 1, 3 1 1-3 17.
T. Robertson, F. T. Wright & R. L. Dystra (1988). Order Restricted Statistical Inference. Wiley, New York.
N. Z. Shi (1988). Rank test statistics for umbrella alternatives. Communication in Statistics, Theory and
Methods, 17,2059-2073.
D. G. Simpson & B. H. Margolin (1986). Recursive nonparametric testing for dose-response relationships
subject to downturns at high doses, Biometrika, 73,589-596.
T. J. Terpsua (1952). The asymptotic normality and consistency of Kendall’s test against trend, when ties
are present. Indagationes Mathematicae, 14, 327-333.

Received 14 December 2006 Mayer ALVO: malvo@uottawa.ca


Accepted 24 September 2007 Department of Mathematics and Statistics
University of Ottawa, Ottawa, Ontario
Canada KIN 6NS

You might also like