You are on page 1of 16

Research Methods 2

Permutation test
for two independent samples

Source
Siegel, S. & Castellan, N. J. (1988) Nonparametric Statistics for the Behavioral
Sciences- 2nd ed., McGraw Hill.

Research Methods II 1
Permutation test
for two independent samples

• A powerful nonparametric technique for testing the


difference between the means of two independent
samples.
• Useful when the sample sizes and are small.
• Uses numerical values, thus, requires at least
interval scale.
• No special assumptions about the underlying
distributions.
• Determines the exact probability associated with the
observations under the assumption that is true.
Research Methods II 2
Permutation test: Example
Consider the case of two independent samples.
Group includes five subjects; .
Group includes five subjects; .

Scores for group : 16 19 22 24 29


Scores for group : 0 11 12 20
With these scores we wish to test the null hypothesis of
no difference against the alternative that the mean of
the population from which group was drawn is higher.

Research Methods II 3
Example
Under the null hypothesis, , all of the nine
observations may be considered from the same
population.
If null hypothesis is true, it is a matter of chance that
certain observations are labeled and others .
When is true, the labels could have been assigned to
the scores in any of the 126 equally likely ways.

When is true, only once in 126 experiments would it


happen that five largest scores of the
would all acquire the label .
Research Methods II 4
Example
If such a result would occur in an experiment, we would
reject at the

level of significance.
Because, if the two groups were from the same
population, the most extreme of 126 possible outcomes
should occur on just the experiment we conduct.

As the likelihood that the observed event would occur


when is true is very small, we would reject .

Research Methods II 5
The rejection region
The permutation test uses the most extreme ( )
outcomes to specify the rejection region.
If is the significance level, then the region of rejection consists
of the

most extreme among the possible occurences.


These most extreme outcomes are those for which the
difference between the mean of the ’s and the mean of the ’s
is the largest.
In the above example, if =.05, then the region of rejection
consists of most extreme possible
outcomes in the specified direction.
Research Methods II 6
Six most extreme outcomes in the predicted
dimension ( )
Possible scores Possible scores
for five X cases for four Y cases
29 24 22 20 19 16 12 11 0 114-39=75
29 24 22 20 16 19 12 11 0 111-42=69
29 24 22 19 16 20 12 11 0 110-43=67*
29 24 20 19 16 22 12 11 0 108-45=63
29 24 22 20 12 19 16 11 0 107-46=61
29 22 20 19 16 24 12 11 0 106-47=59
* The observed sample

Since our observed set of scores is in the region of rejection,


we may reject at level.
The exact probability (one-tailed) of the occurence of the
observed scores of a set more extreme when is true is
7
Six most extreme outcomes in either
dimension ( )
If had not predicted the direction of the difference, then a
two-tailed test of would have been appropriate.
Possible scores Possible scores
for five X cases for four Y cases
29 24 22 20 19 16 12 11 0 114-39=75
29 24 22 20 16 19 12 11 0 111-42=69
29 24 22 19 16 20 12 11 0 110-43=67*

22 16 12 11 0 29 24 20 19 I 61-92I =31
20 16 12 11 0 29 24 22 19 I 59-94I =35
19 16 12 11 0 29 24 22 20 I 58-95I =37
* The observed sample
The exact probability (two-tailed) of the occurence of the
observed scores of a set more extreme when is true is
8 8
Research Methods 2

The Wilcoxon – Mann Whitney test

Source
Siegel, S. & Castellan, N. J. (1988) Nonparametric Statistics for the Behavioral
Sciences- 2nd ed., McGraw Hill.

Research Methods II 9
The Wilcoxon-Mann Whitney test
for two independent samples

• One of the most powerful non-parametric tests.


• Used for testing whether two independent groups
have been drawn from the same population.
• At least ordinal measurement should have been
achieved for the variables being studied.
A very useful alternative to the parametric test when;
• researcher wishes to avoid the test’s assumptions, or,
• the measurement level is weaker than interval scaling.
Research Methods II 10
Wilcoxon-Mann Whitney (W-MW) test
for two independent samples
• Suppose that we have samples from two populations, and
. The null hypothesis is that and have the same
distribution.
• For a one-tailed test, and can be stated in terms of
medians as follows:

• Alternatively, if is one observation from population , and


is one observation from population , and can be
written as;

11
Small samples case
• To apply the W-MW test we must first combine the
observations in both groups and rank them in order of
increasing size.
• Lowest ranks are assigned to the largest negative values.
For example, suppose we had an experimental group of three
cases and a control group of four cases. ( )
Experimental scores : 9 11 15
Control scores : 6 8 10 13
To find , we first rank the scores.

Score: 6 8 9 10 11 13 15
Group: Y Y X Y X Y X
Rank: 1 2 3 4 5 6 7
12
Small samples case
• When and are less than or equal to 10, W-MW rank
sum statistics table can be used.
• This table can be used to determine the exact
probability associated with the occurence when is
true of any as extreme as an observed value of .
• To determine the probability under associated with
the data, the researcher needs to know (the size of
the smaller group), (the size of the larger group) and
.
• In our example, , and 15. The W-MW
table shows that for , the probability of observing
a value of 15 when is true is .
*For details see Siegel and Castellan (1988) p. 130-132, 339-346.
13
Large samples case
• When or ( if ) the statistics
below can be used ( ):

• The value is added if we wish to find probabilities in the left


tail, and subtracted if we wish to find probabilities in the right
tail.
• To capture the effects of the tied ranks, the statistics can be
corrected as follows:

where , is the number of groupings of different tied


ranks, and is the number of tied ranks in the th grouping.
14
Large samples example – Sleuth Case 1.1
Score Motive Rank Score Motive Rank Score Motive Rank
5 Ex 1 17,2 Ex 17 t 20,5 Int 32
5,4 Ex 2 17,2 Ex 17 t 20,6 Int 33
6,1 Ex 3 17,2 Int 17 t 20,7 Ex 34
10,9 Ex 4 17,4 Ex 19 21,2 Ex 35
11,8 Ex 5 17,5 Ex 20,5 t 21,3 Int 36
12 Ex 7 t 17,5 Int 20,5 t 21,6 Int 37
12 Int 7 t 18,2 Int 22 22,1 Ex 38,5 t
12 Int 7 t 18,5 Ex 23 22,1 Int 38,5 t
12,3 Ex 9 18,7 Ex 24,5 t 22,2 Int 40
12,9 Int 10 18,7 Ex 24,5 t 22,6 Int 41
13,6 Int 11 19,1 Int 26 23,1 Int 42
14,8 Ex 12 19,2 Ex 27 24 Ex 43,5 t
15 Ex 13 19,3 Int 28 24 Int 43,5 t
16,6 Int 14 19,5 Ex 29 24,3 Int 45
16,8 Ex 15 19,8 Int 30 26,7 Int 46
20,3 Int 31 29,7 Int 47
Wx = 423.5 (Ex)
Wy = 704.5 (Int) 15
Large samples case
• , and

, (the number of groupings of different tied


ranks).

One-tailed , two-tailed . Reject .


16

You might also like