You are on page 1of 9

# 5.4.

Assumptions: Given two independent random samples X1 , X2 , . . . , Xn1 and Y1 , Y2 , . . . , Yn2 : The measurement scale is at least ordinal so that the relative magnitude of each element in the combined sample can be determined. The variable of interest is continuous. The location adjusted random variables X and Y are identical except possibly for the respective values of some scale parameter. Either both medians are known or the medians are unknown but equal. If neither of these are reasonable assumptions, it is suggested to estimate and by the sample medians. Hypotheses: (A) Two-sided: H0 : x = y (B) One-sided: H0 : x = y (C) One-sided: H0 : x = y Rationale: Combine the two location-adjusted samples and list in increasing order. Assume n1 n2 . Assign score i to the ith and (n + 1 i)th ordered values. In case of ties, assign average scores. Thus, scores increase from both ends towards the center of the ordered sample. If the distribution of X and Y dier only with respect to a dierence in scale, consider the following: If x < y , then the Xs will tend to be get larger scores than the Y s We expect the X scores mean > Y scores mean. If x > y , then the Xs will tend to be get smaller scores than the Y s We expect the X scores mean < Y scores mean. If x = y , then the average score of the Xs will tend to be close to the average score of the Y s. Method: For a given Assign the Ansari-Bradley scores to the combined location-adjusted data set. Let W = sum of the X scores (the Ansari-Bradley test statistic). Under H0 , the sampling distribution of the Ansari-Bradley test statistic W is based on the randomization distribution determined by calculating W over all possible assignments of Ansari-Bradley scores to samples of sizes n1 and n2 . Use the attached table for the upper tail probabilities of the distribution of W for n1 +n2 20 or use the normal approximations on page . 104 vs vs vs H1 : x = y . H1 : x > y . H1 : x < y .

To nd wU , the upper c critical value of W , nd the smallest x in the table such that P (W x) c. To nd wL , the lower c critical value of W , rst nd the largest x in the table such that P (W x) 1 c. Then wL = x 1. Decision Rule (A) For two-sided test H1 : x = y , use /2 to nd wU and wL . If W wU or W wL , Reject H0 . Otherwise, Fail to Reject H0 . (B) For one-sided test H1 : x > y , use to nd wL . If W wL , Reject H0 . Otherwise, Fail to Reject H0 . (C) For one-sided test H1 : x < y , use to nd wU . If W wU , Reject H0 . Otherwise, Fail to Reject H0 . Or, use the normal approximation for larger samples. Normal Approximations for Large Samples If n1 + n2 is even, calculate T = T [n1 (n1 + n2 + 2)/4] n1 n2 (n1 + n2 + 2)(n1 + n2 2)/[48(n1 + n2 1)]

If n1 + n2 is odd, calculate T = T [n1 (n1 + n2 + 1)2 /4(n1 + n2 )] n1 n2 (n1 + n2 + 1)[3 + (n1 + n2 )2 ]/[48(n1 + n2 )2 ]

Compare T to the standard normal distribution to determine p-values for the test. An Example of the Distribution of the Ansari-Bradley Test Statistic (n1 = 3, n2 = 7)
ANSARI-BRADLEY TEST DISTRIBUTION FOR W WITH N1=3 N2=7 N1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 N2 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 Ranks --------1 2 1 1 2 1 2 2 1 1 3 1 1 3 1 1 2 2 3 2 1 3 2 1 2 3 1 2 3 1 1 3 2 1 4 1 1 4 1 1 3 2 1 2 3 1 2 3 4 2 1 4 2 1 3 3 1 2 3 2 2 4 1 2 4 1 2 3 2 W 4 4 5 5 5 5 6 6 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 N1 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 N2 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 Ranks --------2 5 2 2 5 2 2 4 3 2 3 4 2 3 4 1 5 3 1 5 3 1 4 4 1 3 5 1 3 5 5 3 2 5 4 1 5 3 2 5 4 1 4 4 2 4 5 1 4 5 1 3 4 3 3 5 2 3 5 2 3 4 3 2 5 3 2 5 3 W 9 9 9 9 9 9 9 9 9 9 10 10 10 10 10 10 10 10 10 10 10 10 10

105

24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3

7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7

1 1 1 1 1 1 1 4 5 5 4 3 3 3 2 2 2 2 2 1 1 1 1 1 1 1 1 4 5 5 4 4 3 3 3 3 2

4 5 5 4 3 2 2 3 2 2 3 3 4 4 4 5 5 4 3 4 5 5 4 3 3 2 2 3 3 3 3 4 4 5 5 4 4

2 1 1 2 3 4 4 1 1 1 1 2 1 1 2 1 1 2 3 3 2 2 3 4 4 5 5 2 1 1 2 1 2 1 1 2 3

7 7 7 7 7 7 7 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 9 9 9 9 9 9 9 9 9 9

84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120

3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3

7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7

2 2 2 1 1 1 1 5 5 5 4 4 4 3 3 3 2 2 2 2 1 5 5 5 4 4 3 3 3 3 2 5 4 4 3 5 4

4 3 3 5 5 4 4 4 4 5 4 5 5 5 5 4 5 5 4 4 5 4 4 5 5 5 5 5 4 4 5 5 5 5 5 5 5

4 5 5 4 4 5 5 2 2 1 3 2 2 3 3 4 4 4 5 5 5 3 3 2 3 3 4 4 5 5 5 3 4 4 5 4 5

10 10 10 10 10 10 10 11 11 11 11 11 11 11 11 11 11 11 11 11 11 12 12 12 12 12 12 12 12 12 12 13 13 13 13 14 14

DISTRIBUTION FOR W WITH N1=3 N2=7 LEFT-TAIL PROBABILITIES p(w <= W) Cumulative Cumulative w Frequency Percent Frequency Percent ------------------------------------------------4 2 1.67 2 1.67 5 4 3.33 6 5.00 6 10 8.33 16 13.33 7 14 11.67 30 25.00 8 20 16.67 50 41.67 9 20 16.67 70 58.33 10 20 16.67 90 75.00 11 14 11.67 104 86.67 12 10 8.33 114 95.00 13 4 3.33 118 98.33 14 2 1.67 120 100.00 RIGHT-TAIL PROBABILITIES p(w <= W) Cumulative Cumulative w Frequency Percent Frequency Percent ------------------------------------------------14 2 1.67 2 1.67 13 4 3.33 6 5.00 12 10 8.33 16 13.33 11 14 11.67 30 25.00 10 20 16.67 50 41.67 9 20 16.67 70 58.33 8 20 16.67 90 75.00 7 14 11.67 104 86.67 6 10 8.33 114 95.00 5 4 3.33 118 98.33 4 2 1.67 120 100.00

106

107

108

109

5.5

## Permutation Test on Deviances

Assumptions: The same as for the Siegel-Tukey Test and Ansari-Bradley Tests Hypotheses: (A) Two-sided: H0 : x = y (B) One-sided: H0 : x = y (C) One-sided: H0 : x = y Rationale: The deviation of x from its median = x mx and the deviation of y from its median = y my . If x = y , then the average of the absolute deviations of the observed x values about the median of X should be close to the average of the absolute deviations of the observed y values about the median of Y . That is: 1 m 1 n |dev(xi )| |dev(yj )| m i=1 n j=1 Or, expressed as a ratio:
1 m 1 n m i=1 |dev(xi )| n j=1 |dev(yj )|

vs vs vs

H1 : x = y . H1 : x > y . H1 : x < y .

or

or

1 m 1 n

## m i=1 |xi n j=1 |yj

mx | 1 my |

However, if 1 > 2 , then 1 /2 > 1, or, if 1 < 2 , then 2 /1 > 1 If Ho : 1 = 2 , the group labels (say, A and B associated with X and Y are arbitrary. That is, any assignment of m As and n Bs to the set of m + n deviances is equally likely. Thus, we will randomly permute the labels assigned to the deviances and calculate the ratio m m 1 1 i=1 |dev(xi )| i=1 |xi mx | RM D = m n = m n 1 1 j=1 |dev(yj )| j=1 |yj my | n n (or its reciprocal) for each permutation. Then we we compare the RMD value calculated from the original data to the permutation distribution of RMD values to determine p-value. Method: For a given Calculate RM D for the observed data. Call this RM Dobs . Use the Monte Carlo method to generate a large number of permutations that assign m As and n Bs to the combined sample of x and y values. For H1 : x > y , calculate RM D =
1 m 1 n m i=1 |xi n j=1 |yj

1 n 1 m

## my | for each permutation. Let RM Di mx |

max(RM DA , RM DB ) where RM DA and RM DB min(RM DA , RM DB ) are the values of RMD from the one-sided alternatives. Let RM Di be the value of RM D for the ith permutation.

Find the proportion of permutations for which RM Di > RM Dobs . This is the p-value for the permutation test. Decision Rule: Reject Ho if the p-value . R output for Permutation Test on Deviances > dev [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 4.65 37.25 20.15 18.35 4.65 18.25 0.45 0.45 3.95 17.55 [2,] 9.75 5.25 4.15 3.65 12.85 21.05 3.65 8.45 35.25 20.55 > devsum [1] 250.3

> adevsum [1] 125.7 > bdevsum [1] 124.6 > trueRMD [1] 1.008828 > tailA [1] 0.49342 > tailB [1] 0.49022 > tail2 [1] 0.9897 # probability for H1: sigma1 > sigma2 # probability for H1: sigma1 < sigma2 # probability for H1: sigma1 not= sigma2

## R code for Permutation Test on Deviances

# Monte Carlo Approach to Two-Sample Permutation Test on Deviances x <- c(48.4,81.0,23.6,62.1,39.1,25.5,44.2,43.3,39.8,61.3) y <- c(0.0,4.5,5.6,6.1,22.6,30.8,13.4,1.3,45.0,30.3) m = length(x) n = length(y) xdev <- abs(x - median(x)) ydev <- abs(y - median(y)) dev <- rbind(t(xdev),t(ydev)) dev devsum = sum(dev) adevsum = sum(xdev) bdevsum = devsum - adevsum devsum adevsum bdevsum trueRMD = (adevsum/m) / (bdevsum/n) trueRMD iter = 50000 RMDAlist <- 1:iter

111

RMDBlist <- 1:iter RMD2list <- 1:iter for(i in 1:iter) { s <- sample(dev,m) # select a sample of size m pasum = sum(s) pbsum = sum(dev) - sum(s) RMD1 = (pasum/m) / (pbsum/n) RMD2 = 1/RMD1 RMDAlist[i] <- RMD1 # add permutation difference to list RMDBlist[i] <- RMD2 RMD2list[i] <- max(RMD1,RMD2)/min(RMD1,RMD2) } RMDAlist <- sort(RMDAlist) RMDBlist <- sort(RMDBlist) RMD2list <- sort(RMD2list) #RMD2list quantileA <- quantile(RMDAlist,probs=c(.005,.01,.025,.05,.95,.975,.99,.995)) quantileB <- quantile(RMDBlist,probs=c(.005,.01,.025,.05,.95,.975,.99,.995)) quantile2 <- quantile(RMD2list,probs=c(.005,.01,.025,.05,.95,.975,.99,.995)) pdistA <- quantile(RMDAlist,probs=seq(0,1,1/iter)) pdistB <- quantile(RMDBlist,probs=seq(0,1,1/iter)) pdist2 <- quantile(RMD2list,probs=seq(0,1,1/iter)) ntailA <- length(pdistA[RMDAlist >= trueRMD]) tailA <- ntailA/iter tailA # probability for H1: sigma1 > sigma2 ntailB <- length(pdistB[RMDBlist >= trueRMD]) tailB <- ntailB/iter tailB # probability for H1: sigma1 < sigma2 ntail2 <- length(pdist2[RMD2list >= trueRMD]) tail2 <- ntail2/iter tail2 # probability for H1: sigma1 not= sigma2 plot(ecdf(RMD2list))

ecdf(RMD2list)
1.0 Fn(x) 0.0 0.2 0.4 0.6 0.8

10

15 x

20

25

30

112