Non Parametric Test

9.
9 NONPARAMETRIC PROXLEDURES
337
9.9 NONPARAMETRIC PROCEDURES
Most of the hypothesis-testing and confidence interval

based on the assumption that we are working with randomprocedures discussed previously are
samples from
normal populations
Traditionally, we have called these procedures parametric methods because they are based
a particular parametric family on
of distributions in this case, the normal. Alternately, sometimes
we say that these
procedures are not distribution-free because they depend on the
of normality. Fortunately, most of these assumption
procedures are relatively insensitive to moderate
partures from normality. general, l-and F-tests and the 1-confidence intervals will have de-
In the
ac-
tual levels of significance or contidence levels that differ from the nominal or
advertised levels
chosen by the experimenter, although the difference between the actual and
advertised levels is
usually fairly small when the underlying population is not too different from the normal.
In this section we describe procedures called
nonparametric and distribution-free meth-
ods, and we usually make no assumptions about the distribution of the
underlying or confi-
other than that it is continuous. These procedures have actual level of significance cepopulation
dence level 100(1 a ) % for many different types of distributions. These procedures have
some appeal. One of their advantages is that the data need not be quantitative but can be cate-
gorical (such as yes or no, defective or nondefective) or rank data. Another advantage is that
nonparametric procedures are usually very quick and casy to perform.
The procedures described in this chapter are alternatives to the parametric t- and
F-procedures
described earlier. Consequently, it is important to compare the performance of both parametric and
nonparametric methods under the assumptions of both normal and nonnormal populations. In
general, nonparametric procedures do not utilize all the information provided by the sample.
As a result, a nonparametric procedure willbelessefficient than the corresponding paramet-
ric procedure when theunderlying populationisnormal. This loss ofefficiencyisretiected by
a requirement ofa larger samplesize for the nonparametric procedure than would be required
by the parametric procedure in order to achieve the same power. On the other hand, this loss of
efficiency is usually not large, and often the difference in sample size is verysmal. When the
underlying distributions are not close to normal, nonparametric methods may have much to of-
fer. They often provide improvement over the normal-theory parametric methods. Gencerally,
if both parametric and nonparametric methods are applicable to a particular problem, we
shoutt usethemore efficient parametricprocedure.
Another approach that can be used is to transformthe originaldata, say,by taking loga-
rithms,squareroots,orareciprocal, andthen.analyzethetransfocmed data using aparametrit
tchnique.A normal probability plot often works well to see if the transformation has been
successful. When this approachis successful. it is usually preferableto using a nonparametric
technique. Howevet, sometimestransformations are not satisfactory. Thatis, no transforma-
tion makes the sample observations look-very-cBose toa sample from-a normnal distribution.
These situations
Onesituationwhere this happensis when the data are in the formbe of ranks.evaluate 10 differ
used
frequently occur in practice. For instance, a panel of judges may to
ent formulations of a soft-drink beverage for overall quality, with the "best" formulation as-
rank 1, the "next-best" formulation assigned rank 2, and soforth. It is unlikely that rank
signed
data satisfy the normality assumption. Transtformations may not prove satisfactory either.
methods involve the analysis of ranks and consequently are directly

Many nonparametric
suited to this type of problem.
9-9.1 The Sign Test
median i of a continuous distribution. The

Thesign test is used to test hypotheses about thevariable
value of the random 'such that the probability is 0.5 that
median of a distribution is a
338 CHAPTER TESTS OF HYOTHESES FOR A SINGIE
SAMPL
anobserved value of Nis less than or cqual to the
median, and the
served value of Ais greater than or equal to the mcdan. That probability is 0.5 that an ob-
is, P(X s p)
Since the normal distribution is =P(X p) 0.5
symmetric, the mean of a normal distribution
=
median. Therefore, the sign test can be uscu to test equals the
tribution. This is the same problem for which we
hypotheses about the mean of a normal dis-
previously uscd the 1-test.
cuss the relative merits
of the two procelures in Section 9-9.3 Note that, We will briefly dis-
was designed for samples from a normal
distribution, the although the t-test
ftom any continuous distribution. Thus, the sign test is a sign
test is
appropriate for samples
Suppose that the hypotheses ure nonparametric procedure.
H: o (9-51)
The test procedure is easy
to describe. Suppose that Xi, X2..., A, is a random sample from the
population of interest. Form the differences
X,-o. i = 1, 2.....n
(9-52)
Now if the null hypothesis Ho: ji Po is truc, any difference X, y is
=
cqually likcly to
-
be positive or negativc. An appropriate test statistic is the number of

thesc diffcrences that are
positive, say, R". Therefore. to test the null hypothesis we are really testing that thc number of
plus signs is a value of a binomial random variable that has the parameter p= 1/2. A P-valuc
for the observed number of plus signs r can be calculated
directly from the binomial distri-
bution. For instance, in testing the hypotheses in
Equation 9-51, we will reject /1, in favor of
H only if the proportion of plus signs is sufñiciently less thun 1/2 (or equivalently, whenever
the observed number of plus signs r is too small). Thus, if the computed P-value
P PRr when p
is less than or cqual to some preseclected
significance level a, we will reject H, and conclude
H is true.
To test the other one-sided hypotheses
Hy = Po
H:>u (9-53)
we will reject H, in favor of H only if the observed number of plus signs, say, r", is large or,
equivalently, whenever the observed fraction of plus signs is significantly greater than 1/2.
Thus, if the computed P-value
P -PR r when p
is less than a, we will reject Hy and conclude that H is true.
The two-sicted alternative may also be tested. If the hypotheses are
Hy: Po
(9-54)
we should reject Ho: p. = ro if the proportion of plus signs is signiticantly different from
(cither less than or greater than) 1/2. This is equivalent to the observed number of plus signs r"
9.9 NONPARAMETRIC PROXEMURES
339
being either sufficiently Jarge or suficiently
smaill. Thus, if* n2. the P-value is
P 2r R'sr when
and if >
n/2, the P-valuc is

p -)
P
2PR ar when p =
f the P-value is less than some

preselected level a, we will reject H, and
conchude that H, is true.
EXAMPLE 9-15 Propellant Shear

Montpomery. Peck. and Vining (2006)
Strength Sign Test
report on a study in 1. Parameter of Interest: The
which a rcket motor is formed parameter of interest is
by binding an igniter propel. the median of the distribution
l a and sustainer of
a
prupellant together
The shear strength of the bond
inside a metal
housing. strength. propellant shear
between the two
types is an important characteristic. The results of propellant
2. Null hypothesis: Io: i =
2000 psi
randomly selected motors are shown in Table 9-5. We testing 20 3. Alternative hypothesis: H: +
like to test the
hypothesis
that the
would
median shear 4. Test statistic: The test 2000 psi
200 psi. using a = 0.05. strength is statistic is the observed num-
ber of plus differences in
This probiem can be solved Table 9-5. or r 14. =
using the cigh-step hypothesis- Reject H, if: We will reject H, if the

testing proceure P-value corre-
sponding to r = 14 is less than
a = 0.05. or cqual to
Table 9-5 Prupellant Shear Strength Data

Observation Shear Strength Differences
X2000 Sign
2158.70 +158.70
1678.15 -321.85
2316.00 +316.00
2061.30 +61.30
2207.50 +207.50
1708.30 -291.70
1784.70 -215.30
2575.10 +575.10
2357.90 +357.90
10 2256.70 +256.70
2165.20 +165.20
12
2399.55 +399.55
13
1779.80 -220.20
14
2336.75 +336.75
5
1765.30 -234.70
6
2053.50 +53.50
17
2414.40 +414.40
2200.50 +200.50
19
2654.20 +654.20
20 1753.70 -246.30
340 4ATER TESTS ( HYNTHESES RR A $INGA E SAMTE
h Compntations: Since r 14 i« greater than

7 Conelhusions: Since P 9115 s not less than a
220/2 10, we calculate the P-vaime from we canmot
005.
rejeet the mull hypothesis that the
median shear
strength is 2000 psi Another way to say this is that ihe oh-
P 2P(R' 14 whenp
2
) served mmber ef ptus

sugms
eough m imdieate that median shear
14 was not large or small
strength is ditterent
from 2000 psi at the a 005 level
=
of signficance
- 01153
Italso possible to construct a table of critical

is
values for the sign test. This table is shown as
Appendin Table VTlI. The use of this table for the two-sided alternative
9.54 is simple. As before, let R" denote the number hypothesis in Equation
and let R denote the number of these of the ditferences (X, jA) that are positive
-
differences that are

negative. Let R
(R.R).Appendix
=
min
Table VIll presernts critical values r for the sign test that ensure that
I crror)
P (reject Ho when H is true)
=
for a =
0.01, a - 0.05 and a
« =
P (type
0.10. If the
-
observed value of the test statisticers r the null

To illustrate how this table is used,. refer to hypothesis Ho: jHo should be
rejected.
=
the data in Table 9-5 that were used

9-15. Now r =
14 andr =
6. therefore. =
min (14, 6)
in
Example
=
6. From Appendix
Table Vill
with n 20 and a =
0.05, find that
we
ros 5. Sincer =
6 is not less than or
critical value rios =
5, we cannot reject the null equal to the
2000 psi.
hypothesis that the median shear
strength is
We
can also use
Appendix Table VIlI for the sign test when
hypothesis is appropriate. If the àlternative is Hi: i > jio, reject Hg:one-sided ifalternative
a
ifthe alternative is Hi: i> io, reject Ho: ji ë ifr" s ëy =

s
r
=
r
one-sided test one-half the value for a two-sided test.

is r The level of significance of a
sided significance levels in the column Appendix Table VIll shows the one-
headings immediately below the two-sided levels.
Finally, note that when a statistic has a discrete distribution such as R
test
test, it may be impossible to choose a critical value does in the sign
r that has a level of
equal to a. The approach used in Appendix Table VIl is to choose to significance exactly
close to the advertised significance level a as r* yield an a that is as
possible.
Ties in the Sign Test
Since the underlying population is assumed to be
continuous, there is a zero
we will find a "tie"-that is, a value
of X, exactly equal to o. probability that
However, this
happen in practice because of the way the data are collected. When ties occur, sometimes
may
set aside and the sign test they should be
applied to the remaining data.
The Normal Approximation
When p 0.5, the binomial distribution is well
approximated by a normal distribution when
n is at least 10.
Thus, since the mean of the binomial is and the
np variance is np(1
distribution of R" is approximately normal with mean 0.5n p), the
-
and variance 0.25n whenever n is

moderately large. Therefore, in these cases the null bypothesis
the statistie H i Ä can be tested using =
Normal
Approximation
for Sign Test
Statistie 0.5Vn (9-55)
341
A P-value approach could be used tor decision making. The fixed

significance level approach
could also be used.
The two-sicded alternative woulkd be rejected it the observed value of the
test statistic
l l 2 , and the critical regrens of the one-sidecd alternative would be chosen to reflect the
sense of the alternative. (lf the alternative iIs H1: Ho. reject H, if :, > z,, for example)
Type li Error for the Sign Test

The sign test will control the probabil1ty of type I error at an advertised level a for testing the
ull hypothesis Ho: for any continiuous distr1bution. As with
=
any hypothesis-testing
procedure, it is important to investigate the probability of a type il error. B. The test should be
able to effectively detect departures from the null hypothesis, and a good measure of this
effectiveness is the value of B for departures that are important. A small value of B implies an
effective test procedure.
In determining B, it is important to realize not only that a particular value of i. say. i, + 3,
must be used but also that the form of the unterlying distribution will affect the calculations. To
illustrate, supyose that the underlying distribution is normal with o = I and we are testing the
hypothesis Ho: i = 2 versus H1: > 2. (Since i = n the normal distribution. this is equiv
alent to testing that the mean cquals 2.) SuppOse that it is important to detect a departure from
i 2 to = 3 . The situation is illustrated graphically in Fig. 9-15(a). When the alternative
hypothesis is true (H: = 3). the probability that the random variable X is less than or equal to the
value 2 is
P(X 2) = PZ s -1) = d-1) = 0.1587
Suppose we have taken a random sample of size 12. At the a = 0.05 level, Appendix Table VII
indicates that we would reject Ho: ë = 2 ifr" s rhos = 2. Therefore, B is the probability that
we do not reject Ho: i = 2 when in fact i = 3, or
B-1- 201587y(0 8413)
If the distribution of A
had been cxponcntial rather than normal, the situation would be as
shown in Fig. 9-15(b), and the probability that the random variable X is
less than or equal
to the valtne x =
2 when ji 3 (note that when the median of an
=
is 3. the mcan is 4.33) is exponential distribution
P(X 2)- d r = 0.3699
In this case.
B1
( J0.3699y(0.6301)12- =
0.8794
Thus, for the sign test depends not only on the alternative value of
to the iivht of the value p but also on the area
specified in the null hypothesis under the population probability
42 HATER TESTS ( HYA THESES R AsN SAME
587
2 12 3 4 6
UnderH i+2 Under H 3
3699
i-2 -2.89 2 4.33

Under Ho i =2 Under H1 = 3
b
Figure 9.15 Caculation of ß for the sign test. (a) Normal distributions. (b) Exponential
distributions.
distribution. This area is highly dependent on the shape of that particular probability
distribution. In this example. B is large so the ability of the test to detect this departure trom
the mull bypothesis with the current sample size is poor.
9-9.2 The Wicoxon Signed-Rank Test
The sign makes use only of the plus and minus signs of the differences between the observa-
tions and the median io (or the plus and minus signs of the differences between the observa-
tions in the paired case). I does not take into account the size or magnitude of these differ-
ences. Frank Wilcoxon devised a test procedure that úses both direction (sign) and magnitude.
This procedure, now called the Wilcoxon signed-rank test, is discussed and illustrated in this
section.
The Wilcoxon signed-rank test applies to the case of symmetrie continuous distribu-
tions. Under these assumptions, the mean equals the median, and we can use this procedure to
test the null hypothesis
The Test Procedure

We are interested in testing Ho: = Ho against the usual alternatives. ASsume that Xi,
2 , is a randonm sample from a continuous and symmetric distribution with mean(and

median) . Compute the differences X, - H i = 1,2,.. .,n. Rank the absolute ditferences
X ol,i = 1,2,.. ..n in ascending order, and then give the ranks the signs of their
corresponding differences. Let W" be the sum of the positive ranks and W be the absolute
343
value of the sum of the negative ranks, and let W =

min(W*, W). Appendix Table IX contains
critical values of W. say, w*. If the alternative hypothesis is
H: p # Po. then if the ob-
served value of the statistic w s wa the null hypothesis Ho:
p is rejected. Appendix
=
u
Table 1X provides significance levels of a 0.10, a =
0.05, a =
=
0.02, a =
0.01 for the
fwo-sided test.
Forone-sided tests, if the alternative is H: > H, reject H,: p H if
thealternative is H: < H, reject Ho w w and if =
sided tests provided in Appendix Table IX are

H if
=
ws waThe significance levels for one

0.05, 0.025, 0.01. and 0.005.
a =
EXAMPLE 9-16 Propellant Shear Strength Wilcoxon Signed-Rank Test

We will illustrate the Wilcoxon signed-rank test by applying it
to the propellant shear strength data from Table 9-5. Assune 15 -234.70 -9
that the undertying distribution is a continuous symmetric dis- 20 -246.30 -10
tribution. The seven-step procedure is applied as follows: 10 +256.70 +11
. Parameter of Interest: The parameter of interest is -291.70 -12
the mean (or median) of the distribution of propel +316.00 +13
lant shear strength.
-321.85 -14
Null bypothesis: Ho: =
2000 psi 4 +336.75 +15
3.
Alternative hypothesis: H: 2000 psi 9 +357.90 +16
4. Test statistic: The test statistic is 2 +399.55 +17
17 +414.40
w min(w, w) +18
8 +575.10
5. +19
Reject H, if: We will reject H, if w was 52 19
from Appendix Table IX. +654.20 +20
6. Computations: The signed ranks from Table 9-5
are shown in the following display:
The sum of the positive ranks is w
+5 +6 + Il + 13 + 15 + 16 + 17 +
=(1 + 2 +34
18 + 19 +
Difference 20) 150, and the sum of the absolute values of the
Observation 2000 Signed Rank ative ranks is w (7 +8 + 9 + 10+ 12 14)
=
neg
=
16 +53.50 60. Therefore,
+61.30 +2
w min(I50, 60)) = 60
+158.70
+165.20 +4
18 +200.50 7. Conclusions: Since w 60 is not less than or equal
=
+5
+207.50
tothe critical value wous 52, we cannot reject the
+6
7 null hypothesis that the mean (or median, since the
-215.30 population is ussumed to be symmetric) shear
3 220.20 -8 strength is 2000 psi.
contiued
Ties in the Wilcoxon

Signed-Rank Test
Because the underlying
population is continuous, ties are
theoretically impossible, although
they will sometimes occur in practice. several observations have the same absolute magni-
If
tude, they are assigned the
average of the ranks that they would receive if they differed
from one another. slightly
344 CHAITER 9 TESTS OF HYPOTHESES FOR A SINGIE SAMFIE
large Sample Approximation

If the sample sizc is moderately largc.
say, n 20, it can be shown that
approximately a normal distribution with mean W (or W) has
n{n )
4
and variance
nn+1)(2n + 1)
24
Thcrcforc, a test of Ho: =
Ho can be bascd on the statistic
Normal
Approvimation
for Wilcaxon w* n{n +
1/4
Sigved-Rank
Statistic
ZVnln+ 2n +1)y24 (9-56)
An
appropriate critical region for cither the two-sidcd
or one-sided
be chosen from a table of the standard normal alternative hypotheses can
distribution.
9.9.3 Comparison to the t-Test
If the
underlying population is normal, either the sign lest or the
a
hypothesis about the population median. The -test t-test could be used
is known lo have the
to test
Bpossible among all tests that have smallest value of
tests with
symmetric significance level for
critical
« the one-sided alternative
and for
sign test in the regions
for the two-sided
alternative, so it is
normal distribution case. When the superior to the
nonnormal (but with finite mean), the population
t-test will have a
distribution is symmetric and
the sign test, unless the smaller B (or a
distribution has very heavy tails higher power) than
the sign test is
usually considered a test procedure for thecompared with the normal. Thus,
competitor for the t-test. The Wileoxon median rather than as a
serious
compares well with the 1-test for signed-rank test is preferable to the sign test and
where a transformation on the symmetric distributions. It can be useful in
observations does not produce a situations
ably close to the normal. distribution that is reason
EXERCISES FOR SECTiON 9.9

9.110. Ten samples were taken from plating bath used in
a
ancloctronCs manufacturing und the bath
(a) Do the sumple data indicate that this statement is
wrmined. The sample pll values process, plH was de- Use the sign test with u correet?
are 7.91,
7.85, 6.82, 8.01, 0.05 to =
746, 695, 7.05,

7.35, 7.25, and 7.42. esis. Find the P-value for this investigate this hypoth-
Deering belicves that pli has a mediun Manufacturing engi
test.
(b) Use the normal
value of 7.0. approximation
for the
Hy:ji 7.0 versus H: ë * 7.0. Whut sign test to test
=
this t lest? is the P-value for

Non Parametric Test

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Non Parametric Test

Uploaded by

Copyright:

Available Formats

9.

Most of the hypothesis-testing and confidence interval

methods involve the analysis of ranks and consequently are directly

9-9.1 The Sign Test

median i of a continuous distribution. The

be positive or negativc. An appropriate test statistic is the number of

n/2, the P-valuc is

f the P-value is less than some

EXAMPLE 9-15 Propellant Shear

using the cigh-step hypothesis- Reject H, if: We will reject H, if the

Table 9-5 Prupellant Shear Strength Data

h Compntations: Since r 14 i« greater than

) served mmber ef ptus

Italso possible to construct a table of critical

differences that are

observed value of the test statisticers r the null

the data in Table 9-5 that were used

ifthe alternative is Hi: i> io, reject Ho: ji ë ifr" s ëy =

one-sided test one-half the value for a two-sided test.

and variance 0.25n whenever n is

A P-value approach could be used tor decision making. The fixed

Type li Error for the Sign Test

P(X 2) = PZ s -1) = d-1) = 0.1587

B-1- 201587y(0 8413)

is 3. the mcan is 4.33) is exponential distribution

P(X 2)- d r = 0.3699

i-2 -2.89 2 4.33

9-9.2 The Wicoxon Signed-Rank Test

The Test Procedure

2 , is a randonm sample from a continuous and symmetric distribution with mean(and

value of the sum of the negative ranks, and let W =

sided tests provided in Appendix Table IX are

ws waThe significance levels for one

EXAMPLE 9-16 Propellant Shear Strength Wilcoxon Signed-Rank Test

Ties in the Wilcoxon

large Sample Approximation

EXERCISES FOR SECTiON 9.9

746, 695, 7.05,

this t lest? is the P-value for

You might also like