You are on page 1of 43

Classification of Non-Parametric Test

Non Parametric Tests

Wilcoxon Signed Sign Test Mann Whitney Kruskal Friedman Test


Rank Test U-Test Wallis H-
Test

Two Two Three or more Three or more


Dependent Independent Dependent Independent
Samples Samples Samples Samples
Why Use Nonparametric Tests?
Some Common Nonparametric Tests

Figure 16.1
Wilcoxon Signed Rank Test
The paired difference, d is calculated and then ranked

where: x , y : pairs of observations

Wilcoxon T statistic is the smaller of the sum of positive ranks, W+ and the sum
of negative ranks, W-
Wilcoxon Signed Rank Test
ASSUMPTIONS:

Paired data coming from same sample

Data used is of interval measurement scale

Each pair is independent

FORMULAE USED:

The null and alternate hypotheses:


Wilcoxon Signed Rank Test
For larger samples, a normal approximation is used

Mean Standard Deviation


Z-score
Wilcoxon Signed Rank Test In Excel

Sample Data

Difference

Calculated
1.Difference is first rank
using the excel RANK
function

2.R+ & R- is calculated,


if difference value is
positive, it is R+, else it
is R-
Wilcoxon Signed Rank Test In Excel

Sum of positive
and negative ranks
Why Use Nonparametric Tests?
Parametric Tests
• Parametric hypothesis tests require the estimation
of one or more unknown parameters (e.g.,
population mean or variance).
• Often, unrealistic assumptions are made about the
normality of the underlying population.
• Large sample sizes are often required to invoke the
Central Limit Theorem.
Why Use Nonparametric Tests?
Nonparametric Tests
• Nonparametric or distribution-free tests
- usually focus on the sign or rank of the data
rather than the exact numerical value.
- do not specify the shape of the parent
population.
- can often be used in smaller samples.
- can be used for ordinal data.
Why Use Nonparametric Tests?
Advantages and Disadvantages of
Nonparametric Tests

Table 16.1
Why Use Nonparametric Tests?
Some Common Nonparametric Tests

Figure 16.1
One-Sample Runs Test
Wald-Wolfowitz Runs Test
• The one-sample runs test (Wald-Wolfowitz test)
detects non-randomness.
• Ask – Is each observation in a sequence of binary
events independent of its predecessor?
• A nonrandom pattern suggests that the observations
are not independent.
• The hypotheses are
H0: Events follow a random pattern
H1: Events do not follow a random pattern

16-13
One-Sample Runs Test
Wald-Wolfowitz Runs Test
• To test the hypothesis, first count the number of
outcomes of each type.
n1 = number of outcomes of the first type
n2 = number of outcomes of the second type
n = total sample size = n1 + n2
• A run is a series of consecutive outcomes of the same
type, surrounded by a sequence of outcomes of the
other type.
One-Sample Runs Test
Wald-Wolfowitz Runs Test
• For example, consider the following series representing
44 defective (D) or acceptable (A) computer chips:
DAAAAAAADDDDAAAAAAAADDAAAAAAAADDDDAAAAAA
AAAA
• The grouped sequences are:

• A run can be a single outcome if it is preceded and


followed by outcomes of the other type.
One-Sample Runs Test
Wald-Wolfowitz Runs Test

• There are 8 runs (R = 8).


n1 = number of defective chips (D) = 11
n2 = number of acceptable chips (A) = 33
n = total sample size = n1 + n2 = 11 + 33 = 44
• The hypotheses are:
H0: Defects follow a random sequence
H : Defects follow a nonrandom sequence
One-Sample Runs Test
Wald-Wolfowitz Runs Test
• When n1 > 10 and n2 > 10, then the number of runs R
may be assumed to be normally distributed with mean
mR and standard deviation sR.

calc
One-Sample Runs Test
Wald-Wolfowitz Runs Test
• The test statistic is:

calc

• For a given level of significance a, find the critical


value za for a two-tailed test.
• Reject the hypothesis of a random pattern if
z < -za or if z > +za .
One-Sample Runs Test
Wald-Wolfowitz Runs Test
• Decision rule for large-sample runs tests:

Figure 16.2
Wilcox on Signed-Rank Test
• The Wilcox on signed-rank test compares a single
sample median with a benchmark using only ranks of
the data instead of the original observations.
• It is used to compare paired observations.
• Advantages are
- freedom from the normality assumption,
- robustness to outliers
- applicability to ordinal data.
• The population should be roughly symmetric.
Wilcox on Signed-Rank Test
• To compare the sample median (M) with a
benchmark median (M0), the hypotheses are:

• When evaluating the difference between paired observations, use the median
difference (Md) and zero as the benchmark.
Wilcox on Signed-Rank Test
• Calculate the difference between the paired
observations.
• Rank the differences from smallest to largest by
absolute value.
• Add the ranks of the positive differences to obtain
the rank sum W.
Wilcox on Signed-Rank Test
• For small samples, a special table is required to
obtain critical values.
• For large samples (n > 20), the test statistic is
approximately normal.

calc

• Use Excel or Appendix C to get a p-value.


• Reject H0 if p-value < a.
Mann-Whitney Test
• The Mann-Whitney test is a nonparametric test that
compares two populations.
• It does not assume normality.
• It is a test for the equality of medians, assuming
- the populations differ only in centrality,
- equal variances
• The hypotheses are
H0: M1 = M2 (no difference in medians)
H1: M1 ≠ M2 (medians differ)
Mann-Whitney Test
Performing the Test
• Step 1: Sort the combined samples from lowest to
highest.
• Step 2: Assign a rank to each value.
If values are tied, the average of the ranks
is assigned to each.
• Step 3: The ranks are summed for each column
(e.g., T1, T2).
• Step 4: The sum of the ranks T1 + T2 must be
equal to n(n + 1)/2, where n = n1 + n2.
Mann-Whitney Test
Performing the Test
• Step 5: Calculate the mean rank sums T1 and T2.

• Step 6: For large samples (n1 < 10, n2 > 10), use a z test.

calc

• Step 7: For a given a, reject H0 if


z < -za or z > +za
Kruskal-Wallis Test
for Independent Samples
• The Kruskal-Wallis (K-W) test compares c
independent medians, assuming the populations
differ only in centrality.
• The K-W test is a generalization of the Mann-
Whitney test and is analogous to a one-factor
ANOVA (completely randomized model).
• Groups can be of different sizes if each group has 5
or more observations.
• Populations must be of similar shape but normality
is not a requirement.
Kruskal-Wallis Test
for Independent Samples

Performing the Test


• First, combine the
samples and assign a
rank to each observation
in each group. For
example:
• When a tie occurs, each
observation is assigned
the average of the ranks.

Table 16.7
Kruskal-Wallis Test
for Independent Samples
Performing the Test
• Next, arrange the
data by groups and
sum the ranks to
obtain the Tj’s.
• Remember,
STj = n(n+1)/2.

Table 16.8
Kruskal-Wallis Test
for Independent Samples
Performing the Test
• The hypotheses to be tested are:
H0: All c population medians are the same
H1: Not all the population medians are the same
• For a completely randomized design with c groups, the
tests statistic is

calc

where n = n1 + n2 + … + nc
nj = number of observations in group j
Tj = sum of ranks for group j
Kruskal-Wallis Test
for Independent Samples
Performing the Test
• The H test statistic follows a chi-square distribution
with n = c – 1 degrees of freedom.
• This is a right-tailed test, so reject H0 if H > c2a or if
p-value < a.
Friedman Test for Related Samples
• The Friedman test determines if c treatments have the
same central tendency (medians) when there is a
second factor with r levels and the populations are
assumed to be the same except for centrality.
• This test is analogous to a two-factor ANOVA without
replication (randomized block design) with one
observation per cell.
• The groups must be of the same size.
• Treatments should be randomly assigned within blocks.
• Data should be at least interval scale.
Friedman Test for Related Samples
• In addition to the c treatment levels that define the
columns, the Friedman test also specifies r block factor
levels to define each row of the observation matrix.
• The hypotheses to be tested are:
H0: All c populations have the same median
H1: Not all the populations have the same median
• Unlike the Kruskal-Wallis test, the Friedman ranks are
computed within each block rather than within a pooled
sample.
Friedman Test for Related Samples
Performing the Test
• First, assign a rank to each observation within each row.
For example, within each Trial:

• When a tie occurs, each observation is assigned the


average of the ranks.
Friedman Test for Related Samples
Performing the Test
• Compute the test statistic:

calc

where r = the number of blocks (rows)


c = the number of treatments (columns)
Tj = the sum of ranks for treatment j
Friedman Test for Related Samples
Performing the Test
• The Friedman test statistic F, follows a chi-square distribution with n = c – 1
degrees of freedom.
• Reject H0 if F > c2a or if p-value < a.
Spearman Rank Correlation Test
• Spearman’s rank correlation coefficient (Spearman’s
rho) is an overall nonparametric test that measures
the strength of the association (if any) between two
variables.
• This method does not assume interval
measurement.
• The sample rank correlation coefficient rs ranges
from -1 < rs < +1.
Spearman Rank Correlation Test
• The sign of rs indicates whether the relationship is
direct – ranks tend to vary in the same
direction, or
inverse – ranks tend to vary in opposite
directions
• The magnitude of rs indicated the degree of relationship.
If
rs is near 0 – there is little or no agreement
between rankings
rs is near +1 – there is strong direct agreement
rs is near -1 – there is strong inverse
agreement
Spearman Rank Correlation Test
Performing the Test
• First, rank each
variable. For
example,
• If more than
one value is
the same,
assign the
average of the
ranks.

Table 16.11
Spearman Rank Correlation Test
Performing the Test
• The sums of ranks within each column must always
be n(n+1)/2.
• Next, compute the difference in ranks di for each
observation.
• The rank differences should sum to zero.
Spearman Rank Correlation Test
Performing the Test
• Calculate the sample rank correlation coefficient rs.

where di = difference in ranks for case i


n = sample size
• For a right-tailed test, the hypotheses to be
tested are
H0: True rank correlation is zero (rs < 0)
H1: True rank correlation is positive (rs > 0)
Spearman Rank Correlation Test
Performing the Test
• If n is large (at least 20 observations), then rs may be
assumed to follow the Student’s t distribution with
degrees of freedom n = n - 1

calc

• Reject H0 if t > ta or if p-value < a.


Correlation versus Causation

• Caution: correlation does not prove causation.


• Correlations may prove to be “significant” even
when there is no causal relation between the two
variables.
• However, causation is not ruled out.
• Multiple causes may be present.

You might also like