This action might not be possible to undo. Are you sure you want to continue?
Scanlan, EdD, RRT
Parametric statistics assume (1) that the distribution characteristics of a sample's population are known (e.g. the mean, standard deviation, normality) and (2) that the data being analyzed are at the interval or ratio level. Frequently, however, these assumptions cannot be met. Commonly, this occurs with nominal- or ordinal-level data (for which there are no measures of means or standard deviations). Alternatively, continuous data may be so severely skewed from normal that it cannot be analyzed using regular parametric methods. In these cases we cannot perform analyses based on means or standard deviations. Instead, we must use nonparametric methods. Unlike their parametric counterparts, non-parametric tests make no assumptions about the distribution of the data nor do they rely on estimates of population parameters such as the mean in order to describe a variable's distribution. For this reason, nonparametric tests often are called 'distribution-free' or ' parameter-free' statistics. Given that nonparametric methods make less stringent demands on the data, one might wonder why they are not used more often. There are several reasons. First, nonparametric statistics cannot provide definitive measures of actual differences between population samples. A nonparametric test may tell you that two interventions are different, but it cannot provide a confidence interval for the difference or even a simple mean difference between the two. Second, nonparametric procedures discard information. For example, if we convert severely skewed interval data into ranks, we are discarding the actual values and only retaining their order. Because vital information is discarded, nonparametric tests are less powerful (more prone to Type II errors) than parametric methods. This also means that nonparametric tests typically require comparably larger sample sizes in order to demonstrate an effect when it is present. Last, there are certain types of information that only parametric statistical tests can provide. A good example is independent variable interaction, as provided by factorial analysis of variance. There is simply no equivalent nonparametric method to analysis such interactions. For these reasons, you will see nonparametric analysis used primarily on an as-needed basis, either (1) to analyze nominal or ordinal data or (2) to substitute for parametric tests when their assumptions are grossly violated, e.g., when a distribution is severely skewed. Discussion here will be limited to the analysis of nominal or ordinal data. Nominal (Categorical) Data Analysis We previously have learned that the Pearson product-moment correlation coefficient (r) is commonly used to assess the relationship between two continuous variables. If instead the two variables are measured at the nominal level (categorical in nature), we assess their relationship by crosstabulating the data in a contingency table. A contingency table is a two-dimensional (rows x columns) table formed by 'cross-classifying' subjects or events on two categorical variables. One variable's categories define the rows while the other variable's categories define the columns. The intersection (crosstabulation) of each row and column forms a cell, which
In our example χ2 = 30. given that the sample contains exactly 50% male and 50% female. If on the other hand the computed * The actual formula for computing the expected count (E) in any contingency table cell is: E = (row total x column total)/grand total For the Male/Survives cell E = (50 x 41) / 100 = 20. yielding the values of chi-square (χ2). we would expect exactly half of those surviving (41) to be male. the less likely that the null hypothesis of independence holds true. i. As indicated in the formula.. not associated.14.e. To determine whether or not the row and column categories for the table as a whole are independent of each other. If the computed χ2 statistic is less than this critical value.e. squares this difference. the stronger the evidence that the two variables are related. These values are then summed for each cell (the ∑ symbol).e.displays the count (frequency) of cases classified as being in the applicable category of both variables.5 2 . were there no relationship between gender and outcome (the null hypothesis of independence). i. i. we compute the Chi-square statistic (χ2): where O = the observed frequency and E = the expected frequency. the large difference between the observed (O = 34) and expected (E = 20. one first computes the differences between the observed and expected frequencies in each cell. and then divides the squared difference by that cell's expected frequency. The resulting χ2 statistic is then compared to a critical value that is based on the number of rows and columns and obtained from a Chi-square distribution table.e. i. 41/2 = 20.* Similar expected values can be computed for all cells in the table. then we must accept the null hypothesis and conclude that the variable categories are independent of each other. The greater the difference between the observed (O) and expected (E) cell counts. For example. i. Below is a simple example of a hypothetical contingency table that crosstabulates patient gender against survival of chest trauma: Outcome Survives Dies 34 16 7 41 43 59 Male Female Total Total 50 50 100 Testing for Independence (Chi-square and Related Tests) Based on simple probability. the number of cases we would expect based on their total distribution in the sample.e.5) cell counts for the Male/Survives cell suggests that being male is associated with greater likelihood of survival...5.. we can easily compute the expected values for each cell. In our example.
it is difficult to compare the association among variables among different size tables using this coefficient. the critical value for χ2 in this analysis is 3. a moderately strong association.. i. other nonparametric test would need to be considered.14 clearly exceeds this critical value. The contingency coefficient (CC) is computed as follows: where χ2 = the Chi-square value and N = the sample size. the Phi coefficient = √30. However. an alternative is needed. these coefficients only range from 0 to +1 (you cannot have a 'negative' relationship between categorical variables) .. then we reject the null hypothesis and conclude that the variable categories are indeed related. The alternative to the χ2 test for this situations is Fisher's Exact Test. Most authors recommend using Fisher's Exact Test statistics instead of χ2 whenever one or more of the expected counts in a table cell is less than 5 or when the row or column totals are very uneven.e. unlike the Pearson r. the maximum value of the contingency coefficient varies with table size (being larger for larger tables). The formula for Cramer's V is: 3 .χ2 statistic exceeds the critical value. we can conclude that the variable categories are indeed related. It is important to note that both χ2 and Fisher's Exact Test are nondirectional (symmetrical) tests. the Phi coefficient or Cramer's V.84.14/50 = 0. they make no assumptions as to directionality or cause and effect. then the resulting χ2 statistic may not be accurate. For this reason. If the minimum expected count for any cell in a contingency table is less than 5. The Phi coefficient (φ) is a measure of nominal association applicable only to 2 x 2 contingency tables. If one is assessing the relationship between cause and effect. that gender is associated with survival after chest trauma (hypothetical example).60. Cramer's V is the nominal association measure of choice. These coefficients can be thought of as Pearson product-moment correlations for categorical variables. Since our computed χ2 of 30. It is calculated using the following formula: In our example.e. Unfortunately. In our example. In this case. i. Testing for the Strength of Categorical Relationships χ2 and Fisher's Exact Test only test whether or not there is a relationship between categorical variables. If we were conducting crosstabulation on contingency tables larger than 2 x 2. To test the strength of such relationships we use correlation-like measures such as the Contingency Coefficient. which can assume negative values.
and the differences D between the ranks of each observation on the two variables are calculated. and so on): Case Rank by Height Rank by Weight A 1 3 B 2 4 C D 3 1 4 2 E 5 5 F 6 7 G H 7 8 8 6 The differences between the ranks for the 8 subjects (height rank – weight rank) are: -2. In practice. As an example.0. nonparametric correlation tests like Spearman's rho and Kendall's tau simply replace these data with their ranks. and N = the number of pairs of values. Because in 2 x 2 tables k = 2 and k-1 = 1. In principle. -1. 0.0 and +1. The two most common ordinal measures of association are Spearman's rho (ρ) and Kendall's rank order correlation coefficient or Kendall's tau (τ). The raw scores are converted to ranks.* Like the parametric Pearson product-moment correlation coefficient.where N is the total number of cases and k is the lesser of the number of rows or columns. 2 Squaring and then summing these values: If the data are at the interval or ratio level. both these measures can range between -1. 2. Spearman's rho. Cramer's V equals Phi for 2 x 2 analyses. suppose we rank a group of eight people by height and by weight (here person A is tallest and third-heaviest. -1. are paired observations. ρ is then computed as: where: D = the difference between the ranks of corresponding values of X and Y. 2. X and Y. Both Spearman's rho and Kendall's tau require that the two variables. Spearman's rho is simply a special case of the Pearson productmoment coefficient in which the data are converted to ranks before calculating the coefficient. * 4 . Ordinal (Ranked) Data Analysis Testing for the Strength of Ordinal (Ranked) Relationships As with continuous and nominal data. with a positive correlation indicating that the ranks increase together. with the variables measured are at least at the ordinal level. while a negative correlation indicates that as the rank of one variable increases the other one decreases. -2. however. measures exist to quantify the strength of association between variable measured at the ordinal level. a simpler procedure is normally used to calculate ρ.
Continuing this way. Again.∑D2 = 4 + 4 + 4 + 4 + 0 + 1 + 1 + 4 = 22 Computing the denominator: N(N2-1) = 8(82 – 1) = 8(64 – 1) = 504 And finally Spearman's ρ ρ = 1 – [(6*22)/504] = 1 – (132/504) = 0. Using the same data as we employed to compute Spearman's rho. we see a positive correlation between the height and weight ranks.738 would be considered a moderately strong positive correlation. The main advantage of using Kendall's tau over Spearman's rho is that one can interpret its value as a direct measure of the probabilities of observing concordant and discordant pairs. we note that the paired observations are sorted in order of height. of cases ranked after the given item by both rankings. so too does their weight rank. and n is the number of paired items. in this case indicating that as a person's height rank increases. As long as one of the variables is presorted by order. 3. 5 . Kendall's tau can be computed using the following formula: Where P is the sum. over all the cases. has five higher ranks to the right of it. Moving to the second entry. albeit less strong than that revealed by Spearman's rho. the first entry. Kendall's tau. In the Weight row of this table. so its contribution to P is 5. so we will compute P based on the weight data. we find that P = 5 + 4 + 5 + 4 + 3 + 1 + 0 + 0 = 22 And thus 2P = 2 x22 = 44 Computing the denominator: ½ n(n -1) = 4(8-1) = 4 x 7 = 28 And finally computing Kendall's tau . a Spearman's rho (ρ) of 0. 4.738 Like a Pearson r. An alternative measure used to test for the strength of a relationship between ordinal (ranked) variables is Kendall's rank order correlation coefficient or Kendall's tau (τ). we see that there are four higher ranks to the right of it and its contribution to P is 4.
interval or ratio-level measurements on groups may be so skewed as to make regular parametric analysis impossible.R1 U2 = n1n2 + (n2(n2 + 1))/2 . nonparametric tests like the Mann-Whitney U simply replace these data with their ranks. Likewise U2 is defined as the number of times that a score from group 2 is lower in rank that a score from group 1. comparable nonparametric approaches to traditional t-testing or analysis of variance (ANOVA) are needed. Two U scores are computed: U1 and U2. The Mann-Whitney U test (also known as the Wilcoxon Rank Sum Test) is a nonparametric test used to determine whether two samples of ordinal/ranked data differ. In these cases.HTM Comparing Two Groups by Ranks – the Mann-Whitney U Test. Nonparametric statistics. * 6 . Alternatively.edu/~gdallal/LHSP. U1 is defined as the number of times that a score from group 1 is lower in rank than a score from group 2. However. if the sample data are continuous and normally distributed. sum of ranks and 'U' score is computed for each group. G.Testing for Group Differences on Ordinal (Ranked) Data There are many times when researchers want to compare two or more groups on an outcome that is measured at the ordinal level (as opposed to interval or ratio data).* It is the nonparametric equivalent to conducting an independent t-test comparing two groups on a normally distributed continuous variable.tufts. Then a mean rank. The Mann-Whitney U ranks all the cases for each of the two groups from the lowest to the highest value. U1 and U2 are computed as follows: U1 = n1n2 + (n1(n1 + 1))/2 . The following table summarizes the nonparametric equivalents to traditional t-testing or ANOVA: Purpose or Need To analyze differences between 2 independent groups To analyze differences between 2 related groups (repeated measures) To analyze differences between 3 or more independent groups To analyze differences between 3 or more related groups (repeated measures) Parametric Approach Independent t-test Paired (dependent) t-test One way ANOVA (F-test) Repeated measures ANOVA Nonparametric Approach Mann-Whitney U test (aka Wilcoxon rank-sum test) Wilcoxon signed rank test for paired data Kruskal-Wallis ANOVA Friedman two-way ANOVA Adapted from: Dallal. then nonparametric tests like the Mann-Whitney U Test should not be employed since they are less powerful than their parametric equivalents and thus more likely to miss a true difference between groups.R2 where: n1 = number of observations in group 1 n2 = number of observations in group 2 R1 = sum of ranks assigned to group 1 R2 = sum of ranks assigned to group 2 If the data are at the interval or ratio level. In The Little Handbook of Statistical Practice available at: http://www.E.
Since the sampling distributions for both the U and W statistics approach that of a normal curve (as long as N > 20).96 indicates a statistically significant difference in the distribution of ranks. then the Z-score will equal 0. If the rank distributions are identical to one another.g. However. is computed as follows. * 7 .The Mann-Whitney U statistic is defined as the smaller of U1 or U2. while negative Z-scores indicate the opposite. H. if the sample data are continuous and normally distributed. i. we can use a simple Z-score to judge the significance of group differences in ranks. At the normal confidence level of 0. Comparing More than Two Groups by Ranks – the Kruskal-Wallis Test. with the results being compared to a critical value in the Chi-square distribution: where: k = number of samples (groups) ni = number of observations for the i-th sample or group N = total number of observations (sum of all the ni) Ri = sum of ranks for group i As an example. that the sums of the ranks of group 2 are less than that of group 1.05. nonparametric tests like the Mann-Whitney U simply replace these data with their ranks.* It is the nonparametric equivalent to conducting a one-way ANOVA comparing multiple groups on a normally distributed continuous variable. The Wilcoxon W statistic (Wilcoxon rank-sum test) is simply the smaller of the two groups’ Sums of Ranks. The observations represent kilograms of weight lost over a 3 month period. Note that if the observations are paired instead of independent of each other (e.. a pre/post measure conducted on the same subjects). B. The Kruskal-Wallis statistic. If the data are at the interval or ratio level. The Kruskal-Wallis Test is a generalization of the Wilcoxon rank sum test nonparametric test used to determine whether more than two groups of ordinal/ranked data differ. then nonparametric tests like the Mann-Whitney U Test should not be employed since hey are less powerful than their parametric equivalents and thus more likely to miss a true difference between groups. any Z-score greater than ±1. consider the following comparison of four diet plans (labeled as plan A.. C & D) enrolling a total of 19 patients. then we use the Wilcoxon signed rank test for paired data (not to be confused with the Wilcoxon rank-sum test described above) instead of the MannWhitney U test. Positive Zscores indicate that the sums of the ranks of group 2 are greater than that of group 1.e.
8 C 1. we apply the computation formula for H: Last. we determine that the critical value three degrees of freedom (degrees of freedom = # groups – 1) is 7.g.812. pre-test. e.4 The first step in conducting a Kruskal-Wallis analysis is to rank order ALL the observations from lowest (1) to highest (19) and then sum the ranks for each plan: A 17 19 14 15 Sum of Ranks 65 B 10 4.3 2. other followup). Friedman's two-way ANOVA.5 3 7 1 17..0 B 3.4 2. Note that if the observations are repeated more than once (e.5 6 13 8 41.5 C 2 4.1 4. Since 13. post-test.8 D 3.g. we reject the null hypothesis and conclude that the rankings of weight loss do differ among the four diet plans.9 4.5 3. 8 . whereas plan C ranks lowest in weight loss. from using a Chi-square table.6 3.678 is greater than this critical value. A 4.1 3.7 4.8 2.4 2.5 D 11 9 12 16 18 66 Based on the sum of ranks for each group.1 2..9 2.7 1.. Inspecting the sum of ranks suggests that plans A and B are the best (and nearly equivalent).6 3.2 4. then we cannot use the Kruskal-Wallis test and instead must use a nonparametric alternative to the repeated-measures ANOVA.
L.edu/faculty/ballc/webtools/web_chi_tut. Retrieved January 9. In Statnotes: Topics in multivariate analysis. (1998a). W.georgetown.edu/~gdallal/npar. London: Sage Publications Friel. In Biostatistics: A foundation for analysis in the health sciences.htm 9 . (2004a). Crosstabs: Measures for ordinal data. Retrieved February 26.M.edu/~icc_cmf/cj_685/mod9.htm Daniel. Chi-square tutorial. C. L. In Biostatistics: A foundation for analysis in the health sciences.capebretonu. 8th ed. (2003). (1999a). L. Nonparametric statistics. (1996). (1999c). D.edu/~icc_cmf/cj_685/mod12. Nonparametric tests. Applied nonparametric statistics (2nd ed). Georgetown University. W. Retrieved February 19. Sam Houston State University.Reference Bibliography Agresti. University of Colorado at Colorado Springs.shsu.edu/lbecker/SPSS80/ctabs2.M. Retrieved January 9. W.E. In The little handbook of statistical practice.A. Becker. (1990).doc Friel.edu/lbecker/SPSS/ctabs1. Nonparametric correlation techniques.doc Field.A. A. (2004). A. 2002 from http://www. W. 2002 from http://www.htm Becker. (2004). Retrieved November 30. University of Colorado at Colorado Springs. Crosstabs: Measures for nominal data. Boca Raton. Retrieved May 7. (1999b).doc Garson. 2004 from http://www2.edu/garson/pa765/chisq.G. Retrieved November 22. (1999). 2004 from http://anthrosoc. 2004 from http://web. Sam Houston State University.html Conover. 2004 from http://www.uccs. Daniel.edu/~gdallal/ctab. NY: Wiley.tufts. (2000). Testing for differences between two groups: Nonparametric tests. G. New York: Wiley deRoche. W. (2004b).chass.ncsu. Categorical data (Chapter 16). G. In Practical statistics for medical research. The Chi-square distribution and the analysis of frequencies (Chapter 12). 2004 from http://web. J. Contingency tables. In The little handbook of statistical practice. J.A.htm Dallal. Introduction to categorical data analysis. (2004).E.D. (2005). J. Retrieved November 22. C. 2004 from http://www. Retrieved January 9. Measures of association.tufts. University of Colorado at Colorado Springs. Practical nonparametric statistics (3rd ed). (1991). In Discovering statistics using SPSS.uccs.htm Becker. New York: Wiley. Comparing groups – categorical data (Chapter 10).edu/lbecker/spss80/nonpar. Boston: PWS-Kent Dallal.htm Connor-Linton. Cape Breton University. 8th ed.uccs. FL: Chapman & Hall. New York: Wiley Daniel.shsu.W. 2nd ed. Chi-square significance tests.ca/Measures%20of%20association. 2004 from http://web. Retrieved February 19. Nonparametric and distribution-free statistics (Chapter 13). (2000). Altman.W. 2004 from http://www. G.
G. Decker Norman. (2000). & Chakraborti. A. Quantitative applications in the social sciences series. Nonparametric statistical methods (2nd ed).chass.D. 2004 from http://www2. G.chass.ncsu. Deshpande. Measures of association for ranked data (Chapter 23). Test of significance for categorical frequency data (Chapter 20). Introduction to statistics: the nonparametric way.C. Statistical analysis of non-normal data. Hamilton. J. New York: SpringerVerlag. D. & Shanubhogue.htm Garson. New York: Wiley. (1992). (2000). J. In Statnotes: Topics in multivariate analysis. Hamilton.htm Garson.Garson. In Biostatistics: The bare essentials (2nd ed).S. P.L. 105-113. Hamilton. & Streiner.C. Kendall's tau-b and tau-c.edu/garson/pa765/assocordinal.ncsu. Decker Norman. Nonparametric measures of association.R. Michael. Tests for two independent samples: Mann-Whitney U. Lehmkuhl.chass. J Prothetics Orthotics. (1991). G. 2004 from http://www2. In Statnotes: Topics in multivariate analysis.edu/~educy520/sec5982/week_12/chi_sq_summary011020. Ontario: B.R. Cramer's V. D. (1992). V. (1998b). J. D. Ontario: B. Thousand Oaks.ncsu. & Moses extreme reactions tests. Ontario: B. G.chass. Measures of association for categorical data (Chapter 21).edu/garson/pa765/mann. 2004 from http://www2. (2000). W.C. CA: Sage Publications Gibbons.D.D. In Biostatistics: The bare essentials (2nd ed). & Streiner.. & Streiner. G. J. uncertainty coefficient. Tschuprow's T.C. E.D. lambda. (1998d). (2001) Crosstabulation & Chi square. Tests of significance for ranked data (Chapter 22). New York: Marcel Dekker Gore.L.R. Hamilton. Decker 10 .htm Garson. CA: Sage Publications Gibbons.edu/garson/pa765/assocnominal. A. Ordinal association: gamma.. In Biostatistics: The bare essentials (2nd ed). Quantitative applications in the social sciences series. New York: Wiley.L. 2004 from http://www2. G. Retrieved February 26. 2004 from http://www. Nonparametric statistics: Methods for analyzing data not meeting assumptions required for the application of parametric tests. G. (1998c). Indiana University. Nonparametric statistics: An introduction. (2000). Nominal association: Phi. Retrieved November 30. Hollander. Ontario: B.). In Biostatistics: The bare essentials (2nd ed). & Streiner. Fisher exact test of significance. 8. (1996).D. & Wolfe.indiana.R. Norman. (1993). D. (1993). In Statnotes: Topics in multivariate analysis. Retrieved February 26.edu/garson/pa765/fisher. G.htm Gibbons. Retrieved February 26. (1999).ncsu.pdf Noether.A. KolmogorovSmirnov Z. Retrieved May 17. R. D. L. In Statnotes: Topics in multivariate analysis. Decker Norman. Somers' d. G.. contingency coefficient. Nonparametric statistical inference (3rd ed. Thousand Oaks. (1998e).. D.L. S.
(Chapter 8).T. Analysis of nominal data.. (2002). B. Nonparametric statistics for the behavioral sciences (2nd ed. Retrieved March 25. & Lumley. Retrieved Aprill 22. (2004).pdf Williams. S.S. Categorical data analysis. Heagerty. New York: McGraw-Hill. Nonparametric. G. 2006 from http://www. Fisher.pdf Weaver.). L. G.edu/~rwilliam/stats1/x51. Biostatistics: A methodology for the health sciences. P. N. T. T..nd.J. J. B. Northern Ontario School of Medicine.com/textbook/stnonpar.S. Retrieved August 2. New York: Wiley Weaver.. P.Pett. (1988). New York: Wiley van Belle.com/wv/bwhomedir/notes/categorical. Analysis of categorical data (Chapter 2). Newbury Park CA: Sage. Inc. P. Heagerty..J. & Castellan. (2004). Siegel. L. 2007 from http://www. Lumley. CA: Sage Publications Reynolds.D. Tulsa.statsoft. 2004 from http://www. Retrieved August 2..angelfire. distribution-free and permutation models: Robust procedures. (2005). 2006 from http://www.pdf 11 . Department of Sociology. H. Thousand Oaks.D.com/wv/bwhomedir/notes/nonpar. 2nd ed. Nonparametric statistics. Categorical data: Contingency tables (Chapter 7). Biostatistics: A methodology for the health sciences. University of Notre Dame. R. Northern Ontario School of Medicine. Sage series on Quantitative Applications in the Social Sciences. Nonparametric tests (Chapter 3). OK: StatSoft. 2nd ed. (1984).. Nonparametric statistics in health care research: Statistics for small samples and unusual distributions. Statsoft (2006).angelfire.html van Belle. (2005). In Electronic Statistics Textbook. Fisher. (1997).