You are on page 1of 3

C C -S T quantpsy.

org
An interactive calculation tool for chi-square tests of goodness of fit and independence
Curriculum vitae Calculation for the Chi-Square test: An interactive calculation tool for chi-square tests of
Selected publications goodness of fit and independence
Kristopher J. Preacher (Vanderbilt University)
Supplemental material
for publications How to cite this page
Online utilities This web utility may be cited in APA style in the following manner:
Mediation & moderation Preacher, K. J. (2001, April). Calculation for the chi-square test: An interactive calculation tool for
material chi-square tests of goodness of fit and independence [Computer software]. Available from
PSY-GS 8882: Multilevel http://quantpsy.org.
Modeling
The purpose of this page
Vanderbilt Psychological
Sciences This web page is intended to provide a brief introduction to chi-square tests of independence and
goodness-of-fit. These tests are used to detect group differences using frequency (count) data.
Vanderbilt Quantitative This page also provides an interactive tool allowing researchers to conduct chi-square tests for
Methods
their own research. Any introductory applied statistics text should have a good description of these
Organizations chi-square tests, but following is a condensed introduction.
Friends and colleagues About the chi-square test of independence
Contact me Often a researcher wishes to see if the frequency of cases possessing some quality varies among
levels of a given factor or among combinations of levels of two or more factors. In such situations,
the appropriate test is the chi-square test of goodness of fit or the chi-square test of independence
for k groups.
How it's done
© 2010-2020,
Kristopher J. Preacher To conduct the chi-square test, the researcher enters observed frequencies corresponding to
combinations of levels of relevant factors (here, called "condition" and "group," but these are labels
of convenience). Sums of elements within rows and within columns are then computed (call these
marginal Ns). The chi-square test of independence is used to test the null hypothesis that the
frequency within cells is what would be expected, given these marginal Ns. The chi-square test of
goodness of fit is used to test the hypothesis that the total sample N is distributed evenly among
all levels of the relevant factor.
The expected value within each cell, if the null condition is true (i.e., if the factors have no
significant influence on observed frequencies in the population), is simply the product of the row
total and column total divided by the overall sample N for the test of independence and N divided
by the number of levels of the single factor for the test of goodness of fit. If Oij is the observed
frequency and Eij the expected frequency for the cell corresponding to the ith condition and the jth
group, then chi-square is:

If there is only one factor of interest with (k > 1) levels, the same formula will work, with i or j
being set to 1. The test presented here can be used to test only 1- or 2-dimensional arrays. Arrays
of higher dimension are possible, and are based on the same principle and even use the same
formula, although they involve multiple nested summations.
How to use this page
Input observed frequencies into the white cells. I realize that not very many designs involve
exactly 10 conditions and 10 groups - if your design is smaller, then choose some subset of rows
and columns in which to enter your data. For example, if your design is (2 x 3), then you may
choose to enter your data in the 6 cells in the upper left portion of the data table, defined by the
first two Conditions and the first three Groups. You can choose any subset of rows and columns for
your data. You can also opt to leave cells corresponding to observed frequencies of zero blank.
Non-integer observed frequencies are allowed, although it is difficult to imagine how one would
obtain these in actual research.
If you are performing a test of goodness of fit, you may choose to enter your data in any single
column or row. However, observed zero frequencies need to be explicitly included (i.e., you'll need
to actually type "0" in those cells, otherwise it is assumed that those cells are not part of your
design). Once you have entered your data, click on the Calculate button and expect to see output
in the beige cells (they should be white if you are using older versions of Netscape). Do not panic if
you see scientific notation for your p-value - that simply means that p is really small.
This tool also yields a chi-square incorporating Yates' correction for continuity. This correction is
often employed to improve the accuracy of the null-condition sampling distribution of chi-square. It
probably should be used only for 1-df tests (i.e., goodness of fit tests or tests of independence with

C Warnings
C -S T
2x2 contingency tables), so use at your own risk for tests with df>1.
quantpsy.org
An interactive calculation Use
tool forchi-square
of the chi-squaretests is tests of goodness
inappropriate of frequency
if any expected fit andisindependence
below 1 or if the expected
frequency is less than 5 in more than 20% of your cells. The status cell at the bottom of the table
will let you know if there is a problem. In the 2 x 2 case of the chi-square test of independence,
Curriculum vitae
expected frequencies less than 5 are usually considered acceptable if Yates' correction is employed.
Selected publications
Supplemental material Gp 1 Gp 2 Gp 3 Gp 4 Gp 5 Gp 6 Gp 7 Gp 8 Gp 9 Gp 10
for publications Cond.
1:
Online utilities
Cond.
Mediation & moderation 2:
material Cond.
PSY-GS 8882: Multilevel 3:
Modeling Cond.
Vanderbilt Psychological 4:
Sciences Cond.
5:
Vanderbilt Quantitative
Methods Cond.
6:
Organizations
Cond.
Friends and colleagues 7:
Contact me Cond.
8:
Cond.
9:
Cond.
© 2010-2020, 10:
Kristopher J. Preacher

Output:
Calculate Reset all Chi-square:
degrees of freedom:

p-value:

Yates' chi-square:
Status: Status okay Yates' p-value:

"Custom" expected frequencies


When using the chi-square goodness of fit test, sometimes it is useful to be able to specify your
own expected frequencies. If there is a theoretical reason for doing so, the following table will allow
you to enter your own Eij's. Non-integer expected frequencies are allowed. Use as many cells in
this table as necessary, making sure that (1) the marginal total is the same for both observed and
expected frequencies, (2) there are no expected frequencies less than 1, and (3) no more than
20% of your expected frequencies are less than 5. If a frequency is entered in an Observed cell,
then a frequency must also be entered in the corresponding Expected cell (and vice versa).

Gp 1 Gp 2 Gp 3 Gp 4 Gp 5 Gp 6 Gp 7 Gp 8 Gp 9 Gp 10
Observed:

Expected:
Output:
Calculate Reset all Chi-square:

degrees of freedom:
p-value:

Yates' chi-square:

Status: Status okay Yates' p-value:

Acknowledgments
Original version posted April, 2001. My thanks to Nancy Briggs and Rebecca White for scripting
help and to Derek Rucker, Geoffrey Leonardelli, and Tom Nygren for testing earlier versions of this
page. Free JavaScripts provided by The JavaScript Source and John C. Pezzullo.

You might also like