Professional Documents
Culture Documents
where r = number of rows; c = number of partitions and comparisons for any given table.
columns in the whole table; m = number of An example of such a different kind of scheme
columns in the partitioned table, m < c; w,y is illustrated in Figure 2 for a 4 X 4 table.
= observed frequency in cell ij;
Calculation of Chi-Square for Partitions
= Z Irwin (1949) has presented a general
»'=! formula for the (r — l)(c— 1) independent
2 X 2 tables partitioned from the whole table :
Wi. = Z «</
N
• * = Z
L
iw w»
^—r Z tv
*-** n-Ji = Z
T-3 = Z *—' nt*•
X (011022 + #22011 — #12021 — 021012)2 [3]
s=l /=! j=l »=1
where «„ = observed frequency in cell ij of
This procedure has greater generality than the 2 X 2 table (i, j = 1, 2) ; ef. = the sum
those reviewed by Castellan (1965) because it of the expected frequencies calculated from the
is not restricted to partitions of one degree of margins of the original table for cells il plus i2
freedom from r X c tables or to partitions of (i = 1,2); e.j = the sum of the expected fre-
more than one degree of freedom from 2X N quencies calculated from the margins of the
tables. The equation may be used to determine original table for cells Ij plus 2j (j = 1,2); and
the contribution of a portion of the table to
the overall chi-square, but with the require- E = Z 0i. = Z e.i
i=-l,2 J-=1,2
ment that only the number of rows or only the
number of columns has been reduced. This formula reduces to the usual expression
for a 2 X 2 table if the expected frequencies
Determination of Permissible Partitions are calculated from the margins of the 2 X 2
table itself; and not, as here, from the margins
Irwin (1949), Kimball (1954), Lancaster of the original table.
(1949, 1950), and Maxwell (1961) have dis- To illustrate the use of this formula, the
cussed a procedure for determining the permis- 4 X 4 table discussed previously (Figs. 1 and
sible 2 X 2 partitions. The scheme entails 2) is now presented with actual numbers and
beginning in one corner of the table and chi-square values calculated for each of the
systematically isolating and eliminating the partitions (Fig. 3). For both methods of parti-
single element appearing in that corner. For tioning, the sum of the Xi2 is equal to the
every row and column, this procedure results total X92.
in comparisons between the first cell and all Snedecor (1956) has shown that the formula
succeeding cells pooled, between the second
cell and all succeeding cells pooled, and so on, (Y
\&-t XjPj/) -
"II "12 "13 "14 "l. "II "12 "13 "14 "l. "12 "13 "14 V
n2| "22 "23 "24 "2. "21 "22 "23 "24 "2. "22 "23 "24 V
n n
"31 "32 "33 34 3. "31 "32 "33 "34 °3. ":
"32 "33 "34 3
"41 "42 "43 "44 "4. "41 "42 "43 "44 "4. "42 "43 "44 n4'.
a, "•2 "'3 "'4 N n., n. 2 11.3 n.4 N "•2 n.3 n.4 Nc'
2
x; x, x;
n,,,
"13 "14 "21 "22 "23 "24 "2. "31 "32 "33 "34 "3.
"23 "24 "2'.' n n
"31 "32 33 34 "3. "41 "42043044 "4.
"33 "34 "3." "41 "42 "43 "44 "4. n.|" ag" 1.3-04"
NR"
a,- n.2- n. 3-0.4-
"43 "44 "4" N
R'
n
-3 "•4 N C "
x; x; x;
Og - n« H
"22 "23 "24 "2'. "23 "24 "32 "33 "34 "3.' "33 "34 A
"32 "33 "34 "3.' "33 "34 "3." "42 "43 "44 V "43 "44 "4."
"42 "43 "44 V. "43 "44 "4" "•2" o.3" "4" NR"C' "•3" ".4- NR"C'
"•2' n.3- 0.4- "-3 "•4- NR'C"
1
NR'C'
x;
FIG. 1. Example of partitioning in a 4 X 4 table using the conventional method of successively
eliminating one column and one row.
can be used for the total chi-square oia.2 X N a 2 X 2 partition from a 2 X N table. If this
contingency table. An equivalent formula for new expression is converted from proportions
a 2 X N table using frequencies can be found to frequencies, the following formula results:
in Walker and Lev (1953). Cochran (1954)
further noted that when the denominator is ^2 _ v (/° ~ f^ rr-i
kept constant, the formula is equal to Xi2 for Fe
PARTITIONING CHI-SQUARE CONTINGENCY TABLES 255
where /<, = observed frequency of partitioned meet the criterion of either complete rows or
table; /« = expected frequency of partitioned columns, and the chi-square values for these
table calculated from marginal totals of parti- partitions can be obtained by either formula.
tioned table; and Fe = expected frequency of Equation 5 is also equivalent to Equation 2
partitioned table calculated from marginal (the proof requires only algebraic manipula-
totals of original table. tions) since for any partition which includes
It can be demonstrated that this formula is all the categories of one dimension, the sum of
applicable to 2 X 2 partitions, not only from the expected frequencies as determined by the
a 2 X N table, but from any r X c table, the original marginal totals is equal to the sum of
sole restriction being that only one dimension the observed frequencies. An alternative
has been reduced. That is, chi-square calcu- statement is that the sum of the expectations
lated by this formula equals chi-square is equal to the expectation of the sum; and,
calculated by Irwin's formula if one or more an even stronger statement is that the sum of
rows (or columns) have been sliced off, the original expectations is equal to the sum
provided that all columns (or rows) remain. of the observed in either each row or each
The first five partitions for the 4 X 4 table column (but not both). This relationship does
n
ll n |2 "13 "14 "I. n
l l "12 "13 "14 "I. " I I "12 n,,,
"21 "22 "23 "24 "2. "21 "22 "23 "24 "2. "21 °22 V
n
"31 "32 "33 184 "3. "31 °32 33 "34 "3. "31 "32 "3."
"41 "42 "43 "44 "4. "41 "42 "43 "44 "4. "41 "42 V
n., n.2 "•3 "•4 N n., n.2 n.3 n.4 N a. n.2 Ncn
x; x, x;
"13 n f4 •Y
"23 "24 trio"
n
"ll "12 !3 "14 "l. "31 "32 "33 n34
"21 "22 "23 "24 "2. "41 "42 "43 "44 "33 n34 "V
n,|" n.2" n.3» n.4» '43 "44 1114"
NRn
a
3 n.4 N Cm
x; x, x;
n
ll "12 "I." "31 "32 n3." "13 ",4 •Y "33 "34 m3,"
m
"2. "22 V "41 "42 V "23 "24 m2" "43 "44 4"
n.," "•2" N m.r m.2" N nj. ".4" m.j, m,4,, N
RnCn RmCn N
RnCm RmCm
Xi X, X,
F;o. ?. Example of a, partitioned 4 X 4 table ijsing a variant of the conventional method of partitioning,
256 JEAN L. BRESNAHAN AND MARTIN M. SHAPIRO
0.801901 2.992469
X,
1.836734 3.956341 0.000831
X
1.069129 0.033535 4.685159 I . 1871 14
FIG. 3a. Table from Figure 1, with numbers substituted for symbols.
not exist when both dimensions have been is easier to calculate than Irwin's equivalent
reduced. Nevertheless, it should be apparent formula because in a 2 X 2 table all the
that reducing only the rows or only the squared deviations are equal. Apart from con-
columns may not provide sufficient informa- venience, however, this formula has the
tion. It shall now be shown that these reduc- important advantage of not being restricted
tions are only first steps in a complete analysis. to 2 X 2 tables. This fact enables it to have an
application not directly available by Irwin's
Additional Applications and Advantages of the formula. If a researcher's interest lies in a
Alternative Equation particular configuration consisting of all the
The expression rows (or columns) of the original table, with
one or more of the columns (or rows) elimi-
Fe nated, then for this partition considered as a
PARTITIONING CHI-SQUARE CONTINGENCY TABLES 257
16.563214 1.237229
xr
2.559479 3,547708 0.477911
-1-
X|= 1-069129
"V = 16.563213
X1 0.033535
/V 5.873104
9
FIG. 4a. Table for six degrees of freedom remaining
after the first row has been removed from the 4 X 4
table of Figure 3, and the chi-square values for each of
the partitions.
X- 563213
en = expected frequency for cell ij calculated FIG. 4b. Table for three degrees of freedom remaining
from the margins of the original table after the first two rows have been removed from the
4 X 4 table from Figure 3, and the chi-square values
m for each of the partitions.
e<. = E «« FIG. 4. Examples of partitioning in which some of the
components have more than one degree of freedom.
i=l j-l
[7] (E
l
(i-l)(7JI-l
Let Oi. and o.j be the observed margin and (c) a further correction for having counted
totals in the partitioned table, and 0 be the the total observed deviation from expectation
total observed frequency in the partitioned twice. That is, it is a test of independence
table for which E is the total expected fre- corrected for lack of goodness-of-fit in the
quency ; then Equation 9 becomes margins.
3=1
TABLE 1 TABLE 3
Ps OT CC Ps OT CC
Or 19 80 75
236.0 560.0 646.0 1442.0
Sc 121 344 382
X22 = 21.583895, p < .001
Se 18 11 141
same as they would be if the small expectations
were pooled or discarded completely, but the
gories do follow a natural ordering, and pooling value of the statistic is now exact.2
may be a legitimate operation, as in goodness- It is interesting to note that although the
of-fit tests. There are also numerous occasions requirement of exhaustiveness has been re-
when an ordering seems apparent but is, in peatedly mentioned in the literature, it has
fact, imposed by the researcher and not the not been elaborated upon and appears to have
natural events. been treated more as a theoretical ideal than
The alternative procedure of discarding one to be sought in practice. Perhaps this
categories is patently an inefficient procedure cursory treatment of the problem in the past
since all data should be utilized. More im- indicated the lack of a satisfactory solution.
portant, the discarding of categories fails to
satisfy the requirement of exhaustiveness.
The errors caused by either the pooling of AN EXAMPLE
data or the use of nonexhaustive categories It is perhaps useful to present a real-life
can be eliminated by the technique suggested example to demonstrate the utility of these
in this paper. That is, if one or more rows techniques and the manner in which the general
and/or columns consist of cells which do not equation for chi-square facilitates the computa-
have desired minimum expectations, instead tion. The example to be used is Table 35 from
of discarding these rows and/or columns, or Hollingshead and Redlich (1958, p. 288). It
pooling them with adjacent entries, a chi- shall be assumed that the sample of 1,442
square value should be calculated for the table constituted a random sample from the popula-
remaining by Equation 10. This value can be tion of psychotic patients. Each subject was
regarded as the contribution of the table re- categorized according to the diagnostic groups
maining to the chi-square of the original table. 2
It should be noted that in some sense no computed
The degrees of freedom for this table are the value of chi-square is exact since the computational
formulas yield values which are only approximately
TABLE 2 distributed as chi-square. That is, chi-square is ap-
proached as the number of frequency entries approaches
Ps OT CC
infinity. In this sense, the components add up exactly
to an approximation.
Af 34.4 167.4 10.9 160.0
Al 154.7 15.0 9.8 91.0 TABLE 4
Or 12.7 94.7 72.2 174.0 Ps OT CC
TABLE S TABLE 7
Ps + OT CC Ps OT
HOLLINGSHEAD, A. B., & REDLiCH, F. C. Social class LANCASTER, H. O. The exact partition of X8 and its
and mental illness: A community study. New York: application to the problem of the pooling of small
Wiley, 1958. expectations. Biometrika, 1950, 37, 267-270.
IRWIN, J. 0. A note on the subdivision of X2 into com- LEWIS, B., & BURKE, C. J. The use and misuse of the
ponents. Biometrika, 1949, 36, 130-134. chi-square test. Psychological Bulletin, 1949, 46,
KASTENBATJM, M. A. A note on the additive partitioning 433-489.
of chi-square in contingency tables. Biometrics, 1960, MAXWELL, A. E. Analysing qualitative data. London:
16, 416-422. Methuen, 1961.
KIMBALL, A. W. Short-cut formulas for the exact SNEDECOR, G. W. Statistical methods. (5th ed.). Ames,
partition of X2 in contingency tables. Biometrics, 1954, Iowa: Iowa State College Press, 1956.
10, 452-458. WALKER, H. M., & LEV, J. Statistical inference. New
LANCASTER, H. 0. The derivation and partition of X2 York: Holt, 1953.
in certain discrete distributions. Biometrika, 1949,
36,117-129. (Received July 6, 1965)