Professional Documents
Culture Documents
13.4 Test of Independence: Contingency Tables: Objective
13.4 Test of Independence: Contingency Tables: Objective
Objective:
we want to determine whether the beer preference is independent of the
gender of the beer drinker.
We want to test
H0 :
vs.
Ha :
with 0.05 .
We have the following data:
Beer Preference
Light
20 f 11
40 f 12
Dark
20 f 13
Female
30 f 21
30 f 22
10 f 23
70
Total
50
70
30
150 n
15
3
15
pc 2
pc 3
Male
Gender
Proportion
15
pc1
Regular
Total
Proportion
80
15
7
15
p r1
pr 2
5
8 5
150
np r1 p c1
15
15 15
7
8 7
150
np r1 pc 2
15
15 15
3
8 3
150
npr1 pc 3
15
15 15
5
7 5
150
npr 2 pc1
15
15 15
7
7 7
150
np r 2 pc 2
15
15 15
3
7 3
150
np r 2 pc 3 .
15
15 15
Beer Preference
Light
Male
Gende
r
Female
Regular
Dark
n p r1 p c1
n p r1 p c 2
n p r1 p c 3
e11 26.67
e12 37.33
e13 16
n p r 2 p c1
n pr 2 pc 2
n p r 2 pc 3
e21 23.33
e22 32.67
e23 14
Proportion
15
pc1
15
pc 2
Proportion
15
7
15
p r1
pr 2
3
15
pc 3
Intuitively, if the differences between the observed number f ij and the expect number
(under H 0 ) eij , i 1, 2; j 1, 2, 3 , are small, that might imply H 0 is true and thus
the observed number and the expected number (under H 0 ) are close. The following
statistic can be used to reflect the difference between the observed number and the
expected number,
ij
i 1 j 1
eij
eij
e11
e12
e13
e22
e23
20 26.67 2 40 37.33 2 20 16 2
26.67
37.33
16
2
2
30 23.33 30 32.67 10 14 2
23.33
32.67
14
6.13
General Case:
Suppose there are two variables, column variable (with m categories)
and row variable (with p categories). We want test the hypothesis
H0 :
Ha :
...
f 11
f1 j
f 1m
proportions
m
p r1
Row
Variabl
e
(p
rows)
f i1
f ij
f im
f
k 1
f p1
f pj
f pm
k 1
f
k 1
pk
p cm
p
f k1
k 1
p cj
ik
p c1
1k
p ri
proporti
ons
H0
k 1
p rp
If
f kj
k 1
km
H0
are
...
e11
npr1 pc1
Row
Variabl
e
(p
rows)
ei1
e p1
p c1
proportions
e1m
p r1
eij
np r1 p cm
eim
p ri
np ri p cj
np ri p cm
e pj
eim
p rp
np rp p c1
proporti
ons
e1 j
np r1 p cj
np ri p c1
np rp p cm
np rp p cj
p cj
p cm
Note:
k 1
ik
f
k 1
ik
kj
k 1
f
k 1
kj
where
m
k 1
k 1
and
p
sample size f ij n .
i 1 j 1
Thus, the chi-square statistic used to reflect the difference between the
observed number and the expected number is
p
ij
eij
eij
i 1 j 1
e12
e1m
f 21 e21 2 f 22 e22 2 f 2 m e2 m 2
e21
p1
e p1
e22
e p1
e p2
p2
ep2
e2 m
must be to reject
pm
e pm
e pm
H0 ?
Chi-Square Test:
Let
p
i 1 j 1
As
eij 5
for
ij
eij
eij
Ha :
is to
4
reject H 0 :
2 2p 1 m 1 ,
not reject H 0 :
2 2p 1 m 1 ,
In addition,
p - value P 2p 1 m1 2 .
Note: as
H0
is
2p 1 m1 .
Example (continue)
2
2
2
Since p 2, m 3 and 6.13 5.99 2, 0.05 p 1 m 1 , , thus we reject H 0 .
Also,
Favor
252
148
Not Favor
145
105
No Comment
203
147
Please test if female and male differ in their opinions about the proposal with
0.05 .
[solution:]
The column totals are 252 148 400,145 105 250,203 147 350 while the row
totals are 252 145 203 600,148 105 147 400 . In addition, the total number is
1000.
The table for the expected numbers eij is
Favor
Not Favor
No Comment
Row Total
600
Male
600 400
240
1000
600 250
150
1000
600 350
210
1000
Female
400 400
160
1000
400 250
100
1000
400 350
140
1000
400
Column Total
400
250
350
1000
Thus,
p
i 1 j 1
ij
eij
eij
i 1 j 1
ij
eij
eij
240
150
210
2
2
148 160 105 100 147 140 2 2.5
160
100
140
2
2
2
2
Since 2.5 5.99 2, 0.05 21 31 , 0.05 p 1 m1 , , we do not reject H 0 .
Online Exercise:
Exercise 13.4.1
Exercise 13.4.2