Professional Documents
Culture Documents
House Location
House Style Urban Rural Total
Split-Level 63 49 112
Ranch 15 33 48
Total 78 82 160
Levels of variable 1
2 Test of Independence
Hypotheses & Statistic
• 1. Hypotheses
– H0: Variables Are Independent
– Ha: Variables Are Related (Dependent)
2 Test of Independence
Hypotheses & Statistic
• 1. Hypotheses
– H0: Variables Are Independent
– Ha: Variables Are Related (Dependent)
• 2. Test Statistic Observed count
ch
nij E nij
2
Expected
2
all cells
ch
E n ij
count
2 Test of Independence
Hypotheses & Statistic
• 1. Hypotheses
– H0: Variables Are Independent
– Ha: Variables Are Related (Dependent)
• 2. Test Statistic Observed count
ch
nij E nij
2
Expected
2
all cells
ch
E n ij
count
Rows Columns
• Degrees of Freedom: (r - 1)(c - 1)
2Test of Independence
Expected Counts
• 1. Statistical Independence Means Joint
Probability Equals Product of Marginal
Probabilities
• 2. Compute Marginal Probabilities &
Multiply for Joint Probability
• 3. Expected Count Is Sample Size Times
Joint Probability
Expected Count Example
Expected Count Example
Location
Urban Rural
House Style Obs. Obs. Total
Split-Level 63 49 112
Ranch 15 33 48
Total 78 82 160
Expected Count Example
Marginal probability = 112
160
Location
Urban Rural
House Style Obs. Obs. Total
Split-Level 63 49 112
Ranch 15 33 48
Total 78 82 160
Expected Count Example
Marginal probability = 112
160
Location
Urban Rural
House Style Obs. Obs. Total
Split-Level 63 49 112
Ranch 15 33 48
Total 78 82 160
78
Marginal probability =
160
Expected Count Example
112 78
Joint probability = Marginal probability = 112
160 160 160
Location
Urban Rural
House Style Obs. Obs. Total
Split-Level 63 49 112
Ranch 15 33 48
Total 78 82 160
78
Marginal probability =
160
Expected Count Example
112 78
Joint probability = Marginal probability = 112
160 160 160
Location
Urban Rural
House Style Obs. Obs. Total
Split-Level 63 49 112
Ranch 15 33 48
Total 78 82 160
112 78
78 Expected count = 160·
Marginal probability = 160 160
160 = 54.6
Expected Count Calculation
Expected Count Calculation
Expected count =
aRow totalf aColumn totalf
Sample size
Expected Count Calculation
Expected count =
aRow totalf aColumn totalf
Sample size
112·78 House Location 112·82
160 Urban Rural 160
House Style Obs. Exp. Obs. Exp. Total
Split-Level 63 54.6 49 57.4 112
Ranch 15 23.4 33 24.6 48
Total 78 78 82 82 160
48·78 48·82
160 160
2 Test of Independence
Example
• You’re a marketing research analyst. You ask a
random sample of 286 consumers if they
purchase Diet Pepsi or Diet Coke. At the .05
level, is there evidence of a relationship?
Diet Pepsi
Diet Coke No Yes Total
No 84 32 116
Yes 48 122 170
Total 132 154 286
2 Test of Independence
Solution
2 Test of Independence
Solution
• H0: Test Statistic:
• Ha:
=
• df =
• Critical Value(s): Decision:
Reject
Conclusion:
0 2
2 Test of Independence
Solution
• H0: No Relationship Test Statistic:
• Ha: Relationship
=
• df =
• Critical Value(s): Decision:
Reject
Conclusion:
0 2
2 Test of Independence
Solution
• H0: No Relationship Test Statistic:
• Ha: Relationship
= .05
• df = (2 - 1)(2 - 1)
=1
Decision:
• Critical Value(s):
Reject
Conclusion:
0 2
2 Test of Independence
Solution
• H0: No Relationship Test Statistic:
• Ha: Relationship
= .05
• df = (2 - 1)(2 - 1)
=1
Decision:
• Critical Value(s):
Reject
= .05 Conclusion:
0 3.841 2
2 Test of Independence
Solution
E(nij) 5 in all
cells
116·132 Diet Pepsi 154·116
286 No Yes 286
Diet Coke Obs. Exp. Obs. Exp. Total
No 84 53.5 32 62.5 116
Yes 48 78.5 122 91.5 170
Total 132 132 154 154 286
170·132 170·154
286 286
2 Test of Independence
Solution
ch
nij E nij
2
2
all cells E n chij
af
n11 E n11
2
af
n12 E n12
2
af
n22 E n22
2
E naf 11 E n af
12 af
E n 22
2 2 2
84 53.5 32 62.5 122 91.5
54.29
53.5 62.5 91.5
2 Test of Independence
Solution
• H0: No Relationship Test Statistic:
• Ha: Relationship 2 = 54.29
= .05
• df = (2 - 1)(2 - 1)
=1
Decision:
• Critical Value(s):
Reject
= .05 Conclusion:
0 3.841 2
2 Test of Independence
Solution
• H0: No Relationship Test Statistic:
• Ha: Relationship 2 = 54.29
= .05
• df = (2 - 1)(2 - 1)
=1
Decision:
• Critical Value(s):
Reject Reject at = .05
= .05 Conclusion:
0 3.841 2
2 Test of Independence
Solution
• H0: No Relationship Test Statistic:
• Ha: Relationship 2 = 54.29
= .05
• df = (2 - 1)(2 - 1)
=1
Decision:
• Critical Value(s):
Reject Reject at = .05
= .05 Conclusion:
There is evidence of a
0 3.841 2 relationship
Siskel and Ebert
• | Ebert
• Siskel | Con Mix Pro | Total
• -----------+---------------------------------+----------
• Con | 24 8 13 | 45
• Mix | 8 13 11 | 32
• Pro | 10 9 64 | 83
• -----------+---------------------------------+----------
• Total | 42 30 88 | 160
•
Siskel and|
Ebert Ebert
• Siskel | Con Mix Pro | Total
•-----------+---------------------------------+----------
• Con | 24 8 13 | 45
• | 11.8 8.4 24.8 | 45.0
•-----------+---------------------------------+----------
• Mix | 8 13 11 | 32
• | 8.4 6.0 17.6 | 32.0
•-----------+---------------------------------+----------
• Pro | 10 9 64 | 83
• | 21.8 15.6 45.6 | 83.0
•-----------+---------------------------------+----------
• Total | 42 30 88 | 160
• | 42.0 30.0 88.0 | 160.0
O
2
ij eij 0.5
2 i j
eij
Measures of association
– Relative End
Risk of Chapter
– Odds Ratio
– Absolute Risk
Any blank slides that follow are
blank intentionally.
Relative Risk
^ n11 ^ n21
1 2
n1. n2.
Relative Risk
• Estimated Relative Risk:
^
RR 1 ^
2
( RR (e 1.96 v
) , RR (e1.96 v
))
^ ^
(1 1 ) (1 )
e 2.71828 v
2
n11 n21
Relative Risk
• Interpretation
– Conclude that the probability that the
outcome is present is higher (in the
population) for group 1 if the entire interval is
above 1
– Conclude that the probability that the
outcome is present is lower (in the
population) for group 1 if the entire interval is
below 1
– Do not conclude that the probability of the
outcome differs for the two groups if the
interval contains 1
Example - Coccidioidomycosis and
TNF-antagonists
• Research Question: Risk of developing Coccidioidmycosis
associated with arthritis therapy?
• Groups: Patients receiving tumor necrosis factor (TNF)
versus Patients not receiving TNF (all patients arthritic)
( OR (e 1.96 v
) , OR (e1.96 v ) )
1 1 1 1
e 2.71828 v
n11 n12 n21 n22
Odds Ratio
• Interpretation
– Conclude that the probability that the
outcome is present is higher (in the
population) for group 1 if the entire interval is
above 1
– Conclude that the probability that the
outcome is present is lower (in the
population) for group 1 if the entire interval is
below 1
– Do not conclude that the probability of the
outcome differs for the two groups if the
interval contains 1
Example - NSAIDs and GBM
• Case-Control Study (Retrospective)
– Cases: 137 Self-Reporting Patients with Glioblastoma
Multiforme (GBM)
– Controls: 401 Population-Based Individuals matched to
cases wrt demographic factors
^ ^
AR 1 2
^
^ ^ ^
1 1 1 2 1 2
AR 1.96
n1. n2.
Absolute Risk
• Interpretation
– Conclude that the probability that the
outcome is present is higher (in the
population) for group 1 if the entire interval is
positive
– Conclude that the probability that the
outcome is present is lower (in the
population) for group 1 if the entire interval is
negative
– Do not conclude that the probability of the
outcome differs for the two groups if the
interval contains 0
Example - Coccidioidomycosis and
TNF-antagonists
• Group 1: Patients on TNF
• Group 2: Patients not on TNF
^ 7 ^ 4
1 .0283 2 .0054
247 738
^ ^
AR 1 2 .0283 .0054 .0229
.0283(.9717) .0054(.9946)
95%CI : .0229 1.96
247 738
.0229 .0213 (0.0016 , 0.0242)
C
D
d
od
da
t
7
3
5
5 A
W
4
3
6
3 H
2
5
4
1 S
3
1
5
9 T
C
D
d
od
da
t
7
3
5
5 A
W
4
3
6
3 H
2
5
4
1 S
3
1
5
9 T
^ CD ^
1 1
CD
• Kendall’s tb:
^ CD
tb
(n ni. )( n 2 n. j )
2 2 2
^ C D 83164 73496
0.0617
C D 83164 73496
y m
a
b
o
rlE
x
ou
5
0
7
5O
K
2
2
7
5O
G
9N
a
N
b
U