Uji Kebebasan

Contingency Tables
1. Explain 2 Test of Independence

2. Measure of Association
Contingency Tables
• Tables representing all combinations of

levels of explanatory and response
variables
• Numbers in table represent Counts of the
number of cases in each cell
• Row and column totals are called
Marginal counts
2x2 Tables
• Each variable has 2 levels

– Explanatory Variable – Groups (Typically
based on demographics, exposure)
– Response Variable – Outcome (Typically
presence or absence of a characteristic)
2x2 Tables - Notation
Outcome Outcome Group

Present Absent Total
Group 1 n11 n12 n1.
Group 2 n21 n22 n2.
Outcome n.1 n.2 n..

Total
2 Test of Independence
• 1. Shows If a Relationship Exists Between

2 Qualitative Variables
– One Sample Is Drawn
– Does Not Show Causality
• 2. Assumptions
– Multinomial Experiment
– All Expected Counts  5
• 3. Uses Two-Way Contingency Table
Contingency Table
• 1. Shows # Observations From 1 Sample
Jointly in 2 Qualitative Variables
Contingency Table
• 1. Shows # Observations From 1 Sample
Jointly in 2 Qualitative Variables
Levels of variable 2
House Location
House Style Urban Rural Total
Split-Level 63 49 112
Ranch 15 33 48
Total 78 82 160
Levels of variable 1
Hypotheses & Statistic
• 1. Hypotheses
– H0: Variables Are Independent
– Ha: Variables Are Related (Dependent)
• 1. Hypotheses
• 2. Test Statistic Observed count
ch
nij  E nij
2

Expected
 
2
all cells

ch
E n ij
count
• 1. Hypotheses
• 2. Test Statistic Observed count
ch
nij  E nij
2

Expected
 
2
all cells

ch
E n ij
count
Rows Columns
• Degrees of Freedom: (r - 1)(c - 1)
2Test of Independence
Expected Counts
• 1. Statistical Independence Means Joint
Probability Equals Product of Marginal
Probabilities
• 2. Compute Marginal Probabilities &
Multiply for Joint Probability
• 3. Expected Count Is Sample Size Times
Joint Probability
Expected Count Example
Location
Urban Rural
House Style Obs. Obs. Total
Ranch 15 33 48
Total 78 82 160
Marginal probability = 112
160
Location
Urban Rural
Ranch 15 33 48
Total 78 82 160
Marginal probability = 112
160
Location
Urban Rural
Ranch 15 33 48
Total 78 82 160
78
Marginal probability =
160
112 78
Joint probability = Marginal probability = 112
160 160 160
Location
Urban Rural
Ranch 15 33 48
Total 78 82 160
78
Marginal probability =
160
112 78
Joint probability = Marginal probability = 112
160 160 160
Location
Urban Rural
Ranch 15 33 48
Total 78 82 160
112 78
78 Expected count = 160·
Marginal probability = 160 160
160 = 54.6
Expected Count Calculation
Expected count =
aRow totalf aColumn totalf
Sample size
Expected count =
aRow totalf aColumn totalf
Sample size
112·78 House Location 112·82
160 Urban Rural 160
House Style Obs. Exp. Obs. Exp. Total
Split-Level 63 54.6 49 57.4 112
Ranch 15 23.4 33 24.6 48
Total 78 78 82 82 160
48·78 48·82
160 160
Example
• You’re a marketing research analyst. You ask a
random sample of 286 consumers if they
purchase Diet Pepsi or Diet Coke. At the .05
level, is there evidence of a relationship?
Diet Pepsi
Diet Coke No Yes Total
No 84 32 116
Yes 48 122 170
Total 132 154 286
Solution
Solution
• H0: Test Statistic:
• Ha:
=
• df =
• Critical Value(s): Decision:
Reject
Conclusion:
0 2
Solution
• H0: No Relationship Test Statistic:
• Ha: Relationship
=
• df =
• Critical Value(s): Decision:
Reject
Conclusion:
0 2
Solution
  = .05
• df = (2 - 1)(2 - 1)
=1
Decision:
• Critical Value(s):
Reject
Conclusion:
0 2
Solution
  = .05
• df = (2 - 1)(2 - 1)
=1
Decision:
Reject
 = .05 Conclusion:
0 3.841 2
Solution

E(nij)  5 in all
cells
116·132 Diet Pepsi 154·116
286 No Yes 286
Diet Coke Obs. Exp. Obs. Exp. Total
No 84 53.5 32 62.5 116
Yes 48 78.5 122 91.5 170
Total 132 132 154 154 286
170·132 170·154
286 286
Solution
ch
nij  E nij
2
 
2

all cells E n chij

af
n11  E n11
2

af
n12  E n12
2

af
n22  E n22
2
E naf 11 E n af
12 af
E n 22
2 2 2
84  53.5 32  62.5 122  91.5
    54.29
53.5 62.5 91.5
Solution
• Ha: Relationship 2 = 54.29
  = .05
• df = (2 - 1)(2 - 1)
=1
Decision:
Reject
0 3.841 2
Solution
  = .05
• df = (2 - 1)(2 - 1)
=1
Decision:
Reject Reject at  = .05
0 3.841 2
Solution
  = .05
• df = (2 - 1)(2 - 1)
=1
Decision:
Reject Reject at  = .05
There is evidence of a
0 3.841 2 relationship
Siskel and Ebert
• | Ebert
• Siskel | Con Mix Pro | Total
• -----------+---------------------------------+----------
• Con | 24 8 13 | 45
• Mix | 8 13 11 | 32
• Pro | 10 9 64 | 83
• -----------+---------------------------------+----------
• Total | 42 30 88 | 160
•
Siskel and|
Ebert Ebert
• Siskel | Con Mix Pro | Total
•-----------+---------------------------------+----------
• Con | 24 8 13 | 45
• | 11.8 8.4 24.8 | 45.0
•-----------+---------------------------------+----------
• Mix | 8 13 11 | 32
• | 8.4 6.0 17.6 | 32.0
•-----------+---------------------------------+----------
• Pro | 10 9 64 | 83
• | 21.8 15.6 45.6 | 83.0
•-----------+---------------------------------+----------
• Total | 42 30 88 | 160
• | 42.0 30.0 88.0 | 160.0
• Pearson chi2(4) = 45.3569 p < 0.001

Yate’s Statistics
• Method of testing for association for 2x2

tables when sample size is moderate (
total observation between 6 – 25)
 O 
2
ij  eij  0.5
 
2 i j
eij
Measures of association
– Relative End
Risk of Chapter
– Odds Ratio
– Absolute Risk
Any blank slides that follow are
blank intentionally.
Relative Risk
• Ratio of the probability that the outcome

characteristic is present for one group,
relative to the other
• Sample proportions with characteristic from
groups 1 and 2:
^ n11 ^ n21
1  2 
n1. n2.
Relative Risk
• Estimated Relative Risk:
^
RR   1 ^
 2
95% Confidence Interval for Population Relative Risk:
( RR (e 1.96 v
) , RR (e1.96 v
))
^ ^
(1   1 ) (1   )
e  2.71828 v  
2
n11 n21
Relative Risk
• Interpretation
– Conclude that the probability that the
outcome is present is higher (in the
population) for group 1 if the entire interval is
above 1
outcome is present is lower (in the
below 1
– Do not conclude that the probability of the
outcome differs for the two groups if the
interval contains 1
Example - Coccidioidomycosis and
TNF-antagonists
• Research Question: Risk of developing Coccidioidmycosis
associated with arthritis therapy?
• Groups: Patients receiving tumor necrosis factor  (TNF)
versus Patients not receiving TNF (all patients arthritic)
COC No COC Total

TNF 7 240 247
Other 4 734 738
Total 11 974 985
Source: Bergstrom, et al (2004)

TNF-antagonists
• Group 1: Patients on TNF
• Group 2: Patients not on TNF
^ 7 ^ 4
1   .0283  2   .0054
247 738
^
1
.0283 1  .0283 1  .0054
RR  ^   5.24 v   .3874
 2 .0054 7 4
95%CI : (5.24e 1.96 .3874

, 5.24e1.96 .3874
)  (1.55 , 17.76)
Entire CI above 1  Conclude higher risk if on TNF

Odds Ratio
• Odds of an event is the probability it occurs

divided by the probability it does not occur
• Odds ratio is the odds of the event for group 1
divided by the odds of the event for group 2
• Sample odds of the outcome for each group:
n11 / n1. n11
odds1  
n12 / n1. n12
n21
odds2 
n22
Odds Ratio
• Estimated Odds Ratio:
odds1 n11 / n12 n11n22

OR   
odds2 n21 / n22 n12n21
95% Confidence Interval for Population Odds Ratio
( OR (e 1.96 v
) , OR (e1.96 v ) )
1 1 1 1
e  2.71828 v    
n11 n12 n21 n22
Odds Ratio
• Interpretation
above 1
below 1
interval contains 1
Example - NSAIDs and GBM
• Case-Control Study (Retrospective)
– Cases: 137 Self-Reporting Patients with Glioblastoma
Multiforme (GBM)
– Controls: 401 Population-Based Individuals matched to
cases wrt demographic factors
GBM Present GBM Absent Total

NSAID User 32 138 170
NSAID Non-User 105 263 368
Total 137 401 538
Source: Sivak-Sears, et al (2004)
Example - NSAIDs and GBM
32(263) 8416
OR    0.58
138(105) 14490
1 1 1 1
v     0.0518
32 138 105 263
95% CI : ( 0.58e 1.96 0.0518

, 0.58e1.96 0.0518
)  (0.37 , 0.91)
Interval is entirely below 1, NSAID use appears

to be lower among cases than controls
Absolute Risk
• Difference Between Proportions of outcomes

with an outcome characteristic for 2 groups
• Sample proportions with characteristic
from groups 1 and 2:
^ n11 ^ n21
1  2 
n1. n2.
Absolute Risk
Estimated Absolute Risk:
^ ^
AR   1   2
95% Confidence Interval for Population Absolute Risk
^
 ^  ^  ^ 
 1 1   1   2  1   2 
AR  1.96    
n1. n2.
Absolute Risk
• Interpretation
positive
negative
interval contains 0
TNF-antagonists
• Group 1: Patients on TNF
• Group 2: Patients not on TNF
^ 7 ^ 4
1   .0283  2   .0054
247 738
^ ^
AR   1   2  .0283  .0054  .0229
.0283(.9717) .0054(.9946)
95%CI : .0229  1.96 
247 738
 .0229  .0213  (0.0016 , 0.0242)
Interval is entirely positive, TNF is

associated with higher risk
Ordinal Explanatory and Response
Variables
• Pearson’s Chi-square test can be used to test
associations among ordinal variables, but more
powerful methods exist
• When theories exist that the association is
directional (positive or negative), measures exist
to describe and test for these specific
alternatives from independence:
– Gamma
– Kendall’s tb
Concordant and Discordant Pairs
• Concordant Pairs - Pairs of individuals where one
individual scores “higher” on both ordered
variables than the other individual
• Discordant Pairs - Pairs of individuals where one
individual scores “higher” on one ordered
variable and the other individual scores “lower”
on the other
• C = # Concordant Pairs D = # Discordant Pairs
– Under Positive association, expect C > D
– Under Negative association, expect C < D
– Under No association, expect C  D
Example - Alcohol Use and Sick Days
• Alcohol Risk (Without Risk, Hardly any Risk,

Some to Considerable Risk)
• Sick Days (0, 1-6, 7)
• Concordant Pairs - Pairs of respondents where
one scores higher on both alcohol risk and sick
days than the other
• Discordant Pairs - Pairs of respondents where
one scores higher on alcohol risk and the other
scores higher on sick days
Source: Hermansson, et al (2003)
A
C
D
d
od
da
t
7
3
5
5 A
W
4
3
6
3 H
2
5
4
1 S
3
1
5
9 T
• Concordant Pairs: Each individual in a given cell is

concordant with each individual in cells “Southeast”
of theirs
•Discordant Pairs: Each individual in a given cell is
discordant with each individual in cells “Southwest”
of theirs
A
C
D
d
od
da
t
7
3
5
5 A
W
4
3
6
3 H
2
5
4
1 S
3
1
5
9 T
C  347(63  56  25  34)  113(56  34)  154(25  34)  63(34)  83164

D  145(154  63  52  25)  113(154  52)  56(52  25)  63(52)  73496
Measures of Association
• Goodman and Kruskal’s Gamma:
^ CD ^
  1    1
CD
• Kendall’s tb:
^ CD
tb 
(n   ni. )( n 2   n. j )
2 2 2
When there’s no association between the ordinal variables,

the population based values of these measures are 0.
Statistical software packages provide these tests.
^ C  D 83164  73496
   0.0617
C  D 83164  73496
y m
a
b
o
rlE
x
ou
5
0
7
5O
K
2
2
7
5O
G
9N
a
N
b
U

Uji Kebebasan

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Uji Kebebasan

Uploaded by

Copyright:

Available Formats

Contingency Tables

1. Explain 2 Test of Independence

• Tables representing all combinations of

• Each variable has 2 levels

Outcome Outcome Group

Group 1 n11 n12 n1.

Group 2 n21 n22 n2.

Outcome n.1 n.2 n..

• 1. Shows If a Relationship Exists Between

• Pearson chi2(4) = 45.3569 p < 0.001

• Method of testing for association for 2x2

• Ratio of the probability that the outcome

95% Confidence Interval for Population Relative Risk:

COC No COC Total

Source: Bergstrom, et al (2004)

95%CI : (5.24e 1.96 .3874

Entire CI above 1  Conclude higher risk if on TNF

• Odds of an event is the probability it occurs

odds1 n11 / n12 n11n22

95% Confidence Interval for Population Odds Ratio

GBM Present GBM Absent Total

95% CI : ( 0.58e 1.96 0.0518

Interval is entirely below 1, NSAID use appears

• Difference Between Proportions of outcomes

95% Confidence Interval for Population Absolute Risk

Interval is entirely positive, TNF is

• Alcohol Risk (Without Risk, Hardly any Risk,

• Concordant Pairs: Each individual in a given cell is

C  347(63  56  25  34)  113(56  34)  154(25  34)  63(34)  83164

When there’s no association between the ordinal variables,

You might also like