You are on page 1of 72

大綱

Chapter 2 Two-Way Contingency Tables

離散資料分析
Categorical Data Analysis

陳俞成
Email:ycchen@mail.chna.edu.tw

2005.9.26

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


大綱
Chapter 2 Two-Way Contingency Tables

Chapter 2 Two-Way Contingency Tables


Probability Structure for Contingency Tables
Copmaring Proportions in Two-By-Two Tables
The Odds Ratio

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Two-Way Contingency Tables

I Association between two categorical variables


I Parameters describe the association
I differences and ratios of proportions
I the odds ratio
I Inferential methods for those parameters
I Large-sample significance tests for nominal variables
I Large-sample significance tests for ordinal variables
I Small-sample analyses

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

I Let X and Y denote two categorical variables, X


having I levels and Y having J levels.
I A contingency table is below:

Y
X y1 y2 ··· yJ
x1 n11 n12 · · · n1J
x2 n21 n22 · · · n2J
.. .. .. .. ..
. . . . .
xI nI 1 nI 2 · · · nIJ
陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis
Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

I The cells of the table represent the IJ possible


outcomes with count nij .
I A contingency table that cross classifies two
variables is called a two-way table.
I A two-way table having I rows and J columns is
called an I × J(read I -by-J) table.
I A contingency table that cross classifies three
variables is called a three-way table.

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

I Joint, Marginal, and Conditional Probabilities

Column
Row 1 2 Total
1 π11 π12 π1+
(π1|1 ) (π2|1 ) (1.0)
2 π21 π22 π2+
(π1|2 ) (π2|2 ) (1.0)
Total π+1 π+2 1.0

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

The joint distribution

I Let πij = P(X = i, Y = j) denote the probability


that (X , Y ) falls in the cell in row i and column j.
I The probabilities {πij } form the joint distribution
P
of X and Y . item i,j πij = 1

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

The marginal distribution

I The marginal distribution are the row and


column totals of the joint probabilities.
I {πi+ } for the row variable(X )
I {π+j } for the column variable(Y )
I The subscript “+” denotes the sum over the
index it replaces. For instance, for 2 × 2 table,

π1+ = π11 + π12 and π+1 = π11 + π21

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

The conditional distribution

I Let the column variable, Y , be a response


variable and the row variable, X , be an
explanatory variable.
I The conditional distribution is the conditional
probabilities for Y , given the level of X .
πij
I P(Y = j|X = i) = πj|i = πi+ for all i and j

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

I The sample joint, marginal, and conditional


probabilities

Column
Row 1 2 Total
1 p11 p12 p1+
(p1|1 ) (p2|1 ) (1.0)
2 p21 p22 p2+
(p1|2 ) (p2|2 ) (1.0)
Total p+1 p+2 1.0

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

I The cell counts are denoted by {nij }, with


P
n = i,j nij denoting the total sample size.

Column
Row 1 2 Total
1 n11 n12 n1+
2 n21 n22 n2+
Total n+1 n+2 n
n
I pij = nij
I The marginal frequencies are the row totals
{ni+ } and the column totals {n+j }.
陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis
Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Belief in Afterlife Example

Belief in Afterlife
Gender Yes No or Undecided Total
Females n11 = 435 n12 = 147 n1+ = 582
Males n21 = 375 n22 = 134 n2+ = 509
Total n+1 = 810 n+2 = 281 n = 1091

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Belief in Afterlife Example

I The joint pbobabilities:


p11 = 435/1091 = .399, p12 = 147/1091,
p21 = 375/1091, p22 = 134/1091

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Belief in Afterlife Example

I The marginal pbobabilities:


p1+ = 582/1091, p2+ = 509/1091
for the gender variable and
p+1 = 810/1091, p+2 = 281/1091
for the belief in afterlife variable

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Belief in Afterlife Example

I The conditional probabilites for the belief in


afterlife variable given gender variables:
p1|1 = 435/582 = .747, p2|1 = 147/582
given gender=females,
p1|2 = 375/509, p2|2 = 134/509
given gender=males

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Independence
I Two variables are said to be statistically
independent if the conditional distributions of Y
are identical at each level of X .
I That is πj|i = π+j .
I When both variables are response variables, one
can describe their relationship using their joint
distribution, or the conditional distribution of Y
given X , or the conditional distribution of X
given Y .
陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis
Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Independence

I πij = πi+ π+j for i = 1, . . . , I and j = 1, . . . , J


I When X and Y are independent,

πj|i = πij /πi+ = (πi+ π+j )/πi+ = π+j

for i=1,. . . ,I

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Sample distribution

I Let {pij } denote the sample joint distribution.


The cell frequencies are denoted {nij }, and
P P
n = i j nij is the total sample size.
I pij = nij /n for i = 1, . . . , I and j = 1, . . . , J
I pj|i = pij /pi+ = nij /ni+ , where
P
ni+ = npi+ = j nij

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Poisson Sampling

I The Poisson sampling model for a 2 × 2 table


treats each of the four cell counts in the table as
an independent Poisson variate with parameters
{µij }
I The joint probability mass function for potential
outcomes {nij } is
n
Πi Πj exp(−µij )µijij /nij !

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Poisson Sampling

H0 : 是否飲酒和交通事故是否造成死亡無關

飲酒
死亡 是 否 合計
是 n11 n12 n1+
否 n21 n22 n2+
合計 n+1 n+2 n++ = n

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Independent Binomial Sampling


I When the rows of a contingency table refer to
different groups, the sample sizes for those
groups are often fixed by the sampling design.
I n1j |n1+ ∼ bin(n1+ , πj|1 ) and
n2j |n2+ ∼ bin(n2+ , πj|2 ) are independent.
I The joint probability mass function for potential
outcomes {nij } is
Πi ni+ ! n
Πi Πj πj|iij
Πi Πj nij !
陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis
Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Independent Binomial Sampling

H0 : 是否感冒和服用維他命與否無關

感冒
服用 是 否 合計
維他命 n11 n12 n1+
寬心劑 n21 n22 n2+
合計 n+1 n+2 n++ = n

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Multinomial Sampling

I When the total sample size in the table is fixed


but not the row or column totals, a multinomial
sampling model applies.
I The joint probability mass function for potential
outcomes {nij } is
n
[n!/(n11 ! · · · nIJ !)]Πi Πj πij ij

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Multinomial Sampling

H0 : 是否相信有來生和性別無關

相信有來生
性別 是 否 合計
女性 n11 n12 n1+
男性 n21 n22 n2+
合計 n+1 n+2 n++ = n

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Difference of Proportions

I Let π1|1 = π1 and π1|2 = π2


I −1 < π1 − π2 < 1
I When the response is independent of the group
classification, π1 = π2
I For sample proportions:
p1 = n11 /n1+ , p2 = n21 /n2+

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Difference of Proportions

I H0 : π1 = π2 v.s. Ha : π1 6= π2
under significance level α = 0.05
(p −p )−(π1 −π2 ) ·
I z= q 1 2
∗ ∗ 1 1
∼ N(0, 1),
p (1−p )( n + n )
1+ 2+
∗ n11 +n21
where p = n1+ +n2+
I reject H0
if z > 1.96(= zα/2 ) or z < −1.96(= −zα/2 )

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Difference of Proportions

I A large sample 100(1 − α)% confidence interval


for π1 − π2 is

(p1 − p2 ) ± zα/2 σ̂(p1 − p2 )


q
where σ̂(p1 − p2 ) = p1 (1−p
n1+
1)
+ p2 (1−p
n2+
2)

I zα/2 is the (1 − α/2) percentile of N(0, 1)

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Aspirin and Heart Attacks Example

I 雙盲實驗 (double blind experiment): 在研究中的


醫生和病患均不知道服用的是哪一種藥物
Myocardial Infarction

I
Group Yes No Total
Placebo 189 10845 11034
Aspirin 104 10933 11037
Source:N.Engl.J.Med.,318:262-264(1988)

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Aspirin and Heart Attacks Example

I H0 : 是否服用阿司匹靈和是否心肌梗塞無關
I π1 = P(MI|placebo), π2 = P(MI|aspirin)
I H0 : π1 = π2 or H0 : π1 − π2 = 0

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Aspirin and Heart Attacks Example

189 104
I p1 = = 0.0171, p2 = 11037
11034 = 0.0094
q
I σ̂(p1 − p2 ) = (.0171)(.9829)
11034 + (.0094)(99069)
11037 =
0.0015
I A 95% C.I. for π1 − π2 is (0.005, 0.011)
I ∵0∈
/ (0.005, 0.011) ∴ reject H0 : π1 − π2 = 0

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Relative Risk(相對風險)
I In 2 × 2 tables, the relative risk is the ratio of the
“success” probabilities for the two groups,
π1
π2
.
I The sample relative risk is
p1
p2
.
陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis
Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Relative Risk

I 若 p1 和 p2 都很接近 0 時, 使用比例差值 (p1 − p2 )


可能造成誤導, 建議採用相對風險 (relative risk)

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Relative Risk

I 假設比較兩種藥物的副作用,
一組比例為 p1 = 0.01, p2 = 0.001,
另一組比例為 p1 = 0.41, p2 = 0.401。
若以 p1 − p2 來看,
兩組比例差值皆為 0.009,
但第一組的相對風險為 0.01/0.001 = 10,
第二組的相對風險為 0.41/0.401 = 1.02,
顯然使用相對風險較能提醒第一組資料較值得注意。

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Relative Risk

I 若 π1 = π2 則相對風險= 1, 即解釋變數和反應變數
互相獨立。
I 有時候計算“失敗”機率之比值也能提供一些訊息。

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Relative Risk for Aspirin Study

Myocardial Infarction
Group Yes No Total
Placebo 189 10845 11034
Aspirin 104 10933 11037
Source:N.Engl.J.Med.,318:262-264(1988)

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Relative Risk for Aspirin Study

I H0 : 是否服用阿司匹靈和是否心肌梗塞無關
I π1 = P(MI|placebo), π2 = P(MI|aspirin)
I H0 : π1 = π2 or H0 : π1 /π2 = 1
189 104
I p1 = 11034 = 0.0171, p2 = 11037 = 0.0094

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Relative Risk for Aspirin Study

I The relative risk = pp12 = 0.0171


0.0094 = 1.82
I A 95% C.I. for π1 /π2 is (1.43, 2.30)
I ∵1∈
/ (1.43, 2.30) ∴ reject H0 : π1 /π2 = 1
I The C.I. for the relative risk indicates that the
risk of MI is at least 43% higher for the placebo
group.

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

I Suppose the probability of “success” is π1 in row


1 and π2 in row 2 in 2 × 2 table.
I Within row 1, the odds of success are defined to
π1
be odds1 = (1−π 1)
.
I Within row 2, the odds of success equal
π2
odds2 = (1−π 2)
.
I
π
odds = (1−π) ⇒ π = odds
(odds+1)

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

I The odds are nonnegative, with value greater


than 1.0 when a success is more likely than a
failure.
I When odds= 4.0, a success is four times as likely
as a failure. The probability of success is
4
π = (4+1) = 0.8, and the probability of failure is
0.2.
I When odds= 14 , a failur is four times as likely a
success. The probability of success is
1
π = ( 1 +1)
4
= 0.2, and the probability of failure is
4
0.8.
陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis
Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

I When π1 = π2 , the odds satisfy odds1 = odds2 .


I The variables, X and Y , are then independent.

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

I The odds ratio is defined as


θ = odds1 = ππ12 /(1−π 1)
= ππ12 (1−π 2)
= ππ11 π22
π21 .
odds2 /(1−π 2 ) (1−π 1 ) 12

I Whereas the relative risk is a ratio of two


probabilities, the odds ratio θ is a ratio of two
odds.

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Properties of the Odds Ratio

I When θ = 1, it corresponds X and Y are


independent. (i.e. π1 = π2 )
I When 1 < θ < ∞, the odds of success are higher
in row 1 than in row 2. It represent a positive
association. (i.e. π1 > π2 )
I When 0 < θ < 1, a success is less likely in row 1
than in row 2. It represent a negative
association. (i.e. π1 < π2 )
陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis
Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Properties of the Odds Ratio

I Values of θ farther from 1.0 in a given direction


represent stronger levels of association.
I When the order of the rows is reversed or the
order of the columns is reversed, the new value of
θ is the inverse of the original value. The value
of | log(θ)| remains the same.(invariant property)

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Properties of the Odds Ratio

I The odds ratio does not change value when the


orientation of the table reverses so that the rows
become the columns and the columns become
the rows.
I When the row or the column multiplies one
constant, the value of the odds ratio does not
change.(independent of sample size)

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Properties of the Odds Ratio


I When both variables are responses, the odds
ratio can be fefined as θ = ππ11 /π12
21 /π22
= ππ11 π22
12 π21

I The odds ratio is also called the cross-product


ratio.
I The sample odds ratio equals the ratio of the
sample odds in the two rows,
θ̂ = pp21 /(1−p
/(1−p1 )
2)
= nn11 /n12
21 /n22
= nn11 n22
12 n22

I For the standard sampling schemes, this is the


ML estimator of the true odds ratio.
陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis
Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Odds Ratio for the Aspirin Study

Myocardial Infarction
Group Yes No Total
Placebo 189 10845 11034
Aspirin 104 10933 11037
Source:N.Engl.J.Med.,318:262-264(1988)

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Odds Ratio for Aspirin Study

I H0 : 是否服用阿司匹靈和是否心肌梗塞無關
I π1 = P(MI|placebo), π2 = P(MI|aspirin)
I H0 : π1 = π2 or H0 : θ = 1
I odds1 = nn12
11 189
= 10845 = 0.0174 = 1.74
100 ,
n21 104
odds2 = n22 = 10933 = 0.0095 = 0.95
100

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Odds Ratio for Aspirin Study

I The odds ratio = 0.0174


0.095 =
189×10933
104×10845 = 1.832
I The estimated odds were 83% higher for the
placebo group.

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Inference for Odds ratios and Log Odds


Ratios

I For small to moderate sample sizes, the sampling


distrbution of the odds ratio is highly
skewed(right skewness).
I The log odds ratio(log(θ)) is symmetric about
zero.

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Inference for Odds ratios and Log Odds


Ratios

I Independence corresponds to log(θ) = 0; that is,


θ = 1.
I Doubling a log odds ratio corresponds to
squaring an odds ratio.

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Inference for Odds ratios and Log Odds


Ratios
·
I log θ̂ ∼ N(log θ, var(log θ̂)) or
√log θ̂−log θ → N(0, 1)
var
d (log θ̂)
I An asymptotic standard error is denoted by ASE .
·
I var(log θ̂) = ( π111 + 1 1 1 1
π12 + π21 + π22 ) · n , n = n++
1 1 1 1
I var(log
c θ̂) = n11+ n12 + n21 + n22 ∵ pij = nij /n
q
1 1 1 1
I ASE (log θ̂) = n11 + n12 + n21 + n22

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Inference for Odds ratios and Log Odds


Ratios

I A large-sample confidence interval for log θ is


log θ̂ ± zα/2 ASE (log θ̂)
I A large-sample confidence interval for θ is
(exp(log θ̂ − zα/2 ASE (log θ̂)),
exp(log θ̂ + zα/2 ASE (log θ̂)))

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Log Odds Ratio for the Aspirin Study

Myocardial Infarction
Group Yes No Total
Placebo 189 10845 11034
Aspirin 104 10933 11037
Source:N.Engl.J.Med.,318:262-264(1988)

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Log Odds Ratio for Aspirin Study

I H0 : 是否服用阿司匹靈和是否心肌梗塞無關
I π1 = P(MI|placebo), π2 = P(MI|aspirin)
I H0 : π1 = π2 or H0 : log θ = 0
I log θ̂ = log(1.832) = 0.605

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Log Odds Ratio for Aspirin Study

I ASE (log θ̂) =


p
1/189 + 1/10933 + 1/10845 + 1/104 = 0.123
I A 95% C.I. for log θ is
(0.605 ± 1.96 × 0.123) = (0.365, 0.846)
∵0∈ / (0.365, 0.846) reject H0 : log θ = 0
I A 95% C.I. for θ is
(exp(0.365), exp(0.846)) = (1.44, 2.33)
∵1∈ / (1.44, 2.33) reject H0 : θ = 1
陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis
Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Log Odds Ratio for Aspirin Study

I The interval predicts that the odds of MI are at


least 44% higher for subjects taking placebo than
for subjests taking aspirin.
I θ̂ = 1.83 is not the midpoint of (1.44, 2.33),
because the sampling distribution of θ̂ is skewed
to the right.

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

I The sample odds ratio θ̂ = 0 or ∞ if any nij = 0,


and it is undefined if both entries in a row or
column are zero.
(n11 +0.5)(n22 +0.5)
I The amended estimator is θ̃ = (n12 +0.5)(n21 +0.5)
I For the aspirin study,
(189.5)(10933.5)
θ̃ = (10845.5)(104.5) = 1.828 ≈ θ̂ = 1.832,
since no cell count is especially small.

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Relationship Between Odds Ratio and


Relative Risk

I θ̂ = 1.83 does not mean p1 = 1.83p2 .


I θ̂ = 1.83 means p1 /(1 − p1 ) = 1.83p2 /(1 − p2 )
p1 /(1−p1 )
I Odds ratio= p2 /(1−p2 ) =Relative risk×( 1−p
1−p1 ).
2

I When the proportion of successes is close to zero


for both groups,
1−p2 · p1
1−p1 ≈ 1.0 ⇒ θ̂ = p2 =relative risk.

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Smoking Status and Myocardial Infarction


Study

I For some data sets calculation of the relative risk


is not possible, yet one can calculate the odds
ratio and use it to approximate the relative risk.
I The first column refers to 262 young and
middle-aged women(age< 69) admitted to 30
coronary care units in the northern Italy with
acute MI during the period 1983-1988.
陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis
Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Smoking Status and Myocardial Infarction


Study

I Each case was matched with two control patients


admitted to the same hospitals with other acute
disorders. The controls fall in the second column
in the table.

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Smoking Status and Myocardial Infarction


Study

Ever Myocardial
Smoker Infarction Controls Total
Yes 172 173 345
No 90 346 436
Total 262 519 781
Source:J.Epidemiol.and Commun.
Health,43:214-217(1989)
陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis
Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Smoking Status and Myocardial Infarction


Study

I P(Y = Cases|X = Yes) and


P(Y = Cases|X = No) are not estimated, since
·
262/(262 + 519) = 1/3 has no meaning.
I The testing H0 : P(Y = Cases|X = Yes) =
P(Y = Cases|X = No) can not be made.
I The testing H0 : P(X = Yes|Y = Cases) =
P(X = Yes|Y = Controls) can be made.
陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis
Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Smoking Status and Myocardial Infarction


Study
I Since n+1 and n+2 are fixed, n11 ∼ bin(n+1 , π1 )
and n12 ∼ bin(n+2 , π2 ) are dependent, where
π1 = P(X = Yes|Y = Cases) and
π2 = P(X = Yes|Y = Controls).
I H0 : 是否抽煙和是否心肌梗塞無關
I H0 : π1 = π2 or H0 : θ = 1 or H0 : log θ = 0
I p1 = n11 /n+1 = 172/262 = 0.656 and
p2 = n12 /n+2 = 173/519 = 0.333
陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis
Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Smoking Status and Myocardial Infarction


Study

I Under H0 : π1 = π2 = π
with significance level α = 0.05
n11 +n12
I p∗ = n+1 +n+2
p1 −p2 ·
I z= q
1 1
∼ N(0, 1)
p ∗ (1−p ∗ )( n + n )
+1 +2

I Reject H0 : π1 = π2 = π if |z| > 1.96

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Smoking Status and Myocardial Infarction


Study

I Under H0 : log θ = 0
with significance level α = 0.05
log θ̂ ·
I z= qP
1
∼ N(0, 1)
i,j nij

I Reject H0 : log θ = 0 if |z| > 1.96

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Smoking Status and Myocardial Infarction


Study

I odds1 = 0.656/(1 − 0.656) and


odds2 = 0.333/(1 − 0.333)
0.656/(1−0.656) 172×346
I θ̂ = 0.333/(1−0.333) = 173×90 = 3.82
I odds1 = 3.82 × odds2
·
I If P(Y = Case|X ) = 0 then
·
θ̂ = pp12 =relative risk.

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Types of Observational Studies


I Prospective studies(前瞻性研究): We follow a
sample of subjests for the future years, observing
the rates of interesting factor(such as MI) for
exposed and unexposed groups.
I In cohort studies(世代研究), the subjects make their
own choice about which group to join(e.g., whether to
be exposed or unexposed), and we simply observe in
future time who suffers MI.
I In clinical trials(臨床試驗), we randomly allocate
subjects to the two groups of interesting, again
observing in future time who suffers MI.
陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis
Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Types of Observational Studies


I Retrospective studies(回溯性研究): We collect a
sample of subjects with the marginal distribution
of interesting factor(such as MI) is fixed by the
sampling design, often there being two controls
for each case. The outcome measured for each
subject is whether one ever was in the exposed or
unexposed groups. This study, which uses a
design to “look into the past,” is called a
case-control study.
陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis
Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Types of Observational Studies

I Cross-sectional design(橫斷面設計): We sample


subjects and classify them simultaneously on the
group classification and their current response.

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Types of Observational Studies


I Case-control, cohort, and cross-sectional studies
are called observational studies.
I A clinical trial is an experimental study, the
investigator having control over which subjects
enter each group.
I Clinical trials have fewer potential pitfalls,
because of the use of randomization, but
observational studies are often more practical for
biomedical and social science research.
陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis
Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Two Web site

http://www.stat.ufl.edu/∼aa/cda/cda.html
http:
//www.ats.ucla.edu/stat/examples/icda

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis


Probability Structure for Contingency Tables
大綱
Copmaring Proportions in Two-By-Two Tables
Chapter 2 Two-Way Contingency Tables
The Odds Ratio

Summary

I Association between two categorical variables


I Parameters describe the association
I differences and ratios of proportions
I the odds ratio
I Inferential methods for those parameters
I Types of Observational Studies

陳俞成 Email:ycchen@mail.chna.edu.tw 離散資料分析 Categorical Data Analysis

You might also like