You are on page 1of 5

chi-square

MS
. MS

.
30 MS 30
.
20 MS .
15 .
0.67 0.50
.
.
.
2 :

.

. .
H 0 : p1 = p2

H 1 : p1 p2

Gene

Total
10
15
25

30
30
60

+
20
15
35

K ( O E )2
) i E i > 2 with df = (r 1)(c 1
i
i =1

H0
r

MS
+
total

= 2

K ( O E ) 2 ( 20 17.5 ) 2 ( 10 12.5 ) 2 ( 15 17.5 ) 2 ( 15 12.5 ) 2


i E i = 17.5 + 12.5 + 17.5 + 12.5 = 1.71
i
i =1
with df = (2 1)(2 1) = 1
30 25
= 12.5
60

= E1

30 35
= 17.5
60

= E3

30 25
= 12.5
60

= E2

If

= 2

2 = 3.84
0.05

30 35
= 17.5
60

= E1

Prospective study
Exposed
E

Disease

D
D

Incidence rate =

Relative risk=

a
c

b
d

Number of new cases of a disease per unit of time


Total number at risk in beginning of this time period

Incidence of disease among people with the risk factor (exp osed group )
Incidence of disease among people without the risk factor ( un exp osed group )

a
P
Relative risk = a + c = 1
b
P2
b+d
P

H0 : 1 = 1

P2
H 0 : RR = 1
H : P = P2


0 1

H 1 : RR 1
H 1 : P1 P2
H : P1 1
1

P2
(Confidence interval for relative risk (RR
Relative risk converges to normality faster on the log scale. Thus we first construct a
confidence interval for log (RR).
A 95% confidence interval for RR is:
1 P1 1 p2
+
where n1 = a + c and n2 = b + d
n1 p1
n2 p2
Exponentiating (taking antilogs of) its endpoints provides a confidence interval for
RR.
We accept the null hypothesis (H0) if the confidence interval includes 1.
Example:
These data come from the National Pooling Project. For purposes of this example,
high blood pressure is defined as diastolic blood pressure 105 mm Hg and normal
is defined as PB <78 mm Hg. The disease in question is a Coronary Event and the
time period is 10 years. Note that hypertension is currently defined as systolic blood
pressure > 140 mmHg and/or diastolic blood pressure > 90 mmHg. Note: current
bounds for hypertension are a systolic blood pressure > 140 mmHg or a diastolic
blood pressure > 90 mmHg.
log(RR) Z / 2

Disease:
Coronary
Event
+
Total
Relative risk=

Exposed (BP)
High
Normal
90
403
493

70
1201
1271

Incidence of disease among people with high BP(exp osed group )


Incidence of disease among people with normal BP( un exp osed group )

a
P
90 / 493
= 3.31
= RR= a + c = 1
b
P2 70 / 1271
b+d
1 P1 1 p2
+
log(RR) Z / 2
n1 p1
n2 p2
log(3.31) 1.960 0.15
1.197 0.294
)log(RR) (0.903,1.49
RR (e0.9 ,e1.49 ) e=2.718
log ea = c a = ec
)RR (2.45,4.43

Retrospective study
Exposed
Total
a+b
c+d

=

b
d

a
c

D
D


( )P1

( )P1 -1
b
a+b

Disease

= 1 - P1

a
a +b

= P1


( )P2

( )P2 -1
d
c +d

= 1 P2

c
c +d

= P2

= =
(Odds Ratio(OR

a
a+b
P1
b
a
1 P1
ad
OR =
= a+b = b =
P2
c
c bc
c+d
d
1 P2
d
c+d

Confidence interval for odds ratio (OR)


Odds ratio converges to normality faster on the log scale. Thus we first construct a
confidence interval for log (OR).
A 95% confidence interval for OR is:
1
1
+
where n1 = a + c and n2 = b + d
n1 p1 ( 1 P1 ) n2 p2 ( 1 p2 )
Exponentiating (taking antilogs of) its endpoints provides a confidence interval for
RR.
We accept the null hypothesis (H0) if the confidence interval includes 1.
log(OR) Z / 2

Disease:
Lung
Cancer
+
-

Exposed
(Smokers)
+
475
7
431
61

Total
482
492

For smokers: odds of lung cancer are


For nonsmokers: odds of lung cancer are

ad 475 61
=
= 9.61
bc 7 461
1
1
+
log(9.61) 1.96
n1 p1 ( 1 P1 ) n2 p2 ( 1 p2 )
log(OR) (1.47,3.25)
OR (e1.47 ,e3.25 ) e=2.718
log ea = c a = ec
OR (4.36,21.22)
OR (2.45,4.43)
OR =

(OR)

P1
1 P
H0 : 1 = 1
P2
H0 :OR = 1 1 P2 H0 :P1 = P2

H1 = OR 1 P1 H1 :P1 P2
1 P1
H1 : P 1
2
1 P2

OR
: (Delta-Method)
The delta method is a widely used procedure in statistics when an approximation is
needed for the variance of a function of a variable whose variance is known. In this
instance the variable with known variance is a proportion p, and the function is the
logit. The basic delta method formula is: var(y) (dy/dx)2 var(x).

You might also like