Professional Documents
Culture Documents
• In the book A Civil Action, the people in the town feared that a
contaminated water supply caused cancer
• Translating question to statistical framework:
• 12 cases of childhood leukemia (<19 y.o.) diagnosed from 1970-1979
• Total of 12000 childhood residents (<19 y.o.)
• National average is 5 cases per 100,000 person years
• Is the cancer risk in the town different from the national average?
Finding the Point Estimate for the town
• 10 years, 12,000 people; the study was for
person years
(approximation, more later when we study Survival Analysis)
• Estimating incidence: events per person-year
• Rescale: -person years and events per 100,000-person years
• What is the uncertainty around that estimate?
Exact Interval Estimation for Poisson
• An exact confidence interval for Poisson parameter is given by such
that , and satisfy
and
Example: Leukemia
• 12 cases in 1.2 100,000-person-years
• What is the 95% confidence interval for ? For ?
• 12 cases out of 1.2 100,000-person years is very small, use exact
methods.
• For , use Table 7, page 881. Look under 95% column in row where .
Piece of Table 7, page 881
Example: Leukemia - Confidence Intervals
• The 95% CI for is (6.20, 20.96)
• For , convert from : so the 95% CI for is
• From Stata, the 95% CI for λ=, cases per each person year.
Equivalently to λ= per -person years.
One-Sample Inference for
Poisson Distribution
Section 7.10
13
Example (page 259) 7.57: Occupational
Health: Rubber workers
• Starting in January 1, 1964, 8418 white male rubber workers ages 40-
84 were followed for 10 years for various mortality outcomes and
compared to the US white male mortality rates in 1968. 4 deaths due
to Hodgkin’s disease were found compared to 3.3 deaths expected
from US mortality rates. Is this significant?
14
Example (page 259) 7.57: Occupational
Health: Rubber workers Continued
• Let
X=total number of deaths in study population
Yi = 1 if the individual i dies in the study period 0 otherwise
pi =probability of death for the ith individual
Therefore Yi ~Bernoulli(pi)
• Under H0: death rate of rubber workers =US general population, the expected number of events μ0 is
given by
• If the disease is rare, then the expected number of events ~Poisson(μ0), we test: H0:μ=μ0, vs. H1:μ≠ μ0
15
So if we have…
• One variable of interest? _Yes_
• We have one-sample? _Yes_
• Assume that underlying distribution is normal or CLT holds? _No_
• Is underlying distribution binomial? _No_
• Is underlying distribution Poisson? _Yes_
• Then we have a one-sample Poisson Test…..
16
Mortality rates and the Poisson
• The Poisson distribution can be used as a model for the counts of
events, such as death, occurring rarely in a population.
• Useful when the probability of the event, such as mortality, is not
constant for everyone in the population being considered
• When the probability is not constant, then the binomial distribution
does not apply
17
One sample test: Poisson Distribution Critical
Value Method (as opposed to p-value method)
• Use the Poisson exact confidence interval method
• If X is a Poisson Random variable with expected value
, then to test the hypothesis
H0: = 0 versus H1: 0
using a two sided test of level
Construct the two sided 100% x (1-) confidence
interval for , based on the observed value x: (c1,c2)
• If 0 < c1 or 0 > c2 (outside the interval) then reject H0
• c1 < 0 < c2, (inside the confidence interval) then do
not reject H0
18
Example: Occupational Health Rubber workers –
Critical value Method
• 0 = 3.3 ; x =4;
7
-- Poisson Exact --
Variable | Exposure Mean Std. Err. [95% Conf. Interval]
-------------+---------------------------------------------------------------
Translating question to statistical framework
| 84180 .0000475 .0000238 .0000129 .0001217
21
Equivalently, (Rosner page 260 and 261)
22
Example: Occupational Health- p-value
• Back to the Rubber workers, with 4 deaths, and 3.3 expected from the
US mortality rate.
; therefore we use
23
Example Occupational Health - p-value in Stata
or display poissonp(3.3,0)
display poissonp(3.3,1)
display poissonp(3.3,2)
display poissonp(3.3,3)
thus
24
Example: Occupational Health-
Interpretation of p-value
• With a p value of 0.839, there is no evidence for the mortality from
Hodgkin’s disease among rubber workers being significantly different
from the US mortality rate.
25
Standardized mortality ratio (SMR)
• Another way to compare the mortality rate of a
sample with that of the population would be the
standardized mortality ratio (SMR).
• The standardized mortality ratio is 100%*observed/
expected number of deaths.
• The expected number of deaths assumes no difference
between the sample and the general population
• Standard morbidity ratio is an alternate name for the
standardized mortality ratio when the conditions do
not result in death.
26
SMR interpretation
• Similar to an Odds ratio
• SMR >100% implies increased risk in the sample
• SMR < 100% implies decreased risk in the sample
• And SMR = 100% implies neither an increased nor decreased risk in the
sample compared to the general population
27
Example: Occupational Health -SMR
• Recall we had 4 observed in our sample, and expected
3.3 based on the US mortality reports for Hodgkins.
• 100%*(4/3.3) = 121%
• We can reframe our Poisson test in terms of the SMR:
• H0: SMR =100%
• H1: SMR 100%
• Since we already performed the test, we can say the
SMR is not significantly different from 1 for the
Hodgkin’s disease in our sample of rubber workers.
28
One Sample Inference for the Poisson (large
sample)
• If the expected number of events (deaths) under the null distribution
are large enough, we can approximate
• This approximation is useful only if 0 >10
• It uses a statistic that follows the Chi-squared distribution
29
Large sample test for Poisson
• Compute x = number of observed events in the study sample
• Compute the test statistic: (See next slide)
x 0
2 2
2 SMR 2
• X 0 1 ~ 1 , under H
0 100 0
2
• For a two sided test at level , we reject H if our statistic X2
> 1,1
2 0
1,1
• And fail to reject H0 if X2 <
12
• The exact p-value is given by Pr( > X2)
30
Just a note:
Remember (SMR=)
Remember if then
and therefore,
31
Large Sample Approximate CI
• The 100%x(1-) confidence interval for can be approximated by
x z1 /2 x
32
Example: Occupational Health- Rubber
workers and Bladder cancer
• We observe 21 cases in our sample, and the US mortality
rate Is 18.1 deaths. Are there significantly more deaths in
our sample?
• SMR = 100%*(21/18.1) = 116%
• X = 21;
• Exact method (p-value):
20
e 18.118.1k
• min 2 1 ,1 = min[2 x display poisson(18.1,20),1]
k 0 k!
33
Example Occupational Health – p-value
= min[2 x display poisson(18.1,20),1],
so we get:
Thus by the exact test, the mortality rate from bladder cancer in our sample
of rubber workers is not significantly different from the general population.
34
Example: Occupational Health-
Comparing Methods
• Using the approximate method, we get
0 2.92
2 2
2 x 21 18.1
X 0.46464
0 18.1 18.1
37
Summary
• Poisson
• Point Estimation
• Exact Interval Estimation
• One Sample Inference
• SMR