You are on page 1of 38

PH 1700 Session 4b:

Poisson - Point and Interval Estimates


and One Sample Inference
Rosner 6.9, & 7.10
Point Estimation and Exact
Interval Methods for the
Poisson
Chapter 6 Section 9
Summary
• Poisson
• Point Estimation
• Exact Interval Estimation
• One Sample Inference
• SMR
Estimation for Poisson
• The Poisson distribution is used to estimate occurrences in a given
time period
• Often times we can apply the Poisson distribution to events occurring
for people over a given time.
• A common unit of follow up time is person-years – the unit of time
defined as 1 person being followed for 1 year.
• Example: a study with 10 people that each person is followed up for 2 years
has a total of 20 person-years.
Point Estimate for Poisson
• Assume the number of events over person years is Poisson
distributed with parameter . An unbiased estimator of is given by ,
where is the observed number of events over person-years (the
entire study)
• If is the incidence rate per person-year, and = number of person-years
of follow up, and we assume a Poisson distribution then and therefore
is unbiased by
Example: Woburn, MA 1970’s Excess Cancer risk in children’s
leukemia

• In the book A Civil Action, the people in the town feared that a
contaminated water supply caused cancer
• Translating question to statistical framework:
• 12 cases of childhood leukemia (<19 y.o.) diagnosed from 1970-1979
• Total of 12000 childhood residents (<19 y.o.)
• National average is 5 cases per 100,000 person years
• Is the cancer risk in the town different from the national average?
Finding the Point Estimate for the town
• 10 years, 12,000 people; the study was for
person years
(approximation, more later when we study Survival Analysis)
• Estimating incidence: events per person-year
• Rescale: -person years and events per 100,000-person years
• What is the uncertainty around that estimate?
Exact Interval Estimation for Poisson
• An exact confidence interval for Poisson parameter is given by such
that , and satisfy

and
Example: Leukemia
• 12 cases in 1.2 100,000-person-years
• What is the 95% confidence interval for ? For ?
• 12 cases out of 1.2 100,000-person years is very small, use exact
methods.
• For , use Table 7, page 881. Look under 95% column in row where .
Piece of Table 7, page 881
Example: Leukemia - Confidence Intervals
• The 95% CI for is (6.20, 20.96)
• For , convert from : so the 95% CI for is

cases for -person years


• Since our 95% CI does not include the national rate of 5 per 100,000
person years, 5 is not a plausible value for the rate of the town. We
can say the town rate of leukemia is significantly higher than the
national rate.
• This can also be done with Stata
Example: Leukemia –
Using Stata for Poisson Distribution Estimation
• cii means 120000 12, poisson
Poisson Exact
Variable Exposure Mean Std. Err. [95% Conf. Interval]
120000 0001 .0000289 .0000517 .0001747

• From Stata, the 95% CI for λ=, cases per each person year.
Equivalently to λ= per -person years.
One-Sample Inference for
Poisson Distribution
Section 7.10

13
Example (page 259) 7.57: Occupational
Health: Rubber workers
• Starting in January 1, 1964, 8418 white male rubber workers ages 40-
84 were followed for 10 years for various mortality outcomes and
compared to the US white male mortality rates in 1968. 4 deaths due
to Hodgkin’s disease were found compared to 3.3 deaths expected
from US mortality rates. Is this significant?

14
Example (page 259) 7.57: Occupational
Health: Rubber workers Continued
• Let
X=total number of deaths in study population
Yi = 1 if the individual i dies in the study period 0 otherwise
pi =probability of death for the ith individual
Therefore Yi ~Bernoulli(pi)

• Under H0: death rate of rubber workers =US general population, the expected number of events μ0 is
given by

• If the disease is rare, then the expected number of events ~Poisson(μ0), we test: H0:μ=μ0, vs. H1:μ≠ μ0

15
So if we have…
• One variable of interest? _Yes_
• We have one-sample? _Yes_
• Assume that underlying distribution is normal or CLT holds? _No_
• Is underlying distribution binomial? _No_
• Is underlying distribution Poisson? _Yes_
• Then we have a one-sample Poisson Test…..

16
Mortality rates and the Poisson
• The Poisson distribution can be used as a model for the counts of
events, such as death, occurring rarely in a population.
• Useful when the probability of the event, such as mortality, is not
constant for everyone in the population being considered
• When the probability is not constant, then the binomial distribution
does not apply

17
One sample test: Poisson Distribution Critical
Value Method (as opposed to p-value method)
• Use the Poisson exact confidence interval method
• If X is a Poisson Random variable with expected value
, then to test the hypothesis
H0: = 0 versus H1:   0
using a two sided test of level 
Construct the two sided 100% x (1-) confidence
interval for , based on the observed value x: (c1,c2)
• If 0 < c1 or 0 > c2 (outside the interval) then reject H0
• c1 < 0 < c2, (inside the confidence interval) then do
not reject H0
18
Example: Occupational Health Rubber workers –
Critical value Method
• 0 = 3.3 ; x =4;
7

• 1.09<3.3<10.24, therefore we fail to reject H0.


• The rate of Hodgkin’s disease among Rubber workers
is not significantly different from the national
mortality of Hodgkin’s. 19
Example: Occupational Health Rubber workers
Confidence Interval for Poisson in Stata
. cii means 84180 4, poisson

-- Poisson Exact --
Variable | Exposure Mean Std. Err. [95% Conf. Interval]
-------------+---------------------------------------------------------------
Translating question to statistical framework
| 84180 .0000475 .0000238 .0000129 .0001217

• Stata gives the CI in terms of as a rate. To make it a CI for multiply by T:


• 0.0000129*84180 person years = 1.0859
• 0.001217*84180 person years = 10.2447
• CI for (1.09, 10.24)
Small Sample inference – p-value
• Let  be the expected value of a Poisson distribution. To test the
hypothesis H0:  = 0 versus H1:   0,
• Compute x= observed number of deaths in the study population
• Under H0, the random variable X will follow a Poisson distribution with
parameter 0 and the exact p-value is given by

• With = display poissonp(, )


• And display poisson

21
Equivalently, (Rosner page 260 and 261)

22
Example: Occupational Health- p-value
• Back to the Rubber workers, with 4 deaths, and 3.3 expected from the
US mortality rate.

; therefore we use

23
Example Occupational Health - p-value in Stata
or display poissonp(3.3,0)

display poissonp(3.3,1)

display poissonp(3.3,2)

display poissonp(3.3,3)
thus

24
Example: Occupational Health-
Interpretation of p-value
• With a p value of 0.839, there is no evidence for the mortality from
Hodgkin’s disease among rubber workers being significantly different
from the US mortality rate.

25
Standardized mortality ratio (SMR)
• Another way to compare the mortality rate of a
sample with that of the population would be the
standardized mortality ratio (SMR).
• The standardized mortality ratio is 100%*observed/
expected number of deaths.
• The expected number of deaths assumes no difference
between the sample and the general population
• Standard morbidity ratio is an alternate name for the
standardized mortality ratio when the conditions do
not result in death.

26
SMR interpretation
• Similar to an Odds ratio
• SMR >100% implies increased risk in the sample
• SMR < 100% implies decreased risk in the sample
• And SMR = 100% implies neither an increased nor decreased risk in the
sample compared to the general population

27
Example: Occupational Health -SMR
• Recall we had 4 observed in our sample, and expected
3.3 based on the US mortality reports for Hodgkins.
• 100%*(4/3.3) = 121%
• We can reframe our Poisson test in terms of the SMR:
• H0: SMR =100%
• H1: SMR  100%
• Since we already performed the test, we can say the
SMR is not significantly different from 1 for the
Hodgkin’s disease in our sample of rubber workers.

28
One Sample Inference for the Poisson (large
sample)
• If the expected number of events (deaths) under the null distribution
are large enough, we can approximate
• This approximation is useful only if 0 >10
• It uses a statistic that follows the Chi-squared distribution

29
Large sample test for Poisson 
• Compute x = number of observed events in the study sample
• Compute the test statistic: (See next slide)
 x  0 
2 2
2  SMR  2
• X    0  1  ~ 1 , under H
0  100  0
2
• For a two sided test at level , we reject H if our statistic X2
> 1,1
2 0
1,1
• And fail to reject H0 if X2 <
12
• The exact p-value is given by Pr( > X2)

30
Just a note:
Remember (SMR=)

Remember if then
and therefore,

31
Large Sample Approximate CI
• The 100%x(1-) confidence interval for  can be approximated by

x  z1 /2 x

32
Example: Occupational Health- Rubber
workers and Bladder cancer
• We observe 21 cases in our sample, and the US mortality
rate Is 18.1 deaths. Are there significantly more deaths in
our sample?
• SMR = 100%*(21/18.1) = 116%
• X = 21;
• Exact method (p-value):

  20
e 18.118.1k  
• min  2  1    ,1 = min[2 x display poisson(18.1,20),1]
  k 0 k!  

33
Example Occupational Health – p-value
= min[2 x display poisson(18.1,20),1],

so we get:

= min( 2*(1-.72270),1) = 0.5546

Thus by the exact test, the mortality rate from bladder cancer in our sample
of rubber workers is not significantly different from the general population.

34
Example: Occupational Health-
Comparing Methods
• Using the approximate method, we get
 0   2.92
2 2
2 x   21  18.1
X     0.46464
0 18.1 18.1

• critical value for the test


• invchi2(1,0.95) = 3.8414
• Or p-value for the test: chi2tail(0.46464) = 0.4955
• By the approximate method, there is no significant
difference.
• Compare to exact: P- value = 0.5546
• In general, exact methods are strongly preferred for
inference concerning the Poisson distribution
35
Checklist for tests of hypothesis
• Identify the variable of interest
• Identify the parameter(s) of interest
• State the null and alternative hypotheses
• Identify the type I error level
• Identify the test statistic (you can use the flow chart in the back of the
textbook)
• Identify the distribution of the test statistic (a known probability
distribution)
• Determine the decision rule (do a graph!)
• Calculate the test statistic
• Report the test statistic, df, CI & p-value
• Make a decision
• Conclude and interpret
When do I use Poisson, Revisited
• Testing mortality\morbidity ratios or rare incidence rates
• mortality rate varies across the sample
• Any other situation the Poisson applies and we are testing the Poisson
parameter (eg. modeling counts)
• Use the Poisson test
• Exact method – small sample (0 < 10)
• Large sample approximation with chi-square (not as accurate), (0 > 10)
• However, exact method is often preferred, even if the approximation is okay.

37
Summary
• Poisson
• Point Estimation
• Exact Interval Estimation
• One Sample Inference
• SMR

You might also like