Professional Documents
Culture Documents
Sahadeb Sarkar
IIM Calcutta
1
Readings: Chapters 1-6 2
Terminology
Discrete data: relates to countable number of outcomes,
Discrete distributions
• Categorical Data: Discrete data with finitely many possible
values on a nominal/ordinal scale (e.g., the state a person
lives in, the political party one might vote for, the blood
type of a patient, grades in a course (Bernoulli,
Multinomial distribution). Central tendency given by its
mode
• Count data (non-negative integer valued) : Records the
frequency of an event (‘success’), may not have an upper
bound (e.g., Poisson, Negative Binomial distributions). It
arises out of counting and not ranking.
• Categorical variable has a finite range and count variable
has possibly an infinite range (text, p. 3)
3
Data Types
• Dichotomous data: can take only two values
such as “Yes” and “No”
• Nonordered polytomous data: five different
detergents
• Ordered polytomous data: grades A, B , C, D;
“old”, “middle-aged”, “young” employees
5
Derivation Tools in CDA, Text p.18
Delta Method:
𝑑 1
If 𝜃𝑛 ՜ 𝑁 𝜃, Σ , 𝑔 𝜃 a m×1 differentiable function from Rk to Rm ,
𝑛
then
𝑑 1 𝑇
𝑔 𝜃𝑛 ՜ 𝑁 𝑔 𝜃 , 𝐷 Σ𝐷
𝑛
𝜕
where Dk×m = 𝑔(𝜃)
𝜕𝜃
𝑝
Example (Exc): Zi ~ Bin 1, p iid, 𝑝Ƹ = 𝑍ҧ𝑛 , 𝑔 𝑝 =ln , 𝑝Ƹ =
𝑑 1−𝑝
𝑝(1−𝑝)
ҧ
𝑍𝑛 ՜ 𝑁(p, ); k=1,m=1;
𝑛
𝑝ො 𝑑 𝑝 𝑝(1−𝑝) 1 𝑝 1
then ln( )՜ 𝑁 ln( ), ( )2 = 𝑁 ln( ),
1−𝑝ො 1−𝑝 𝑛 𝑝(1−𝑝) 1−𝑝 𝑛𝑝(1−𝑝)
6
Derivation Tools in CDA, Text p.18
Delta Method:
𝑑 1
መ
If 𝜃𝑛 ՜ 𝑁 𝜃, Σ , 𝑔 𝜃 a m×1 differentiable function
𝑛
from Rk to Rm , then
𝑑 1 𝑇
𝑔 𝜃መ𝑛 ՜ 𝑁 𝑔 𝜃 , 𝐷 Σ𝐷
𝑛
𝜕
where Dk×m = 𝑔(𝜃)
𝜕𝜃
Example 1.4(p.19): scalar, 𝑔 𝜃 = exp(),
𝑑 2 𝑑
𝜃መ𝑛 ՜ 𝑁(θ, ), then 𝑒𝑥𝑝 𝜃መ𝑛 ՜ ??
𝑛
7
Derivation Tools in CDA, Text p.18
Delta Method:
𝑑 1
If 𝜃𝑛 ՜ 𝑁 𝜃, Σ , 𝑔 𝜃 a m×1 differentiable function
𝑛
k m
from R to R , then
𝑑 1 𝑇
𝑔 𝜃𝑛 ՜ 𝑁 𝑔 𝜃 , 𝐷 Σ𝐷
𝑛
𝜕
where Dk×m = 𝑔(𝜃)
𝜕𝜃
𝑑 2
Example 1.4(p.19): scalar, 𝑔 𝜃 = exp(), 𝜃𝑛 ՜ 𝑁(θ, );
𝑛
𝑑 2
k=1, m=1; then 𝑒𝑥𝑝 𝜃𝑛 ՜ 𝑁 𝑒𝑥𝑝 𝜃 , exp(2)
𝑛
8
Derivation Tools in CDA, Text p.15-19
Set Up: Suppose we have an i.i.d. sample X1, X2, …, Xn, n
large, with E(Xi) = µ and V(Xi)=2, E(Xi-µ)4 = µ4 (fourth
central moment notation) with underlying prob. dist. may
be complex and usually unknown. Let 𝑋ത𝑛 = sample mean;
𝑠𝑛2 = sample variance with divisor n.
Convergence in Probability or (Weak) Law of Large
Numbers: For >0, P(|𝑋ത𝑛 − µ| > ) converges to 0 as n
, for every ;
𝑝
(written as: 𝑋ത𝑛 ՜ µ).
[𝑋ത𝑛 is said to be a consistent estimate of µ.]
9
Derivation Tools in CDA, Text p.15-19
CLT and Convergence in Distribution (or Law) (p.16):
𝑋ത𝑛 −𝜇
Let Fn(x) denote the CDF of 𝑛[ ] and F(x) = (x) denote
the CDF of N(0,1). Then for every continuity point x of F() =
𝑋ത𝑛 −𝜇 𝑑
(), Fn(x) converges to F(x) as n [written as: 𝑛[ ]՜
N(0, 1), as n ]
𝜎2
Application: Asymptotic Variance of 𝑋ത𝑛 = ; Estimated
𝑛
2
𝑠𝑛
Asymptotic Variance of 𝑋ത𝑛 = .
𝑛
[What about asympt dist of 𝑠𝑛2 as an estimator of 𝜎 2 ? For that we
need Slutsky’s Theorem.]
10
Derivation Tools in CDA, Text p.15-19
Convergence in Probability : For >0, P(|𝑋𝑛 − 𝑋| > )
𝑝
converges to 0 as n , for every ; then 𝑋𝑛 ՜ X).
Convergence in Distribution: Let Fn(x) denote the CDF of
𝑋𝑛 and F(x) denote the CDF of X. Then if for every
continuity point x of F(), Fn(x) converges to F(x) as n ,
𝑑
then 𝑋𝑛 ՜ X
12
Derivation Tools in CDA, Text p.18
Slutsky’s Theorem:
𝑑 𝑑
Suppose 𝑋𝑛 ՜ 𝑋 and 𝑌𝑛 ՜ 𝑐, constant. Then
𝑑
1. 𝑋𝑛 + 𝑌𝑛 ՜ 𝑋 + 𝑐
𝑑
2. 𝑌𝑛 𝑋𝑛 ՜ 𝑐𝑋
𝑑
3. If c0, 𝑋𝑛 /𝑌𝑛 ՜ 𝑋/𝑐
13
Derivation Tools in CDA, Text p.18
σ𝑛 ҧ 2
𝑖=1 𝑍𝑖 −𝑍𝑛 σ𝑛
𝑖=1 𝑍𝑖 −µ
2
𝑠𝑛2 2
−𝜎 = − =
2
− 2 − (𝑍ҧ𝑛 − µ)2 .
𝑛 𝑛
𝑍𝑖∗ = 𝑍𝑖 − µ 2 ; 𝐸(𝑍 ∗ ) = 2 , 𝑉(𝑍 ∗ ) = 𝐸 𝑍 − µ 4 − 4 = µ − 4 .
𝑖 𝑖 𝑖 4
σ𝑛𝑖=1 𝑍𝑖 −µ
2
𝑛 𝑠𝑛2 − 𝜎 2 = 𝑛 − 2 − 𝑛 𝑍ҧ𝑛 − µ ∗ (𝑍ҧ𝑛 − µ)
𝑛
=Term 1 Term2*Term3
𝑑
Term1 ՜ N(0, 𝑉 𝑍𝑖 − µ 2 ) by CLT for iid 𝑍𝑖∗ ;
𝑑 𝑑
2
Term 2 ՜ N(0, 𝜎 ) by CLT; Term 3 ՜ 0 by WLLN.
𝑑
By Slutsky’s Theorem, 𝑛 𝑠𝑛2 − 𝜎2 ՜ N(0, 𝑉 𝑍𝑖 − µ 2
= µ4 − 4 ).
1 𝑛
µෞ4 −4 ҧ )4
σ𝑖=1(𝑍𝑖 −𝑍𝑛 2 )2
−(𝑠𝑛
Estimated asymptotic variance of 𝑠𝑛2 = = 𝑛
𝑛 𝑛
14
Inference for One-way Frequency
Table
15
Binomial Distribution
(leading to One-Way Frequecy Table)
Suppose Y is a random variable with 2 possible outcome
categories c1,c2 with probabilities π1, π2=(1 π1).
Suppose there are n observations on Y ; we can summarize
the responses through the vector of observed frequencies
(random variables) of data on Y: (X1, X2=nX1).
16
Example 1.1, p. 6, Text
18
Example 1.1 (Binary Case), p. 37, Text
• Test if the prevalence of Metabolic Syndrome is 40% in this study
population
48
ෝ − 𝜋0
π 93
− 0.4
𝑍= = = 2.286;
𝜋0 ×(1−𝜋0 )/𝑛 0.4×0.6/93
P-value = 2(2.2.86)=0.0223
ෝ ± 𝑍/2 × π
π ෝ × (1 − π
ෝ)/𝑛
48 48 48
= ± 1.96 × × (1 − )/93
93 93 93
=[0.4146, 0.6177]
19
Negative Binomial Distribution (p. 41)
• A sequence of independent Bernoulli trials, having two potential
outcomes "success" and "failure". In each trial probability of
success is p and of failure is (1 − p). Observe this sequence until
a predefined number r of failures has occurred. Then X =
number of ‘successes’ observed, will have the negative
binomial distribution:
𝑟+𝑘 𝑟
• 𝑃𝑟 𝑋 = 𝑘 = 𝑝 (1 − 𝑝)𝑘 , where 𝑛 =
𝑟 𝑘!
∞ 𝑛−1 −𝑥 𝑘+𝑟−1 𝑘 −𝑟
0 𝑥 𝑒 𝑑𝑥 = 𝑛 − 1 ! ; =(−1)
𝑘
𝑘
• Why called Negative Binomial?
−𝑟 σ∞ 𝑘 + 𝑟 − 1 𝑘 σ∞ −𝑟
• (1 − 𝑝) = 𝑘=0 𝑝 = 𝑘=0 (−𝑝)𝑘
𝑘 𝑘
20
Negative Binomial Distribution (p. 41)
• A sequence of independent Bernoulli trials, having two
potential outcomes "success" and "failure". In each trial
probability of success is p and of failure is (1 − p). Observe this
sequence until a predefined number r of failures has occurred.
Then X = number of ‘successes’ observed, will have
the negative binomial distribution:
𝑟+𝑘 𝑟
• 𝑃 𝑋=𝑘 = 𝑝 (1 − 𝑝)𝑘 ……… (1a)
𝑟 𝑘!
𝑝 𝑝
• E(X)= r( ), V(X) = r 2 > E(X) …….. (1b)
(1−𝑝) (1−𝑝)
• Extension through reparameterization:
1 𝑝
= (> 0), =r( ) in (1)
𝑟 (1−𝑝)
𝑟
α1 +𝑘 µ 𝑘
1
• Then, 𝑃 𝑋 = 𝑘 = α
……… (2a)
α1 𝑘! α1 +µ 1
α
+µ
• E(X)= ; V(X) = + 2 ……………………(2b)
22
Hypergeometric Distribution
(used in calculating P-value in Exact Tests)
23
Hypergeometric Distribution
(used to calculate ‘Population’ Total)
24
Hypergeometric Distribution
(used in acceptance sampling)
25
Multivariate Hypergeometric Distribution
(used in calculating P-value in Exact Tests)
26
Inference for Multinomial Case
27
Multinomial Distribution
(may lead to One-Way, Two-Way, … Frequecy Table)
Suppose Y is a categorical variable with k possible
outcome categories c1,c2,…,ck with probabilities π1,
π2,…, πk=(1- π1-…- πk-1).
Suppose there are n observations on categorical Y;
we can summarize the responses through the vector
of observed frequencies (random variables) of data
on Y: X= (X1, X2,…, Xk), where Xk=n- X1-…- Xk-1.
28
Multinomial Distribution
(may lead to One-Way, Two-Way, … Frequecy Table)
X = (X1, X2,…, Xk) has a multinomial distribution with
parameters n and (π1, π2,…, πk ).
𝑛! 𝑥 𝑥
P(X1=x1, …, Xk=xk) = 𝜋1 1 … 𝜋𝑘 𝑘
𝑥1 ! 𝑥2 !… 𝑥𝑘 !
𝐸(𝑋𝑖 )=n𝜋𝑖 ,
𝑉(𝑋𝑖 )=n𝜋𝑖 (1−𝜋𝑖 ) ;
𝐶𝑜𝑣(𝑋𝑖 , 𝑋𝑗 ) = n𝜋𝑖 𝜋𝑗 , ij [Prove it, Exercise]
29
Multinomial Distribution
(may lead to One-Way, Two-Way, … Frequency Table)
X = (X1, X2,…, Xk) has a multinomial distribution with
parameters n and (π1, π2,…, πk ).
𝑛! 𝑥 𝑥
P(X1=x1, …, Xk=xk) = 𝜋1 1 … 𝜋𝑘 𝑘
𝑥1 ! 𝑥2 !… 𝑥𝑘 !
𝐸(𝑋𝑖 )=n𝜋𝑖 ,
𝑉(𝑋𝑖 )=n𝜋𝑖 (1−𝜋𝑖 ) ;
𝐶𝑜𝑣(𝑋𝑖 , 𝑋𝑗 ) = n𝜋𝑖 𝜋𝑗 , ij
𝜋1 𝜋1 (1 − 𝜋1 ) ⋯ −𝜋1 𝜋𝑘
E X = µ = 𝑛 ⋮ ; V X = 𝑛 = 𝑛 ⋮ ⋱ ⋮ ;
𝜋𝑘 −𝜋1 𝜋𝑘 ⋯ 𝜋𝑘 (1 − 𝜋𝑘 )
𝜋1 ⋯ 0 𝜋1
Thus, = 𝑑𝑖𝑎𝑔 𝜋 − 𝜋𝜋 𝑇 = ⋮ ⋱ ⋮ − ⋮ 𝜋1 … 𝜋𝑘
0 ⋯ 𝜋𝑘 𝜋𝑘
30
Linking Poisson with Multinomial Dist,
(pp. 202-203, Text)
Result: Let Y1, …,Yk be independent Poi(i) r.v.’s. Then given (Y1+
…+Yk)=n, the conditional joint distribution of (Y1, …,Yk) is multinomial
distribution with parameters: n trials, and 𝜋𝑖 = 𝑘 𝑖 , i=1,…,k.
σ𝑗=1 𝑗
31
Example 1.1, p. 6, Text
One-Way Frequency Table for Metabolic Syndrome Study
MS
Present Absent Total
48 45 93
34
Example: Pearson’s χ2 Test
When we are trying to do a test of hypothesis to determine
whether a die is a fair die, it is a simple hypothesis.
Suppose we roll it 120 times and the summarized data
are as follows:
In this case, k=6 and n=120. H0: πi = 1/6 (= π0i), i=1,2,…,6
37
Example 2.2, p.4, p.38, Text
39
Testing Composite Hypothesis
in Inference for Count Data
40
Poisson Distribution Case
Suppose Y is a random variable taking integer values y=0,
−
𝑦
1, 2,…, with probability P(Y=y)=𝑒
𝑦!
Suppose there are n observations on Y; we can summarize
the observations through the vector of observed
frequencies for value-categories 0, 1, 2, …
42
MLE of = 9.1 or 3.6489?
𝐿 𝜃
2 5 3 5 4 5 5 6
32 4 𝜃 𝜃 𝜃 𝜃
= 𝑒 −𝜃 𝑒 −𝜃 𝜃 𝑒 −𝜃 𝑒 −𝜃 𝑒 −𝜃 𝑒 −𝜃 ×
2 6 24 120
2 5 41
𝜃 𝜃
1 − 𝑒 −𝜃 − 𝑒 −𝜃 𝜃 − 𝑒 −𝜃 … −𝑒 −𝜃
2 120
Maximize this function 𝐿 𝜃 w.r.t. 𝜃
Need to do numerical maximization
Do it as an Exercise
43
Example 2.3, p.42, Text
46
Sampling Schemes
Leading to (2×2) Contingency Tables
47
Layout of the 2×2 table
Column factor
(‘Response’)
Level 1 Level 2
Level 1 n11 n12 R1=
Row Fact n1+= n10=n1 Row
(‘Explanatory’) Total
Level 2 n21 n22 R2=
n2+= n20=n2
49
Poisson Sampling
• Poisson Sampling (French mathematician Simeon
Denis Poisson): Here a fixed amount of time (or space,
volume, money etc.) is employed to collect a random
sample from a single population and each member of
the population falls into one of the four cells in the
2×2 table.
• In the CVD Death example 1 (next slide), researchers
spent a certain amount of time sampling the health
records of 3112 women who were categorized as
obese and non-obese against died of CVD or not. In
this case, none of the marginal totals or the sample
size was known in advance.
50
Example-1*: Cardio-Vascular Deaths and Obesity
among women in American Samoa
51
American Samoa on World Map
52
Multinomial Sampling
• This is same as the Poisson sampling scheme except
for the fact that here the overall sample size is
predetermined, and not the amount of time for
sampling (or space or volume or money etc.)
• Suppose in the CVD Death Example 1, researchers
decided to sample the health records of exactly 3112
women and then for each woman note (i) if obese or
non-obese and (ii) if died of CVD or did not die of
CVD. Then it would have been multinomial sampling.
53
Prospective Product Binomial Sampling
• Prospective Product Binomial Sampling
(“cohort” study): First identify explanatory variable(s)
that explain “causation” . Population is categorized according
to levels of explanatory variable and random samples are then
selected from each explanatory group.
Suppose separate lists of obese and non-obese American
Samoan women were available in Example 1, a random
sample of 2061 and 1051 respectively could have been
selected from the lists. The term Binomial refers to the
dichotomy of the explanatory variable. The term Product
refers to the fact that sampling is done from more than one
population independently.
54
Example-1: Cardio-Vascular Deaths and Obesity
among women in American Samoa
55
Layout for Inference on the 2×2 table
(Sec 2.2, Text)
Column factor
(‘Response’)
Level 1 Level 2
Level 1 n11 n12 R1=
Row Fact n1+= n10=n1 Row
(‘Explanatory’) Total
Level 2 n21 n22 R2=
n2+= n20=n2
57
Example-2*: Vitamin-C versus Placebo Experiment
59
Retrospective Product Binomial Sampling
60
Example 3*: Smoking versus Lung Cancer Outcome
SMOKER 83 72 155
NON-SMOKER 3 14 17
TOTAL 86 86 172
SMOKER 83 72 155
NON-SMOKER 3 14 17
TOTAL 86 86 172
62
Why Retrospective Product Binomial
Sampling at all ?
• We cannot test for the equality of proportions along the
explanatory variable if the sampling scheme is
retrospective.
• We only get odds ratio from a case control study which is
an inferior measure of strength of association as
compared to relative risk.
• Why do retrospective sampling at all, then?
Compared to prospective cohort studies they tend to be less
costly and shorter in duration. Case-control studies are often
used in the study of rare diseases, or as a preliminary study
where little is known about the association between possible
risk factor and disease of interest.
63
Retrospective Product Binomial
Sampling (Continued)
1. If the probabilities of the “Yes” response (e.g., cancer)
are very small for a particular level (e.g., non-smoker) of
explanatory variable, it may need a huge sample size to
get any “Yes” response at all through prospective
sampling.
2. Retrospective sampling guarantees that we have at
least a reasonable number of “Yes” responses for each
level of explanatory variable.
3. Retrospective sampling may be accomplished without
having to follow the subjects throughout their lifetime
(in the smoking versus lung cancer study, Example 3).
64
Sampling scheme versus Hypotheses Testing
Sampling scheme Marginal Total fixed in Usual Hypothesis: Usual Hypothesis:
advance Independence Homogeneity
Poisson None YES
Multinomial Grand Total (Sample YES
size)
Prospective Row (explanatory) YES
total
Retrospective Column (Response) YES
total
65
Prospective
Subjects selected
according to the levels
of the explanatory
variable
Explanatory Response
Variable Variable
Retrospective
Subjects selected
according to
levels of the
Response variable
66
Layout for Inference on the 2×2 table
(Sec 2.2, Text)
Column factor
(‘Response’)
Level 1 Level 2
Level 1 n11 n12 R1=
Row Fact n1+= n10=n1 Row
(‘Explanatory’) Total
Level 2 n21 n22 R2=
n2+= n20=n2
68
Assumptions for Asymptotic Chi-square tests in
22 Contingency Tables
• We will assume that the frequencies of all the entries in
the 2x2 table are greater than 5.
• This ensures that the “asymptotic tests” performed on
the 2x2 tables are reasonably accurate. (“asymptotic”
means ‘appropriate in large samples’)
• If all the entries in the 2x2 table are not greater than 5,
one may try Fisher’s Exact test.
69
Example-1*: Cardio-Vascular Deaths and Obesity among
women in American Samoa
70
Example-1 Calculations
74
Example-2: Vitamin-C versus Placebo Experiment
75
Example-2: Vitamin-C versus Placebo Experiment
76
Example-2 Calculations
𝑛+𝑗
• Expected cell count for PCS test under H0: 𝑝1+ = 𝑝2+: 𝑛ො 𝑖𝑗 = 𝑛𝑖+ ( 𝑛 );
• Equivalent Z Test for homogeneity of two binomial populations: 𝑍 =
𝑝ො1+ −𝑝ො2+ 𝑛+1 𝑛+2
( ), where 𝑝Ƹ = and 1 − 𝑝Ƹ = ;
𝑝ො (1−𝑝)/𝑛
ො 𝑛 𝑛
• Note: PCS test can be used to test homogeneity of binomial populations, as
77
PCS=Z2.
Example 3*: Smoking versus Lung Cancer Outcome
SMOKER 83 72 155
NON-SMOKER 3 14 17
TOTAL 86 86 172
78
Example 3*: Smoking versus Lung Cancer Outcome
SMOKER 83 72 155
NON-SMOKER 3 14 17
TOTAL 86 86 172
79
Example 3*: Smoking versus Lung Cancer Outcome
SMOKER 83 72 155
NON-SMOKER 3 14 17
TOTAL 86 86 172
80
Example-3 Calculations
𝑛+𝑗
• Expected cell count for PCS test under H0: 𝑝1+ = 𝑝2+: 𝑛ො 𝑖𝑗 = 𝑛𝑖+ ( 𝑛 );
• Equivalent Z Test for homogeneity of two binomial populations: 𝑍 =
𝑝ො1+ −𝑝ො2+ 𝑛+1 𝑛+2
( ), where 𝑝Ƹ = and 1 − 𝑝Ƹ = ;
𝑝ො (1−𝑝)/𝑛
ො 𝑛 𝑛
• Note: PCS test can be used to test homogeneity of binomial populations
81
Intentionally Kept Blank
82
Homogeneity versus Independence
Hypotheses
• Hypothesis of homogeneity (for prospective product
binomial sampling)
H0: π1 = π2
Not done in Retrospective Product Binomial Sampling
• Hypothesis of Independence
(At this stage qualitatively expressed)
Done only in Multinomial or Poisson Sampling
83
Homogeneity versus Independence
Hypotheses (contd.)
• The hypothesis of independence is used to
investigate an association between row and column
factors without specifying one of them as a
response. Although the hypotheses may be
expressed in terms of parameters, it is more
convenient to use the qualitative wording:
• H0: The row categorization is independent of the
column categorization
84
Sampling scheme versus Hypotheses Testing
Sampling scheme Marginal Total fixed in Usual Hypothesis: Usual Hypothesis:
advance Independence Homogeneity
Poisson None YES
Multinomial Grand Total (Sample YES
size)
Prospective Row (explanatory) YES
total
Retrospective Column (Response) YES
total
85
Assumptions for Pearson’s Chi-square tests
• If all the entries in the 2x2 table are not greater than
5, one may try Fisher’s Exact test.
86
Intentionally Kept Blank
87
Layout for Inference on the 2×2 table
(Sec 2.2, Text)
Column factor
(‘Response’)
Level 1 Level 2
Level 1 n11 n12 R1=
Row Fact n1+= n10=n1 Row
(‘Explanatory’) Total
Level 2 n21 n22 R2=
n2+= n20=n2
Measures of Association:
• (i) Difference Between Proportions,
• (ii) Relative Risk (or Risk Ratio or Incidence Rate
Ratio or ‘Probability Ratio’)
• (iii) Odds Ratio
1
[ Example: Expected number of trials up to first failure = (1−) ;
‘Expected number of successes before first failure = (1−) = odds
for success’]
89
Is “Tutoring” Helpful in a Business Stat Course?
90
Motivating Use of Odds-Ratios
• High School A reduced the dropout rate from 10% to 5%, a
dramatic 50% decrease!
• High School B increased the graduation rate from 90% to 95%,
a modest 5.5% increase.
• Principal of School A was lauded by the NY Times for slashing
the drop out rate by half. The other principal got a short
mention in the local newspaper. Even though they did the
same thing.
• Log-odds ratios puts these on even terms:
School A: log((0.05/0.95)/(0.10/0.90)) = −0.32
School B: log((0.95/0.05)/(0.90/0.10)) = 0.32
• Log-odds-ratios consider a change from 10% to 5% equivalent
to a change from 90% to 95%.
91
‘Odds for success’ vs ‘probability of success’
in Geometric Distribution Setup (p. 41)
• A sequence of independent Bernoulli trials, having two
potential outcomes "success" and "failure". In each trial
probability of success is and of failure is (1 − ). Observe
this sequence until first failure has occurred. Then X = number
of ‘successes’ observed, will have the geometric (negative
binomial with r=1) distribution: Pr(X=k) = k(1-), k=0,1,2, …
• = E(X)= /(1- ), V(X) = /(1- )2 > E(X).
• Thus, = ‘odds for success’ is equal to ‘average number’
(1−)
of successes before first failure in a ‘geometric experiment’ of
trials.
• Compare it with = ‘probability of success’ = ‘proportion’ of
success in an experiment of infinitely many trials.
92
Odds versus Probabilities (contd.)
Interpretation: An event with chance of
occurrence 0.95 means the event has odds of
0.95/0.05 =19 to 1 in favour of its occurrence
93
Relation between Probability, Odds & Logit
Log(Odds)
Probability Odds =Logit
0 0 NC Odds maps probability
0.1 0.11 -2.20 from [0,1] to [0,)
0.2 0.25 -1.39 asymmetrically,
0.3 0.43 -0.85 while Logit maps
0.4 0.67 -0.41 probability to (-, )
0.5 1.00 0.00 symmetrically
0.6 1.50 0.41
0.7 2.33 0.85
0.8 4.00 1.39
0.9 9.00 2.20
1 NC NC
94
Example: NFL Football
TEAM “ODDS against” (Prob of Win)
San Francisco 49ers Even (1/2)
Denver Broncos 5 to 2 (2/7)
New York Giants 3 to 1 (1/4)
Cleveland Browns 9 to 2 (2/11)
Los Angeles Rams 5 to 1 (1/6)
Minnesota Vikings 6 to 1 (1/7)
Buffalo Bills 8 to 1 (1/9)
Pittsburgh Steelers 10 to 1 (1/11)
96
Advantage of Odds Ratio over
Risk Ratio or Difference of Proportions
1. Estimate of Odds Ratio (OR) remains invariant over
the sampling design (i.e., works even in case of
retrospective sampling), and it is given by
𝑂𝑅=(n 11n22)/(n12n21), since
𝑷(𝒀=𝟏|𝑿=𝟏) 𝑃(𝑌=1,𝑋=1)
𝑷(𝒀=𝟎|𝑿=𝟏) 𝑃(𝑌=0,𝑋=1) 𝑃(𝑌=1,𝑋=1)𝑃(𝑌=0,𝑋=0)
𝑷(𝒀=𝟏|𝑿=𝟎) = 𝑃(𝑌=1,𝑋=0) =
𝑃 𝑌=0,𝑋=1 𝑃(𝑌=1,𝑋=0)
𝑷(𝒀=𝟎|𝑿=𝟎) 𝑃(𝑌=0,𝑋=0)
𝑃 𝑋 =1 𝑌 =1 𝑃(𝑌 = 1) 𝑷 𝑿 = 𝟏 𝒀=𝟏
𝑃 𝑋 =1 𝑌 =0 𝑃(𝑌 = 0) 𝑷 𝑿 = 𝟎 𝒀=𝟏
= =
𝑃 𝑋 =0 𝑌 =1 𝑃(𝑌 = 1) 𝑷 𝑿 = 𝟏 𝒀=𝟎
𝑃 𝑋 =0 𝑌 =0 𝑃(𝑌 = 0) 𝑷 𝑿 = 𝟎 𝒀=𝟎
2. Comparison of odds extends nicely to regression
analysis when response (Y) is a categorical variable. 97
(iii) Odds and Odds Ratio
Odds of an outcome: Let be the population
proportion of “YES” outcomes. Then the
corresponding odds is given by,
𝜋
𝜔=
1−𝜋
98
(iii) Odds, and Odds Ratio (contd)
• i = population proportion of “YES” response for
Group X=i. Then the odds of “YES” happening is given
𝜋𝑖
by: 𝜔𝑖 = , 0 ≤ 𝜔𝑖 < ∞.
1−𝜋𝑖
• The sample odds of “YES” in Group i, give the
ෞ𝑖
𝜋
estimate: 𝜔
ෞ𝑖 = , i=1,2
1−ෞ
𝜋𝑖
• Odds Ratio of “YES” response in Group 1 to that in
Group 2:
𝜔1 𝜋1 (1 − 𝜋2 )
𝜑= = ×
𝜔2 (1 − 𝜋1 ) 𝜋2
99
Odds versus Probability
Given the probability of a “YES” outcome, the
corresponding odds is given by,
𝜔 = 𝜋/(1 − 𝜋)
Similarly, given the odds ω of a “YES” response, the
corresponding probability is given by
𝜋 = 𝜔/(1 + 𝜔)
100
Properties of Odds
𝜔 = 𝜋/(1 − 𝜋) 𝜋 = 𝜔/(1 + 𝜔)
101
Intentionally Kept Blank
102
Difference Between Proportions or Relative Risk?
[ (π1 π2) or (π1/π2) ?]
• Two proportions π1 = 0.5 and π2 = 0.45 have the same
difference as π1 = 0.1 and π2 = 0.05 (even though in the
second case one is twice the other). This is when Relative
Risk is a better measure than Difference Between
Proportions.
• An alternative to comparing proportions (relative risk,
i.e., π1/π2) is to compare the corresponding odds through
odds ratio 𝜑 = 𝜔1 /𝜔2 , where 𝜔1 = π1/(1- π1) 𝜔2 = π2/(1-
π2).
103
Confidence Interval for π1π2
𝑛11 𝑛21
• Estimate of π1 π2: 𝜋ො 1 − 𝜋ො 2 = −
𝑛1+ 𝑛1+
ෝ 1 (1−𝜋
𝜋 ෝ1) ෝ 2 (1−𝜋
𝜋 ෝ2)
•
𝑉𝑎𝑟(𝜋ො 1 − 𝜋ො 2 ) = +
𝑛1+ 𝑛2+
ෝ 1 (1−𝜋
𝜋 ෝ1) ෝ 2 (1−𝜋
𝜋 ෝ2)
• s.e.( 𝜋
ෞ1 − 𝜋
ෞ2 ) = +
𝑛1+ 𝑛2+
• 100(1-)% CI for π1 π2:
1(1−𝜋
𝜋 1) 2(1−𝜋
𝜋 2)
(𝜋ො 1 − 𝜋ො 2 ) − 𝑍/2
𝑛1+ + 𝑛2+ to
1(1−𝜋
𝜋 1) 2(1−𝜋
𝜋 2)
(𝜋ො 1 − 𝜋ො 2 ) + 𝑍/2
𝑛1+ + 𝑛2+
104
Testing H0: π1 π2 = 0
𝑛11 𝑛21
• Estimate of π1 π2: (𝜋ො 1 − 𝜋ො 2 )= −
𝑛1+ 𝑛1+
𝑛11 +𝑛21
• ො =
𝑛1+ +𝑛2+
1 1
) = ො (1 − ො )(
• 𝑉𝑎𝑟(ො + )
𝑛1+ n2+
(𝜋ෝ 1 −ෝ𝜋2 )
• Test statistic, Z = is asymptotically
(1− )(𝑛 + n )
1 1
1+ 2+
N(0,1), under H0, if n1+, n2+ are ‘large’
105
Relative Risk vs Odds Ratio
106
Relative Risk (RR) or Incidence Rate Ratio (IRR)
(Text, p.53)
107
Confidence Intervals for Relative Risk (RR)
(Text, p.54)
𝑛11
π 𝑛1+
• =
Estimate of RR ( π1 ): 𝑅𝑅 𝑛21 ;
2 𝑛2+
• Exercise:
1− ෞ 1− ෞ
~ N(𝑙𝑜𝑔𝑒 𝑅𝑅 , ( 1−π1 +
𝑙𝑜𝑔𝑒 𝑅𝑅
1−π2
)) N( 𝑙𝑜𝑔𝑒 𝑅𝑅 , (
π1
+
π2
))
π1 𝑛1+ π2 𝑛2+ 𝑛11 𝑛21
1− ෞ
π1 1− ෞ
π2
• 𝑙𝑜𝑔𝑒 𝑅𝑅
𝑉𝑎𝑟 =( + )
𝑛11 𝑛21
1− ෞ
π1 1− ෞ
π2
• ∓ 𝑍/2
100(1-)% CI for 𝑙𝑜𝑔𝑒 𝑅𝑅 : 𝑙𝑜𝑔𝑒 𝑅𝑅 +
𝑛11 𝑛21
• 100(1-)% CI for RR :
exp − 𝑍/2
𝑅𝑅
𝑉𝑎𝑟(𝑙𝑜𝑔
𝑒 𝑅𝑅 to exp 𝑍/2
𝑅𝑅
𝑉𝑎𝑟(𝑙𝑜𝑔
𝑒 𝑅𝑅
109
Intentionally Kept Blank
110
Computation of odds ratio
(Example 3)
Cancer Control
Smoker 83 72
Non-Smoker 3 14
Placebo 335 76
𝑝ො1 𝑝1 𝑝 1−𝑝 1 2 𝑝1 1 1 1
• ln ~𝑁 ln , 1𝑛 1 𝑁 ln ,𝑛 =𝑛 +𝑛
1−𝑝ො1 1−𝑝1 1+ 𝑝1 1−𝑝1 1−𝑝1 ො1 1−𝑝ො1
1+ 𝑝 11 12
𝑝ො2 𝑝2 1 1 1
• ln 𝑁 ln ,𝑛 =𝑛 +𝑛
1−𝑝ො2 1−𝑝2 2+ ෝ
𝑝2 1−𝑝ො2 21 22
𝟏 𝟏 𝟏 𝟏
~𝑵 𝒍𝒏 𝑶𝑹 ,
• 𝒍𝒏 𝑶𝑹 + + + ;
𝒏𝟏𝟏 𝒏𝟏𝟐 𝒏𝟐𝟏 𝒏𝟐𝟐
1 1 1 1
•
𝑂𝑅~𝑁 𝑂𝑅, 𝑂𝑅2 (𝑛 + 𝑛 + 𝑛 + 𝑛 ) , by DELTA method
11 12 21 22
𝟏 𝟏 𝟏
~𝑵 𝟎,
• Under H0, 𝒍𝒏 𝑶𝑹 (
+ ) ,
ෝ 𝒑𝒄 𝟏−ෝ𝒑𝒄 𝒏𝟏+ 𝒏𝟐+
𝒏𝟏𝟏 +𝒏𝟐𝟏
where estimated common p = ෝ𝒄 =
𝒑
𝒏
113
Confidence Interval for Odds Ratio
(through that for loge of Odds Ratio, (Text, p.52)
= 𝑛11 𝑛22
• Estimate of OR : 𝑂𝑅
𝑛21 𝑛12
• Estimate of “asymptotic” variance of loge(OR):
~𝑁 ln 𝑂𝑅 , 1 + 1 + 1 + 1 , by DELTA method
• ln 𝑂𝑅
𝑛11 𝑛12 𝑛21 𝑛22
1 1 1 1
• 𝑉𝑎𝑟(𝑙𝑜𝑔
𝑒 𝑂𝑅 = + + +
𝑛11 𝑛22 𝑛21 𝑛12
• 100(1-)% CI for OR:
𝑛11 𝑛22
exp ∓ 𝑍/2 𝑉𝑎𝑟(𝑙𝑜𝑔
𝑒 𝑂𝑅 , i.e. ,
𝑛21 𝑛12
𝑛11 𝑛22 1 1 1 1
exp ∓𝑍/2 + + +
𝑛21 𝑛12 𝑛11 𝑛22 𝑛21 𝑛12
4. 95% interval for the odds ratio exp(0.093) to exp(0.761); or 1.10 to 2.14
Conclusion: The odds of a cold for the placebo group are estimated to be 1.53
times the odds of a cold for the vitamin C group (approximate 95% CI: 1.10 to 2.14)
116
Test for Homogeneity
• Hypothesis of homogeneity
H0: π1 = π2
117
Testing Equality of Two Population
Odds
118
Odds Ratio (Contd.)
Interpretation:
If the odds ratio =1 /2 equals to 4, then 1=42.
This means that the odds of a “yes” outcome in the
first group is four times the odds of a “yes” outcome in
the second group.
119
Example 3: Cancer vs Smoking
Cancer Control
Smoker 83 72
Non-Smoker 3 14
Calculate odds ratio by dividing the product of the diagonal elements of the
table with that of the off diagonal element of the table.
The above result indicates that the odds of getting cancer for a smoker is
5.38 times that of getting cancer for a non-smoker.
120
Sampling Distribution of ln(OR)
𝑝1
• 𝑂𝑅 = = 𝑛11 𝑛22
; 𝑂𝑅
𝑝2 𝑛21 𝑛12
𝑝ො1 𝑝ො2
• = ln
ln 𝑂𝑅 − ln( )
1−𝑝ො1 1−𝑝ො2
𝑝 1−𝑝1
• 𝑝Ƹ1 ~𝑁(𝑝1 , 1 ),
𝑛1+
𝑝ො1 𝑝1 𝑝1 1−𝑝1 1 2
• By DELTA method, ln ~𝑁 ln ,
1−𝑝ො1 1−𝑝1 𝑛1+ 𝑝1 1−𝑝1
𝑝1 1 1 1
𝑁 ln , = +
1 − 𝑝1 𝑛1+ 𝑝Ƹ1 1 − 𝑝Ƹ1 𝑛11 𝑛12
𝑝ො2 𝑝2 1 1 1
• ln 𝑁 ln , = +
1−𝑝ො2 1−𝑝2 𝑛2+ ෝ𝑝2 1−𝑝ො2 𝑛21 𝑛22
𝟏 𝟏 𝟏 𝟏
~𝑵 𝒍𝒏 𝑶𝑹 ,
• 𝒍𝒏 𝑶𝑹 + + + , by DELTA method
𝒏𝟏𝟏 𝒏𝟏𝟐 𝒏𝟐𝟏 𝒏𝟐𝟐
1 1 1 1
•
𝑂𝑅~𝑁 𝑂𝑅, 𝑂𝑅2 ( + + + ) , by DELTA method
𝑛11 𝑛12 𝑛21 𝑛22
𝟏 𝟏 𝟏
~𝑵 𝟎,
• Under H0, 𝒍𝒏 𝑶𝑹 ( + ) ,
ෝ 𝒑𝒄 𝟏−ෝ𝒑𝒄 𝒏𝟏+ 𝒏𝟐+
𝒏𝟏𝟏 +𝒏𝟐𝟏
ෝ𝒄 =
where 𝒑
𝒏
121
Example 1: Cardio-Vascular Deaths and Obesity
among women in American Samoa
124
Intentionally Kept Blank
125
Test for Marginal Homogeneity
(McNemar’s Test, Text, p.55-56)
Year 0: 9/285=0.03;
Year 1: 41/317=0.13
128
Derivation: McNemar’s Test
• H0: 1+= +1, define statistic d = p1+ p+1 (= p+2 p2+),
130
Cochran-Mantel-Haenszel Test for no row by
column association in any of the 22 Tables
(Text pp. 94-101, Agresti, p. 237)
131
Cochran-Mantel-Haenszel Test
(Text pp. 94-101, Agresti, p. 237)
(ℎ) (ℎ)
(ℎ) 𝑛1+ 𝑛+1
𝑚11 =
𝑛(ℎ)
133
Cochran-Armitage Trend Test
(See Text, p.60-61, Agresti, p. 178)
Binary categorical (row) variable X, ordered (column) variable Y.
135
Layout for Inference on the 2×2 table
(Sec 2.2, Text)
Column factor
(‘Response’)
Level 1 Level 2
Level 1 n11 n12 R1=
Row Fact n1+= n10=n1 Row
(‘Explanatory’) Total
Level 2 n21 n22 R2=
n2+= n20=n2
• If all the entries in the 2x2 table are not greater than
5, one may try Fisher’s Exact test.
137
Exact Test: Independence of Two Attributes
139
Exact (Conditional) Test: Independence of
Two Attributes
• To test if two qualitative characters (attributes) A and B are
independent. Let P(A=Ai, B=Bj) = pij, i=1,…,k, j=1,…,l.
• To test H0: 𝑝𝑖𝑗 = 𝑝𝑖0 𝑝0𝑗 , for all i,j.
• nij= observed freq for cell AiBj. The marginal frequency of Ai and Bj are
𝑛𝑖0 = σ𝑙𝑗=1 𝑛𝑖𝑗 and 𝑛0𝑗 = σ𝑘𝑖=1 𝑛𝑖𝑗
• Under H0, conditional distribution of {nij, all I,j} given current sample
marginals {𝑛𝑖0 , 𝑛0𝑗 , all i, j}, has the (multivariate hypergeometric) pmf
𝑛! 𝑘 ς𝑙 𝑛𝑖0 𝑛0𝑗
ς 𝑝 𝑝0𝑗
ς ς 𝑛𝑖𝑗 !
𝑖 𝑗
𝑖=1 𝑗=1 𝑖0 ς𝑖 ni0! ς𝑗 n0j!
𝑛! 𝑛 𝑛! 𝑛 =
ς𝑖 𝑝𝑖0𝑖0 ς𝑗 𝑝0𝑗0𝑗 𝑛! ς𝑖 ς𝑗 𝑛𝑖𝑗 !
ς n
𝑖 i0!
ς 𝑛0𝑗 !
140
Multivariate Hypergeometric Distribution
(used in calculating P-value in Exact Tests)
141
Derivation: Conditional distribution of {nij} given
current sample marginals {𝑛𝑖0 , 𝑛0𝑗 }
• To test if two qualitative characters (attributes) A and B are independent. Let
P(A=Ai, B=Bj) = pij, i=1,…,k, j=1,…,l.
• To test H0: 𝑝𝑖𝑗 = 𝑝𝑖0 𝑝0𝑗 , for all i,j.
• nij= observed freq for cell AiBj. The marginal frequency of Ai and Bj are 𝑛𝑖0 =
σ𝑙𝑗=1 𝑛𝑖𝑗 and 𝑛0𝑗 = σ𝑘𝑖=1 𝑛𝑖𝑗
• Under H0, pmf of {nij,𝑛all I,j},
𝑛! 𝑛! 𝑛𝑖0 𝑛0𝑗
ς𝑘𝑖=1 ς𝑙𝑗=1 𝑝𝑖𝑗𝑖𝑗 = ς𝑘𝑖=1 ς𝑙𝑗=1 𝑝𝑖0 𝑝0𝑗
ς ς
𝑖 𝑗 𝑛𝑖𝑗 ! 𝑖 ς ς
𝑗 𝑛𝑖𝑗 !
𝑛! 𝑛𝑖0
• Under H0, unconditional pmf of {ni0, all i} = ς ς𝑖 𝑝𝑖0
𝑖 ni0!
𝑛! 𝑛
• Under H0, unconditional pmf of {n0j, all j} = ς ς𝑗 𝑝0𝑗0𝑗
𝑛0𝑗 !
• Under H0, conditional pmf of {nij, all I,j} given given current sample marginals
{𝑛𝑖0 , 𝑛0𝑗 , all i, j}
𝑛! 𝑛𝑖0 𝑛0𝑗
ς𝑘𝑖=1 ς𝑙𝑗=1 𝑝𝑖0 𝑝0𝑗
ς𝑖 ς𝑗 𝑛𝑖𝑗 ! ς𝑖 ni0 ! ς𝑗 n0j !
=
𝑛! 𝑛𝑖0 𝑛! 𝑛0𝑗 𝑛! ς𝑖 ς𝑗 𝑛𝑖𝑗 !
( ς𝑖 𝑝𝑖0 ) ( ς𝑗 𝑝0𝑗 )
ς𝑖 ni0 ! ς 𝑛0𝑗 !
142
Example 4: Exact Test of Indep. of Attributes
• Add up probabilities, under H0, of the given table and of those
indicating more extreme positive association (and having the
same marginals). These tables and corresponding probabilities:
• Here n1+=15, n2+ = 14, n11=6, n21 = 11 and Ha: p1 < p2. Here sample
sizes are not large, hence asymptotic tests are not applicable. Need
to use exact tests.
144
Exact Test of Two Proportions (Homogeneity)
(GGD, Fundamentals, Vol 1)
145
Exact Test of Two Proportions
(Homogeneity of two Binomial Distributions )
𝑛1 𝑥
• 𝑓 𝑥1 = 𝑥 𝑝 1 (1 − 𝑝)(𝑛1−𝑥1 ) , under H0
1
𝑛2 𝑥
• 𝑓 𝑥2 = 𝑥 𝑝 2 (1 − 𝑝)(𝑛2−𝑥2 ) , under H0
2
𝑛 𝑥
• 𝑓 𝑥 = 𝑝 (1 − 𝑝)(𝑛−𝑥) , under H0
𝑥
• Conditional pmf of X1 for given X= x =x1+x2, under H0 is
𝑛1 𝑥 𝑛 𝑛1 𝑛2
𝑝 1 (1−𝑝)(𝑛1 −𝑥1 ) 2 𝑝𝑥2 (1−𝑝)(𝑛2 −𝑥2 )
𝑥1 𝑥2 𝑥1 𝑥−𝑥1
𝑓 𝑥1 |𝑥 = 𝑛 𝑥 = 𝑛1 +𝑛2 ,
𝑝 (1−𝑝)(𝑛−𝑥)
𝑥 𝑥
If observed value of X1 is x10 and that of X is x0, then use
conditional pmf of X1, f(x1|x0) for testing H0.
146
Exact Test of Two Proportions
(Homogeneity of two Binomial Distributions )
149
Exact Test for Homogeneity of two Multinomial
Distributions (GGD, Fundamentals, Vol 1)
• Two multinomial population with distributions
Mult(n,(p11,…,p1k)), Mult(n,(p21,…,p2k)). Random
samples of sizes n1 and n2 are drawn independently
from the two pop. Let X1=(X11,…,X1k), X2=(X21,…,X2k),
are the samples. 𝑛𝑖 = σ𝑘𝑗=1 𝑋𝑖𝑗
• Want to test H0: (p11,…,p1k) = (p21,…,p2k) [= (p1,…,pk),
say, but unknown]
• Make use of the statistics X1, X2, but concentrate on
samples for which sum X=X1+X2 is fixed at observed
sum (x1+x2)=x.
150
Exact Test for Homogeneity of two Multinomial
Distributions
𝑛1 !
• 𝑓 𝒙1 = 𝑝1 𝑥11 … 𝑝𝑘 𝑥1𝑘 , under H0
𝑥11 !𝑥12 !…𝑥1𝑘 !
𝑛2 !
• 𝑓 𝒙2 = 𝑝1 𝑥21 … 𝑝𝑘 𝑥2𝑘 , under H0
𝑥21 !𝑥22 !…𝑥2𝑘 !
𝑛!
• 𝑓 𝒙 = 𝑝1 𝑥1 … 𝑝𝑘 𝑥𝑘 , under H0
𝑥1 !𝑥2 !…𝑥𝑘 !
• Conditional pmf of X1 for given X= x =x1+x2, under H0 is
𝑛1 ! 𝑛2 ! 𝑥1 !𝑥2 !…𝑥𝑘 !
𝑓 𝒙1 |𝒙 = .
𝑥11 !𝑥12 !…𝑥1𝑘 ! 𝑥21 !𝑥22 !…𝑥2𝑘 ! 𝑛!
If observed value of X1 is x10 and that of X is x0, then use
conditional pmf of X1, f(x1|x0) for testing H0.
• Exercise: Find a good business example/application
151
Testing Homogeneity of Two Multinomial Pop.
(https://online.stat.psu.edu/stat415/lesson/17/17.1 ; Example 17-3)
Level 1 ⋯ Level J
Population 1 n11 ⋯ n1J n1=n1+
Population 2 n21 ⋯ n2J n2=n2+
n+1 ⋯ n+J n=n++
152
Testing Homogeneity of Two Multinomial Pop.
(https://online.stat.psu.edu/stat415/lesson/17/17.1 ; Example 17-3)
The head of a surgery department at a university medical center was concerned that
surgical residents in training applied unnecessary blood transfusions at a different
rate than the more experienced attending physicians. Therefore, he ordered a study
of the 49 Attending Physicians and 71 Residents in Training with privileges at the
hospital. For each of the 120 surgeons, the number of blood transfusions prescribed
unnecessarily in a one-year period was recorded. Based on the number recorded, a
surgeon was identified as either prescribing unnecessary blood transfusions
Frequently, Occasionally, Rarely, or Never.
155
Test of Poisson Mean
(GGD, Fundamentals, Vol 1)
• Poisson population for which mean is . Random samples {X1,
…,𝑋𝑛 } are drawn independently. Then Y= σ𝑛𝑗=1 X𝑗 ~ Poi(𝑛).
• Want to test H0: =0 vs Ha: >0 ;
𝑒
−𝑛 0 (𝑛
0)
𝑦
• Let Y=y0; Exact P-value = P Y ≥ 𝑦0 = σ∞
𝑦=𝑦0 𝑦!
σ𝑛
𝑗=1 X𝑗
• CI for : 𝑋ത = ~ N(, /n) N(, 𝑋/𝑛)
ത
𝑛
ത )
(𝑋−
• ത
~ N(0,1)
𝑋/𝑛
157
Exact Test of Two Poisson Means
(GGD, Fundamentals, Vol 1)
• Two Poisson populations for which means are 1 and 2.
Random samples {X11, …,𝑋1𝑛1 } and {X21, …,𝑋2𝑛2 } of sizes n1 and
n2 are drawn independently from the two pop. Then Yi=
𝑛𝑖
σ𝑗=1 X𝑖𝑗 ~ Poi(𝑛𝑖 ).
• Suppose we want to test H0: 1=2 vs Ha: 1>2
𝑛1
• Conditional pmf of Y1 given Y1+ Y2 = y, is Bin(y, ), under H0
𝑛1 +𝑛2
𝑦0 𝑛 𝑛
• P-value = σ𝑦1 ≥𝑦10 𝑦 ( 1 )𝑦1 ( 2 )𝑦0 −𝑦1
1 𝑛1 +𝑛2 𝑛1 +𝑛2
• Exercise: Collect data for two world cup (cricket) tournaments
and test equality of two Poisson means
158
Intentionally Kept Blank
159
Degrees of Freedom for Likelihood Ratio Tests
160
References
• Agresti, A. (2012). Categorical Data Analysis, Wiley Series in
Probability and Statistics.
• Bishop, Y., Fienberg, S. E. and Holland, P. W. (1975). Discrete
Multivariate Analysis, MIT Press, Cambridge.
• Christensen, R. (1990). Loglinear Models. Springer-Verlag,
New York.
• Ramsey, F. L. and Schafer, D. W. (1997). The Statistical Sleuth.
Duxbury Press, Belmont, California.
• Read, T. R. C. and Cressie, N. (1988). Goodness of fit Statistics
for Discrete Multivariate Data. Springer-Verlag, New York.
• Goon, Gupta, Dasgupta, Fundamentals of Statistics, Volume
One.
161
162
Bivariate Normal Distribution
• Two “related”, normally distributed variables X ∼
N(µx, σ2x), and Y ∼ N(µy, σ2y) with correlation
coefficient .
• Example: X= Adv Exp, Y = Sales.
• Outline of a bivariate normal distribution
Histogram can be represented by the following
(probability density) function:
163
Bivariate Normal Distribution
• Correlation coefficient = 0
164
Bivariate Normal Distribution
• Correlation coefficient = 0.8
165