Statistics Training

STATISTICAL INFERENCE LECTURE
- Dripto Bakshi
Inferential Statistics
 Inferential statistics: the part of statistics that allows

researchers to generalize their findings beyond data
collected.
 Statistical inference: a procedure for making inferences or

generalizations about a larger population from a sample of
that population
Lecture Parts
 Weak Law of Large Numbers & Central Limit Theorem
 Estimation Theory
Weak Law of Large Numbers & Central Limit
Theorem
Pre - Poll / Election Survey
Markov’ Inequality
 Consider any positive random variable X∈ (0, ∞) with density 𝑓𝑋 .

+∞
 E(X) = ‫׬‬0 𝑥 𝑓𝑋 (x) dx
+∞ +∞
 E(X) = ‫׬‬0 𝑥 𝑓𝑋 (x) dx ≥ ‫𝑋𝑓 𝑥 𝑎׬‬ (x) dx , where a > 0
+∞ +∞
+∞ +∞
 ‫𝑋𝑓 𝑥 𝑎׬‬ (x) dx ≥ ‫𝑎 𝑎׬‬ . 𝑓𝑋 (x) dx = a . P(X ≥ a)
+∞ +∞
+∞ +∞
 ‫𝑋𝑓 𝑥 𝑎׬‬ (x) dx ≥ a ‫׬‬0 1. 𝑓𝑋 (x) dx = a . P(X ≥ a)
𝐸(𝑋)
 P(X ≥ a) ≤
𝑎
Chebyshev’s Inequality
 Consider any random variable X ~ 𝐹𝑋 (𝜇, 𝜎 2 )
 Define Q = (𝑋 − 𝜇)2 . Q is definitely a positive random variable.

 By Markov’s inequality, for any 𝜀 > 0
𝐸((𝑋 −𝜇)2 ) 𝜎2
P ((𝑋 − 𝜇)2 > 𝜀 2 ) ≤ =
𝜀2 𝜀2
 By Markov’s inequality, for any 𝜀 > 0
2 2 𝐸((𝑋 −𝜇)2 ) 𝜎 2
P ((𝑋 − 𝜇) > 𝜀 ) ≤ = 2
𝜀2 𝜀
𝜎2
 P(|X − µ| ≥ 𝜀) ≤ 𝜀2
Weak Law of Large Numbers
 𝑋1 , 𝑋2 , 𝑋3 ,…… 𝑋𝑛 are iid, where 𝑋𝑗 ~ F (𝜇, 𝜎 2 ) ∀ 𝑗 = 1,2, … . . 𝑛

𝑋1 + 𝑋2 + 𝑋3 ….𝑋𝑛
 𝑀𝑛 =
𝑛
𝑋1 + 𝑋2 + 𝑋3 ….𝑋𝑛
 𝑀𝑛 =
𝑛
 What happens if n -> ∞ ?

Convergence in Probability
𝜎2
 E(𝑀𝑛 ) = 𝜇 & V(𝑀𝑛 ) =
𝑛
𝜎2
 E(𝑀𝑛 ) = 𝜇 & V(𝑀𝑛 ) =
𝑛
 According to Chebyshev’s Inequality: For any finite 𝜀 > 0

𝜎2
𝑛 𝜎2
P(| 𝑀𝑛 − µ| ≥ 𝜀) ≤ =
𝜀2 𝑛𝜀 2
𝜎2
 E(𝑀𝑛 ) = 𝜇 & V(𝑀𝑛 ) =
𝑛
 According to Chebyshev’s Inequality: For any finite 𝜀 > 0

𝜎2
𝑛 𝜎2
P(| 𝑀𝑛 − µ| ≥ 𝜀) ≤ =
𝜀2 𝑛𝜀 2
 Hence as n -> ∞ , P(| 𝑀𝑛 − µ| ≥ 𝜀) -> 0 for any 𝜀 > 0

𝑀𝑛 converges in probability to µ
Pre - Poll / Election Survey
The Methodology
 f: The proportion of people respond “voting for Republicans”
The Methodology
 Define a random variable 𝑋𝑗 for any individual j where

𝑋𝑗 = 1 ; “vote for Republican”
= 0 ; “vote for Democrat”
The Methodology

𝑋1 + 𝑋2 + 𝑋3 ….𝑋𝑛
 𝑀𝑛 =
𝑛
 𝑀𝑛 denotes the fraction of people responding “voting for Republicans”

The Methodology

𝑋1 + 𝑋2 + 𝑋3 ….𝑋𝑛
 𝑀𝑛 = 𝑛
 𝑀𝑛 denotes the fraction of people responding “voting for Republicans”
 Goal: 95% confidence of ≤1% error

P(|𝑀𝑛 − f| ≥ .01) ≤ .05
 𝑋𝑗 ~ Ber (f)
 E(𝑋𝑗 ) = f & V(𝑋𝑗 ) = f(1-f)

 E(𝑋𝑗 ) = f & V(𝑋𝑗 ) = f(1-f)
𝑋1 + 𝑋2 + 𝑋3 ….𝑋𝑛
 𝑀𝑛 =
𝑛
 E(𝑋𝑗 ) = f & V(𝑋𝑗 ) = f(1-f)
𝑋1 + 𝑋2 + 𝑋3 ….𝑋𝑛
 𝑀𝑛 =
𝑛
𝑓(1−𝑓)
 E(𝑀𝑛 ) = f & V(𝑀𝑛 ) =
𝑛
 Using Chebyshev’s Inequality
V(𝑀𝑛 )
P(|𝑀𝑛 − f| ≥ .01) ≤ 2
(.01)
𝑓(1−𝑓)
V(𝑀𝑛 ) 𝑛
P(|𝑀𝑛 − f| ≥ .01) ≤ =
(.01)2 (.01)2
𝑓(1−𝑓)
V(𝑀𝑛 ) 𝑛
P(|𝑀𝑛 − f| ≥ .01) ≤ =
(.01)2 (.01)2
𝑓(1−𝑓)
 We want 𝑛
< 0.05
(.01)2
𝑓(1−𝑓)
 We want 𝑛
< 0.05
(.01)2
𝑓(1−𝑓)
 Choose the sample size (n) such that 𝑛
< 0.05
(.01)2
𝑓(1−𝑓)
 We want 𝑛
(.01)2
< 0.05
𝑓(1−𝑓)
 Choose the sample size (n) such that 𝑛
(.01)2
< 0.05
 But we don’t know f
 In fact we are trying to find / estimate f.

 f ∈ 0,1 & 𝑓 1 − 𝑓 ≤ ¼
 f ∈ 0,1 & 𝑓 1 − 𝑓 ≤ ¼
𝑓(1−𝑓)
1
 Therefore 𝑛
≤
(.01)2 4𝑛(.01)2
 f ∈ 0,1 & 𝑓 1 − 𝑓 ≤ ¼
𝑓(1−𝑓)
1
 Therefore 𝑛
≤
(.01)2 4𝑛(.01)2
1
 Now < 0.05 if n ≥ 50000
4𝑛(.01)2
 f ∈ 0,1 & 𝑓 1 − 𝑓 ≤ ¼
𝑓(1−𝑓)
1
 Therefore 𝑛
≤
(.01)2 4𝑛(.01)2
1
 Now < 0.05 if n ≥ 50000
4𝑛(.01)2
 So the minimum sample size (n) = 50000 !!!!

Central Limit Theorem
 E(𝑋𝑗 ) = 𝜇 & V(𝑋𝑗 ) =𝜎 2 ∀ 𝑗 = 1,2, … . . 𝑛

 E(𝑋𝑗 ) = 𝜇 & V(𝑋𝑗 ) =𝜎 2 ∀ 𝑗 = 1,2, … . . 𝑛
 𝑆𝑛 = 𝑋1 + 𝑋2 + 𝑋3 … . 𝑋𝑛
 E(𝑋𝑗 ) = 𝜇 & V(𝑋𝑗 ) =𝜎 2 ∀ 𝑗 = 1,2, … . . 𝑛
 𝑆𝑛 = 𝑋1 + 𝑋2 + 𝑋3 … . 𝑋𝑛
 E(𝑆𝑛 ) = n𝜇 & V(𝑆𝑛 ) = n 𝜎 2

 E(𝑋𝑗 ) = 𝜇 & V(𝑋𝑗 ) =𝜎 2 ∀ 𝑗 = 1,2, … . . 𝑛
 𝑆𝑛 = 𝑋1 + 𝑋2 + 𝑋3 … . 𝑋𝑛
 E(𝑆𝑛 ) = n𝜇 & V(𝑆𝑛 ) = n 𝜎 2
𝑆𝑛 −E(𝑆𝑛 ) 𝑆𝑛 −n𝜇 𝑀𝑛 −𝜇
 Standardize: 𝑍𝑛 = = = 𝜎
𝜎 n
V(𝑆𝑛) n
 E(𝑍𝑛 ) = 0 & V(𝑍𝑛 ) = 1
 E(𝑍𝑛 ) = 0 & V(𝑍𝑛 ) = 1
 Let Z be a standard Normal random variable. i.e Z ~ 𝑁 0,1

 E(𝑍𝑛 ) = 0 & V(𝑍𝑛 ) = 1
 Let Z be a standard Normal random variable. i.e Z ~ 𝑁 0,1
 The Central Limit Theorem states that:
For every c, P(Zn ≤ c) → P(Z ≤ c) as n becomes large enough. (Convergence in

distribution)
P(Z ≤ c) is the standard normal CDF, Φ(c), available from the normal tables
Election problem – (with CLT)
𝑓(1−𝑓)
 E(𝑀𝑛 ) = f & V(𝑀𝑛 ) =
𝑛
𝑓(1−𝑓)
 E(𝑀𝑛 ) = f & V(𝑀𝑛 ) =
𝑛
 Now we want P(|𝑀𝑛 − f| ≥ .01) ≤ .05

𝑓(1−𝑓)
 E(𝑀𝑛 ) = f & V(𝑀𝑛 ) =
𝑛
 Now we want P(|𝑀𝑛 − f| ≥ .01) ≤ .05
𝑀𝑛 −f .01
 P(| |≥ ) ≤ .05
𝑓(1−𝑓) 𝑓(1−𝑓)
𝑛 𝑛
𝑓(1−𝑓)
 E(𝑀𝑛 ) = f & V(𝑀𝑛 ) =
𝑛
 Now we want P(|𝑀𝑛 − f| ≥ .01) ≤ .05
𝑀𝑛 −f .01
 P(| |≥ ) ≤ .05
𝑓(1−𝑓) 𝑓(1−𝑓)
𝑛 𝑛
.01
 P(|Z| ≥ ) ≤ .05
𝑓(1−𝑓)
𝑛
 f ∈ 0,1 & 𝑓 1 − 𝑓 ≤ ¼
 f ∈ 0,1 & 𝑓 1 − 𝑓 ≤ ¼ => 𝑓 1 − 𝑓 ≤ ½
 f ∈ 0,1 & 𝑓 1 − 𝑓 ≤ ¼ => 𝑓 1 − 𝑓 ≤ ½
.01 .01 𝑛
 ≥ 1 = .02 𝑛
𝑓(1−𝑓)
2
𝑛
 f ∈ 0,1 & 𝑓 1 − 𝑓 ≤ ¼ => 𝑓 1 − 𝑓 ≤ ½
.01 .01 𝑛
 ≥ 1 = .02 𝑛
𝑓(1−𝑓)
2
𝑛
.01
 P(|Z| ≥ ) ≤ P(|Z| ≥ .02 𝑛)
𝑓(1−𝑓)
𝑛
 If n = 10000, P(|Z| ≥ .02 𝑛) = P(|Z| ≥ 2) < .05
 If n = 10000, P(|Z| ≥ .02 𝑛) = P(|Z| ≥ 2) < .05
 In fact P(|Z| ≥ .02 𝑛) = .05 implies .02 𝑛 = 1.96 i.e n = 9604.

 If n = 10000, P(|Z| ≥ .02 𝑛) = P(|Z| ≥ 2) < .05
 In fact P(|Z| ≥ .02 𝑛) = .05 implies .02 𝑛 = 1.96 i.e n = 9604.
 And 9604 << 50000 !!!!!

ESTIMATION
Estimation - categories
 Point Estimation
 Interval estimation
Point Estimation
 The Statistic
 The desirable properties of an estimator:

i. Unbiasedness
ii. Consistency
Interval Estimation
• Interval Estimation: an inferential statistical procedure used to

estimate population parameters from sample data through the
building of confidence intervals
• Confidence Intervals: a range of values computed from sample data

that has a known probability of capturing some population parameter
of interest
The Problem Statement
 Find the average height of Indian males.
 But India has 72 Cr males ! It’s impossible to measure everyone.
 Remedy: Collect a sample, say S = { 𝑋1 , 𝑋2 , 𝑋3 … . 𝑋𝑛 }
 Each sample observation is a realisation of a random variable. Why??
 Since the sample is drawn from the populations 𝑋𝑖 ’s are i.i.d.

The Assumption
 Let us assume that height of Indian males are normally distributed

with mean 𝜇 and variance 𝜎 2 .
 Our aim is to find (estimate) 𝜇 and 𝜎 2
 These are called “population parameters”.

Sample Statistics
1
 Sample mean 𝑋 = σ𝑛𝑖=1 𝑋𝑖
ത
𝑛
1
2
 Sample variance 𝑠 = σ𝑛𝑖=1(𝑋𝑖 − 𝑋)
ത 2
𝑛−1
Sample Statistics
1
 Sample mean 𝑋 = σ𝑛𝑖=1 𝑋𝑖
ത
𝑛
1
2
 Sample variance 𝑠 = σ𝑛𝑖=1(𝑋𝑖 − 𝑋)
ത 2
𝑛−1
 I would want to use the sample statistics to “estimate” the

population parameters.
 Let’s see how.

Point Estimation
ത and sample variance (𝑠 2 ) as proxies

 Just take the sample mean (𝑋)
for the population mean (𝜇) and population variance (𝜎 2 ),
respectively.
 𝑋ത and 𝑠 2 are called point estimators of 𝜇 and 𝜎 2 respectively.
 But are they “good” proxies?

The “Good” Properties
The good / desirable properties of a point estimator are
i. Unbiasedness
ii. Consistency
i. Unbiasedness:
ത =𝜇
 E(𝑋)
 E(𝑠 2 ) = 𝜎 2
ii. Consistency
i. Unbiasedness:
ത =𝜇
 E(𝑋)
 E(𝑠 2 ) = 𝜎 2
ii. Consistency:
ത =0
 lim 𝑉(𝑋)
𝑛→ ∞
 lim 𝑉(𝑠 2 ) = 0
𝑛→ ∞
Interval estimation of mean
 Sample observations S = { 𝑋1 , 𝑋2 , 𝑋3 … . 𝑋𝑛 }
 𝑋𝑖 ~ N (𝜇, 𝜎 2 )
 𝑋𝑖 ~ N (𝜇, 𝜎 2 )
1
 𝑋 = σ𝑛𝑖=1 𝑋𝑖
ത
𝑛
 𝑋𝑖 ~ N (𝜇, 𝜎 2 )
1
 𝑋 = σ𝑛𝑖=1 𝑋𝑖
ത
𝑛
𝜎2
 𝑋ത ~ N (𝜇, )
𝑛
 𝑋𝑖 ~ N (𝜇, 𝜎 2 )
1
 𝑋ത = σ𝑛𝑖=1 𝑋𝑖
𝑛
𝜎2
 𝑋ത ~ N (𝜇, )
𝑛
𝑋ത −𝜇
 Z= 𝜎 ~ N (0,1)
𝑛
The Standard Normal Distribution
The Interval
 P( 𝑍 < 𝑍𝛼/2 ) = 1 - 𝛼
The Interval
 P( 𝑍 < 𝑍𝛼/2 ) = 1 - 𝛼
 P(- 𝑍𝛼/2 < Z < 𝑍𝛼/2 ) = 1 - 𝛼

The Interval
 P( 𝑍 < 𝑍𝛼/2 ) = 1 - 𝛼
 P(- 𝑍𝛼/2 < Z < 𝑍𝛼/2 ) = 1 - 𝛼
𝑋ത −𝜇
 P(- 𝑍𝛼/2 < 𝜎 < 𝑍𝛼/2 ) = 1 - 𝛼
𝑛
The Interval
 P( 𝑍 < 𝑍𝛼/2 ) = 1 - 𝛼
 P(- 𝑍𝛼/2 < Z < 𝑍𝛼/2 ) = 1 - 𝛼
𝑋ത −𝜇
 P(- 𝑍𝛼/2 < 𝜎 < 𝑍𝛼/2 ) = 1 - 𝛼
𝑛
𝜎 𝜎
 P(𝜇 ∈ (𝑋ത − . 𝑍𝛼 , 𝑋ത + . 𝑍𝛼 ) = 1 - 𝛼
𝑛 2 𝑛 2
The Interval
 P( 𝑍 < 𝑍𝛼/2 ) = 1 - 𝛼
 P(- 𝑍𝛼/2 < Z < 𝑍𝛼/2 ) = 1 - 𝛼
𝑋ത −𝜇
 P(- 𝑍𝛼/2 < 𝜎 < 𝑍𝛼/2 ) = 1 - 𝛼
𝑛
𝜎 𝜎
 P(𝜇 ∈ (𝑋ത − . 𝑍𝛼 , 𝑋ത + . 𝑍𝛼 ) = 1 - 𝛼
𝑛 2 𝑛 2
 This is the interval within which 𝜇 lies with probability = 1 - 𝛼

 We can compute 𝑋ത from the sample. n is the sample size. 𝑍𝛼/2 can be computed from
the standard normal table. So if we know 𝜎, we are done !!
We may not know 𝝈 !!
The t - statistic
𝑛−1 .𝑠 2
 ~ χ2(𝑛−1)
𝜎2
The t - statistic
𝑛−1 .𝑠 2
 ~ χ2(𝑛−1)
𝜎2
𝑋ത −𝜇
 Define a statistic t = 𝑠
𝑛
The t - statistic
𝑛−1 .𝑠 2
 ~ χ2(𝑛−1)
𝜎2
ഥ −𝜇
𝑋
𝜎
𝑋ത −𝜇 𝑛
 Define a statistic t = 𝑠 =
𝑠/𝜎
𝑛
The t - statistic
𝑛−1 .𝑠 2
 ~ χ2(𝑛−1)
𝜎2
ഥ −𝜇
𝑋 ഥ −𝜇
𝑋
𝜎 𝜎
𝑋ത −𝜇 𝑛 𝑛 𝑍
 Define a statistic t = 𝑠 = = = ~ 𝑡𝑛−1
𝑠/𝜎 𝑛−1 .𝑠2 1
𝑛 .𝑛−1 χ2(𝑛−1)
𝜎2 .
𝑛−1
The t - statistic
𝑛−1 .𝑠 2
 ~ χ2(𝑛−1)
𝜎2
ഥ −𝜇
𝑋 ഥ −𝜇
𝑋
𝜎 𝜎
𝑋ത −𝜇 𝑛 𝑛 𝑍
 Define a statistic t = 𝑠 = = = ~ 𝑡𝑛−1
𝑠/𝜎 𝑛−1 .𝑠2 1
𝑛 .𝑛−1 χ2(𝑛−1)
𝜎2 .
𝑛−1
i.e the student’s t – distribution with n-1 degrees of freedom.

Student’s t - distribution
The Interval
(𝛼/2)
 P( 𝑡 < 𝑡𝑛−1 )=1-𝛼
The Interval
(𝛼/2)
 P( 𝑡 < 𝑡𝑛−1 )=1-𝛼
(𝛼/2) (𝛼/2)
 P(-𝑡𝑛−1 < t< 𝑡𝑛−1 ) =1-𝛼
(𝛼/2) 𝑋ത −𝜇 (𝛼/2)
 P(-𝑡𝑛−1 < 𝑠 <𝑡𝑛−1 ) =1-𝛼
𝑛
The Interval
(𝛼/2)
 P( 𝑡 < 𝑡𝑛−1 )=1-𝛼
(𝛼/2) (𝛼/2)
 P(-𝑡𝑛−1 < t < 𝑡𝑛−1 ) = 1 - 𝛼
(𝛼/2) 𝑋ത −𝜇 (𝛼/2)
 P(-𝑡𝑛−1 < 𝑠 <𝑡𝑛−1 ) =1-𝛼
𝑛
𝑠 (𝛼/2) ത 𝑠 (𝛼/2)
 P(𝜇 ∈ (𝑋ത − . 𝑡𝑛−1 , 𝑋 + . 𝑡𝑛−1 ) =1-𝛼
𝑛 𝑛
The Interval
(𝛼/2)
 P( 𝑡 < 𝑡𝑛−1 ) = 1 - 𝛼
(𝛼/2) (𝛼/2)
 P(-𝑡𝑛−1 < t < 𝑡𝑛−1 ) = 1 - 𝛼
(𝛼/2) 𝑋ത −𝜇 (𝛼/2)
 P(-𝑡𝑛−1 < 𝑠 <𝑡𝑛−1 ) = 1 - 𝛼
𝑛
𝑠 (𝛼/2) 𝑠 (𝛼/2)
 P(𝜇 ∈ (𝑋ത − . 𝑡𝑛−1 , 𝑋ത + . 𝑡𝑛−1 ) = 1 - 𝛼
𝑛 𝑛
 This is the interval within which 𝜇 lies with probability = (1 - 𝛼)
(𝛼/2)
 We can compute 𝑋ത from the sample. n is the sample size.𝑡𝑛−1 can be computed from the t - distribution
table.
Interval estimation of Variance
𝑛−1 .𝑠 2
 ~ χ2(𝑛−1)
𝜎2
𝛼 𝛼
2
(1− ) 𝑛−1 .𝑠2 ( )
2 2
 P( χ 𝑛−12 < < χ 𝑛−1 ) = 1-𝛼
𝜎2
2 𝑛−1 .𝑠2 𝑛−1 .𝑠2

 P (𝜎 ∈ ( 𝛼
(1− 2 )
, 𝛼 )=1-𝛼
χ 2
𝑛−1 χ 2 2
𝑛−1
 Given a sample we can compute 𝑠 2 and we know n (sample size). We can

𝛼 𝛼
( ) (1− )
2 2 2
compute χ 𝑛−1 & χ 𝑛−12 from the chi – square distribution table.
 Thus we have found the interval such that 𝜎 2 lies in that interval with probability
(1- 𝛼 )

Statistics Training

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Statistics Training

Uploaded by

Copyright:

Available Formats

STATISTICAL INFERENCE LECTURE

 Inferential statistics: the part of statistics that allows

 Statistical inference: a procedure for making inferences or

 Weak Law of Large Numbers & Central Limit Theorem

 Consider any positive random variable X∈ (0, ∞) with density 𝑓𝑋 .

 Consider any positive random variable X∈ (0, ∞) with density 𝑓𝑋 .

 Consider any positive random variable X∈ (0, ∞) with density 𝑓𝑋 .

 Consider any positive random variable X∈ (0, ∞) with density 𝑓𝑋 .

 Consider any positive random variable X∈ (0, ∞) with density 𝑓𝑋 .

 Define Q = (𝑋 − 𝜇)2 . Q is definitely a positive random variable.

 Define Q = (𝑋 − 𝜇)2 . Q is definitely a positive random variable.

 By Markov’s inequality, for any 𝜀 > 0

 Define Q = (𝑋 − 𝜇)2 . Q is definitely a positive random variable.

 By Markov’s inequality, for any 𝜀 > 0

 𝑋1 , 𝑋2 , 𝑋3 ,…… 𝑋𝑛 are iid, where 𝑋𝑗 ~ F (𝜇, 𝜎 2 ) ∀ 𝑗 = 1,2, … . . 𝑛

 𝑋1 , 𝑋2 , 𝑋3 ,…… 𝑋𝑛 are iid, where 𝑋𝑗 ~ F (𝜇, 𝜎 2 ) ∀ 𝑗 = 1,2, … . . 𝑛

 𝑋1 , 𝑋2 , 𝑋3 ,…… 𝑋𝑛 are iid, where 𝑋𝑗 ~ F (𝜇, 𝜎 2 ) ∀ 𝑗 = 1,2, … . . 𝑛

 What happens if n -> ∞ ?

 According to Chebyshev’s Inequality: For any finite 𝜀 > 0

 According to Chebyshev’s Inequality: For any finite 𝜀 > 0

 Hence as n -> ∞ , P(| 𝑀𝑛 − µ| ≥ 𝜀) -> 0 for any 𝜀 > 0

 Define a random variable 𝑋𝑗 for any individual j where

 Define a random variable 𝑋𝑗 for any individual j where

 𝑀𝑛 denotes the fraction of people responding “voting for Republicans”

 Define a random variable 𝑋𝑗 for any individual j where

 𝑀𝑛 denotes the fraction of people responding “voting for Republicans”

 Goal: 95% confidence of ≤1% error

 E(𝑋𝑗 ) = f & V(𝑋𝑗 ) = f(1-f)

 E(𝑋𝑗 ) = f & V(𝑋𝑗 ) = f(1-f)

 E(𝑋𝑗 ) = f & V(𝑋𝑗 ) = f(1-f)

 But we don’t know f

 In fact we are trying to find / estimate f.

 So the minimum sample size (n) = 50000 !!!!

 E(𝑋𝑗 ) = 𝜇 & V(𝑋𝑗 ) =𝜎 2 ∀ 𝑗 = 1,2, … . . 𝑛

 E(𝑋𝑗 ) = 𝜇 & V(𝑋𝑗 ) =𝜎 2 ∀ 𝑗 = 1,2, … . . 𝑛

 E(𝑋𝑗 ) = 𝜇 & V(𝑋𝑗 ) =𝜎 2 ∀ 𝑗 = 1,2, … . . 𝑛

 E(𝑆𝑛 ) = n𝜇 & V(𝑆𝑛 ) = n 𝜎 2

 E(𝑋𝑗 ) = 𝜇 & V(𝑋𝑗 ) =𝜎 2 ∀ 𝑗 = 1,2, … . . 𝑛

 E(𝑆𝑛 ) = n𝜇 & V(𝑆𝑛 ) = n 𝜎 2

 Let Z be a standard Normal random variable. i.e Z ~ 𝑁 0,1

 Let Z be a standard Normal random variable. i.e Z ~ 𝑁 0,1

 The Central Limit Theorem states that:

For every c, P(Zn ≤ c) → P(Z ≤ c) as n becomes large enough. (Convergence in

 Now we want P(|𝑀𝑛 − f| ≥ .01) ≤ .05

 Now we want P(|𝑀𝑛 − f| ≥ .01) ≤ .05

 Now we want P(|𝑀𝑛 − f| ≥ .01) ≤ .05

 In fact P(|Z| ≥ .02 𝑛) = .05 implies .02 𝑛 = 1.96 i.e n = 9604.

 In fact P(|Z| ≥ .02 𝑛) = .05 implies .02 𝑛 = 1.96 i.e n = 9604.

 And 9604 << 50000 !!!!!

 The desirable properties of an estimator:

• Interval Estimation: an inferential statistical procedure used to

• Confidence Intervals: a range of values computed from sample data

 Find the average height of Indian males.

 But India has 72 Cr males ! It’s impossible to measure everyone.

 Remedy: Collect a sample, say S = { 𝑋1 , 𝑋2 , 𝑋3 … . 𝑋𝑛 }

 Each sample observation is a realisation of a random variable. Why??

 Since the sample is drawn from the populations 𝑋𝑖 ’s are i.i.d.

 Let us assume that height of Indian males are normally distributed

 Our aim is to find (estimate) 𝜇 and 𝜎 2

 These are called “population parameters”.

 I would want to use the sample statistics to “estimate” the

 Let’s see how.

ത and sample variance (𝑠 2 ) as proxies