You are on page 1of 271

MTL390-Statistical Methods

Lecture 1
Discrete Distributions

Bernoulli distribution: ℙ(𝑋 = 0) = 1 − 𝑝, ℙ(𝑋 = 1) = 𝑝


We have 𝔼(𝑋) = 𝑝, 𝑀(𝑡) = 𝔼(𝑒 𝑡𝑋 ) = 𝑝𝑒 𝑡 + 𝑞, 𝑣𝑎𝑟(𝑋) = 𝔼(𝑋 2 ) − 𝔼(𝑋)2 = 𝑝 − 𝑝2 = 𝑝𝑞

Finally, we have 𝜙(𝑡) = 𝔼(𝑒 𝑖𝑡𝑋 ) = 𝑝𝑒 𝑖𝑡 + 𝑞


2
𝜕 log(𝑓(𝑥))
Fischer Information: 𝔼 (( 𝜕𝑝
) )

2
𝜕 log(𝑓(𝑥)) 𝑥2 2𝑥
We have 𝑓(𝑥) = 𝑝 𝑥 (1 − 𝑝)1−𝑥 ⇒ [ ] = 1/𝑞 2 (𝑝2 + 1 − )
𝜕𝑝 𝑝

2
𝜕 log(𝑓(𝑥)) 1 𝑝 2𝑝 1 1 1
Hence, we have 𝔼 (( 𝜕𝑝
) ) = 𝑞2 (𝑝2 + 1 − 𝑝
) = 𝑞2 (𝑝 − 1) = 𝑞𝑝

Binomial Distribution
The probability mass function of Binomial distribution is given by 𝑓(𝑥) =𝑛 𝐶𝑥 𝑝 𝑥 (1 − 𝑝)𝑛−𝑥

Can be considered as sum of n independent Bernoulli trials

Hence, we have 𝔼(𝑋) = 𝑝 + 𝑝 + ⋯ (𝑛 − 𝑡𝑖𝑚𝑒𝑠) = 𝑛𝑝, 𝑣𝑎𝑟(𝑋) = 𝑛𝑣𝑎𝑟(𝑋1 ) = 𝑛𝑝𝑞, 𝑀(𝑡) =


𝑛
(𝑞 + 𝑝𝑒 𝑡 )𝑛 and 𝜙(𝑡) = (𝑞 + 𝑝𝑒 𝑖𝑡 )
2
𝜕 log(𝑓(𝑥)) 1 𝑥2 2𝑛𝑥
Now, we have [ 𝜕𝑝
] = (
𝑞 2 𝑝2
+ 𝑛2 − 𝑝
)

2
𝜕 log(𝑓(𝑥)) 1 𝑛𝑝𝑞+𝑛2 𝑝2 2𝑛𝑛𝑝 1 𝑛(𝑞+𝑛𝑝)
Hence, we have 𝔼 (( 𝜕𝑝
) ) = 𝑞2 ( 𝑝2
+ 𝑛2 − 𝑝
) = 𝑞2 ( 𝑝
+ 𝑛2 − 2𝑛2 ) =
𝑛 𝑛
(𝑞 + 𝑛𝑝 − 𝑛𝑝) = 𝑝𝑞
𝑝𝑞2

Result: If X and Y are independent and X~B(n,p) and Y~B(m,p), then X+Y~B(m+n,p)

Proof:

Method 1: Using MGF

We have for RV X+Y

M(t)=𝔼(𝑒 𝑡(𝑋+𝑌) ) = 𝔼(𝑒 𝑡𝑋 )𝔼(𝑒 𝑡𝑌 ) = (𝑞 + 𝑝𝑒 𝑡 )𝑛 (𝑞 + 𝑝𝑒 𝑡 )𝑚 = (𝑞 + 𝑝𝑒 𝑡 )𝑚+𝑛

Hence, we have 𝑋 + 𝑌~𝐵(𝑚 + 𝑛, 𝑝)

Method 2:

We have ℙ(𝑋 + 𝑌 = 𝑘) = ∑𝑘𝑖=0 ℙ(𝑋 = 𝑖)ℙ(𝑌 = 𝑘 − 𝑖) = ∑𝑘𝑖=0 𝐶(𝑛, 𝑖)𝑝𝑖 𝑞 𝑛−𝑖 𝐶(𝑚, 𝑘 −
𝑖)𝑝𝑘−𝑖 𝑞 𝑚−𝑘+𝑖

= ∑𝑘𝑖=0 𝐶(𝑛, 𝑖)𝐶(𝑚, 𝑘 − 𝑖)𝑝𝑘 𝑞𝑚+𝑛−𝑘


Combinatorical argument. Number of ways of selecting k objects from a pile containing m + n
objects is equivalent to dividing the objects into two groups of m objects and n objects respectively
and then finding the number of ways of choosing 0,1,2 ,…, k objects from first pile and remaining
objects from the other pile

Hence, we have ∑𝑘𝑖=0 𝐶(𝑛, 𝑖)𝐶(𝑚, 𝑘 − 𝑖) = 𝐶(𝑚 + 𝑛, 𝑘)

Hence, we have ℙ(𝑋 + 𝑌 = 𝑘) = 𝐶(𝑚 + 𝑛, 𝑘)𝑝𝑘 𝑞𝑚+𝑛−𝑘

Hence, we have 𝑋 + 𝑌~𝐵(𝑚 + 𝑛, 𝑝)

Hypergeometric distribution:
It models the probability of m successes in M draws without replacement from a finite population
of size N in which n objects are associated with success

Denoted as 𝐻𝐺(𝑁, 𝑛, 𝑀, 𝑚)
𝑛 )( 𝑁−𝑛 )
(𝑚 𝑀−𝑚
We have ℙ(𝑋 = 𝑚) =
(𝑁 )
𝑀
𝑛 𝑀 𝑁−𝑀 𝑁−𝑛
Hence, we have 𝔼(𝑋) = 𝑀 and 𝑣𝑎𝑟(𝑋) = 𝑛
𝑁 𝑁 𝑁 𝑁−1

Poisson Distribution(Poi(𝜆) )
𝑒 −𝜆 𝜆𝑥
We have 𝑓(𝑥) = 𝑥!
,𝑥 = 0,1,2, …
𝑡 −1
Hence, we have 𝔼(𝑋) = 𝜆, 𝑣𝑎𝑟(𝑋) = 𝜆, 𝑀(𝑡) = 𝑒 𝜆𝑒 , 𝜙(𝑡) = exp (𝜆(𝑒 𝑖𝑡 − 1))

𝜕 log(𝑓(𝑥)) 𝑥
Fischer information: log(𝑓(𝑥)) = −𝜆 + 𝑥 log 𝜆 − log(𝑥!) ⇒ 𝜕𝜆
= −1 + 𝜆

𝑓(𝑥) 2 𝑥2 2𝑥 𝜆2 +𝜆 2𝜆 1
Hence, we have 𝔼 ((𝜕 log ( 𝜕𝜆
)) ) = 𝔼 [1 + 𝜆2 − 𝜆
] =1+ 𝜆2
− 𝜆 =𝜆

𝑓(𝑥) 2
For passion distribution, we have 𝔼 ((𝜕 log ( 𝜕𝜆
)) ) = 1/𝜆

Theorem: If 𝑋~𝑝𝑜𝑖(𝜆), 𝑌~𝑝𝑜𝑖(𝛾), then 𝑋 + 𝑌~𝑝𝑜𝑖(𝜆 + 𝛾)

Proof:

We have 𝑀(𝑡) = (exp (𝜆(𝑒 𝑖𝑡 − 1)) × (exp (𝛾(𝑒 𝑖𝑡 − 1))) = exp ((𝜆 + 𝛾)(𝑒 𝑖𝑡 − 1))

Hence, we have 𝑋 + 𝑌~𝑃𝑜𝑖(𝜆 + 𝛾)

Geometric Distribution:
We have 𝑓(𝑥) = 𝑝𝑞 𝑥−1 , 𝑥 = 1,2, …
1
Hence, we have 𝔼(𝑋) = 𝑝 , 𝑣𝑎𝑟(𝑋) = 𝑞/𝑝2

Geometric distribution possesses memoryless property, i.e. ℙ(𝑋 ≥ 𝑡 + 𝑠|𝑋 ≥ 𝑡) = ℙ(𝑋 ≥ 𝑠)

Result: If a discrete distribution has memory less property, then it needs to be geometric
Proof:

We have ℙ(𝑋 > 2|𝑋 > 1) = ℙ(𝑋 > 1) ⇒ ℙ(𝑋 > 2) = ℙ(𝑋 > 1)2

Let ℙ(𝑋 > 1) = 𝑝

Hence, we have ℙ(𝑋 > 2) = 𝑝2 , ℙ(𝑋 > 3) = ℙ(𝑋 > 3|𝑋 > 2)ℙ(𝑋 > 2) = 𝑝(𝑝2 ) = 𝑝3

Hence, in general, we have ℙ(𝑋 > 𝑛) = 𝑝𝑛

Hence, we have ℙ(𝑋 = 𝑛) = 1 − ℙ(𝑋 ≤ 𝑛 − 1) − ℙ(𝑋 > 𝑛) = 1 − (1 − 𝑝𝑛−1 ) − 𝑝𝑛 =


𝑝𝑛−1 (1 − 𝑝) = 𝑝𝑛−1 𝑞
Hence, we have 𝑋~𝐺𝑒𝑜(𝑝)

For geometric distribution, we have ∑∞


𝑖=0(1 − 𝐹(𝑖)) = 𝔼(𝑋)

Continuous Distributions
Uniform Distribution(U(a,b))
0, 𝑥 ≤ 𝑎
1 𝑥−𝑎
We have 𝑓(𝑥) = ,𝑎 ≤ 𝑥 ≤ 𝑏, 𝐹(𝑥) = {𝑏−𝑎 ,𝑎 ≤ 𝑥 ≤ 𝑏
𝑏−𝑎
1, 𝑥 ≥ 𝑏
𝑎+𝑏 (𝑏−𝑎)2 𝑒 𝑡𝑎 −𝑒 𝑡𝑏
Hence, we have 𝔼(𝑋) = , 𝑣𝑎𝑟(𝑋) = , 𝑀(𝑡) = ,𝑡 ≠0
2 12 𝑡(𝑏−𝑎)

Exponential Distribution(𝐸𝑥𝑝(𝜆))
We have 𝑓(𝑥) = 𝜆𝑒 −𝜆𝑥 ; 𝑥 > 0, 𝐹(𝑥) = 1 − 𝑒 −𝜆𝑥 ; 𝑥 > 0
1 1 𝜆
Hence, we have 𝔼(𝑋) = 𝜆 , 𝑣𝑎𝑟(𝑋) = 𝜆2 , 𝑀(𝑡) = 𝜆−𝑡

Fischer information: We have log(𝑓(𝑥)) = log 𝜆 − 𝜆𝑥

𝜕 log 𝑓(𝑥) 1 𝜕 log 𝑓(𝑥) 2 1 2𝑥


Hence, we have 𝜕𝜆
=𝜆−𝑥 ⇒( 𝜕𝜆
) = 𝜆2 + 𝑥 2 − 𝜆

𝜕 log 𝑓(𝑥) 2 1 1 1 2 1
Hence, we have 𝔼 [( 𝜕𝜆
) ] = 𝜆2 + (𝜆2 + 𝜆2 ) − 𝜆2 = 𝜆2

Exponential distribution has memoryless property

Proof:
ℙ(𝑋>𝑡+𝑠) 𝑒 −𝜆(𝑡+𝑠)
We have ℙ(𝑋 > 𝑡 + 𝑠|𝑋 > 𝑡) = ℙ(𝑋>𝑡)
= 𝑒 −𝜆𝑡
= 𝑒 −𝜆𝑠 = ℙ(𝑋 > 𝑠)∎

Gamma Distribution( Γ(𝜆, 𝛼 ))


𝜆𝛼
The pdf is 𝑓(𝑥) = Γ𝛼 𝑒 −𝜆𝑥 𝑥 𝛼−1 ; 𝛼 > 0, 𝜆 > 0, 𝑥 > 0

𝜆𝛼 ∞ 𝜆𝛼 𝛼
Hence, we have 𝔼(𝑋) = Γ𝛼 ∫0 𝑒 −𝜆𝑥 𝑥 𝛼+1−1 = λα+1 Γ𝛼 Γ(𝛼 + 1) = 𝜆

𝛼(𝛼+1)
Similarly, we have 𝔼(𝑋 2 ) = 𝜆2
𝛼
Hence, we have 𝑣𝑎𝑟(𝑋) =
𝜆2

𝜆𝛼 ∞ −𝜆𝑥 𝛼−1 𝑡𝑥 𝜆𝛼 Γ𝛼 𝑡 −𝛼
Finally, we have 𝑀(𝑡) = 𝔼(𝑒 𝑡𝑋 ) = ∫ 𝑒 𝑥 𝑒 = × (𝜆−𝑡)𝛼 = (1 − )
Γ𝛼 0 Γ𝛼 𝜆

Fischer information: log 𝑓(𝑥) = 𝛼 log 𝜆 − log Γ𝛼 − 𝜆𝑥 + (𝛼 − 1) log 𝑥


𝜕 log 𝑓(𝑥) 𝛼 𝜕 log 𝑓(𝑥) 2 𝛼2 2𝛼𝑥
Hence, we have 𝜕𝜆
= 𝜆−𝑥 ⇒( 𝜕𝜆
) = 𝜆2
+ 𝑥2 − 𝜆

𝜕 log 𝑓(𝑥) 2 𝛼2 𝛼(𝛼+1) 2𝛼 2 𝛼


Hence, we have 𝔼 [( 𝜕𝜆
) ] = 𝜆2
+ 𝜆 2 − 𝜆2
= 𝜆2

If 𝑋~Γ(𝜆, 𝛼), 𝑌~Γ(𝜆, 𝛽), X, Y are independent, then 𝑋 + 𝑌~Γ(𝜆, 𝛼 + 𝛽)

Proof:

Let 𝑍 = 𝑋 + 𝑌
𝑧 𝜆𝛼 𝜆𝛽 𝜆𝛼+𝛽 𝑧
We have 𝑓𝑍 (𝑧) = ∫0 Γ𝛼
𝑒 −𝜆𝑥 𝑥 𝛼−1 Γ𝛽 𝑒 −𝜆(𝑧−𝑥) (𝑧 − 𝑥)𝛽−1 𝑑𝑥 = Γ𝛼Γ𝛽 𝑒 −𝜆𝑧 ∫0 𝑥 𝛼−1 (𝑧 − 𝑥)𝛽−1

𝜆𝛼+𝛽 −𝜆𝑧 𝛼+𝛽−2 𝑧 𝑥 𝛼−1 𝑥 𝛽−1


Hence, we have 𝑓𝑍 (𝑧) = 𝑒 𝑧 ∫0 (𝑧 ) (1 − )
Γ𝛼Γ𝛽 𝑧

𝑥 𝜆𝛼+𝛽 −𝜆𝑧 𝛼+𝛽−1 1 𝛼−1 𝜆𝛼+𝛽 −𝜆𝑧 𝛼+𝛽−1


Put = 𝑦, we get 𝑓𝑍 (𝑧) = 𝑒 𝑧 ∫0 𝑥 (1 − 𝑥)𝛽−1 = 𝑒 𝑧 𝐵(𝛼, 𝛽 )
𝑧 Γ𝛼Γ𝛽 Γ𝛼Γ𝛽

Γ𝛼Γ𝛽
We have 𝐵(𝛼, 𝛽) =
Γ(𝛼+𝛽)

𝜆𝛼+𝛽
Hence, we have 𝑓𝑍 (𝑧) = Γ(𝛼+𝛽) 𝑒 −𝜆𝑧 𝑧 𝛼+𝛽−1

Hence, we have 𝑍~Γ(𝜆, 𝛼 + 𝛽)

Normal Distribution (𝑁(𝜇, 𝜎 2 ))


1 (𝑥−𝜇)2 𝜎2𝑡 2
We have 𝑓(𝑥) = exp (− 2𝜎 2
),𝑀(𝑡) = exp (𝜇𝑡 + )
√2𝜋𝜎 2

If 𝑋~𝑁(𝜇, 𝜎 2 ),then 𝑎𝑋 + 𝑏 ~𝑁(𝑎𝜇 + 𝑏, 𝑎2 𝜎 2 )

If 𝑋~𝑁(𝜇, 𝜎 2 ), 𝑌~𝑁(𝛼, 𝛾 2 ) are independent, then 𝑋 + 𝑌~𝑁(𝜇 + 𝛼, 𝜎 2 + 𝛾 2 )

Tutorial 1
Q1: Solution: 𝑌 = −𝜆 log(1 − 𝑋)

Since 0 ≤ 𝑋 ≤ 1, we have 0 < 𝑌 < ∞


𝑦
Hence, we have ℙ(𝑌 ≤ 𝑦) = ℙ(−𝜆 log(1 − 𝑋) ≤ 𝑦) = ℙ (1 − 𝑋 ≥ exp (− 𝜆 )) =
𝑦 𝑦
ℙ (𝑋 ≤ 1 − exp (− 𝜆 )) = 1 − exp (− 𝜆 )
𝑦 1 𝑦
Hence, we have 𝐹(𝑦) = 1 − exp (− 𝜆 ) ⇒ 𝑓(𝑦) = 𝜆 exp (− 𝜆 )

1
Hence, we have 𝑌~𝐸𝑥𝑝 (𝜆)
Q2: Solution: We have Y = F(X)

Hence, we have 𝐹𝑌 (𝑦) = ℙ(𝑌 ≤ 𝑦) = ℙ(𝐹𝑋 (𝑋) ≤ 𝑦) = ℙ (𝑋 ≤ 𝐹𝑋−1 (𝑦)) = 𝐹𝑋 (𝐹𝑋−1 (𝑦) = 𝑦

Hence, we have 𝑌~𝑈(0,1)

Q3: Solution: We have 𝑌𝑘 = 𝑋1 + 𝑋2 + ⋯ + 𝑋𝑘

Here 𝔼( 𝑋𝑖 ) = 𝑝 − 𝑞

Hence, we have 𝔼(𝑌𝑘 ) = (𝑝 − 𝑞)𝑘

Q4: Solution: We have ℙ(𝑁𝑡 = 𝑘) = ℙ(𝐸𝑥𝑎𝑐𝑡𝑙𝑦 𝑛 − 𝑘 𝑐𝑜𝑜𝑙𝑒𝑟𝑠 ℎ𝑎𝑣𝑒 𝑓𝑎𝑖𝑙𝑒𝑑 𝑡𝑖𝑙𝑙 𝑡𝑖𝑚𝑒 𝑘) =
𝑛−𝑘 𝑘
𝐶(𝑛, 𝑛 − 𝑘)ℙ(𝑋 ≤ 𝑡)𝑛−𝑘 ℙ(𝑋 ≥ 𝑡)𝑘 = 𝐶(𝑛, 𝑛 − 𝑘)(𝑒 −𝜆𝑡 ) (1 − 𝑒 −𝜆𝑡 )

Hence, we have 𝑁𝑡 ~𝐵𝑖𝑛(𝑛, 1 − 𝑒 −𝜆𝑡 )

Q5: Done

Q6: Done

Q7:Solution: We have 𝔼(𝑋 𝑠 ) ≤ 𝔼(|𝑋 𝑠 |) ≤ 𝔼(1 + |𝑋 𝑟 |) < ∞(𝐵)

Q8:Solution: Max number of virus in 1st gen = 1

Max number of virus in 2nd gen = 2

Max number of virus in 3rd gen = 4

To infect atleast 7 humans atleast 4 viruses are needed

Hence, we have 𝑃 = (0.4) × (0.4)2 × [0.83 × 0.2 × 4 + 0.84 ]

Q9: Easy
𝜆𝛼 𝑥 𝛼−1 −𝜆𝑥
Q10: Solution We have 𝑓(𝑥) = Γ𝛼
𝑒

𝑑𝑓(𝑥)
For maxima/ minima we have 𝑑𝑥
= 0 ⇒ −𝜆𝑥 𝛼−1 𝑒 −𝜆𝑥 + (𝛼 − 1)(𝑥 𝛼−2 𝑒 −𝜆𝑥 ) = 0 ⇒ 𝑥 =
𝛼−1
0 𝑜𝑟 𝜆𝑥 = 𝛼 − 1 ⇒ 𝑥 = 𝜆
𝛼−1
By first derivative test 𝑥 = 𝜆
is mode.

Q11:Solution: First point can be chosen with probability 1

Second point can only be chosen on the arc subtended by the equilateral triangle
1
Hence, 𝑝 = 3
MTL390: Statistical Methods

Lecture 00

Marking Scheme and Quiz Details

Niladri Chatterjee
Dept. of Mathematics
1
Introductory Note

Marking Scheme:

Quiz: 40 Marks Minor: 20 Marks Major: 40 Marks

Quiz :

• 6 quizzes 10 Marks each. Best four will be considered.


• Quizzes will be held during class hours.
• Solutions will be discussed/shared after the quiz.
Topics
Q1 Random variables and known distribution with functions and properties
Monday : 10-01-2022

Q2 Functions of two random variables, sampling


Monday : 24-01-2022
Q3 Sampling and Order Statistics
Monday : 07-02-2022
MINOR + Semester Break

Q4 Estimation of parameters, CR- inequality


Thursday : 03-03-2022
Q5 Interval Estimation and Testing of hypothesis
Wednesday : 16-03-2022
Q6 Non-parametric tests
Thursday : 31-03-2022
Course Plan

1. Quick Revision of Basic Probability Distributions with properties


Discrete: Bernoulli, Uniform, Binomial, Geometric, Poisson, Hypergeometric ,
Negative Binomial
Continuous: Uniform, Exponential, Normal, Cauchy, Chi-sq, Gamma, Weibull

Properties: pmf, pdf, cdf, Expectation, Variance, MGF, Characteristic Function,


Fisher’s Information

To be followed by Quiz 1 on 10-01-2022


Course Plan

2. Advanced Distributions: Student’s T, F, Beta1, Beta2


To be viewed as functions of two random variables

Introduction to Sampling

To be followed by Quiz 2 on 24-01-2022


Course Plan

3. Sampling Distribution (contd) and Order Statistics


Apart from mean and Variance there are important
parameters: Median, Range, percentiles

Order statistics is about their distribution

To be followed by Quiz 3 on 07-02-2022


Course Plan

4. Estimation of Parameters and Cramer-Rao Inequality

Parameter Estimation is most important task of Statistical inference


Samples give only estimations of population parameters
What should be the properties of Good estimators ?
Cramer-Rao provide a lower bound of variance of an unbiased estimator

To be followed by Quiz 4 on 03-03-2022


Course Plan

5. Interval Estimation and Testing of Hypothesis

Estimation can be of two types:

a) Point Estimate
b) Interval Estimate – Here we try to find a confidene interval
for the predicted value of the parameters.

c) In ToH we check if the sample gives enough evidence in support


of a hypothesis that we make for the distribution parameters.

To be followed by Quiz 5 on 16-03-2022


Course Plan

6. Non-Parametric tests

Parametric Estimation makes sense only if we have the


probabilistic model of the phenomenon. If no such model
exists and/or the values are non-numeric we may have to
resort to non-parametric estimation

To be followed by Quiz 4 on 31-03-2022


Thank You
1/2

You might also like