0 Minor390 PDF

MTL390-Statistical Methods
Lecture 1
Discrete Distributions
Bernoulli distribution: ℙ(𝑋 = 0) = 1 − 𝑝, ℙ(𝑋 = 1) = 𝑝

We have 𝔼(𝑋) = 𝑝, 𝑀(𝑡) = 𝔼(𝑒 𝑡𝑋 ) = 𝑝𝑒 𝑡 + 𝑞, 𝑣𝑎𝑟(𝑋) = 𝔼(𝑋 2 ) − 𝔼(𝑋)2 = 𝑝 − 𝑝2 = 𝑝𝑞
Finally, we have 𝜙(𝑡) = 𝔼(𝑒 𝑖𝑡𝑋 ) = 𝑝𝑒 𝑖𝑡 + 𝑞

2
𝜕 log(𝑓(𝑥))
Fischer Information: 𝔼 (( 𝜕𝑝
) )
2
𝜕 log(𝑓(𝑥)) 𝑥2 2𝑥
We have 𝑓(𝑥) = 𝑝 𝑥 (1 − 𝑝)1−𝑥 ⇒ [ ] = 1/𝑞 2 (𝑝2 + 1 − )
𝜕𝑝 𝑝
2
𝜕 log(𝑓(𝑥)) 1 𝑝 2𝑝 1 1 1
Hence, we have 𝔼 (( 𝜕𝑝
) ) = 𝑞2 (𝑝2 + 1 − 𝑝
) = 𝑞2 (𝑝 − 1) = 𝑞𝑝
Binomial Distribution
The probability mass function of Binomial distribution is given by 𝑓(𝑥) =𝑛 𝐶𝑥 𝑝 𝑥 (1 − 𝑝)𝑛−𝑥
Can be considered as sum of n independent Bernoulli trials
Hence, we have 𝔼(𝑋) = 𝑝 + 𝑝 + ⋯ (𝑛 − 𝑡𝑖𝑚𝑒𝑠) = 𝑛𝑝, 𝑣𝑎𝑟(𝑋) = 𝑛𝑣𝑎𝑟(𝑋1 ) = 𝑛𝑝𝑞, 𝑀(𝑡) =

𝑛
(𝑞 + 𝑝𝑒 𝑡 )𝑛 and 𝜙(𝑡) = (𝑞 + 𝑝𝑒 𝑖𝑡 )
2
𝜕 log(𝑓(𝑥)) 1 𝑥2 2𝑛𝑥
Now, we have [ 𝜕𝑝
] = (
𝑞 2 𝑝2
+ 𝑛2 − 𝑝
)
2
𝜕 log(𝑓(𝑥)) 1 𝑛𝑝𝑞+𝑛2 𝑝2 2𝑛𝑛𝑝 1 𝑛(𝑞+𝑛𝑝)
Hence, we have 𝔼 (( 𝜕𝑝
) ) = 𝑞2 ( 𝑝2
+ 𝑛2 − 𝑝
) = 𝑞2 ( 𝑝
+ 𝑛2 − 2𝑛2 ) =
𝑛 𝑛
(𝑞 + 𝑛𝑝 − 𝑛𝑝) = 𝑝𝑞
𝑝𝑞2
Result: If X and Y are independent and X~B(n,p) and Y~B(m,p), then X+Y~B(m+n,p)
Proof:
Method 1: Using MGF
We have for RV X+Y
M(t)=𝔼(𝑒 𝑡(𝑋+𝑌) ) = 𝔼(𝑒 𝑡𝑋 )𝔼(𝑒 𝑡𝑌 ) = (𝑞 + 𝑝𝑒 𝑡 )𝑛 (𝑞 + 𝑝𝑒 𝑡 )𝑚 = (𝑞 + 𝑝𝑒 𝑡 )𝑚+𝑛
Hence, we have 𝑋 + 𝑌~𝐵(𝑚 + 𝑛, 𝑝)
Method 2:
We have ℙ(𝑋 + 𝑌 = 𝑘) = ∑𝑘𝑖=0 ℙ(𝑋 = 𝑖)ℙ(𝑌 = 𝑘 − 𝑖) = ∑𝑘𝑖=0 𝐶(𝑛, 𝑖)𝑝𝑖 𝑞 𝑛−𝑖 𝐶(𝑚, 𝑘 −
𝑖)𝑝𝑘−𝑖 𝑞 𝑚−𝑘+𝑖
= ∑𝑘𝑖=0 𝐶(𝑛, 𝑖)𝐶(𝑚, 𝑘 − 𝑖)𝑝𝑘 𝑞𝑚+𝑛−𝑘

Combinatorical argument. Number of ways of selecting k objects from a pile containing m + n
objects is equivalent to dividing the objects into two groups of m objects and n objects respectively
and then finding the number of ways of choosing 0,1,2 ,…, k objects from first pile and remaining
objects from the other pile
Hence, we have ∑𝑘𝑖=0 𝐶(𝑛, 𝑖)𝐶(𝑚, 𝑘 − 𝑖) = 𝐶(𝑚 + 𝑛, 𝑘)
Hence, we have ℙ(𝑋 + 𝑌 = 𝑘) = 𝐶(𝑚 + 𝑛, 𝑘)𝑝𝑘 𝑞𝑚+𝑛−𝑘
Hence, we have 𝑋 + 𝑌~𝐵(𝑚 + 𝑛, 𝑝)
Hypergeometric distribution:
It models the probability of m successes in M draws without replacement from a finite population
of size N in which n objects are associated with success
Denoted as 𝐻𝐺(𝑁, 𝑛, 𝑀, 𝑚)
𝑛 )( 𝑁−𝑛 )
(𝑚 𝑀−𝑚
We have ℙ(𝑋 = 𝑚) =
(𝑁 )
𝑀
𝑛 𝑀 𝑁−𝑀 𝑁−𝑛
Hence, we have 𝔼(𝑋) = 𝑀 and 𝑣𝑎𝑟(𝑋) = 𝑛
𝑁 𝑁 𝑁 𝑁−1
Poisson Distribution(Poi(𝜆) )
𝑒 −𝜆 𝜆𝑥
We have 𝑓(𝑥) = 𝑥!
,𝑥 = 0,1,2, …
𝑡 −1
Hence, we have 𝔼(𝑋) = 𝜆, 𝑣𝑎𝑟(𝑋) = 𝜆, 𝑀(𝑡) = 𝑒 𝜆𝑒 , 𝜙(𝑡) = exp (𝜆(𝑒 𝑖𝑡 − 1))
𝜕 log(𝑓(𝑥)) 𝑥
Fischer information: log(𝑓(𝑥)) = −𝜆 + 𝑥 log 𝜆 − log(𝑥!) ⇒ 𝜕𝜆
= −1 + 𝜆
𝑓(𝑥) 2 𝑥2 2𝑥 𝜆2 +𝜆 2𝜆 1
Hence, we have 𝔼 ((𝜕 log ( 𝜕𝜆
)) ) = 𝔼 [1 + 𝜆2 − 𝜆
] =1+ 𝜆2
− 𝜆 =𝜆
𝑓(𝑥) 2
For passion distribution, we have 𝔼 ((𝜕 log ( 𝜕𝜆
)) ) = 1/𝜆
Theorem: If 𝑋~𝑝𝑜𝑖(𝜆), 𝑌~𝑝𝑜𝑖(𝛾), then 𝑋 + 𝑌~𝑝𝑜𝑖(𝜆 + 𝛾)
Proof:
We have 𝑀(𝑡) = (exp (𝜆(𝑒 𝑖𝑡 − 1)) × (exp (𝛾(𝑒 𝑖𝑡 − 1))) = exp ((𝜆 + 𝛾)(𝑒 𝑖𝑡 − 1))
Hence, we have 𝑋 + 𝑌~𝑃𝑜𝑖(𝜆 + 𝛾)
Geometric Distribution:
We have 𝑓(𝑥) = 𝑝𝑞 𝑥−1 , 𝑥 = 1,2, …
1
Hence, we have 𝔼(𝑋) = 𝑝 , 𝑣𝑎𝑟(𝑋) = 𝑞/𝑝2
Geometric distribution possesses memoryless property, i.e. ℙ(𝑋 ≥ 𝑡 + 𝑠|𝑋 ≥ 𝑡) = ℙ(𝑋 ≥ 𝑠)
Result: If a discrete distribution has memory less property, then it needs to be geometric
Proof:
We have ℙ(𝑋 > 2|𝑋 > 1) = ℙ(𝑋 > 1) ⇒ ℙ(𝑋 > 2) = ℙ(𝑋 > 1)2
Let ℙ(𝑋 > 1) = 𝑝
Hence, we have ℙ(𝑋 > 2) = 𝑝2 , ℙ(𝑋 > 3) = ℙ(𝑋 > 3|𝑋 > 2)ℙ(𝑋 > 2) = 𝑝(𝑝2 ) = 𝑝3
Hence, in general, we have ℙ(𝑋 > 𝑛) = 𝑝𝑛
Hence, we have ℙ(𝑋 = 𝑛) = 1 − ℙ(𝑋 ≤ 𝑛 − 1) − ℙ(𝑋 > 𝑛) = 1 − (1 − 𝑝𝑛−1 ) − 𝑝𝑛 =

𝑝𝑛−1 (1 − 𝑝) = 𝑝𝑛−1 𝑞
Hence, we have 𝑋~𝐺𝑒𝑜(𝑝)
For geometric distribution, we have ∑∞

𝑖=0(1 − 𝐹(𝑖)) = 𝔼(𝑋)
Continuous Distributions
Uniform Distribution(U(a,b))
0, 𝑥 ≤ 𝑎
1 𝑥−𝑎
We have 𝑓(𝑥) = ,𝑎 ≤ 𝑥 ≤ 𝑏, 𝐹(𝑥) = {𝑏−𝑎 ,𝑎 ≤ 𝑥 ≤ 𝑏
𝑏−𝑎
1, 𝑥 ≥ 𝑏
𝑎+𝑏 (𝑏−𝑎)2 𝑒 𝑡𝑎 −𝑒 𝑡𝑏
Hence, we have 𝔼(𝑋) = , 𝑣𝑎𝑟(𝑋) = , 𝑀(𝑡) = ,𝑡 ≠0
2 12 𝑡(𝑏−𝑎)
Exponential Distribution(𝐸𝑥𝑝(𝜆))
We have 𝑓(𝑥) = 𝜆𝑒 −𝜆𝑥 ; 𝑥 > 0, 𝐹(𝑥) = 1 − 𝑒 −𝜆𝑥 ; 𝑥 > 0
1 1 𝜆
Hence, we have 𝔼(𝑋) = 𝜆 , 𝑣𝑎𝑟(𝑋) = 𝜆2 , 𝑀(𝑡) = 𝜆−𝑡
Fischer information: We have log(𝑓(𝑥)) = log 𝜆 − 𝜆𝑥
𝜕 log 𝑓(𝑥) 1 𝜕 log 𝑓(𝑥) 2 1 2𝑥

Hence, we have 𝜕𝜆
=𝜆−𝑥 ⇒( 𝜕𝜆
) = 𝜆2 + 𝑥 2 − 𝜆
𝜕 log 𝑓(𝑥) 2 1 1 1 2 1
Hence, we have 𝔼 [( 𝜕𝜆
) ] = 𝜆2 + (𝜆2 + 𝜆2 ) − 𝜆2 = 𝜆2
Exponential distribution has memoryless property
Proof:
ℙ(𝑋>𝑡+𝑠) 𝑒 −𝜆(𝑡+𝑠)
We have ℙ(𝑋 > 𝑡 + 𝑠|𝑋 > 𝑡) = ℙ(𝑋>𝑡)
= 𝑒 −𝜆𝑡
= 𝑒 −𝜆𝑠 = ℙ(𝑋 > 𝑠)∎
Gamma Distribution( Γ(𝜆, 𝛼 ))

𝜆𝛼
The pdf is 𝑓(𝑥) = Γ𝛼 𝑒 −𝜆𝑥 𝑥 𝛼−1 ; 𝛼 > 0, 𝜆 > 0, 𝑥 > 0
𝜆𝛼 ∞ 𝜆𝛼 𝛼
Hence, we have 𝔼(𝑋) = Γ𝛼 ∫0 𝑒 −𝜆𝑥 𝑥 𝛼+1−1 = λα+1 Γ𝛼 Γ(𝛼 + 1) = 𝜆
𝛼(𝛼+1)
Similarly, we have 𝔼(𝑋 2 ) = 𝜆2
𝛼
Hence, we have 𝑣𝑎𝑟(𝑋) =
𝜆2
𝜆𝛼 ∞ −𝜆𝑥 𝛼−1 𝑡𝑥 𝜆𝛼 Γ𝛼 𝑡 −𝛼
Finally, we have 𝑀(𝑡) = 𝔼(𝑒 𝑡𝑋 ) = ∫ 𝑒 𝑥 𝑒 = × (𝜆−𝑡)𝛼 = (1 − )
Γ𝛼 0 Γ𝛼 𝜆
Fischer information: log 𝑓(𝑥) = 𝛼 log 𝜆 − log Γ𝛼 − 𝜆𝑥 + (𝛼 − 1) log 𝑥

𝜕 log 𝑓(𝑥) 𝛼 𝜕 log 𝑓(𝑥) 2 𝛼2 2𝛼𝑥
Hence, we have 𝜕𝜆
= 𝜆−𝑥 ⇒( 𝜕𝜆
) = 𝜆2
+ 𝑥2 − 𝜆
𝜕 log 𝑓(𝑥) 2 𝛼2 𝛼(𝛼+1) 2𝛼 2 𝛼

Hence, we have 𝔼 [( 𝜕𝜆
) ] = 𝜆2
+ 𝜆 2 − 𝜆2
= 𝜆2
If 𝑋~Γ(𝜆, 𝛼), 𝑌~Γ(𝜆, 𝛽), X, Y are independent, then 𝑋 + 𝑌~Γ(𝜆, 𝛼 + 𝛽)
Proof:
Let 𝑍 = 𝑋 + 𝑌
𝑧 𝜆𝛼 𝜆𝛽 𝜆𝛼+𝛽 𝑧
We have 𝑓𝑍 (𝑧) = ∫0 Γ𝛼
𝑒 −𝜆𝑥 𝑥 𝛼−1 Γ𝛽 𝑒 −𝜆(𝑧−𝑥) (𝑧 − 𝑥)𝛽−1 𝑑𝑥 = Γ𝛼Γ𝛽 𝑒 −𝜆𝑧 ∫0 𝑥 𝛼−1 (𝑧 − 𝑥)𝛽−1
𝜆𝛼+𝛽 −𝜆𝑧 𝛼+𝛽−2 𝑧 𝑥 𝛼−1 𝑥 𝛽−1

Hence, we have 𝑓𝑍 (𝑧) = 𝑒 𝑧 ∫0 (𝑧 ) (1 − )
Γ𝛼Γ𝛽 𝑧
𝑥 𝜆𝛼+𝛽 −𝜆𝑧 𝛼+𝛽−1 1 𝛼−1 𝜆𝛼+𝛽 −𝜆𝑧 𝛼+𝛽−1

Put = 𝑦, we get 𝑓𝑍 (𝑧) = 𝑒 𝑧 ∫0 𝑥 (1 − 𝑥)𝛽−1 = 𝑒 𝑧 𝐵(𝛼, 𝛽 )
𝑧 Γ𝛼Γ𝛽 Γ𝛼Γ𝛽
Γ𝛼Γ𝛽
We have 𝐵(𝛼, 𝛽) =
Γ(𝛼+𝛽)
𝜆𝛼+𝛽
Hence, we have 𝑓𝑍 (𝑧) = Γ(𝛼+𝛽) 𝑒 −𝜆𝑧 𝑧 𝛼+𝛽−1
Hence, we have 𝑍~Γ(𝜆, 𝛼 + 𝛽)
Normal Distribution (𝑁(𝜇, 𝜎 2 ))

1 (𝑥−𝜇)2 𝜎2𝑡 2
We have 𝑓(𝑥) = exp (− 2𝜎 2
),𝑀(𝑡) = exp (𝜇𝑡 + )
√2𝜋𝜎 2
If 𝑋~𝑁(𝜇, 𝜎 2 ),then 𝑎𝑋 + 𝑏 ~𝑁(𝑎𝜇 + 𝑏, 𝑎2 𝜎 2 )
If 𝑋~𝑁(𝜇, 𝜎 2 ), 𝑌~𝑁(𝛼, 𝛾 2 ) are independent, then 𝑋 + 𝑌~𝑁(𝜇 + 𝛼, 𝜎 2 + 𝛾 2 )
Tutorial 1
Q1: Solution: 𝑌 = −𝜆 log(1 − 𝑋)
Since 0 ≤ 𝑋 ≤ 1, we have 0 < 𝑌 < ∞

𝑦
Hence, we have ℙ(𝑌 ≤ 𝑦) = ℙ(−𝜆 log(1 − 𝑋) ≤ 𝑦) = ℙ (1 − 𝑋 ≥ exp (− 𝜆 )) =
𝑦 𝑦
ℙ (𝑋 ≤ 1 − exp (− 𝜆 )) = 1 − exp (− 𝜆 )
𝑦 1 𝑦
Hence, we have 𝐹(𝑦) = 1 − exp (− 𝜆 ) ⇒ 𝑓(𝑦) = 𝜆 exp (− 𝜆 )
1
Hence, we have 𝑌~𝐸𝑥𝑝 (𝜆)
Q2: Solution: We have Y = F(X)
Hence, we have 𝐹𝑌 (𝑦) = ℙ(𝑌 ≤ 𝑦) = ℙ(𝐹𝑋 (𝑋) ≤ 𝑦) = ℙ (𝑋 ≤ 𝐹𝑋−1 (𝑦)) = 𝐹𝑋 (𝐹𝑋−1 (𝑦) = 𝑦
Hence, we have 𝑌~𝑈(0,1)
Q3: Solution: We have 𝑌𝑘 = 𝑋1 + 𝑋2 + ⋯ + 𝑋𝑘
Here 𝔼( 𝑋𝑖 ) = 𝑝 − 𝑞
Hence, we have 𝔼(𝑌𝑘 ) = (𝑝 − 𝑞)𝑘
Q4: Solution: We have ℙ(𝑁𝑡 = 𝑘) = ℙ(𝐸𝑥𝑎𝑐𝑡𝑙𝑦 𝑛 − 𝑘 𝑐𝑜𝑜𝑙𝑒𝑟𝑠 ℎ𝑎𝑣𝑒 𝑓𝑎𝑖𝑙𝑒𝑑 𝑡𝑖𝑙𝑙 𝑡𝑖𝑚𝑒 𝑘) =
𝑛−𝑘 𝑘
𝐶(𝑛, 𝑛 − 𝑘)ℙ(𝑋 ≤ 𝑡)𝑛−𝑘 ℙ(𝑋 ≥ 𝑡)𝑘 = 𝐶(𝑛, 𝑛 − 𝑘)(𝑒 −𝜆𝑡 ) (1 − 𝑒 −𝜆𝑡 )
Hence, we have 𝑁𝑡 ~𝐵𝑖𝑛(𝑛, 1 − 𝑒 −𝜆𝑡 )
Q5: Done
Q6: Done
Q7:Solution: We have 𝔼(𝑋 𝑠 ) ≤ 𝔼(|𝑋 𝑠 |) ≤ 𝔼(1 + |𝑋 𝑟 |) < ∞(𝐵)
Q8:Solution: Max number of virus in 1st gen = 1
Max number of virus in 2nd gen = 2
Max number of virus in 3rd gen = 4
To infect atleast 7 humans atleast 4 viruses are needed
Hence, we have 𝑃 = (0.4) × (0.4)2 × [0.83 × 0.2 × 4 + 0.84 ]
Q9: Easy
𝜆𝛼 𝑥 𝛼−1 −𝜆𝑥
Q10: Solution We have 𝑓(𝑥) = Γ𝛼
𝑒
𝑑𝑓(𝑥)
For maxima/ minima we have 𝑑𝑥
= 0 ⇒ −𝜆𝑥 𝛼−1 𝑒 −𝜆𝑥 + (𝛼 − 1)(𝑥 𝛼−2 𝑒 −𝜆𝑥 ) = 0 ⇒ 𝑥 =
𝛼−1
0 𝑜𝑟 𝜆𝑥 = 𝛼 − 1 ⇒ 𝑥 = 𝜆
𝛼−1
By first derivative test 𝑥 = 𝜆
is mode.
Q11:Solution: First point can be chosen with probability 1
Second point can only be chosen on the arc subtended by the equilateral triangle
1
Hence, 𝑝 = 3
MTL390: Statistical Methods
Lecture 00
Marking Scheme and Quiz Details
Niladri Chatterjee
Dept. of Mathematics
1
Introductory Note
Marking Scheme:
Quiz: 40 Marks Minor: 20 Marks Major: 40 Marks
Quiz :
• 6 quizzes 10 Marks each. Best four will be considered.

• Quizzes will be held during class hours.
• Solutions will be discussed/shared after the quiz.
Topics
Q1 Random variables and known distribution with functions and properties
Monday : 10-01-2022
Q2 Functions of two random variables, sampling

Monday : 24-01-2022
Q3 Sampling and Order Statistics
Monday : 07-02-2022
MINOR + Semester Break
Q4 Estimation of parameters, CR- inequality

Thursday : 03-03-2022
Q5 Interval Estimation and Testing of hypothesis
Wednesday : 16-03-2022
Q6 Non-parametric tests
Thursday : 31-03-2022
Course Plan
1. Quick Revision of Basic Probability Distributions with properties

Discrete: Bernoulli, Uniform, Binomial, Geometric, Poisson, Hypergeometric ,
Negative Binomial
Continuous: Uniform, Exponential, Normal, Cauchy, Chi-sq, Gamma, Weibull
Properties: pmf, pdf, cdf, Expectation, Variance, MGF, Characteristic Function,

Fisher’s Information
To be followed by Quiz 1 on 10-01-2022

Course Plan
2. Advanced Distributions: Student’s T, F, Beta1, Beta2

To be viewed as functions of two random variables
Introduction to Sampling

Course Plan
3. Sampling Distribution (contd) and Order Statistics

Apart from mean and Variance there are important
parameters: Median, Range, percentiles
Order statistics is about their distribution

Course Plan
4. Estimation of Parameters and Cramer-Rao Inequality
Parameter Estimation is most important task of Statistical inference

Samples give only estimations of population parameters
What should be the properties of Good estimators ?
Cramer-Rao provide a lower bound of variance of an unbiased estimator

Course Plan
5. Interval Estimation and Testing of Hypothesis
Estimation can be of two types:
a) Point Estimate
b) Interval Estimate – Here we try to find a confidene interval
for the predicted value of the parameters.
c) In ToH we check if the sample gives enough evidence in support

of a hypothesis that we make for the distribution parameters.

Course Plan
6. Non-Parametric tests
Parametric Estimation makes sense only if we have the

probabilistic model of the phenomenon. If no such model
exists and/or the values are non-numeric we may have to
resort to non-parametric estimation

Thank You
1/2

0 Minor390 PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

0 Minor390 PDF

Uploaded by

Copyright:

Available Formats

MTL390-Statistical Methods

Bernoulli distribution: ℙ(𝑋 = 0) = 1 − 𝑝, ℙ(𝑋 = 1) = 𝑝

Finally, we have 𝜙(𝑡) = 𝔼(𝑒 𝑖𝑡𝑋 ) = 𝑝𝑒 𝑖𝑡 + 𝑞

Can be considered as sum of n independent Bernoulli trials

Hence, we have 𝔼(𝑋) = 𝑝 + 𝑝 + ⋯ (𝑛 − 𝑡𝑖𝑚𝑒𝑠) = 𝑛𝑝, 𝑣𝑎𝑟(𝑋) = 𝑛𝑣𝑎𝑟(𝑋1 ) = 𝑛𝑝𝑞, 𝑀(𝑡) =

Method 1: Using MGF

We have for RV X+Y

M(t)=𝔼(𝑒 𝑡(𝑋+𝑌) ) = 𝔼(𝑒 𝑡𝑋 )𝔼(𝑒 𝑡𝑌 ) = (𝑞 + 𝑝𝑒 𝑡 )𝑛 (𝑞 + 𝑝𝑒 𝑡 )𝑚 = (𝑞 + 𝑝𝑒 𝑡 )𝑚+𝑛

Hence, we have 𝑋 + 𝑌~𝐵(𝑚 + 𝑛, 𝑝)

= ∑𝑘𝑖=0 𝐶(𝑛, 𝑖)𝐶(𝑚, 𝑘 − 𝑖)𝑝𝑘 𝑞𝑚+𝑛−𝑘

Hence, we have ∑𝑘𝑖=0 𝐶(𝑛, 𝑖)𝐶(𝑚, 𝑘 − 𝑖) = 𝐶(𝑚 + 𝑛, 𝑘)

Hence, we have ℙ(𝑋 + 𝑌 = 𝑘) = 𝐶(𝑚 + 𝑛, 𝑘)𝑝𝑘 𝑞𝑚+𝑛−𝑘

Hence, we have 𝑋 + 𝑌~𝐵(𝑚 + 𝑛, 𝑝)

Theorem: If 𝑋~𝑝𝑜𝑖(𝜆), 𝑌~𝑝𝑜𝑖(𝛾), then 𝑋 + 𝑌~𝑝𝑜𝑖(𝜆 + 𝛾)

Hence, we have 𝑋 + 𝑌~𝑃𝑜𝑖(𝜆 + 𝛾)

Geometric distribution possesses memoryless property, i.e. ℙ(𝑋 ≥ 𝑡 + 𝑠|𝑋 ≥ 𝑡) = ℙ(𝑋 ≥ 𝑠)

Let ℙ(𝑋 > 1) = 𝑝

Hence, in general, we have ℙ(𝑋 > 𝑛) = 𝑝𝑛

Hence, we have ℙ(𝑋 = 𝑛) = 1 − ℙ(𝑋 ≤ 𝑛 − 1) − ℙ(𝑋 > 𝑛) = 1 − (1 − 𝑝𝑛−1 ) − 𝑝𝑛 =

For geometric distribution, we have ∑∞

Fischer information: We have log(𝑓(𝑥)) = log 𝜆 − 𝜆𝑥

𝜕 log 𝑓(𝑥) 1 𝜕 log 𝑓(𝑥) 2 1 2𝑥

Exponential distribution has memoryless property

Gamma Distribution( Γ(𝜆, 𝛼 ))

Fischer information: log 𝑓(𝑥) = 𝛼 log 𝜆 − log Γ𝛼 − 𝜆𝑥 + (𝛼 − 1) log 𝑥

𝜕 log 𝑓(𝑥) 2 𝛼2 𝛼(𝛼+1) 2𝛼 2 𝛼

If 𝑋~Γ(𝜆, 𝛼), 𝑌~Γ(𝜆, 𝛽), X, Y are independent, then 𝑋 + 𝑌~Γ(𝜆, 𝛼 + 𝛽)

𝜆𝛼+𝛽 −𝜆𝑧 𝛼+𝛽−2 𝑧 𝑥 𝛼−1 𝑥 𝛽−1

𝑥 𝜆𝛼+𝛽 −𝜆𝑧 𝛼+𝛽−1 1 𝛼−1 𝜆𝛼+𝛽 −𝜆𝑧 𝛼+𝛽−1

Hence, we have 𝑍~Γ(𝜆, 𝛼 + 𝛽)

Normal Distribution (𝑁(𝜇, 𝜎 2 ))

If 𝑋~𝑁(𝜇, 𝜎 2 ),then 𝑎𝑋 + 𝑏 ~𝑁(𝑎𝜇 + 𝑏, 𝑎2 𝜎 2 )

If 𝑋~𝑁(𝜇, 𝜎 2 ), 𝑌~𝑁(𝛼, 𝛾 2 ) are independent, then 𝑋 + 𝑌~𝑁(𝜇 + 𝛼, 𝜎 2 + 𝛾 2 )

Since 0 ≤ 𝑋 ≤ 1, we have 0 < 𝑌 < ∞

Hence, we have 𝑌~𝑈(0,1)

Q3: Solution: We have 𝑌𝑘 = 𝑋1 + 𝑋2 + ⋯ + 𝑋𝑘

Hence, we have 𝔼(𝑌𝑘 ) = (𝑝 − 𝑞)𝑘

Hence, we have 𝑁𝑡 ~𝐵𝑖𝑛(𝑛, 1 − 𝑒 −𝜆𝑡 )

Q7:Solution: We have 𝔼(𝑋 𝑠 ) ≤ 𝔼(|𝑋 𝑠 |) ≤ 𝔼(1 + |𝑋 𝑟 |) < ∞(𝐵)

Q8:Solution: Max number of virus in 1st gen = 1

Max number of virus in 2nd gen = 2

Max number of virus in 3rd gen = 4

To infect atleast 7 humans atleast 4 viruses are needed

Hence, we have 𝑃 = (0.4) × (0.4)2 × [0.83 × 0.2 × 4 + 0.84 ]

Q11:Solution: First point can be chosen with probability 1

Marking Scheme and Quiz Details

Quiz: 40 Marks Minor: 20 Marks Major: 40 Marks

• 6 quizzes 10 Marks each. Best four will be considered.

Q2 Functions of two random variables, sampling

Q4 Estimation of parameters, CR- inequality

1. Quick Revision of Basic Probability Distributions with properties

Properties: pmf, pdf, cdf, Expectation, Variance, MGF, Characteristic Function,

To be followed by Quiz 1 on 10-01-2022

2. Advanced Distributions: Student’s T, F, Beta1, Beta2

To be followed by Quiz 2 on 24-01-2022

3. Sampling Distribution (contd) and Order Statistics

Order statistics is about their distribution

To be followed by Quiz 3 on 07-02-2022

4. Estimation of Parameters and Cramer-Rao Inequality

Parameter Estimation is most important task of Statistical inference

To be followed by Quiz 4 on 03-03-2022

5. Interval Estimation and Testing of Hypothesis

Estimation can be of two types: