Professional Documents
Culture Documents
Aritrabha Majumdar
March 2024
2 Sample Mean
X1 + ... + Xn
X=
n
Suppose E[X] = µ and V ar[X] = σ 2 Then E[X] = µ and V ar[X] = σ 2
3 Sample Variance
n
1 X
Sn2 = (Xi − X)2
n − 1 i=1
features
1
4 Observation
X Z
UX = q Pn , UZ = q Pn
1 1
n−1 i=1 (Xi − X)2 n−1 i=1 (Zi − Z)2
5 Student’s t-Distribution
Say,
n
X1 + ... + Xn 1 X
X= and Sn2 = (Xi − X)2
n n − 1 i=1
Then √
n(X − µ)
σ
follows tn−1 distribution.
6 Some Inequalities
Let X be a non-negative random variable with finite mean and variance. Then
E(X)
P(X ≥ c) ≤ , c>0
c
E(X)
=⇒ P(X ≥ c) ≤
c
1
P (|X − µ| ≥ kσ) ≤
k2
2
(X−µ)2
Proof: We put Y = σ2 and applying Markov’s Inequality,
2
E(Y ) E |X − µ| 1
P(Y ≥ k 2 ) ≤ = = 2
k2 σ2 k2 k
And we are done. Note that the assumptions (or Conditions) applicable in Markov’s Inequality still
prevails!!
Let ε > 0 be given.
√
σ2 1
|X − µ| ε ε n σ
P (|X − µ| > ε) = P > = P |X − µ| > · √ ≤ 2 ·
σ σ σ n ε n
runningmean = function(x,N)
{
y = rpois(N,x)
y = runif(N,2,3)
c = cumsum(y)
n = 1:N
c/n
}
u = runningmean(1,1000)
v=1:1000; plot(u~v, type="l")
invisible(replicate(9, lines(runningmean(c(0,1), 1000)~v,
type="l", col = sample(viridis(10000),1))))
3
2.5
2.4
2.3
u
2.2
2.1
par(mfrow=c(1,3))
u = runningmean(1, 100)
x=1:100; plot(u~x, type="l");
invisible(replicate(10,
lines(runningmean(1, 100)~x, type="l", col =
sample(viridis(15, option="A",1))
)
))
u = runningmean(1, 1000)
x=1:1000; plot(u~x, type="l");
invisible(replicate(10, lines(runningmean(1, 1000)~x, type="l",
col = sample(viridis(15, option="B",1))
)
))
u = runningmean(1, 10000)
x=1:10000; plot(u~x, type="l");
invisible(replicate(10, lines(runningmean(1, 10000)~x, type="l",
col = sample(viridis(15, option="C",1))
)
))
4
2.6
2.65
2.6
2.5
2.60
2.5
2.4
u
u
2.55
2.3
2.4
2.50
2.2
2.3
x x x
then
P(A) = 1
This is similar as that of saying
∞ \
∞
!
\ [
P |Xn − X| < ε =1
ε>0 n=1 n=N
8 Another Question
Does √
n|X − µ|
→ N (0, 1)
σ
Always occur?
binomialsim1 = rbinom(100,10,0.1)
# generates 100 Binomial (10,0.1) samples
binomialsim2 = rbinom(100,10,0.25)
5
# generates 100 Binomial (10,0.25) samples
binomialsim3 = rbinom(100,10,0.5)
# generates 100 Binomial (10,0.5) samples
par(mfrow=c(1,3))
hist(binomialsim1, main = "Binomial(10, 0.1)")
hist(binomialsim2, main = "Binomial(10, 0.25)")
hist(binomialsim3, main = "Binomial(10, 0.5)")
30
25
25
30
20
20
Frequency
Frequency
Frequency
15
20
15
10
10
10
5
5
0
0 1 2 3 4 0 1 2 3 4 5 6 2 4 6 8 10
binomialsim1 = rbinom(100,100,0.1)
# generates 100 Binomial (100,0.1) samples
binomialsim2 = rbinom(100,100,0.25)
# generates 100 Binomial (100,0.25) samples
binomialsim3 = rbinom(100,100,0.5)
# generates 100 Binomial (100,0.5) samples
par(mfrow=c(1,3))
hist(binomialsim1, main = "Binomial(100, 0.1)")
hist(binomialsim2, main = "Binomial(100, 0.25)")
hist(binomialsim3, main = "Binomial(100, 0.5)")
6
Binomial(100, 0.1) Binomial(100, 0.25) Binomial(100, 0.5)
30
40
30
25
25
30
20
20
Frequency
Frequency
Frequency
15
20
15
10
10
10
5
5
0
0
5 10 15 20 10 20 30 40 35 45 55 65
binomial0.1sim1 = rbinom(100,10,0.1)
binomial0.1sim2 = rbinom(100,100,0.1)
binomial0.1sim3 = rbinom(100,1000,0.1)
par(mfrow=c(1,3))
hist(binomial0.1sim1, main = "Binomial(10, 0.1)")
hist(binomial0.1sim2, main = "Binomial(100, 0.1)")
hist(binomial0.1sim3, main = "Binomial(1000, 0.1)")
7
Binomial(10, 0.1) Binomial(100, 0.1) Binomial(1000, 0.1)
30
20
25
30
20
15
Frequency
Frequency
Frequency
20
15
10
10
10
5
5
0
0
0 1 2 3 4 5 10 15 20 80 90 110
Now, √ Sn
√
n|X − µ| n| − p| Sn − np
= p n =p → N (0, 1)
σ p(1 − p) np(1 − p)
binomial0.1sim1 = rbinom(100,10,0.1)
stdbinom1 = (binomial0.1sim1 - 10*0.1)/sqrt(10*0.1*0.9)
binomial0.1sim2 = rbinom(100,100,0.1)
stdbinom2 = (binomial0.1sim2 - 100*0.1)/sqrt(100*0.1*0.9)
# generates 100 Binomial (100,0.1) samples
binomial0.1sim3 = rbinom(100,1000,0.1)
stdbinom3 = (binomial0.1sim3 - 1000*0.1)/sqrt(1000*0.1*0.9)
# generates 100 Binomial (1000,0.1) samples
par(mfrow=c(1,3))
hist(stdbinom1)
hist(stdbinom2)
hist(stdbinom3)
8
Histogram of stdbinom1 Histogram of stdbinom2 Histogram of stdbinom3
40
20
15
30
15
Frequency
Frequency
Frequency
10
20
10
5
10
5
0
0
−1 0 1 2 −2 0 1 2 −2 0 1 2 3
par(mfrow=c(1,3))
qqnorm(stdbinom1)
qqline(stdbinom1)
qqnorm(stdbinom2)
qqline(stdbinom2)
qqnorm(stdbinom3)
qqline(stdbinom3)
9
Normal Q−Q Plot Normal Q−Q Plot Normal Q−Q Plot
2.0
3
2
1.5
2
1
1.0
Sample Quantiles
Sample Quantiles
Sample Quantiles
1
0.5
0
0.0
−1
−1
−0.5
−2
−2
−1.0
−2 0 1 2 −2 0 1 2 −2 0 1 2
par(mfrow=c(1,3))
x= rnorm(100)
boxplot(x,stdbinom1)
boxplot(x,stdbinom2)
boxplot(x,stdbinom3)
10
2
3
2
2
1
1
0
0
−1
−1
−1
−2
−2
−2
−3
1 2 1 2 1 2
S1000std = (binomial0.1sim3-1000*0.1)/sqrt(1000*0.1*0.9)
par(mfrow=c(1,3))
qqnorm(S1000std)
qqline(S1000std)
boxplot(x,S1000std)
hist(S1000std,main="STD-1000")
11
Normal Q−Q Plot STD−1000
3
3
15
2
Sample Quantiles
1
1
Frequency
10
0
0
−1
5
−1
−2
−2
0
−3
−2 0 1 2 1 2 −2 0 1 2 3
Let {Xn }n≥1 be a sequence of random variable with finite mean and variance. For all x ∈ R
√ x
n(Xn − X)
Z
1 y2
P ≤x → √ e− 2 dy
V ar(X) −∞ 2π
100 −0.5
10·S√
Exercise: Let X ∼Uniform(0, 1). Generate 100 samples of 1
.
12
u1 <- replicate(100,mean(runif(100)))
u2 <- 10*(u1-0.5)/(sqrt(1/12))
par(mfrow=c(1,3))
hist(u2)
qqnorm(u2)
qqline(u2)
boxplot(rnorm(100),u2)
12
Histogram of u2 Normal Q−Q Plot
25
3
2
2
20
1
Sample Quantiles
1
15
Frequency
0
10
−1
−1
5
−2
−2
0
−3
−3
−3 −1 0 1 2 −2 0 1 2 1 2
u2 Theoretical Quantiles
library(moments)
c(skewness(x),skewness(u2))
c(kurtosis(x), kurtosis(u2))
u1 <- replicate(100,mean(rexp(100,10)))
u2 <- 10*(u1-0.1)/(sqrt(1/100))
par(mfrow=c(1,3))
hist(u2)
qqnorm(u2)
qqline(u2)
boxplot(rnorm(100),u2)
13
Histogram of u2 Normal Q−Q Plot
2
20
1
Sample Quantiles
15
Frequency
0
10
−1
−1
5
−2
−2
0
−3 −1 0 1 2 −2 0 1 2 1 2
u2 Theoretical Quantiles
library(moments)
c(skewness(x),skewness(u2))
c(kurtosis(x), kurtosis(u2))
14