You are on page 1of 10

Statistics for Data Science - 2

Week 7 Graded assignment

Use the following values of standard normal distribution if needed.

FZ (0.71) = 0.76115, FZ (1.26) = 0.89617, FZ (1.58) = 0.94295, FZ (2.58) = 0.99506, FZ (1.96) =


0.975

1. Let X1 , X2 , X3 are three independent and identically distributed random variables with
mean µ and variance σ 2 . Given below are 3 different formulations of sample mean.
(Observe that E[A] = E[B] = E[C]).

X1 + X2 + X3
A=
3
B =0.1X1 + 0.3X2 + 0.6X3
C =0.2X1 + 0.3X2 + 0.5X3

Choose the correct option from the following:

(a) Var(A) = Var(B) = Var(C)


(b) Var(A) ≥ Var(B) ≥ Var(C)
(c) Var(A) ≤ Var(B) ≤ Var(C)
(d) Var(A) ≤ Var(C) ≤ Var(B)

Solution:
Let X1 , X2 , X3 ∼ i.i.d.X, where E[X] = µ, Var(X) = σ 2
 
X1 + X 2 + X3
Var(A) =Var
3
1
= (Var[X1 ] + Var[X2 ] + Var[X3 ])
9
1 σ2
= (3σ 2 ) =
9 3

Var(B) =Var (0.1X1 + 0.3X2 + 0.6X3 )


=0.01Var[X1 ] + 0.09Var[X2 ] + 0.36Var[X3 ]
=0.46(3σ 2 )
=1.38σ 2

1
Var(C) =Var (0.2X1 + 0.3X2 + 0.5X3 )
=0.04Var[X1 ] + 0.09Var[X2 ] + 0.25Var[X3 ]
=0.38(3σ 2 )
=1.14σ 2

Therefore, Var(B) ≥ Var(C) ≥ Var(A).

2. A random sample of size 25 is collected from a normal population with mean of 50 and
standard deviation of 5. Find the variance of the sample mean.
Solution:
We know that variance of the sample mean X is given by

σ2
Var[X] =
n
52
= =1
25

50
P
3. Let X1 , X2 , . . . , X50 ∼ i.i.d. Poisson(0.04) and let Y = Xi . Use Central Limit
i=1
theorem to find P (Y > 3). Enter the answer correct to 2 decimal places.
Solution:
Let X ∼ Poisson(0.04).
Consider the samples X1 , X2 , . . . , X50 from X.
E[X] = Var[X]
 50 =0.04  50 
P P
E[Y ] = E Xi = 50 × 0.04 = 2, Var[Y ] = Var Xi = 50 × 0.04 = 2
i=1 i=1

To find: P (Y > 3).


By CLT, we know that
Y − nµ
√ ∼ Normal(0, 1)
σ n
 
Y −2
⇒ √ ∼ Normal(0, 1)
2
Now,

P (Y > 3) = P (Y − 2 > 1)
 
Y −2 3−2
=P √ > √
2 2
= P (Z > 0.707)
= 1 − FZ (0.707) = 1 − 0.76 = 0.24

2
4. Let the moment generating function of a random variable X be given by
         
1 −2λ 1 3 −λ 3 2λ 7
MX (λ) = e + + e + e + eλ
4 40 10 40 20
Find the distribution of X.

X −2 −1 0 1 2
1 3 3 1 7
P (X = x) 4 40 10 40 20

(a)

X −2 −1 0 1 2
1 1 3 3 7
P (X = x) 4 40 10 40 20

(b)

X −2 −1 0 1 2
1 3 1 7 3
P (X = x) 4 10 40 20 40

(c)

X −2 −1 0 1 2
1 3 1 7 3
P (X = x) 4 40 40 20 10

(d)
Solution:
The MGF of a discrete random variable X with the PMF fX (x) = P (X = x), x ∈ TX
is given by
MX (λ) = E[eλX ]
X
= P (X = x)eλx
x∈TX

Now, MGF of a random variable X is given as


         
1 −2λ 1 3 −λ 3 2λ 7
MX (λ) = e + + e + e + eλ
4 40 10 40 20

Therefore, distribution of X is given by

X −2 −1 0 1 2
1 3 1 7 3
P (X = x) 4 10 40 20 40

3
5. A fair coin is tossed 1000 times. Use CLT to compute the probability that head appears
at most 520 times. Enter the answer correct to 3 decimal places.
Solution:
Define a random variable X such that
(
1 if head appears on tossing a fair coin
X=
0 otherwise

1
Therefore, E[X] = µ = and
2
1 1 1
Var(X) = σ 2 = . =
2 2 4

Let X1 , X2 , . . . , X1000 be outcomes on tossing the fair coin 1000 times.


Notice that X1 + X2 + . . . + X1000 will denote the number of times head appears in
1000 tosses.
Let S = X1 + X2 + . . . + X1000
To find: P (S ≤ 520)
By CLT, we know that
S − 1000µ
√ ∼ Normal(0, 1)
σ n
S − 500
⇒ √ ∼ Normal(0, 1)
5 10
Now,

P (S ≤ 520) = P (S − 500 ≤ 20)


 
S − 500 20
=P √ ≤ √
5 10 5 10
= P (Z ≤ 1.26)
= 0.896

6. A fair die is rolled 100 times. Let X denote the number of times six is obtained. Find
X 1
a bound for the probability that differs from by less than 0.1 using weak law of
100 6
large numbers.
5
(a) at least
36
31
(b) at least
36

4
5
(c) at most
36
31
(d) at most
36
Solution:
X denotes the number of times six is obtained on rolling a fair die 100 times.
Let X1 , X2 , . . . , X100 be 100 i.i.d. samples such that
(
1 if six appears on rolling a fair die
Xi =
0 otherwise
1
E[Xi ] = µ = and
6
5
Var(Xi ) = σ 2 =
36
Notice that X = X1 + X2 + X3 + . . . + X100
!
X 1
To find: Bound on P − < 0.1 .

100 6

By weak law of large numbers, we have


σ2
P (|X − µ| < δ) ≥ 1 − 2
! nδ
X 1 5
⇒P − < 0.1 ≥ 1 −

100 6 36 × 100 × 0.01
!
X 1 5 31
⇒P − < 0.1 ≥ 1 − =

100 6 36 36

7. Let X1 , X2 , . . . , X500 ∼ i.i.d Normal(0, 1). Evaluate P (X12 + X22 + . . . + X500


2
> 550)
using Central Limit theorem. Enter the answer correct to 2 decimal places.
Hint: (X12 + X22 + . . . + X500 2
) ∼ Gamma (250, 0.5) .
Solution:
Given X1 , . . . , X500 ∼ i.i.d. Normal(0, 1).  
2 1 1
We know that if X ∼ Normal(0, 1) =⇒ X ∼ Gamma ,
2 2
Also, Sum of n independent  Gamma(α,
 β) is Gamma(nα, β).
1 1
Therefore, Xi2 ∼ Gamma , , for all i.
2 2
and (X12 + X22 + . . . + X500
2
) ∼ Gamma (250, 0.5)

5
Let Y = Y1 + Y2 + . . . + Y500 , where Yi = Xi2 for all i : 1 → 500

0.5 0.5
E[Yi ] = = 1 and Var[Yi ] = = 2, for i : 1 → 500
0.5 0.25
250 250
E[Y ] = = 500 and Var[Y ] = = 1000
0.5 0.52

To find: P (Y > 550)


By CLT, we know that
Y − 500µ
√ ∼ Normal(0, 1)
σ n
Y − 500
⇒ √ ∼ Normal(0, 1)
10 10
Now,

P (Y > 550) = P (Y − 500 > 50)


 
Y − 550 5
=P √ >√
10 10 10
= P (Z > 1.58)
= 1 − FZ (1.58) = 1 − 0.94 = 0.06

Use the below information to answer questions 8 and 9.

Let X be a random variable having the gamma distribution with the parameters α = 2n
and β = 1.
Hint:
α α
• If X ∼ Gamma(α, β), E[X] = and Var[X] = 2
β β
• Sum of n independent Gamma(α, β) is Gamma(nα, β)

8. Use the Weak Law of Large number to find the value of n such that
!
X
P − 1 > 0.01 < 0.01

2n

(a) 505000
(b) 470000
(c) 498000
(d) 482000

6
Solution:
Given X ∼ Gamma(2n, 1)
Let X = X1 + X2 + X3 + . . . + X2n , where Xi ∼ Gamma(1, 1).

E[X] = µ = 1 and Var(X) = σ 2 = 1


1
E[X̄] = 1 and Var[X̄] =
2n
!
X
To find: The value of n such that P − 1 > 0.01 < 0.01.

2n

By weak law of large numbers, we have

σ2
P (|X − µ| > δ) ≤ 2

!
X 1
⇒P − 1 > 0.01 ≤

2n 2n × 0.012

1 1
Therefore, 2
< 0.01 =⇒ 2n > =⇒ n > 500000.
2n × 0.01 0.013

9. Use CLT to find the value of n such that


!
X
P − 1 > 0.01 < 0.01

2n

Hint: Use FZ (2.58) = 0.995, FZ (1.96) = 0.975 if needed.

(a) 34570
(b) 33500
(c) 32500
(d) 30000

Solution:
E[X1 + . . . + X2n ] = 2n and Var[X1 + . . . + X2n ] = 2n
!
X
To find: The value of n such that P − 1 > 0.01 < 0.01.

2n

By CLT, we know that


X − 2nµ
√ ∼ Normal(0, 1)
σ n

7
X − 2n
⇒ √ ∼ Normal(0, 1)
2n
Now,
!
X
P − 1 > 0.01 < 0.01

2n
!
X + . . . + X
1 n
=⇒ P − 1 > 0.01 < 0.01

2n
!
X + . . . + X − 2n
1 n

=⇒ P √ > 0.01 2n < 0.01

2n

=⇒ P (| Z |> 0.01 2n) < 0.01

=⇒ 2P (Z > 0.01 2n) < 0.01
√ 0.01
=⇒ 1 − FZ (0.01 2n) <
√ 2
=⇒ FZ (0.01 2n) > 0.995

=⇒ FZ (0.01 2n) > FZ (2.58)
=⇒ n > 33282

10. Let the time taken (in hours) for failure of an electric bulb follow the exponential distri-
bution with the parameter 0.05. Suppose that 100 such light bulbs say L1 , L2 , . . . , L100
are used in the following manner: For every i, as soon as the light Li fails, Li+1 be-
comes operative, where i : 1 → 99 (i.e. If L1 fails, L2 becomes operative, if L2 fails, L3
becomes operative, and so on). Let the total time of operation of 100 bulbs be denoted
by T. Using CLT, compute the probability that T exceeds 2500 hours.
(a) FZ (1.5)
(b) 1 − FZ (1.5)
(c) FZ (2.5)
(d) 1 − FZ (2.5)
Solution:
Given, time to failure (in hours) of an electric bulb has the exponential distribution
with the parameter λ = 0.05.
Since, the bulbs are used in such a way, that as soon as light L1 fails, L2 becomes
operative, L2 fails, L3 becomes operative, and so on.
We know that if X ∼ Gamma(α, β) with parameter α = 1, then X ∼ Exp(β).
Also, sum of n i.i.d. Exp(λ) is Gamma(n, λ).

8
Since each of the Li ’s are exponentially distributed with parameter = 0.05, therefore

L1 + . . . + L100 ∼ Gamma(nα, β) = Gamma(100, 0.05)

Let T = L1 + . . . + L100

1 1
E[Li ] = µ = = 20 and SD[Li ] = σ = = 20
0.05 0.05

To find: P (T ≥ 2500)
By CLT, we know that
T − 100µ
√ ∼ Normal(0, 1)
σ n
T − 2000
⇒ √ ∼ Normal(0, 1)
20 100
Now,

P (T ≥ 2500) = P (T − 2000 ≥ 500)


 
T − 2000 500
=P ≥
200 200
= P (Z ≥ 2.5)
= 1 − FZ (2.5)

11. Suppose speeds of vehicles on a particular road are normally distributed with mean 36
mph and standard deviation 2 mph. Find the probability that the mean speed X of
20 randomly selected vehicles is between 35 and 38 mph.
√ √
(a) FZ ( 5) − FZ (− 5)
√ √
(b) FZ ( 20) − FZ (− 20)
√ √
(c) FZ ( 38) − FZ (− 35)
√ √
(d) FZ ( 20) − FZ (− 5)

Solution:
Let X denote the speed of a vehicle on a particular road.
Given that X ∼ Normal(36, 22 ).
Therefore, µ = 36 and σ = 2
Select X1 , X2 , . . . X20 samples such that X1 , X2 , . . . X20 ∼ iid X

X1 + X2 + . . . + X20
Let X = and S = X1 + X2 + . . . + X20
20

9
To find: P (35 < X < 38) From CLT, we know that

X1 + X2 . . . + Xn − nE[X]
√ ∼ Normal(0, 1)

S − nµ
⇒ √ ∼ Normal(0, 1)

(S − 36(20))
⇒ √ ∼ Normal(0, 1)
(2 20)

Now,
S
P (35 < X < 38) = P (35 < < 38)
20
S
= P (−1 < − 36 < 2)
20
S − 36(20)
= P (−1 < < 2)
√ 20
− 20 S − 36(20) √
= P( < √ < 20)
2 2 20
√ S − 36(20) √
= P (− 5 < √ < 20)
2 20
√ √
= FZ ( 20) − FZ (− 5)

10

You might also like