You are on page 1of 60

Chapter 5.

Estimation

Nguyễn Văn Hạnh


Department of Applied Mathematics
School of Applied Mathematics and Informatics

Second semester, 2022-2023

NV HANH Probability and statistics Second semester, 2022-2023 1 / 60


Contents

1 Random sample

2 Sampling distribution of random sample mean

3 Estimation
Point estimation
Maximum likelihood estimation methods
Method of moments

4 Confidence interval estimation


Confidence interval estimation of µ
Confidence interval estimation of σ 2
Confidence interval estimation of p
General problem

NV HANH Probability and statistics Second semester, 2022-2023 2 / 60


Random sample

Population and sample


Definition 5.1:
A population is the set of all individuals of interest.
In practice, we usually study on a characteristic X of individuals in a
population. X is also called the population.
The number of individuals N is called the population size.
A subset of n individuals taken from a population X is called a
sample of size n.
A sample of size n is a vetor of n obsevations (x1 , x2 , .., xn ).

NV HANH Probability and statistics Second semester, 2022-2023 3 / 60


Random sample

Random sample

Suppose that we study on a population X having a probability


distribution f (x) (probability density function or probability mass
function).
Definition 5.2: A vector of n random variables (X1 , X2 , . . . , Xn ),
where Xi are independently and identically distributed with the
probability distribution f (x) is called a random sample of size n taken
from the population X .
The joint probability distribution (pdf or pmf) of the random sample
(X1 , X2 , . . . , Xn ) is

f (x1 , x2 , . . . , xn ) = f (x1 )f (x2 ) . . . f (xn )

and is called the likelihood function.

NV HANH Probability and statistics Second semester, 2022-2023 4 / 60


Random sample

Example

Let X be the electricity bills (thousands dong) of households in a


region of Vietnam (in June 2020). The population is the set of all
households in this region (consists of about one million households).
Suppose that X follows a normal distribution with a mean of µ and a
variance of σ 2 , the probability density function is f (x; µ, σ 2 ).
A random vector (X1 , X2 , . . . , X50 ), where Xi are i.i.d random variable
having the same normal distribution f (x; µ, σ 2 ), is called a random
sample of size 50 drawn from the population X .
Observed the electricity bills of 50 households from this region and
onbtained the following sample
(x1 , x2 , . . . , x50 ) = (255, 367, . . . , 423), this sample is a representation
of the random sample (X1 , X2 , . . . , X50 ).

NV HANH Probability and statistics Second semester, 2022-2023 5 / 60


Random sample

Statistic

Definition 5.3: A statistic is a function f (X1 , X2 , . . . , Xn ) of the


random sample (X1 , X2 , . . . , Xn ).
Example: The random sample mean
X1 + X2 + . . . + Xn
X̄ =
n
is a statistic.
A statistic is a random variable and the distribution of a statistic is
called a sampling distribution.

NV HANH Probability and statistics Second semester, 2022-2023 6 / 60


Random sample

Some important statistic

The random sample mean:


X1 + X2 + . . . + Xn
X̄ =
n
The non-adjusted random sample variance:

(X1 − X̄ )2 + . . . + (Xn − X̄ )2
Ŝ 2 =
n
The adjusted random sample variance

(X1 − X̄ )2 + . . . + (Xn − X̄ )2
S2 =
n−1

The adjusted random sample standard deviation S = S2

NV HANH Probability and statistics Second semester, 2022-2023 7 / 60


Random sample

Some important statistic

The Z -statistic:
X̄ − µ
Z= √
σ/ n
The T -statistic:
X̄ − µ
T = √
S/ n

NV HANH Probability and statistics Second semester, 2022-2023 8 / 60


Sampling distribution of random sample mean

Sampling distribution of random sample mean

Consider a random sample (X1 , X2 , . . . , Xn ) taken from a population


X . Denote by µ = E (X ) and σ 2 = V (X ).
The random sample mean:
X1 + X2 + . . . + Xn
X̄ = .
n
Theorem 5.1: For all distribution of X, we have E (X̄ ) = µ and
2
V (X̄ ) = σn .
Theorem 5.2: If X is normal: X ∼ N(µ, σ 2 ) then X̄ is also normal:
2
X̄ ∼ N(µ, σn ). So the Z-statistic is standard normal:
X̄ −µ
Z= √
σ/ n
∼ N(0, 1).

NV HANH Probability and statistics Second semester, 2022-2023 9 / 60


Sampling distribution of random sample mean

Sampling distribution of random sample mean

Consider a random sample (X1 , X2 , . . . , Xn ) taken from a population


X . Denote by µ = E (X ) and σ 2 = V (X ).
Theorem 5.3:The central limit theorem:
X̄ − µ n→+∞
Z= √ −−−−→ N(0, 1)
σ/ n
X̄ −µ 2
When n is large enough, Z = √
σ/ n
≈ N(0, 1) or X̄ ≈ N(µ, σn ), for all
distribution of X .

NV HANH Probability and statistics Second semester, 2022-2023 10 / 60


Sampling distribution of random sample mean

Sampling distribution of random sample mean

Consider a random sample (X1 , X2 , . . . , Xn ) taken from a population


X . Denote by µ = E (X ) and σ 2 = V (X ).
Theorem 5.4: If X is normal: X ∼ N(µ, σ 2 ) then the T-statistic
follows the Student’s distribution with n − 1 degrees of freedom:

X̄ − µ
T = √ ∼ tn−1 .
S/ n

Theorem 5.5 (The central limit theorem + Slutsky’s theorem): For all
distribution of X :
X̄ − µ n→+∞
T = √ −−−−→ N(0, 1)
S/ n

NV HANH Probability and statistics Second semester, 2022-2023 11 / 60


Estimation Point estimation

Introduction
Consider a random sample (X1 , X2 , . . . , Xn ) taken from a population
X.
Suppose that the population X follows a distribution F (x; θ) that
depends on a unknown parameter θ.
The parameter θ is unknown since we usually cannot observe all the
population.
Problem of estimation: it is necessary to estimate the parameter θ
based on the random sample (X1 , X2 , . . . , Xn ).
Definition 5.4: A point estimator of θ is a statistic
θ̂ = h(X1 , X2 , . . . , Xn ).
A confidence interval estimation of θ with a confidence level 1 − α is
a random interval [θ̂1 ; θ̂2 ] = [h1 (X1 , X2 , . . . , Xn ); h2 (X1 , X2 , . . . , Xn )]
such that:
P(θ̂1 ≤ θ ≤ θ̂2 ) = 1 − α
NV HANH Probability and statistics Second semester, 2022-2023 12 / 60
Estimation Point estimation

Example

Let X be the electricity bills (thousands dong) of households in a region of


Vietnam (in June 2020). Observed the electricity bills of 200 households
from this region and onbtained the following data:

NV HANH Probability and statistics Second semester, 2022-2023 13 / 60


Estimation Point estimation

Example

The histogram for these data is the following:

NV HANH Probability and statistics Second semester, 2022-2023 14 / 60


Estimation Point estimation

Example

The distribution of data can be approximated by a normal distribution:

NV HANH Probability and statistics Second semester, 2022-2023 15 / 60


Estimation Point estimation

Example
Modelling: We can suppose that the electricity bills of households in
this region follows a normal distribution with parameter θ = (µ, σ 2 )
and the probability density function:
1  (x − µ)2 
f (x; θ) = √ exp −
2πσ 2 2σ 2
The parameter µ is the population mean (the mean electricity bill of
all households) and the parameter σ 2 is the populaiton variance.
A point estimator of µ is the random sample mean
1
(X1 + X2 + . . . + Xn )
X̄ =
n
For the given sample, the sample mean
1
x̄ = (x1 + x2 + . . . + xn ) = 236.78
n
is also called a point estimate of µ.
NV HANH Probability and statistics Second semester, 2022-2023 16 / 60
Estimation Maximum likelihood estimation methods

Maximum likelihood estimation (MLE)


Consider a random sample (X1 , X2 , . . . , Xn ) taken from a population
X that follows a distribution f (x; θ) (the pdf or pmf).
The likelihood function of θ is defined by:
L(θ) = Πni=1 f (Xi ; θ)
The likelihood function measures the possibility of the output
(probability that the sample was observed). The output depemds on
the paramete θ of the model, different values of parameters produce
different ouputs.
Idea of MLE is to find the parameter θ such that the ouput of model
is closest to the observed sample (fit the data the best or maximizes
the likelihood function).
Maximum likelihood estimator: To find θ that maximizes the
likelihood function L(θ) or log L(θ):
θ̂ = argmax L(θ) = argmax log L(θ)
θ θ
NV HANH Probability and statistics Second semester, 2022-2023 17 / 60
Estimation Maximum likelihood estimation methods

Maximum likelihood estimation (MLE)


Procedure to find the maximum likelihood estimator:
Step 1: Write the pdf or pmf of X : f (x; θ).
Step 2: Compute the likelihood function of θ: L(θ) = Πni=1 f (Xi ; θ)
Step 3: Compute the log- likelihood function:
n
X
log L(θ) = log f (Xi ; θ)
i=1

Step 4: Solve the equation


∂ log L(θ)
=0
∂θ
and let θ̂ be the solution, then prove that
∂ 2 log L(θ)
|θ=θ̂ < 0
∂θ2
.
NV HANH Probability and statistics Second semester, 2022-2023 18 / 60
Estimation Maximum likelihood estimation methods

Maximum likelihood estimation (MLE)


Example 5.1: Let X be the lifetime of a type of batteries produced by a
factory and suppose that X follows an exponential distribution with a
parameter λ > 0. Find the maximum likelihood estimator of λ.
Step 1: The probability density function (pdf) of X is

f (x; λ) = λe −λx , for x > 0.

Step 2: The likelihood function of λ:


Pn
L(λ) = Πni=1 λe −λXi = λn e −λ i=1 Xi

Step 3: The log- likelihood function:


n
X
log L(λ) = n log λ − λ Xi
i=1

NV HANH Probability and statistics Second semester, 2022-2023 19 / 60


Estimation Maximum likelihood estimation methods

Maximum likelihood estimation (MLE)


Step 4: Solve the equation
n
∂ log L(λ) n X
= − Xi = 0
∂λ λ
i=1

we obtain the solution


n 1
λ̂ = Pn = .
i=1 Xi X̄
Since
∂ 2 log L(λ) n
2
= − 2 < 0, for all λ > 0,
∂λ λ
then the maximum likelihood estimator of λ is
1
λ̂ = .

NV HANH Probability and statistics Second semester, 2022-2023 20 / 60
Estimation Maximum likelihood estimation methods

Maximum likelihood estimation (MLE) of normal


distribution
Consider a random sample (X1 , X2 , . . . , Xn ) drawn from a normal
population X with a mean of µ and a variance of σ 2 . Find the MLE
of θ = (µ, σ 2 ).
The pdf of X is
1  (x − µ)2 
f (x; θ) = √ exp −
2πσ 2 2σ 2
The likelihood function is
1  (X − µ)2 
i
L(θ) = Πni=1 f (Xi ; θ) = Πni=1 √ exp −
2πσ 2 2σ 2
The log-likelihood function is
n
n 1 X
log L(θ) = − log(2πσ 2 ) − 2 (Xi − µ)2
2 2σ
i=1

NV HANH Probability and statistics Second semester, 2022-2023 21 / 60


Estimation Maximum likelihood estimation methods

Maximum likelihood estimation (MLE) of normal


distribution

Solve the following system of equations:


n
∂ log L(θ) 1 X
=− 2 (Xi − µ) = 0
∂µ σ
i=1
n
∂ log L(θ) n 1 X
=− 2 + 4 (Xi − µ)2 = 0
∂σ 2 2σ 2σ
i=1

Obtain the MLE of µ and σ 2 as follows:


n n
1X 1X
µ̂ = X̄ = Xi and σ̂ 2 = (Xi − X̄ )2
n n
i=1 i=1

NV HANH Probability and statistics Second semester, 2022-2023 22 / 60


Estimation Maximum likelihood estimation methods

Maximum likelihood estimation (MLE) of proportion

Let p be a proportion of defective items in a production line. Find the


MLE of p.
Consider a random sample of n items, we define random variables Xi
that equals to 1 if the i th item in the sample is defective and equals
to 0 otherwise. Then the random sample (X1 , X2 , . . . , Xn ) is drawn
from a Bernoulli population X with a parameter of p.
The pmf of X is

f (x; p) = p x (1 − p)1−x , for x = 0, 1.

The likelihood function is


Pn Pn
L(p) = Πni=1 f (Xi ; p) = Πni=1 p Xi (1 − p)1−Xi = p i=1 Xi
(1 − p)n− i=1 Xi

NV HANH Probability and statistics Second semester, 2022-2023 23 / 60


Estimation Maximum likelihood estimation methods

Maximum likelihood estimation (MLE) of noraml


distribution
The log-likelihood function is
n
X n
X
log L(p) = Xi log p + (n − Xi ) log(1 − p)
i=1 i=1

Solve the following equation:


Pn
n − ni=1 Xi
P
∂ log L(p) i=1 Xi
= − =0
∂p p 1−p
Obtain the MLE of p as follows:
n
1X
p̂ = X̄ = Xi
n
i=1

NV HANH Probability and statistics Second semester, 2022-2023 24 / 60


Estimation Method of moments

Method of moments
Definition 5.5: Let X be a random variable. The k th moment of X is
E [X k ], for k ∈ N ∗ .
Definition 5.6: Let X be a random variable and (X1 , X2 , ..., Xn ) be a
random sample drawn from the population X . The k th sample
moment of X is
n
1 k 1X k
(X1 + ... + Xnk ) = Xi
n n
i=1

Method of moments: Let X be a population with a probability


distribution f (x; θ), where θ is an unknown parameter in R r . The
estimator of θ by the method of moment is the solution of the
following system of equations:
n
1X k
E [X k ] = Xi , k = 1, .., r .
n
i=1

NV HANH Probability and statistics Second semester, 2022-2023 25 / 60


Estimation Method of moments

Method of moments
Example 5.2: Let X be the lifetime of a type of batteries produced by a
factory and suppose that X follows an exponential distribution with a
parameter λ > 0. Find the estimator of λ by the method of moments.
The parameter λ is in R+∗ , so the dimension r = 1.

The 1 moment of X is E [X ] = λ1 .
st

The 1st sample moment of X is


1
(X1 + ... + Xn ) = X̄
n
We solve the equation:
n
1X 1 1
E [X ] = Xi ⇔ = X̄ ⇔ λ = .
n λ X̄
i=1
The estimator of λ by the method of moments is
1
λ̂MM = = λ̂MLE .

NV HANH Probability and statistics Second semester, 2022-2023 26 / 60
Estimation Method of moments

Unbiased estimator

Definition 5.7: A point estimator θ̂ of θ is called unbiased if E(θ̂) = θ.


Example: Consider a random sample (X1 , X2 , ..., Xn ) drawn from a
population X with a mean of µ and a variance of σ 2 .
We can prove that:
n−1 2
E(X̄ ) = µ and E(Ŝ 2 ) = σ .
n

Then X̄ is an unbiased estimator of µ and Ŝ 2 is a biased estimator of


σ2.
We adjusted Ŝ 2 to obtain an unbiased estimator of σ 2 as follows:
n
2 1 X
S = (Xi − X̄ )2 .
n−1
i=1

NV HANH Probability and statistics Second semester, 2022-2023 27 / 60


Confidence interval estimation

Confidence interval estimation

A confidence interval estimation of θ with a confidence level 1 − α is


a random interval [θ̂1 ; θ̂2 ] such that: P(θ̂1 ≤ θ ≤ θ̂2 ) = 1 − α.
Procedure of finding a confidence interval estimation:
Find a point estimator θ̂ of θ.
Using the sampling distribution of θ̂ or the central limit theorem:

X̄ − µ
Z= √ ≈ N(0; 1)
σ/ n

to find an interval [θˆ1 , θˆ2 ] such that P[θˆ1 < θ < θˆ2 ] = 1 − α (where µ
and σ are functions of θ).

NV HANH Probability and statistics Second semester, 2022-2023 28 / 60


Confidence interval estimation Confidence interval estimation of µ

Confidence interval estimation of µ


Problem 1: Consider a random sample (X1 , X2 , . . . , Xn ) taken from a
population X with a mean of µ = E (X ) and a variance of σ 2 = V (X ).
Find a 1 − α confidence interval estimation of µ.
Solution:
Case 1: The population X is normal: X ∼ N(µ, σ 2 ), where σ 2 is known.
A point estimator of µ is the random sample mean X̄ .
2
The sampling distribution of X̄ is also normal: X̄ ∼ N(µ, σn ).
X̄ −µ
√ ∼ N(0; 1).
The statistic Z = σ/ n
Let Zα/2 be the critical value of N(0; 1) at level 1 − α/2, it means
that P(Z < Zα/2 ) = 1 − α/2.
We have
 X̄ − µ 
P − Zα/2 ≤ √ ≤ Zα/2 = 1 − α
σ/ n
 σ σ 
P X̄ − Zα/2 √ ≤ µ ≤ X̄ + Zα/2 √ =1−α
n n
NV HANH Probability and statistics Second semester, 2022-2023 29 / 60
Confidence interval estimation Confidence interval estimation of µ

Confidence interval estimation of µ

Problem 1: Consider a random sample (X1 , X2 , . . . , Xn ) taken from a


population X with a mean of µ = E (X ) and a variance of σ 2 = V (X ).
Find a 1 − α confidence interval estimation of µ.
Solution:
Case 1: The population X is normal: X ∼ N(µ, σ 2 ), where σ 2 is known.
We have
 σ σ 
P X̄ − Zα/2 √ ≤ µ ≤ X̄ + Zα/2 √ =1−α
n n

Then a 1 − α confidence interval (CI) estimation of µ is:


h σ σ i
X̄ − Zα/2 √ ; X̄ + Zα/2 √ = X̄ ∓ ,
n n

where  = Zα/2 √σn is called the error of CI.

NV HANH Probability and statistics Second semester, 2022-2023 30 / 60


Confidence interval estimation Confidence interval estimation of µ

Confidence interval estimation of µ


Problem 1: Consider a random sample (X1 , X2 , . . . , Xn ) taken from a
population X with a mean of µ = E (X ) and a variance of σ 2 = V (X ).
Find a 1 − α confidence interval estimation of µ.
Solution:
Case 2: The population X is normal: X ∼ N(µ, σ 2 ), where σ 2 is unknown.
A point estimator of µ is the random sample mean X̄ .
X̄ −µ
√ ∼ tn−1 , where tn−1 is the Student’s
We use the T-statistic T = S/ n
distribution with n − 1 degrees of freedom.
Let tn−1;α/2 be the critical value of tn−1 at level 1 − α/2, it means
that P(tn−1 < tn−1;α/2 ) = 1 − α/2.
We have
 X̄ − µ 
P − tn−1;α/2 ≤ √ ≤ tn−1;α/2 = 1 − α
S/ n

NV HANH Probability and statistics Second semester, 2022-2023 31 / 60


Confidence interval estimation Confidence interval estimation of µ

Confidence interval estimation of µ


Problem 1: Consider a random sample (X1 , X2 , . . . , Xn ) taken from a
population X with a mean of µ = E (X ) and a variance of σ 2 = V (X ).
Find a 1 − α confidence interval estimation of µ.
Solution:
Case 2: The population X is normal: X ∼ N(µ, σ 2 ), where σ 2 is unknown.

We have
 S S 
P X̄ − tn−1;α/2 √ ≤ µ ≤ X̄ + tn−1;α/2 √ =1−α
n n
Then a 1 − α confidence interval (CI) estimation of µ is:
h S S i
X̄ − tn−1;α/2 √ ; X̄ + tn−1;α/2 √ = X̄ ∓ ,
n n

where  = tn−1;α/2 √Sn is called the error of CI.


NV HANH Probability and statistics Second semester, 2022-2023 32 / 60
Confidence interval estimation Confidence interval estimation of µ

Confidence interval estimation of µ

Problem 1: Consider a random sample (X1 , X2 , . . . , Xn ) taken from a


population X with a mean of µ = E (X ) and a variance of σ 2 = V (X ).
Find a 1 − α confidence interval estimation of µ.
Solution:
Case 3: The population X is non-normal, n is large enough and the
population variance σ 2 is known.
X̄ −µ
The statistic Z = √
σ/ n
≈ N(0; 1).
Then a 1 − α confidence interval (CI) estimation of µ is:
h σ σ i
X̄ − Zα/2 √ ; X̄ + Zα/2 √ = X̄ ∓ ,
n n

where  = Zα/2 √σn is called the error of CI.

NV HANH Probability and statistics Second semester, 2022-2023 33 / 60


Confidence interval estimation Confidence interval estimation of µ

Confidence interval estimation of µ

Problem 1: Consider a random sample (X1 , X2 , . . . , Xn ) taken from a


population X with a mean of µ = E (X ) and a variance of σ 2 = V (X ).
Find a 1 − α confidence interval estimation of µ.
Solution:
Case 4: The population X is non-normal, n is large enough and the
population variance σ 2 is unknown.
X̄ −µ
The statistic Z = √
S/ n
≈ N(0; 1).
Then a 1 − α confidence interval (CI) estimation of µ is:
h S S i
X̄ − Zα/2 √ ; X̄ + Zα/2 √ = X̄ ∓ ,
n n

where  = Zα/2 √Sn is called the error of CI.

NV HANH Probability and statistics Second semester, 2022-2023 34 / 60


Confidence interval estimation Confidence interval estimation of µ

Confidence interval estimation of µ

Example 5.3: Let X be the amount of telephone bills (USD) of customers


in a city. Suppose that X follows a normal distribution N(µ, σ 2 ). Observed
a sample of 20 customers, we obtained the following data:

31.3, 28.8, 30.8, 29.6, 32.5, 30.1, 28.6, 32.2, 30.8, 32.6,

31.8, 28.5, 29.9, 27.2, 36.0, 30.6, 29.2, 30.9, 31.0, 30.8

Find the point estimate of µ and σ 2 by method of moments.


Find the point estimate of µ and σ 2 by MLE method.
Find a 90% confidence interval estimate of µ.
Suppose that the standard deviation σ is known to equal to 1.5. Find
a 90% confidence interval estimate of µ.

NV HANH Probability and statistics Second semester, 2022-2023 35 / 60


Confidence interval estimation Confidence interval estimation of µ

Confidence interval estimation of µ

Solution of Example: The point estimator of µ and σ 2 by method of


moments:
The parameter θ = (µ, σ 2 ) is in R 2 , so the dimension r = 2.
We have E (X ) = µ and E (X 2 ) = µ2 + σ 2 .
We solve the following system of equations:
n n
1X 1X 2
E [X ] = Xi and E (X 2 ) = Xi
n n
i=1 i=1

The point estimator of µ and σ 2 by method of moments are:


n
2 1X 2
µ̂MM = X̄ and σ̂MM = Xi − X̄ 2 = Ŝ 2 .
n
i=1

NV HANH Probability and statistics Second semester, 2022-2023 36 / 60


Confidence interval estimation Confidence interval estimation of µ

Confidence interval estimation of µ


Solution of Example: The point estimator of µ and σ 2 by method of
moments:
The point estimator of µ and σ 2 by method of moments are:
n
2 1X 2
µ̂MM = X̄ and σ̂MM = Xi − X̄ 2 = Ŝ 2 .
n
i=1

For the given sample, the point estimate of µ and σ 2 by method of


moments are:
1
µ̂MM = x̄ = (31.3 + ... + 30.8) = 30.66
20
and
2 1
σ̂MM = ŝ 2 = (31.32 + ... + 30.82 − 20 ∗ 30.662 ) = 3.4234
20
NV HANH Probability and statistics Second semester, 2022-2023 37 / 60
Confidence interval estimation Confidence interval estimation of µ

Confidence interval estimation of µ

Solution of Example:
The point estimator of µ and σ 2 by the maximum likelihood
estimation method are:
2
µ̂MLE = X̄ and σ̂MLE = Ŝ 2 .

For the given sample, the point estimate of µ and σ 2 by the


maximum likelihood estimation method are:
1
µ̂MLE = x̄ = (31.3 + ... + 30.8) = 30.66
20
and

2 1
σ̂MLE = ŝ 2 = (31.32 + ... + 30.82 − 20 ∗ 30.662 ) = 3.4234
20

NV HANH Probability and statistics Second semester, 2022-2023 38 / 60


Confidence interval estimation Confidence interval estimation of µ

Confidence interval estimation of µ


Solution of Example: Find a 90% confidence interval estimate of µ.
Since X ∼ N(µ; σ 2 ) where σ 2 is unknown then we use the following
statistic
X̄ − µ
T = √ ∼ tn−1 ,
S/ n
so a 1 − α confidence interval (CI) estimation of µ is:
h S S i
X̄ − tn−1;α/2 √ ; X̄ + tn−1;α/2 √
n n
For the given sample, we have √
n 2
n = 20; x̄ = 30.66; s 2 = n−1 ŝ = 3.604; s = 3.604 = 1.9;
1 − α = 90% then tn−1;α/2 = t19;0.05 = 1.73
So the CI of µ is
1.9
30.66 ∓ 1.73 √ = 30.66 ∓ 0.735 = [29.925; 31.395]
20
NV HANH Probability and statistics Second semester, 2022-2023 39 / 60
Confidence interval estimation Confidence interval estimation of µ

Confidence interval estimation of µ


Solution of Example: Find a 90% confidence interval estimate of µ when
σ = 1.5.
Since X ∼ N(µ; σ 2 ) where σ is known to equal to 1.5 then we use the
following statistic
X̄ − µ
Z= √ ∼ N(0; 1),
σ/ n
so a 1 − α confidence interval (CI) estimation of µ is:
h σ σ i
X̄ − Zα/2 √ ; X̄ + Zα/2 √
n n
For the given sample, we have n = 120; x̄ = 30.66; 1 − α = 90% then
Zα/2 = Z0.05 = 1.645
So the CI of µ is
1.5
30.66 ∓ 1.645 √ = 30.66 ∓ 0.55 = [30.11; 31.21]
20
NV HANH Probability and statistics Second semester, 2022-2023 40 / 60
Confidence interval estimation Confidence interval estimation of σ 2

Confidence interval estimation of σ 2


Problem 2: Let (X1 , X2 , .., Xn ) be a random sample taken from a normal
population X ∼ N(µ, σ 2 ). Find a 1 − α confidence interval estimate of σ 2 .
The point estimator of σ 2 is S 2 .
The sampling distribution of S 2 is the following
(n − 1)S 2
∼ χ2n−1 ,
σ2
where χ2n−1 is the Chi-squared distribution with n − 1 degrees of
freedom.
Let χ2n−1;1−α/2 and χ2n−1;α/2 be the critical value of Chi-squared
distribution χ2n−1 at level α/2 and 1 − α/2.
We have
 (n − 1)S 2 
P χ2n−1;1−α/2 ≤ ≤ χ 2
n−1;α/2 = 1 − α
σ2
NV HANH Probability and statistics Second semester, 2022-2023 41 / 60
Confidence interval estimation Confidence interval estimation of σ 2

Confidence interval estimation of σ 2


Problem 2: Let (X1 , X2 , .., Xn ) be a random sample taken from a normal
population X ∼ N(µ, σ 2 ). Find a 1 − α confidence interval estimate of σ 2 .

We have
(n − 1)S 2
P(χ2n−1;1−α/2 ≤ ≤ χ2n−1;α/2 ) = 1 − α
σ2
Then
 (n − 1)S 2 (n − 1)S 2 
P ≤ σ2 ≤ =1−α
χ2n−1;α/2 χ2n−1;1−α/2

A 1 − α confidence interval estimate of σ 2 is


h (n − 1)S 2 (n − 1)S 2 i
;
χ2n−1;α/2 χ2n−1;1−α/2

NV HANH Probability and statistics Second semester, 2022-2023 42 / 60


Confidence interval estimation Confidence interval estimation of σ 2

Confidence interval estimation of σ 2


Example 5.4: Let X be the amount of telephone bills (USD) of customers
in a city. Suppose that X follows a normal distribution N(µ, σ 2 ). Observed
a sample of 20 customers, we obtained the following data:
31.3, 28.8, 30.8, 29.6, 32.5, 30.1, 28.6, 32.2, 30.8, 32.6,
31.8, 28.5, 29.9, 27.2, 36.0, 30.6, 29.2, 30.9, 31.0, 30.8
Find a 90% confidence interval estimate of σ 2 .
A 1 − α confidence interval estimate of σ 2 is
h (n − 1)S 2 (n − 1)S 2 i
; ,
χ2n−1;α/2 χ2n−1;1−α/2
where n = 20; s 2 = 3.604; 1 − α = 0.9 then
χ2n−1;1−α/2 = χ219,0.95 = 10.12; χ2n−1;α/2 = χ219,0.05 = 30.14.
Then a 90% confidence interval estimate of σ 2 is
h 19 ∗ 3.604 19 ∗ 3.604 i
; = [2.27; 6.77].
30.14 10.12
NV HANH Probability and statistics Second semester, 2022-2023 43 / 60
Confidence interval estimation Confidence interval estimation of p

Confidence interval estimation of population proportion p


Problem 3: Let p be a population proportion, for example, p is the
proportion of defective items in a production line. Find a 1 − α confidence
interval estimate of p.
Consider a random sample of size n from the population.
A point estimator of p is p̂, the sample proportion (example: the
proportion of defective items in a sample of n items).
By the following limit theorem
p̂ − p
Z=q ≈ N(0, 1)
p̂(1−p̂)
n

we obtain the following 1 − α confidence interval estimate of p;


r
p̂(1 − p̂)
p̂ ∓  = p̂ ∓ Zα/2
n
q
where  = Zα/2 p̂(1−n
p̂)
is the error of the CI.
NV HANH Probability and statistics Second semester, 2022-2023 44 / 60
Confidence interval estimation Confidence interval estimation of p

Confidence interval estimation of population proportion p

Example 5.5: Let p be the proportion of defective items in a production


line. Examined a random sample of 120 items from the line and there were
6 defective items. Find a 90% confidence interval estimate of p.
A 90% confidence interval estimate of p is
r
p̂(1 − p̂)
p̂ ∓  = p̂ ∓ Zα/2
n
where n = 20; p̂ = 6/120 = 0.05; 1 − α = 0.9 then
Zα/2 = Z0.025 = 1.96.
So the CI of p is
r
0.05 ∗ 0.95
0.05 ∓ 1.96 = 5% ∓ 3.9% = [1.1%; 8.9%].
120

NV HANH Probability and statistics Second semester, 2022-2023 45 / 60


Confidence interval estimation General problem

Confidence interval estimation


General problem: Observe a population X with the pdf (ou pmf) f (x; θ),
where θ is unknown parameter to estimate. Find a 1 − α confidence
interval estimation of θ.
Consider a random sample (X1 , X2 , ..., Xn ) taken from the population
X.
Find a point estimator θ̂ of θ.
Use the sampling distribution of θ̂ or a limit theorem, for example:
X̄ − µ X̄ − g1 (θ)
Z= √ = √ ≈ N(0; 1)
σ/ n g2 (θ)/ n
From the equation
 X̄ − g1 (θ) 
P − Zα/2 ≤ √ ≤ Zα/2 = 1 − α
g2 (θ)/ n

we find an interval [θˆ1 ; θˆ2 ] such that P(θ̂1 ≤ θ ≤ θ̂2 ) = 1 − α.


NV HANH Probability and statistics Second semester, 2022-2023 46 / 60
Confidence interval estimation General problem

Confidence interval estimation

Example 5.6: Let X be the lifetime (in years) of a mechanical part.


Suppose that X follows an exponential distribution with a rate parameter
of λ.
Construct a 1 − α confidence interval estimation of λ.
Given the following sample:

X [0,1] (1, 2] (2, 3] (3, 4] (4, 5] (5, 6] (6, 7]


No of parts 20 12 8 3 3 2 2
Find a 90% confidence interval estimate of λ for this sample.

NV HANH Probability and statistics Second semester, 2022-2023 47 / 60


Confidence interval estimation General problem

Confidence interval estimation


Solution:
Since X ∼ E(λ) then the pdf of X is f (x; λ) = λe −λx , for x > 0 and
µ = E (X ) = 1/λ; σ 2 = V (X ) = 1/λ2 then σ = 1/λ.
By the central limit theorem:

X̄ − µ X̄ − 1/λ √
Z= √ = √ = (X̄ λ − 1) n ≈ N(0; 1)
σ/ n (1/λ)/ n

From the equation


 √ 
P − Zα/2 ≤ (X̄ λ − 1) n ≤ Zα/2 = 1 − α

or
 1 − Z√α/2 Z
1 + √α/2 
n n
P ≤λ≤ =1−α
X̄ X̄
NV HANH Probability and statistics Second semester, 2022-2023 48 / 60
Confidence interval estimation General problem

Confidence interval estimation

Solution:
Then a 1 − α confidence interval estimation of λ is
Zα/2 Zα/2
h1 − √ 1+ √ i
n n
;
X̄ X̄
For the given sample, we have n = 50; 1 − α = 0.9 then
Zα/2 = Z0.05 = 1.645;
x̄ = (20 ∗ 0.5 + 12 ∗ 1.5 + ... + 2 ∗ 6.5)/50 = 1.92.
So the 90% CI of λ is
1.645
1∓ √
50
= [0.4; 0.64]
1.92
We are 90% confident that the parameter λ is between 0.4 and 0.64.

NV HANH Probability and statistics Second semester, 2022-2023 49 / 60


Confidence interval estimation General problem

Confidence interval estimation

Solution: 2nd method


Using the following limit theorem:

X̄ − µ X̄ − 1/λ
T = √ = √ ≈ N(0; 1)
S/ n S/ n

From the equation


 X̄ − 1/λ 
P − Zα/2 ≤ √ ≤ Zα/2 = 1 − α
S/ n
or  1 1 
P ≤ λ ≤ =1−α
X̄ + Zα/2 √Sn X̄ − Zα/2 √Sn

NV HANH Probability and statistics Second semester, 2022-2023 50 / 60


Confidence interval estimation General problem

Confidence interval estimation

Solution:
Then a 1 − α confidence interval estimation of λ is
h 1 1 i
;
X̄ + Zα/2 √Sn X̄ − Zα/2 √Sn

For the given sample, we have n = 50; Zα/2 = Z0.05 = 1.645;


1.92; s 2 = (20 ∗ 0.52 + ... + 2 ∗ 6.52 − 50 ∗ 1.922 )/49 = 2.861;
x̄ = √
s = 2.861 = 1.69
So the 90% CI of λ is
1
1.69
= [0.43; 0.65]
1.92 ∓ 1.645 ∗ √
50

We are 90% confident that the parameter λ is between 0.43 and 0.65.

NV HANH Probability and statistics Second semester, 2022-2023 51 / 60


Confidence interval estimation General problem

Confidence interval estimation

Example 5.7: Let X be the number of accidents per week in a small city.
Suppose that X follows a Poisson distribution with a mean parameter of
λ.
Find the point estimator of λ by the method of moment and by the
MLE method.
Construct a 1 − α confidence interval estimation of λ.
Given the following sample:
X 0 1 2 3 4
o
N of weeks 7 15 10 12 6
Find a 90% confidence interval estimate of λ for this sample.

NV HANH Probability and statistics Second semester, 2022-2023 52 / 60


Confidence interval estimation General problem

Confidence interval estimation

Solution: The point estimator of λ by the method of moment:


∗ then the dimension of parameter space is
The parameter λ ∈ R+
r = 1.
The first moment of X is E (X ) = λ.
The first sample moment of X is n1 ni=1 Xi = X̄ .
P

We solve the following equation:


n
1X
E (X ) = Xi or λ = X̄
n
i=1

So the point estimator of λ by the method of moment is λ̂MM = X̄ .

NV HANH Probability and statistics Second semester, 2022-2023 53 / 60


Confidence interval estimation General problem

Confidence interval estimation


Solution: The point estimator of λ by the MLE method.:
The pmf of X is
λx
f (x; λ) = e −λ , x = 0, 1, 2, ...
x!
The likelihood function of λ is
n n Xi
Pn
Xi
−λ λ λ
Y Y i=1
−nλ Q
L(λ) = f (Xi ; λ) = e =e n
Xi ! i=1 Xi !
i=1 i=1

The log-likelihood function of λ is


n
X n
Y
log L(λ) = −nλ + Xi log(λ) − log( Xi !)
i=1 i=1

NV HANH Probability and statistics Second semester, 2022-2023 54 / 60


Confidence interval estimation General problem

Confidence interval estimation

Solution: The point estimator of λ by the MLE method:


Solve the following equation:
Pn
∂ log L(λ) i=1 Xi
= −n + =0
∂λ λ
we have λ = X̄ .
Since Pn
∂ 2 log L(λ) i=1 Xi
=− <0
∂λ2 λ2
then the likelihood function attains maximum at λ = X̄ . So
λ̂MLE = X̄ .

NV HANH Probability and statistics Second semester, 2022-2023 55 / 60


Confidence interval estimation General problem

Confidence interval estimation

Solution: Construct a 1 − α confidence interval estimation of λ:



Since X ∼ P(λ) then µ = E (X ) = λ; σ 2 = V (X ) = λ; σ = λ.
By the central limit theorem:

X̄ − µ X̄ − λ √
Z= √ = √ n ≈ N(0; 1)
σ/ n λ
From the equation
 |X̄ − λ| √ 
P √ n ≤ Zα/2 = 1 − α
λ
or  
P n(X̄ − λ)2 ≤ λZα/2
2
=1−α

NV HANH Probability and statistics Second semester, 2022-2023 56 / 60


Confidence interval estimation General problem

Confidence interval estimation

Solution: Construct a 1 − α confidence interval estimation of λ:


or  
P nλ2 − (2nX̄ + Zα/2
2
)λ + nX̄ 2 = 1 − α
or q
2 ∓Z
2nX̄ + Zα/2 2
 α/2 4nX̄ + Zα/2 
P λ∈ =1−α
2n
Then a 1 − α confidence interval estimation of λ is
q
2 ∓Z
2nX̄ + Zα/2 2
α/2 4nX̄ + Zα/2

2n

NV HANH Probability and statistics Second semester, 2022-2023 57 / 60


Confidence interval estimation General problem

Confidence interval estimation

Solution:
For the given sample, we have n = 50; 1 − α = 0.9 then
Zα/2 = Z0.05 = 1.645;
x̄ = (7 ∗ 0 + 15 ∗ 1 + 10 ∗ 2 + 21 ∗ 3 + 6 ∗ 4)/50 = 1.9.
So the 90% CI of λ is

2 ∗ 50 ∗ 1.9 + 1.6452 ∓ 1.645 4 ∗ 50 ∗ 1.9 + 1.6452
= [1.61; 2.25]
2 ∗ 50
We are 90% confident that the parameter λ is between 1.61 and 2.25.

NV HANH Probability and statistics Second semester, 2022-2023 58 / 60


Confidence interval estimation General problem

Confidence interval estimation

Solution: 2nd method.


By the following limit theorem:

X̄ − µ X̄ − λ √
Z= √ = n ≈ N(0; 1)
S/ n S

From the equation


 |X̄ − λ| √ 
P n ≤ Zα/2 = 1 − α
S
or  S S 
P X̄ − Zα/2 √ ≤ λ ≤ X̄ + Zα/2 √ =1−α
n n

NV HANH Probability and statistics Second semester, 2022-2023 59 / 60


Confidence interval estimation General problem

Confidence interval estimation

Solution:
Then a 1 − α confidence interval estimation of λ is
S
X̄ ∓ Zα/2 √
n

For the given sample, we have n = 50; Zα/2 = Z0.05 = 1.645;


x̄ = 1.9; s 2 √
= (7 ∗ 02 + 15 ∗ 12 + 10 ∗ 22 + 21 ∗ 32 + 6 ∗ 42 − 50 ∗ 1.92 ) =
1.602; s = 1.602 = 1.27.
So the 90% CI of λ is
1.27
1.9 ∓ 1.645 √ = [1.6; 2.2]
50
We are 90% confident that the parameter λ is between 1.6 and 2.2.

NV HANH Probability and statistics Second semester, 2022-2023 60 / 60

You might also like