Slides 7

Statistics
Statistics:
1. Model X i ~ N ( µ , σ 2 ), i = 1, 2,  , n iid.
2. Estimation µˆ = x , σˆ 2 = s 2
3. Hypothesis test µ = µ 0 , σ 2 = σ 02
1 lecture 7
Estimation
Estimat
Definition:
^
A (point) estimate θ of a parameter, θ, in the model is a
“guess” at what θ can be (based on the sample). The
^
corresponding random variable Θ is called an estimator.
Population Sample parameter estimat estimator

• µ µˆ = x X
••••
σ σˆ 2 = s 2 S2
...
••• ...
••••
^
θ θ lecture 7
2
Estimation
Unbiased estimate
Definition: ^
An estimator Θ is said to be unbiased if
^
E(Θ) = θ
We have
E( X ) = µ unbiased
E ( S 2 ) = σ 2 unbiased
3 lecture 7
Confidence interval for the
mean
Example:
In a sample of 20 chocolate bars the amount of calories
has been measured:
• the sample mean is 224 calories
How certain are we that the population mean µ is close

to 224?
The confidence interval (CI) for µ helps us here!
4 lecture 7
Confidence interval for µ
Known variance
Let x be the average of a sample consisting of n observations
from a population with mean µ and variance σ2 (known).
A (1-α)100% confidence interval for µ is given by
x − zα / 2 σ < µ < x + zα / 2 σ
n n
From the standard normal distribution N(0,1)
•We are (1-α)100% confident that the unknown parameter µ lies in the CI.
•We are (1-α)100% confident that the error we make by using x as an
estimate of µ does not exceed zα / 2 σ n (from which we can find n for a
given error tolerance).
5 lecture 7
Known variance
Confidence interval for known variance is a results of the

Central Limits Theorem. The underlying assumptions:
• sample size n > 30, or
• the corresponding random var. is (approx.) normally
distributed
(1 - α)100% confidence interval :

• typical values of α: α =10% , α = 5% , α = 1%
• what happens to the length of the CI when n increases?
6 lecture 7
Known variance
Problem:
In a sample of 20 chocolate bars the amount of calories has
been measured:
In addition we assume:
• the corresponding random variable is approx. normally
distributed
• the population standard deviation σ = 10
Calculate 90% and 95% confidence interval for µ
Which confidence interval is the longest?
7 lecture 7
Unknown variance
Let x be the mean and s the sample standard deviation of a
sample of n observations from a normal distributed population
with mean µ and unknown variance.
A (1-α)100 % confidence interval for µ is given by
x − tα / 2, n −1 s < µ < x + tα / 2, n −1 s
n n
From the t distribution with n-1 degrees of freedom t(n -1)
•Not necessarily normally distributed, just approx. normal distributed.

•For n > 30 the standard normal distribution can be used instead of the t
distribution.
•We are (1-α)100% confident that the unknown µ lies in the CI.
8 lecture 7
Unknown variance
Problem:
In a sample of 20 chocolate bars the amount of calories
has been measured:
• the sample standard deviation is 10
Calculate 90% and 95% confidence intervals for µ
How does the lengths of these confidence intervals

compare to those we obtained when the variance was
known?
9 lecture 7
Using the computer
MATLAB: If x contains the data we can obtain a (1-alpha)100%
confidence interval as follow:
mean(x) + [-1 1] * tinv(1-alpha/2,size(x,1)-1) *

std(x)/sqrt(size(x,1))
where
•size(x,1) is the size n of the sample
•tinv(1-alpha/2,size(x,1)-1) = tα/2(n-1)
•std(x) = s (sample standard deviation)
R:
mean(x) + c(-1,1) * qt(1-alpha/2,length(x)-1) *
sd(x)/sqrt(length(x))
10 lecture 7
Confidence interval for σ 2
Example:
In a sample of 20 chocolate bars the amount of
calories has been measured:
• sample standard deviation is10
How certain are we that the population variance σ2

is close to 102?
The confidence interval for σ2 helps us answer this!
11 lecture 7
Confidence interval for σ2
Let s be the standard deviation of a sample consisting of n

observations from a normal distributed population with
variance σ2.
A (1-α) 100% confidence interval for σ2 is given by
(n − 1) s 2 (n − 1) s 2
<σ 2 <
χα2 / 2, n −1 χ12− α / 2, n −1
From χ2 distribution with n-1 degrees of freedom
2
We are (1-α)100% confident that the unknown parameter σ lies in the CI.
12 lecture 7
Problem:
In a sample of 20 chocolate bars the amount of
calories has been measured:
• sample standard deviation is10
Find the 90% and 95% confidence intervals for σ2
13 lecture 7
Using the computer
MATLAB: If x contains the data we can obtain a (1-alpha)100%
confidence interval for σ2 as follow:
(size(x,1)-1)*std(x)./
chi2inv([1-alpha/2 alpha/2],size(x,1)-1)
R:
(length(x)-1)*sd(x))/
qchisq(c(1-alpha/2,alpha/2), length(x)-1)
14 lecture 7
Maximum Likelihood Estimation
The likelihood function
Assume that X1,...,Xn are random variables with joint
density/probability function
f ( x1 , x2 ,  , xn ;θ )
where θ is the parameter (vector) of the distribution.
Considering the above function as a function of θ given

the data x1,...,xn we obtain the likelihood function
L(θ ; x1 , x2 ,  , xn ) = f ( x1 , x2 ,  , xn ;θ )
15 lecture 7
The likelihood function
Reminder: If X1,...,Xn are independent random variables
with identical marginal probability/ density function f(x;θ),
then the joint probability / density function is
f ( x1 , x2 ,  , xn ;θ ) = f ( x1 ;θ ) f ( x2 ;θ )  f ( xn ;θ )
Definition: Given independent observations x1,...,xn from

the probability / density function f(x;θ) the maximum
likelihood estimate (MLE) θ is the value of θ which
maximizes the likelihood function
L(θ ; x1 , x2 ,  , xn ) = f ( x1 ;θ ) f ( x2 ;θ )  f ( xn ;θ )
16 lecture 7
Example
Assume that X1,...,Xn are a sample from a normal
population with mean µ and variance σ2. Then the
marginal density for each random variable is
 1 2
exp − 2 ( x − µ ) 
1
f ( x;θ ) =
2πσ 2
 2σ 
Accordingly the joint density is
 1 2
exp − 2 ∑ ( xi − µ ) 
1
f ( x1 , x2 ,  , xn ;θ ) =
(2π ) (σ )
n2 2 n2
 2σ i 
The logarithm of the likelihood function is
ln L( µ , σ ; x1 , x2 ,  , xn ) = − ln(2π ) − ln(σ ) − 2 ∑ ( xi − µ )
2 n n 2 1 2
2 2 2σ i
17 lecture 7
Example
We find the maximum likelihood estimates by maximizing
the log-likelihood:
∂
ln L( µ , σ ; x) = 2 ∑ ( xi − µ ) = 0
2 1
∂µ σ i
1
which implies µ̂ = ∑ xi = x . For σ2 we have
n i
∂
ln L ( µ , σ 2
; x ) = −
n
+
1
2 ∑
( x − µ )2
=0
∂σ 2σ 2(σ ) i
2 2 2 i
which implies σˆ = ∑ (xi − x )

2 1 2
n i
n −1 2
Notice E[σˆ 2 ] = σ , i.e. the MLE is biased!
n
18 lecture 7

Slides 7

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Slides 7

Uploaded by

Copyright:

Available Formats

Statistics

Population Sample parameter estimat estimator

How certain are we that the population mean µ is close

The confidence interval (CI) for µ helps us here!

Confidence interval for known variance is a results of the

(1 - α)100% confidence interval :

Which confidence interval is the longest?

•Not necessarily normally distributed, just approx. normal distributed.

Calculate 90% and 95% confidence intervals for µ

How does the lengths of these confidence intervals

mean(x) + [-1 1] * tinv(1-alpha/2,size(x,1)-1) *

How certain are we that the population variance σ2

The confidence interval for σ2 helps us answer this!

Let s be the standard deviation of a sample consisting of n

From χ2 distribution with n-1 degrees of freedom

Find the 90% and 95% confidence intervals for σ2

Considering the above function as a function of θ given

Definition: Given independent observations x1,...,xn from

which implies σˆ = ∑ (xi − x )

You might also like