Professional Documents
Culture Documents
1 / 18
Motivating example
• A producer of t-shirts for Chalmers wants us to provide them with the average
height of a male student.
• We do not have the resources to measure the height of every student.
• We choose 10 students at random, and get the following heights in cm
182, 171, 177, 174, 186, 183, 193, 172, 180, 181
Note that our estimate needs not be the true average height of all students.
Disclaimer: I did not gather this data and I cannot guarantee its validity.
2 / 18
General framework of parameter estimation
• We want to study a numerical property possessed by members of a certain large
population, and it is impossible/impractical to gather data about the whole
population. (Note that the population may be hypothetical, e.g. the population of
all cellphone batteries - already produced batteries and batteries produced in the
future).
• The distribution of the property in the whole population is described by a
random variable X, whose characteristics/parameters, like the mean or variance,
we want to estimate/approximate.
• We choose a (relatively) small random sample of n members of the population.
We do it in such a way that the selection of one member does not influence the
selection of any other member.
• Before the actual choice, the property of the ith member is described by a
random variable Xi which has the same distribution as X, and the variables
X1 , X2 , . . . , Xn
3 / 18
General framework of parameter estimation
Note that we used the term random sample to denote three different notions: the
randomly selected members of the population, the collection of random variables
associated with these members, and the collection of the observed values of the
variables. The interpretation of the term is usually clear from the context.
4 / 18
Random sample and statistics
A random sample of size n from the distribution of X is a collection of n independent
random variables
X1 , X2 , . . . , Xn ,
each with the same distribution as X.
A statistic is a random variable whose value can be be computed from the values of
the random sample X1 , X2 , . . . , Xn .
5 / 18
Estimators
Examples of parameters are the mean value µ, variance σ 2 , standard deviation σ, and
parameters λ, p for Poisson, exponential, and binomial distributions, etc.
E[θ̂] = θ.
The fact that an estimator is unbiased tells us that it is fluctuates around the right
value.
6 / 18
Sample mean
X The sample mean estimate (or the observed value for X) for the height of a
student from our first example is x̄ = 179.9 cm.
7 / 18
Sample mean
Even though we know that the sample mean is unbiased, this does not give us much
information about the accuracy of our estimates. It just tells us that we draw our
estimates from a distribution with the right mean value. A desirable property of an
estimator is that it has small variance for large sample sizes. Small variance implies
that our estimates will be precise with large probability.
Let X be the sample mean based on a random sample of size n from a distribution
with mean µ and variance σ 2 . Then,
σ2
Var[X] = .
n
Proof. Since variance is additive for independent random variables, we can write
n n n n
hX i 1 hX i 1 X 1 X 2 σ2
Var[X] = Var Xi /n = 2 Var Xi = 2 Var[Xi ] = 2 σ = .
n n n n
i=1 i=1 i=1 i=1
The above implies that the larger the sample size, the larger the probability that our
estimates are close to the true mean µ.
8 / 18
Sample variance
9 / 18
Sample variance
Proof. We have to prove that E[S2 ] = σ 2 . Note that S2 does not change if we add a
constant to the variable X. This implies that it is enough to consider the case
µ = E[X] = 0. Recall that E[XY] = E[X]E[Y] for independent random variables X
and Y, and hence
n n
1 h X i 1 X 1 1 σ2
E[Xj · X] = E Xj Xi = E Xj Xi = E[Xj2 ] = E[X 2 ] = , and
n n n n n
i=1 i=1
n n
2 1 h X i 1 X 1 σ2 σ2
E[X ] = E Xi X = E[Xi X] = · n · = .
n n n n n
i=1 i=1
For µ = 0, we have
h 1 Xn n
i 1 hX
2
i
E[S2 ] = E (Xi − X)2 = E Xi2 − 2Xi X + X
n−1 n−1
i=1 i=1
n n
1 X 2
1 X 1 2
= E[Xi2 ] − 2E[Xi X] + E[X ] = 1− σ = σ2 .
n−1 n−1 n
i=1 i=1
10 / 18
Sample variance
It is usually more convenient to use the computational formula for the sample
variance given by
Pn 2
X 2 − nX
S2 = i=1 i .
n−1
X The sample variance estimate (or the observed value of S2 ) for the height of a
student is
1822 + 1712 + . . . + 1812 − 10 · (179.9)2
s2 = = 45.43.
9
X The sample standard deviation estimate (or the observed value of S) for the
height of a student is √
s = s2 = 6.74 cm.
11 / 18
Sample variance
12 / 18
Interval estimation
• Point estimates provide us with a number estimate of the parameter that we want
to know but they do not contain any information about their accuracy.
• One way of trying to introduce accuracy quantification into our considerations is
to try to construct intervals that should contain the parameter of interest.
13 / 18
Confidence intervals
P(L ≤ θ ≤ R) = 1 − α.
14 / 18
Confidence intervals for normal variables with known σ 2
FZ (−zα/2 ) = α/2.
is a confidence interval for the true mean µ with confidence level 1 − α, that is
P(L ≤ µ ≤ R) = 1 − α.
15 / 18
Confidence intervals for normal variables with known σ 2
Proof. If X ∼ N (µ, σ 2 ), then by the property of the normal distribution
σ2 X−µ
X ∼ N µ, , and hence √ ∼ N (0, 1)
n σ/ n
Note that the assumption that we know the variance of the underlying distribution is
idealistic. However, the assumption that the distribution of the random sample is
(approximately) normal is very reasonable as we will see in the next lecture.
16 / 18
Confidence intervals for normal variables with known σ 2
Let us assume that the height of a student is distributed like a normal variable with
standard deviation 7 cm. Let us construct a 95% confidence interval for the mean
height using the data from the first example.
Using a table for the standard normal, we find that
Hence, using the formula from the slide before, the observed confidence interval with
confidence level 95% is [l, r], where
√
l = x̄ − z2.5 σ/ 10 = 179.9 − 1.96 · 7/3.162 = 175.56,
and √
l = x̄ + z2.5 σ/ 10 = 179.9 − 1.96 · 7/3.162 = 184.23.
It is very important to understand that it is not correct to say that with 95% probability
the true average height of a student is in the interval [175.56,184.23]. It does not
make sense to talk about probabilities since µ, 175.56, 184.23 are fixed numbers and
not random variables.
17 / 18
Interpretation of confidence intervals
• The confidence level tells us about the accuracy of the whole procedure of
computing confidence intervals and not a single observed interval.
• If we keep repeating to construct 100%(1 − α) confidence intervals with
independently gathered data, the constructed intervals will contain the true
parameter in 100%(1 − α) cases on average.
• This implies that in α% cases on average, the constructed confidence interval
will not contain the true parameter.
18 / 18