Statistical Estimation
Prof GRC Nair
Statistical Estimation
Samples are studied to infer from it
the possible characteristics of the
population.
Statistical Estimation is the process of
arriving at a reliable figure for the
population parameter from the study
of the sample statistic.
The sample statistic used for
estimating the population parameter is
called estimator.
Types of Estimate
There are two types of estimates
Point Estimate – Estimating a Single
probable value for the parameter
Interval Estimate – Estimating a
Range within which the parameter is
expected to be.
Point Estimator
A point Estimator for Population mean is
sample mean
eg estimating average MAT score of all MBA
students (), using the average mark of
students in one center (x).
Proportion of people having a car or have
invested in shares in a city (p ) from a
sample proportion (p).
Sample variance, std deviation etc are point
estimators for respective population
parameters.
Point estimates are for rough estimate
only.
Not reliable,
unless,
the sample size is very large
sample is truly representative of
the population.
Interval Estimate
Gives a Range of values (confidence
interval) within which the parameter is
expected to be.
Has an associated confidence level.
It is the probability that the parameter
will be with in this range
Interval estimate of population mean ‘’is
got from sample mean, ‘x’ and if known
or std deviation‘s’ of the sample.
Large Sample
A large sample has > 30 elements in it.
For estimation of parameter from a large
sample, always use Normal distribution
with Std Error, x = / root n
Confidence interval for mean is given by
x + z x .
x + z x is called the upper confidence
limit and x – z xis the lower confidence
limit.
z for 90%,95% and 99% confidence level
are respectively 1.645,1.96 and 2.58.
If is not known, use ‘s’ in that place.
Small Sample
If small sample (< 30 elements):
If distribution of population is normal
(at least symmetrical single modal) and
is known, use normal distribution
with known x = /root n
If is not known, take ‘s’ in its place
and use ‘t’ distribution with d.f = n-1
Confidence interval for mean is given
by x + t ( x)
Example 1
A company producing contact lenses
introduces a new type, which claims
longer life. 60 persons who used it gave
an average life of 4.6 years with std dev
= 1.29yr. Construct a 95% confidence
interval for the mean life of new lenses.
This is a large sample. Use normal
distribution . z = 1.96. x = 4.6, s = 1.29
Confidence interval= x + 1.96 s / root 60
4.6 + 0.326 = 4.274 to 4.926
Example 2
Construct a 95% confidence interval for the mean
life of new lenses if the sample taken was 6 and
the std deviation is 0.49 yr in the previous
problem.
Use ‘t’ distribution, since 6 is a small sample and
is unknown.
= 4.6 and s = .49 x = 0.49/ root 6 = 0.2.
‘t’ value from tables for 95%for d.f 5 = 2.571
(‘t’ table gives area of both tails together)
Confidence interval= 4.6 +2.571x .2
4.0858 to 5.1142
Interval Estimate for Proportion
For large samples, use normal
distribution.
Take mean, p = p
Std error p = root of (pq/n) where,
q = 1-p
Confidence interval= p + z p
Example 3
Templeton surveyed 90 shoppers and 85% of
them purchase their products regularly. What
is the std error of the proportion? Construct
a 95% confidence interval for the true
proportion of people who purchase their
product regularly. ans:
p = 0.85 q = 0.15 n= 90
Using normal distribution, take p = 0.85
Std error p = root (pq/n) = 0.0376
95% confidence interval = p + 1.96 p
0.776 to 0.924
Example 4
We have indications that the proportion
of drivers without valid license is 0.3.
Find the sample size needed to
estimate the proportion with in + .02
with a confidence level of 90%
ans: p = 0.3 q = 0.7 z for 90% =1.645
z p = + 0.02
1.645 root (pq/n) = 0.02
n = 1421
Example - HW
John Bull has just purchased a
computer program that claims to
pick stocks that will appreciate its
value in next week with 85%
accuracy. How many stocks should
John test this program so as to be
98% certain that the percentage of
stocks that actually go up in the next
week will be + 0.05 of the sample
proportion ? Ans 277