You are on page 1of 3

1

Estimation

Definitions

Confidence Interval
An interval estimate with a specific level of confidence
Confidence Level
The percent of the time the true mean will lie in the interval estimate given.
Consistent Estimator
An estimator which gets closer to the value of the parameter as the sample size increases.
Degrees of Freedom
The number of data values which are allowed to vary once a statistic has been determined.
Estimator
A sample statistic which is used to estimate a population parameter. It must be unbiased, consistent,
and relatively efficient.
Interval Estimate
A range of values used to estimate a parameter.
Maximum Error of the Estimate
The maximum difference between the point estimate and the actual parameter. The Maximum Error
of the Estimate is 0.5 the width of the confidence interval for means and proportions.
Point Estimate
A single value used to estimate a parameter.
Relatively Efficient Estimator
The estimator for a parameter with the smallest variance.
T distribution
A distribution used when the population variance is unknown.
Unbiased Estimator
An estimator whose expected value is the mean of the parameter being estimated.

Introduction to Estimation

One area of concern in inferential statistics is the estimation of the population parameter from the sample
statistic. It is important to realize the order here. The sample statistic is calculated from the sample data and
the population parameter is inferred (or estimated) from this sample statistic. Let me say that again: Statistics
are calculated, parameters are estimated.

We talked about problems of obtaining the value of the parameter earlier in the course when we talked about
sampling techniques.

Another area of inferential statistics is sample size determination. That is, how large of a sample should be
taken to make an accurate estimation. In these cases, the statistics can't be used since the sample hasn't been
taken yet.

Point Estimates

There are two types of estimates we will find: Point Estimates and Interval Estimates. The point estimate is
the single best value.
2
A good estimator must satisfy three conditions:

 Unbiased: The expected value of the estimator must be equal to the mean of the parameter
 Consistent: The value of the estimator approaches the value of the parameter as the sample size
increases
 Relatively Efficient: The estimator has the smallest variance of all estimators which could be used

Confidence Intervals

The point estimate is going to be different from the population parameter because due to the sampling error,
and there is no way to know who close it is to the actual parameter. For this reason, statisticians like to give
an interval estimate which is a range of values used to estimate the parameter.

A confidence interval is an interval estimate with a specific level of confidence. A level of confidence is the
probability that the interval estimate will contain the parameter. The level of confidence is 1 - alpha. 1-alpha
area lies within the confidence interval.

Maximum Error of the Estimate

The maximum error of the estimate is denoted by E and is one-half the width of the confidence interval. The
basic confidence interval for a symmetric distribution is set up to be the point estimate minus the maximum
error of the estimate is less than the true population parameter which is less than the point estimate plus the
maximum error of the estimate. This formula will work for means and proportions because they will use the
Z or T distributions which are symmetric. Later, we will talk about variances, which don't use a symmetric
distribution, and the formula will be different.

Area in Tails

Since the level of confidence is 1-alpha, the amount in the tails is alpha. There is a notation in statistics which
means the score which has the specified area in the right tail.

Examples:

 Z(0.05) = 1.645 (the Z-score which has 0.05 to the right, and 0.4500 between 0 and it)
 Z(0.10) = 1.282 (the Z-score which has 0.10 to the right, and 0.4000 between 0 and it).

As a shorthand notation, the () are usually dropped, and the probability written as a subscript. The greek letter
alpha is used represent the area in both tails for a confidence interval, and so alpha/2 will be the area in one
tail.

Here are some common values

Confidence Area between Area in one z-score


Level 0 and z-score tail (alpha/2)
50% 0.2500 0.2500 0.674
80% 0.4000 0.1000 1.282
90% 0.4500 0.0500 1.645
3

95% 0.4750 0.0250 1.960


98% 0.4900 0.0100 2.326
99% 0.4950 0.0050 2.576

Notice in the above table, that the area between 0 and the z-score is simply one-half of the confidence level.
So, if there is a confidence level which isn't given above, all you need to do to find it is divide the confidence
level by two, and then look up the area in the inside part of the Z-table and look up the z-score on the outside.

Also notice - if you look at the student's t distribution, the top row is a level of confidence, and the bottom
row is the z-score. In fact, this is where I got the extra digit of accuracy from.

Estimating the Mean

You are estimating the population mean, mu, not the sample mean, x bar.

Population Standard Deviation Known

If the population standard deviation, sigma is known, then the mean has a normal (Z) distribution.

The maximum error of the estimate is given by the formula for E shown. The Z here is the z-score obtained
from the normal table, or the bottom of the t-table as explained in the introduction to estimation. The z-score
is a factor of the level of confidence, so you may get in the habit of writing it next to the level of confidence.

Once you have computed E, I suggest you save it to the memory on your calculator. On the TI-82, a good
choice would be the letter E. The reason for this is that the limits for the confidence interval are now found
by subtracting and adding the maximum error of the estimate from/to the sample mean.

You might also like