You are on page 1of 4

Estimation in Statistics

In statistics, estimation refers to the process by which one makes inferences about a population,
based on information obtained from a sample.

Estimation is a division of statistics and signal processing that determines the values of
parameters through measured and observed empirical data.  The process of estimation is carried
out to measure and diagnose the true value of a function or a particular set of populations.  It is
done based on observations on the samples, which are a combined piece of the target population
or function.  Several statistics are used to perform the task of estimation.

UNBIASED ESTIMATION

The term estimation is used in statistics in a way very similar to its use in everyday language.
The contractor estimates the cost of building a house; the physician estimates a patient's length of
stay in a hospital; the aircraft pilot estimates the time of arrival; the surveyor estimates the
distance which he is about to measure; and the city planner estimates the population of the city.

In statistics the quantity to be estimated is one of the parameters of the probability model, or
some quantity whose value depends on the parameters. The available information consists of the
observed values of the random variables and certain known aspects of the experiment. The
estimate is computed from these values, using the assumptions of the model.

Statisticians use sample statistics to estimate population parameters. For example, sample means
are used to estimate population means; sample proportions, to estimate population proportions.

An estimate of a population parameter may be expressed in two ways:

 Point estimate. A point estimate of a population parameter is a single value of a statistic.


For example, the sample mean x is a point estimate of the population mean μ. Similarly,
the sample proportion p is a point estimate of the population proportion P.
 Interval estimate. An interval estimate is defined by two numbers, between which a
population parameter is said to lie. For example, a < x < b is an interval estimate of the
population mean μ. It indicates that the population mean is greater than a but less than b.

Confidence Intervals

Statisticians use a confidence interval to express the precision and uncertainty associated with a
particular sampling method. A confidence interval consists of three parts.

 A confidence level.
 A statistic.
 A margin of error.
The confidence level describes the uncertainty of a sampling method. The statistic and the
margin of error define an interval estimate that describes the precision of the method. The
interval estimate of a confidence interval is defined by the sample statistic + margin of error.

For example, suppose we compute an interval estimate of a population parameter. We might


describe this interval estimate as a 95% confidence interval. This means that if we used the same
sampling method to select different samples and compute different interval estimates, the true
population parameter would fall within a range defined by the sample statistic + margin of error
95% of the time.

Confidence intervals are preferred to point estimates, because confidence intervals indicate (a)
the precision of the estimate and (b) the uncertainty of the estimate.

Confidence Level

The probability part of a confidence interval is called a confidence level. The confidence level
describes the likelihood that a particular sampling method will produce a confidence interval that
includes the true population parameter.

Here is how to interpret a confidence level. Suppose we collected all possible samples from a
given population, and computed confidence intervals for each sample. Some confidence intervals
would include the true population parameter; others would not. A 95% confidence level means
that 95% of the intervals contain the true population parameter; a 90% confidence level means
that 90% of the intervals contain the population parameter; and so on.

Margin of Error

In a confidence interval, the range of values above and below the sample statistic is called the
margin of error.

The margin of error is the level of precision you require. This is the plus or minus number that is
often reported with an estimated mean and is also called the confidence interval. It is the range in
which the true population mean is estimated to be. Note that the actual precision achieved after
you collect your data will be more or less than this target amount, because it will be based on the
population variance estimated from the data and not your expected variance

For example, suppose the local newspaper conducts an election survey and reports that the
independent candidate will receive 30% of the vote. The newspaper states that the survey had a
5% margin of error and a confidence level of 95%. These findings result in the following
confidence interval: We are 95% confident that the independent candidate will receive between
25% and 35% of the vote.

Population size

This is the total number of distinct individuals in your population. In this formula we use a finite
population correction to account for sampling from populations that are small. If your
population is large, but you don’t know how large you can conservatively use 100,000. The
sample size doesn’t change much for populations larger than 100,000.

Population variance

This is calculated as:

σ2 = (1/N)* ∑Ni=1(xi-μ)2,

where,

μ = (1/N)* ∑Ni=1xi

and gives you an indication of how variable the population is. When performing significance
tests, the sample variance provides an estimate of the population variance for inclusion in the
formula.

Sample size

This is the minimum sample size you need to estimate the true population mean with the required
margin of error and confidence level. Note that if some people choose not to respond they cannot
be included in your sample and so if non-response is a possibility your sample size will have to
be increased accordingly. In general, the higher the response rate the better the estimate, as non-
response will often lead to biases in your estimate.

Point Estimates and Confidence Intervals

You have seen that the samplemean   is an unbiased estimate of the population mean μ. Another
way to say this is that   is the best point estimate of the true value of μ. Some error is associated
with this estimate, however—the true population mean may be larger or smaller than the sample
mean. Instead of a point estimate, you might want to identify a range of possible values p might
take, controlling the probability that μ is not lower than the lowest value in this range and not
higher than the highest value. Such a range is called a confidence interval.

Point Estimation vs. Interval Estimation

The two main types of estimators in statistics are point estimators and interval estimators. Point
estimation is the opposite of interval estimation. It produces a single value while the latter
produces a range of values. A point estimator is a statistic used to estimate the value of an
unknown parameter of a population. It uses sample data when calculating a single statistic that
will be the best estimate of the unknown parameter of the population.

 Point estimation, in statistics, the process of finding an approximate value of some


parameter—such as the mean (average)—of a population from random samples of the
population. The accuracy of any particular approximation is not known precisely, though
probabilistic statements concerning the accuracy of such numbers as found over many
experiments can be constructed

It is desirable for a point estimate to be:

(1) Consistent. The larger the sample size, the more accurate the estimate

(2) Unbiased. The expectation of the observed values of many samples (“average observation
value”) equals the corresponding population parameter. For example, the sample mean is an
unbiased estimator for the population mean.

(3) Most efficient or best unbiased—of all consistent, unbiased estimates, the one possessing
the smallest variance (a measure of the amount of dispersion away from the estimate). In
other words, the estimator that varies least from sample to sample. This generally depends
on the particular distribution of the population. For example, the mean is more efficient than
the median (middle value) for the normal distribution but not for more “skewed”
(asymmetrical) distributions.

 On the other hand, interval estimation uses sample data to calculate the interval of the
possible values of an unknown parameter of a population. The interval of the parameter
is selected in a way that it falls within a 95% or higher probability, also known as the
Confidence Interval

A confidence interval is an estimate of an interval in statistics that may contain a population


parameter. It is generally defined by its lower and upper bounds.. The confidence interval is
used to indicate how reliable an estimate is, and it is calculated from the observed data. The
endpoints of the intervals are referred to as the upper and lower confidence limits.

You might also like