You are on page 1of 32
8 Estimation |Agresti & Finlay (1997), Johnson & Bhattacharyya (1992), Moore & McCabe (1998) and Weiss (1999)} Tn this section we consider how to use sample data to estimate unknown population parameters. Statistical inference uses sample data to form two types of estimators of parameters. A point estimate consists of a sin- gle number, calculated from the data, that is the best single guess for the aknown parameter. A interval estimate consists of a range of numbers around the point estimate, within which the parameter is believed to fall. 8.1 Point estimation ‘The object of point estimation is to calculate, from the sample data, a single number that is likely to be close to the unknown value of the population parameter. The available information is assumed to be in the form of a random sample X,X»,...,X, of size n taken from the population, ‘The object is to formulate a statistic such that its value computed from the sample data would reflect the value of the population parameter as closely as possible. DEFINITION 8.1. A point estimator of a unknown population parameter is a statistic that estimates the value of that parancter. A point estimate of a parameter is the value of a statistic that is used to estimate the parameter. (Agresti & Finlay, 1997 and Weiss, 1999) For instance, to estimate a population mean j1, perhaps the most intuitive point estimator is the sample mean: Xi +Xo te + Xn a Once the observed values 1r),2r2,... 4:2) of the random variables X; are avail- able, we can actually calculate the observed value of the sample mean 2, which is called a point estimate of p. A good point estimator of a parameter is one with sampling distribution that is centered around parameter, and has small standard error as possible. A point estimator is called unbiased if its sampling distribution centers around the parameter in the sense that the parameter is the mean of the distribution, mmpling distribution of the sample mean X X is an unbiased estimator of the population mean 1. For example, the mean of the equals y.. Th A second preferable property for an estimator is a small standard error. An estimator whose standard error is smaller than those of other potential imators is said to be efficient. An efficient estimator is desirable because, on the average, it falls closer than other estimators to the parameter. For example, it can be shown that under normal distribution, the sample mean is an efficient estimator, and hence has smaller standard error compared, e.g, to the sample median. 8.1.1 Point estimators of the population mean and standard de- viation ‘The sample mean X is the obvious point estimator of a population mean 1. In fact, X is unbiased, and it is relatively efficient for most. population distributions. It is the point estimator, denoted by ji, used in this text: jog = MitXete eX ee Moreover, the sample standard deviation s is the most popular point estimate of the population standard deviation 0. That is, 8.2 Confidence interval For point estimation, a single number lies in the forefront even though a standard error is attached. Instead, it is often more desirable to produce an interval of values that is likely to contain the true value of the unknown parameter. A confidence interval estimate of a parameter consists of an interval of, numbers obtained from a point estimate of the parameter together with a percentage that specifies how confident we are that the parameter lies in the interval. The confidence percentage is called the confidence level. 53. DEFINITION 8.2 (Confidence interval). A confidence interval for a parameter is a range of numbers within which the parameter is believed to fall. The probatitity that the confidence interval contains the parameter is called the confidence coefficient. This is a chosen number close to 1, such as 0.95 or 0.99. (Agresti & Finlay, 1997) 8.2.1 Confidence interval for , when o known We first confine our attention to the construction of a confidence interval for a population mean j: assuming that the population variable X is normally distributed and its the standard deviation o is known. Recall the Key Fact 7.1 that when the population is normally distributed, the distribution of X is also normal, ie., X ~ N(y,%). The normal table shows that the probability is 0.95 that a normal random variable will lie within 1.96 standard deviations from its mean. For X, we then have P(u- 1.96 Now the relation w—196= < X equals p< X + 1.9 Va and o © a als X — 1.96-

You might also like