Professional Documents
Culture Documents
Point Estimation, Interval Estimation
Point Estimation, Interval Estimation
Point estimation Desirable properties of point estimations Interval estimations Confidence intervals
Estimator
Assume that we have a sample (x1,x2,,,xn) from a given population. All parameters of the population are known except some parameter . We want to determine from the given observations unknown parameter - . In other words we want to determine a number or range of numbers from the observations that can be taken as a value of . Estimator is a method of estimation. Estimate is a result of an estimator Point estimation as the name suggests is the estimation of the population parameter with one number. Problem of statistics is not to find estimates but to find estimators. Estimator is not rejected because it gives one bad result for one sample. It is rejected when it gives bad results in a long run. I.e. it gives bad result for many, many samples. Estimator is accepted or rejected depending on its sampling properties. Estimator is judged by the properties of the distribution of estimates it gives rise.
Properties of estimator
Since estimator gives rise an estimate that depends on sample points (x1,x2,,,xn) estimate is a function of sample points. Sample points are random variable therefore estimate is random variable and has probability distribution. We want that estimator to have several desirable properties like 1. Consistency 2. Unbiasedness 3. Minimum variance In general it is not possible for an estimator to have all these properties. Note that estimator is a sample statistic. I.e. it is a function of the sample elements.
Here we used the fact that expectation and summation can change order (Remember that expectation is integration for continuous random variables and summation for discrete random variables.) and the expectation of each sample point is equal to the population mean. Knowledge of population distribution was not necessary for derivation of unbiasedness of the sample mean. This fact is true for the samples taken from population with any distribution for which the first moment exists..
What is the bias of this estimator? We could derive distribution of tn and then use it to find expectation value. If population has normal distribution then it would give us multiple of 2 distribution with n-1 degrees of freedom. Let us use a direct approach:
E ( tn ) =
n n n 1 n 1 n 1 1 1 E (xi2 ) E (( xi ) 2 = E ( x 2 ) 2 E ( xi x j ) = E ( x 2 ) 2 ( E ( xi2 ) E ( xi x j )) = E ( x 2 ) 2 ( nE ( x 2 ) n( n 1) E ( x ) 2 ) n i =1 n i =1 n n n i =1, j =1 i =1 i j
= E( x2 )
1 n 1 n 1 n 1 2 E( x2 ) E ( x )2 = ( E ( x 2 ) E ( x )2 ) = n n n n
Sample variance is not an unbiased estimator for the population variance. That is why when mean and variance are unknown the following equation is used for sample variance: s 2 = 1 n ( x x )2
n 1 i =1
V = E (tn E (tn )) 2
Exercise: What is the variance of the sample mean. As we noted if estimator for is tn then difference between them is error of the estimation. Expectation value of this error is bias. Expectation value of square of this error is called mean square error (m.s.e.):
M = E ( tn ) 2
It can be expressed by the bias and the variance of the estimator: M (tn ) = E (tn ) 2 = E (tn E (tn ) + E (tn ) ) 2 = E (tn E (tn )) 2 + ( E (tn ) ) 2 =
V (tn ) + B2 (tn ) M.s.e is equal to square of the estimators bias plus variance of the estimator. If the bias is 0 then m.s.e is equal to the variance. In estimation it is usually trade of between unbiasedness and minimum variance. In ideal world we would like to have minimum variance unbiased estimator. It is not always possible.
Since sample is from the population with the density of distribution f(x) sample mean is plug-in estimator for the population mean. Exercise: What is the plug-in estimator for population variance? What is the plug-in estimator for covariance. Hint: Population variance and covariance are calculated as: 2 2
= ( x ) f ( x)dx and
cov( X , Y ) = ( x x )( y y ) f ( x, y )dxdy
Replace the integration with summation and divide by the number of elements in the sample. Since sample was drawn from the population with a given distribution it is not necessary to multiply by f(x)
Least-squares estimator
Another well known and popular estimator is the least-square estimator. If we have a sample and we think that (because of some knowledge we had before) all parameters of interest are inside the mean value of the population then least squares methods estimates by minimising the square of the differences between observations and mean value:
w ( x ( ))
i =1 i i
min
Exercise: Verify that if only unknown parameter is the mean of the population and all wi are equal to each other then the least-squares estimator will result in the sample mean.
Interval estimation
Estimation of the parameter is not sufficient. It is necessary to analyse and see how confident we can be about this particular estimation. One way of doing it is defining confidence intervals. If we have estimated we want to know if the true parameter is close to our estimate. In other words we want to find an interval that satisfies following relation:
P (GL < < GU ) 1 I.e. probability that true parameter is in the interval (GL,GU) is greater than 1-. Actual realisation of this interval - (gL,gU) is called a 100(1- )% of confidence
interval, limits of the interval are called lower and upper confidence limits. 1 is called confidence level. Example: If population variance is known (2) and we estimate population mean then
Z= x is normal N (0,1) / n
We can find from the table that probability of Z is more than 1 is equal to 0.1587. Probability of Z is less than -1 is again 0.1587. These values comes from the tables of the standard normal distribution.
Confidence level that true value is within 1 standard error (standard deviation of sampling distribution) from the sample mean is 0.6826. Probability that true value is within 2 standard error from the sample mean is 0.9545. What we did here is to find sample distribution and to use it to define confidence intervals. Here we used two sided symmetric interval. They dont have to be two sided or symmetric. Under some circumstances non-symmetric intervals might be better. For example it might be better to diagnose patient for particular treatment than not. If doctor made an error and did not treat the patient then he might die. But if doctor made a mistake and started to treat him then he can stop and correct his mistake at some later time.
Here s2 is the sample variance. Since it is the ratio of the standard normal random variable to square root of 2 random variable with n-1 degrees of freedom, Z has Students t distribution with n-1 degrees of freedom. In this case we can use table of t distribution to find confidence levels. It is not surprising that when we do not know sample variance confidence intervals for the same confidence levels becomes larger. That is price we pay for what we do not know. If number of degrees of freedom becomes large then t distribution is approximated well with normal distribution. For n>100 we can use normal distribution to find confidence levels, intervals.