You are on page 1of 35

Estimation

Introduction
• Estimation: To estimate the parameters on the basis of sample
observations through a statistic.

• E.g., the sample mean ( ) Xis employed to estimate the


population mean (µ).

• Sample standard deviation (s) is employed to estimate


population standard deviation (σ).
Managerial Questions
• How can the national average salary for computer positions be
estimated using sample data? How much error is involved in
such estimation? How much confidence do we have in this
estimation?
• One survey reported that 37% of the workers felt that
companies are not supervising their work enough. This figure
came from a survey of 1,200 employees and is only a sample
statistic. Can we say from this survey that 37% of all
employees in the country feel this way? Why and why not?
Can we use the 37% as an estimate for the population
parameter? If we do, how much error is there; and how much
confidence do we have in the final results?
Types of Estimators
• Point Estimation:
– A single number

• Interval Estimation
– provides additional information about variability
Point Estimation
• A point estimator draws inferences about a population by
estimating the value of an unknown parameter using a single
value or point.
• We can estimate the population mean, µ using sample mean

x
 x
n
Interval Estimation
• An interval estimator draws inferences about a population by
estimating the value of an unknown parameter using an
interval.

• That is we say (with some percent certainty or confidence) that


the population parameter of interest is between some lower
and upper bounds.
Example
• For example, suppose we want to estimate the mean summer
income of a class of business students. For n=25 students,
is calculated to be 400 $/week.

point estimate

interval estimate

• An alternative statement is:


The mean income is between 380 and 420 $/week with 95%
confidence.
Confidence Interval

Lower Upper
Confidence Confidence
Point Estimate
bound bound
Width of
confidence interval
Level of Confidence
• Probability that the unknown population parameter falls within
the interval
• Denoted (1 -  = level of confidence
• is probability that the parameter is not within the Interval.
• If we say that the population mean, µ falls within the interval a
and b with 95% confidence (i.e.,  = 0.5), then
mathematically, P(a < µ < b) = 0.95
Graphical Representation

… µ may be …or …or possibly


here… here even here…
Estimating µ from Large Samples When σ is
Known
• the probability that the interval:

contains the population mean µ is 1 – α.


This is the interval estimator for µ with
(1 – α)% confidence.
Graphical Representation

width
Commonly Used Confidence Levels
Example
• A sample of 11 circuits from a large normal population has a
mean resistance of 2.20 ohms. We know from past testing that
the population standard deviation is 0.35 ohms. Determine a
95% confidence interval for the true mean resistance of the
population.
• Given: X ~ N (µ, 0.352); n = 11, X = 2.20 ohm
1- α = 0.95 => α/2 = 0.025
Solution
σ
• Solution: X Z
n
 2.20  1.96 (0.35/ 11)
 2.20  0.2068
1.9932    2.4068

• We are 95% confident that the true mean resistance is


between 1.9932 and 2.4068 ohms
• Although the true mean may or may not be in this
interval, 95% of intervals formed in this manner will
contain the true mean
Example
• A survey was taken of U.S. companies that do business with
firms in India. One of the questions on the survey was:
Approximately how many years has your company been
trading with firms in India? A random sample of 44 responses
to this question yielded a mean of 10.455 years. Suppose the
population standard deviation for this question is 7.7 years.
Using this information, construct a 90% confidence interval
for the mean number of years that a company has been trading
in India for the population of U.S. companies trading with
firms in India. [Ans: (8.545, 12.365)]
Example
• A study is conducted in a company that employs 800
engineers. A random sample of 50 engineers reveals that the
average sample age is 34.3 years. Historically the population
standard deviation of the age of the company’s engineers is
approximately 8 years. Construct a 98% confidence interval to
estimate the average age of all the engineers in the country.
[Ans: (31.66, 36.94)]
Estimating µ from Small Samples
• When sample size is 30 or less normal distribution is
not the appropriate sampling distribution.
• When sample size is 30 or less and population
standard deviation is unknown, t-distribution (or
Student’s t-distribution) is more appropriate.
• Interval Estimate:

S  S S 
X  t /2,n 1   X  t / 2, n 1 , X  t /2, n 1 
n  n n
The t Distribution
• Developed by British statistician, William Gosset
• A family of distributions -- a unique distribution for each value
of its parameter, degrees of freedom (d.f.)
• Symmetric, Unimodal, Mean = 0, Flatter than a z
• t formula
x 
t
s n
Comparison of Selected t Distributions
to the Standard Normal

Standard Normal
t (d.f. = 25)
t (d.f. = 5)
t (d.f. = 1)

-3 -2 -1 0 1 2 3
Table of Critical Values of t

df t0.100 t0.050 t0.025 t0.010 t0.005


1 3.078 6.314 12.706 31.821 63.656
2 1.886 2.920 4.303 6.965 9.925
3 1.638 2.353 3.182 4.541 5.841
4 1.533 2.132 2.776 3.747 4.604 
5 1.476 2.015 2.571 3.365 4.032

23 1.319 1.714 2.069 2.500 2.807


24 1.318 1.711 2.064 2.492 2.797
25 1.316 1.708 2.060 2.485 2.787
 t
29 1.311 1.699 2.045 2.462 2.756
30 1.310 1.697 2.042 2.457 2.750

40 1.303 1.684 2.021 2.423 2.704 With df = 24 and a = 0.05,


60 1.296 1.671 2.000 2.390 2.660
120 1.289 1.658 1.980 2.358 2.617
ta = 1.711.
 1.282 1.645 1.960 2.327 2.576
Student’s t-distribution
Note: t Z as n increases

Standard
Normal
(t with df = ∞)

t (df = 13)
t-distributions are bell-
shaped and symmetric, but
have ‘fatter’ tails than the t (df = 5)
normal

t
Degrees of Freedom (df)
• Number of observations that are free to vary
after sample statistic has been calculated
Example:
Sum of 3 Numbers Is 6
X1 = 1 (or Any Number) degrees of freedom = n -1
X2 = 2 (or Any Number) = 3 -1
X3 = 3 (Cannot Vary) =2
Sum = 6
Example
• A random sample of n = 25 taken from a normal
population has X = 50 and s = 8. Form a 95%
confidence interval for μ.
1 - α = 0.95 => α = .05
df = n – 1 = 24, so t /2 , n 1  t 0.025,24  2.064
[In the Students’ t table, 2.064 is the t value
corresponding to α = .05 and df = 24.]
The confidence interval is
S 8
X  t /2, n-1  50  (2.0639)
n 25
= [46.698 , 53.302]
Example
The owner of a large equipment rental company wants to make a
rather quick estimate of the average number of days a piece of
ditchdigging equipment is rented out per person per time. The
company has records of all rentals, but the amount of time required
to conduct an audit of all accounts would be prohibitive. The owner
decides to take a random sample of rental invoices. Fourteen
different rentals of ditch diggers are collected randomly from the
files, yielding the following data. She uses these data to construct a
99% confidence interval to estimate the average number of days
that a ditch digger is rented and assumes that the number of days
per rental is normally distributed in the population.
Data: 3 1 3 2 5 1 2 1 4 2 1 3 1 1
Determination of Sample Size
• The required sample size needed to estimate a
population parameter to within a selected margin of
error (e) using a specified level of confidence (1 -
) can be computed
• The margin of error is also called sampling error
– the amount of imprecision in the estimate of the
population parameter
– the amount added and subtracted to the point estimate to
form the confidence interval
Determining Sample Size for the Mean
Sampling error
(margin of error)

σ σ
XZ eZ
n n

2 2
Z σ
n
Now solve for n
to get
2
e
Determining Sample Size for the Mean

• To determine the required sample size for the


mean, you must know:

– The desired level of confidence (1 - ), which


determines the critical Z value
– The acceptable sampling error, e
– The standard deviation, σ
Example
• If  = 45, what sample size is needed to
estimate the mean within ± 5 with 90%
confidence?
2 2 2 2
Z σ (1.645) (45)
n 2  2
 219.19
e 5

So the required sample size is n = 220


(Always round-up)
Estimators and Their Properties
•• Desirable
Desirableproperties
propertiesof
ofestimators
estimatorsinclude:
include:
Unbiasedness
 Unbiasedness
Efficiency
 Efficiency
Consistency
 Consistency
Sufficiency
 Sufficiency
Unbiasedness
AAstatistic
statisticttisissaid
saidto
tobe
bean
anunbiased
unbiasedestimator
estimatorof
ofaaparameter
parameterθ,θ,
ififthe
theexpected
expectedvalue
valueof
ofttisisθ.θ.
E(t)==θθ
E(t)
Otherwise,the
Otherwise, theestimator
estimatorisissaid
saidtotobe
bebiased.
biased.AAbias
biasof
ofaastatistic
statistic
inestimating
in estimatingθθisisgiven
givenas
as
Bias==E(t)
Bias E(t)--θθ
Unbiased and Biased Estimators

{
Bias

An unbiased estimator is on A biased estimator is


target on average. off target on average.
Efficiency
Anestimator
An estimatorisisefficient
efficientififitithas
hasaarelatively
relativelysmall
smallvariance
variance(and
(and
standarddeviation).
standard deviation).

An efficient estimator is, An inefficient estimator is, on


on average, closer to the average, farther from the
parameter being estimated.. parameter being estimated.
Consistency and Sufficiency
Anestimator
An estimatorisissaid
saidto
tobe
beconsistent
consistentififits
itsprobability
probabilityof
ofbeing
beingclose
close
tothe
to theparameter
parameterititestimates
estimatesincreases
increasesasasthe
thesample
samplesize
sizeincreases.
increases.

Consistency

n = 10 n = 100
Anestimator
An estimatorisissaid
saidto
tobe
besufficient
sufficientififititcontains
containsall
allthe
theinformation
information
inthe
in thedata
dataabout
aboutthe
theparameter
parameterititestimates.
estimates.
Properties of the Sample Mean
For a normal population, both the sample mean and
sample median are unbiased estimators of the
population mean, but the sample mean is both more
efficient (because it has a smaller variance), and
sufficient. Every observation in the sample is used in
the calculation of the sample mean, but only the middle
value is used to find the sample median.
In general, the sample mean is the best estimator of the
population mean. The sample mean is the most
efficient unbiased estimator of the population mean. It
is also a consistent estimator.

You might also like