Interval Estimation

Data Science for Managerial
Decisions
Interval Estimation
Population Mean: s Known

Population Mean: s Unknown
Determining the Sample Size
2
Margin of Error and the Interval Estimate
A point estimator cannot be expected to provide the exact value of the population parameter.
An interval estimate can be computed by adding and subtracting a margin of error to the point
estimate.
Point Estimate +/- Margin of Error
The purpose of an interval estimate is to provide information about how close the point estimate is
to the value of the parameter.
The general form of an interval estimate of a population mean is
x  Margin of Error
3
Interval Estimate of a Population Mean:
s Known
In order to develop an interval estimate of a population mean, the margin of error must be
computed using either:
▪ the population standard deviation s , or
▪ the sample standard deviation s
▪ s is rarely known exactly, but often a good estimate can be obtained based on historical data.
▪ We refer to such cases as the s known case.
4
s Known
◼ Interval Estimate of m
s
x  z /2
n
where: x is the sample mean

1 - is the confidence coefficient
z/2 is the z value providing an area of
/2 in the upper tail of the standard
normal probability distribution
s is the population standard deviation
n is the sample size
5
s Known
Sampling
distribution
of x
/2 1 -  of all /2

x values
x
m
7
s Known
Sampling Sampling
distribution distribution
of x of z
/2 1 -  of all /2 /2 1 -  of all /2

x values z values
x 𝑍
m 0
𝑋ത𝐴 𝑋ത𝐵 −𝑍𝛼ൗ +𝑍𝛼ൗ

2 2
8
s Known
Sampling
distribution
of x
1 -  of all
/2 /2
x values
interval
does not x
include m m interval
z /2 s x z /2 s x includes m
x
x
x 9
s Known
◼ Interval Estimate of m
s
x  z /2
n
where: x is the sample mean

1 - is the confidence coefficient
s is the population standard deviation
n is the sample size
10
s Known
Values of z/2 for the Most Commonly Used Confidence Levels
Confidence Table
Level α α/2 Look-up Area zα/2
90% .10 .05 .9500 1.645

95% .05 .025 .9750 1.960
99% .01 .005 .9950 2.576
11
s Known
Because 90% of all the intervals constructed using 𝑥ҧ ± 1.645𝜎𝑥ҧ will contain the population mean, we
say we are 90% confident that the interval 𝑥ҧ ± 1.645𝜎𝑥ҧ includes the population mean µ.
We say that this interval has been established at the 90% confidence level.
The value .90 is referred to as the confidence coefficient.
12
s Known
Example: Lloyd’s Department Store
Each week Lloyd’s Department Store selects a random sample of 100 customers in order to learn about the
amount spent per shopping trip.
With x representing the amount spent per shopping trip, the sample mean provides a point estimate of μ, the
mean amount spent per shopping trip for the population of all Lloyd’s customers.
Lloyd’s has been using the weekly survey for several years. Based on the historical data, Lloyd’s now
assumes a known value of σ = $20 for the population standard deviation. The historical data also indicate
that the population follows a normal distribution.
During the most recent week, Lloyd’s surveyed 100 customers (n=100) and obtained a sample mean of 𝑥ҧ =
$82. The sample mean amount spent provides a point estimate of the population mean amount spent per
shopping trip, μ.
Determine the interval estimate of μ with 95% confidence level.
13
s Known
95% of the sample means that can be observed are within + 1.96𝜎𝑥ҧ of the population mean m.
The margin of error is:
s  20 
z /2 = 1.96   = 3.92
n  100 
Thus, at 95% confidence,
the margin of error is $3.92.
14
s Known
Interval estimate of m is:
$82 + $3.92
or
$78.08 to $85.92
We are 95% confident that the interval contains the population mean.
15
s Known
Confidence Margin
Level of Error Interval Estimate
90% 3.29 78.71 to 85.29
95% 3.92 78.08 to 85.92
99% 5.15 76.85 to 87.15
In order to have a higher degree of confidence, the margin of error

and thus the width of the confidence interval must be larger.
16
s Known
Adequate Sample Size
In most applications, a sample size of n = 30 is adequate.
If the population distribution is highly skewed or contains outliers, a sample size of 50 or more is
recommended.
If the population is not normally distributed but is roughly symmetric, a sample size as small as 15
will suffice.
If the population is believed to be at least approximately normal, a sample size of less than 15 can
be used.
17
s unknown
If an estimate of the population standard deviation s cannot be developed prior to sampling, we use
the sample standard deviation s to estimate s .
This is the s unknown case.
In this case, the interval estimate for m is based on the t distribution.
18
t Distribution
William Gosset, writing under the name “Student”, is the founder of the t distribution.
Gosset was an Oxford graduate in mathematics and worked for the Guinness Brewery in Dublin.
He developed the t distribution while working on small-scale materials and temperature
experiments.
(13 June 1876 – 16 October 1937) 19

t Distribution
20
t Distribution
A specific t distribution depends on a parameter known as the degrees of freedom.
Degrees of freedom refer to the number of independent pieces of information that go into the
computation of s.
A t distribution with more degrees of freedom has less dispersion.
As the degrees of freedom increases, the difference between the t distribution and the standard
normal probability distribution becomes smaller and smaller.
21
t Distribution
t distribution
Standard (20 degrees
normal of freedom)
distribution
t distribution
(10 degrees
of freedom)
z, t
0
22
t Distribution
For more than 100 degrees of freedom, the standard normal z value provides a good approximation
to the t value.
The standard normal z values can be found in the infinite degrees (∞) row of the t distribution table.
Standard normal
z values
23
t Distribution
24
s unknown
◼ Interval Estimate
s
x  t /2
n
where: 1 - = the confidence coefficient

t/2 = the t value providing an area of /2 in the upper
tail of a t distribution with n - 1 degrees of freedom
s = the sample standard deviation
25
s unknown
Example: Credit Card Debt
To illustrate the interval estimation procedure for the σ unknown case, we will consider a study designed to
estimate the mean credit card debt for the population of U.S. households.
A sample of n = 70 households provided the credit card balances shown in Table 8.3. For this situation, no
previous estimate of the population standard deviation σ is available. Thus, the sample data must be used to
estimate both the population mean and the population standard deviation.
Using the data in Table 8.3, we compute the sample mean 𝑥ҧ = $9312 and the sample standard deviation s =
$4007.
26
s unknown
Example: Credit Card Debt
27
s unknown
➢ At 95% confidence,  = .05, and /2 = .025.
➢ t.025 is based on n - 1 = 70 - 1 = 69 degrees of freedom.
➢ In the t distribution table we see that t.025 = 1.995.
28
s unknown
29
s unknown
Interval Estimate
s
x  t.025 Margin
n of Error
4007
9312  1.995 = 9312  955
70
We are 95% confident that the mean credit card balance for the population
of all household is between $8,357 and $10,267.
30
s unknown
In most applications, a sample size of n = 30 is

adequate when using the expression 𝑥ҧ ± 𝑡𝛼Τ2 𝑠Τ 𝑛 to
develop an interval estimate of a population mean.
If the population distribution is highly skewed or

contains outliers, a sample size of 50 or more is
recommended.
31
s unknown
If the population is not normally distributed but is

roughly symmetric, a sample size as small as 15
will suffice.
If the population is believed to be at least

approximately normal, a sample size of less than 15
can be used.
32
Summary of Interval Estimation Procedures
for a Population Mean
Yes Can the No

population standard
deviation s be assumed
known ?
Use the sample

standard deviation
s Known s to estimate s
Case
Use Use
s s Unknown s
x  z /2 Case x  t /2
n n
33
Sample Size for an Interval Estimate
of a Population Mean
Let E = the desired margin of error.
E is the amount added to and subtracted from the point estimate to obtain an interval estimate.
If a desired margin of error is selected prior to sampling, the sample size necessary to satisfy the
margin of error can be determined.
Margin of Error s
➢ E = z /2
n
Necessary Sample Size ( z / 2 ) 2 s 2

➢
n=
E2
34
The Necessary Sample Size equation requires a value for the population standard deviation s.
If s is unknown, a preliminary or planning value for s can be used in the equation.
➢ Use the estimate of the population standard deviation computed in a previous study.
➢ Use a pilot study to select a preliminary study and use the sample standard deviation from the
study.
➢ Use judgment or a “best guess” for the value of s .
35
Recall that Lloyd’s Department Store is planning to get an estimate of the mean amount spent per
shopping trip. Suppose that Lloyd’s management team wants an estimate of the population mean
such that there is a 0.95 probability that the sampling error is $2 or less.
How large a sample size is needed to meet the required precision?
36
s
z / 2 =2
n
At 95% confidence, z.025 = 1.96. Recall that s = 20.
(1.96)2 (20)2
n= 2
= 384.16 = 385
(2)
A sample of size 385 is needed to reach a desired precision of

+ $2 at 95% confidence.
37
Interval Estimate of a Population Proportion
The general form of an interval estimate of a

population proportion is
p  Margin of Error
The sampling distribution of 𝑝ҧ plays a key role in

computing the margin of error for this interval
estimate.
The sampling distribution of 𝑝ҧ can be approximated

by a normal distribution whenever np > 5 and
n(1 – p) > 5.
❑ Normal Approximation of Sampling Distribution of 𝑝ҧ
Sampling
𝑝(1 − 𝑝)
distribution 𝜎𝑝ǉ =
𝑛
of 𝑝ҧ
/2 1 - 𝛼 of all /2

𝑝ҧ values
p
p
𝑧𝛼/2 𝜎𝑝ǉ 𝑧𝛼/2 𝜎𝑝ǉ
◼ Interval Estimate
p (1 - p )
p  z / 2
n
where: 1 - is the confidence coefficient

𝑝ҧ is the sample proportion
Example: Political Science, Inc.
Political Science, Inc. (PSI) specializes in voter polls and surveys designed to keep
political office seekers informed of their position in a race. Using telephone surveys,
PSI interviewers ask registered voters who they would vote for if the election were
held that day.
In a current election campaign, PSI has just found that 220 registered voters, out of 500
contacted, favor a particular candidate. PSI wants to develop a 95% confidence interval
estimate for the proportion of the population of registered voters that favor the candidate.
p (1 - p )
p  z / 2
n
where: n = 500, 𝑝ҧ = 220/500 = .44, z/2 = 1.96
.44(1 - .44)
.44  1.96 = .44 + .0435
500
PSI is 95% confident that the proportion of all voters

that favor the candidate is between .3965 and .4835.
Sample Size for an Interval Estimate of a Population Proportion
Margin of Error
p (1 - p )
E = z / 2
n
Solving for the necessary sample size, we get
( z / 2 ) 2 p (1 - p )
n=
E2
However, 𝑝ҧ will not be known until after we have selected the sample. We will use the planning value
p* for 𝑝ҧ .
Necessary Sample Size
( z / 2 ) 2 p* (1 - p* )
n=
E2
The planning value p* can be chosen by:

1. Using the sample proportion from a previous sample of the same or similar units, or
2. Selecting a preliminary sample and using the sample proportion from this sample.
3. Use judgment or a “best guess” for a p* value.
4. Otherwise, use .50 as the p* value.
Suppose that PSI would like a .99 probability that the sample proportion is within + .03
of the population proportion. How large a sample size is needed to meet the required
precision? (A previous sample of similar units yielded .44 for the sample proportion.)
p * (1 - p * )
z /2 = .03
n
At 99% confidence, z.005 = 2.576. Recall that p* = .44.
( z /2 )2 p * (1 - p * ) (2.576)2 (.44)(.56)
n= =  1817
E2 (.03) 2
A sample of size 1817 is needed to reach a desired

precision of + .03 at 99% confidence.
Note: We used .44 as the best estimate of p in the preceding expression. If no

information is available about p, then .5 is often assumed because it provides
the highest possible sample size. If we had used p = .5, the recommended n would
have been 1843.
Thank You !!!

Interval Estimation

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Interval Estimation

Uploaded by

Copyright:

Available Formats

Data Science for Managerial

Population Mean: s Known

Point Estimate +/- Margin of Error

where: x is the sample mean

/2 1 -  of all /2

/2 1 -  of all /2 /2 1 -  of all /2

𝑋ത𝐴 𝑋ത𝐵 −𝑍𝛼ൗ +𝑍𝛼ൗ

where: x is the sample mean

90% .10 .05 .9500 1.645

The margin of error is:

In order to have a higher degree of confidence, the margin of error

(13 June 1876 – 16 October 1937) 19

where: 1 - = the confidence coefficient

In most applications, a sample size of n = 30 is

If the population distribution is highly skewed or

If the population is not normally distributed but is

If the population is believed to be at least

Yes Can the No

Use the sample

Necessary Sample Size ( z / 2 ) 2 s 2

How large a sample size is needed to meet the required precision?

A sample of size 385 is needed to reach a desired precision of

The general form of an interval estimate of a

The sampling distribution of 𝑝ҧ plays a key role in

The sampling distribution of 𝑝ҧ can be approximated

/2 1 - 𝛼 of all /2

where: 1 - is the confidence coefficient

Example: Political Science, Inc.

Example: Political Science, Inc.

PSI is 95% confident that the proportion of all voters

The planning value p* can be chosen by:

A sample of size 1817 is needed to reach a desired

Note: We used .44 as the best estimate of p in the preceding expression. If no

You might also like