You are on page 1of 6

2. Calculate the following quantities using Excel.

cel. (If you the difference between this result and the result
have Excel 2010 or later, we suggest using its new obtained in part d?
functions.) 3. Calculate the following quantities using Excel. (If you
a. P(−2.00 ≤ t10 ≤ 1.00), where t10 has a have Excel 2010 or later, we suggest using its new
t distribution with 10 degrees of freedom. functions.)
b. P(−2.00 ≤ t100 ≤ 1.00), where t100 has a a. Find the value of x such that P(t10 > x) = 0.75,
t distribution with 100 degrees of freedom. How do where t10 has a t distribution with 10 degrees of
you explain the difference between this result and freedom.
the one obtained in part a? b. Find the value of y such that P(t100 > y) = 0.75,
c. P(−2.00 ≤ Z ≤ 1.00), where Z is a where t100 has a t distribution with 100
standard normal random variable. Compare degrees of freedom. How do you explain the
this result to the results obtained in parts a and difference between this result and the result obtained
b. How do you explain the differences in these in part a?
probabilities? c. Find the value of z such that P(Z > z) = 0.75,
d. Find the 68th percentile of the t distribution with where Z is a standard normal random variable.
20 degrees of freedom. Compare this result to the results obtained in parts
e. Find the 68th percentile of the t distribution a and b. How do you explain the differences in the
with 3 degrees of freedom. How do you explain values of x, y, and z?

8-3 CONFIDENCE INTERVAL FOR A MEAN


We now come to the main topic of this chapter: using properties of sampling distributions to
construct confidence intervals. We assume that data have been generated by some random
mechanism, either by observing a random sample from some population or by performing
a randomized experiment. The goal is to infer the values of one or more population param-
eters such as the mean, the standard deviation, or a proportion from sample data. For each
such parameter, you use the data to calculate a point estimate, which can be considered a
best guess for the unknown parameter. You then calculate a confidence interval around the
point estimate to measure its accuracy.
We begin by deriving a confidence interval for a population mean μ, and we discuss its
interpretation. Although the particular details pertain to a specific parameter, the mean, the
same ideas carry over to other parameters as well, as will be described in later sections. As
usual, the sample X is used as the point estimate of μ.
To obtain a confidence interval for μ, you first specify a confidence level, usually
90%, 95%, or 99%. You then use the sampling distribution of the point estimate to deter-
mine the multiple of the standard error (SE) to go out on either side of the point esti-
mate to achieve the given confidence level. If the confidence level is 95%, the value used
most frequently in applications, the multiple is approximately 2. More precisely, it is a
t-value. That is, a typical confidence interval for μ is of the form in Expression (8.4),
where SE(X) = s/!n.

Confidence Interval for Population Mean


X ± t-multiple × SE (X) (8.4)

To obtain the correct t-multiple, let α be 1 minus the confidence level (expressed as
a decimal). For example, if the confidence level is 90%, then α = 0.10. Then the appro-
priate t-multiple is the value that cuts off probability α/2 in each tail of the t distribution

342 Chapter 8 Confidence Interval Estimation

Copyright 201 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
with n − 1 degrees of freedom. For example, if n = 30 and the confidence level is 95%,
cell B25 of Figure 8.2 indicates that the correct t-value is 2.045. The corresponding 95%
confidence interval for μ is then
X ± 2.045(s/!n)
If the confidence level is instead 90%, the appropriate t-value is 1.699 (change the proba-
bility in cell B24 to 0.10 to see this), and the resulting 90% confidence interval is
X ± 1.699(s/!n)
If the confidence level is 99%, the appropriate t-value is 2.756 (change the probability in
cell B24 to 0.01 to see this), and the resulting 99% confidence interval is
X ± 2.756(s/!n)
Confidence interval Note that as the confidence level increases, the width of the confidence interval also
widths increase when increases. Because narrow confidence intervals are desirable, this presents a trade-off. You
you ask for higher can either have less confidence and a narrow interval, or you can have more confidence and
confidence levels, but
they tend to decrease a wide interval. However, you can also take a larger sample. As n increases, the standard
when you use larger error s/!n decreases, so the length of the confidence interval tends to decrease for any con-
sample sizes. fidence level. (Why won’t it decrease for sure? The larger sample might result in a larger
value of s that could offset the increase in n.)
Example 8.1 illustrates confidence interval estimation for a population mean. It uses
the One-Sample procedure in StatTools to perform the calculations. However, by examin-
ing the resulting Excel formulas, you can check that all it is really doing is (1) calculating
the sample mean, (2) calculating the standard error of the sample mean, s/!n, (3) find-
ing the appropriate t-multiple, and (4) combining these to form the confidence interval via
Expression (8.4).

EXAMPLE 8.1 C USTOMER R ESPONSE TO A N EW S ANDWICH

A fast-food restaurant recently added a new sandwich to its menu. To estimate the popu-
larity of this sandwich, a random sample of 40 customers who ordered the sand-
wich were surveyed. Each of these customers was asked to rate the sandwich on a scale
of 1 to 10, 10 being the best. The results of this survey appear in column B of Figure 8.4.
(See the file Satisfaction Ratings.xlsx.) The manager wants to estimate the mean satisfac-
tion rating over the entire population of customers by finding a 95% confidence interval.
How should she proceed?
Objective To use StatTools’s One-Sample procedure to obtain a 95% confidence interval
for the mean satisfaction rating of the new sandwich.

Solution
You need to use StatTools’s One-Sample procedure on the Satisfaction variable. To do so,
make sure a StatTools data set has been designated, select Confidence Interval from the
StatTools Statistical Inference dropdown list, and select the Mean/Std. Deviation option.
Then fill in the resulting dialog box as shown in Figure 8.3. In particular, select One-Sample
Analysis as the Analysis type. (Other types will be used later in the chapter.) You should
obtain the output shown in Figure 8.4. (Note: If you want to place the output next to the
data, as shown here, select Settings from the StatTools ribbon, and, in the Report group,
select either of the last two Placement options.)

8-3 Confidence Interval for a Mean 343

Copyright 201 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Figure 8.3
Dialog Box for
Confidence Interval
for Mean

© Cengage Learning
Figure 8.4 A B C D E
Analysis of New 1 Customer Sasfacon Sasfacon
2 1 7 Conf. Intervals (One-Sample) Data Set #1
Sandwich Data
3 2 5 Sample Size 40
4 3 5 Sample Mean 6.250
5 4 6 Sample Std Dev 1.597
6 5 8 Confidence Level (Mean) 95.0%
7 6 7 Degrees of Freedom 39
8 7 6 Lower Limit 5.739
9 8 7 Upper Limit 6.761
10 9 10

© Cengage Learning
11 10 7
39 38 9
40 39 5
41 40 4

The principal results are that (1) the best guess for the population mean rating is 6.250,
the sample average in cell E4, and (2) a 95% confidence interval for the population mean
rating extends from 5.739 to 6.761, as seen in cells E8 and E9. The manager can be 95%
confident that the true mean rating over all customers who might try the sandwich is within
this confidence interval.
To understand where
these numbers come The degrees of freedom for the t distribution is one less than the sample size, as shown
from, take a look at the in cell E7. The formulas for the confidence interval limits, in cells E8 and E9, are equiva-
formulas in column E. lent to the general formula in Expression (8.4), but they use special StatTools functions

344 Chapter 8 Confidence Interval Estimation

Copyright 201 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
to calculate the t-multiples. Note that StatTools doesn’t display the standard error of the
mean explicitly, but you can calculate it easily as the sample standard deviation in cell E5
divided by the square root of the sample size.
We stated previously that as the confidence level increases, the length of the confi-
dence interval increases. You can convince yourself of this by entering different confidence
levels such as 90% or 99% in cell E6. The lower and upper limits of the confidence interval
in cells E8 and E9 will change automatically, getting closer together for the 90% level and
farther apart for the 99% level. Just remember that you, the analyst, can choose the confi-
dence level, but 95% is the level most commonly chosen.
Before leaving this example, we discuss the assumptions that lead to the confidence
interval. First, you might question whether the sample is really a random sample—or
whether it matters. Perhaps the manager used some random mechanism to select the
customers to be surveyed. More likely, however, she simply surveyed 40 consecutive
customers who tried the sandwich on a given day. This is called a convenience sample and
is not really a random sample. However, unless there is some reason to believe that these
40 customers differ in some relevant aspect from the entire population of customers, it is
probably safe to treat them as a random sample.
A second assumption is that the population distribution is normal. We made this
assumption when we introduced the t distribution. Obviously, the population distribution
cannot be exactly normal because it is concentrated on the 10 possible satisfaction rat-
ings, and the normal distribution describes a continuum. However, this is probably not a
problem for two reasons. First, confidence intervals based on the t distribution are robust
to violations of normality. This means that the resulting confidence intervals are valid for
any populations that are approximately normal. Second, the normal population assumption
is less crucial for larger sample sizes because of the central limit theorem. A sample size of
40 should be large enough.
Finally, it is important to recognize what this confidence interval implies and what it
doesn’t imply. In the entire population of customers who ordered this sandwich, there is
a distribution of satisfaction ratings. Some fraction rate it as 1, some rate it as 2, and so
on. All we are trying to determine here is the average of all these ratings. Based on the
analysis, the manager can be 95% confident that this (still unknown) average is between
5.739 and 6.761. However, this confidence interval doesn’t tell her other characteristics of
the population of ratings that might be of interest, such as the proportion of customers who
rate the sandwich 6 or higher. It only provides information about the mean rating. Later in
this chapter, you will see how to find a confidence interval for a proportion, which allows
you to analyze another important characteristic of a population distribution. ■

In the sandwich example, we said that the manager can be 95% confident that the true
mean rating is between 5.739 and 6.761. What does this statement really mean? Contrary
to what you might expect, it does not mean that the true mean lies between 5.739 and
6.761 with probability 0.95. Either the true mean is inside this interval or it is not. The
true meaning of a 95% confidence interval is based on the procedure used to obtain it.
Specifically, if you use this procedure on a large number of random samples, all from
the same population, then approximately 95% of the resulting confidence intervals will
be “good” ones that include the true mean, and the other 5% will be “bad” ones that do
not include the true mean. Unfortunately, when you have only a single sample, as in the
sandwich example, you have no way of knowing whether your confidence interval is one
of the good ones or one of the bad ones, but you can be 95% confident that you obtained
one of the good intervals.

8-3 Confidence Interval for a Mean 345

Copyright 201 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
This simulation is Because this is such an important concept, we illustrate it in Figure 8.5 with simulation.
performed only to (See the file Confidence Interval Simulation Finished.xlsx. There is no “unfinished” ver-
illustrate the true sion of this file.) The data in column B are generated randomly from a normal distribu-
meaning of a “95%
confidence interval.” tion with the known values of μ and σ in cells B3 and B4. Next, StatTools’s One-Sample
In any real situation, Confidence Interval procedure is used to calculate a 95% confidence interval for the true
you obtain only a value of μ, exactly as in the sandwich example. However, because the true value of μ is
single random sample known, it is possible to record a 1 in cell H6 if the true mean is inside the interval and a 0
and the corresponding otherwise. The appropriate formula is
confidence interval.
=IF(AND(B3>=D13,B3<=D14),1,0)
Finally, a data table can be used to replicate the simulated results 1000 times.2
Specifically, the formula in G11 is
=G6
Then to build the data table in the range G11:H1011, leave the row input cell box empty
and specify any blank cell as the column input cell. Finally, the AVERAGE function can be
used in cell H7 to find the fraction of 1s in the range G12:G1011.

Figure 8.5 Simulation Demonstration of Confidence Intervals

A B C D E F G H
1 Interpretaon of a “95% confidence interval”
2 This simulaon uses a normal populaon for illustraon. But you could generate the random
3 Populaon mean 100 sample from another distribuon (e.g., triangular) to see if the confidence intervals are sll
4 Populaon stdev 20 valid, i.e., if the % in cell H7 is about 95%.
5
6 Random sample Random sample Mean captured? 1
7 78.70 Conf. Intervals (One-Sample) Data Set #1 % of CI’s capturing mean 95.1%
8 111.72 Sample Size 30
9 93.13 Sample Mean 93.74 Data table to replicate confidence interval
10 74.28 Sample Std Dev 24.04 Replicaon Mean captured?
11 75.31 Confidence Level (Mean) 95.0% 1
12 55.61 Degrees of Freedom 29 1 1
13 83.45 Lower Limit 84.76 2 1
14 82.48 Upper Limit 102.72 3 1
15 74.98 4 1
16 72.48 Graphical representaon 5 1
17 113.06 Limit Height 6 1
18 114.42 84.76 1 7 1
19 83.61 102.72 1 8 0
20 110.32 9 1
21 95.18 Mean Height 10 1
22 111.87 100 1 11 1
23 55.45 12 1
24 118.47 13 1
2
25 114.64 14 1
26 77.27 15 1
27 76.21 16 1
28 64.16 1 17 1
Confidence limits
29 95.07 18 1
Mean
30 128.13 19 1
31 100.43 20 1
32 141.25 0 21 1
© Cengage Learning

80.00 90.00 100.00 110.00 120.00


33 146.15 22 1
34 94.14 23 1
35 109.04 24 1
36 61.14 25 1
37 26 1
38 27 1

2
Depending on the speed of your PC, it can take a few seconds to simulate 1000 samples of size 30 in this data
table. Therefore, it is a good idea to set the recalculation mode to “automatic except tables.” (You can find this
option under the Calculation Options dropdown menu on the Formulas ribbon.) That way, the data table recalcu-
lates only if you explicitly tell it to (by pressing the F9 key).

346 Chapter 8 Confidence Interval Estimation

Copyright 201 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
You can see that 948 of the simulated confidence intervals (each based on a different
random sample of size 30) contain the true mean 100. In theory, 950 of the 1000 intervals
should cover the true mean, and this is almost exactly what occurred. Of course, in a par-
ticular application you might unluckily obtain the fourth sample (in row 15). However,
without knowing that the true mean is 100, you would have no way of knowing that you
obtained a “bad” interval.
Fundamental Insight We also show this graphically in the file. (See
True Meaning of a 95% Confidence Figure 8.5.) The small square in this graph is posi-
Interval tioned at the known mean and never changes. The
blue line represents a particular confidence interval.
Given the data in a particular sample, a 95% confidence
Put your cursor below this chart in, say, cell C35,
interval for the mean will either include the (unknown)
and press the Delete key. (This forces a recalcula-
population mean or it won’t.The true meaning of a 95%
tion without recalculating the whole data table.)
confidence interval is that if the same procedure is used
The position of the blue line will change. About
on many different random samples, about 95% of the
95% of the time, the blue line will straddle the small
resulting confidence intervals will include the popula-
square—the confidence interval will include the
tion mean, and only about 5% won’t. Therefore, you can
true mean—but about 1 time out of 20, it will not.
be 95% confident that any particular confidence inter-
This also illustrates the meaning of a “95% confi-
val you happen to get is a “good” one.
dence interval.”

PROBLEMS
Level A if population is defined as all NFL players that year.
However, proceed as in the previous chapter to select
4. A manufacturing company’s quality control personnel a random sample of size 50 from the 2009 popula-
have recorded the proportion of defective items for tion. Based on this random sample, calculate a 95%
each of 500 monthly shipments of one of the computer confidence interval for the mean NFL total salary in
components that the company produces. The data are 2009. Does it contain the population mean? Repeat
in the file P07_07.xlsx. The quality control department this procedure several times until you find a random
manager does not have sufficient time to review all sample where the population mean is not included in
of these data. Rather, she would like to examine the the confidence interval.
proportions of defective items for a sample of these
shipments. 6. The file P08_06.xlsx contains data on repetitive task
a. Use StatTools to generate a simple random sample times for each of two workers. John has been doing
of size 25. this task for months, whereas Fred has just started.
b. Using the sample generated in part a, construct a Each time listed is the time (in seconds) to perform a
95% confidence interval for the mean proportion routine task on an assembly line. The times shown are
of defective items over all monthly shipments. in chronological order.
Assume that the population consists of the a. Find a 95% confidence interval for the mean time
proportion of defective items for each of the given it takes John to perform the task. Do the same for
500 monthly shipments. Fred.
c. Interpret the 95% confidence interval constructed b. Do you believe both of the confidence intervals in
in part b. part a are valid and/or useful? Why or why not?
d. Does the 95% confidence interval contain the Which of the two workers would you rather have,
actual population mean in this case? If not, explain assuming that task time is the only issue?
why not. What proportion of many similarly 7. The manager of a local fast-food restaurant is inter-
constructed confidence intervals should include the ested in improving the service provided to customers
true population mean? who use the restaurant’s drive-up window. As a first
5. The file P08_05.xlsx contains salary data on all NFL step in this process, the manager asks an assistant to
players in each of the years 2002 to 2009. Because record the time (in seconds) it takes to serve a large
this file contains all players for each of these years, number of customers at the final window in the facil-
you can calculate the population mean for each year ity’s drive-up system. The file P08_07.xlsx contains a

8-3 Confidence Interval for a Mean 347

Copyright 201 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

You might also like