SM McClave Stat10 WM PDF

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.
com
INSTRUCTOR'S SOLUTIONS MANUAL

to Accompany
James T. McClave
P. George Benson
and Terry Sincich's
STATISTICS FOR BUSINESS

AND ECONOMICS
Tenth Edition
Nancy S. Boudreau
Bowling Green State University
Upper Saddle River, New Jersey

Columbus, Ohio
To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com
Contents
Preface
Chapter 1
Statistics, Data, and Statistical Thinking
Chapter 2
Methods for Describing Sets of Data

The Kentucky Milk Case
5
46
Chapter 3
Probability
55
Chapter 4
Random Variables and Probability Distributions

The Furniture Fire Case
82
136
Chapter 5
Inferences Based on a Single Sample: Estimation with Confidence Intervals
137
Chapter 6
Inferences Based on a Single Sample: Tests of Hypothesis
161
Chapter 7
Inferences Based on Two Samples: Confidence Intervals and

Tests of Hypotheses
The Kentucky Milk Case Part II
201
243
Chapter 8
Design of Experiments and Analysis of Variance
256
Chapter 9
Categorical Data Analysis

Discrimination in the Work Place
300
328
Chapter 10
Simple Linear Regression
332
Chapter 11
Multiple Regression and Model Building

The Condo Sales Case
379
444
Chapter 12
Methods for Quality Improvement
448
Chapter 13
Time Series: Descriptive Analyses, Models, and Forecasting

The Gasket Manufacturing Case
476
522
Chapter 14
Nonparametric Statistics
529
iii
iv
Preface
This solutions manual is designed to accompany the text, Statistics for Business and Economics, Tenth
Edition, by James T. McClave, P. George Benson, and Terry Sincich. It provides answers to most evennumbered exercises for each chapter in the text. Other methods of solution may also be appropriate; however,
the author has presented one that she believes to be most instructive to the beginning Statistics student.
This manual is provided to help instructors save time in preparing presentations of the solutions and to
possibly provide another point of view regarding their meaning.
Some of the exercises are subjective in nature. Subjective decisions regarding these exercises have been made
and are explained by the author. Solutions based on these decisions are presented; the solution to this type of
exercise is often most instructive. When an alternative interpretation of an exercise may occur, the author has
often addressed it and given justification for the approach taken.
I would like to thank Kelly Barber for creating the art work and for typing this work.
Nancy S. Boudreau
Bowling Green State University
Bowling Green, Ohio
Statistics, Data,
and Statistical Thinking
Chapter 1
1.2
Descriptive statistics utilizes numerical and graphical methods to look for patterns, to
summarize, and to present the information in a set of data. Inferential statistics utilizes sample
data to make estimates, decisions, predictions, or other generalizations about a larger set of
data.
1.4
The first element of inferential statistics is the population of interest. The population is a set of
existing units. The second element is one or more variables that are to be investigated. A
variable is a characteristic or property of an individual population unit. The third element is
the sample. A sample is a subset of the units of a population. The fourth element is the
inference about the population based on information contained in the sample. A statistical
inference is an estimate, prediction, or generalization about a population based on information
contained in a sample. The fifth and final element of inferential statistics is the measure of
reliability for the inference. The reliability of an inference is how confident one is that the
inference is correct.
1.6
Quantitative data are measurements that are recorded on a meaningful numerical scale.
Qualitative data are measurements that are not numerical in nature; they can only be classified
into one of a group of categories.
1.8
A population is a set of existing units such as people, objects, transactions, or events. A

sample is a subset of the units of a population.
1.10
An inference without a measure of reliability is nothing more than a guess. A measure of

reliability separates statistical inference from fortune telling or guessing. Reliability gives a
measure of how confident one is that the inference is correct.
1.12
Statistical thinking involves applying rational thought processes to critically assess data and
inferences made from the data. It involves not taking all data and inferences presented at face
value, but rather making sure the inferences and data are valid.
1.14
a.
The two variables measured are type of credit card used and amount of purchase.
Type of credit card used is qualitative. It has no meaningful number associated with it,
only the name of the card used. Amount of purchase is quantitative. It has a meaningful
number associated with it.
b.
In Study 1, it says that all purchases were tracked. Thus, the data represent a population.
a.
High school GPA is a number usually between 0.0 and 4.0. Therefore, it is quantitative.
b.
Honors/awards would have responses that name things. Therefore, it would be

qualitative.
1.16
1.18
1.20
1.22.
c.
The scores on the SAT's are numbers between 200 and 800. Therefore, it is quantitative.
d.
Gender is either male or female. Therefore, it is qualitative.
e.
Parent's income is a number: $25,000, $45,000, etc. Therefore, it is quantitative.
f.
Age is a number: 17, 18, etc. Therefore, it is quantitative.
a.
1.
The variable of interest is the status of a companys e-commerce strategy.

Since a company either has an e-commerce strategy or not, the variable is
qualitative.
2.
The variable of interest is when the company will implement an e-commerce plan.
Since the time of implementation will be a date, this variable will be qualitative.
3.
The variable of interest is whether the company is delivering products over the
internet or not. Since the company is either delivering products or not, the variable
is qualitative.
4.
The variable of interest is the companys total revenue in the last fiscal year. Since
this is a meaningful number, this variable is quantitative.
b.
Since there are many more that 154 companies in the U.S., this represents a sample rather
than a population.
a.
The population of interest is the collection of computer security personnel at all U.S.
corporations and government agencies.
b.
Surveys were sent to computer security personnel at all U. S. corporations and

government agencies. However, in 2006, only 616 organizations responded to the
survey. There could be nonresponse bias. Often, only those subjects with strong
opinions will respond to a survey. Thus, the responses may not reflect what the
population as a whole thinks.
c.
The variable measured in the survey is whether or not there was unauthorized use of
computer systems at the firms during the year. Since the responses will be either Yes
or No, the variable is qualitative.
d.
If we assume that the responses were a random sample from the population, we could
infer that about 52% of all computer security personnel will admit to unauthorized use of
computer systems at their firms during the year.
a.
The data collection method used is a designed experiment.
b.
The experimental units in the study are the 50,000 smokers.
c.
The variable of interest is the age at which the scanning method first detects a tumor.
Since this is a meaningful number, this variable is quantitative.
Chapter 1
1.24
1.26
1.28
1.30
d.
The population of interest is the set of all smokers in the U.S. The sample of interest is
the set of 50,000 smokers surveyed.
e.
The researchers want to compare the age at first detection for the 2 methods to see if one
is more sensitive than the other.
a.
The variable of interest to the researchers is the rating of highway bridges.
b.
Since the rating of a bridge can be categorized as one of three possible values, it is
qualitative.
c.
The data set analyzed is a population since all highway bridges in the U.S. were
categorized.
d.
The data were collected observationally. Each bridge was observed in its natural setting.
a.
The population of interest is the set of all New York accounting firms employing two or
more professionals. There are two variables of interest: Whether or not the firm uses
audit sampling methods, and if so, whether or not it uses random sampling. The sample
is the set of 163 firms whose responses were useable. The inference of interest to the
New York Society of CPAs is the proportion of all New York accounting firms
employing two or more professionals that use sampling methods in auditing their clients.
b.
The four responses that were unusable could have been returned blank or could have been
filled out incorrectly.
c.
Any time a survey is mailed it is questionable whether the returned questionnaires

represent a random sample. Often times, only those with very strong opinions return the
surveys. In such a case, the returned surveys would not be representative of the entire
population.
a.
The experimental units in this study are the 24 projects.
b.
The population from which the sample was selected is the set of all new software
development projects.
c.
The variable of interest in this project is the outcome of reusing previously developed
software for the new software development projects.
d.
In the sample, 9 of the 24 projects were judged failures. This is (9 / 24)*100% = 37.5%.
We could infer that approximately 37.5% of all projects would be judged failures.
a.
The process being studied is the process of filling beverage cans with softdrink at CCSB's
Wakefield plant.
b.
The variable of interest is the amount of carbon dioxide added to each can of beverage.
c.
The sampling plan was to monitor five filled cans every 15 minutes. The sample is the
total number of cans selected.
d.
The company's immediate interest is learning about the process of filling beverage cans
with softdrink at CCSB's Wakefield plant. To do this, they are measuring the amount of
carbon dioxide added to a can of beverage to make an inference about the process of
filling beverage cans. In particular, they might use the mean amount of carbon dioxide
added to the sampled cans of beverage to estimate the mean amount of carbon dioxide
added to all the cans on the process line.
e.
The technician would then be dealing with a population. The cans of beverage have
already been processed. He/she is now interested in the outputs.
Chapter 1
Methods for
Describing Sets of Data
2.2
a.
To find the frequency for each class, count the number of times each letter occurs. The
frequencies for the three classes are:
Class
X
Y
Z
Total
b.
Chapter 2
Frequency
8
9
3
20
The relative frequency for each class is found by dividing the frequency by the total sample
size. The relative frequency for the class X is 8/20 = .40. The relative frequency for the
class Y is 9/20 = .45. The relative frequency for the class Z is 3/20 = .15.
Class
X
Y
Z
Total
Frequency
8
9
3
20
Relative Frequency
.40
.45
.15
1.00
c.
The frequency bar chart is:
d.
The pie chart for the frequency distribution is:
2.4
a.
The variable summarized in the table is Reason for requesting the installation of the
passenger-side on-off switch. The values this variable could assume are: Infant, Child,
Medical, Infant & Medical, Child & Medical, Infant & Child, and Infant & Child &
Medical. Since the responses name something, the variable is qualitative.
b.
The relative frequencies are found by dividing the number of requests for each category by
the total number of requests. For the category Infant, the relative frequency is
1,852/30,337 = .061. The rest of the relative frequencies are found in the table below:
Reason
Infant
Number of
Requests
1,852
1,852/30,337
Relative
frequencies
.061
Child
17,148
17,148/30,337
.565
Medical
8,377
8,377/30,337
.276
Infant & Medical
44
44/30,337
.0014
Child & Medical
903
903/30,337
.030
1,878
1,878/30,337
.062
135
135/30,337
.0045
Infant & Child

Infant & Child & Medical
TOTAL
c.
30,337
.9999
Using MINITAB, a pie chart of the data is:
Pie Chart of Reason

Child
(17148, 56.5%)
Child&Medica ( 903, 3.0%)

Inf &Chd&Med ( 135, 0.4%)
Inf ant
( 1852, 6.1%)
Medical
( 8377, 27.6%)
Inf ant&Child ( 1878, 6.2%)

Inf ant&Medic (
d.
44, 0.1%)
There are 4 categories where Medical is mentioned as a reason: Medical, Infant &
Medical, Child & Medical, and Infant & Child & Medical. The sum of the frequencies
for these 4 categories is 8,377 + 44 + 903 + 135 = 9,459. The proportion listing Medical
as one of the reasons is 9,459/30,337 = .312.
Chapter 2
2.6
a.
To find relative frequencies, we divide the frequencies of each category by the total
number of incidents. The relative frequencies of the number of incidents for each of the
cause categories are:
Management System
Cause Category
Engineering & Design
Procedures & Practices
Management & Oversight
Training & Communication
TOTAL
b.
Number of Incidents
Relative Frequencies
27
24
22
10
83
27 / 83 = .325
24 / 83 = .289
22 / 83 = .265
10 / 83 = .120
1
The Pareto diagram is:

Management Systen Cause Category
35
30
P er cent
25
20
15
10
5
0
2.8
E ng&D es
P roc&P ract
M gmt&O v er
C ategor y
Trn&C omm
c.
The category with the highest relative frequency of incidents is Engineering and Design.
The category with the lowest relative frequency of incidents is Training and
Communication.
a.
The data collection method was a survey.
b.
Since the data were numbers (percentage of US labor and materials), the variable is
quantitative. Once the data were collected, they were grouped into 4 categories.
c.

Pie Chart of Made in USA
100% (64, 60.4%)
<50% ( 4, 3.8%)
75-99% (20, 18.9%)
50-74% (18, 17.0%)
About 60% of those surveyed believe that Made in USA means 100% US labor and
materials.
2.10
Using MINITAB, a bar chart of the frequency of occurrence of the industry types is:
Chart of INDUSTRY
80
70
Count
60
50
40
30
20
0
Aerospace/Defense
Banking
Capital Goods
Chemicals
Conglomerates
Construction
Consumer Durables
Diversified Financials
Drugs/Biotechnology
Food Markets
Food/Drink/Tobacco
Health Care
Hotels/Restaurants/Leisure
Household/Personal Products
Insurance
Materials
Media
Oil & Gas
Retailing
Semiconductors
Services/Supplies
Software & Services
Technology Equipment
Telecommunications
Transportation
Utilities
10
INDUSTRY
Chapter 2
2.12
Using MINITAB, the side-by-side bar charts are:

Chart of 1999, 2006 vs Use
Yes
No
1999
0.7
D on't know
2006
Relative Fr equency
0.6
0.5
0.4
0.3
0.2
0.1
0.0
Yes
No
Don't know
Unathor ized Use of C O mputer Systems
The relative frequency of unauthorized use of computer systems has decreased from
1999 to 2006.
2.14
a.
Using MINITAB, the side-by-side graphs are:

Chart of Exposure, Opportunity, Content, Faculty vs Stars
5
Exposure
Opportunity
16
12
Fr equency
8
4
Content
Faculty
16
12
8
4
0
Star s
From these graphs, one can see that very few of the top 30 MBA programs got 5-stars in
any criteria. In addition, about the same number of programs got 4 stars in each of the 4
criteria. The biggest difference in ratings among the 4 criteria was in the number of
programs receiving 3-stars. More programs received 3-stars in Course Content than in any
of the other criteria. Consequently, fewer programs received 2-stars in Course Content
than in any of the other criteria.
b.
Since this chart lists the rankings of only the top 30 MBA programs in the world, it is
reasonable that none of these best programs would be rated as 1-star on any criteria.
2.16
2.18
a.
The original data set has 1 + 3 + 5 + 7 + 4 + 3 = 23 observations.
b.
For the bottom row of the stem-and-leaf display:

The stem is 0.
The leaves are 0, 1, 2.
The numbers in the original data set are 0, 1, and 2.
2.20.
10
c.
The dot plot corresponding to all the data points is:
a.
The measurement class that contains the highest proportion of respondents is none.
Sixty-one percent of the respondents said that their companies did not outsource any
computer security functions.
b.
From the graph, 6% of the respondents indicated that they outsourced between 20% and
40% of their computer security functions.
c.
The proportion of the 609 respondents who outsourced at least 40% of computer security
functions is .04 + .01 + .01 = .06.
d.
The number of the 609 respondents who outsourced less than 20% of computer security
functions is (.27 + .61)*609 = .88(609) = 536.
Chapter 2
2.22
a.
Using MINITAB, the stem-and-leaf display of the data is:
Stem-and-Leaf Display: SCORE

Stem-and-leaf of SCORE
Leaf Unit = 1.0
1
6
1
6
2
7
3
7
4
8
15
8
56
9
(100) 9
13
10
2.24
= 169
2
2
8
4
66677888899
00001111111222222222233333333344444444444
55555555555555555555556666666666666666666777777777777777777888888+
0000000000000
b.
From the stem-and-leaf display, we see that there are only 4 observations with sanitation
scores less than the acceptable score of 86. The proportion of ships that have an accepted
sanitation standard would be (169 4) / 169 = .976.
c.
The sanitation score of 84 is in bold in the stem-and-leaf display in part a.
a.
Using MINITAB, the frequency histogram is:
Frequency
30
20
10
0
20
30
40
50
Length
11
b.

35
30
Frequency
25
20
15
10
5
0
0
500
1000
1500
2000
250
Weight
c.
140
120
Frequency
100
80
60
40
20
0
0
500
1000
DDT
2.26
Using MINITAB, the two dot plots are:

Dotplot for Arrive-Depart
Yes. Most of the numbers of items arriving at the work center per hour are in the 135 to 165
area. Most of the numbers of items departing the work center per hour are in the 110 to 140
area. Because the number of items arriving is larger than the number of items departing,
there will probably be some sort of bottleneck.
12
Chapter 2
2.28
a.
Using MINITAB, the three frequency histograms are as follows (the same starting point and
class interval were used for each):
Histogram of C1
N = 25
Tenth Performance
Midpoint Count
4.00
0
8.00
0
12.00
1
16.00
5
20.00
10
24.00
6
28.00
0
32.00
2
36.00
0
40.00
1
*
*****
**********
******
**
*
Histogram of C2
N = 25
Thirtieth Performance
Midpoint Count
4.00
1
8.00
9
12.00
12
16.00
2
20.00
1
*
*********
************
**
*
Histogram of C3
N = 25
Fiftieth Performance
Midpoint Count
4.00
3
8.00
15
12.00
4
16.00
2
20.00
1
b.
***
***************
****
**
*
The histogram for the tenth performance shows a much greater spread of the observations
than the other two histograms. The thirtieth performance histogram shows a shift to the
leftimplying shorter completion times than for the tenth performance. In addition, the
fiftieth performance histogram shows an additional shift to the left compared to that for the
thirtieth performance. However, the last shift is not as great as the first shift. This agrees
with statements made in the problem.
13
2.30
a.
A stem-and-leaf display is as follows, where the stems are the units place and the leaves are
the decimal places:
Stem Leaves
1 0 0 0 0 1 1 2 2 222 3 4 4 4 4444 5 5 55 6 79
2 1 144 6 7 9 9
3 0 028 9 9
4 1112 5
5 24
6
7 8
8
9
10 1
2.32
b.
A little more than half (26/49 = .53) of all companies spent less than 2 months in
bankruptcy. Only two of the 49 companies spent more than 6 months in bankruptcy. It
appears then, in general, the length of time in bankruptcy for firms using "prepacks" is less
than that of firms not using "prepacks."
c.
A dot diagram will be used to compare the time in bankruptcy for the three types of
"prepack" firms:
d.
The circled times in part a correspond to companies that were reorganized through a
leverage buyout. There does not appear to be any pattern to these points. They appear to
be scattered about evenly throughout the distribution of all times.
Using MINITAB, the stem-and-leaf display for the data is:

Stem-and-leaf of Time
Leaf Unit = 1.0
3
7
(7)
11
6
4
2
1
3
4
5
6
7
8
9
10
= 25
239
3499
0011469
34458
13
26
5
2
The numbers in bold represent delivery times associated with customers who subsequently
did not place additional orders with the firm. Since there were only 2 customers with
delivery times of 68 days or longer that placed additional orders, I would say the maximum
tolerable delivery time is about 65 to 67 days. Everyone with delivery times less than 67
days placed additional orders.
14
Chapter 2
2.34
a.
x = 3 + 8 + 4 + 5 + 3 + 4 + 6 = 33
b.
c.
( x 5)
= 32 + 82 + 42 + 52 + 32 + 42 + 62 = 175
= (3 5)2 + (8 5)2 + (4 5)2 + (5 5)2 + (3 5)2 + (4 5)2

+ (6 5)2 = 20
d.
( x 2)
= (3 2)2 + (8 2)2 + (4 2)2 + (5 2)2 + (3 2)2 + (4 2)2

+ (6 2)2 = 71
2.36
2.38
2.40
e.
( x)
a.
x = 6 + 0 + (2) + (1) + 3 = 6
b.
c.
a.
x=
b.
x=
400
= 25
16
c.
x=
35
= .78
45
d.
x=
242
= 13.44
18
= (3 + 8 + 4 + 5 + 3 + 4 + 6)2 = 332 = 1089
= 62 + 02 + (2)2 + (1)2 + 32 = 50
( x)
x = 85
n
10
= 50
62
= 50 7.2 = 42.8
5
= 8.5
The median is the middle number once the data have been arranged in order. If n is even, there
is not a single middle number. Thus, to compute the median, we take the average of the middle
two numbers. If n is odd, there is a single middle number. The median is this middle number.
A data set with five measurements arranged in order is 1, 3, 5, 6, 8. The median is the middle
number, which is 5.
A data set with six measurements arranged in order is 1, 3, 5, 5, 6, 8. The median is the average
5 + 5 10
= 5.
of the middle two numbers which is
=
2
2
15
2.42
a.
x = 7 + " + 4 = 15
x =
Median =
= 2.5
3+3
= 3 (mean of 3rd and 4th numbers, after ordering)
2
Mode = 3
2.44
x = 2 + " + 4 = 40
b.
= 3.08
n
13
13
Median = 3 (7th number, after ordering)
Mode = 3
c.
= 49.6
10
10
48 + 50
Median =
= 49 (mean of 5th and 6th numbers, after ordering)
2
Mode = 50
a.
The sample mean is:
x =
x = 51 + " + 37 = 496
x =
x=
x
i =1
529 + 355 + 301 + ... + 63 3757

=
= 144.5
26
26
The sample median is found by finding the average of the 13th and 14th observations once the
data are arranged in order. The 13th and 14th observations are 100 and 105. The average of
these two numbers (median) is:
median =
100 + 105 205

=
= 102.5
2
2
The mode is the observation appearing the most. For this data set, the mode is 70, which
appears 3 times.
Since the mean is larger than the median, the data are skewed to the right.
b.
The sample mean is:

n
x=
x
i =1
11 + 9 + 6 + ... + 4 136
=
= 5.23
26
26
The sample median is found by finding the average of the 13th and 14th observations once the
data are arranged in order. The 13th and 14th observations are 5 and 5. The average of these
two numbers (median) is:
median =
16
5 + 5 10
=
=5
2
2
Chapter 2
appears 6 times.
Since the mean and median are about the same, the data are somewhat symmetric.
2.46
a.
The sample mean is:

n
x=
xi
i =1
1.72 + 2.50 + 2.16 + + 1.95 37.62

=
= 1.881
20
20
The sample average surface roughness of the 20 observations is 1.881.

b.
The median is found as the average of the 10th and 11th observations, once the data have
been ordered. The ordered data are:
1.06 1.09 1.19 1.26 1.27 1.40 1.51 1.72 1.95 2.03 2.05 2.13 2.13 2.16 2.24 2.31 2.41 2.50 2.57 2.64
The 10th and 11th observations are 2.03 and 2.05. The median is:
2.03 + 2.05 4.08
=
= 2.04
2
2
The middle surface roughness measurement is 2.04. Half of the sample measurements
were less than 2.04 and half were greater than 2.04.
2.48
c.
The data are somewhat skewed to the left. Thus, the median might be a better measure of
central tendency than the mean. The few small values in the data tend to make the mean
smaller than the median.
a.
Using MINITAB, the stem-and-leaf display is:

Stem-and-leaf of PAF
Leaf Unit = 1.0
6
8
(2)
7
5
4
4
3
b.
0
1
2
3
4
5
6
7
N=17
000009
25
45
13
0
2
057
The median is the middle number once the data are arranged in order. The data arranged in
order are: 0, 0, 0, 0, 0, 9, 12, 15, 24, 25, 31, 33, 40, 62, 70, 75, 77.
The middle number or the median is 24.
c.
The mean of the data is x =
x
n
77 + 33 + 75 + " + 31
473
=
= 27.82
17
17
17
2.50
d.
The number occurring most frequently is 0. The mode is 0.
e.
The mode corresponds to the smallest number. It does not seem to locate the center of the
distribution. Both the mean and the median are in the middle of the stem-and-leaf display.
Thus, it appears that both of them locate the center of the data.
a.
The sample mean length is:

n
x=
x
i =1
42.5 + 44.0 + 41.5 + ... + 36.0 6165

=
= 42.81
144
144
The average length of the 144 fish is 42.81 cm.

The median is the average of the middle two observations once they have been ordered.
The 72nd and 73rd observations are 45 and 45. The average of these two observations is 45.
Half of the fish lengths are less than 45 cm and half are longer.
The mode is 46 cm. This observation occurred 12 times.
b.
The sample mean weight is:

n
x=
x
i =1
732 + 795 + 547 + ... + 1433 151159

=
= 1049.72
144
144
The average weight of the 144 fish is 1049.72 grams.

The 72nd and 73rd observations are 989 and 1011. The average of these two observations is
median =
989 + 1,011
= 1000
2
Half of the fish weights are less than 1000 grams and half are heavier.
There are 2 modes, 886 and 1186. Each of these observations occurred 3 times.
c.
The sample mean DDT level is:

n
x=
x
i =1
10 + 16 + 23 + ... + 1.9 3507.1

=
= 24.35
144
144
The average DDT level of the 144 fish is 24.35 parts per million.
18
Chapter 2
The 72nd and 73rd observations are 7.1 and 7.2. The average of these two observations is
median =
7.1 + 7.2
= 7.15
2
Half of the fish DDT levels are less than 7.15 parts per million and half are greater.
The mode is 12. This observation occurred 8 times.
2.52
2.54
d.
From the graph in Exercise 2.24a, the data are skewed to the left. This corresponds to the
relationship between the mean and the median. For data skewed to the left, the mean is
less than the median. For the fish lengths, the mean is 42.81 and the median is 45.
e.
From the graph in Exercise 2.24b, the data are slightly skewed to the right. This
corresponds to the relationship between the mean and the median. For data skewed to the
right, the mean is more than the median. For the fish weights, the mean is 1049.72 and the
median is 1000.
f.
From the graph in Exercise 2.24c, the data are skewed to the right. This corresponds to the
relationship between the mean and the median. For data skewed to the right, the mean is
more than the median. For the fish DDT levels, the mean is 24.35 and the median is 7.15.
a.
Due to the "elite" superstars, the salary distribution is skewed to the right. Since this
implies that the median is less than the mean, the players' association would want to use the
median.
b.
The owners, by the logic of part a, would want to use the mean.
a.
The sample mean is:

n
x=
x
i =1
5 + 3 + 4 + ... + 3 80
=
=4
20
20
The sample median is found by finding the average of the 10th and 11th observations once
the data are arranged in order. The data arranged in order are:
1 1 1 1 1 2 2 3 3 3 4 4 4 5 5 5 6 7 9 13
The 10th and 11th observations are 3 and 4. The average of these two numbers (median) is:
median =
3+ 4 7
= = 3.5
2
2
appears 5 times.
19
b.
Eliminating the largest number which is 13 results in the following:

The sample mean is:
n
x=
x
i =1
5 + 3 + 4 + ... + 3 67
=
= 3.53
19
19
The sample median is found by finding the middle observation once the data are arranged
in order. The data arranged in order are:
1 1 1 1 1 2 2 3 3 3 4 4 4 5 5 5 6 7 9
The 10th observation is 3. The median is 3
The mode is the observations appearing the most. For this data set, the mode is 1, which
appears 5 times.
By dropping the largest number, the mean is reduced from 4 to 3.53. The median is
reduced from 3.5 to 3. There is no effect on the mode.
c.
The data arranged in order are:

1 1 1 1 1 2 2 3 3 3 4 4 4 5 5 5 6 7 9 13
If we drop the lowest 2 and largest 2 observations we are left with:
1 1 1 2 2 3 3 3 4 4 4 5 5 5 6 7
The sample 10% trimmed mean is:

n
x=
x
i =1
1 + 1 + 1 + ... + 7 56
=
= 3.5
16
16
The advantage of the trimmed mean over the regular mean is that very large and very small
numbers that could greatly affect the mean have been eliminated.
20
Chapter 2
2.56
a.
b.
2.58
s2 =
s2 =
( x)
n 1
=
2
=
2
a.
Range = 42 37 = 5
b.
( x)
n 1
s=
3.3333 = 1.826
17 2
20 = .1868
20 1
s=
.1868 = .432
1992
5 = 3.7
5 1
s=
3.7 = 1.92
7935
Range = 100 1 = 99
s2 =
c.
4.8889 = 2.211
18
s2 =
n 1
1002
`
40 = 3.3333
40 1
s=
380
( x)
202
10 = 4.8889
10 1
84
( x)
n 1
c.
s2 =
( x)
n 1
3032
9 = 1,949.25
9 1
25,795
s = 1,949.25 = 44.15
Range = 100 2 = 98
s2 =
2.60
( x)
n 1
2952
8 = 1,307.84
8 1
20,033
s = 1,307.84 = 36.16
This is one possibility for the two data sets.

Data Set 1: 1, 1, 2, 2, 3, 3, 4, 4, 5, 5
Data Set 2: 1, 1, 1, 1, 1, 5, 5, 5, 5, 5
x1 =
x2 =
x = 1 + 1 + 2 + 2 + 3 + 3 + 4 + 4 + 5 + 5 = 30 = 3
n
10
10
1 + 1 + 1 + 1 + 1 + 5 + 5 + 5 + 5 + 5 30
=
=
=3
n
10
10
Therefore, the two data sets have the same mean. The variances for the two data sets are:
s12 =
s22 =
( x)
n 1
( x)
n 1
302
10 = 20 = 2.2222
9
9
110
302
10 = 20 = 4.4444
9
9
110
21
The dot diagrams for the two data sets are shown below.
2.62
a.
Range = 3 0 = 3
s2 =
b.
( x)
n 1
72
5 = 1.3
=
5 1
15
s = 1.3 = 1.1402
After adding 3 to each of the data points,

Range = 6 3 = 3
s2 =
c.
( x)
n 1
222
5 = 1.3
5 1
102
s = 1.3 = 1.1402
After subtracting 4 from each of the data points,

Range = 1 (4) = 3
s2 =
2.64
( x)
n 1
(13) 2
5
= 1.3
5 1
39
s = 1.3 = 1.1402
d.
The range, variance, and standard deviation remain the same when any number is added to
or subtracted from each measurement in the data set.
a.
The maximum age is 64. The minimum age is 39. The range is 64 39 = 25.
b.
The variance is:

2
n
xi
n
2
24942
x i =1
125,764n
50 = 27.822
s 2 = i =1
=
n 1
50-1
c.
The standard deviation is:

s = s 2 = 27.822 = 5.275
d.
22
Since the standard deviation of the ages of the 50 most powerful women in Europe is 10
years and is greater than that in the U.S. (5.275 years), the age data for Europe is more
variable.
Chapter 2
2.66
a.
The maximum weight is 1.1 carats. The minimum weight is .18 carats. The range is
1.1 .18 = .92 carats.
b.
The variance is:

2
xi
194.322
xi2 i
146.19
n
308 = .0768 square carats
s2 = i
=
308 1
n 1
c.

s = s 2 = .0768 = .2772 carats
2.68
d.
The standard deviation. This gives us an idea about how spread out the data are in the
same units as the original data.
a.
A worker's overall time to complete the operation under study is determined by adding the
subtask-time averages.
Worker A
The average for subtask 1 is: x =
x = 211 = 30.14
n
7
21
=
=3
n
7
Worker A's overall time is 30.14 + 3 = 33.14.
Worker B
x = 213 = 30.43
n
7
29
=
= 4.14
n
7
Worker B's overall time is 30.43 + 4.14 = 34.57.
b.
Worker A
s=
( x)
n 1
2117
7 = 15.8095 = 3.98
7 1
6455
Worker B
s=
c.
( x)
n 1
2132
7 = .9524 = .98
7 1
6487
The standard deviations represent the amount of variability in the time it takes the worker
to complete subtask 1.
23
d.
Worker A
s=
( x)
n 1
212
7 = .6667 = .82
7 1
67
Worker B
s=
e.
( x)
n 1
292
7 = 4.4762 = 2.12
7 1
147
I would choose workers similar to worker B to perform subtask 1. Worker B has a slightly
higher average time on subtask 1 (A: x = 30.14, B: x = 30.43). But, Worker B has a
smaller variability in the time it takes to complete subtask 1 (part b). He or she is more
consistent in the time needed to complete the task.
I would choose workers similar to Worker A to perform subtask 2. Worker A has a smaller
average time on subtask 2 (A: x = 3, B: x = 4.14). Worker A also has a smaller
variability in the time needed to complete subtask 2 (part d).
2.70
2.72
Since no information is given about the data set, we can only use Chebyshev's Rule.
a.
Nothing can be said about the percentage of measurements which will fall between
x s and x + s.
b.
At least 3/4 or 75% of the measurements will fall between x 2s and x + 2s.
c.
At least 8/9 or 89% of the measurements will fall between x 3s and x + 3s.
a.
x =
s2 =
x = 206
n
25
= 8.24
( x)
n 1
2062
25 = 3.357
25 1
1778
s=
s 2 = 1.83
b.
Interval
c.
24
Number of Measurements
in Interval
Percentage
x s, or (6.41, 10.07)
18
18/25 = .72 or 72%
x 2s, or (4.58, 11.90)
24
24/25 = .96 or 96%
x 3s, or (2.75, 13.73)
25
25/25 = 1
or 100%
The percentages in part b are in agreement with Chebyshev's Rule and agree fairly well
with the percentages given by the Empirical Rule.
Chapter 2
d.
Range = 12 5 = 7
s range/4 = 7/4 = 1.75
The range approximation provides a satisfactory estimate of s = 1.83 from part a.
2.74
From Chebyshevs Theorem, we know that at least or 75% of all observations will fall within
2 standard deviations of the mean. From Exercise 2.47, x = .631. From Exercise 2.66,
s = .2772. This interval is:
x 2 s .631 2(.2772) .631 .5544 (.0766, 1.1854)
2.76
a.
From the information given, we have x = 375 and s = 25. From Chebyshev's Rule, we
know that at least three-fourths of the measurements are within the interval:
x 2s, or (325, 425)
Thus, at most one-fourth of the measurements exceed 425. In other words, more than 425
vehicles used the intersection on at most 25% of the days.
b.
According to the Empirical Rule, approximately 95% of the measurements are within the
interval:
x 2s, or (325, 425)
This leaves approximately 5% of the measurements to lie outside the interval. Because of
the symmetry of a mound-shaped distribution, approximately 2.5% of these will lie below
325, and the remaining 2.5% will lie above 425. Thus, on approximately 2.5% of the days,
more than 425 vehicles used the intersection.
2.78
a.
Since the sample mean (18.2) is larger than the sample median (15), it indicates that the
distribution of years is skewed to the right. In addition, the maximum number of years is
50 and the minimum is 2. If the distribution were symmetric, the mean and median should
be about halfway between these two numbers. Halfway between the maximum and
minimum values is 26, which is much larger than either the mean or the median.
b.
The standard deviation can be estimated by the range divided by either 4 or 6. For this
distribution, the range is:
Range = Largest smallest = 50 2 = 48.
Dividing the range by 4, we get an estimate of the standard deviation to be 48/4 = 12.
Dividing the range by 6, we get an estimate of the standard deviation to be 48/6 = 8.
Thus, the standard deviation should be somewhere between 8 and 12. For this problem, the
standard deviation is s = 10.64. This value falls in the estimated range of 8 to 12.
25
c.
First, we calculate the number of standard deviations from the mean the value of 40 years
is. To do this, we first subtract the mean and then divide by the value of the standard
deviation.
40 x 40 18.2
Number of standard deviations is
= 2.05 2
=
10.64
s
Using Chebyshev's Rule, we know that at most 1/k2 or 1/22 = 1/4 of the data will be more
than 2 standard deviations from the mean. Thus, this would indicate that at most 25% of
the Generation Xers responded with 40 years or more.
Next, we calculate the number of standard deviations from the mean the value of 8 years is.
Number of standard deviations is
8 x 8 18.2
= .96 -1
=
s
10.64
Using Chebyshev's Rule, we get no information about the data within 1 standard deviation
of the mean. However, we know the median (15) is more than 8. By definition, 50% of
the data are larger than the median. Thus, at least 50% of the Generation Xers responded
with 8 years or more. No additional information can be obtained with the information
given.
2.80
a.
Using MINITAB, the frequency histogram for the time in bankruptcy is:
Frequency
20
10
0
1
10
Time in Bankrupt
The Empirical Rule is not applicable because the data are not mound shaped.
26
Chapter 2
b. Using MINITAB, the descriptive measures are:

Descriptive Statistics: Time in Bankrupt
Variable
Time in
N
49
Mean
2.549
Median
1.700
TrMean
2.333
Variable
Time in
Minimum
1.000
Maximum
10.100
Q1
1.350
Q3
3.500
StDev
1.828
SE Mean
0.261
From Chebyshevs Theorem, we know that at least 75% of the observations will fall within
2 standard deviations of the mean. This interval is:
x 2 s 2.549 2(1.828) 2.549 3.656 (1.107, 6.205)
c. There are 47 of the 49 observations within this interval. The percentage would be
(47/49)*100% = 95.9%. This agrees with Chebyshevs Theorem (at least 75%0. It also
agrees with the Empirical Rule (approximately 95%).
d. From the above interval we know that about 95% of all firms filing for prepackaged
bankruptcy will be in bankruptcy between 0 and 6.2 months. Thus, we would estimate that a
firm considering filing for bankruptcy will be in bankruptcy up to 6.2 months.
2.82
2.84
a.
Since it is given that the distribution is mound-shaped, we can use the Empirical Rule. We
know that 1.84% is 2 standard deviations below the mean. The Empirical Rule states that
approximately 95% of the observations will fall within 2 standard deviations of the mean and,
consequently, approximately 5% will lie outside that interval. Since a mound-shaped
distribution is symmetric, then approximately 2.5% of the day's production of batches will
fall below 1.84%.
b.
If the data are actually mound-shaped, it would be extremely unusual (less than 2.5%) to
observe a batch with 1.80% zinc phosphide if the true mean is 2.0%. Thus, if we did
observe 1.8%, we would conclude that the mean percent of zinc phosphide in today's
production is probably less than 2.0%.
a.
Since we do not have any idea of the shape of the distribution of SAT-Math score
changes, we must use Chebyshevs Theorem. We know that at least 8/9 of the
observations will fall within 3 standard deviations of the mean. This interval would be:
x 3s 19 3(65) 19 195 (176, 214)
Thus, for a randomly selected student, we could be pretty sure that this students score
would be any where from 176 points below his/her previous SAT-Math score to 214 points
above his/her previous SAT-Math score.
b.
Since we do not have any idea of the shape of the distribution of SAT-Verbal score
changes, we must use Chebyshevs Theorem. We know that at least 8/9 of the
observations will fall within 3 standard deviations of the mean. This interval would be:
x 3s 7 3(49) 7 147 (140, 154)
27
Thus, for a randomly selected student, we could be pretty sure that this students score
would be any where from 140 points below his/her previous SAT-Verbal score to 154
points above his/her previous SAT-Verbal score.
2.86
c.
A change of 140 points on the SAT-Math would be a little less than 2 standard deviations
from the mean. A change of 140 points on the SAT-Verbal would be a little less than 3
standard deviations from the mean. Since the 140 point change for the SAT-Math is not as
big a change as the 140 point on the SAT-Verbal, it would be most likely that the score was
a SAT-Math score.
a.
z=
b.
z=
c.
z=
d.
z=
x x 40 30
= 2 (sample)
=
s
5
x
2 standard deviations above the mean.
90 89
= .5 (population) .5 standard deviations above the mean.
2
50 50
= 0 (population) 0 standard deviations above the mean.
5
x x 20 30
= 2.5 (sample) 2.5 standard deviations below the mean.
=
s
4
2.88
The 50th percentile of a data set is the observation that has half of the observations less than it.
Another name for the 50th percentile is the median.
2.90
Since the element 40 has a z-score of 2 and 90 has a z-score of 3,

2 =
40
and 3 =
2 = 40
2 = 40
= 40 + 2
90
3 = 90
+ 3 = 90
By substitution, 40 + 2 + 3 = 90
5 = 50
= 10
By substitution, = 40 + 2(10) = 60
Therefore, the population mean is 60 and the standard deviation is 10.
2.92
28
The percentile ranking of the age of 25 years would be 100% 73.5% = 26.5%.
Chapter 2
2.94
a.
From Exercise 2.77, x = 94.91 and s = 4.83. The z-score for an observation of 78 is:
z=
x x 78 94.91
=
= 3.50
s
4.83
This z-score indicates that an observation of 78 is 3.5 standard deviations below the
mean. Very few observations will be lower than this one.
b.
The z-score for an observation of 98 is:

z=
x x 98 94.91
=
= 0.63
s
4.83
This z-score indicates that an observation of 98 is .63 standard deviations above the
mean. This score is not an unusual observation in the data set.
2.96
a.
From the problem, = 2.7 and = .5

z=
x-
z = x x = + z
For z = 2.0, x = 2.7 + 2.0(.5) = 3.7

For z = 1.0, x = 2.7 1.0(.5) = 2.2
For z = .5, x = 2.7 + .5(.5) = 2.95
For z = 2.5, x = 2.7 2.5(.5) = 1.45
b.
For z = 1.6, x = 2.7 1.6(.5) = 1.9
c.
If we assume the distribution of GPAs is

approximately mound-shaped, we can use the
Empirical Rule.
From the Empirical Rule, we know that .025
or 2.5% of the students will have GPAs
above 3.7 (with z = 2). Thus, the GPA
corresponding to summa cum laude (top
2.5%) will be greater than 3.7 (z > 2).
We know that .16 or 16% of the students will have GPAs above 3.2 (z = 1). Thus, the
limit on GPAs for cum laude (top 16%) will be greater than 3.2 (z > 1).
We must assume the distribution is mound-shaped.
29
2.98
a.
Since the data are approximately mound-shaped, we can use the Empirical Rule.
On the blue exam, the mean is 53% and the standard deviation is 15%. We know that
approximately 68% of all students will score within 1 standard deviation of the mean.
This interval is:
x s 53 (15) (38, 68)
About 95% of all students will score within 2 standard deviations of the mean. This
interval is:
x 2 s 53 2(15) 53 30 (23, 83)
About 99.7% of all students will score within 3 standard deviations of the mean. This
interval is:
x 3s 53 3(15) 53 45 (8, 98)
b.
Since the data are approximately mound-shaped, we can use the Empirical Rule.
On the red exam, the mean is 39% and the standard deviation is 12%. We know that
approximately 68% of all students will score within 1 standard deviation of the mean.
This interval is:
x s 39 (12) (27, 51)
About 95% of all students will score within 2 standard deviations of the mean. This
interval is:
x 2 s 39 2(12) 39 24 (15, 63)
About 99.7% of all students will score within 3 standard deviations of the mean. This
interval is:
c.
2.100
30
x 3s 39 3(12) 39 36 (3, 75)

The student would have been more likely to have taken the red exam. For the blue exam,
we know that approximately 95% of all scores will be from 23% to 83%. The observed
20% score does not fall in this range. For the blue exam, we know that approximately
95% of all scores will be from 15% to 63%. The observed 20% score does fall in this
range. Thus, it is more likely that the student would have taken the red exam.
The 25th percentile, or lower quartile, is the measurement that has 25% of the measurements
below it and 75% of the measurements above it. The 50th percentile, or median, is the
measurement that has 50% of the measurements below it and 50% of the measurements above it.
The 75th percentile, or upper quartile, is the measurement that has 75% of the measurements
below it and 25% of the measurements above it.
Chapter 2
2.102
a.
Median is approximately 4.
b.
QL is approximately 3 (Lower Quartile)

QU is approximately 6 (Upper Quartile)
2.104
c.
IQR = QU QL 6 3 = 3
d.
The data set is skewed to the right since the right whisker is longer than the left, there is
one outlier, and there are two potential outliers.
e.
50% of the measurements are to the right of the median and 75% are to the left of the upper
quartile.
f.
There are two potential outliers, 12 and 13. There is one outlier, 16.
a.
From the problem, x = 52.33 and s = 9.22.

The highest salary is 75 (thousand).
The z-score is z =
xx
75 52.33
=
= 2.46
s
9.22
Therefore, the highest salary is 2.46 standard deviations above the mean.
The lowest salary is 35.0 (thousand).
The z-score is z =
xx
35.0 52.33
=
= 1.88
s
9.22
Therefore, the lowest salary is 1.88 standard deviations below the mean.
The mean salary offer is 52.33 (thousand).
The z-score is z =
xx
52.33 52.33
=
=0
s
9.22
The z-score for the mean salary offer is 0 standard deviations from the mean.
No, the highest salary offer is not unusually high. For any distribution, at least 8/9 of the
salaries should have z-scores between 3 and 3. A z-score of 2.46 would not be that
unusual.
31
b.
Using MINITAB, the box plot is:
Since no salaries are outside the inner fences, none of them are potentially faulty observations.
2.106
Using MINITAB, the side-by-side box plots are:

65
60
A GE
55
50
45
40
1
2
GRO UP
From the boxplots, there appears to be one outlier in the third group.
2.108
a.
First, we will compute the mean and standard deviation.

The sample mean is:
n
x=
x
i =1
393
= 5.24
75
The sample variance is:

2
xi
3932
xi2 i
5943
n
75 = 52.482
s2 = i
=
75 1
n 1
32
Chapter 2

s = s 2 = 52.482 = 7.244
Since this data set is highly skewed, we will use 2 standard deviations from the mean as
the cutoff for outliers. Z-scores with values greater than 2 in absolute value are
considered outliers. An observation with a z-score of 2 would have the value:
z=
xx
x 5.24
2=
2(7.244) = x 5.24 14.488 = x 5.24 x = 19.728
s
7.244
An observation with a z-score of -2 would have the value:

xx
x 5.24
2 =
2(7.244) = x 5.24
z=
s
7.244
14.488 = x 5.24 x = 9.248
Thus any observation that is greater than to 19.728 or less than -9.248 would be
considered an outlier. In this data set there would be 4 outliers: 21, 21, 25, 48.
b.
Deleting these 4 outliers, we will recalculate the mean, median, variance, and standard
deviation. The median for the original data set is the middle number once they have been
arranged in order and is the 38th observation which is 3.
The new mean is:
n
x=
x
i =1
278
= 3.92
71
The new sample variance is:

2
xi
2782
xi2 i
2132
n
71 = 14.907
s2 = i
=
n 1
71 1
The new standard deviation is:
s = s 2 = 14.907 = 3.861
The new median is the 36th observation once the data have been arranged in order and is 3.
In the original data set, the mean is 5.24, the standard deviation is 7.244, and the median
is 3. In the revised data set, the mean is 3.92, the standard deviation is 3.861, and the
median is 3. The mean has been decreased, the standard deviation has been almost
halved, but the median stays the same.
33
2.110
For Perturbed Intrinsics, but no Perturbed Projections:

n
x=
xi
i =1
1.0 + 1.3 + 3.0 + 1.5 + 1.3 8.1

=
= 1.62
5
5
2
n
xi
n
2
8.12
xi i =1
15.63
n
5 = 2.508 = .627
s 2 = i =1
=
4
4
n 1
s = s 2 = .627 = .792
The z-score corresponding to a value of 4.5 is

z=
x x 4.5 1.62
=
= 3.63
s
.792
Since this z-score is greater than 3, we would consider this an outlier for perturbed
intrinsics, but no perturbed projections.
For Perturbed Projections, but no Perturbed Intrinsics:
n
x=
xi
i =1
22.9 + 21.0 + 34.4 + 29.8 + 17.7 125.8

=
= 25.16
5
5
2
n
xi
n
2
125.82
xi i =1
3350.1
n
5 = 184.972 = 46.243
s 2 = i =1
=
4
4
n 1
s = s 2 = 46.243 = 6.800
The z-score corresponding to a value of 4.5 is

z=
x x 4.5 25.16
=
= 3.038
s
6.800
Since this z-score is less than -3, we would consider this an outlier for perturbed
projections, but no perturbed intrinsics.
Since the z-score corresponding to 4.5 for the perturbed projections, but no perturbed
intrinsics is smaller than that for perturbed intrinsics, but no perturbed projections, it is
more likely that the that the type of camera perturbation is perturbed projections, but no
perturbed intrinsics.
34
Chapter 2
2.112
Using MINITAB, a scatterplot of the data is:

15
Var2
10
0
-1
Var1
2.114
Using MINITAB, the scatterplot of the data is:
550
Lawyers
450
350
250
150
50
0
10
Offices
As the number of offices increases, the number of lawyers also tends to increase.
2.116
a.
Using MINITAB, the scatterplot is:

20
30th
15
10
5
10
20
30
40
10th
It appears that as the completion time for the 10th trial increases, the completion time for
the 30th trial decreases.
35
b.

20
50th
15
10
10
20
30
40
10th
the 50th trial increases.
c.

20
50th
15
10
10
15
20
30th
the 50th trial increases.
36
Chapter 2
2.118
Using MINITAB, the scatterplot of the data is:

Scatterplot of Mass vs Time
7
6
5
M ass
4
3
2
1
0
0
10
20
30
T ime
40
50
60
There is evidence to indicate that the mass of the spill tends to diminish as time
increases. As time is getting larger, the mass is decreasing.
2.120
The mean is sensitive to extreme values in a data set. Therefore, the median is preferred to the
mean when a data set is skewed in one direction or the other.
2.122
a.
If we assume that the data are about mound-shaped, then any observation with a
z-score greater than 3 in absolute value would be considered an outlier. From Exercise
1.121, the z-score corresponding to 50 is 1, the z-score corresponding to 70 is 1, and the
z-score corresponding to 80 is 2. Since none of these z-scores is greater than 3 in absolute
value, none would be considered outliers.
b.
From Exercise 1.121, the z-score corresponding to 50 is 2, the z-score corresponding to

70 is 2, and the z-score corresponding to 80 is 4. Since the z-score corresponding to 80 is
greater than 3, 80 would be considered an outlier.
c.
From Exercise 1.121, the z-score corresponding to 50 is 1, the z-score corresponding to 70

is 3, and the z-score corresponding to 80 is 4. Since the z-scores corresponding to 70 and
80 are greater than or equal to 3, 70 and 80 would be considered outliers.
d.
From Exercise 1.121, the z-score corresponding to 50 is .1, the z-score corresponding to 70
is .3, and the z-score corresponding to 80 is .4. Since none of these z-scores is greater than
3 in absolute value, none would be considered outliers.
37
2.124
a.
x = 4 + 6 + 6 + 5 + 6 + 7 = 34
x = 42 + 62 + 62 + 52 + 62 + 72 = 198
x = 34 = 5.67
x=
2
s2 =
( x)
n 1
s = 1.067 = 1.03
b.
342
6 = 5.3333 = 1.0667
6 1
5
198
x = 1 + 4 + (3) + 0 + (3) + (6) = 9

x = (1)2 + 42 + (3)2 + 02 + (3)2 + (6)2 = 71
x = 9 = -$1.5
x=
2
( x)
n
=
n 1
s = 11.5 = $3.39
s2 =
c.
(9) 2
6 = 57.5 = 11.5 dollars squared
6 1
5
71
x = 5 + 5 + 5 + 5 + 16
2
= 2.0625
2
2
3 4 2 1 1
x = 5 + 5 + 5 + 5 + 16 = 1.2039
x = 2.0625 = .4125%
x=
5
n
s2 =
d.
2.126
38
( x)
2.06252
.3531
5
= .0883% squared
=
5 1
4
1.2039
s=
n 1
.0883 = .30%
(a)
Range = 7 4 = 3
(b)
Range = $4 ($-6) = $10
(c)
Range =
4
1
64
5
59
% % = % % = % = .7375%
5
16
80
80
80
range/4 = 20/4 = 5
Chapter 2
2.128

Pie Chart of defect
C ategory
false
true
true
9.8%
false
90.2%
A response of true means the software contained defective code. Thus, only 9.8% of the
modules contained defective software code.
2.130
The z-score would be:

z=
x x 408 603.7
=
= 1.06
185.4
s
Since this value is not very big, this is not an unusual value to observe.
2.132
2.134
a.
The variable of interest is opinion of book reviews. The values could be would not
recommend, cautious or very little recommendation, little or no preference,
favorable/recommended, and outstanding/significant contribution. Since these
responses are not numerical, the variable is quantitative.
b.
Most of the books (63%) received a "favorable/recommended" review. About the same
percentage of books received the following reviews: "cautious or very little
recommendation" (10%), "little or no preference" (9%), and "outstanding/significant
contribution" (12%). Only 5% of the books received "would not recommend" reviews.
c.
If the top two categories are added together, the percent recommended is 75% (actually
slightly higher than 75%). This agrees with the study.
a.
To display the status, we use a pie chart. From the pie chart,
we see that 58% of the Beanie babies are retired and 42%
are current.
39
b.
Using Minitab, a histogram of the values is:
Most (40 of 50) Beanie babies have values less than $100. Of the remaining 10, 5 have
values between $100 and $300, 1 has a value between $300 and $500, 1 has a value
between $500 and $700, 2 have values between $700 and $900, and 1 has a value between
$1900 and $2100.
c.
A plot of the value versus the age of the Beanie Baby is as follows:
From the plot, it appears that as the age increases, the value tends to increase.
2.136
a.

Stem-and-leaf of C1
Leaf Unit = 0.10
4
(25)
16
4
2
2
2
2
1
1
40
0
0
1
1
2
2
3
3
4
4
N = 46
34 4 4
5 5 5 5 5 5 5 556666 6 6 6 7 7 7 7 7 8 8 8 8 9
000011222 3 34
7 7
9
7
Chapter 2
2.138
b.
The leaves that represent those brands that carry the American Dental Association seal are
circled above.
c.
It appears that the cost of the brands approved by the ADA tend to have the lower costs.
Thirteen of the twenty brands approved by the ADA, or (13/20) 100% = 65% are less
than the median cost.
a.
Using MINITAB, the summary statistics are:
Descriptive Statistics: Marketing, Engineering, Accounting, Total

Variable
Marketin
Engineer
Accounti
Total
N
50
50
50
50
Mean
4.766
5.044
3.652
13.462
Median
5.400
4.500
0.800
13.750
TrMean
4.732
4.798
2.548
13.043
Variable
Marketin
Engineer
Accounti
Total
Minimum
0.100
0.400
0.100
1.800
Maximum
11.000
14.400
30.000
36.200
Q1
2.825
1.775
0.200
8.075
Q3
6.250
7.225
3.725
16.600
b.
SE Mean
0.365
0.542
0.885
0.965
The z-scores corresponding to the maximum time guidelines developed for each
department and the total are as follows:
Marketing: z =
x x 6.5 4.77
= .67
=
2.58
s
Engineering: z =
x x 7.0 5.04
= .51
=
3.84
s
Accounting: z =
x x 8.5 3.65
= .77
=
6.26
s
Total: z =
c.
StDev
2.584
3.835
6.256
6.820
x x 17 13.46
= .52
=
s
6.82
To find the maximum processing time corresponding to a z-score of 3, we substitute in the

values of z, , and s into the z formula and solve for x.
z=
xx
x x = zs x = x + zs
s
Marketing:
x = 4.77 + 3(2.58) = 4.77 + 7.74 = 12.51

None of the orders exceed this time.
Engineering:
x = 5.04 + 3(3.84) = 5.04 + 11.52 = 16.56

None of the orders exceed this time.
These both agree with both the Empirical Rule and Chebyshev's Rule.
41
Accounting:
x = 3.65 + 3(6.26) = 3.65 + 18.78 = 22.43

One of the orders exceeds this time or 1/50 = .02.
Total:
x = 13.46 + 3(6.82) = 13.46 + 20.46 = 33.92

One of the orders exceeds this time or 1/50 = .02.
These both agree with Chebyshev's Rule but not the Empirical Rule. Both of these last two
distributions are skewed to the right.
d.
Marketing:
x = 4.77 + 2(2.58) = 4.77 + 5.16 = 9.93

Two of the orders exceed this time or 2/50 = .04.
Engineering:
x = 5.04 + 2(3.84) = 5.04 + 7.68 = 12.72

Accounting:
x = 3.65 + 2(6.26) = 3.65 + 12.52 = 16.17

Three of the orders exceed this time or 3/50 = .06.
Total:
x = 13.46 + 2(6.82) = 13.46 + 13.64 = 27.10

All of these agree with Chebyshev's Rule but not the Empirical Rule.
e.
No observations exceed the guideline of 3 standard deviations for both Marketing and
Engineering. One observation exceeds the guideline of 3 standard deviations for both
Accounting (#23, time = 30.0 days) and Total (#23, time = 36.2 days). Therefore, only
(1/10) 100% of the "lost" quotes have times exceeding at least one of the 3 standard
deviation guidelines.
Two observations exceed the guideline of 2 standard deviations for both Marketing (#31,
time = 11.0 days and #48, time = 10.0 days) and Engineering (#4, time = 13.0 days and
#49, time = 14.4 days). Three observations exceed the guideline of 2 standard deviations
for Accounting (#20, time = 22.0 days; #23, time = 30.0 days; and #36, time = 18.2 days).
Two observations exceed the guideline of 2 standard deviations for Total (#20, time = 30.2
days and #23, time = 36.2 days). Therefore, (7/10) 100% = 70% of the "lost" quotes
have times exceeding at least one the 2 standard deviation guidelines.
We would recommend the 2 standard deviation guideline since it covers 70% of the lost
quotes, while having very few other quotes exceed the guidelines.
2.140
a.
First, construct a relative frequency distribution for the departments.

Class
1
2
3
4
5
42
Department
Production
Maintenance
Sales
R&D
Administration
TOTAL
Frequency
13
31
3
2
5
54
Relative Frequency
.241
.574
.056
.037
.093
1.001
Chapter 2

From the diagram, it is evident that
the departments with the worst safety
record are Maintenance and Production.
b.
First, construct a relative frequency

distribution for the type of injury in the maintenance department.
Class
1
2
3
4
5
6
7
8
Injury
Burn
Back strain
Eye damage
Cuts
Broken arm
Broken leg
Concussion
Hearing loss
TOTAL
Frequency
6
5
2
10
2
1
3
2
31
Relative Frequency
.194
.161
.065
.323
.065
.032
.097
.065
1.002

From the Pareto diagram, it is
evident that cuts is the most
prevalent type of injury. Burns and
back strain are the next most
prevalent types of injuries.
2.142
a.
Using MINITAB, the descriptive statistics are:
Descriptive Statistics: MPG

Variable
MPG
N
36
Mean
40.056
Median
40.000
TrMean
40.063
Variable
MPG
Minimum
35.000
Maximum
45.000
Q1
39.000
Q3
41.000
StDev
2.177
SE Mean
0.363
43
The mean is 40.056 and the standard deviation is 2.177. Both of these measures are
measured in the same units as the original data, which are miles per gallon.
b.
Since the sample mean is a good estimate of the population mean, the manufacturer should
be satisfied. The sample mean is 40.056 which is greater than 40.
c.
The range of the data set is 45 35 = 10. Using Chebyshev's Rule, the range should cover
approximately 6 standard deviations. Thus, a good estimate of the standard deviation
would be 10/6 = 1.67. Using the Empirical Rule, the range should cover approximately 4
standard deviations. Thus, a good estimate of the standard deviation would be 10/4 = 2.5
The given standard deviation is 2.177 which is between these two estimates. Thus, it is a
reasonable value.
d.
Using MINITAB, the frequency histogram is (the relative frequency histogram would have
the same shape):
9
8
Frequency
7
6
5
4
3
2
1
0
35
36
37
38
39
40
41
42
43
44
45
MPG
Yes, the data appear to be mound-shaped.

e.
Because the data are mound-shaped, we can use the Empirical Rule. We would expect
approximately 68% of the data within the interval x s, approximately 95% of the data
within the interval x 2s, and approximately all of the data within the interval x 3s.
f.
The interval x s is 40.056 2.177 or (37.879, 42.233). Twenty-seven of the

observations fall in this interval or 27/36 = .75 or 75%. This number is a little larger than
68%.
The interval x 2s is 40.056 2(2.177) or (35.702, 44.410). Thirty-four of the
observations fall in this interval or 34/36 = .94 or 94%. This number is very close to 95%.
The interval x 3s is 40.056 3(2.177) or (33.525, 46.587). Thirty-six of the
observations fall in this interval or 36/36 = 1.00 or 100%. This number is the same as all
of the observations.
44
Chapter 2
2.144
a.
Both the height and width of the bars (peanuts) change. Thus, some readers may tend to
equate the area of the peanuts with the frequency for each year.
b.
The frequency bar chart is:
45
(To accompany Chapters 12)
There are many things that could be included in a report about the possibility of collusion. I have
concentrated on the incumbency rates, bid levels and dispersion, and average winning bids. With the
data available, no comparison of market share can be made since there was so much missing data.
Actually, with the data available, the exact analysis cannot be made, since only the winning bid
information is provided. Thus, we have no idea what the losing bids were. I will present what I think is
a reasonable solution. This is by no means the only solution to the case. Many other presentations
could also be used.
Incumbency Rates
The incumbency rate is the percent of the school districts that are won by the same vendor who won the
previous year. A table containing the incumbency rates is included as well as a plot. Notice in the plot
that the incumbency rates in the Tri-county market is higher than that in the Surrounding market.
From 1985 through 1988, the incumbency rate for the Tri-county market was never lower than .923,
while in the same period in the Surrounding market, the incumbency rate was never higher than .730.
This implies the possibility of collusion in the Tri-county market.
Year
1984
1985
1986
1987
1988
1989
1990
1991
46
Surrounding Market
Tri-county Market
Number of
Same
Incumbency Number of
Same
Incumbency
Districts Vendors
Rate
Districts Vendors
Rate
26
16
.615
10
8
.800
27
19
.704
12
12
1.000
32
19
.594
13
13
1.000
37
27
.730
13
12
.923
37
25
.676
13
13
1.000
37
23
.622
13
9
.692
34
24
.706
13
10
.769
5
3
.600
13
11
.846
The plot of the incumbency rates is:
Bid Levels and Dispersion

Since we only have access to the winning bids in each of the school districts, we cannot make a true
analysis of the bid levels and dispersions. As a compromise, I have used the winning bids of the two
dairies in questionTrauth and Meyer. I have looked at only the winning bids of these two dairies in
both the Tri-county market and in the Surrounding market. If there was no collusion, then the winning
bids and the dispersions of the winning bids should be similar in the two markets for the two dairies. I
looked at the box plots of the winning bids of the two dairies in each market for each type of milk:
whole white, lowfat white and lowfat chocolate. I have included only a few of the box plots as
illustrations. Those included are for 1985 and 1986.
47
1985 Winning Bids:
OBS
MARKET
WINNER
WHOLE
WHITE
LOWFAT
WHITE
LOWFAT
CHOCOLATE
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
SUR
SUR
SUR
SUR
SUR
SUR
SUR
TRI
TRI
TRI
TRI
TRI
TRI
TRI
TRI
TRI
TRI
TRI
TRI
TRI
MEYER
TRAUTH
TRAUTH
TRAUTH
MEYER
TRAUTH
MEYER
TRAUTH
TRAUTH
MEYER
TRAUTH
MEYER
MEYER
MEYER
TRAUTH
TRAUTH
MEYER
TRAUTH
MEYER
TRAUTH
0.1280
0.1200
.
.
0.1225
0.1230
0.1250
0.1440
0.1450
0.1410
0.1393
0.1340
0.1445
.
0.1449
.
0.1480
0.1310
.
0.1435
0.1250
0.1110
0.1079
0.1190
0.1130
0.1130
0.1145
0.1440
0.1350
0.1410
0.1393
0.1340
0.1345
0.1345
0.1349
0.1299
0.1480
0.1290
0.1380
0.1335
0.1315
0.1090
0.1079
0.1210
0.1099
0.1120
0.1140
.
.
0.1410
.
0.1340
0.1395
.
0.1399
0.1299
0.1480
.
.
.
Box Plots for Whole White Milk1985

Boxplots for Whole White Milk - 1985
0.150
0.145
WWBID
0.140
0.135
0.130
0.125
0.120
S U RRO U N D
TRI-C O U N TY
M A RKET
48
Box Plots for Lowfat White Milk1985

Boxplots for Lowfat White Milk - 1985
0.15
LFWBID
0.14
0.13
0.12
0.11
S U RRO U N D
TRI-C O U N TY
M A RKET
Box Plots for Lowfat Chocolate Milk1985

Boxplots for Lowfat Chocolate Milk - 1985
0.15
LFC BID
0.14
0.13
0.12
0.11
S U RRO U N D
TRI-C O U N TY
M A RKET
49
For each type of milk, the mean and median winning bids for the Tri-county market were higher than
the corresponding winning bids in the Surrounding market. Also, the dispersion, indicated by the width
of the boxes and the length of the whiskers, for the Surrounding market is larger than for the Tri-county
market in most cases. This is indicative of collusion in the Tri-county market. This same pattern also
existed in 1986.
1986 Winning Bids:
OBS
MARKET
WINNER
WHOLE
WHITE
LOWFAT
WHITE
LOWFAT
CHOCOLATE
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
SUR
SUR
SUR
SUR
SUR
SUR
SUR
SUR
TRI
TRI
TRI
TRI
TRI
TRI
TRI
TRI
TRI
TRI
TRI
TRI
TRI
TRAUTH
TRAUTH
TRAUTH
MEYER
TRAUTH
TRAUTH
TRAUTH
TRAUTH
TRAUTH
TRAUTH
MEYER
TRAUTH
MEYER
MEYER
MEYER
TRAUTH
TRAUTH
MEYER
TRAUTH
MEYER
TRAUTH
0.1195
0.1330
0.1140
0.1350
0.1224
.
.
0.1250
0.1475
0.1469
0.1440
0.1420
0.1390
0.1470
.
0.1474
.
0.1505
0.1360
.
0.1460
0.1100
0.1240
0.1070
0.1250
0.1124
0.1110
0.1180
0.1125
0.1475
0.1369
0.1340
0.1420
0.1390
0.1370
0.1380
0.1374
0.1349
0.1505
0.1320
0.1430
0.1360
0.1085
0.1290
0.1050
0.1315
0.1110
0.1110
0.1200
0.1115
.
.
0.1395
.
0.1390
0.1420
.
0.1424
0.1349
0.1505
.
.
.
Box Plots for Whole White Milk1986

Boxplots for Whole White Milk - 1986
0.15
WWBID
0.14
0.13
0.12
0.11
S U RRO U N D
TRI-C O U N TY
M A RKET
50
Box Plots for Lowfat White Milk1986

Boxplots for Lowfat White Milk - 1986
0.15
LFWBID
0.14
0.13
0.12
0.11
S U RRO U N D
TRI-C O U N TY
M A RKET
Box Plots for Lowfat Chocolate Milk1986

Boxplots for Lowfat Chocolate Milk - 1986
0.15
LFC BID
0.14
0.13
0.12
0.11
0.10
S U RRO U N D
TRI-C O U N TY
M A RKET
51
The same pattern that existed for 1985 and 1986 also existed in 1984, 1987, and 1988. From 1989 on,
the pattern no longer existed. Thus, from the plots, it appears that the two dairies were working
together from 1984 through 1988 in the Tri-county market.
I also plotted the mean winning bids for the two dairies in each of the two markets from 1984 through
1991 for each type of milk. In all three plots, the mean winning bid in 1983 was almost the same in the
two markets. Then, in 1984, the mean winning bid in the Tri-county market was higher than in the
Surrounding market for all three types of milk. This trend holds basically through 1988 (the lowfat
white milk mean winning bid for the Surrounding market was greater than the mean winning bid in the
Tri-county market in 1988). After 1988, the mean winning bids in the two markets are almost the same.
This points to collusion in the Tri-county market from 1984 through 1988.
52
The dispersion, measured using the standard deviation, of the winning bids for each of the three types
of milk was basically smaller in the Tri-county market than in the Surrounding market for the years
1985 through 1988. Again, after 1988 this pattern no longer existed. Again, this points to collusion
between the two dairies in the Tri-county market during the years 1984 through 1988.
53
54
Probability
3.2
Chapter 3
a.
This is a Venn Diagram.
b.
If the sample points are equally likely, then

P(1) = P(2) = P(3) = = P(10) =
1
10
Therefore,
1
1
1
3
+ + =
= .3
10 10 10 10
1
1
2
P(B) = P(6) + P(7) =
+ =
= .2
10 10 10
P(A) = P(4) + P(5) + P(6) =
3.4
1
1
3
5
+
+
=
= .25
20 20 20 20
3
3
6
+
=
P(B) = P(6) + P(7) =
= .3
20 20 20
c.
P(A) = P(4) + P(5) + P(6) =
a.
9
9!
9 8 7 6 5 4 3 2 1
= 126
=
=
4 4!(9 4)! 4 3 2 1 5 4 3 2 1
b.
7
7!
7 6 5 4 3 2 1
=
= 21
=
2 2!(7 2)! 2 1 5 4 3 2 1
c.
4
4!
4 3 2 1
=1
=
=
4 4!(4 4)! 4 3 2 1 1
d.
5
5!
5 4 3 2 1
=1
=
=
0 0!(5 0)! 1 5 4 3 2 1
e.
6
6!
6 5 4 3 2 1
=
=6
=
5 5!(6 5)! 5 4 3 2 1 1
Probability
55
3.6
a.
The 36 sample points are:

1,1 1,2 1,3 1,4 1,5 1,6 2,1 2,2 2,3 2,4 2,5 2,6
3,1 3,2 3,3 3,4 3,5 3,6 4,1 4,2 4,3 4,4 4,5 4,6
5,1 5,2 5,3 5,4 5,5 5,6 6,1 6,2 6,3 6,4 6,5 6,6
b.
If the dice are fair, then each of the sample points is equally likely. Each would have a
probability of 1/36 of occurring.
c.
There is one sample point in A: 3,3. Thus, P(A) =
1
.
36
There are 6 sample points in B: 1,6 2,5 3,4 4,3 5,2 and 6,1. Thus, P(B) =
6 1
= .
36 6
There are 18 sample points in C: 1,1 1,3 1,5 2,2 2,4 2,6 3,1 3,3 3,5 4,2 4,4
18 1
= .
4,6 5,1 5,3 5,5 6,2 6,4 and 6,6. Thus, P(C) =
36 2
3.8
Each student will obtain slightly different proportions. However, the proportions should be
close to P(A) = 1/10, P(B) = 6/10 and P(C) = 3/10.
3.10
Define the following event:

B: {Postal worker was assaulted on the job in the past year}
P(B) =
3.12
a.
600
= .05
12,000
The 5 sample points are:

Total population, Agricultural change, Presence of industry, Growth, and Population
concentration.
b.
The probabilities are best estimated with the sample proportions. Thus,
P(Total population) = .18
P(Agricultural change) = .05
P(Presence of industry) = .27
P(Growth) = .05
P(Population concentration) = .45
c.
Define the following event:

A: {Factor specified is population-related}
P(A) = P(Total population) + P(Growth) + P(Population concentration)
= .18 + .05 + .45 = .68.
56
Chapter 3
3.14
a.
The sample points of this experiment correspond to each of the 8 possible types of
commodities. Suppose we introduce notation to make the listing of the sample points
easier.
A: {carload contains agricultural products}
CH: {carload contains chemicals}
CO: {carload contains coal}
F: {carload contains forest products}
MO: {carload contains metallic ores and minerals}
MV: {carload contains motor vehicles and equipment}
N: {carload contains nonmetallic minerals and products}
O: {carload contains other}
The eight sample points are: A CH CO F MO MV N O

b.
The probability of each sample point is found by dividing the number of carloads for each
sample point by the total number of carloads. The probabilities are:
P(A) = 41,690 / 335,770 = .124
P(CH) = 38,331 / 335,770 = .114
P(CO) = 124,595 / 335,770 = .371
P(F) = 21,929 / 335,770 = .065
P(MO) = 34,521 / 335,770 = .103
P(MV) = 22,906 / 335,770 = .068
P(N) = 37,416 / 335,770 = .111
P(O) = 14,382 / 335,770 = .043
c.
P(MV) = .068
P(nonagricultural products) = P(CH) + P(CO) + P(F) + P(MO) + P(MV) + P(N) + P(O)
= .114 + .371 + .065 + .103 + .068 + .111 + .043 = .875
d.
P(CH) + P(CO) = .114 + .371 = .485
e.
Since there were 335,770 carloads that week, the probability of selecting any one in
particular would be 1 / 335,770 = .00000298. Thus, the probability of selecting the
carload with the serial number 1003642 is .00000298.
Probability
57
3.16
a.
Since order does not matter, the number of different bets would be a combination of 8
things taken 2 at a time.
The number of ways would be
8
8!
8 7 6 5 4 3 2 1 40,320
=
=
= 28
=
2 2!(8 2)! 2 1 6 5 4 3 2 1 1440
3.18
b.
If all players are of equal ability, then each of the 28 sample points would be equally
likely. Each would have a probability of occurring of 1/28. There is only one sample
point with values 2 and 7. Thus, the probability of winning with a bet of 2-7 would by
1/28 or .0357.
a.
Let I = Infiniti 1435, TP = Toyota Prius, and C = Chevrolet Corvette. All possible
rankings are as follows, where the first dealer listed is ranked first, the second dealer
listed is ranked second, and the third dealer listed is ranked third:
I,TP,C
b.
I,C,TP
C,I,TP
C,TP,I
TP,I,C
TP,C, I
If each set of rankings is equally likely, then each has a probability of 1/6.
The probability that the Toyota Prius is ranked first = P(TP,I,C) + P(TP,C, I)
=1/6 + 1/6 = 2/6 = 1/3.
The probability that the Infinity 1435 is ranked third = P(C,TP,I) + P(TP,C, I)
=1/6 + 1/6 = 2/6 = 1/3.
The probability that the Toyota Prius is ranked first and the Chevrolet Corvette is ranked
second = P(TP,C, I) =1/6.
3.20
First, we need to compute the total number of ways we can select 2 bullets (pair) from 1,837
bullets. This is a combination of 1,837 things taken 2 at a time.
The number of pairs is:
1,837
1,837!
1837 1836 1
1837 1836
=
=
=
= 1,686,366
2
2 2!(1,837 2)! 2 1 1835 1834 1
The probability of a false positive is the number of false positives divided by the number
of pairs and is:
P(false positive) = # false positives / # pairs = 693 / 1,686,366 = .0004
This probability is very small. There would be only about 4 false positives out of every
10,000. I would have confidence in the FBIs forensic evidence.
58
Chapter 3
3.22
3.24
a.
P ( B c ) = 1 P ( B ) = 1 .7 = .3
b.
P ( Ac ) = 1 P ( A) = 1 .4 = .6
c.
P ( A B ) = P ( A) + P ( B ) P( A B) = .4 + .7 .3 = .8
The experiment consists of rolling a pair of fair dice. The sample points are:
1, 1
1, 2
1, 3
1, 4
1, 5
1, 6
2, 1
2, 2
2, 3
2, 4
2, 5
2, 6
3, 1
3, 2
3, 3
3, 4
3, 5
3, 6
4, 1
4, 2
4, 3
4, 4
4, 5
4, 6
5, 1
5, 2
5, 3
5, 4
5, 5
5, 6
6, 1
6, 2
6, 3
6, 4
6, 5
6, 6
Since each die is fair, each sample point is equally likely. The probability of each sample point
is 1/36.
a.
A: {(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)}
B: {(1, 4), (2, 4), (3, 4), (4, 4), (5, 4), (6, 4), (4, 1), (4, 2), (4, 3), (4, 5), (4, 6)}
A B: {(3, 4), (4, 3)}
A B: {(1, 4), (2, 4), (3, 4), (4, 4), (5, 4), (6, 4), (4, 1), (4, 2), (4, 3), (4, 5),
(4, 6), (1, 6), (2, 5), (5, 2), (6, 1)}
Ac: {(1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (2, 1), (2, 2), (2, 3), (2, 4), (2, 6), (3, 1),
(3, 2), (3, 3), (3, 5), (3, 6), (4, 1), (4, 2), (4, 4), (4, 5), (4, 6), (5, 1), (5, 3),
(5, 4), (5, 5), (5, 6), (6, 2), (6, 3), (6, 4), (6, 5), (6, 6)}
b.
1 6 1
P(A) = 6 =
=
36 36 6
1 11
P(B) = 11 =
36 36
1 2 1
P(A B) = 2 =
=
36 36 18
1 15 5
P(A B) = 15 =
=
36 36 12
1 30 5
P(Ac) = 30 =
=
36 36 6
1 11 1 6 + 11 2 15 5
+
=
=
=
6 36 18
36
36 12
c.
P(A B) = P(A) + P(B) P(A B) =
d.
A and B are not mutually exclusive. To be mutually exclusive, P(A B) must be 0. Here,
1
.
P(A B) =
18
Probability
59
3.26
3.28
3.30
a.
P(Ac) = P(E3) + P(E6) = .2 + .3 = .5
b.
P(Bc) = P(E1) + P(E7) = .10 + .06 = .16
c.
P(Ac B) = P(E3) + P(E6) = .2 + .3 = .5
d.
P(A B) = P(E1) + P(E2) + P(E3) + P(E4) + P(E5) + P(E6) + P(E7)

= .10 + .05 + .20 + .20 + .06 + .30 + .06 = .97
e.
P(A B) = P(E2) + P(E4) + P(E5) = .05 + .20 + .06 = .31
f.
P(Ac Bc) = P(E1) + P(E7) + P(E3) + P(E6) = .10 + .06 + .20 + .30 = .66
g.
No. A and B are mutually exclusive if P(A B) = 0. Here, P(A B) = .31.
a.
The outcome "On" and "High" is A D.
b.
The outcome "Low" or "Medium" is Dc.
Define the following events:

A: {problems with absenteeism}
T: {problems with turnover}
From the problem, P(A) = .55, P(T) = .41, and P(A T) = .22
P(problems with either absenteeism or turnover) = P(A T) = P(A) + P(T) P(A T)
= .55 + .41 .22 = .74
3.32
60
a.
The event A B is the event the outcome is black and odd. The event is A B: {11, 13,
15, 17, 29, 31, 33, 35}
b.
The event A B is the event the outcome is black or odd or both. The event A B is {2,
4, 6, 8, 10, 11, 13, 15, 17, 20, 22, 24, 26, 28, 29, 31, 33, 35, 1, 3, 5, 7, 9, 19, 21, 23, 25,
27}
Chapter 3
c.
Assuming all events are equally likely, each has a probability of 1/38.
1 18 9
P(A) = 18 = =
38 38 19
1 18 9
P(B) = 18 = =
38 38 19
4
1 8
P(A B) = 8 =
=
38 38 19
1 28 14
P(A B) = 28 =
=
38 38 19
1 18 9
P(C) = 18 = =
38 38 19
d.
The event A B C is the event the outcome is odd and black and low.
The event A B C is {11, 13, 15, 17}.
e.
P(A B) = P(A) + P(B) P(A B) =
f.
2
1 4
=
P(A B C) = 4 =
38
38
19

g.
The event A B C is the event the outcome is odd or black or low.

The event A B C is:
9
9
4 14
+ =
19 19 19 19
{1, 2, 3, ... , 29, 31, 33, 35}

or
{All sample points except 00, 0, 30, 32, 34, 36}
3.34
h.
1 32 16
=
P(A B C) = 32 =
38 38 19
a.
PSA
Products 6 and 7 are contained in this intersection.
b.
P(possess all the desired characteristics) = P(P S A)

= P(6) + P(7) =
c.
1 1 1
+ =
10 10 5
AS
P(A S) = P(2) + P(3) + P(5) + P(6) + P(7) + P(8) + P(9) + P(10)
1
1
1
1
1
1
1
1 8 4
+ + + + + + + = =
=
10 10 10 10 10 10 10 10 10 5
Probability
61
d.
PS
P(P S) = P(2) + P(6) + P(7) =
3.36
3.38
1
1
1
3
+ + =
10 10 10 10
First, convert the percentages in the table to probabilities by dividing the percent by 100%.
a.
P(A) = .259 + .169 + .115 = .543

P(B) = .003
P(C) = .037 + .078 + .016 + .002 + .047 + .027 = .207
P(D) = .414
b.
P(A D) = .156 + .094 + .043 = .293

P(A D) = P(A) + P(B) P(A D) = .543 + .414 .293 = .664
c.
Ac: {The worker is under 40}

Bc: {The worker is 20 or older or is not part-time}
Dc: {The worker is not part-time}
d.
P(Ac) = 1 P(A) = 1 .543 = .457

P(Bc) = 1 P(B) = 1 .003 = .997
P(Dc) = 1 P(D) = 1 .414 = .586

A: {Wheelchair user had an injurious fall}
B: {Wheelchair user had all five features installed in the home}
C: {Wheelchair user had no falls}
D: {Wheelchair user had none of the features installed in the home}
3.40
62
a.
P ( A) =
48
= .157
306
b.
P( B) =
9
= .029
306
c.
P (C D) =
89
= .291
306
There are a total of 6 x 6 x 6 = 216 possible outcomes from throwing 3 fair dice. To help
demonstrate this, suppose the three dice are different colors red, blue and green. When we
roll these dice, we will record the outcome of the red die first, the blue die second, and the
green die third. Thus, there are 6 possible outcomes for the first position, 6 for the second, and
6 for the third. This leads to the 216 possible outcomes.
Chapter 3
The Grand Duke argued that the chance of getting a sum of 9 and the chance of getting a sum
of 10 should be the same since the number of partitions for 9 and 10 are the same. These
partitions are:
9
126
135
144
225
234
333
10
136
145
226
235
244
334
In each case, there are 6 partitions. However, if we take into account the three colors of the
dice, then there are various ways to get each partition. For instance, to get a partition of 126,
we could get 126, 162, 216, 261, 612, and 621 (again, think of the red die first, the blue die
second, and the green die third). However, to get a partition of 333, there is only 1 way. To
get a partition of 144, there are 3 ways: 144, 414, and 441. The numbers of ways to get each
of the above partitions are:
9
126
135
144
225
234
333
# ways
6
6
3
3
6
_ 1
25
10
136
145
226
235
244
334
# ways
6
6
3
6
3
_3
27
Thus, there are a total of 25 ways to get a sum of 9 and 27 ways to get a sum of 10.
The chance of throwing a sum of 9 (25 chances out of 216 possibilities) is less than the
chance of throwing a 10 (27 chances out of 216 possibilities).
3.42
3.44
a.
P ( A B ) = P ( A | B ) P ( B ) = .6(.2) = .12
b.
P ( B | A) =
a.
Since A and B are mutually exclusive events, P(A B) = P(A) + P(B) = .30 + .55 = .85
b.
Since A and C are mutually exclusive events, P(A C) = 0
c.
P(AB) =
d.
Since B and C are mutually exclusive events, P(B C) = P(B) + P(C) = .55 + .15 = .70
e.
No, B and C cannot be independent events because they are mutually exclusive events.
Probability
P ( A B ) .12
= .3
=
P( A)
.4
P( A B) 0
=
=0
P( B)
.55
63
3.46
a.
If two fair coins are tossed, there are 4 possible outcomes or simple events. They are:
(1) HH
(2) HT
(3) TH
(4) TT
Event A contains the simple events (2), (3), and (4). Event B contains the simple events
(2) and (3).
A Venn diagram of this would be:
B
2
3
Since the coins are fair, each of the sample points is equally likely. Each would have
probabilities of .
b.
1 3
P ( A) = 3 = = .75
4 4
1 2 1
P ( B ) = 2 = = = .5
4 4 2
P ( A B ) = P (2)+P (3) =
c.
64
1 1 2 1
+ = = = .5
4 4 4 2
P( A | B) =
P ( A B ) .5
= =1
P( B)
.5
P ( B | A) =
P ( A B ) .5
=
= .667
P ( A)
.75
Chapter 3
3.48
The 36 possible outcomes obtained when tossing two dice are listed below:
(1, 1) (1, 2) (1, 3) (1, 4) (1, 5) (1, 6)
(2, 1) (2, 2) (2, 3) (2, 4) (2, 5) (2, 6)
(3, 1) (3, 2) (3, 3) (3, 4) (3, 5) (3, 6)
(4, 1) (4, 2) (4, 3) (4, 4) (4, 5) (4, 6)
(5, 1) (5, 2) (5, 3) (5, 4) (5, 5) (5, 6)
(6, 1) (6, 2) (6, 3) (6, 4) (6, 5) (6, 6)
A: {(1, 2), (1, 4), (1, 6), (2, 1), (2, 3), (2, 5), (3, 2), (3, 4), (3, 6), (4, 1), (4, 3),
(4, 5), (5, 2), (5, 4), (5, 6), (6, 1), (6, 3), (6, 5)}
B: {(3, 6), (4, 5), (5, 4), (5, 6), (6, 3), (6, 5), (6, 6)}
A B: {(3, 6), (4, 5), (5, 4), (5, 6), (6, 3), (6, 5)}
If A and B are independent, then P(A)P(B) = P(A B).
18 1
7
6 1
=
P(B) =
P(A B) =
=
36 2
36
36 6
1 7
7 1
P(A)P(B) = =
= P ( A B ) . Thus, A and B are not independent.
2 36 72 6
P(A) =
3.50

S: {cause of fatal crash is speeding}
C: {cause of fatal crash is missing a curve}
From the problem, we know P(S) = .3 and P(S C) = .12.
P (C | S ) =
3.52
P (C S ) .12
= .4
=
P( S )
.3

A: {Winner is from the American League}
B: {Winner is from the National League}
C: {Winner is from the Eastern Division}
D {Winner is from the Central Division}
E: {Winner is from the Western Division}
a.
Probability
P (C | A) =
7
P( A C )
7
= 15 = = .7
10
P( A)
10
15
65
3.54
b.
1
P( B D)
1
P ( B | D) =
= 15 = = .333
3
P( D)
3
15
c.
P( D E | B) =
2
P (( D E ) B )
2
= 15 = = .4
5
P( B)
5
15

A: {electrical switch monitors quality of power}
B: {electrical switch not wired properly}
From the problem, P(A) = .90 and P(B | A) = .90.
P(A B) = P(B | A) P(A) = .90(.90) = .81.
3.56
Ai : {ith CEO has bachelors degree}

a.
b.
3.58
P ( A1 ) =
8
= .20
40
If the first 4 CEOs have just bachelors degree, then on the next pick there are only 4 left
to choose from. Similarly, after picking 4 CEOs, there are only 36 observations left to
choose from.
4
P ( A5 | A1 A2 A3 A4 ) =
= .111
36
If A and B are independent, then P ( A B ) = P ( A) P ( B ) . For this Exercise,

1385 + 786 2171
1385 + 1175 2560
=
= .651 , and
P ( A) =
=
= .552 , P ( B ) =
3934
3934
3934
3934
P( A B) =
1385
= .352 .
3934
P ( A) P ( B ) = .552(.651) = .359 .352 = P ( A B ) . Thus, A and B are not independent.

3.60
66
The probability of a false positive is P(A | B).
Chapter 3
3.62
First, define the following event:

A: {CVSA correctly determines the veracity of a suspect} P(A) = .98 (from claim)
3.64
a.
The event that the CVSA is correct for all four suspects is the event A A A A.
P(A A A A) = .98(.98)(.98)(.98) = .9224
b.
The event that the CVSA is incorrect for at least one of the four suspects is the event
(A A A A)c. P(A A A A)c = 1 P(A A A A)
= 1 .9224 = .0776

I: {Leak ignites immediately (jet fire)}
D: {Leak has delayed ignition (flash fire)}
From the problem, P(I) = .01 and P(D | Ic) = .01
The probability of a jet fire or a flash fire = P(I D) = P(I) + P(D) P(I D)
= P(I) + P(D | Ic)P(Ic) P(I D) = .01 + .01(1 .01) 0 = .01 + .0099 = .0199
A tree diagram of this problem is:
I
.01
I
.01
D(.01)
.99
Ic
Dc
(.99)
3.66
a.
IcD
.99(.01)=.0099
IcDc .99(.99)=.9801

W:
F:
{Player wins the game Go}

{Player plays first (black stones)}
P(W F) = 319/577 = .553
Probability
67
b.
P(W FCA) = 34/34 = 1

P(W FCB) = 69/79 = .873
P(W FCC) = 66/118 = .559
P(W FBA) = 40/54 = .741
P(W FBB) = 52/95 = .547
P(W FBC) = 27/79 = .342
P(W FAA) = 15/28 = .536
P(W FAB) = 11/51 = .216
P(W FAC) = 3/39 = .077
c.
There are three combinations where the player with the black stones (first) is ranked
higher than the player with the white stones: CA, CB, and BA.
P(W FCA CB BA) = (34 + 69 + 40)/(34 + 79 + 54) = 143/167 = .856
d.
There are three combinations where the players are of the same level: CC, BB, and AA.
P(W FCC BB AA) = (66 + 52 + 15)/(118 + 95 + 28) = 133/241 = .552
3.68
a.
Suppose the elements of the population are:

1, 2, 3, 4, 5, 6, 7, 8, 9, and 10.
The possible samples of size 2 are:
(1, 2) (1, 3) (1, 4) (1, 5) (1, 6) (1, 7) (1, 8) (1, 9) (1, 10)
(2, 3) (2, 4) (2, 5) (2, 6) (2, 7) (2, 8) (2, 9) (2, 10)
(3, 4) (3, 5) (3, 6) (3, 7) (3, 8) (3, 9) (3, 10)
(4, 5) (4, 6) (4, 7) (4, 8) (4, 9) (4, 10)
(5, 6) (5, 7) (5, 8) (5, 9) (5, 10)
(6, 7) (6, 8) (6, 9) (6, 10)
(7, 8) (7, 9) (7, 10)
(8, 9) (8, 10)
(9, 10)
Since there are N = 10 elements in the population, the number of samples of size n = 2 is a
combination of 10 things taken 2 at a time or
10 10! 10 9 8 7 6 5 4 3 2 1
=
1 = 45
=
2 2!8! (2 1)(8 7 6 5 4 3 2 1)
Therefore, there are 45 different samples of size n = 2 that can be selected from a
population of N = 10.
b.
68
If random sampling is employed, every pair of elements has an equal probability of being
selected. Therefore, the probability of drawing a particular pair is 1/45.
Chapter 3
c.
To draw a random sample of 2 elements from 10, we will number the elements from 0 to
9. Then, starting in an arbitrary position in Table I, Appendix B, we will select two
numbers by going either down a column or across a row. Suppose that we start in the
third position of column 6 and row 9. We will proceed down the column. The first
sample drawn will be 1 and 5. The second sample drawn will be 9 and 4. The 20 samples
selected are:
Sample Number
1
2
3
4
5
6
7
8
9
10
Items Selected
1, 5
9, 4
4, 2
9, 3
8, 1
5, 6
1, 3
0, 2
4, 6
8, 0
Sample Number
11
12
13
14
15
16
17
18
19
20
Items Selected
0, 9
1, 0
3, 7
3, 9
0, 8
3, 4
0, 4
9, 7
8, 4
0, 5
There are actually two pairs of samples that match: Samples 10 and 15, and samples 4
and 14. Given the low probability of each pair occurring, it is not that likely to have two
pairs of samples that match.
3.70
First, number the elements of the population from 1 to 200,000. Starting in row 10, column 1,
of Table I of Appendix B and reading down, take the first ten 6-digit numbers. Eliminate any
duplicates, the number 000000, and all numbers greater than 200,000.
The 10 numbers selected for the random sample are:
094299
103656
071199
023682
010115
070569
024883
007425
053660
005820
Elements with the above numbers are selected for the sample.
3.72
To draw a random sample of 1,000 households from 534,322, we will number the households
from 1 to 534,322. Then, starting in an arbitrary position in Table I, Appendix B, we will select
6-digit numbers by proceeding down a column. We will continue selecting numbers until we have
1,000 different 6-digit numbers, eliminating 000000 and any numbers between 534,323 and
999,999.
Probability
69
3.74
a.
Give each stock in the NYSE-Composite Transactions table of the Wall Street Journal a
number (1 to m). Using Table I of Appendix B, pick a starting point and read down using
the same number of digits as in m until you have n different numbers between 1 and m,
inclusive.
3.76
a.
P ( B1 A) = P ( A | B1 ) P ( B1 ) = .3(.75) = .225
b.
P( B2 A) = P( A | B2 ) P( B2 ) = .5(.25) = .125
c.
P ( A) = P ( B1 A) + P ( B2 A) = .225 + .125 = .35
d.
P ( B1 | A) =
P ( B1 A) .225
=
= .643
P( A)
.35
e.
P ( B2 | A) =
P ( B2 A) .125
=
= .357
P ( A)
.35
3.78
If A is independent of B1, B2, and B3, then P( A | B1 ) = P( A) = .4 .

Then P ( B1 | A) =
3.80
a.
P ( A | B1 ) P ( B1 ) .4(.2)
=
= .2
P ( A)
.4
P( E1 error )
P (error )
P (error | E1 ) P( E1 )
=
P(error | E1 ) P( E1 ) + P(error | E2 ) P( E2 ) + P(error | E3 ) P ( E3 )
P ( E1 | error ) =
=
b.
.01(.30)
.003
.003
=
= .158
=
.01(.30) + .03(.20) + .02(.50) .003 + .006 + .01 .019
P( E2 error )
P (error )
P(error | E2 ) P( E2 )
=
P (error | E1 ) P ( E1 ) + P (error | E2 ) P ( E2 ) + P(error | E3 ) P( E3 )
P ( E2 | error ) =
70
.03(.20)
.006
.006
=
= .316
=
.01(.30) + .03(.20) + .02(.50) .003 + .006 + .01 .019
Chapter 3
c.
P ( E3 error )
P(error )
P(error | E3 ) P ( E3 )
=
P(error | E1 ) P( E1 ) + P(error | E2 ) P( E2 ) + P(error | E3 ) P( E3 )
P ( E3 | error ) =
=
d.
3.82
.02(.50)
.01
.01
=
=
= .526
.01(.30) + .03(.20) + .02(.50) .003 + .006 + .01 .019
If there was a serious error, the probability that the error was made by engineer 3 is .526.
This probability is higher than for any of the other engineers. Thus engineer #3 is most
likely responsible for the error.

D: {Defect in steel casting}
H: {NDE detects Hit or defect in steel casting}
From the problem, P(H | D) = .97, P(H | Dc) = .005, and P(D) = .01.
P(H) = P(H | D)P(D) + P(H | Dc)P(Dc) = .97(.01) + .005(.99) = .0097 + .00495 = .01465
P( D | H ) =
3.84
P ( D H ) P ( H | D) P ( D) .97(.01) .0097
=
=
=
= .6621
P( H )
P( H )
.01465 .01465

A: {Alarm A sounds alarm}
B: {Alarm B sounds alarm}
I: {Intruder}
From the problem:
P(A | I ) = .9
P(B | I ) = .95
P(A | Ic ) = .2
P(B | Ic ) = .1
P( I ) = .4
Since the two systems are operating independently of each other,
P(A B | I ) = P(A | I ) P(B | I ) = .9 (.95) = .855
P(A B I ) = P(A B | I ) P( I ) = .855(.4) = .342
P(A B | Ic ) = P(A | Ic ) P(B | Ic ) = .2 (.1) = .02
Probability
71
P(A B Ic ) = P(A B | Ic ) P( Ic ) = .02(.6) = .012

Thus, P(A B) = P(A B I ) + P(A B Ic ) = .342 + .012 = .354
Finally, P(I | A B ) = P(A B I ) / P(A B) = .342 / .354 = .966
3.86
a.
The two probability rules for a sample space are that the probability for any sample point
is between 0 and 1 and that the sum of the probabilities of all the sample points is 1.
For this Exercise, all the probabilities of the sample points are between 0 and 1 and
4
P(S ) = P(S ) + P(S ) + P(S ) + P( S ) =.2 + .1 + .3 + .4 = 1.0

i =1
b.
P( A) = P( S1 ) + P( S4 ) = .2 + .4 = .6
3.88
P ( A B ) = P ( A) + P( B) P( A B) = .7 + .5 .4 = .8
3.90
a.
If the Dow Jones Industrial Average increases, a large New York bank would tend to
decrease the prime interest rate. Therefore, the two events are not mutually exclusive since
they could occur simultaneously.
b.
The next sale by a PC retailer could not be both a laptop and a desktop computer. Since
the two events cannot occur simultaneously, the events are mutually exclusive.
c.
Since both events cannot occur simultaneously, the events are mutually exclusive.
a.
Because events A and B are independent, we have:
3.92
P(A B) = P(A)P(B) = (.3)(.1) = .03

Thus, P(A B) 0, and the two events cannot be mutually exclusive.
3.94
72
P( A B ) .03
=
= .3
P( B)
.1
P(BA) =
P( A B ) .03
=
= .1
P ( A)
.3
b.
P(AB) =
c.
P(A B) = P(A) + P(B) P(A B) = .3 + .1 .03 = .37
Mutually exclusive events are also dependent events since the assumption that one event occurs
alters the probability of the occurrence of the other one. If we assume that one event has
occurred, it is impossible for the other one to occur simultaneously since they are mutually
exclusive. In other words, if A and B are mutually exclusive, P(A B) = 0. P(AB) =
P( A B)
0
=
= 0. Since P(A) 0, A and B are dependent.
P( B)
P( B)
Chapter 3
3.96

C: {Public school building has inadequate plumbing}
D: {Public school has plans for repairing building}
From the problem, we know P(C) = .25 and P(D|C) = .38.
P (C D) = P ( D | C ) P(C ) = .38(.25) = .095
3.98
a.
The event {The manager was involved in the ISO 9000 registration} contains the sample
points {The manager was very involved}, {The manager had moderate involvement}, and
{The manager had minimal involvement}. Thus, P(A) is:
P(A) =
b.
The event {The length of time to achieve ISO 9000 registration was more than 2 years}
contains the sample points {The length of time to achieve ISO 9000 registration was
between 2.1 and 2.5 years} and {The length of time to achieve ISO 9000 registration was
greater than 2.5 years}. Thus, P(B) is:
P(B) =
3.100
9 16 12
37
=
= .925
+
+
40 40 40
40
2
3
5
=
= .125
+
40 40
40
c.
We cannot determine if events A and B are independent from the data given because there
is no way of finding the P(A B). In order to find P(A B), the 40 individuals would
have to be classified on both variables at the same time. In the data provided, the
individuals are first classified on the first variable and then classified on the second
variable.
a.
The experiment consists of selecting 159 employees and asking each to indicate how
strongly he/she agreed or disagreed with the statement "I believe that management is
committed to CQI." There are five sample points: "Strongly agree," "Agree," "Neither
agree nor disagree," "Disagree," and "Strongly disagree."
b.
Since we have frequencies for each of the sample points, good estimates of the
probabilities are the relative frequencies. To find the relative frequencies, divide all of the
frequencies by the sample size of 159. The estimates of the probabilities are:
c.
Probability
Strongly
Agree
Agree
Neither Agree Nor

Disagree
Disagree
Strongly
Disagree
.189
.403
.258
.113
.038
The probability that an employee agrees or strongly agrees with the statement is
.189 + .403 = .592.
73
3.102
d.
The probability that an employee does not strongly agree with the statement is equal to
the sum of all the probabilities except that for "strongly agree" = .403 + .258 + .113 +
.038 = .812.
a.
There are a total of 9 2 = 18 sample points for this experiment. There are 9 sources of
CO poisoning, and each source of poisoning has 2 possible outcomes, fatal or nonfatal.
Suppose we introduce some notation to make it easier to write down the sample points.
Let FI = Fire, AU = Auto exhaust, FU = Furnace, K = Kerosene or spaceheater,
AP = Appliance, OG = Other gas-powered motors, FP = Fireplace, O = Other, and
U = Unknown. Also, let F = Fatal and N = Nonfatal. The 18 sample points are:
FI, F
FI, N
AU, F
AU, N
FU, F
FU, N
K, F
K, N
AP, F
AP, N
OG, F
OG, N
FP, F
FP, N
O, F
O, N
b.
The set of all sample points is called the sample space.
c.
The event A is made up of the following sample points: FI, F and FI, N
U, F
U, N
Then, P(A) = P(FI, F) + P(FI, N) = 63/981 + 53/981 = 116/981 = .118

d.
The event B is made up of the following sample points:

(FI, F); (AU, F); (FU, F); (K, F); (AP, F); (OG, F); (FP, F); (O, F); (U, F)
Then, P(B) = P(FI, F) + P(AU, F) + P(FU, F) + P(K, F) + P(AP, F)
+ P(OG, F) + P(FP, F) + P(O, F) + P(U, F)
= 63/981 + 60/981 + 18/891 + 9/981 + 9/981 + 3/981 + 0/981 + 3/981
+ 9/981
= 174/981 = .177
e.
The event C is made up of the following sample points: (AU, F) and (AU, N)
Then, P(C) = P(AU, F) + P(AU, N) = 60/981 + 178/981 = 238/981 = .243
f.
The event D is made up of the following sample point: AU, F

Then, P(D) = P(AU, F) = 60/981 = .061
g.
The event E is made up of the following sample point: FI, N

Then, P(E) = P(FI, N) = 53/981 = .054
3.104
Since there are 11 individuals who are willing to serve on the panel, the number of different
panels of 5 experts is a combination of 11 things taken 5 at a time or
11 11! 11 10 9 8 7 6 5 4 3 2 1
= 462
=
=
5 5!6! (5 4 3 2 1)(6 5 4 3 2 1)
74
Chapter 3
3.106
The possible ways of ranking the blades are:

GSW
GWS
SGW
SWG
WGS
WSG
If the consumer had no preference but still ranked the blades, then the 6 possibilities are equally
likely. Therefore, each of the 6 possibilities has a probability of 1/6 of occurring.
3.108
a.
P(Ranks G first) = P(GSW) + P(GWS) =
1 1
2
1
+ =
=
6 6
6
3
b.
P(Ranks G last) = P(SWG) + P(WSG) =
1 1
2
1
+ = =
6 6
6
3
c.
P(ranks G last and W second) = P(SWG) =
d.
P(WGS) =
a.
Consecutive tosses of a coin are independent events since what occurs one time would not
affect the next outcome.
b.
If the individuals are randomly selected, then what one individual says should not affect
what the next person says. They are independent events.
c.
The results in two consecutive at-bats are probably not independent. The player may have
faced the same pitcher both times which may affect the outcome.
d.
The amount of gain and loss for two different stocks bought and sold on the same day are
probably not independent. The market might be way up or down on a certain day so that
all stocks are affected.
e.
The amount of gain or loss for two different stocks that are bought and sold in different
time periods are independent. What happens to one stock should not affect what happens
to the other.
f.
The prices bid by two different development firms in response to the same building
construction proposal would probably not be independent. The same variables would be
present for both firms to consider in their bids (materials, labor, etc.).
Probability
1
6
1
6
75
3.110
a.
We will define the following events:

A:{The first activation device works properly; i.e., activates the sprinkler when it
should}
B:{The second activation device works properly}
From the statement of the problem, we know
P(A) = .91 and P(B) = .87
Furthermore, since the activation devices work independently, we conclude that
P(A B) = P(A)P(B) = (.91)(.87) = .7917
Now, if a fire starts near a sprinkler head, the sprinkler will be activated if either the first
activation device or the second activation device, or both, operates properly. Thus,
P(Sprinkler head will be activated) = P(A B) = P(A) + P(B) P(A B)
= .91 + .87 .7917 = .9883
b.
The event that the sprinkler head will not be activated is the complement of the event that
the sprinkler will be activated. Thus,
P(Sprinkler head will not be activated) = 1 P(Sprinkler head will be activated)
= 1 .9883 = .0117
c.
From part a, P(A B) = P(A)P(B) = .7917
d.
In terms of the events we have defined, we wish to determine

P(A Bc) = P(A)P(Bc) (by independence) = .91(1 .87) = .91(.13) = .1183
3.112

S: {System shuts down}
F1: {Hardware failure}
F2: {Software failure}
F3: {Power failure}
From the Exercise, we know:
P(F1) = .01, P(F2) = .05, and P(F3) = .02. Also, P(S|F1) = .73, P(S|F2) = .12, and P(S|F3) = .88.
76
Chapter 3
The probability that the current shutdown is due to a hardware failure is:
P ( F1 S )
P( S | F1 ) P ( F1 )
=
P(S )
P ( S | F1 ) P ( F1 ) + P ( S | F2 ) P ( F2 ) + P ( S | F3 ) P ( F3 )
P ( F1 | S ) =
.73(.01)
.0073
.0073
=
=
= .2362
.73(.01) + .12(.05) + .88(.02) .0073 + .006 + .0176 .0309
The probability that the current shutdown is due to a software failure is:
P ( F2 | S ) =
=
P ( F2 S )
P ( S | F2 ) P ( F2 )
=
P(S )
P ( S | F1 ) P ( F1 ) + P( S | F2 ) P( F2 ) + P( S | F3 ) P( F3 )
.12(.05)
.006
.006
=
=
= .1942
.73(.01) + .12(.05) + .88(.02) .0073 + .006 + .0176 .0309
The probability that the current shutdown is due to a power failure is:
P ( F3 | S ) =
=
3.114
P( F3 S )
P ( S | F3 ) P ( F3 )
=
P( S )
P ( S | F1 ) P ( F1 ) + P ( S | F2 ) P ( F2 ) + P ( S | F3 ) P ( F3 )
.88(.02)
.0176
.0176
=
=
= .5696
.73(.01) + .12(.05) + .88(.02) .0073 + .006 + .0176 .0309

C: {Committee judges joint acceptable}
I: {Inspector judges joint acceptable}
The sample points of this experiment are:
CI
C Ic
Cc I
Cc I c
a.
The probability the inspector judges the joint to be acceptable is:

P(I) = P(C I) + P(C c I) =
101 23 124
+
=
= .810
153 153 153
The probability the committee judges the joint to be acceptable is:

P(C) = P(C I) + P(C I c) =
Probability
101 10 111
+
=
= .725
153 153 153
77
b.
The probability that both the committee and the inspector judge the joint to be acceptable
is:
P(C I) =
101
= .660
153
The probability that neither judge the joint to be acceptable is:

P(C c I c) =
c.
19
= .124
153
The probability the inspector and committee disagree is:

P(C I c) + P(C c I) =
10
23 33
+
=
= .216
153 153 153
The probability the inspector and committee agree is:

P(C I) + P(C c I c) =
3.116
a.
101 19 120
+
=
= .784
153 153 153

A1:
A2:
B3:
B4:
A:
B:
{Component 1 works properly}

{Subsystem A works properly}
{Subsystem B works properly}
The probability a component fails is .1, so the probability a component works properly is
1 .1 = .9.
Subsystem A works properly if both components 1 and 2 work properly.
P(A) = P(A1 A2) = P(A1)P(A2) = .9(.9) = .81
(since the components operate independently)
Similarly, P(B) = P(B1 B2) = P(B1)P(B2) = .9(.9) = .81
B
The system operates properly if either subsystem A or B operates properly.

The probability the system operates properly is:
P(A B) = P(A) + P(B) - P(A B) = P(A) + P(B) P(A)P(B)
= .81 + .81 .81(.81) = .9639
78
Chapter 3
b.
The probability exactly one subsystem fails is:

P(A Bc) + P(Ac B) = P(A)P(Bc) + P(Ac)P(B)
= .81(1 .81) + (1 .81).81 = .1539 + .1539 = .3078
c.
The probability the system fails is the probability that both subsystems fail or:
P(Ac Bc) = P(Ac)P(Bc) = (1 .81)(1 .81) = .0361
d.
The system operates correctly 99% of the time means it fails 1% of the time. The
probability one subsystem fails is .19. The probability n subsystems fail is .19 n. Thus,
we must find n such that
.19n .01
Thus, n = 3.
3.118
Define the events:

A: {A bottle comes from machine A}
B: {A bottle comes from machine B}
R: {A bottle is rejected}.
Then the given probabilities are:
P(A) = .75, P(B) = .25, P(RA) =
1
1
, P(RB) =
20
30
The proportion of rejected bottles is:

P(R) = P(A R) + P(B R) = P(RA)P(A) + P(RA)P(B)
1
1
(.75) +
(.25) = .0458
=
20
30
The probability that a bottle comes from machine A, given that it is accepted is:
c
P( A R c ) P ( R A) P ( A) (19 / 20) (.75)
= .7467
P(AR ) =
=
=
R( R c )
1 P( R)
1 .0458
Probability
79
3.120
There are a total of 6 6 = 36 outcomes when rolling 2 dice. If we let the first number in the
pair represent the outcome of die number 1 and the second number in the pair represent the
outcome of die number 2, then the possible outcomes are:
1,1
1,2
1,3
1,4
1,5
1,6
2,1
2,2
2,3
2,4
2,5
2,6
3,1
3,2
3,3
3,4
3,5
3,6
4,1
4,2
4,3
4,4
4,5
4,6
5,1
5,2
5,3
5,4
5,5
5,6
6,1
6,2
6,3
6,4
6,5
6,6
If both dice are fair, then each of these outcomes are equally like and have a probability of
1/36.
a.
To win on the first roll, a player must roll a 7 or 11. There are 6 ways to roll a 7 and 2
ways to roll an 11. Thus the probability of winning on the first roll is:
P (7 or 11) =
b.
To lose on the first roll, a player must roll a 2 or 3. There is 1 way to roll a 2 and 2 ways
to roll a 3. Thus the probability of losing on the first roll is:
P (2 or 3) =
c.
8
= .2222
36
3
= .0833
36
If a player rolls a 4 on the first roll, the game will end on the next roll if the player rolls
the original roll again (player wins) or if the player rolls a seven (player loses). Now,
there are 3 ways of getting a 4 on the first roll: 1,3, 2,2, or 3,1.
If the first roll was 2,2, then the game would end on the next roll if the player threw a 2,2,
1,6, 2,5, 3,4, 4,3, 5,2, or 6,1 on the next roll. The probability of the game ending on
the next roll would be:
P (2, 2 or 7 on second toss | 2, 2 on first) =
7
= .1944
36
Now, suppose the first roll ended with a 1 and a 3. Since the dice are not marked, this
result could have happened two ways: 1, 3 or 3,1. Regardless of how the original 1 and 3
were obtained, the player would have 2 ways of winning on the next roll: 1,3 or 3,1. For
the game to end on the next roll, the player could throw 1,3, 3,1, 1,6, 2,5, 3,4, 4,3, 5,2,
or 6,1. The probability of the game ending on the next roll would be:
P (1,3 or 3,1 or 7 on second toss |1 and 3 on first) =
8
= .2222
36
Since there were 3 ways to get a 4 on the first roll, and each were equally likely,
P(2,2) = 1/3 and P[1 and 3 (any order)] = 2/3.
80
Chapter 3
The probability that the game ends on the second roll is

P (2, 2 or 7 on second toss | 2, 2 on first) P (2, 2 on first)
+ P (1,3 or 3,1 or 7 on second toss |1 and 3 on first) P (1 and 3 on first)
1
2
= .1944 + .2222 = .0648 + .1481 = .2129
3
3
3.122
Suppose we define the following event:

E: {Error produced when dividing}
From the problem, we know that P(E) = 1 / 9,000,000,000
The probability of no error produced when dividing is P(Ec) = 1 P(E) = 1
1 / 9,000,000,000 = 8,999,999,999 / 9,000,000,000 = .999999999 1.0000
Suppose we want to find the probability of no errors in 2 divisions (assuming each division is
independent):
P(Ec Ec) = .999999999(.999999999) = .999999999 1.0000
Thus, in general, the probability of no errors in k divisions would be:
P(Ec Ec Ec Ec) = P(Ec)k = [8,999,999,999 / 9,000,000,000]k
k times
Suppose a user ran a program that performed 1 billion divisions. The probability of no errors
in these 1 billion divisions would be:
P(Ec)1,000,000,000 = [8,999,999,999 / 9,000,000,000]1,000,000,000 = .9048
Thus, the probability of at least 1 error in 1 billion divisions would be
1 P(Ec)1,000,000,000 = 1 - [8,999,999,999 / 9,000,000,000]1,000,000,000 = 1 .9048 = .0852
For a heavy MINITAB user, this flawed chip would be a problem because the above
probability is not that small.
Probability
81
Random Variables and

Probability Distributions
4.2
Chapter 4
a.
The closing price of a particular stock on the New York Stock Exchange is discrete. It
can take on only a countable number of values.
b.
The number of shares of a particular stock that are traded on a particular day is discrete.
It can take on only a countable number of values.
c.
The quarterly earnings of a particular firm is discrete. It can take on only a countable
number of values.
d.
The percentage change in yearly earnings between 2005 and 2006 for a particular firm is
continuous. It can take on any value in an interval.
e.
The number of new products introduced per year by a firm is discrete. It can take on only
a countable number of values.
f.
The time until a pharmaceutical company gains approval from the U.S. Food and Drug
Administration to market a new drug is continuous. It can take on any value in an
interval of time.
4.4
The number of customers, x, waiting in line can take on values 0, 1, 2, 3, . Even though the
list is never ending, we call this list countable. Thus, the random variable is discrete.
4.6
A banker might be interested in the number of new accounts opened in a month, or the number
of mortgages it currently has, both of which are discrete random variables.
4.8
The manager of a hotel might be concerned with the number of employees on duty at a specific
time, or the number of vacancies there are on a certain night.
4.10
A stockbroker might be interested in the length of time until the stockmarket is closed for the
day.
4.12
a.
The variable x can take on values 1, 3, 5, 7, and 9.
b.
The value of x that has the highest probability associated with it is 5. It has a probability
of .4.
82
Chapter 4
4.14
4.16
c.
Using MINITAB, the probability

distribution of x as a graph is:
d.
P(x = 7) = .2
e.
P(x 5) = p(5) + p(7) + p(9) = .4 + .2 + .1 = .7
f.
P(x > 2) = p(3) + p(5) + p(7) + p(9) = .2 + .4 + .2 + .1 = .9
a.
This is not a valid distribution because
b.
This is a valid distribution because 0 p(x) 1 for all values of x and
c.
This is not a valid distribution because p(4) = .3 < 0.
d.
The sum of the probabilities over all possible values of the random variable is
p( x) = 1.1 > 1, so this is not a valid probability distribution.
a.
= E(x) =
p( x) = .9 1.
p( x) = 1.
xp( x)
= 10(.05) + 20(.20) + 30(.30) + 40(.25) + 50(.10) + 60(.10)

= .5 + 4 + 9 + 10 + 5 + 6 = 34.5
2 = E(x )2 =
(x )
p ( x)
= (10 34.5)2(.05) + (20 34.5)2(.20) + (30 34.5)2(.30)

+ (40 34.5)2(.25) + (50 34.5)2(.10) + (60 34.5)2(.10)
= 30.0125 + 42.05 + 6.075 + 7.5625 + 24.025 + 65.025 = 174.75
= 174.75 = 13.219
b.
83
c.
2 34.5 2(13.219) 34.5 26.438 (8.062, 60.938)

P(8.062 < x < 60.938) = p(10) + p(20) + p(30) + p(40) + p(50) + p(60)
= .05 + .20 + .30 + .25 + .10 + .10 = 1.00
4.18
a.
It would seem that the mean of both would be 1 since they both are symmetric
distributions centered at 1.
b.
P(x) seems more variable since there appears to be greater probability for the two extreme
values of 0 and 2 than there is in the distribution of y.
c.
For x:
xp( x) = 0(.3) + 1(.4) + 2(.3) = 0 + .4 + .6 = 1

2 = E[(x )2] = ( x ) p ( x)
= E(x) =
= (0 1)2(.3) + (1 1)2(.4) + (2 1)2(.3) = .3 + 0 + .3 = .6
yp( y) = 0(.1) + 1(.8) + 2(.1) = 0 + .8 + .2 = 1

2 = E[(y )2] = ( y ) p( y )
= E(y) =
For y:
= (0 1)2(.1) + (1 1)2(.8) + (2 1)2(.1) = .1 + 0 + .1 = .2

The variance for x is larger than that for y.
4.20
a.
Yes. Relative frequencies are observed values from a sample. Relative frequencies are
commonly used to estimate unknown probabilities. In addition, relative frequencies have
the same properties as the probabilities in a probability distribution, namely
1. all relative frequencies are greater than or equal to zero
2. the sum of all the relative frequencies is 1
b.
Using MINITAB, the graph of the probability distribution is:

0.15
p(age)
0.10
0.05
0.00
20
25
30
age
c.
Let x = age of employee. Then P(x > 30) = .13 + .15 + .12 = .40.
P(x > 40) = 0
P(x < 30) = .02 + .04 + .05 + .07 + .04 + .02 + .07 + .02 + .11 + .07 = .51
d.
84
P(x = 25 or x = 26) = .02 + .07 = .09
Chapter 4
4.22
4.24
a.
The probability distribution for x is:

Grill Display
Combination
1-2-3
x
6
p(x)
35 / 124 = .282
1-2-4
8 / 124 = .065
1-2-5
42 / 124 = .339
2-3-4
4 / 124 = .032
2-3-5
10
1 / 124 = .008
2-4-5
11
34 / 124 = .274
b.
P(x > 10) = p(10) + p(11) = .008 + .274 = .282
a.
First, we must find the probability distribution of x. Define the following events:
C: {Chicken is contaminated}
N: {Chicken is not contaminated}
If 3 slaughtered chickens are randomly selected, then the possible outcomes are:
CCC, CCN, CNC, NCC, CNN, NCN, NNC, and NNN
Each of these outcomes are NOT equally likely since P(C) = 1/100 = .01. P(N) = 1 P(C)
= 1 -.01 = .99.
P(CCC) = P(C C C ) = P(C) P(C) P(C) = .01(.01)(.01) = .000001
P(CCN) = P(CNC) = P(NCC) = P(C C N ) = P(C) P(C) P(N) = .01(.01)(.99) =
.000099
P(CNN) = P(NCN) = P(NNC) = P(C N N ) = P(C) P(N) P(N) = .01(.99)(.99) =
.009801
P(NNN) = P(N N N ) = P(N) P(N) P(N) = .99(.99)(.99) = .970299.
The variable x is defined as the number of contaminated chickens in the sample. The value
of x for each of the outcomes is:
Event
CCC
CCN
CNC
NCC
CNN
NCN
NNC
NNN
x
3
2
2
2
1
1
1
0
p(x)
.000001
.000099
.000099
.000099
.009801
.009801
.009801
.970299
85
The probability distribution of x is:

x
3
2
1
0
b.
p(x)
.000001
.000297
.029403
.970299
Using MINITAB, the probability graph for x is:
1.0
0.9
0.8
0.7
p(x)
0.6
0.5
0.4
0.3
0.2
0.1
0.0
0
c.
4.26
P(x 1) = P(x = 0) + P(x = 1) = .970299 + .029403 = .999702
To find the probability distribution of x, we first list the possible values of x. For this exercise,
the possible values of x are 3, 1, and 5. Next, we list the number of cases, f(x), that result in
the particular values of x. To find the probability distribution of x, we divide the number of
cases for each value of x, f(x), by the total number of cases, 678. For x = 3, the probability is
p(3) = 68 / 678 = .100. For x = 1, the probability is p(1) = 71 / 678 = .105. For x = 5, the
probability is p(5) = 539 / 678 = .795. The probability distribution of x is:
x
3
1
5
Total
86
f(x)
68
71
539
678
p(x)
.100
.105
.795
1.000
Chapter 4
Using MINITAB, the graph of the probability distribution is:
0.8
0.7
0.6
p(x)
0.5
0.4
0.3
0.2
0.1
0.0
-3
-2
-1
4.28
a.
E(x) =
xp( x)
All x
Firm A: E(x) = 0(.01) + 500(.01) + 1000(.01) + 1500(.02) + 2000(.35) + 2500(.30)

+ 3000(.25) + 3500(.02) + 4000(.01) + 4500(.01) + 5000(.01)
= 0 + 5 + 10 + 30 + 700 + 750 + 750 + 70 + 40 + 45 + 50
= 2450
Firm B: E(x) = 0(.00) + 200(.01) + 700(.02) + 1200(.02) + 1700(.15) + 2200(.30)
+ 2700(.30) + 3200(.15) + 3700(.02) + 4200(.02) + 4700(.01)
= 0 + 2 + 14 + 24 + 255 + 660 + 810 + 480 + 74 + 84 + 47
= 2450
b.
= 2
2 =
(x )
p( x)
All x
Firm A: 2 = (0 2450)2(.01) + (500 2450)2(.01) + + (5000 2450)2(.01)

= 60,025 + 38,025 + 21,025 + 18,050 + 70,875 + 750 + 75,625
+ 22,050 + 24,025 + 42,025 + 65,025
= 437,500
= 661.44
Firm B: 2 = (0 2450)2(.00) + (200 2450)2(.01) + + (4700 2450)2(.01)
= 0 + 50,625 + 61,250 + 31,250 + 84,375 + 18,750 + 84,375
+ 31,250 + 61,250 + 50,625
= 492,500
= 701.78
Firm B faces greater risk of physical damage because it has a higher variance and
standard deviation.
87
4.30
a.
If a large number of measurements are observed, then the relative frequencies should be
very good estimators of the probabilities.
b.
E(x) =
xp( x) = 1(.01) + 2(.04) + 3(.04) + 4(.08) + 5(.10) + 6(.15) + 7(.25) + 8(.20)

+ 9(.08) + 10(.05)
= .01 + .08 + .12 + .32 + .50 + .90 + 1.75 + 1.60 + .72 + .50
= 6.50
The average number of checkout lanes per store is 6.5.

c.
2 =
(x )
p( x) = (1 6.5)2(.01) + (2 6.5)2(.04) + (3 6.5)2(.04)
All x
+ (4 6.5)2(.08) + (5 6.5)2(.10) + (6 6.5)2(.15)

+ (7 6.5)2(.25) + (8 6.5)2(.20) + (9 6.5)2(.08)
+ (10 6.5)2(.05)
= .3025 + .8100 + .4900 + .5000 + .2250 + .0375 + .0625
+ .4500 + .5000 + .6125
= 3.99
=
d.
3.99 = 1.9975
Chebyshev's Rule says that at least 0 of the observations should fall in the interval .
Chebyshev's Rule says that at least 75% of the observations should fall in the interval
2.
e.
6.5 1.9975 (4.5025, 8.4975)
P(4.5025 x 8.4975) = .10 + .15 + .25 + .20 = .70

This is at least 0.
2 6.5 2(1.9975) 6.5 3.995 (2.505, 10,495)
P(2.505 x 10.495) = .04 + .08 + .10 + .15 + .25 + .20 + .08 + .05 = .95
This is at least .75 or 75%.
4.32
Let x = winnings in the Florida lottery. The probability distribution for x is:
x
p(x)
22,999,999/23,000,000
$1
$6,999,999
1/23,000,000
The expected net winnings would be:
= E(x) = (1)(22,999,999/23,000,000) + 6,999,999(1/23,000,000) = $.70

The average winnings of all those who play the lottery is $.70.
88
Chapter 4
4.34
Each point in the system can have one of 2 status levels, free or obstacle. Define the
following events:
AF: {Point A is free}
BF: {Point B is free}
CF: {Point C is free}
AO: {Point A is obstacle}

BO: {Point B is obstacle}
CO: {Point C is obstacle}
Thus, the sample points for the space are:

AFBFCF, AFBFCO, AFBOCF, AFBOCO, AOBFCF, AOBFCO, AOBOCF, AOBOCO
Since it is stated that the probability of any point in the system having a free status is
.5, the probability of any point having an obstacle status is also .5, Thus, the
probability of each of the sample points above is P(AiBiCi) = .5(.5)(.5) = .125.
The values of Y, the number of free links in the system, for each sample point are listed
below. A link is free if both the points are free. Thus, a link from A to B is free if A is free
and B is free. A link from B to C is free if B is free and C is free.
Sample point
Probability
AFBFCF
.125
AFBFCO
.125
AFBOCF
.125
AFBOCO
.125
AOBFCF
.125
AOBFCO
.125
AOBOCF
.125
AOBOCO
.125
The probability distribution for Y is:

Y
Probability
.625
.250
.125
89
4.36
a.
x is discrete. It can take on only six values.
b.
This is a binomial distribution.
c.
5
p(0) = (.7)0(.3)5-0 =
0
5
p(1) = (.7)1(.3)5-1 =
1
5
p(2) = (.7)2(.3)5-2 =
2
5
p(3) = (.7)3(.3)5-3 =
3
5
p(4) = (.7)4(.3)5-4 =
4
5
p(5) = (.7)5(.3)5-5 =
5
90
5!
5 4 3 2 1
(.7)0(.3)5 =
(1)(.00243) = .00243
0!5!
15 4 3 2 1
5!
(.7)1(.3)4 = .02835
1!4!
5!
(.7)2(.3)3 = .1323
2!3!
5!
(.7)3(.3)2 = .3087
3!2!
5!
(.7)4(.3)1 = .36015
4!1!
5!
(.7)5(.3)0 = .16807
5!0!
d.
= np = 5(.7) = 3.5
= npq = 5(.7)(.3) = 1.0247
e.
2 = 3.5 2(1.0247) (1.4506, 5.5494)
Chapter 4
4.38
a.
3
3!
3 2 1
p(0) = (.3)0(.7)3-0 =
(.3)0(.7)3 =
(1)(.7)3 = .343
0!3!
1 3 2 1
0
3
3!
(.3)1(.7)2 = .441
p(1) = (.3)1(.7)3-1 =
1
1!2!

3
p(2) = (.3)2(.7)3-2 =
2
3
p(3) = (.3)3(.7)3-3 =
3
4.40
4.42
p(x)
0
1
2
3
.343
.441
.189
.027
5!
(.3)2(.7)1 = .189
2!1!
5!
(.3)3(.7)0 = .027
3!0!
a.
P(x = 2) = P(x 2) P(x 1) = .167 .046 = .121 (from Table II, Appendix B)
b.
P(x 5) = .034
c.
P(x > 1) = 1 P(x 1) = 1 .919 = .081
d.
P(x < 10) = P(x 9) = 0
e.
P(x 10) = 1 P(x 9) = 1 .002 = .998
f.
P(x = 2) = P(x 2) P(x 1) = .206 .069 = .137
a.
We will check the 5 characteristics of a binomial random variable.

1.
2.
3.
4.
5.
The experiment consists of n = 200 identical trials.

There are only two possible outcomes on each trial. Let S = young adult owns a
mobile phone with internet access and F = young adult does not own a mobile
phone with internet access.
The probability of success (S) is the same from trial to trial. For each trial,
p = P(S) = .20. q = 1 p = 1 .20 = .80.
The trials are independent.
The binomial random variable x is the number of young adults in 200 trials that own
a mobile phone with internet access.
Thus, x is a binomial random variable.

b.
From the exercise, p = .20. For any young adult, the probability that they own a mobile
phone with internet access is .20.
c.
= E ( x) = np = 200(.20) = 40 . On the average, for every 200 young people surveyed, 40

will own mobile phones with internet access.
91
4.44
a.
We will check the 5 characteristics of a binomial random variable.

1. The experiment consists of n = 5 identical trials. We have to assume that the
number of bottled water brands is large.
2. There are only 2 possible outcomes for each trial. Let S = brand of bottled water
used tap water and F = brand of bottled water did not use tap water.
3. The probability of success (S) is the same from trial to trial. For each trial, p =
P(S) = .25 and q = 1 p = 1 - .25 = .75.
4. The trials are independent.
5. The binomial random variable x is the number of brands in the 5 trials that used tap
water.
If the total number of brands of bottled water is large, then the above
characteristics will be basically true. Thus, x is a binomial random variable.
b.
c.
d.
4.46
5
The formula for the probability distribution for x is p( x) = .25 x (.75)5 x ,
x
for x = 1, 2, 3, 4, 5.
5
5!
.252.753 = .2637
P ( x = 2) = .252 (.75)5 2 =
2
2!3!

5
5
P ( x 1) = P ( x = 0) + P ( x = 1) = .250 (.75)50 + .251 (.75)51
0
1
5!
5!
=
.250.755 +
.251.754 = .2373 + .3955 = .6328
0!5!
1!4!
a.
In order for x to be a binomial random variable, the n trials must be identical. We can
assume that the process of selecting of a worker is identical from trial to trial. There are
two possible outcomes - a worker missed work due to a back injury or not. The
probability of success must be the same from trial to trial. We can assume that the
probability of missing work due to a back injury is constant. The trials must be
independent of each other. We can assume that the outcome of one trials will not affect
the outcome of any other. Thus, x is a binomial random variable.
b.
From the information given in the problem, the estimate of p is .40.
c.
The mean is = E(x) = np = 10(.40) = 4.

The standard deviation is =
d.
np(1 p ) = 10(.40)(.60) = 2.4 1 = 1.549
Using Table II, Appendix B, with n = 10 and p = .40,

P(x = 1) = P(x 1) P(x 0) = .046 .006 = .040
P(x > 1) = 1 P(x 1) = 1 .046 = .954
92
Chapter 4
4.48
Let x = number of packets observed by a network sensor in 150 trials. Then x has an
approximate binomial distribution with n = 150 and p = .001.
The virus will be detected if at least 1 packets is observed.
150
150!
0
150 0
P ( x 1) = 1 P ( x = 0) = 1
=1
.999150 = 1 .8606 = .1394
.001 (.999)
0!150!
0
4.50
a.
We must assume that the trials are identical, the probability of success is constant from
trial to trial, and the trials are independent of each other.
b.
From the problem, we estimate p to be .20. Using Table II, Appendix B, with n = 25 and
p = .20,
P(x 10) = .994

c.
E(x) = np = 25(.20) = 5
np(1 p ) = 25(.20)(.80) = 4 = 2
d.
2 5 2(2) 5 4 (1, 9)
e.
P(1 < x < 9) = P(x 8) P(x 1) = .953 .027 = .926

4.52
Assuming the supplier's claim is true,
= np = 500(.001) = .5
= npq = 500(.001)(.999) = .4995 = .707
If the supplier's claim is true, we would only expect to find .5 defective switches in a sample of
size 500. Therefore, it is not likely we would find 4.
Based on the sample, the guarantee is probably inaccurate.
Note: z =
4 .5
= 4.95
.707
This is an unusually large z-score.

4.54
a.
For this test, n = 20 and p = .10. Then x is a binomial random variable with n = 20
and p = .10.
Using Table II, Appendix, with n = 20 and p = .10,
P(x 1) = .392
93
b.
For the experiment in part a, the level of confidence is 1 P(x 1) = 1 .392 = .608.
Since this value is not close to 1, this would not be an acceptable level.
c.
Suppose we increased n from 20 to 25. Using Table II, Appendix B, with n = 25 and
p = .10,
P(x 1) = .271. This value is smaller than the value found in part a.
Now, suppose we keep n = 20, but change K to 0 instead of 1. Using Table II,
Appendix B, with n = 20 and p = .10,
P(x 0) = .122. This value is again, smaller than the value found in part a.
d.
Suppose we let K = 0. Now, we need to find n such that the level of confidence .95,
which means that P(x = 0) .05.
n
P ( x = 0) = .10 (.9) n 0 .05
0
n! n
.9 .05
0!n!
.9n .05
ln(.9n ) ln(.05)
nln(.9) ln(.05)
ln(.05) 2.99573
=
= 28.4
.10536
ln(.9)
Thus, if K = 0, then we need a sample size of 28 to get a level of confidence of at
least .95.
n
Now, suppose K = 1. Now, we need to find n such that the level of confidence is at
least .95, which means that P(x 1) .05.
n
n
P ( x 1) = P ( x = 0) + P( x = 1) = .10 (.9) n 0 + .11 (.9) n 1 .05
0
1
n! n
n!
.9 +
.11.9n 1 .05
0!n!
1!(n 1)!
.9n + n.11.9n 1 .05
.9n 1 (.9 + .1n) ln(.05)
From here, we will use trial and error.
94
Chapter 4
For n = 30, .930-1(.9+.1(30)) = .1837

n
.9n-1(.9+.1n)
30
.930-1(.9+.1(30)) = .1837
40
.940-1(.9+.1(40)) = .0805
45
.945-1(.9+.1(45)) = .0524
46
.946-1(.9+.1(46)) = .0480
Thus, for K = 1, we would need a sample size of 46 to get a level of confidence of

at least .95.
4.56
= = 1.5
Using Table III of Appendix B:
4.58
a.
P(x 3) = .934
b.
P(x 3) = 1 P(x 2) = 1 .809 = .191
c.
P(x = 3) = P(x 3) P(x 2) = .934 .809 = .125
d.
P(x = 0) = .223
e.
P(x > 0) = 1 P(x = 0) = 1 .223 = .777
f.
P(x > 6) = 1 P(x 6) = 1 .999 = .001
a.
To graph the Poisson probability distribution with = 5, we need to calculate p(x) for x =
0 to 15. Using Table III, Appendix B,
p(0) = .007
p(1) = P(x 1) P(x 0) = .040 .007 = .033
p(2) = P(x 2) P(x 1) = .125 .040 = .085
p(3) = P(x 3) P(x 2) = .265 .125 = .140
p(4) = P(x 4) P(x 3) = .440 .265 = .175
p(5) = P(x 5) P(x 4) = .616 .440 = .176
p(6) = P(x 6) P(x 5) = .762 .616 = .146
p(7) = P(x 7) P(x 6) = .867 .762 = .105
p(8) = P(x 8) P(x 7) = .932 .867 = .065
p(9) = P(x 9) P(x 8) = .968 .932 = .036
p(10) = P(x 10) P(x 9) = .986 .968 = .018
p(11) = P(x 11) P(x 10) = .995 .986 = .009
p(12) = P(x 12) P(x 11) = .998 .995 = .003
p(13) = P(x 13) P(x 12) = .999 .998 = .001
p(14) = P(x 14) P(x 13) = 1.000 .999 = .001
p(15) = P(x 15) P(x 14) = 1.000 1.000 = .000
95
The graph is shown at right:
4.60
b.
==5
= = 5 = 2.2361
2 5 2(2.2361) 5 4.4722 (.5278, 9.4722)
c.
P(.5278 < x < 9.4722) = P(1 x 9) = P(x 9) P(x = 0)

= .968 .007 = .961
a.
E(x) = = = 6
= = 6 = 2.449
x
z=
c.
Using Table III, Appendix B, with = 6,
1 6
= 2.041
2.449
b.
P(x 10) = .957
4.62
a.
In the problem, it is stated that E(x) = .03. This is also the value of .
2 = = .03
b.
96
The experiment consists of counting the number of deaths or missing persons in a threeyear interval. We must assume that the probability of a death or missing person in a
three-year period is the same for any three-year period. We must also assume that the
number of deaths or missing persons in any three-year period is independent of the
number of deaths or missing persons in any other three-year period.
Chapter 4
c.
4.64
P(x = 1) =
1e - = .031e -.03 = .0291
P(x = 0) =
0e - = .030e -.03 = .9704
1!
0!
1!
0!
a.
Using Table III and = 6.2, P(x = 2) = P(x 2) P(x 1) = .054 .015 = .039
P(x = 6) = P(x 6) P(x 5) = .574 .414 = .160
P(x = 10) = P(x 10) P(x 9) = .949 .902 = .047
b.
The plot of the distribution is:
c.
= = 6.2, = = 6.2 = 2.490

6.2 2.49 (3.71, 8.69)
2 6.2 2(2.49) 6.2 4.98 (1.22, 11.18)
3 6.2 3(2.49) 6.2 7.47 (1.27, 13.67)
See the plot in part b.
d.
First, we need to find the mean number of customers per hour. If the mean number of
customers per 10 minutes is 6.2, then the mean number of customers per hour is
6.2(6) = 37.2 = .
= = 37.2 and = = 37.2 = 6.099

3 37.2 3(6.099) 37.2 18.297 (18,903, 55.498)
Using Chebyshev's Rule, we know at least 8/9 or 88.9% of the observations will fall
within 3 standard deviations of the mean. The number 75 is way beyond the 3 standard
deviation limit. Thus, it would be very unlikely that more than 75 customers entered the
store per hour on Saturdays.
97
4.66
Let x = number of minor flaws in one square foot of a door's surface. Then x has a Poisson
distribution with = .5.
= = .5, using Table III, Appendix B:

P(fail inspection) = P(2 or more minor flaws in the square foot inspected)
= P(x 2) = 1 P(x 1)
= 1 .910 = .090
P(pass inspection) = P(x < 2) = P(x 1) = .910
4.68
If it takes exactly 5 minutes to wash a car and there are 5 cars in line, it will take 5(5) = 25
minutes to wash these 5 cars. Thus, for anyone to be in line at closing time, more than 1 car
must arrive in the final hour. In addition, if on average 10 cars arrive per hour, then an
average of 5 cars will arrive per hour (30 minutes). If we let x = number of cars to arrive in
hour, then x is a Poisson random variable with = 5.
P(x > 1) = 1 P(x 1) = 1 .04 = .96 (Using Table III, Appendix B)
Since this probability is so big, it is very likely that someone will be in line at closing time.
4.70
4.72
.04 (20 x 45)

From Exercise 4.69, f(x) =
0 otherwise
a.
P(20 x 30) = (30 20)(.04) = .4
b.
P(20 < x < 30) = (30 20)(.04) = .4
c.
P(x 30) = (45 30)(.04) = .6
d.
P(x 45) = (45 45)(.04) = 0
e.
P(x 40) = (40 20)(.04) = .8
f.
P(x < 40) = (40 20)(.04) = .8
g.
P(15 x 35) = (35 20)(.04) = .6
h.
P(21.5 x 31.5) = (31.5 21.5)(.04) = .4
1
(3 x 7)
From Exercise 4.71, f(x) = 4

0 otherwise
a.
98
1
P(x a) = .6 (7 a) = .6
4
7 a = 2.4
a = 4.6
Chapter 4
b.
c.
d.
4.74
1
P(x a) = .25 (a 3) = .25
4
a3=1
a=4
1
P(x a) = 1 (a 3) = 1
4
a3=4
a=7
For any value of a 7, P(x a) = 1. Thus, a 7.
1
P(4 x a) = .5 (a 4) = .5
4
a 4= 2
a=6
c+d
= 10 c + d = 20 c = 20 - d
2
d -c
=
= 1 d c = 12
12
Substituting, d (20 d) = 12 2d 20 = 12
2d = 20 + 12
20 + 12
d=
2
d = 11.732
Since c + d = 20 c + 11.732 = 20
c = 8.268
1
(c x d)
f(x) =
d c
1
1
1
=
=
= .289
d c 11.732 - 8.268 3.464
.289 (8.268 x 11.732)
Therefore, f(x) =
0 otherwise
The graph of the probability distribution
for x is given here.
99
4.76
a.
For this problem, c = 0 and d = 1.

1
1
(0 x 1)
=
f(x) = d c 1 0
0
otherwise
c+d
0 +1
=
= .5
2
2
2
2
(d c)
(1 0)
1
= .0833
2 =
=
=
12
12
12
P(.2 < x < .4) = (.4 .2)(1) = .2
b.
4.78
c.
P(x > .995) = (1 .995)(1) = .005. Since the probability of observing a trajectory greater
than .995 is so small, we would not expect to see a trajectory exceeding .995.
a.
For layer 2, let x = amount loss. Since the amount of loss is random between .01 and .05
million dollars, the uniform distribution for x is:
f(x) =
1
d c
(c x d)
1
1
1
=
=
= 25
d c .05 .01 .04
25 (.01 x .05)
Therefore, f(x) =
0 otherwise
A graph of the distribution looks like the following:
c + d .01 + .05
=
= .03
2
2
d c
12
.05 .01
12
= .0115, 2 = (.0115)2 = .00013
The mean loss for layer 2 is .03 million dollars and the variance of the loss for layer 2 is
.00013 million dollars squared.
100
Chapter 4
b.
For layer 6, let x = amount loss. Since the amount of loss is random between .50 and 1.00
million dollars, the uniform distribution for x is:
f(x) =
1
d c
(c x d)
1
1
1
=
=
=2
d c 1.00 .50 .50
2 (.50 x 1.00)
Therefore, f(x) =
0 otherwise
A graph of the distribution looks like the following:
c + d
.50 + 1.00
=
= .75
2
2
d c
12
1.00 .50
= .1443, 2 = (.1443)2 = .0208
12
The mean loss for layer 6 is .75 million dollars and the variance of the loss for layer 6 is
.0208 million dollars squared.
c.
A loss of $10,000 corresponds to x = .01. P(x > .01) = 1

A loss of $25,000 corresponds to x = .025.
1
1
P(x < .025) = (Base)(Height) = (x c)

= (.025 .01)
d c
.05 .01
= .015(25) = .375
101
d.
A loss of $750,000 corresponds to x = .75. A loss of $1,000,000 corresponds to x = 1.

1
1
P(.75 < x < 1) = (Base)(Height) = (d - x)

= (1.00 - .75)
1.00 .50
d c
= .25(2) = .5
A loss of $900,000 corresponds to x = .90.

1
1
P(x > .9) = (Base)(Height) = (d x)

= (1.00 .90)
1.00 .50
d c
= .10(2) = .20
P(x = .9) = 0
4.80
Let x = cycle availability, where x has a uniform distribution on the interval from 0 to 1.
Mean = =
c + d 0 +1
=
= .5
2
2
Standard deviation = =
d c
12
1 0
= .289
12
The 10th percentile is that value of x such that 10% of all observations are below it.
Let K1 = 10th percentile.
P(x K1) = (K1 0)(1 0) = K1 = .10
The lower quartile is that value of x such that 25% of all observations are below it.
P(x K2) = (K2 0)(1 0) = K2 = .25
The UPPER quartile is that value of x such that 75% of all observations are below it.
P(x K3) = (K3 0)(1 0) = K3 = .75
4.82
102
a.
Chapter 4
b.
c + d 0 +1
=
= .5
2
2
d c 1 0
=
=
= .289
12
12
=
2 = .2892 = .083
c.
P(p > .95) = (1 .95)(1) = .05

P(p < .95) = (.95 0)(1) = .95
d.
The analyst should use a uniform probability distribution with c = .90 and d = .95.
1
1
1
=
=
= 20 (.90 p .95)
f(p) = d c .95 .90 .05
0 otherwise
4.84
4.86
Table IV in the text gives the area between z = 0 and z = z0. In

this exercise, the answers may thus be read directly from the
table by looking up the appropriate z.
a.
P(0 < z < 2.0) = .4772
b.
P(0 < z < 3.0) = .4987
c.
P(0 < z < 1.5) = .4332
d.
P(0 < z < .80) = .2881
a.
P(1 z 1) = A1 + A2
= .3413 + .3413
= .6826
b.
P(2 z 2) = A1 + A2
= .4772 + .4772
= .9544
c.
P(2.16 < z 0.55) = A1 + A2

= .4846 + .2088
= .6934
103
4.88
4.90
104
d.
P(.42 < z < 1.96)

= P(.42 z 0) + P(0 z 1.96)
= A 1 + A2
= .1628 + .4750
= .6378
e.
P(z 2.33) = P(2.33 z 0) + P(z 0)

= A 1 + A2
= .4901 + .5000
= .9901
f.
P(z < 2.33) = P(z 0) + P(0 z 2.33)

= A 1 + A2
= .5000 + .4901
= .9901
a.
P(z = 1) = 0, since a single point does not have an area.
b.
P(z 1) = P(z 0) + P(0 < z 1)

= A 1 + A2
= .5 + .3413
= .8413
(Table IV, Appendix B)
c.
P(z < 1) = P(z 1) = .8413 (Refer to part b.)
d.
P(z > 1) = 1 P(z 1) = 1 .8413 = .1587 (Refer to part b.)
Using Table IV, Appendix B:

a.
P(z z0) = .05

A1 = .5 .05 = .4500
Looking up the area .4500 in Table IV gives
z0 = 1.645.
b.
P(z z0) = .025

A1 = .5 .025 = .4750
Looking up the area .4750 in Table IV
gives z0 = 1.96.
c.
P(z z0) = .025

A1 = .5 .025 = .4750
Looking up the area .4750 in Table IV gives
z = 1.96. Since z0 is to the left of 0, z0 = 1.96.
Chapter 4
4.92
4.94
d.
P(z z0) = .10

A1 = .5 .1 = .4
Looking up the area .4000 in Table IV
gives z0 = 1.28.
e.
P(z > z0) = .10

A1 = .5 .1 = .4
z0 = 1.28 (same as in d)
a.
z=1
b.
z = 1
c.
z=0
d.
z = 2.5
e. z = 3
Using Table IV of Appendix B:
a.
To find the probability that x assumes a value more than 2

standard deviations from :
P(x < 2) + P(x > + 2)
= P(z < 2) + P(z > 2)
= 2P(z > 2)
= 2(.5000 .4772)
= 2(.0228) = .0456
To find the probability that x assumes a value more than 3
standard deviations from :
P(x < 3) + P(x > + 3)
= P(z < 3) + P(z > 3)
= 2P(z > 3)
= 2(.5000 .4987)
= 2(.0013) = .0026
b.
To find the probability that x assumes a value within

1 standard deviation of its mean:
P( < x < + )
= P(1 < z < 1)
= 2P(0 < z < 1)
= 2(.3413)
= .6826
105
To find the probability that x assumes a value within

2 standard deviations of :
P( 2 < x < + 2)
= P(2 < z < 2)
= 2P(0 < z < 2)
= 2(.4772)
= .9544
c.
To find the value of x that represents the 80th percentile,

we must first find the value of z that corresponds to the
80th percentile.
P(z < z0) = .80. Thus, A1 + A2 = .80. Since A1 = .50,
A2 = .80 - .50 = .30. Using the body of Table IV, z0 = .84.
To find x, we substitute the values into the z-score formula:
z=
.84 =
x 1000
x = .84(10) + 1000 = 1008.4
10
To find the value of x that represents the 10th percentile,

we must first find the value of z that corresponds to the
10th percentile.
P(z < z0) = .10. Thus, A1 = .50 - .10 = .40. Using the
body of Table IV, z0 = 1.28. To find x, we substitute the values into the z-score formula:
z=
1.28 =
4.96
x 1000
x = 1.28(10) + 1000 = 987.2
10
The random variable x has a normal distribution with = 50 and = 3.

a.
P(x x0) = .8413

So, A1 + A2 = .8413
Since A1 = .5, A2 = .8413 .5 = .3413.
Look up the area .3413 in the body of Table IV,
Appendix B; z0 = 1.0.
106
Chapter 4
To find x0, substitute all the values into the z-score formula:
z=
x 50
1.0 = 0
3
x0 = 50 + 3(1.0) = 53
b.
P(x > x0) = .025

So, A = .5000 .025 = .4750
Appendix B; z0 = 1.96.
z=
x 50
1.96 = 0
3
x0 = 50 + 3(1.96) = 55.88
c.
P(x > x0) = .95

So, A1 + A2 = .95. Since A2 = .5, A1 = .95 .5 = .4500.
Appendix B; (since it is exactly between two values,
average the z-scores). z0 1.645.
To find x0, substitute into the z-score formula:
z=
x 50
1.645 = 0
3
x0 = 50 3(1.645) = 45.065
d.
P(41 x < x0) = .8630

z=
41 50
= 3
3
A1 = P(41 x ) = P(3 z 0)
= P(0 z 3)
= .4987
A1 + A2 = .8630, since A1 = .4987, A2 = .8630 - .4987 = .3643. Look up .3643 in the body
of Table IV, Appendix B; z0 = 1.1.
107
To find x0, substitute into the z-score formula:

z=
x 50
1.1 = 0
3
x0 = 50 + 3(1.1) = 53.3
e.
P(x < x0) = .10

So A = .5000 .10 = .4000
Look up area .4000 in the body of Table IV, Appendix B; z0 = 1.28. Since z0 is to the left
of 0, z0 = 1.28.
z=
x0 50
3
x0 = 50 1.28(3) = 46.16
1.28 =
f.
P(x > x0) = .01

So A = .5000 .01 = .4900
Look up area .4900 in the body of Table IV, Appendix B; z0 = 2.33.
z=
x0 50
3
x0 = 50 + 2.33(3) = 56.99
2.33 =
4.98
a.
Using Table IV, Appendix B,
0 5.26
P ( x > 0) = P z >
= P ( z > 0.526)
10
= .5 + P (0.53 < z < 0) = .5 + .2019 = .7019

b.
108
15 5.26
5 5.26
<z<
P (5 < x < 15) = P
= P(0.026 < z < 0.974)
10
10
= P (.03 < z < 0) + P (0 < z < .97) = .0120 + .3340 = .3460
Chapter 4
c.
d.
1 5.26
P ( x < 1) = P z <
= P( z < 0.426)
10
= .5 P(0.43 < z < 0) = .5 .1664 = .3336

25 5.26
P ( x 25) = P z
= P ( z 3.026)
10
= .5 P(3.03 z < 0) = .5 .4988 = .0012

Since the probability of seeing a win percentage of -25% or anything more unusual is so
small (p = .0012), we would conclude that the average casino win percentage is not
5.26%.
4.100
Let x = drivers head injury rating. The random variable x has a normal distribution with
= 605 and = 185. Using Table IV, Appendix B,
a.
b.
700 605
500 605
P (500 < x < 700) = P
<z<
= P (0.57 < z < 0.51)
185
185
= P ( 0.57 < z < 0) + P (0 < z < 0.51) = .2157 + .1950 = .4107
500 605
400 605
P (400 < x < 500) = P
<z<
= P (1.11 < z < 0.57)
185
185
= P ( 1.11 < z < 0) P ( 0.57 < z < 0) = .3665 .2157 = .1508

c.
d.
850 605
P ( x < 850) = P z <

= P ( z < 1.32) = .5 + P (0 < z < 1.32)
185
= .5 + .4066 = .9066
1, 000 605
P ( x > 1, 000) = P z >

= P ( z > 2.14) = .5 P (0 < z < 2.14)
185
= .5 .4838 = .0162
4.102
a.
Let x = crop yield. The random variable x has a normal distribution with = 1,500
and = 250.
1,600 -1,500
P(x < 1,600) = P z <

= P(z < .4) = .5 + .1554 = .6554
250
(Using Table IV)
109
b.
Let x1 = crop yield in first year and x2 = crop yield in second year. If x1 and x2 are
independent, then the probability that the farm will lose money for two straight years is:
1,600 1,500
1,600 1,500
P(x1 < 1,600) P(x2 < 1,600) = P z1 <

P z2 <
250
250
= P(z1 < .4) P(z2 < .4) = (.5 + .1554)(.5 + .1554) = .6554(.6554) = .4295
(Using Table IV)
c.
4.104
[1,500 + 2 ] 1,500
[1,500 2 ] 1,500
P(1,500 2 x 1,500 + 2) = P
z
= P(2 z 2) = 2P(0 z 2) = 2(.4772) = .9544

(Using Table IV)
Let x = wage rate. The random variable x is normally distributed with = 16 and = 1.25.
Using Table IV, Appendix B,
a.
b.
c.
17.30 16
P ( x > 17.30) = P z >

= P ( z > 1.04)
1.25
= .5 P(0 < z < 1.04) = .5 .3508 = .1492

17.30 16
P ( x > 17.30) = P z >

= P ( z > 1.04)
1.25
= .5 P(0 < z < 1.04) = .5 .3508 = .1492

P(x ) = P(x ) = .5
Thus, = = 16.
(Recall from section 2.4 that in a symmetric distribution, the mean equals the median.)
4.106
a.
The contract will be profitable if total cost, x, is less than $1,000,000.

1,000,000 850,000
P(x < 1,000,000) = P z <

= P(z < .88) = .5 + .3106 = .8106
170,000
b.
The contract will result in a loss if total cost, x, exceeds 1,000,000.

P(x > 1,000,000) = 1 P(x < 1,000,000) = 1 .8106 = .1894
110
Chapter 4
c.
P(x < R) = .99. Find R.

R 850,000
= P(z < z0) = .99

P(x < R) = P z <
170,000
A1 = .99 .5 = .4900
Looking up the area .4900 in Table IV, z0 = 2.33
R 850,000
R 850,000
2.33 =
170,000
170,000
R = 2.33(170,000) + 850,000 = $1,246,100
z0 =
4.108
a.
Let x = quantity injected per container. The random variable x has a normal distribution
with = 10 and = .2.
10 10
P(x < 10) = P z <

= P(z < 0.0) = .5
.2
10 10
P(x 10) = P z
= P(z 0.0) = .5
.2
4.110
b.
Since the container needed to be reprocessed, it cost $10. Upon refilling, it contained
10.60 units with a cost of 10.60($20) = $212. Thus, the total cost for filling this container
is $10 + $212 = $222. Since the container sells for $230, the profit is $230 $222 = $8.
c.
Let x = quantity injected per container. The random variable x has a normal distribution
with = 10.10 and = .2. The expected value of x is E(x) = = 10.10. The cost of a
container with 10.10 units is 10.10($20) = $202. Thus, the expected profit would be the
selling price minus the cost or $230 $202 = $28.
a.
If z is a standard normal random variable,

QL = zL is the value of the standard normal distribution which has 25% of the data to the
left and 75% to the right.
Find zL such that P(z < zL) = .25
A1 = .50 .25 = .25.
Look up the area A1 = .25 in the body of Table IV of
Appendix B; zL = .67 (taking the closest value). If interpolation is used, .675 would be
obtained.
111
QU = zU is the value of the standard normal distribution which has 75% of the data to the
left and 25% to the right.
Find zU such that P(z < zU) = .75
A1 + A2 = P(z 0) + P(0 z zU)
= .5 + P(0 z zU)
= .75
Therefore, P(0 z zU) = .25.
Look up the area .25 in the body of Table IV of Appendix B; zU = .67 (taking the closest
value).
b.
Recall that the inner fences of a box plot are located 1.5(QU - QL) outside the hinges (QL
and QU).
To find the lower inner fence,
QL 1.5(QU QL) = .67 1.5(.67 (.67))
= -.67 1.5(1.34)
= -2.68 (2.70 if zL = .675 and zU = +.675)
The upper inner fence is:
QU + 1.5(QU - QL) = .67 + 1.5(.67 (.67))
= .67 + 1.5(1.34)
= 2.68 (+2.70 if zL = .675 and zU = +.675)
c.
Recall that the outer fences of a box plot are located 3(QU QL) outside the hinges
(QL and QU).
To find the lower outer fence,
QL 3(QU QL) = .67 3(.67 (.67))
= .67 3(1.34)
= -4.69 (4.725 if zL = .675 and zU = +.675)
The upper outer fence is:
QU + 3(QU QL) = .67 + 3(.67 (.67))
= .67 + 3(1.34)
= 4.69 (4.725 if zL = .675 and zU = +.675)
112
Chapter 4
d.
P(z < 2.68) + P(z > 2.68)

= 2P(z > 2.68)
= 2(.5000 .4963)
(Table IV, Appendix B)
= 2(.0037) = .0074
(or 2(.5000 .4965) = .0070 if 2.70 and 2.70 are used)
P(z < 4.69) + P(z > 4.69)
= 2P(z > 4.69)
2(.5000 .5000) 0
4.112
4.114
e.
In a normal probability distribution, the probability of an observation being beyond the

inner fences is only .0074 and the probability of an observation being beyond the outer
fences is approximately zero. Since the probability is so small, there should not be any
observations beyond the inner and outer fences. Therefore, they are probably outliers.
a.
IQR = QU QL = 195 72 = 123
b.
IQR/s = 123/95 = 1.295
c.
Yes. Since IQR is approximately 1.3, this implies that the data are approximately normal.
a.

Stem-and-leaf of X
Leaf Unit = 0.10
5
11266
35
11
035
14
039
14
3457
10
346
24469
N = 28
47
Since the data do not form a mound-shape, it indicates that the data may not be normally
distributed.
b.

Variable
X
Variable
X
Mean
Median
TrMean
StDev
SE Mean
28
5.511
6.100
5.519
2.765
0.5230
Minimum
Maximum
Q1
Q3
1.100
9.700
3.350
8.050
The standard deviation is 2.765.
113
c.
Using the printout from MINITAB in part b, QL = 3.35, and QU = 8.05. The IQR
= QU QL = 8.05 3.35 = 4.7. If the data are normally distributed, then IQR/s 1.3.
For this data, IQR/s = 4.7/2.765 = 1.70. This is a fair amount larger than 1.3, which
indicates that the data may not be normally distributed.
d.
Using MINITAB, the normal probability plot is:
The data at the extremes are not particularly on a straight line. This indicates that the data are
not normally distributed.
4.116
From the normal probability plot, it appears that the data may not be normal. The points with
small observed values and the points with large observed values do not fall on the straight line.
This implies that the data may not be from a normal distribution.
4.118
a.
We will look at the 4 methods for determining if the data are normal. First, we will look
at a histogram of the data. Using MINITAB, the histogram of the fish weights is:
35
30
Frequency
25
20
15
10
5
0
0
500
1000
1500
2000
2500
Weight
From the histogram, the data appear to be fairly mound-shaped. This indicates that the
data may be normal.
114
Chapter 4
Next, we look at the intervals x s, x 2 s, x 3s . If the proportions of observations

falling in each interval are approximately .68, .95, and 1.00, then the data are
approximately normal. Using MINITAB, the summary statistics are:
Descriptive Statistics: Weight
Variable
Weight
N
144
Mean
1049.7
Median
1000.0
TrMean
1039.4
Variable
Weight
Minimum
173.0
Maximum
2302.0
Q1
804.5
Q3
1263.3
StDev
376.5
SE Mean
31.4
x s 1049.7 376.5 (673.2, 1, 426.2) 98 of the 144 values fall in this interval. The
proportion is .68. This is exactly the .68 we would expect if the data were normal.
x 2 s 1049.7 2(376.5) 1049.7 753 (296.7, 1802.7) 140 of the 144 values fall
in this interval. The proportion is .97. This is somewhat larger than the .95 we would
expect if the data were normal.
x 3s 1049.7 3(376.5) 1049.7 1126.5 (79.8, 2179.2) 143 of the 144 values
fall in this interval. The proportion is .993. This is close to the 1.00 we would expect if
the data were normal.
From this method, it appears that the data are normal.
Next, we look at the ratio of the IQR to s. IQR = QU QL = 1263.3 804.5 = 458.8.
IQR 458.8
=
= 1.22 This is close to the 1.3 we would expect if the data were normal.
376.5
s
This method indicates the data are normal.
Finally, using MINITAB, the normal probability plot is:
Normal Probability Plot for Weight
ML Estimates - 95% CI
ML Estimates
99
Mean
1049.72
StDev
375.236
Percent
95
90
Goodness of Fit
80
70
60
50
40
30
20
AD*
0.793
10
5
1
1000
2000
Data
Since the data form a fairly straight line, the data appear to be normal.
115
From the 4 different methods, all indications are that the fish weight data are
approximately normal.
b.
at a histogram of the data. Using MINITAB, the histogram of the fish DDT levels is:
140
120
Frequency
100
80
60
40
20
0
0
500
1000
DDT
From the histogram, the data appear to be skewed to the right. This indicates that the data
may not be normal.
Descriptive Statistics: DDT
Variable
DDT
N
144
Mean
24.35
Median
7.15
TrMean
10.38
Variable
DDT
Minimum
0.11
Maximum
1100.00
Q1
3.33
Q3
13.00
StDev
98.38
SE Mean
8.20
x s 24.35 98.38 (74.03, 122.73) 138 of the 144 values fall in this interval. The
proportion is .96. This is much greater than the .68 we would expect if the data were
normal.
x 2 s 24.35 2(98.38) 24.35 196.76 (172.41, 221.11) 142 of the 144 values
fall in this interval. The proportion is .986 This is much larger than the .95 we would
x 3s 24.35 3(98.38) 24.35 295.14 (270.79, 319.49) 142 of the 144 values
fall in this interval. The proportion is .986. This is somewhat lower than the 1.00 we
would expect if the data were normal.
From this method, it appears that the data are not normal.
116
Chapter 4
IQR 9.67
=
= 0.098 This is much smaller than the 1.3 we would expect if the data were
s
98.38
normal. This method indicates the data are not normal.
Normal Probability Plot for DDT
ML Estimates
99
Percent
95
90
Mean
24.355
StDev
98.0364
Goodness of Fit
80
70
60
50
40
30
20
AD*
38.58
10
5
1
500
1000
Data
Since the data do not form a straight line, the data are not normal.
From the 4 different methods, all indications are that the fish DDT level data are not normal.
4.120
We will look at the 4 methods for determining if the data are normal. First, we will look at
a histogram of the data. Using MINITAB, the histogram of the sanitation scores is:
Histogram of SCORE
40
Fr equency
30
20
10
66
72
78
84
90
96
SC O RE
117
From the histogram, the data appear to be skewed to the left. This indicates that the data are
not normal.
Next, we look at the intervals x s, x 2 s, x 3s . If the proportions of observations falling
in each interval are approximately .68, .95, and 1.00, then the data are approximately normal.
Descriptive Statistics: DDT
Variable
DDT
N
144
Mean
24.35
Median
7.15
TrMean
10.38
Variable
DDT
Minimum
0.11
Maximum
1100.00
Q1
3.33
Q3
13.00
StDev
98.38
SE Mean
8.20
proportion is .81. This is much larger than the .68 we would expect if the data were normal.
x 2 s 94.911 2(4.825) 94.911 9.65 (85.261, 104.561) 165 of the 169 values fall
in this interval. The proportion is .98. This is somewhat larger than the .95 we would expect if
x 3s 94.911 3(4.825) 94.911 14.475 (80.436, 109.386) 166 of the 169 values fall
in this interval. The proportion is .982. This is somewhat smaller than the 1.00 we would
Next, we look at the ratio of the IQR to s. IQR = QU QL = 98 93 = 5.
IQR
5
=
= 1.036 This is much smaller than the 1.3 we would expect if the data were
s
4.825
Probability Plot of SCORE
N ormal - 95% C I
99.9
Mean
StDev
99
N
AD
P-Value
95
94.91
4.825
169
7.216
<0.005
P er cent
90
80
70
60
50
40
30
20
10
5
1
0.1
60
118
70
80
90
SC O RE
100
110
Chapter 4
From the 4 different methods, all indications are that the sanitation scores data are not normal.
4.122
We will look at the 4 methods for determining if the data are normal. First, we will look at
a histogram of the data. Using MINITAB, the histogram of the tensile strength values is:
Histogram of Strength
3.0
Fr equency
2.5
2.0
1.5
1.0
0.5
0.0
330
335
340
345
Str ength
350
355
From the histogram, the data appear to be somewhat skewed to the left. This might indicate
that the data are not normal.
Next, we look at the intervals x s, x 2 s, x 3s . If the proportions of observations falling
in each interval are approximately .68, .95, and 1.00, then the data are approximately normal.
Descriptive Statistics: Strength
Variable
Strength
N
11
N*
0
Variable
Strength
Maximum
356.30
Mean
342.13
SE Mean
2.38
StDev
7.91
Minimum
328.20
Q1
334.70
Median
343.60
Q3
347.80
proportion is .73. This is somewhat larger than the .68 we would expect if the data were
normal.
x 2 s 342.16 2(7.91) 342.16 9.65 (326.34, 357.98) All 11 of the 11 values fall in
this interval. The proportion is 1.00. This is somewhat larger than the .95 we would expect if
x 3s 342.16 3(7.91) 342.16 23.73 (318.43, 365.89) Again, all 11 of the 11
values fall in this interval. The proportion is 1.00. This is equal to the 1.00 we would expect if
119
From this method, it appears that the data are quite normal.
IQR 13.1
=
= 1.656 This is much larger than the 1.3 we would expect if the data were normal.
s
7.91
This method indicates the data are not normal.

Probability Plot of Strength
Normal - 95% CI
99
Mean
StDev
N
AD
P-Value
95
90
342.1
7.907
11
0.154
0.937
80
Percent
70
60
50
40
30
20
10
5
310
320
330
340
Strength
350
360
370
Since the data do form a fairly straight line, the data could be normal.
From the 4 different methods, three of the four indicate that the data probably are not from a
normal distribution.
4.124
a.
In order to approximate the binomial distribution with the normal distribution, the interval
3 np 3 npq should lie in the range 0 to n.
When n = 25 and p = .4,
np 3 npq 25(.4) 3 25(.4)(1 .4)
10 3 6 10 7.3485 (2.6515, 17.3485)
Since the interval calculated does lie in the range 0 to 25, we can use the normal
approximation.
b.
= np = 25(.4) = 10
2 = npq = 25(.4)(.6) = 6
c.
P(x 9) = 1 P(x 8) = 1 .274 = .726
d.
120
(Table II, Appendix B)
(9 .5) 10
P(x 9) P z
= P(z .61)
= .5000 + .2291 = .7291
(Using Table IV in Appendix B.)
Chapter 4
4.126
= np = 1000(.5) = 500, =
a.
npq = 1000(.5)(.5) = 15.811
Using the normal approximation,

(500 + .5) 500
P(x > 500) P z >

= P(z > .03) = .5 .0120 = .4880
15.811
(from Table IV, Appendix B)
b.
(500 .5) 500

(490 .5) 500
P(490 x < 500) P
z<
15.811
15.811
= P(.66 z < .03) = .2454 .0120 = .2334

c.
4.128
(550 + .5) 500
P(x > 550) P z >

= P(z > 3.19) .5 .5 = 0
15.811
a.
E(x) = = np = 350(.27) = 94.5.
b.
= 2 = npq = 350(.27)(.73) = 68.985 = 8.306
c.
z=
d.
To see if the normal approximation is appropriate, we use:
99.5 94.5
= 0.60
8.306
3 94.5 3(8.306) 94.5 24.918 (69.582, 119.418)

Since the interval lies in the range of 0 to 350, the normal approximation is appropriate.
P ( x 100) P ( z 0.60) = .5 .2257 = .2743 (Using Table IV, Appendix B)
4.130
Let x = number of white-collar employees in good shape who will develop stress related illnesses
in a sample of 400. Then x is a binomial random variable with n = 400 and p = .10. To see if
the normal approximation is appropriate for this problem:
np 3 npq 400(.1) 3 400(.1)(.9) 40 18 (22, 58)
Since this interval is contained in the interval 0, n = 400, the normal approximation is
appropriate.
(60 + .5) 40
P(x > 60) P z >
= P(z > 3.42) .5000 - .5000 = 0
121
4.132
a.
For n = 100 and p = .01:

3 np 3 npq 100(.01) 3 100(.01)(.99)
1 3(.995) 1 2.985 (1.985, 3.985)
Since the interval does not lie in the range 0 to 100, we cannot use the normal
approximation to approximate the probabilities.
b.
For n = 100 and p = .5:

3 np 3 npq 100(.5) 3 100(.5)(.5)
50 3(5) 50 15 (35, 65)
Since the interval lies in the range 0 to 100, we can use the normal approximation to
approximate the probabilities.
c.
For n = 100 and p = .9:

3 np 3 npq 100(.9) 3 100(.9)(.1)
90 3(3) 90 9 (81, 99)
4.134
b.
Let v = number of credit card users out of 100 who carry Visa. Then v is a binomial
random variable with n = 100 and pv = .539.
E(v) = npv = 100(.539) = 53.9.
Let d = number of credit card users out of 100 who carry Discover. Then d is a binomial
random variable with n = 100 and pd = .040.
E(d) = npd = 100(.040) = 4.0.
c.
To see if the normal approximation is valid, we use:
3 npv 3 npv qv 100(.539) 3 100(.539)(.461) 53.9 3(4.9848)

53.9 14.9544 (38.946, 68.854)
approximate the probability.
(50 .5) 53.9
P (v 50) P z
= P ( z .88) = .5 + .3106 = .8106
4.985
122
Chapter 4
Let a = number of credit card users out of 100 who carry American Express. Then a is a
binomial random variable with n = 100 and pa = .132. To see if the normal approximation
is valid, we use:
3 npa 3 npa qa 100(.132) 3 100(.132)(.868) 13.2 3(3.385)

13.2 10.155 (3.045, 23.355)
approximate the probability.
(50 .5) 13.2
P (a 50) P z
= P( z 10.72) .5 + .5 = 0
3.385
4.136
d.
In order for the normal approximation to be valid, 3 must lie in the interval (0, n).
This check was done in part c for both portions of the question. In both cases, the normal
approximation was justified.
a.
If 80% of the passengers pass through without their luggage being inspected, then 20%
will be detained for luggage inspection. The expected number of passengers detained will
be:
E(x) = np = 1,500(.2) = 300
4.140
b.
For n = 4,000, E(x) = np = 4,000(.2) = 800
c.
(600 + .5) 800

P(x > 600) P z >
= P(z > 7.89) = .5 + .5 = 1.0
4000(.2)(.8)
E(x) = =
xp( x) = 1(.2) + 2(.3) + 3(.2) + 4(.2) + 5(.1)

= .2 + .6 + .6 + .8 + .5 = 2.7
E( x ) =
xp( x ) = 1.0(.04) + 1.5(.12) + 2.0(.17) + 2.5(.20) + 3.0(.20) + 3.5(.14) + 4.0(.08)

+ 4.5(.04) + 5.0(.01)
= .04 + .18 + .34 + .50 + .60 + .49 + .32 + .18 + .05 = 2.7
4.144
The sampling distribution is approximately normal only if the sample size is sufficiently large
or if the population being sampled from is normal.
4.146
a.
x = = 10, x = / n = 3/ 25 = 0.6
b.
x = = 100, x = / n = 25 / 25 = 5
c.
x = = 20, x = / n = 40 / 25 = 8
d.
x = = 10, x = / n = 100 / 25 = 20
123
4.148
4.150
a.
x = = 20, x = / n = 16 / 64 = 2
b.
By the Central Limit Theorem, the distribution of is approximately normal. In order for
the Central Limit Theorem to apply, n must be sufficiently large. For this problem,
n = 64 is sufficiently large.
c.
z=
d.
z=
x x
x
x x
15.5 20
= 2.25
2
23 20
= 1.50
2
For this population and sample size,

E ( x ) = = 100, x = / n = 10 / 900 = 1/3
a.
b.
4.154
Approximately 95% of the time, will be within two standard deviations of the mean, i.e.,
2
1
2 100 2 100 (99.33, 100.67). Almost all of the time, the
3
3
sample mean will be within three standard deviations of the mean, i.e., 3 100
1
3 100 1 (99, 101).
3
1
No more than three standard deviations, i.e., 3 = 1
3
c.
No, the previous answer only depended on the standard deviation of the sampling
distribution of the sample mean, not the mean itself.
a.
x = = 98,500
b.
x =
30,000
50
= 4, 242.6407
c. By the Central Limit Theorem, the sampling distribution of x is approximately normal.
124
x x
z=
e.
P ( x > 89,500) = P ( z > 2.12) = .5 + .4830 = .9830 (Using Table IV, Appendix B)
89,500 98,500
= 2.12
4, 242.6407
d.
Chapter 4
4.156
a.
x = = 89.34; x =
7.74
35
= 1.3083
b.
c.
d.
4.158
a.
88 89.34
P( x > 88) = P z >

= P(z > 1.02) = .5 + .3461 = .8461
1.3083
(using Table IV, Appendix B)

87 89.34
P( x < 87) = P z <

= P(z < 1.79) = .5 .4633 = .0367
1.3083
Since the sample size is small, we also have to assume that the distribution from
.5
which the sample was drawn is normal. x = = 1.8 , x =
=
= .1118
n
20
1.85 1.8
P ( x 1.85) = P z
= P ( z 0.45) = .5 .1736 = .3264
.1118
b.
Descriptive Statistics: Rough

Variable
Rough
N
20
N*
0
Mean
1.881
SE Mean
0.117
StDev
0.524
Minimum
1.060
Q1
1.303
Median
2.040
Q3
2.293
Maximum
2.640
From this output, the value of x is 1.881.

c.
For x = 1.881:
1.881 1.8
P ( x 1.881) = P z
= P ( z 0.72) = .5 .1736 = .3264
.1118
Since this probability is so high, observing a sample mean of x = 1.881, is not

unusual. The assumptions in part a appear to be valid,
4.160
If the observations are independent of each other, then
P(1, 1) = p(1)p(1) = .2(.2) = .04

P(1, 2) = p(1)p(2) = .2(.3) = .06
P(1, 3) = p(1)p(3) = .2(.2) = .04
etc.
125
a.
Possible Sample
1, 1
1, 2
1, 3
1, 4
1, 5
2, 1
2, 2
2, 3
2, 4
2, 5
3, 1
3, 2
3, 3
1
1.5
2
2.5
3
1.5
2
2.5
3
3.5
2
2.5
3
p( x )
Possible Samples
.04
.06
.04
.04
.02
.06
.09
.06
.06
.03
.04
.06
.04
3, 4
3, 5
4, 1
4, 2
4, 3
4, 4
4, 5
5, 1
5, 2
5, 3
5, 4
5, 5
x
3.5
4
2.5
3
3.5
4
4.5
3
3.5
4
4.5
5
p( x )
.04
.02
.04
.06
.04
.04
.02
.02
.03
.02
.02
.01
Summing the probabilities, the probability distribution of is:
x
1
1.5
2
2.5
3
3.5
4
4.5
5
p( x )
.04
.12
.17
.20
.20
.14
.08
.04
.01
b.
126
c.
P( x 4.5) = .04 + .01 = .05
d.
No. The probability of observing = 4.5 or larger is small (.05).
Chapter 4
4.162
For n = 36, x = = 406 and x = / n = 10.1/ 36 = 1.6833. By the Central Limit

Theorem, the sampling distribution is approximately normal (n is large).
400.8 406
P( x 400.8) = P z
= P(z 3.09) = .5 .4990 = .0010
1.6833

The first. If the true value of is 406, it would be extremely unlikely to observe an as small
as 400.8 or smaller (probability .0010). Thus, we would infer that the true value of is less
than 406.
4.164
4.166
a.
This experiment consists of 100 trials. Each trial results in one of two outcomes: chip is
defective or not defective. If the number of chips produced in one hour is much larger
than 100, then we can assume the probability of a defective chip is the same on each trial
and that the trials are independent. Thus, x is a binomial. If, however, the number of
chips produced in an hour is not much larger than 100, the trials would not be
independent. Then x would not be a binomial random variable.
b.
This experiment consists of two trials. Each trial results in one of two outcomes:
applicant qualified or not qualified. However, the trials are not independent. The
probability of selecting a qualified applicant on the first trial is 3 out of 5. The
probability of selecting a qualified applicant on the second trial depends on what
happened on the first trial. Thus, x is not a binomial random variable. It is a
hypergeometric random variable.
c.
The number of trials is not a specified number in this experiment, thus x is not a binomial
random variable. In this experiment, x is counting the number of calls received.
d.
The number of trials in this experiment is 1000. Each trial can result in one of two
outcomes: favor state income tax or not favor state income tax. Since 1000 is small
compared to the number of registered voters in Florida, the probability of selecting a
voter in favor of the state income tax is the same from trial to trial, and the trials are
independent of each other. Thus, x is a binomial random variable.
a.
=
2 =
xp( x) = 10(.2) + 12(.3) + 18(.1) + 20(.4) = 15.4

(x )
p ( x)
= (10 15.4) (.2) + (12 15.4)2(.3) + (18 15.4)2(.1) + (20 15.4)2(.4) = 18.44
= 18.44 4.294
2
P(x < 15) = p(10) + p(12) = .2 + .3 = .5
c.
2 = 15.4 2(4.294) (6.812, 23.988)
d.
P(6.812 < x < 23.988) = .2 + .3 + .1 + .4 = 1.0
127
4.168
4.170
128
Using Table III, Appendix B,

a.
When = 2, p(3) = P(x 3) P(x 2) = .857 .677 = .180
b.
When = 1, p(4) = P(x 4) P(x 3) = .996 .981 = .015
c.
When = .5, p(2) = P(x 2) P(x 1) = .986 .910 = .076
a.
1
1
1
=
= ,10 x 90
f(x) = d c 90 10 80
0
otherwise
b.
c.
The interval 2 50 2(23.094)

50 46.188 (3.812, 96.188) is indicated
on the graph.
d.
P(x 60) = Base(height) = (60 10)
e.
P(x 90) = 0
f.
P(x 80) = Base(height) = (80 10)
g.
P( x + ) = P(50 23.094 x 50 + 23.094)

= P(26.906 x 73.094)
= Base(height)
1 46.188
= (73.094 26.906) =
= .577
80
80
h.
P(x > 75) = Base(height) = (90 75)
c+d
10 + 90
=
= 50
2
2
d c
90 10
=
=
= 23.094011
12
12
1 5
= = .625
80 8
1 7
= = .875
80 8
1 15
=
= .1875
80 80
Chapter 4
4.172
a.
P(z z0) = .5080

P(0 z z0) = .5080 .5 = .0080
Looking up the area .0080 in Table IV,
z0 = .02
b.
P(z z0) = .5517

P(z0 z 0) = .5517 .5 = .0517
Looking up the area .0517 in Table IV, z0 = .13.
c.
P(z z0) = .1492

P(0 z z0) = .5 .1492 = .3508
Looking up the area .3508 in Table IV,
z0 = 1.04
d.
P(z0 z .59) = .4773

P(z0 z 0) + P(0 z .59) = .4773
P(0 z .59) = .2224
Thus, P(z0 z 0) = .4773 .2224 = .2549
Looking up the area .2549 in Table IV, z0 = -.69
4.174
= np = 100(.5) = 50, =
a.
npq = 100(.5)(.5) = 5
(48 + .5) 50
P(x 48) = P z
= P(z .30)
= .5 .1179 = .3821
b.
P(50 x 65)
(65 + .5) 50
(50 .5) 50
= P
z
5
5
= P(.10 z 3.10)
= .0398 + .5000 = .5398
129
c.
(70 .5) 50
P(x 70) = P z
= P(z 3.90)
= .5 .5 = 0
d.
P(55 x 58)
(58 + .5) 50
(55 .5) 50
= P
z
5
5
= P(.90 z 1.70)
= P(0 z 1.70) P(0 z .90)
= .4554 .3159 = .1395
e.
P(x = 62)
(62 + .5) 50
(62 .5) 50
= P
z
5
5
= P(2.30 z 2.50)
= P(0 z 2.50) (0 z 2.30)
= .4938 .4893 = .0045
f.
P(x 49 or x 72)
(49 + .5) 50
= P z
(72 .5) 50
+ P z
= P(z .10) + P(z 4.30)

= (.5 .0398) + (.5 .5) = .4602
4.176
a.
First we must compute and . The probability distribution for x is:

x
1
2
3
4
= E(x) =
p(x)
.3
.2
.2
.3
xp( x) = 1(.3) + 2(.2) + 3(.2) + 4(.3) = 2.5
2 = E ( x ) 2 =
(x )
p ( x)
= (1 2.5) (.3) + (2 2.5)2(.2) + (3 2.5)2(.2)+ (4 2.52)(.3)

= 1.45
1.45
x = = 2.5, x =
=
= .1904
n
40
2
130
Chapter 4
4.180
b.
By the Central Limit Theorem, the distribution of is approximately normal. The sample size,
n = 40, is sufficiently large. Our answer does depend on n. If n is not sufficiently large, the
Central Limit Theorem would not apply.
a.
In order to be a binomial random variable, the five characteristics must hold.

1.
2.
3.
4.
5.
For this problem, there are 5 items scanned. We will assume that these 5 trials are
identical.
For each item scanned, there are 2 possible outcomes: priced incorrectly (S) or
priced correctly (F).
The probability of being priced incorrectly remains constant from trial to trial. For
this problem, we will assume that the probability of being priced incorrectly is P(S)
= 1/30 for each trial.
We will assume that whether one item is priced incorrectly is independent of any
other.
The random variable x is the number of items priced incorrectly in 5 trials.
Thus, x is a binomial random variable.

b.
The estimate of p, the probability of an item being priced incorrectly is 1/30.
c.
5
P(x = 1) = (1/30)1(29/30)4 = .1455
1
d.
e.
5
P(x 1) = 1 P(x = 0) = 1 (1/30)0(29/30)5 = 1 .8441 = .1559
0
Let x = number of items with incorrect prices in 10,000 trials. Thus, x is a binomial
random variable with n = 10,000 and p = 1/30 = .033.
3 np 3 npq 10,000(.033) 3 10, 000(.033)(.967)

330 3 319.11 330 3(17.864) 330 53.591 (276.409, 383.591)
Since the interval lies in the range 0 to 10,000, we can use the normal approximation to
(100 .5) 330
P(x 100) P z
= P(z 12.90)
17.864
= P(12.90 z < 0) + .5 .5 + .5 = 1 (using Table IV, Appendix B)
131
f.
Let x = number of items with incorrect prices in 100 trials. Thus, x is a binomial random
variable with n = 100 and p = 1/30 = .033.
3 np 3 npq 100(.033) 3 100(.033)(.967)

3.3 3 3.191 3.3 3(1.786) 3.3 5.358 (2.058, 8.658)
Since the interval does not lie in the range 0 to 100, the normal approximation will not be
appropriate.
4.182
a.
Using Table IV, Appendix B, with = 8.72 and = 1.10,

6 8.72
= P(z < 2.47) = .5 .4932 = .0068

P(x < 6) = P z <
1.10
Thus, approximately .68% of the games would result in fewer than 6 hits.
4.184
b.
The probability of observing fewer than 6 hits in a game is p = .0068. The probability of
observing 0 hits would be even smaller. Thus, it would be extremely unusual to observe
a no hitter.
a.
Using Table III, Appendix B, with =1, P(x = 3) = P(x 3) P(x 2) = .981 .920
= .061
b. P(x > 2) = 1 P(x 2) = 1 .920 = .080.

4.186
a.
Let x = number of employees who have a drug problem in 1,000 trials. Then x is a
binomial random variable with n = 1,000 and p = .052.
E(x) = np = 1,000(.052) = 52
b.
Let x = number of employees who have an alcohol problem in 10 trials. Then x is a

binomial random variable with n = 10 and p = .085.
10
P(x 1) = 1 P(x = 0) = 1 .0850 .91510-0
0
10!
=1
.0850 .91510 = 1 .4113 = .5887
0!(10 - 0)!
10
10!
P(x = 2) = .0852 .91510-2 =
.0852 .9158 = .1597
2
2!(10
2)!

c.
132
We had to assume that the probability of an employee having a substance abuse problem
was constant from trial to trial and that the trials were independent.
Chapter 4
4.188
Let x = demand for white bread. Then x is a normal random variable with = 7200 and
= 300:
a.
P(x x0) = .94. Find x0.
x 7200
P(x x0) = P z 0
300
= P(z z0) = .94

A1 = .94 .50 = .4400
Using Table IV and area .4400, z0 = 1.555.
x 0 7200
x 7200
1.555 = 0
x0 = 7666.5 7667
300
300
If the company produces 7,667 loaves, the company will be left with more than 500
loaves if the demand is less than 7,667 - 500 = 7167.
7167 7200
P(x < 7167) = P z <

= P(z < .11)
300
z0 =
b.
= .5 .0438 = .4562 (from Table IV, Appendix B)

Thus, on 45.62% of the days the company will be left with more than 500 loaves.
4.190
Let x = number of inches a gouge is from one end of the spindle. Then x has a uniform
distribution with f(x) as follows:
1
1
1
=
=
f ( x) = d c 18 0 18
0 x 18
otherwise
In order to get at least 14 consecutive inches without a gouge, the gouge must be within 4
inches of either end. Thus, we must find:
P(x < 4) + P(x > 14) = (4 0)(1/18) + (18 14)(1/18)
= 4/18 + 4/18 = 8/18 = .4444
4.192
a.
b.
c.
x = = 3.5 x =
.5
100
= .05
3.60 3.5
3.40 3.5
P(3.40 < x < 3.60) = P
<z<
.05
.05
= P(2 < z < 2) = .4772 + .4772 = .9544

3.62 3.5
P( x > 3.62) = P z >

= P(z > 2.40) = .5 .4918 =.0082
.05
133
d.
x = = 3.5 x =
.5
200
= .03536
The mean of the sampling distribution of would stay the same, but the standard deviation
would decrease.
3.60 3.5
3.40 3.5
<z<
P(3.40 < x < 3.60) = P
.03536
.03536
= P(2.83 < z < 2.83) = .4977 + .4977 = .9954
This probability is larger than when the sample size was 100.
3.62 3.5
P( x > 3.62) = P z >

= P(z > 3.39) .5 .5 = 0
.03536

This probability is smaller than when the sample size was 100.
4.194
a.
Let p1 = probability of an error = 1/100 = .01 and p2 = probability of an error resulting in

a significant problem = 1/500 = .002.
Let x = number of errors in 60,000 trials. Then E(x) = 1 = np1 = 60,000(.01) = 600.
b.
Let y = number of significant errors in 60,000 trials. Then E(y) = 2 = np2 = 60,000(.002)
= 120.
= np2q2 = 60,000(.002)(.998) = 119.76
= 119.76 = 10.94
2 3 120 3(10.94) 120 32.82 (87.18, 152.82)
Using Chebyshev's Rule, at least 88.9% of the observations will fall within 3 standard
deviations of the mean. We would expect the number of significant errors to fall between
87 and 153.
4.196
c.
We must assume that the trials are independent and that the probability of a significant
error is constant from trial to trial.
a.
By the Central Limit Theorem, the sampling distribution of x is approximately normal

since n > 30 and
15
x =
x = = 840
=
= 2.1213
n
50
b.
c.
134
830 840
P( x 830) = P z
= P(z 4.71) .5 .5 = 0
2.1213
Since the probability of observing a mean of 830 or less is extremely small (0) if the true
mean is 840, we would tend to believe that the mean is not 840, but something less.
Chapter 4
d.
By the Central Limit Theorem, the sampling distribution of is approximately normal

since n > 30 and
45
x =
x = = 840
=
= 6.3640
n
50
830 840
P( x 830) = P z
= P(z 1.57) .5 .4418 = .0582
6.3640
4.198
Let x = length of time a bus is late. Then x is a uniform random variable with probability
distribution:
1
(0 x 20)
f(x) = 20
0 otherwise
0 + 20
= 10
2
a.
b.
1 1
P(x 19) = (20 19) =
= .05
20 20
c.
It would be doubtful that the director's claim is true, since the probability of the bus being
more than 19 minutes late is so small.
135

Using the entire data set of 3,005 invoices as the population, the mean profit margin is 48.9% and the
standard deviation is 13.8291%. If a random sample is selected from this population, the sampling
distribution of the sample mean ( x ) is approximately normal with a mean of 48.901% and a standard
deviation of 13.8291%/ n by the Central Limit Theorem. If a random sample of 253 invoices is
selected, then the probability of obtaining a sample mean of 50.8% or higher is:
50.8 48.901
P(x 50.8) = P z
= P(z 2.18) = .5 .4854 = .0146
13.8291/ 253
Since the probability of obtaining a sample mean of 50.8% or higher from this population is extremely
small (.0146), we would conclude that there is evidence of fraud.
If we look at the two samples separately, the evidence becomes even more damning. For the sample of
134 invoices, the probability of obtaining a sample mean of 50.6% or higher is:
50.6 48.901
P( x1 50.6) = P z
= P(z 1.42) = .5 .4222 = .0778
13.8291/ 134
For the sample of 119 invoices, the probability of obtaining a sample mean of 51.0% or higher is:
51.0 48.901
P( x2 51.0) = P z
= P(z 1.66) = .5 .4515 = .0485
13.8291/ 119
The probability of observing one sample mean of 50.6% or higher AND a second sample mean of
51.0% or higher is:
P( x1 50.6, x2 51.0) = .0778(.0485) = .0038

Again, since the probability of obtaining two sample means of 50.8% or higher and 51.0% or higher
from this population is extremely small (.0038), we would conclude that there is evidence of fraud.
136
Inferences Based on a Single Sample:

Estimation with Confidence Intervals
5.2
5.4
a.
z/2 = 1.96, using Table IV, Appendix B, P(0 z 1.96) = .4750. Thus, /2 =
.5000 .4750 = .025, = 2(.025) = .05, and 1 - = 1 - .05 = .95. The confidence level is
100% .95 = 95%.
b.
z/2 = 1.645, using Table IV, Appendix B, P(0 z 1.645) = .45. Thus, /2 = .50 .45 =
.05, = 2(.05) = .1, and 1 = 1 .1 = .90. The confidence level is 100% .90 = 90%.
c.
z/2 = 2.575, using Table IV, Appendix B, P(0 z 2.575) = .495. Thus, /2 = .500
.495 = .005, = 2(.005) = .01, and 1 = 1 .01 = .99. The confidence level is
100% .99 = 99%.
d.
z/2 = 1.282, using Table IV, Appendix B, P(0 z 1.282) = .4. Thus, /2 = .5 .4 = .1,
= 2(.1) = .2, and 1 = 1 .2 = .80. The confidence level is 100% .80 = 80%.
e.
z/2 = .99, using Table IV, Appendix B, P(0 z .99) = .3389. Thus, /2 = .5000 .3389
= .1611, = 2(.1611) = .3222, and 1 = 1 .3222 = .6778. The confidence level is
100% .6778 = 67.78%.
a.
For confidence coefficient .95, = .05 and /2 = .05/2 = .025. From Table IV, Appendix
B, z.025 = 1.96. The confidence interval is:
x z.025
b.
c.
s
2.7
25.9 1.96
25.9 .56 (25.34, 26.46)
90
n
x z.05
s
n
25.9 1.645
2.7
90
25.9 .47 (25.43, 26.37)
x z.005
5.6
Chapter 5
s
2.7
25.9 2.58
25.9 .73 (25.17, 26.63)
90
n
If we were to repeatedly draw samples from the population and form the interval x 1.96 x
each time, approximately 95% of the intervals would contain . We have no way of knowing
whether our interval estimate is one of the 95% that contain or one of the 5% that do not.
137
5.8
a.
x z.025
5.10
s
3.3
33.9 1.96
33.9 .323 (33.577, 34.223)
n
400
b.
x z.025
c.
For part a, the width of the interval is 2(.647) = 1.294. For part b, the width of the
interval is 2(.323) = .646. When the sample size is quadrupled, the width of the
confidence interval is halved.
a.
A point estimate for the average number of latex gloves used per week by all healthcare
workers with latex allergy is x = 19.3 .
b.
x z / 2
138
s
n
19.3 1.96
11.9
46
19.3 3.44 (15.86, 22.74)
c.
We are 95% confident that the true average number of latex gloves used per week by all
healthcare workers with a latex allergy is between 15.86 and 22.74.
d.
The conditions required for the interval to be valid are:

a.
b.
5.12
s
3.3
33.9 1.96
33.9 .647 (33.253, 34.547)
100
n
The sample selected was randomly selected from the target population.
The sample size is sufficiently large, i.e. n > 30.
a.
The point estimate for the mean charitable commitment of tax-exempt organizations is
x = 74.9667.
b.
From the printout, the 95% confidence interval is (68.2371, 81.6962).
c.
The probability of estimating the true mean charitable commitment with a single number
is 0. By estimating the true mean charitable commitment with an interval, we can be
pretty confident that the true mean is in the interval.
Chapter 5
5.14

Descriptive Statistics: r
Variable
r
N
34
Mean
0.4224
Median
0.4300
TrMean
0.4310
Variable
r
Minimum
-0.0800
Maximum
0.7400
Q1
0.2925
Q3
0.6000
StDev
0.1998
SE Mean
0.0343
For confidence coefficient .95, = .05 and /2 = .05/2 = .025. From Table IV, Appendix B,
z.025 = 1.96. The confidence interval is:
x z / 2
.4224 1.96
n
(.3552, .4895)
.1998
34
.4224 .0672
We are 95% confident that the mean value of r is between .3552 and .4895.
5.16
a.
Descriptive Statistics: Rate

Variable
Rate
N
30
Mean
79.73
Median
80.00
TrMean
80.15
Variable
Rate
Minimum
60.00
Maximum
90.00
Q1
76.75
Q3
84.00
StDev
5.96
SE Mean
1.09
x z / 2
s
5.96
79.73 1.645
79.73 1.79
n
30
(77.94, 81.52)
b.
We are 90% confident that the mean participation rate for all companies that have 401(k)
plans is between 77.94% and 81.52%.
c.
We must assume that the sample size (n = 30) is sufficiently large so that the Central
Limit Theorem applies.
d.
Yes. Since 71% is not included in the 90% confidence interval, it can be concluded that this
company's participation rate is lower than the population mean.
e.
The center of the confidence interval is . If 60% is changed to 80%, the value of will
increase, thus indicating that the center point will be larger. The value of s2 will decrease if
60% is replaced by 80%, thus causing the width of the interval to decrease.
139
5.18
a.
Using MINITAB, I generated 30 random numbers using the uniform distribution from 1
to 308. The random numbers were:
9, 15, 19, 36, 46, 47, 63, 73, 90, 92, 108, 112, 117, 127, 144, 145, 150, 151, 172, 178, 218,
229, 230, 241, 242, 246, 252, 267, 274, 282
I numbered the 308 observations in the order that they appear in the file. Using the random
numbers generated above, I selected the 9th, 15th, 19th, etc. observations for the sample.
The selected sample is:
.31, .34, .34, .50, .52, .53, .64, .72, .70, .70, .75, .78, 1.00, 1.00, 1.03, 1.04, 1.07, 1.10, .21,
.24, .58, 1.01, .50, .57, .58, .61, .70, .81, .85, 1.00
b.
Using MINITAB, the descriptive statistics for the sample of 30 observations are:
Descriptive Statistics: carats-samp

Variable
carats-s
N
30
Mean
0.6910
Median
0.7000
TrMean
0.6965
Variable
carats-s
Minimum
0.2100
Maximum
1.1000
Q1
0.5150
Q3
1.0000
StDev
0.2620
SE Mean
0.0478
From above, x =.6910 and s = .2620.

c.
x z / 2
5.20
s
n
.691 1.96
.262
30
.691 .094 (.597, .785)
d.
We are 95% confident that the mean number of carats is between .597 and .785.
e.
From Exercise 2.47, we computed the population mean to be .631. This mean does fall
in the 95% confidence interval we computed in part d.
x=
11,298
= 2.26
5,000
For confidence coefficient, .95, = .05 and /2 = .025. From Table IV, Appendix B,
1 .5
s
2.26 1.96
2.26 .04 (2.22, 2.30)
x z/2
5000
n
We are 95% confident the mean number of roaches produced per roach per week is between
2.22 and 2.30.
140
Chapter 5
5.22
5.24
a.
If x is normally distributed, the sampling distribution of x is normal, regardless of the

sample size.
b.
If nothing is known about the distribution of x, the sampling distribution of x is

approximately normal if n is sufficiently large. If n is not large, the distribution of x is
unknown if the distribution of x is not known.
a.
P(t t0) = .025 where df = 11

t0 = 2.201
b.
P(t t0) = .01 where df = 9

t0 = 2.821
c.
P(t t0) = .005 where df = 6

Because of symmetry, the statement can be rewritten
P(t t0) = .005 where df = 6
t0 = 3.707
d.
5.26
P(t t0) = .05 where df = 18

t0 = 1.734
For this sample,

x = 1567 = 97.9375
x=
n
16
s2 =
s=
( x)
n 1
1567 2
16 = 159.9292
16 1
155,867
s 2 = 12.6463
a.
For confidence coefficient, .80, = 1 .80 = .20 and /2 = .20/2 = .10. From Table VI,
Appendix B, with df = n 1 = 16 1 = 15, t.10 = 1.341. The 80% confidence interval for
is:
s
12.6463
x t.10
97.94 1.341
97.94 4.240 (93.700, 102.180)
n
16
b.
For confidence coefficient, .95, = 1 .95 = .05 and /2 = .05/2 = .025. From Table VI,
Appendix B, with df = n 1 = 24 1 = 23, t.025 = 2.131. The 95% confidence interval for
is:
x t.025
s
n
97.94 2.131
12.6463
16
97.94 6.737 (91.203, 104.677)
The 95% confidence interval for is wider than the 80% confidence interval for found
in part a.
141
c.
For part a:
We are 80% confident that the true population mean lies in the interval 93.700 to
102.180.
For part b:
We are 95% confident that the true population mean lies in the interval 91.203 to
104.677.
The 95% confidence interval is wider than the 80% confidence interval because the more
confident you want to be that lies in an interval, the wider the range of possible values.
5.28
a.
Descriptive Statistics: MTBE

Variable
MTBE
N
12
N*
0
Mean
97.2
SE Mean
32.8
StDev
113.8
Minimum
8.00
Q1
12.0
Median
50.5
Q3
146.0
Maximum
367.0
A point estimate for the true mean MTBE level for all well sites located near the New
Jersey gasoline service station is x = 97.2 .
b.
For confidence coefficient .99, = .01 and /2 = .01/2 = .005. From Table VI, Appendix
B, with df = n 1 = 12 1 = 11, t.005 = 3.106. The 99% confidence interval is:
s
x t.005
97.2 3.106
113.8
12
97.2 102.04 (4.84, 199.24)
We are 99% confident that the true mean MTBE level for all well sites located near the
New Jersey gasoline service station is between 4.84 and 199.24.
c.
We must assume that the data were sampled from a normal distribution. We will use the
four methods to check for normality. First, we will look at a histogram of the data. Using
MINITAB, the histogram of the data is:
Histogram of MTBE
5
Fr equency
142
50
100
150
200
M T BE
250
300
350
Chapter 5
From the histogram, the data do not appear to be mound-shaped. This indicates that the
data may not be normal.
proportion is .83. This is not very close to the .68 we would expect if the data were
normal.
x 2 s 97.2 2(113.8) 97.2 227.6 (130.4, 324.8) 11 of the 12 values fall in
this interval. The proportion is .92. This is a somewhat smaller than the .95 we would
x 2 s 97.2 3(113.8) 97.2 341.4 (244.2, 438.6) 12 of the 12 values fall in
this interval. The proportion is 1.00. This is exactly the 1.00 we would expect if the data
were normal.
From this method, it appears that the data may not be normal.
IQR 134.0
=
= 1.18 This is somewhat smaller than the 1.3 we would expect if the data
s
113.8
were normal. This method indicates the data may not be normal.

Probability Plot of MTBE
N ormal - 95% C I
99
95
90
Mean
StDev
97.17
113.8
N
AD
P-Value
12
0.929
0.012
P er cent
80
70
60
50
40
30
20
10
5
-300
-200
-100
100
200
M T BE
300
400
500
Since the data do not form a fairly straight line, the data may not be normal.
From above, the all methods indicate the data may not be normal. It appears that the data
probably are not normal.
143
5.30
We must assume that the distribution of the LOS's for all patients is normal.
a.
For confidence coefficient .90, = 1 .90 = .10 and /2 = .10/2 = .05. From Table VI,
Appendix B, with df = n 1 = 20 1 = 19, t.05 = 1.729. The 90% confidence interval is:
x t.05
5.32
5.34
s
n
3.8 1.729
1.2
20
3.8 .464 (3.336, 4.264)
b.
We are 90% confident that the mean LOS is between 3.336 and 4.264 days.
c.
90% confidence means that if repeated samples of size n are selected from a population and
90% confidence intervals are constructed, 90% of all intervals thus constructed will contain
the population mean.
a.
The 95% confidence interval for the mean surface roughness of coated interior pipe is
(1.63580, 2.12620).
b.
No. Since 2.5 does not fall in the 95% confidence interval, it would be very unlikely that
the average surface roughness would be as high as 2.5 micrometers.
a.
The population is the set of all DOT permanent count stations in the state of Florida.
b.
Yes. There are several types of routes included in the sample. There are 3 recreational
areas, 7 rural areas, 5 small cities, and 5 urban areas.
c.
Descriptive Statistics: 30th hour, 100th hour

Variable
30th hou
100th ho
N
20
20
Mean
2206
2096
Median
2064
1999
TrMean
2165
2048
Variable
30th hou
100th ho
Minimum
252
229
Maximum
4905
4815
Q1
1429
1318
Q3
3068
2877
StDev
1224
1203
SE Mean
274
269
For confidence coefficient .95, = .05 and /2 = .05/2 = .025. From Table VI,
Appendix B, with df = n 1 = 20 1 = 19, t.025 = 2.093. The 95% confidence interval is:
x t.025
s
n
2, 206 2.093
1, 224
20
2, 206 572.84 (1,633.16, 2,778.84)
We are 95% confident that the mean traffic count at the 30th highest hour is between
1,633.16 and 2,778.84.
d.
144
We must assume that the distribution of the traffic counts at the 30th highest hour is
normal. From the stem-and-leaf display, the data look fairly mound-shaped. Thus, the
assumption of normality is probably met.
Chapter 5
e.
B, with df = n 1 = 20 1 = 19, t.025 = 2.093. The 95% confidence interval is:
x t.025
s
n
2,096 2.093
1, 203
20
2,096 563.01 (1,532.99, 2,659.01)
We are 95% confident that the mean traffic count at the 100th highest hour is between
1,532.99 and 2,659.01.
We must assume that the distribution of the traffic counts at the 100th highest hour is
normal. From the stem-and-leaf display, the data look fairly mound-shaped. Thus, the
assumption of normality is probably met.
f.
If = 2,700, it is very possible that it is the mean count for the 30th highest hour. It falls
in the 95% confidence interval for the mean count for the 30th highest hour. It is not very
likely that the mean count for the 100th highest hour is 2,700. It does not fall in the 95%
confidence interval for the mean count for the 100th highest hour. (See parts c and e
above.)
5.36
By the Central Limit Theorem, the sampling distribution of is approximately normal with
pq
mean p = p and standard deviation p =
.
n
5.38
a.
The sample size is large enough if the interval p 3 p does not include 0 or 1.
p 3 p p 3
pq
pq
.88(1 .88)
.88 .089
p 3
.88
n
n
121
(.791, .969)
Since the interval lies within the interval (0, 1), the normal approximation will be
adequate.
b.
For confidence coefficient .90, = .10 and /2 = .05. From Table IV, Appendix B,
z.05 = 1.645. The 90% confidence interval is:
p z .05
c.
pq
p 1.645
n
pq
.88(.12)
.88 .049
.88 1.645 1.645
n
121
(.831, .929)
We must assume that the sample is a random sample from the population of interest.
145
5.40
a.
Of the 50 observations, 15 like the product p =
15
= .30.
30
To see if the sample size is sufficiently large:
p 3 p p 3
pq
.3(.7)
.3 3
.3 .194 (.106, .494)
n
50
Since this interval is wholly contained in the interval (0, 1), we may conclude that the
normal approximation is reasonable.
For the confidence coefficient .80, = .20 and /2 = .10. From Table IV, Appendix B,
p z.10
5.42
pq
.3(.7)
.3 1.28
.3 .083 (.217, .383)
n
50
b.
We are 80% confident the proportion of all consumers who like the new snack food is
between .217 and .383.
a.
The point estimate of p is p = .11 .
b.
pq
.11(.89)
.11 3
.11 .077 (.033, .187)
n
150
Since the interval is wholly contained in the interval (0, 1), we may conclude that the
p 3 p p 3
p z.025
5.44
pq
.11(.89)
.11 1.645
.11 .05 (.06, .16)
n
150
c.
We are 95% confident that the true proportion of MSDS that are satisfactorily completed
is between .06 and .16.
a.
The point estimate of p is p =
x 16
=
= .052 .
n 308
pq
.052(.948)
p 3 p p 3
.052 3
.052 .038 (.014, .090)
n
308
146
Chapter 5
p z.05
b.
pq
.052(.948)
.052 2.58
.052 .033 (.019, .085)
n
308
We are 99% confident that the true proportion of diamonds for sale that are classified as
D color is between .019 and .085.
x 81
= .263 .
The point estimate of p is p = =
n 308
p 3 p p 3
pq
.263(.737)
.263 3
.263 .075 (.188, .338)
n
308
p z.05
pq
.263(.737)
.263 2.58
.263 .065 (.198, .328)
n
308
We are 99% confident that the true proportion of diamonds for sale that are classified as
VS1 clarity, is between .198 and .328.
5.46
a.
The population is all senior human resource executives at U.S. companies.
b.
The population parameter of interest is p, the proportion of all senior human resource
executives at U.S. companies who believe that their hiring managers are interviewing too
many people to find qualified candidates for the job.
c.
x 211
=
= .42 . To see if the sample size is sufficiently
n 502
large:
p 3 p p 3
pq
.42(.58)
.42 3
.42 .066 (.354, .486)
n
502
147
d.
For confidence coefficient .98, = .02 and /2 = .02/2 = .01. From Table IV,
Appendix B, z.01 = 2.33. The confidence interval is:
p z.01
pq
.42(.58)
.42 2.33
.42 .051 (.369, .471)
n
502
We are 98% confident that the true proportion of all senior human resource executives at
U.S. companies who believe that their hiring managers are interviewing too many people
to find qualified candidates for the job is between .369 and .471.
5.48
e.
A 90% confidence interval would be narrower. If the interval was narrower, it would
contain fewer values, thus, we would be less confident.
a.
The point estimate of p is
b.
We must check to see if the sample size is sufficiently large:
p 3 p p 3
p = x/n = 35/55 = .636.
pq
.636(.364)
.636 3
.636 .195 (.441, .831)
n
55
Since the interval is wholly contained in the interval (0, 1) we may assume that the
For confidence coefficient, .99, = .01 and /2 = .01/2 = .005. From Table IV,
p z.005
c.
d.
5.50
pq
.636(.364)
.636 2.575
.636 .167 (.469, .803)
n
55
We are 99% confident that the true proportion of fatal accidents involving children is
The sample proportion of children killed by air bags who were not wearing seat belts or
were improperly restrained is 24/35 = .686. This is rather large proportion. Whether a
child is killed by an airbag could be related to whether or not he/she was properly
restrained. Thus, the number of children killed by air bags could possibly be reduced if
the child were properly restrained.
x 36
=
= .434 .
n 83

p 3 p p 3
pq
.434(.566)
.434 3
.434 .163 (.271, .597)
n
83
Since the interval is wholly contained in the interval (0, 1), we may conclude that the normal
approximation is reasonable.
148
Chapter 5
pq
.434(.566)
.434 1.96
.434 .107 (.327, .541)
n
83
p z.025
We are 95% confident that the true proportion of healthcare workers with latex allergies
actually suspects the he or she actually has the allergy is between .327 and .541.
5.52
To compute the necessary sample size, use
n=
2
( z / 2 ) 2
where = 1 .95 = .05 and /2 = .05/2 = .025.
SE 2
From Table IV, Appendix B, z.025 = 1.96. Thus,

n=
(1.96) 2 (7.2)
= 307.328 308
.32
You would need to take 308 samples.

5.54
a.
To compute the needed sample size, use:
n=
Thus, n =
( z / 2 )
SE
pq
where z.025 = 1.96 from Table IV, Appendix B.
(1.96) 2 (.2)(.8)
= 96.04 97
.08 2
You would need to take a sample of size 97.

b.
To compute the needed sample size, use:
n=
( z / 2 )
SE
pq
(1.96) 2(.5)(.5)
= 150.0625 151
.08 2
You would need to take a sample of size 151.

5.56
a.
For a width of 5 units, SE = 5/2 = 2.5.

To compute the needed sample size, use
( z / 2 ) 2
2
n=
SE
where = 1 .95 = .05 and /2 = .025.
149
n=
(1.96) 2 (14) 2
= 120.47 121
2.52
You would need to take 121 samples at a cost of 121($10) = $1210.

Yes, you do have sufficient funds.
b.
For confidence coefficient .90, = 1 .90 = .10 and /2 = .10/2 = .05. From Table IV,
Appendix B, z.05 = 1.645.
n=
(1.645) 2 (14) 2
= 84.86 85
2.52
You would need to take 85 samples at a cost of 85($10) = $850.

You still have sufficient funds but have an increased risk of error.
5.58
The sample size will be larger than necessary for any p other than .5.
5.60
a.
The confidence level desired by the researchers is 90%.
b.
The sampling error desired by the researchers is SE = .05.
c.
x 64
Appendix B, z.05 = 1.645. From the problem, we will use p = =
= .604
n 106
to estimate p. Thus,
n=
( z / 2 ) 2 pq 1.6452.604(.396)
=
= 258.9 259
( SE ) 2
.052
Thus, we would need a sample of size 259.

5.62
z.025 = 1.96. For this study,
n=
( z / 2 ) 2 2 1.962 (5) 2
= 96.04 97
SE 2
12
The sample size needed is 97.
150
Chapter 5
5.64
z.05 = 1.645.
For a width of .06, SE = .06/2 = .03
( z / 2 ) 2 pq
(.1645) 2 (.17)(.83)
= 424.2 425
=
The sample size is n =
2
.032
SE
You would need to take n = 425 samples.
5.66
To compute the necessary sample size, use

n=
( z / 2 ) 2 2
where = 1 .90 = .10 and /2 = .05.
SE 2

n=
5.68
a.
(1.645) 2 (10) 2
= 270.6 271
12
To compute the needed sample size, use

n=
( z / 2 ) 2 2
where = 1 .90 = .10 and /2 = .05.
SE 2

n=
(1.645) 2 (2) 2
= 1,082.41 1,083
.12
b.
As the sample size decreases, the width of the confidence interval increases. Therefore, if
we sample 100 parts instead of 1,083, the confidence interval would be wider.
c.
To compute the maximum confidence level that could be attained meeting the
management's specifications,
n=
( z / 2 ) 2 2
( z / 2 )(2) 2
100(.01)
100
=
( z / 2 ) 2 =
= .25 z/2 = .5
2
2
4
SE
.1
Using Table IV, Appendix B, P(0 z .5) = .1915. Thus, /2 = .5000 .1915 = .3085,
= 2(.3085) = .617, and 1 = 1 .617 = .383.

The maximum confidence level would be 38.3%.
151
5.70
5.72
x =
N n
N
2500 1000
= 4.90
2500
a.
x=
200
1000
b.
x =
200 5000 1000

= 5.66
5000
1000
c.
x =
10,000 1000
= 6.00
10,000
1000
d.
x =
200 100,000 1000

= 6.293
100,000
1000
a.
For n = 36, with the finite population correction factor:

N n 24 5000 64
x = s / n
=
= 3 .9872 = 2.9807
N
5000
64
200
without the finite population correction factor:

24
x = s / n =
=3
64
x without the finite population correction factor is slightly larger.

b.
For n = 400, with the finite population correction factor:

N n
24 5000 400
x = s / n
=
= 1.2 .92 = 1.1510

N
5000
400
without the finite population correction factor:

24
x = s / n =
= 1.2
400
c.
5.74
In part a, n is smaller relative to N than in part b. Therefore, the finite population

correction factor did not make as much difference in the answer in part a as in part b.
An approximate 95% confidence interval for is:

s N n
14 375 40
x 2 x x 2
422 2
375
N
40
n
422 4.184 (417.816, 426.184)
152
Chapter 5
5.76
a.
For N = 2,193, n = 223, x =116,754, and s = 39,185, the 95% confidence interval is:
s N n
39,185 2,193 223
116,754 2
N
2,193
n
223
116,754 4,974.06 (111,779.94, 121,728.06)
x 2 x x 2
5.78
b.
We are 95% confident that the mean salary of all vice presidents who subscribe to
Quality Progress is between $111,777.94 and $121,728.06.
a.
The population of interest is the set of all households headed by women that have incomes
of $25,000 or more in the database.
b.
Yes. Since n/N = 1,333/25,000 = .053 exceeds .05, we need to apply the finite population
correction.
c.
The standard error for p should be:
p =
d.
.708(1 .708) 25,000 1,333

p (1 p ) N n
= .012
1333
25,000
n
N
For confidence coefficient .90, = 1 .90 = .10 and /2 = .10/2 = .05. From Table
IV, Appendix B, z.05 = 1.645. The approximate 90% confidence interval is:
p 1.645 p .708 1.645(.012) (.688, .728)

5.80
For N = 1,500, n = 35, x = 1, and s = 124, the 95% confidence interval is:
s N n
124 1,500 35
x 2 x x 2
1 2
1 41.43
1,500
N
n
35
(40.43, 42.43)
We are 95% confident that the mean error of the new system is between -$40.43 and $42.43.
5.82
a.
For a small sample from a normal distribution with unknown standard deviation, we use the
t statistic. For confidence coefficient .95, = 1 .95 = .05 and /2 = .05/2 = .025.
From Table VI, Appendix B, with df = n 1 = 23 1 = 22, t.025 = 2.074.
b.
For a large sample from a distribution with an unknown standard deviation, we can estimate
the population standard deviation with s and use the z statistic. For confidence coefficient
.95, = 1 .95 = .05 and /2 = .05/2 = .025. From Table IV, Appendix B, z.025 =
1.96.
c.
For a small sample from a normal distribution with known standard deviation, we use the z
statistic. For confidence coefficient .95, = 1 .95 = .05 and /2 = .05/2 = .025.
From Table IV, Appendix B, z.025 = 1.96.
153
5.84
d.
For a large sample from a distribution about which nothing is known, we can estimate the
population standard deviation with s and use the z statistic. For confidence coefficient .95,
= 1 .95 = .05 and /2 = .05/2 = .025. From Table IV, Appendix B, z.025 = 1.96.
e.
For a small sample from a distribution about which nothing is known, we can use neither z
nor t.
a.
Of the 400 observations, 227 had the characteristic p = 227/400 = .5675.

p 3 p p 3
pq
pq
.5675(.4325)
p 3
.5675 3
.5675 .0743
n
n
400
(.4932, .6418)
adequate.
p z.025
b.
pq
1.96
n
pq
.5675(.4325)
.5675 1.96
.5675 .0486
n
400
(.5189, .6161)
For this problem, SE = .02. For confidence coefficient .95, = .05 and /2 = .05/2 =
.025. From Table IV, Appendix B, z.025 = 1.96. Thus,
n=
( z / 2 ) 2 pq (1.96) 2 (.5675)(.4325)
=
= 2,357.2 2,358
SE 2
.022
Thus, the sample size was 2,358.

5.86
a.
The finite population correction factor is:

( N n)
=
N
b.

( N n)
=
N
c.
(100 20)
= .8944
100

( N n)
=
N
154
(2,000 50)
= .9874
2,000
(1,500 300)
= .8944
1,500
Chapter 5
5.88
5.90
a.
From the printout, the 90% confidence interval is (4.277, 6.184). We are 90%
confident that the mean number of offices operated by all Florida law firms is
between 4.277 and 6.184.
b.
From the histogram, it appears that the data probably are not from a normal distribution.
The data appear to be skewed to the right.
c.
The interval constructed in part a depends on the assumption that the data came
from a normal distribution. From part b, it appears that this assumption is not valid.
Thus, the confidence interval is probably not valid.
a.
b.
x 67
=
= .638 .
n 105
pq
.638(.362)
.638 3
.638 .141 (.497, .779)
n
105
p 3 p p 3
p z.025
5.92
pq
.638(.362)
.638 1.96
.638 .092 (.546, .730)
n
105
c.
We are 95% confident that the true proportion of on-the-job homicide cases that occurred
at night is between .546 and .730.
a.
Descriptive Statistics: NJValues

Variable
NJValues
N
20
N*
0
Mean
440.4
SE Mean
67.8
StDev
303.0
Minimum
159.0
Q1
212.3
Median
297.5
Q3
660.5
Maximum
1190.0
Appendix B, with df = n 1 = 20 1 = 19, t.025 = 2.093. The 95% confidence interval
is:
x t.025
b.
s
n
440.4 2.093
303.0
20
440.4 141.81 (298.59, 582.21)
We are 95% confident that the true mean sales price is between $298,590 and $582,210.
155
c.
"95% confidence" means that in repeated sampling, 95% of all confidence intervals
constructed will contain the true mean sales price and 5% will not.
d.
Using MINITAB, a histogram of the data is:

Histogram of NJValues
9
8
Fr equency
7
6
5
4
3
2
1
0
200
400
600
800
NJValues
1000
1200
Since the sample size is small (n = 20), we must assume that the distribution of sales
prices is normal. From the histogram, it does not appear that the data come from a normal
distribution. Thus, this confidence interval is probably not valid.
5.94
a.
x z.05
x 1.645
s
n
12.2 1.645
10
100
12.2 1.645
(10.555, 13.845)
We are 90% confident that the mean number of days of sick leave taken by all its
employees is between 10.555 and 13.845.
b.
z.005 = 2.58.
The sample size is n =
2
( z / 2 ) 2
SE 2
(2.58) 2 (10) 2
= 166.4 167
22
You would need to take n = 167 samples.
156
Chapter 5
5.96
a.
x z / 2
2.21
s
1.13 2.58
1.13 .67
72
n
(.46, 1.80)
We are 99% confident that the mean number of pecks at the blue string is between .46
and 1.80.
5.98
b.
Yes. The mean number of pecks at the white string is 7.5. This value does not fall in the
99% confident interval for the blue string found in part a. Thus, the chickens are more
apt to peck at white string.
a.
First we must compute p : p =
x 124
= .78
=
n 159
pq
.78(22)
.78 3
.78 .099 (.681, .879)
n
159
p 3 p p 3
p z.05
pq
p 1.645
n
pq
.78(.22)
.78 1.645
.78 .054
n
159
(.726, .834)
We are 90% confident that the true proportion of all truck drivers who suffer from sleep
apnea is between .726 and .834.
5.100
b.
Sleep researchers believe that 25% of the population suffer from obstructive sleep apnea.
Since the 90% confidence interval for the proportion of truck drivers who suffer from
sleep apnea does not contain .25, it appears that the true proportion of truck drivers who
suffer from sleep apnea is larger than the proportion of the general population.
a.
The population of interest is the set of all debit cardholders in the U.S.
c.
Of the 1252 observations, 180 had used the debit card to purchase a product or service on
the Internet
p =
180
= .144
1252
157

p 3 p p 3
pq
.144(.856)
.144 3
.144 .030 (.114, .174)
n
1252
d.
p z.01
pq
.144(.856)
.144 .023 (.121, .167)
.144 2.33
n
1252
We are 98% confident that the proportion of debit cardholders who have used their card
in making purchases over the Internet is between .121 and .167.
5.102
e.
Since we would have less confidence with a 90% confidence interval than with a 98%
confidence interval, the 90% interval would be narrower.
a.
Of the 100 cancer patients, 7 were fired or laid off = 7/100 = .07.
p 3 p p 3
pq
pq
.07(.93)
p 3
.07 3
.07 .077
n
n
100
(.007, .145)
Since the interval does not lie within the interval (0, 1), the normal approximation will not
be adequate. We will go ahead and construct the interval anyway.
p z.05
pq
p 1.645
n
pq
.07(.93)
.07 1.645
.07 .042
n
100
(.028, .112)
Converting these to percentages, we get (2.8%, 11.2%).
158
b.
We are 90% confident that the percentage of all cancer patients who are fired or laid off
due to their illness is between 2.8% and 11.2%.
c.
Since the rate of being fired or laid off for all Americans is 1.3% and this value falls
outside the confidence interval in part b, there is evidence to indicate that employees with
cancer are fired or laid off at a rate that is greater than that of all Americans.
Chapter 5
5.104
a.
x 9296
=
= .9296
n 10,000
p =
The approximate 95% confidence interval is:

p (1 p ) N n
.9296(.0704) 500,000 10,000
.9296 2
10,000
500,000
n
N
p 2
.9296 2 .000006413 .9296 .0051 (.9245, .9347)
5.106
10,000
100% = 2% of the subscribers returned the questionnaire. Often in mail
500,000
surveys, those that respond are those with strong views. Thus, the 10,000 that responded
may not be representative. I would question the estimate in part a.
b.
Only
a.
The point estimate for the fraction of the entire market who refuse to purchase bars is:
p =
b.
x 23
=
= .094
n 244
To see if the sample size is sufficient:
p 3
pq
(.094)(.906)
.094 3
.094 .056 (.038, .150)
244
n
Since the interval above is contained in the interval (0, 1), the sample size is sufficiently
large.
c.
p z.025
d.
pq
(.094)(.906)
.094 1.96
.094 .037 (.057, .131)
244
n
The best estimate of the true fraction of the entire market who refuse to purchase bars six
months after the poisoning is .094. We are 95% confident the true fraction of the entire
market who refuse to purchase bars six months after the poisoning is between .057 and
.131.
159
5.108
The bound is SE = .1. For confidence coefficient .99, = 1 .99 = .01 and /2 = .01/2 = .005.
From Table IV, Appendix B, z.005 = 2.575.
We estimate p with from Exercise 7.48 which is = .636. Thus,
n=
( z / 2 ) 2 pq 2.5752 (.636)(.364)
= 153.5 154
.12
SE 2
The necessary sample size would be 154.

5.110
Since the manufacturer wants to be reasonably certain the process is really out of control
before shutting down the process, we would want to use a high level of confidence for our
inference. We will form a 99% confidence interval for the mean breaking strength.
For confidence coefficient .99, = .01 and /2 = .01/2 = .005. From Table VI, Appendix B,
with df = n 1 = 9 1 = 8, t.005 = 3.355. The 99% confidence interval is:
x t.005
s
22.9
985.6 3.355
985.6 25.61 (959.99, 1,011.21)
9
n
We are 99% confident that the true mean breaking strength is between 959.99 and 1,011.21.
Since 1,000 is contained in this interval, it is not an unusual value for the true mean breaking
strength. Thus, we would recommend that the process is not out of control.
160
Chapter 5
Inferences Based on a Single Sample:

Tests of Hypothesis
Chapter 6
6.2
The test statistic is used to decide whether or not to reject the null hypothesis in favor of the
alternative hypothesis.
6.4
A Type I error is rejecting the null hypothesis when it is true.

A Type II error is accepting the null hypothesis when it is false.
= the probability of committing a Type I error.

= the probability of committing a Type II error.
6.6
We can compute a measure of reliability for rejecting the null hypothesis when it is true. This
measure of reliability is the probability of rejecting the null hypothesis when it is true which is
. However, it is generally not possible to compute a measure of reliability for accepting the
null hypothesis when it is false. We would have to compute the probability of accepting the
null hypothesis when it is false, , for every value of the parameter in the alternative
hypothesis.
6.8
Let p = proportion of U.S. companies that have formal, written travel and entertainment
policies for their employees. The null hypothesis would be:
H0: p = .80
6.10
Let = average Libor rate for 3-month loans. Since many Western banks think that the
reported average Libor rate (.054) is too high, they want to show that the average is less than
.054. The appropriate hypotheses would be:
H0: = .054
Ha: < .054
6.12
Let p = proportion of time the camera correctly detects liars. The null hypothesis would be:
H0: p = .75
6.14
a.
A Type I error would be concluding the proposed user is unauthorized when, in fact, the
proposed user is authorized.
A Type II error would be concluding the proposed user is authorized when, in fact, the
proposed user is unauthorized.
In this case, a more serious error would be a Type II error. One would not want to
conclude that the proposed user is authorized when he/she is not.
b.
The Type I error rate is 1%. This means that the probability of concluding the proposed
user is unauthorized when, in fact, the proposed user is authorized is .01.
161
The Type II error rate is .00025%. This means that the probability of concluding the
proposed user is authorized when, in fact, the proposed user is unauthorized is .0000025.
c.
The Type I error rate is .01%. This means that the probability of concluding the proposed
user is unauthorized when, in fact, the proposed user is authorized is .0001.
The Type II error rate is .005%. This means that the probability of concluding the
proposed user is authorized when, in fact, the proposed user is unauthorized is .00005.
6.16
6.18
a.
The null hypothesis is: Ho: There is no intrusion.
b.
The alternative hypothesis is: Ha: There is an intrusion.
c.
= P(warning | no intrusion) =
1
= .001 .
1000
= P(no warning | intrusion) =
500
= .5 .
1000
a.
The decision rule is to reject H0 if x > 270. Recall that

z=
x 0
Therefore, reject H0 if x > 270

can be written reject H0 if z >
x 0
x
270 255
z>
63/ 81
z > 2.14
The decision rule in terms of z is to reject H0 if z > 2.14.

b.
6.20
a.
P(z > 2.14) = .5 P(0 < z < 2.14)

= .5 .4838
= .0162
H0: = .36
Ha: < .36
The test statistic is z =
x 0
.323 .36
.034 / 64
= 1.61
The rejection region requires = .10 in the lower tail of the z-distribution. From Table
IV, Appendix B, z.10 = 1.28. The rejection region is z < 1.28.
162
Chapter 6
Since the observed value of the test statistic falls in the rejection region (z = 1.61 <
1.28), H0 is rejected. There is sufficient evidence to indicate the mean is less than .36 at
= .10.
b.
H0: = .36
Ha: .36
The test statistic is z = 1.61 (see part a).

The rejection region requires /2 = .10/2 = .05 in the each tail of the z-distribution. From
Table IV, Appendix B, z.05 = 1.645. The rejection region is z < 1.645 or z > 1.645.
Since the observed value of the test statistic does not fall in the rejection region (z = 1.61
</ 1.645), H0 is not rejected. There is insufficient evidence to indicate the mean is
different from .36 at = .10.
6.22
a.
To determine whether the mean July, 2006 dealer price of the Toyota Prius differs
from $25,000, we test:
H0: = 25,000
Ha: 25,000
b.
The sample mean is x =
xi = 4,076,271 = 25, 476.69

n
160
The sample variance is:
s2 =
xi2
( xi )
n 1
104,788,653,115
160 1
4,076,2712
160
= 5,904,057.862
The sample standard deviation is: s = s 2 = 5,904,057.862 = 2, 429.8267

x o
d.
The rejection region requires /2 = .05/2 = .025 in each tail of the z-distribution. From
e.
Since the observed value of the test statistic falls in the rejection region (z = 2.48 > 1.96),
Ho is rejected. There is sufficient evidence to indicate the mean July, 2006 dealer price of
the Toyota Prius differs from $25,000 at = .05.
25, 476.69 25,000

= 2.48
2, 429.8267 160
c.
163
6.24
a.
A Type I error is rejecting H0 when H0 is true. In this case, we would conclude that the
mean number of carats per diamond is different from .6 when, in fact, it is equal to .6.
A Type II error is accepting H0 when H0 is false. In this case, we would conclude that the
mean number of carats per diamond is equal to .6 when, in fact, it is different from .6.
b.
From Exercise 5.18, the random sample of 30 diamonds yielded x = .691 and s = .262.
Let = mean number of carats per diamond. To determine if the mean number of carats
per diamond is different from .6, we test:
H0: = .6
Ha: .6
x 0
.691 .6
.262
30
= 1.90
Table IV, Appendix B, z.025 = 1.96. The rejection region is z > 1.96 or z < 1.96.
Since the observed value of the test statistic does not fall in the rejection region
(z = 1.90 >/ 1.96), H0 is not rejected. There is insufficient evidence to indicate the mean
number of carats per diamond is different from .6 carats at = .05.
c.
When is changed, H0, Ha, and the test statistic remain the same.
Since the observed value of the test statistic falls in the rejection region
(z = 1.90 > 1.645), H0 is rejected. There is sufficient evidence to indicate the mean
number of carats per diamond is different from .6 carats at = .10.
d.
6.26
When the value of changes, the decision can also change. Thus, it is very important to
include the level of used in all decisions.

Descriptive Statistics: GASTURBINE
Variable
GASTURBINE
N
67
N*
0
Variable
GASTURBINE
Maximum
16243
Mean
11066
SE Mean
195
StDev
1595
Minimum
8714
Q1
9918
Median
10656
Q3
11842
To determine if the mean heat rate of gas turbines augmented with high pressure inlet
fogging exceeds 10,000 kJ/kWh, we test:
H0: = 10,000
H0: > 10,000
164
Chapter 6
x o
11,066 10,000
= 5.47
1,595 67
The rejection region requires = .05 in the upper tail of the z-distribution. From Table IV,
Appendix B, z.05 = 1.645. The rejection region is z > 1.645.
Since the observed value of the test statistics falls in the rejection region (z = 5.47 > 1.645),
H0 is rejected. There is sufficient evidence to indicate the true mean heat rate of gas turbines
augmented with high pressure inlet fogging exceeds 10,000 kJ/kWh at = .05.
6.28
a.
Let = average full-service fee (in thousands of dollars) of U.S. funeral homes in 2006.
To determine if the average full-service fee exceeds $6,500, we test:
H0: = 6.50
Ha: > 6.50
b.
Using MINTAB, the output is:

Descriptive Statistics: FUNERAL
Variable
Fee
Variable
Fee
N
36
Mean
6.819
Minimum
5.200
Median
6.600
Maximum
11.600
StDev
1.265
Q1
6.025
SE Mean
0.211
Q3
7.400
H0: = 6.50
Ha: > 6.50
x 0
6.819 6.50
= 1.51
1.265 36
The rejection region requires = .05 in the upper tail of the z-distribution. From Table
IV, Appendix B, z.05 = 1.645. The rejection region is z > 1.645.
(z = 1.51 >/ 1.645), H0 is not rejected. There is insufficient evidence to indicate the true
mean full-service fee of U.S. funeral homes in 2006 exceeds $6,500 at = .05.
c.
No. Since the sample size (n = 36) is greater than 30, the Central Limit Theorem applies.
The distribution of x is approximately normal regardless of the population distribution.
165
6.30
a.
To determine if the sample data refute the manufacturer's claim, we test:
H0: = 10
Ha: < 10
b.
A Type I error is concluding the mean number of solder joints inspected per second is less
than 10 when, in fact, it is 10 or more.
A Type II error is concluding the mean number of solder joints inspected per second is at
least 10 when, in fact, it is less than 10.
c.
Descriptive Statistics: PCB

Variable
PCB
N
48
Mean
9.292
Median
9.000
TrMean
9.432
Variable
PCB
Minimum
0.000
Maximum
13.000
Q1
9.000
Q3
10.000
StDev
2.103
SE Mean
0.304
H0: = 10
Ha: < 10
x 0
9.292 10
2.103 / 48
= 2.33
1.645), H0 is rejected. There is sufficient evidence to indicate the mean number of
inspections per second is less than 10 at = .05.
6.32
166
We will reject H0 if the p-value < .

a.
.06 </ .05, do not reject H0.
b.
c.
.01 < .05, reject H0.
d.
.001 < .05, reject H0.
e.
f.
.042 < .05, reject H0.
Chapter 6
6.34
z=
x 0
49.4 50
4.1/ 100
= 1.46
p-value = P(z 1.46) = .5 + .4279 = .9279

There is no evidence to reject H0 for .10.
6.36
First, find the value of the test statistic:

z=
x 0
10.7 10
3.1/ 50
= 1.60
p-value = P(z 1.60 or z 1.60) = 2P(z 1.60) = 2(.5 .4452) = 2(.0548) = .1096
6.38
a.
The p-value reported by SAS is for a two-tailed test. Thus, P(z 1.63) + P(z 1.63)
= .1032. For this one-tailed test, the p-value = P(z 1.63) = .1032/2 = .0516.
Since the p-value = .0516 > = .05, H0 is not rejected. There is insufficient evidence to
indicate < 75 at = .05.
b.
For this one-tailed test, the p-value = P(z 1.63). Since P(z 1.63) = .1032/2 = .0516,
P(z 1.63) = 1 .0516 = .9484.
indicate < 75 at = .10.
c.
For this one-tailed test, the p-value = P(z 1.63) = .1032/2 = .0516.
Since the p-value = .0516 < = .10, H0 is rejected. There is sufficient evidence to
indicate > 75 at = .10.
d.
For this two-tailed test, the p-value = .1032.

indicate 75 at = .01.
6.40
The p-value is p = 0.014. The probability of observing a test statistic of t = 2.48 or anything
more unusual if = 25,000 is 0.014. Since p = 0.014 is so small, we would reject H0. There is
sufficient evidence to indicate the mean prices for hybrid Toyota Prius cars is different than
$25,000 for any value of > .014.
6.42
From the printout, the p-value = .000. Since the p-value = .000 < = .01, H0 is rejected.
There is sufficient evidence to indicate that the true population mean weight of plastic golf tees
is different from .250 at = .01.
167
6.44
a.
z=
x o
52.3 51
7.1
= 1.29
50
The p-value is p = P ( z 1.29)+P ( z 1.29) = (.5 .4015) + (.5 .4015) = .1970 .

(Using Table IV, Appendix B.)
b.
The p-value is p = P ( z 1.29)= (.5 .4015) = .0985 . (Using Table IV, Appendix B.)
c.
z=
x o
52.3 51
10.4
50
= 0.88
The p-value is p = P ( z 0.88)+P ( z 0.88) = (.5 .3106) + (.5 .3106) = .3788 .

(Using Table IV, Appendix B.)
d.
In part a, in order to reject H0, would have to be greater than .1970. In part b, in order
to reject H0, would have to be greater than .0985. In part c, in order to reject H0,
would have to be greater than .3788.
e.
For a two-tailed test, /2 = .01/2 = .005. From Table IV, Appendix B, z.005 = 2.58.
z=
x o
2.58 =
52.3 51
s
50
2.58
s
50
= 52.3 51 .3649s = 1.3 s = 3.56
For a one-tailed test, = .01. From Table IV, Appendix B, z.01 = 2.33.
z=
6.46
a.
z=
x o
x 0
2.33 =
52.3 51
s
10.2 0
50
2.33
s
50
= 52.3 51 .3295s = 1.3 s = 3.95
= 2.30
31.3 / 50
b.
For this two-sided test, the p-value = P(z 2.30) + P(z 2.30) = (.5 .4893) + (.5
.4893) = .0214. Since this value is so small, there is evidence to reject H0. There is
sufficient evidence to indicate the mean level of feminization is different from 0% for any
value of > .0214.
c.
z=
x - 0
15.0 0
= 4.23
25.1/ 50
For this two-sided test, the p-value = P(z 4.23) + P(z 4.23) (.5 .5) + (.5 .5) = 0.
Since this value is so small, there is evidence to reject H0. There is sufficient evidence to
indicate the mean level of feminization is different from 0% for any value of
> 0.0.
168
Chapter 6
6.48
6.50
a.
P(t > 1.440) = .10

(Using Table VI, Appendix B, with df = 6)
b.
P(t < 1.782) = .05

c.
P(t < 2.060) + P(t > 2.060) = .025 + .025 = .05

d.
The probability of a Type I error is computed above for each of the parts.
a.
H0: = 6
Ha: < 6
The test statistic is t =
x 0
s/ n
4.8 6
1.3/ 5
= 2.064
The necessary assumption is that the population is normal.

The rejection region requires = .05 in the lower tail of the t-distribution with df = n 1
= 5 1 = 4. From Table VI, Appendix B, t.05 = 2.132. The rejection region is t < 2.132.
Since the observed value of the test statistic does not fall in the rejection region (t =
2.064 </ 2.132), H0 is not rejected. There is insufficient evidence to indicate the mean
is less than 6 at = .05.
b.
H0: = 6
Ha: 6
The test statistic is t = 2.064 (from a).
The assumption is the same as in a.
The rejection region requires /2 = .05/2 = .025 in each tail of the t-distribution with
df = n 1 = 5 1 = 4. From Table VI, Appendix B, t.025 = 2.776. The rejection region is
t < 2.776 or t > 2.776.
2.064 </ 2.776), H0 is not rejected. There is insufficient evidence to indicate the mean
is different from 6 at = .05.
169
c.
For part a, the p-value = P(t 2.064).

From Table VI, with df = 4, .05 < P(t 2.064) < .10 or .05 < p-value < .10.
For part b, the p-value = P(t 2.064) + P(t 2.064).
From Table VI, with df = 4, 2(.05) < p-value < 2(.10) or .10 < p-value < .20.
6.52
a.
To determine if the true mean breaking strength of the new bonding adhesive is less
than 5.70 Mpa, we test:
H0: = 5.70
Ha: < 5.70
6.54
b.
The rejection region requires = .01 in the lower tail of the t-distribution with
df = n 1 = 10 1 = 9. From Table VI, Appendix B, t.01 = 2.821. The rejection region
is t < -2.821.
c.
d.
(t = 4.33 < 2.821), H0 is rejected. There is sufficient evidence to indicate the true
mean breaking strength of the new bonding adhesive is less than 5.70 Mpa at = .01.
e.
We must assume that the sample was random and selected from a normal population.
x o
s
5.07 5.70
.46
10
= 4.33 .
Some preliminary calculations are:
x=
s2 =
x 736
n
= 105.14
( x)
n 1
(736) 2
7
= 218.4762
7 1
78696
s=
218.4762 = 14.7809
a.
To determine if the mean consumption rate of salad dressings in the Southeastern U.S. is
different than the mean national consumption rate, we test:
H0: = 100
Ha: 100
b.
170
Since the sample size is so small, we must assume that the population being sampled is
normal. In addition, we must assume that the sample is random.
Chapter 6
c.
x 0
s/ n
105.14 100
14.7809 / 7
= .92
The rejection region requires /2 = .05/2 = .025 in each tail of the t-distribution. From
Table VI, Appendix B, with df = n 1 = 7 1 = 6, t.025 = 2.447. The rejection region is
t > 2.447 or t < 2.447.
Since the value of the test statistic does not fall in the rejection region (t = .92 >/ 2.447),
H0 is not rejected. There is insufficient evidence to indicate the mean consumption rate of
salad dressings in the Southeastern U.S. is different than the mean national consumption
rate at = .05.
6.56
d.
The observed significance level is p-value = P(t .92) + P(t .92). Since we did not
reject H0 in part c, we know that the p-value must be greater than .05. Using Table VI,
Appendix B, with df = n 1 = 7 1 = 6, p-value = P(t .92) + P(t .92) > .1 + .1 = .2
Thus, with this table, we only know that the p-value is greater than .2.
a.
To determine if the mean repellency percentage of the new mosquito repellent is less than
95, we test:
H0: = 95
Ha: < 95
x 0
s/ n
83 95
15 / 5
= 1.79
The rejection region requires = .10 in the lower tail of the t distribution. From Table
VI, Appendix B, with df = n 1 = 5 1 = 4, t.10 = 1.533. The rejection region is
t < 1.533.
Since the observed value of the test statistic falls in the rejection region (t = 1.79 < 1.533),
H0 is rejected. There is sufficient evidence to indicate that the true mean repellency
percentage of the new mosquito repellent is less than 95 at = .10.
6.58
b.
We must assume that the population of percent repellencies is normally distributed.
a.
Descriptive Statistics: Plants

Variable
Plants
N
20
Mean
4.000
Median
3.500
TrMean
3.667
Variable
Plants
Minimum
1.000
Maximum
13.000
Q1
1.250
Q3
5.000
StDev
3.061
SE Mean
0.684
Let = mean number of active nuclear power plants operating in all states. To determine
if the mean number of active nuclear power plants operating in all states exceeds 3, we test:
H0: = 3
Ha: > 3
171
x o
s
43
3.061
20
= 1.46
The rejection region requires = .10 in the upper tail of the t-distribution with df = n 1
= 20 1 = 19. From Table VI, Appendix B, t.10 = 1.328. The rejection region is
t > 1.328.
Since the observed value of the test statistic falls in the rejection region (t = 1.46 > 1.328),
H0 is rejected. There is sufficient evidence to indicate the mean number of active nuclear
power plants operating in all states exceeds 3 at = .10.
b.
at a histogram of the data. Using MINITAB, the histogram of the number of power plants
is:
7
6
Frequency
5
4
3
2
1
0
2
10
12
14
Plants
From the histogram, the data appear to be skewed to the right. This indicates that the data
may not be normal.
x s 4 3.061 (.939, 7.061) 18 of the 20 values fall in this interval. The

proportion is .90. This is much greater than the .68 we would expect if the data were
normal.
x 2s 4 2(3.061) 4 6.122 (2.122, 10.122) 19 of the 20 values fall in this
interval. The proportion is .95. This is the same as the .95 we would expect if the data
were normal.
x 3s 4 3(3.061) 4 9.183 (5.183, 13.183) 20 of the 20 values fall in this
interval. The proportion is 1.000. This is equal to the 1.00 we would expect if the data
were normal.
172
Chapter 6
IQR 3.75
=
= 1.22 This is close to the 1.3 we would expect if the data were normal.
s
3.061
This method indicates the data may be normal.
Normal Probability Plot for Plants
99
ML Estimates
95
Mean
StDev
2.98329
90
Goodness of Fit
Percent
80
AD*
70
60
50
1.298
40
30
20
10
5
1
-5
10
Data
From 3 of the 4 different methods, the indications are that the number of power plants data
are not normal.
c.
The two largest values are 9 and 13. The two lowest values are 1 and 1. Using
MINITAB with the data deleted yields the descriptive statistics:
Descriptive Statistics: Plants2

Variable
Plants2
N
16
Mean
3.500
Median
3.500
TrMean
3.429
Variable
Plants2
Minimum
1.000
Maximum
7.000
Q1
2.000
Q3
5.000
StDev
1.826
SE Mean
0.456
To determine if the mean number of active nuclear power plants operating in all states
exceeds 3 (using the reduced data set), we test:
H0: = 3
Ha: > 3
173
x o
s
3.5 3
1.826
16
= 1.10
The rejection region requires = .10 in the upper tail of the t-distribution with df = n 1 =
16 1 = 15. From Table VI, Appendix B, t.10 = 1.341. The rejection region is t > 1.341.
Since the observed value of the test statistic does not fall in the rejection region (t = 1.10 >/
1.341), H0 is not rejected. There is insufficient evidence to indicate the mean number of
active nuclear power plants operating in all states exceeds 3 at = .10.
By eliminating the top two and bottom two observations, we have changed the decision
from rejecting H0 to not rejecting H0.
d.
6.60
It is very dangerous to eliminate data points to satisfy assumptions. The data may, in fact,
not be normal. By eliminating data points, one has changed the kind of data that come
from the parent population. Thus, incorrect decisions could be made.
Using MINITAB, the descriptive statistics for the 2 plants are:

Descriptive Statistics: AL1, AL2
Variable
aximum
AL1
AL2
N*
Mean
SE Mean
StDev
Minimum
Q1
Median
Q3
2
2
0
0
0.00750
0.0700
0.00250
0.0200
0.00354
0.0283
0.00500
0.0500
*
*
0.00750
0.0700
*
*
M
0.01000
0.0900
To determine if plant 1 is violating the OSHA standard, we test:

H0: = .004
Ha: > .004
x o
s
.0075 .004
.00354
= 1.40
Since no level was given, we will use = .05. The rejection region requires = .05 in
the upper tail of the t-distribution with df = n 1 = 2 1 = 1. From Table VI, Appendix B,
t.05 = 6.314. The rejection region is t > 6.314.
(t = 1.40 >/ 6.314), H0 is not rejected. There is insufficient evidence to indicate the
OSHA standard is violated by plant 1 at = .05.
To determine if plant 2 is violating the OSHA standard, we test:
H0: = .004
Ha: > .004
174
x o
s
.07 .004
.0283
= 3.30
Chapter 6
Since no level was given, we will use = .05. The rejection region requires = .05 in the
upper tail of the t-distribution with df = n 1 = 2 1 = 1. From Table VI, Appendix B,
t.05 = 6.314. The rejection region is t > 6.314.
(t = 3.30 >/ 6.314), H0 is not rejected. There is insufficient evidence to indicate the
OSHA standard is violated by plant 2 at = .05.
6.62
b.
First, check to see if n is large enough.

p0 3 p p0 3
p0 q0
(.70)(.30)
.70 3
.70 .14 (.56, .84)
100
n
adequate.
H0: p = .70
Ha: p < .70
p p0
p p0
p0 q0
n
.63 .70
= 1.53
.70(.30)
100
Since the observed value of the test statistic does not fall in the rejection region (1.53 </
1.645), H0 is not rejected. There is insufficient evidence to indicate that the proportion
is less than .70 at = .05.
c.
p-value = P(z 1.53) = .5 .4370 = .0630

Since p is not less than = .05, H0 is not rejected.
6.64
a.
No. The p-value is the probability of observing your test statistic or anything more
unusual if H0 is true. For this problem, the p-value = .3300/2 = .1650.
Given the true value of the population proportion, p, is .5, the probability of observing a
test statistic of z = .44 or larger is .1650. Since the p-value is not small (p = .1650), there
is no evidence to reject H0. There is no evidence to indicate the population proportion is
greater than .5.
b.
If the alternative hypothesis were two-tailed, the p-value would be 2 times the p-value for
a one-tailed test. For this problem, the p-value = .3300. The probability of observing
your test statistic or anything more unusual if H0 is true is .3300.
There is no evidence to reject H0 for .10. There is no evidence to indicate that p .5
for .10.
175
6.66
6.68
x 64
=
= .604
n 106
a.
p =
b.
H0: p = .70
Ha: p .70
c.
d.
e.
(z = 2.16 </ 2.58), H0 is not rejected. There is insufficient evidence to indicate the true
proportion of consumers who believe Made in the USA means 100% of labor and
materials are from the United States is different from .70 at = .01.
a.
The population parameter of interest is p = proportion of items that had the wrong
price scanned at California Wal-Mart stores.
b.
To determine if the true proportion of items scanned at California Wal-Mart stores with
the wrong price exceeds the 2% NIST standard, we test:
p p0
p0 q0
n
.604 .70
= 2.16
.70(.30)
106
H0: p = .02
Ha: p > .02
c.
p po
po qo
n
.083 .02
.02(.98)
1000
= 14.23
The rejection region requires = .05 in the upper tail of the z-distribution. From
Table IV, Appendix B, z.05 = 1.645. The rejection region is z > 1.645.
d.
(z = 14.23 > 1.645), H0 is rejected. There is sufficient evidence to indicate that the true
proportion of items scanned at California Wal-Mart stores with the wrong price exceeds
the 2% NIST standard at = .05. This means that the proportion of items with wrong
prices at California Wal-Mart stores is much higher than what is allowed.
e.
In order for the inference to be valid, the sampling distribution of p must be

approximately normal. We check this assumption:
po 3 p po 3
po qo
.02(.98)
.02 3
.02 .013 (.007, .033)
n
1000
Since the above interval falls completely in the interval (0, 1), the normal distribution
will be adequate.
176
Chapter 6
6.70
a.
Let p = proportion of vacation-home owners who are minorities in 2003.

p =
x 46
=
= .111
n 416
To determine if the percentage of vacation-home owners in 2006 who are minorities

is larger than 6%, we test:
H0: p = .06
Ha: p > .06
p po
po qo
n
.111 .06
= 4.38
.06(.94)
416
Table IV, Appendix B, z.01 = 2.33. The rejection region is z > 2.33.
(z = 4.38 > 2.33), H0 is rejected. There is sufficient evidence to indicate that the true
percentage of vacation-home owners in 2006 who are minorities is larger than 6% at
= .01.
b.
6.72
Since the return rate of the questionnaire was so small compared to the number sent out,
one should be very skeptical of the results. It would be fairly unusual that the sample of
returned questionnaires would be representative of the entire population.
Let p = proportion of firms in violation of the new 4-day rule for reporting material changes.
p =
x 23
=
= .050
n 462
To determine if the percentage of firms in violation of the new 4-day rule for reporting
material changes is less than 10%, we test:
H0: p = .10
Ha: p < .10
p po
po qo
n
.050 .10
= 3.58
.10(.90)
462
The rejection region requires = .01 in the lower tail of the z-distribution. From Table IV,
Appendix B, z.01 = 2.33. The rejection region is z < 2.33.
(z = 3.58 < 2.33), Ho is rejected. There is sufficient evidence to indicate that the true
percentage of firms in violation of the new 4-day rule for reporting material changes is less
than 10% at = .01.
177
6.74
Let p = proportion of patients taking the pill who reported an improved condition.
First we check to see if the normal approximation is adequate:
p0 3 p p0 3
p0 q0
.5(.5)
3
.5 .018 (.482, .518)
7000
n
Since the interval falls completely in the interval (0, 1), the normal distribution will be
adequate.
To determine if there really is a placebo effect at the clinic, we test:
H0: p = .5
Ha: p > .5
p p0
p0 q0
n
.7 .5
= 33.47
.5(.5)
7000
The rejection region requires = .05 in the upper tail of the z distribution. From Table IV,
Since the observed value of the test statistic falls in the rejection region (z = 33.47 > 1.645), H0
is rejected. There is sufficient evidence to indicate that there really is a placebo effect at the
clinic at = .05.
6.76
a.
The power of a test increases when:

1.
2.
3.
b.
178
The distance between the null and alternative values of increases.

The value of increases.
The sample size increases.
The power of a test is equal to 1 . As increases, the power decreases.
Chapter 6
6.78
6.80
From Exercise 6.77 we want to test H0: = 500 against Ha: > 500 using = .05, = 100, n =
25, and x = 532.9.
532.9 575
= P(z < 2.11)
100 / 25
= .5 .4826 = .0174
a.
= P( x0 < 532.9 when = 575) = P z <
b.
Power = 1 = 1 .0174 = .9826
c.
In Exercise 6.77, = .1949 and the power is .8051. The value of has decreased in this
exercise since = 575 is further from the hypothesized value than = 550. As a result,
the power of the test in this exercise has increased (when decreases, the power of the
test increases).
a.
From Exercise 6.79, we want to test H0: = 75 against Ha: < 75 using = .10, = 15,
n = 49, and x = 72.257.
If = 74,
= P( x0 > 72.257 when = 74) = P z >
If = 72,
= P( x0 > 72.257 when = 72) = P z >
If = 70,
72.257 74
= P(z > .81)
15 / 49
= .5 + .2910 = .7910
72.257 72
= P(z > .12)
15 / 49
= .5 .0478 = .4522
= P( x0 > 72.257 when = 70) = .1469

(Refer to Exercise 6.69, part c.)
If = 68,
= P( x0 > 72.257 when = 68) = P z >
If = 66,
= P( x0 > 72.257 when = 66) = P z >
In summary,
74
.7910
72
.4522
70
.1469
72.257 68
= P(z > 1.99)
15 / 49
= .5 .4767 = .0233
72.257 66
= P(z > 2.92)
15 / 49
= .5 .4982 = .0018
68
.0233
66
.0018
179
b.
c.
Looking at the graph, is approximately .62 when = .73.
d.
Power = 1
Therefore,
74
.7910
Power .2090
72
.4522
.5478
70
.1469
.8531
68
.0233
.9767
66
.0018
.9982
The power curve starts out close to 1 when = 66 and decreases as increases, while the
curve is close to 0 when = 66 and increases as increases.
6.82
e.
As the distance between the true mean and the null hypothesized mean 0 increases,
decreases and the power increases. We can also see that as increases, the power
decreases.
a.
To determine if the mean size of California homes exceeds the national average, we test:
H0: = 2230
Ha: > 2230
180
Chapter 6
x 0
2347 2230
= 4.55
257 / 100
H0 is rejected. There is sufficient evidence to indicate the mean size of California homes
exceeds the national average at = .01.
b.
To compute the power, we must first set up the rejection regions in terms of .
s
257
x0 = 0 + z x 0 + 2.33
= 2, 230 + 2.33
= 2,289.88
n
100
We would reject H0 if x > 2,289.88

The power of the test when = 2,330 would be:
x a
2, 289.88 2,330
Power = P( x > 2289.88 = 2,330) = P z > 0

= P z >
x
257 / 100
= P(z > 1.56) = .5 + .4406 = .9406
c.
The power of the test when = 2,280 would be:
x a
Power = P( > 2289.88 = 2,280) = P z > 0
x
= P(z > 0.38) = .5 .1480 = .3520
6.84
a.
2, 289.88 2, 280
= P z >
257 / 100
To determine if the mean mpg for 2006 Honda Civic autos is greater than 38 mpg, we
test:
H0: = 38
Ha: > 38
b.
x 0
40.3 38
= 2.16
6.4 / 36
Since the observed value of the test statistic falls in the rejection region (z = 2.16 >
1.645), H0 is rejected. There is sufficient evidence to indicate that the mean mpg for 2006
Honda Civic autos is greater than 38 mpg at = .05.
We must assume that the sample was a random sample.
181
c.
First find:
x0 = 0 + z x = 0 + z
Thus, x0 = 38 + 1.645
where z = 1.645 from Table IV, Appendix B.
6.4
= 39.75
36
For = 38.5:
39.75 38.5
Power = P( x > 39.75 = 38.5) = P z >

= P(z > 1.17)
6.4 / 36
For = 39:
= .5 .3790 = .1210
39.75 39
Power = P( x > 39.75 = 39) = P z >

= P(z > .70)
6.4 / 36
For = 39.5:
= .5 .2580 = .2420
39.75 39.5
Power = P( x > 39.75 = 39.5) = P z >

= P(z > .23 )
6.4 / 36
For = 40:
= .5 .0910 = .4090
39.75 40
Power = P( x > 39.75 = 40) = P z >

= P(z > .23)
6.4 / 36
For = 40.5:
= .5 + .0910 = .5910
39.75 40.5
Power = P( x > 39.75 = 40.5) = P z >

= P(z > .70)
6.4 / 36
= .5 + .2580 = .7580
d.
182
The plot is:
Chapter 6
e.
From the plot, the power is approximately .5.

For = 39.75 :
39.75 39.75
Power = P( x > 39.75 | = 39.75) = P z >
= P( z > 0) = .5
6.4 36
f.
From the plot, the power is approximately 1.

For = 43 :
39.75 43
Power = P( x > 39.75 | = 43) = P z >
= P( z > 3.05)
6.4 36
= .5 + .4989 = .9989
If the true value of is 40, the approximate probability that the test will fail to reject H0 is
1 .9989 = .0011.
6.86
Using Table VII, Appendix B:

a.
For n = 12, df = n 1 = 12 1 = 11
P(2 > 02 ) = .10 02 = 17.2750
b.
For n = 9, df = n 1 = 9 1 = 8
P(2 > 02 ) = .05 02 = 15.5073
c.
For n = 5, df = n 1 = 5 1 = 4
P(2 > 02 ) = .025 02 = 11.1433
6.88
a.
It would be necessary to assume that the population has a normal distribution.
b.
H0: 2 = 1
Ha: 2 > 1
The test statistic is 2 =
(n 1) s 2
2
0
6(4.84)
= 29.04
1
The rejection region requires = .05 in the upper tail of the 2 distribution with
2
= 12.5916. The rejection
df = n 1 = 7 1 = 6. From Table VII, Appendix B, .05
region is 2 > 12.5916.
Since the observed value of the test statistic falls in the rejection region (29.04 >
12.5916), H0 is rejected. There is sufficient evidence to indicate that the variance is
greater than 1 at = .05.
183
c.
H0: 2 = 1
Ha: 2 1
(n 1) s 2
2
0
6(4.84)
= 29.04
1
The rejection region requires /2 = .025 in the upper tail of the 2 distribution with
2
= 1.237347 and
2
.025
= 14.4494. The rejection region is 2 < 1.237347 or 2 > 14.4494.
Since the observed value of the test statistic falls in the rejection region (29.04 >
14.4494), H0 is rejected. There is sufficient evidence to indicate that the variance is not
equal to 1 at = .05.
6.90
s2 =
( x)
n 1
302
7 = 7.9048
7 1
176
To determine if 2 < 1, we test:

H0: 2 = 1
Ha: 2 < 1
(n 1) s 2
2
0
(7 1)7.9048
= 47.43
1
The rejection region requires = .05 in the lower tail of the 2 distribution with df = n 1 = 7
2
= 1.63539. The rejection region is 2 < 1.63539.
1 = 6. From Table VII, Appendix B, .95
Since the observed value of the test statistic does not fall in the rejection region (2 = 47.43 </
1.63539), H0 is not rejected. There is insufficient evidence to indicate the variance is less
than 1.
6.92
a.
To determine if the breaking strength variance of the new adhesive is less than the
variance of the standard composite adhesive, 2 = .25, we test:
H0: 2 = .25
Ha: 2 < .25
b.
184
The rejection region requires = .01 in the lower tail of the 2 distribution with
2
2
region is < 2.087912.
Chapter 6
6.94
(n 1) s 2
(10 1).462
= 7.6176 .
.25
c.
b.
(2 = 7.6176 </ 2.087912), H0 is not rejected. There is insufficient evidence to
indicate the breaking strength variance of the new adhesive is less than the
variance of the standard composite adhesive, 2 = .25 at = .01.
e.
We must assume that the distribution of the breaking strengths is approximately

normal and that a random sample was selected from this population.
o2
To determine if the true standard deviation of the point-spread errors exceed 15 (variance
exceeds 225), we test:
H0: 2 = 225
Ha: 2 > 225
(n 1) s 2
02
(240 1)13.32
= 187.896
225
The rejection region requires in the upper tail of the 2 distribution with df = n 1
= 240 1 = 239. The maximum value of df in Table VII is 100. Thus, we cannot find the
rejection region using Table VII. Using a statistical package, the p-value associated with
2 = 187.896 is .9938.
Since the p-value is so large, there is no evidence to reject H0. There is insufficient evidence to
indicate that the true standard deviation of the point-spread errors exceeds 15 for any
reasonable value of .
(Since the observed variance (or standard deviation) is less than the hypothesized value of the
variance (or standard deviation) under H0, there is no way H0 will be rejected for any
reasonable value of .)
6.96

Descriptive Statistics: GASTURBINE
Variable
GASTURBINE
N
67
N*
0
Variable
GASTURBINE
Maximum
16243
Mean
11066
SE Mean
195
StDev
1595
Minimum
8714
Q1
9918
Median
10656
Q3
11842
To determine if the heat rates of the augmented gas turbine engine are more variable
than the heat rates of the standard gas turbine engine, we test:
Ho: 2 = 1,5002
Ha: 2 > 1,5002
185
( n 1) s 2
o2
(67 1)1,5952
= 74.625 .
1,5002
2
85.95148. The rejection
2
region is > 85.95148.
(2 = 74.625 >/ 85.95148), H0 is not rejected. There is insufficient evidence to indicate the
heat rates of the augmented gas turbine engine are more variable than the heat rates of the
standard gas turbine engine at = .05.
6.98
For a large sample test of hypothesis about a population mean, no assumptions are necessary
because the Central Limit Theorem assures that the test statistic will be approximately
normally distributed. For a small sample test of hypothesis about a population mean, we must
assume that the population being sampled from is normal. The test statistic for the large
sample test is the z statistic, and the test statistic for the small sample test is the t statistic.
6.100
The elements of the test of hypothesis that should be specified prior to analyzing the data are:
null hypothesis, alternative hypothesis, and rejection region based on .
6.102
= P(Type I error) = P(rejecting H0 when it is true). Thus, if rejection of H0 would cause your
firm to go out of business, you would want this probability or to be small.
6.104
a.
H0: = 8.3
Ha: 8.3
x 0
8.2 8.3
.79 / 175
= 1.67
Since the observed value of the test statistic does not fall in the rejection region (1.67 </
1.96), H0 is not rejected. There is insufficient evidence to indicate that the mean is
different from 8.3 at = .05.
b.
H0: = 8.4
Ha: 8.4
x 0
8.2 8.4
= 3.35
.79 / 175
The rejection region is the same as part b, z < 1.96 or z > 1.96.
186
Chapter 6
Since the observed value of the test statistic falls in the rejection region (3.35 < 1.96),
H0 is rejected. There is sufficient evidence to indicate that the mean is different from 8.4
at = .05.
c.
H0: = 1
Ha: 1
H0: 2 = 1
or
Ha: 2 1
(n 1) s 2
02
(175 1)(.79) 2
= 108.59
1
The rejection region requires /2 = .05/2 = .025 in each tail of the 2 distribution with df
2
2
129.561 and .975
= n 1 = 175 1 = 174. From Table VII, Appendix B, .025
74.2219. The rejection region is 2 > 129.561 or 2 < 74.2219.

Since the observed value of the test statistic does not fall in the rejection region ( 2 =
108.59 >/ 129.561 and 2 = 108.59 </ 74.2219), H0 is not rejected. There is insufficient
evidence to indicate the variance differs from 1 at = .05.
d.
In part a, the rejection region is z < 1.96 or z > 1.96. In terms of x , the rejection region
would be:
z=
x 0
z=
x 0
1.96 =
xU 8.3
.79
1.96 =
175
.117 = xU 8.3 xU = 8.417
xL 8.3
.79
175
.117 = xL 8.3 xL = 8.183
Based on x , the rejection region would be: Reject H0 if x < 8.183 or x > 8.417
The power of the test is the probability the test statistic falls in the rejection region, given
the alternative hypothesis is true. In this case, we will let a = 8.5.
Power = P( x < 8.183 | a = 8.5) + P( x > 8.417 | a = 8.5)
8.183 8.5
8.417 8.5
= P z <
+ P z >
.79 175
.79 175
= P( z < 5.31) + P ( z > 1.39) = (.5 .5) + (.5 + .4177) = .9177

(Using Table IV, Appendix B)
187
6.106
6.108
a.
The p-value = .1288 = P(t 1.174). Since the p-value is not very small, there is no
evidence to reject H0 for .10. There is no evidence to indicate the mean is greater
than 10.
b.
We must assume that a random sample was selected from a population that is normally
distributed.
c.
For the alternative hypothesis Ha: 10, the p-value is 2 times the p-value for the onetailed test. The p-value = 2(.1288) = .2576. There is no evidence to reject H0 for .10.
There is no evidence to indicate the mean is different from 10.
a.
If we wish to test the research hypothesis that the mean GHQ score for all unemployed
men exceeds 10, we test:
H0: = 10
Ha: > 10
This is a one-tailed test. We are only interested in rejecting H0 if the mean GHQ score for
all unemployed men is greater than 10.
b.
c.
x 0
10.94 10.0
= 1.29
5.10 / 49
>/ 1.645), H0 is not rejected. There is insufficient evidence to indicate the mean GHQ
score for all unemployed men is greater than 10 at = .05.
d.
The p-value is P(z 1.29) = .5 .4015 = .0985. (Using Table IV, Appendix B)
The probability of observing our test statistic or anything more unusual, given H0 is true,
is .0985. Since this value is not less than = .05, we do not reject H0. There is
insufficient evidence to indicate the mean GHO score is greater than 10.
6.110
a.
The population parameter of interest is p = proportion of all television viewers with

access to cable-TV who agree with the statement Overall, I find the quality of news on
cable networks to be better than news on the ABC, CBS, and NBC networks.
b.
p =
c.
To determine if the true proportion of TV-viewers who find cable news to be better
quality than network news differs from .50, we test:
x 248
=
= .496
n 500
H0: p = .50
Ha: p .50
188
Chapter 6
d.
p p0
p0 q0
n
.496 .50
= 0.18
.50(.50)
500
(z = 0.18 </ 1.645), H0 is not rejected. There is insufficient evidence to indicate the
true proportion of TV-viewers who find cable news to be better quality than network
news differs from .50 at = .10.
e.
In order for the inference to be valid, the sampling distribution of p must be

approximately normal. We check this assumption:
p0 3 p p0 3
p0 q0
.5(.5)
.5 3
.5 .067 (.433, .567)
n
500
adequate.
6.112
a.
First, check to see if the normal approximation is adequate:

p0 3 p p0 3
p0 q0
(.25)(.75)
.25 3
.25 .103 (.147, .353)
n
159
adequate.
p =
x 124
= .786
=
n 159
To determine if the percentage of truckers who suffer from sleep apnea differs from 25%,
we test:
H0: p = .25
Ha: p .25
p p0
p0 q0
n
.786 .25
= 15.61
(.25)(.75)
159
189
1.645), H0 is rejected. There is sufficient evidence to indicate that the percentage of
truckers who suffer from sleep apnea differs from 25% at = .05.
b.
The observed significance level is the p-value and is:

p-value = P(z 15.61) + P(z 15.61) (.5 .5) + (.5 .5) = 0
Since the p-value is so small, we would reject H0 for any reasonable value of . There is
sufficient evidence to indicate that the percentage of truckers who suffer from sleep apnea
differs from 25%.
6.114
c.
The inference from a confidence interval and a test of hypothesis must agree because the
same numbers are used in both if the same level of significance is used.
a.
Let p = proportion of shoppers using cents-off coupons. To determine if the proportion of

shoppers using cents-off coupons exceeds .65, we test:
H0: p = .65
Ha: p > .65
p p0
p0 q0
n
.77 .65
.65(.35)
1, 000
= 7.96
1.645), H0 is rejected. There is sufficient evidence to indicate the proportion of shoppers
using cents-off coupons exceeds .65 at = .05.
b.
The sample size is large enough if the interval does not include 0 or 1.
p0 q0
.65(.35)
.65 3
.65 .045 (.605, .695)
n
1, 000
adequate.
p0 3 p p0 3
c.
190
The p-value is p = P ( z 7.96) = (.5 .5) .0 . (Using Table IV, Appendix B.) Since the
p-value is smaller than = .05, H0 is rejected. There is sufficient evidence to indicate the
proportion of shoppers using cents-off coupons exceeds .65 at = .05.
Chapter 6
6.116

Descriptive Statistics: Tunnel
Variable
Tunnel
N
10
Mean
989.8
Median
970.5
TrMean
987.9
Variable
Tunnel
Minimum
735.0
Maximum
1260.0
Q1
862.5
Q3
1096.8
StDev
160.7
SE Mean
50.8
To determine whether peak hour pricing succeeded in reducing the average number of vehicles
attempting to use the Lincoln Tunnel during the peak rush hour, we test:
H0: = 1,220
Ha: < 1,220
x 0
s/ n
989.8 1, 220
160.7 / 10
= 4.53
Since no is given, we will use = .05. The rejection region requires = .05 in the lower tail
of the t-distribution with df = n 1 = 10 1 = 9. From Table VI, Appendix B, t.05 = 1.833.
The rejection region is t < 1.833.
Since the observed value of the test statistic falls in the rejection region (t = 4.53 < 1.833),
H0 is rejected. There is sufficient evidence to indicate that peak hour pricing succeeded in
reducing the average number of vehicles attempting to use the Lincoln Tunnel during the peak
rush hour at = .05.
6.118
a.
To determine if the true mean number of pecks at the blue string is less than 7.5, we test:
H0: = 7.5
Ha: < 7.5
x 0
1.13 7.5
2.21
72
= 24.46
(z = 24.46 < 2.33), H0 is rejected. There is sufficient evidence to indicate the true
mean number of pecks at the blue string is less than 7.5 at = .01.
b.
From Exercise 5.96, the 99% confidence interval is (.46, 1.80). Since the hypothesized
value of the mean ( = 7.5) does not fall in the confidence interval, it is not a likely
candidate for the true value of the mean. Thus, you would reject it. This agrees with the
conclusion in part a.
191
6.120
a.
p = 24/40 = .6
To determine if the proportion of shoplifters turned over to police is greater than .5, we
test:
H0: p = .5
Ha: p > .5
p p0
p0 q0
n
.6 .5
.5(.5)
40
= 1.26
>/ 1.645), H0 is not rejected. There is insufficient evidence to indicate the proportion of
shoplifters turned over to police is greater than .5 at = .05.
b.
To determine if the normal approximation is appropriate, we check:

p0 3 p 3
p0 q0
(.5)(.5)
.5 3
.5 .237 (.263, .737)
n
40
adequate.
c.
The observed significance level of the test is p-value = P(z 1.26) = .5 .3962 = .1038.
The probability of observing the value of our test statistic or anything more unusual if the
true value of p is .5 is .4038. Since this p-value is so large, there is no evidence to reject
H0. There is no evidence to indicate the true proportion of shoplifters turned over to
police is greater than .5.
6.122
d.
Any value of that is greater than the p-value would lead one to reject H0. Thus, for this
problem, we would reject H0 for any value of > .1038.
a.
To determine whether the mean profit change for restaurants with frequency programs is
greater than $1047.34, we test:
H0: = 1047.34
Ha: > 1047.34
b.

x =
192
x = 30,113.17
n
12
= 2,509.43
Chapter 6
( x)
30,113.17 2
n
12
s2 =
= 4,619,331.955
=
n 1
12 1
s = 4,619,331.955 = 2149.2631
126,379,568.8
x 0
s/ n
2509.43 1047.34
2149.2631/ 12
= 2.36
t > 1.796.
H0 is rejected. There is sufficient evidence to indicate the mean profit change for
restaurants with frequency programs is greater than $1047.34 for = .05.
It appears that the frequency program would be profitable for the company if adopted
nationwide.
6.124
a.
A Type II error would be concluding the mean amount of PCB in the air is less than or
equal to 3 parts per million when, in fact, it is more than 3 parts per million.
b.
From Exercise 6.123, z =
x0
/ n
x0 = z
.5
+3
50
x0 = 3.165
+ 0 x0 = 2.33
3.165 3.1
= P(z .92) = .5 + .3212 = .8212
For = 3.1, = P( x 3.165) = P z
.5
50

c.
Power = 1 = 1 .8212 = .1788
d.
3.165 3.2
= P(z .49) = .5 .1879 = .3121
For = 3.2, = P( x 3.165) = P z
.5
50
Power = 1 = 1 .3121 = .6879

As the plant's mean PCB departs further from 3, the power increases.
193
6.126
a.
Some preliminary calculations:

x =
s2 =
s=
x = 79.93
n
( x)
= 15.986
2
n
=
n 1
.00043 = .0207
1, 277.7627
5 1
79.932
5
= .00043
To determine if the mean measurement differs from 16.01, we test:

H0: = 16.01
Ha: 16.01
x 0
s/ n
15,986 16.01
.0207 / 5
= 2.59
The rejection region requires /2 = .05/2 = .025 in each tail of the t-distribution with df =
n 1 = 5 1 = 4. From Table VI, Appendix B, t.025 = 2.776. The rejection region is t <
2.776 or t > 2.776.
Since the observed value of the test statistic does not fall in the rejection region (t = 2.59
</ 2.776), H0 is not rejected. There is insufficient evidence to indicate the true mean
measurement differs from 16.01 at = .05.
b.
We must assume that the sample of measurements was randomly selected from a
population of measurements that is normally distributed.
c.
To determine if the standard deviation of the weight measurements is greater than .01, we
test:
H0: 2 = .012
Ha: 2 > .012
( n 1) s 2
o2
(5 1).0207 2
= 16.0684 .
.012
df = n 1 = 5 1 = 4. From Table VII, Appendix B, .205 = 9.48773. The rejection
region is 2 > 9.48773.
(2 = 16.0684 > 9.48773), H0 is rejected. There is sufficient evidence to indicate the
standard deviation of the weight measurements is greater than .01 at = .05.
194
Chapter 6
6.128
a.
Let pi = proportion of first round games won by the ith seed. To determine if the higher
seed has a better than 50-50 chance of winning a first-round game, we test:
H0: pi = .5
Ha: pi > .5 for i = 1, 2, 3, , 8
The test statistic is zi =
p i p0
po qo
n
No value of was given. We will use = .05. The rejection region requires = .05 in
the upper tail of the z-distribution. From Table IV, Appendix B, z.05 = 1.645. The
rejection region is z > 1.645.
xi
x
x 52
x
49
41
= 1 , p 2 = 2 =
= .942 , p 3 = 3 =
= .788 ,
. Thus, p1 = 1 =
n 52
n 52
n
n 52
x 37
x
x
x
42
36
35
p 4 = 4 =
= .808 , p 5 = 5 =
= .712 , p 6 = 6 =
= .692 , p 7 = 7 =
= .673 ,
n 52
n 52
n 52
n 52
x
22
p 8 = 8 =
= .423
n 52
p i =
The corresponding test statistics are:

z1 =
z3 =
z5 =
z7 =
p1 p0
po qo
n
p 3 p0
po qo
n
p 5 p0
po qo
n
p 7 p0
po qo
n
1.00 .5
.5(.5)
52
.788 .5
.5(.5)
52
.712 .5
.5(.5)
52
.673 .5
.5(.5)
52
= 7.21 , z2 =
= 4.15 , z4 =
= 3.06 , z6 =
= 2.50 , z8 =
p 2 p0
po qo
n
p 4 p0
po qo
n
p 6 p0
po qo
n
p 8 p0
po qo
n
.942 .5
.5(.5)
52
.808 .5
.5(.5)
52
.692 .5
.5(.5)
52
.423 .5
.5(.5)
52
= 6.37 ,
= 4.44 ,
= 2.77 ,
= 1.11
For games matching 1 and 16, since the observed value of the test statistic falls in the
rejection region (z1 = 7.21 > 1.645), H0 is rejected. There is sufficient evidence to
indicate the #1 seed has a better than 50-50 chance of winning a first-round game at
= .05.
195
= .05.
= .05.
= .05.
= .05.
= .05.
= .05.
For games matching 8 and 9, since the observed value of the test statistic does not fall in
the rejection region (z8 = 1.11 >/ 1.645), H0 is not rejected. There is insufficient
evidence to indicate the #8 seed has a better than 50-50 chance of winning a first-round
game at = .05.
b.
Let i = mean margin of victory. To determine if the mean margin of victory is greater
than 10 points, we test:
H0: i = 10
Ha: i > 10 i = 1, 2, 3, and 4
xi 0
rejection region is z > 1.645.
196
Chapter 6
The test statistics are:

z1 =
z3 =
x1 0
x3 0
22.9 10
12.4
52
= 7.50 , z2 =
10.6 10
12.0
52
= 0.36 , z4 =
x2 0
x4 0
17.2 10
11.4
52
= 4.55 ,
10.0 10
12.5
52
=0
indicate the #1 seed wins by more than 10 points in first-round games at = .05.
the rejection region (z3 = 0.36 >/ 1.645), H0 is not rejected. There is insufficient evidence
to indicate the #3 seed wins by more than 10 points in first-round games at
= .05.
c.
the rejection region (z4 = 0 >/ 1.645), H0 is not rejected. There is insufficient evidence to
Let i = mean margin of victory. To determine if the mean margin of victory is less than
5 points, we test:
H0: i = 5
Ha: i < 5 i = 5, 6, 7, and 8
xi 0
rejection region is z < 1.645.
z5 =
z7 =
x5 0
x7 0
5.3 5
10.4
52
3.2 5
10.5
52
= 0.21 , z6 =
x6 0
= 1.24 , z8 =
x8 0
4.3 5
10.7
=
52
= .47 ,
2.1 5
11.0
52
= 4.65
197
the rejection region (z5 = 0.21 </ 1.645), H0 is not rejected. There is insufficient
evidence to indicate the #5 seed wins by less than 5 points in first-round games at
= .05.
= .05.
= .05.
rejection region (z8 = 4.65 < 1.645), H0 is rejected. There is sufficient evidence to
indicate the #8 seed wins by less than 5 points in first-round games at = .05.
d.
To determine if the standard deviation of victory margin differs from 11, we test:
H0: 12 = 112 = 121
Ha: 12 112 = 121
The test statistic is i2 =
(n 1) si2
02
No level was given, so we will use = .05. The rejection region requires /2 = .05/2
= .025 in each tail of the 2 distribution with df = n 1 = 52 1 = 51. From Table VII,
2
2
= 71.4202 and .975
= 32.3574. The rejection region is 2 < 32.3574
Appendix B, .025
or 2 > 71.4202.
12 =
32 =
52 =
72 =
198
(n 1) s12
(n 1) s22 (52 1)(11.4) 2

(52 1)(12.4) 2
= 64.808 , 22 =
=
= 54.777 ,
121
121
02
(n 1) s32
(n 1) s42 (52 1)(12.5) 2

(52 1)(12.0) 2
= 60.694 , 42 =
=
= 65.857 ,
121
121
02
(n 1) s52
(n 1) s62 (52 1)(10.7) 2

(52 1)(10.4) 2
= 45.588 , 62 =
=
= 48.256 ,
121
121
02
(n 1) s72
(n 1) s82 (52 1)(11) 2

(52 1)(10.5) 2
= 46.469 , 82 =
=
= 51.000
121
121
02
2
0
02
02
02
Chapter 6
the rejection region ( 12 = 64.808 >/ 71.4202 and 12 = 64.808 </ 32.3574), H0 is not
rejected. There is insufficient evidence to indicate standard deviation of victory margin
differs from 11 at = .05.
199
e.
Let = mean difference in game outcome and point spread. To determine if the point
spread is a good predictor of the victory margin, we test:
H0: = 0
Ha: 0
x 0
.7 0
11.3
360
= 1.18 .
Since no was given, we will use = .05. The rejection region requires /2 = .05/2 =
.025 in each tail of the z-distribution. From Table IV, Appendix B, z.025 = 1.96. The
rejection region is z > 1.96 or z < 1.96.
(z = 1.18 >/ 1.96), H0 is not rejected. There is insufficient evidence to indicate there is a
difference in the game outcome and point spread at = .05. There is no evidence to
indicate the point spread is not a good predictor of the victory margin.
6.130

Descriptive Statistics: Candy
Variable
Candy
N
5
N*
0
Mean
24.00
SE Mean
1.67
StDev
3.74
Minimum
21.00
Q1
21.00
Median
23.00
Q3
27.50
Maximum
30.00
To give the benefit of the doubt to the students we will use a small value of . (We do
not want to reject H0 when it is true to favor the students.) Thus, we will use = .001.
We must also assume that the sample comes from a normal distribution. To determine if
the mean number of candies exceeds 15, we test:
H0: = 15
Ha: > 15
x o
22 15
3
= 5.22
H0 is rejected. There is sufficient evidence to indicate the mean number of candies exceeds 15
at = .001.
200
Chapter 6
Inferences Based on Two Samples:

Confidence Intervals and
Tests of Hypothesis
7.2
a.
x = 1 = 12
x =
b.
x = 2 = 10
x =
c.
x x = 1 2 = 12 10 = 2
1
7.4
1
n1
2
n2
4
= .5
64
3
64
= .375
x x =
d.
Chapter 7
12
n1
22
n2
42 32
25
+
=
= .625
64 64
64
Since n1 30 and n2 30, the sampling distribution of x1 x2 is approximately normal by

the Central Limit Theorem.
Assumptions about the two populations:

1.
2.
Both sampled populations have relative frequency distributions that are approximately
normal.
The population variances are equal.
Assumptions about the two samples:

The samples are randomly and independently selected from the population.
7.6
a.
sp2 =
(n1 1) s12 + (n2 1) s22 (25 1)120 + (25 1)100 5280

= 110
=
=
n1 + n2 2
25 + 25 + 2
48
b.
sp2 =
(20 1)12 + (10 1)20 408

=
= 14.5714
20 + 10 2
28
c.
sp2 =
(6 1).15 + (10 1).2 2.55

=
= .1821
6 + 10 2
14
d.
sp2 =
(16 1)3000 + (17 1)2500 85,000

=
= 2741.9355
16 + 17 2
31
e.
sp2 falls near the variance with the larger sample size.
Inferences Based on Two Samples: Confidence Intervals and Tests of Hypothesis
201
7.8
12
22
9
16
+
= .25 = .5
100 100
a.
x x =
b.
The sampling distribution of

x1 x2 is approximately normal
by the Central Limit Theorem
since n1 30 and n2 30.
n1
n2
x x = 1 2 = 10
1
c.
x1 x2 = 15.5 26.6 = 11.1

Yes, it appears that x1 x2 = 11.1 contradicts the null hypothesis H0: 1 2 = 10.
d.
The rejection region requires /2 = .025 = .05/2 in each tail of the z-distribution. From
e.
H0: 1 2 = 10
Ha: 1 2 10
( x1 x2 ) 10
12
n1
22
(15.5 26.6) 10
= 42.2
.5
n2
The rejection region is z < 1.96 or z > 1.96. (Refer to part d.)
Since the observed value of the test statistic falls in the rejection region (z = 42.2
< 1.96), H0 is rejected. There is sufficient evidence to indicate the difference in the
population means is not equal to 10 at = .05.
f.
The form of the confidence interval is:

( x1 x2 ) z / 2
12
n1
22
n2
9
16
+
11.1 .98 (12.08, 10.12)
(15.5 26.6) 1.96
100 100
We are 95% confident that the difference in the two means is between 12.08 and 10.12.
g.
202
The confidence interval gives more information.
Chapter 7
7.10

x1 =
n1
2
1
s12 =
x1 =
sp2 =
a.
654
15
( x )
n2
6542
15 = 419.6 = 29.3167
15 1
14
n1
28934
n1 1
2
2
s22 =
858
= 53.625
16
( x )
n2
n2 1
8582
16 = 439.75 = 29.3167
16 1
15
46450
(n1 1) s12 + (n2 1) s22 (15 1)29.9714 + (16 1)29.3167 859.3501

= 29.6328
=
=
29
n1 + n2 2
15 + 16 2
H0: 2 1 = 10
Ha: 2 1 > 10
( x1 x2 ) D0
1 1
sp2 +
n1 n2
(53.625 43.6) 10
1 1
29.6328 +
15 16
.025
= .013
1.9564
The rejection region requires = .01 in the upper tail of the t-distribution with df =
n1 + n2 2 = 15 + 16 2 = 29. From Table VI, Appendix B, t.01 = 2.462. The rejection
region is t > 2.462.
Since the test statistic does not fall in the rejection region (t = .013 >/ 2.462), H0 is not
rejected. There is insufficient evidence to conclude 2 1 > 10 at = .01.
b.
For confidence coefficient .98, = .02 and /2 = .01. From Table VI, Appendix B, with
df = n1 + n2 2 = 15 + 16 2 = 29, t.01 = 2.462. The 98% confidence interval for
(2 1) is:
1 1
1 1
( x1 x2 ) t / 2 sp2 + (53.625 43.6) 2.462 29.6328 +
15 16
n1 n2
10.025 4.817 (5.208, 14.842)
We are 98% confident that the difference between the mean of population 2 and the mean
of population 1 is between 5.208 and 14.842.
203
7.12
a.
Let 1 = mean carat size of diamonds certified by GIA and 2 = mean carat size of
diamonds certified by HRD. For confidence coefficient .95, = .05 and /2 = .05/2 =
.025. From Table IV, Appendix B, z.025 = 1.96. The 95% confidence interval is:
12
( x1 x2 ) z / 2
n1
22
n2
(.6723 .8129) 1.96
.24562 .18312
+
151
79
.1406 .0563 (.1969, .0843)
b.
We are 95% confident that the difference in mean carat size between diamonds certified
by GIA and those certified by HRD is between -.1969 and -.0843.
c.
Let 3 = mean carat size of diamonds certified by IGI.

( x1 x3 ) z / 2
12
n1
32
n3
(.6723 .3665) 1.96
.24562 .21632
+
151
78
.3058 .0620 (.2438, .3678)
7.14
d.
by GIA and those certified by IGI is between .2438 and .3678.
e.
( x2 x3 ) z / 2
f.
by HRD and those certified by IGI is between .3837 and .5091.
a.
Let 1 = mean score for males and 2 = mean score for females. For confidence
coefficient .90, = .10 and /2 = .10/2 = .05. From Table IV, Appendix B, z.025 = 1.645.
The 90% confidence interval is:
22
n2
( x1 x2 ) z / 2
32
n3
12
n1
.18312 .21632
+
79
78
.4464 .0627 (.3837, .5091)
(.8129 .3665) 1.96
22
n2
(39.08 38.79) 1.645
6.732 6.942
+
127
114
0.29 1.452 (1.162, 1.742)
We are 90% confident that the difference in mean service-rating scores between males
and females.
b.
204
Because 0 falls in the 90% confidence interval, we are 90% confident that there is no
difference in the mean service-rating scores between males and females.
Chapter 7
7.16
a.
The descriptive statistics are:
Descriptive Statistics: US, Japan

Variable
US
Japan
N
5
5
Mean
6.562
3.118
Median
6.870
3.220
TrMean
6.562
3.118
Variable
US
Japan
Minimum
4.770
1.920
Maximum
8.000
4.910
Q1
5.415
1.970
Q3
7.555
4.215
s 2p =
StDev
1.217
1.227
SE Mean
0.544
0.549
(n1 1) s12 + (n2 1) s22 (5 1)1.217 2 + (5 1)1.227 2

= 1.4933
=
5+52
n1 + n2 2
To determine if the mean annual percentage turnover for U.S. plants exceeds that for
Japanese plants, we test:
H0: 1 2 = 0
Ha: 1 2 > 0
( x1 x2 ) D0
1
1
sp2 +
n1 n2
(6.562 3.118) 0
1 1
1.4933 +
5 5
= 4.456
The rejection region requires = .05 in the upper tail of the t-distribution with df =
n1 + n2 2 = 5 + 5 2 = 8. From Table VI, Appendix B, t.05 = 1.860. The rejection
region is t > 1.860.
H0 is rejected. There is sufficient evidence to indicate the mean annual percentage
turnover for U.S. plants exceeds that for Japanese plants at = .05.
b.
The p-value = P(t 4.456). Using Table VI, Appendix B, with df = n1 + n2 2

= 5 + 5 2 = 8, .005 < P(t 4.456) < .001. Since the p-value is so small, there is
evidence to reject H0 for > .005.
c.
The necessary assumptions are:

1.
2.
3.
Both sampled populations are approximately normal.

The samples are randomly and independently sampled.
There is no indication that the populations are not normal. Both sample variances are
similar, so there is no evidence the population variances are unequal. There is no
indication the assumptions are not valid.
205
7.18
Let 1 = the mean relational intimacy score for participants in the CMC group and 2 = the
mean relational intimacy score for participants in the FTF group.
Descriptive Statistics: CMC, FTF
Variable
CMC
FTF
N
24
24
N*
0
0
Mean
3.500
3.542
SE Mean
0.159
0.134
StDev
0.780
0.658
Minimum
2.000
2.000
Q1
3.000
3.000
Median
3.500
4.000
Q3
4.000
4.000
Maximum
5.000
5.000

s 2p =
( n1 1) s12 + ( n2 1) s22 = ( 24 1) .7802 + ( 24 1) .6582

n1 + n2 2
24 + 24 2
= 0.5207
To determine if the mean relational intimacy score for participants in the CMC group is
lower than the mean relational intimacy score for participants in the FTF group, we test:
H0: 1 2 = 0
Ha: 1 2 < 0
( x1 x2 ) Do
1 1
s 2p +
n1 n2
( 3.500 3.542 ) 0 = 0.042 = .20

1
1
.5207 +
24 24
.20831
df = n1 + n2 2 = 24 + 24 2 = 46. From Table VI, Appendix B, t.10 1.303. The
rejection region is t < 1.303.
(t = .20 / 1.303), H0 is not rejected. There is insufficient evidence to indicate that the
mean relational intimacy score for participants in the CMC group is lower than the mean
relational intimacy score for participants in the FTF group at = .10.
7.20
206
a.
The first population is the set of responses for all business students who have access to
lecture notes and the second population is the set of responses for all business students
not having access to lecture notes.
Chapter 7
b.
To determine if there is a difference in the mean response of the two groups, we test:
H0: 1 2 = 0
Ha: 1 2 0
( x1 x2 ) 0
s12 s22
+
n1 n2
(8.48 7.80) 0
= 2.19
.94 2.99
+
86
35
>/ 2.58), H0 is not rejected. There is insufficient evidence to indicate a difference in the
mean response of the two groups at = .01.
c.
s12 s22
.94 2.99
+
(8.48 7.80) 2.58
+
n1 n2
86
35
( x1 x2 ) z.005
.68 .801 (.121, 1.481)

We are 99% confident that the difference in the mean response between the two groups is
between .121 and 1.481.
7.22
d.
A 95% confidence interval would be smaller than the 99% confidence interval. The z
value used in the 95% confidence interval is z.025 = 1.96 compared with the z value used
in the 99% confidence interval of z.005 = 2.58.
a.
The bacteria counts are probably normally distributed because each count is the median
of five measurements from the same specimen.
b.
Let 1 = mean of the bacteria count for the discharge and 2 = mean of the bacteria count
upstream. Since we want to test if the mean of the bacteria count for the discharge
exceeds the mean of the count upstream, we test:
H0: 1 2 = 0
Ha: 1 2 > 0
c.

Descriptive Statistics: Plant, Upstream
Variable
Plant
Upstream
N
6
6
Mean
32.10
29.617
Median
31.75
30.000
TrMean
32.10
29.617
Variable
Plant
Upstream
Minimum
28.20
26.400
Maximum
36.20
32.300
Q1
29.40
27.075
Q3
35.23
31.850
StDev
3.19
2.355
SE Mean
1.30
0.961
207
(n1 1) s12 + (n2 1) s22 (6 1)3.192 + (6 1)2.3552

= 7.861
s =
=
n1 + n2 2
6+62
2
p
( x1 x2 ) 0
1
1
s +
n1 n2
(32.10 29.617) 0
2
p
1 1
7.861 +
6 6
= 1.53
No level was given, so we will use = .05. The rejection region requires = .05 in
the upper tail of the t-distribution with df = n1 + n2 2 = 6 + 6 2 = 10. From Table VI,
Appendix B, t.05 = 1.812. The rejection region is t > 1.812.
>/ 1.812), H0 is not rejected. There is insufficient evidence to indicate the mean bacteria
count for the discharge exceeds the mean of the count upstream at = .05.
d.
We must assume:
1.
2.
3.
7.24
The mean counts per specimen for each location is normally distributed.
The variances of the 2 distributions are equal.
Independent and random samples were selected from each population.
a.
We cannot make inferences about the difference between the mean salaries of male
and female accounting/finance/banking professionals because no standard
deviations are provided.
b.
To determine if the mean salary for males is significantly greater than that for females, we
test:
H0: 1 2 = 0
Ha: 1 2 > 0
Table IV, Appendix B, z.05 = 1.645.
To make things easier, we will assume that the standard deviations for the 2 groups
are the same.
The test statistic is
z=
208
( x1 x2 ) Do = ( 69, 484 52,012 ) 0 =

12 22
+
n1 n2
1
1
+
1400
1400
17,836
471,896.2038
=
(.037796)
Chapter 7
In order to reject H0 this test statistic must fall in the rejection region, or be greater
than 1.645. Solving for we get:
z=
471,896.2038
> 1.645 <
471,896.2038
= 286,866.99
1.645
Thus, to reject H0 the average of the two standard deviations has to be less than
$286,866.99.
7.26
c.
Yes. In fact, reasonable values for the standard deviation will be around $5,000. which is
much smaller than the required $286,866.99.
d.
These data were collected from voluntary subjects who responded to a Web-based survey.
Thus, this is not a random sample, but a self-selected sample. Generally, subjects who
respond to surveys tend to have very strong opinions, which may not be the same as the
population in general. Thus, the results from this self-selected sample may not reflect the
results from the population in general.
a.
Pair
Difference
1
2
3
4
5
6
3
2
2
4
0
1
nd
d=
d
i =1
nd
12
=2
6
nd
di
nd
i =1
2
di
n
d
sd2 = i =1
nd 1
(12) 2
34
=2
=
5
b.
d = 1 2
c.
df = nD 1 = 6 1 = 5, t.025 = 2.571. The confidence interval is:
d t / 2
sd
nd
= 2.571
2
6
2 1.484 (.516, 3.484)
209
d.
H0: d = 0
Ha: d 0
The test statistic is t = t =
d
sd
nd
2
= 3.46
2/ 6
nD 1 = 6 1 = 5. From Table VI, Appendix B, t.025 = 2.571. The rejection region is
t < 2.571 or t > 2.571.
Since the observed value of the test statistic falls in the rejection region (3.46 > 2.571), H0
is rejected. There is sufficient evidence to indicate that the mean difference is different
from 0 at = .05.
7.28
a.
H0: 1 2 = 0
Ha: 1 2 < 0
b.
H0: 1 2 = 0
Ha: 1 2 < 0
d 0 3.5 0
=
= 4.71 .
sd
21
nd
38
The rejection region is z < 1.28 (Refer to part a.)

1.28), H0 is rejected. There is sufficient evidence to indicate 1 2 < 0 at = .10.
c.
Since the sample size of the number of pairs is greater than 30, we do not need to assume
that the population of differences is normal. The sampling distribution of d is
approximately normal by the Central Limit Theorem. We must assume that the
differences are randomly selected.
d.
B, z.05 = 1.645. The 90% confidence interval is:
d z.05
e.
210
sd
nd
3.5 1.645
21
38
3.5 1.223 (4.723, 2.277)
The confidence interval provides more information since it gives an interval of possible
values for the difference between the population means.
Chapter 7
7.30
a.
Let 1 = the mean salary of technology professionals in 2003 and 2 = the

mean salary of technology professionals in 2005. Let d = 1 - 2.
To determine if the mean salary of technology professionals at all U.S. metropolitan
areas has increased between 2003 and 2005, we test:
H0: 1 2 = 0
H0: d = 0
OR
Ha: 1 2 < 0
Ha: d < 0
b.
Metro Area
2003 Salary
($ thousands)
87.7
78.6
71.4
70.8
73.0
76.3
73.6
71.1
69.5
69.0
71.0
73.0
62.3
Silicon Valley
New York
Washington, D.C.
Los Angeles
Denver
Boston
Atlanta
Chicago
Philadelphia
San Diego
Seattle
Dallas-Ft. Worth
Detroit
2005 Salary
($ thousands)
85.9
80.3
77.4
77.1
77.1
80.1
73.2
73.0
69.8
77.1
66.9
71.0
64.1
Difference
(2003 2005)
1.8
1.7
6.0
6.3
4.1
3.8
0.4
1.9
0.3
8.1
4.1
2.0
1.8
nd
c.
d=
di
1
nd
25.7
= 1.977
13
2
nd
di
nd
1
2
(25.7) 2
di
206.59
nd
13
=
= 12.9819
sd2 = 1
nd 1
13 1
sd = sd2 = 12.9819 = 3.603
d o
e.
df = nd 1 = 13 1 = 14. From Table VI, Appendix B, t.10 = 1.345. The rejection
region is t < 1.345.
sd
nd
1.977 0
= 1.978
3.603 13
d.
211
f.
(t = 1.978 < 1.345), H0 is rejected. There is sufficient evidence to indicate the mean
salary of technology professionals at all U.S. metropolitan areas has increased
between 2003 and 2005 at = .10.
g.
In order for the inference to be valid, we must assume that the population of differences is
normal and that we have a random sample.
Using MINITAB, the histogram of the differences is:
Histogram of Diff
3.0
Fr equency
2.5
2.0
1.5
1.0
0.5
0.0
-7.5
-5.0
-2.5
0.0
2.5
5.0
Diff
The graph is fairly mound-shaped although it is somewhat skewed to the right. Since
there are only 13 observations, this graph is close enough to being mound-shaped to
indicate the normal assumption is reasonable.
7.32
212
a.
The data should be analyzed as a paired difference experiment because each actor who
won an Academy Award was paired with another actor with similar characteristics who
did not win the award.
b.
Let 1 = mean life expectancy of Academy Award winners and 2 = mean life expectancy
of non-Academy Award winners. To compare the mean life expectancies of Academy
Award winners and non-winners, we test:
H0: 1 2 = d = 0
Ha: d 0
c.
Since the p-value was so small, there is sufficient evidence to indicate the mean life
expectancies of the Academy Award winners and non-winners are different for any value
of > .003. Since the sample mean life expectancy of Academy Award winners is
greater than that for non-winners, we can conclude that Academy Award winners have a
longer mean life expectancy than non-winners.
Chapter 7
7.34
a.
Let 1 = mean driver chest injury rating and 2 = mean passenger chest injury rating.
Because the data are paired, we are interested in 1 2 = d, the difference in mean
chest injury ratings between drivers and passengers.
b.
The data were collected as matched pairs and thus, must be analyzed as matched pairs.
Two ratings are obtained for each car the drivers chest injury rating and the
passengers chest injury rating.
c.
Descriptive Statistics: DrivChst, PassChst, diff

Variable
DrivChst
PassChst
diff
N
98
98
98
Mean
49.663
50.224
-0.561
Median
50.000
50.500
0.000
TrMean
49.682
50.148
-0.420
Variable
DrivChst
PassChst
diff
Minimum
34.000
35.000
-15.000
Maximum
68.000
69.000
13.000
Q1
45.000
45.000
-4.000
Q3
54.000
55.000
3.000
StDev
6.670
7.107
5.517
SE Mean
0.674
0.718
0.557
d z.005
7.36
sd
nd
0.561 2.58
5.517
98
0.561 1.438 (1.999, 0.877)
d.
We are 99% confidence that the difference between the mean chest injury ratings of
drivers and front-seat passengers is between 1.999 and 0.877. Since 0 is in the
confidence interval, there is no evidence that the true mean driver chest injury rating
exceeds the true mean passenger chest injury rating.
e.
Since the sample size is large, the sampling distribution of d is approximately normal by
the Central Limit Theorem. We must assume that the differences are randomly selected.
a.
Let C1 = mean relational intimacy score for the CMC group on the first meeting and
C3 = mean relational intimacy score for the CMC group on the third meeting. Let
Cd = difference in mean relational intimacy score between the first and third meetings
for the CMC group. To determine if the mean relational intimacy score will increase
between the first and third meetings, we test:
Ho: Cd = 0
Ha: Cd < 0
b.
The researchers used the paired t-test because the same individuals participated in each of
the three meeting sessions. Thus, the samples would not be independent.
c.
Since the p-value is so small (p = .003), H0 would be rejected. There is sufficient

evidence to indicate that the mean relational intimacy score for participants in the CMC
group increased from the first to the third meeting for any value of > .003.
213
d.
Let F1 = mean relational intimacy score for the FTF group on the first meeting and
F3 = mean relational intimacy score for the FTF group on the third meeting. Let
Fd = difference in mean relational intimacy score between the first and third meetings
for the FTF group. To determine if the mean relational intimacy score will change
between the first and third meetings, we test:
H0: Fd = 0
Ha: Fd 0
e.
7.38
Since the p-value is not small (p = .39), H0 would be not be rejected. There is
insufficient evidence to indicate that the mean relational intimacy score for participants
in the FTF group changed from the first to the third meeting for any value of < .39.

Descriptive Statistics: Method1, Method2, Diff
Variable
Method1
Method2
Diff
N
10
10
10
N*
0
0
0
Mean
13.39
13.10
0.290
SE Mean
4.18
3.96
0.553
StDev
13.22
12.51
1.750
Minimum
1.00
1.40
-2.200
Q1
1.30
1.78
-0.875
Median
10.35
9.50
-0.150
Q3
24.63
25.05
1.575
Maximum
34.40
30.70
3.700
To determine if the mean transition error for method 1 differs from the mean transition error
for method 2, we test:
H0: 1 2 = 0
H0: d = 0
OR
Ha: 1 2 0
d o
sd
nd
Ha: d 0
0.290 0
= 0.52
1.750 10
df = nd 1 = 10 1 = 9. From Table VI, Appendix B, t.05 = 1.833. The rejection region is
t < 1.833 or t > 1.833.
(t = 0.52 >/ 1.833), H0 is not rejected. There is insufficient evidence to indicate the mean
transition error for method 1 differs from the mean transition error for method 2 at = .10.
7.40

Descriptive Statistics: HMETER, HSTATIC, Diff
214
Variable
HMETER
HSTATIC
Diff
N
40
40
40
N*
0
0
0
Mean
1.0405
1.0410
-0.000523
Variable
HMETER
HSTATIC
Diff
Median
1.0232
1.0237
-0.000165
SE Mean
0.00638
0.00649
0.000204
Q3
1.0883
1.0908
0.000317
StDev
0.0403
0.0410
0.001291
Minimum
0.9936
0.9930
-0.004480
Q1
1.0047
1.0043
-0.001078
Maximum
1.1026
1.1052
0.001580
Chapter 7
Appendix B, with df = nd 1 = 40 1 = 39, t.025 2.021. The 95% confidence interval
is:
d t.025
sd
0.000523 2.021
n
(0.000936,
0.001291
0.000523 0.000413
40
0.000110)
We are 95% confident that the true difference in mean density measurements between the two
methods is between -0.000936 and -0.000110. Since the absolute value of this interval is
completely less than the desired maximum difference of .002, the winery should choose the
alternative method of measuring wine density.
7.42
a.
b.
c.
d.
7.44
Appendix B, z.025 = 1.96. The 95% confidence interval for p1 p2 is approximately:
a.
( p1 p 2 ) z / 2
p1q1 p 2 q2
.65(1 .65) .58(1 .58)
+
(.65 .58) 1.96
+
n1
n2
400
400
.07 .067 (.003, .137)
b.
( p1 p 2 ) z / 2
p1q1 p 2 q2
+
(.31 .25) 1.96
n1
n2
.31(1 .31) .25(1 .25)

+
180
250
.06 .086 (.026, .146)

c.
( p1 p 2 ) z / 2
p1q1 p 2 q2
.46(1 .46) .61(1 .61)
+
(.46 .61) 1.96
+
100
120
n1
n2
.15 .131 (.281, .019)
7.46
p =
n1 p1 + n2 p 2 55(.7) + 65(.6) 78
=
=
= .65
55 + 65
120
n1 + n2
q = 1 p = 1 .65 = .35
H0: p1 p2 = 0
Ha: p1 p2 > 0
215
( p1 p 2 ) 0
1 1
+
pq
n1 n2
(.7 .6) 0
1
1
.65(.35) +
55
65
.1
= 1.14
.08739
Since the observed value of the test statistic does not fall in the rejection region (z = 1.14 >/
1.645), H0 is not rejected. There is insufficient evidence to indicate the proportion from
population 1 is greater than that for population 2 at = .05.
7.48
a.
Let p1 = proportion of men who prefer to keep track of appointments in their head and
p2 = proportion of women who prefer to keep track of appointments in their head. To
determine if the proportion of men who prefer to keep track of appointments in their head
is greater than that of women, we test:
H0: p1 p2 = 0
Ha: p1 p2 > 0
b.
p =
n1 p1 + n2 p 2 500(.56) + 500(.46)
= .51 and q = 1 p = 1 .51 = .49
=
n1 + n2
500 + 500
7.50
( p1 p 2 ) 0
1 1
+
pq
n1 n2
(.56 .46) 0
1
1
+
.51(.49)
500 500
= 3.16
c.
The rejection region requires = .01 in the upper tail of the z distribution. From Table
d.
The p-value is p = P(z 3.16) .5 .5 = 0.
e.
H0 is rejected. There is sufficient evidence to indicate the proportion of men who prefer
to keep track of appointments in their head is greater than that of women at = .01.
a.
Let p1 = proportion of customers returning the printed survey and p2 = proportion of

customers returning the electronic survey. Some preliminary calculations are:
p1 =
x1 261
=
= .414
n1 631
p 2 =
x2 155
=
= .374
n2 414
Appendix B, z.05 = 1.645. The 90% confidence interval is:
216
Chapter 7
( p1 p 2 ) z.05
p1q1 p 2 q2
.414(.586) .374(.626)
+
(.414 .374) 1.645
+
n1
n2
631
414
.04 .051 (.011, .091)
We are 90% confidence that the difference in the response rates for the two types of
surveys is between .011 and .091.
7.52
b.
Since the value .05 falls in the 90% confidence interval, it is not an unusual value. Thus,
there is no evidence that the difference in response rates is different from .05. The
researchers would be able to make this inference.
a.
Let p1 = proportion of managers and professionals who are male and p2 = proportion of
part-time MBA students who are male. To see if the samples are sufficiently large:
p1 3 p1 p1 3
p1q1
p q
(.95)(0.5)
p1 3 1 1 .95 3
n1
n1
162
.95 .05 (.90, 1.00)

p 2 3 p 2 p 2 3
p2 q2
p q
(.689)(.311)
p 2 3 2 2 .95 3
n2
n2
109
.689 .133 (.556, .822)

Since both intervals are contained within the interval (0, 1), the normal approximation
will be adequate.
First, we calculate the overall estimate of the common proportion under H0.
p =
n1 p1 + n2 p 2 162(.95) + 109(.689)
= .845
=
n1 + n2
162 + 109
To determine if the population of managers and professionals consists of more males than
the part-time MBA population, we test:
H0: p1 = p2
Ha: p1 > p2
( p1 p 2 ) 0
1 1
+
pq
n1 n2
(.95 .689) 0
1
1
+
.845(.155)
162 109
= 5.82
Since the observed value of the test statistic falls in the rejection (z = 5.82 > 1.645), H0 is
rejected. There is sufficient evidence to indicate that population of managers and
professionals consists of more males than the part-time MBA population at = .05.
217
b.
We had to assume:
1. Both samples were randomly selected
2. Both sample sizes are sufficiently large.
c.
First, we calculate the overall estimate of the common proportion under H0.
p =
n1 p1 + n2 p 2 162(.912) + 109(.534)
=
= .760
n1 + n2
162 + 109
To determine if the population of managers and professionals consists of more married

individuals than the part-time MBA population, we test:
H0: p1 = p2
Ha: p1 > p2
( p1 p 2 ) 0
1 1
+
pq
n1 n2
(.912 .534) 0
1
1
+
.760(.240)
162 109
= 7.14
Since the observed value of the test statistic falls in the rejection (z = 7.14 > 2.33), H0 is
rejected. There is sufficient evidence to indicate that population of managers and
professionals consists of more married individuals than the part-time MBA population at
= .01.
d.
We had to assume:
1. Both samples were randomly selected
2. Both sample sizes are sufficiently large.
7.54
Let p1 = accuracy rate for modules with correct code and p2 = accuracy rate for modules with
defective code.
p 1 =
218
x1 400
=
= .891
n1 449
p 2 =
x 2 20
=
= .408
n2 49
Chapter 7
( p1 p 2 ) z.005
p1q1 p 2 q2
.891(.109) .408(.592)
+
(.891 .408) 2.58
+
n1
n2
449
49
.483 .185 (.298, .668)
We are 99% confident that the difference in accuracy rates between modules with
correct code and modules with defective code is between .298 and .668.
7.56
a.
Let p = proportion of all children who recognize Joe Camel.

p =
x 15 + 46
=
= .735
n 28 + 55
q = 1 p = 1 .735 = .265
To see if the sample is sufficiently large:

p 3 p p 3
pq
pq
.735(.265)
p 3
.735 3
.735 .145
n
n
83
(.590, .880)
adequate.
p z.025
pq
.735(.265)
.735 1.96
.735 .095 (.640, .830)
n
83
We are 95% confident that the proportion of all children who recognize Joe Camel is
b.
Let p1 = proportion of children under the age of 6 who recognize Joe Camel and
p2 = proportion of children age 6 and over who recognize Joe Camel.
x1 15
=
= .536
n1 28
x
46
p 2 = 2 =
= .836
n2 55
p1 =
q1 = 1 p1 = 1 .536 = .464
q2 = 1 p 2 = 1 .836 = .164
219
To see if the samples are sufficiently large:
p1 3 p1 p1 3
p 2 3 p 2 p 2 3
p1q1
p q
.536(.464)
p1 3 1 1 .536 \ 3
28
n1
n1
p2 q2
n2
.536 .283 (.253, .819)

p q
.836(.164)
p 2 3 2 2 .836 3
n2
55
.836 .150 (.686, .986)
Since both intervals lie within the interval (0, 1), the normal approximation will be
adequate.
To determine if the recognition of Joe Camel increases with age, we test:
H0: p1 p2 = 0
Ha: p1 p2 < 0
( p1 p 2 ) 0
1 1
+
pq
n1 n2
(.536 .836) 0
1
1
.735(.265) +
28 55
= 2.93
1.645), H0 is rejected. There is sufficient evidence to indicate that the recognition of Joe
Camel increases with age at = .05.
7.58
a.
n1 = n2 =
b.
2
( z / 2 ) ( 12 + 22 )
ME 2
(1.96) 2 (152 + 17 2 )
= 192.83 193
3.22
If the range of each population is 40, we would estimate by:
60/4 = 15
n1 = n2 =
220
2
( z / 2 ) ( 12 + 22 )
ME 2
(2.58) 2 (152 + 152 )

= 46.80 47
82
Chapter 7
c.
Appendix B, z.05 = 1.645. For a width of 1, the bound is .5.
n1 = n2 =
7.60
2
( z / 2 ) ( 12 + 22 )
ME
(1.645) 2 (5.82 + 7.52 )

= 143.96 144
.52
First, find the sample sizes needed for width 5, or margin of error 2.5.
For confidence coefficient .9, = 1 .9 = .1 and /2 = .1/2 = .05. From Table IV, Appendix
B, z.05 = 1.645.
n1 = n2 =
2
( z / 2 ) ( 12 + 22 )
ME 2
(1.645) 2 (102 + 102 )

= 86.59 87
2.52
Thus, the necessary sample size from each population is 87. Therefore, sufficient funds have
not been allocated to meet the specifications since n1 = n2 = 100 are large enough samples.
7.62
z.025 = 1.96.
n1 = n2 =
2
( z / 2 ) ( 12 + 22 )
( ME ) 2
1.962 (3.1892 + 2.3552 )

= 26.8 27
1.52
We would need to sample 27 specimens from each location.

7.64
Appendix B, z.05 = 1.645. Since no information is given about the values of p1 and p2, we will
be conservative and use .5 for both. A width of .04 means the bound is .04/2 = .02.
n1 = n2 =
7.66
a.
( z / 2 )
( p1 q1 + p2 q2 )
( ME ) 2
1.6452 (.5(.5) + .5(.5) )

.022
= 3,382.5 3,383
Appendix B, z.10 = 1.28. Since we have no prior information about the proportions, we use
p1 = p2 = .5 to get a conservative estimate. For a width of .06, the margin of error is .03.
n1 = n2 =
b.
( z / 2 )
( p1q1 + p2 q2 )
ME 2
(1.28) 2 (.5(1 .5) + .5(1 .5) )

.032
= 910.22 911
Appendix B, z.05 = 1.645. Using the formula for the sample size needed to estimate a
proportion from Chapter 7,
n=
( z / 2 )
ME
pq
1.6452 (.5(1 .5) )

.02
.6765
= 1691.27 1692
.0004
No, the sample size from part a is not large enough.
221
7.68
For confidence coefficient .95, = 1 .95 = .05 and /2 = .025. From Table IV, Appendix B,
z.025 = 1.96.
n1 = n2 =
7.70
2
( z / 2 ) ( 12 + 22 )
( ME ) 2
1.962 (352 + 802 )

= 292.9 293
102
a.
With 1 = 2 and 2 = 30,

P(F 5.39) = .01 (Table XI, Appendix B)
b.
With 1 = 24 and 2 = 10,

P(F 2.74) = .05 (Table IX, Appendix B)
Thus, P(F < 2.74) = 1 P(F 2.74) = 1 .05 = .95.
c.
With 1 = 7 and 2 = 1,
P(F 236.8) = .05 (Table VIII, Appendix B)
Thus, P(F < 236.8) = 1 P(F 236.8) = 1 .05 = .95.
d.
7.72
With 1 = 40 and 2 = 40,

P(F > 2.11) = .01 (Table XI, Appendix B)
To test H0: 12 = 22 against Ha: 12 22 , the rejection region is F > F/2 with 1 = 10 and
2 = 12.
a.
= .20, /2 = .10
Reject H0 if F > F.10 = 2.19 (Table VIII, Appendix B)
b.
= .10, /2 = .05
Reject H0 if F > F.05 = 2.75 (Table IX, Appendix B)
c.
= .05, /2 = .025
Reject H0 if F > F.025 = 3.37 (Table X, Appendix B)
d.
= .02, /2 = .01
Reject H0 if F > F.01 = 4.30 (Table XI, Appendix B)
7.74
a.
To determine if a difference exists between the population variances, we test:

H0: 12 = 22
Ha: 12 22
The test statistic is F =
222
s22 8.75
=
= 2.26
s12 3.87
Chapter 7
The rejection region requires /2 = .10/2 = .05 in the upper tail of the F-distribution with
1 = n2 1 = 27 1 = 26 and 2 = n1 1 = 12 1 = 11. From Table IX, Appendix B, F.05
2.60. The rejection region is F > 2.60.
Since the observed value of the test statistic does not fall in the rejection region (F = 2.26
>/ 2.60), H0 is not rejected. There is insufficient evidence to indicate a difference
between the population variances.
b.
The p-value is 2P(F 2.26). From Tables VIII and IX, with 1 = 26 and 2 = 11,
2(.05) < 2P(F 2.26) < 2(.10) .10 < 2P(F 2.26) < .20
7.76
Let 12 = variance of carat size for diamonds certified by GIA, 22 = variance of carat size for
diamonds certified by HRD, and 32 = variance of carat size for diamonds certified by IGI.
a.
To determine if the variation in carat size differs for diamonds certified by GIA and
diamonds certified by HRD, we test:
H0: 12 = 22
Ha: 12 22
Larger sample variance s12 .24562

= =
= 1.799
Smaller sample variance s22 .18312
1 = n1 1 = 151 1 = 150 and 2 = n2 1 = 79 1 = 78. From Table X, Appendix B,
F.025 1.43. The rejection region is F > 1.43.
Since the observed value of the test statistic falls in the rejection region (F = 1.799 >
1.43), H0 is rejected. There is sufficient evidence to indicate the variation in carat size
differs for diamonds certified by GIA and those certified by HRD at = .05.
b.
To determine if the variation in carat size differs for diamonds certified by GIA and
diamonds certified by IGI, we test:
H0: 12 = 32
Ha: 12 32

= =
= 1.289
1 = n1 1 = 151 1 = 150 and 2 = n3 1 = 78 1 = 77. From Table X,
Appendix B, F.025 1.43. The rejection region is F > 1.43.
223
Since the observed value of the test statistic does not fall in the rejection region (F =
1.289 >/ 1.43), H0 is not rejected. There is insufficient evidence to indicate the variation
in carat size differs for diamonds certified by GIA and those certified by IGI at = .05.
c.
To determine if the variation in carat size differs for diamonds certified by HRD and
diamonds certified by IGI, we test:
H0: 22 = 32
Ha: 22 32

= =
= 1.396
1 = n3 1 = 78 1 = 77 and 2 = n2 1 = 79 1 = 78. From Table X, Appendix B, F.025
in carat size differs for diamonds certified by HRD and those certified by IGI at = .05.
d.
at histograms of the data. Using MINITAB, the histograms of the carat sizes for the 3
certification bodies are:
40
40
30
Percent
Percent
30
20
20
10
10
0.0
0.5
1.0
0.0
GIA
0.5
1.0
HRD
40
Percent
30
20
10
0
0.0
0.5
1.0
IGI
From the histograms, none of the data appear to be mound-shaped. It appears that none
of the data sets are normal.
224
Chapter 7

Descriptive Statistics: GIA, IGI, HRD
Variable
GIA
IGI
HRD
N
151
78
79
Mean
0.6723
0.3665
0.8129
Median
0.7000
0.2900
0.8100
TrMean
0.6713
0.3406
0.8169
Variable
GIA
IGI
HRD
Minimum
0.3000
0.1800
0.5000
Maximum
1.1000
1.0100
1.0900
Q1
0.5000
0.2100
0.6500
Q3
0.9000
0.4850
1.0000
StDev
0.2456
0.2163
0.1831
SE Mean
0.0200
0.0245
0.0206
For GIA:
x s .6723 .2456 (.4267, .9179) 84 of the 151 values fall in this interval. The
proportion is .56. This is much smaller than the .68 we would expect if the data were
normal.
x 2 s .6723 2(.2456) .6723 .4912 (.1811, 1.1635) 151 of the 151 values fall
in this interval. The proportion is 1.00. This is much larger than the .95 we would expect
if the data were normal.
x 3s .6723 3(.2456) .6723 .7368 (.0645, 1.4091) 151 of the 151 values
fall in this interval. The proportion is 1.00. This is the same as the 1.00 we would expect
if the data were normal.
For IGI:
proportion is .88. This is much larger than the .68 we would expect if the data were
normal.
x 2s .3665 2(.2163) .3665 .4326 (.0661, .7991) 74 of the 78 values fall
in this interval. The proportion is .95. This is the same as the .95 we would expect if the
data were normal.
x 3s .3665 3(.2163) .3665 .6489 (.2824, 1.0154) 78 of the 78 values fall
in this interval. The proportion is 1.00. This is the same as the 1.00 we would expect if
225
For HRD:
proportion is .38. This is much smaller than the .68 we would expect if the data were
normal.
x 2 s .8129 2(.1831) .8129 .3662 (.4467, 1.1791) 79 of the 79 values fall in
this interval. The proportion is 1.00. This is much larger than the .95 we would expect if
x 3s .8129 3(.1831) .8129 .5493 (.2636, 1.3622) 79 of the 79 values fall in
this interval. The proportion is 1.00. This is the same as the 1.00 we would expect if the
data were normal.
Next, we look at the ratio of the IQR to s.
For GIA:
IQR = QU QL = 1.1 .3 = .8.

IQR
.8
=
= 3.26 This is much larger than the 1.3 we would expect if the data were
s
.2456
For IGI:
IQR = QU QL = 1.01 - .18 = .83.

IQR
.83
=
s
.2163
For HRD:
IQR = QU QL = 1.09 - .5 = .59.

IQR
.59
=
s
.1831
226
Chapter 7
Finally, using MINITAB, the normal probability plot for GIA is:
Normal Probability Plot for GIA
ML Estimates
99
Percent
95
90
Mean
0.672252
StDev
0.244757
Goodness of Fit
80
70
60
50
40
30
20
AD*
3.332
10
5
1
0.0
0.5
1.0
1.5
Data
Using MINITAB, the normal probability plot for IGI is:
Normal Probability Plot for IGI
ML Estimates
99
Mean
0.366538
StDev
0.214863
Percent
95
90
Goodness of Fit
80
70
60
50
40
30
20
AD*
5.622
10
5
1
0.0
0.5
1.0
Data
227
Using MINITAB, the normal probability plot for HRD is:

Normal Probability Plot for HRD
ML Estimates
99
Percent
95
90
Mean
0.812911
StDev
0.181890
Goodness of Fit
80
70
60
50
40
30
20
AD*
3.539
10
5
1
0.5
1.0
1.5
Data
From the 4 different methods, all indications are that the carat size data are not normal
for any of the certification bodies.
7.78
a.
The amount of variability of GHQ scores tells us how similar or different the members of
the group are on GHQ scores. The larger the variability, the larger the differences are
among the members on the GHQ scores. The smaller the variability, the smaller the
differences are among the members on the GHQ scores.
b.
Let 12 = variance of the mental health scores of the employed and 22 = variance of the
mental health scores of the unemployed. To determine if the variability in mental health
scores differs for employed and unemployed workers, we test:
H0: 12 = 22
Ha: 12 22
c.
Larger sample variance s12 5.102

= 2.45
= =
Smaller sample variance s22 3.262
The rejection region requires /2 = .05/2 = .025 in the upper tail of the F-distribution
with 1 = n2 1 = 49 1 = 48 and 2 = n1 1 = 142 1 = 141. From Table XI, Appendix
B, F.025 1.61. The rejection region is F > 1.61.
Since the observed value of the test statistic falls in the rejection region (F = 2.45 > 1.61),
H0 is rejected. There is sufficient evidence to indicate that the variability in mental health
scores differs for employed and unemployed workers for = 05.
228
Chapter 7
d.
7.80
We must assume that the 2 populations of mental health scores are normally distributed.
We must also assume that we selected 2 independent random samples.
Let 12 = variance zinc measurements from the text-line, 22 = variance zinc measurements
from the witness-line, and 32 = variance zinc measurements from the intersection. Using
MINITAB, the descriptive statistics are:
Descriptive Statistics: Text-line, Witness-line, Intersection
Variable
Text-lin
WitnessIntersec
N
3
6
5
Mean
0.3830
0.3042
0.3290
Median
0.3740
0.2955
0.3190
TrMean
0.3830
0.3042
0.3290
Variable
Text-lin
WitnessIntersec
Minimum
0.3350
0.1880
0.2850
Maximum
0.4400
0.4390
0.3930
Q1
0.3350
0.2045
0.2900
Q3
0.4400
0.4075
0.3730
a.
StDev
0.0531
0.1015
0.0443
SE Mean
0.0306
0.0415
0.0198
To determine if the variation in the zinc measurements for the text-line and the
intersection differ, we test:
H0: 12 = 32
Ha: 12 32

= 1.437
= =
1 = n1 1 = 3 1 = 2 and 2 = n3 1 = 5 1 = 4. From Table X, Appendix B, F.025 =
in the zinc measurements for the text-line and the intersection differ at = .05.
b.
To determine if the variation in the zinc measurements for the witness-line and the
intersection differ, we test:
H0: 22 = 32
Ha: 22 32

= 5.250
= =
1 = n2 1 = 6 1 = 5 and 2 = n3 1 = 5 1 = 4. From Table X, Appendix B, F.025 =
in the zinc measurements for the witness-line and the intersection differ at = .05.
229
7.82
c.
There is no indication that the variances of the zinc measurements for three locations
differ.
d.
With only 3, 6, and 5 measurements, it is very difficult to check the assumptions.
Using MINITAB, some preliminary calculations are:

Descriptive Statistics: HEATRATE
Variable
HEATRATE
ENGINE
Advanced
Aeroderiv
Traditional
N
21
7
39
Variable
HEATRATE
ENGINE
Advanced
Aeroderiv
Traditional
Q3
10060
14628
11964
a.
N*
0
0
0
Mean
9764
12312
11544
SE Mean
139
1002
205
StDev
639
2652
1279
Minimum
9105
8714
10086
Q1
9252
9469
10592
Median
9669
12414
11183
Maximum
11588
16243
14796
To determine if the heat rate variances for traditional and aeroderivative augmented gas
turbines differ, we test:
H0: 22 = 32
Ha: 22 32
89)
F=
Larger sample variance s22 26522

= 4.299
=
=
Smaller sample variance s32 12792
The rejection region requires /2 = .05/2 = .025 in the upper tail of the F distribution
with numerator df = 2 = n2 1 = 7 1 = 6 and denominator df = 3 = n3 1 = 39 1
= 38. From Table X, Appendix B, F.025 2.74. The rejection region is F > 2.74.
(F = 4.299 > 2.74), H0 is rejected. There is sufficient evidence to indicate the heat rate
variances for traditional and aeroderivative augmented gas turbines differ at = .05.
Since the test in Exercise 7.23 a assumes that the population variances are the same, the
validity of the test is suspect since we just found the variances are different.
b.
To determine if the heat rate variances for advanced and aeroderivative augmented gas
turbines differ, we test:
H0: 12 = 22
Ha: 12 22
230
Chapter 7
Larger sample variance s 212 26522

= 17.224
=
=
Smaller sample variance s12
6392
The rejection region requires /2 = .05/2 = .025 in the upper tail of the F distribution
with numerator df = 1 = n1 1 = 7 1 = 6 and denominator df = 2 = n2 1 = 21 1
= 20. From Table X, Appendix B, F.025 = 3.13. The rejection region is F > 3.13.
(F = 17.224 > 3.13), H0 is rejected. There is sufficient evidence to indicate the heat rate
variances for advanced and aeroderivative augmented gas turbines differ at = .05.
Since the test in Exercise 7.23 b assumes that the population variances are the same, the
validity of the test is suspect since we just found the variances are different.
7.84
a.
The 2 samples are randomly selected in an independent manner from the two populations.
The sample sizes, n1 and n2, are large enough so that x1 and x2 each have approximately
normal sampling distributions and so that s12 and s22 provide good approximations to 12
and 22 . This will be true if n1 30 and n2 30.
b.
7.86
1.
2.
3.
Both sampled populations have relative frequency distributions that are

The samples are randomly and independently selected from the populations.
c.
1.
2.
The relative frequency distribution of the population of differences is normal.

The sample of differences are randomly selected from the population of differences.
d.
The two samples are independent random samples from binomial distributions. Both
samples should be large enough so that the normal distribution provides an adequate
approximation to the sampling distributions of p1 and p 2 .
e.
The two samples are independent random samples from populations which are normally
distributed.
a.
H0: 12 = 22
Ha: 12 22
s2
Larger sample variance
120.1
= 22 =
= 3.84
Smaller sample variance
s1
31.3
The rejection region requires /2 = .05/2 = .025 in the upper tail of the F-distribution
with numerator df 1 = n2 1 = 15 1 = 14 and denominator df 2 = n1 1 = 20 1 = 19.
From Table XI, Appendix B, F.025 2.66. The rejection region is F > 2.66.
H0 is rejected. There is sufficient evidence to conclude 12 22 at = .05.
231
b.
7.88
No, we should not use a small sample t test to test H0: (1 2) = 0 against Ha: (1 2)
0 because the assumption of equal variances does not seem to hold since we concluded
12 22 in part b.
p1 =
a.
x1 110
x 130
x +x
110 + 130 240
=
=
= .55; p 2 = 2 =
= .65; p = 1 2 =
n1 200
n2 200
n1 + n2 200 + 200 400
H0: (p1 p2) = 0

Ha: (p1 p2) < 0
( p1 p 2 ) 0
1 1
+
pq
n1 n2
(.55 .65) 0
1
1
.6(1 .6)
+
200 200
.10
= 2.04
.049
1.28), H0 is rejected. There is sufficient evidence to conclude (p1 p2 < 0) at = .10.
b.
Appendix B, z.025 = 1.96. The 95% confidence interval for (p1 p2) is approximately:
p q
p q
( p1 p 2 ) z / 2 1 2 + 2 2
n1
n2
.55(1 .55) .65(1 .65)
+
200
200
.10 .096 (.196, .004)
(.55 .65) 1.96
c.
From part b, z.025 = 1.96. Using the information from our samples, we can use
p1 = .55 and p2 = .65. For a width of .01, the margin of error is .005.
n1 = n2 =
232
( z / 2 )
( p1q1 + p2 q2 )
ME
(1.96) 2 (.55(1 .55) + .65(1 .65) )
.005
= 72990.4 72,991
1.82476
.000025
Chapter 7
7.90
a.
Let p1 = proportion of Opening Doors students enrolled full time and p2 = proportion
of traditional students enrolled full time.
The target parameter for this comparison is p1 p2.
b.
Let 1 = mean GPA of Opening Doors students and 2 = mean GPA of traditional
students.
The target parameter for this comparison is 1 2.
7.92
Using MINITAB, some preliminary calculations are:

Descriptive Statistics: Spillage
Variable
Spillage
Cause
Collision
Fire
Grounding
HullFail
Unknown
Variable
Spillage
Cause
Q3 Maximum
Collision 102.0
257.0
Fire
80.5
239.0
Grounding 59.00
124.00
HullFail
46.0
221.0
Unknown
*
27.00
a.
N
10
12
14
12
2
N*
0
0
0
0
0
Mean
76.6
70.9
47.79
54.4
26.00
SE Mean
22.3
17.5
7.61
16.3
1.00
StDev
70.4
60.7
28.47
56.4
1.41
Minimum
31.0
26.0
21.00
24.0
25.00
Q1
35.0
32.3
30.25
29.3
*
Median
41.5
49.0
37.50
31.5
26.00
Let 1 = mean spillage for accidents caused by collision and 2 = mean spillage for
accidents caused by fire/explosion.
s 2p =
( n1 1) s12 + ( n2 1) s22 = (10 1) 70.42 + (12 1) 60.72

n1 + n2 2
10 + 12 2
= 4, 256.7415
VI, Appendix B, with df = n1 + n2 2 = 10 + 12 2 = 20, t.05 = 1.725. The confidence
interval is:
1 1
1 1
( x1 x2 ) t.05 s 2p + (76.6 70.9) 1.725 4256.7415 +
10 12
n1 n2
5.7 48.19 ( 42.59, 53.89)
b.
Let 3 = mean spillage for accidents caused by grounding and 4 = mean spillage for
accidents caused by hull failure.
s 2p =
( n1 1) s12 + ( n2 1) s22 = (14 1) 28.47 2 + (12 1) 56.42

n1 + n2 2
14 + 12 2
= 1,896.9830
233
To determine if the mean spillage amount for accidents caused by grounding is

different from the mean spillage amount caused by hull failure, we test:
H0: 3 4 = 0
Ha: 3 4 0
( x1 x2 ) Do
1 1
s 2p +
n1 n2
( 47.79 54.4 ) 0
1 1
1896.983 +
14 12
6.61
= .39
17.1342
df = n1 + n2 2 = 14 +12 2 = 24. From Table VI, Appendix B, t.025 = 2.064. The
rejection region is t < 2.064 or t > 2.064.
(t = .39 </ 2.064), H0 is not rejected. There is insufficient evidence to indicate the
mean spillage amount for accidents caused by grounding is different from the mean
spillage amount caused by hull failure at = .05.
c.

We must assume that the distributions from which the samples were selected are
approximately normal, the samples are independent, and the variances of the two
populations are equal.
Below are the stem-and-leaf plots for each of the samples:
Stem-and-leaf of Spillage Cause = Collision
Leaf Unit = 10
(6)
4
2
1
1
1
0
0
1
1
2
2
234
0
0
0
0
1
1
1
1
1
2
2
= 10
333444
69
2
5
Stem-and-leaf of Spillage Cause = Fire

Leaf Unit = 10
4
(4)
4
3
2
2
1
1
1
1
1
= 12
2333
4455
7
8
3
Chapter 7
Stem-and-leaf of Spillage Cause = Grounding

Leaf Unit = 1.0
3
(5)
6
4
3
2
2
2
1
1
1
2
3
4
5
6
7
8
9
10
11
12
168
11678
15
8
2
1
4
Stem-and-leaf of Spillage Cause = Hull Failure

Leaf Unit = 10
(8)
4
2
2
2
1
1
1
1
1
1
0
0
0
0
1
1
1
1
1
2
2
= 14
= 12
22233333
44
0
Based on the shapes of the stem-and-leaf plots, it does not appear that the data are
normally distributed.
Also, we know that if the data are normally distributed, then the Interquartile Range,
IQR, divided by the standard deviation should be approximately 1.3. We will
compute IQR/s for each of the samples:
Collision:
Fire:
Grounding:
Hull Failure:
IQR/s = (102.0 35.0) / 70.4 = .95

IQR/s = (80.5 32.3) / 60.7 = .79
IQR/s = (59.0 30.25) / 28.47 = 1.01
IQR/s = (46.0 29.3) / 56.4 = .29
Since all of these ratios are quite a bit smaller than 1.3, it indicates that none of the
samples come from normal distributions.
Thus, it appears that the assumption of normal distributions is violated.
The sample standard deviations are:
Collision:
Fire:
Grounding:
Hull Failure:
s = 70.4
s = 60.7
s = 28.47
s = 56.4
Without doing formal tests, it appears that the variances of the groups Collision, Fire,
and Hull Failure are probably not significantly different. However, it appears that the
variance for the grounding group is smaller than the others.
235
d.
Let 12 = variance of spillage for accidents caused by collision and 22 = variance of

spillage for accidents caused by grounding.
To determine if the variances of the amounts of spillage due to collision and grounding
differ, we test:
H0: 12 = 22
Ha: 12 22
Larger sample variance s12

70.42
= 6.11
= 2 =
Smaller sample variance s2 28.47 2
The rejection region requires /2 = .02/2 = .01 in the upper tail of the F distribution with
numerator df = 1 = n1 1 = 10 1 = 9 and denominator df = 2 = n2 1 = 14 1 = 13.
From Table XI, Appendix B, F.01 = 4.19. The rejection region is F > 4.19.
(F = 6.11 > 4.19), H0 is rejected. There is sufficient evidence to indicate the variances of
the amounts of spillage due to collision and grounding differ at = .02.
7.94
a.
Let 1 = mean rating of concern about product tampering for males and 2 = mean rating
of concern about product tampering for females. To determine whether a difference
exists in the mean level of concern about product tampering between males and females,
we test:
H0: 1 2 = 0
Ha: 1 2 0
7.96
b.
The p-value = .008. Since the p-value is so small, there is evidence to reject H0. There is
sufficient evidence to indicate a difference exists in the mean level of concern about
product tampering between males and females for > .008.
c.
We must assume the sample sizes were sufficiently large so that the Central Limit
Theorem applies. We must also assume that we selected two random and independent
samples from the two populations.
z.025 = 1.96.
n1 n 2 =
7.98
236
a.
( z / 2 )
( p1q1 + p2 q2 )
( ME ) 2
1.962 (.395(.605) + .293(.707) )

.032
= 1904.26 1905
Let p1999 = proportion of adult Americans who would vote for a woman president in 1999
and p1975 = proportion of adult Americans who would vote for a woman president in
1975.
Chapter 7
b.

p1999 3 p1999 p1999 3
p1999 q1999
p q
.92(.08)
p1999 3 1999 1999 .92 3
n1999
n1999
2000
.92 .02 (.90, .94)

p1975 3 p1975 p1975 3
p1975 q1975
p q
.73(.27)
p1975 3 1975 1975 .73 3
n1975
n1975
2000
.73 .03 (.70, .76)

Since both intervals are contained within the interval (0, 1), the normal approximation
will be adequate.
c.
( p1 p 2 ) z.05
p1 q1 p 2 q2
.92(.08) .73(.27)
+
(.92 .73) 1.645
+
n1
n2
2000
1500
.19 .02 (.17, .21)

We are 90% confident that the difference in the proportions of adult Americans who
would vote for a woman president between 1999 and 1975 is between .17 and .21.
d.

p1999 3 p1999 p1999 3
p1999 q1999
p q
.92(.08)
p1999 3 1999 1999 .92 3
n1999
n1999
20
.92 .18 (.74, 1.10)

p1975 3 p1975 p1975 3
p1975 q1975
p q
.73(.27)
p1975 3 1975 1975 .73 3
n1975
n1975
50
.73 .19 (.54, .92)

Since the first interval is not contained within the interval (0, 1), the normal
approximation will not be adequate.
7.100
a.
For each measure, let 1 = mean job satisfaction for day-shift nurses and 2 = mean job
satisfaction for night-shift nurses. To determine whether a difference in job satisfaction
exists between day-shift and night-shift nurses, we test:
H0: 1 2 = 0
Ha: 1 2 0
237
b.
Hours of work: The p-value = .813. Since the p-value is so large, there is no evidence to
reject H0. There is insufficient evidence to indicate a difference in mean job satisfaction
exists between day-shift and night-shift nurses on hours of work for .10.
Free time: The p-value = .047. Since the p-value is so small, there is evidence to reject
H0. There is sufficient evidence to indicate a difference in mean job satisfaction exists
between day-shift and night-shift nurses on free time for > .047.
Breaks: The p-value = .0073. Since the p-value is so small, there is evidence to reject H0.
There is sufficient evidence to indicate a difference in mean job satisfaction exists
between day-shift and night-shift nurses on breaks for > .0073.
c.
We must make the following assumptions for each measure:

1.
2.
3.
7.102
The job satisfaction scores for both day-shift and night-shift nurses are normally
distributed.
The variances of job satisfaction scores for both day-shift and night-shift nurses are
equal.
Random and independent samples were selected from both populations of job
satisfaction scores.
Appendix B, z.05 = 1.645. We estimate p1 = p2 = .5.
n1 n 2 =
7.104
( z / 2 )
( p1q1 + p2 q2 )
( ME ) 2
(1.645)2 (.5(.5) + .5(.5) )

.052
= 541.205 542
Let p1 = proportion of larvae that died in containers containing high carbon dioxide levels and
p2 = proportion of larvae that died in containers containing normal carbon dioxide levels. The
parameter of interest for this problem is p1 p2, or the difference in the death rates for the two
groups.
p =
x1 + x2 .10(80) + .05(80)
=
= .075
n1 + n2
80 + 80
q = 1 p = 1 .075 = .925
To determine if an increased level of carbon dioxide is effective in killing a higher percentage

of leaf-eating larvae, we test:
H0: p1 p2 = 0
Ha: p1 p2 > 0
238
( p1 p 2 ) 0
1
1
+
pq
80 80
(.10 .05) 0
1
1
.075(.925) +
80 80
= 1.201
Chapter 7
The rejection region requires = .01 in the upper tail of the z distribution. From Table IV,
Since the observed value of the test statistic does not fall in the rejection region (z = 1.201 >/
2.33), H0 is not rejected. There is insufficient evidence to indicate that an increased level of
carbon dioxide is effective in killing a higher percentage of leaf-eating larvae at = .01.
7.106
a.
Let p1 = proportion of female students who switched due to loss of interest in SME and
p2 = proportion of male students who switched due to lack of interest in SME.
p1 =
x1 74
x
x +x
72
74 + 72
=
= .43; p 2 = 2 =
= .44; p = 1 2 =
= .436
n1 172
n2 163
n1 + n2 172 + 163
To determine if the proportion of female students who switch due to lack of interest in
SME differs from the proportion of males who switch due to a lack of interest, we test:
H0: p1 p2 = 0
Ha: p1 p2 0
( p1 p 2 ) 0
1 1
+
pq
n1 n2
(.43 .44) 0
1
1
+
.436(.564)
172 163
= 0.18
</ -1.645), H0 is not rejected. There is insufficient evidence to indicate the proportion of
female students who switch due to lack of interest in SME differs from the proportion of
males who switch due to a lack of interest in SME at = .10.
b.
Let p1 = proportion of female students who switched due to low grades in SME and
p2 = proportion of male students who switched due to low grades in SME.
p1 =
x1 33
=
= .19;
n1 172
p 2 =
x2 44
=
= .27
n2 163
( p1 p 2 ) z.05
p1q1 p 2 q2
.19(.81) .27(.73)
+
(.19 .27) 1.645
+
n1
n2
172
163
.08 .075 (.155, .005)
239
We are 90% confident that the difference between the proportions of female and male
switchers who lost confidence due to low grades in SME is between .155 and .005.
Since the interval does not include 0, there is evidence to indicate the proportion of
female switchers due to low grades is less than the proportion of male switchers due to
low grades.
7.108
For confidence level .95, = .05 and /2 = .05/2 = .025. From Table IV, Appendix B,
z.025 = 1.96. The standard deviation can be estimated by dividing the range by 4:
Range
4
= =1
4
4
2
( z / 2 ) ( 12 + 22 )
n1 = n 2 =
7.110
( ME ) 2
1.962 (12 + 12 )
= 192.08 193
.22
2
1
s12 =
n1
n1 1
2
2
s22 =
( x )
( x )
n2
n2 1
2252
5 = 126 = 31.5
4
5 1
10, 251
227 2
5 = 45.2 = 11.3
4
5 1
10,351
Let 12 = variance for instrument A and 22 = variance for instrument B. Since we wish to
determine if there is a difference in the precision of the two machines, we test:
H0: 12 = 22
Ha: 12 22
Larger sample variance s12 31.5

= 2.79
= =
Smaller sample variance s22 11.3
The rejection region requires /2 = .10/2 = .05 in the upper tail of the F-distribution with 1 =
n1 1 = 5 1 = 4 and 2 = n2 1 = 5 1 = 4. From Table IX, Appendix B, F.05 = 6.39. The
rejection region is F > 6.39.
Since the observed value of the test statistic does not fall in the rejection region (F = 2.79 >/
6.39), H0 is not rejected. There is insufficient evidence of a difference in the precision of the
two instruments at = .10.
240
Chapter 7
7.112
a.
Let 1 = mean change in bond prices handled by underwriter 1 and 2 = mean change in
bond prices handled by underwriter 2.
sp2 =
(n1 1) s12 + ( n2 1) s22 (27 1).0098 + (23 1).002465 .30903

=
= .006438
=
48
n1 + n2 2
27 + 23 2
To determine if there is a difference in the mean change in bond prices handled by the 2
underwriters, we test:
H0: 1 2 = 0
Ha: 1 2 0
( x1 x2 ) D0
1 1
s +
n1 n2
2
p
.0491 (.0307) 0
1
1
.006438 +
27 23
= .81
n1 + n2 2 = 27 + 23 2 = 48. From Table VI, Appendix B, t.025 1.96. The rejection
region is t < 1.96 or t > 1.96.
Since the observed value of the test statistic does not fall in the rejection region (t = .81
</ 1.96), H0 is not rejected. There is insufficient evidence to indicate there is a
difference in the mean change in bond prices handled by the 2 underwriters at = .05.
b.
Appendix B, with df = 48, t.025 1.96. The confidence interval is:
1 1
( x1 x2 ) t.025 sp2 +
n1 n2
1
1
(.0491 (.0307) 1.96 .006438 +
27 23
.0184 .0446 (.063, .0262)
We are 95% confident the difference in the mean bond prices handled by underwriter 1
and underwriter 2 is somewhere between .063 and .0262.
7.114
a.
To determine if the mean salary of all males with post-graduate degrees exceeds the mean
salary of all females with post-graduate degrees, we test:
H0: M = F
Ha: M > F
b.
( xM xF ) 0
s
2
xM
+s
2
xF
(61, 340 32, 227)

2,1852 + 9322
= 12.26
241
242
c.
d.
2.33), H0 is rejected. There is sufficient evidence to indicate the mean salary of all males
with post-graduate degrees exceeds the mean salary of all females with post-graduate
degrees at = .01.
Chapter 7
The Kentucky Milk CasePart II

(1)Incumbency Rates
I have repeated the incumbency rates for the Tri-county market. If the "normal" incumbency
rate is .7 in competitive markets, then we would like to test to see if the incumbency rate in the
Tri-county market is larger than .7. We will run a test for each of the years from 1985 through
1988, and also for the four years combined.
Year
1984
1985
1986
1987
1988
1989
1990
1991
Tri-County Market
Number of
Same
Incumbency
Districts
Vendors
Rate
10
8
.800
12
12
1.000
13
13
1.000
13
12
.923
13
13
1.000
13
9
.692
13
10
.769
13
11
.846
1985
One of the assumptions necessary for this test is that the sample size is sufficiently large. In
order for the sample size to be sufficiently large, the interval p0 3 p must not contain 0 or 1.
For this problem, the interval is:
p0 3 p .7 3
.7(.3)
.7 .397 (.303, 1.097)
12
Since 1 is included in the interval, the sample size is not sufficiently large. The following test
may not be valid.
To see if the incumbency rate in 1985 exceeds .7, we test:
H0: p = .7
Ha: p > .7
p p0
p0 q0
n
1 .7
= 2.27
.7(.3)
12
243
Since the observed value of the test statistic falls in the rejection region (z = 2.27 > 1.645), H0 is
rejected. There is sufficient evidence to indicate that the incumbency rate exceeds .7 in the Tricounty market at = .05.
1986
In order for the sample size to be sufficiently large, the interval p0 3 p must not contain 0 or
1.
p0 3 p .7 3
.7(.3)
.7 .381 (.319, 1.081)
13
may not be valid.
H0: p = .7
Ha: p > .7
p p0
p0 q0
n
1 .7
.7(.3)
13
= 2.36
1987
In order for the sample size to be sufficiently large, the interval p0 3 p must not contain 0
or 1.
p0 3 p .7 3
244
.7(.3)
.7 .381 (.319, 1.081)
13
may not be valid.
H0: p = .7
Ha: p > .7
p p0
p0 q0
n
.923 .7
= 1.75
.7(.3)
13
1988
This test is the same as the test for 1986.

Combined 1985-1988
To see if the sample size is sufficiently large, the interval p0 3 p must not contain 0 or 1.
p0 3 p .7 3
.7(.3)
.7 .193 (.507, .893)
51
Since neither 0 nor 1 is included in the interval, the sample size is sufficiently large.
p =
50
= .980
51

H0: p = .7
Ha: p > .7
p p0
p0 q0
n
980 .7
= 4.36
.7(.3)
51
245
Thus, there is evidence, based on the incumbency rates, that bid collusion is present in the Tricounty market.
(2)Bid Price Dispersion
Again, we can use only the data provided which are the winning bids in each of the school
districts in both markets. The sample sizes and the variances for each of the milk products for
each year and each market are provided in the table.
Whole White Milk
YR
83
84
85
86
87
88
89
90
91
N
22
22
26
33
36
36
37
35
5
Surround
Market
VAR
0.000212
0.000188
0.000174
0.000120
0.000105
0.000128
0.000056
0.000063
0.000042
N
8
9
10
10
12
12
12
12
13
Tri-County
Market
VAR
0.000213
0.000022
0.000028
0.000019
0.000027
0.000024
0.000089
0.000010
0.000020
N
10
12
13
13
13
13
13
13
12
Tri-County
Market
VAR
0.000155
0.000040
0.000028
0.000028
0.000049
0.000038
0.000068
0.000025
0.000034
Lowfat White Milk
YR
83
84
85
86
87
88
89
90
91
246
N
24
26
29
33
35
35
35
34
5
Surround
Market
VAR
0.000279
0.000216
0.000210
0.000139
0.000152
0.000165
0.000043
0.000091
0.000051
Lowfat Chocolate Milk
YR
83
84
85
86
87
88
89
90
91
N
24
25
28
34
36
36
36
33
5
Surround
Market
VAR
0.000287
0.000234
0.000248
0.000163
0.000163
0.000184
0.000060
0.000098
0.000098
N
5
6
6
6
7
9
9
10
11
Tri-County
Market
VAR
0.000015
0.000060
0.000038
0.000027
0.000040
0.000087
0.000087
0.000014
0.000042
I will write out the first test and then summarize the others in a table. The first test will be for
the year 1983 and will compare the variances of the whole white milk.
To determine if the variances in the winning bid prices differ for the two markets, we test:
12
=1
22
2
Ha: 12 1
2
H0:
s2
larger sample variance
.000213
= 1.005
= 12 =
s2
smaller sample variance
.000212
The rejection region requires /2 = .05/2 = .025 in the upper tail of the F-distribution with 1 =
n2 1 = 8 1 = 7 and 2 = n1 1 = 22 1 = 21. From Table IX, Appendix B, F.025 = 2.97.
The rejection region is F > 2.97.
Since the observed value of the test statistic does not fall in the rejection region (F = 1.005 >/
2.97), H0 is not rejected. There is insufficient evidence to indicate that the variances of the
winning bids are different for the two markets.
Whole White Milk
Year
1983
1984
1985
1986
1987
1988
1989
1990
1991
1, 2
7,21
21,8
25,9
32,9
35,11
35,11
11,36
34,11
4,12
F.025
2.97
4.00
3.61
3.56
2.96
2.96
2.51
3.12
4.12
F
1.005
8.545
6.214
6.316
3.889
5.333
1.589
6.300
2.100
Decision
Do not reject
Reject
Reject
Reject
Reject
Reject
Do not reject
Reject
Do not reject
247
In all cases where there was a significant difference in the variances of the winning bids
between the two markets, the variance in the Surrounding market was larger than the variance
in the Tri-county market. This implies that collusion might be present in the Tri-county market.
Lowfat White Milk
Year
1983
1984
1985
1986
1987
1988
1989
1990
1991
1, 2
23,9
25,11
28,12
32,12
34,12
34,12
12,34
33,12
4,11
F.025
3.62
3.17
3.02
2.96
2.96
2.96
2.41
2.96
4.28
F
1.800
5.400
7.500
4.964
3.102
4.342
1.581
3.640
1.500
Decision
Do not reject
Reject
Reject
Reject
Reject
Reject
Do not reject
Reject
Do not reject
Again, in all cases where there was a significant difference in the variances of the winning bids
Lowfat Chocolate Milk
Year
1983
1984
1985
1986
1987
1988
1989
1990
1991
v1,v2
23,4
24,5
27,5
33,5
35,6
35,8
8,35
32,9
4,10
F.025
8.56
6.28
6.28
6.23
5.07
3.89
2.65
3.56
4.47
F
19.133
3.900
6.526
6.037
4.075
10.222
1.450
7.000
2.333
Decision
Reject
Do not reject
Reject
Do not reject
Do not reject
Reject
Do not reject
Reject
Do not reject
Again, in all cases where there was a significant difference in the variances of the winning bids
Based on the analysis of the three milk products, there appears to be collusion in the Tri-county
market.
248
(3)Average Winning Bid Price
I have provided the SAS output for computing the t-tests to compare the mean winning bid
prices between the two markets for each of the years and each of the milk products. I will
discuss the findings for each milk product separately. For t-tests, we must assume that the two
population variances are the same. If the population variances are not the same, there is an
approximate test that takes into consideration the different variances. The SAS printout
provided allows for the test of equal variances first. I used a p-value of .25 as the cutoff point.
If the p-value was less than or equal to .25 for the F-test, I assumed that the variances were
different and used the approximate test designated as UNEQUAL. If the p-value for the F-test
was greater than .25, I assumed that the population variances were the same and used the test
designated as EQUAL.
Whole White Milk:
Variable: Whole White Milk - 1983
MARKET
Mean
Std Dev
Std Error
Variances
DF
Prob>|T|
-----------------------------------------------------------------------------SUR
22
0.1318
0.01458844
0.00311027
Unequal
2.4045
12.4
0.0326
TRI
0.1173
0.01462038
0.00516909
Equal
2.4071
28.0
0.0229*
For H0: Variances are equal, F' = 1.00
DF = (7,21)
Prob>F' = 0.9116
************************************************************************
Variable: Whole White Milk
MARKET
Mean
- 1984
Std Dev
Std Error
Variances
DF Prob>|T|
-----------------------------------------------------------------------------SUR
22
0.1309
0.01374189
0.00292978
Unequal
-2.3904
28.6
0.0236*
TRI
0.1389
0.00474871
0.00158290
Equal
-1.6825
29.0
0.1032
DF = (21,8)
Prob>F' = 0.0044
************************************************************************
MARKET
Mean
- 1985
Std Dev
Std Error
Variances
DF Prob>|T|
-----------------------------------------------------------------------------SUR
26
0.1279
0.01321810
0.00259228
Unequal
-4.3968
33.8
0.0001*
TRI
10
0.1415
0.00534266
0.00168950
Equal
-3.1348
34.0
0.0035
DF = (25,9)
Prob>F' = 0.0077
************************************************************************
249

MARKET
Mean
- 1986
Std Dev
Std Error
Variances
DF Prob>|T|
----------------------------------------------------------------------------SUR
33
0.1253
0.01098665
0.00191253
Unequal
-8.1534
37.3
0.0001*
TRI
10
0.1446
0.00442846
0.00140040
Equal
-5.3943
41.0
0.0000
DF = (32,9)
Prob>F' = 0.0070
************************************************************************
MARKET
Mean
- 1987
Std Dev
Std Error
Variances
DF
Prob>|T|
-----------------------------------------------------------------------------SUR
36
0.1264
0.01026078
0.00171013
Unequal
TRI
12
0.1495
0.00527196
0.00152188
Equal
-10.0785
37.5
0.0001*
-7.4313
46.0
0.0000
DF = (35,11)
Prob>F' = 0.0224
************************************************************************
MARKET
Mean
- 1988
Std Dev
Std Error
Variances
DF
Prob>|T|
-----------------------------------------------------------------------------SUR
36
0.1277
0.01135449
0.00189242
Unequal
-9.9271
42.2
0.0001*
TRI
12
0.1513
0.00499090
0.00144075
Equal
-6.9441
46.0
0.0000
DF = (35,11)
Prob>F' = 0.0060
************************************************************************
MARKET
Mean
- 1989
Std Dev
Std Error
Variances
DF
Prob>|T|
-----------------------------------------------------------------------------SUR
37
0.1299
0.00752173
0.00123657
Unequal
-0.4890
15.8
0.6316
TRI
12
0.1314
0.00944991
0.00272795
Equal
-0.5501
47.0
0.5849NS
DF = (11,36)
Prob>F' = 0.2947
************************************************************************
MARKET
Mean
- 1990
Std Dev
Std Error Variances
DF
Prob>|T|
--------------------------------------------------------------------------SUR
35
0.1609
0.00794659
0.00134322 Unequal
-1.1177
43.7
0.2698NS
TRI
12
0.1628
0.00317904
0.00091771 Equal
-0.7673
45.0
0.4469
DF = (34,11)
Prob>F' = 0.0026
************************************************************************
250

MARKET
Mean
- 1991
Std Dev
Std Error
Variances
DF
Prob>|T|
-----------------------------------------------------------------------------SUR
0.1452
0.00652012
0.00291589
Unequal
1.2585
5.6
TRI
13
0.1412
0.00458169
0.00127073
Equal
1.4813
16.0
0.2585
0.1580NS
DF = (4,12)
Prob>F' = 0.3095
************************************************************************
The mean winning bid prices were significantly different between the markets for all years except
1989, 1990, and 1991. In 1983, the mean winning bid for the Surrounding market was
significantly larger than that for the Tri-county market. For the years 19841988, the mean
winning bid price for the Tri-county market was significantly larger than that for the Surrounding
market. This implies evidence of collusion for the years 19841988.
Lowfat White Milk:
Variable: Lowfat White Milk - 1983
MARKET
Mean
Std Dev
Std Error
Variances
DF
Prob>|T|
-----------------------------------------------------------------------------SUR
24
0.1243
0.01672220
0.00341341
Unequal
2.5085
22.6
0.0198
TRI
10
0.1112
0.01246237
0.00394095
Equal
2.2214
32.0
0.0335*
DF = (23,9)
Prob>F' = 0.3627
************************************************************************
MARKET
Mean
Std Dev
Std Error
Variances
DF
Prob>|T|
----------------------------------------------------------------------------SUR
26
0.1236
0.01469859
0.00288263
Unequal
-3.0061
36.0
0.0048*
TRI
12
0.1338
0.00635717
0.00183516
Equal
-2.3099
36.0
0.0267
DF = (25,11)
Prob>F' = 0.0059
************************************************************************
MARKET
Mean
Std Dev
Std Error
Variances
DF
Prob>|T|
----------------------------------------------------------------------------SUR
29
0.1200
0.01452245
0.00269675
Unequal
-5.3857
39.2
0.0001*
TRI
13
0.1366
0.00537445
0.00149061
Equal
-3.9769
40.0
0.0003
DF = (28,12)
Prob>F' = 0.0008
************************************************************************
251

MARKET
Mean
Std Dev
Std Error
Variances
DF
Prob>|T|
-----------------------------------------------------------------------------SUR
33
0.1178
0.01180640
0.00205523
Unequal
-8.4010
43.0
0.0001*
TRI
13
0.1391
0.00533205
0.00147884
Equal
-6.2183
44.0
0.0000
DF = (32,12)
Prob>F' = 0.0055
************************************************************************
MARKET
Mean
Std Dev
Std Error
Variances
DF
Prob>|T|
------------------------------------------------------------------------------SUR
35
0.1173
0.01235100
0.00208770
Unequal
-8.7991
37.8
0.0001*
TRI
13
0.1424
0.00701738
0.00194627
Equal
-6.8995
46.0
0.0000
DF = (34,12)
Prob>F' = 0.0404
************************************************************************
MARKET
Mean
Std Dev
Std Error
Variances
DF
Prob>|T|
----------------------------------------------------------------------------SUR
35
0.1182
0.01285522
0.00217293
Unequal
-9.6219
42.7
0.0001*
TRI
13
0.1448
0.00618019
0.00171408
Equal
-7.1332
46.0
0.0000
DF = (34,12)
Prob>F' = 0.0095
************************************************************************
MARKET
Mean
Std Dev
Std Error
Variances
DF
Prob>|T|
----------------------------------------------------------------------------SUR
35
0.1187
0.00655938
0.00110874
Unequal
-2.1005
17.9
0.0501
TRI
13
0.1240
0.00828350
0.00229743
Equal
-2.3400
46.0
0.0237*
DF = (12,34)
Prob>F' = 0.2798
************************************************************************
MARKET
Mean
Std Dev
Std Error
Variances
DF
Prob>|T|
-----------------------------------------------------------------------------SUR
34
0.1519
0.00954524
0.00163700
Unequal
-2.3772
39.8
0.0223*
TRI
13
0.1570
0.00508486
0.00141029
Equal
-1.8347
45.0
0.0732
DF = (33,12)
Prob>F' = 0.0238
************************************************************************
252

MARKET
Mean
Std Dev
Std Error
Variances
DF
Prob>|T|
----------------------------------------------------------------------------SUR
0.1364
0.00718485
0.00321316
Unequal
0.2745
6.3
TRI
12
0.1354
0.00585768
0.00169097
Equal
0.3001
15.0
0.7925
0.7682NS
DF = (4,11)
Prob>F' = 0.5343
************************************************************************
1991. In 1983, the mean winning bid for the Surrounding market was significantly larger than
that for the Tri-county market. For the years 19841990, the mean winning bid price for the Tricounty market was significantly larger than that for the Surrounding market. This implies
evidence of collusion for the years 19841990.
Lowfat Chocolate Milk:
Variable: Lowfat Chocolate Milk - 1983
MARKET
Mean
Std Dev
Std Error
Variances
DF
Prob>|T|
-----------------------------------------------------------------------------SUR
24
0.1267
0.01696642
0.00346326
Unequal
5.3313
26.3
0.0001*
TRI
0.1060
0.00394740
0.00176533
Equal
2.6795
27.0
0.0124
For H0: Variances are equal, F' =
18.47
DF = (23,4)
Prob>F' = 0.0117
************************************************************************
MARKET
Mean
Std Dev
Std Error
Variances
DF
Prob>|T|
----------------------------------------------------------------------------SUR
25
0.1251
0.01530156
0.00306031
Unequal
-2.1693
15.7
0.0457*
TRI
0.1347
0.00778522
0.00317830
Equal
-1.4733
29.0
0.1514
DF = (24,5)
Prob>F' = 0.1379
************************************************************************
MARKET
Mean
Std Dev
Std Error
Variances
DF
Prob>|T|
----------------------------------------------------------------------------SUR
28
0.1206
0.01575587
0.00297758
Unequal
-4.6215
20.9
0.0001*
TRI
0.1387
0.00621914
0.00253895
Equal
-2.7384
32.0
0.0100
DF = (27,5)
Prob>F' = 0.0472
************************************************************************
253

MARKET
Mean
Std Dev
Std Error
Variances
DF
Prob>|T|
----------------------------------------------------------------------------SUR
34
0.1169
0.01279357
0.00219408
Unequal
-8.0140
18.2
0.0001*
TRI
0.1414
0.00521130
0.00212751
Equal
-4.5821
38.0
0.0000
DF = (33,5)
Prob>F' = 0.0533
************************************************************************
MARKET
Mean
Std Dev
Std Error
Variances
DF
Prob>|T|
----------------------------------------------------------------------------SUR
36
0.1184
0.01280507
0.00213418
Unequal
-7.8853
17.5
0.0001*
TRI
0.1436
0.00632926
0.00239224
Equal
-5.0675
41.0
0.0000
DF = (35,6)
Prob>F' = 0.0832
************************************************************************
MARKET
Mean
Std Dev
Std Error
Variances
DF
Prob>|T|
----------------------------------------------------------------------------SUR
36
0.1192
0.01359999
0.00226666
Unequal
10.3636
40.6
0.0001*
TRI
0.1470
0.00425532
0.00141844
Equal
-5.9934
43.0
0.0000
For H0: Variances are equal, F' =
10.21
DF = (35,8)
Prob>F' = 0.0019
************************************************************************
MARKET
Mean
Std Dev
Std Error
Variances
DF
Prob>|T|
----------------------------------------------------------------------------SUR
36
0.1200
0.00776605
0.00129434
Unequal
-1.7178
10.9
0.1140
TRI
0.1258
0.00932923
0.00310974
Equal
-1.9216
43.0
0.0613NS
DF = (8,35)
Prob>F' = 0.4274
************************************************************************
MARKET
Mean
Std De
Std Error
Variances
DF
Prob>|T|
-----------------------------------------------------------------------------SUR
33
0.1531
0.00993298
0.00172911
Unequal
-3.9472
38.3
0.0003*
TRI
10
0.1614
0.00383030
0.00121125
Equal
-2.5773
41.0
0.0137
DF = (32,9)
Prob>F' = 0.0050
************************************************************************
254

MARKET
Mean
Std Dev
Std Error
Variances
DF
Prob>|T|
----------------------------------------------------------------------------SUR
0.1402
0.00991020
0.00443197
Unequal
-0.4431
5.6
TRI
11
0.1423
0.00650294
0.00196071
Equal
-0.5216
14.0
0.6743
0.6101NS
DF = (4,10)
Prob>F' = 0.2552
1989 and 1991. In 1983, the mean winning bid for the Surrounding market was significantly
larger than that for the Tri-county market. For the years 19841988 and 1990, the mean winning
bid price for the Tri-county market was significantly larger than that for the Surrounding market.
This implies evidence of collusion for the years 19841988.
255
Design of Experiments and

Analysis of Variance
8.2
Chapter 8
The treatments are the combinations of levels of each of the two factors. There are 2 5 = 10
treatments. They are:
(A, 50), (A, 60), (A, 70), (A, 80), (A, 90)
(B, 50), (B, 60), (B, 70), (B, 80), (B, 90)
8.4
8.6
a.
College GPA's are measured on college students. The experimental units are college
students.
b.
Household income is measured on households. The experimental units are households.
c.
Gasoline mileage is measured on automobiles. The experimental units are the

automobiles of a particular model.
d.
The experimental units are the sectors on a computer diskette.
e.
The experimental units are the states.
a.
The response variable is the amount of the purchase.
b. There is one factor in this problem: type of credit card.

c. There are 4 treatments, corresponding to the 4 levels of the factor. The treatments are
VISA, MasterCard, American Express, and Discover.
d. The experimental units are the credit card holders.
8.8
8.10
256
a.
The response variable in this problem is the consumers opinion on the value of the
discount offer.
b.
There are two treatments in this problem: Within-store price promotion and betweenstore price promotion.
c.
The experimental units are the consumers.
a.
There are 2 factors in the problem: Type of yeast and Temperature. Type of yeast
has 2 levels Brewers yeast and bakers yeast. Temperature has 4 levels 45o,
48o, 51o and 54oC.
b.
The response variable is the autolysis yield.
c.
There are a total of 2 4 = 8 treatments in this experiment. The treatments are all the
type of yeast-temperature combinations.
d.
This is a designed experiment.
Chapter 8
8.12
8.14
8.16
a.
The response is the evaluation by the undergraduate student of the ethical behavior of the
salesperson.
b.
There are two factorstype of sales job at two levels (high tech. vs. low tech.) and sales
task at two levels (new account development vs. account maintenance).
c.
The treatments are the 2 2 = 4 combinations type of sales job and sales task.
d.
The experimental units are the college students.
a.
From Table IX with 1 = 4 and 2 = 4, F.05 = 6.39.
b.
From Table XI with 1 = 4 and 2 = 4, F.01 = 15.98.
c.
From Table VIII with 1 = 30 and 2 = 40, F.10 = 1.54.
d.
From Table X with 1 = 15, and 2 = 12, F.025 = 3.18.
a.
In the second dot diagram #2, the difference between the sample means is small relative
to the variability within the sample observations. In the first dot diagram #1, the values in
each of the samples are grouped together with a range of 4, while in the second diagram
#2, the range of values is 8.
b. For diagram #1,
7 + 8 + 9 + 10 + 11 54
=
=9
n
6
6
x2 = 12 + 13 + 14 + 14 + 15 + 16 = 84 = 14
x2 =
n
6
6
x1 =
For diagram #2,
5 + 5 + 7 + 11 + 13 + 13 54
=
=9
n
6
6
x2 = 10 + 10 + 12 + 16 + 18 + 18 = 84 = 14
x2 =
n
6
6
x1 =
c.
For diagram #1,

2
SST =
n (x
i =1
x ) 2 1 = 6(9 11.5)2 + 6(14 11.5)2 = 75
x = 54 + 84 = 11.5
x =
12
n
For diagram #2,

2
SST =
n (x
i =1
x ) 2 = 6(9 - 11.5)2 + 6(14 - 11.5)2 = 75
257
d.
For diagram #1,
2
1
s12 =
( x )
n1
n1 1
2
2
s22 =
( x )
2
542
54
496
6 =
6 =2
6 1
6 1
496
n2
n2 1
842
6 =2
6 1
1186
SSE = (n1 1) s12 + (n2 1) s22 = (6 1)2 + (6 1)2 = 20

For diagram #2,
2
1
s12 =
( x )
n1
n1 1
2
2
s22 =
( x )
n2
n2 1
542
6 = 14.4
6 1
558
842
6 = 14.4
6 1
1248
SSE = (n1 1) s12 + (n2 1) s22 = (6 1)14.4 + (6 1)14.4 = 144

e.
For diagram #1, SS(Total) = SST + SSE = 75 + 20 = 95

SST is
SST
75
100% =
100% = 78.95% of SS(Total)
SS(Total)
95
For diagram #2, SS(Total) = SST + SSE = 75 + 144 = 219

SST is
f.
SST
75
100% =
100% = 34.25% of SS(Total)
SS(Total)
219
SST
75
=
= 75
k 1 2 1
SSE
20
=
MSE =
=2
n k 12 2
For diagram #1, MST =
SST
75
=
= 75
k 1 2 1
SSE
144
=
= 14.4
MSE =
n k 12 2
F=
MST
75
=
= 37.5
MSE
2
F=
MST
75
=
= 5.21
MSE 14.4
For diagram #2, MST =
258
Chapter 8
g.
The rejection region for both diagrams requires = .05 in the upper tail of the Fdistribution with 1 = p 1 = 2 1 = 1 and 2 = n p = 12 2 = 10. From Table IX,
Appendix B, F.05 = 4.96. The rejection region is F > 4.96.
For diagram #1, the observed value of the test statistic falls in the rejection region (F =
37.5 > 4.96). Thus, H0 is rejected. There is sufficient evidence to indicate the samples
were drawn from populations with different means at = .05.
For diagram #2, the observed value of the test statistic falls in the rejection region (F =
5.21 > 4.96). Thus, H0 is rejected. There is sufficient evidence to indicate the samples
were drawn from populations with different means at = .05.
h.
8.18
We must assume both populations are normally distributed with common variances.
Refer to Exercise 8.16, the ANOVA table is:

For diagram #1:
Source
Treatment
Error
Total
Df
1
10
11
SS
75
20
95
MS
75
2
F
37.5
SS
75
144
219
MS
75
14.4
F
5.21
For diagram #2:

Source
Treatment
Error
Total
8.20
a.
Df
1
10
11
df for Error is 41 6 = 35
SSE = SS(Total) SST = 46.5 17.5 = 29.0
SST 17.5
=
= 2.9167
k 1
6
MST
2.9167
=
F=
= 3.52
MSE
.8286
MST =
MSE =
SSE 29.0
=
= .8286
nk
35
The ANOVA table is:

Source
Treatment
Error
Total
df
6
35
41
SS
17.5
29.0
46.5
MS
2.9167
.8286
F
3.52
259
b.
The number of treatments is k. We know k 1 = 6 k = 7.
c.
To determine if there is a difference among the population means, we test:

H0: 1 = 2 = = 7
Ha: At least one of the population means differs from the rest
The test statistic is F = 3.52.
The rejection region requires = .10 in the upper tail of the F-distribution with numerator
df = k 1 = 6 and denominator df = n k = 35. From Table VIII, Appendix B, F.10
H0 is rejected. There is sufficient evidence to indicate a difference among the population
means at = .10.
d.
The observed significance level is P(F 3.52). With numerator df = 6 and denominator
df = 35, and Table XI, P(F 3.52) < .01.
e.
H0: 1 = 2
Ha: 1 2
x1 x2
1
1
MSE +
n1 n2
3.7 4.1
1 1
.8286 +
6 6
= .76
n p = 35. From Table VI, Appendix B, t.05 1.697. The rejection region is t < 1.697
or t > 1.697.
</ 1.697), H0 is not rejected. There is insufficient evidence to indicate that 1 and 2
differ at = .10.
f.
df = 35, t.05 1.697. The confidence interval is:
1 1
1 1
( x1 x2 ) t.05 MSE + (3.7 4.1) 1.697 .8286 +
6 6
n1 n 2
.4 .892 (1.292, .492)
g.
260
The confidence interval is:

x1 t.05 MSE/6 3.7 1.697 .8286 / 6 3.7 .631 (3.069, 4.331)
Chapter 8
8.22
a.
The experimental unit in the study is the college tennis coach. The dependent
variable is the response to the statement the Prospective Student-Athlete Form on
the web site contributes very little to the recruiting process on a scale from 1 to 7.
There is one factor in the study and it is the NCAA division of the college tennis
coach. There are 3 levels of this factor, and thus, there are 3 treatments Division I,
Division II, and Division III.
b.
To determine if the mean responses of tennis coaches from the different divisions differ,
we test:
H0: 1 = 2 = 3
Ha: At least 1 i differs
8.24
c.
Since the observed p-value of the test (p < .003) is less than = .05, H0 is rejected. There
is sufficient evidence to indicate differences in mean response among coaches of the 3
divisions.
a.
A completely randomized design was used.
b.
There are 4 treatments: 3 robots/colony, 6 robots/colony, 9 robots/colony, and 12

robots/colony.
c.
To determine if there was a difference in the mean energy expended (per robot) among
the 4 colony sizes, we test:
H0: 1 = 2 = 3 = 4
Ha: At least two means differ
d.
8.26
a.
Since the p-value (<.001) is less than (.05), H0 is rejected. There is sufficient evidence
to indicate a difference in mean energy expended per robot among the 4 colony sizes at
= .05.
To determine if differences exist in the mean rates of return among the three types of
fund groups, we test:
H0: 1 = 2 = 3
b.
c.
The rejection region requires = .01 in the upper tail of the F-distribution with
1 = k 1 = 3 1 = 2 and 2 = N k = 90 3 = 87. From Table XI, Appendix B,
F.01 4.98. The rejection region is F > 4.98.
4.98), H0 is rejected. There is sufficient evidence to indicate differences exist in the mean
rates of return among the three types of fund groups at = .01.
261
8.28
a.
The response variable for this study is the safety rating of nuclear power plants.
b.
There are three treatments in this study. The treatment groups are the scientists, the
journalists, and the federal government policymakers.
c.
To determine whether there are differences in the attitudes of scientists, journalists, and
government officials regarding the safety of nuclear power plants, we test:
H0: 1 = 2 = 3
d.
The rejection region requires = .05 in the upper tail of the F-distribution with 1 = k 1
= 3 1 = 2 and 2 = n k = 300 3 = 297. From Table IX, Appendix B, F.05 3.00. The
In order to reject H0, the test statistic F must be greater than 3.00.
F=
MST
> 3.00
MSE
MST > 3.00(MSE) 3.00 (2.355) = 7.065. Thus, MST must be greater
than 7.065.
8.30
MST
11.28
=
= 4.79
MSE
2.355
e.
For MST = 11.280, F =
f.
With 1 = k 1 = 3 1 = 2 and 2 = n k = 300 3 = 297, P(F > 4.79) .01, using Table
XI, Appendix B. The approximate p-value is .01.
a.
We will select size as the quantitative variable and color as the qualitative variable.
To determine if the mean size of diamonds differ among the 6 colors, we test:
H0: 1 = 2 = 3 = 4 = 5 = 6
b.
Using MINITAB, the ANOVA table is:
One-way ANOVA: Carats versus Color

Analysis of Variance for Carats
Source
DF
SS
MS
Color
5
0.7963
0.1593
Error
302
22.7907
0.0755
Total
307
23.5869
Level
D
E
F
G
H
I
N
16
44
82
65
61
40
Pooled StDev =
262
Mean
0.6381
0.6232
0.5929
0.5808
0.6734
0.7310
0.2747
StDev
0.3195
0.2677
0.2648
0.2792
0.2643
0.2918
F
2.11
P
0.064
Individual 95% CIs For Mean

Based on Pooled StDev
----------+---------+---------+-----(-------------*------------)
(-------*-------)
(-----*-----)
(------*------)
(------*------)
(-------*--------)
----------+---------+---------+-----0.60
0.70
0.80
Chapter 8
The test statistic is F = 2.11 and the p-value is p = 0.064.

Since the p-value (0.064) is less than = .10, H0 is rejected. There is sufficient evidence
to indicate the mean size of diamonds differ among the 6 colors at = .10.
c.
We will check the assumptions of normality and equal variances. Using MINITAB, the
stem-and-leaf plots are:
Stem-and-Leaf Display: Carats
Stem-and-leaf of Carats
Leaf Unit = 0.010
1
3
5
5
7
7
(4)
5
5
5
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1156
00011
1
2
3
4
5
6
7
8
9
10
Color = 2
= 44
Color = 3
= 82
= 65
9
123
0011345
6
00012245668
23
000123
113
0000011113
88999
1356667
01124445567
0178
000111122333345566678
0
00001112367
0012555
0
00000011112224
Leaf Unit = 0.010
5
12
21
23
(12)
30
26
16
12
12
= 16
23
Leaf Unit = 0.010
5
12
23
27
(21)
34
33
22
15
14
9
01
01
Leaf Unit = 0.010
1
4
11
12
(11)
21
19
13
10
10
Color= 1
Color = 4
88899
0001359
000124455
08
000013556789
0034
0000001348
0125
000000011126
263
Leaf Unit = 0.010
5
14
16
21
27
(13)
21
14
14
1
2
3
4
5
6
7
8
9
10
11
Color = 5
2
457
012344567
03
25778
001466
0001112233448
0014669
2
3
4
5
6
7
8
9
10
= 61
1 89
0000011111266
0
Leaf Unit = 0.010
4
8
11
13
15
20
20
17
16
Color = 6
= 40
5689
0113
115
26
25
03355
002
0
0000001111114579
The data for the 6 colors do not look particularly mound-shaped, so the assumption of
normality is probably not valid. However, departures from this assumption often do not
invalidate the ANOVA results.
Using MINITAB, the box plots are:
1.1
1.0
0.9
Carats
0.8
0.7
0.6
0.5
0.4
0.3
0.2
D
Color
The spreads of all the colors appear to be about the same, so the assumption of constant
variance is probably valid.
264
Chapter 8
8.32
a.
The df for Groups = 1 = k 1 = 3 1 = 2. The df for Error = 2 = n k = 71 3 = 68.

The completed ANOVA table is:
Source
Groups
Error
b.
df
2
68
SS
128.70
27,124.52
MS
64.35
398.89
F
0.16
To determine if the total number of activities undertaken differed among the three groups
of entrepreneurs, we test:
H0: 1 = 2 = 3
Ha: At least one mean differs
>/ 3.15), H0 is not rejected. There is insufficient evidence to indicate that the total
number of activities differed among the groups of entrepreneurs at = .05.
c.
The p-value of the test is P(F > 0.16). From Table VIII, Appendix B, with 1 = 2 and
2 = 68, P(F > 0.16) > .10.
d.
No. Since our conclusion was that there was no evidence of a difference in the total
number of activities among the groups, there would be no evidence to indicate a
difference between two specific groups.
e.
This study would be observational. The group that each entrepreneur fell into was
observed, not controlled. Since no differences were found, the type of study does not
have an impact on the conclusions.
8.34
The experimentwise error rate is the probability of making a Type I error for at least one of all
of the comparisons made. If the experimentwise error rate is = .05, then each individual
comparison is made at a value of which is less than .05.
8.36
a.
From the diagram, the following pairs of treatments are significantly different because
they are not connected by a line: A and E, A and B, A and D, C and E, C and B, C and D,
and E and D. All other pairs of means are not significantly different because they are
connected by lines.
b.
they are not connected by a line: A and B, A and D, C and B, C and D, E and B, E and D,
and B and D. All other pairs of means are not significantly different because they are
connected by lines.
265
8.38
8.40
c.
they are not connected by a line: A and E, A and B, and A and D. All other pairs of
means are not significantly different because they are connected by lines.
d.
they are not connected by a line: A and E, A and B, A and D, C and E, C and B, C and D,
E and D, and B and D. All other pairs of means are not significantly different because
they are connected by lines.
a.
The total number of comparisons conducted is k(k 1)/2 = 4(4 1)/2 = 6.
b.
The mean energy expended by robots in the 12 robot colony is significantly smaller than
the mean energy expended by robots in any of the other size colonies. There is no
difference in the mean energy expended by robots in the 3 robot colony, the 6 robot
colony, and the 9 robot colony.
a.
There will be c =
b.
Comparing the mean safety scores for government officials and journalists, the difference
in mean safety scores is 4.2 3.7 = .5, The critical value for the Tukey comparison is
.23. Since .5 > .23, we conclude that the mean safety score for government officials is
higher than the mean safety score for journalists.
k (k 1) 3(3 1)
= 3 pairwise comparisons.
=
2
2
Comparing the mean safety scores for government officials and scientists, the difference
in mean safety scores is 4.2 4.1 = .1. Since .1 < .23, we conclude that there is no
difference in mean safety scores between government officials and scientists.
Comparing the mean safety scores for scientists and journalists, the difference in mean
safety scores is 4.1 3.7 = .4, The critical value for the Tukey comparison is .23. Since
.4 > .23, we conclude that the mean safety score for scientists is higher than the mean
safety score for journalists.
A display of these conclusions is:
Journalists
3.7
8.42
Scientists
4.1
Gov. Officials
4.2
a.
The probability of declaring at least one pair of means different when they are not is
.01.
b.
There are a total of
k (k 1) 3(3 1)
=
= 3 pair-wise comparisons. They are:
2
2
Under $30 thousand to Between $30 and $60 thousand

Under $30 thousand to Over $60 thousand
Between $30 and $60 thousand to Over $60 thousand
266
Chapter 8
c.
Means for groups in homogeneous subsets are displayed in the table:

Income
Group
Subsets
Under $30,000
$30,000-$60,000
Over $60,000
d.
N
379
392
267
1
4.60
2
5.08
5.15
Two of the comparisons in part b will yield confidence intervals that do not contain 0.
They are:
Under $30 thousand to Between $30 and $60 thousand
Under $30 thousand to Over $60 thousand
8.44
From Exercise 8.30, we found that there were differences in the mean carats among the 6 levels
of color
From Exercise 8.30, the mean carats for the 6 colors are:
G
F
E
D
H
I
0.5808
0.5929
0.6232
0.6381
0.6734
0.7310
Using MINITAB, the Tukey confidence intervals are:

Tukey's pairwise comparisons
Family error rate = 0.100
Individual error rate = 0.0101
Critical value = 3.66
Intervals for (column level mean) - (row level mean)
D
-0.1926
0.2225
-0.1491
0.2395
-0.1026
0.1631
-0.1411
0.2558
-0.0964
0.1812
-0.1059
0.1302
-0.2350
0.1644
-0.1909
0.0904
-0.2007
0.0397
-0.2194
0.0341
-0.3032
0.1174
-0.2631
0.0475
-0.2752
-0.0010
-0.2931
-0.0074
-0.2022
0.0871
267
There are only 2 intervals that do not contain 0:

The confidence interval for the difference in mean carats between colors G and I is
(0.2931, 0.0074). The confidence interval for the difference in mean carats between colors
F and I is (0.2752, 0.0010). Since 0 is not contained in these confidence intervals, there is
sufficient evidence of a difference in the mean number of carats between colors G and I and
between colors F and I. No other differences exist.
8.46
a.
There are 3 blocks used since Block df = b 1 = 2 and 5 treatments since the treatment
df = k 1 = 4.
b.
There were 15 observations since the Total df = n 1 = 14.
c.
H0: 1 = 2 = 3 = 4 = 5
Ha: At least two treatment means differ
d.
e.
The rejection region requires = .01 in the upper tail of the F distribution with 1 = k 1
= 5 1 = 4 and 2 = n k b + 1 = 15 5 3 + 1 = 8. From Table XI, Appendix B, F.01
= 7.01. The rejection region is F > 7.01.
f.
7.01), H0 is rejected. There is sufficient evidence to indicate that at least two treatment
means differ at = .01.
g.
The assumptions necessary to assure the validity of the test are as follows:
1.
2.
8.48
a.
The probability distributions of observations corresponding to all the blocktreatment combinations are normal.
The variances of all the probability distributions are equal.
The ANOVA Table is as follows:

Source
Treatment
Block
Error
Total
268
MST
= 9.109
MSE
df
2
3
6
11
SS
12.032
71.749
.708
84.489
MS
6.016
23.916
.118
F
50.958
202.586
Chapter 8
b.
To determine if the treatment means differ, we test:
H0: A = B = C
B
MST
= 50.958
MSE
= 3 1 = 2 and 2 = n k b + 1 = 12 3 4 + 1 = 6. From Table IX, Appendix B, F.05
= 5.14. The rejection region is F > 5.14.
5.14), H0 is rejected. There is sufficient evidence to indicate that the treatment means
differ at = .05.
c.
To see if the blocking was effective, we test:
H0: 1 = 2 = 3 = 4
Ha: At least two block means differ
MSB
= 202.586
MSE
The rejection region requires = .05 in the upper tail of the F distribution with
1 = k 1 = 4 1 = 3 and 2 = n k b + 1 = 12 3 4 + 1 = 6. From Table IX,
4.76), H0 is rejected. There is sufficient evidence to indicate that blocking was effective
in reducing the experimental error at = .05.
d.
From the printouts, we are given the differences in the sample means. The difference
between Treatment B and both Treatments A and C are positive (1.125 and 2.450), so
Treatment B has the largest sample mean. The difference between Treatment A and C is
positive (1.325), so Treatment A has a larger sample mean than Treatment C. So
Treatment B has the largest sample mean, Treatment A has the next largest sample mean
and Treatment C has the smallest sample mean.
From the printout, all the means are significantly different from each other.
e.
The assumptions necessary to assure the validity of the inferences above are:
1.
2.
The probability distributions of observations corresponding to all the blocktreatment combinations are normal.
The variances of all the probability distributions are equal.
269
8.50
a.
This is a randomized block design. The blocks are the 12 plots of land. The treatments
are the three methods used on the shrubs: fire, clipping, and control. The response
variable is the mean number of flowers produced. The experimental units are the 36
shrubs.
b.
Plot
c.
To determine if there is a difference in the mean number of flowers produced among the
three treatments, we test:
H0: 1 = 2 = 3
Ha: The mean number of flowers produced differ for at least two of the methods.
The test statistic is F = 5.42 and p = .009. We can reject the null hypothesis at the
> .009 level of significance. At least two of the methods differ with respect to mean
number of flowers produced by pawpaws.
d.
8.52
270
The means of Control and Clipping do not differ significantly. The means of Clipping
and Burning do not differ significantly. The mean of treatment Burning exceeds that of
the Control.
From the printout, the p-value for treatments or Decoy is p = .589. Since the p-value is not
small, we cannot reject H0. There is insufficient evidence to indicate a difference in mean
percentage of a goose flock to approach to within 46 meters of the pit blind among the three
decoy types. This conclusion is valid for any reasonable value of .
Chapter 8
8.54
Using SAS, the ANOVA Table is:

The ANOVA Procedure
Dependent Variable: temp
Source
DF
Sum of
Squares
Mean Square
F Value
Pr > F
Model
11
18.53700000
1.68518182
0.52
0.8634
Error
18
58.03800000
3.22433333
Corrected Total
29
76.57500000
R-Square
Coeff Var
Root MSE
temp Mean
0.242076
1.885189
1.795643
95.25000
Source
STUDENT
PLANT
DF
Anova SS
Mean Square
F Value
Pr > F
9
2
18.41500000
0.12200000
2.04611111
0.06100000
0.63
0.02
0.7537
0.9813
To determine if there are differences among the mean temperatures among the three treatments,
we test:
H0: 1 = 2 = 3
The test statistic is F = 0.02. The associated p-value is p = .9813. Since the p-value is very
large, there is no evidence of a difference in mean temperature among the three treatments.
Since there is no difference, we do not need to compare the means. It appears that the presence
of plants or pictures of plants does not reduce stress.
8.56
a.
( y )
CM =
n
SS(Total) =
2.952
= .435125
10
y 2 CM = .4705 .435125 = .035375
1.622 1.332
T12 T22
+
CM =
+
.435125 = .004205
10
10
b
b
SST .004205
=
= .004205, df = k 1 = 1
MST =
2 1
k 1
B2
B2 B2
SSB = SS(DOG) = 1 + 2 + + 10 CM
k
k
k
2
2
2
2
2
.32 + .38 + .27 + .36 + .42 + .312 + .19 2 + .192 + .32 + .212
=
2
.435125 = .028925
SSB .028925
=
MSB =
= .003214, df = b 1 = 9
b 1
10 1
SST = SS(DRUG) =
SSE = SS(Total) SST SSB = .035375 .004205 .028925 = .002245
271
MSE =
F=
SSE
.002245
=
= .0002494
n k b + 1 20 2 10 + 1
MST
.004205
=
= 16.86
MSE .0002494
F=
MSB .003214
=
= 12.89
MSE .0002494
To determine if there is a difference in mean pressure readings for the two treatments, we
test:
H0: A = B
Ha: A B
B
MST
= 16.86
MSE
= 2 1 = 1 and 2 = n k b + 1 = 20 2 10 + 1 = 9. From Table IX, Appendix B,
F.05 = 5.12. The rejection region is F > 5.12.
Since the observed value of the test statistic falls in the rejection region (F = 16.86
> 5.12), H0 is rejected. There is sufficient evidence to indicate a difference in mean
pressure readings for the two drugs at = .05.
b.
Since there is expected to be much variation between the dogs, we use the dogs as blocks
to eliminate this identified source of variation.
c.
272
Dog
Drug A
Drug B
1
2
3
4
5
6
7
8
9
10
.17
.20
.14
.18
.23
.19
.12
.10
.16
.13
.15
.18
.13
.18
.19
.12
.07
.09
.14
.08
(A B)
Differences
.02
.02
.01
.00
.04
.07
.05
.01
.02
.05
Chapter 8

d=
sd2 =
sd =
nd
.29
= .029
10
=
2
i
( d )
nd
nd 1
(.29) 2
10 = .00449 = .0004989
10 1
9
.0129
sd2 = .0004989 = .02234
To determine if there is a difference in mean pressure readings for the two treatments, we
test:
H0: A = B
Ha: A B
B
d 0
sd / nd
.029 0
= 4.105
.02234 / 10
The rejection region requires /2 = .05/2 = .025 in each tail of the t distribution with
is t < 2.262 or t > 2.262.
Since the observed value of the test statistic falls in the rejection region (t = 4.105
> 2.262), H0 is rejected. There is sufficient evidence to indicate a difference in the
treatment means at = .05.
d.
In part a, F = 16.86; and in part c, t = 4.105. Note that t2 = 4.1052 = 16.85 = F.

In part a, F.05 = 5.12; and in part c, t.025 = 2.262. Note that t.2025 = 2.2622 = 5.12 = F.05.
e.
p-value = P(F 16.86) with 1 = 1 and 2 = 9.

Using Table XI, Appendix B, P(F 10.56) < .01.
Thus, the p-value is < .01.
The probability of a test statistic this extreme if the treatment means are the same is less
than .01. This is very significant. We would reject H0 in favor of Ha if is larger than
the p-value.
8.58
a.
There are two factors.
b.
No, we cannot tell whether the factors are qualitative or quantitative.
c.
Yes. There are four levels of factor A and three levels of factor B.
d.
A treatment would consist of a combination of one level of factor A and one level of
factor B. There are a total of 4 3 = 12 treatments.
273
8.60
e.
One problem with only one replicate is there are no degrees of freedom for error. This is
overcome by having at least two replicates.
a.
Factor A has 3 + 1 = 4 levels and factor B has 1 + 1 = 2 levels.
b.
There are a total of 23 + 1 = 24 observations and 4 2 = 8 treatments. Therefore, there

were 24/8 = 3 observations for each treatment.
c.
AB
df = (a 1)(b 1) = (4 1)(2 1) = 3
Error df = n ab = 24 4(2) = 16
SS A
SSA = (a 1)MSA = (4 1)(.75) = 2.25
a 1
SSB .95
=
MSB =
= .95
b 1 2 1
SS AB
MSAB =
SSAB = (a 1)(b 1)MSAB = (4 1)(2 1)(.30) = .9
(a 1)(b 1)
SSE = SS(Total) SSA SSB SSAB = 6.5 2.25 .95 .9 = 2.4
SSE
2.4
=
MSE =
= .15
n ab 24 - 4(2)
MSA =
SST = SSA + SSB + SSAB = 2.25 + .95 + .90 = 4.1

Treatment df = ab 1 = 4(2) 1 = 7
SST 4.1
MST
.5857
MST =
= .5857
FT =
= 3.90
=
=
ab 1 7
MSE
.15
MSA
.75
=
= 5.00
MSE
.15
MSAB .30
=
= 2.00
FAB =
MSE
.15
FA =
FB =
B
MSB .95
=
= 6.33
MSE .15
The ANOVA table is:

Source
Treatments
A
B
AB
Error
Total
274
df
7
3
1
3
16
23
SS
4.1
2.25
.95
.90
2.40
6.50
MS
.59
.75
.95
.30
.15
F
3.90
5.00
6.33
2.00
Chapter 8
d.
To determine whether the treatment means differ, we test:

H0: 1 = 2 = = 8
MST
= 3.90
MSE
The rejection region requires = .10 in the upper tail of the F-distribution with 1 = ab
1 = 4(2) 1 = 7 and 2 = n ab = 24 4(2) = 16. From Table VIII, Appendix B,
H0 is rejected. There is sufficient evidence to indicate the treatment means differ at
= .10.
e.
To determine if the factors interact, we test:

H0: Factors A and B do not interact to affect the response mean
Ha: Factors A and B do interact to affect the response mean
The rejection region requires = .10 in the upper tail of the F-distribution with 1 =
(a 1)(b 1) = (4 1)(2 1) = 3 and 2 = n ab = 24 4(2) = 16. From Table VIII,
>/ 2.46), H0 is not rejected. There is insufficient evidence to indicate factors A and B
interact at = .10.
To determine if the four means of factor A differ, we test:
H0: There is no difference in the four means of factor A
Ha: At least two of the factor A means differ
a 1 = 4 1 = 3 and 2 = n ab = 24 - 4(2) = 16. From Table VIII, Appendix B, F.10 =
H0 is rejected. There is sufficient evidence to indicate at least two of the four means of
factor A differ at = .10.
To determine if the 2 means of factor B differ, we test:
H0: There is no difference in the two means of factor B
Ha: At least two of the factor B means differ
275

b 1 = 2 1 = 1 and 2 = n ab = 24 4(2) = 16. From Table VIII, Appendix B, F.10 =
H0 is rejected. There is sufficient evidence to indicate the two means of factor B differ at
= .10.
All of the tests performed are warranted because interaction was not significant.
8.62
a.
The treatments are the combinations of the levels of factor A and the levels of factor B.
There are 2 2 = 4 treatments. The treatment means are:
x11 =
x21 =
11
2
x21
2
29.6 + 35.2
= 32.4
2
x12 =
12.9 + 17.6
= 15.25
2
x22 =
12
2
x22
2
47.3 + 42.1
2
28.4 + 22.7
2
The factors do not appear to interactthe

lines are almost parallel. The treatment
means do appear to differ because the
sample means range from 15.25 to 44.7.
b.
276
( x )
235.82
8
n
2
SS(Total) = x CM = 7922.92 6950.205 = 972.715
CM =
SSA =
SSB =
2
i
br
2
i
ar
CM=
154.22 81.62
+
= 7609.05 6950.205 = 658.845
2(2)
2(2)
CM=
95.32 140.52
+
= 7205.585 6950.205 = 255.38
2(2)
2(2)
Chapter 8
AB
2
ij
64.82 89.42 30.52 51.12

+
+
+
r
2
4
2
2
658.845 255.38 6950.205 = 7866.43 7864.43 = 2
SSE = SS(Total) SSA SSB SSAB = 972.715 658.845 255.38 2 = 56.49
SSAB =
SSA SSB CM =
df = a 1 = 2 1 = 1
df = b 1 = 2 1 = 1
df = (a 1)(b 1) = (2 1)(2 1) = 1
df = n ab = 8 2(2) = 4
df = n 1 = 8 1 = 7
A
B
AB
Error
Total
SSA 658.845
=
= 658.845
a 1
1
SSAB
2
= =2
MSAB =
(a 1)(b 1) 1
MSA =
MSB =
SSB 255.38
=
= 255.38
b 1
1
MSE =
SSE 56.49
= 14.1225
=
n - ab
4
MSA 658.845
=
= 46.65
MSE 14.1225
FA =
FAB =
FB =
B
MSB 255.38
=
= 18.08
MSE 14.1225
MSAB
2
=
= .14
MSE 14.1225
The ANOVA table is:

Source
A
B
AB
Error
Total
c.
df
1
1
1
4
7
SS
658.845
255.380
2.000
56.490
972.715
MS
658.845
255.380
2.000
14.1225
F
46.65
18.08
.14
SST = SSA + SSB + SSAB = 658.845 + 255.380 + 2.000 = 916.225

df = ab 1 = 2(2) 1 = 3
SST 916.225
MST
305.408
= 21.63
MST =
=
= 305.408 FT =
=
ab 1
3
MSE
14.1225
H0: 1 = 2 = 3 = 4
Ha: At least two of the treatment means differ
ab 1 = 2(2) 1 = 3 and 2 = n ab = 8 2(2) = 4. From Table IX, Appendix B,
277
6.59), H0 is rejected. There is sufficient evidence to indicate the treatment means differ at
= .05.
This agrees with the conclusion in part a.
d.
Since there are differences among the treatment means, we test for the presence of
interaction:
H0: Factors A and B do not interact to affect the response means
Ha: Factors A and B do interact to affect the response means
The test statistic is F = .14.
(a 1)(b 1) = (2 1)(2 1) = 1 and 2 = n ab = 8 2(2) = 4. From Table IX,
Since the observed value of the test statistic does not fall in the rejection region (F = .14
>/ 7.71), H0 is not rejected. There is insufficient evidence to indicate the factors interact
at = .05.
e.
Since the interaction was not significant, we test for main effects.
To determine whether the two means of factor A differ, we test:
H0: 1 = 2
Ha: 1 2
a 1 = 2 1 = 1 and 2 = n ab = 8 2(2) = 4. From Table IX, Appendix B, F.05 = 7.71.
7.71), H0 is rejected. There is sufficient evidence to indicate the two means of factor A
differ at = .05.
To determine whether the two means of factor B differ, we test:
H0: 1 = 2
Ha: 1 2
The rejection region requires = .05 in the upper tail of the F-distribution with 1 = b 1
= 2 1 = 1 and 2 = n ab = 8 2(2) = 4. From Table IX, Appendix B, F.05 = 7.71. The
278
Chapter 8
7.71), H0 is rejected. There is sufficient evidence to indicate the two means of factor B
differ at = .05.
f.
The results of all the tests agree with those in part a.
g.
Since no interaction is present, but the means of both factors A and B differ, we compare
the two means of factor A and compare the two means of factor B. Since there are only
two means to compare for each factor, the higher population mean corresponds to the
higher sample mean.
Factor A: x1 =
x2 =
br
br
29.6 + 35.2 + 47.3 + 42.1

= 38.55
2(2)
12.9 + 17.6 + 28.4 + 22.7

= 20.4
2(2)
The mean for level 1 of factor A is significantly higher than the mean for level 2.
Factor B: x1 =
x2 =
ar
ar
29.6 + 35.2 + 12.9 + 17.6

= 23.825
2(2)
47.3 + 42.1 + 28.4 + 22.7

= 35.125
2(2)
The mean for level 2 of factor B is significantly higher than the mean for level 1.
8.64
a.
There are a total of 2 4 = 8 treatments.
b.
The interaction between temperature and type was significant. This means that the effect
of type of yeast on the mean autolysis yield depends on the level of temperature.
c.
To determine if the main effect of type of yeast is significant, we test:

H0: Ba = Br
Ha: Ba Br
To determine if the main effect of temperature is significant, we test:
H0: 1 = 2 = 3 = 4
Ha: At least one mean differs
d.
The tests for the main effects should not be run until after the test for interaction is
conducted. If interaction is significant, then these interaction effects could cover up the
main effects. Thus, the main effect tests would not be informative.
If the test for interaction is not significant, then the main effect tests could be run.
279
e.
Bakers yeast:
The mean yield for temperature 54o is significantly lower than the mean yields for the
other 3 temperatures. There is no difference in the mean yields for the temperatures
45o, 48o and 51o.
Brewers yeast:
The mean yield for temperature 54o is significantly lower than the mean yields for the
other 3 temperatures. There is no difference in the mean yields for the temperatures
45o, 48o and 51o.
8.66
a.
This is an observational experiment. The researcher recorded the number of users per
hour for each of 24 hours per day, 7 days per week, for 7 weeks. The researcher did not
manipulate the weeks or days or hours.
b.
The two factors are (1) the day of the week with 7 levels and (2) the hour of the day with
24 levels.
c.
In a factorial experiment, a is the number of levels of factor A and b is the number of

levels of factor B. If we let factor A be the day of the week and factor B be the hour of
the day, then a = 7 and b = 24.
d.
To determine if the a b = 7 24 = 168 treatment means differ, we test:

H0: 1 = 2 = 3 = . . . = 168
MST 1143.99
=
= 25.06
MSE
45.65
The rejection region requires = .01 in the upper tail of the F distribution with v1 = p 1
= 168 1 = 167 and v2 = n p = 1172 168 = 1004. From Table XI, Appendix B, F.01
1.00), H0 is rejected. There is sufficient evidence to indicate a difference in mean usage
among the day-hour combinations at = .01.
e.
The hypotheses used to test if an interaction effect exists are:

H0: Days and hours do not interact to affect the mean usage
Ha: Days and hours interact do affect the mean usage
f.
MSAB 55.69
= 1.22
=
MSE
45.65
The p-value is p = .0527. Since the p-value is not less than = .01, H0 is not rejected.
There is insufficient evidence to indicate days and hours interact to affect usage at =
.01.
280
Chapter 8
g.
To determine if the mean usage differs among the days of the week, we test:
H0: 1 = 2 = 3 = 4 = 5 = 6 = 7
MSA 3122.02
= 68.39
=
MSE
45.65
The p-value is p = .0001. Since the p-value is less than = .01, H0 is rejected. There is
sufficient evidence to indicate the mean usage differs among the days of the week at =
.01.
To determine if the mean usage differs among the hours of the day, we test:
H0: 1 = 2 = 3 = . . . = 24
MSB 7157.82
= 156.80
=
MSE
45.65
The p-value is p = .0001. Since the p-value is less than = .01, H0 is rejected. There is
sufficient evidence to indicate the mean usage differs among the hours of the day at =
.01.
8.68
a.
The degrees of freedom for Type of message retrieval system is a 1 = 2 1 = 1. The

degrees of freedom for Pricing option is b 1 = 2 1 = 1. The degrees of freedom for
the interaction of Type of message retrieval system and Pricing option is (a 1)(b 1) =
(2 1)(2 1) = 1. The degrees of freedom for error is n ab = 120 2(2) = 116.
Source
Type of message retrieval system
Pricing Option
Type of system pricing option
Error
Total
b.
Df
1
1
1
116
119
SS
-
MS
-
F
2.001
5.019
4.986
To determine if Type of system and Pricing option interact to affect the mean
willingness to buy, we test:
H0: Type of system and Pricing option do not interact
Ha: Type of system and Pricing option interact
c.
MSAB
= 4.986
MSE
The rejection region requires = .05 in the upper tail of the F distribution with 1 =
281
3.92), H0 is rejected. There is sufficient evidence to indicate Type of system and
Pricing option interact to affect the mean willingness to buy at = .05.
8.70
d.
No. Since the test in part c indicated that interaction between Type of system and
Pricing option is present, we should not test for the main effects. Instead, we should
proceed directly to a multiple comparison procedure to compare selected treatment
means. If interaction is present, it can cover up the main effects.
a.
The treatments are the 3 3 = 9 combinations of PES and Trust. The nine treatments are:
(BC, Low), (PC, Low), (NA, Low), (BC, Med), (PC, Med), (NA, Med), (BC, High),
(PC, High), and (NA, High).
b.
df(Trust) = 3 1 = 2;
SSE = SSTot SS(PES) SS(Trust) SSPT
= 161.1162 2.1774 7.6367 1.7380 = 149.5641
SS(PES)
2.1774
=
= 1.0887
MS(PES) =
2
df(PES)
SS(Trust)
7.6367
=
= 3.81835
MS(Trust) =
2
df(Trust)
SS(PT) 1.7380
=
= 0.4345
MS(PT) =
df(PT)
4
SSE
149.5641
MSE =
= 0.7260
=
df(Error)
206
MS(PES)
MS(Trust)
1.0887
3.81835
FPES =
= 1.50
FTrust =
= 5.26
=
=
MSE
MSE
0.7260
0.7260
MS(PT)
0.4345
FPT =
=
= 0.60
0.7260
MSE
The ANOVA table is:
Source
PES
Trust
PES Trust
Error
Total
c.
df
2
2
4
206
214
SS
2.1774
7.6367
1.7380
149.5641
161.1162
MS
1.0887
3.81835
0.4345
0.7260
F
1.50
5.26
0.60
To determine if PES and Trust interact, we test:

H0: PES and Trust do not interact to affect the mean tension
Ha: PES and Trust do interact to affect the mean tension
282
Chapter 8
>/ 2.37), H0 is not rejected. There is insufficient evidence to indicate that PES and Trust
interact at = .05.
d.
The plot of the treatment means is:

The mean tension scores for Low
Trust are relatively the same for each
level of PES. Similarly, the mean
tension scores for Medium Trust are
relatively the same for each level of
PES. However, the mean tension
scores for High Trust are not the
same for each level of PES. For both
PES levels BC and PC, as the level of
trust increases, the mean tension
scores decrease. However, for PES
level NA, as trust goes from low to medium, the mean tension decreases. As the trust
goes from medium to high, the mean tension increases. This indicates that interaction is
present which was also found in part d.
e.
8.72
Because the interaction of PES and Trust was found to be significant, the tests for the
main effects are irrelevant. If the factors interact, the interaction effect can cover up any
main effect differences. In addition, interaction implies that the effects of one factor on
the dependent variable are different at different levels of the second factor. Thus, there is
no one "main" effect of the factor.
Using MINITAB, the ANOVA results are:

General Linear Model: Deviation versus Group, Trail
Factor
Group
Trail
Type Levels Values

fixed
4 F G M N
fixed
2 C E
Analysis of Variance for Deviatio, using Adjusted SS for Tests

Source
Group
Trail
Group*Trail
Error
Total
DF
3
1
3
112
119
Seq SS
16271.2
46445.5
2245.2
82131.7
147093.6
Adj SS
13000.6
46445.5
2245.2
82131.7
Adj MS
4333.5
46445.5
748.4
733.3
F
5.91
63.34
1.02
P
0.001
0.000
0.386
First, we must test for treatment effects.

SST = SS(Group) + SS(Trail) + SS(GxT) = 16,271.2 + 46,445.5 + 2,245.2 = 64,961.9.
The df = 3 + 1 + 3 = 7.
283
MST =
SST 64, 961.9

=
= 9, 280.2714
ab 1 4(2) 1
F=
MST 9, 280.2714
=
= 12.66
MSE
733.3
To determine if there are differences in mean ratings among the 8 treatments, we test:
H0: All treatment means are the same
Since no was given, we will use = .05. The rejection region requires = .05 in the upper
tail of the F distribution with 1 = ab 1 = 4(2) 1 = 7 and 2 = n ab = 120 4(2) = 112.
From Table IX, Appendix B, F.05 2.09. The rejection region is F > 2.09.
Since the observed value of the test statistic falls in the rejection region (F = 12.66 > 2.09), H0
is rejected. There is sufficient evidence that differences exist among the treatment means at
= .05. Since differences exist, we now test for the interaction effect between Trail and Group.
To determine if Trail and Group interact, we test:
H0: Trail and Group do not interact
Ha: Trail and Group do interact
The test statistic is F = 1.02 and p = .386
Since the p-value is greater than (p = .386 > .05), H0 is not rejected. There is insufficient
evidence that Trail and Group interact at = .05. Since the interaction does not exist, we test
for the main effects of Trail and Group.
To determine if there are differences in the mean rating between the two levels of
Trail, we test:
H0: 1 = 2
Ha: 1 2
The test statistics is F = 63.34 and p = 0.000.
Since the p-value is greater than (p = .000 < .05), H0 is rejected. There is sufficient evidence
that the mean trail deviations differ between the fecal extract trail and the control trail = .05.
To determine if there are differences in the mean rating between the four levels of Group, we
test:
H0: 1 = 2 = 3 = 4
Ha: At least 2 means differ
The test statistics is F = 5.91 and p = 0.001.
Since the p-value is less than (p = 0.001 < .05), Ho is rejected. There is sufficient evidence
that the mean trail deviations differ among the four groups at = .05.
284
Chapter 8
8.74
There are 3 2 = 6 treatments. They are A1B1, A1B2, A2B1, A2B2, A3B1, and A3B2.
8.76
a.
SSE = SSTot SST = 62.55 36.95 = 25.60

df Treatment = p 1 = 4 1 = 3
df Error = n p = 20 4 = 16
df Total = n 1 = 20 1 = 19
36.95
= 12.32
MST = SST/df =
3
25.60
= 1.60
MSE = SSE/df =
16
MST
12.32
F=
=
= 7.70
MSE
1.60
The ANOVA table:
Source
Treatment
Error
Total
b.
df
3
16
19
SS
36.95
25.60
62.55
MS
12.32
1.60
F
7.70
To determine if there is a difference in the treatment means, we test:

H0: 1 = 2 = 3 = 4
Ha: At least two of the means differ
where the i represents the mean for the ith treatment.
MST
= 7.70
MSE
(p 1) = (4 1) = 3 and 2 = (n p) = (20 4) = 16. From Table VIII, Appendix B,
H0 is rejected. There is sufficient evidence to conclude that at least two of the means
differ at = .10.
c.
x4 =
n4
57
= 11.4
5
For confidence level .90, = .10 and /2 = .10/2 = .05. From Table VI, Appendix B,
with df = 16, t.05 = 1.746. The confidence interval is:
x4 t.05 MSE/n4 11.4 1.746 1.6 / 5 11.4 .99 (10.41, 12.39)
285
8.78
a.
df(AB) = (a 1)(b - 1) = 3(5) = 15

df(Error) = n ab = 48 4(6) = 24
SSAB = MSAB(df) = 3.1(15) = 46.5
SS(Total) = SSA + SSB + SSAB + SSE = 2.6 + 9.2 + 46.5 + 18.7 = 77
SS A 2.6
SSB 9.2
=
= .8667
=
= 1.84
MSA =
MSB =
a 1 3
b 1 5
SSE 18.7
=
= .7792
MSE =
n ab 24
MSA .8667
MSB 1.84
=
= 1.11
=
= 2.36
FB =
FA =
MSE .7792
MSE .7792
MS AB
3.1
=
= 3.98
FAB =
MSE .7792
B
Source
A
B
AB
Error
Total
df
3
5
15
24
47
SS
2.6
9.2
46.5
18.7
77.0
MS
.8667
1.84
3.1
.7792
F
1.11
2.36
3.98
b.
Factor A has a = 3 + 1 = 4 levels and factor B has b = 5 + 1 = 6 levels. The number of

treatments is ab = 4(6) = 24. The total number of observations is n = 47 + 1 = 48. Thus,
two replicates were performed.
c.
SST = SSA + SSB + SSAB = 2.6 + 9.2 + 46.5 = 58.3

SST
58.3
=
= 2.5347
MST =
ab 1 4(6) 1
F=
MST 2.5347
=
= 3.25
MSE .7792

H0: 1 = 2 = = 24
Ha: At least one treatment mean is different
MST
= 3.25
MSE
The rejection region requires = .05 in the upper tail of the F-distribution with 1 = ab
1 = 4(6) 1 = 23 and 2 = n ab = 48 - 4(6) = 24. From Table IX, Appendix B, F.05
H0 is rejected. There is sufficient evidence to indicate the treatment means differ at
= .05.
286
Chapter 8
d.
Since there are differences among the treatment means, we test for the presence of
interaction:
H0: Factor A and factor B do not interact to affect the response mean
Ha: Factor A and factor B do interact to affect the response mean
MS AB
= 3.98
MSE
H0 is rejected. There is sufficient evidence to indicate factors A and B interact to affect
the response means at = .05.
Since the interaction is significant, no further tests are warranted. Multiple comparisons
need to be performed.
8.80
a.
This is a two-factor factorial design. It is also a completely randomized design.
b.
The two factors are "involvement in topic" and "question wording." Both are qualitative
variables because neither are measured on numerical scales.
c.
There are two levels of "involvement in topic": high and low. There are two levels of
"question wording": positive and negative.
d.
There are 2 2 = 4 treatments. The are:

(high, positive), (high, negative), (low, positive), and (low, negative)
8.82
e.
The experiment's dependent variable is the level of agreement.
a.
To determine if the mean vacancy rates of the eight office-property submarkets in

Atlanta differ, we test:
H0: 1 = 2 = 3 = 4 = 5 = 6 = 7 = 8
b.
If quarterly data were used for nine years, there are 4 9 = 36 observations per
submarket. Since there are 8 submarkets, the total sample size is 8 36 = 288. Since no
value of is given, we will use = .05.
= 8 1 = 7 and 2 = n k = 288 8 = 280. From Table X, Appendix B, F.05 2.01. The
287
2.01), H0 is rejected. There is sufficient evidence to indicate the mean vacancy rates of
the eight office-property submarkets in Atlanta differ at = .05.
8.84
c.
With 1 = k 1 = 8 1 = 7 and 2 = n k = 288 8 = 280, P(F > 17.54) < .01, using
Table XI, Appendix B. Thus, the p-value is less than .01.
d.
We must assume that all eight samples are randomly drawn from normal populations, the
eight populations variances are the same, and the samples are independent.
e.
The mean vacancy rate for the South submarket is significantly larger than the mean
vacancy rates for all other submarkets. The mean vacancy rate of the Downtown
submarket is significantly larger than the mean vacancy rates for all other submarkets
except the South. The mean vacancy rate of the North Lake submarket is significantly
larger than the mean vacancy rates for all other submarkets except the South and
Downtown. The mean vacancy rate of the Midtown submarket is significantly larger than
the mean vacancy rates for all other submarkets except the South, Downtown, and North
Lake. There are no other significant differences.
a.
The response is the weight of a brochure. There is one factor and it is carton. The
treatments are the five different cartons, while the experimental units are the brochures.
b.
( y)
.750052
= .01406437506
n
40
SS(Total) = y 2 CM = .014066537 .01406437506 = .00000216264
CM =
SST =
2
Ti 2
.
.15028 2 .14962 2 .15217 2 .150312
+
+
+
+
.01406437506
n CM = 14767
8
8
8
8
8
i
= .01406568209 - .01406437506 = .00000130703

SSE = SS(Total) SST = .00000216264 - .00000130703 = .00000085561
SST .00000130703
=
MST =
= .000000326756
k 1
5 1
SSE .00000085561
=
= .000000024446
MSE =
nk
40 5
MST .000000326756
F=
=
= 13.37
MSE .000000024446
Source
Treatments
Error
Total
df
4
35
39
SS
.00000130703
.00000085561
.00000216264
MS
F
.000000326756 13.37
.000000024446
To determine whether there are differences in mean weight per brochure among the five
cartons, we test:
H0: 1 = 2 = 3 = 4 = 5
288
Chapter 8

2.53), H0 is rejected. There is sufficient evidence to indicate a difference in mean weight
per brochure among the five cartons at = .05.
c.
We must assume that the distributions of weights for the brochures in the five cartons are
normal, that the variances of the weights for the brochures in the five cartons are equal,
and that random and independent samples were selected from each of the cartons.
d.
Using MINITAB, the results of Tukeys multiple comparison procedure are:
Level
Carton1
Carton2
Carton3
Carton4
Carton5
N
8
8
8
8
8
Mean
0.018459
0.018785
0.018703
0.019021
0.018789
Individual 95% CIs For Mean Based on

Pooled StDev
---+---------+---------+---------+----(-----*-----)
(----*-----)
(----*-----)
(-----*-----)
(----*-----)
---+---------+---------+---------+-----0.01840
0.01860
0.01880
0.01900
StDev
0.000105
0.000101
0.000109
0.000232
0.000188
Pooled StDev = 0.000156

Tukey 95% Simultaneous Confidence Intervals
All Pairwise Comparisons
Individual confidence level = 99.32%
Carton1 subtracted from:
Carton2
Carton3
Carton4
Carton5
Lower
0.0001013
0.0000188
0.0003375
0.0001050
Center
0.0003262
0.0002437
0.0005625
0.0003300
Upper
0.0005512
0.0004687
0.0007875
0.0005550
Carton2
Carton3
Carton4
Carton5
------+---------+---------+---------+--(-----*------)
(-----*-----)
(-----*-----)
(-----*------)
------+---------+---------+---------+---0.00035
0.00000
0.00035
0.00070

Carton3
Carton4
Carton5
Lower
-0.0003075
0.0000113
-0.0002212
Center
-0.0000825
0.0002363
0.0000037
Carton3
Carton4
Carton5
------+---------+---------+---------+--(------*-----)
(------*-----)
(-----*------)
------+---------+---------+---------+---0.00035
0.00000
0.00035
0.00070
Upper
0.0001425
0.0004612
0.0002287
289

Carton4
Carton5
Lower
0.0000938
-0.0001387
Center
0.0003187
0.0000862
Upper
0.0005437
0.0003112
Carton4
Carton5
------+---------+---------+---------+--(-----*------)
(-----*------)
------+---------+---------+---------+---0.00035
0.00000
0.00035
0.00070

Carton5
Lower
-0.0004575
Center
-0.0002325
Upper
-0.0000075
Carton5
------+---------+---------+---------+--(-----*------)
------+---------+---------+---------+---0.00035
0.00000
0.00035
0.00070
The means arranged in order are:

Carton 1
Carton 3
Carton 2
.018459.018703.018785.018789.019021
Carton 5
Carton 4
The interpretation of the Tukey results are:

The mean weight for carton 4 is significantly higher than the mean weights of all the
other cartons.
The mean weights of cartons 5, 4, and 3 are significantly higher than the mean weight of
carton 1.
8.86
e.
Since there are differences among the cartons, management should sample from many
cartons.
a.
This is a randomized block design.

Response:
Factor:
Factor type:
Treatments:
Experimental units:
290
the length of time required for a cut to stop bleeding

drug
qualitative
drugs A, B, and C
subjects
Chapter 8
b.
Using MINITAB, the results are:

General Linear Model: Y versus Drug, Person
Factor
Drug
Person
Type Levels Values

fixed
3 A B C
fixed
5 1 2 3 4 5
Analysis of Variance for Y, using Adjusted SS for Tests

Source
Drug
Person
Error
Total
DF
2
4
8
14
Seq SS
156.4
7645.8
160.1
7962.3
Adj SS
156.4
7645.8
160.1
Adj MS
78.2
1911.5
20.0
F
3.91
95.51
P
0.066
0.000
Tukey 90.0% Simultaneous Confidence Intervals

Response Variable Y
All Pairwise Comparisons among Levels of Drug
Drug = A subtracted from:
Drug
B
C
Lower
-11.56
-3.72
Center
-4.820
3.020
Upper
1.922
9.762
-----+---------+---------+---------+(-------*-------)
(--------*-------)
-----+---------+---------+---------+-8.0
0.0
8.0
16.0
Upper
14.58
-----+---------+---------+---------+(--------*-------)
-----+---------+---------+---------+-8.0
0.0
8.0
16.0
Drug = B subtracted from:

Drug
C
Lower
1.098
Center
7.840
Let 1, 2, and 3 represent the mean clotting time for the three drugs.
H0: 1 = 2 = 3
MS(Drug)
= 3.91
MSE
The p-value is p = 0.066. Since the observed level of significance is less than
= .10, H0 is rejected. There is sufficient evidence to indicate differences in the mean
clotting times among the three drugs at = .10.
c.
The observed level of significance is given as 0.066.
d.
To determine if there is a significant difference in the mean response over blocks, we test:
H0: 1 = 2 = 3 = 4 = 5
Ha: At least two block means differ
MS(Person)
= 95.51
MSE
291
The p-value is p = 0.000. Since the observed level of significance is less than
= .10, H0 is rejected. There is sufficient evidence to indicate differences in the mean
clotting times among the five people at = .10.
e.
The confidence interval to compare drugs A and B is (-11.56, 1.922). Since 0 is in the
interval, there is no evidence of a difference in mean clotting times between drugs
A and B.
The confidence interval to compare drugs A and C is (-3.72, 9.762). Since 0 is in the
interval, there is no evidence of a difference in mean clotting times between drugs A and
C.
The confidence interval to compare drugs B and C is (1.098, 14.58). Since 0 is not in the
interval, there is evidence of a difference in mean clotting times between drugs B and C.
Since the numbers are positive, the mean clotting time for drug C is greater than that for
drug B.
In summary, the mean clotting time for drug C is greater than that for drug B. No other
differences exist.
8.88
a.
243.2
57.8
SS A
SSB
=
= 243.2
MSB =
=
= 57.8
1
1
df B
df A
SSAB = SSTot- SSA - SSB - SSE = 976.3 - 243.2 - 57.8 - 670.8 = 4.5
SS AB
SSE
4.5
670.8
= 4.5
MSE =
= 8.712
=
=
MSAB =
1
77
df AB
df E
MSA =
MS A
243.2
= 27.92
=
MSE
8.712
MSAB
4.5
= 0.52
FAB =
=
8.712
MSE
FA =
FB =
B
MSB
57.8
= 6.63
=
MSE 8.712
The ANOVA table is:

Source
Recent Performance (A)
Risk attitude(B)
AB
Error
Total
b.
df
1
1
1
77
80
SS
243.2
57.8
4.5
670.8
976.3
MS
243.2
57.8
4.5
8.712
F
27.92
6.63
0.52
To determine if factors A and B interact, we test:
H0: Factors A and B do not interact to affect the mean decision

Ha: Factors A and B do interact to affect the mean decision
292
Chapter 8
>/ 4.00), H0 is not rejected. There is insufficient evidence to indicate that factors A and B
interact at = .05.
c.
Since the interaction is not significant, the main effect tests are meaningful.
To determine if an individual's risk attitude affects his or her budgetary decisions, we test:
H0: No difference exists between the risk attitude means

Ha: The risk attitude means differ
The rejection region requires = .05 in the upper tail of the F-distribution with 1 = b 1
= 2 1 = 1 and 2 = n ab = 81 2(2) = 77. From Table IX, Appendix B, F.05 4.00.
H0 is rejected. There is sufficient evidence to indicate an individual's risk attitude affects
his or her budgetary decisions at = .05.
d.
To determine if recent performance affects budgeting decisions, we test:
H0: No difference exists between the recent performance means

Ha: The recent performance means differ
The rejection region requires = .01 in the upper tail of the F-distribution with 1 = a 1
= 2 1 = 1 and 2 = n ab = 81 2(2) = 77. From Table XI, Appendix B, F.01 7.08.
7.08), H0 is rejected. There is sufficient evidence to indicate that recent performance
affects his or her budgetary decisions at = .01.
293
8.90
Let factor A be second plastic and factor B be metal density. Some preliminary calculations
are:
( y)
5.562
= 3.8642
n
8
SS(Total) = y 2 CM = 9.1646 3.8642 = 5.3004
CM =
SSA =
SSB =
Ai2
.922 4.642
br CM = 2(2) + 2(2) 3.8642 = 5.594 3.8642 = 1.7298
B 2j
ar
SSAB =
CM =
ABij2
ar
.57 2 4.992
+
3.8642 = 6.30625 3.8642 = 2.44205
2(2) 2(2)
SSA SSB CM
.062 .862 .512 4.132

+
+
+
1.7298 2.44205 3.8642
2
2
2
2
= 9.0301 8.03605 = .99405
SSE = SS(Total) SSA SSB SSAB
= 5.3004 1.7298 2.44205 .99405 = .1345
SSA 1.7298
MSA =
=
= 1.7298
a 1 2 1
SSB 2.44205
=
= 2.44205
MSB =
b 1
2 1
SS AB
.99405
=
MSAB =
= .99405
(a 1)(b 1) (1)(1)
SSE
.1345
=
MSE =
= .033625
n ab 8 (2)(2)
MSA
1.7298
F(A) =
=
= 51.44
MSE
.033625
MSB 2.44205
=
F(B) =
= 72.63
MSE .033625
MS AB .99405
F(AB) =
=
= 29.56
MSE .033625
=
Source
A
B
AB
Error
Total
df
1
1
1
4
SS
1.72980
2.44205
.99405
.13450
7
5.30040
MS
1.72980
2.44205
.99405
.033625
F
51.44
72.63
29.56
SST = SSA + SSB + SSAB = 1.7298 + 2.44205 + .99405 = 5.1659

SST
5.1659
=
= 1.7220
MST =
ab 1 2(2) 1
294
Chapter 8
F(T) =
MST 1.7220
= 51.21
=
MSE .033625
To determine whether differences exist among the treatment means, we test:
H0: 1 = 2 = 3 = 4
The rejection region requires = .05 in the upper tail of the F-distribution with 1 = ab 1 =
2(2) 1 = 3 and 2 = n ab = 8 2(2) = 4. From Table IX, Appendix B, F.05 = 6.59. The
is rejected. There is sufficient evidence to indicate differences in mean radiation among the
four treatments at = .05.
Since there are differences among the treatment means, we next test to see if the two factors
interact.
H0: Second plastic and metal density do not interact

Ha: Second plastic and metal density do interact
MS AB
= 29.56
MSE
The rejection requires = .05 in the upper tail of the F-distribution with 1 = (a 1)(b 1) = 1
and 2 = n ab = 8 2(2) = 4. From Table IX, Appendix B, F.05 = 7.71. The rejection region
is F > 7.71.
is rejected. There is sufficient evidence to indicate second plastic and metal density interact at
= .05.
Since interaction is present, no tests for main effects are necessary. Since we want to find the
preferred method to protect patients, we will compare all four treatment means. There are four
p ( p 1)
4(4 1)
treatments, so c =
=
= 6. For * = /c = .05/6 = .0083 and */2 = .0083/2 =
2
2
.0042 .005 and df = n - ab = 4, t.005 = 4.604 from Table VI, Appendix B.
295
We now form confidence intervals for the differences between each pair of means using the
formula:
( xi x j ) t.005 s
1 1
+
where s =
ni n j
MSE = .033625 = .1834
Pair
11 12
11 21
11 22
12 21
12 22
21 22
1 1
+ .40 .844 (1.244, .444)
2 2
(.03 .255) .844 .255 .844 (1.069, .619)
(.03 2.065) .844 2.035 .844 (2.879, 1.191)
(.43 .255) .844 .175 .844 (.669, 1.019)
(.43 2.065) .844 1.635 .844 (2.479, .791)
(.255 - 2.065) .844 1.81 .844 (2.654, .966)
(.03 .43) 4.604(.1834)
The means that differ are 11 and 22, 12 and 22, and 21 and 22. No other means are
significantly different. Since we are looking for the treatment that gives the best protection
(allows the smallest amount of radiation), we would pick any treatment except 22. Thus, use
second plastic present and heavy alloy, second plastic present and light alloy, or second plastic
not present and heavy alloy. Pick the one of these three which is the cheapest or the most
convenient.
8.92
a.
There are a total of a b = 3 3 = 9 treatments in this study.
b.
Using MINITAB, the ANOVA results are:

General Linear Model: Y versus Display, Price
Factor
Display
Price
Type Levels Values

fixed
3 1 2 3
fixed
3 1 2 3
Analysis of Variance for Y, using Adjusted SS for Tests

Source
Display
Price
Display*Price
Error
Total
DF
2
2
4
18
26
Seq SS
1691393
3089054
510705
8905
5300057
Adj SS
1691393
3089054
510705
8905
Adj MS
F
845696 1709.37
1544527 3121.89
127676 258.07
495
P
0.000
0.000
0.000
To get the SS for Treatments, we must add the SS for Display, SS for Price, and the SS for
Interaction. Thus, SST = 1,691,393 + 3,089,054 + 510,705 = 5,291,152. The df = 2 + 2 +
4 = 8.
SST 5, 291,152
MST 661,394
=
= 661,394
MST =
F=
=
= 1336.15
3(3) 1
ab 1
MSE
495
296
Chapter 8
H0: 1 = 2 = = 9
MST
= 1336.15
MSE
ab 1 = 3(3) 1 = 8 and 2 = n ab = 27 3(3) = 18. From Table VIII, Appendix B,
2.04), H0 is rejected. There is sufficient evidence to indicate the treatment means differ at
= .10.
c.
Since there are differences among the treatment means, we next test for the presence of
interaction.
H0: Factors A and B do not interact to affect the response means

Ha: Factors A and B do interact to affect the response means
MSAB
= 258.07
MSE
(a 1)(b 1) = (3 1)(3 1) = 4 and 2 = n ab = 17 3(3) = 18. From Table VIII,
2.29), H0 is rejected. There is sufficient evidence to indicate the two factors interact at
= .10.
d.
The main effect tests are not warranted since interaction is present in part c.
e.
The nine treatment means need to be compared.
f.
From the graph, if the like letters are connected, the lines are not parallel. This implies
interaction is present. This agrees with the results of part c.
297
8.94
a.
This is a completely randomized design with a complete four-factor factorial design.
b.
There are a total of 2 2 2 2 = 16 treatments.
c.
Using SAS, the output is:

Analysis of Variance Procedure
Dependent Variable: Y
Sum of
Mean
Source
DF
Squares
Square
F Value
Pr > F
Model
15
546745.50
36449.70
5.11
0.0012
Error
16
114062.00
7128.88
Corrected Total
31
660807.50
R-Square
C.V.
Root MSE
Y Mean
0.827390
41.46478
84.433
203.63
DF
Anova SS
Mean Square
F Value
Pr > F
SPEED
56784.50
56784.50
7.97
0.0123
FEED
21218.00
21218.00
2.98
0.1037
SPEED*FEED
55444.50
55444.50
7.78
0.0131
COLLET
165025.13
165025.13
23.15
0.0002
SPEED*COLLET
44253.13
44253.13
6.21
0.0241
FEED*COLLET
142311.13
142311.13
19.96
0.0004
SPEED*FEED*COLLET
54946.13
54946.13
7.71
0.0135
WEAR
378.13
378.13
0.05
0.8208
SPEED*WEAR
1540.13
1540.13
0.22
0.6483
FEED*WEAR
946.13
946.13
0.13
0.7204
SPEED*FEED*WEAR
528.13
528.13
0.07
0.7890
COLLET*WEAR
1682.00
1682.00
0.24
0.6337
SPEED*COLLET*WEAR
512.00
512.00
0.07
0.7921
FEED*COLLET*WEAR
72.00
72.00
0.01
0.9212
SPEE*FEED*COLLE*WEAR
1104.50
1104.50
0.15
0.6991
Source
d.
To determine if the interaction terms are significant, we must add together the sum of
squares for all interaction terms as well as the degrees of freedom.
SS(Interaction) = 55,444.50 + 44,253.13 + 142,311.13 + 54,946.13 + 1,540.13 + 946.13
+ 528.13 + 1,682.00 + 512.00 + 72.00 + 1,104.50
= 303,339.78
df(Interaction) = 11
SS(Interacton)
303, 339.78
=
= 27,576.34364
MS(Interaction) =
11
df(Interaction)
MS(Interaction)
27, 576.34364
= 3.87
F(Interaction) =
=
MSE
7128.88
298
Chapter 8
To determine if interaction effects are present, we test:
H0: No interaction effects exist

Ha: Interaction effects exist
The rejection region requires = .05 in the upper tail of the F-distribution with 1 = 11
and 2 = 16. From Table IX, Appendix B, F.05 2.49. The rejection region is F > 2.49.
H0 is rejected. There is sufficient evidence to indicate that interaction effects exist at =
.05.
Since the sums of squares for a balanced factorial design are independent of each other,
we can look at the SAS output to determine which of the interaction effects are
significant. The three-way interaction between speed, feed, and collet is significant
(p = .0135). There are three two-way interactions with p-values less than .05. However,
all of these two-way interaction terms are imbedded in the significant three-way
interaction term.
e.
Yes. Since the significant interaction terms do not include wear, it would be necessary to
perform the main effect test for wear. All other main effects are contained in a significant
interaction term.
To determine if the mean finish measurements differ for the different levels of wear, we
test:
H0: The mean finish measurements for the two levels of wear are the same
Ha: The mean finish measurements for the two levels of wear are different
The test statistic is t = 0.05.
The rejection region requires = .05 in the upper tail of the F-distribution with 1 = 1 and
2 = 16. From Table IX, Appendix B, F.05 = 4.49. The rejection region is F > 4.49.
>/ 4.49), H0 is not rejected. There is insufficient evidence to indicate that the mean finish
measurements differ for the different levels of wear at = .05.
f.
We must assume that:

i.
ii.
iii.
The populations sampled from are normal.

The population variances are the same.
The samples are random and independent.
299
9.2
Chapter 9
The characteristics of the multinomial experiment are:

1.
2.
3.
4.
5.
The experiment consists of n identical trials.

There are k possible outcomes to each trial.
The probabilities of the k outcomes, denoted p1, p2, ... , pk, remain the same from trial to
trial, where p1 + p2 + + pk = 1.
The trials are independent.
The random variables of interest are the counts n1, n2, ... , nk in each of the k cells.
The characteristics of the binomial are the same as those for the multinomial with k = 2.
9.4
The hypotheses of interest are:

H0: p1 = .25, p2 = .25, p3 = .50
Ha: At least one of the probabilities differs from the hypothesized value
E(n1) = np1,0 = 320(.25) = 80
E(n2) = np2,0 = 320(.25) = 80
E(n3) = np3,0 = 320(.50) = 160
The test statistic is =
2
[ ni E (ni )]
E (ni )
(78 80) 2 (60 80) 2 (182 160)2

= 8.075
+
+
80
80
160
The rejection region requires = .05 in the upper tail of the 2 distribution with df = k 1
2
= 5.99147. The rejection region is 2 >
= 3 1 = 2. From Table VII, Appendix B, .05
5.99147.
Since the observed value of the test statistic falls in the rejection region (2 = 8.075 > 5.99147),
H0 is rejected. There is sufficient evidence to indicate that at least one of the probabilities
differs from its hypothesized value at = .05.
9.6
300
a.
The qualitative variable of interest is the location of professional sports stadiums and
ballparks. There are 3 levels or categories of this variable downtown, central city, and
suburban.
b.
Let p1 = proportion of major sports facilities located in downtown areas, p2 = proportion

of major sports facilities located in central city areas, and p3 = proportion of major sports
facilities located in suburban areas in 1997.
Chapter 9
To determine if the proportions of major sports facilities in downtown, central city, and
suburban areas in 1997 are the different than in 1985, we test:
H0: p1 = .40, p2 = .30, p3 = .30
Ha: At least one of the proportions differs from their hypothesized values
c.
E(n1) = np1,0 = 113(.40) = 45.2; E(n2) = np2,0 = 113(.30) = 33.9;

E(n3) = np3,0 = 113(.30) = 33.9
d.

[n E (ni )]2 (58 45.2) 2 (26 33.9) 2 (29 33.9) 2
=
+
+
= 6.174
2 = i
45.2
33.9
33.9
E ( ni )
e.
The degrees of freedom for the test statistic is k 1 = 3 1 = 2. The p-value is

p = P ( 2 6.174) .
Using Table VII, Appendix B, with df = 2, .025 > P ( 2 6.174) > .01 . Thus,
.01 < p < .025.
Since the p-value is smaller than = .05, H0 is rejected. There is sufficient evidence to
indicate the proportions of major sports facilities in downtown, central city, and suburban
areas in 1997 are the different than in 1985.
9.8
a.
The categorical variable is the rating of the student exposure to social and
environmental issues. It has 5 levels: 1-star, 2-stars, 3-stars, 4-stars, and 5-stars.
b.
If there were no difference in the category proportions, then each proportion should be pi
= 1/5 = .20. There were a total of n = 30 business schools sampled. The expected
number would be:
E(n1) = E(n2) = E(n3) = E(n4) = E(n5) = n(pi,0) = 30(.20) = 6
c.
To determine if there are differences in the star rating category proportions of all MBA
programs, we test:
H0: p1 = p2 = p3 = p4 = p5 = .20
Ha: At least one pi differs from its hypothesized value
d.
ni E ( ni )
( 2 6 )2 ( 9 6 )2 (14 6 )2 ( 5 6 )2 ( 0 6 )2
=
+
+
+
+
= 21
=
E ( ni )
6
6
6
6
6
2
e.
2
df = k 1 = 5 1 = 4. From Table VII, Appendix B, .05
2
region is > 9.48773.
301
f.
(2 = 21 > 9.48773), H0 is rejected. There is sufficient evidence to indicate differences in
the star rating category proportions of all MBA programs at = .05.
g.

p 3 =
x3 14
=
= .467
n 30
p 3 z.025
p 3q3
.467(.533)
.467 1.96
.467 .179 (.288, .646)
n
30
We are 95% confident that the proportion of all MBA programs that are ranked in the
3-star category is between .288 and .646.
9.10
a.

E(n1) = np1,0 = 1000(.50) = 500
E(n2) = np2,0 = 1000(.22) = 220
E(n3) = np3,0 = 1000(.11) = 110
E(n4) = np4,0 = 1000(.17) = 170
To determine if the percentages disagree with the percentages reported by

Nielson/NetRatings, we test:
H0: p1 = .50, p2 = .22, p3 = .11, and p4 = .17
2
2
2
2
ni E ( ni )
487 500 )
245 220 )
121 110 )
147 170 )
(
(
(
(
=
+
+
+
=
500
220
110
170
E ( ni )
2
= 7.391
2
2
region is > 7.81473.
(2 = 7.391 >/ 7.81473), H0 is not rejected. There is insufficient evidence to indicate
the percentages disagree with the percentages reported by Nielson/NetRatings at
= .05.
302
Chapter 9
b.

p1 =
x1 487
=
= .487
n 1000
p1 z.025
p1q1
.487(.513)
.487 1.96
.487 .031 (.456, .518)
n
1000
We are 95% confident that the percentage of all Internet searches that use the
Google Search Engine is between 45.6% and 51.8%.
9.12

E(n1) = np1,0 = 2,023(.45) = 910.35
E(n2) = np2,0 = 2,023 (.35) = 708.05
E(n3) = np3,0 = 2,023 (.15) = 303.45
E(n4) = np4,0 = 2,023 (.05) = 101.15
To determine if the percentages of all adults falling into the four response categories
changed after the Enron scandal, we test:
H0: p1 = .45, p2 = .35, p3 = .15, and p4 = .05
2
2
2
2
ni E ( ni )
1,173 910.35 )
587 708.05 )
182 303.45 )
81 101.15 )
(
(
(
(
=
=
+
+
+
910.35
708.05
303.45
101.15
E ( ni )
2
= 149.096
2
= 11.3449. The rejection region is
2
> 11.3449.
(2 = 149.096 > 11.3449), H0 is rejected. There is sufficient evidence to indicate the
percentages of all adults falling into the four response categories changed after the Enron
scandal at = .01.
303
9.14
a.

E(n1) = np1,0 = 700(.09) = 63
E(n3) = np3,0 = 700(.02) = 14
E(n5) = np5,0 = 700(.12) = 84
E(n7) = np7,0 = 700(.03) = 21
E(n9) = np9,0 = 700(.09) = 63
E(n11) = np11,0 = 700(.01) = 7
E(n13) = np13,0 = 700(.02) = 14
E(n15) = np15,0 = 700(.08) = 56
E(n17) = np17,0 = 700(.01) = 7
E(n19) = np19,0 = 700(.04) = 28
E(n21) = np21,0 = 700(.04) = 28
E(n23) = np23,0 = 700(.02) = 14
E(n25) = np25,0 = 700(.02) = 14
E(n27) = np27,0 = 700(.02) = 14
2 =
E(n2) = np2,0 = 700(.02) = 14

E(n4) = np4,0 = 700(.04) = 28
E(n6) = np6,0 = 700(.02) = 14
E(n8) = np8,0 = 700(.02) = 14
E(n10) = np10,0 = 700(.01) = 7
E(n12) = np12,0 = 700(.04) = 28
E(n14) = np14,0 = 700(.06) = 42
E(n16) = np16,0 = 700(.02) = 14
E(n18) = np18,0 = 700(.06) = 42
E(n20) = np20,0 = 700(.06) = 42
E(n22) = np22,0 = 700(.02) = 14
E(n24) = np24,0 = 700(.01) = 7
E(n26) = np26,0 = 700(.01) = 7
[ ni E (ni )]2 (39 63) 2 (18 14) 2 (30 14) 2

(34 14) 2
=
+
+
+ ... +
= 360.48
E (ni )
63
14
14
14
To determine if ScrabbleExpress presents the player with unfair word selection

opportunities that are different from the Scrabble board game, we test:
H0: Proportions in ScrabbleExpress are the same as in the Scrabble board game
Ha: Proportions in ScrabbleExpress are different from those in the Scrabble board
game
The test statistic is 2 = 360.47
The rejection region requires = .05 in the upper tail of the 2 distribution with df =
k 1 = 27 1 = 26. From Table VII, Appendix B, 2 = 38.8852. The rejection region
is 2 > 38.8852.
Since the observed value of the test statistic falls in the rejection region ( 2 = 360.47 >
38.8852), H0 is rejected. There is sufficient evidence to indicate the ScrabbleExpress
presents the player with unfair word selection opportunities that are different from the
Scrabble board game at = .05.
b.
The relative frequency of vowels for the board game is P(A) + P(E) + P(I) + P(O) +
P(U) = .09 + .12 + .09 +.08 + .04 = .42
p v =
304
39 + 31 + 25 + 20 + 21 136
=
= .194
700
700
Chapter 9
For confidence level .95, = .05 and /2 = .05/2 = .025. From Table IV, Appendix B,
p v (1 p v )
.194(.806)
.194 1.96
.194 .029 (.165, .223)
n
700
p v z.025
We are 95% confident that the true proportion of vowels in the ScrabbleExpress game is
between .165 and .223. The true proportion from the board game is .42 which is much
greater than the values in the interval.
9.16
2
df = (r 1)(c 1) = (5 1)(5 1) = 16. From Table VII, Appendix B, .05
= 26.2962.
2
The rejection region is > 26.2962.
a.
b.
9.18
2
= 15.9871.
2
c.
df = (r 1)(c 1) = (2 1)(3 1) = 2. From Table VII, Appendix B, 2 = 9.21034. The

rejection region is 2 > 9.21034.
a.
To convert the frequencies to percentages, divide the numbers in each column by the
column total and multiply by 100. Also, divide the row totals by the overall total and
multiply by 100. The column totals are 25, 64, and 78, while the row totals are 96 and
71. The overall sample size is 165. The table of percentages are:
Column
2
1
Row 1
b.
9
100 = 36%
25
34
100 = 53.1%
64
53
100 = 67.9%
78
96
100 = 57.5%
167
2 16
100 = 64%
25
30
100 = 46.9%
64
25
100 = 32.1%
78
71
100 = 42.5%
167
Using MINITAB, the graph is:
70
60
57.5%
50
Percent
40
30
20
10
0
1
Column
305
c.
9.20
If the rows and columns are independent, the row percentages in each column would be
close to the row total percentages. This pattern is not evident in the plot, implying the
rows and columns are not independent.
a-b. To convert the frequencies to percentages, divide the numbers in each column by the
column total and multiply by 100. Also, divide the row totals by the overall total and
multiply by 100.
B
B2
B1
B
c.
B3
Totals
A1 40
100 = 29.9%
134
72
100 = 44.2%
163
42
100 = 29.6%
142
154
100 = 35.1%
439
A2 63
100 = 47.0%
Row
134
53
100 = 32.5%
163
70
100 = 49.3%
142
186
100 = 42.4%
439
A3 31
100 = 23.1%
134
38
100 = 23.3%
163
30
100 = 21.1%
142
99
100 = 22.6%
439
Using MINITAB, the graph is:
45
40
35
35.1%
30
Percent
25
20
15
10
5
0
1
The graph supports the conclusion that the rows and columns are not independent. If they
were, then the height of all the bars would be essentially the same.
9.22
a.
The contingency table would be:

Taxmotivation
Yes
No
Total
306
Itemize Deductions
Yes
No
691
381
794
899
1,482
1,280
Total
1,072
1,693
2,765
Chapter 9
b.
c.
E11 =
R1C1 1,072(1, 485)

=
= 575.7
n
2,765
E21 =
R2C1 1,693(1, 485)

=
= 909.3
n
2,765
E12 =
R1C2 1,072(1, 280)

=
= 496.3
n
2,765
E22 =
R2C2 1,693(1, 280)

=
= 783.7
n
2,765
The test statistic is:
2 =
[nij Eij ]2
Eij
[691 575.7]2 [381 496.3]2 [794 909.3]2 [899 783.7]2

+
+
+
575.7
496.3
909.3
783.7
= 81.46
=
d.
To determine if tax-motivation and itemize-deduction are related for charitable givers, we

test:
H0: Tax-motivation and itemize-deduction are independent
Ha: Tax-motivation and itemize-deduction are dependent
The test statistic is 2 = 81.46.
2
= 3.84146. The
(r 1)(c 1) = (2 1)(2 1) = 1. From Table VII, Appendix B, .05

3.84146), H0 is rejected. There is sufficient evidence to indicate that tax-motivation and
itemize-deduction are related for charitable givers at = .05.
e.
To compute the bar graph, we first convert frequencies to percentages by dividing the
numbers in each column by the column total and multiplying by 100%. Also, divide the
row totals by the overall total and multiply by 100%.
Taxmotivation
Yes
No
Total
Itemize Deductions
Yes
691
100% = 46.5%
1485
794
100% = 53.5%
1485
1,485
No
381
100% = 29.8%
1280
899
100% = 70.2%
1280
1,280
Total
1072
100% = 38.8%
2765
1693
100% = 61.28%
2765
2,765
307
Using MINITAB, the bar graph is:
50
40
38.8%
Percent
30
20
10
0
Yes
No
Itemize
9.24.
a.

p C1 =
xC1
175
=
= .028
n1 6, 222
p C 2 =
xC 2
236
=
= .050
4,692
n2
p C 3 =
xC 3
319
=
= .045
7,140
n3
p C 4 =
xC 4
231
=
= .038
6,120
n4
p C 5 =
xC 5
480
=
= .046
n5 10,353
p C 6 =
xC 6 187
=
= .039
4794
n6
The proportions range from .028 to .050. Since .050 is about twice as big as .028, there
may be evidence to conclude some of the proportions are different.
b.
308

E11 =
R1C1 6, 222(37,693)
=
= 5,964.39
n
39,321
E12 =
R1C2 6, 222(1628)
=
= 257.61
n
39,321
E21 =
R2C1 4,692(37,693)
=
= 4497.74
n
39,321
E22 =
R2C2 4,692(1,628)
=
= 194.26
n
39,321
E31 =
R3C1 7,140(37,693)
=
= 6,844.38
n
39,321
E32 =
R3C2 7,140(1,628)
=
= 295.62
n
39,321
E41 =
R4C1 6,120(37,693)
=
= 5,866.61
n
39,321
E42 =
R4C2 6,120(1,628)
=
= 253.39
n
39,321
Chapter 9
E51 =
R5C1 10,353(37,693)
=
= 9,924.36
n
39,321
E52 =
R5C2 10,353(1,628)
=
= 428.64
n
39,321
E61 =
R6C1 4,794(37,693)
=
= 4,595.51
39,321
n
E62 =
R6C2 4,794(1,628)
=
= 198.49
39,321
n
To determine if the proportions of censored measurements differ for the six tractor
lines, we test:
H0: Tractor lines and Censored measurements are independent
Ha: Tractor lines and Censored measurements are dependent
2
2
2
2
nij Eij
6047 5964.39 )
175 257.61)
4456 4497.74 )
(
(
(
=
=
+
+
5964.39
257.61
4497.74
Eij
2
2
187 198.49 )
(
+ +
198.49
= 48.0978
2
= 15.0863.
2
(2 = 48.0978 > 15.0863), H0 is rejected. There is sufficient evidence to indicate that
the proportions of censored measurements differ for the six tractor lines at = .01.
c.
9.26
Even though there are differences in the proportions of censured data among the 6 tractor
lines, these proportions range from .028 to .050. In practice, there is very little difference

E11 =
R1C1 95(118)
=
= 42.8
262
n
E21 =
R2 C1 69(118)
=
= 31.1
n
262
E31 =
R3 C1 42(118)
=
= 18.9
n
262
E32 =
R3 C2 42(144)
=
= 23.1
n
262
E41 =
R4 C1 56(118)
=
= 25.2
n
262
E42 =
R4 C2 56(144)
=
= 30.8
n
262
E12 =
R1C2 95(144)
=
= 52.2
262
n
E22 =
R2 C2 69(144)
=
= 37.9
n
262
309
To determine whether a pig farmers education level has an impact on the size of the pig farm,
we test:
H0: Pig farmers education level and size of pig farm are independent
Ha: Pig farmers education level and size of pig farm are dependent
2 =
+
[nij Eij ]2
Eij
(42 42.8) 2 (53 52.2) 2 (27 31.1) 2 (42 37.9) 2 (22 18.9) 2
+
+
+
+
42.8
52.2
31.1
37.9
18.9
(20 23.1) 2 (27 25.2)2 (29 30.8) 2

+
+
= 2.17
23.1
25.2
30.8
The rejection region requires = .05 in the upper tail of the 2 distribution with df
2
= (r 1)(c 1) = (4 1)(2 1) = 3. From Table VII, Appendix B, .05
= 7.81473. The

Since the observed value of the test statistic does not fall in the rejection region ( 2 = 2.17 >/
7.81473), H0 is not rejected. There is insufficient evidence to indicate that a pig farmers
education level has an impact on the size of the pig farm at = .05.
To compute the bar graph, we first convert frequencies to percentages by dividing the numbers
in each row by the row total and multiplying by 100%. Also, divide the column totals by the
overall total and multiply by 100%.
Farm Size
<1,000 pigs
1,000-2,000
pigs
2,000-5,000
pigs
> 5,000
pigs
Total
310
Education Level
No college
College
42
53
100% = 44.2%
100% = 55.8%
95
95
27
42
100% = 39.1%
100% = 60.9%
69
69
22
20
100% = 52.4%
100% = 47.6%
42
42
27
29
100% = 48.2%
100% = 51.8%
56
56
118
144
100% = 45.0%
100% = 55.0%
262
262
Total
95
69
42
56
262
Chapter 9
50
45.0%
Percent
40
30
20
10
0
<1,000
1,000-2,000
2,000-5,000
>5,000
Farm Size
9.28
a.

R1C1 53(35)
=
= 26.5
n
70
R C 17(35)
= 8.5
E21 = 2 1 =
n
70
R1C2 53(35)
=
= 26.5
n
70
R C 17(35)
E22 = 2 2 =
= 8.5
n
70
E11 =
E12 =
To determine if the severity of the ethical issue influenced whether the issue was
identified or not by the auditors, we test:
H0: Severity of ethical issue and identification are independent
Ha: Severity of ethical issue and identification are dependent
nij Eij
Eij
(27 26.5) (26 26.5) (8 8.5) (9 8.5)

+
+
+
=
= .078
26.5
26.5
8.5
8.5
2
2
= 3.84146. The
2
rejection region is > 3.84146.
Since the observed value of the test statistic does not fall in the rejection region (2 = .078
>/ 3.84146), H0 is not rejected. There is insufficient evidence to indicate that the severity
of the ethical issue influenced whether the issue was identified or not by the auditors at
= .05.
b.
No. If there were 0 in the bottom cell of the column, then the expected count for that cell
will be less than 5. One of the assumptions necessary for the test statistic to have a 2
distribution will not hold.
311
c.
Suppose we change the numbers in the table to be as follows:

Severity of Ethical Issue
Moderate
Severe
32
21
3
14
Ethical Issue Identified

Ethical Issue Not Identified
Since the row and column totals are the same, the expected cell counts are the same as
above.
nij Eij
Eij
(32 26.5) 2 (21 26.5) 2 (3 8.5) 2 (14 8.5) 2

+
+
+
= 9.401
26.5
26.5
8.5
8.5
Now the test statistic would fall in the rejection region.

9.30
a.
The contingency table is:
Altitude
< 300
300-600
600
Totals
b.
Flight Response
Low
High
85
105
77
121
17
59
179
285
Totals
190
198
76
464
E11 =
R1C1 190(179)
=
= 73.297
n
464
E12 =
R1C2 190(285)
=
= 116.703
n
464
E21 =
R2C1 198(179)
=
= 76.384
n
464
E22 =
R2C2 198(285)
=
= 121.616
n
464
E31 =
R3C1 76(179)
=
= 29.319
464
n
E32 =
R3C2 76(285)
=
= 46.681
464
n
To determine if flight response of the geese depends on the altitude of the helicopter,
we test:
H0: Flight response and Altitude of helicopter are independent

Ha: Flight response and Altitude of helicopter are dependent
312
Chapter 9
nij Eij
=
Eij
(85 73.297 )2 (105 116.703)2 ( 77 76.384 )2 (121 121.616 )2

73.297
116.703
(17 29.319 )
29.319
( 59 46.681)
76.384
121.616
46.681
= 11.477
2
= 9.21034.
2
the flight response of the geese depends on the altitude of the helicopter at = .01.
c.

Flight Response
Lateral
Distance
< 1000
1000-2000
2000-3000
3000
Totals
d.
Low
37
68
44
30
179
High
243
37
4
1
285
Totals
280
105
48
31
464
E11 =
R1C1 280(179)
=
= 108.017
n
464
E12 =
R1C2 280(285)
=
= 171.983
n
464
E21 =
R2C1 105(179)
=
= 40.506
n
464
E22 =
R2C2 105(285)
=
= 64.494
n
464
E31 =
R3C1 48(179)
=
= 18.517
464
n
E32 =
R3C2 48(285)
=
= 29.483
464
n
E41 =
R 4 C1 31(179)
=
= 11.959
n
464
E42 =
R4C2 31(285)
=
= 19.041
n
464
313
To determine if flight response of the geese depends on the lateral distance of the
helicopter, we test:
H0: Flight response and Lateral distance of the helicopter are independent
Ha: Flight response and Lateral distance of the helicopter are dependent
nij Eij
2 =
Eij
=
( 37 108.017 )2 ( 243 171.983)2 ( 68 40.506 )2 ( 37 64.494 )2

108.017
+
171.983
( 44 18.517 )
18.517
( 4 29.494 )
40.506
2
29.494
( 30 11.959 )
64.494
2
11.959
(1 19.041)2
19.041
= 207.814
2
= 11.3449.
2
the flight response of the geese depends on the lateral distance of the helicopter at = .01.
e.
Using SAS, the contingency table for altitude by response with the column percents is:
Table of ALTGRP by RESPONSE
ALTGRP
RESPONSE
Frequency|
Percent |
Row Pct |
Col Pct |LOW
|HIGH
| Total
---------+--------+--------+
<300
|
85 |
105 |
190
| 18.32 | 22.63 | 40.95
| 44.74 | 55.26 |
| 47.49 | 36.84 |
---------+--------+--------+
300-600 |
77 |
121 |
198
| 16.59 | 26.08 | 42.67
| 38.89 | 61.11 |
| 43.02 | 42.46 |
---------+--------+--------+
600+
|
17 |
59 |
76
|
3.66 | 12.72 | 16.38
| 22.37 | 77.63 |
|
9.50 | 20.70 |
---------+--------+--------+
Total
179
285
464
38.58
61.42
100.00
314
Chapter 9
Statistics for Table of ALTGRP by RESPONSE

Statistic
DF
Value
Prob
-----------------------------------------------------Chi-Square
2
11.4770
0.0032
Likelihood Ratio Chi-Square
2
12.1040
0.0024
Mantel-Haenszel Chi-Square
1
10.2104
0.0014
Phi Coefficient
0.1573
Contingency Coefficient
0.1554
Cramer's V
0.1573
Sample Size = 464
From the row percents, it appears that the lower the plane, the lower the response.
For altitude <300m, 55.26% of the geese had a high response. For altitude 300600m, 61.11% of the geese had a high response. For altitude 600+m, 77.63% of the
geese had a high response. Thus, instead of setting a minimum altitude for the
planes, we need to set a maximum altitude. For this data, the lowest response is at
an altitude of < 300 meters.
Using SAS, the contingency table for lateral distance by response with the column
percents is:
The FREQ Procedure
Table of LATGRP by RESPONSE
LATGRP
RESPONSE
Frequency |
Percent
|
Row Pct
|
Col Pct
|LOW
|HIGH
| Total
----------+--------+--------+
<1000
|
37 |
242 |
279
|
7.99 | 52.27 | 60.26
| 13.26 | 86.74 |
| 20.67 | 85.21 |
----------+--------+--------+
1000-2000 |
68 |
37 |
105
| 14.69 |
7.99 | 22.68
| 64.76 | 35.24 |
| 37.99 | 13.03 |
----------+--------+--------+
2000-3000 |
44 |
4 |
48
|
9.50 |
0.86 | 10.37
| 91.67 |
8.33 |
| 24.58 |
1.41 |
----------+--------+--------+
3000+
|
30 |
1 |
31
|
6.48 |
0.22 |
6.70
| 96.77 |
3.23 |
| 16.76 |
0.35 |
----------+--------+--------+
Total
179
284
463
38.66
61.34
100.00
Frequency Missing = 1
Statistics for Table of LATGRP by RESPONSE
Statistic
DF
Value
Prob
-----------------------------------------------------Chi-Square
3
207.0800
<.0001
3
226.8291
<.0001
1
189.2843
<.0001
Phi Coefficient
0.6688
0.5559
Cramer's V
0.6688
Effective Sample Size = 463
Frequency Missing = 1
315
From the row percents, it appears that the greater the lateral distance, the lower the
response. For a lateral distance of 3000+m only 3.23% of the geese had a high
response. Thus, the further away the plane is laterally, the lower the response. For
this data, the lowest response is when the plane is further than 3000 meters.
Thus the recommendation would be a maximum height of 300 m and a minimum
lateral distance of 3000 m.
9.32
a.

E11 =
E12 =
E13 =
E31 =
E32 =
E33 =
50(50)
= 10
250
50(90)
= 18
250
50(110)
= 22
250
100(50)
= 20
250
100(90)
= 36
250
100(110)
= 44
250
100(50)
= 20
250
100(90)
E22 =
= 36
250
100(110)
E23 =
= 44
250
E21 =
To determine if the rows and columns are dependent, we test:

H0: Rows and columns are independent
Ha: Rows and columns are dependent
2
nij Eij
(20 10) 2
(30 44) 2
+"+
=
= 54.14
10
44
Eij
2
2
= 9.48773. The
2
Since the observed value of the test statistic falls in the rejection region (2 = 54.14 >
9.48773), H0 is rejected. There is sufficient evidence to indicate a dependence between
rows and columns at = .05.
316
b.
No, the analysis remains identical.
c.
Yes, the assumptions on the sampling differ.
Chapter 9
d.
The percentages are in the table below.

Column
2
1
1
20
50
Row
10
50
20
50
e.
20
100% = 40%
90
20
100% = 20%
90
50
100% = 40%
90
3
10
100% = 22.2%
110
70
100% = 22.2%
110
30
100% = 55.6%
110
100% = 9.1%
Totals
50
250
100% = 63.6%
100
100% = 37.3%
100
250
250
100% = 20%
100% = 40%
100% = 40%
40
Percent
30
20%
20
10
0
1
Column
The graph supports the decision in part a. In part a, we rejected the null hypothesis and
concluded that the rows and columns were dependent. If they were dependent, then we
would expect the three bars to be the same height. In this graph, they are not the same
height.
9.34
a.
If Bon Appetit readers do not have a preference for their least favorite vegetable, then the
values of p1, p2, p3, and p4 should all be the same. Since there are four categories, then p1
= p2 = p3 = p4 = .25.
b.
To determine if the Bon Appetit readers have a preference for at least one of the
vegetables as least favorite, we test:
H0: p1 = p2 = p3 = p4 = .25
Ha At least one pi .25
317
c.

n=
= 46 + 76 + 44 + 34 = 200
E(ni) = npi,0 = 200(.25) = 50, i = 1, 2, 3, or 4

2
[ ni E (ni )]
E ( ni )
(46 50) 2 (76 50) 2 (44 50) 2 (34 50) 2

= 19.68
+
+
+
50
50
50
50
2
2
> 7.81473.
(2 = 19.68 > 7.81473), H0 is rejected. There is sufficient evidence to indicate the Bon
Appetit readers have a preference for at least one of the vegetables as least favorite at
= .05.
d.
We must assume that:

Sample is random
Sample size is sufficiently large (every cell has an expected value of at least 5).
9.36
a.

E11 =
R1C1 242(473)
=
= 208.499
n
549
E21 =
R2 C1 212(473)
=
= 182.652
n
549
E31 =
R3 C1 95(473)
=
= 81.849
549
n
E12 =
R1C2 242(76)
=
= 33.501
n
549
E22 =
E32 =
R2 C2 212(76)
=
= 29.348
n
549
R3 C2 95(76)
=
= 13.151
549
n
To determine if the likelihood for stress is dependent on an employees fitness level, we

test:
H0: Stress and Fitness level are independent
Ha: Stress and Fitness level are dependent
318
Chapter 9

nij Eij
=
Eij
( 204 208.499 )2 ( 38 33.506 )2 (184 182.652 )2

+
208.499
+
( 28 29.348)
29.348
33.506
2
182.652
(85 81.849 )
81.849
(10 13.151)2
13.151
= 1.648
Since no level was given, we will use = .05. The rejection region requires
= .05 in the upper tail of the 2 distribution with df = (r 1)(c 1) = (3 1)(2 1) = 2.
2
From Table VII, Appendix B, .05
= 5.99147. The rejection region is 2 > 5.99147.
(2 = 1.648 > 5.99147), H0 is not rejected. There is insufficient evidence to indicate
that the likelihood for stress is dependent on an employees fitness level at = .05.
b.
A Type I error is rejecting H0 when H0 is true. In this case, it would be concluding that
Stress and Fitness level are dependent when, in fact, they are independent.
A Type II error is accepting Ho when Ho is false. In this case, it would be concluding
that Stress and Fitness level are independent when, in fact, they are dependent.
c.
To convert frequencies to percentages, divide the numbers in each row by the row total
and multiply by 100. Also, divide the column totals by the overall total and multiply by
100.
Stress Level
Poor
Fitness Level
Average
Good
Total
No Stress
Stress
204
100 = 84.3%
242
184
100 = 86.8%
212
85
100 = 89.5%
95
473
100 = 86.2%
549
38
100 = 15.7%
242
28
100 = 13.2%
212
10
100 = 10.5%
95
76
100 = 13.8%
549
319
Using MINITAB, the bar chart is:

Chart of Percent with Stress
16
14
13.8%
P er cent
12
10
8
6
4
2
0
9.38
a.
P oor
A v erage
Fitness Level
G ood
E(n1) = np1,0 = 370(.30) = 111

E(n2) = np2,0 = 370(.20) = 74
E(n3) = np3,0 = 370(.20) = 74
E(n4) = np4,0 = 370(.10) = 37
E(n5) = np5,0 = 370(.10) = 37
E(n6) = np6,0 = 370(.10) = 37
b.

2
[ ni E (ni )]
E (ni )
(84 111) (79 74) (75 74) (49 37)

+
+
+
111
74
74
37
2
2
(36 37) (47 37)
+
+
= 13.541
37
37
2
c.
To determine if the true percentages of the colors produced differ from the manufacturers
stated percentages, we test:
H0: p1 = .30, p2 = .20, p3 = .20, p4 = .10, p5 = .10, p6 = .10
Ha: At least one pi does not equal its hypothesized value.
320
Chapter 9
2
2
> 11.0705.
(2 = 13.541 > 11.0705), H0 is rejected. There is sufficient evidence to indicate the true
percentages of the colors produced differ from the manufacturers stated percentages at
= .05.
9.40
a.
The expected cell counts are:

R1C1
20(11)
= 7.097
=
31
n
RC
11(11)
E21 = 2 1 =
= 3.903
31
n
E11 =
R1C2
20(20)
= 12.903
=
31
n
RC
11(20)
E22 = 2 2 =
= 7.097
31
n
E12 =
b.
One of the assumptions for the chi-square test is that the sample size, n, is large enough
so that, for every cell, the expected cell count, Eij, will be equal to 5 or more. For cell (2,
1), the expected cell count is only 3.903.
c.
To determine if inside ownership and size are independent, we test:

H0: Inside ownership and size are independent
Ha: Inside ownership and size are dependent
The p-value is .0043. Since the p-value is so small, H0 is rejected. There is sufficient
evidence to indicate that inside ownership and size are dependent for > .0043
d.
First, we find the percentages by dividing each cell count by the column total and
multiplying by 100. The row totals are divided by the total sample size. The percentages
are found in the table:
Size
Insider
Ownership
Low
High
Small
3
100% = 27.3%
11
8
100% = 72.7%
11
Large
17
100% = 85%
20
3
100% = 15%
20
Totals
20
100% = 64.5%
31
11
100% = 35.5%
31
321
Using MINITAB, the bar chart is:
90
80
70
64.5%
60
Percent
50
40
30
20
10
0
Small
Large
Size
Since the bars are not the same height, there is evidence that insider ownership and size
are dependent. This is what we found in part c.
9.42
322

E11 =
R1C1 100(171)
=
= 34.2
n
500
E12 =
R1C2 100(207)
=
= 41.4
n
500
E13 =
R1C3 100(80)
=
= 16.0
n
500
E14 =
R1C4 100(42)
=
= 8.4
n
500
E21 =
R2 C1 175(171)
=
= 59.9
500
n
E22 =
R2 C2 175(207)
=
= 72.5
500
n
E23 =
R2 C3 175(80)
=
= 28.0
500
n
E24 =
R2 C4 175(42)
=
= 14.7
500
n
E31 =
R3 C1 145(171)
=
= 49.6
n
500
E32 =
R3 C2 145(207)
=
= 60.0
n
500
E33 =
R3 C3 145(80)
=
= 23.2
n
500
E34 =
R3 C4 145(42)
=
= 12.2
n
500
E41 =
R4 C1 80(171)
=
= 27.4
n
500
E42 =
R4 C2 80(207)
=
= 33.1
n
500
E43 =
R4 C3 80(80)
=
= 12.8
500
n
E44 =
R4 C4 80(42)
=
= 6.7
500
n
Chapter 9
To determine if there is a dependence between a son's choice of occupation and his

occupation, we test:
father's
H0: Son's choice of occupation and his father's occupation are independent
Ha: Son's choice of occupation and his father's occupation are dependent.
=
2
[nij Eij ]2
Eij
(55 34.2) 2 (38 41.4) 2 (7 16.0) 2 (0 8.4) 2 (79 59.9) 2

+
+
+
+
34.2
41.4
16.0
8.4
59.9
(71 72.5) 2 (25 28) 2 (0 14.7) 2 (22 49.6) 2 (75 60) 2 (38 23.2) 2
+
+
+
+
+
72.5
28
14.7
49.6
60
23.2
(10 12.2) 2 (15 27.4) 2 (23 33.1) 2 (10 12.8) 2 (32 6.7) 2
+
+
+
+
+
= 181.32
12.2
27.4
33.1
12.8
6.7
+
The rejection region requires = .05 in the upper tail of the 2 distribution with df
2
= 16.9190. The
= (r 1)(c 1) = (4 1)(4 1) = 9. From Table VII, Appendix B, .05

16.9190), H0 is rejected. There is sufficient evidence to indicate a dependence between a sons
choice of occupation and his fathers occupation at = .05.
9.44
a.

R1C1 57(52)
=
= 34.465
n
86
R C 29(52)
E21 = 2 1 =
= 17.535
n
86
E11 =
R1C2 57(54)
=
= 22.535
n
86
RC
29(34)
E22 = 2 2 =
= 11.465
n
86
E12 =
To determine if manufacturing firms were more likely to be involved with TQM than
service firms, we test:
H0: Type of firm and TQM are independent
Ha: Type of firm and TQM are dependent
nij Eij
Eij
(34 34.465) (23 22.535) (18 17.535) (11 11.465)

+
+
+
= .047
34.465
22.535
17.535
11.465
2
2
= 3.84146. The
2
323
Since the observed value of the test statistic does not fall in the rejection region (2 = .047
>/ 3.84146), H0 is not rejected. There is insufficient evidence to indicate that the type of
firm and TQM are dependent at = .05. There is no evidence to indicate that
manufacturing firms are more likely to be involved with TQM than service firms.
b.
The p-value is P(2 > .047). From Table VII, Appendix B, with df = 1, .10 < P(2 > .047)
< .90.
c.
We must assume:
1.
2.
9.46
a.
The n observed counts are a random sample from the population of interest. We
may then consider this to be a multinomial experiment with r c = 2 2 = 4
possible outcomes
The sample size, n, will be large enough so that, for every cell, the expected cell
count, E(nij), will be equal to 5 or more.

E(n1) = np1,0 = 85(.26) = 22.1
E(n2) = np2,0 = 85(.30) = 25.5
E(n3) = np3,0 = 85(.11) = 9.35
E(n4) = np4,0 = 85(.14) = 11.9
E(n5) = np5,0 = 85(.19) = 16.15
To determine if probabilities differ from the hypothesized values, we test:
H0: p1 = .26, p2 = .30, p3 = .11, p4 = .14, p5 = .19
Ha: At least one of the probabilities differs from its hypothesized value.
ni E ( n i )
E (ni ) 2
(32 22.1) (26 25.5) (15 9.35) (6 11.9) (6 16.15)

+
+
+
+
22.1
25.5
9.35
11.9
16.15
2
= 17.16
2
2
> 9.48773.
9.48873), reject H0. There is sufficient evidence to indicate the probabilities differ from
their hypothesized values at = .05.
324
Chapter 9
b.
p1 =
n1 32
= .37647
=
n 85
p1 (1 p1 )
n
.37647(1 .37647)
.376 1.96
85
.376 .103
(.273, .479)
z.025
9.48
c.
The interval tells us that between 27.3% and 47.9% of the Avonex MS patients are
exacerbation-free during a two-year period. Since this interval is completely above the
percentage of placebo patients (26%), it seems that the Avonex patients are more likely to
have no exacerbations than placebo patients.
a.

Shift 1
2
3
Defectives
25
35
80
140
Non-Defectives
175
165
120
460
200
200
200
600
R1C1 200(140)
= 46.667
=
n
600
200(140)
E21 = E31 =
= 46.667
600
200(460)
E12 = E22 = (n32) =
= 153.333
600
E11 =
To determine if quality of the filters are related to shift, we test:
H0: Quality of filters and shift are independent

Ha: Quality of filters and shift are dependent
(80 46.667 )
46.667
= 47.98
[ nij Eij ]2
Eij
(175 153.333)
153.333
( 25 46.667 )
46.667
(165 153.333)
153.333
( 35 46.667 )
46.667
(120 153.333)
153.333
325
2
= 5.99147. The
2
5.99147), H0 is rejected. There is sufficient evidence to indicate quality of filters and
shift are related at = .05.
b.
The form of the confidence interval for p is:

p q
25
p z/2 1 1 where p1 =
= .125
200
n
.125(.875)
.125 1.96
.125 .046 (.079, .171)
200
9.50
Using SAS, the output is:

The FREQ Procedure
Table of CANDIDATE by TIME
CANDIDATE
TIME
Frequency|
Col Pct |
1|
2|
3|
4|
5|
6|
---------+--------+--------+--------+--------+--------+--------+
SMITH
|
208 |
208 |
451 |
392 |
351 |
410 |
| 52.53 | 55.32 | 55.34 | 55.92 | 56.16 | 55.33 |
---------+--------+--------+--------+--------+--------+--------+
COPPIN
|
55 |
51 |
109 |
98 |
88 |
104 |
| 13.89 | 13.56 | 13.37 | 13.98 | 14.08 | 14.04 |
---------+--------+--------+--------+--------+--------+--------+
MONTES
|
133 |
117 |
255 |
211 |
186 |
227 |
| 33.59 | 31.12 | 31.29 | 30.10 | 29.76 | 30.63 |
---------+--------+--------+--------+--------+--------+--------+
Total
396
376
815
701
625
741
Total
2020
505
1129
3654
Statistics for Table of CANDIDATE by TIME

Statistic
DF
Value
Prob
-----------------------------------------------------Chi-Square
10
2.2839
0.9937
10
2.2722
0.9938
1
0.9851
0.3209
Phi Coefficient
0.0250
0.0250
Cramer's V
0.0177
Sample Size = 3654
To determine if candidates received votes independent of time period, we test:

H0: Voting and Time period are independent
Ha: Voting and Time period are dependent
326
Chapter 9
Since no value of was given, we will use = .05. The rejection region requires = .05 in
the upper tail of the 2 distribution with df = (r 1)(c 1) = (3 1)(6 1) = 10. From Table
2
= 18.3070. The rejection region is 2 > 18.3070.
VII, Appendix B, .05
(2 = 2.2839 >/ 18.3070), H0 is not rejected. There is insufficient evidence to indicate Voting
and Time period are dependent at = .05. Thus, we can conclude that voting and time period
are independent. This means that regardless of time period, the percentage of votes received by
each candidate is the same. In the table created by SAS, the bottom number in each cell is the
column percent. This is the percent of votes received by the candidate in each time period. An
inspection of these percents indicates that candidate Smith received approximately 55.3% of
the votes each time period, candidate Coppin received approximately 13.8% of the vote, and
candidate Montes received approximately 30.9% of the vote. All of this indicates that the
election was rigged.
327
Part I:
If we assume that those selected for termination were randomly selected from all workers, then the Chisquared test for independence is appropriate. Using SAS, the output is:
TABLE OF RACE BY DECISION
RACE
DECISION
Frequency|
Percent |
Row Pct |
Col Pct |RETAINED|LAIDOFF | Total
---------+--------+--------+
WHITE
|
1051 |
31 |
1082
| 86.50 |
2.55 | 89.05
| 97.13 |
2.87 |
| 90.29 | 60.78 |
---------+--------+--------+
BLACK
|
113 |
20 |
133
|
9.30 |
1.65 | 10.95
| 84.96 | 15.04 |
|
9.71 | 39.22 |
---------+--------+--------+
Total
1164
51
1215
95.80
4.20
100.00
STATISTICS FOR TABLE OF RACE BY DECISION
Statistic
DF
Value
Prob
-----------------------------------------------------Chi-Square
1
43.641
0.001
1
29.260
0.001
Continuity Adj. Chi-Square
1
40.666
0.001
1
43.605
0.001
Fisher's Exact Test (Left)
1.000
(Right)
6.43E-08
(2-Tail)
6.43E-08
Phi Coefficient
0.190
0.186
Cramer's V
0.190
Sample Size = 1215
328
To determine if the variables Race and Decision are related, we test:

H0: Race and Decision are independent
Ha: Race and Decision are dependent
The p-value is p = .001. Since the p-value is so small, there is evidence to reject H0. There is sufficient
evidence to indicate that Race and Decision are related. From the table, only 2.9% of whites were
terminated. However, 15.0% of black were terminated. There is a significant difference in these
percentages. This supports the plaintiff's position. However, this is all based on the assumption that
those selected to be laidoff were randomly selected. However, if the company made its decision based
on performance as it claims, then those selected to be terminated were not randomly selected and thus,
the test of hypothesis is invalid.
Part II:
If the workers to be terminated were truly selected at random, then the Chi-square test for independence
is appropriate. Using SAS, the output is:
TABLE OF STATUS BY AGE1
STATUS
AGE1
Frequency |
Percent
|
Row Pct
|
Col Pct
|UNDER 40|40 +
| Total
-----------+--------+--------+
ACTIVE
|
18 |
13 |
31
| 32.73 | 23.64 | 56.36
| 58.06 | 41.94 |
| 72.00 | 43.33 |
-----------+--------+--------+
TERMINATED |
7 |
17 |
24
| 12.73 | 30.91 | 43.64
| 29.17 | 70.83 |
| 28.00 | 56.67 |
-----------+--------+--------+
Total
25
30
55
45.45
54.55
100.00
329
STATISTICS FOR TABLE OF STATUS BY AGE1

Statistic
DF
Value
Prob
-----------------------------------------------------Chi-Square
1
4.556
0.033
1
4.651
0.031
Continuity Adj. Chi-Square
1
3.465
0.063
1
4.473
0.034
Fisher's Exact Test (Left)
0.993
(Right)
0.031
(2-Tail)
0.055
Phi Coefficient
0.288
0.277
Cramer's V
0.288
Sample Size = 55
To determine if the variables Status and Age are related, we test:

H0: Age and Status are independent
Ha: Age and Status are dependent
The p-value is p = .033. Since the p-value is so small, there is evidence to reject H0. There is sufficient
evidence to indicate that Age and Status are related. From the table, 56.7% of those aged 40 and over
were terminated. However, only 28.0% of those aged under 40 were terminated. There is a significant
difference in these percentages. This supports the plaintiff's position.
We can also look at some other revealing statistics. If we compare the mean wages of those terminated
against those who remained active, there is a significant difference. The mean wages of those
terminated is significantly higher than the mean wages of those who remained active. Also, the mean
age of those who remained active (33.0) is significantly less than the mean age of those who were
terminated (44.08). Also, the mean wage of those under 40 ($26,452.20) was significantly less than the
mean wage of those 40 or over ($39,044.17). All of this implies that those who were terminated were
those who were older with the higher salaries. It appears that the company wanted to not only reduce
the work force, but also reduce its mean expenses for those remaining on the workforce.
I can find nothing to support the defendant's position.
TTEST PROCEDURE
Variable: WAGES
STATUS
N
Mean
Std Dev Std Error Variances
T
DF Prob>|T|
------------------------------------------------------------------------------ACTIVE
31 28772.26 6302.5283 1131.9675 Unequal -6.8124 52.9
0.0001
-6.6214 53.0
0.0000*
TERMINATED 24 39195.42 5042.9673 1029.3914 Equal
330

DF = (30,23)
Prob>F' = 0.2738
************************************************************************
Variable: AGE
STATUS
N
Mean Std Dev Std Error
Variances
T
DF Prob>|T|
-----------------------------------------------------------------------------ACTIVE
31 33.0000 8.0000 1.4368
Unequal
-5.7661
53.0
0.0001*
TERMINATED 24 44.0833 6.2549 1.2768
Equal
-5.5886
53.0
0.0000
DF = (30,23)
Prob>F' = 0.2273
************************************************************************
Variable: WAGES
AGE1
N
Mean
Std Dev Std Error Variances
T
DF Prob>|T|
------------------------------------------------------------------------------UNDER 40 25 26452.2000 4739.5548 947.9110 Unequal -10.1970
49.3
0.0001
-10.2814
53.0
0.0000*
40 + 30 39044.1667 4334.8764 791.4365 Equal
DF = (24,29)
Prob>F' = 0.6409
331
10.2
Chapter 10
For all problems below, we use:
a.
Slope =
"rise" y2 y1
=
"run" x2 x1
Slope =
5 1
= 1 = 1
5 1
If y = 0 + 1x, then 0 = y 1x.

Since a given point is (1, 1) and 1 = 1, the y-intercept = 0 = 1 1(1) = 0.
b.
Slope =
03
= 1 = 1
30
If y = 0 + 1x, then 0 = y 1x.

Since (0, 3) is given, the y-intercept is 0 = 3 (1)(0) = 3.
c.
Slope =
2 1
1
= = .2 = 1
4 (1) 5
If y = 0 + 1x, then 0 = y 1x.

Since a given point is (1, 1) and 1 = 1/5, the y-intercept is 0 = 1 .2(1) = 1.2.
d.
Slope =
6 ( 3) 9
= = 1.125 = 1
2 (6) 8
If y = 0 + 1x, then 0 = y 1x.

Since a given point is (6, 3) and 1 = 9/8, the y-intercept is 0 = 3 1.125(6) = 3.75.
10.4
a.
The equation for a straight line (deterministic) is y = 0 + 1x.

If the line passes through (1, 1), then
1 = 0 + 1(1) 1 = 0 + 1
Likewise, through (5, 5)
5 = 0 + 1(5)
332
Chapter 10
Solving for these two equations:

1 = 0 + 1
(5 = 0 + 1(5))
4 = 41 1 = 1
Substituting 1 = 1 into the first equation, we get
1 = 0 + 1 0 = 0
The equation is y = 0 + 1x or y = x.
b.
The equation for a straight line is y = 0 + 1x. If the line passes through (0, 3), then
3 = 0 + 1(0), which implies 0 = 3. Likewise, through the point (3, 0), then 0 = 0 + 31
or 0 = 31. Substituting 0 = 3, we get 3 = 31 or 1 = 1. Therefore, the line passing
through (0, 3) and (3, 0) is y = 3 x.
c.
1 = 0 + 1(1). Likewise through the point (4, 2), 2 = 0 + 1(4). Solving for these
two equations
2 = 0 + 14
(1 = 0 11)
51 or 1 =
1=
d.
1
5
Solving for 0, 1 = 0 +
1
1
1 6
(1) or 1 = 0
or 0 = 1 + =
5
5
5 5
The equation, with 0 =
6
1
6 1
and 1 = , is y = + x .
5
5
5 5
3 = 0 16. Likewise, through the point (2, 6), 6 = 0 + 12. Solving these equations
simultaneously.
6 = 0 + 12
[(3) = 0 16]
9=
81 or 1 =
9
8
18
30
9
Solving for 0, 6 = 0 + 2 6
= 0 or 0 =
8
8
8
Therefore, y =
30 9
+ x.
8 8
333
10.6
a.
y = 4 + x. The slope is 1 = 1. The intercept is 0 = 4.
b.
y = 5 2x. The slope is 1 = 2. The intercept is 0 = 5.
c.
y = 4 + 3x. The slope is 1 = 3. The intercept is 0 = -4.
d.
y = 2x. The slope is 1 = 2. The intercept is 0 = 0.
e.
y = x. The slope is 1 = 1. The intercept is 0 = 0.
f.
y = .5 + 1.5x. The slope is 1 = 1.5. The intercept is 0 = .5.
10.8
The "line of means" is the deterministic component in a probabilistic model.
10.10
a.
xi
yi
xi2
xi yi
7
4
6
2
1
1
3
2
4
2
5
7
6
5
72 = 49
42 = 16
62 = 36
22 = 4
12 = 1
12 = 1
32 = 9
7(2) = 14
4(4) = 16
6(2) = 12
2(5) = 10
1(7) = 7
1(6) = 6
3(5) = 15
x = 7 + 4 + 6 + 2 + 1 + 1 + 3 = 24
y = 2 + 4 + 2 + 5 + 7 + 6 + 5 = 31
x = 49 + 16 + 36 + 4 + 1 + 1 + 9 = 116
x y = 14 + 16 + 12 + 10 + 7 + 6 + 15 = 80
Totals:
2
i
b.
334
SSxy =
c.
SSxx =
d.
1 =
x y
i
2
i
SS xy
SS xx
( x )( y )
i
( x )
i
= 80
= 116
(24)(31)
= 80 106.2857143 = -26.2857143
7
(24) 2
= 116 82.28571429 = 33.71428571
7
26.2857143
= .779661017 .7797
33.71428571
24
= 3.428571429 y =
7
31
= 4 .428571429
7
e.
x =
f.
0 = y 1 x = 4.428571429 (.779661017)(3.428571429)
g.
= 4.428571429 (2.673123487) = 7.101694916 7.102

The least squares line is y = 0 + 1 x = 7.102 .7797x.
Chapter 10
10.12
a.
b.
Choose y = 1 + x since it best describes the relation of x and y.
c.
y
2
1
3
d.
.5
1.0
1.5
2
1
3
.5
1.0
1.5
SSE =
( y y )
y y
y = 1 + x
2 1.5 = .5
1 2.0 = 1.0
3 2.5 = .5
Sum of errors = 0
1.5
2.0
2.5
y = 3 x
3 .5 = 2.5
3 1.0 = 2.0
3 1.5 = 1.5
y y
2 2.5 = .5
1 2.0 = 1.0
3 1.5 = 1.5
Sum of errors = 0
SSE for 1st model: y = 1 + x, SSE = (.5)2 + (1)2 + (.5)2 = 1.5

SSE for 2nd model: y = 3 - x, SSE = (.5)2 + (1)2 + (1.5)2 = 3.5
The best fitting straight line is the one that has the smallest least squares. The model
y = 1 + x has a smaller SSE, and therefore it verifies the visual check in part a.
e.
=3
SSxy =
SSxx =
1 =
y = 6 xy = 6.5 x = 3.5
( x )( y ) = 6.5 (3)(6) = .5
xy
( x)
.5
= 1; x =
.5
x
3
= 3.5
=
(3)
= .5
3
3
= 1; y =
3
y
3
6
=2
3
335
0 = y 1 x = 2 1(1) = 1 y = 0 + 1 x = 1 + x
The least squares line is the same as the second line given.
10.14
10.16
a.
The straight-line model would be: y = o + 1 x +
b.
The least squares line is:
c.
Since range of observed values for the number of carats (x) does not include 0, the yintercept has no meaning.
d.
The slope of the line is 1 . In terms of this problem, 1 is the change in the mean
asking price for each additional carat. This interpretation is meaningful for values of x
within the observed range. The observed range of x is .18 to 1.10.
e.
y = 2, 298.4 + 11, 598.9(.52) = 3, 733.028 . The predicted asking price for a .52 carat
diamond is $3,733.028.
a.
y = 2, 298.4 + 11, 598.9 x
x = 62
y = 97.8
x 2 = 720.52
y 2 = 1,710.2
x=
x = 62 = 10.33333333
n
SS xy = xy
SS xx = x
1 =
SS xy
SS xx
xy = 1,087.78
y=
y = 97.8 = 16.3
n
( x )( y ) = 1,087.78 62(97.8) = 1,087.78 1,010.6 = 77.18

6
( x)
= 720.52
(62) 2
= 720.52 640.667 = 79.8533333
6
77.18
= 0.966521957 0.9665
79.8533333
o = y 1 x = 16.3 0.966521957(10.33333333) = 6.312606448 6.3126

y = 6.3126 + .9665 x
b.
336
Since x = 0 is not in the observed range of the mean pore diameters, the y-intercept has no
meaning.
Chapter 10
10.18
c.
For each unit increase in mean pore diameter, the mean value of porosity is estimated to
increase by .9665.
d.
For x = 10, y = 6.3126 + .9665(10) = 15.9776
a.
x
x
= 6167
2
= 1,641,115
SSxy =
xy
SSxx =
y = 135.8
xy = 34,764.5
n = 24
( x )( y )
n
(6167)(135.8)
= 130.44167
= 34,764.5
24
2
( x)
(6167)
= 56,452.95833
24
SS xy 130.44167
=
1 =
= .002310625 .0023
SS xx 56452.958
= 1,641,115
0 = y 1 x =
135.8
6167
(.002310625)
= 6.252067683 6.25
24
24
The least squares line is y = 6.25 .0023x

b.
0 = 6.25. Since x = 0 is not in the observed range, 0 has no interpretation other than
being the y-intercept.
1 = .0023. For each additional increase of 1 part per million of pectin, the mean
sweetness index is estimated to decrease by .0023.
10.20
c.
y = 6.25 .0023(300) = 5.56
a.
A proposed model is E(y) = o + 1x.
b.
x = 1, 292.7
x 2 = 88,668.43
y = 3,781.1
xy = 218, 291.63
y 2 = 651,612.45
337
x=
x = 1, 292.7 = 58.75909091
y=
22
SS xy = xy
y = 3,781.1 = 171.8681818
22
( x )( y ) = 218, 291.63 1, 292.7(3,781.1)
n
22
= 218, 291.63 222,173.9986 = 3,882.3686
( x)
(1, 292.7) 2
n
22
= 88,668.43 75,957.87682 = 12,710.55318
SSxx = x
1 =
SSxy
SSxx
= 88,668.43
3,882.3686
= 0.305444503 0.305
12,710.55318
o = y 1 x = 171.8681818 (0.305444503)(58.75909091)
= 189.8158231 189.816
The fitted regression line is: y = 189.816 0.305 x
c.
Using MINITAB, a graph of the fitted regression line is:

Fitted Line Plot
F C A T-M ath = 189.8 - 0.3054 P ercent
190
S
R-Sq
R-Sq(adj)
185
5.36572
67.3%
65.7%
FC A T -M ath
180
175
170
165
160
155
10
20
30
40
50
60
P er cent
70
80
90
100
From the fitted regression line, the relationship between the two variables is
negative.
338
Chapter 10
d.
o = 189.816 . Since 0 is not in the range of observed values of the variable %

Below Poverty, the y-intercept has no meaning.
1 = 0.305 .
e.
For each unit change in % Below Poverty, the mean value of

FCAT-Math is estimated to decrease by 0.305.
A proposed model is E(y) = o + 1x.

x = 1, 292.7
y = 3,764.2
x 2 = 88,668.43
x=
y 2 = 645, 221.16
x = 1, 292.7 = 58.75909091
n
22
SSxy = xy
xy = 217,738.81
y=
y = 3,764.2 = 171.1
n
22
( x )( y ) = 217,738.81 1, 292.7(3,764.2)
n
= 217,738.81 221,180.97 = 3, 442.16
( x)
22
(1, 292.7) 2
n
22
= 88,668.43 75,957.87682 = 12,710.55318
SS xx = x
1 =
SSxy
SSxx
= 88,668.43
3, 442.16
= 0.270811187 0.271
12,710.55318
o = y 1 x = 171.1 (0.270811187)(58.75909091) = 187.0126192 187.013

The fitted regression line is: y = 187.013 0.271x
339
Using MINITAB, a graph of the fitted regression line is:

Fitted Line Plot
F C A T-Read = 187.0 - 0.2708 P ercent
185
180
FC A T -Read
3.42319
R-Sq
R-Sq(adj)
79.9%
78.9%
175
170
165
160
10
20
30
40
50
60
P er cent
70
80
90
100
From the fitted regression line, the relationship between the two variables is
negative.
10.22
o = 187.013 .
Since 0 is not in the range of observed values of the variable %

Below Poverty, the y-intercept has no meaning.
1 = 0.271 .
For each unit change in % Below Poverty, the mean value of

FCAT-Reading is estimated to decrease by .271.
a.
We will select Average Salary as the dependent variable and Mean GMAT as the
independent variable.
b.
x = 6,944
y = 1,080, 288
x 2 = 4,824,680
y 2 = 118,151,669, 430
x=
x = 6,944 = 694.4
n
10
SSxy = xy
y=
y = 1,080, 288 = 108,028.8

n
10
( x )( y ) = 751,698, 490 6,944(1,080, 288)
n
= 751,698, 490 75,015,987.2 = 1,546,502.8
340
xy = 751,698, 490
10
Chapter 10
( x)
(6,944) 2
n
10
= 4,824,680 4,821,913.6 = 2,766.4
SSxx = x
1 =
SSxy
SSxx
= 4,824,680
1,546,502.8
= 559.0307981 559.031
2,766.4
o = y 1 x = 108,028.8 (559.0307981)(694.4) = 280,162.1862 280,162.186

The fitted regression line is: y = 280,162.186 + 559.031x
o = 280,162.186 .
Since 0 is not in the range of observed values of the variable

Mean GMAT, the y-intercept has no meaning.
1 = 0.271 . For each additional point increase in the mean GMAT score, the mean
value of Average Salary is estimated to increase by $559.031.
10.24
The graph in b would have the smallest s2 because the width of the data points is the smallest.
10.26
a.
SSE = SSyy 1 SSxy = 95 .75(50) = 57.5

s2 =
x
n
57.5
= 3.19444
20 2
b.
SSyy =
c.
SSyy =
(y
( y)
2
2
50
= 797.5
40
n
SSE = SSyy 1 SSxy = 797.5 .2(2700) = 257.5
SSE
257.5
=
= 6.776315789 6.7763
s2 =
n2
40 2
2
= 860
y ) 2 = 58
1 =
SS xy
91
= .535294117
170
SS xx
SSE = SSyy 1 SSxy = 58 .535294117(91) = 9.2882353 9.288

SSE
9.2882353
=
= 1.161029413 1.1610
s2 =
n2
10 2
10.28
a.
From the printout, SSE = 382,178,624, s2 = MSE = 1,248,950, and s = 1,117.56.
b.
s = 1,117.56. We would expect approximately 95% of the observed values of y to

fall within 2s or 2(1,117.56) = 2,235.12 of their least squares predicted values.
341
10.30
a.
From part a of Exercise 10.17, SSxy = 20.00833333,
y = 239 , y 2 = 10, 255 ,
and 1 = 35.91623038 .
( y)
(239) 2
n
6
= 10, 255 9520.166667 = 734.8333333
SS yy = y
= 10, 255
SSE = SS yy 1SS xy = 734.833333 35.91623068(20.00833333) = 16.2094179

s 2 = MSE =
10.32
SSE 16.2094179
=
= 4.052354475 and s = 4.052354475 = 2.013
n2
62
b.
s = 2.013. We would expect approximately 95% of the observed values of y (Drug

release rate) to fall within 2s or 2(2.013) = 4.026 units of their least squares predicted
values.
a.
Using MINITAB, the scattergram of the data is:
b.
x = 44.71
y = 131,670
y = 1,514,402,100
xy
= 493,117.7
= 167.4615
x=
x = 44.71 = 3.7258333
n
SSxy =
12
xy
y=
y = 131, 670
n
12
= 10,972.5
( x )( y ) = 493,117.7 44.71(131, 670)
n
= 493,117.7 490,580.475 = 2,537.225
( x)
12
44.712
n
12
= 167.4615 166.5820083 = .8794917
SSxx =
1 =
342
SSxy
SS xx
= 167.4615
2, 537.225
= 2884.876571 2884.877
.8794917
Chapter 10
0 = y 1 x = 10,972.5 2884.876571(3.7258333) = 10,972.5 10,748.56929

= 233.93071 233.931
The fitted regression line is = 233.931 + 2884.877x
c.
( y)
131, 6702
n
12
= 1,514,402,100 1,444,749,075 = 69,653,025
SSyy =
= 1,514,402,1000
SSE = SSyy 1 SSxy = 69,653,025 2,884.876571(2,537.225)

= 69,653,025 - 7,319,580.958 = 62,333,444.04
s2 =
SSE 62, 333, 444.04

=
= 6,233,344.404
n2
12 2
s=
s 2 = 6, 233, 344.404 = 2,496.6667
We would expect to see most of the hospital charges to fall within 2s or 2($2,496.6667) =
$4,993.3333 of the least squares line.
d.
For x = 4, y = 223.931 + 2,884.877(4) = 11,763.439

y 2s 11,763.439 4,993.3333 (6,770.106, 16,756.772)
e.
10.34
Only one state (California) had an average hospital charge more than 2 standard errors
from the least squares line. Thus, 11 out of 12 or 11/12 or .917 of the states had average
hospital charges within 2 standard errors of the least squares line.
Some preliminary calculations for Brand A are:
x = 750
SSxy = xy
x y = 2, 022 750(44.8) = 218
SSxx = x 2
SS yy = y
= 40, 500
xy = 2, 022 y = 44.8
( x)
= 168.70
15
( y)
= 40, 500
7502
= 3, 000
15
= 168.70
44.82
= 34.89733333
15
218
= 0.0726666667 0.0727
SSxx 3, 000
44.8
750
0 = y 1 x =
(0.0726666667)
= 6.62
15
15
1 =
SSxy
343
The least squares prediction equation for Brand A is: y = 6.62 0.0727 x
Some preliminary calculations for Brand B are:
x = 750
SSxy = xy
x y = 2, 622 750(58.9) = 323
SSxx = x 2
SS yy = y
= 40, 500
xy = 2, 622 y = 58.9
( x)
= 270.89
15
( y)
= 40, 500
7502
= 3, 000
15
= 270.89
58.92
= 39.60933333
15
323
1 =
=
= 0.1076666667 0.1077
SSxx 3, 000
58.9
750
0 = y 1 x =
(0.1076666667)
= 9.31
15
15
SSxy
The least squares prediction equation for Brand B is: y = 9.31 0.1077 x
For Brand A,
SSE = SS yy 1SS xy = 34.89733333 ( 0.072666667)(218) = 19.0560
s 2 = MSE =
SSE 19.0560
=
= 1.4658 and s = 1.4658 = 1.211
n 2 15 2
For Brand B,
SSE = SS yy 1SS xy = 39.60933333 (0.107666667)(323) = 4.833
s 2 = MSE =
SSE 4.833
=
= 0.37177 and s = 0.37177 = .61
n 2 15 2
For Brand A, y = 6.62 .0727x. For x = 70, y = 6.62 .0727(70) = 1.531

2s = 2(1.211) = 2.422
Therefore, y 2s 1.531 2.422 (.891, 3.593)
For Brand B, y = 9.31 .1077x. For x = 70, y = 9.31 .1077(70) = 1.7
2s = 2(.61) = 1.22
Therefore, y 2s 1.771 1.22 (.551, 2.991)
More confident with Brand B since there is less variation (s is smaller).
344
Chapter 10
10.36
a.
b.
= 21
SSxy =
x = 91 xy = 86
y = 21
x y = 86 21(21) = 86 63 = 23
xy
2
SSxx =
SSyy =
( x)
= 89
( y)
= 91
21
= 91 63 = 28
7
= 89
212
= 26
7
23
= .821428571 .821
28
SS xx
21
21
0 = y 1 x = .821428571 = 3 2.4642857 = .535714285 .536
7
7
1 =
SS xy
The fitted line is y = .536 + .821x.

c.
d.
See the plot in part a.

To test whether x contributes significant information for predicting y, we test:
H0: 1 = 0
Ha: 1 0
e.
1 0
s
where s =
1
s
SSxx
SSE = SSyy 1 SSxy = 26 .821428571(23) = 7.107142857

SSE
7.107142857
= 1.421428571
s2 =
s = 1.42143 = 1.1922
=
72
n2
1.1922
.82143 0
s =
= .2253
t=
= 3.646
1
.2253
28
The degrees of freedom for this t is df = n 2 = 7 2 = 5.
345
f.
The rejection region requires /2 = .05/2 = .025 in each tail of the t-distribution. From
Table VI, Appendix B, t.025 = 2.571 with df = n 2 = 7 2 = 5. The rejection region is
t > 2.571 or t < 2.571.
Since the observed value of the test statistic falls in the rejection region (t = 3.646 >
2.571), H0 is rejected. There is sufficient evidence to indicate that x contributes
information for the prediction of y at = .05.
10.38
= 21
SSxy =
x = 91 xy = 65
y = 19
x y = 65 21(19) = 65 66.5 = -1.5
xy
2
SSxx =
x2
SSyy =
( x)
= 65
( y)
= 91
212
= 91 73.5 = 17.5
6
= 65
192
= 65 60.166667 = 4.8333333
6
1.5
SS xy
1 =
=
= .085714285 .0857
17.5
SS xx
SSE = SSyy 1 SSxy = 4.8333333 (.085714285)(1.5) = 4.704761903
SSE
4.704761903
s2 =
s = 1.76190476 = 1.0845
=
= 1.176190476
62
n2
To determine whether a straight line is useful for characterizing the relationship between
x and y, we test:
H0: 1 = 0
Ha: 1 0
1 0
s
.08571 0
= .33
1.0845
17.5
The rejection region requires /2 = .05/2 = .025 in each tail of the t-distribution with df = n 2
= 6 - 2 = 4. From Table VI, Appendix B, t.025 = 2.776. The rejection region is
t > 2.776 or t < 2.776.
Since the observed value of the test statistic does not fall in the rejection region (t = .33 </
2.776), H0 is not rejected. There is insufficient evidence to indicate that a straight line is
useful for characterizing the relationship between x and y at = .05.
346
Chapter 10
10.40
a.
To determine if the average state SAT score in 2005 has a positive relationship with
the average state SAT score in 1990, we test:
H0: 1 = 0
Ha: 1 > 0
b.
From the printout in Exercise 10.15, the p-value is p = 0.000. This is the p-value for a 2tailed test. The p-value for this one-tailed test is 0.000/2 = 0.000. Since the p-value is
less than = .05, H0 is rejected. There is sufficient evidence to indicate the average state
SAT score in 2005 has a positive relationship with the average state SAT score in 1990 at
= .05.
c.
B, with df = n 2 = 51 2 = 49, t.025 2.011. The 95% confidence interval is:
1 t.025 s 1.073 2.011(.056) 1.073 .113 (.960, 1.186)

1
We are 95% confident that for each additional point in the 1990 average state SAT
score, the increase in the 2005 average stat SAT score is between .960 and 1.186.
10.42
From Exercise 10.18, SSxy = 130.44167, 1 = -0.002310625, and SSxx = 56,452.95833.
y = 135.8
y = 769.72
( y ) = 769.72 135.8
SS yy = y
24
= 1.3183333
SSE = SS yy 1SS xy = 1.3183333 ( 0.002310625)(130.44167) = 1.016931516

SSE 1.016931516
=
= 0.046224159 and s = 0.046224159 = 0.214998
n2
24 2
MSE
0.214998
s =
=
= 0.0009049
1
SSxx
56, 452.95833
s 2 = MSE =
For confidence coefficient .95, = .05 and /2 = .05/2 = .025. From Table VI, Appendix B,
with df = n 2 = 24 2 = 22, t.025 = 2.074. The confidence interval is:
1 t.025 s 0.0023 2.074(0.0009049)

1
0.0023 0.0019 (0.0042, 0.0004)
We are 95% confident that for each additional point increase in the amount of soluble
pectin, the mean sweetness index will decrease from between .0004 and .0042 points.
347
10.44
a.
From Exercise 10.23, SSxy = -787.51087, SSxx = 6,906.6087,
y = 60.1 ,
= 262.271 , and 1 = 0.114022801 .
( y)
(60.1) 2
23
n
= 262.271 157.043913 = 105.227087
SS yy = y
= 262.271
SSE = SS yy 1SS xy = 105.227087 ( 0.114022801)( 787.51087) = 15.43289179

s 2 = MSE =
s =
1
SSE 15.43289179
=
= 0.734899609 and s = 0.734899609 = 0.8573
n2
23 2
MSE
SS xx
0.734899609
6,906.6087
= 0.010315
To determine if the mass of the spill tends to diminish linearly as time increases, we test:
H0: 1 = 0
Ha: 1 < 0
1 0
s
0.114022801
= 11.05
0.010315
df = n 2 = 23 2 = 21. From Table VI, Appendix B, t.05 = 1.721. The rejection
region is t < 1.721.
(t = 11.05 < 1.721), H0 is rejected. There is sufficient evidence to indicate the mass
of the spill tends to diminish linearly as time increases at = .05.
b.
Appendix B, with df = n 2 = 23 2 = 21, t.025 = 2.080. The 95% confidence interval
is:
1 t.025 s 0.1140 2.080(0.010315) 0.1140 0.02146

1
(0.13546, -0.09254)
We are 95% confident that for each additional minute of elapsed time, the decrease
in spill mass is between 0.13546 and 0.09254.
348
Chapter 10
10.46
a.
Using MINITAB, the scattergram is:
It appears from the plot that as the percentage of the population that is minority increases,
the number of people per branch bank tends to increase.
b.
The value of 1 will be positive. As one variable increases, the other tends to increase.
c.
x = 363.8
y
x=
y = 56,560
xy = 1,075,763
= 9,020.86
= 158,763,894
x = 363.8 = 17.32380952
n
SSxy =
21
xy
y=
x = 56, 560 = 2,693.33333

n
21
( x )( y ) = 1, 075, 763 363.8(56, 560)
n
21
= 1,075,763 979,834.6667 = 95,928.3333
( x)
363.82
n
21
= 9,020.86 6,302.401905 = 2,718.458095
SSxx =
1 =
SS xy
SS xx
= 9,020.86
95, 928.3333
= 35.28777342 35.288
2, 718.458095
( y)
56, 5602
n
21
= 158,763,894 - 152,334,933.3 = 6,428,960.7
SSyy =
= 158,863,894
SSE = SSyy 1 SSxy = 6,428,960.7 35.28777342(95,928.3333)

= 6,428,960.7 3,385,097.29 = 3,043,863.41
s2 =
SSE 3, 043,863.41
=
= 160,203.3374
n2
21 2
349
s=
2
s = 160, 203.3374 = 400.2541
To determine if the data support the charge made against the New Jersey banking
community, we test:
H0: 1 = 0
Ha: 1 0
1 0
s
35.288 0
400.2541
= 4.597
2, 718.458095
The rejection region requires /2 =.01/2 = .005 in each tail of the t-distribution with
is t < 2.861 or t > 2.861.
2.861), H0 is rejected. There is sufficient evidence to support the charge made against the
New Jersey banking community at = .01.
10.48
a.
b.
Using MINITAB, the regression analysis is:

Regression Analysis: Index versus Interactions
The regression equation is
Index = 44.1 + 0.237 Interactions
Predictor
Constant
Interact
S = 19.40
Coef
44.130
0.2366
SE Coef
9.362
0.1865
R-Sq = 8.6%
T
4.71
1.27
P
0.000
0.222
R-Sq(adj) = 3.3%
Source
Regression
Residual Error
Total
DF
1
17
18
SS
606.0
6400.6
7006.6
MS
606.0
376.5
F
1.61
P
0.222
From the printout, the least squares line is y = 44.13 + .2366x.
350
Chapter 10
c.
From the printout, s = 19.40

The standard deviation s represents the spread of the manager success index about the
least squares line. Approximately 95% of the manager success indexes should lie within
2s = 2(19.40) = 38.8 of the least squares line.
d.
Refer to the scattergram in part a. The number of interactions with outsiders might
contribute some information in the prediction of managerial success, but it does not look
like a very strong relationship.
e.
To determine if the number of interactions contributes information for the prediction of

managerial success, we test:
H0: 1 = 0
Ha: 1 0
1 0
s
= 1.27
n 2 = 19 2 = 17. From Table VI, Appendix B, t.025 = 2.110. The rejection region is
t > 2.110 or t < 2.110.
>/ 2.110), H0 is not rejected. There is insufficient evidence to indicate the number of
interactions contributes information for the prediction of managerial success at = .05.
f.
Appendix B, with df = 17, t.025 = 2.110. The 95% confidence interval is:
1 t.025 s .2366 2.110(.1865) .2366 .3935 (.1569, .6301)

1
We are 95% confident the change in the mean manager success index for each additional
interaction with outsiders is between .1569 and .6301.
10.50
a.

Regression Analysis: Risk versus Credit
Risk = 56.2 - 0.400 Credit
Predictor
Constant
Credit
Coef
56.215
-0.39961
S = 12.6777
SE Coef
6.033
0.09152
R-Sq = 33.4%
T
9.32
-4.37
P
0.000
0.000
R-Sq(adj) = 31.7%
Source
Regression
Residual Error
Total
DF
1
38
39
SS
3064.4
6107.5
9171.9
MS
3064.4
160.7
F
19.07
P
0.000
351
To determine if country credit risk contributes information for the prediction of market
volatility, we test:
H0: 1 = 0
Ha: 1 0
1 0
s
= 4.37 (from printout).
The p-value is .000. Since the p-value is so small, there is strong evidence to indicate
that country credit risk contributes information for the prediction of market volatility at
> .000.
b.
Using MINITAB, a scattergram of the data with the fitted regression line is:
Regression Plot
Risk = 56.22 .3996 Credit
S = 12.6777
R-Sq = 33.4 %
R-Sq(adj) = 31.7 %
90
80
70
Ris k
60
50
40
30
20
10
20
30
40
50
60
70
80
90
100
Credit
From the plot, there appears to be several outliers. Observations 1, 19, 34, and 36 have
arrows pointing at them.
352
Chapter 10
c.
Eliminating those four data points and using MINITAB, the regression analysis is as
follows:
Risk = 48.9 - 0.316 Credit
Predictor
Constant
Credit
Coef
48.891
-0.31599
s = 7.46401
Stdev
3.991
0.05883
R-sq = 45.9%
t-ratio
12.25
-5.37
p
0.000
0.000
R-sq(adj) = 44.3%
SOURCE
Regression
Error
Total
Unusual
Obs.
4
25
27
DF
1
34
35
SS
1607.4
1894.2
3501.6
Observations
C2
C1
35.1
63.70
25.3
23.30
55.6
46.40
MS
1607.4
55.7
Fit Stdev.Fit
37.80
2.13
40.90
2.62
31.32
1.35
F
28.85
Residual
25.90
-17.60
15.08
p
0.000
St.Resid
3.62R
-2.52R
2.05R
R denotes an obs. with a large st. resid.
After eliminating the four data points, the regression analysis is very similar. The fitted
regression line is:
y = 48.891 .31599x
To determine if country credit risk contributes information for the prediction of market
volatility, we test:
H0: 1 = 0
Ha: 1 0
1 0
s
The p-value is .000. Since the p-value is so small, there is strong evidence to indicate that
country credit risk contributes information for the prediction of market volatility at
> .000.
The standard error for the analysis when the four data points have been removed (s = 7.464)
is much smaller than the standard error with all the data points (s = 12.6777).
353
10.52
10.54
a.
r = 1 implies x and y are perfectly, positively related.
b.
r = 1 implies x and y are perfectly, negatively related.
c.
r = 0 implies x and y are not related.
d.
r = .90 implies x and y are positively related. Since r is close to 1, the strength of the
relationship is very high.
e.
r = .10 implies x and y are positively related. Since r is close to 0, the relationship is
fairly weak.
f.
r = .88 implies x and y are negatively related. Since r is close to 1, the relationship is
fairly strong.
a.
x =0
y = 12
SSxy =
x = 10 xy = 20
y = 70
x y = 20 0(12) = 20
xy
2
SSxx =
SSyy =
r=
( x)
( y)
SS xy
SS xxSS yy
= 10
0
= 10
5
= 70
122
= 41.2
5
20
10(41.2)
= .9853
r2 = .98532 = .9709
Since r = .9853, there is a very strong positive linear relationship between x and y.
Since r2 = .9709, 97.09% of the total sample variability around the sample mean response
is explained by the linear relationship between x and y.
354
Chapter 10
b.
x =0
y = 16
SSxy =
x = 10
xy = 15
y = 74
x y = 15 0(16) = 15
xy
2
SSxx =
SSyy =
r=
( x)
( y)
02
= 10
5
= 74
162
= 22.8
5
SS xy
= 10
15
10(22.8)
SS xxSS yy
2
2
r = (.9934) = .9868
= .9934
Since r = .9934, there is a very strong negative linear relationship between x and y.
Since r2 = .9868, 98.68% of the total sample variability around the sample mean response
is explained by the linear relationship between x and y.
c.
x = 18
y = 14
SSxy =
x = 52 xy = 36
y = 32
x y = 36 18(14) = 0
xy
2
SSxx =
SSyy =
( x)
( y)
= 52
182
= 5.71428571
7
= 32
142
=4
7
355
SS xy
r=
5.71428571(4)
SS xxSS yy
=0
r2 = 02 = 0
Since r = 0, this implies that x and y are not linearly related.
Since r2 = 0, 0% of the total sample variability around the sample mean response is
explained by the linear relationship between x and y.
d.
x = 15
y =4
SSxy =
x = 71 xy = 12
y =6
x y = 12 15(4) = 0
xy
2
SSxx =
x2
SSyy =
r=
( x)
( y)
SS xy
SS xxSS yy
2
2
r =0 =0
= 71
152
= 26
5
=6
42
= 2.8
5
0
26(2.8)
=0
Since r = 0, this implies that x and y are not linearly related.

Since r2 = 0, 0% of the total sample variability around the sample mean response is
explained by the linear relationship between x and y.
356
Chapter 10
10.56
10.58
10.60
a.
From the printout, r2 = R-Sq = 89.3%. 89.3% of the total sample variability around the
sample mean asking price is explained by the linear relationship between asking
price
and number of carats for diamond.
b.
r = r 2 = .893 = .945. The value of r has the same sign as 1 , which is positive. Since
r is very close to 1, there is a strong positive linear relationship between
asking price
and number of carats for diamond.
a.
Since r = .43, there is a fairly weak positive linear relationship between total time
allotted to sports and audience rating.
b.
r2 = .432 = .1849. Since r2 = .1849, 18.49% of the total sample variability around the
sample mean audience rating is explained by the linear relationship between audience
rating and total time allocated to sports.
a.
Using MINITAB, a scattergram of the data is:

Scatterplot of NetWorth vs Age
50
NetWor th
40
30
20
10
20
30
40
50
60
70
80
90
A ge
There appears to be a slight increase in the Net Worth as age increases, but the
relationship is fairly weak.
b.
x = 859
y = 303.8
x 2 = 53,567
y 2 = 8, 202.28
SS xy = xy
xy = 17,841.6
( x )( y ) = 17,841.6 859(303.8)
15
n
= 17,841.6 17,397.61333 = 443.98667
357
( x)
(859) 2
15
n
= 53,567 49,192.06667 = 4,374.93333
SS xx = x
= 53,567
( y)
(303.8) 2
15
n
= 8, 202.28 6,152.962667 = 2,049.317333
SS yy = y
1 =
r=
SSxy
SS xx
= 8, 202.28
443.98667
= 0.101484213 0.1015
4,374.93333
SSxy
SSxx SS yy
443.98667
= .1483
4,374.93333 2,049.317333
Since r is positive, there is a very weak positive linear relationship between a

persons net worth and his/her age.
c.
If r had a negative sign, the interpretation would be:

Since r is negative, there is a very weak negative linear relationship between a
persons net worth and his/her age.
10.62
From Exercises 10.23 and 10.44, SSxy = -787.51087, SSxx = 6,906.6087, and
SSyy = 105.227087.
r=
SSxy
SSxx SS yy
787.51087
= .924
6,906.6087 105.227087
There is a very strong negative linear relationship between mass of spill and elapsed
time of the spill.
r 2 = .9242 = .854 Approximately 85.4% of the variability in the mass of the spill
around the sample mean is explained by the linear relationship between mass of the spill
and elapsed time of the spill.
358
Chapter 10
10.64
a.
15
WeightChg
10
-5
-10
0
10
20
30
40
50
60
70
80
Digest
b.
x = 1, 266.5
y = 1, 075.5
xy = 4,103.25 y = 46
= 57, 390.75
x y = 4,103.25 1, 266.5(46) = 2, 716.130952
SSxy = xy
SSxx = x 2
SS yy = y
1 =
r=
SSxy
SSxx
( x)
42
= 57, 390.75
( y)
= 1, 075.5
(1, 266.5) 2
= 19,199.74405
42
462
= 1, 025.119048
42
n
2, 716.130952
=
= 0.141467039
19,199.74405
SSxy
SSxx SS yy
2, 716.130952
19,199.74405 1, 025.119048
= .6122
There is a moderate positive linear relationship between digestion efficiency and

weight change.
c.
To determine whether weight change is correlated with digestion, we test:

H0: = 0
Ha: 0
r
1 r
n2
2
.6122
1 .61222
42 2
= 4.90
359
region is t > 2.704 or t < 2.704.
2.704), H0 is rejected. There is sufficient evidence to indicate weight change and
digestion are correlated at = .01.
d.
After deleting the data corresponding to duck chow, the preliminary calculations are:
x = 701.50
SS xy = xy
SS xx = x 2
SS yy = y
= 21, 069
xy = 99.5 y = 18 y
= 404.00
x y = 99.5 701.50(18) = 482.1363636

n
( x)
33
= 21, 069
( y)
= 404
(701.50) 2
= 6,156.81061
33
(18) 2
= 394.1818182
33
n
482.1363636
1 =
=
= 0.078309435
SSxx 6,156.81061
SSxy
r=
SSxy
SSxx SS yy
482.1363636
= .3095
6,156.81061 394.1818182
There is a rather weak positive linear relationship between digestion efficiency and
weight change.
To determine whether weight change is correlated with digestion, we test:
H0: = 0
Ha: 0
.3095
= 1.81
1 r2
1 .30952
n2
33 2
region is t > 2.750 or t < 2.750.
(t = 1.81 >/ 2.750), H0 is not rejected. There is insufficient evidence to indicate weight
change and digestion are correlated at = .01.
360
Chapter 10
e.
80
70
60
Digest
50
40
30
20
10
0
5
15
25
35
Fiber
x = 943.5 x
y = 57, 390.75
= 24, 533.25
xy = 21, 405.5 y = 1, 266.5
SSxy = xy
SSxx = x 2
SS yy = y
1 =
r=
SSxy
SSxx
x y = 21, 405.5 943.5(1, 266.5) = 7, 045.51786

n
( x)
42
( y)
= 24, 533.25
(943.5) 2
= 3, 338.19643
42
= 57, 390.75
1, 266.52
= 19,199.74405
42
n
7, 045.51786
=
= 2.110576177
3, 338.19643
SSxy
SSxx SS yy
7, 045.51786
3, 338.19643 19,199.74405
= .8801
There is a fairly strong negative linear relationship between digestion efficiency and
acid-detergent fiber.
361
To determine whether acid-detergent fiber is correlated with digestion, we test:

H0: = 0
Ha: 0
r
1 r2
n2
.8801
1 (.8801) 2
42 2
= 11.72
region is t > 2.704 or t < 2.704.
Since the observed value of the test statistic falls in the rejection region (t = 11.72 <
2.704), H0 is rejected. There is sufficient evidence to indicate acid-detergent fiber and
After deleting the data corresponding to duck chow, the preliminary calculations are:
x = 877 x
y = 21, 069
xy = 17, 274 y = 701.50
= 24, 036.5
x y = 17, 274 877(701.50) = 1, 368.89394
SSxy = xy
SSxx = x 2
SS yy = y
( x)
33
= 24, 036.5
( y)
= 21, 069
(877) 2
= 729.56061
33
(701.50) 2
= 6,156.81061
33
n
1, 368.89394
1 =
=
= 1.876326547
SSxx
729.56061
SSxy
r=
SSxy
SSxx SS yy
1, 368.89394
= .6459
729.56061 6,156.81061
There is a moderate negative linear relationship between digestion efficiency and

To determine whether acid-detergent fiber is correlated with digestion, we test:
H0: = 0
Ha: 0
362
Chapter 10
r
1 r2
n2
.6459
1 ( .6459) 2
33 2
= 4.71
region is t > 2.750 or t < 2.750.
2.750), H0 is rejected. There is sufficient evidence to indicate acid-detergent fiber and
10.66
a.
b.
= 28
SSxy =
x = 224 xy = 254 y = 37 y
x y = 254 28(37) = 106
xy
2
SSxx =
x2
SSyy =
( x)
= 307
( y)
= 224
282
= 112
7
= 307
37 2
= 111.4285714
7
106
= .946428571
SS xx 112
37
28
.946428571 = 1.5
0 = y 1 x =
7
7
1 =
SS xy
The least squares line is y = 1.5 + .946x.

c.
SSE = SSyy 1 SSxy = 111.4285714 (.946428571)(106) = 11.1071429

SSE 11.1071429
=
= 2.22143
s2 =
n2
72
363
d.

1 ( xp x )
y t/2s
+
SSxx
n
where s =
2
s =
For xp = 3, y = 1.5 + .946(3) = 4.338 and x =
2.22143 = 1.4904
28
=4
7
Appendix B, t.05 = 2.015 with df = n 2 = 7 2 = 5.
The 90% confidence interval is:
1 (3 4)
+
4.338 1.170 (3.168, 5.508)
7
112
2
4.338 2.015(1.4904)
e.
The form of the prediction interval is:

1 ( xp x )
y t/2s 1 + +
SSxx
n
The 90% prediction interval is:

1 (3 4)
+
4.338 3.223 (1.115, 7.561)
7
112
2
4.338 2.015(1.4904) 1 +
f.
The 95% prediction interval for y is wider than the 95% confidence interval for the mean
value of y when xp = 3.
The error of predicting a particular value of y will be larger than the error of estimating
the mean value of y for a particular x value. This is true since the error in estimating the
mean value of y for a given x value is the distance between the least squares line and the
true line of means, while the error in predicting some future value of y is the sum of two
errorsthe error of estimating the mean of y plus the random error that is a component of
the value of y to be predicted.
10.68
a.

s
y = 22 = 2.2
y t/2
where y =
n
10
n
s2 =
( y)
n 1
(22) 2
10 = 3.7333 and s = 1.9322
10 1
82
Appendix B, t.025 = 2.262 with df = n 1 = 10 1 = 9. The 95% confidence interval is:
2.2 2.262
364
1.9322
10
2.2 1.382 (.818, 3.582)
Chapter 10
b.
c.
The confidence intervals computed in Exercise 10.63 are much narrower than that found
in part a. Thus, x appears to contribute information about the mean value of y.
d.
From Exercise 12.63, 1 = .843, s = .8619, SSxx = 38.9, and n = 10.

H0: 1 = 0
Ha: 1 0
1 0
s
1 0
s
.843 0
= 6.10
.8619
SSxx
38.9
n 2 = 10 2 = 8. From Table VI, Appendix B, t.025 = 2.306. The rejection region is t >
2.306 or t < 2.306.
H0 is rejected. There is sufficient evidence to indicate the straight-line model contributes
information for the prediction of y at = .05.
10.70
10.72
a.
The 95% confidence interval for E(y) when y = .52 is (3,598.1, 3,868.1). We are
95% confident that the mean asking price for a diamond weighing .52 carats is
between $3,598.10 and $3,868.10.
b.
The 95% prediction interval for y when y = .52 is (1529.8, 5,936.3). We are 95%
confident that the actual asking price for a diamond weighing .52 carats is between
$1,529.80 and $5,936.30.
Answers may vary. One possible answer is:

The 90% confidence interval for x = 220.00 is (5.64898, 5.83848). We are 90% confident that
the mean sweetness index of all orange juice samples will be between 5.64898 and 5.83848
parts per million when the pectin value is 220.00.
365
10.74
a.
Using MINITAB, the results of the regression analysis are:

Regression Analysis: Managers versus UnitsSold
Managers = 5.33 + 0.586 UnitsSold
Predictor
Constant
UnitsSol
Coef
5.325
0.58610
S = 2.566
SE Coef
1.180
0.03818
R-Sq = 92.9%
T
4.51
15.35
P
0.000
0.000
R-Sq(adj) = 92.5%
Source
Regression
Residual Error
Total
DF
1
18
19
SS
1552.0
118.6
1670.5
MS
1552.0
6.6
F
235.63
P
0.000
To determine the usefulness of the model, we test:

H0: 1 = 0
Ha: 1 0
1 0
s
t > 2.101 or t < 2.101.
> 2.101), H0 is rejected. There is sufficient evidence to indicate the model is useful at
= .05. Therefore, the monthly sales is useful in predicting the number of managers at
= .05.
b.
Appendix B, t.05 = 1.734 with df = 18.
For xp = 39, x =
x = 540
n
20
= 27, and y = 5.325 + .5861(39) = 28.1829.
The form of the prediction interval is:

2
1 (39 27) 2
1 ( xp x )
+
28.183 1.734(2.5664) 1 +
y t/2s 1 + +
20
4, 518
n
SSxx
28.183 4.629 (23.554, 32.812)

c.
366
We are 90% confident the actual number of managers needed when 39 units are sold is
between 23.55 and 32.81.
Chapter 10
10.76
a.
From Exercise 10.34, SSxx = 3000 and x = 50.

Also, for Brand A, s = 1.211; for Brand B, s = .610.
For Brand A, y = 6.62 .0727(45) = 3.349, while for Brand B, y = 9.31 .1077(45)
= 4.464.
The degrees of freedom for both brands is n 2 = 15 2 = 13. For confidence coefficient
.90, (i.e., for all parts of this question), = .10 and /2 = .05. From Table VI, Appendix
B, with df = 13, t.05 = 1.771.
The form of both confidence intervals is y t/2s
2
1 ( xp x )
+
n
SSxx
For Brand A, we obtain:

1 (45 50)
+
3.349 .587 (2.762, 3.936)
15
3000
2
3.349 1.771(1.211)
For Brand B, we obtain:
1 (45 50)
+
4.464 .296 (4.168, 4.760)
15
3000
2
4.464 1.771(.610)
The first interval is wider, caused by the larger value of s.
b.
2
1 ( xp x )
The form of both prediction intervals is y t/2s 1 + +
n
SSxx
For Brand A, we obtain:

1 (45 - 50)
3.349 1.771(1.211) 1 + +
15
3000
3.349 2.224 (1.125, 5.573)
For Brand B, we obtain:

1 (45 - 50)
4.464 1.771(.610) 1 + +
15
3000
4.464 1.120 (3.344, 5.584)
Again, the first interval is wider, caused by the larger value of s. Each of these intervals
is wider than its counterpart from part a, since, for the same x, a prediction interval for an
individual y is always wider than a confidence interval for the mean of y. This is due to
an individual observation having a greater variance than the variance of the mean of a set
of observations.
367
c.
To obtain a confidence interval for the life of a brand A cutting tool that is operated at
100 meters per minute, we use:
2
1 ( xp x )
y t/2s 1 + +
n
SSxx
For x = 100, y = 6.62 .0727(100) = .65.

The degrees of freedom are n 2 = 15 2 = 13. For confidence coefficient .95, = .05
and /2 = .025. From Table VI, Appendix B, with df = 13, t.025 = 2.160.
Here, we obtain:
.65 2.160(1.211) 1 +
(100 50) 2
1
+
.65 3.606 (4.256, 2.956)
15
3000
The additional assumption would be that the straight line model fits the data well for the
x's actually observed all the way up to the value under consideration, 100. Clearly from
the estimated value of .65, this is not true (usually, negative "useful lives" are not
found).
10.78
a.
b.
One possible line is y = x.

x
y - y
1
3
5
1
3
5
1
3
5
0
0
0
0
For this example
( y y ) = 0
A second possible line is y = 3.
368
y - y
1
3
5
1
3
5
3
3
3
2
0
2
0
For this example
( y y ) = 0
Chapter 10
c.
x = 9 x = 35 xy = 35
y = 9 y = 35
x y = 35 9(9) = 8
SSxy = xy
n
3
( x ) = 35 9 = 8
SSxx = x
n
3
( y ) = 35 9 = 8
SSyy = y
3
n
2
2
i
1 =
SS xy
SS xx
8
=1
8
9 9
0 = 1 x = 1 = 0
3
The least squares line is y = 0 + 1x = x.

d.
For y = x, SSE = SSyy 1 SSxy = 8 1(8) = 0

For y = 3, SSE = ( yi yi ) 2 = (1 3)2 + (3 3)2 + (5 3)2 = 8
The least squares line has the smallest SSE of all possible lines.
10.80
a.
The variables x and y do appear to be related. It appears when x increases, y tends to

increase.
b.
r = r 2 = .612 = .7823
The correlation between concentration and exhaustion index is .7823. This relationship is
positive since r > 0. The relationship is fairly strong. No, this does not mean that
concentration causes emotional exhaustion. They are just related.
369
c.
To determine if the straight-line relationship is useful, we test:

H0: 1 = 0
Ha: 1 0
1 0
s
= 6.03
t > 2.069 or t < 2.069.
H0 is rejected. There is sufficient evidence to indicate the model is useful for predicting
burnout at = .05.
d.
r2 = .612
61.2% of the sample variation of exhaustion index is explained by the linear relationship
between the exhaustion index and concentration.
e.
1 t.025 s 8.865 2.069(1.471) 8.865 3.043 (5.822, 11.908)

1
We are 95% confident that the change in mean exhaustion index for each unit
change in concentration is between 5.822 and 11.908.
f.
For confidence coefficient .95, = 1 .95 and /2 = .05/2 = .025. From Table VI,
Appendix B, t.025 = 2.069 with df = 23. The confidence interval is:
2
1 ( xp x )
y t/2s
where y = 29.497 + 8.865(80) = 679.703
+
n
SSxx
1 (80 68.56)
+
679.703 80.054
25
14, 026.16
(599.678, 759.757)
2
679.703 2.069(174.2074)
We are 95% confident that the interval from 599.648 to 759.757 encloses the mean
exhaustion level for all professionals who have 80% of their social contacts within their
work groups.
370
Chapter 10
10.82
a.
x = 590,124
x = 27,727,637,890
xy = 1,396,503,941
y = 30,537.4
y = 73,506,140.4
( x )( y ) = 1,396,503,941 590,124(30, 537.4) = 10,284,507
SSxy = xy
13
13
( x ) = 27,727,637,890 590,124 = 939,458,250
SSxx = x
13
13
2
10, 284, 507

= .010947274 .0109
939, 458, 250
SS xx
.010947274(590,124)
30, 537.4
1 = y 1 x =
= 1852.088523 1852.089
13
13
1 =
SS xy
The least squares line is y = 1852.089 + .0109x.

b.
The plot of the data is:
c.
Based on the graph, it does not appear that the line fits the data very well. The points do
not lie very close to the line.
d.

SS yy = y
( y)
= 73, 506,140.4
(30, 537.4) 2
= 1, 772,848.19
13
SSE = SS yy 1SS xy = 1, 772,848.19 (0.010947274)(10, 284, 507) = 1, 660, 260.874
SSE 1, 660, 260.874

=
= 150, 932.8067
n2
13 2
and s = 150, 932.8067 = 388.501
s 2 = MSE =
371
1 t.025 s .0109 2.201

1
10.84
10.86
388.501
939, 458, 250
.0109 .0279 (0.0170, 0.0388)
e.
Since 0 is contained in the 95% confidence interval, there is no evidence to indicate that
there is a linear relationship between buying income and retail sales.
a.
r = .14. Because this value is close to 0, there is a very weak positive linear relationship
between math confidence and computer interest for boys.
b.
r = .33. Because this value is fairly close to 0, there is a weak positive linear relationship
between math confidence and computer interest for girls.
a.
1 = .020. For each additional 1% increase in leaves infected, the mean log of the
average number of infections per leaf is estimated to increase by .02.
b.
r2 = .816. 81.6% of the total sample variability around the sample mean log of the
average number of infections per leaf is explained by the linear relationship between the
log of the average number of infections per leaf and the percentage of leaves infected.
c.
s = .288. We would expect most of the observed values of the log of the average number
of infections per leaf to fall within 2s or 2(.288) or .576 units of their predicted values.
d.
r = .816 = .903. Because this number is close to 1, there is a fairly strong positive
linear relationship between the log of the average number of infections per leaf and the
percentage of leaves infected.
e.
To determine if there is a linear relationship between the log of the average number of
infections per leaf and the percentage of leaves infected, we test:
H0: 1 = 0
Ha: 1 0
r
(1 r ) /(n 2)
2
.903
(1 .816) /(100 2)
= 20.83
df = n 2 = 100 2 = 98. From Table VI, Appendix B, t.025 1.99. The rejection region
is t < 1.99 or t > 1.99.
H0 is rejected. There is sufficient evidence to indicate that there is a linear relationship
between the log of the average number of infections per leaf and the percentage of leaves
infected at = .05.
372
Chapter 10
10.88
f.
For xp = 80%, y = .939 + .020(80) = .661. The antilog (base 10) of .661 is 4.58. Thus,
when the percentage of leaves infected is 80%, the average number of infections per leaf
is predicted to be 4.58.
a.
A straight line model relating an NFL teams current value to its operating income is:
y = 0 + 1x +
b.
x = 1,037.6
y = 26, 207
x 2 = 38,996.28
x=
y 2 = 22,024,389
x = 1,037.6 = 32.425
32
SSxy = xy
xy = 879, 473.1
y=
y = 26, 207 = 818.96875

32
( x )( y ) = 879, 473.1 1,037.6(26, 207)
n
= 879, 473.1 849,761.975 = 29,711.125
( x)
32
(1,037.6) 2
32
n
= 38,996.28 33,644.18 = 5,352.1
SSxx = x
1 =
SSxy
SSxx
= 38,996.28
29,711.125
= 5.551302293 5.551
5,352.1
o = y 1 x = 818.96875 (5.551302293)(32.425) = 638.9677731 638.968

The fitted regression line is: y = 638.968 + 5.551x
c.
1 = 5.551. When operating income increases by 1 millon dollars, the mean current
value is estimated to increase by 5.551 million dollars. This is meaningful for values of
operating income between 7.8 and 54.3 million dollars.
0 = 638.968. This has no meaning since x = 0 is not in the observed range.

d.
Some additional calculations are:
( y)
(26, 207) 2
32
n
= 22,024,389 21, 462,714.03 = 561,674.97
SS yy = y
= 22,024,389
SSE = SS yy 1SS xy = 561674.97 5.551302293(29,711.125) = 396,739.5337
373
s 2 = MSE =
SSE 396,739.5337
=
= 13, 224.65112 and
n2
32 2
s = 13, 224.65112 = 114.9985

To determine if a linear relationship exists between current value and operating
income, we test:
H0: 1 = 0
Ha: 1 0
The test statistics is t =
1 0
s
5.551 0
= 3.53
114.9985
5,352.1
No was given so we will use = .05. The rejection region requires /2 = .05/2 =
.025 in each tail of the t-distribution with df = n 2 = 32 2 = 30. From Table VI,
Appendix B, t.025 = 2.042. The rejection region is t > 2.042 or t < 2.042.
2.042), H0 is rejected. There is sufficient evidence to indicate a significant linear
relationship between current value and operating income at = .05.
r2 =
SS yy SSE
SS yy
561,674.97 396,739.5337
= .29365 .294
561,674.97
29.4% of the sample variation around the sample mean current value is explained by
the linear relationship between current value and operating income.
There is a significant linear relationship between current value and operating income.
However, the relationship is not particularly strong.
10.90
a.

Regression Analysis: BTU versus Area
BTU = - 99045 + 103 Area
Predictor
Constant
Area
S = 628185
Coef
-99045
102.81
SE Coef
261618
15.86
R-Sq = 67.8%
T
-0.38
6.48
P
0.709
0.000
R-Sq(adj) = 66.1%
Source
Regression
Residual Error
Total
374
DF
SS
MS
1 1.65850E+13 1.65850E+13
20 7.89232E+12 3.94616E+11
21 2.44773E+13
F
42.03
P
0.000
Chapter 10
Predicted Values for New Observations

New Obs
1
Fit
723467
SE Fit
165874
95.0% CI
377459, 1069475)
95.0% PI
( -631816, 2078750)
Values of Predictors for New Observations

New Obs
Area
1
8000
0 (INTERCEP) = 99045
1 (AREA) = 102.81
b.
To determine if energy consumption is positively linearly related to the shell area, we

test:
H0: 1 = 0
Ha: 1 > 0
The test statistic is t = 6.48 (from printout).
t > 1.325.
H0 is rejected. There is sufficient evidence to indicate that energy consumption is
positively linearly related to the shell area at = .10.
c.
Since this is a one-tailed test but the output calculates the p-value for a two-tailed test, the
observed significance level is:
1
( Prob > T
2
) 12 (.000) = .000
This is the probability of observing our value of t (6.481) or anything larger if 1 = 0.

Since this probability is so small, there is strong evidence to reject H0.
d.
r2 = R-Square = .678
67.8% of the total sample variability in energy consumption around its mean is explained
by the linear relationship between energy consumption and shell area.
e.
From the printout, for xp = 8000, y = 723,467

The 95% prediction interval is (631,816, 2,078,750).
This interval is so large and includes negative BTU's; it is not very useful.
375
10.92
x = 4305
y = 201,558
a.
1 =
x
y
xy = 76,652,695
x 1,652,025
2
= 1,652,025
xy = 76,652,695
= 3,571,211,200
= 46.39923427 46.3992
The least squares line is y = 46.3992x.
b.
SSxy =
xy
SSxx =
1 =
SSxy
SSxx
x y
n
( x)
= 76,652,695
= 1,652,025
4305(201,558)
= 18,805,549
15
2
4305
= 416,490
15
n
18,805,549
=
= 45.15246224 45.1525
416, 490
0 y 1 x =
201,558
4305
45.15246224
= 478.4433
15
15
The least squares line is y = 478.4433 + 45.1525x.

c.
376
Because x = 0 is not in the observed range, we are trying to represent the data on the
observed interval with the best fitting line. We are not concerned with whether the line
goes through (0, 0) or not.
Chapter 10
d.
( y)
201,5582
= 862,836,042
15
n
SSE = SSyy 1 SSxy = 862,836,042 - 45.15246224(18,805,549) = 13,719,200.88
SSE 13,719, 200.88
s2 =
=
= 1,055,323.145
s = 1027.2892
n2
15 2
SSyy =
= 3,571, 211, 200
H0: 0 = 0
Ha: 0 0
0 0
2
x
1
+
s
n SSxx
478.443
2
1
1027.2892
+ 287
15 416, 490
= .906
is t < 1.771 or t > 1.771.
>/ 1.771), H0 is not rejected. There is insufficient evidence to indicate 0 is different
from 0 at = .10. Thus, 0 should not be included in the model.
10.94
Answers may vary. Possible answer:

The scaffold-drop survey provides the most accurate estimate of spall rate in a given wall
segment. However, the drop areas were not selected at random from the entire complex; rather,
drops were made at areas with high spall concentrations. Therefore, if the photo spall rates
could be shown to be related to drop spall rates, then the 83 photo spall rates could be used to
predict what the drop spall rates would be.
a.
Construct a scattergram for the data.
The scattergram shows a positive relationship between the photo spall rate (x) and the
drop spall rate (y).
377
b.
Find the prediction equation for drop spall rate. The MINITAB output shows the results
of the analysis.
drop = 2.55 + 2.76 photo
Predictor
Constant
photo
S = 4.164
Coef
2.548
2.7599
StDev
1.637
0.2180
R-Sq = 94.7%
T
P
1.56 0.154
12.66 0.000
R-Sq(adj) = 94.1%
Source
DF
SS
Regression
1
2777.5
Residual Error 9
156.0
Total
10
2933.5
Unusual Observations
Obs
photo
drop
11
11.8
43.00
MS
2777.5
17.3
F
P
160.23 0.000
Fit StDev Fit

35.11
1.97
Residual St Resid
7.89
2.15R
R denotes an observation with a large standardized residual

y = 2.55 + 2.76x
c.
Conduct a formal statistical hypthesis test to determine if the photo spall rates contribute
information for the prediction of drop spall rates.
H0: 1 = 0
Ha: 1 0
The test statistic is t = 12.66, with p-value < .0001.
Reject H0 for any level of significance .0001. There is sufficient evidence to indicate that
photo spall rates contribute information for the prediction of drop spall rates at .0001.
d.
378
One could now use the 83 photos spall rates to predict values for 83 drop spall rates.
Then use this information to estimate the true spall rate at a given wall segment and
estimate to total spall damage.
Chapter 10
Multiple Regression and

Model Building
11.2
a.
0 = 506.346, 1 = 941.900, 2 = -429.060
b.
y = 506.346 941.900x1 429.060x2
c.
SSE = 151,016, MSE = 8883, s = 94.251
Chapter 11
We expect about 95% of the y-values to fall within 2s or 2(94.251) or 188.502

units of the fitted regression equation.
d.
H0: 1 = 0
Ha: 1 0
1 0
s
941.900
= 3.42
275.08
df = n (k + 1) = 20 - (2 + 1) = 17. From Table VI, Appendix B, t.025 = 2.110. The
2.110), H0 is rejected. There is sufficient evidence to indicate 1 0 at = .05.
e.
For confidence coefficient .95, = .05 and /2 = .025. From Table VI, Appendix
B, with df = n (k + 1) = 20 (2 + 1) = 17, t.025 = 2.110. The 95% confidence
interval is:
2 t.025 s 429.060 2.110(379.83) 429.060 801.441

2
(1230.501, 372.381)
f.
R2 = R-Sq = 45.9% . 45.9% of the total sample variation of the y values is explained
by the model containing x1 and x2.
R2a = R-Sq(adj) = 39.6%. 39.6% of the total sample variation of the y values is
explained by the model containing x1 and x2, adjusted for the sample size and the
number of parameters in the model.
379
g.
To determine if at least one of the independent variables is significant in prediction y,

we test:
H0: 1 = 2 = 0
Ha: At least one i 0
From the printout, the test statistic is F = 7.22
Since no level was given, we will choose = .05. The rejection region requires
= .05 in the upper tail of the F-distribution with 1 = k = 2 and 2 = n (k + 1)
= 20 (2 + 1) = 17. From Table IX, Appendix B, F.05 = 3.59. The rejection region is
F > 3.59.
( F = 7.22 > 3.59), H0 is rejected. There is sufficient evidence to indicate at least
one of the variables, x1 or x2, is significant in predicting y at = .05.
11.4
h.
The observed significance level of the test is p-value = 0.005. Since the
p-value is so small, we will reject H0 for most reasonable values of . There is
sufficient evidence to indicate at least one of the variables, x1 or x2, is significant in
predicting y at greater than 0.005.
a.
We are given 1 = 3.1, s = 2.3, and n = 25.

1
H0: 1 = 0
Ha: 1 > 0
1 0
s
3.1
= 1.35
2.3
The rejection region requires = .05 in the upper tail of the t distribution with df =
n (k + 1) = 25 (2 + 1) = 22. From Table VI, Appendix B, t.05 = 1.717. The
rejection region is t > 1.717.
1.35 >/ 1.717), H0 is not rejected. There is insufficient evidence to indicate 1 > 0 at
= .05.
b.
We are given 2 = .92, s = .27, and n = 25.

2
H0: 2 = 0
Ha: 2 0
2 0
s
.92
= 3.41
.27
380
Chapter 11
df = n (k + 1) = 25 (2 + 1) = 22. From Table VI, Appendix B, t.025 = 2.074. The
2.074), reject H0. There is sufficient evidence to indicate 2 0 at = .05.
c.
VI, Appendix B, with df = n (k + 1) = 25 (2 + 1) = 22, t.05 = 1.717. The
confidence interval is:
1 t.05 s 3.1 1.717(2.3) 3.1 3.949 (.849, 7.049)

1
We are 90% confident that 1 falls between .849 and 7.049.

d.
VI, Appendix B, with df = n (k + 1) = 25 (2 + 1) = 22, t.005 = 2.819. The
2 t.005 s .92 2.819(.27) .92 .761 (.159, 1.681)

2
We are 99% confident that 2 falls between .159 and 1.681.

11.6
a.
For x2 = 1 and x3 = 3,
E(y) = 1 + 2x1 + 1 3(3)
E(y) = 2x1 7
The graph is :
381
b.
For x2 = 1 and x3 = 1
E(y) = 1 + 2x1 + (1) 3(1)
E(y) = 2x1 3
The graph is:
c.
They are parallel, each with a slope of 2. They have different y-intercepts.
d.
The relationship will be parallel lines.
11.8
No. There may be other independent variables that are important that have not been
included in the model, while there may also be some variables included in the model which
are not important. The only conclusion is that at least one of the independent variables is a
good predictor of y.
11.10
a.
The first order model is: E(y) = 0 + 1 x1 + 2 x2 + 3 x3 + 4 x4 + 5 x5
b.
R2 = .58. 58% of the total sample variation of the levels of trust is explained by the
model containing the 5 independent variables.
c.
F=
d.
.58 5
R2 k
=
= 16.57
2
(1 R ) [n (k + 1)] (1 .58) [66 (5 + 1)]
The rejection region requires = .10 in the upper tail of the F-distribution with 1 = k
= 5 and 2 = n (k + 1) = 66 (5 + 1) = 60. From Table VIII, Appendix B, F.10 = 1.90.
(F = 16.57 > 1.96), H0 is rejected. There is sufficient evidence to indicate that at
least one of the 5 independent variables is useful in the prediction of level of trust at
= .10.
11.12
a.
The least squares prediction equation is:
y = 3.70 + .34 x1 + .49 x2 + .72 x3 + 1.14 x4 + 1.51x5 + .26 x6 .14 x7 .10 x8 .10 x9 .
382
Chapter 11
b.
0 = 3.70 . This is estimate of the y-intercept. It has no other meaning because the
point with all independent variables equal to 0 is not in the observed range.
1 = 0.34 . For each additional walk, the mean number of runs scored is estimated
to increase by .30, holding all other variables constant.
2 = 0.49 . For each additional single, the mean number of runs scored is estimated to
increase by .49, holding all other variables constant.
3 = 0.72 . For each additional double, the mean number of runs scored is
estimated to increase by .72, holding all other variables constant.
4 = 1.14 . For each additional triple, the mean number of runs scored is estimated
to increase by 1.14, holding all other variables constant.
5 = 1.51 . For each additional home run, the mean number of runs scored is
estimated to increase by 1.51, holding all other variables constant.
6 = 0.26 . For each additional stolen base, the mean number of runs scored is
estimated to increase by .26, holding all other variables constant.
7 = 0.14 . For each additional time a runner is caught stealing, the mean number
of runs scored is estimated too decrease by .14, holding all other variables constant.
8 = 0.10 . For each additional strikeout, the mean number of runs scored is
estimated to decrease by .10, holding all other variables constant.
9 = 0.10 . For each additional out, the mean number of runs scored is estimated
to decrease by .10, holding all other variables constant.
c.
H0: 7 = 0
Ha: 7 < 0
7 0
s
.14 0
= 1.00
.14
The rejection region requires = .05 in the lower tail of the t-distribution with df
= n (k + 1) = 234 (9 + 1) = 224. From Table VI, Appendix B, t.05 = 1.645. The
(t = 1.00 </ 1.645), H0 is not rejected. There is insufficient evidence to indicate
that the mean number of runs decreases as the number of runners caught stealing
increase, holding all other variables constant at = .05.
383
d.
For confidence level .95, = .05 and /2 = .05/2 = .025. From Table VI, Appendix
B, with df = 224, t.025 = 1.96. The 95% confidence interval is:
5 t / 2 s 1.51 1.96(.05) 1.51 0.098 (1.412, 1.608)

5
We are 95% confident that the mean number of runs will increase by anywhere from
1.412 to 1.608 for each additional home run, holding all other variables constant.
11.14. a.
b.
R2 = .31. 31% of the total sample variation of the natural log of the level of CO2
emissions in 1996 is explained by the model containing the 7 independent variables.
.31 7
R2 k
=
= 3.72
2
(1 R ) [n (k + 1)] (1 .31) [66 (7 + 1)]
The rejection region requires = .01 in the upper tail of the F-distribution with 1 = k
= 7 and 2 = n (k + 1) = 66 (7 + 1) = 58. From Table XI, Appendix B, F.01 = 2.95.
(F = 3.72 > 2.95), H0 is rejected. There is sufficient evidence to indicate that at
least one of the 7 independent variables is useful in the prediction of natural log of
the level of CO2 emissions in 1996 at = .01.
c.
To determine if foreign investments in 1980 is a useful predictor of CO2 emissions in

1996, we test:
H0: 1 = 0
Ha: 1 0
11.16
d.
The test statistic is t = 2.52 and the p-value is p < 0.05. Since the observed p-value is
less than (p < .05), Ho is rejected. There is sufficient evidence to indicate foreign
investments in 1980 is a useful predictor of CO2 emissions in 1996 at = .05.
a.
From MINITAB, the output is:

Regression Analysis: DDT versus Mile, Length, Weight
DDT = - 108 + 0.0851 Mile + 3.77 Length - 0.0494 Weight
Predictor
Constant
Mile
Length
Weight
Coef
-108.07
0.08509
3.771
-0.04941
S = 97.48
SE Coef
62.70
0.08221
1.619
0.02926
R-Sq = 3.9%
T
-1.72
1.03
2.33
-1.69
P
0.087
0.302
0.021
0.094
R-Sq(adj) = 1.8%
Source
Regression
Residual Error
Total
384
DF
3
140
143
SS
53794
1330210
1384003
MS
17931
9501
F
1.89
P
0.135
Chapter 11
y = 108.07 + 0.08509x1 + 3.771x2 0.04941x3

b.
s = 97.48. We would expect about 95% of the observed values of DDT level to fall
within 2s or 2(97.48) = 194.96 units of their least squares predicted values.
c.
To determine if at least one of the variables is useful in predicting the DDT level, we
test:
Ho: 1 = 2 = 3 = 0
Ha: At least 1 i 0
The test statistic is F = 1.89 and the p-value is p = .135. Since the p-value is not less
than = .05 (p = .135 </ .05), H0 is not rejected. There is insufficient evidence to
indicate at least one of the variables is useful in predicting the DDT level at = .05.
d.
To determine if DDT level increases as length increases, we test:

H0: 2 = 0
Ha: 2 > 0
The test statistics is t = 2.33

The p-value is p = .021/2 = .0105. Since the p-value is less than (p = .0105 < .05),
H0 is rejected. There is sufficient evidence to indicate that DDT level increases as
length increases, holding the other variables constant at = .05.
The observed significance level is p = .0105.
e.
Appendix B, with df = n 3 = 144 4 = 140, t.025 = 1.96. The 95% confidence
interval is:
3 t / 2 s 0.04941 1.96(0.02926) 0.04941 0.05735

3
(0.10676, 0.00794)
We are 95% confident that the mean DDT level will change from 0.10676 to
0.00794 for each additional point increase in weight, holding length and mile
constant. Since 0 is in the interval, there is no evidence that weight and DDT level
are linearly related.
385
11.18
a.

Regression Analysis: WeightChg versus Digest, Fiber
WeightChg = 12.2 - 0.0265 Digest - 0.458 Fiber
Predictor
Constant
Digest
Fiber
Coef
12.180
-0.02654
-0.4578
S = 3.519
SE Coef
4.402
0.05349
0.1283
R-Sq = 52.9%
T
2.77
-0.50
-3.57
P
0.009
0.623
0.001
R-Sq(adj) = 50.5%
Source
Regression
Residual Error
Total
DF
2
39
41
SS
542.03
483.08
1025.12
MS
271.02
12.39
F
21.88
P
0.000
y = 12.2 .0265x1 .458x2

b.
0 = 12.2 = the estimate of the y-intercept

1 = .0265. We estimate that the mean weight change will decrease by .0265% for
each additional increase of 1% in digestion efficiency, with acid-detergent fibre held
constant.
2 = .458. We estimate that the mean weight change will decrease by .458% for
each additional increase of 1% in acid-detergent fibre, with digestion efficiency held
constant.
c.
To determine if digestion efficiency is a useful predictor of weight change, we test:

H0: 1 = 0
Ha: 1 0
The test statistic is t = .50. The p-value is p = .623. Since the p-value is greater than
(p = .623 > .01), H0 is not rejected. There is insufficient evidence to indicate that
digestion efficiency is a useful linear predictor of weight change at = .01.
d.
VI, Appendix B, with df = n (k + 1) = 42 (2 + 1) = 39, t.005 2.704. The 99%
2 t.005 s .4578 2.704 (.1283) .4578 .3469 (.8047, .1109)

2
We are 99% confident that the change in mean weight change for each unit change in
acid-detergent fiber, holding digestion efficiency constant is between .8047% and
.1109%.
386
Chapter 11
e.
R2 = R-Sq = 52.9%. 52.9% of the total sample variance of the weight changes is
explained by the model containing the 2 independent variables, digestion efficiency ad
R2a = R-Sq(adj) = 50.5%. 50.5% of the total sample variance of the weight changes is
explained by the model containing the 2 independent variables, digestion efficiency
ad acid-detergent fiber, adjusting for the sample size and the number of parameters in
the model.
f.
To determine if at least one of the variables is useful in predicting weight change, we

test:
H0: 1 = 2 = 0
Ha: At least 1 i 0
The test statistic is F = 21.88 and the p-value is p = .000. Since the p-value is less
than = .05 (p = .000 < .05), H0 is rejected. There is sufficient evidence to indicate at
least one of the variables is useful in predicting weight change at = .05.
11.20
a.

y = 4.30 .002x1 + .336x2 + .384x3 + .067x4 .143x5 + .081x6 + .134x7
b.
To determine if the model is adequate, we test:

H0: 1 = 2 = 3 = 4 = 5 = 6 = 7 = 0
Ha: At least one i 0, i = 1, 2, 3, ..., 7
The test statistic is F = 111.1 (from table).
Since no was given, we will use = .05. The rejection region requires = .05 in
the upper tail of the F-distribution with 1 = k = 7 and 2 = n (k + 1) = 268 (7 + 1)
= 260. From Table IX, Appendix B, F.05 2.01. The rejection region is F > 2.01.
2.01), H0 is rejected. There is sufficient evidence to indicate that the model is
adequate for predicting the logarithm of the audit fees at = .05.
c.
3 = .384.
For each additional subsidiary of the auditee, the mean of the

logarithm of audit fee is estimated to increase by .384 units.
387
d.
To determine if the 4 > 0, we test:

H0: 4 = 0
Ha: 4 > 0
The test statistic is t = 1.76 (from table).
The p-value for the test is .079. Since the p-value is not less than (p = .079 </ =
.05), H0 is not rejected. There is insufficient evidence to indicate that 4 > 0, holding
all the other variables constant, at = .05.
e.
To determine if the 1 < 0, we test:

H0: 1 = 0
Ha: 1 < 0
The test statistic is t = 0.049 (from table).
The p-value for the test is .961. Since the p-value is not less than (p = .961 </ =
.05), H0 is not rejected. There is insufficient evidence to indicate that 1 < 0, holding
all the other variables constant, at = .05. There is insufficient evidence to indicate
that the new auditors charge less than incumbent auditors.
11.22
To determine if the model is useful, we test:

H0: 1 = 2 = = 18 = 0
Ha: At least one i 0, i = 1, 2, ... , 18
R2 / k
.95 /18
=
= 1.06
2
(1 R ) /[n ( k + 1)]
(1 .95) /[20 (18 + 1)]
The rejection region requires = .05 in the upper tail of the F distribution with 1 = k = 18
and 2 = n (k + 1) = 20 (18 + 1) = 1. From Table IX, Appendix B, F.05 245.9. The
>/ 247), H0 is not rejected. There is insufficient evidence to indicate the model is adequate
at = .05.
Note: Although R2 is large, there are so many variables in the model that 2 is small.
388
Chapter 11
11.24
a.

Regression Analysis: Labor versus Pounds, Units, Weight
Labor = 132 + 2.73 Pounds + 0.0472 Units - 2.59 Weight
Predictor
Constant
Pounds
Units
Weight
Coef
131.92
2.726
0.04722
-2.5874
S = 9.810
SE Coef
25.69
2.275
0.09335
0.6428
R-Sq = 77.0%
T
5.13
1.20
0.51
-4.03
P
0.000
0.248
0.620
0.001
R-Sq(adj) = 72.7%
Source
Regression
Residual Error
Total
Source
Pounds
Units
Weight
DF
3
16
19
DF
1
1
1
SS
5158.3
1539.9
6698.2
MS
1719.4
96.2
F
17.87
P
0.000
Seq SS
3400.6
198.4
1559.3
The least squares equation is:

y = 131.92 + 2.726x1 + .0472x2 2.587x3
b.
To test the usefulness of the model, we test:

H0: 1 = 2 = 3 = 0
Ha: At least one i 0, for i = 1, 2, 3
MSR
1719.4
=
= 17.87
MSE
96.2
The rejection region requires = .01 in the upper tail of the F-distribution with
1 = k = 3 and 2 = n (k + 1) = 20 (3 + 1) = 16. From Table XI, Appendix B,
> 5.29), H0 is rejected. There is sufficient evidence to indicate a relationship exists
between hours of labor and at least one of the independent variables at = .01.
c.
H0: 2 = 0
Ha: 2 0
The test statistic is t = .51. The p-value = .620. We reject H0 if p-value < . Since
.620 > .05, do not reject H0. There is insufficient evidence to indicate a relationship
exists between hours of labor and percentage of units shipped by truck, all other
variables held constant, at = .05.
389
d.
R2 is printed as R-Sq. R2 = .770. We conclude that 77% of the sample variation of the
labor hours is explained by the regression model, including the independent variables
pounds shipped, percentage of units shipped by truck, and weight.
e.
If the average number of pounds per shipment increases from 20 to 21, the estimated
change in mean number of hours of labor is 2.587. Thus, it will cost $7.50(2.587) =
$19.4025 less, if the variables x1 and x2 are constant.
f.
Since s = Standard Error = 9.81, we can estimate approximately with 2s precision or

2(9.81) or 19.62 hours.
g.
No. Regression analysis only determines if variables are related. It cannot be used to
determine cause and effect.
11.26
From the printout, the 90% prediction interval is (-151.996, 175.4874). We are 90%
confidence that an actual DDT level for a fish caught 100 miles upstream that is 40
centimeters long and weighs 800 grams will be between -151.996 and 175.4874. Since the
DDT level cannot be negative, the interval would be between 0 and 175.4874.
11.28
a.

Regression Analysis: Precip versus Altitude, Latit, Coast
Precip = - 102 + 0.00409 Altitude + 3.45 Latit - 0.143 Coast
Predictor
Constant
Altitude
Latit
Coast
Coef
-102.36
0.004091
3.4511
-0.14286
S = 11.10
SE Coef
29.21
0.001218
0.7949
0.03634
R-Sq = 60.0%
T
-3.50
3.36
4.34
-3.93
P
0.002
0.002
0.000
0.001
R-Sq(adj) = 55.4%
Source
Regression
Residual Error
Total
Source
Altitude
Latit
Coast
DF
1
1
1
DF
3
26
29
SS
4809.4
3202.3
8011.7
MS
1603.1
123.2
F
13.02
P
0.000
Seq SS
730.7
2175.3
1903.4

New Obs
1
Fit
29.25
SE Fit
5.60
95.0% CI
17.75,
40.76)
95.0% PI
3.71,
54.80)

New Obs Altitude
Latit
Coast
1
6360
36.6
145
The fitted regression line is:
y = 102.36 + 0.00409 x1 + 3.4511x2 0.1429 x3
390
Chapter 11
b.
To determine if the first-order model is useful for the predicting annual precipitation,
we test:
H0: 1 = 2 = 3 = 0
Ha: At least one i 0, i = 1, 2, 3
The test statistic is 13.02 and the p-value is p = 0.000. Since the p-value is less than
= .05, H0 is rejected. There is sufficient evidence to indicate that the model is
useful for predicting annual precipitation at = .05.
c.
The prediction interval is (3.71, 54.80).

With 95% confidence, we can conclude that the annual precipitation for an individual
meteorological station with characteristics x1 = 6360 feet, x2 = 36.6, x3 = 145 miles
will fall between 3.71 inches and 54.80 inches.
11.30
The first order model is:
E(y) = 0 + 1x1 + 2x2 + 3x5

We want to find a 95% prediction interval for the actual voltage when the volume fraction
of the disperse phase is at the high level (x1 = 80), the salinity is at the low level (x2 = 1),
and the amount of surfactant is at the low level (x5 = 2).
Using MINITAB, the output is:
y = 0.993 - 0.0243 x1 + 0.142 x2 + 0.385 x5
Predictor
Coef
StDev
0.9326
0.2482
3.76
0.002
x1
-0.024272
0.004900
-4.95
0.000
x2
0.14206
0.07573
1.88
0.080
x5
0.38457
0.09801
3.92
0.001
Constant
S = 0.4796
R-Sq = 66.6%
R-Sq(adj) = 59.9%
Source
Regression
Residual
DF
SS
MS
6.8701
2.2900
9.95
0.001
15
3.4509
0.2301
18
10.3210
Error
Total
Sourc
DF
Seq SS
x1
1.4016
x2
1.9263
x5
3.5422
391
Obs
x1
Fit
StDev Fit
Residual
St Resid
40.0
3.200
2.068
0.239
1.132
2.72R

Predicted Values
Fit
StDev Fit
-0.098
0.232
95.0%
( -0.592,
CI
0.396)
95.0%
(
-1.233,
PI
1.038)
The 95% prediction interval is (1.233, 1.038). We are 95% confident that the actual
voltage is between 1.233 and 1.038 kw/cm when the volume fraction of the disperse phase
is at the high level (x1 = 80), the salinity is at the low level (x2 = 1), and the amount of
surfactant is at the low level (x5 = 2).
11.32
11.34
a.
E(y) = 0 + 1x1 + 2x2 + 3x1x2
b.
E(y) = 0 + 1x1 + 2x2 + 3x3 + 4x1x2 + 5x1x3 + 6x2x3
a.
R2 = 1
SSE
SS yy
=1
21
= .956
479
95.6% of the total variability of the y values is explained by this model.

b.
To test the utility of the model, we test:
H0: 1 = 2 = 3 = 0
R2 / k
.956 / 3
= 202.8
=
2
(1 R )[n (k + 1)] (1 .956)[32 (3 + 1)]
The rejection region requires = .05 in the upper tail of the F distribution, with 1 = k
= 3 and 2 = n (k + 1) = 32 (3 + 1) = 28. From Table IX, Appendix B, F.05 = 2.95.
2.95), H0 is rejected. There is sufficient evidence that the model is adequate for
predicting y at = .05.
392
Chapter 11
c.
The relationship between y and x1 depends on the level of x2.
d.
To determine if x1 and x2 interact, we test:
H0: 3 = 0
Ha: 3 0
1 0 10
s
= 2.5.
2.048), H0 is rejected. There is sufficient evidence to indicate that x1 and x2 interact at
= .05.
11.36
a.
To determine if the overall model is useful for predicting y, we test:
H0: 1 = 2 = 3 = 0
Ha: At least one i is not 0
The test statistic is F = 226.35 and the p-value is p < .001. Since the p-value is less
than (p < .001 < .05), Ho is rejected. There is sufficient evidence to indicate the
overall model is useful for predicting y, willingness of the consumer to shop at a
retailers store in the future at = .05.
b.
To determine if consumer satisfaction and retailer interest interact to affect

willingness to shop at retailers shop in future, we test:
H0: 3 = 0
Ha: 3 0
The test statistic is t = -3.09 and the p-value is p < .01. Since the p-value is less
than (p < .01 < .05), H0 is rejected. There is sufficient evidence to indicate
393
consumer satisfaction and retailer interest interact to affect willingness to shop at

retailers shop in future at = .05.
c.
When x2 = 1,
y = o + .426 x1 + .044 x2 .157 x1 x2

= + .426 x + .044(1) .157 x (1)
o
= o + .044 + (.426 .157) x1

= + .044 + .269 x
o
Since no value is given for o , we will use o = 1 for graphing purposes. Using
MINITAB, a graph might look like:
Scatterplot of YHAT vs X1 when X2=1
3.0
YHA T
2.5
2.0
1.5
d.
4
X1
When x2 = 7,
y = o + .426 x1 + .044 x2 .157 x1 x2

= + .426 x + .044(7) .157 x (7)
o
= o + .308 + (.426 1.099) x1

= + .308 .673x
o
Since no value is given for o , we will again use o = 1 for graphing purposes.
394
Chapter 11
Using MINITAB, a graph might look like:

Scatterplot of YHAT vs X1 when X2=7
YHA T
-1
-2
-3
-4
1
e.
4
X1
Using MINITAB, both plots on the same graph would be:

Scatterplot of YAHT vs X1
Variable
x2=1
x2=7
YHA T
1
0
-1
-2
-3
-4
1
4
X1
Since the lines are not parallel, it indicates that interaction is present.
11.38
a.
The hypothesized regression model including the interaction between x1 and x2

would be:
E ( y ) = o + 1 x1 + 2 x2 + 3 x1 x2
b.
If x1 and x2 interact to affect y then the effect of x1 on y depends on the level of x2.
Also, the effect of x2 on y depends on the level of x1.
395
c.
Since the p-value is not small (p = .25), Ho is not rejected. There is insufficient
evidence to indicate x1 and x2 interact to affect y.
d.
1 corresponds to x1, the number ahead in line. If the negative feeling score gets
larger as the number of people ahead increases, then 1 is positive. 2 corresponds to
x2, the number behind in line. If the negative feeling score gets lower as the number
of people behind increases, then 2 is negative.
11.40
a.
If client credibility and linguistic delivery style interact, then the effect of client
credibility on the likelihood value depends on the level of linguistic delivery style.
b.
To determine the overall model adequacy, we test:
H0: 1 = 2 = 3 = 0
c.
The test statistic is F = 55.35 and the p-value is p < 0.0005.

Since the p-value is so small (p < 0.0005), H0 is rejected for any reasonable value of
. There is sufficient evidence to indicate that the model is adequate at > 0.0005.
d.
To determine if client credibility and linguistic delivery style interact, we test:
H0: 3 = 0
Ha: 3 0
e.
The test statistic is t = 4.008 and the p-value is p < 0.005.

Since the p-value is so small (p < 0.005), H0 is rejected. There is sufficient evidence
to indicate that client credibility and linguistic delivery style interact at > 0.005.
f.
When x1 = 22, the least squares line is:
y = 15.865 + 0.037(22) 0.678 x2 + 0.036 x2 (22) = 16.679 + 0.114 x2

The estimated slope of the Likelihood-Linguistic delivery style line when client
credibility is 22 is 0.114. When client credibility is equal to 22, for each additional
point increase in linguistic delivery style, the mean likelihood is estimated to increase
by 0.114.
g.
When x1 = 46, the least squares line is:
y = 15.865 + 0.037(46) 0.678 x2 + 0.036 x2 (46) = 17.567 + 0.978 x2

The estimated slope of the Likelihood-Linguistic delivery style line when client
credibility is 46 is 0.978. When client credibility is equal to 46, for each additional
point increase in linguistic delivery style, the mean likelihood is estimated to increase
by 0.978.
396
Chapter 11
11.42
a.
E(y) = 0 + 1x1 + 2x2 + 3x3 + 4x4 + 5x5
b.
H0: 4 = 0
c.
t = 4.408, p-value = .001

Since the p-value is so small, there is strong evidence to reject H0. There is sufficient
evidence to indicate that the strength of client-therapist relationship contributes
information for the prediction of a client's reaction for any > .001.
11.44
d.
Answers may vary.
e.
R2 = .2946. 29.46% of the variability in the client's reaction scores can be explained
by this model.
a.
1 = .02. The mean level of support for a military response is estimated to increase
by .02 for each day increase in level of TV news exposure, all other
variables held constant.
b.
To determine if an increase in TV news exposure is associated with an increase in

support for military resolution, we test:
H0: 1 = 0
Ha: 1 > 0
The p-value is p = .03/2 = .015. Since the p-value is less than (p = .015 < .05), H0 is
rejected. There is sufficient evidence to indicate that an increase in TV news
exposure is associated with an increase in support for military resolution, all other
c.
To determine if the relationship between support for military resolution and gender
depends on political knowledge, we test:
H0: 8 = 0
Ha: 8 0
The p-value is p = .02. Since the p-value is less than (p = .02 < .05), H0 is rejected.
There is sufficient evidence to indicate that the relationship between support for a
military resolution and gender depends on political knowledge, all other variables
held constant, at = .05.
d.
To determine if the relationship between support for military resolution and race
depends on political knowledge, we test:
H0: 9 = 0
Ha: 9 0
The p-value is p = .08. Since the p-value is not less than (p = .08 </ .05), H0 is not
rejected. There is insufficient evidence to indicate that the relationship between
397
support for a military resolution and race depends on political knowledge, all other
e.
f.
R2 = .194.
19.4% of the variation in the support for military resolution is

explained by the model containing the seven independent variables
and the two interaction terms.
H0: 1 = 2 = 3 = 4 = 5 = 6 = 7 = 8 = 9 = 0
Ha: At least one i 0, i = 1, 2, 3, ... , 9
R2 / k
.194 / 9
=
= 46.88
2
(1 R ) /[n (k + 1)] (1 .194) /[1763 (9 + 1)]
The rejection region requires = .05 in the upper tail of the F distribution with 1 =
k = 9 and 2 = n (k + 1) = 1763 (9 + 1) = 1753. From Table IX, Appendix B, F.05
1.88), H0 is rejected. There is sufficient evidence to indicate that the model is useful
at = .05.
11.46
a.
H0: 2 = 0
Ha: 2 0
2 0
s
.47 0
= 3.133
.15
2.074), H0 is rejected. There is sufficient evidence to indicate the quadratic term
should be included in the model at = .05.
b.
H0: 2 = 0
Ha: 2 > 0
The test statistic is the same as in part a, t = 3.133.
The rejection region requires = .05 in the upper tail of the t distribution with df =
22. From Table VI, Appendix B, t.05 = 1.717. The rejection region is t > 1.717.
1.717), H0 is rejected. There is sufficient evidence to indicate the quadratic curve
opens upward at = .05.
398
Chapter 11
11.48
11.50
a.
b.
It moves the graph to the right (2x) or to the left (+2x) compared to the graph of
y = 1 + x2.
c.
It controls whether the graph opens up (+x2) or down (x2). It also controls how steep
the curvature is, i.e., the larger the absolute value of the coefficient of x2 , the
narrower the curve is.
a.
0 has no meaning because x = 0 would not be in the observed range of values. In

this case, x is the year with values between 1984 and 1999.
b.
1 = 321.67. Since the quadratic effect is included in the model, the linear term is
just a location parameter and has no meaning.
c.
2 = .0794. Since the value of 2 is positive, the curvature is upward.
d.
Since no data have been collected past 1999, we have no idea if the relationship
between the two variables from 1984 to 1999 will remain the same until 2021.
399
11.52
a.
Using MINITAB, a sketch of the least squares prediction equation is:

Scatterplot of yhat vs Dose
12
10
yhat
8
6
4
2
0
0
100
200
300
400
Dose
500
600
700
800
b.
For x = 500, y = 10.25 + .0053(500) .0000266(5002 ) = 10.25 + 2.65 6.65 = 6.25
c.
For x = 0, y = 10.25 + .0053(0) .0000266(02 ) = 10.25
d.
For x = 100, y = 10.25 + .0053(100) .0000266(1002 ) = 10.25 + .53 .266 = 10.514

This value is slightly larger than that for the control group (10.25).
For x = 200, y = 10.25 + .0053(200) .0000266(2002 ) = 10.25 + 1.06 1.064 = 10.246
This value is slightly smaller than that for the control group (10.25). So, the largest
value of x which yields an estimated weight change that is closest to, but just less than
the estimated weight change for the control group is x = 200.
11.54
a.
A first order model is:

E(y) = o + 1x
b.
A second order model is:

E(y) = o + 1x + 2x2
400
Chapter 11
c.
Using MINITAB, a scattergram of these data is:

Scatterplot of International vs Domestic
1200
Inter national
1000
800
600
400
200
0
100
200
300
400
Domestic
500
600
From the plot, it appears that the first order model might fit the data better. There
does not appear to be much of a curve to the relationship.
d.

Regression Analysis: International versus Domestic, Dsq
International = 203 - 0.58 Domestic + 0.00364 Dsq
Predictor
Constant
Domestic
Dsq
Coef
202.9
-0.581
0.003638
S = 142.696
SE Coef
245.0
1.510
0.002085
R-Sq = 78.8%
T
0.83
-0.38
1.74
P
0.424
0.707
0.107
R-Sq(adj) = 75.2%
Source
Regression
Residual Error
Total
Source
Domestic
Dsq
DF
1
1
DF
2
12
14
SS
906515
244345
1150860
MS
453258
20362
F
22.26
P
0.000
Seq SS
844526
61990
To investigate the usefulness of the model, we test:
H0: 1 = 2 = 0
Ha: At least one i 0, i = 1, 2
401

The p-value is p = 0.000. Since the p-value is so small, we reject H0. There is
sufficient evidence to indicate the model is useful for predicting foreign gross
revenue.
To determine if a curvilinear relationship exists between foreign and domestic gross
revenues, we test:
H0: 2 = 0
Ha: 2 0
The test statistic is t = 1.74
The p-value is p = 0.107. Since the p-value is greater than = .05
(p = 0.107 > = .05), H0 is not rejected. There is insufficient evidence to indicate
that a curvilinear relationship exists between foreign and domestic gross revenues at
= .05.
e.
11.56
402
From the analysis in part d, the first-order model better explains the variation in
foreign gross revenues. In part d, we concluded that the second-order term did not
improve the model.
a.
b.
It moves the graph to the right (2x) or to the left (+2x) compared to the graph of
y = 1 + x2.
c.
It controls whether the graph opens up (+x2) or down (x2). It also controls how steep
the curvature is, i.e., the larger the absolute value of the coefficient of x2 , the
narrower the curve is.
Chapter 11
11.58
a.
A scatterplot of the data is:

-
10500+
7000+
***
* *
**
*
*
*
** *
**
3500+
*
*
* *
+---------+---------+---------+---------+---------+------X
0.0
8.0
16.0
24.0
32.0
40.0
b.
From the plot, it looks like a second-order model would fit the data better than a firstorder model. There is little evidence that a third-order model would fit the data better
than a second-order model.
c.
Using MINITAB, the output for fitting a first-order model is:

Y = 2752 + 122 X
Predictor
Constant
X
Coef
2752.4
122.34
s = 1904
Stdev
613.5
26.08
R-sq = 36.7%
t-ratio
4.49
4.69
p
0.000
0.000
R-sq(adj) = 35.0%
SOURCE
Regression
Error
Total
DF
1
38
39
SS
79775688
137726224
217501920
Obs.
X
Y
27
27.0
2007
40
40.0
11520
MS
79775688
3624374
Fit Stdev.Fit
6056
345
7646
591
F
22.01
Residual
-4049
3874
p
0.000
St.Resid
-2.16R
2.14R
403
To see if there is a significant linear relationship between day and demand, we test:
H0: 1 = 0
Ha: 1 0
The p-value for the test is p = 0.000. Since the p-value is less than = .05, H0 is
rejected. There is sufficient evidence to indicate that there is a linear relationship
between day and demand at = .05.
d.
Using MINITAB, the output for fitting a second-order model is:

Y = 5120 - 216 X + 8.25 XSQ
Predictor
Constant
X
XSQ
Coef
5120.2
-215.92
8.250
s = 1637
Stdev
816.9
91.89
2.173
R-sq = 54.4%
t-ratio
6.27
-2.35
3.80
p
0.000
0.024
0.001
R-sq(adj) = 52.0%
SOURCE
Regression
Error
Total
DF
2
37
39
SS
118377056
99124856
217501920
SOURCE
X
XSQ
DF
1
1
SEQ SS
79775688
38601372
Obs.
X
Y
27
27.0
2007
MS
59188528
2679050
Fit Stdev.Fit
5305
357
F
22.09
Residual
-3298
p
0.000
St.Resid
-2.06R
To see if there is a significant quadratic relationship between day and demand, we

test:
H0: 2 = 0
Ha: 2 0
The p-value for the test is p = 0.001. Since the p-value is less than = .05, H0 is
rejected. There is sufficient evidence to indicate that there is a quadratic relationship
between day and demand at = .05.
404
Chapter 11
e.
11.60
Since the quadratic term is significant in the second-order model in part d, the second
order model is better.
The model is E(y) = 0 + 1x1 + 2x2

where
1 if the variable is at level 2

x1 =
0 otherwise
1 if the variable is at level 3

x2 =
0 otherwise
0 = mean value of y when qualitative variable is at level 1.

1 = difference in mean value of y between level 2 and level 1 of qualitative variable.
2 = difference in mean value of y between level 3 and level 1 of qualitative variable.
11.62
a.

y = 80 + 16.8x1 + 40.4x2
b.
1 estimates the difference in the mean value of the dependent variable between level
2 and level 1 of the independent variable.
2 estimates the difference in the mean value of the dependent variable between level
3 and level 1 of the independent variable.
c.
The hypothesis H0: 1 = 2 = 0 is the same as H0: 1 = 2 = 3.

The hypothesis Ha: At least one of the parameters 1 and 2 differs from 0 is the same
as Ha: At least one mean (1, 2, or 3) is different.
d.
MSR 2059.5
=
= 24.72
MSE
83.3
Since no was given, we will use = .05. The rejection region requires = .05 in
the upper tail of the test statistic with numerator df = k = 2 and denominator df = n
(k + 1) = 15 (2 + 1) = 12. From Table IX, Appendix B, F.05 = 3.89. The rejection
region is F > 3.89.
3.89), H0 is rejected. There is sufficient evidence to indicate at least one of the means
is different at = .05.
11.64
a.
b.
A confidence interval for the difference of two population means could be used.
Since both sample sizes are over 30, the large sample confidence interval is used (with
independent samples).
1 if public college
Let x1 =
0 otherwise
The model is E(y) = 0 + 1x1
405
c.
1 is the difference between the two population means. A point estimate for 1 is 1 .
A confidence interval for 1 could be used to estimate the difference in the two
population means.
11.66
a.
1 if no
Let x1 =
0 if yes
The model would be E(y) = 0 + 1x1
In this model, 0 is the mean job preference for those who responded yes to the
question "Flextime of the position applied for" and 1 is the difference in the mean job
preference between those who responded 'no' to the question and those who answered
yes to the question.
b.
1 if referral
Let x1 =
0 if not
1 if on-premise
x2 =
0 if not
The model would be E(y) = o + 1x1 + 2x2

In this model, o is the mean job preference for those who responded none to level
of day care support required, 1 is the difference in the mean job preference between
those who responded referral and those who responded none, and 2 is the
difference in the mean job preference between those who responded on-premise and
those who responded none.
c.
1 if counseling
Let x1 =
0 if not
1 if active search
x2 =
0 if not
The model would be E(y) = 0 + 1x1 + 2x2

In this model, 0 is the mean job preference for those who responded none to
spousal transfer support required, 1 is the difference in the mean job preference
between those who responded counseling and those who responded none, and 2 is
the difference in the mean job preference between those who responded active
search and those who responded none.
d.
1 if not married
Let x1 =
0 if married
In this model, 0 is the mean job preference for those who responded married to
marital status and 1 is the difference in the mean job preference between those who
responded not married and those who answered married.
406
Chapter 11
e.
1 if female
Let x1 =
0 if male
In this model, 0 is the mean job preference for males and 1 is the difference in the
mean job preference between females and males.
11.68
a.
4 = .296 The difference in the mean value of DTVA between when the operating
earnings are negative and lower than last year and when the operating earnings are
not negative and lower than last year is estimated to be .296, holding all other
variables constant.
b.
To determine if the mean DTVA for firms with negative earnings and earnings lower
than last year exceed the mean DTVA of other firms, we test:
H0: 4 = 0
Ha: 4 > 0
The p-value for this test is p = .001 / 2 = .0005. Since the p-value is so small, we
would reject H0 for = .05. There is sufficient evidence to indicate the mean DTVA
for firms with negative earnings and earnings lower than last year exceed the mean
DTVA of other firms at = .05.
11.70
c.
Ra2 = .280 28% of the variability in the DTVA scores is explained by the model
containing the 5 independent variables, adjusted for the number of variables in the
model and the sample size.
a.
To determine if there is a difference in the mean monthly rate of return for T-Bills
between an expansive Fed monetary policy and a restrictive Fed monetary policy, we
test:
H0: 1 = 0
Ha: 1 0
Since no n nor is given, we cannot determine the exact rejection region. However,
we can assume that n is greater than 2 since the data used are from 1972 and 1997.
With = .05, the critical value of t for the rejection region will be smaller than 4.303.
Thus, with = .05, t = 8.14 will fall in the rejection region. There is sufficient
evidence to indicate a difference in the mean monthly rate of return for T-Bills
between an expansive Fed monetary policy and a restrictive Fed monetary policy at
= .05.
However, the value of R2 is .1818. The model used is explaining only 18.18% of the
variability in the monthly rate of return. This is not a particularly large value.
407
To determine if there is a difference in the mean monthly rate of return for Equity
REIT between an expansive Fed monetary policy and a restrictive Fed monetary
policy, we test:
H0: 1 = 0
Ha: 1 0
Since no n nor is given, we cannot determine the exact rejection region. However,
we can assume that n is greater than 4 since the data used are from 1972 and 1997.
With = .05, the critical value of t for the rejection region will be smaller than 3.182.
Thus, with = .05, t = 3.46 will fall in the rejection region. There is sufficient
evidence to indicate a difference in the mean monthly rate of return for Equity REIT
between an expansive Fed monetary policy and a restrictive Fed monetary policy at
= .05.
However, the value of R2 is .0387. The model used is explaining only 3.87% of the
variability in the monthly rate of return. This is a very small value.
b.
For the first model, 1 is the difference in the mean monthly rate of return for T-Bills
between an expansive Fed monetary policy and a restrictive Fed monetary policy.
For the second model, 1 is the difference in the mean monthly rate of return for
Equity REIT between an expansive Fed monetary policy and a restrictive Fed
monetary policy.
c.
The least squares prediction equation for the equity REIT index is:
y = 0.01863 0.01582x.
When the Federal Reserves monetary policy is restrictive, x = 1. The predicted mean
monthly rate of return for the equity REIT index is
y = 0.01863 0.01582(1) = .00281

When the Federal Reserves monetary policy is expansive, x = 0. The predicted mean
monthly rate of return for the equity REIT index is
y = 0.01863 0.01582(0) = .01863.
11.72
a.
The first-order model is E(y) = 0 + 1x1
b.
The new model is E(y) = 0 + 1x1 + 2x2 + 3x3

1 if level 2
where x 2 =
0 otherwise
408
1 if level 3
x3 =
0 otherwise
Chapter 11
c.
To allow for interactions, the model is:

E(y) = 0 + 1x1 + 2x2 + 3x3 + 4x1x2 + 5x1x3
11.74
11.76
d.
The response lines will be parallel if 4 = 5 = 0
e.
There will be one response line if 2 = 3 = 4 = 5 = 0
a.
When x2 = x3 = 0, E(y) = 0 + 1x1

When x2 = 1 and x3 = 0, E(y) = 0 + 1x1 + 2
When x2 = 0 and x3 = 1, E(y) = 0 + 1x1 + 3
b.
For level 1, y = 44.8 + 2.2x1

For level 2, y = 44.8 + 2.2x1 + 9.4
= 54.2 + 2.2x1
For level 3, y = 44.8 + 2.2x1 + 15.6
= 60.4 + 2.2x1
The model is E(y) = 0 + 1x1 + 2 x12 + 3x2 + 4x3 + 5x4

where x1 is the quantitative variable and
1 if level 2 of qualitative variable
x2 =
0 otherwise
x3 =
0 otherwise
x4 =
0 otherwise
409
11.78
a.
E(y) = 0 + 1x1 + 2x2 + 3x1x2
1 if diet is duck chow

where x 2 =
0 otherwise
b.
Using MINITAB, the printout is:

WtChg = -2.21 + 0.0783x1 - 10.4x2 - 0.095x1x2
Predicto
r
Constant
x1
x2
x1x2
S = 3.882
Coef
StDev
-2.210
0.07831
10.354
-0.0948
1.250
0.04947
8.538
0.1418
-1.77
1.58
1.21
-0.67
0.085
0.122
0.233
0.508
R-Sq = 44.1%
R-Sq(adj) = 39.7
Source
Regression
Residual
Error
Total
Sourc
e
x1
x2
x1x2
DF
3
38
SS
452.54
572.58
41
1025.12
DF
Seq SS
1
1
1
384.24
61.57
6.73
MS
150.85
15.07
F
10.01
P
0.000
Obs
12
37
40
x1
30.0
42.5
75.0
y
-8.500
8.000
8.500
WtChg StDev Fit Residual St Resid

0.139
0.802
-8.639
-2.27R
7.445
2.990
0.555
0.22 X
6.910
2.077
1.590
0.48 X

X denotes an observation whose X value gives it large influence.
The fitted equation is y = 2.21 + .0783x1 + 10.4x2 .095x1x2
410
Chapter 11
c.
For diet = plants, x2 = 0

y = 2.21 + .0783x1 + 10.4(0) .095x1(0) = 2.21 + .0783x1
The slope is .0783. For each unit increase in digestion efficiency, the mean weight
change is estimated to increase by .0783 for goslings fed plants.
d.
For diet = plants, x2 = 1
y = 2.21 + .0783x1 + 10.4(1) .095x1(1) = 8.19 .0167x1

The slope is .0167. For each unit increase in digestion efficiency, the mean weight
change is estimated to decrease by .0167 for goslings fed duck chow.
e.
To determine if the slopes associated with the two diets differ, we test:
H0: 3 = 0
Ha: 3 0
From MINITAB, the test statistic is t = .67 with p-value = .508

Since = .05 is less than the p-value, we fail to reject H0. There is insufficient
evidence to conclude that the slopes associated with the two diets are significantly
different at = .05
11.80
a.
1 if intervention group
Let x2 =
0 if otherwise
The first-order model would be:
E(y) = 0 + 1x1 + 2x2
b.
For the control group, x2 = 0. The first-order model is:

E(y) = 0 + 1x1 + 2(0) = 0 + 1x1
For the intervention group, x2 = 1. The first-order model is:

E(y) = 0 + 1x1 + 2(1) = 0 + 1x1 + 2 = (0 + 2) + 1x1
In both models, the slope of the line is 1.

c.
If pretest score and group interact, the first-order model would be:
E(y) = 0 + 1x1 + 2x2 + 3x1x2
411
d.
For the control group, x2 = 0. The first-order model including the interaction is:
E(y) = 0 + 1x1 + 2(0) + 3x1(0) = 0 + 1x1
For the intervention group, x2 = 1. The first-order model including the interaction is:
E(y) = 0 + 1x1 + 2(1) + 3x1(1) = 0 + 1x1 + 2 + 3x1
= (0 + 2) + (1 + 3)x1
The slope of the model for the control group is 1. The slope of the model for the
intervention group is 1 + 3.
11.82
a.
The first-order model is:

E(y) = 0 + 1x1 + 2x2
b.
For the high-tech firms, x2 = 1. The model for the high-tech firm is:
E(y) = 0 + 1x1 + 2(1) = 0 + 2 + 1x1
The slope of the line would be 1.

c.
The new model would include the interaction term:

E(y) = 0 + 1x1 + 2x2 + 3x1x2
d.
For the high-tech firms, x2 = 1. The model for the high-tech firm is:
E(y) = 0 + 1x1 + 2(1) + 3x1(1) = 0 + 2 + (1 + 3)x1
The slope of the line would be 1 + 3.

11.84
By adding variables to the model, SSE will decrease or stay the same. Thus, SSEC SSER.
The only circumstance under which we will reject H0 is if SSEC is much smaller than SSER.
If SSEC is much smaller than SSER, F will be large. Thus, the test is only one-tailed.
11.86
a.
b.
The reduced model would be E(y) = 0 + 1x1 + 2x2
c.
The numerator df = k g = 5 2 = 3 and the denominator df = n (k + 1)

= 30 (5 + 1) = 24.
412
Chapter 11
d.
H0: 3 = 4 = 5 = 0
(SSE R SSE C)/(k g )
SSE C /[n (k + 1)]
(1250.2 1125.2) /(5 2) 41.6667
= .89
=
=
1125.2 /[30 (5 + 1)]
46.8833
numerator df = k g = 5 2 = 3 and denominator df = n (k + 1) = 30 (5 + 1) = 24.
From Table IX, Appendix B, F.05 = 3.01. The rejection region is F > 3.01.
.89 >/ 3.01), H0 is not rejected. There is insufficient evidence to indicate the secondorder terms are useful at = .05.
11.88
a.
Let variables x1 through x4 be the Demographic variables, variables x5 through x11 be

the Diagnostic variables, variables x12 through x15 be the Treatment variables, and
variables x16 through x21 be the Community variables. The compete model is:
E ( y ) = 0 + 1 x1 + 2 x2 + 3 x3 + 4 x4 + 5 x5 + 6 x6 + 7 x7 + 8 x8 + 9 x9
+ 10 x10 + 11 x11 + 12 x12 + 13 x13 + 14 x14 + 15 x15 + 16 x16 + 17 x17
+ 18 x18 + 19 x19 + 20 x20 + 21 x21
b.
To determine if the 7 Diagnostic variables contribute information for the prediction

of y, we test:
H0: 5 = 6 = = 11 = 0
c.
The reduced model would be:

E ( y ) = 0 + 1 x1 + 2 x2 + 3 x3 + 4 x4 + 12 x12 + 13 x13 + 14 x14
+ 15 x15 + 16 x16 + 17 x17 + 18 x18 + 19 x19 + 20 x20 + 21 x21
11.90
d.
Since the p-value is so small (p < .0001), H0 is rejected. There is sufficient evidence
to indicate at least one of the seven diagnostic variables contributes information for
the prediction of y.
a.
The complete second order model is:

E(y) = 0 + 1x1 + x12 + 3x2 + 4x1x2 + 5 x12 x2
where x1 = age
1 if current
x2 =
0 otherwise
413
b.
To determine if the quadratic terms are important, we test:
c.
H0: 2 = 5 = 0
To determine if the interaction terms are important, we test:
H0: 4 = 5 = 0
d.
From MINITAB, the outputs from fitting the three models are:
Regression Analysis: Value versus Age, AgeSq, Status, AgeSt, AgeSqSt
Value = 83 - 5.7 Age + 0.236 AgeSq - 62 Status + 5.4 AgeSt - 0.234 AgeSqSt
Predictor
Constant
Age
AgeSq
Status
AgeSt
AgeSqSt
Coef
83.4
-5.74
0.2361
-62.1
5.36
-0.2337
S = 286.8
SE Coef
316.3
18.68
0.2549
354.8
24.81
0.4080
R-Sq = 24.7%
T
0.26
-0.31
0.93
-0.18
0.22
-0.57
P
0.793
0.760
0.359
0.862
0.830
0.570
R-Sq(adj) = 16.1%
Source
Regression
Residual Error
Total
Source
Age
AgeSq
Status
AgeSt
AgeSqSt
DF
5
44
49
DF
1
1
1
1
1
SS
1186549
3618994
4805542
MS
237310
82250
F
2.89
P
0.024
Seq SS
865746
138871
77594
77342
26996
Regression Analysis: Value versus Age, Status, AgeSt

Value = - 176 + 11.2 Age + 196 Status - 11.4 AgeSt
Predictor
Constant
Age
Status
AgeSt
Coef
-176.1
11.166
196.5
-11.432
S = 283.2
SE Coef
145.0
3.902
178.9
6.763
R-Sq = 23.2%
T
-1.21
2.86
1.10
-1.69
P
0.231
0.006
0.278
0.098
R-Sq(adj) = 18.2%
Source
Regression
Residual Error
Total
Source
Age
Status
AgeSt
414
DF
1
1
1
DF
3
46
49
SS
1116017
3689526
4805543
MS
372006
80207
F
4.64
P
0.006
Seq SS
865746
21097
229174
Chapter 11
Regression Analysis: Value versus Age, AgeSq, Status

Value = 166 - 8.8 Age + 0.253 AgeSq - 106 Status
Predictor
Constant
Age
AgeSq
Status
Coef
165.8
-8.81
0.2535
-105.6
S = 284.5
SE Coef
182.7
10.89
0.1632
107.9
R-Sq = 22.5%
T
0.91
-0.81
1.55
-0.98
P
0.369
0.423
0.127
0.333
R-Sq(adj) = 17.5%
Source
Regression
Residual Error
Total
Source
Age
AgeSq
Status
DF
1
1
1
DF
3
46
49
SS
1082210
3723332
4805542
MS
360737
80942
F
4.46
P
0.008
Seq SS
865746
138871
77594
Test for part b:

F=
(SSE R SSE C)/(k g ) (3, 689, 526 3, 618, 994) / 2

=
= .429
82, 250
SSE C /[n ( k + 1)]
Since no is given, we will use = .05. The rejection region requires = .05 in the
upper tail of the F distribution with 1 = 2 numerator degrees of freedom and 2 = 44
denominator degrees of freedom. From Table IX, Appendix B, F.05 3.23. The
.429 >/ 3.23), H0 is not rejected. There is insufficient evidence to indicate the
quadratic terms are important for predicting market value at = .05.
Test for part c:
F=
(SSE R SSE C)/(k g ) (3, 723, 332 3, 618, 994) /(5 3)

=
= .634
82, 250
SSE C /[n (k + 1)]
The rejection region is the same as in previous test. Reject H0 if F > 3.23.
(F = .634 >/ 3.23), H0 is not rejected. There is insufficient evidence to indicate the
interaction terms are important for predicting market value at = .05.
415
11.92
a.
The reduced model for testing if the mean posttest scores differ for the intervention
and control groups would be:
E(y) = 0 + 1x1
11.94
b.
The reported p-value is .03. Since the p-value is so small, H0 is rejected. There is
evidence to indicate that the mean posttest sun safety knowledge scores differ for the
intervention and control groups for > .03.
c.
The reported p-value is .033. Since the p-value is so small, H0 is rejected. There is
evidence to indicate that the mean posttest sun safety comprehension scores differ for
the intervention and control groups for > .033.
d.
The reported p-value is .322. Since the p-value is not small, H0 is not rejected. There
is no evidence to indicate that the mean posttest sun safety application scores differ
for the intervention and control groups for < .322.
a.
To determine whether the rate of increase of emotional distress with experience is

different for the two groups, we test:
H0: 4 = 5 = 0
b.
To determine whether there are differences in mean emotional distress levels that are
attributable to exposure group, we test:
H0: 3 = 4 = 5 = 0
c.
To determine whether there are differences in mean emotional distress levels that are
attributable to exposure group, we test:
H0: 3 = 4 = 5 = 0
(SSE R SSE C) /(k g )

(795.23 783.9) /(5 2)
=
= .93
783.9 /[200 (5 + 1)]
SSE C /[ n (k + 1)]
The rejection region requires = .05 in the upper tail of the F distribution with 1 = k
g = 5 2 = 3 and 2 = n (k + 1) = 200 (5 + 1) = 194. From Table IX, Appendix
B, F.05 2.60. The rejection region is F > 2.60.
(F = .93 >/ 2.60), H0 is not rejected. There is insufficient evidence to indicate that
there are differences in mean emotional distress levels that are attributable to exposure
group at = .05.
416
Chapter 11
11.96
a.
The best one-variable predictor of y is the one whose t statistic has the largest absolute
value. The t statistics for each of the variables are:
Independent
Variable
x1
x2
x3
x4
x5
x6
t=
i
s
t = 1.6/.42 = 3.81
t = .9/.01 = 90
t = 3.4/1.14 = 2.98
t = 2.5/2.06 = 1.21
t = 4.4/.73 = 6.03
t = .3/.35 = .86
The variable x2 is the best one-variable predictor of y. The absolute value of the
corresponding t score is 90. This is larger than any of the others.
11.98
b.
Yes. In the stepwise procedure, the first variable entered is the one which has the
largest absolute value of t, provided the absolute value of the t falls in the rejection
region.
c.
Once x2 is entered, the next variable that is entered is the one that, in conjunction with
x2, has the largest absolute t value associated with it.
a.
In step 1, all 1 variable models are fit. Thus, there are a total of 11 models fit.
b.
In step 2, all two-variable models are fit, where 1 of the variables is the best one
selected in step 1. Thus, a total of 10 two-variable models are fit.
c.
In the 11th step, only one model is fit the model containing all the independent
variables.
d.
The model would be:
E ( y ) = 0 + 1 x1 + 2 x2 + 3 x3 + 4 x4 + 7 x7 + 9 x9 + 10 x10 + 11 x11
e.
67.7% of the total sample variability of overall satisfaction is explained by the model
containing the independent variables safety on bus, seat availability, dependability, t
travel time, convenience of route, safety at bus stops, hours of service, and frequency
of service.
f.
Using stepwise regression does not guarantee that the best model will be found.
There may be better combinations of the independent variables that are never found,
because of the order in which the independent variables are entered into the model.
417
11.100 a.
The plot of the residuals reveals a nonrandom pattern. The residuals exhibit a curved
shape. Such a pattern usually indicates that curvature needs to be added to the model.
b.
The plot of the residuals reveals a nonrandom pattern. The residuals versus the
predicted values shows a pattern where the range in values of the residuals increases
as y increases. This indicates that the variance of the random error, , becomes
larger as the estimate of E(y) increases in value. Since E(y) depends on the x-values
in the model, this implies that the variance of is not constant for all settings of the
x's.
c.
This plot reveals an outlier, since all or almost all of the residuals should fall within 3
standard deviations of their mean of 0.
d.
This frequency distribution of the residuals is skewed to the right. This may be due to
outliers or could indicate the need for a transformation of the dependent variable.
11.102 a.
b.
Since all the pairwise correlations are .45 or less in absolute value, there is little
evidence of extreme multicollinearity.
No. The overall model test is significant (p < .001). This implies that at least one
variable contributes to the prediction of the urban/rural rating. Looking at the
individual t-tests, there are several that are significant, namely x1, x3, and x5. There is
no evidence that multicollinearity is present.
11.104 First, we need to compute the value of the residual:
Residual = y y = 87 29.63 = 57.37

We are given that the standard deviation is s = 24.68. Thus, an observation with a
residual of 57.37 is 57.37 / 24.68 = 2.32 standard deviations from the fitted regression
line. Since this is less than 3 standard deviations from the regression line, this point is
not considered an outlier.
418
Chapter 11
11.106 a.

Regression Analysis: Food versus Income, Size
Food = 2.79 - 0.00016 Income + 0.383 Size
Predictor
Constant
Income
Size
Coef
2.7944
-0.000164
0.38348
S = 0.7188
SE Coef
0.4363
0.006564
0.07189
R-Sq = 55.8%
T
6.40
-0.02
5.33
P
0.000
0.980
0.000
R-Sq(adj) = 52.0%
Source
Regression
Residual Error
Total
Source
Income
Size
DF
2
23
25
DF
1
1
SS
15.0027
11.8839
26.8865
MS
7.5013
0.5167
F
14.52
P
0.000
Seq SS
0.2989
14.7037
Correlations: Income, Size

Pearson correlation of Income and Size = -0.137
P-Value = 0.506
No; Income and household size do not seem to be highly correlated. The correlation
coefficient between income and household size is .137.
b.
Using MINITAB, the residual plots are:

Histogram of the Residuals
(response is Food)
Frequency
10
0
-1.0
-0.5
0.0
0.5
1.0
1.5
2.0
2.5
3.0
Residual
419
Residuals Versus the Fitted Values

(response is Food)
3
Residual
-1
3
Fitted Value
Residuals Versus Income

(response is Food)
3
Residual
-1
0
10
20
30
40
50
60
70
80
90
100
Income
Residuals Versus Size

(response is Food)
3
Residual
-1
0
Size
Yes; The residuals versus income and residuals versus homesize exhibit a curved shape.
Such a pattern could indicate that a second-order model may be more appropriate.
420
Chapter 11
c.
No; The residuals versus the predicted values reveals varying spreads for different
values of y . This implies that the variance of is not constant for all settings of the
x's.
d.
Yes; The outlier shows up in several plots and is the 26th household (Food consumption
= $7500, income = $7300 and household size = 5).
e.
No; The frequency distribution of the residuals shows that the outlier skews the
frequency distribution to the right.
11.108 Using MINITAB, the residual plots are:
Residual Plots for DDT

Normal Probability Plot of the Residuals
Percent
99
90
50
10
1
0.1
Residuals Versus the Fitted Values

Standardized Residual
99.9
-5
0
5
2.5
0.0
50
10
50
Fitted Value
100
Residuals Versus the Order of the Data

Frequency
100
2
4
6
8
5.0
Histogram of the Residuals
7.5
10
150
10.0
10.0
7.5
5.0
2.5
0.0
1 10 20 30 4 0 5 0 6 0 7 0 8 0 9 0 00 10 20 30 40
1 1 1 1 1
Observation Order
Residuals Versus WEIGHT

(response is DDT)
12
10
8
6
4
2
0
0
500
1000
1500
2000
2500
WEIGHT
421
Residuals Versus LENGTH

(response is DDT)
12
10
8
6
4
2
0
20
25
30
35
LENGTH
40
45
50
55
Residuals Versus MILE

(response is DDT)
12
10
8
6
4
2
0
0
50
100
150
200
250
300
350
MILE
From the normal probability plot, the points do not fall on a straight line, indicating the
residuals are not normal. The histogram of the residuals indicates the residuals are
skewed to the right, which also indicates that the residuals are not normal. The plot of
the residuals versus yhat indicates that there is at least one outlier and the variance is
not constant. One observation has a standardized residual of more than 10 and several
others have standardized residuals greater than 3. This is also evident in the plots of the
residuals versus each of the independent variables. Since the assumptions of normality
and constant variance appear to be violated, we could consider transforming the data.
We should also check the outlying observations to see if there are any errors connected
with these observations.
11.110 a.
To determine if at least one of the parameters is not zero, we test:

H0: 1 = 2 = 3 = 4 = 0
422
R2 / k
.83 / 4
=
= 24.41
2
(1 R ) /[n (k + 1)] (1 .83)([25 (4 + 1)]
Chapter 11
numerator df = k = 4 and denominator df = n (k + 1) = 25 (4 + 1) = 20. From
Table IX, Appendix B, F.05 = 2.87. The rejection region is F > 2.87.
> 2.87), H0 is rejected. There is sufficient evidence to indicate at least one of the
parameters is nonzero at = .05.
b.
H0: 1 = 0
Ha: 1 < 0
1 0
s
2.43 0
= 2.01
1.21
The rejection region requires = .05 in the lower tail of the t distribution with df =
n (k + 1) = 25 (4 + 1) = 20. From Table VI, Appendix B, t.05 = 1.725. The
< 1.725), H0 is rejected. There is sufficient evidence to indicate 1 is less than 0 at
= .05.
c.
H0: 2 = 0
Ha: 2 > 0
2 0
s
.05 0
= .31
.16
The rejection region requires = .05 in the upper tail of the t distribution. From part
b above, the rejection region is t > 1.725.
.31 >/ 1.725), H0 is not rejected. There is insufficient evidence to indicate 2 is
d.
H0: 3 = 0
Ha: 3 0
3 0
s
.62 0
= 2.38
.26
df = 20. From Table VI, Appendix B, t.025 = 2.086. The rejection region is t < 2.086
or t > 2.086.
2.086), H0 is rejected. There is sufficient evidence to indicate 3 is different from 0 at
= .05.
423
11.112 The error of prediction is smallest when the values of x1, x2, and x3 are equal to their sample
means. The further x1, x2, and x3 are from their means, the larger the error. When x1 = 60,
x2 = .4, and x3 = 900, the observed values are outside the observed ranges of the x values.
When x1 = 30, x2 = .6, and x3 = 1300, the observed values are within the observed ranges
and consequently the x values are closer to their means. Thus, when x1 = 30, x2 = .6, and
x3 = 1300, the error of prediction is smaller.
11.114 From the plot of the residuals for the straight line model, there appears to be a mound shape
which implies the quadratic model should be used.
11.116 a.
b.
Ha: At least one of 4 and 5 0

The regression model
E(y) = 0 + 1x1 + 2x2 + 3 x22 + 4x1x2 + 5x1 x22
is fit to the 35 data points, yielding a sum of squares for error, denoted SSEC. The
regression model
E(y) = 0 + 1x1 + 2x2 + 3 x22
is also fit to the data and its sum of squares for error is obtained, denoted SSER. Then
the test statistic is:
F=
(SSE R SSE C) /( k g )
SSE C /[n (k + 1)]
where k = 5, g = 3, and n = 35.

c.
The numerator degrees of freedom is k g = 5 3 = 2, and the denominator degrees

of freedom is n (k + 1) = 35 (5 + 1) = 29.
d.
numerator df = 2 and denominator df = 29. From Table IX, Appendix B, F.05 = 3.33.
11.118 a.
E(y) = 0 + 1x1 + 2x2 + 3x3

1, if level 2
1, if level 3
x3 =
where x2 =
0, otherwise
0, otherwise
b.
E(y) = 0 + 1x1 + 2 x12 + 3x2 + 4x3 + 5x1x2 + 6x1x3 + 7 x12 x2 + 8 x12 x3

where x1, x2, and x3 are as in part a.
424
Chapter 11
11.120 a.
b.
11.122 a.
E(y) = 0 + 1x1 + 2x2

E(y) = 0 + 1x1 + 2 x12 + 3x2 + 4 x22 + 5x1x2
1.
2.
3.
4.
5.
b.
c.
The "Quantitative GMAT score" is measured on a numerical scale, so it is a

quantitative variable.
The "Verbal GMAT score" is measured on a numerical scale, so it is a
The "Undergraduate GPA" is measured on a numerical scale, so it is a
The "First-year graduate GPA" is measured on a numerical scale, so it is a
The "Student cohort" has 3 categories, so it is a qualitative variable. Note that
the numerical scale is meaningless in this situation. (It is possible to consider
this as a quantitative variable. However, for this problem we will consider it as
qualitative.)
The quantitative variables GMAT score, verbal GMAT score, undergraduate GPA,
and first-year graduate GPA should all be positively correlated to final GPA.
1
x5 =
0
1
x6 =
0
if student entered doctoral program in year 3

otherwise
if student entered doctoral program in year 5
otherwise
d.
E(y) = 0 + 1x1 + 2x2 + 3x3 + 4x4 + 5x5 + 6x6
e.
0 = the y-intercept for students entering in year 1.

1 = the final GPA will increase by 1 for each additional increase of one unit of
GMAT score, holding the remaining variables constant.
2 = the final GPA will increase by 2 for each additional increase of one unit of
verbal GMAT score, holding the remaining variables constant.
3 = the final GPA will increase by 3 for each additional increase of one
undergraduate GPA point, holding the remaining variables constant.
4 = the final GPA will increase by 4 for each additional increase of one first-year
graduate GPA point, holding the remaining variables constant.
5 = difference in mean final GPA between student cohort year 2 and year 1.
6 = difference in mean final GPA between student cohort year 3 and year 1.
f.
E(y) = 0 + 1x1 + 2x2 + 3x3 + 4x4 + 5x5 + 6x6 + 7x1x5 + 8x1x6

+ 9x2x5 + 10x2x6 + 11x3x5 + 12x3x6 + 13x4x5 + 14x4x6
425
g.
For the year 1 cohort, x5 = x6 = 0. The model is:

E(y) = 0 + 1x1 + 2x2 + 3x3 + 4x4 + 5(0) + 6(0) + 7x1(0) + 8x1(0)
+ 9x2(0) + 10x2(0) + 11x3(0) + 12x3(0) + 13x4(0) + 14x4(0)
= 0 + 1x1 + 2x2 + 3x3 + 4x4
The slopes for the four variables are 1, 2, 3 and 4 respectively.
11.124 a.
The hypothesized model is:

E(y) = 0 + 1x1 + 2x2 + 3x3 + 4x4 + 5x5
0 = y-intercept. It has no interpretation in this model.

1 = difference in the mean salaries between males and females, all other variables
held constant.
2 = difference in the mean salaries between whites and nonwhites, all other variables
held constant.
3 = change in the mean salary for each additional year of education, all other
4 = change in the mean salary for each additional year of tenure with firm, all other
5 = change in the mean salary for each additional hour worked per week, all other
b.
y = 15.491 + 12.774x1 + .713x2 + 1.519x3 + .32x4 + .205x5
0 = estimate of the y-intercept. It has no interpretation in this model.

1 : We estimate the difference in the mean salaries between males and females to be
$12.774, all other variables held constant.
2 : We estimate the difference in the mean salaries between whites and nonwhites to
be
$.713, all other variables held constant.
3 : We estimate the change in the mean salary for each additional year of education
to be $1.519, all other variables held constant.
4 : We estimate the change in the mean salary for each additional year of tenure
with firm to be $.320, all other variables held constant.
5 : We estimate the change in the mean salary for each additional hour worked per
week to be $.205, all other variables held constant.
426
Chapter 11
c.
R2 = .240. 24% of the total variability of salaries is explained by the model containing
gender, race, educational level, tenure with firm, and number of hours worked per
week.
To determine if the model is useful for predicting annual salary, we test:
H0: 1 = 2 = 3 = 4 = 5 = 0
R2 / k
.24 / 5
=
= 11.68
2
(1 R )[n (k + 1)] (1 .24) /[191 (5 + 1)]
Table IX, Appendix B, F.05 2.21. The rejection region is F > 2.21.
2.21), H0 is rejected. There is sufficient evidence to indicate the model containing
gender, race, educational level, tenure with firm, and number of hours worked per
week is useful for predicting annual salary for = .05.
d.
To determine if male managers are paid more than female managers, we test:
H0: 1 = 0
Ha: 1 > 0
The p-value given for the test < .05/2 = .025. Since the p-value is less than = .05,
there is evidence to reject H0. There is evidence to indicate male managers are paid
more than female managers, holding all other variables constant, for > .025.
e.
11.126 a.
b.
The salary paid an individual depends on many factors other than gender. Thus, in
order to adjust for other factors influencing salary, we include them in the model.
The main effects model would be: E ( y ) = 0 + 1 x1 + 8 x8
1 = .28 . The mean value for the relative error of the effort estimate for developers
is estimated to be .28 units below that of project leaders, holding previous accuracy
constant.
8 = .27 . The mean value for the relative error of the effort estimate if previous
accuracy is more than 20% is estimated to be .27 units above that if previous
accuracy is less than 20%, holding company role of estimator constant.
c.
One possible reason for the sign of 1 being opposite from what is expected could be
that company role of estimator and previous accuracy could be correlated.
427
11.128 a.
R2 = .45. 45% of the total variability of the suicide rates is explained by the model
containing unemployment rate, percentage of females in the work force, divorce rate,
logarithm of GNP, and annual percent change in GNP.
To determine if the model is useful for predicting suicide rate, we test:
H0: 1 = 2 = 3 = 4 = 5 = 0
R2 / k
.45 / 5
=
= 6.38
2
(1 R )[n (k + 1)] (1 .45) /[45 (5 + 1)]
Table IX, Appendix B, F.05 2.45. The rejection region is F > 2.45.
2.45), H0 is rejected. There is sufficient evidence to indicate the model containing
unemployment rate, percentage of females in the work force, divorce rate, logarithm of
GNP and annual percent change in GNP is useful for predicting suicide rate for = .05.
b.
0 = .002 = estimate of the y-intercept. It has no interpretation in this model.

1 : We estimate the change in suicide rate for each unit change in unemployment
rate to be .0204, all other variables held constant.
2 : We estimate the change in suicide rate for each unit change in percentage of
females in the work force to be .0231, all other variables held constant.
3 : We estimate the change in suicide rate for each unit change in divorce rate to be
.0765, all other variables held constant.
4 : We estimate the change in suicide rate for each unit change in logarithm of GNP
to be .2760, all other variables held constant.
5 : We estimate the change in suicide rate for each unit change in annual percent
change in GNP to be .0018, all other variables held constant.
The p-values for unemployment rate and percentage of females in the work force are
less than .05. This indicates that both are important in predicting suicide rate. The pvalues for divorce rate, logarithm of GNP, and annual percent change in GNP are all
greater than .10. This indicates that none of these variables are important in
predicting suicide rate. We must view these conclusions with caution. Some of these
independent variables may be highly correlated with each other. If so, some of the
variables declared nonsignificant may be significant if the other variables are removed
from the model.
428
Chapter 11
c.
To determine if unemployment rate is a useful predictor of the suicide rate, we test:

H0: 1 = 0
Ha: 1 0
The p-value = .002. Since this p-value is less than = .05, there is evidence to reject
H0. There is sufficient evidence to indicate unemployment rate is a useful predictor of
the suicide rate for = .05.
d.
Curvature: It may be possible that the relationship between the suicide rate and some
of the independent variables is not linear, but curved. Thus, some of the variables that
do not appear to be useful predictors may, in fact, be useful predictors if the secondorder term was added to the model.
Interaction: Again, it may be possible that the effect of some independent variables
on the suicide rate is different for different levels of other independent variables. This
possibility should be explored before throwing out certain independent variables.
Multicollinearity: Some of these independent variables may be highly correlated with
each other. If so, some of the variables declared nonsignificant may be significant if
other variables are removed from the model.
11.130 CEO income (x1) and stock percentage (x2) are said to interact if the effect of one variable,
say CEO income, on the dependent variable profit (y) depends on the level of the second
variable, stock percentage.
11.132 a.
The SAS output is:

DEP VARIABLE: Y
ANALYSIS OF VARIANCE
SUM OF
MEAN
DF
SQUARES
SQUARE
F VALUE
PROB>F
MODEL
25784705.01
8594901.67
241.758
0.0001
ERROR
16
568826.19
35551.63709
C TOTAL
19
26353531.20
ROOT MSE
188.5514
R-SQUARE
0.9784
DEP MEAN
3014.2
ADJ R-SQ
0.9744
SOURCE
C.V.
6.255438
PARAMETER ESTIMATES
PARAMETER
STANDARD
T FOR H0:
ESTIMATE
ERROR
PARAMETER=0
PROB > |T|
290.99944
4.581
0.0003
0.37864583
-0.399
0.6949
5.34596285
-0.491
0.6300
0.006863831
7.569
0.0001
VARIABLE
DF
INTERCEP
1333.17830
X1
-0.15122302
X2
-2.62532461
X1X2
0.05195415
The fitted model is y = 1333.18 .151x1 2.625x2 + .052x1x2
429
b.
To determine if the overall model is useful, we test:

H0: 1 = 2 = 3 = 0
MSR
8, 594, 901.67
=
= 241.758
MSE
35, 551.637
Table IX, Appendix B, F.05 = 3.24. The rejection region is F > 3.24.
> 3.24), H0 is rejected. There is sufficient evidence to indicate the model is useful at
= .05.
c.
To determine if the interaction is present, we test:

H0: 3 = 0
Ha: 3 0
3 0
s
= 7.569.
2.120), H0 is rejected. There is sufficient evidence to indicate the interaction between
advertising expenditure and shelf space is present at = .05.
430
d.
Advertising expenditure and shelf space are said to interact if the affect of advertising
expenditure on sales is different at different levels of shelf space.
e.
If a first-order model was used, the effect of advertising expenditure on sales would
be the same regardless of the amount of shelf space. If interaction really exists, the
effect of advertising expenditure on sales would depend on which level of shelf space
was present.
Chapter 11
11.134 a.
There is a curvilinear trend.

b.

The regression equation is y = 42.2 - 0.0114x + 0.000001 xsq
Predictor
Coef
StDev
42.247
5.712
7.40
0.000
-0.011404
0.005053
-2.26
0.037
0.00000061
0.00000037
1.66
0.115
Constant
x
xsq
S = 21.81
R-Sq = 34.9%
R-Sq(adj) = 27.2%
Source
DF
SS
MS
4325.4
2162.7
4.55
0.026
475.6
Regression
Residual Error
17
8085.5
Total
19
12410.9
Sourc
DF
Seq SS
e
x
3013.3
xsq
1312.1
Obs
16
17
x1
9150
Fit
StDev Fit
Residual
4.60
-11.21
16.24
15.81
St Resid
1.09 x
15022
2.20
8.09
21.40
-5.89
-1.41 x
X denotes an observation whose X value gives it large influence.
The fitted model is y = 42.2 .0114x + .00000061x2
431
c.
To determine if a curvilinear relationship exists, we test:

H0: 2 = 0
Ha: 2 0
From MINITAB, the test statistic is t = 1.66 with p-value = .115. Since the p-value is
greater than = .05, do not reject H0. There is insufficient evidence to indicate that a
curvilinear relationship exists between dissolved phosphorus percentage and soil loss
at = .05.
11.136 a.
The first order model for this problem is:

E(y) = 0 + 1x1 + 2x2 + 3x3 + 4x4
b.

Regression Analysis
y = 28.9 -0.000000 x1 + 0.844 x2 - 0.360 x3 - 0.300 x4
Predictor
Coef
StDev
28.87
12.67
2.28
0.034
x1
-0.00000011
0.00000028
-0.38
0.708
x2
0.8440
0.2326
3.63
0.002
x3
-0.3600
0.1316
-2.74
0.013
x4
-0.3003
0.1834
-1.64
0.117
Constant
S = 5.989
R-Sq = 51.2%
R-Sq(adj) = 41.5%
Source
Regression
DF
SS
MS
753.76
188.44
5.25
0.005
35.87
Residual Error
20
717.40
Total
24
1471.17
Source
DF
Seq SS
x1
129.96
x2
355.43
x3
172.19
x4
96.17
Obs
x1
Fit
StDev Fit
Residual
11940345
32.60
17.25
3.40
15.35
St Resid
3.11R
12
4905123
27.00
16.17
4.36
10.83
2.63R
432
Chapter 11
The least squares prediction line is y = 28.9 .00000011x1 + .844x2 .360x3 .300x4.
To determine if the model is useful for predicting percentage of problem mortgages,
we test:
H0: 1 = 2 = 3 = 4 = 0
Ha: At least one of the coefficients is nonzero
MS(Model)
= 5.25
MSE
The p-value is p = .005. Since the p-value is less than = .05 (p = .005 < .05), H0 is
rejected. There is sufficient evidence to indicate the model is useful in predicting
percentage of problem mortgages at = .05.
c.
0 = 28.9. This is merely the y-intercept. It has no other meaning in this problem.
1 = 0.00000011. For each unit increase in total mortgage loans, the mean
percentage of problem mortgages is estimated to decrease by 0.00000011, holding
percentage of invested assets, percentage of commercial mortgages, and percentage of
residential mortgages constant.
2 = 0.844. For each unit increase in percentage of invested assets, the mean
percentage of problem mortgages is estimated to increase by 0.844, holding total
mortgage loans, percentage of commercial mortgages, and percentage of residential
mortgages constant.
3 = 0.360. For each unit increase in percentage of commercial mortgages, the

mean percentage of problem mortgages is estimated to decrease by 0.360, holding
total mortgage loans, percentage of invested assets, and percentage of residential
mortgages constant.
4 = 0.300. For each unit increase in percentage of residential mortgages, the mean
percentage of problem mortgages is estimated to decrease by 0.300, holding total
mortgage loans, percentage of invested assets, and percentage of commercial
mortgages constant.
433
d.
Using MINITAB, the scattergrams are:
From the scattergrams, it appears that possibly x2 and x4 might warrant inclusion in
the model as second order terms.
434
Chapter 11
e.

Regression Analysis
y = 56.2 -0.000000 x1 - 1.82 x2 - 0.449 x3 + 0.223 x4 + 0.0771 x2sq - 0.0189 x4sq
Predictor
Coef
StDev
56.17
13.81
4.07
0.001
x1
-0.00000008
0.00000025
-0.31
0.760
x2
-1.8177
0.9935
-1.83
0.084
x3
-0.4494
0.1127
-3.99
0.001
x4
0.2227
0.6079
0.37
0.718
x2sq
0.07707
0.02665
2.89
0.010
x4sq
-0.01887
0.02334
-0.81
0.429
Constant
S = 4.956
R-Sq = 69.9%
R-Sq(adj) = 59.9%
Source
Regression
DF
SS
MS
1029.03
171.51
6.98
0.001
24.56
Residual Error
18
442.13
Total
24
1471.17
Source
DF
Seq SS
x1
129.96
x2
355.43
x3
172.19
x4
96.17
x2sq
259.22
x4sq
16.05
Obs
x1
Fit
StDev Fit
Residual
4 11940345
32.600
26.777
4.038
5.823
St Resid
2.03R
-2.04R
10
5328142
7.500
16.105
2.599
-8.605
12
4905123
27.000
16.559
3.607
10.441
3.07R
20
2978628
3.200
11.759
2.679
-8.559
-2.05R
The least squares prediction equation is

y = 56.2 .00000008x1 1.82x2 .449x3 + .223x4 + 1 .0771x22 .0189 x42
To determine if the model is useful for predicting percentage of problem mortgages,
we test:
H0: 1 = 2 = 3 = 4 = 5 = 6 = 0
435
MS(Model)
= 6.98
MSE
The p-value is p = .001. Since the p-value is less than = .05 (p = .001 < .05), H0 is
rejected. There is sufficient evidence to indicate the model is useful in predicting
f.
To determine if one or more of the second-order terms of our model contribute

information for the prediction of the percentage of problem mortgages, we test:
H0: 5 = 6 = 0
(SSE R SSE C) /( k g ) (717.40 442.13) /(6 4)

=
= 5.60
442.13 /[25 (6 + 1)]
SSE C /[n (k + 1)]
(k g) = (6 4) = 2 and 2 = n (k + 1) = 25 (6 + 1) = 18. From Table IX,
3.55), H0 is rejected. There is sufficient evidence to indicate one or more of the
second-order terms of our model contribute information for the prediction of the
11.138 a.
Using SAS, the output for fitting the model is:

DEP VARIABLE: Y
SUM OF
MEAN
DF
SQUARES
SQUARE
F VALUE
PROB>F
MODEL
2396.36410
798.78803
99.394
0.0001
ERROR
16
128.58590
8.03662
C TOTAL
11
2524.95000
SOURCE
436
ROOT MSE
2.83489
R-SQUARE
0.9491
DEP MEAN
23.05000
ADJ R-SQ
0.9395
C.V.
12.29889
Chapter 11
PARAMETER ESTIMATES
PARAMETER
STANDARD
T FOR H0:
VARIABLE
DF
ESTIMATE
ERROR
PARAMETER=0
PROB > |T|
INTERCEP
-11.768830
3.05032146
-3.858
0.0014
X1
10.293782
1.43788129
7.159
0.0001
X1SQ
-0.417991
0.16132974
-2.591
0.0197
X2
13.244076
1.50325080
8.810
0.0001
The fitted model is: y = 11.8 + 10.3x1 .418 x12 + 13.2x2

b.
To determine if the second-order term is necessary, we test:

H0: 2 = 0
Ha: 2 0

The p-value is p = .0197. Since the p-value is less than (p = .0197 < .05), H0 is
rejected. There is sufficient evidence to conclude that the second-order term in the
model proposed by the operations manager is necessary at = .05.
c.
The reduced model E(y) = 0 + 3x2 was fit to the data. The SAS output is:
DEP VARIABLE: Y
SUM OF
MEAN
DF
SQUARES
SQUARE
F VALUE
PROB>F
MODEL
1.25000000
1.25000000
0.009
0.9258
ERROR
18
2523.70000
140.20556
C TOTAL
19
2524.95000
ROOT MSE
11.84084
R-SQUARE
0.0005
DEP MEAN
23.05
ADJ R-SQ
-0.0550
SOURCE
C.V.
51.37025
PARAMETER ESTIMATES
PARAMETER
STANDARD
T FOR H0:
VARIABLE
DF
ESTIMATE
ERROR
PARAMETER=0
PROB > |T|
INTERCEP
23.30000000
3.74440323
6.223
0.0001
X2
-0.50000000
5.29538583
-0.094
0.9258
437
The fitted model is y = 23.3 .5x2.

The hypotheses are:
H0: 1 = 2 = 0
(SSE R SSE C) /(k g )

SSE C /[ n (k + 1)]
(2523.7 128.586) /(3 1) 1197.557
=
= 149.01
=
128.586 /[20 (3 + 1)]
8.036625
numerator df = k g = 3 1 = 2 and denominator df = n (k + 1) = 20 (3 + 1) = 16.
From Table VIII, Appendix B, F.10 = 2.67. The rejection region is F > 2.67.
> 2.67), H0 is rejected. There is sufficient evidence to indicate the age of the machine
contributes information to the model at = .10.
After adjusting for machine type, there is evidence that down time is related to age.
11.140 a.
For a sunny weekday, x1 = 0 and x2 = 1:

x3 = 70 y = 250 700(0) + 100(1) + 5(70) + 15(0)(70) = 700
x3 = 80 y = 250 700(0) + 100(1) + 5(80) + 15(0)(80) = 750
x3 = 90 y = 800
x3 = 100 y = 850
For a sunny weekend, x1 = 1 and x2 = 1:

x3 = 70 y = 250 700(1) + 100(1) + 5(70) + 15(1)(70) = 1050
x3 = 80 y = 250 700(1) + 100(1) + 5(80) + 15(1)(80) = 1250
x3 = 90 y = 1450
x3 = 100 y = 1650
438
Chapter 11
For both sunny weekdays and sunny weekend days, as the predicted high temperature
increases, so does the predicted day's attendance. However, the predicted day's
attendance on sunny weekend days increases at a faster rate than on sunny weekdays.
Also, the predicted day's attendance is higher on sunny weekend days than on sunny
weekdays.
b.
To determine if the interaction term is a useful addition to the model, we test:

H0: 4 = 0
Ha: 4 0
4
s
15
=5
3
Since the observed value of the test statistic falls in the rejection region (t = 5 > 2.06),
H0 is rejected. There is sufficient evidence to indicate the interaction term is a useful
addition to the model at = .05.
c.
For x1 = 0, x2 = 1, and x3 = 95,

y = 250 700(0) + 100(1) + 5(95) + 15(0)(95) = 825
d.
The width of the interval in Exercise 11.139e is 1245 645 = 600, while the width is
850 800 = 50 for the model containing the interaction term. The smaller the width
of the interval, the smaller the variance. This implies that the interaction term is quite
useful in predicting daily attendance. It has reduced the unexplained error.
439
e.
11.142 a.
Because an interaction term including x1 is in the model, the coefficient corresponding

to x1 must be interpreted with caution. For all observed values of x3 (temperature), the
interaction term value is greater than 700.
Regression Analysis: y versus x1, x2, x1sq, x2sq, x1x2
y = - 9.92 + 0.167 x1 + 0.138 x2 - 0.00111 x1sq -0.000843 x2sq
+0.000241 x1x2
Predictor
Constant
x1
x2
x1sq
x2sq
x1x2
Coef
-9.917
0.16681
0.13760
-0.0011082
-0.0008433
0.0002411
S = 0.1871
SE Coef
1.354
0.02124
0.02673
0.0001173
0.0001594
0.0001440
R-Sq = 93.7%
T
-7.32
7.85
5.15
-9.45
-5.29
1.67
P
0.000
0.000
0.000
0.000
0.000
0.103
R-Sq(adj) = 92.7%
Source
Regression
Residual Error
Total
Source
x1
x2
x1sq
x2sq
x1x2
DF
5
34
39
DF
1
1
1
1
1
SS
17.5827
1.1908
18.7735
MS
3.5165
0.0350
F
100.41
P
0.000
Seq SS
5.2549
7.5311
3.6434
1.0552
0.0982

y = 9.917 + .167 x1 + .138 x2 .00111x12 .000843 x22 + .000241x 1 x2
b.
The standard deviation for the first-order model is s = .4023. The standard deviation
for the second-order model is s = .1871.
The relative precision for the first-order model is 2(.4023) = .8046. The relative
precision for the second-order model is 2(.1871) = .3742.
c.

H0: 1 = 2 = 3 = 4 = 5 = 0
Ha: At least one i 0, i = 1, 2, ... , 5
MSR
3.5165
=
= 100.41
MSE
.0350
The p-value is .0000. Since the p-value is less than = .05, H0 is rejected. There is
sufficient evidence to indicate the model is useful for predicting GPA at = .05.
440
Chapter 11
d.
To determine if the interaction term is important, we test:

H0: 5 = 0
Ha: 5 0

The p-value is .103. Since the p-value is not less than = .10, H0 is not rejected.
There is insufficient evidence to indicate the interaction term is important for
predicting GPA at = .10.
e.
From MINITAB, the plots are:
Residuals Versus x1
(response is y)
0.5
0.4
0.3
Residual
0.2
0.1
0.0
-0.1
-0.2
-0.3
-0.4
40
50
60
70
80
90
100
x1
441
Residuals Versus x2
(response is y)
0.5
0.4
0.3
Residual
0.2
0.1
0.0
-0.1
-0.2
-0.3
-0.4
50
60
70
80
90
100
x2
The residual plots of the residuals against x1 and against x2 for the second-order model
indicate there is no mound or bowl shape in either graph. This implies that secondorder is the highest order necessary. We have eliminated the mound shape from the
plots of the residuals against x1 and the residuals against x2 for the first-order model.
From the plots and the results of the tests in 11.145, it appears the second order model
is preferable for predicting GPA.
f.
To see if the second-order terms are useful, we test:

H0: 3 = 4 = 5
(SSE R SSE C ) /(k g ) (5.9876 1.1908) / 3

=
= 45.68
.0350
SSE C / [ n (k + 1) ]
upper tail of the F distribution with 1 = k g = 5 2 = 3 and 2 = n [k + 1] =
40 (5 + 1) = 34. From Table IX, Appendix B, F.05 2.92. The rejection region is
F > 2.92.
2.92), H0 is rejected. There is sufficient evidence that at least one second-order term
is useful at = .05.
442
Chapter 11
11.144 a.
The model is E(y) = 0 + 1x1

A sketch of the response curve might be:
b.
The model is E(y) = 0 + 1x1 + 2x2 + 3x3

1 if brand 2
where x 2 =
0 otherwise
1 if brand 3
x3 =
0 otherwise
c.
The model is E(y) = 0 + 1x1 + 2x2 + 3x3 + 4x1x2 + 5x1x3

443

Several models were fit to obtain the final model. I first fit a model with only the main effects for
Floor, Distance, View, Endunit, and Furnish. Of these, only Furnish, adjusted for the other variables,
was not significant. See the output below.
Price = 184 - 3.81 Floor + 1.74 Distance + 40.3 View - 32.7 Endunit
+ 4.28 Furnish
Predictor
Constant
Floor
Distance
View
Endunit
Furnish
Coef
183.570
-3.8076
1.7414
40.325
-32.716
4.279
s = 24.39
Stdev
5.221
0.7482
0.3750
3.456
9.581
3.602
R-sq = 49.4%
t-ratio
35.16
-5.09
4.64
11.67
-3.41
1.19
p
0.000
0.000
0.000
0.000
0.001
0.236
R-sq(adj) = 48.2%
SOURCE
Regression
Error
Total
SOURCE
Floor
Distance
View
Endunit
Furnish
DF
5
203
208
SS
118091
120802
238893
DF
1
1
1
1
1
SEQ SS
14149
21208
75065
6829
840
MS
23618
595
F
39.69
p
0.000
I then added Floor2 and Distance2 to the model with all main effects. For this model, all of the main
effects, including Furnish, were significant along with both squared terms. The output follows.
Price = 220 - 13.3 Floor - 7.01 Distance + 38.9 View - 22.0 Endunit
+ 7.31 Furnish + 1.05 FlSq + 0.572 DiSq
Predictor
Constant
Floor
Distance
View
Endunit
Furnish
FlSq
DiSq
s = 22.49
444
Coef
220.258
-13.296
-7.007
38.927
-21.967
7.308
1.0512
0.5719
Stdev
8.178
3.253
1.614
3.202
9.086
3.419
0.3492
0.1033
R-sq = 57.4%
t-ratio
26.93
-4.09
-4.34
12.16
-2.42
2.14
3.01
5.54
p
0.000
0.000
0.000
0.000
0.017
0.034
0.003
0.000
R-sq(adj) = 56.0%
SOURCE
Regression
Error
Total
DF
7
201
208
SS
137234
101659
238893
DF
1
1
1
1
1
1
1
SEQ SS
14149
21208
75065
6829
840
3640
15503
SOURCE
Floor
Distance
View
Endunit
Furnish
FlSq
DiSq
MS
19605
506
F
38.76
p
0.000
I then did a stepwise regression, forcing all the main effects and the two squared terms into the model,
to see if any two-way interaction terms could be added to the model. From this, only the interaction
between Floor and View was significant. The output from the final model is:
Price = 206 - 9.93 Floor - 7.02 Distance + 66.0 View - 22.5 Endunit
+ 6.48 Furnish + 1.02 FlSq + 0.577 DiSq - 6.04 FV
Predictor
Constant
Floor
Distance
View
Endunit
Furnish
FlSq
DiSq
FV
Coef
206.123
-9.927
-7.020
65.952
-22.451
6.485
1.0207
0.57720
-6.037
s = 21.44
Stdev
8.379
3.186
1.539
6.619
8.662
3.265
0.3330
0.09848
1.312
R-sq = 61.5%
t-ratio
24.60
-3.12
-4.56
9.96
-2.59
1.99
3.07
5.86
-4.60
p
0.000
0.002
0.000
0.000
0.010
0.048
0.002
0.000
0.000
R-sq(adj) = 60.0%
SOURCE
Regression
Error
Total
DF
8
200
208
SS
146965
91928
238893
DF
1
1
1
1
1
1
1
1
SEQ SS
14149
21208
75065
6829
840
3640
15503
9731
SOURCE
Floor
Distance
View
Endunit
Furnish
FlSq
DiSq
FV
MS
18371
460
F
39.97
p
0.000
445
This final model is fairly good. The R-squared value is .615. Thus, 61.5% of the variation in prices can
be explained by the model that includes the follow variables: Floor and Floor-squared, Distance and
Distance-squared, View, Endunit, Furnish, and the interaction of Floor and View. The residual plots
are as follows:
From the residual plots, it appears that the data are normally distributed, but there may be a couple of
outliers. This is evident by the two points whose standardized residuals are less than 3. Also, it
appears that there is constant variance. Thus, the model looks to be fairly good. It would be better if
the R-squared value was higher, however.
The final model is:
Price = 206 9.93 Floor 7.02 Distance + 66.0 View 22.5 Endunit + 6.48 Furnish
+ 1.02 FlSq + 0.577 DiSq - 6.04 FV
I have included graphs to indicate how each variable affects the price. These graphs reflect the
relationship between Price and a selected variable, holding the other variables constant.
The first graph is a graph of Price by Floor for each level of View, since Floor and View interact. Both
lines are curved to reflect the quadratic relationship between Floor and Price. For the Non-ocean view,
the price is fairly constant. There is a slight decrease in price as the Floor increases until Floor 5, and
then a slight increase as the floor increases. For the Ocean view, the price decreases at a decreasing rate
as the Floor increases.
The second graph is a graph of the Price by Distance. Again, the quadratic relationship is reflected by
the curved line. As the distance increases, the price decreases until a distance of 6 is reached. Then the
price begins to increase again as the distance increases.
446
The third graph is a graph of the Price by View, for each Floor. Again, we must look at the relationship
between Price and View at each Floor because of the significant interaction. For all Floors, the price of
the Ocean View is higher than the price of the Non-ocean View. However, the difference in the two
views depends on the floor.
The fourth graph is a graph of the Price by Endunit. From the graph, the price of the endunits are less
than the others.
The last graph is a graph of the Price by Furnish. From the graph, the price of the furnished units is
higher than the price of the non-furnished units.
447
Chapter 12
12.2
If rational subgrouping is not used, it is possible that a change in the process mean will go
undetected. In rational subgrouping, samples are selected so that a change in the process mean
occurs between samples, not within samples.
12.4
An x -chart is used to monitor the process mean.
12.6
The variation of a process must be stable. If it were not, the control limits of the -chart would
be meaningless since they are a function of the process variation.
12.8
a.
According to rule 4 (14 points in a row alternating up and down), the process is out of
control. Therefore, it is affected by both common and special causes of variation. An incontrol process is affected by only common causes. Rule 4 says that if we observe 14
points in a row alternating up and down, that is an indication of the presence of special
causes of variation in addition to common causes. Points 2 through 16 alternate up and
down.
b.
The extended x -chart is:

_
x
35
UCL
A
30
B
25
C
=
x
20
C
15
B
10
A
5
LCL
1
10
15
20
25
30
Sample Number
The additional points suggest that the process is out of control. Rule 1 (One point
beyond Zone A), Rule 5 (2 out of 3 points in a row in Zone A or beyond), and Rule 6
(4 out of 5 points in a row in Zone B or beyond) indicate the process is out of control.
12.10
448
a.
x1 + x2 + " + x25 2008.8

=
= 80.352
25
k
R + R2 + " + R25 198.7
= 7.948
R= 1
=
25
k
x=
Chapter 12
b.
Centerline = x = 80.352
From Table XII, Appendix B, with n = 5, A2 = .577.
Upper control limit = x + A2 R = 80.352 + .577(7.948) = 84.938

Lower control limit = x A2 R = 80.352 .577(7.948) = 75.766
c d.
2
( A2 R ) ) = 80.352 + 23 (.577)(7.948) = 83.409
3
2
2
Lower AB boundary = x ( A2 R ) ) = 80.352 (.577)(7.948) = 77.295
3
3
1
1
Upper BC boundary = x + ( A2 R ) ) = 80.352 + (.577)(7.948) = 81.881
3
3
1
1
Lower BC boundary = x ( A2 R ) ) = 80.352 + (.577)(7.948) = 78.823
3
3
Upper AB boundary = x +
The x -chart is:
Rule 1:
Rule 2:
Rule 3:
Rule 4:
Rule 5:
Rule 6:
One point beyond Zone A: Point 10 is beyond Zone A. This indicates the
process is out of control.
Nine points in a row in Zone C or beyond: No sequence of nine points are in
Zone C (on one side of the centerline) or beyond.
Six points in a row steadily increasing or decreasing: No sequence of six
points steadily increase or decrease.
Fourteen points in a row alternating up and down: This pattern does not exist.
Two out of three points in Zone A or beyond: There are no groups of three
consecutive points that have two or more in Zone A or beyond.
Four out of five points in a row in Zone B or beyond: No sequence of five
points has four or more in Zone B or beyond.
Rule 1 indicates the process is out of control.
449
12.12
a.
x = .6733 and R = .335

Upper control limit = x + A2 R = .6733 + .729(.335) = .9175
Lower control limit = x A2 R = .6733 .729(.335) = .4291
b.
Upper A B boundary = x +
2
2
A2 R ) = .6733 + (.729)(.335) = .8361
(
3
3
Lower A B boundary = x
2
2
A2 R ) = .6733 (.729)(.335) = .5105
(
3
3
Upper B C boundary = x +
1
1
A2 R ) = .6733 + (.729)(.335) = .7547
(
3
3
1
1
A2 R ) = .6733 (.729)(.335) = .5919
(
3
3
Rule 1:
Rule 2:
Rule 3:
Rule 4:
Rule 5:
Rule 6:
One point beyond Zone A: No points are beyond Zone A.

Nine points in a row in Zone C or beyond: There are nine points (Points 9
through 17) in a row in Zone C (on one side of the centerline) or beyond. This
indicates that the process is out of control.
Two out of three points in Zone A or beyond: There are no groups of three
consecutive points that have two or more in Zone A or beyond.
Four out of five points in a row in Zone B or beyond: No sequence of five
points has four or more in Zone B or beyond.
Rule 2 indicates the process in out of control.

c.
450
These control limits should not be used to monitor future output because the process is
out of control. One or more special causes of variation are affecting the process mean.
These should be identified and eliminated in order to bring the process into control.
Chapter 12
12.14
a.
The process of interest is the production of bolts used in military aircraft.
b.
Descriptive Statistics: Length by Hour

Variable
Length
Hour
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
N
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
Mean
36.973
36.957
37.067
37.065
36.948
36.998
37.000
37.005
37.027
36.970
37.020
36.983
37.070
37.073
36.993
36.955
37.038
37.010
36.955
37.035
36.995
37.023
37.003
36.995
37.010
Median
36.965
36.970
37.060
37.040
36.940
36.985
36.995
36.995
37.020
36.950
37.050
36.985
37.075
37.075
37.020
36.965
37.035
37.010
36.965
37.045
36.985
37.020
37.010
37.005
37.020
TrMean
36.973
36.957
37.067
37.065
36.948
36.998
37.000
37.005
37.027
36.970
37.020
36.983
37.070
37.073
36.993
36.955
37.038
37.010
36.955
37.035
36.995
37.023
37.003
36.995
37.010
StDev
0.098
0.079
0.081
0.096
0.121
0.101
0.054
0.087
0.111
0.106
0.098
0.066
0.132
0.025
0.069
0.040
0.097
0.085
0.058
0.109
0.044
0.096
0.039
0.071
0.083
Variable
Length
Hour
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
SE Mean
0.049
0.040
0.040
0.048
0.061
0.051
0.027
0.044
0.055
0.053
0.049
0.033
0.066
0.013
0.035
0.020
0.049
0.043
0.029
0.055
0.022
0.048
0.019
0.036
0.041
Minimum
36.880
36.850
36.990
36.980
36.810
36.890
36.940
36.910
36.900
36.870
36.880
36.900
36.910
37.040
36.890
36.900
36.940
36.910
36.880
36.900
36.960
36.930
36.950
36.900
36.900
Maximum
37.080
37.040
37.160
37.200
37.100
37.130
37.070
37.120
37.170
37.110
37.100
37.060
37.220
37.100
37.040
36.990
37.140
37.110
37.010
37.150
37.050
37.120
37.040
37.070
37.100
Q1
36.885
36.878
36.995
36.990
36.835
36.908
36.953
36.927
36.927
36.880
36.918
36.920
36.940
37.048
36.920
36.913
36.948
36.927
36.895
36.925
36.960
36.935
36.963
36.923
36.927
Q3
37.067
37.025
37.147
37.165
37.068
37.100
37.053
37.093
37.135
37.080
37.093
37.043
37.195
37.095
37.038
36.988
37.130
37.093
37.005
37.135
37.040
37.113
37.035
37.058
37.083
For each sample, we compute R = range = largest measurement - smallest measurement.
451
The results are listed in the table:

Sample No.
1
2
3
4
5
6
7
8
9
10
11
12
13
R
.20
.19
.17
.22
.29
.24
.13
.21
.27
.24
.22
.16
.31
Sample No.
14
15
16
17
18
19
20
21
22
23
24
25
R
.06
.15
.09
.20
.20
.13
.25
.09
.19
.09
.17
.20
x1 + x2 + " + x25 925.1650

=
= 37.0066
k
25
R + R2 + " R25 4.67
=
R = 1
= .1868
k
25
x =
Upper control limit = x + A2 R = 37.007 + .729(.1868) = 37.143

Lower control limit = x A2 R = 37.007 .729(.1868) = 36.871
2
2
A2 R ) ) = 37.007 + (.729)(.1868) = 37.098
(
3
3
2
2
Lower AB boundary = x ( A2 R ) ) = 37.007 (.729)(.1868) = 36.916
3
3
1
1
Upper BC boundary = x + ( A2 R ) ) = 37.007 + (.729)(.1868) = 37.052
3
3
1
1
Lower BC boundary = x ( A2 R ) ) = 37.007 (.729)(.1868) = 36.962
3
3
Upper AB boundary = x +
452
Chapter 12
The x -chart is:
c.
To determine if the process is in or out of control, we check the six rules:

Rule 1:
Rule 2:
Rule 3:
Rule 4:
Rule 5:
Rule 6:

Nine points in a row in Zone C or beyond: No sequence of nine points
are in Zone C (on one side of the centerline) or beyond.
Fourteen points in a row alternating up and down: This pattern does not
exist.
Two out of three points in Zone A or beyond: There are no groups of
three consecutive points that have two or more in Zone A or beyond.
Four out of five points in a row in Zone B or beyond: No sequence of
five points has four or more in Zone B or beyond.
The process appears to be in control. No special causes of variation appear to be present.
12.16
d.
An example of a special cause of variation would be if the machine used to produce the
bolts slipped out of alignment and started producing bolts of a different length. An
example of common cause variation would be the grade of the raw material used to make
the bolts.
e.
Since the process appears to be in control, it is appropriate to use these limits to monitor
future process output.
a.
x1 + x2 + " + x16 868.18

=
= 54.26125
k
16
R + R2 + " + R16 44.1
=
R= 1
= 2.75625
k
16
x =
From Table XII, Appendix B, with n = 5, A2 = .577
453
Lower control limit = x A2 R = 54.26125 .577(2.75625) = 52.6709

Upper A B boundary = x +
2
2
(.577)(2.75625) = 55.3215
( A2 R ) = 54.26125 +
3
3
2
2
( A2 R) = 54.26125
(.577)(2.75625) = 53.2010
3
3
Upper B C boundary = x +
1
1
( A2 R ) = 54.26125 + (.577)(2.75625) = 54.7914
3
3
Lower B C boundary = x
1
1
( A2 R ) = 54.26125 (.577)(2.75625) = 53.7311
3
3
The x -chart is:
b.

Rule 1:
Rule 2:
Rule 3:
Rule 4:
Rule 5:
Rule 6:
One point beyond Zone A: One point is beyond Zone A.

exist.
Two out of three points in Zone A or beyond: There are two sets of three
consecutive points (data points 3, 4, and 5 and data points 4, 5, and 6)
that have two points in Zone A or beyond.
Special causes of variation appear to be present. The process appears to be out of control.
Rules 1 and 5 indicate the process is out of control.
c.
454
Since the process is out of control, these control limits should not be used to monitor
future process outputs.
Chapter 12
12.18
The R-chart is designed to monitor the variation of the process.
12.20
Using Table XII, Appendix B:
12.22
a.
With n = 4, D3 = 0.000
D4 = 2.282
b.
With n = 12, D3 = 0.283
D4 = 1.717
c.
With n = 24, D3 = 0.451
D4 = 1.548
a.
From Exercise 12.11, the R values are:

Sample No.
1
2
3
4
5
6
7
8
9
10
R=
R
1.8
2.8
3.8
2.5
3.7
5.0
5.5
3.5
2.5
4.1
Sample No.
11
12
13
14
15
16
17
18
19
20
R
3.2
0.9
2.6
4.0
2.2
4.3
3.6
2.5
2.2
5.5
R1 + R2 + " R20 66.2

=
= 3.31
k
20
Centerline = R = 3.31
From Table XII, Appendix B, with n = 4, D4 = 2.282, and D3 = 0.
Upper control limit = R D4 = 3.31(2.282) = 7.553

Since D3 = 0, the lower control limit is negative and is not included on the chart.
b.
From Table XII, Appendix B, with n = 4, d2 = 2.059, and d3 = .880.
Upper AB boundary = R + 2d3
R
3.31
= 6.139
= 3.31 + 2(.880)
d2
2.059
Lower AB boundary = R 2d3
R
3.31
= 0.481
= 3.31 2(.880)
d2
2.059
Upper BC boundary = R + d3
R
3.31
= 4.725
= 3.31 + (.880)
d2
2.059
Lower BC boundary = R d3
R
3.31
= 1.895
= 3.31 (.880)
d2
2.059
455
c.
The R-chart is:
To determine if the process is in or out of control, we check the four rules:

Rule 1:
Rule 2:
Rule 3:
Rule 4:

exist.
The process appears to be in control.

12.24
a.
From Table XII, Appendix B, with n = 4, D3 = 0, and D4 = 2.282.

R = .335
Upper control limit = R D4 = .335(2.282) = .7645
b.
To determine if special causes of variation are present, we need to complete the R-chart.
From Table XII, Appendix B, with n = 4, d2 = 2.059, and d3 = .880.
456
Upper A B boundary = R + 2d 3
R
.335
= .335 + 2(.880)
= .6213
d2
2.059
Lower A B boundary = R 2d3
R
.335
= .335 2(.880)
= .0486
d2
2.059
Chapter 12
Upper B C boundary = R + d3
R
.335
= .335 + (.880)
= .4782
d2
2.059
Lower B C boundary = R d 3
R
.335
= .335 (.880)
= .1918
d2
2.059
The R-chart is:

UCL = .7646
.6213
.4782
R = 0.335
.1918
.0486
To determine if the process is in control, we check the four rules.

Rule 1:
Rule 2:
Rule 3:
Rule 4:

Nine points in a row in Zone C or beyond: There are not nine points are
in a row in Zone C (on one side of the centerline) or beyond.
exist.
It appears that the process is in control.

c.
Yes. This process appears to be in control. Therefore, these control limits could be used
to monitor future output.
d.
Of the 30 R values plotted, there are only 6 different values. Most of the R values take on
one of three values. This indicates that the data must be discrete (take on a countable
number of values), or that the path widths are multiples of each other.
457
12.26
a.
R=
R1 + R2 + " + R20 4 + 6 + " + 15 176

=
=
= 8.8
k
20
20
From Table XII, Appendix B, with n = 5, D4 = 2.114 and D3 = 0.

Upper control limit = RD4 = 8.8(2.114) = 18.603
From Table XII, Appendix B, with n = 5, d2 = 2.326 and d3 = 0.864.
R
8.8
= 15.338
= 8.8 + 2(.864)
d2
2.326
Upper A B boundary = R + 2d3
Lower A B boundary = R 2 d 3
Upper B C boundary = R + d 3
Lower B C boundary = R d 3
= 8.8 2(.864)
d2
= 8.8 + (.864)
8.8
= 12.069
2.326
= 8.8 (.864)
8.8
= 5.531
2.326
d2
R
d2
8.8
= 2.262
2.326
The R-chart is:
b.

Rule 1:
Rule 2:
458

Nine points in a row in Zone C or beyond: No sequence of nine points are
in Zone C (on one side of the centerline) or beyond.
Chapter 12
Rule 3:
Rule 4:

exist.
The process appears to be in control since none of the out-of-control signals are observed.
No special causes of variation appear to be present.
12.28
c.
Since the process appears to be in control, the control limits of the R-chart could be used
to monitor future replacement cycle times.
d.
From part b, we decided that the process was in control. However, there does appear to
be a pattern emerging in the R-chart. As the sample number increases, the value of R is
tending to increase. If this process was monitored for a longer period of time, the R-chart
might indicate that the process was out of control.
a.
R =
R1 + R2 + " + R16 .4 + 1.4 + " + 2.6 44.1

=
=
= 2.756
k
16
16
From Table XII, Appendix B, with n = 5, D4 = 2.114 and D3 = 0.

Upper control limit = RD4 = 2.756(2.114) = 5.826
From Table XII, Appendix B, with n = 5, d2 = 2.326 and d3 = 0.864.
Upper A B boundary = R + 2d3
R
2.756
= 4.803
= 2.756 + 2(.864)
d2
2.326
Lower A B boundary = R 2d3
R
2.756
= 2.756 - 2(.864)
= .709
d2
2.326
Upper B C boundary = R + d3
2.756
R
= 2.756 + (.864)
= 3.780
2.326
d2
Lower B C boundary = R d3
R
2.756
= 1.732
= 2.756 - (.864)
d2
2.326
459
The R-chart is:
b.
The R-chart is designed to monitor the process variation.
c.

Rule 1:
Rule 2:
Rule 3:
Rule 4:

Nine points in a row in Zone C or beyond: No sequence of nine points are in
points steadily increases or decreases.
The process appears to be in control. None of the out-of-control signals are present.
There is no indication that special causes of variation present.
12.30
The p-chart is designed to monitor the proportion of defective units produced by a process.
12.32
a.
To compute the proportion of defectives in each sample, divide the number of defectives
by the number in the sample, 200:
No. of defectives
P =
No. in sample
460
Chapter 12
The sample proportions are listed in the table:

Sample No.
1
2
3
4
5
6
7
8
9
10
11
12
13
b.
p
.080
.070
.045
.055
.075
.040
.060
.080
.085
.065
.075
.050
.045
Sample No. p
14
.060
15
.070
16
.055
17
.040
18
.035
19
.060
20
.075
21
.045
22
.080
23
.065
24
.055
25
.050
To get the total number of defectives, sum the number of defectives for all 25 samples.
The sum is 303. To get the total number of units sampled, multiply the sample size by the
number of samples: 200(25) = 5000.
p =
Total defective in all samples

303
= .0606
=
5000
Total units sampled
Centerline = p = .060
Upper control limit = p + 3
Lower control limit = p 3
c.
p (1 p )
.0606(.9394)
= .0606 + 3
= .1112
n
200
p (1 p )
.0606(.9394)
= .0606 3
= .0100
n
200
p (1 p )
.0606(.9394)
= .0606 + 2
= .0943
n
200
p (1 p )
.0606(.9394)
Lower AB boundary = p 2
= .0606 2
= .0269
n
200
p (1 p )
.0606(.9394)
= .0606 +
Upper BC boundary = p +
= .0775
n
200
p (1 p )
.0606(.9394)
= .0606
Lower BC boundary = p
= .0437
n
200
Upper AB boundary = p + 2
461
d.
The p-chart is:
e.

Rule 1:
Rule 2:

exist.
Rule 3:
Rule 4:
The process appears to be in control. There do not appear to be any special causes of
variation.
12.34
a.
The sample size is determined by the following:

n>
9 (1 p 0 )
p0
9(1 .01)
= 891
.01
The minimum sample size is 892.

b.

n>
9 (1 p 0 )
p0
9(1 .05)
= 171
.05
462
Chapter 12
c.

n>
9 (1 p 0 )
p0
9(1 .10)
= 81
.10

d.

n>
9 (1 p 0 )
p0
9(1 .20)
= 36
.20

12.36
a.

n>
9 (1 p 0 )
p0
9(1 .07)
= 119.6 120
.07

b.
p =
No. defectives
No. in sample

Sample No.
1
2
3
4
5
6
7
8
9
10
p
.092
.042
.033
.067
.083
.108
.075
.067
.083
.092
Sample No. p
11
.083
12
.100
13
.067
14
.050
15
.083
16
.042
17
.083
18
.083
19
.025
20
.067
463
p =

171
= .071
=
2400
Total units sampled
p (1 p )
.071(.929)
= .071 + 3
= .141
n
120
p (1 p )
.071(.929)
= .071 3
= .001
n
120
p (1 p )
.071(.929)
= .071 + 2
= .118
n
120
p (1 p )
.071(.929)
= .071 2
= .024
n
120
p (1 p )
.071(.929)
= .071 +
= .094
n
120
p (1 p )
.071(.929)
= .071
= .048
n
120
The p-chart is:
c.

Rule 1:
Rule 2:
Rule 3:
Rule 4:

exist.
The process appears to be in control.
464
Chapter 12
12.38
d.
Since the process is in control, it is appropriate to use the control limits to monitor future
process output.
e.
No. The number of defectives recorded was per day, not per hour. Therefore, the p-chart
is not capable of signaling hour-to-hour changes in p.
a.
p =
No. defectives
No. in sample

Sample No.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
p
.065
.025
.010
.015
.010
.015
.005
.010
.005
.005
.055
.030
.010
.015
.005
Sample No. p
16
.015
17
.005
18
.010
19
.015
20
.005
21
.045
22
.025
23
.010
24
.005
25
.015
26
.010
27
.020
28
.010
29
.005
30
.005
p =

96
=
= .016
Total units sampled
6000
The centerline is p = .016

p (1 p )
.016(1 .016)
= .016 + 3
= .0426
n
200
p (1 p )
.016(1 .016)
= .016 3
= -.0106
n
200
p (1 p )
.016(1 .016)
= .016 + 2
= .0337
n
200
465
p (1 p )
.016(1 .016)
= .016 2
= -.0017
n
200
p (1 p )
.016(1 .016)
= .016 +
= .0249
n
200
p (1 p )
.016(1 .016)
= .016
= .0071
n
200
The p-chart is:
b.

Rule 1:
Rule 2:
Rule 3:
Rule 4:
One point beyond Zone A: There are 3 points beyond Zone APoints 1,
11, and 21.
Six points in a row steadily increasing or decreasing: This pattern is not
present.
exist.
The process does not appear to be in control. Rule 1 indicates that the process is out of
control.
12.40
Specification spread is the difference between the upper specification limit and the lower
specification spread. The specification spread is determined by customers, management, and
product designers. Process spread is the spread of the actual output and is a function of the
standard deviation of the data.
12.42
There are two reasons why CP should not be used in isolation. First, CP is a statistic and is
subject to sampling error. The sample standard deviation is used to estimate the population
standard deviation which is used to calculate the process spread. Thus, the estimate of the
process spread can vary from sample to sample. Second, CP does not reflect the shape of the
output distribution. Distributions with different shapes can have the same CP value.
466
Chapter 12
12.44
12.46
12.48
The specification spread is the difference between the upper specification limit and the lower
specification limit.
a.
Specification spread = USL LSL = 19.65 12.45 = 7.20
b.
Specification spread = USL LSL = .0010 .0008 = .0002
c.
Specification spread = USL LSL = 1.43 1.27 = 0.16
d.
Specification spread = USL LSL = 490 486 = 4
CP =
Specification spread
USL LSL
=
6
Process spread
a.
CP
1.0065 1.0035
USL LSL
.003
=1
=
=
6s
.003
6(.0005)
b.
CP
22 21
USL LSL
1
=
=
= .8333
6s
1.2
6(.2)
c.
CP
875 870
USL LSL
5
= 1.111
=
=
6s
4.5
6(.75)
a.
If the output distribution is normal with a mean of 1000 and a standard deviation of 100,
then the proportion of the output that is unacceptable is:
P(x < 980) + P(x > 1,020)
980 1, 000
1, 020 1, 000
= P z <
+ P z >
100
100
= P(z < .2) + P(z > .2) = (.5 .0793) + (.5 .0793) = .8414
The percentage of unacceptable output is 84.14%.
b.
CP =
USL LSL 1, 020 980

40
= .067
=
6
600
6(100)
Since the value of CP is less than 1, the process is not capable.
467
12.50
a.
A capability diagram is:

LSL = 35 is off the chart.
b.
Fifty-two of the observations are above the upper specification limit. Thus, the
percentage is (52/100) 100% = 52%.
c.
From the sample, x = 37.007 and s = .083.

CP =
d.
37 35
USL LSL
2
= 4.016
=
6s
.498
6(.083)
Since the CP value is greater than 1, the process is capable.
12.52
The quality of a good or service is indicated by the extent to which it satisfies the needs and
preferences of its users. Its eight dimensions are: performance, features, reliability,
conformance, durability, serviceability, aesthetics, and other perceptions that influence
judgments of quality.
12.54
A process is a series of actions or operations that transform inputs to outputs. A process

produces output over time. Organizational process: Manufacturing a product. Personnel
Process: Balancing a checkbook.
12.56
The six major sources of process variation are: people, machines, materials, methods,
measurements, and environment.
12.62
Common causes of variation are the methods, materials, equipment, personnel, and
environment that make up a process and the inputs required by the process. That is, common
causes are attributable to the design of the process. Special causes of variation are events or
actions that are not part of the process design. Typically, they are transient, fleeting events that
affect only local areas or operations within the process for a brief period of time. Occasionally,
however, such events may have a persistent or recurrent effect on the process.
12.64
If a process is capable, then it is necessarily in control. If a process is in control, then the

control chart should be used to monitor the process.
468
Chapter 12
12.66
The probability of observing a value of more than 3 standard deviations from its mean is:
P( x > + 3 x ) + P( x < 3 x ) = P(z > 3) + P(z < 3)
= .5000 .4987 + .5000 .4987 = .0026
If we want to find the number of standard deviations from the mean the control limits should be
set so the probability of the chart falsely indicating the presence of a special cause of variation
is .10, we must find the z score such that:
P(z > z0) + P(z < z0) = .1000 or P(z > z0) = .0500
Using Table IV, Appendix B, z0 = 1.645. Thus the control limits should be set 1.645 standard
deviations above and below the mean.
12.68
a.
The centerline = x =
x = 150.58
n
20
= 7.529
The time series plot is:
12.70
b.
The variation pattern that best describes the pattern in this time series is the level shift.
Points 1 through 10 all have fairly low values, while points 11 through 20 all have fairly
high values.
a.
Yes. The minimum sample size necessary so the lower control limit is not negative is:
n>
9 (1 p 0 )
p0
From the data, p0 .06

Thus, n >
9(1 .06)
= 141. Our sample size was 200.
.06
469
b.
p =
No. of defectives
No. in sample

Sample No.
1
2
3
4
5
6
7
8
9
10
11
p
.02
.03
.055
.06
.025
.05
.04
.08
.085
.10
.14
Sample No.
12
13
14
15
16
17
18
10
20
21
p
.10
.10
.085
.065
.05
.055
.035
.03
.04
.045
p =
No. of defectives
258
=
= .0614
No. in sample
4200
Upper A-B boundary = p +
Lower A-B boundary = p
Upper B-C boundary = p +
Lower B-C boundary = p
470
p (1 p )
= .0614 + 3
n
p (1 p )
= .0614 3
n
p (1 p )
2
= .0614 +
n
p (1 p )
2
= .0614
n
p (1 p )
= .0614 +
n
p (1 p )
= .0614
n
.0614(.9386)
= .1123
200
.0614(.9386)
= .0105
200
.0614(.9386)
2
= .0953
200
.0614(.9386)
2
= .0275
200
.0614(.9386)
= .0784
200
.0614(.9386)
= .0444
200
Chapter 12
The p-chart is:
c.
To determine if the control limits should be used to monitor future process output, we
need to check the four rules.
Rule 1:
Rule 2:
Rule 3:
Rule 4:
One point beyond Zone A: The 11th point is beyond Zone A. This
indicates the process is out of control.
Nine points in a row in Zone C or beyond: There are not nine points in a
row in Zone C (on one side of the centerline) or beyond.
exist.
Rule 1 indicates the process is out of control. These control limits should not be used to
monitor future process output.
12.72
a.
In order for the x -chart to be meaningful, we must assume the variation in the process is
constant (i.e., stable).
x and R = range = largest measurement - smallest
For each sample, we compute x =
n
measurement. The results are listed in the table:
Sample No.
1
2
3
4
5
6
7
8
9
10
11
12
x
32.325
30.825
30.450
34.525
31.725
33.850
32.100
28.250
32.375
30.125
32.200
29.150
R
11.6
12.4
7.8
10.2
9.1
10.4
10.1
6.8
8.7
6.3
7.1
9.3
Sample No.
13
14
15
16
17
18
19
20
21
22
23
24
x
31.050
34.400
31.350
28.150
30.950
32.225
29.050
31.400
30.350
34.175
33.275
30.950
R
13.3
9.6
7.3
8.6
7.6
5.6
10.0
8.7
8.9
10.5
13.0
8.9
471
x1 + x2 + " + x24 755.225

=
= 31.4677
k
24
R + R2 + " + R24 221.8
=
R = 1
= 9.242
k
24
x =
Lower control limit = x A2 R = 31.468 - .729(9.242) = 24.731
2
2
( A2 R ) = 31.468 +
(.729)(9.242) = 35.960
3
3
2
2
Lower A-B boundary = x ( A2 R) = 31.468 (.729)(9.242) = 26.976
3
3
1
1
Upper B-C boundary = x + ( A2 R ) = 31.468 + (.729)(9.242) = 33.714
3
3
1
1
Lower B-C boundary = x ( A2 R) = 31.468 (.729)(9.242) = 29.222
3
3
Upper A-B boundary = x +
The x -chart is:
b.
To determine if the process is in or out of control, we check the six rules.

Rule 1:
Rule 2:
Rule 3:
Rule 4:
472

exist.
Chapter 12
Rule 5:
Rule 6:
Two out of three points in Zone A or beyond: There are no groups of

three consecutive points that have two or more in Zone A or beyond.
The process appears to be in control. There are no indications that special causes of
variation are affecting the process.
12.74
c.
Since the process appears to be in control, these limits should be used to monitor future
process output.
a.
A capability analysis diagram is:
b.
For an upper specification limit of 5, there are 27 observations above this limit. Thus,
(27/100) 100% = 27% of the observations are unacceptable. It does not appear that the
process is capable.
c.
From Exercise 14.73, the process appears to be in control. Thus, it is appropriate to

estimate CP.
From the sample, x = 3.867 and s = 2.190
CP =
50
USL LSL
5
= .381
=
6s
6(2.19) 13.14
Since the CP value is less than 1, the process is not capable.
12.76
d.
There is no lower specification limit because management has no time limit below which
is unacceptable. The variable being measured is time customers wait in line. The actual
lower limit would be 0.
a.
p =

279
= .048
=
5760
Total units sampled
The centerline is p = .048
473
p (1 p )
.048(1 .048)
= .048 + 3
= .099
N
160
p (1 p )
.048(1 .048)
= .048 3
= -.003
3
N
160
p (1 p )
.048(1 .048)
p +2
= .048 + 2
= .082
N
160
p (1 p )
.048(1 .048)
= .048 2
= .014
p 2
N
160
p (1 p )
.048(1 .048)
= .048 +
= .065
p +
N
160
p (1 p )
.048(1 .048)
= .048
= .031
p
N
160

Lower control limit = p
Upper AB boundary =
Lower AB boundary =
Upper BC boundary =
Lower BC boundary =
The p-chart is:
To determine if the process is in or out of control, we check the four rules of the R-chart:
Rule 1:
Rule 2:
Rule 3:
Rule 4:

Six points in a row steadily increasing or decreasing: This pattern is not
present.
exist.
The process appears to be in control. Thus, there is no indication that special causes of
variation are present.
474
Chapter 12
c.
Most of the defects are due to microcracks. Thus, "microcracks" are the "vital few." The
other types of defectives are broken stands, gaps between layers, and internal voids.
These are the "trivial many."
475
Time Series: Descriptive Analyses,

Models, and Forecasting
13.2
a.
Chapter 13
The simple composite index is calculated as follows:

First, sum the observations for all the series of interest at each time period. Select the
base time period. Divide each sum by the sum in the base time period and multiply by
100.
b.
To calculate a weighted composite index, we follow the following steps:

First, multiply the observations in each time series by its appropriate weight. Then sum
the weighted observations across all times series for each time period. Select the base
time period. Divide each weighted sum by the weighted sum in the base time period and
multiply by 100.
c.
The steps necessary to compute a Laspeyres Index are:

1.
2.
3.
4.
5.
d.
The steps necessary to compute a Paasche index are:

1.
2.
3.
4.
5.
13.4
476
a.
Collect data for each of k price series.

Select a base time period and collect purchase quantity information for each of
the k series at the base time period.
Using the purchase quantity values at the base period as weights, multiply each
value in the kth series by its corresponding weight.
Sum the products for each time period.
Divide each sum by the sum corresponding to the base period and multiply by
100.
Collect data for each of k price series.

Select a base period.
Collect purchase quantity information for each series at each time period.
For each time period, multiply the value in each price series by its
corresponding purchase quantity for that time period. Sum the products for
each time period.
To find the value of the Paasche index at a particular time period, multiply the
purchase quantity values (weights) for that time period by the corresponding
price values of the base time period. Sum the results for the base period. The
Paasche Index is then found by dividing the sum found in (4) by the sum found
in (5).
The simple index for the quarter 4 price of product A, using quarter 1 as the base
period is (4.25 / 3.25) 100 = 130.77.
Chapter 13
13.6
b.
The simple index for the quarter 2 price of product B, using quarter 1 as the base
period is (1.25 / 1.75) 100 = 71.43.
c.
To find the simple composite index, we must first sum the prices for all three products
over the base period and the quarter for which we want to compute the simple composite
index. The sum for quarter 1 is 3.25 + 1.75 + 8.00 = 13.00. The sum for quarter 4 is 4.25
+ 1.00 + 10.50 = 15.75. The simple composite index for quarter 4 using quarter 1 as the
base period is (15.75 / 13.00) 100 = 121.15.
d.
The sum of all the products for quarter 2 is 3.50 + 1.25 + 9.35 = 14.10. The simple
composite index for quarter 4 using quarter 2 as the base period is (15.75 / 14.10) 100 =
111.70.
a.
To find the simple index, divide each value by the value for the base year and multiply by
100. The index numbers are:
Year
1975
1980
1985
1990
1995
2000
b.
Simple Index
(Base Year = 1975)
(13,719/13,719) 100 = 100.00
(21,023/13,719) 100 = 153.24
(27,735/13,719) 100 = 202.16
(35,353/13,719) 100 = 257.69
(40,611/13,719) 100 = 296.02
(50,890/13,719) 100 = 370.95
Simple Index
(Base Year = 1980)
(13,719/21,023) 100 = 65.26
(21,023/21,023) 100 = 100.00
(27,735/21,023) 100 = 131.93
(35,353/21,023) 100 = 168.16
(40,611/21,023) 100 = 193.17
(50,890/21,023) 100 = 242.07
The index value for 1990 is 257.69 when the base is 1975. Thus, the median annual
family income for 1990 increased by 257.69 100 = 157.69% over the median annual
family income in 1975.
The index value for 1990 is 168.16 when the base is 1980. Thus, the median annual
family income for 1990 increased by 168.16 100 = 68.16% over the median annual
family income in 1980.
13.8
a.
To compute the simple index, divide each housing start value by the 2001, Quarter 1
value, 274 and then multiply by 100.
Year
2001
2002
2003
Quarter
1
2
3
4
1
2
3
4
1
2
Simple Index
(274/274) 100 =
(374/274) 100 =
(341/274) 100 =
(285/274) 100 =
(293/274) 100 =
(386/274) 100 =
(361/274) 100 =
(319/274) 100 =
(304/274) 100 =
(406/274) 100 =
100.00
136.50
124.45
104.01
106.93
140.88
131.75
116.42
110.95
148.18
Year
2003
2004
2005
Quarter
3
4
1
2
3
4
1
2
3
4
Simple Index
(412/274) x 100 =
(377/274) x 100 =
(345/274) x 100 =
(456/274) x 100 =
(440/274) x 100 =
(370/274) x 100 =
(369/274) x 100 =
(485/274) x 100 =
(471/274) x 100 =
(392/274) x 100 =
150.36
137.59
125.91
166.42
160.58
135.04
134.67
177.01
171.90
143.07
477
13.10
b.
The value of the index for Quarter 2, 2004 is 166.42. Thus, the housing starts in Quarter
2, 2004 increased by 166.42 100 = 66.42% over the housing starts in the base quarter,
Quarter 1, 2001.
c.
The value of the index for Quarter 4, 2005 is 143.07. Thus, the housing starts in Quarter
4, 2005 increased by 143.07 100 = 43.07% over the housing starts in the base quarter,
Quarter 1, 2001.
d.
The number of housing starts for Quarter 1, 2003 is 304 thousand. The number of
housing starts for Quarter 4, 2005 is 392 thousand. Using Quarter 1, 2003 as the base, the
index for Quarter 4, 2005 is (392/304) 100 = 128.95. Thus, the number of housing
starts in Quarter 4, 2005 increased by 128.95 100 = 28.95% over the housing starts in
Quarter 1, 2003.
a.
To compute the simple index for the agricultural data, divide each farm value by the
1980 value 3,364 and then multiply by 100. To compute the simple index for the
nonagricultural data, divide each nonfarm value by the 1980 value 95,938 and then
multiply by 100. The two indices are:
Year
1980
1985
1990
1995
2000
2003
Farm Index
(3,364/3,364) 100 =
(3,179/3,364) 100 =
(3,223/3,364) 100 =
(3,440/3,364) 100 =
(2,464/3,364) 100 =
(2,275/3,364) 100 =
Nonfarm Index
(95,938/95,938) 100 =
(10,3971/95,938) 100 =
(115,570/95,938) 100 =
(121,460/95,938) 100 =
(134,427/95,938) 100 =
(135,461/95,938) 100 =
100.00
108.37
120.46
126.60
140.12
141.20
b.
The nonfarm segment has shown the greater percentage change in employment over the
time period. The nonfarm employment in 2003 was 41.20% greater than in 1980. The
farm employment in 2003 was 32.37% lower than in 1980.
c.
To compute the simple composite index, first sum the two values (farm and nonfarm) for
every time period. Then divide the sum by the sum in 1980, 99,302, and then multiply by
100. The simple composite index is:
Year
1980
1985
1990
1995
2000
2003
d.
478
100.00
94.50
95.81
102.26
73.25
67.63
Sum
99,302
107,150
118,793
124,900
136,891
137,736
Simple Composite Index

(99,302/99,302) 100 =
(107,150/99,302) 100 =
(118,793/99,302) 100 =
(124,900/99,302) 100 =
(136,891/99,302) 100 =
(137,736/99,302) 100 =
100.00
107.90
119.63
125.78
137.85
138.70
The simple composite index value for 2003 is 138.70. The composite employment is
38.70% higher in 2003 than in 1980.
Chapter 13
13.12
a.
The find Laspeyres index, we multiply the durable goods by 10.9, the nondurable goods
by 14.02, and the services by 42.6. The three products are then summed. The index is
found by dividing the weighted sum at each time period by the weighted sum of 1970,
17,108.86, and then multiplying by 100. The Laspeyres index and the simple composite
index for 1970 (computed in Exercise 13.11) are:
Year
Simple Composite
Index-1970
51.43
68.77
100.00
158.52
270.39
412.59
581.78
768.60
1,033.83
1,272.99
1960
1965
1970
1975
1980
1985
1990
1995
2000
2004
b.
Weighted
Sum
8,409.95
11,442.51
17,108.86
27,509.89
48,215.53
76,167.86
110,254.64
150,193.08
202,856.51
251,152.45
Laspeyres
Index
49.16
66.88
100.00
160.79
281.82
445.20
644.43
877.87
1,185.68
1,467.97
The plot of the two indices is:

1600
Variable
I-1970
Laspeyres
1400
1200
Index
1000
800
600
400
200
0
1960
1965
1970
1975 1980
1985
1990
1995
2000
Y ear
The two indices are very similar from 1960 to approximately 1980. After 1980, the
difference between the two indices becomes larger, with the Laspeyres index increasing
faster than the simple composite index.
479
13.14
a.
To get the simple composite price index, sum the prices for the three metals for each
month, divide by 2,090.35 (the sum of the prices for the base period January), and
multiply by 100. To get the simple composite quantity index, sum the quantities for the
three metals for each month, divide by 8,793.40 (the sum of the quantities for the base
period January), and multiply by 100. The indices are:
Month
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
b.
Price
Total
2,090.35
2,495.72
2,536.85
2,409.55
2,550.70
2,603.20
2,719.30
2,998.52
2,978.98
2,997.82
3,038.80
3,018.57
Price
Index
100.00
119.39
121.36
115.27
122.02
124.53
130.09
143.45
142.51
143.41
145.37
144.41
Quantity
Index
100.00
97.02
106.97
102.89
105.80
104.08
105.78
107.56
106.70
110.29
103.79
100.16
To compute the Laspeyres index, multiply the price for each month by the quantity for
each of the metals for January, sum the products for the three metals, divide by
1,768,700.64 (the sum for the base period January), and multiply by 100. The Laspeyres
index is:
Month
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
480
Quantity
Total
8,793.40
8,531.70
9,406.50
9,047.10
9,303.20
9,152.10
9,301.80
9,457.90
9,382.90
9,698.20
9,127.00
8,807.90
Total
1,768,700.64
2,077,067.24
2,345,138.00
2,114,563.64
1,760,956.32
1,746,326.88
2,117,568.80
2,377,017.20
2,100,958.72
2,276,109.40
2,366,980.72
2,155,654.92
Laspeyres Index
100.00
117.43
132.59
119.55
99.56
98.74
119.72
134.39
118.79
128.69
133.83
121.88
Chapter 13
c.
The plots of the simple composite price index, the simple composite quantity index, and
Laspeyres index are:
150
Variable
Price
Quantity
Laspeyres
140
Index
130
120
110
100
90
Jan
Feb Mar
Apr
May
Jun
Jul
Aug
Sep Oct
Nov Dec
M onth
The quantity index appears to be fairly stable while the price index steadily
increases. The Laspeyres index is rather unstable, as it varies much more than the
other two indices.
d.
The following steps are used to compute the Paasche index:

1.
2.
3.
First, multiply the price production for copper, steel, and lead for each month.
The numerator of the index is the sum of these three quantities at each month.
Next, multiply the production values of copper by 1,133, the production of steel by
187.75, and the production of lead by 769.6. The denominator is the sum of these
three quantities at each month.
The values of the Paasche index are the ratios of these two values at each month
times 100.
The Paasche index is:

Month
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Paasche
Numerator
1,768,700.64
2,013,192.24
2,500,128.80
2,180,640.81
1,858,912.26
1,822,735.92
2,230,984.40
2,549,791.96
2,244,369.96
2,504,067.86
2,450,159.20
2,175,046.70
Paasche
Denominator
1,768,700.64
1,714,396.58
1,884,813.60
1,823,938.71
1,867,861.77
1,844,379.26
1,864,385.48
1,898,332.74
1,888,977.74
1,946,822.77
1,831,683.15
1,781,166.44
Paasche
Index
100.00
117.43
132.65
119.56
99.52
98.83
119.66
134.32
118.81
128.62
133.77
122.11
481
e.
The plot of the Laspeyres index and the Paasche index is:
The two indices are almost identical.
Time Series Plot of Laspeyres, Paasche
135
Variable
Laspeyres
Paasche
130
125
Data
120
115
110
105
100
Jan
Feb Mar
Apr
May
Jun
Jul
Aug
Sep Oct
Nov Dec
M onth
13.16
f.
The values of Laspeyres index for September and December are 118.79 and 121.88 The
values of the Paasche index for September and December are 118.81 and 122.11. These
values are almost identical. Both the Laspeyres and Paasche indices are so close to being
the same, neither is superior to the other.
a.
The exponentially smoothed employment for the first period is equal to the employment
for that period. For the rest of the time periods, the exponentially smoothed employment
values are found by multiplying .5 times the employment value of that time period and
adding to that (1 .5) times the value of the exponentially smoothed employment figure
of the previous time period.
The exponentially smoothed employment value for the time period 2 is .5(281) +
(1 .5)(280) = 280.5. The rest of the values are shown in the table.
Month
Jan.
Feb.
Mar.
Apr.
May
June
July
Aug.
Sept.
Oct.
Nov.
Dec.
482
t
1
2
3
4
5
6
7
8
9
10
11
12
Yt
280
281
250
246
239
218
218
210
205
206
200
200
Exponentially
Smoothed Series
w = .5
280.0
280.5
265.3
255.6
247.3
232.7
225.3
217.7
211.3
208.7
204.3
202.2
Chapter 13
b.
The graph of the time series and the exponentially smoothed series is:
280
270
Exponentially Smoothed
Series
260
Yt
250
240
Series
230
220
210
200
2
10
12
Time Period
13.18
a.
The exponentially smoothed fish catch for Chile for the first period is equal to the fish
catch for that period. For the rest of the time periods, the exponentially smoothed fish
catch values are found by multiplying .5 times the fish catch of that time period and
adding to that (1 .5) times the value of the exponentially smoothed fish catch figure of
the previous time period. The exponentially smoothed fish catch for Chile for the time
period 1995 is .5(7,590.5) + (1 .5)(5,195.4) = 6,392.95. The rest of the values are
shown in the table.
Similarly, the exponentially smoothed fish catch for Brazil for the first period is equal to
the fish catch for that period. For the rest of the time periods, the exponentially smoothed
fish catch values are found by multiplying .5 times the fish catch of that time period and
adding to that (1 .5) times the value of the exponentially smoothed fish catch figure of
the previous time period. The exponentially smoothed fish catch for Brazil for time
period 1995 is .5(800.0) + (1 .5)(802.9) = 801.45. The rest of the values are shown in
the table.
483
Year
1990
1995
1998
1999
2000
2001
2002
b.
Chile
Catch
5,195.4
7,590.5
3,265.3
5,050.2
4,300.0
3,797.1
4,271.5
Chile
w=.5
Exponentially
Smoothed
Catch
5,195.40
6,392.95
4,829.13
4,939.66
4,619.83
4,208.47
4,239.98
Brazil
Catch
802.9
800.0
706.8
703.9
766.8
806.7
822.1
Brazil
w=.5
Exponentially
Smoothed
Catch
802.90
801.45
754.13
729.01
747.91
777.30
799.70
The plot of the two time series and the two exponentially smoothed series is:
8000
Variable
Chile
Brazil
Chile-Exp
Brazil-Exp
7000
Fish C atch
6000
5000
4000
3000
2000
1000
0
1990
1992
1994
1996
Y ear
1998
2000
2002
Both the time series and the exponentially smoothed series for the fish catch in Brazil are
fairly stable over time. There is a decrease and then increase for both series in Brazil.
Both the time series and exponentially smoothed series for the fish catch in Chile show a
decrease over time. The exponentially smoothed series is more stable than the actual time
series.
484
Chapter 13
13.20
a.
The exponentially smoothed expenditure for the first time period is equal to the
expenditure for that period. For the rest of the time periods, the exponentially smoothed
expenditures are found by multiplying the expenditures for the time period by w = .2 and
adding to that (1 .2) times the exponentially smoothed value above it. The
exponentially smoothed value for the year 1991 is .2(548.9) + (1 .2)(590.1) = 581.86.
The rest of the values appear in the table. The process is repeated with w = .8.
Year
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
b.
Expenditure
s
590.1
548.9
581.1
607.6
643.2
654.6
687.1
727.4
779.3
831.6
853.4
872.0
890.9
912.3
925.6
931.5
w = .2
Exponentially
Smoothed
Value
590.10
581.86
581.71
586.89
598.15
609.44
624.97
645.46
672.23
704.10
733.96
761.57
787.43
812.41
835.05
854.34
w = .8
Exponentially
Smoothed
Value
590.10
557.14
576.31
601.34
634.83
650.65
679.81
717.88
767.02
818.68
846.46
866.89
886.10
907.06
921.89
929.58
The plot of the two series is:
Variable
Expend
Exp-.2
Exp-.8
900
Expenditur es
800
700
600
500
1991
1993
1995
1997 1999
Y ear
2001
2003
There trend in personal consumption expenditure on transportation increased at a

faster rate in the 1990s than in the 2000s. In the 2000s, the consumption
expenditure is increasing but at a slower rate.
485
13.22
a.
The exponentially smoothed Stock Index for the first time period is equal to the Stock
Index for that time period. For the rest of the time periods, the exponentially smoothed
stock price is found by multiplying w = .3 times the stock prices for that time period and
adding to that (1 .3) times the value of the exponentially smoothed stock price for the
previous time period. The exponentially smoothed stock prices for the second time
period is .3(1372.7) + (1 .3)(1286.4) = 1312.29. The rest of the values are shown in the
table.
Year
1999
2000
2001
2002
2003
2004
2005
2006
486
Quarter
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
S&P
500
1286.4
1372.7
1282.7
1469.2
1498.6
1454.6
1436.5
1320.3
1160.3
1224.4
1040.9
1148.1
1147.4
989.8
815.3
879.8
848.2
974.5
996
1111.9
1126.2
1140.8
1114.6
1211.9
1180.6
1191.3
1228.8
1248.3
1294.9
1270.2
1335.8
Exponentially
Smoothed
Series
w = .3
1286.4
1312.3
1303.4
1353.1
1396.8
1414.1
1420.8
1390.7
1321.6
1292.4
1217.0
1196.3
1181.6
1124.1
1031.4
986.0
944.6
953.6
966.3
1010.0
1044.9
1073.6
1085.9
1123.7
1140.8
1155.9
1177.8
1198.9
1227.7
1240.5
1269.1
Exponentially
Smoothed
Series
w = .7
1286.4
1346.8
1301.9
1419.0
1474.7
1460.6
1443.7
1357.3
1219.4
1222.9
1095.5
1132.3
1142.9
1035.7
881.4
880.3
857.8
939.5
979.0
1072.0
1110.0
1131.5
1119.7
1184.2
1181.7
1188.4
1216.7
1238.8
1278.1
1272.6
1316.8
Chapter 13
The plot of the original series and the exponentially smoothed series with w = .3 is:
Variable
S&P 500
Exp-.3
1500
1400
S & P 500
1300
1200
1100
1000
900
800
Q uarter
Year
b.
Q1
1999
Q1
Q1
2000 2001
Q1
Q1
2002 2003
Q1
2004
Q1
Q1
2005 2006
The same procedure is followed for w = .7. The exponentially smoothed Stock Index for
the first time period is equal to the Stock Index for that time period. For the rest of the
time periods, the exponentially smoothed stock price is found by multiplying w = .7 times
the stock prices for that time period and adding to that (1 .7) times the value of the
exponentially smoothed stock price for the previous time period. The exponentially
smoothed stock prices for the second time period is .7(1372.7) + (1 .7)(1286.4) =
1346.8. The rest of the values are shown in the table in part a.
The plot of the original series and the exponentially smoothed series with w = .7 is:
Variable
S&P 500
Exp-.7
1500
1400
S & P 500
1300
1200
1100
1000
900
800
Q uarter
Year
c.
Q1
1999
Q1
Q1
2000 2001
Q1
Q1
2002 2003
Q1
2004
Q1
Q1
2005 2006
The exponentially smoothed series with w = .3 better describes the trends in the series.
The exponentially smoothed series with w = .7 is almost exactly like the original series.
487
13.24
13.26
a.
The missing trend value for quarter 3 is:

T3 = v(E3 E2) + (1 v)T2 = .6(3.78 3.50) + (1 .6)(.25) = .27
b.
The missing smoothed value for quarter 4 is:

E4 = wY4 + (1 w)(E3 + T3) = .2(4.25) + (1 .2)(3.78 + .27) = 4.09.
c.
The forecast for quarter 5 is:

FQ5 = Ft+1 = Et + Tt = 4.09 + .29 = 4.38.
a.
To compute the exponentially smoothed values, we follow these steps:

E1 = Y1 = 345
E2 = wY2 + (1 w)E1 = .6(456) + (1 .6)(345) = 411.60
E3 = wY3 + (1 w)E2 = .6(440) + (1 .6)(411.60) = 428.64
The rest of the values are computed in a similar manner and are listed in the table:
Year
2004
Quarter
1
2
3
4
1
2
3
4
2005
b.
Exponentially Smoothed
w = .6
345.00
411.60
428.64
393.46
378.78
442.51
459.61
419.04
Housing Starts
345
456
440
370
369
485
471
392
Using MINITAB, the plot is:

500
Variable
Housing
Exp-.6
475
Star ts
450
425
400
375
350
Q uarter
Year
Q1
2004
Q2
Q3
Q4
Q1
2005
Q2
Q3
Q4
c. To forecast using exponentially smoothed values, we use the following:

F2006,1 = Ft+1 = Et = 419.04
F2006,2 = Ft+2 = Ft+1 = 419.04
F2006,3 = Ft+3 = Ft+1 = 419.04
F2006,4 = Ft+4 = Ft+1 = 419.04
488
Chapter 13
13.28
a.
Using the information from Exercise 13.21, the forecast using the exponentially
smoothed values with w = .9 is:
F2006 = Ft+2 = Ft+ 1 = Et = 1815.3
b.
We first compute the Holt-Winters values for years 1974-2004.

With w = .3 and v = .8,
E2 = Y2 = 1171
E3 = wY3 + (1 w)(E2 + T2) = .3(1663) + (1 .3)(1171 + 245) = 1490.1
T2 = Y2 Y1 = 1171 926 = 245
T3 = v(E3 E2) + (1 v)T2 = .8(1490.1 1171) + (1 .8)(245) = 304.28
The rest of the Ets and Tts appear in the table:
Year
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
t
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Imports
926
1,171
1,663
2,058
1,892
1,866
1,414
1,067
633
540
553
479
771
876
987
1,232
1,282
1,233
1,247
1,339
1,307
1,303
1,258
1,378
1,522
1,543
1,664
1,770
1,490
1,671
1,833
Et
w = .3
v = .8
Tt
w = .3
v = .8
1171.00
1490.10
1873.47
2136.31
2253.87
2107.47
1734.46
1182.96
637.02
235.48
8.40
50.00
283.65
622.67
1020.93
1365.36
1571.76
1639.14
1619.79
1529.25
1411.34
1289.30
1232.36
1270.65
1364.08
1508.72
1679.04
1736.09
1771.26
1820.42
245.00
304.28
367.55
283.79
150.80
86.96
315.80
504.36
537.62
428.76
267.41
20.21
182.88
307.79
380.16
351.58
235.43
100.99
4.72
71.48
108.63
119.36
69.42
16.75
78.09
131.33
162.52
78.14
43.77
48.08
489
To forecast using the Holt-Winters Model:

For w = .3 and v = .8,
F2006 = Ft+2 = Ft+1 = Et + 2Tt = 1,820.42 + 2(48.08) = 1,916.58
c.
The error forecast for the exponentially smoothed series is

Yt+2 Ft+2 = 2,100 1815.3 = 284.7
The error forecast for the Holt-Winters series is
Yt+2 Ft+2 = 2,100 1,916.58 = 183.42
The error for the Holt-Winters forecast is smaller than the error for the exponentially
smoothed forecast.
13.30
a.
We first compute the Holt-Winters values for the years 2003-2005.

With w = .3 and v = .5,
E2 = Y2 = 974.5
E3 = wY3 + (1 w)(E2 + T2) =.3(996.0) + (1 .3)(974.5 + 126.3) = 1,069.36.
T2 = Y2 Y1 = 974.5 848.2 = 126.3
T3 = v(E3 E2) + (1 v)T2 = .5(1,069.36 974.5) + (1 .5)(126.3) = 110.58
The rest of the Ets and Tts appear in the table that follows.
Year
2003
2004
2005
2006
490
Quarter
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
S&P
500
848.2
974.5
996.0
1111.9
1126.2
1140.8
1114.6
1211.9
1180.6
1191.3
1228.8
1248.3
1294.9
1270.2
1335.8
Et
w = .3
v = .5
Tt
w = .3
v = .5
Et
w = .7
v = .5
Tt
w = .7
v = .5
974.5
1069.36
1159.53
1219.79
1252.32
1250.50
1258.03
1246.99
1232.52
1227.45
1229.96
126.30
110.58
100.37
80.32
56.42
27.30
17.42
3.19
-5.64
-5.35
-1.42
974.5
1027.44
1113.45
1148.72
1161.64
1139.88
1192.62
1193.28
1196.53
1221.92
1245.60
126.30
89.62
87.81
61.54
37.23
7.74
30.24
15.45
9.35
17.37
20.52
Chapter 13
To forecast using the Holt-Winters Model with w = .3 and v = .5:

F2006,1 = Ft+1 = Et + Tt = 1,229.96 1.42 = 1,228.54
F2006,2 = Ft+2 = Et + 2Tt = 1,229.96 + 2(1.42) = 1,227.12
F2006,3 = Ft+3 = Et + 3Tt = 1,229.96 + 3(1.42) = 1,225.70
With w = .7 and v = .5,
E2 = Y2 = 974.5
E3 = wY3 + (1 w)(E2 + T2) =.7(996.0) + (1 .7)(974.5 + 126.3) = 1,027.44.
T2 = Y2 Y1 = 974.5 848.2 = 126.3
T3 = v(E3 E2) + (1 v)T2 = .5(1,027.44 974.5) + (1 .5)(126.3) = 89.62
The rest of the Ets and Tts appear in the table above.
To forecast using the Holt-Winters Model with w = .7 and v = .5:
F2006,1 = Ft+1 = Et + Tt = 1,245.60 + 20.52 = 1,266.12
F2006,2 = Ft+2 = Et + 2Tt = 1,245.60 + 2(20.52) = 1,286.64
F2006,3 = Ft+3 = Et + 3Tt = 1,245.60 + 3(20.52) = 1,307.16
13.32
a.
From Exercise 13.25a, the forecasts for 2003-2005 using w = .3 are:

F2003 = 199.48
F2004 = 199.48
F2005 = 199.48
The errors are the differences between the actual values and the predicted values.
Thus, the errors are:
Y2003 F2003 = 195 199.48 = 4.48
Y2004 F2004 = 197 199.48 = 2.48
Y2005 F2005 = 195 199.48 = 4.48
b.
From Exercise 13.25a, the forecasts for 2003-2005 using w = .7 are:

F2003 = 199.74
F2004 = 199.74
F2005 = 199.74
The errors are:
Y2003 F2003 = 195 199.74 = 4.74
Y2004 F2004 = 197 199.74 = 2.74
Y2005 F2005 = 195 199.74 = 4.74
491
c.
For the exponentially smoothed forecasts with w = .3,

m
| Yt Ft |
|195 199.48 | + |197 199.48 | + |195 199.48 | 11.44

=
= 3.81
m
3
3
m (Yt Ft )
195 199.48 197 199.48 195 199.48
+
+
Y
195
197
195
i =1
t
100
=
MAPE =
100
m
3
.0585
=
100 = 1.9512
3
MAD =
i =1
(Yt Ft )
i =1
RMSE =
=
d.
(195 199.48)2 + (197 199.48)2 + (195 199.48)2

3
46.2912
= 3.928
3
For the exponentially smoothed forecasts with w = .7,

m
MAD =
| Yt Ft |
i =1
|195 199.74 | + |197 199.74 | + |195 199.74 | 12.22

=
= 4.07
3
3
m (Yt Ft )
Yt
i =1
MAPE =
m
195 199.74 197 199.74 195 199.74
+
+
195
197
195
100 =
100
.0625
=
100 = 2.0841
3
m
RMSE =
(Yt Ft )
i =1
=
=
492
(195 199.74 )2 + (197 199.74 )2 + (195 199.74 )2

3
52.4428
= 4.181
3
Chapter 13
13.34
a.
From Exercise 13.29a, the forecasts for the 3 quarters of 2006 using w = .7 are:
F2006,1 = 1,238.8
F2006,2 = 1,238.8
F2006,3 = 1,238.8
For the exponentially smoothed forecasts with w = .7:
m
MAD =
| Yt Ft |
i =1
|1294.9 1238.8 | + |1270.2 1238.8 | + |1335.8 1238.8 | 184.5

=
= 61.5
3
3
m (Yt Ft )
Yt
i =1
MAPE =
m
1294.9 1238.8 1270.2 1238.8 1335.8 1238.8
+
+
1294.9
1270.2
1335.8
100 =
100
.1407
=
100 = 4.689
3
m
RMSE =
(Yt Ft )
i =1
=
=
b.
(1294.9 1238.8)2 + (1270.2 1238.8)2 + (1335.8 1238.8)2

3
13,542.17
= 67.187
3
From Exercise 13.29b, the forecasts for the 3 quarters of 2006 using w = .3 are:
F2006,1 = 1,198.9
F2006,2 = 1,198.9
F2006,3 = 1,198.9
For the exponentially smoothed forecasts with w = .3:
m
MAD =
| Yt Ft |
i =1
m
304.2
=
= 101.4
3
|1294.9 1198.9 | + |1270.2 1198.9 | + |1335.8 1198.9 |

3
493
m (Yt Ft )
Yt
i =1
MAPE =
m
1294.9 1198.9 1270.2 1198.9 1335.8 1198.9
+
+
1294.9
1270.2
1335.8
100 =
100
.2328
=
100 = 7.759
3
m
(Yt Ft )
i =1
RMSE =
=
=
13.36
(1294.9 1198.9 )2 + (1270.2 1198.9 )2 + (1335.8 1198.9 )2

3
33,041.3
= 104.946
3
c.
For all three measures of error, the exponentially smoothed series with w = .7 is smaller
than the exponentially smoothed series with w = .3. Thus, the more accurate series would
be the exponentially smoothed series with w = .7.
a.
From Exercise 13.31, the actual data and the forecasts using the exponential
smoothing and the Holt-Winters forecasts are:
Year
2005
Month
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Gold
Price
424.2
423.4
434.2
428.9
421.9
430.7
424.5
437.9
456.0
469.9
476.7
509.8
Exponential
Forecast
w =.5
433.47
433.47
433.47
433.47
433.47
433.47
433.47
433.47
433.47
433.47
433.47
433.47
Holt-Winters
Forecast
w =.5, v =.5
454.09
466.55
479.01
491.47
503.93
516.39
528.85
541.31
553.77
566.23
578.69
591.15
For the exponential smoothing forecasts with w = .5:

m
MAD =
| Yt Ft |
i =1
| 424.2 433.47 | + | 423.4 433.47 | + + | 509.8 433.47 |

12
m
230.9
=
= 19.242
12
494
Chapter 13
m (Yt Ft )
Yt
i =1
MAPE =
m
424.2 433.47 423.4 433.47

509.8 433.47
+
+ +
424.2
423.4
509.8
100 =
12
100
.4904
=
100 = 4.087
12
m
(Yt Ft )
i =1
RMSE =
( 424.2 433.47 )2 + ( 423.4 433.47 )2 + + ( 509.8 433.47 )2

12
9,980.2268
= 28.839
12
For the Holt-Winters forecasts with w = .5 and v = .5:

m
MAD =
| Yt Ft |
i =1
| 424.2 454.09 | + | 423.4 466.55 | + + | 509.8 591.15 |

12
m
933.34
=
= 77.778
12
m (Yt Ft )
Yt
i =1
MAPE =
m
424.2 454.09 423.4 466.55

509.8 591.15
+
+ +
424.2
423.4
509.8
100 =
12
100
2.0897
=
100 = 17.415
12
m
RMSE =
(Yt Ft )
i =1
=
=
( 424.2 454.09 )2 + ( 423.4 466.55)2 + + ( 509.8 591.15)2

12
80,190.7476
= 81.747
12
For all three measures of forecast errors, the exponential smoothing forecasts had
smaller errors. Thus, the exponential smoothing forecasts are better.
495
b.
From Exercise 13.31, the actual data and the forecasts using the exponential
smoothing one-step-ahead and the Holt-Winters one-step-ahead forecasts are:
Year
2005
Month
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Gold
Price
424.2
423.4
434.2
428.9
421.9
430.7
424.5
437.9
456.0
469.9
476.7
509.8
Exponential
Forecast
w =.5
433.47
428.83
426.12
430.16
429.53
425.71
428.21
426.35
432.13
444.06
456.98
466.84
Holt-Winters
Forecast
w =.5, v =.5
454.09
444.12
433.57
433.84
430.10
422.67
425.37
423.40
432.74
452.27
473.40
488.19
For the exponential smoothing one-step-ahead forecasts with w = .5:

m
MAD =
| Yt Ft |
i =1
| 424.2 433.47 | + | 423.4 428.83 | + + | 509.8 466.84 |

12
m
164.32
=
= 13.693
12
m (Yt Ft )
Yt
i =1
MAPE =
m
424.2 433.47 423.4 428.83

509.8 466.84
+
+ +
424.2
423.4
509.8
100 =
12
100
.3540
=
100 = 2.950
12
m
RMSE =
(Yt Ft )
i =1
=
=
496
( 424.2 433.47 )2 + ( 423.4 428.83)2 + + ( 509.8 466.84 )2

12
3,884.9754
= 17.993
12
Chapter 13
For the Holt-Winters one-step-ahead forecasts with w = .5 and v = .5:

m
MAD =
| Yt Ft |
i =1
| 424.2 454.09 | + | 423.4 444.12 | + + | 509.8 488.19 |

12
m
153.58
=
= 12.798
12
m (Yt Ft )
Yt
i =1
MAPE =
m
424.2 454.09 423.4 444.12

509.8 488.19
+
+ +
424.2
423.4
509.8
100 =
12
100
.3434
=
100 = 2.862
12
m
RMSE =
(Yt Ft )
i =1
=
=
( 424.2 454.09 )2 + ( 423.4 444.12 )2 + + ( 509.8 488.19 )2

12
3,019.9854
= 15.864
12
For all three measures of forecast errors, the Holt-Winters forecasts have smaller errors.
Thus, the Holt-Winters forecasts are better.
13.38
a.

Regression Analysis: Price versus t
Price = 24.7 + 0.0910 t
Predictor
Constant
t
Coef
24.6975
0.09103
S = 1.497
SE Coef
0.7851
0.08119
R-Sq = 8.2%
T
31.46
1.12
P
0.000
0.281
R-Sq(adj) = 1.7%
Source
Regression
Residual Error
Total
DF
1
14
15
SS
2.817
31.379
34.197
MS
2.817
2.241
F
1.26
P
0.281

New Obs
1
Fit
26.245
SE Fit
0.785
95.0% CI
24.561, 27.929)
95.0% PI
22.619, 29.871)

New Obs
1
t
17.0
497

New Obs
2
Fit
26.336
SE Fit
0.857
95.0% CI
24.497, 28.175)
95.0% PI
22.636, 30.036)

New Obs
2
b.
t
18.0
The estimates of the parameters in the model, E(Yt) = 0 + 1t, are
0 = 24.6975 The price is estimated to be 24.6975 cents/pound for t = 0 or for 1991.

1 = .09103
c.
The price is estimated to increase by .091 cents/pound for each additional

year.
The forecast for 2007 is:

Using t = 17, Y 2003 = 24.6975 + .09103(17) = 26.2450
The forecast for 2008 is:
Using t = 18, Y 2004 = 24.6975 + .09103(18) = 26.3360
Yes, these agree with the predicted values on the printout.
d.
From the printout, the 95% forecast intervals are:

2007 (22.619, 29.871)
2008 (22.636, 30.036)
We are 95% confident that the actual price in 2007 will be between 22.619 and 29.871.
We are 95% confident that the actual price in 2008 will be between 22.636 and 30.036.
e.
13.40
498
No, we would not recommend that this model be used to forecast annual price. If we
were to test if there is a significant linear relationship between time and annual price
(H0: 1 = 0 vs Ha: 1 0), the test statistic would be t = 1.12 and the p-value would be
p = .281. Thus, we would conclude there is insufficient evidence to indicate a linear
relationship exists between time and annual price. (Do not reject H0.)
The major advantage of regression forecasts over the exponentially smoothed forecasts is that
prediction intervals can be formed using the regression forecasts and not using the
exponentially smoothed forecasts.
Chapter 13
13.42
a.

Regression Analysis: Price versus Time
Price = 4.76 + 0.309 Time
Predictor
Constant
Time
Coef
4.7608
0.30857
S = 0.769971
SE Coef
0.4184
0.04601
R-Sq = 77.6%
T
11.38
6.71
P
0.000
0.000
R-Sq(adj) = 75.8%
Source
Regression
Residual Error
Total
DF
1
13
14
SS
26.661
7.707
34.368
MS
26.661
0.593
F
44.97
P
0.000
Obs
15
Time
15.0
Price
10.740
Fit
9.389
SE Fit
0.379
Residual
1.351
St Resid
2.01R
R denotes an observation with a large standardized residual.

New
Obs
1
Fit
9.698
SE Fit
0.418
95% CI
(8.794, 10.602)
95% PI
(7.805, 11.591)

New
Obs
1
Time
16.0

New
Obs
1
Fit
10.006
SE Fit
0.459
95% CI
(9.014, 10.999)
95% PI
(8.069, 11.943)

New
Obs
1
Time
17.0
From the printout:
o = 4.7608 . The price of gas is estimated to be 4.7608 dollars per 1,000 cubic
feet in 1989.
1 = .30857 . For each additional year, the price of gas is estimated to increase
by .30857 dollars per 1,000 cubic feet.
499
b.
To determine the model fit, we test:

H0: = 0
Ha: 0
The test statistic is t = 6.71 (from the printout).
The p-value is p = 0.000. Since the p-value is so small, H0 is rejected for any
reasonable value of . There is sufficient evidence that the model has an adequate
fit.
c.
The 95% prediction interval for 2005 is (7.805, 11.591). We are 95% confident that the
actual annual price of natural gas in 2005 is between 7.805 and 11.591 dollars per 1,000
cubic feet.
The 95% prediction interval for 2006 is (8.069, 11.943). We are 95% confident that
the actual annual price of natural gas in 2006 is between 8.069 and 11.943 dollars per
1,000 cubic feet.
13.44
d.
There are basically two problems with using simple linear regression for predicting time
series data. First, we must predict values of the time series for values of time outside the
observed range. We observe data for time periods 1, 2, , t and use the regression
model to predict values of the time series for t + 1, t + 2, . The second problem is that
simple linear regression does not allow for any cyclical effects such as seasonal trends.
a.
The regression model is: E (Yt ) = o + 1t + 2 Q1 + 3 Q2 + 3 Q3
b.

Regression Analysis: Sales versus t, Q1, Q2, Q3
Sales = 120 + 16.5 t + 262 Q1 + 223 Q2 + 106 Q3
Predictor
Constant
t
Q1
Q2
Q3
Coef
119.85
16.512
262.34
222.83
105.51
S = 26.00
SE Coef
16.95
1.028
16.73
16.57
16.48
R-Sq = 96.9%
T
7.07
16.07
15.68
13.45
6.40
P
0.000
0.000
0.000
0.000
0.000
R-Sq(adj) = 96.1%
Source
Regression
Residual Error
Total
Source
t
Q1
Q2
Q3
500
DF
1
1
1
1
DF
4
15
19
SS
318560
10139
328700
MS
79640
676
F
117.82
P
0.000
Seq SS
114343
81883
94610
27724
Chapter 13

New Obs
1
Fit
728.95
SE Fit
16.95
95.0% CI
692.82, 765.08)
95.0% PI
662.80, 795.10)
95.0% PI
639.80, 772.10)
95.0% PI
539.00, 671.30)
95.0% PI
450.00, 582.30)

New Obs
1
t
21.0
Q1
1.00
Q2
0.000000
Q3
0.000000

New Obs
1
Fit
705.95
SE Fit
16.95
95.0% CI
669.82, 742.08)

New Obs
1
t
22.0
Q1
0.000000
Q2
1.00
Q3
0.000000

New Obs
1
Fit
605.15
SE Fit
16.95
95.0% CI
569.02, 641.28)

New Obs
1
t
23.0
Q1
0.000000
Q2
0.000000
Q3
1.00

New Obs
1
Fit
516.15
SE Fit
16.95
95.0% CI
480.02, 552.28)

New Obs
1
t
24.0
Q1
0.000000
Q2
0.000000
Q3
0.000000

Yt = 119.85 + 16.512t + 262.34Q1 + 222.83Q2 + 105.51Q3
1 = 16.512
2 = 262.34
3 = 222.83
4 = 105.51
For every increase in time period (1 quarter), the mean sales index
increases by an estimated 16.512.
The difference in mean sales index between the first and fourth quarters
is estimated to be 262.34.
The difference in the mean sales index between the second and fourth
quarters is estimated to be 222.83.
The difference in the mean sales index between the third and fourth
quarters is estimated to be 105.51.

H0: 1 = 2 = 3 = 4 = 0
Ha: At least one i 0, i = 1, 2, 3, 4
The test statistic is F = 117.82
501
upper tail of the F-distribution with numerator df = k = 4 and denominator df = n (k + 1)
= 20 (4 + 1) = 15. From Table IX, Appendix B, F = 3.06. The rejection region is
F > 3.06.
3.06), H0 is rejected. There is sufficient evidence to indicate the model is useful at
= .05.
c.
The assumption of independent error terms is in doubt.
d.
The forecasts and the 95% prediction intervals are found at the bottom of the printout and
are:
2007
13.46
13.48
I
II
III
IV
Forecast
728.95
705.95
605.15
516.115
95% Lower Limit 95% Upper Limit

662.8
795.1
639.8
772.1
539.0
671.3
450.0
582.3
a.
d = 3.9 indicates the residuals are very strongly negatively autocorrelated.
b.
d = .2 indicates the residuals are very strongly positively autocorrelated.
c.
d = 1.99 indicates the residuals are probably uncorrelated.
a.
To determine if the overall model contributes information for the prediction of monthly
passenger car and light truck sales, we test:
H0: 1 = 2 = 3 = 4 = 5 = 0
Ha: At least 1 i 0
R2 / k
.856 / 5
=
= 164.067
2
(1 R ) /[n (k + 1)] (1 .856) /[144 (5 + 1)]
The rejection region requires = .05 in the upper tail of the F distribution with 1 = k = 5
and 2 = n (k + 1) = 144 (5 + 1) = 138. From Table IX, Appendix B, F.05 2.29. The
> 2.29), H0 is rejected. There is sufficient evidence to indicate the overall model
contributes information for the prediction of monthly passenger car and light truck sales
at = .05.
b.
To determine if positive autocorrelation is present, we test:

H0: No first-order autocorrelation
Ha: Positive first-order autocorrelation of residuals
502
Chapter 13
The test statistics is d = 1.01.

For = .05, the rejection region is d < dL, = dL,.05 1.57. The value dL,.05 is found in
Table XIII, Appendix B, with k = 5, n = 144, and = .05.
Since the observed value of the test statistic falls in the rejection region (d = 1.01 <
1.57, H0 is rejected. There is sufficient evidence to indicate the time series residuals
are positively autocorrelated at = .05.
13.50
c.
One of the requirements for the validity of the test in part b is that the error terms are
independent. Since H0 was rejected in part a, there is evidence that positive
autocorrelation exists. Since the error terms are not independent, the test in part b
may not be valid.
a.
There is a tendency for the residuals to have long positive runs and negative runs.
Residuals 1 through 6 are positive, while residuals 7 through 25 are negative. Residuals
26 through 35 are positive. This indicates the error terms are correlated.
b.
From the printout, the Durbin-Watson d is d = .0627.

To determine if the time series residuals are autocorrelated, we test:
H0: No first-order autocorrelation of residuals
Ha: Positive or negative first-order autocorrelation of residuals
The test statistic is d = .0627.
For = .10, the rejection region is d < dL,/2 = dL,.05 = 1.40 or (4 d) < dL,.05 = 1.40. The
value dL,.05 is found in Table XIII, Appendix B, with k = 1, n = 35, and = .10.
Since the observed value of the test statistic falls in the rejection region (d = .0627 <
1.40), H0 is rejected. There is sufficient evidence to indicate the time series residuals are
autocorrelated at = .10.
c.
We must assume the residuals are normally distributed.
503
13.52
a.
Using MINITAB, the plot of the residuals against t is:

Scatterplot of RESI1 vs Time
1.5
1.0
RESI1
0.5
0.0
-0.5
-1.0
0
8
T ime
10
12
14
16
There is not a random scattering of the residuals. The first 5 residuals are positive, the
next 6 are negative, the next one is positive, the next one is negative and the last 2 are
positive. This does not appear to be a random scattering. The plot suggests the
possibility of autocorrelation.
b.

Regression Analysis: Price versus Time
Price = 4.76 + 0.309 Time
Predictor
Constant
Time
Coef
4.7608
0.30857
S = 0.769971
SE Coef
0.4184
0.04601
R-Sq = 77.6%
T
11.38
6.71
P
0.000
0.000
R-Sq(adj) = 75.8%
Source
Regression
Residual Error
Total
DF
1
13
14
SS
26.661
7.707
34.368
MS
26.661
0.593
F
44.97
P
0.000
Obs
15
Time
15.0
Price
10.740
Fit
9.389
SE Fit
0.379
Residual
1.351
St Resid
2.01R

Durbin-Watson statistic = 1.39909
504
Chapter 13

For = .05, the rejection region is d < dL, = dL,.05 = 1.08. The value dL,.05 is found in
(d = 1.399 </ 1.08), H0 is not rejected. From Table XII, Appendix B, dU, = 1.36 with
k = 1, n = 15 and = .05. Since the observed value of the test statistic falls above the
upper limit (d = 1.399 > 1.36), there is insufficient evidence to indicate the time series
residuals are positively autocorrelated at = .05.
13.54
c.
Since the error terms do not appear to be dependent, the validity of the test for the model
adequacy appears to be fine.
a.
Using MINITAB, the plot of the residuals against t is:

Scatterplot of RESI1 vs t
30
20
RESI1
10
0
-10
-20
-30
0
10
15
20
25
30
35
Since there appear to be groups of consecutive positive and groups of consecutive

negative residuals, the data appear to be autocorrelated.
505
b.

Regression Analysis: Policies versus t
Policies = 385 - 0.363 t
Predictor
Constant
t
Coef
385.326
-0.3632
S = 15.0555
SE Coef
5.280
0.2632
R-Sq = 5.6%
T
72.98
-1.38
P
0.000
0.177
R-Sq(adj) = 2.7%
Source
Regression
Residual Error
Total
DF
1
32
33
SS
431.6
7253.3
7685.0
MS
431.6
226.7
F
1.90
P
0.177
Obs
1
t
1.0
Policies
355.00
Fit
384.96
SE Fit
5.05
Residual
-29.96
St Resid
-2.11R


For = .05, the rejection region is d < dL, = dL,.05 = 1.39. The value dL,.05 is found in
(d = .42 < 1.39), H0 is rejected. There is sufficient evidence to indicate the time series
residuals are positively autocorrelated at = .05.
c.
506
Since the error terms do not appear to be independent, the validity of the test for model
adequacy is in question.
Chapter 13
13.56
a.
Year
1995
2000
2001
2002
2003
2004
b.
The exponentially smoothed price for the first time period is equal to the price for that
period. For the rest of the time periods, the exponentially smoothed prices are found by
multiplying the price for that time period by w = .5 and adding to that (1 .5) times the
exponentially smoothed price for the time period preceeding it. The exponentially
smoothed values for each of the price series appear in the table:
Cold
Finished
Price
25.70
23.08
22.76
23.26
25.15
38.67
Exponentially
Smoothed
Value
w = .5
25.70
24.39
23.58
23.42
24.28
31.48
Exponentially
Smoothed
Value
w = .5
25.32
20.50
16.10
16.28
15.54
23.19
Hot
Rolled
Price
25.32
15.67
11.71
16.46
14.80
30.84
Galvanized
Price
34.47
21.38
16.41
22.00
20.08
36.69
Exponentially
Smoothed
Value
w = .5
34.47
27.93
22.17
22.08
21.08
28.89
The plot of the three price series and the exponentially smoothed series are:
Cold Finished
40
Variable
CF
CF-Exp-.5
P r ice
35
30
25
1995 1996 1997
1998 1999
2000 2001
2002 2003
2004
Y ear
507
Hot Rolled
Variable
HR
HR-Exp-.5
30
P r ice
25
20
15
10
1995 1996 1997 1998
1999 2000 2001
2002 2003 2004
Y ear
Galvanized
Variable
Gal
Gal-Exp-.5
35
P r ice
30
25
20
15
1995
1996 1997 1998 1999 2000
2001 2002 2003 2004
Y ear
c.
The exponential smoothing forecasts for 2005 are:

Cold Finished: F2005 = E2004 = 31.48
Hot Rolled:
F2005 = E2004 = 23.19
Galvanized:
F2005 = E2004 = 28.89
One of the main drawbacks of this kind of forecast is the inability to forecast
future values using prediction intervals.
508
Chapter 13
13.58
a.
To compute the Laspeyres index, multiply the price for each year by the quantity for each
of the items for 1990, sum the products for the four items, divide by 14.05 (the sum for
the base period 1990), and multiply by 100. The Laspeyres index is:
Year
1990
1995
2000
2004
13.60
Spaghetti
0.85
0.88
0.88
0.95
Ground
Beef
1.63
1.40
1.63
2.14
Eggs
1.00
1.16
0.96
0.98
Potatoes
0.32
0.38
0.35
0.51
Total
14.05
13.72
14.37
18.68
Laspeyres
100.00
97.65
102.28
132.95
b.
From 1990 to 2004, the basket of foods increased by 132.95 100 = 32.95%.
a.
We first calculate the exponentially smoothed values for 19801999.

E1 = Y1 = 56.50
E2 = .8Y2 + (1 .8)E1 = .8(27.0) + .2(56.50) = 32.90
E3 = .8Y3 + (1 .8)E2 = .8(38.75) + .2(32.90) = 37.58
The rest of the values appear in the table.
Year
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
Closing Exponentially Smoothed Value

Price
(w = .8)
56.50
56.50
27.00
32.90
38.75
37.58
45.25
43.72
41.75
42.14
68.37
63.12
45.62
49.12
48.02
48.24
48.01
48.06
64.03
60.84
45.00
48.17
68.07
64.09
30.03
36.84
29.05
30.61
32.05
31.76
41.05
39.19
50.75
48.44
65.50
62.09
49.00
51.62
36.31
39.37
48.44
46.63
55.75
53.93
40.00
42.79
46.60
45.84
46.65
46.49
39.43
40.84
43.80
43.21
509
The forecasts for 2007 and 2008 are:

F2007 = Ft+1 = Et = 43.21
F2008 = Ft+2 = Et = 43.21
The expected gain is F2008 Y2006 = 43.21 43.80 = .59. Since this number is
negative, it is actually a loss.
b.
We first calculate the Holt-Winters values for 1980-2006.

For w = .8 and v = .5,
E2 = Y2 = 27.00
E3 = .8Y3 + (1 .8)(E2 + T2)
= .8(38.75) + .2(27 29.50) = 30.50
T2 = Y2 Y1 = 27.00 56.50 = 29.50
T3 = .5(E3 E2) + (1 .5)(T2)
= .5(30.50 27.00) + .5(29.50) = -13.00
The rest of the values appear in the table.
Year
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
510
Closing Price
56.50
27.00
38.75
45.25
41.75
68.37
45.62
48.02
48.01
64.03
45.00
68.07
30.03
29.05
32.05
41.05
50.75
65.50
49.00
36.31
48.44
55.75
40.00
46.60
46.65
39.43
43.80
Holt-Winters
w = .8
v = .5
Et
Tt
27.00 29.5
30.50 13.00
39.70 1.90
40.96 0.32
62.82 10.77
51.22 0.42
48.58 1.53
47.82 1.14
60.56
5.80
49.27 2.74
63.76
5.87
37.95 9.97
28.84 9.54
29.50 4.44
37.85
1.96
48.56
6.33
63.38 10.58
53.99
0.59
39.96 6.72
45.40 0.64
53.55
3.76
43.46 3.17
45.34 0.65
46.26
0.14
40.82 2.65
42.67 0.40
Chapter 13
The forecasts for 2007 and 2008 are:

F2007 = Ft+1 = Et + Tt = 42.67 + (.40) = 42.27
F2008 = Ft+2 = Et + 2Tt = 42.67 + 2(.40) = 41.87
The expected gain is F2008 Y2006 = 41.87 43.80 = 1.93. Since this number is
negative, it is actually a loss.
13.62
a.
To compute the simple index for the IRA series, divide each IRA value by the 1990
value, 140, and then multiply by 100. To compute the simple index for the 401(k) series,
divide each 401(k) value by the 1990 value, 35, and then multiply by 100. The values for
the indices are in the table:
Year
1990
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
b.
IRA
140
350
476
598
767
960
1234
1232
1161
1034
1307
1490
IRA
Simple
Index
100.00
250.00
340.00
427.14
547.86
685.71
881.43
880.00
829.29
738.57
933.57
1064.29
401(k)
35
184
266
346
466
616
810
815
794
706
919
1086
401(k)
Simple
Index
100.00
525.71
760.00
988.57
1331.43
1760.00
2314.29
2328.57
2268.57
2017.14
2625.71
3102.86
The time series plot is:

3500
Variable
IRAindex
401(K)index
3000
Index
2500
2000
1500
1000
500
0
1990 1992
1994
1996 1998
Y ear
2000
2002
2004
511
13.64
c.
Both the IRA and 401(K) finds have increased since 1990. However, the 401(K) fund
has increased at a higher rate than has the IRA fund.
a.
Using MINITAB, the results from fitting the model E(Yt) = o + 1t are:
Regression Analysis: GDP versus t
GDP = 9595 + 79.5 t
Predictor
Constant
t
Coef
9594.96
79.537
S = 97.4825
SE Coef
45.28
3.780
R-Sq = 96.1%
T
211.89
21.04
P
0.000
0.000
R-Sq(adj) = 95.9%
Source
Regression
Residual Error
Total
DF
1
18
19
SS
4206863
171051
4377914
MS
4206863
9503
F
442.70
P
0.000
Obs
1
t
1.0
GDP
9876.0
Fit
9674.5
SE Fit
42.0
Residual
201.5
St Resid
2.29R

New
Obs
1
Fit
11265.2
SE Fit
45.3
95% CI
(11170.1, 11360.4)
95% PI
(11039.4, 11491.1)

New
Obs
1
t
21.0

New
Obs
1
Fit
11344.8
SE Fit
48.6
95% CI
(11242.6, 11446.9)
95% PI
(11115.9, 11573.6)

New
Obs
1
512
t
22.0
Chapter 13

New
Obs
1
Fit
11424.3
SE Fit
52.0
95% CI
(11315.0, 11533.6)
95% PI
(11192.2, 11656.5)

New
Obs
1
t
23.0

New
Obs
1
Fit
11503.8
SE Fit
55.5
95% CI
(11387.3, 11620.4)
95% PI
(11268.2, 11739.5)X
X denotes a point that is an outlier in the predictors.

New
Obs
1
t
24.0
The fitted regression line is: Yt = 9,594.96 + 79.537t

From the printout, the 2006 quarterly GDP forecasts are:
Year
2006
b.
Quarter
Q1
Q2
Q3
Q4
Forecast
11,265.2
11,344.8
11,424.3
11,503.8
95% Lower
Limit
11,039.4
11,115.9
11,192.2
11,268.2
95% Upper
Limit
11,491.1
11,573.6
11,656.5
11,739.5
The following model is fit: E(Yt) = o + 1t + 1t + 2Q1 + 3Q2 + 4Q3

1 if quarter 1
where Q1 =
0 otherwise
1 if quarter 2
Q2 =
0 otherwise
1 if quarter 3
Q3 =
0 otherwise
The MINITAB printout is:

Regression Analysis: GDP versus t, Q1, Q2, Q3
GDP = 9573 + 79.8 t + 29.4 Q1 + 21.1 Q2 + 25.8 Q3
Predictor
Constant
t
Q1
Q2
Q3
Coef
9572.60
79.850
29.35
21.10
25.85
S = 105.993
SE Coef
69.10
4.190
68.20
67.56
67.17
R-Sq = 96.2%
T
138.53
19.06
0.43
0.31
0.38
P
0.000
0.000
0.673
0.759
0.706
R-Sq(adj) = 95.1%
513
Source
Regression
Residual Error
Total
Source
t
Q1
Q2
Q3
DF
1
1
1
1
DF
4
15
19
SS
4209395
168519
4377914
MS
1052349
11235
F
93.67
P
0.000
Seq SS
4206863
656
212
1664
Obs
1
t
1.0
GDP
9876.0
Fit
9681.8
SE Fit
58.1
Residual
194.2
St Resid
2.19R

New
Obs
1
Fit
11278.8
SE Fit
69.1
95% CI
(11131.5, 11426.1)
95% PI
(11009.1, 11548.5)

New
Obs
1
t
21.0
Q1
1.00
Q2
0.000000
Q3
0.000000

New
Obs
1
Fit
11350.4
SE Fit
69.1
95% CI
(11203.1, 11497.7)
95% PI
(11080.7, 11620.1)

New
Obs
1
t
22.0
Q1
0.000000
Q2
1.00
Q3
0.000000

New
Obs
1
Fit
11435.0
SE Fit
69.1
95% CI
(11287.7, 11582.3)
95% PI
(11165.3, 11704.7)

New
Obs
1
t
23.0
Q1
0.000000
Q2
0.000000
Q3
1.00

New
Obs
Fit SE Fit
95% CI
95% PI
1 11489.0
69.1 (11341.7, 11636.3) (11219.3, 11758.7)
New
Obs
1
514
t
24.0
Q1
0.000000
Q2
0.000000
Q3
0.000000
Chapter 13
The fitted regression line is:

Yt = 9,572.6 + 79.85t + 29.35Q1 + 21.10Q2 + 25.85Q3
To determine whether the data indicate a significant seasonal component, we test:
H0: 2 = 3 = 4 = 0
i = 2, 3, 4

F=
(SSE R SSE C ) /(k g ) (171,051 168,519) /(4 1)

844
=
=
= 0.075
SSE C [ n ( k + 1)]
168,519 /[20 (4 + 1)]
11, 234.6
upper tail of the F-distribution with 1 = k g = 4 1 = 3 and 2 = n (k + 1) = 20
(4 + 1) = 15. From Table IX, Appendix B, F.05 = 3.29. The rejection region is F > 3.29.
(F = .075 >/ 3.29), H0 is not rejected. There is insufficient evidence to indicate a seasonal
component at = .05. This supports the assertion that the data have been seasonally
adjusted.
c.
From the printout, the 2006 quarterly forecasts are:
Year
2006
d.
Quarter
Q1
Q2
Q3
Q4
Forecast
11,278.8
11,350.4
11,435.0
11,489.0
95% Lower
Limit
11,009.1
11,080.7
11,165.3
11,219.3
95% Upper
Limit
11,548.5
11,620.1
11,704.7
11,758.7
To determine if the time series residuals are autocorrelated, we test:

Ha: Positive or negative first-order autocorrelation of residuals
The test statistic is d = 0.24.
For = .10, the rejection region is d < dL,/2 = dL,.05 = .90 or (4 d) < dL,.01 = .90. The
value of dL,.05 is found in Table XIII, Appendix B, with k = 4 and n = 20.
Since the observed value of the test statistic falls in the rejection region (d = 0.24 < .90),
H0 is rejected. There is sufficient evidence to indicate the time series residuals are
autocorrelated at = .10.
515
13.66
a.
Using MINITAB, the results from fitting the model E(Yt) = 0 + 1t are:
Regression Analysis: Revolving versus t
Revolving = - 84.5 + 33.8 t
Predictor
Constant
t
Coef
-84.54
33.768
S = 56.7803
SE Coef
23.41
1.575
R-Sq = 95.2%
T
-3.61
21.44
P
0.001
0.000
R-Sq(adj) = 95.0%
Source
Regression
Residual Error
Total
DF
1
23
24
SS
1482334
74152
1556486
MS
1482334
3224
F
459.78
P
0.000
Obs
1
t
1.0
Revolving
55.0
Fit
-50.8
SE Fit
22.0
Residual
105.8
St Resid
2.02R

New
Obs
1
Fit
827.2
SE Fit
24.8
95% CI
(775.9, 878.5)
95% PI
(699.0, 955.4)

New
Obs
1
t
27.0

New
Obs
1
Fit
861.0
SE Fit
26.2
95% CI
(806.7, 915.2)
95% PI
(731.6, 990.3)

New
Obs
1
t
28.0
The fitted regression line is: Yt = 84.54 + 33.768t
516
Chapter 13
For the years 2006 and 2007, t = 27 and 28. From the printout, the predicted values
and 95% prediction intervals for 2006 and 2007 are:
Year
2006
2007
b.
Forecast
827.2
861.0
95% Lower
Limit
699.0
731.6
95% Upper
Limit
955.4
990.3
To compute the Holt-Winters values for the years 1980-2004:

With w = .7 and v = .7,
E2 = Y2 = 61
E3 = wY3 + (1 w)(E2 + T2) =.7(66) + (1 .7)(61 + 6) = 66.3.
T2 = Y2 Y1 = 61 55 = 6
T3 = v(E3 E2) + (1 v)T2 = .7(66.3 61) + (1 .7)(6) = 5.51
The rest of the values appear in the table:
Year
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
Revolving
55
61
66
79
100
122
136
153
174
198
239
245
257
288
338
443
499
530
579
608
678
722
738
759
794
Holt-Winters
w = .7
v = .7
Et
Tt
61.00
66.30
76.84
95.76
118.91
137.17
153.98
173.24
196.19
232.66
250.91
261.89
284.49
327.99
419.44
497.62
543.45
584.91
614.75
669.40
720.81
748.01
765.97
792.44
6.00
5.51
9.03
15.95
20.99
19.08
17.49
18.73
21.69
32.04
22.38
14.40
20.14
36.49
74.97
77.22
55.24
45.59
34.57
48.62
50.57
34.22
22.83
25.38
517
Using the Holt-Winters series, the forecasts for 2006 and 2007 are:
F2006 = Ft+2 = Et + 2Tt = 792.44 + 2(25.38) = 843.20
F2007 = Ft+3 = Et + 3Tt = 792.44 + 3(25.38) = 868.58
These values are very similar to forecasts found using regression.
13.68
a.
From Example 13.4, the exponentially smoothed value for September 2005 is
80.333. The forecasts for October through December 2005 are:
F2005,Oct = Ft+1 = Et = 80.333
F2005,Nov = Ft+2 = Ft+1 = 80.333
F2005,Dec = Ft+3 = Ft+1 = 80.333
The forecast errors are the differences between the actual values and the forecasted
values. The forecast errors are:
Year
2005,Oct
2005,No
v
2005,Dec
b.
Yt+i
81.88
Ft+i
80.333
Difference
1.55
88.90
82.20
80.333
80.333
8.57
1.87
Using MINITAB, the results of fitting the model are:

Regression Analysis: IBM versus Time
IBM = 95.8 - 0.740 Time
Predictor
Constant
Time
Coef
95.777
-0.7401
S = 5.79351
SE Coef
2.622
0.2088
R-Sq = 39.8%
T
36.53
-3.54
P
0.000
0.002
R-Sq(adj) = 36.6%
Source
Regression
Residual Error
Total
DF
1
19
20
SS
421.71
637.73
1059.44
MS
421.71
33.56
F
12.56
P
0.002
Obs
12
Time
12.0
IBM
98.58
Fit
86.90
SE Fit
1.28
Residual
11.68
St Resid
2.07R

518
Chapter 13

New
Obs
1
Fit
79.50
SE Fit
2.62
95% CI
(74.01, 84.98)
95% PI
(66.19, 92.81)

New
Obs
1
Time
22.0

New
Obs
1
Fit
78.76
SE Fit
2.81
95% CI
(72.88, 84.63)
95% PI
(65.28, 92.23)

New
Obs
1
Time
23.0

New
Obs
1
Fit
78.02
SE Fit
2.99
95% CI
(71.75, 84.28)
95% PI
(64.37, 91.67)

New
Obs
1
Time
24.0
The least squares fitted model is: Yt = 95.777 .7401t
o = 95.777
The estimated stock price for IBM in December 2003 is 95.777.
1 = .7401
The estimated decrease in the value of the stock for IBM for each
additional month is .7401.
c.
The approximate precision is 2s or 2(5.79) or 11.58 .
d.
The forecasts and prediction intervals are found at the bottom of the printout in
part b.
Year
2005, Oct
2005, Nov
2005, Dec
Forecast
79.50
78.76
78.02
95% Lower
Limit
66.19
65.28
64.37
The precision for October is approximately
95% Upper
Limit
92.81
92.23
91.67
92.81 66.19
= 13.31 .
2
519
The precision for November is approximately
92.23 65.28
= 13.48 .
2
The precision for December is approximately
91.67 64.37
= 13.65 .
2
All of these are close to the 11.58 from part c.

e.
The MAD, MAPE, and RMSE for the smoothed series are:
m
MAD =
| Yt Ft |
i =1
| 81.88 80.33 | + | 88.90 80.33 | + | 82.20 80.33 | 11.98

=
= 3.994
3
3
m (Yt Ft )
Yt
i =1
MAPE =
m
(Yt Ft )
i =1
RMSE =
81.88 80.33 88.90 80.33 82.20 80.33
+
+
81.88
88.90
88.90
100 =
3
.1380
=
100 = 4.599
3
=
=
100
(81.88 80.33)2 + (88.90 80.33)2 + (82.20 80.33)2

3
79.2724
= 5.140
3
The MAD, MAPE, and RMSE for the regression model are:
m
MAD =
| Yt Ft |
i =1
m
16.70
=
= 5.567
3
| 81.88 79.50 | + | 88.90 78.76 | + | 82.20 78.02 |

3
m (Yt Ft )
Yt
i =1
MAPE =
m
520
81.88 79.50 88.90 78.76 82.20 78.02
+
+
81.88
88.90
88.90
100 =
3
.1940
=
100 = 6.466
3
100
Chapter 13
RMSE =
(Yt Ft )
i =1
=
=
(81.88 79.50 )2 + (88.90 78.76 )2 + (82.20 78.02 )2

3
125.9564
= 6.480
3
The values of MAD, MAPE, and RMSE for the exponentially smoothed model are all
smaller than their corresponding values for the regression model.
f.
We have to assume that the error terms are independent.
g.

The test statistic is d = 0.69.
The rejection region is d < dL, = dL,.05 = 1.22. The value of dL,.05 is found in
Table XIII, Appendix B, with k = 1 and n = 21 .
(d = .69 < 1.22), H0 is rejected. There is sufficient evidence to indicate the time
series residuals are positively autocorrelated at = .05. Since there is evidence of
positive autocorrelation, the validity of the regression model is questioned.
521

For this study, I constructed an R chart and an x -chart for both the original data (5.1) and for the new
data (5.2).
First, we will analyze the data set, 5.1 (that collected under the discretion of the operator).
We must compute the mean and range for each sample. The range = R = largest measurement smallest measure. The results are listed in the table:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
0.0440
0.0438
0.0453
0.0451
0.0459
0.0449
0.0472
0.0457
0.0464
0.0451
0.0456
0.0448
0.0459
0.0456
0.0472
0.0462
0.0427
0.0431
0.0425
0.0429
0.0443
0.0443
0.0429
0.0448
Samples
0.0446
0.0425
0.0428
0.0441
0.0466
0.0471
0.0477
0.0459
0.0457
0.0447
0.0455
0.0423
0.0468
0.0471
0.0465
0.0463
0.0437
0.0448
0.0442
0.0447
0.0441
0.0423
0.0427
0.0451
0.0437
0.0443
0.0433
0.0434
0.0476
0.0451
0.0452
0.0472
0.0447
0.0457
0.0445
0.0442
0.0452
0.0450
0.0461
0.0471
0.0445
0.0429
0.0432
0.0450
0.0450
0.0447
0.0464
0.0428
x
0.0441
0.0435
0.0438
0.0442
0.0467
0.0457
0.0467
0.0463
0.0456
0.0452
0.0452
0.0438
0.0460
0.0459
0.0466
0.0465
0.0436
0.0436
0.0433
0.0442
0.0445
0.0438
0.0440
0.0442
Range
0.0009
0.0018
0.0025
0.0017
0.0017
0.0022
0.0025
0.0015
0.0017
0.0010
0.0011
0.0025
0.0016
0.0021
0.0011
0.0009
0.0018
0.0019
0.0017
0.0021
0.0009
0.0024
0.0037
0.0023
x1 + x2 + " + x24 1.0770

=
= .0449
n
24
R + R2 + " + R24 .0436
=
R = 1
= .0018
n
24
x =
We now construct an R chart. From Table XVII, Appendix B, with n = 3, D3 = .000 and
D4 = 2.574.
522
R = .0018
Upper control limit = RD4 = .0018(2.574) = .0046
From Table XVII, Appendix B, with n = 3, d2 = 1.693 and d3 = .888.
Upper AB boundary = R + 2d3
.0018
R
= .0018 + 2(.888)
= .0037
1.693
d2
Lower AB boundary = R 2d3
.0018
R
= .0018 2(.888)
= .0001 = 0
1.693
d2
Upper BC boundary = R + d3
.0018
R
= .0018 + (.888)
= .0027
d2
1.693
Lower BC boundary = R d3
.0018
R
= .0018 (.888)
= .0009
1.693
d2
The R-chart is:

Rule 1: One point beyond Zone A: There are no points beyond Zone A.
Rule 2: Nine points in a row in Zone C or beyond: No sequence of nine points are in Zone
C (on one side of the centerline) or beyond.
Rule 3: Six points in a row steadily increasing or decreasing: This pattern is not present.
Rule 4: Fourteen points in a row alternating up and down: This pattern does not exit.
The process appears to be in control. No rule is violated.
Next, we construct the x -chart.
523
Centerline = x = .0449
From Table XVII, Appendix B, with n = 3, A2 = 1.023
Upper control limit = x + A2 R = .0449 + 1.023(.0018) = .0467

Lower control limit = x A2 R = .0449 1.023(.0018) = .0431
Upper A-B boundary = x =
2
2
( A2 R ) = .0449 + (1.023(.0018) ) = .0461
3
3
Lower AB boundary = x
2
2
( A2 R ) = .0449 (1.023(.0018) ) = .0437
3
3
Upper BC boundary = x +
1
1
( A2 R ) = .0449 + (1.023(.0018) ) = .0455
3
3
Lower BC boundary = x
1
1
( A2 R ) = .0449 (1.023(.0018) ) = .0443
3
3
The x -chart is:

Rule 1: One point beyond Zone A: No points are beyond Zone A.
Rule 2: Nine points in a row in Zone C or beyond: No sequence of nine points are in
Rule 5: Two out of three points in Zone A or beyond: There are six groups of at least three
points in Zone A or beyondpoints 57, points 68, points 79, points 1416,
points 1719, and points 1820.
Rule 6: Four out of five points in a row in Zone B or beyond: There are six groups of points
that satisfy this rulepoints 59, points 610, points 1721, points 1822, points
1923, and points 2024.
524
The process appears to be out of control. Rules 5 and 6 indicate that the process is out of control.
Since the process is out of control, a capability analysis is not appropriate. However, I will include a
dot diagram which indicates that many of the actual observations are outside of the specification limits.
The dot plot is:
.
: : ::: ....
:. .:
. .
..
:. .::: :.::.:::. .:: : ...:.. .
::
..
-------+---------+---------+---------+---------+--------0.0430
0.0440
0.0450
0.0460
0.0470
0.0480
The specification limits are .043 to .047. There are 11 points below .043 and 8 above .047. Thus, 19
out of the 72 points or .264 of the points are outside of the specification limits.
This indicates that the present system, when the operator is allowed to adjust the system at his/her
discretion, is not capable of reaching the needs of the customers.
Next, we analyze the second set of data, 5.2.
First, we must compute the mean and range for each sample. The range = R = largest measurement smallest measure. The results are listed in the table:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
0.0445
0.0435
0.0438
0.0449
0.0433
0.0455
0.0455
0.0445
0.0443
0.0449
0.0465
0.0461
0.0443
0.0456
0.0447
0.0454
0.0445
0.0438
0.0453
0.0455
0.0440
0.0444
0.0445
0.0450
Samples
0.0455
0.0453
0.0459
0.0449
0.0461
0.0454
0.0458
0.0451
0.0450
0.0448
0.0449
0.0439
0.0434
0.0459
0.0442
0.0445
0.0471
0.0445
0.0444
0.0435
0.0438
0.0450
0.0447
0.0463
0.0457
0.0450
0.0428
0.0467
0.0451
0.0461
0.0445
0.0436
0.0441
0.0467
0.0448
0.0452
0.0454
0.0452
0.0457
0.0451
0.0465
0.0472
0.0451
0.0443
0.0444
0.0467
0.0461
0.0456
x
0.0452
0.0446
0.0442
0.0455
0.0448
0.0457
0.0453
0.0444
0.0445
0.0455
0.0454
0.0451
0.0444
0.0456
0.0449
0.0450
0.0460
0.0452
0.0449
0.0444
0.0441
0.0454
0.0451
0.0456
Range
0.0012
0.0018
0.0031
0.0018
0.0028
0.0007
0.0013
0.0015
0.0009
0.0019
0.0017
0.0022
0.0020
0.0007
0.0015
0.0009
0.0026
0.0034
0.0009
0.0020
0.0006
0.0023
0.0016
0.0013
525
x1 + x2 + " + x24 1.0808

=
= .0450
n
24
R + R2 + " + R24 .0407
=
R = 1
= .0017
24
n
x =
First, we construct an R chart. From Table XVII, Appendix B, with n = 3, D3 = .000 and D4 = 2.574.
R = .0017
Upper control limit = RD4 = .0017(2.574) = .0044
From Table XVII, Appendix B, with n = 3, d2 = 1.693 and d3 = .888.
.0017
Upper AB boundary = R + 2d3 R = .0017 + 2(.888)
= .0035
1.693
d2
.0017
Lower AB boundary = R 2d3 R = .0017 2(.888)
= -.0001 = 0
1.693
d2
.0017
Upper BC boundary = R + d3 R = .0017 + (.888)
= .0026
1.693
d2
.0017
Lower BC boundary = R d3 R = .0017 (.888)
= .0008
1.693
d2
The R-chart is:

Rule 1: One point beyond Zone A: There are no points beyond Zone A.
526
The process appears to be in control. No rule is violated.
Next, we construct the x -chart.
Centerline = x = .0450
From Table XVII, Appendix B, with n = 3, A2 = 1.023
Upper control limit = x + A2 R = .0450 + 1.023(.0017) = .0467

Lower control limit = x A2 R = .0450 1.023(.0017) = .0433
Upper A-B boundary = x +
2
2
( A2 R ) = .0450 + (1.023(.0017) ) = .0462
3
3
Lower AB boundary = x
2
2
( A2 R ) = .0450 (1.023(.0017) ) = .0438
3
3
Upper BC boundary = x +
1
1
( A2 R ) = .0450 + (1.023(.0017) ) = .0456
3
3
Lower BC boundary = x
1
1
( A2 R ) = .0450 (1.023(.0017) ) = .0444
3
3
The x -chart is:

Rule 1: One point beyond Zone A: No points are beyond Zone A.
527
Rule 3:
Rule 4:
Rule 5:
Rule 6:
Six points in a row steadily increasing or decreasing: This pattern is not present.
Fourteen points in a row alternating up and down: This pattern does not exit.
Two out of three points in Zone A or beyond: This pattern does not exist.
Four out of five points in a row in Zone B or beyond: This pattern does not exist.
The process appears to be in control. No rules are violated. Since the process is in control, we will
perform a capability analysis to see if the process can meet the customer's demand. I will include a dot
diagram which indicates that many of the actual observations are outside of the specification limits.
The dot plot is:
.
:
.
..:
: :: . :
:
.
.
.. :. : .... ::: ::: ::::: :::. : : . : :
..
-----+---------+---------+---------+---------+---------+0.04320
0.04400
0.04480
0.04560
0.04640
0.04720
The specification limits are .043 to .047. There is one point below .043 and two points above .047.
Thus, 3 out of the 72 points or .042 of the points are outside of the specification limits. This indicates
that the present system, when the operator does not adjust the system at his/her discretion, might be able
to meet the needs of the customers.
We will also compute the capability index. The capability index is defined as the ratio of the
specification limits to 6 standard deviations or:
Cp =
upper specification limit lower specification limit

6
Since is not known, we will estimate it with s. In this case, s = .00095. The capability index is:
Cp =
.047 - .043
= .702
6(.00095)
Since the capability index is less than 1, it indicates that the process is not capable of meeting the
customer's needs. Even though this process (operator does not make adjustments) is in control, it is not
capable of meeting the needs of the customers.
In conclusion, it appears that the engineers are correctthe present equipment is not capable of
producing gasket material within the necessary limits.
528
14.2
14.4
Chapter 14
a.
Since the normal distribution is symmetric, the probability that a randomly selected
observation exceeds the mean of a normal distribution is .5.
b.
By the definition of "median," the probability that a randomly selected observation

exceeds the median of a normal distribution is .5.
c.
If the distribution is not normal, the probability that a randomly selected observation
exceeds the mean depends on the distribution. With the information given, the
probability cannot be determined.
d.
By definition of "median," the probability that a randomly selected observation exceeds

the median of a non-normal distribution is .5.
a.
H0: = 9
Ha: > 9
The test statistic is S = {Number of observations greater than 9} = 7.
The p-value = P(x 7) where x is a binomial random variable with n = 10 and p = .5.
From Table II,
p-value = P(x 7) = 1 P(x 6) = 1 .828 = .172
indicate the median is greater than 9 at = .05.
b.
H0 : = 9
Ha: 9
S1 = {Number of observations less than 9} = 3 and
S2 = {Number of observations greater than 9} = 7
The test statistic is S = larger of S1 and S2 = 7.
The p-value = 2P(x 7) where x is a binomial random variable with n = 10 and p = .5.
From Table II,
p-value = 2P(x 7) = 2(1 P(x 6)) = 2(1 - .828) = .344
indicate the median is different than 9 at = .05.
529
c.
H0: = 20
Ha: < 20
The test statistic is S = {Number of observations less than 20} = 9.
From Table II,
p-value = P(x 9) = 1 P(x 8) = 1 .989 = .011
Since the p-value = .011 < = .05, H0 is rejected. There is sufficient evidence to indicate
the median is less than 20 at = .05.
d.
H0: = 20
Ha: 20
S1 = {Number of observations less than 20} = 9 and
S2 = {Number of observations greater than 20} = 1
From Table II,
p-value = 2P(x 9) = 2(1 P(x 8)) = 2(1 .989) = .022
the median is different than 20 at = .05.
e.
For all parts, = np = 10(.5) = 5 and =
npq = 10(.5)(.5) = 1.581.
(7 .5) 5
For part a, P(x 7) P z

= P(z .95) = .5 .3289 = .1911
1.581
This is close to the probability .172 in part a. The conclusion is the same.
(7 .5) 5
For part b, 2P(x 7) 2 P z

= 2P(z .95) = 2(.5 .3289)
1.581
= .3422
This is close to the probability .344 in part b. The conclusion is the same.
(9 .5) 5
= P(z 2.21) = .5 .4864

For part c, P(x 9) P z
1.581
= .0136
This is close to the probability .011 in part c. The conclusion is the same.
530
Chapter 14
(9 .5) 5
For part d, 2P(x 9) 2 P z

= 2P(z 2.21) = 2(.5 .4864)
1.581
= .0272
This is close to the probability .022 in part d. The conclusion is the same.
14.6
f.
We must assume only that the sample is selected randomly from a continuous probability
distribution.
a.
To determine if the median amount of caffeine in Breakfast Blend coffee exceeds

300 milligrams, we test:
H0: = 300
Ha: > 300
b.
S=4
c.
P ( x 4) = 1 P ( x 3) = 1 .656 = .344
d.
14.8
a.
Since the probability in part c is greater than = .05, H0 is not rejected. There is
insufficient evidence to indicate the median amount of caffeine in Breakfast Blend coffee
exceeds 300 milligrams at = .05.
To determine if cohesiveness will deteriorate after storage, we test:
H0: = 0
Ha: > 0
b.
The test statistic is S = {number of measurements greater than 0} = 13.

From Table II,
p-value = P(x 13) = 1 P(x 12) = 1 .868 = .132
14.10
c.
Since the p-value = .132 > = .05, H0 is not rejected. There is insufficient evidence
to indicate cohesiveness will deteriorate after storage at = .05.
a.
I would recommend the sign test because five of the sample measurements are of similar
magnitude, but the 6th is about three times as large as the others. It would be very
unlikely to observe this sample if the population were normal.
b.
To determine if the airline is meeting the requirement, we test:

H0: = 30
Ha: < 30
531
c.
The test statistic is S = number of measurements less than 30 = 5.

H0 will be rejected if the p-value < = .01.
d.
The test statistic is S = 5.

From Table II,
p-value = P(x 5) = 1 P(x 4) = 1 .891 = .109
Since the p-value = .109 is not less than = .01, H0 is not rejected. There is insufficient
evidence to indicate the airline is meeting the maintenance requirement at = .01.
14.12
To determine if the median surface roughness of coated interior pipe differs from 2
micrometers, we test:
H0: = 2
Ha: 2
S1 = {Number of measurements < 2} = 9.
S2 = {Number of measurements > 2} = 11.
The test statistic is S = Larger of S1 and S2 = 11.
The p-value = 2 P(x 11) where x is a binomial random variable with n = 20 and p = .5
From Table II, Appendix B,
p-value = 2 P(x 11) = 2(1 P( x 10)) = 2(1 .588) = .824
Since the p-value = .824 </ = .05, H0 is not rejected. There is insufficient evidence to
indicate the median surface roughness of coated interior pipe differs from 2 micrometers
at = .05.
14.14
To determine if the distribution of A is shifted to the left of distribution B, we test:

H0: The two sampled populations have identical distributions
Ha: The probability distribution for population A is shifted to the left of population B.
n1 (n1 + n2 + 1)
15(15 + 15 + 1)
173
2
2
=
= 2.47
15(15)(15 + 15 + 1)
n1n2 ( n1 + n2 + 1)
12
12
The rejection region requires = .05 in the lower tail of the z-distribution. From Table IV,
z.05 = 1.645. The rejection region is z < 1.645.
T1
Since the observed value of the test statistic falls in the rejection region (z = 2.47 < 1.645),
H0 is rejected. There is sufficient evidence to indicate the distribution of A is shifted to the left
of distribution B.
532
Chapter 14
14.16
Sample from
Population 1
15
10
12
16
13
8
Rank
13
8.5
10.5
14
12
4.5
T1 = 62.5
a.
Sample from
Population 2
5
12
9
9
8
4
5
10
Rank
2.5
10.5
6.5
6.5
4.5
1
2.5
8.5
T2 = 42.5
H0: The two sampled populations have identical probability distributions

Ha: The probability distribution for population 1 is shifted to the left or to the right
of that for 2
The test statistic is T1 = 62.5 since sample A has the smallest number of measurements.
The null hypothesis will be rejected if T1 TL or T1 TU where TL and TU correspond to
= .05 (two-tailed), n1 = 6 and n2 = 8. From Table XV, Appendix B, TL = 29 and TU = 61.
Reject H0 if T1 29 or T1 61.
Since T1 = 62.5 61, we reject H0 and conclude there is sufficient evidence to indicate
population 1 is shifted to the left or right of population 2 at = .05.
b.

Ha: The probability distribution for population 1 is shifted to the right of population 2
The test statistic remains T1 = 62.5.
The null hypothesis will be rejected if T1 TU where TU corresponds to = .05 (onetailed), n1 = 6 and n2 = 8. From Table XV, Appendix B, TU = 58.
Reject H0 if T1 58.
Since T1 = 62.5 58, we reject H0 and conclude there is sufficient evidence to indicate
population 1 is shifted to the right of population 2 at = .05.
533
14.18
a.

Private Sector
2.58
5.05
0.05
2.10
4.30
2.25
2.50
1.94
2.33
b.
Rank
10
13
1
5
12
6
8
4
7
T1 = 66
Public Sector
5.40
2.55
9.00
10.55
1.02
5.11
12.42
1.67
3.33
Rank
15
9
16
17
2
14
18
3
11
T2 = 105
To determine if the distribution for public sector organizations is located to the right of
the distribution for private sector firms, we test:
Ha: The probability distribution of the public sector is located to the right of that
for the private sector
The test statistic is T2 = 105.
The null hypothesis will be rejected if T2 TU where TU corresponds to = .05 (onetailed), and n1 = n2 = 9. From Table XV, Appendix B, TU = 105.
Reject H0 if T2 105.
Since T2 = 105 105, H0 is rejected. There is sufficient evidence to indicate that the
distribution in the public sector organization is located to the right of the distribution for
the private sector firms at = .05.
c.
The null hypothesis will be rejected if T2 TU where TU corresponds to = .05 (onetailed), and n1 = n2 = 9. From Table XV, Appendix B, TU = 105. Since T1 = 105, we
would reject H0. Thus, the p-value is less than or equal to = .05.
d.
The assumptions necessary for the test are:

1.
2.
534
The two samples are random and independent.

The two probability distributions from which the samples were drawn are
continuous.
Chapter 14
14.20
a.
American Purchasing
Managers
Sample 1
Rank
50
20.5
10
4.5
35
15.5
30
13.5
20
10.5
15
7.5
8
3
40
17.5
80
26.5
75
25
19
9
11
6
5
1.5
25
12
30
13.5
T1 = 186
b.
Mexican Purchasing
Managers
Sample 2
Rank
10
4.5
90
29
65
24
50
20.5
20
10.5
15
7.5
60
23
80
26.5
85
28
35
15.5
5
1.5
55
22
40
17.5
45
19
95
30
T2 = 279
To determine whether American and Mexican purchasing managers perceive the given
ethical situation differently, we test:
Ha: The probability distribution of the American managers is shifted to the right or left
of the probability distribution of the Mexican managers.
n1 (n1 + n2 + 1)
15(15 + 15 + 1)
186
2
2
=
= 1.929
15(15)(15 + 15 + 1)
n1n2 (n1 + n2 + 1)
12
12
T1
(z = 1.929 </ 1.96), H0 is not rejected. There is insufficient evidence to indicate
American and Mexican purchasing managers perceive the given ethical situation
differently at = .05.
c.
In order to use the t-test, we need to assume that the two populations being sampled from
are normal and that the variances of the two populations are equal. To check these
assumptions, we will use stem-and-leaf plots and dot plots.
535
The stem-and-leaf plots are:

Stem-and-leaf of Ethics
Leaf Unit = 1.0
2
6
(2)
7
4
3
2
2
1
0
1
2
3
4
5
6
7
8
0
1
2
3
4
5
6
7
8
9
= 15
Managers = 2
= 15
58
0159
05
005
0
0
5
0
Stem-and-leaf of Ethics
Leaf Unit = 1.0
1
3
4
5
7
(2)
6
4
4
2
Managers = 1
5
05
0
5
05
05
05
05
05
Neither of these two stem-and-leaf plots look mound-shaped. The assumption that the
populations are normal may not be valid.
The dot plots are:
Managers
1
.... . :
. :
. .
. .
+---------+---------+---------+---------+---------+-------Ethics
Managers
2
. .
. .
. .
. .
. .
. .
+---------+---------+---------+---------+---------+-------Ethics
0
20
40
60
80
100
The spread of the two data sets look approximately equal. The assumption that the
variances of the two populations are the same appears to be valid.
536
Chapter 14
14.22
a.
Using MINITAB, histograms of the two data sets are:
Histogram of HEATRATE
9000 10000 11000 12000 13000 14000 15000 16000
Aeroderiv
20
Traditional
Frequency
15
10
9000 10000 11000 12000 13000 14000 15000 16000
HEATRATE
Panel variable: ENGINE
From the histograms, the data for each group do not look like they are moundshaped. The variance of the aeroderivative engines is greater than that of the
traditional engines. Thus, the assumptions of normal distributions and equal
variances necessary for the t-test are probably not met.
14.24
b.
The p-value = .3431. Since this p-value is not small, H0 is not rejected. There is no
evidence to indicate that the heat rate distribution of the traditional turbine engines is
shifted to the right or left of that for the aeroderivative turbine engines.
a.
We first rank all the data:

Firms with
Successful MIS (1)
Score
Rank
Score
52
5
90
70
15
75
40
1.5
80
80
19
95
82
21
90
65
12.5
86
59
9
95
60
10.5
93
T1 = 290.5
Rank
25.5
17
19
29.5
25.5
23
29.5
28
Firms with
Unsuccessful MIS (2)
Score
Rank Score
Rank
60
10.5
65
12.5
50
4
55
7
55
7
70
15
70
15
90
25.5
41
3
85
22
40
1.5
80
19
55
7
90
25.5
T2 = 174.5
537
To determine whether the distribution of quality scores for the successfully implemented
systems differs from that for the unsuccessfully implemented systems, we test:
H0: The two sampled distributions are identical
Ha: The probability distribution for the successful MIS is shifted to the right or left of
that for the unsuccessful MIS
n1 (n1 + n2 + 1)
16(16 + 14 + 1)
290.5
2
2
=
= 1.767
16(14)(16 + 14 + 1)
n1n2 (n1 + n2 + 1)
12
12
T1
Table IV, Appendix B, z.025 = 1.96. The rejection region is z < 1.96 or z > 1,96.
(z = 1.767 >/ 1.96), H0 is not rejected. There is insufficient evidence to indicate the
distribution of quality scores for the successfully implemented systems differs from that
for the unsuccessfully implemented systems at = .05.
b.
We could use the two-sample t-test if:

1.
2.
14.26
a.
Both populations are normal.

The variances of the two populations are the same.
The test statistic is T or T+, the smaller of the two.

The rejection region is T 152, from Table XVI, Appendix B, with n = 30, = .10, and
two-tailed.
b.
The test statistic is T.

The rejection region is T 60, from Table XVI, Appendix B, with n = 20, = .05, and
one-tailed.
c.
The test statistic is T+.

The rejection region is T+ 0, from Table XVI, Appendix B, with n = 8, = .005, and
one-tailed.
14.28
a.
n(n + 1)
25(26)
273
4
4 = 2.97
=
n(n + 1)(2n + 1)
25(26)(51)
24
24
T+
b.
The large sample test statistic is z =
1.645), H0 is rejected. There is sufficient evidence to indicate that the responses for A
tend to be larger than those for B at = .05.
538
Chapter 14
c.
p-value = P(z 2.97) = .5 P(0 < z < 2.97)

= .5 .4985
= .0015 (from Table IV, Appendix B)
Thus, we can reject H0 for any preselected greater than .0015.
14.30
a.
To determine if the chest injury ratings of drivers and front-seat passengers differ,
we test:
Ha: The probability distribution of drivers is shifted to the right or left of that for
front-seat passengers
b.

Wilcoxon Signed Rank Test: Diff
Test of median = 0.000000 versus median not = 0.000000
Diff
N
18
N for
Test
16
Wilcoxon
Statistic
23.0
P
0.021
Estimated
Median
-4.000
From the printout, the test statistic is T+ = 23.
c.
The rejection region is T+ To where To corresponds to = .01 (two-tailed) and n = 16.

From Table XVI, Appendix B, To = 19. The rejection region is T+ 19.
d.
(T+ = 23 / 19), H0 is not rejected. There is insufficient evidence to indicate the chest
injury ratings of drivers and front-seat passengers differ at = .01.
From the printout, the p-value is p = .021.
14.32

Theme
Tourism
Physical
Transportation
People
History
Climate
Forestry
Agriculture
Fishing
Energy
Mining
Manufacturing
High School
Teachers
10
2
7
1
2
6
5
7
9
2
10
12
Geography
Alumni
2
1
3
6
5
4
8
10
7
8
11
12
Difference
Rank of Absolute
T-A
Differences
8
11
1
1.5
4
8
9
5
6
3
2
3.5
6
3
6
3
2
3.5
10
6
1.5
1
0
(eliminated)
Positive rank sum T+ = 27.5
539
To determine if the distributions of theme rankings for the two groups differ, we test:
H0: The probability distributions for the two populations are identical
Ha: The probability distribution of the high school teachers is shifted to the right or left
of the probability distribution of the geography alumni
The test statistic is T+ = 27.5.
Reject H0 if T+ T0 where T0 is based on = .05 and n = 11 (two-tailed):
Reject H0 if T+ 11 (from Table XVI, Appendix B)
Since the observed value of the test statistic does not fall in the rejection region (T+ = 27.5 /
11), H0 is not rejected. There is insufficient evidence to indicate that the distributions of these
rankings for the two groups differ at = .05. Practically, this means that the thematic content
of a new atlas could be based on the views of either educators or geography alumni.
14.34
Employee
1
2
3
4
5
6
7
8
9
10
Before
Flextime
54
25
80
76
63
82
94
72
33
90
After
Flextime
68
42
80
91
70
88
90
81
39
93
Difference
(B A)
4
17
0
15
7
6
4
9
6
3
Difference
7
9
(Eliminated)
8
5
3.5
2
6
3.5
1
T+ = 2
To determine if the pilot flextime program is a success, we test:

H0: The two probability distributions are identical
Ha: The probability distribution before is shifted to the left of that after
The test statistic is T+ = 2.
The rejection region is T+ 8, from Table XVI, Appendix B, with n = 9 and = .05.
Since the observed value of the test statistic falls in the rejection region (T+ = 2 8), H0 is
rejected. There is sufficient evidence to indicate the pilot flextime program has been a success
at = .05.
540
Chapter 14
14.36
Science
0
4
3
1
3
2
4
2
3
4
Math
2
3
0
1
1
3
0
1
1
1
Rank of
Difference
Absolute
ScienceDifference
Math
2
5
1
2
3
7.5
0
eliminate
2
5
1
2
4
9
1
2
2
5
3
7.5
Negative rank sum T_ = 7
Positive rank sum T+ = 38
To determine if there are differences in the levels of family involvement between math
and science homework, we test;
H0: The distributions of the science and math levels of family involvement are the
same
Ha: The distributions of the science and math levels of family involvement differ
The test statistic is T_ = 7.
The rejection region is T_ To where To corresponds to = .05 (two-tailed) and n = 9.
From Table XVI, Appendix B, To = 6. The rejection region is T_ 6.
(T_ = 7 / 6), H0 is not rejected. There is insufficient evidence to indicate there are
differences in the levels of family involvement between math and science homework at
= .05.
14.38
a.
The hypotheses are:

H0: The three probability distributions are identical
Ha: At least two of the three probability distributions differ in location
b.

H=
2
12
12 230 2 440 2 365 2
Rj
+
+
3(n + 1) =
3(46)
n( n + 1)
45(46) 15
15
15
nj
= 146.754 138 = 8.754
541
2
df = p 1 = 3 1 = 2. From Table VII, Appendix B, .05
region is H > 5.99147.
Since the observed value of the test statistic falls in the rejection region (H = 8.754 >
5.99147), H0 is rejected. There is sufficient evidence to indicate that the probability
distributions of at least two of the populations A, B, and C, differ in location at = .05.
c.
d.
14.40
a.
The approximate p-value is P(2 8.754). From Table VII, Appendix B, with df = 2,
.01 P(2 8.754) .025.
RB 440
R A = 230
=
= 29.333
= 15.333
RB =
15 15
15 15
RC 365
n + 1 45 + 1
=
= 24.333
=
= 23
R =
RC =
15 15
2
2
12
H=
n j ( R j R )2
n(n + 1)
12
=
15(15.333 23) 2 + 15(29.333 23) 2 + 15(24.333 23) 2 = 8.754
45(46)
In order to compare the three population means using parametric techniques, we must
assume that all populations being sampled from are normal and all population variances
are the same. It is quite possible that these two conditions are not met with this data.
RA =
b.
Since we want to compare 3 groups, we will use the Kruskal-Wallis test.
c.

H=
R 2j
53352 3937 2 37692
12
12
+
=
+
+
3(
1)
n
n
n(n + 1)
161(161 + 1) 67
57
37
j
3(161 + 1)
= 11.201
14.42
d.
Since the p-value is so small (p = .0037), H0 will be rejected. There is sufficient

evidence to indicate DEF distributions differ for the 3 tax litigation forums for > .0037.
a.
To determine if the distributions of office rental growth rates differ among the four
market cycle phases, we test:
H0: The four probability distributions are identical
Ha: At least two of the growth rate distributions differ
542
Chapter 14
b.
Phase I
2.7
1.0
1.1
3.4
4.2
3.5
14.44
The ranks of the measurements are:

Rank
9
4.5
6
10
12
11
R1 = 52.5
Phase II
10.5
11.5
9.4
12.2
8.6
10.9
Rank
20
23
19
24
18
21
R2 = 125
Phase III
6.1
1.2
11.4
4.4
6.2
7.6
Rank
14
7
22
13
15.5
17
R3 = 88.5
Phase IV
1.0
6.2
10.8
2.0
1.1
2.3
Rank
4.5
15.5
1
8
3
2
R4 = 34
c.
The rank sums appear in the table above. The test statistic is:
R 2j
52.52 1252 88.52 342
12
12
+
=
+
+
+
3(
1)
H=
n
3(24 + 1)
n
n( n + 1)
24(24 + 1) 6
6
6
6
j
= 16.23
d.
2
p 1 = 4 1 = 3. From Table VII, Appendix B, .05
H > 7.81473.
e.
(H = 16.23 > 7.81473), H0 is rejected. There is sufficient evidence to indicate the
distributions of office rental growth rates differ among the four market cycle phases at
= .05.

Aromatics
1.06
0.79
0.82
0.89
1.05
0.95
0.65
1.15
1.12
Ranks
26
19
20
22
25
24
18
29
27.5
R1 = 210.5
Chloroalkanes
1.58
1.45
0.57
1.16
1.12
0.91
0.83
0.43
Ranks
32
31
15
30
27.5
23
21
9.5
R2 = 189
Esters
0.29
0.06
0.44
0.61
0.55
0.43
0.51
0.10
0.34
0.53
0.06
0.09
0.17
0.60
0.17
Ranks
7
1.5
11
17
14
9.5
12
4
8
13
1.5
3
5.5
16
5.5
R3 = 128.5
543
To determine if the sorption rate distributions differ among the three solvents, we test:

R 2j
210.52 1892 128.52
12
12
3(n + 1) =
+
+
H=
3(32 + 1)
n( n + 1)
nj
32(32 + 1) 9
8
15
= 20.197
The rejection region requires = .01 in the upper tail of the 2 distribution with df = p 1 = 3
2
H > 9.21034.
9.21034), H0 is rejected. There is sufficient evidence to indicate the sorption rate distributions
differ among the three solvents at = .01.
14.46
a.
The F-test would be appropriate if:

1.
2.
3.
b.
c.
All p populations sampled from are normal.

The variances of the p populations are equal.
The p samples are independent.
The variances for the three populations are probably not the same and the populations are
probably not normal.
To determine whether the salary distributions differ among the three cities, we test:

1
Atlanta
34,600
84,900
61,700
38,900
77,200
83,600
59,800
544
Rank
1
19
11
3
17
18
10
R1 = 79
2
Los Angeles
42,400
135,000
63,000
43,700
69,400
97,000
49,500
Rank
4
21
12
5
13
20
7
R2 = 82
3
Washington, D.C.
38,000
76,900
48,000
72,600
73,200
51,800
55,000
Rank
2
16
6
14
15
8
9
R3 = 70
Chapter 14
The test statistic is H =

=
2
12
Rj
3(n + 1)
n( n + 1)
nj
12 79 2 82 2 70 2
+
+
3(22) = 66.2894 66 = .2894

21(22) 7
7
7
The rejection region requires = .05 in the upper tail of the 2 distribution with df = p
2
1 = 3 1 = 2. From Table VII, Appendix B, .05
H > 5.99147.
Since the observed value of the test statistic does not fall in the rejection region (H =
.2894 >/ 5.99147), H0 is not rejected. There is insufficient evidence to indicate the salary
distributions differ among the three cities at = .05.
We must assume we have independent random samples, sample sizes greater than or
equal to 5 from each population, and that all populations are continuous.
14.48
a.
The hypotheses are:

H0: The probability distributions for three treatments are identical
Ha: At least two of the probability distributions differ in location
b.
2
Fr > 4.60517.
c.

Block
1
2
3
4
5
6
7
9
13
11
10
9
14
10
Rank
1
2
1
1
2
2
1
RA = 10
B
11
13
12
15
8
12
12
Rank
2
2
2.5
2
1
1
2
RB = 12.5
C
18
13
12
16
10
16
15
Rank
3
2
2.5
3
3
3
3
RC = 19.5
12
R 2j 3b( p + 1)
bp ( p + 1)
12
102 + 12.52 + 19.52 3(7)(4) = 90.9286 84 = 6.9286
=
7(3)(4)
The test statistic is Fr =
Since the observed value of the test statistic falls in the rejection region (Fr = 6.9286
> 4.60517), H0 is rejected. There is sufficient evidence to indicate the effectiveness of the
three different treatments differ at = .10.
545
14.50
a.
The Friedman test statistic is Fr =

=
14.52
12
R 2j 3b( p + 1)
bp ( p + 1)
12
(27 2 + 252 + 182 + 112 + 92 ) 3(6)(5 + 1) = 17.333
6(5)(5 + 1)
b.
2
Fr > 9.48773.
c.
(Fr = 17.333 > 9.48773), H0 is rejected. There is sufficient evidence to indicate there is
a difference in the levels of farm production among the five conditions at = .05.
a.
To determine if the distributions of rotary oil rigs differ among the three states, we test:
H0: The probability distributions of the rotary oil rigs for the 3 states are the same
Ha: At least two of the probability distributions of rotary oil rigs differ in location
b.
The ranked data are:

Month/Year
Nov. 2000
Oct. 2001
Nov. 2001
c.
Utah
2
2
2
R2 = 6
Alaska
1
1
1
R3 = 3

Fr =
546
California
3
3
3
R1 = 9
12
12
92 + 62 + 32 3(3)(3 + 1) = 6
R 2j 3b( p + 1) =
3(3)(3 + 1)
bp ( p + 1)
d.
2
H > 5.99147.
e.
Since the observed value of the test statistic falls in the rejection region (H = 6 > 5.99147),
H0 is rejected. There is sufficient evidence to indicate the distributions of rotary oil rigs
differ among the three states at = .05.
Chapter 14
14.54
Location
Anguilla
Antigua
Dominica
Guyana
Jamaica
St. Lucia
Suriname
Temephos Rank
4.6
5
9.2
5
7.8
5
1.7
2
3.4
3
6.7
4
1.4
1
R1 = 13
Malsathion
Rank
1.2
1
2.9
3
1.4
1
1.9
4
3.7
4
2.7
1.5
1.9
3
R2 = 15
Fenitrothion
Rank
1.5
2.5
2.0
1.5
2.4
2
2.2
5
2.0
2
2.7
1.5
2.0
4
R3 = 18.5
Fenthion Rank
1.8
4
7.0
4
4.2
4
1.5
1
1.5
1
4.8
3
2.1
5
R4 = 22
Chlorpyrifos Rank
1.5
2.5
2.0
1.5
4.1
3
1.8
3
7.1
5
8.7
5
1.7
2
R5 = 22
To determine if the resistance ratio distributions of the 5 insecticides differ, we test:

H0: The distributions of the 5 insecticide ratios are the same
Ha: At least two of the distributions of insecticide ratios differ
12
R 2j 3b( p + 1)
bp ( p + 1)
12
(252 + 17.52 + 18.52 + 222 + 222 ) 3(7)(5 + 1) = 2.086
=
7(5)(5 + 1)
The test statistic is Fr =
Since no was given, we will use = .05. The rejection region requires = .05 in the upper
2
tail of the 2 distribution with df = p 1 = 5 1 = 4. From Table VII, Appendix B, .05
=
9.48773. The rejection region is Fr > 9.48773.
(Fr = 2.086 >/ 9.48773), H0 is not rejected. There is insufficient evidence to indicate that the
resistance ratio distributions of the 5 insecticides differ at = .05.
14.56
Week
1
2
3
4
5
6
7
8
9
Monday
5
5
2.5
2
5
4
5
4
1
R1 = 33.5
Tuesday
1
4
2.5
1
1
2
3.5
2
2
R2 = 19
Wednesday
4
3
5
3.5
2
3
1.5
1
5
R3 = 28
Thursday
2
1
1
5
3
1
3.5
3
3
R1 = 22.5
Friday
3
2
4
3.5
4
5
1.5
5
4
R2 = 32
To determine if the distributions of days of the weeks differ, we test:

H0: The probability distributions of the 5 days of the week are the same
Ha: At least two of the probability distributions of the 5 days of the week differ in
location
547

12
R 2j 3b( p + 1)
bp ( p + 1)
12
33.52 + 192 + 282 + 22.52 + 322 3(9)(5 + 1) = 6.778
=
9(5)(5 + 1)
Fr =
Since no was given we will use = .05. The rejection region requires = .05 in the upper
2
tail of the 2 distribution with df = p 1 = 5 1 = 4. From Table VII, Appendix B, .05
=
9.48773. The rejection region is H > 9.48773.
(H = 6.778 >/ 9.48773), H0 is not rejected. There is insufficient evidence to indicate the
distributions of the absentee rate for the days of the weeks differ at = .05.
14.58
14.60
a.
From Table XVII with n = 10, rs,/2 = rs,.025 = .648. The rejection region is rs > .648 or
rs < .648.
b.
From Table XVII with n = 20, rs, = rs,.025 = .450. The rejection region is rs > .450.
c.
From Table XVII with n = 30, rs, = rs,.01 = .432. The rejection region is rs < .432.
a.
H0: s = 0
Ha: s 0
b.
The test statistic is rs =
x
0
3
0
4
3
0
4
548
Rank, u
3
5.5
3
1
5.5
3
7
u = 28
SSuv =
uv
SSuu =
SSvv =
SSuv
SSuuSSvv
y
0
2
2
0
3
1
2
Rank, v
1.5
5
5
1.5
7
3
5
v = 28
( u )( v ) = 131 28(28)
n
(u )
( v)
= 137.5
(20) 2
7
= 137.5
(20) 2
7
u2
9
30.25
9
1
30.25
9
49
u 2 = 137.5
v2
2.25
25
25
2.25
49
9
25
v 2 = 137.5
uv
45
27.5
15
1.5
38.5
9
35
uv = 131
= 19
Chapter 14
rs =
19
= .745
25.5(25.5)
Reject H0 if rs < rs,/2 or rs > rs,/2 where /2 = .025 and n = 7:
Reject H0 if rs < .786 or rs > .786 (from Table XVII, Appendix B).
Since the observed value of the test statistic does not fall in the rejection region, (rs = .745
>/ .786), do not reject H0. There is insufficient evidence to indicate x and y are correlated
at = .05.
14.62
c.
The p-value is P(rs .745) + P(rs .745). For n = 7, rs = .745 is above rs,.025 where /2 =
.025 and below rs,.05 where /2 = .05. Therefore, 2(.025) = .05 < p-value < 2(.05) = .10.
d.
The assumptions of the test are that the samples are randomly selected and the probability
distributions of the two variables are continuous.
a.

Expert
1
6
5
1
3
2
4
Brand
A
B
C
D
E
F
rs = 1
b.
6 di2
n(n 1)
2
= 1
Expert
2
5
6
2
1
4
3
Difference di
1
1
1
2
2
1
di2
1
1
1
4
4
1
di2 = 12
6(12)
= 1 .343 = .657
6(62 1)
To determine if there is a positive correlation in the rankings of the two experts, we test:
H0: s = 0
Ha: s > 0
The test statistic is rs = .657.
Reject H0 if rs > rs, where = .05 and n = 6. From Table XVII, Appendix B,
rs,.01 = .829. Reject H0 if rs > .829.
(rs = .657 >/ .829), H0 is not rejected. There is insufficient evidence to indicate a
positive correlation in the rankings of the two experts at = .05.
549
14.64
a.

x
u
y
v
5.2
1
220
4.5
5.5
7
227
7.5
6.0
23.5
259
15.5
5.9
20.5
210
1
5.8
16
224
6
6.0
23.5
215
3
5.8
16
231
9
5.6
10
268
19
5.6
10
239
11
5.9
20.5
212
2
5.4
5
410
24
5.6
10
256
14
5.8
16
306
22
5.5
7
259
15.5
5.3
3
284
21
5.3
3
383
23
5.7
12.5
271
20
5.5
7
264
18
5.7
12.5
227
7.5
5.3
3
263
17
5.9
20.5
232
10
5.8
16
220
4.5
5.8
16
246
13
5.9
20.5
241
12
u =300
v = 300
SSuv =
uv
SSuu =
SSvv =
rs =
u-sq
1
49
552.25
420.25
256
552.25
256
100
100
420.25
25
100
256
49
9
9
156.25
49
156.25
9
420.25
256
256
420.25
2
u =4878
( u )( v ) = 3197.5 300(300)
n
(u )
(v)
SSuv
SSuuSSvv
24
= 4878
v-sq
20.25
56.25
240.25
1
36
9
81
361
121
4
576
196
484
240.25
441
529
400
324
56.25
289
100
20.25
169
144
2
v =4898.5
uv
4.5
52.5
364.25
20.5
96
70.5
144
190
110
41
120
140
352
108.5
63
69
250
126
93.75
51
205
72
208
246
uv =3197.5
= 552.5
3002
= 1128
24
= 4898.5
552.5
1128(1148.5)
3002
= 1148.5
24
= .4854
Since the magnitude of the correlation coefficient is not particularly large, there is a fairly
weak negative relationship between sweetness index and pectin.
550
Chapter 14
b.
To determine if there is a negative association between the sweetness index and the
amount of pectin, we test:
H0: s = 0
Ha: s < 0
The test statistic is rs = .4854
Reject H0 if rs < rs, where = .01 and n = 24.
Reject H0 if rs < .485 (from Table XVII, Appendix B)
(rs = .4854 < .485), H0 is rejected. There is sufficient evidence to indicate there is a
negative association between the sweetness index and the amount of pectin at = .01.
14.66
a.

Parent
643
381
342
251
216
208
192
141
131
128
124
Rank, u
11
10
9
8
7
6
5
4
3
2
1
rs = 1
Subsid
2,617
1,724
1,867
1,238
890
681
1,534
899
492
579
672
6 di2
n( n 1)
2
=1
Rank, v
11
9
10
7
5
4
8
6
1
2
3
Difference di
0
1
-1
1
2
2
-3
-2
2
0
-2
di2
0
1
1
1
4
4
9
4
4
0
4
2
di = 32
6(32)
= 1 .145 = .855
11(112 1)
Since this correlation coefficient is fairly close to 1, it indicates that there is a

relatively strong positive relationship between the number of parent companies and
the number of subsidiaries.
To determine if the number of parent companies is positively related to the number of
subsidiaries, we test:
H0: s = 0
Ha: s > 0
551
From Table XVI, Appendix B, rs,.05 = .523, with n = 11. The rejection region is
rs > .523.
(rs = .855 > .523), H0 is rejected. There is sufficient evidence to indicate that the
number of parent companies is positively related to the number of subsidiaries at
= .05.
b.
We must assume:
1. The sample is randomly selected.
2. The probability distributions of both of the variables are continuous.
The actual number of companies and subsidiaries are not continuous. However,
since the numbers of companies/subsidiaries are very large, this assumption is
basically met. From the information given, we cannot tell whether the sample was
random or not.
14.68
b.
Involvement
1
2
3
4
5
6
7
8
9
10
11
rs = 1
6 d i2
n(n 1)
2
ui
vi
Differences
di = ui vi
8
6
10
2
5
9
1
4
7
11
3
9
7
10
1
5
8
2
4
6
11
3
1
1
0
1
0
1
1
0
1
0
0
=1
d i2
di2
1
1
0
1
0
1
1
0
1
0
1
=6
6(6)
= .972
11(112 1)
To determine if a positive relationship exists between participation rates and cost savings
rates, we test:
H0: s = 0
Ha: s > 0
From Table XVII, Appendix B, rs,.01 = .736, with n = 11. The rejection region is
rs > .736.
552
Chapter 14
Since the observed value of the test statistic does falls in the rejection region (rs = .972 >
.736), H0 is rejected. There is sufficient evidence to indicate that a positive relationship
exists between participation rates and cost savings rates at = .01.
c.
In order for the above test to be valid, we must assume:

1.
2.
The sample is randomly selected.

The probability distributions of both of the variables are continuous.
In order to use the Pearson correlation coefficient, we must assume that both populations
are normally distributed. It is very unlikely that the data are normally distributed.
14.70
The appropriate test for this completely randomized design is the Kruskal-Wallis H-test. Some
preliminary calculations are:
Sample 1
18
32
43
15
63
Rank
4.5
6
9
3
12
Sample 2
12
33
10
34
18
Rank Sample 3
12
87
7
53
1
65
8
50
4.5
64
77
R2 = 22.5
R1 = 34.5
Rank
16
11
14
10
13
15
R3 = 79
To determine whether at least two of the populations differ in location, we test:

2
Rj
12
3( n + 1)
n( n + 1)
nj
=
(34.5) 2 (22.5) 2 (79) 2

12
+
+
3(16 + 1)
16(16 + 1) 5
5
6
= 60.859 51 = 9.859
The rejection region requires = .05 in the upper tail of the 2 distribution with df = p 1 = 3
2
= 5.99147. The rejection region is H > 5.99147.
Since the observed value of the test statistic falls in the rejection region (H = 9.859 > 5.99147),
reject H0. There is sufficient evidence to indicate a difference in location for at least two of the
three probability distributions at = .05.
553
14.72
The appropriate test for two independent samples is the Wilcoxon rank sum test. Some
preliminary calculations are:
Sample 1
1.2
1.9
.7
2.5
1.0
1.8
1.1
Rank
4
8.5
1
10
2
7
3
T1 = 35.5
Sample 2
1.5
1.3
2.9
1.9
2.7
3.5
Rank
6
5
12
8.5
11
13
T2 = 55.5
To determine if there is a difference between the locations of the probability distributions,

we test:
Ha: The probability distribution for population 1 is shifted to the left or right of that for 2
The test statistic is T2 = 55.5.
Reject H0 if T2 TL or T2 TU where = .05 (two-tailed), n1 = 7 and n2 = 6:
Reject H0 if T2 28 or T2 56 (from Table XV, Appendix B).
Since T2 = 55.5 / 28 and T2 = 55.5 / 56, do not reject H0. There is insufficient evidence to
indicate a difference between the locations of the probability distributions for the sampled
populations at = .05.
14.74
a.
To determine whether the median biting rate is higher in bright, sunny weather, we test:
H0: = 5
Ha: > 5
b.
( S .5) .5n
(95 .5) .5(122)
=
= 6.07
.5 n
.5 122
(where S = number of observations greater than 5)
The p-value is p = P(z 6.07). From Table IV, Appendix B, p = P(z 6.07) 0.0000.
c.
554
Since the observed p-value is less than (p = 0.0000 < .01), H0 is rejected. There is
sufficient evidence to indicate that the median biting rate in bright, sunny weather is
Chapter 14
14.76

Difference
Highway 1 Highway 2
25
4
23
16
16
Rank of Absolute
Differences
5
1
4
2.5
2.5
T+ = 1
To determine if the heavily patrolled highway tends to have fewer speeders per 100 cars
than the occasionally patrolled highway, we test:
Ha: The probability distribution for highway 1 is shifted to the left of that for
highway 2
The test statistic is T+ = 1.
The rejection region is T+ 1 from Table XVI, Appendix B, with n = 5 and = .05.
Since the observed value of the test statistic falls in the rejection region (T+ = 1 1), H0 is
rejected. There is sufficient evidence to indicate the probability distribution for highway
1 is shifted to the left of that for highway 2 at = .05.
b.

Day
1
2
3
4
5
Difference
Highway 1 Highway 2
25
4
23
16
16
d=
di = 76
5
di2
= 15.2
( di )
n
=
n 1
sd = 131.7 = 11.4761
sd2 =
(76) 2
5
5 1
1682
To determine if the mean number of speeders per 100 cars differ for the two highways,
we test:
H0: 1 = 2
Ha: 1 2
d 0
15.2
=
= 2.96
s d / n 11.4761
5
555
n 1 = 5 1 = 4. From Table VI, Appendix B, t.025 = 2.776. The rejection region is t >
2.776 and t < 2.776.
< 2.776), H0 is rejected. There is sufficient evidence to indicate the mean number of
speeders per 100 cars differ for the two highways at = .05.
We must assume that the population of differences is normally distributed and that a
random sample of differences was selected.
14.78
a.
Since only 70 of the 80 customers responded to the question, only the 70 will be
included.
To determine if the median amount spent on hamburgers at lunch at McDonald's is less
than $2.25, we test:
H0: = 2.25
Ha: < 2.25
S = number of measurements less than 2.25 = 20.
( S .5) .5n
.5 n
(20 .5) .5(70)

.5 70
= 3.71
No was given in the exercise. We will use = .05. The rejection region requires
= .05 in the lower tail of the z-distribution. From Table IV, Appendix B, z.05 = 1.645.
The rejection region is z > 1.645.
>/ 1.645), H0 is not rejected. There is insufficient evidence to indicate that the median
amount spent on hamburgers at lunch at McDonald's is less than $2.25 at = .05.
556
b.
No. The survey was done in Boston only. The eating habits of those living in Boston are
probably not representative of all Americans.
c.
We must assume that the sample is randomly selected from a continuous probability
distribution.
Chapter 14
14.80

1
Urban
4.3
5.2
6.2
5.6
3.8
5.8
4.7
2
3
Rank Suburban
Rank Rural
Rank
4.5
5.9
14
5.1
9
10.5
6.7
17
4.8
7
15.5
7.6
19
3.9
2
12
4.9
8
6.2
15.5
1
5.2
10.5
4.2
3
13
6.8
18
4.3
4.5
6
R1 = 62.5
R2 = 86.5
R3 = 41
To determine if there is a difference in the level of property taxes among the three types of
school districts, we test:
2
Rj
12
3( n + 1)
n( n + 1)
nj
62.52 86.52 412

12
+
+
3(20) = 65.8498 60
19(19 + 1) 7
6
6
= 5.8498
=
The rejection region requires = .05 in the upper tail of the 2 distribution with df = p 1 =
2
= 5.99147. The rejection region is H > 5.99147.
3 1 = 2. From Table VII, Appendix B, .05
Since the observed value of the test statistic does not fall in the rejection region (H = 5.8498 >/
5.99147), H0 is not rejected. There is insufficient evidence to indicate that there is a difference
in the level of property taxes among the three types of school districts at = .05.
14.82
a. Some preliminary calculations are:

Truck Static Weight of
Truck (ui)
1
3
2
4
3
10
4
1
5
6
6
8
7
2
8
5
9
7
10
9
55
Weigh-in-Motion
Prior (vi)
3
4
9
1.5
6
8
1.5
5
7
10
55
Weigh-in-Motion
After (wi)
3
4
10
2
6
8
1
5
7
9
55
uivi
9
16
90
1.5
36
64
3
25
49
90
383.5
uiwi
9
16
100
2
36
64
2
25
49
81
384
557
ui vi = 383.5 55(55)
SSuv =
ui vI
SSuw =
u i wi
SSuu =
ui2
SSvv =
SSww =
rs1 =
rs2 =
vi2
n
( ui wi )
( ui )
n
= 385
SSuu SSvv
= 384.5
SSuw
SSuu SSww
= 385
81
82.5(82)
= 81
55(55)
= 81.5
10
552
= 81.5
10
( wi )
SSuv
= 384
( vi )
wi2
10
552
= 82
10
552
= 82.5
10
= .9848
81.5
= .9879
82.5(82.5)
The correlation coefficient for x and y1 is rs1 = .9848.

Since rs1 > 0, the relationship between static weight and weigh-in-motion prior to
adjustment is positive. Because the value is close to 1, the relationship is very strong. It
is larger than r1 = .965 found in Exercise 10.89.
The correlation coefficient for x and y2 is rs2 = .9879.
Since rs2 > 0, the relationship between static weight and weigh-in-motion after the
adjustment is positive. Because the value is close to 1, the relationship is very strong. It
is smaller than r2 = .996 found in Exercise 10.89.
b.
In order for rs to be exactly 1, the rankings for the static weight and the weigh-in-motion
must be the same for each truck.
In order for rs to be exactly 0, the rankings for one of the variables (static weight) must be
equal to 11 minus ranking of the other variable (weigh-in-motion) for each truck.
14.84
a.
To determine if the median level differs from the target, we test:

H0: = .75
Ha: .75
b.
S1 = number of observations less than .75 and S2 = number of observations greater than
.75.
The test statistic is S = larger of S1 and S2.
The p-value = 2P(x S) where x is a binomial random variable with n = 25 and p = .5. If
the p-value is less than = .10, reject H0.
558
Chapter 14
c.
A Type I error would be concluding the median level is not .75 when it is. If a Type I
error were committed, the supervisor would correct the fluoridation process when it was
not necessary. A Type II error would be concluding the median level is .75 when it is
not. If a Type II error were committed, the supervisor would not correct the fluoridation
process when it was necessary.
d.
S1 = number of observations less than .75 = 7 and S2 = number of observations greater

than .75 = 18.
From Table II,
p-value = 2P(x 18) = 2(1 P(x 17)) = 2(1 .978)
= 2(.022) = .044
the median level of fluoridation differs from the target of .75 at = .10.
e.
A distribution heavily skewed to the right might look something like the following:
One assumption necessary for the t-test is that the distribution from which the sample is
drawn is normal. A distribution which is heavily skewed in one direction is not normal.
Thus, the sign test would be preferred.
14.86

Hours
Rank
1
2
3
4
5
6
7
8
1
2
3
4
5
6
7
8
Fraction
Defective
.02
.05
.03
.08
.06
.09
.11
.10
Rank
1
3
2
5
4
6
8
7
di
0
1
1
1
1
0
1
1
d i2
di2
0
1
1
1
1
0
1
1
=6
559
To determine if the fraction defective increases as the day progresses, we test:

H0: s = 0
Ha: s > 0
The test statistic is rs = 1
6 di2
n(n 1)
2
=1
6(6)
= 1 .071 = .929
8(82 1)
Reject H0 if rs > rs, where = .05 and n = 8:

Reject H0 if rs > .643 (from Table XVII, Appendix B).
Since rs = .929 > .643, reject H0. There is sufficient evidence to indicate that the fraction
defective increases as the day progresses at = .05.
14.88
a.
The design utilized was a completely randomized design.
b.

Site 1
34.3
35.5
32.1
28.3
40.5
36.2
43.5
34.7
38.0
35.1
Rank
6
11
3
1
19
12
23
8
15
9
R1 = 107
Site 2
39.3
45.5
50.2
72.1
48.6
42.2
103.5
47.9
41.2
44.0
Rank
17
25
28
29
27
21
30
26
20
24
R2 = 247
Site 3
34.5
29.3
37.2
33.2
32.6
38.3
43.3
36.7
40.0
35.2
Rank
7
2
14
5
4
16
22
13
18
10
R3 = 111
To determine if the probability distributions for the three sites differ, we test:
H0: The three sampled population probability distributions are identical
Ha: At least two of the three sampled population probability distributions differ in
location
2
Rj
12
3( n + 1) 3(n + 1)
n( n + 1)
nj
=
560
12 107 2 247 2 1112

+
+
3(31) = 109.3923 93
30(31) 10
10
10
= 16.3923
Chapter 14
2
H > 5.99147.
5.99147), H0 is rejected. There is sufficient evidence to indicate the probability
distributions for at least two of the three sites differ at = .05.
c.
Since H0 was rejected, we need to compare all pairs of sites.

Site 1
34.3
35.5
32.1
28.3
40.5
36.2
43.5
34.7
38.0
35.1
Site 2
39.3
45.5
50.2
72.1
48.6
42.2
103.5
47.9
41.2
44.0
Rank
3
6
2
1
10
7
13
4
8
5
T1 = 59
Rank
9
15
18
19
17
12
20
16
11
14
T2 = 151
Site 2
39.3
45.5
50.2
72.1
48.6
42.2
103.5
47.9
41.2
44.0
Rank
9
15
18
19
17
12
20
16
11
14
T2 = 151
Site 3
34.5
29.3
37.2
33.2
32.6
38.3
43.3
36.7
40.0
35.2
Site 1
34.3
35.5
32.1
28.3
40.5
36.2
43.5
34.7
38.0
35.1
Rank
6
11
3
1
18
12
20
8
15
9
T1 = 103
Site 3
34.3
29.3
37.2
33.2
32.6
38.3
43.3
36.7
40.0
35.2
Rank
7
2
14
5
4
16
19
13
17
10
T3 = 107
Rank
4
1
7
3
2
8
13
6
10
5
T3 = 59
For each pair, we test:

H0: The two sampled population probability distributions are identical
Ha: The probability distribution for one site is shifted to the right or left of the
other.
The rejection region for each pair is T 79 or T 131 from Table XV, Appendix B,
with n1 = n2 = 10 and = .05.
561
For sites 1 and 2:

Since the observed value of the test statistic falls in the rejection region,
(TA = 59 79), H0 is rejected. There is sufficient evidence to indicate the
probability distribution for site 1 is shifted to the left of that for site 2 at = .05.
For sites 1 and 3:
(T1 = 103 </ 79 and 103 >/ 131), H0 is not rejected. There is insufficient evidence
to indicate the probability distribution for site 1 is shifted to the right or left of that
for site 3 at = .05.
For sites 2 and 3:
(T2 = 151 131), H0 is rejected. There is sufficient evidence to indicate the
probability distribution for site 2 is shifted to the right of that for site 3 at = .05.
Thus, the income for those at site 2 is significantly higher than at the other two sites.
d.

1.
2.
3.
The three samples are random and independent.

There are five or more measurements in each sample.
The three probability distributions from which the samples are drawn are
continuous.
For parametric tests, the assumptions are:

1.
2.
3.
562
The three populations are normal.

The samples are random and independent
The three population variances are equal.
Chapter 14
14.90
Using MINITAB, the results of the Wilcoxon Rank Sum Test (Mann-Whitney Test) for
each of the Variables are:
Mann-Whitney Test and CI: CREATIVE-S, CREATIVE-NS
CREATIVE-S
CREATIVE-NS
N
47
67
Median
5.0000
4.0000
Point estimate for ETA1-ETA2 is 1.0000

95.0 Percent CI for ETA1-ETA2 is (0.9999,1.0000)
W = 3734.5
Test of ETA1 = ETA2 vs ETA1 not = ETA2 is significant at 0.0000
The test is significant at 0.0000 (adjusted for ties)
Mann-Whitney Test and CI: INFO-S, INFO-NS

INFO-S
INFO-NS
N
47
67
Median
5.000
5.000

95.0 Percent CI for ETA1-ETA2 is (-0.000,1.000)
W = 2888.5
Mann-Whitney Test and CI: DECPERS-S, DECPERS-NS

DECPERS-S
DECPERS-NS
N
47
67
Median
3.000
2.000
Point estimate for ETA1-ETA2 is -0.000

W = 2963.5
Mann-Whitney Test and CI: SKILLS-S, SKILLS-NS

SKILLS-S
SKILLS-NS
N
47
67
Median
6.0000
5.0000

95.0 Percent CI for ETA1-ETA2 is (0.9999,1.9999)
W = 3498.5
563
Mann-Whitney Test and CI: TASKID-S, TASKID-NS

N
47
67
TASKID-S
TASKID-NS
Median
5.000
4.000

W = 3028.0
Mann-Whitney Test and CI: AGE-S, AGE-NS

AGE-S
AGE-NS
N
47
67
Median
47.000
45.000

W = 2891.5
Mann-Whitney Test and CI: EDYRS-S, EDYRS-NS

EDYRS-S
EDYRS-NS
N
47
67
Median
13.000
13.000
Point estimate for ETA1-ETA2 is -0.000

95.0 Percent CI for ETA1-ETA2 is (0.000,-0.000)
W = 2664.0
A summary of the tests above and the t-tests from Chapter 7 are listed in the table:
Variable
CREATIVE
INFO
DECPERS
SKILLS
TASKID
AGE
EDYRS
Wilcoxon
Test Statistic, T2
3734.5
2888.5
2963.5
3498.5
3028.0
2891.5
2664.0
p-value
0.000
0.274
0.123
0.000
0.057
0.277
0.819
t
8.847
1.503
1.506
4.766
1.738
0.742
-0.623
p-value
0.000
0.136
0.135
0.000
0.087
0.460
0.534
The p-values for the Wilcoxon Rank Sum Tests and the t-tests are similar and the
decisions are the same.
Since the sample sizes are large (n = 47 and n = 67), the Central Limit Theorem applies.
Thus, the t-tests (or z-tests) are valid. One assumption for the Wilcoxon Rank Sum test
is that the distributions are continuous. Obviously, this is not true. There are many ties
in the data, so the Wilcoxon Rank Sum tests may not be valid.
564
Chapter 14

SM McClave Stat10 WM PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

SM McClave Stat10 WM PDF

Uploaded by

Copyright:

Available Formats

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.

INSTRUCTOR'S SOLUTIONS MANUAL

STATISTICS FOR BUSINESS

Bowling Green State University

Upper Saddle River, New Jersey

Statistics, Data, and Statistical Thinking

Methods for Describing Sets of Data

Random Variables and Probability Distributions

Inferences Based on a Single Sample: Estimation with Confidence Intervals

Inferences Based on a Single Sample: Tests of Hypothesis

Inferences Based on Two Samples: Confidence Intervals and

Design of Experiments and Analysis of Variance

Categorical Data Analysis

Simple Linear Regression

Multiple Regression and Model Building

Methods for Quality Improvement

Time Series: Descriptive Analyses, Models, and Forecasting

A population is a set of existing units such as people, objects, transactions, or events. A

An inference without a measure of reliability is nothing more than a guess. A measure of

Honors/awards would have responses that name things. Therefore, it would be

Statistics, Data, and Statistical Thinking

Gender is either male or female. Therefore, it is qualitative.

Parent's income is a number: $25,000, $45,000, etc. Therefore, it is quantitative.

Age is a number: 17, 18, etc. Therefore, it is quantitative.

The variable of interest is the status of a companys e-commerce strategy.

Surveys were sent to computer security personnel at all U. S. corporations and

The data collection method used is a designed experiment.

The experimental units in the study are the 50,000 smokers.

The variable of interest to the researchers is the rating of highway bridges.

Any time a survey is mailed it is questionable whether the returned questionnaires

The experimental units in this study are the 24 projects.

Statistics, Data, and Statistical Thinking

The frequency bar chart is:

The pie chart for the frequency distribution is:

Methods for Describing Sets of Data

Infant & Medical

Child & Medical

Infant & Child

Using MINITAB, a pie chart of the data is:

Pie Chart of Reason

Child&Medica ( 903, 3.0%)

Inf ant&Child ( 1878, 6.2%)

The Pareto diagram is:

The data collection method was a survey.

Methods for Describing Sets of Data

Using MINITAB, a pie chart of the data is:

100% (64, 60.4%)

75-99% (20, 18.9%)

50-74% (18, 17.0%)

Using MINITAB, the side-by-side bar charts are:

Using MINITAB, the side-by-side graphs are:

Methods for Describing Sets of Data

The original data set has 1 + 3 + 5 + 7 + 4 + 3 = 23 observations.

For the bottom row of the stem-and-leaf display:

The dot plot corresponding to all the data points is:

Using MINITAB, the stem-and-leaf display of the data is:

Stem-and-Leaf Display: SCORE

The sanitation score of 84 is in bold in the stem-and-leaf display in part a.

Using MINITAB, the frequency histogram is:

Methods for Describing Sets of Data

Using MINITAB, the frequency histogram is:

Using MINITAB, the frequency histogram is:

Using MINITAB, the two dot plots are: