You are on page 1of 570

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.

com

INSTRUCTOR'S SOLUTIONS MANUAL


to Accompany
James T. McClave
P. George Benson
and Terry Sincich's

STATISTICS FOR BUSINESS


AND ECONOMICS
Tenth Edition

Nancy S. Boudreau

Bowling Green State University

Upper Saddle River, New Jersey


Columbus, Ohio

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Contents
Preface

Chapter 1

Statistics, Data, and Statistical Thinking

Chapter 2

Methods for Describing Sets of Data


The Kentucky Milk Case

5
46

Chapter 3

Probability

55

Chapter 4

Random Variables and Probability Distributions


The Furniture Fire Case

82
136

Chapter 5

Inferences Based on a Single Sample: Estimation with Confidence Intervals

137

Chapter 6

Inferences Based on a Single Sample: Tests of Hypothesis

161

Chapter 7

Inferences Based on Two Samples: Confidence Intervals and


Tests of Hypotheses
The Kentucky Milk Case Part II

201
243

Chapter 8

Design of Experiments and Analysis of Variance

256

Chapter 9

Categorical Data Analysis


Discrimination in the Work Place

300
328

Chapter 10

Simple Linear Regression

332

Chapter 11

Multiple Regression and Model Building


The Condo Sales Case

379
444

Chapter 12

Methods for Quality Improvement

448

Chapter 13

Time Series: Descriptive Analyses, Models, and Forecasting


The Gasket Manufacturing Case

476
522

Chapter 14

Nonparametric Statistics

529

iii

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

iv

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Preface
This solutions manual is designed to accompany the text, Statistics for Business and Economics, Tenth
Edition, by James T. McClave, P. George Benson, and Terry Sincich. It provides answers to most evennumbered exercises for each chapter in the text. Other methods of solution may also be appropriate; however,
the author has presented one that she believes to be most instructive to the beginning Statistics student.
This manual is provided to help instructors save time in preparing presentations of the solutions and to
possibly provide another point of view regarding their meaning.
Some of the exercises are subjective in nature. Subjective decisions regarding these exercises have been made
and are explained by the author. Solutions based on these decisions are presented; the solution to this type of
exercise is often most instructive. When an alternative interpretation of an exercise may occur, the author has
often addressed it and given justification for the approach taken.
I would like to thank Kelly Barber for creating the art work and for typing this work.

Nancy S. Boudreau
Bowling Green State University
Bowling Green, Ohio

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Statistics, Data,
and Statistical Thinking

Chapter 1

1.2

Descriptive statistics utilizes numerical and graphical methods to look for patterns, to
summarize, and to present the information in a set of data. Inferential statistics utilizes sample
data to make estimates, decisions, predictions, or other generalizations about a larger set of
data.

1.4

The first element of inferential statistics is the population of interest. The population is a set of
existing units. The second element is one or more variables that are to be investigated. A
variable is a characteristic or property of an individual population unit. The third element is
the sample. A sample is a subset of the units of a population. The fourth element is the
inference about the population based on information contained in the sample. A statistical
inference is an estimate, prediction, or generalization about a population based on information
contained in a sample. The fifth and final element of inferential statistics is the measure of
reliability for the inference. The reliability of an inference is how confident one is that the
inference is correct.

1.6

Quantitative data are measurements that are recorded on a meaningful numerical scale.
Qualitative data are measurements that are not numerical in nature; they can only be classified
into one of a group of categories.

1.8

A population is a set of existing units such as people, objects, transactions, or events. A


sample is a subset of the units of a population.

1.10

An inference without a measure of reliability is nothing more than a guess. A measure of


reliability separates statistical inference from fortune telling or guessing. Reliability gives a
measure of how confident one is that the inference is correct.

1.12

Statistical thinking involves applying rational thought processes to critically assess data and
inferences made from the data. It involves not taking all data and inferences presented at face
value, but rather making sure the inferences and data are valid.

1.14

a.

The two variables measured are type of credit card used and amount of purchase.
Type of credit card used is qualitative. It has no meaningful number associated with it,
only the name of the card used. Amount of purchase is quantitative. It has a meaningful
number associated with it.

b.

In Study 1, it says that all purchases were tracked. Thus, the data represent a population.

a.

High school GPA is a number usually between 0.0 and 4.0. Therefore, it is quantitative.

b.

Honors/awards would have responses that name things. Therefore, it would be


qualitative.

1.16

Statistics, Data, and Statistical Thinking

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

1.18

1.20

1.22.

c.

The scores on the SAT's are numbers between 200 and 800. Therefore, it is quantitative.

d.

Gender is either male or female. Therefore, it is qualitative.

e.

Parent's income is a number: $25,000, $45,000, etc. Therefore, it is quantitative.

f.

Age is a number: 17, 18, etc. Therefore, it is quantitative.

a.

1.

The variable of interest is the status of a companys e-commerce strategy.


Since a company either has an e-commerce strategy or not, the variable is
qualitative.

2.

The variable of interest is when the company will implement an e-commerce plan.
Since the time of implementation will be a date, this variable will be qualitative.

3.

The variable of interest is whether the company is delivering products over the
internet or not. Since the company is either delivering products or not, the variable
is qualitative.

4.

The variable of interest is the companys total revenue in the last fiscal year. Since
this is a meaningful number, this variable is quantitative.

b.

Since there are many more that 154 companies in the U.S., this represents a sample rather
than a population.

a.

The population of interest is the collection of computer security personnel at all U.S.
corporations and government agencies.

b.

Surveys were sent to computer security personnel at all U. S. corporations and


government agencies. However, in 2006, only 616 organizations responded to the
survey. There could be nonresponse bias. Often, only those subjects with strong
opinions will respond to a survey. Thus, the responses may not reflect what the
population as a whole thinks.

c.

The variable measured in the survey is whether or not there was unauthorized use of
computer systems at the firms during the year. Since the responses will be either Yes
or No, the variable is qualitative.

d.

If we assume that the responses were a random sample from the population, we could
infer that about 52% of all computer security personnel will admit to unauthorized use of
computer systems at their firms during the year.

a.

The data collection method used is a designed experiment.

b.

The experimental units in the study are the 50,000 smokers.

c.

The variable of interest is the age at which the scanning method first detects a tumor.
Since this is a meaningful number, this variable is quantitative.

Chapter 1

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

1.24

1.26

1.28

1.30

d.

The population of interest is the set of all smokers in the U.S. The sample of interest is
the set of 50,000 smokers surveyed.

e.

The researchers want to compare the age at first detection for the 2 methods to see if one
is more sensitive than the other.

a.

The variable of interest to the researchers is the rating of highway bridges.

b.

Since the rating of a bridge can be categorized as one of three possible values, it is
qualitative.

c.

The data set analyzed is a population since all highway bridges in the U.S. were
categorized.

d.

The data were collected observationally. Each bridge was observed in its natural setting.

a.

The population of interest is the set of all New York accounting firms employing two or
more professionals. There are two variables of interest: Whether or not the firm uses
audit sampling methods, and if so, whether or not it uses random sampling. The sample
is the set of 163 firms whose responses were useable. The inference of interest to the
New York Society of CPAs is the proportion of all New York accounting firms
employing two or more professionals that use sampling methods in auditing their clients.

b.

The four responses that were unusable could have been returned blank or could have been
filled out incorrectly.

c.

Any time a survey is mailed it is questionable whether the returned questionnaires


represent a random sample. Often times, only those with very strong opinions return the
surveys. In such a case, the returned surveys would not be representative of the entire
population.

a.

The experimental units in this study are the 24 projects.

b.

The population from which the sample was selected is the set of all new software
development projects.

c.

The variable of interest in this project is the outcome of reusing previously developed
software for the new software development projects.

d.

In the sample, 9 of the 24 projects were judged failures. This is (9 / 24)*100% = 37.5%.
We could infer that approximately 37.5% of all projects would be judged failures.

a.

The process being studied is the process of filling beverage cans with softdrink at CCSB's
Wakefield plant.

b.

The variable of interest is the amount of carbon dioxide added to each can of beverage.

c.

The sampling plan was to monitor five filled cans every 15 minutes. The sample is the
total number of cans selected.

Statistics, Data, and Statistical Thinking

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

d.

The company's immediate interest is learning about the process of filling beverage cans
with softdrink at CCSB's Wakefield plant. To do this, they are measuring the amount of
carbon dioxide added to a can of beverage to make an inference about the process of
filling beverage cans. In particular, they might use the mean amount of carbon dioxide
added to the sampled cans of beverage to estimate the mean amount of carbon dioxide
added to all the cans on the process line.

e.

The technician would then be dealing with a population. The cans of beverage have
already been processed. He/she is now interested in the outputs.

Chapter 1

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Methods for
Describing Sets of Data
2.2

a.

To find the frequency for each class, count the number of times each letter occurs. The
frequencies for the three classes are:
Class
X
Y
Z
Total

b.

Chapter 2

Frequency
8
9
3
20

The relative frequency for each class is found by dividing the frequency by the total sample
size. The relative frequency for the class X is 8/20 = .40. The relative frequency for the
class Y is 9/20 = .45. The relative frequency for the class Z is 3/20 = .15.
Class
X
Y
Z
Total

Frequency
8
9
3
20

Relative Frequency
.40
.45
.15
1.00

c.

The frequency bar chart is:

d.

The pie chart for the frequency distribution is:

Methods for Describing Sets of Data

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

2.4

a.

The variable summarized in the table is Reason for requesting the installation of the
passenger-side on-off switch. The values this variable could assume are: Infant, Child,
Medical, Infant & Medical, Child & Medical, Infant & Child, and Infant & Child &
Medical. Since the responses name something, the variable is qualitative.

b.

The relative frequencies are found by dividing the number of requests for each category by
the total number of requests. For the category Infant, the relative frequency is
1,852/30,337 = .061. The rest of the relative frequencies are found in the table below:
Reason
Infant

Number of
Requests
1,852

1,852/30,337

Relative
frequencies
.061

Child

17,148

17,148/30,337

.565

Medical

8,377

8,377/30,337

.276

Infant & Medical

44

44/30,337

.0014

Child & Medical

903

903/30,337

.030

1,878

1,878/30,337

.062

135

135/30,337

.0045

Infant & Child


Infant & Child & Medical
TOTAL
c.

30,337

.9999

Using MINITAB, a pie chart of the data is:

Pie Chart of Reason


Child

(17148, 56.5%)

Child&Medica ( 903, 3.0%)


Inf &Chd&Med ( 135, 0.4%)
Inf ant

( 1852, 6.1%)

Medical

( 8377, 27.6%)

Inf ant&Child ( 1878, 6.2%)


Inf ant&Medic (

d.

44, 0.1%)

There are 4 categories where Medical is mentioned as a reason: Medical, Infant &
Medical, Child & Medical, and Infant & Child & Medical. The sum of the frequencies
for these 4 categories is 8,377 + 44 + 903 + 135 = 9,459. The proportion listing Medical
as one of the reasons is 9,459/30,337 = .312.

Chapter 2

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

2.6

a.

To find relative frequencies, we divide the frequencies of each category by the total
number of incidents. The relative frequencies of the number of incidents for each of the
cause categories are:
Management System
Cause Category
Engineering & Design
Procedures & Practices
Management & Oversight
Training & Communication
TOTAL

b.

Number of Incidents

Relative Frequencies

27
24
22
10
83

27 / 83 = .325
24 / 83 = .289
22 / 83 = .265
10 / 83 = .120
1

The Pareto diagram is:


Management Systen Cause Category
35
30

P er cent

25
20
15
10
5
0

2.8

E ng&D es

P roc&P ract
M gmt&O v er
C ategor y

Trn&C omm

c.

The category with the highest relative frequency of incidents is Engineering and Design.
The category with the lowest relative frequency of incidents is Training and
Communication.

a.

The data collection method was a survey.

b.

Since the data were numbers (percentage of US labor and materials), the variable is
quantitative. Once the data were collected, they were grouped into 4 categories.

Methods for Describing Sets of Data

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

Using MINITAB, a pie chart of the data is:


Pie Chart of Made in USA

100% (64, 60.4%)

<50% ( 4, 3.8%)

75-99% (20, 18.9%)

50-74% (18, 17.0%)

About 60% of those surveyed believe that Made in USA means 100% US labor and
materials.
2.10

Using MINITAB, a bar chart of the frequency of occurrence of the industry types is:

Chart of INDUSTRY
80
70

Count

60
50
40
30
20
0

Aerospace/Defense
Banking
Capital Goods
Chemicals
Conglomerates
Construction
Consumer Durables
Diversified Financials
Drugs/Biotechnology
Food Markets
Food/Drink/Tobacco
Health Care
Hotels/Restaurants/Leisure
Household/Personal Products
Insurance
Materials
Media
Oil & Gas
Retailing
Semiconductors
Services/Supplies
Software & Services
Technology Equipment
Telecommunications
Transportation
Utilities

10

INDUSTRY

Chapter 2

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

2.12

Using MINITAB, the side-by-side bar charts are:


Chart of 1999, 2006 vs Use
Yes

No

1999

0.7

D on't know

2006

Relative Fr equency

0.6
0.5
0.4
0.3
0.2
0.1
0.0

Yes

No
Don't know
Unathor ized Use of C O mputer Systems

The relative frequency of unauthorized use of computer systems has decreased from
1999 to 2006.
2.14

a.

Using MINITAB, the side-by-side graphs are:


Chart of Exposure, Opportunity, Content, Faculty vs Stars
5
Exposure

Opportunity
16
12

Fr equency

8
4
Content

Faculty

16
12
8
4
0

Star s

From these graphs, one can see that very few of the top 30 MBA programs got 5-stars in
any criteria. In addition, about the same number of programs got 4 stars in each of the 4
criteria. The biggest difference in ratings among the 4 criteria was in the number of
programs receiving 3-stars. More programs received 3-stars in Course Content than in any
of the other criteria. Consequently, fewer programs received 2-stars in Course Content
than in any of the other criteria.
b.

Since this chart lists the rankings of only the top 30 MBA programs in the world, it is
reasonable that none of these best programs would be rated as 1-star on any criteria.

Methods for Describing Sets of Data

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

2.16

2.18

a.

The original data set has 1 + 3 + 5 + 7 + 4 + 3 = 23 observations.

b.

For the bottom row of the stem-and-leaf display:


The stem is 0.
The leaves are 0, 1, 2.
The numbers in the original data set are 0, 1, and 2.

2.20.

10

c.

The dot plot corresponding to all the data points is:

a.

The measurement class that contains the highest proportion of respondents is none.
Sixty-one percent of the respondents said that their companies did not outsource any
computer security functions.

b.

From the graph, 6% of the respondents indicated that they outsourced between 20% and
40% of their computer security functions.

c.

The proportion of the 609 respondents who outsourced at least 40% of computer security
functions is .04 + .01 + .01 = .06.

d.

The number of the 609 respondents who outsourced less than 20% of computer security
functions is (.27 + .61)*609 = .88(609) = 536.

Chapter 2

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

2.22

a.

Using MINITAB, the stem-and-leaf display of the data is:

Stem-and-Leaf Display: SCORE


Stem-and-leaf of SCORE
Leaf Unit = 1.0
1
6
1
6
2
7
3
7
4
8
15
8
56
9
(100) 9
13
10

2.24

= 169

2
2
8
4
66677888899
00001111111222222222233333333344444444444
55555555555555555555556666666666666666666777777777777777777888888+
0000000000000

b.

From the stem-and-leaf display, we see that there are only 4 observations with sanitation
scores less than the acceptable score of 86. The proportion of ships that have an accepted
sanitation standard would be (169 4) / 169 = .976.

c.

The sanitation score of 84 is in bold in the stem-and-leaf display in part a.

a.

Using MINITAB, the frequency histogram is:

Frequency

30

20

10

0
20

30

40

50

Length

Methods for Describing Sets of Data

11

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

b.

Using MINITAB, the frequency histogram is:


35
30

Frequency

25
20
15
10
5
0
0

500

1000

1500

2000

250

Weight

c.

Using MINITAB, the frequency histogram is:

140
120

Frequency

100
80
60
40
20
0
0

500

1000

DDT

2.26

Using MINITAB, the two dot plots are:


Dotplot for Arrive-Depart

Yes. Most of the numbers of items arriving at the work center per hour are in the 135 to 165
area. Most of the numbers of items departing the work center per hour are in the 110 to 140
area. Because the number of items arriving is larger than the number of items departing,
there will probably be some sort of bottleneck.

12

Chapter 2

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

2.28

a.

Using MINITAB, the three frequency histograms are as follows (the same starting point and
class interval were used for each):
Histogram of C1

N = 25

Tenth Performance
Midpoint Count
4.00
0
8.00
0
12.00
1
16.00
5
20.00
10
24.00
6
28.00
0
32.00
2
36.00
0
40.00
1

*
*****
**********
******
**
*

Histogram of C2

N = 25

Thirtieth Performance
Midpoint Count
4.00
1
8.00
9
12.00
12
16.00
2
20.00
1

*
*********
************
**
*

Histogram of C3

N = 25

Fiftieth Performance
Midpoint Count
4.00
3
8.00
15
12.00
4
16.00
2
20.00
1

b.

***
***************
****
**
*

The histogram for the tenth performance shows a much greater spread of the observations
than the other two histograms. The thirtieth performance histogram shows a shift to the
leftimplying shorter completion times than for the tenth performance. In addition, the
fiftieth performance histogram shows an additional shift to the left compared to that for the
thirtieth performance. However, the last shift is not as great as the first shift. This agrees
with statements made in the problem.

Methods for Describing Sets of Data

13

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

2.30

a.

A stem-and-leaf display is as follows, where the stems are the units place and the leaves are
the decimal places:
Stem Leaves
1 0 0 0 0 1 1 2 2 222 3 4 4 4 4444 5 5 55 6 79
2 1 144 6 7 9 9
3 0 028 9 9
4 1112 5
5 24
6
7 8
8
9
10 1

2.32

b.

A little more than half (26/49 = .53) of all companies spent less than 2 months in
bankruptcy. Only two of the 49 companies spent more than 6 months in bankruptcy. It
appears then, in general, the length of time in bankruptcy for firms using "prepacks" is less
than that of firms not using "prepacks."

c.

A dot diagram will be used to compare the time in bankruptcy for the three types of
"prepack" firms:

d.

The circled times in part a correspond to companies that were reorganized through a
leverage buyout. There does not appear to be any pattern to these points. They appear to
be scattered about evenly throughout the distribution of all times.

Using MINITAB, the stem-and-leaf display for the data is:


Stem-and-leaf of Time
Leaf Unit = 1.0
3
7
(7)
11
6
4
2
1

3
4
5
6
7
8
9
10

= 25

239
3499
0011469
34458
13
26
5
2

The numbers in bold represent delivery times associated with customers who subsequently
did not place additional orders with the firm. Since there were only 2 customers with
delivery times of 68 days or longer that placed additional orders, I would say the maximum
tolerable delivery time is about 65 to 67 days. Everyone with delivery times less than 67
days placed additional orders.

14

Chapter 2

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

2.34

a.

x = 3 + 8 + 4 + 5 + 3 + 4 + 6 = 33

b.

c.

( x 5)

= 32 + 82 + 42 + 52 + 32 + 42 + 62 = 175

= (3 5)2 + (8 5)2 + (4 5)2 + (5 5)2 + (3 5)2 + (4 5)2


+ (6 5)2 = 20

d.

( x 2)

= (3 2)2 + (8 2)2 + (4 2)2 + (5 2)2 + (3 2)2 + (4 2)2


+ (6 2)2 = 71

2.36

2.38

2.40

e.

( x)

a.

x = 6 + 0 + (2) + (1) + 3 = 6

b.

c.

a.

x=

b.

x=

400
= 25
16

c.

x=

35
= .78
45

d.

x=

242
= 13.44
18

= (3 + 8 + 4 + 5 + 3 + 4 + 6)2 = 332 = 1089

= 62 + 02 + (2)2 + (1)2 + 32 = 50

( x)

x = 85
n

10

= 50

62
= 50 7.2 = 42.8
5

= 8.5

The median is the middle number once the data have been arranged in order. If n is even, there
is not a single middle number. Thus, to compute the median, we take the average of the middle
two numbers. If n is odd, there is a single middle number. The median is this middle number.
A data set with five measurements arranged in order is 1, 3, 5, 6, 8. The median is the middle
number, which is 5.
A data set with six measurements arranged in order is 1, 3, 5, 5, 6, 8. The median is the average
5 + 5 10
= 5.
of the middle two numbers which is
=
2
2

Methods for Describing Sets of Data

15

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

2.42

a.

x = 7 + " + 4 = 15

x =

Median =

= 2.5

3+3
= 3 (mean of 3rd and 4th numbers, after ordering)
2

Mode = 3

2.44

x = 2 + " + 4 = 40

b.

= 3.08
n
13
13
Median = 3 (7th number, after ordering)
Mode = 3

c.

= 49.6
10
10
48 + 50
Median =
= 49 (mean of 5th and 6th numbers, after ordering)
2
Mode = 50

a.

The sample mean is:

x =

x = 51 + " + 37 = 496

x =

x=

x
i =1

529 + 355 + 301 + ... + 63 3757


=
= 144.5
26
26

The sample median is found by finding the average of the 13th and 14th observations once the
data are arranged in order. The 13th and 14th observations are 100 and 105. The average of
these two numbers (median) is:
median =

100 + 105 205


=
= 102.5
2
2

The mode is the observation appearing the most. For this data set, the mode is 70, which
appears 3 times.
Since the mean is larger than the median, the data are skewed to the right.
b.

The sample mean is:


n

x=

x
i =1

11 + 9 + 6 + ... + 4 136
=
= 5.23
26
26

The sample median is found by finding the average of the 13th and 14th observations once the
data are arranged in order. The 13th and 14th observations are 5 and 5. The average of these
two numbers (median) is:
median =

16

5 + 5 10
=
=5
2
2

Chapter 2

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The mode is the observation appearing the most. For this data set, the mode is 6, which
appears 6 times.
Since the mean and median are about the same, the data are somewhat symmetric.
2.46

a.

The sample mean is:


n

x=

xi
i =1

1.72 + 2.50 + 2.16 + + 1.95 37.62


=
= 1.881
20
20

The sample average surface roughness of the 20 observations is 1.881.


b.

The median is found as the average of the 10th and 11th observations, once the data have
been ordered. The ordered data are:
1.06 1.09 1.19 1.26 1.27 1.40 1.51 1.72 1.95 2.03 2.05 2.13 2.13 2.16 2.24 2.31 2.41 2.50 2.57 2.64

The 10th and 11th observations are 2.03 and 2.05. The median is:
2.03 + 2.05 4.08
=
= 2.04
2
2

The middle surface roughness measurement is 2.04. Half of the sample measurements
were less than 2.04 and half were greater than 2.04.

2.48

c.

The data are somewhat skewed to the left. Thus, the median might be a better measure of
central tendency than the mean. The few small values in the data tend to make the mean
smaller than the median.

a.

Using MINITAB, the stem-and-leaf display is:


Stem-and-leaf of PAF
Leaf Unit = 1.0
6
8
(2)
7
5
4
4
3

b.

0
1
2
3
4
5
6
7

N=17

000009
25
45
13
0
2
057

The median is the middle number once the data are arranged in order. The data arranged in
order are: 0, 0, 0, 0, 0, 9, 12, 15, 24, 25, 31, 33, 40, 62, 70, 75, 77.
The middle number or the median is 24.

c.

The mean of the data is x =

Methods for Describing Sets of Data

x
n

77 + 33 + 75 + " + 31
473
=
= 27.82
17
17

17

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

2.50

d.

The number occurring most frequently is 0. The mode is 0.

e.

The mode corresponds to the smallest number. It does not seem to locate the center of the
distribution. Both the mean and the median are in the middle of the stem-and-leaf display.
Thus, it appears that both of them locate the center of the data.

a.

The sample mean length is:


n

x=

x
i =1

42.5 + 44.0 + 41.5 + ... + 36.0 6165


=
= 42.81
144
144

The average length of the 144 fish is 42.81 cm.


The median is the average of the middle two observations once they have been ordered.
The 72nd and 73rd observations are 45 and 45. The average of these two observations is 45.
Half of the fish lengths are less than 45 cm and half are longer.
The mode is 46 cm. This observation occurred 12 times.
b.

The sample mean weight is:


n

x=

x
i =1

732 + 795 + 547 + ... + 1433 151159


=
= 1049.72
144
144

The average weight of the 144 fish is 1049.72 grams.


The median is the average of the middle two observations once they have been ordered.
The 72nd and 73rd observations are 989 and 1011. The average of these two observations is
median =

989 + 1,011
= 1000
2

Half of the fish weights are less than 1000 grams and half are heavier.
There are 2 modes, 886 and 1186. Each of these observations occurred 3 times.
c.

The sample mean DDT level is:


n

x=

x
i =1

10 + 16 + 23 + ... + 1.9 3507.1


=
= 24.35
144
144

The average DDT level of the 144 fish is 24.35 parts per million.

18

Chapter 2

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The median is the average of the middle two observations once they have been ordered.
The 72nd and 73rd observations are 7.1 and 7.2. The average of these two observations is
median =

7.1 + 7.2
= 7.15
2

Half of the fish DDT levels are less than 7.15 parts per million and half are greater.
The mode is 12. This observation occurred 8 times.

2.52

2.54

d.

From the graph in Exercise 2.24a, the data are skewed to the left. This corresponds to the
relationship between the mean and the median. For data skewed to the left, the mean is
less than the median. For the fish lengths, the mean is 42.81 and the median is 45.

e.

From the graph in Exercise 2.24b, the data are slightly skewed to the right. This
corresponds to the relationship between the mean and the median. For data skewed to the
right, the mean is more than the median. For the fish weights, the mean is 1049.72 and the
median is 1000.

f.

From the graph in Exercise 2.24c, the data are skewed to the right. This corresponds to the
relationship between the mean and the median. For data skewed to the right, the mean is
more than the median. For the fish DDT levels, the mean is 24.35 and the median is 7.15.

a.

Due to the "elite" superstars, the salary distribution is skewed to the right. Since this
implies that the median is less than the mean, the players' association would want to use the
median.

b.

The owners, by the logic of part a, would want to use the mean.

a.

The sample mean is:


n

x=

x
i =1

5 + 3 + 4 + ... + 3 80
=
=4
20
20

The sample median is found by finding the average of the 10th and 11th observations once
the data are arranged in order. The data arranged in order are:
1 1 1 1 1 2 2 3 3 3 4 4 4 5 5 5 6 7 9 13
The 10th and 11th observations are 3 and 4. The average of these two numbers (median) is:
median =

3+ 4 7
= = 3.5
2
2

The mode is the observation appearing the most. For this data set, the mode is 1, which
appears 5 times.

Methods for Describing Sets of Data

19

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

b.

Eliminating the largest number which is 13 results in the following:


The sample mean is:
n

x=

x
i =1

5 + 3 + 4 + ... + 3 67
=
= 3.53
19
19

The sample median is found by finding the middle observation once the data are arranged
in order. The data arranged in order are:
1 1 1 1 1 2 2 3 3 3 4 4 4 5 5 5 6 7 9
The 10th observation is 3. The median is 3
The mode is the observations appearing the most. For this data set, the mode is 1, which
appears 5 times.
By dropping the largest number, the mean is reduced from 4 to 3.53. The median is
reduced from 3.5 to 3. There is no effect on the mode.
c.

The data arranged in order are:


1 1 1 1 1 2 2 3 3 3 4 4 4 5 5 5 6 7 9 13
If we drop the lowest 2 and largest 2 observations we are left with:
1 1 1 2 2 3 3 3 4 4 4 5 5 5 6 7

The sample 10% trimmed mean is:


n

x=

x
i =1

1 + 1 + 1 + ... + 7 56
=
= 3.5
16
16

The advantage of the trimmed mean over the regular mean is that very large and very small
numbers that could greatly affect the mean have been eliminated.

20

Chapter 2

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

2.56

a.

b.

2.58

s2 =

s2 =

( x)

n 1

=
2

=
2

a.

Range = 42 37 = 5

b.

( x)

n 1

s=

3.3333 = 1.826

17 2
20 = .1868
20 1

s=

.1868 = .432

1992
5 = 3.7
5 1

s=

3.7 = 1.92

7935

Range = 100 1 = 99

s2 =
c.

4.8889 = 2.211

18

s2 =

n 1

1002
`
40 = 3.3333
40 1

s=

380

( x)

202
10 = 4.8889
10 1

84

( x)

n 1

c.

s2 =

( x)

n 1

3032
9 = 1,949.25
9 1

25,795

s = 1,949.25 = 44.15

Range = 100 2 = 98

s2 =
2.60

( x)

n 1

2952
8 = 1,307.84
8 1

20,033

s = 1,307.84 = 36.16

This is one possibility for the two data sets.


Data Set 1: 1, 1, 2, 2, 3, 3, 4, 4, 5, 5
Data Set 2: 1, 1, 1, 1, 1, 5, 5, 5, 5, 5

x1 =
x2 =

x = 1 + 1 + 2 + 2 + 3 + 3 + 4 + 4 + 5 + 5 = 30 = 3
n

10
10
1 + 1 + 1 + 1 + 1 + 5 + 5 + 5 + 5 + 5 30
=
=
=3
n
10
10

Therefore, the two data sets have the same mean. The variances for the two data sets are:

s12 =

s22 =

( x)

n 1

( x)

n 1

302
10 = 20 = 2.2222
9
9

110

302
10 = 20 = 4.4444
9
9

110

Methods for Describing Sets of Data

21

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The dot diagrams for the two data sets are shown below.

2.62

a.

Range = 3 0 = 3

s2 =
b.

( x)

n 1

72
5 = 1.3
=
5 1
15

s = 1.3 = 1.1402

After adding 3 to each of the data points,


Range = 6 3 = 3

s2 =
c.

( x)

n 1

222
5 = 1.3
5 1

102

s = 1.3 = 1.1402

After subtracting 4 from each of the data points,


Range = 1 (4) = 3

s2 =

2.64

( x)

n 1

(13) 2
5
= 1.3
5 1

39

s = 1.3 = 1.1402

d.

The range, variance, and standard deviation remain the same when any number is added to
or subtracted from each measurement in the data set.

a.

The maximum age is 64. The minimum age is 39. The range is 64 39 = 25.

b.

The variance is:


2

n
xi
n
2
24942
x i =1

125,764n
50 = 27.822
s 2 = i =1
=
n 1
50-1

c.

The standard deviation is:


s = s 2 = 27.822 = 5.275

d.

22

Since the standard deviation of the ages of the 50 most powerful women in Europe is 10
years and is greater than that in the U.S. (5.275 years), the age data for Europe is more
variable.

Chapter 2

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

2.66

a.

The maximum weight is 1.1 carats. The minimum weight is .18 carats. The range is
1.1 .18 = .92 carats.

b.

The variance is:


2

xi
194.322
xi2 i
146.19

n
308 = .0768 square carats
s2 = i
=
308 1
n 1
c.

The standard deviation is:


s = s 2 = .0768 = .2772 carats

2.68

d.

The standard deviation. This gives us an idea about how spread out the data are in the
same units as the original data.

a.

A worker's overall time to complete the operation under study is determined by adding the
subtask-time averages.
Worker A

The average for subtask 1 is: x =

x = 211 = 30.14
n

7
21
=
The average for subtask 2 is: x =
=3
n
7
Worker A's overall time is 30.14 + 3 = 33.14.

Worker B

The average for subtask 1 is: x =

x = 213 = 30.43
n

7
29
=
The average for subtask 2 is: x =
= 4.14
n
7
Worker B's overall time is 30.43 + 4.14 = 34.57.

b.

Worker A

s=

( x)

n 1

2117
7 = 15.8095 = 3.98
7 1

6455

Worker B

s=
c.

( x)

n 1

2132
7 = .9524 = .98
7 1

6487

The standard deviations represent the amount of variability in the time it takes the worker
to complete subtask 1.

Methods for Describing Sets of Data

23

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

d.

Worker A

s=

( x)

n 1

212
7 = .6667 = .82
7 1

67

Worker B

s=
e.

( x)

n 1

292
7 = 4.4762 = 2.12
7 1

147

I would choose workers similar to worker B to perform subtask 1. Worker B has a slightly
higher average time on subtask 1 (A: x = 30.14, B: x = 30.43). But, Worker B has a
smaller variability in the time it takes to complete subtask 1 (part b). He or she is more
consistent in the time needed to complete the task.
I would choose workers similar to Worker A to perform subtask 2. Worker A has a smaller
average time on subtask 2 (A: x = 3, B: x = 4.14). Worker A also has a smaller
variability in the time needed to complete subtask 2 (part d).

2.70

2.72

Since no information is given about the data set, we can only use Chebyshev's Rule.
a.

Nothing can be said about the percentage of measurements which will fall between
x s and x + s.

b.

At least 3/4 or 75% of the measurements will fall between x 2s and x + 2s.

c.

At least 8/9 or 89% of the measurements will fall between x 3s and x + 3s.

a.

x =

s2 =

x = 206
n

25

= 8.24

( x)

n 1

2062
25 = 3.357
25 1

1778

s=

s 2 = 1.83

b.
Interval

c.

24

Number of Measurements
in Interval

Percentage

x s, or (6.41, 10.07)

18

18/25 = .72 or 72%

x 2s, or (4.58, 11.90)

24

24/25 = .96 or 96%

x 3s, or (2.75, 13.73)

25

25/25 = 1

or 100%

The percentages in part b are in agreement with Chebyshev's Rule and agree fairly well
with the percentages given by the Empirical Rule.

Chapter 2

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

d.

Range = 12 5 = 7
s range/4 = 7/4 = 1.75
The range approximation provides a satisfactory estimate of s = 1.83 from part a.

2.74

From Chebyshevs Theorem, we know that at least or 75% of all observations will fall within
2 standard deviations of the mean. From Exercise 2.47, x = .631. From Exercise 2.66,
s = .2772. This interval is:
x 2 s .631 2(.2772) .631 .5544 (.0766, 1.1854)

2.76

a.

From the information given, we have x = 375 and s = 25. From Chebyshev's Rule, we
know that at least three-fourths of the measurements are within the interval:
x 2s, or (325, 425)

Thus, at most one-fourth of the measurements exceed 425. In other words, more than 425
vehicles used the intersection on at most 25% of the days.
b.

According to the Empirical Rule, approximately 95% of the measurements are within the
interval:
x 2s, or (325, 425)

This leaves approximately 5% of the measurements to lie outside the interval. Because of
the symmetry of a mound-shaped distribution, approximately 2.5% of these will lie below
325, and the remaining 2.5% will lie above 425. Thus, on approximately 2.5% of the days,
more than 425 vehicles used the intersection.
2.78

a.

Since the sample mean (18.2) is larger than the sample median (15), it indicates that the
distribution of years is skewed to the right. In addition, the maximum number of years is
50 and the minimum is 2. If the distribution were symmetric, the mean and median should
be about halfway between these two numbers. Halfway between the maximum and
minimum values is 26, which is much larger than either the mean or the median.

b.

The standard deviation can be estimated by the range divided by either 4 or 6. For this
distribution, the range is:
Range = Largest smallest = 50 2 = 48.
Dividing the range by 4, we get an estimate of the standard deviation to be 48/4 = 12.
Dividing the range by 6, we get an estimate of the standard deviation to be 48/6 = 8.
Thus, the standard deviation should be somewhere between 8 and 12. For this problem, the
standard deviation is s = 10.64. This value falls in the estimated range of 8 to 12.

Methods for Describing Sets of Data

25

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

First, we calculate the number of standard deviations from the mean the value of 40 years
is. To do this, we first subtract the mean and then divide by the value of the standard
deviation.
40 x 40 18.2
Number of standard deviations is
= 2.05 2
=
10.64
s
Using Chebyshev's Rule, we know that at most 1/k2 or 1/22 = 1/4 of the data will be more
than 2 standard deviations from the mean. Thus, this would indicate that at most 25% of
the Generation Xers responded with 40 years or more.
Next, we calculate the number of standard deviations from the mean the value of 8 years is.
Number of standard deviations is

8 x 8 18.2
= .96 -1
=
s
10.64

Using Chebyshev's Rule, we get no information about the data within 1 standard deviation
of the mean. However, we know the median (15) is more than 8. By definition, 50% of
the data are larger than the median. Thus, at least 50% of the Generation Xers responded
with 8 years or more. No additional information can be obtained with the information
given.
2.80

a.

Using MINITAB, the frequency histogram for the time in bankruptcy is:

Frequency

20

10

0
1

10

Time in Bankrupt

The Empirical Rule is not applicable because the data are not mound shaped.

26

Chapter 2

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

b. Using MINITAB, the descriptive measures are:


Descriptive Statistics: Time in Bankrupt

Variable
Time in

N
49

Mean
2.549

Median
1.700

TrMean
2.333

Variable
Time in

Minimum
1.000

Maximum
10.100

Q1
1.350

Q3
3.500

StDev
1.828

SE Mean
0.261

From Chebyshevs Theorem, we know that at least 75% of the observations will fall within
2 standard deviations of the mean. This interval is:
x 2 s 2.549 2(1.828) 2.549 3.656 (1.107, 6.205)

c. There are 47 of the 49 observations within this interval. The percentage would be
(47/49)*100% = 95.9%. This agrees with Chebyshevs Theorem (at least 75%0. It also
agrees with the Empirical Rule (approximately 95%).
d. From the above interval we know that about 95% of all firms filing for prepackaged
bankruptcy will be in bankruptcy between 0 and 6.2 months. Thus, we would estimate that a
firm considering filing for bankruptcy will be in bankruptcy up to 6.2 months.
2.82

2.84

a.

Since it is given that the distribution is mound-shaped, we can use the Empirical Rule. We
know that 1.84% is 2 standard deviations below the mean. The Empirical Rule states that
approximately 95% of the observations will fall within 2 standard deviations of the mean and,
consequently, approximately 5% will lie outside that interval. Since a mound-shaped
distribution is symmetric, then approximately 2.5% of the day's production of batches will
fall below 1.84%.

b.

If the data are actually mound-shaped, it would be extremely unusual (less than 2.5%) to
observe a batch with 1.80% zinc phosphide if the true mean is 2.0%. Thus, if we did
observe 1.8%, we would conclude that the mean percent of zinc phosphide in today's
production is probably less than 2.0%.

a.

Since we do not have any idea of the shape of the distribution of SAT-Math score
changes, we must use Chebyshevs Theorem. We know that at least 8/9 of the
observations will fall within 3 standard deviations of the mean. This interval would be:
x 3s 19 3(65) 19 195 (176, 214)

Thus, for a randomly selected student, we could be pretty sure that this students score
would be any where from 176 points below his/her previous SAT-Math score to 214 points
above his/her previous SAT-Math score.
b.

Since we do not have any idea of the shape of the distribution of SAT-Verbal score
changes, we must use Chebyshevs Theorem. We know that at least 8/9 of the
observations will fall within 3 standard deviations of the mean. This interval would be:
x 3s 7 3(49) 7 147 (140, 154)

Methods for Describing Sets of Data

27

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Thus, for a randomly selected student, we could be pretty sure that this students score
would be any where from 140 points below his/her previous SAT-Verbal score to 154
points above his/her previous SAT-Verbal score.

2.86

c.

A change of 140 points on the SAT-Math would be a little less than 2 standard deviations
from the mean. A change of 140 points on the SAT-Verbal would be a little less than 3
standard deviations from the mean. Since the 140 point change for the SAT-Math is not as
big a change as the 140 point on the SAT-Verbal, it would be most likely that the score was
a SAT-Math score.

a.

z=

b.

z=

c.

z=

d.

z=

x x 40 30
= 2 (sample)
=
s
5
x

2 standard deviations above the mean.

90 89
= .5 (population) .5 standard deviations above the mean.
2

50 50
= 0 (population) 0 standard deviations above the mean.
5

x x 20 30
= 2.5 (sample) 2.5 standard deviations below the mean.
=
s
4

2.88

The 50th percentile of a data set is the observation that has half of the observations less than it.
Another name for the 50th percentile is the median.

2.90

Since the element 40 has a z-score of 2 and 90 has a z-score of 3,


2 =

40

and 3 =

2 = 40
2 = 40
= 40 + 2

90

3 = 90
+ 3 = 90

By substitution, 40 + 2 + 3 = 90
5 = 50
= 10
By substitution, = 40 + 2(10) = 60
Therefore, the population mean is 60 and the standard deviation is 10.
2.92

28

The percentile ranking of the age of 25 years would be 100% 73.5% = 26.5%.

Chapter 2

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

2.94

a.

From Exercise 2.77, x = 94.91 and s = 4.83. The z-score for an observation of 78 is:
z=

x x 78 94.91
=
= 3.50
s
4.83

This z-score indicates that an observation of 78 is 3.5 standard deviations below the
mean. Very few observations will be lower than this one.
b.

The z-score for an observation of 98 is:


z=

x x 98 94.91
=
= 0.63
s
4.83

This z-score indicates that an observation of 98 is .63 standard deviations above the
mean. This score is not an unusual observation in the data set.
2.96

a.

From the problem, = 2.7 and = .5


z=

x-

z = x x = + z

For z = 2.0, x = 2.7 + 2.0(.5) = 3.7


For z = 1.0, x = 2.7 1.0(.5) = 2.2
For z = .5, x = 2.7 + .5(.5) = 2.95
For z = 2.5, x = 2.7 2.5(.5) = 1.45
b.

For z = 1.6, x = 2.7 1.6(.5) = 1.9

c.

If we assume the distribution of GPAs is


approximately mound-shaped, we can use the
Empirical Rule.
From the Empirical Rule, we know that .025
or 2.5% of the students will have GPAs
above 3.7 (with z = 2). Thus, the GPA
corresponding to summa cum laude (top
2.5%) will be greater than 3.7 (z > 2).
We know that .16 or 16% of the students will have GPAs above 3.2 (z = 1). Thus, the
limit on GPAs for cum laude (top 16%) will be greater than 3.2 (z > 1).
We must assume the distribution is mound-shaped.

Methods for Describing Sets of Data

29

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

2.98

a.

Since the data are approximately mound-shaped, we can use the Empirical Rule.
On the blue exam, the mean is 53% and the standard deviation is 15%. We know that
approximately 68% of all students will score within 1 standard deviation of the mean.
This interval is:
x s 53 (15) (38, 68)

About 95% of all students will score within 2 standard deviations of the mean. This
interval is:
x 2 s 53 2(15) 53 30 (23, 83)

About 99.7% of all students will score within 3 standard deviations of the mean. This
interval is:
x 3s 53 3(15) 53 45 (8, 98)

b.

Since the data are approximately mound-shaped, we can use the Empirical Rule.
On the red exam, the mean is 39% and the standard deviation is 12%. We know that
approximately 68% of all students will score within 1 standard deviation of the mean.
This interval is:
x s 39 (12) (27, 51)

About 95% of all students will score within 2 standard deviations of the mean. This
interval is:
x 2 s 39 2(12) 39 24 (15, 63)

About 99.7% of all students will score within 3 standard deviations of the mean. This
interval is:

c.

2.100

30

x 3s 39 3(12) 39 36 (3, 75)


The student would have been more likely to have taken the red exam. For the blue exam,
we know that approximately 95% of all scores will be from 23% to 83%. The observed
20% score does not fall in this range. For the blue exam, we know that approximately
95% of all scores will be from 15% to 63%. The observed 20% score does fall in this
range. Thus, it is more likely that the student would have taken the red exam.

The 25th percentile, or lower quartile, is the measurement that has 25% of the measurements
below it and 75% of the measurements above it. The 50th percentile, or median, is the
measurement that has 50% of the measurements below it and 50% of the measurements above it.
The 75th percentile, or upper quartile, is the measurement that has 75% of the measurements
below it and 25% of the measurements above it.

Chapter 2

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

2.102

a.

Median is approximately 4.

b.

QL is approximately 3 (Lower Quartile)


QU is approximately 6 (Upper Quartile)

2.104

c.

IQR = QU QL 6 3 = 3

d.

The data set is skewed to the right since the right whisker is longer than the left, there is
one outlier, and there are two potential outliers.

e.

50% of the measurements are to the right of the median and 75% are to the left of the upper
quartile.

f.

There are two potential outliers, 12 and 13. There is one outlier, 16.

a.

From the problem, x = 52.33 and s = 9.22.


The highest salary is 75 (thousand).
The z-score is z =

xx
75 52.33
=
= 2.46
s
9.22

Therefore, the highest salary is 2.46 standard deviations above the mean.
The lowest salary is 35.0 (thousand).
The z-score is z =

xx
35.0 52.33
=
= 1.88
s
9.22

Therefore, the lowest salary is 1.88 standard deviations below the mean.
The mean salary offer is 52.33 (thousand).
The z-score is z =

xx
52.33 52.33
=
=0
s
9.22

The z-score for the mean salary offer is 0 standard deviations from the mean.
No, the highest salary offer is not unusually high. For any distribution, at least 8/9 of the
salaries should have z-scores between 3 and 3. A z-score of 2.46 would not be that
unusual.

Methods for Describing Sets of Data

31

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

b.

Using MINITAB, the box plot is:

Since no salaries are outside the inner fences, none of them are potentially faulty observations.
2.106

Using MINITAB, the side-by-side box plots are:


65

60

A GE

55

50

45

40
1

2
GRO UP

From the boxplots, there appears to be one outlier in the third group.
2.108

a.

First, we will compute the mean and standard deviation.


The sample mean is:
n

x=

x
i =1

393
= 5.24
75

The sample variance is:


2

xi
3932
xi2 i
5943

n
75 = 52.482
s2 = i
=
75 1
n 1

32

Chapter 2

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The standard deviation is:


s = s 2 = 52.482 = 7.244

Since this data set is highly skewed, we will use 2 standard deviations from the mean as
the cutoff for outliers. Z-scores with values greater than 2 in absolute value are
considered outliers. An observation with a z-score of 2 would have the value:
z=

xx
x 5.24
2=
2(7.244) = x 5.24 14.488 = x 5.24 x = 19.728
s
7.244

An observation with a z-score of -2 would have the value:


xx
x 5.24
2 =
2(7.244) = x 5.24
z=
s
7.244
14.488 = x 5.24 x = 9.248

Thus any observation that is greater than to 19.728 or less than -9.248 would be
considered an outlier. In this data set there would be 4 outliers: 21, 21, 25, 48.
b.

Deleting these 4 outliers, we will recalculate the mean, median, variance, and standard
deviation. The median for the original data set is the middle number once they have been
arranged in order and is the 38th observation which is 3.
The new mean is:
n

x=

x
i =1

278
= 3.92
71

The new sample variance is:


2

xi
2782
xi2 i
2132

n
71 = 14.907
s2 = i
=
n 1
71 1
The new standard deviation is:
s = s 2 = 14.907 = 3.861

The new median is the 36th observation once the data have been arranged in order and is 3.
In the original data set, the mean is 5.24, the standard deviation is 7.244, and the median
is 3. In the revised data set, the mean is 3.92, the standard deviation is 3.861, and the
median is 3. The mean has been decreased, the standard deviation has been almost
halved, but the median stays the same.

Methods for Describing Sets of Data

33

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

2.110

For Perturbed Intrinsics, but no Perturbed Projections:


n

x=

xi
i =1

1.0 + 1.3 + 3.0 + 1.5 + 1.3 8.1


=
= 1.62
5
5
2

n
xi
n
2
8.12
xi i =1
15.63

n
5 = 2.508 = .627
s 2 = i =1
=
4
4
n 1
s = s 2 = .627 = .792

The z-score corresponding to a value of 4.5 is


z=

x x 4.5 1.62
=
= 3.63
s
.792

Since this z-score is greater than 3, we would consider this an outlier for perturbed
intrinsics, but no perturbed projections.
For Perturbed Projections, but no Perturbed Intrinsics:
n

x=

xi
i =1

22.9 + 21.0 + 34.4 + 29.8 + 17.7 125.8


=
= 25.16
5
5
2

n
xi
n
2
125.82
xi i =1
3350.1

n
5 = 184.972 = 46.243
s 2 = i =1
=
4
4
n 1
s = s 2 = 46.243 = 6.800

The z-score corresponding to a value of 4.5 is


z=

x x 4.5 25.16
=
= 3.038
s
6.800

Since this z-score is less than -3, we would consider this an outlier for perturbed
projections, but no perturbed intrinsics.
Since the z-score corresponding to 4.5 for the perturbed projections, but no perturbed
intrinsics is smaller than that for perturbed intrinsics, but no perturbed projections, it is
more likely that the that the type of camera perturbation is perturbed projections, but no
perturbed intrinsics.

34

Chapter 2

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

2.112

Using MINITAB, a scatterplot of the data is:


15

Var2

10

0
-1

Var1

2.114

Using MINITAB, the scatterplot of the data is:

550

Lawyers

450

350

250

150

50
0

10

Offices

As the number of offices increases, the number of lawyers also tends to increase.
2.116

a.

Using MINITAB, the scatterplot is:


20

30th

15

10

5
10

20

30

40

10th

It appears that as the completion time for the 10th trial increases, the completion time for
the 30th trial decreases.

Methods for Describing Sets of Data

35

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

b.

Using MINITAB, the scatterplot is:


20

50th

15

10

10

20

30

40

10th

It appears that as the completion time for the 10th trial increases, the completion time for
the 50th trial increases.
c.

Using MINITAB, the scatterplot is:


20

50th

15

10

10

15

20

30th

It appears that as the completion time for the 30th trial increases, the completion time for
the 50th trial increases.

36

Chapter 2

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

2.118

Using MINITAB, the scatterplot of the data is:


Scatterplot of Mass vs Time
7
6
5

M ass

4
3
2
1
0
0

10

20

30
T ime

40

50

60

There is evidence to indicate that the mass of the spill tends to diminish as time
increases. As time is getting larger, the mass is decreasing.
2.120

The mean is sensitive to extreme values in a data set. Therefore, the median is preferred to the
mean when a data set is skewed in one direction or the other.

2.122

a.

If we assume that the data are about mound-shaped, then any observation with a
z-score greater than 3 in absolute value would be considered an outlier. From Exercise
1.121, the z-score corresponding to 50 is 1, the z-score corresponding to 70 is 1, and the
z-score corresponding to 80 is 2. Since none of these z-scores is greater than 3 in absolute
value, none would be considered outliers.

b.

From Exercise 1.121, the z-score corresponding to 50 is 2, the z-score corresponding to


70 is 2, and the z-score corresponding to 80 is 4. Since the z-score corresponding to 80 is
greater than 3, 80 would be considered an outlier.

c.

From Exercise 1.121, the z-score corresponding to 50 is 1, the z-score corresponding to 70


is 3, and the z-score corresponding to 80 is 4. Since the z-scores corresponding to 70 and
80 are greater than or equal to 3, 70 and 80 would be considered outliers.

d.

From Exercise 1.121, the z-score corresponding to 50 is .1, the z-score corresponding to 70
is .3, and the z-score corresponding to 80 is .4. Since none of these z-scores is greater than
3 in absolute value, none would be considered outliers.

Methods for Describing Sets of Data

37

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

2.124

a.

x = 4 + 6 + 6 + 5 + 6 + 7 = 34
x = 42 + 62 + 62 + 52 + 62 + 72 = 198
x = 34 = 5.67
x=
2

s2 =

( x)

n 1
s = 1.067 = 1.03
b.

342
6 = 5.3333 = 1.0667
6 1
5

198

x = 1 + 4 + (3) + 0 + (3) + (6) = 9


x = (1)2 + 42 + (3)2 + 02 + (3)2 + (6)2 = 71
x = 9 = -$1.5
x=
2

( x)

n
=
n 1
s = 11.5 = $3.39
s2 =

c.

(9) 2
6 = 57.5 = 11.5 dollars squared
6 1
5

71

x = 5 + 5 + 5 + 5 + 16
2

= 2.0625
2

2
3 4 2 1 1
x = 5 + 5 + 5 + 5 + 16 = 1.2039
x = 2.0625 = .4125%
x=
5
n

s2 =

d.

2.126

38

( x)

2.06252
.3531
5
= .0883% squared
=
5 1
4

1.2039

s=

n 1
.0883 = .30%

(a)

Range = 7 4 = 3

(b)

Range = $4 ($-6) = $10

(c)

Range =

4
1
64
5
59
% % = % % = % = .7375%
5
16
80
80
80

range/4 = 20/4 = 5

Chapter 2

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

2.128

Using MINITAB, a pie chart of the data is:


Pie Chart of defect
C ategory
false
true

true
9.8%

false
90.2%

A response of true means the software contained defective code. Thus, only 9.8% of the
modules contained defective software code.
2.130

The z-score would be:


z=

x x 408 603.7
=
= 1.06
185.4
s

Since this value is not very big, this is not an unusual value to observe.
2.132

2.134

a.

The variable of interest is opinion of book reviews. The values could be would not
recommend, cautious or very little recommendation, little or no preference,
favorable/recommended, and outstanding/significant contribution. Since these
responses are not numerical, the variable is quantitative.

b.

Most of the books (63%) received a "favorable/recommended" review. About the same
percentage of books received the following reviews: "cautious or very little
recommendation" (10%), "little or no preference" (9%), and "outstanding/significant
contribution" (12%). Only 5% of the books received "would not recommend" reviews.

c.

If the top two categories are added together, the percent recommended is 75% (actually
slightly higher than 75%). This agrees with the study.

a.

To display the status, we use a pie chart. From the pie chart,
we see that 58% of the Beanie babies are retired and 42%
are current.

Methods for Describing Sets of Data

39

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

b.

Using Minitab, a histogram of the values is:

Most (40 of 50) Beanie babies have values less than $100. Of the remaining 10, 5 have
values between $100 and $300, 1 has a value between $300 and $500, 1 has a value
between $500 and $700, 2 have values between $700 and $900, and 1 has a value between
$1900 and $2100.
c.

A plot of the value versus the age of the Beanie Baby is as follows:

From the plot, it appears that as the age increases, the value tends to increase.
2.136

a.

Using MINITAB, the stem-and-leaf display is:


Stem-and-leaf of C1
Leaf Unit = 0.10
4
(25)
16
4
2
2
2
2
1
1

40

0
0
1
1
2
2
3
3
4
4

N = 46

34 4 4
5 5 5 5 5 5 5 556666 6 6 6 7 7 7 7 7 8 8 8 8 9
000011222 3 34
7 7

9
7

Chapter 2

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

2.138

b.

The leaves that represent those brands that carry the American Dental Association seal are
circled above.

c.

It appears that the cost of the brands approved by the ADA tend to have the lower costs.
Thirteen of the twenty brands approved by the ADA, or (13/20) 100% = 65% are less
than the median cost.

a.

Using MINITAB, the summary statistics are:

Descriptive Statistics: Marketing, Engineering, Accounting, Total


Variable
Marketin
Engineer
Accounti
Total

N
50
50
50
50

Mean
4.766
5.044
3.652
13.462

Median
5.400
4.500
0.800
13.750

TrMean
4.732
4.798
2.548
13.043

Variable
Marketin
Engineer
Accounti
Total

Minimum
0.100
0.400
0.100
1.800

Maximum
11.000
14.400
30.000
36.200

Q1
2.825
1.775
0.200
8.075

Q3
6.250
7.225
3.725
16.600

b.

SE Mean
0.365
0.542
0.885
0.965

The z-scores corresponding to the maximum time guidelines developed for each
department and the total are as follows:
Marketing: z =

x x 6.5 4.77
= .67
=
2.58
s

Engineering: z =

x x 7.0 5.04
= .51
=
3.84
s

Accounting: z =

x x 8.5 3.65
= .77
=
6.26
s

Total: z =
c.

StDev
2.584
3.835
6.256
6.820

x x 17 13.46
= .52
=
s
6.82

To find the maximum processing time corresponding to a z-score of 3, we substitute in the


values of z, , and s into the z formula and solve for x.
z=

xx
x x = zs x = x + zs
s

Marketing:

x = 4.77 + 3(2.58) = 4.77 + 7.74 = 12.51


None of the orders exceed this time.

Engineering:

x = 5.04 + 3(3.84) = 5.04 + 11.52 = 16.56


None of the orders exceed this time.

These both agree with both the Empirical Rule and Chebyshev's Rule.

Methods for Describing Sets of Data

41

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Accounting:

x = 3.65 + 3(6.26) = 3.65 + 18.78 = 22.43


One of the orders exceeds this time or 1/50 = .02.

Total:

x = 13.46 + 3(6.82) = 13.46 + 20.46 = 33.92


One of the orders exceeds this time or 1/50 = .02.

These both agree with Chebyshev's Rule but not the Empirical Rule. Both of these last two
distributions are skewed to the right.
d.

Marketing:

x = 4.77 + 2(2.58) = 4.77 + 5.16 = 9.93


Two of the orders exceed this time or 2/50 = .04.

Engineering:

x = 5.04 + 2(3.84) = 5.04 + 7.68 = 12.72


Two of the orders exceed this time or 2/50 = .04.

Accounting:

x = 3.65 + 2(6.26) = 3.65 + 12.52 = 16.17


Three of the orders exceed this time or 3/50 = .06.

Total:

x = 13.46 + 2(6.82) = 13.46 + 13.64 = 27.10


Two of the orders exceed this time or 2/50 = .04.

All of these agree with Chebyshev's Rule but not the Empirical Rule.
e.

No observations exceed the guideline of 3 standard deviations for both Marketing and
Engineering. One observation exceeds the guideline of 3 standard deviations for both
Accounting (#23, time = 30.0 days) and Total (#23, time = 36.2 days). Therefore, only
(1/10) 100% of the "lost" quotes have times exceeding at least one of the 3 standard
deviation guidelines.
Two observations exceed the guideline of 2 standard deviations for both Marketing (#31,
time = 11.0 days and #48, time = 10.0 days) and Engineering (#4, time = 13.0 days and
#49, time = 14.4 days). Three observations exceed the guideline of 2 standard deviations
for Accounting (#20, time = 22.0 days; #23, time = 30.0 days; and #36, time = 18.2 days).
Two observations exceed the guideline of 2 standard deviations for Total (#20, time = 30.2
days and #23, time = 36.2 days). Therefore, (7/10) 100% = 70% of the "lost" quotes
have times exceeding at least one the 2 standard deviation guidelines.
We would recommend the 2 standard deviation guideline since it covers 70% of the lost
quotes, while having very few other quotes exceed the guidelines.

2.140

a.

First, construct a relative frequency distribution for the departments.


Class
1
2
3
4
5

42

Department
Production
Maintenance
Sales
R&D
Administration
TOTAL

Frequency
13
31
3
2
5
54

Relative Frequency
.241
.574
.056
.037
.093
1.001

Chapter 2

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The Pareto diagram is:


From the diagram, it is evident that
the departments with the worst safety
record are Maintenance and Production.

b.

First, construct a relative frequency


distribution for the type of injury in the maintenance department.
Class
1
2
3
4
5
6
7
8

Injury
Burn
Back strain
Eye damage
Cuts
Broken arm
Broken leg
Concussion
Hearing loss
TOTAL

Frequency
6
5
2
10
2
1
3
2
31

Relative Frequency
.194
.161
.065
.323
.065
.032
.097
.065
1.002

The Pareto diagram is:


From the Pareto diagram, it is
evident that cuts is the most
prevalent type of injury. Burns and
back strain are the next most
prevalent types of injuries.

2.142

a.

Using MINITAB, the descriptive statistics are:

Descriptive Statistics: MPG


Variable
MPG

N
36

Mean
40.056

Median
40.000

TrMean
40.063

Variable
MPG

Minimum
35.000

Maximum
45.000

Q1
39.000

Q3
41.000

Methods for Describing Sets of Data

StDev
2.177

SE Mean
0.363

43

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The mean is 40.056 and the standard deviation is 2.177. Both of these measures are
measured in the same units as the original data, which are miles per gallon.
b.

Since the sample mean is a good estimate of the population mean, the manufacturer should
be satisfied. The sample mean is 40.056 which is greater than 40.

c.

The range of the data set is 45 35 = 10. Using Chebyshev's Rule, the range should cover
approximately 6 standard deviations. Thus, a good estimate of the standard deviation
would be 10/6 = 1.67. Using the Empirical Rule, the range should cover approximately 4
standard deviations. Thus, a good estimate of the standard deviation would be 10/4 = 2.5
The given standard deviation is 2.177 which is between these two estimates. Thus, it is a
reasonable value.

d.

Using MINITAB, the frequency histogram is (the relative frequency histogram would have
the same shape):

9
8

Frequency

7
6
5
4
3
2
1
0
35

36

37

38

39

40

41

42

43

44

45

MPG

Yes, the data appear to be mound-shaped.


e.

Because the data are mound-shaped, we can use the Empirical Rule. We would expect
approximately 68% of the data within the interval x s, approximately 95% of the data
within the interval x 2s, and approximately all of the data within the interval x 3s.

f.

The interval x s is 40.056 2.177 or (37.879, 42.233). Twenty-seven of the


observations fall in this interval or 27/36 = .75 or 75%. This number is a little larger than
68%.
The interval x 2s is 40.056 2(2.177) or (35.702, 44.410). Thirty-four of the
observations fall in this interval or 34/36 = .94 or 94%. This number is very close to 95%.
The interval x 3s is 40.056 3(2.177) or (33.525, 46.587). Thirty-six of the
observations fall in this interval or 36/36 = 1.00 or 100%. This number is the same as all
of the observations.

44

Chapter 2

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

2.144

a.

Both the height and width of the bars (peanuts) change. Thus, some readers may tend to
equate the area of the peanuts with the frequency for each year.

b.

The frequency bar chart is:

Methods for Describing Sets of Data

45

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The Kentucky Milk Case

(To accompany Chapters 12)

There are many things that could be included in a report about the possibility of collusion. I have
concentrated on the incumbency rates, bid levels and dispersion, and average winning bids. With the
data available, no comparison of market share can be made since there was so much missing data.
Actually, with the data available, the exact analysis cannot be made, since only the winning bid
information is provided. Thus, we have no idea what the losing bids were. I will present what I think is
a reasonable solution. This is by no means the only solution to the case. Many other presentations
could also be used.

Incumbency Rates
The incumbency rate is the percent of the school districts that are won by the same vendor who won the
previous year. A table containing the incumbency rates is included as well as a plot. Notice in the plot
that the incumbency rates in the Tri-county market is higher than that in the Surrounding market.
From 1985 through 1988, the incumbency rate for the Tri-county market was never lower than .923,
while in the same period in the Surrounding market, the incumbency rate was never higher than .730.
This implies the possibility of collusion in the Tri-county market.

Year
1984
1985
1986
1987
1988
1989
1990
1991

46

Surrounding Market
Tri-county Market
Number of
Same
Incumbency Number of
Same
Incumbency
Districts Vendors
Rate
Districts Vendors
Rate
26
16
.615
10
8
.800
27
19
.704
12
12
1.000
32
19
.594
13
13
1.000
37
27
.730
13
12
.923
37
25
.676
13
13
1.000
37
23
.622
13
9
.692
34
24
.706
13
10
.769
5
3
.600
13
11
.846

The Kentucky Milk Case

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The plot of the incumbency rates is:

Bid Levels and Dispersion


Since we only have access to the winning bids in each of the school districts, we cannot make a true
analysis of the bid levels and dispersions. As a compromise, I have used the winning bids of the two
dairies in questionTrauth and Meyer. I have looked at only the winning bids of these two dairies in
both the Tri-county market and in the Surrounding market. If there was no collusion, then the winning
bids and the dispersions of the winning bids should be similar in the two markets for the two dairies. I
looked at the box plots of the winning bids of the two dairies in each market for each type of milk:
whole white, lowfat white and lowfat chocolate. I have included only a few of the box plots as
illustrations. Those included are for 1985 and 1986.

The Kentucky Milk Case

47

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

1985 Winning Bids:

OBS

MARKET

WINNER

WHOLE
WHITE

LOWFAT
WHITE

LOWFAT
CHOCOLATE

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

SUR
SUR
SUR
SUR
SUR
SUR
SUR
TRI
TRI
TRI
TRI
TRI
TRI
TRI
TRI
TRI
TRI
TRI
TRI
TRI

MEYER
TRAUTH
TRAUTH
TRAUTH
MEYER
TRAUTH
MEYER
TRAUTH
TRAUTH
MEYER
TRAUTH
MEYER
MEYER
MEYER
TRAUTH
TRAUTH
MEYER
TRAUTH
MEYER
TRAUTH

0.1280
0.1200
.
.
0.1225
0.1230
0.1250
0.1440
0.1450
0.1410
0.1393
0.1340
0.1445
.
0.1449
.
0.1480
0.1310
.
0.1435

0.1250
0.1110
0.1079
0.1190
0.1130
0.1130
0.1145
0.1440
0.1350
0.1410
0.1393
0.1340
0.1345
0.1345
0.1349
0.1299
0.1480
0.1290
0.1380
0.1335

0.1315
0.1090
0.1079
0.1210
0.1099
0.1120
0.1140
.
.
0.1410
.
0.1340
0.1395
.
0.1399
0.1299
0.1480
.
.
.

Box Plots for Whole White Milk1985


Boxplots for Whole White Milk - 1985
0.150
0.145

WWBID

0.140
0.135
0.130
0.125
0.120
S U RRO U N D

TRI-C O U N TY
M A RKET

48

The Kentucky Milk Case

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Box Plots for Lowfat White Milk1985


Boxplots for Lowfat White Milk - 1985
0.15

LFWBID

0.14

0.13

0.12

0.11
S U RRO U N D

TRI-C O U N TY
M A RKET

Box Plots for Lowfat Chocolate Milk1985


Boxplots for Lowfat Chocolate Milk - 1985
0.15

LFC BID

0.14

0.13

0.12

0.11
S U RRO U N D

TRI-C O U N TY
M A RKET

The Kentucky Milk Case

49

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

For each type of milk, the mean and median winning bids for the Tri-county market were higher than
the corresponding winning bids in the Surrounding market. Also, the dispersion, indicated by the width
of the boxes and the length of the whiskers, for the Surrounding market is larger than for the Tri-county
market in most cases. This is indicative of collusion in the Tri-county market. This same pattern also
existed in 1986.
1986 Winning Bids:

OBS

MARKET

WINNER

WHOLE
WHITE

LOWFAT
WHITE

LOWFAT
CHOCOLATE

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21

SUR
SUR
SUR
SUR
SUR
SUR
SUR
SUR
TRI
TRI
TRI
TRI
TRI
TRI
TRI
TRI
TRI
TRI
TRI
TRI
TRI

TRAUTH
TRAUTH
TRAUTH
MEYER
TRAUTH
TRAUTH
TRAUTH
TRAUTH
TRAUTH
TRAUTH
MEYER
TRAUTH
MEYER
MEYER
MEYER
TRAUTH
TRAUTH
MEYER
TRAUTH
MEYER
TRAUTH

0.1195
0.1330
0.1140
0.1350
0.1224
.
.
0.1250
0.1475
0.1469
0.1440
0.1420
0.1390
0.1470
.
0.1474
.
0.1505
0.1360
.
0.1460

0.1100
0.1240
0.1070
0.1250
0.1124
0.1110
0.1180
0.1125
0.1475
0.1369
0.1340
0.1420
0.1390
0.1370
0.1380
0.1374
0.1349
0.1505
0.1320
0.1430
0.1360

0.1085
0.1290
0.1050
0.1315
0.1110
0.1110
0.1200
0.1115
.
.
0.1395
.
0.1390
0.1420
.
0.1424
0.1349
0.1505
.
.
.

Box Plots for Whole White Milk1986


Boxplots for Whole White Milk - 1986
0.15

WWBID

0.14

0.13

0.12

0.11
S U RRO U N D

TRI-C O U N TY
M A RKET

50

The Kentucky Milk Case

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Box Plots for Lowfat White Milk1986


Boxplots for Lowfat White Milk - 1986
0.15

LFWBID

0.14

0.13

0.12

0.11

S U RRO U N D

TRI-C O U N TY
M A RKET

Box Plots for Lowfat Chocolate Milk1986


Boxplots for Lowfat Chocolate Milk - 1986
0.15

LFC BID

0.14

0.13

0.12

0.11

0.10
S U RRO U N D

TRI-C O U N TY
M A RKET

The Kentucky Milk Case

51

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The same pattern that existed for 1985 and 1986 also existed in 1984, 1987, and 1988. From 1989 on,
the pattern no longer existed. Thus, from the plots, it appears that the two dairies were working
together from 1984 through 1988 in the Tri-county market.
I also plotted the mean winning bids for the two dairies in each of the two markets from 1984 through
1991 for each type of milk. In all three plots, the mean winning bid in 1983 was almost the same in the
two markets. Then, in 1984, the mean winning bid in the Tri-county market was higher than in the
Surrounding market for all three types of milk. This trend holds basically through 1988 (the lowfat
white milk mean winning bid for the Surrounding market was greater than the mean winning bid in the
Tri-county market in 1988). After 1988, the mean winning bids in the two markets are almost the same.
This points to collusion in the Tri-county market from 1984 through 1988.

52

The Kentucky Milk Case

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The dispersion, measured using the standard deviation, of the winning bids for each of the three types
of milk was basically smaller in the Tri-county market than in the Surrounding market for the years
1985 through 1988. Again, after 1988 this pattern no longer existed. Again, this points to collusion
between the two dairies in the Tri-county market during the years 1984 through 1988.

The Kentucky Milk Case

53

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

54

The Kentucky Milk Case

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Probability

3.2

Chapter 3

a.

This is a Venn Diagram.

b.

If the sample points are equally likely, then


P(1) = P(2) = P(3) = = P(10) =

1
10

Therefore,

1
1
1
3
+ + =
= .3
10 10 10 10
1
1
2
P(B) = P(6) + P(7) =
+ =
= .2
10 10 10

P(A) = P(4) + P(5) + P(6) =

3.4

1
1
3
5
+
+
=
= .25
20 20 20 20
3
3
6
+
=
P(B) = P(6) + P(7) =
= .3
20 20 20

c.

P(A) = P(4) + P(5) + P(6) =

a.

9
9!
9 8 7 6 5 4 3 2 1
= 126
=
=
4 4!(9 4)! 4 3 2 1 5 4 3 2 1

b.

7
7!
7 6 5 4 3 2 1
=
= 21
=
2 2!(7 2)! 2 1 5 4 3 2 1

c.

4
4!
4 3 2 1
=1
=
=
4 4!(4 4)! 4 3 2 1 1

d.

5
5!
5 4 3 2 1
=1
=
=
0 0!(5 0)! 1 5 4 3 2 1

e.

6
6!
6 5 4 3 2 1
=
=6
=
5 5!(6 5)! 5 4 3 2 1 1

Probability

55

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

3.6

a.

The 36 sample points are:


1,1 1,2 1,3 1,4 1,5 1,6 2,1 2,2 2,3 2,4 2,5 2,6
3,1 3,2 3,3 3,4 3,5 3,6 4,1 4,2 4,3 4,4 4,5 4,6
5,1 5,2 5,3 5,4 5,5 5,6 6,1 6,2 6,3 6,4 6,5 6,6

b.

If the dice are fair, then each of the sample points is equally likely. Each would have a
probability of 1/36 of occurring.

c.

There is one sample point in A: 3,3. Thus, P(A) =

1
.
36

There are 6 sample points in B: 1,6 2,5 3,4 4,3 5,2 and 6,1. Thus, P(B) =

6 1
= .
36 6

There are 18 sample points in C: 1,1 1,3 1,5 2,2 2,4 2,6 3,1 3,3 3,5 4,2 4,4
18 1
= .
4,6 5,1 5,3 5,5 6,2 6,4 and 6,6. Thus, P(C) =
36 2
3.8

Each student will obtain slightly different proportions. However, the proportions should be
close to P(A) = 1/10, P(B) = 6/10 and P(C) = 3/10.

3.10

Define the following event:


B: {Postal worker was assaulted on the job in the past year}
P(B) =

3.12

a.

600
= .05
12,000

The 5 sample points are:


Total population, Agricultural change, Presence of industry, Growth, and Population
concentration.

b.

The probabilities are best estimated with the sample proportions. Thus,
P(Total population) = .18
P(Agricultural change) = .05
P(Presence of industry) = .27
P(Growth) = .05
P(Population concentration) = .45

c.

Define the following event:


A: {Factor specified is population-related}
P(A) = P(Total population) + P(Growth) + P(Population concentration)
= .18 + .05 + .45 = .68.

56

Chapter 3

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

3.14

a.

The sample points of this experiment correspond to each of the 8 possible types of
commodities. Suppose we introduce notation to make the listing of the sample points
easier.
A: {carload contains agricultural products}
CH: {carload contains chemicals}
CO: {carload contains coal}
F: {carload contains forest products}
MO: {carload contains metallic ores and minerals}
MV: {carload contains motor vehicles and equipment}
N: {carload contains nonmetallic minerals and products}
O: {carload contains other}

The eight sample points are: A CH CO F MO MV N O


b.

The probability of each sample point is found by dividing the number of carloads for each
sample point by the total number of carloads. The probabilities are:
P(A) = 41,690 / 335,770 = .124
P(CH) = 38,331 / 335,770 = .114
P(CO) = 124,595 / 335,770 = .371
P(F) = 21,929 / 335,770 = .065
P(MO) = 34,521 / 335,770 = .103
P(MV) = 22,906 / 335,770 = .068
P(N) = 37,416 / 335,770 = .111
P(O) = 14,382 / 335,770 = .043

c.

P(MV) = .068
P(nonagricultural products) = P(CH) + P(CO) + P(F) + P(MO) + P(MV) + P(N) + P(O)
= .114 + .371 + .065 + .103 + .068 + .111 + .043 = .875

d.

P(CH) + P(CO) = .114 + .371 = .485

e.

Since there were 335,770 carloads that week, the probability of selecting any one in
particular would be 1 / 335,770 = .00000298. Thus, the probability of selecting the
carload with the serial number 1003642 is .00000298.

Probability

57

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

3.16

a.

Since order does not matter, the number of different bets would be a combination of 8
things taken 2 at a time.
The number of ways would be
8
8!
8 7 6 5 4 3 2 1 40,320
=
=
= 28
=
2 2!(8 2)! 2 1 6 5 4 3 2 1 1440

3.18

b.

If all players are of equal ability, then each of the 28 sample points would be equally
likely. Each would have a probability of occurring of 1/28. There is only one sample
point with values 2 and 7. Thus, the probability of winning with a bet of 2-7 would by
1/28 or .0357.

a.

Let I = Infiniti 1435, TP = Toyota Prius, and C = Chevrolet Corvette. All possible
rankings are as follows, where the first dealer listed is ranked first, the second dealer
listed is ranked second, and the third dealer listed is ranked third:
I,TP,C

b.

I,C,TP

C,I,TP

C,TP,I

TP,I,C

TP,C, I

If each set of rankings is equally likely, then each has a probability of 1/6.
The probability that the Toyota Prius is ranked first = P(TP,I,C) + P(TP,C, I)
=1/6 + 1/6 = 2/6 = 1/3.
The probability that the Infinity 1435 is ranked third = P(C,TP,I) + P(TP,C, I)
=1/6 + 1/6 = 2/6 = 1/3.
The probability that the Toyota Prius is ranked first and the Chevrolet Corvette is ranked
second = P(TP,C, I) =1/6.

3.20

First, we need to compute the total number of ways we can select 2 bullets (pair) from 1,837
bullets. This is a combination of 1,837 things taken 2 at a time.
The number of pairs is:

1,837
1,837!
1837 1836 1
1837 1836

=
=
=
= 1,686,366
2
2 2!(1,837 2)! 2 1 1835 1834 1
The probability of a false positive is the number of false positives divided by the number
of pairs and is:
P(false positive) = # false positives / # pairs = 693 / 1,686,366 = .0004

This probability is very small. There would be only about 4 false positives out of every
10,000. I would have confidence in the FBIs forensic evidence.

58

Chapter 3

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

3.22

3.24

a.

P ( B c ) = 1 P ( B ) = 1 .7 = .3

b.

P ( Ac ) = 1 P ( A) = 1 .4 = .6

c.

P ( A B ) = P ( A) + P ( B ) P( A B) = .4 + .7 .3 = .8

The experiment consists of rolling a pair of fair dice. The sample points are:
1, 1
1, 2
1, 3
1, 4
1, 5
1, 6

2, 1
2, 2
2, 3
2, 4
2, 5
2, 6

3, 1
3, 2
3, 3
3, 4
3, 5
3, 6

4, 1
4, 2
4, 3
4, 4
4, 5
4, 6

5, 1
5, 2
5, 3
5, 4
5, 5
5, 6

6, 1
6, 2
6, 3
6, 4
6, 5
6, 6

Since each die is fair, each sample point is equally likely. The probability of each sample point
is 1/36.
a.

A: {(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)}
B: {(1, 4), (2, 4), (3, 4), (4, 4), (5, 4), (6, 4), (4, 1), (4, 2), (4, 3), (4, 5), (4, 6)}
A B: {(3, 4), (4, 3)}
A B: {(1, 4), (2, 4), (3, 4), (4, 4), (5, 4), (6, 4), (4, 1), (4, 2), (4, 3), (4, 5),
(4, 6), (1, 6), (2, 5), (5, 2), (6, 1)}
Ac: {(1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (2, 1), (2, 2), (2, 3), (2, 4), (2, 6), (3, 1),
(3, 2), (3, 3), (3, 5), (3, 6), (4, 1), (4, 2), (4, 4), (4, 5), (4, 6), (5, 1), (5, 3),
(5, 4), (5, 5), (5, 6), (6, 2), (6, 3), (6, 4), (6, 5), (6, 6)}

b.

1 6 1
P(A) = 6 =
=
36 36 6
1 11
P(B) = 11 =
36 36
1 2 1
P(A B) = 2 =
=
36 36 18
1 15 5
P(A B) = 15 =
=
36 36 12
1 30 5
P(Ac) = 30 =
=
36 36 6
1 11 1 6 + 11 2 15 5
+
=
=
=
6 36 18
36
36 12

c.

P(A B) = P(A) + P(B) P(A B) =

d.

A and B are not mutually exclusive. To be mutually exclusive, P(A B) must be 0. Here,
1
.
P(A B) =
18

Probability

59

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

3.26

3.28

3.30

a.

P(Ac) = P(E3) + P(E6) = .2 + .3 = .5

b.

P(Bc) = P(E1) + P(E7) = .10 + .06 = .16

c.

P(Ac B) = P(E3) + P(E6) = .2 + .3 = .5

d.

P(A B) = P(E1) + P(E2) + P(E3) + P(E4) + P(E5) + P(E6) + P(E7)


= .10 + .05 + .20 + .20 + .06 + .30 + .06 = .97

e.

P(A B) = P(E2) + P(E4) + P(E5) = .05 + .20 + .06 = .31

f.

P(Ac Bc) = P(E1) + P(E7) + P(E3) + P(E6) = .10 + .06 + .20 + .30 = .66

g.

No. A and B are mutually exclusive if P(A B) = 0. Here, P(A B) = .31.

a.

The outcome "On" and "High" is A D.

b.

The outcome "Low" or "Medium" is Dc.

Define the following events:


A: {problems with absenteeism}
T: {problems with turnover}
From the problem, P(A) = .55, P(T) = .41, and P(A T) = .22
P(problems with either absenteeism or turnover) = P(A T) = P(A) + P(T) P(A T)
= .55 + .41 .22 = .74

3.32

60

a.

The event A B is the event the outcome is black and odd. The event is A B: {11, 13,
15, 17, 29, 31, 33, 35}

b.

The event A B is the event the outcome is black or odd or both. The event A B is {2,
4, 6, 8, 10, 11, 13, 15, 17, 20, 22, 24, 26, 28, 29, 31, 33, 35, 1, 3, 5, 7, 9, 19, 21, 23, 25,
27}

Chapter 3

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

Assuming all events are equally likely, each has a probability of 1/38.
1 18 9
P(A) = 18 = =
38 38 19
1 18 9
P(B) = 18 = =
38 38 19
4
1 8
P(A B) = 8 =
=
38 38 19
1 28 14
P(A B) = 28 =
=
38 38 19
1 18 9
P(C) = 18 = =
38 38 19

d.

The event A B C is the event the outcome is odd and black and low.
The event A B C is {11, 13, 15, 17}.

e.

P(A B) = P(A) + P(B) P(A B) =

f.

2
1 4
=
P(A B C) = 4 =
38
38
19

g.

The event A B C is the event the outcome is odd or black or low.


The event A B C is:

9
9
4 14
+ =
19 19 19 19

{1, 2, 3, ... , 29, 31, 33, 35}


or
{All sample points except 00, 0, 30, 32, 34, 36}

3.34

h.

1 32 16
=
P(A B C) = 32 =
38 38 19

a.

PSA
Products 6 and 7 are contained in this intersection.

b.

P(possess all the desired characteristics) = P(P S A)


= P(6) + P(7) =

c.

1 1 1
+ =
10 10 5

AS
P(A S) = P(2) + P(3) + P(5) + P(6) + P(7) + P(8) + P(9) + P(10)
1
1
1
1
1
1
1
1 8 4
+ + + + + + + = =
=
10 10 10 10 10 10 10 10 10 5

Probability

61

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

d.

PS
P(P S) = P(2) + P(6) + P(7) =

3.36

3.38

1
1
1
3
+ + =
10 10 10 10

First, convert the percentages in the table to probabilities by dividing the percent by 100%.
a.

P(A) = .259 + .169 + .115 = .543


P(B) = .003
P(C) = .037 + .078 + .016 + .002 + .047 + .027 = .207
P(D) = .414

b.

P(A D) = .156 + .094 + .043 = .293


P(A D) = P(A) + P(B) P(A D) = .543 + .414 .293 = .664

c.

Ac: {The worker is under 40}


Bc: {The worker is 20 or older or is not part-time}
Dc: {The worker is not part-time}

d.

P(Ac) = 1 P(A) = 1 .543 = .457


P(Bc) = 1 P(B) = 1 .003 = .997
P(Dc) = 1 P(D) = 1 .414 = .586

Define the following events:


A: {Wheelchair user had an injurious fall}
B: {Wheelchair user had all five features installed in the home}
C: {Wheelchair user had no falls}
D: {Wheelchair user had none of the features installed in the home}

3.40

62

a.

P ( A) =

48
= .157
306

b.

P( B) =

9
= .029
306

c.

P (C D) =

89
= .291
306

There are a total of 6 x 6 x 6 = 216 possible outcomes from throwing 3 fair dice. To help
demonstrate this, suppose the three dice are different colors red, blue and green. When we
roll these dice, we will record the outcome of the red die first, the blue die second, and the
green die third. Thus, there are 6 possible outcomes for the first position, 6 for the second, and
6 for the third. This leads to the 216 possible outcomes.

Chapter 3

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The Grand Duke argued that the chance of getting a sum of 9 and the chance of getting a sum
of 10 should be the same since the number of partitions for 9 and 10 are the same. These
partitions are:
9
126
135
144
225
234
333

10
136
145
226
235
244
334

In each case, there are 6 partitions. However, if we take into account the three colors of the
dice, then there are various ways to get each partition. For instance, to get a partition of 126,
we could get 126, 162, 216, 261, 612, and 621 (again, think of the red die first, the blue die
second, and the green die third). However, to get a partition of 333, there is only 1 way. To
get a partition of 144, there are 3 ways: 144, 414, and 441. The numbers of ways to get each
of the above partitions are:
9
126
135
144
225
234
333

# ways
6
6
3
3
6
_ 1
25

10
136
145
226
235
244
334

# ways
6
6
3
6
3
_3
27

Thus, there are a total of 25 ways to get a sum of 9 and 27 ways to get a sum of 10.
The chance of throwing a sum of 9 (25 chances out of 216 possibilities) is less than the
chance of throwing a 10 (27 chances out of 216 possibilities).
3.42

3.44

a.

P ( A B ) = P ( A | B ) P ( B ) = .6(.2) = .12

b.

P ( B | A) =

a.

Since A and B are mutually exclusive events, P(A B) = P(A) + P(B) = .30 + .55 = .85

b.

Since A and C are mutually exclusive events, P(A C) = 0

c.

P(AB) =

d.

Since B and C are mutually exclusive events, P(B C) = P(B) + P(C) = .55 + .15 = .70

e.

No, B and C cannot be independent events because they are mutually exclusive events.

Probability

P ( A B ) .12
= .3
=
P( A)
.4

P( A B) 0
=
=0
P( B)
.55

63

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

3.46

a.

If two fair coins are tossed, there are 4 possible outcomes or simple events. They are:
(1) HH

(2) HT

(3) TH

(4) TT

Event A contains the simple events (2), (3), and (4). Event B contains the simple events
(2) and (3).
A Venn diagram of this would be:

B
2
3

Since the coins are fair, each of the sample points is equally likely. Each would have
probabilities of .
b.

1 3
P ( A) = 3 = = .75
4 4
1 2 1
P ( B ) = 2 = = = .5
4 4 2
P ( A B ) = P (2)+P (3) =

c.

64

1 1 2 1
+ = = = .5
4 4 4 2

P( A | B) =

P ( A B ) .5
= =1
P( B)
.5

P ( B | A) =

P ( A B ) .5
=
= .667
P ( A)
.75

Chapter 3

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

3.48

The 36 possible outcomes obtained when tossing two dice are listed below:
(1, 1) (1, 2) (1, 3) (1, 4) (1, 5) (1, 6)
(2, 1) (2, 2) (2, 3) (2, 4) (2, 5) (2, 6)
(3, 1) (3, 2) (3, 3) (3, 4) (3, 5) (3, 6)
(4, 1) (4, 2) (4, 3) (4, 4) (4, 5) (4, 6)
(5, 1) (5, 2) (5, 3) (5, 4) (5, 5) (5, 6)
(6, 1) (6, 2) (6, 3) (6, 4) (6, 5) (6, 6)
A: {(1, 2), (1, 4), (1, 6), (2, 1), (2, 3), (2, 5), (3, 2), (3, 4), (3, 6), (4, 1), (4, 3),
(4, 5), (5, 2), (5, 4), (5, 6), (6, 1), (6, 3), (6, 5)}
B: {(3, 6), (4, 5), (5, 4), (5, 6), (6, 3), (6, 5), (6, 6)}
A B: {(3, 6), (4, 5), (5, 4), (5, 6), (6, 3), (6, 5)}
If A and B are independent, then P(A)P(B) = P(A B).
18 1
7
6 1
=
P(B) =
P(A B) =
=
36 2
36
36 6
1 7
7 1
P(A)P(B) = =
= P ( A B ) . Thus, A and B are not independent.
2 36 72 6

P(A) =

3.50

Define the following events:


S: {cause of fatal crash is speeding}
C: {cause of fatal crash is missing a curve}
From the problem, we know P(S) = .3 and P(S C) = .12.
P (C | S ) =

3.52

P (C S ) .12
= .4
=
P( S )
.3

Define the following events:


A: {Winner is from the American League}
B: {Winner is from the National League}
C: {Winner is from the Eastern Division}
D {Winner is from the Central Division}
E: {Winner is from the Western Division}

a.

Probability

P (C | A) =

7
P( A C )
7
= 15 = = .7
10
P( A)
10
15

65

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

3.54

b.

1
P( B D)
1
P ( B | D) =
= 15 = = .333
3
P( D)
3
15

c.

P( D E | B) =

2
P (( D E ) B )
2
= 15 = = .4
5
P( B)
5
15

Define the following events:


A: {electrical switch monitors quality of power}
B: {electrical switch not wired properly}
From the problem, P(A) = .90 and P(B | A) = .90.
P(A B) = P(B | A) P(A) = .90(.90) = .81.

3.56

Define the following events:

Ai : {ith CEO has bachelors degree}


a.
b.

3.58

P ( A1 ) =

8
= .20
40

If the first 4 CEOs have just bachelors degree, then on the next pick there are only 4 left
to choose from. Similarly, after picking 4 CEOs, there are only 36 observations left to
choose from.
4
P ( A5 | A1 A2 A3 A4 ) =
= .111
36

If A and B are independent, then P ( A B ) = P ( A) P ( B ) . For this Exercise,


1385 + 786 2171
1385 + 1175 2560
=
= .651 , and
P ( A) =
=
= .552 , P ( B ) =
3934
3934
3934
3934
P( A B) =

1385
= .352 .
3934

P ( A) P ( B ) = .552(.651) = .359 .352 = P ( A B ) . Thus, A and B are not independent.


3.60

66

The probability of a false positive is P(A | B).

Chapter 3

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

3.62

First, define the following event:


A: {CVSA correctly determines the veracity of a suspect} P(A) = .98 (from claim)

3.64

a.

The event that the CVSA is correct for all four suspects is the event A A A A.
P(A A A A) = .98(.98)(.98)(.98) = .9224

b.

The event that the CVSA is incorrect for at least one of the four suspects is the event
(A A A A)c. P(A A A A)c = 1 P(A A A A)
= 1 .9224 = .0776

Define the following events:


I: {Leak ignites immediately (jet fire)}
D: {Leak has delayed ignition (flash fire)}
From the problem, P(I) = .01 and P(D | Ic) = .01
The probability of a jet fire or a flash fire = P(I D) = P(I) + P(D) P(I D)
= P(I) + P(D | Ic)P(Ic) P(I D) = .01 + .01(1 .01) 0 = .01 + .0099 = .0199
A tree diagram of this problem is:
I
.01

I
.01

D(.01)

.99

Ic

Dc
(.99)

3.66

a.

IcD
.99(.01)=.0099

IcDc .99(.99)=.9801

Define the following events:


W:
F:

{Player wins the game Go}


{Player plays first (black stones)}

P(W F) = 319/577 = .553

Probability

67

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

b.

P(W FCA) = 34/34 = 1


P(W FCB) = 69/79 = .873
P(W FCC) = 66/118 = .559
P(W FBA) = 40/54 = .741
P(W FBB) = 52/95 = .547
P(W FBC) = 27/79 = .342
P(W FAA) = 15/28 = .536
P(W FAB) = 11/51 = .216
P(W FAC) = 3/39 = .077

c.

There are three combinations where the player with the black stones (first) is ranked
higher than the player with the white stones: CA, CB, and BA.
P(W FCA CB BA) = (34 + 69 + 40)/(34 + 79 + 54) = 143/167 = .856

d.

There are three combinations where the players are of the same level: CC, BB, and AA.
P(W FCC BB AA) = (66 + 52 + 15)/(118 + 95 + 28) = 133/241 = .552

3.68

a.

Suppose the elements of the population are:


1, 2, 3, 4, 5, 6, 7, 8, 9, and 10.
The possible samples of size 2 are:
(1, 2) (1, 3) (1, 4) (1, 5) (1, 6) (1, 7) (1, 8) (1, 9) (1, 10)
(2, 3) (2, 4) (2, 5) (2, 6) (2, 7) (2, 8) (2, 9) (2, 10)
(3, 4) (3, 5) (3, 6) (3, 7) (3, 8) (3, 9) (3, 10)
(4, 5) (4, 6) (4, 7) (4, 8) (4, 9) (4, 10)
(5, 6) (5, 7) (5, 8) (5, 9) (5, 10)
(6, 7) (6, 8) (6, 9) (6, 10)
(7, 8) (7, 9) (7, 10)
(8, 9) (8, 10)
(9, 10)
Since there are N = 10 elements in the population, the number of samples of size n = 2 is a
combination of 10 things taken 2 at a time or
10 10! 10 9 8 7 6 5 4 3 2 1
=
1 = 45
=
2 2!8! (2 1)(8 7 6 5 4 3 2 1)
Therefore, there are 45 different samples of size n = 2 that can be selected from a
population of N = 10.

b.

68

If random sampling is employed, every pair of elements has an equal probability of being
selected. Therefore, the probability of drawing a particular pair is 1/45.

Chapter 3

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

To draw a random sample of 2 elements from 10, we will number the elements from 0 to
9. Then, starting in an arbitrary position in Table I, Appendix B, we will select two
numbers by going either down a column or across a row. Suppose that we start in the
third position of column 6 and row 9. We will proceed down the column. The first
sample drawn will be 1 and 5. The second sample drawn will be 9 and 4. The 20 samples
selected are:
Sample Number

1
2
3
4
5
6
7
8
9
10

Items Selected
1, 5
9, 4
4, 2
9, 3
8, 1
5, 6
1, 3
0, 2
4, 6
8, 0

Sample Number
11
12
13
14
15
16
17
18
19
20

Items Selected
0, 9
1, 0
3, 7
3, 9
0, 8
3, 4
0, 4
9, 7
8, 4
0, 5

There are actually two pairs of samples that match: Samples 10 and 15, and samples 4
and 14. Given the low probability of each pair occurring, it is not that likely to have two
pairs of samples that match.
3.70

First, number the elements of the population from 1 to 200,000. Starting in row 10, column 1,
of Table I of Appendix B and reading down, take the first ten 6-digit numbers. Eliminate any
duplicates, the number 000000, and all numbers greater than 200,000.
The 10 numbers selected for the random sample are:
094299
103656
071199
023682
010115
070569
024883
007425
053660
005820
Elements with the above numbers are selected for the sample.

3.72

To draw a random sample of 1,000 households from 534,322, we will number the households
from 1 to 534,322. Then, starting in an arbitrary position in Table I, Appendix B, we will select
6-digit numbers by proceeding down a column. We will continue selecting numbers until we have
1,000 different 6-digit numbers, eliminating 000000 and any numbers between 534,323 and
999,999.

Probability

69

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

3.74

a.

Give each stock in the NYSE-Composite Transactions table of the Wall Street Journal a
number (1 to m). Using Table I of Appendix B, pick a starting point and read down using
the same number of digits as in m until you have n different numbers between 1 and m,
inclusive.

3.76

a.

P ( B1 A) = P ( A | B1 ) P ( B1 ) = .3(.75) = .225

b.

P( B2 A) = P( A | B2 ) P( B2 ) = .5(.25) = .125

c.

P ( A) = P ( B1 A) + P ( B2 A) = .225 + .125 = .35

d.

P ( B1 | A) =

P ( B1 A) .225
=
= .643
P( A)
.35

e.

P ( B2 | A) =

P ( B2 A) .125
=
= .357
P ( A)
.35

3.78

If A is independent of B1, B2, and B3, then P( A | B1 ) = P( A) = .4 .


Then P ( B1 | A) =

3.80

a.

P ( A | B1 ) P ( B1 ) .4(.2)
=
= .2
P ( A)
.4

P( E1 error )
P (error )
P (error | E1 ) P( E1 )
=
P(error | E1 ) P( E1 ) + P(error | E2 ) P( E2 ) + P(error | E3 ) P ( E3 )

P ( E1 | error ) =

=
b.

.01(.30)
.003
.003
=
= .158
=
.01(.30) + .03(.20) + .02(.50) .003 + .006 + .01 .019

P( E2 error )
P (error )
P(error | E2 ) P( E2 )
=
P (error | E1 ) P ( E1 ) + P (error | E2 ) P ( E2 ) + P(error | E3 ) P( E3 )

P ( E2 | error ) =

70

.03(.20)
.006
.006
=
= .316
=
.01(.30) + .03(.20) + .02(.50) .003 + .006 + .01 .019

Chapter 3

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

P ( E3 error )
P(error )
P(error | E3 ) P ( E3 )
=
P(error | E1 ) P( E1 ) + P(error | E2 ) P( E2 ) + P(error | E3 ) P( E3 )

P ( E3 | error ) =

=
d.

3.82

.02(.50)
.01
.01
=
=
= .526
.01(.30) + .03(.20) + .02(.50) .003 + .006 + .01 .019

If there was a serious error, the probability that the error was made by engineer 3 is .526.
This probability is higher than for any of the other engineers. Thus engineer #3 is most
likely responsible for the error.

Define the following events:


D: {Defect in steel casting}
H: {NDE detects Hit or defect in steel casting}
From the problem, P(H | D) = .97, P(H | Dc) = .005, and P(D) = .01.
P(H) = P(H | D)P(D) + P(H | Dc)P(Dc) = .97(.01) + .005(.99) = .0097 + .00495 = .01465
P( D | H ) =

3.84

P ( D H ) P ( H | D) P ( D) .97(.01) .0097
=
=
=
= .6621
P( H )
P( H )
.01465 .01465

Define the following events:


A: {Alarm A sounds alarm}
B: {Alarm B sounds alarm}
I: {Intruder}
From the problem:
P(A | I ) = .9
P(B | I ) = .95
P(A | Ic ) = .2
P(B | Ic ) = .1
P( I ) = .4
Since the two systems are operating independently of each other,
P(A B | I ) = P(A | I ) P(B | I ) = .9 (.95) = .855
P(A B I ) = P(A B | I ) P( I ) = .855(.4) = .342
P(A B | Ic ) = P(A | Ic ) P(B | Ic ) = .2 (.1) = .02

Probability

71

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

P(A B Ic ) = P(A B | Ic ) P( Ic ) = .02(.6) = .012


Thus, P(A B) = P(A B I ) + P(A B Ic ) = .342 + .012 = .354
Finally, P(I | A B ) = P(A B I ) / P(A B) = .342 / .354 = .966
3.86

a.

The two probability rules for a sample space are that the probability for any sample point
is between 0 and 1 and that the sum of the probabilities of all the sample points is 1.
For this Exercise, all the probabilities of the sample points are between 0 and 1 and
4

P(S ) = P(S ) + P(S ) + P(S ) + P( S ) =.2 + .1 + .3 + .4 = 1.0


i =1

b.

P( A) = P( S1 ) + P( S4 ) = .2 + .4 = .6

3.88

P ( A B ) = P ( A) + P( B) P( A B) = .7 + .5 .4 = .8

3.90

a.

If the Dow Jones Industrial Average increases, a large New York bank would tend to
decrease the prime interest rate. Therefore, the two events are not mutually exclusive since
they could occur simultaneously.

b.

The next sale by a PC retailer could not be both a laptop and a desktop computer. Since
the two events cannot occur simultaneously, the events are mutually exclusive.

c.

Since both events cannot occur simultaneously, the events are mutually exclusive.

a.

Because events A and B are independent, we have:

3.92

P(A B) = P(A)P(B) = (.3)(.1) = .03


Thus, P(A B) 0, and the two events cannot be mutually exclusive.

3.94

72

P( A B ) .03
=
= .3
P( B)
.1

P(BA) =

P( A B ) .03
=
= .1
P ( A)
.3

b.

P(AB) =

c.

P(A B) = P(A) + P(B) P(A B) = .3 + .1 .03 = .37

Mutually exclusive events are also dependent events since the assumption that one event occurs
alters the probability of the occurrence of the other one. If we assume that one event has
occurred, it is impossible for the other one to occur simultaneously since they are mutually
exclusive. In other words, if A and B are mutually exclusive, P(A B) = 0. P(AB) =
P( A B)
0
=
= 0. Since P(A) 0, A and B are dependent.
P( B)
P( B)

Chapter 3

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

3.96

Define the following events:


C: {Public school building has inadequate plumbing}
D: {Public school has plans for repairing building}
From the problem, we know P(C) = .25 and P(D|C) = .38.
P (C D) = P ( D | C ) P(C ) = .38(.25) = .095

3.98

a.

The event {The manager was involved in the ISO 9000 registration} contains the sample
points {The manager was very involved}, {The manager had moderate involvement}, and
{The manager had minimal involvement}. Thus, P(A) is:
P(A) =

b.

The event {The length of time to achieve ISO 9000 registration was more than 2 years}
contains the sample points {The length of time to achieve ISO 9000 registration was
between 2.1 and 2.5 years} and {The length of time to achieve ISO 9000 registration was
greater than 2.5 years}. Thus, P(B) is:
P(B) =

3.100

9 16 12
37
=
= .925
+
+
40 40 40
40

2
3
5
=
= .125
+
40 40
40

c.

We cannot determine if events A and B are independent from the data given because there
is no way of finding the P(A B). In order to find P(A B), the 40 individuals would
have to be classified on both variables at the same time. In the data provided, the
individuals are first classified on the first variable and then classified on the second
variable.

a.

The experiment consists of selecting 159 employees and asking each to indicate how
strongly he/she agreed or disagreed with the statement "I believe that management is
committed to CQI." There are five sample points: "Strongly agree," "Agree," "Neither
agree nor disagree," "Disagree," and "Strongly disagree."

b.

Since we have frequencies for each of the sample points, good estimates of the
probabilities are the relative frequencies. To find the relative frequencies, divide all of the
frequencies by the sample size of 159. The estimates of the probabilities are:

c.

Probability

Strongly
Agree

Agree

Neither Agree Nor


Disagree

Disagree

Strongly
Disagree

.189

.403

.258

.113

.038

The probability that an employee agrees or strongly agrees with the statement is
.189 + .403 = .592.

73

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

3.102

d.

The probability that an employee does not strongly agree with the statement is equal to
the sum of all the probabilities except that for "strongly agree" = .403 + .258 + .113 +
.038 = .812.

a.

There are a total of 9 2 = 18 sample points for this experiment. There are 9 sources of
CO poisoning, and each source of poisoning has 2 possible outcomes, fatal or nonfatal.
Suppose we introduce some notation to make it easier to write down the sample points.
Let FI = Fire, AU = Auto exhaust, FU = Furnace, K = Kerosene or spaceheater,
AP = Appliance, OG = Other gas-powered motors, FP = Fireplace, O = Other, and
U = Unknown. Also, let F = Fatal and N = Nonfatal. The 18 sample points are:
FI, F
FI, N

AU, F
AU, N

FU, F
FU, N

K, F
K, N

AP, F
AP, N

OG, F
OG, N

FP, F
FP, N

O, F
O, N

b.

The set of all sample points is called the sample space.

c.

The event A is made up of the following sample points: FI, F and FI, N

U, F
U, N

Then, P(A) = P(FI, F) + P(FI, N) = 63/981 + 53/981 = 116/981 = .118


d.

The event B is made up of the following sample points:


(FI, F); (AU, F); (FU, F); (K, F); (AP, F); (OG, F); (FP, F); (O, F); (U, F)
Then, P(B) = P(FI, F) + P(AU, F) + P(FU, F) + P(K, F) + P(AP, F)
+ P(OG, F) + P(FP, F) + P(O, F) + P(U, F)
= 63/981 + 60/981 + 18/891 + 9/981 + 9/981 + 3/981 + 0/981 + 3/981
+ 9/981
= 174/981 = .177

e.

The event C is made up of the following sample points: (AU, F) and (AU, N)
Then, P(C) = P(AU, F) + P(AU, N) = 60/981 + 178/981 = 238/981 = .243

f.

The event D is made up of the following sample point: AU, F


Then, P(D) = P(AU, F) = 60/981 = .061

g.

The event E is made up of the following sample point: FI, N


Then, P(E) = P(FI, N) = 53/981 = .054

3.104

Since there are 11 individuals who are willing to serve on the panel, the number of different
panels of 5 experts is a combination of 11 things taken 5 at a time or
11 11! 11 10 9 8 7 6 5 4 3 2 1
= 462
=
=
5 5!6! (5 4 3 2 1)(6 5 4 3 2 1)

74

Chapter 3

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

3.106

The possible ways of ranking the blades are:


GSW
GWS

SGW
SWG

WGS
WSG

If the consumer had no preference but still ranked the blades, then the 6 possibilities are equally
likely. Therefore, each of the 6 possibilities has a probability of 1/6 of occurring.

3.108

a.

P(Ranks G first) = P(GSW) + P(GWS) =

1 1
2
1
+ =
=
6 6
6
3

b.

P(Ranks G last) = P(SWG) + P(WSG) =

1 1
2
1
+ = =
6 6
6
3

c.

P(ranks G last and W second) = P(SWG) =

d.

P(WGS) =

a.

Consecutive tosses of a coin are independent events since what occurs one time would not
affect the next outcome.

b.

If the individuals are randomly selected, then what one individual says should not affect
what the next person says. They are independent events.

c.

The results in two consecutive at-bats are probably not independent. The player may have
faced the same pitcher both times which may affect the outcome.

d.

The amount of gain and loss for two different stocks bought and sold on the same day are
probably not independent. The market might be way up or down on a certain day so that
all stocks are affected.

e.

The amount of gain or loss for two different stocks that are bought and sold in different
time periods are independent. What happens to one stock should not affect what happens
to the other.

f.

The prices bid by two different development firms in response to the same building
construction proposal would probably not be independent. The same variables would be
present for both firms to consider in their bids (materials, labor, etc.).

Probability

1
6

1
6

75

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

3.110

a.

We will define the following events:


A:{The first activation device works properly; i.e., activates the sprinkler when it
should}
B:{The second activation device works properly}
From the statement of the problem, we know
P(A) = .91 and P(B) = .87
Furthermore, since the activation devices work independently, we conclude that
P(A B) = P(A)P(B) = (.91)(.87) = .7917
Now, if a fire starts near a sprinkler head, the sprinkler will be activated if either the first
activation device or the second activation device, or both, operates properly. Thus,
P(Sprinkler head will be activated) = P(A B) = P(A) + P(B) P(A B)
= .91 + .87 .7917 = .9883

b.

The event that the sprinkler head will not be activated is the complement of the event that
the sprinkler will be activated. Thus,
P(Sprinkler head will not be activated) = 1 P(Sprinkler head will be activated)
= 1 .9883 = .0117

c.

From part a, P(A B) = P(A)P(B) = .7917

d.

In terms of the events we have defined, we wish to determine


P(A Bc) = P(A)P(Bc) (by independence) = .91(1 .87) = .91(.13) = .1183

3.112

Define the following events:


S: {System shuts down}
F1: {Hardware failure}
F2: {Software failure}
F3: {Power failure}
From the Exercise, we know:
P(F1) = .01, P(F2) = .05, and P(F3) = .02. Also, P(S|F1) = .73, P(S|F2) = .12, and P(S|F3) = .88.

76

Chapter 3

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The probability that the current shutdown is due to a hardware failure is:
P ( F1 S )
P( S | F1 ) P ( F1 )
=
P(S )
P ( S | F1 ) P ( F1 ) + P ( S | F2 ) P ( F2 ) + P ( S | F3 ) P ( F3 )

P ( F1 | S ) =

.73(.01)
.0073
.0073
=
=
= .2362
.73(.01) + .12(.05) + .88(.02) .0073 + .006 + .0176 .0309

The probability that the current shutdown is due to a software failure is:
P ( F2 | S ) =
=

P ( F2 S )
P ( S | F2 ) P ( F2 )
=
P(S )
P ( S | F1 ) P ( F1 ) + P( S | F2 ) P( F2 ) + P( S | F3 ) P( F3 )
.12(.05)
.006
.006
=
=
= .1942
.73(.01) + .12(.05) + .88(.02) .0073 + .006 + .0176 .0309

The probability that the current shutdown is due to a power failure is:
P ( F3 | S ) =
=
3.114

P( F3 S )
P ( S | F3 ) P ( F3 )
=
P( S )
P ( S | F1 ) P ( F1 ) + P ( S | F2 ) P ( F2 ) + P ( S | F3 ) P ( F3 )
.88(.02)
.0176
.0176
=
=
= .5696
.73(.01) + .12(.05) + .88(.02) .0073 + .006 + .0176 .0309

Define the following events:


C: {Committee judges joint acceptable}
I: {Inspector judges joint acceptable}
The sample points of this experiment are:
CI
C Ic
Cc I
Cc I c
a.

The probability the inspector judges the joint to be acceptable is:


P(I) = P(C I) + P(C c I) =

101 23 124
+
=
= .810
153 153 153

The probability the committee judges the joint to be acceptable is:


P(C) = P(C I) + P(C I c) =

Probability

101 10 111
+
=
= .725
153 153 153

77

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

b.

The probability that both the committee and the inspector judge the joint to be acceptable
is:
P(C I) =

101
= .660
153

The probability that neither judge the joint to be acceptable is:


P(C c I c) =
c.

19
= .124
153

The probability the inspector and committee disagree is:


P(C I c) + P(C c I) =

10
23 33
+
=
= .216
153 153 153

The probability the inspector and committee agree is:


P(C I) + P(C c I c) =
3.116

a.

101 19 120
+
=
= .784
153 153 153

Define the following events:


A1:
A2:
B3:
B4:
A:
B:

{Component 1 works properly}


{Component 2 works properly}
{Component 3 works properly}
{Component 4 works properly}
{Subsystem A works properly}
{Subsystem B works properly}

The probability a component fails is .1, so the probability a component works properly is
1 .1 = .9.
Subsystem A works properly if both components 1 and 2 work properly.
P(A) = P(A1 A2) = P(A1)P(A2) = .9(.9) = .81
(since the components operate independently)
Similarly, P(B) = P(B1 B2) = P(B1)P(B2) = .9(.9) = .81
B

The system operates properly if either subsystem A or B operates properly.


The probability the system operates properly is:
P(A B) = P(A) + P(B) - P(A B) = P(A) + P(B) P(A)P(B)
= .81 + .81 .81(.81) = .9639

78

Chapter 3

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

b.

The probability exactly one subsystem fails is:


P(A Bc) + P(Ac B) = P(A)P(Bc) + P(Ac)P(B)
= .81(1 .81) + (1 .81).81 = .1539 + .1539 = .3078

c.

The probability the system fails is the probability that both subsystems fail or:
P(Ac Bc) = P(Ac)P(Bc) = (1 .81)(1 .81) = .0361

d.

The system operates correctly 99% of the time means it fails 1% of the time. The
probability one subsystem fails is .19. The probability n subsystems fail is .19 n. Thus,
we must find n such that
.19n .01
Thus, n = 3.

3.118

Define the events:


A: {A bottle comes from machine A}
B: {A bottle comes from machine B}
R: {A bottle is rejected}.
Then the given probabilities are:
P(A) = .75, P(B) = .25, P(RA) =

1
1
, P(RB) =
20
30

The proportion of rejected bottles is:


P(R) = P(A R) + P(B R) = P(RA)P(A) + P(RA)P(B)
1
1
(.75) +
(.25) = .0458
=
20
30
The probability that a bottle comes from machine A, given that it is accepted is:
c
P( A R c ) P ( R A) P ( A) (19 / 20) (.75)
= .7467
P(AR ) =
=
=
R( R c )
1 P( R)
1 .0458

Probability

79

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

3.120

There are a total of 6 6 = 36 outcomes when rolling 2 dice. If we let the first number in the
pair represent the outcome of die number 1 and the second number in the pair represent the
outcome of die number 2, then the possible outcomes are:
1,1
1,2
1,3
1,4
1,5
1,6

2,1
2,2
2,3
2,4
2,5
2,6

3,1
3,2
3,3
3,4
3,5
3,6

4,1
4,2
4,3
4,4
4,5
4,6

5,1
5,2
5,3
5,4
5,5
5,6

6,1
6,2
6,3
6,4
6,5
6,6

If both dice are fair, then each of these outcomes are equally like and have a probability of
1/36.
a.

To win on the first roll, a player must roll a 7 or 11. There are 6 ways to roll a 7 and 2
ways to roll an 11. Thus the probability of winning on the first roll is:
P (7 or 11) =

b.

To lose on the first roll, a player must roll a 2 or 3. There is 1 way to roll a 2 and 2 ways
to roll a 3. Thus the probability of losing on the first roll is:
P (2 or 3) =

c.

8
= .2222
36

3
= .0833
36

If a player rolls a 4 on the first roll, the game will end on the next roll if the player rolls
the original roll again (player wins) or if the player rolls a seven (player loses). Now,
there are 3 ways of getting a 4 on the first roll: 1,3, 2,2, or 3,1.
If the first roll was 2,2, then the game would end on the next roll if the player threw a 2,2,
1,6, 2,5, 3,4, 4,3, 5,2, or 6,1 on the next roll. The probability of the game ending on
the next roll would be:
P (2, 2 or 7 on second toss | 2, 2 on first) =

7
= .1944
36

Now, suppose the first roll ended with a 1 and a 3. Since the dice are not marked, this
result could have happened two ways: 1, 3 or 3,1. Regardless of how the original 1 and 3
were obtained, the player would have 2 ways of winning on the next roll: 1,3 or 3,1. For
the game to end on the next roll, the player could throw 1,3, 3,1, 1,6, 2,5, 3,4, 4,3, 5,2,
or 6,1. The probability of the game ending on the next roll would be:
P (1,3 or 3,1 or 7 on second toss |1 and 3 on first) =

8
= .2222
36

Since there were 3 ways to get a 4 on the first roll, and each were equally likely,
P(2,2) = 1/3 and P[1 and 3 (any order)] = 2/3.

80

Chapter 3

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The probability that the game ends on the second roll is


P (2, 2 or 7 on second toss | 2, 2 on first) P (2, 2 on first)
+ P (1,3 or 3,1 or 7 on second toss |1 and 3 on first) P (1 and 3 on first)
1
2
= .1944 + .2222 = .0648 + .1481 = .2129
3
3

3.122

Suppose we define the following event:


E: {Error produced when dividing}
From the problem, we know that P(E) = 1 / 9,000,000,000
The probability of no error produced when dividing is P(Ec) = 1 P(E) = 1
1 / 9,000,000,000 = 8,999,999,999 / 9,000,000,000 = .999999999 1.0000
Suppose we want to find the probability of no errors in 2 divisions (assuming each division is
independent):
P(Ec Ec) = .999999999(.999999999) = .999999999 1.0000
Thus, in general, the probability of no errors in k divisions would be:
P(Ec Ec Ec Ec) = P(Ec)k = [8,999,999,999 / 9,000,000,000]k
k times
Suppose a user ran a program that performed 1 billion divisions. The probability of no errors
in these 1 billion divisions would be:
P(Ec)1,000,000,000 = [8,999,999,999 / 9,000,000,000]1,000,000,000 = .9048
Thus, the probability of at least 1 error in 1 billion divisions would be
1 P(Ec)1,000,000,000 = 1 - [8,999,999,999 / 9,000,000,000]1,000,000,000 = 1 .9048 = .0852
For a heavy MINITAB user, this flawed chip would be a problem because the above
probability is not that small.

Probability

81

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Random Variables and


Probability Distributions
4.2

Chapter 4

a.

The closing price of a particular stock on the New York Stock Exchange is discrete. It
can take on only a countable number of values.

b.

The number of shares of a particular stock that are traded on a particular day is discrete.
It can take on only a countable number of values.

c.

The quarterly earnings of a particular firm is discrete. It can take on only a countable
number of values.

d.

The percentage change in yearly earnings between 2005 and 2006 for a particular firm is
continuous. It can take on any value in an interval.

e.

The number of new products introduced per year by a firm is discrete. It can take on only
a countable number of values.

f.

The time until a pharmaceutical company gains approval from the U.S. Food and Drug
Administration to market a new drug is continuous. It can take on any value in an
interval of time.

4.4

The number of customers, x, waiting in line can take on values 0, 1, 2, 3, . Even though the
list is never ending, we call this list countable. Thus, the random variable is discrete.

4.6

A banker might be interested in the number of new accounts opened in a month, or the number
of mortgages it currently has, both of which are discrete random variables.

4.8

The manager of a hotel might be concerned with the number of employees on duty at a specific
time, or the number of vacancies there are on a certain night.

4.10

A stockbroker might be interested in the length of time until the stockmarket is closed for the
day.

4.12

a.

The variable x can take on values 1, 3, 5, 7, and 9.

b.

The value of x that has the highest probability associated with it is 5. It has a probability
of .4.

82

Chapter 4

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

4.14

4.16

c.

Using MINITAB, the probability


distribution of x as a graph is:

d.

P(x = 7) = .2

e.

P(x 5) = p(5) + p(7) + p(9) = .4 + .2 + .1 = .7

f.

P(x > 2) = p(3) + p(5) + p(7) + p(9) = .2 + .4 + .2 + .1 = .9

a.

This is not a valid distribution because

b.

This is a valid distribution because 0 p(x) 1 for all values of x and

c.

This is not a valid distribution because p(4) = .3 < 0.

d.

The sum of the probabilities over all possible values of the random variable is
p( x) = 1.1 > 1, so this is not a valid probability distribution.

a.

= E(x) =

p( x) = .9 1.

p( x) = 1.

xp( x)

= 10(.05) + 20(.20) + 30(.30) + 40(.25) + 50(.10) + 60(.10)


= .5 + 4 + 9 + 10 + 5 + 6 = 34.5

2 = E(x )2 =

(x )

p ( x)

= (10 34.5)2(.05) + (20 34.5)2(.20) + (30 34.5)2(.30)


+ (40 34.5)2(.25) + (50 34.5)2(.10) + (60 34.5)2(.10)
= 30.0125 + 42.05 + 6.075 + 7.5625 + 24.025 + 65.025 = 174.75
= 174.75 = 13.219
b.

Random Variables and Probability Distributions

83

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

2 34.5 2(13.219) 34.5 26.438 (8.062, 60.938)


P(8.062 < x < 60.938) = p(10) + p(20) + p(30) + p(40) + p(50) + p(60)
= .05 + .20 + .30 + .25 + .10 + .10 = 1.00

4.18

a.

It would seem that the mean of both would be 1 since they both are symmetric
distributions centered at 1.

b.

P(x) seems more variable since there appears to be greater probability for the two extreme
values of 0 and 2 than there is in the distribution of y.

c.

For x:

xp( x) = 0(.3) + 1(.4) + 2(.3) = 0 + .4 + .6 = 1


2 = E[(x )2] = ( x ) p ( x)
= E(x) =

= (0 1)2(.3) + (1 1)2(.4) + (2 1)2(.3) = .3 + 0 + .3 = .6

yp( y) = 0(.1) + 1(.8) + 2(.1) = 0 + .8 + .2 = 1


2 = E[(y )2] = ( y ) p( y )
= E(y) =

For y:

= (0 1)2(.1) + (1 1)2(.8) + (2 1)2(.1) = .1 + 0 + .1 = .2


The variance for x is larger than that for y.
4.20

a.

Yes. Relative frequencies are observed values from a sample. Relative frequencies are
commonly used to estimate unknown probabilities. In addition, relative frequencies have
the same properties as the probabilities in a probability distribution, namely
1. all relative frequencies are greater than or equal to zero
2. the sum of all the relative frequencies is 1

b.

Using MINITAB, the graph of the probability distribution is:


0.15

p(age)

0.10

0.05

0.00
20

25

30

age

c.

Let x = age of employee. Then P(x > 30) = .13 + .15 + .12 = .40.
P(x > 40) = 0
P(x < 30) = .02 + .04 + .05 + .07 + .04 + .02 + .07 + .02 + .11 + .07 = .51

d.

84

P(x = 25 or x = 26) = .02 + .07 = .09

Chapter 4

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

4.22

4.24

a.

The probability distribution for x is:


Grill Display
Combination
1-2-3

x
6

p(x)
35 / 124 = .282

1-2-4

8 / 124 = .065

1-2-5

42 / 124 = .339

2-3-4

4 / 124 = .032

2-3-5

10

1 / 124 = .008

2-4-5

11

34 / 124 = .274

b.

P(x > 10) = p(10) + p(11) = .008 + .274 = .282

a.

First, we must find the probability distribution of x. Define the following events:
C: {Chicken is contaminated}
N: {Chicken is not contaminated}
If 3 slaughtered chickens are randomly selected, then the possible outcomes are:
CCC, CCN, CNC, NCC, CNN, NCN, NNC, and NNN
Each of these outcomes are NOT equally likely since P(C) = 1/100 = .01. P(N) = 1 P(C)
= 1 -.01 = .99.
P(CCC) = P(C C C ) = P(C) P(C) P(C) = .01(.01)(.01) = .000001
P(CCN) = P(CNC) = P(NCC) = P(C C N ) = P(C) P(C) P(N) = .01(.01)(.99) =
.000099
P(CNN) = P(NCN) = P(NNC) = P(C N N ) = P(C) P(N) P(N) = .01(.99)(.99) =
.009801
P(NNN) = P(N N N ) = P(N) P(N) P(N) = .99(.99)(.99) = .970299.
The variable x is defined as the number of contaminated chickens in the sample. The value
of x for each of the outcomes is:
Event
CCC
CCN
CNC
NCC
CNN
NCN
NNC
NNN

x
3
2
2
2
1
1
1
0

Random Variables and Probability Distributions

p(x)
.000001
.000099
.000099
.000099
.009801
.009801
.009801
.970299

85

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The probability distribution of x is:


x
3
2
1
0

b.

p(x)
.000001
.000297
.029403
.970299

Using MINITAB, the probability graph for x is:

1.0
0.9
0.8
0.7

p(x)

0.6
0.5
0.4
0.3
0.2
0.1
0.0
0

c.
4.26

P(x 1) = P(x = 0) + P(x = 1) = .970299 + .029403 = .999702

To find the probability distribution of x, we first list the possible values of x. For this exercise,
the possible values of x are 3, 1, and 5. Next, we list the number of cases, f(x), that result in
the particular values of x. To find the probability distribution of x, we divide the number of
cases for each value of x, f(x), by the total number of cases, 678. For x = 3, the probability is
p(3) = 68 / 678 = .100. For x = 1, the probability is p(1) = 71 / 678 = .105. For x = 5, the
probability is p(5) = 539 / 678 = .795. The probability distribution of x is:

x
3
1
5
Total

86

f(x)
68
71
539
678

p(x)
.100
.105
.795
1.000

Chapter 4

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Using MINITAB, the graph of the probability distribution is:

0.8
0.7
0.6

p(x)

0.5
0.4
0.3
0.2
0.1
0.0
-3

-2

-1

4.28

a.

E(x) =

xp( x)

All x

Firm A: E(x) = 0(.01) + 500(.01) + 1000(.01) + 1500(.02) + 2000(.35) + 2500(.30)


+ 3000(.25) + 3500(.02) + 4000(.01) + 4500(.01) + 5000(.01)
= 0 + 5 + 10 + 30 + 700 + 750 + 750 + 70 + 40 + 45 + 50
= 2450
Firm B: E(x) = 0(.00) + 200(.01) + 700(.02) + 1200(.02) + 1700(.15) + 2200(.30)
+ 2700(.30) + 3200(.15) + 3700(.02) + 4200(.02) + 4700(.01)
= 0 + 2 + 14 + 24 + 255 + 660 + 810 + 480 + 74 + 84 + 47
= 2450
b.

= 2

2 =

(x )

p( x)

All x

Firm A: 2 = (0 2450)2(.01) + (500 2450)2(.01) + + (5000 2450)2(.01)


= 60,025 + 38,025 + 21,025 + 18,050 + 70,875 + 750 + 75,625
+ 22,050 + 24,025 + 42,025 + 65,025
= 437,500
= 661.44
Firm B: 2 = (0 2450)2(.00) + (200 2450)2(.01) + + (4700 2450)2(.01)
= 0 + 50,625 + 61,250 + 31,250 + 84,375 + 18,750 + 84,375
+ 31,250 + 61,250 + 50,625
= 492,500
= 701.78
Firm B faces greater risk of physical damage because it has a higher variance and
standard deviation.

Random Variables and Probability Distributions

87

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

4.30

a.

If a large number of measurements are observed, then the relative frequencies should be
very good estimators of the probabilities.

b.

E(x) =

xp( x) = 1(.01) + 2(.04) + 3(.04) + 4(.08) + 5(.10) + 6(.15) + 7(.25) + 8(.20)


+ 9(.08) + 10(.05)
= .01 + .08 + .12 + .32 + .50 + .90 + 1.75 + 1.60 + .72 + .50
= 6.50

The average number of checkout lanes per store is 6.5.


c.

2 =

(x )

p( x) = (1 6.5)2(.01) + (2 6.5)2(.04) + (3 6.5)2(.04)

All x

+ (4 6.5)2(.08) + (5 6.5)2(.10) + (6 6.5)2(.15)


+ (7 6.5)2(.25) + (8 6.5)2(.20) + (9 6.5)2(.08)
+ (10 6.5)2(.05)
= .3025 + .8100 + .4900 + .5000 + .2250 + .0375 + .0625
+ .4500 + .5000 + .6125
= 3.99

=
d.

3.99 = 1.9975

Chebyshev's Rule says that at least 0 of the observations should fall in the interval .

Chebyshev's Rule says that at least 75% of the observations should fall in the interval
2.
e.

6.5 1.9975 (4.5025, 8.4975)

P(4.5025 x 8.4975) = .10 + .15 + .25 + .20 = .70


This is at least 0.

2 6.5 2(1.9975) 6.5 3.995 (2.505, 10,495)

P(2.505 x 10.495) = .04 + .08 + .10 + .15 + .25 + .20 + .08 + .05 = .95
This is at least .75 or 75%.

4.32

Let x = winnings in the Florida lottery. The probability distribution for x is:
x
p(x)
22,999,999/23,000,000
$1
$6,999,999
1/23,000,000

The expected net winnings would be:

= E(x) = (1)(22,999,999/23,000,000) + 6,999,999(1/23,000,000) = $.70


The average winnings of all those who play the lottery is $.70.

88

Chapter 4

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

4.34

Each point in the system can have one of 2 status levels, free or obstacle. Define the
following events:
AF: {Point A is free}
BF: {Point B is free}
CF: {Point C is free}

AO: {Point A is obstacle}


BO: {Point B is obstacle}
CO: {Point C is obstacle}

Thus, the sample points for the space are:


AFBFCF, AFBFCO, AFBOCF, AFBOCO, AOBFCF, AOBFCO, AOBOCF, AOBOCO
Since it is stated that the probability of any point in the system having a free status is
.5, the probability of any point having an obstacle status is also .5, Thus, the
probability of each of the sample points above is P(AiBiCi) = .5(.5)(.5) = .125.
The values of Y, the number of free links in the system, for each sample point are listed
below. A link is free if both the points are free. Thus, a link from A to B is free if A is free
and B is free. A link from B to C is free if B is free and C is free.

Sample point

Probability

AFBFCF

.125

AFBFCO

.125

AFBOCF

.125

AFBOCO

.125

AOBFCF

.125

AOBFCO

.125

AOBOCF

.125

AOBOCO

.125

The probability distribution for Y is:


Y

Probability

.625

.250

.125

Random Variables and Probability Distributions

89

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

4.36

a.

x is discrete. It can take on only six values.

b.

This is a binomial distribution.

c.

5
p(0) = (.7)0(.3)5-0 =
0
5
p(1) = (.7)1(.3)5-1 =
1

5
p(2) = (.7)2(.3)5-2 =
2
5
p(3) = (.7)3(.3)5-3 =
3
5
p(4) = (.7)4(.3)5-4 =
4
5
p(5) = (.7)5(.3)5-5 =
5

90

5!
5 4 3 2 1
(.7)0(.3)5 =
(1)(.00243) = .00243
0!5!
15 4 3 2 1

5!
(.7)1(.3)4 = .02835
1!4!
5!
(.7)2(.3)3 = .1323
2!3!

5!
(.7)3(.3)2 = .3087
3!2!
5!
(.7)4(.3)1 = .36015
4!1!
5!
(.7)5(.3)0 = .16807
5!0!

d.

= np = 5(.7) = 3.5
= npq = 5(.7)(.3) = 1.0247

e.

2 = 3.5 2(1.0247) (1.4506, 5.5494)

Chapter 4

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

4.38

a.

3
3!
3 2 1
p(0) = (.3)0(.7)3-0 =
(.3)0(.7)3 =
(1)(.7)3 = .343
0!3!
1 3 2 1
0
3
3!
(.3)1(.7)2 = .441
p(1) = (.3)1(.7)3-1 =
1
1!2!

3
p(2) = (.3)2(.7)3-2 =
2
3
p(3) = (.3)3(.7)3-3 =
3

4.40

4.42

p(x)

0
1
2
3

.343
.441
.189
.027

5!
(.3)2(.7)1 = .189
2!1!

5!
(.3)3(.7)0 = .027
3!0!

a.

P(x = 2) = P(x 2) P(x 1) = .167 .046 = .121 (from Table II, Appendix B)

b.

P(x 5) = .034

c.

P(x > 1) = 1 P(x 1) = 1 .919 = .081

d.

P(x < 10) = P(x 9) = 0

e.

P(x 10) = 1 P(x 9) = 1 .002 = .998

f.

P(x = 2) = P(x 2) P(x 1) = .206 .069 = .137

a.

We will check the 5 characteristics of a binomial random variable.


1.
2.

3.
4.
5.

The experiment consists of n = 200 identical trials.


There are only two possible outcomes on each trial. Let S = young adult owns a
mobile phone with internet access and F = young adult does not own a mobile
phone with internet access.
The probability of success (S) is the same from trial to trial. For each trial,
p = P(S) = .20. q = 1 p = 1 .20 = .80.
The trials are independent.
The binomial random variable x is the number of young adults in 200 trials that own
a mobile phone with internet access.

Thus, x is a binomial random variable.


b.

From the exercise, p = .20. For any young adult, the probability that they own a mobile
phone with internet access is .20.

c.

= E ( x) = np = 200(.20) = 40 . On the average, for every 200 young people surveyed, 40


will own mobile phones with internet access.

Random Variables and Probability Distributions

91

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

4.44

a.

We will check the 5 characteristics of a binomial random variable.


1. The experiment consists of n = 5 identical trials. We have to assume that the
number of bottled water brands is large.
2. There are only 2 possible outcomes for each trial. Let S = brand of bottled water
used tap water and F = brand of bottled water did not use tap water.
3. The probability of success (S) is the same from trial to trial. For each trial, p =
P(S) = .25 and q = 1 p = 1 - .25 = .75.
4. The trials are independent.
5. The binomial random variable x is the number of brands in the 5 trials that used tap
water.
If the total number of brands of bottled water is large, then the above
characteristics will be basically true. Thus, x is a binomial random variable.

b.

c.

d.

4.46

5
The formula for the probability distribution for x is p( x) = .25 x (.75)5 x ,
x
for x = 1, 2, 3, 4, 5.
5
5!
.252.753 = .2637
P ( x = 2) = .252 (.75)5 2 =
2
2!3!

5
5
P ( x 1) = P ( x = 0) + P ( x = 1) = .250 (.75)50 + .251 (.75)51
0
1
5!
5!
=
.250.755 +
.251.754 = .2373 + .3955 = .6328
0!5!
1!4!

a.

In order for x to be a binomial random variable, the n trials must be identical. We can
assume that the process of selecting of a worker is identical from trial to trial. There are
two possible outcomes - a worker missed work due to a back injury or not. The
probability of success must be the same from trial to trial. We can assume that the
probability of missing work due to a back injury is constant. The trials must be
independent of each other. We can assume that the outcome of one trials will not affect
the outcome of any other. Thus, x is a binomial random variable.

b.

From the information given in the problem, the estimate of p is .40.

c.

The mean is = E(x) = np = 10(.40) = 4.


The standard deviation is =

d.

np(1 p ) = 10(.40)(.60) = 2.4 1 = 1.549

Using Table II, Appendix B, with n = 10 and p = .40,


P(x = 1) = P(x 1) P(x 0) = .046 .006 = .040
P(x > 1) = 1 P(x 1) = 1 .046 = .954

92

Chapter 4

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

4.48

Let x = number of packets observed by a network sensor in 150 trials. Then x has an
approximate binomial distribution with n = 150 and p = .001.
The virus will be detected if at least 1 packets is observed.
150
150!
0
150 0
P ( x 1) = 1 P ( x = 0) = 1
=1
.999150 = 1 .8606 = .1394
.001 (.999)
0!150!
0

4.50

a.

We must assume that the trials are identical, the probability of success is constant from
trial to trial, and the trials are independent of each other.

b.

From the problem, we estimate p to be .20. Using Table II, Appendix B, with n = 25 and
p = .20,

P(x 10) = .994


c.

E(x) = np = 25(.20) = 5

np(1 p ) = 25(.20)(.80) = 4 = 2

d.

2 5 2(2) 5 4 (1, 9)

e.

Using Table II, Appendix B, with n = 25 and p = .20,

P(1 < x < 9) = P(x 8) P(x 1) = .953 .027 = .926


4.52

Assuming the supplier's claim is true,

= np = 500(.001) = .5
= npq = 500(.001)(.999) = .4995 = .707
If the supplier's claim is true, we would only expect to find .5 defective switches in a sample of
size 500. Therefore, it is not likely we would find 4.
Based on the sample, the guarantee is probably inaccurate.
Note: z =

4 .5
= 4.95
.707

This is an unusually large z-score.


4.54

a.

For this test, n = 20 and p = .10. Then x is a binomial random variable with n = 20
and p = .10.
Using Table II, Appendix, with n = 20 and p = .10,

P(x 1) = .392

Random Variables and Probability Distributions

93

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

b.

For the experiment in part a, the level of confidence is 1 P(x 1) = 1 .392 = .608.
Since this value is not close to 1, this would not be an acceptable level.

c.

Suppose we increased n from 20 to 25. Using Table II, Appendix B, with n = 25 and
p = .10,

P(x 1) = .271. This value is smaller than the value found in part a.
Now, suppose we keep n = 20, but change K to 0 instead of 1. Using Table II,
Appendix B, with n = 20 and p = .10,

P(x 0) = .122. This value is again, smaller than the value found in part a.
d.

Suppose we let K = 0. Now, we need to find n such that the level of confidence .95,
which means that P(x = 0) .05.
n
P ( x = 0) = .10 (.9) n 0 .05
0
n! n
.9 .05
0!n!
.9n .05

ln(.9n ) ln(.05)
nln(.9) ln(.05)
ln(.05) 2.99573
=
= 28.4
.10536
ln(.9)
Thus, if K = 0, then we need a sample size of 28 to get a level of confidence of at
least .95.
n

Now, suppose K = 1. Now, we need to find n such that the level of confidence is at
least .95, which means that P(x 1) .05.

n
n
P ( x 1) = P ( x = 0) + P( x = 1) = .10 (.9) n 0 + .11 (.9) n 1 .05
0
1
n! n
n!
.9 +
.11.9n 1 .05

0!n!
1!(n 1)!
.9n + n.11.9n 1 .05
.9n 1 (.9 + .1n) ln(.05)
From here, we will use trial and error.

94

Chapter 4

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

For n = 30, .930-1(.9+.1(30)) = .1837


n

.9n-1(.9+.1n)

30

.930-1(.9+.1(30)) = .1837

40

.940-1(.9+.1(40)) = .0805

45

.945-1(.9+.1(45)) = .0524

46

.946-1(.9+.1(46)) = .0480

Thus, for K = 1, we would need a sample size of 46 to get a level of confidence of


at least .95.
4.56

= = 1.5
Using Table III of Appendix B:

4.58

a.

P(x 3) = .934

b.

P(x 3) = 1 P(x 2) = 1 .809 = .191

c.

P(x = 3) = P(x 3) P(x 2) = .934 .809 = .125

d.

P(x = 0) = .223

e.

P(x > 0) = 1 P(x = 0) = 1 .223 = .777

f.

P(x > 6) = 1 P(x 6) = 1 .999 = .001

a.

To graph the Poisson probability distribution with = 5, we need to calculate p(x) for x =
0 to 15. Using Table III, Appendix B,
p(0) = .007
p(1) = P(x 1) P(x 0) = .040 .007 = .033
p(2) = P(x 2) P(x 1) = .125 .040 = .085
p(3) = P(x 3) P(x 2) = .265 .125 = .140
p(4) = P(x 4) P(x 3) = .440 .265 = .175
p(5) = P(x 5) P(x 4) = .616 .440 = .176
p(6) = P(x 6) P(x 5) = .762 .616 = .146
p(7) = P(x 7) P(x 6) = .867 .762 = .105
p(8) = P(x 8) P(x 7) = .932 .867 = .065
p(9) = P(x 9) P(x 8) = .968 .932 = .036
p(10) = P(x 10) P(x 9) = .986 .968 = .018
p(11) = P(x 11) P(x 10) = .995 .986 = .009
p(12) = P(x 12) P(x 11) = .998 .995 = .003
p(13) = P(x 13) P(x 12) = .999 .998 = .001
p(14) = P(x 14) P(x 13) = 1.000 .999 = .001
p(15) = P(x 15) P(x 14) = 1.000 1.000 = .000

Random Variables and Probability Distributions

95

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The graph is shown at right:

4.60

b.

==5
= = 5 = 2.2361
2 5 2(2.2361) 5 4.4722 (.5278, 9.4722)

c.

P(.5278 < x < 9.4722) = P(1 x 9) = P(x 9) P(x = 0)


= .968 .007 = .961

a.

E(x) = = = 6

= = 6 = 2.449
x

z=

c.

Using Table III, Appendix B, with = 6,

1 6
= 2.041
2.449

b.

P(x 10) = .957

4.62

a.

In the problem, it is stated that E(x) = .03. This is also the value of .

2 = = .03
b.

96

The experiment consists of counting the number of deaths or missing persons in a threeyear interval. We must assume that the probability of a death or missing person in a
three-year period is the same for any three-year period. We must also assume that the
number of deaths or missing persons in any three-year period is independent of the
number of deaths or missing persons in any other three-year period.

Chapter 4

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

4.64

P(x = 1) =

1e - = .031e -.03 = .0291

P(x = 0) =

0e - = .030e -.03 = .9704

1!

0!

1!

0!

a.

Using Table III and = 6.2, P(x = 2) = P(x 2) P(x 1) = .054 .015 = .039
P(x = 6) = P(x 6) P(x 5) = .574 .414 = .160
P(x = 10) = P(x 10) P(x 9) = .949 .902 = .047

b.

The plot of the distribution is:

c.

= = 6.2, = = 6.2 = 2.490


6.2 2.49 (3.71, 8.69)
2 6.2 2(2.49) 6.2 4.98 (1.22, 11.18)
3 6.2 3(2.49) 6.2 7.47 (1.27, 13.67)
See the plot in part b.

d.

First, we need to find the mean number of customers per hour. If the mean number of
customers per 10 minutes is 6.2, then the mean number of customers per hour is
6.2(6) = 37.2 = .

= = 37.2 and = = 37.2 = 6.099


3 37.2 3(6.099) 37.2 18.297 (18,903, 55.498)
Using Chebyshev's Rule, we know at least 8/9 or 88.9% of the observations will fall
within 3 standard deviations of the mean. The number 75 is way beyond the 3 standard
deviation limit. Thus, it would be very unlikely that more than 75 customers entered the
store per hour on Saturdays.

Random Variables and Probability Distributions

97

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

4.66

Let x = number of minor flaws in one square foot of a door's surface. Then x has a Poisson
distribution with = .5.

= = .5, using Table III, Appendix B:


P(fail inspection) = P(2 or more minor flaws in the square foot inspected)
= P(x 2) = 1 P(x 1)
= 1 .910 = .090
P(pass inspection) = P(x < 2) = P(x 1) = .910

4.68

If it takes exactly 5 minutes to wash a car and there are 5 cars in line, it will take 5(5) = 25
minutes to wash these 5 cars. Thus, for anyone to be in line at closing time, more than 1 car
must arrive in the final hour. In addition, if on average 10 cars arrive per hour, then an
average of 5 cars will arrive per hour (30 minutes). If we let x = number of cars to arrive in
hour, then x is a Poisson random variable with = 5.
P(x > 1) = 1 P(x 1) = 1 .04 = .96 (Using Table III, Appendix B)

Since this probability is so big, it is very likely that someone will be in line at closing time.
4.70

4.72

.04 (20 x 45)


From Exercise 4.69, f(x) =
0 otherwise
a.

P(20 x 30) = (30 20)(.04) = .4

b.

P(20 < x < 30) = (30 20)(.04) = .4

c.

P(x 30) = (45 30)(.04) = .6

d.

P(x 45) = (45 45)(.04) = 0

e.

P(x 40) = (40 20)(.04) = .8

f.

P(x < 40) = (40 20)(.04) = .8

g.

P(15 x 35) = (35 20)(.04) = .6

h.

P(21.5 x 31.5) = (31.5 21.5)(.04) = .4

1
(3 x 7)

From Exercise 4.71, f(x) = 4


0 otherwise
a.

98

1
P(x a) = .6 (7 a) = .6
4

7 a = 2.4

a = 4.6

Chapter 4

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

b.

c.

d.

4.74

1
P(x a) = .25 (a 3) = .25
4

a3=1

a=4
1
P(x a) = 1 (a 3) = 1
4

a3=4

a=7
For any value of a 7, P(x a) = 1. Thus, a 7.
1
P(4 x a) = .5 (a 4) = .5
4

a 4= 2

a=6

c+d
= 10 c + d = 20 c = 20 - d
2
d -c
=
= 1 d c = 12
12

Substituting, d (20 d) = 12 2d 20 = 12
2d = 20 + 12
20 + 12
d=
2
d = 11.732
Since c + d = 20 c + 11.732 = 20
c = 8.268
1
(c x d)
f(x) =
d c
1
1
1
=
=
= .289
d c 11.732 - 8.268 3.464
.289 (8.268 x 11.732)
Therefore, f(x) =
0 otherwise
The graph of the probability distribution
for x is given here.

Random Variables and Probability Distributions

99

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

4.76

a.

For this problem, c = 0 and d = 1.


1
1
(0 x 1)
=

f(x) = d c 1 0
0
otherwise
c+d
0 +1
=
= .5
2
2
2
2
(d c)
(1 0)
1
= .0833
2 =
=
=
12
12
12
P(.2 < x < .4) = (.4 .2)(1) = .2

b.

4.78

c.

P(x > .995) = (1 .995)(1) = .005. Since the probability of observing a trajectory greater
than .995 is so small, we would not expect to see a trajectory exceeding .995.

a.

For layer 2, let x = amount loss. Since the amount of loss is random between .01 and .05
million dollars, the uniform distribution for x is:
f(x) =

1
d c

(c x d)

1
1
1
=
=
= 25
d c .05 .01 .04

25 (.01 x .05)
Therefore, f(x) =
0 otherwise
A graph of the distribution looks like the following:

c + d .01 + .05
=
= .03
2
2
d c

12

.05 .01
12

= .0115, 2 = (.0115)2 = .00013

The mean loss for layer 2 is .03 million dollars and the variance of the loss for layer 2 is
.00013 million dollars squared.

100

Chapter 4

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

b.

For layer 6, let x = amount loss. Since the amount of loss is random between .50 and 1.00
million dollars, the uniform distribution for x is:
f(x) =

1
d c

(c x d)

1
1
1
=
=
=2
d c 1.00 .50 .50

2 (.50 x 1.00)
Therefore, f(x) =
0 otherwise
A graph of the distribution looks like the following:

c + d
.50 + 1.00
=
= .75
2
2

d c

12

1.00 .50
= .1443, 2 = (.1443)2 = .0208
12

The mean loss for layer 6 is .75 million dollars and the variance of the loss for layer 6 is
.0208 million dollars squared.
c.

A loss of $10,000 corresponds to x = .01. P(x > .01) = 1


A loss of $25,000 corresponds to x = .025.
1
1

P(x < .025) = (Base)(Height) = (x c)


= (.025 .01)

d c
.05 .01
= .015(25) = .375

Random Variables and Probability Distributions

101

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

d.

A loss of $750,000 corresponds to x = .75. A loss of $1,000,000 corresponds to x = 1.


1
1

P(.75 < x < 1) = (Base)(Height) = (d - x)


= (1.00 - .75)

1.00 .50
d c
= .25(2) = .5

A loss of $900,000 corresponds to x = .90.


1
1

P(x > .9) = (Base)(Height) = (d x)


= (1.00 .90)

1.00 .50
d c
= .10(2) = .20
P(x = .9) = 0

4.80

Let x = cycle availability, where x has a uniform distribution on the interval from 0 to 1.
Mean = =

c + d 0 +1
=
= .5
2
2

Standard deviation = =

d c

12

1 0
= .289
12

The 10th percentile is that value of x such that 10% of all observations are below it.
Let K1 = 10th percentile.
P(x K1) = (K1 0)(1 0) = K1 = .10

The lower quartile is that value of x such that 25% of all observations are below it.
Let K2 = 25th percentile.
P(x K2) = (K2 0)(1 0) = K2 = .25

The UPPER quartile is that value of x such that 75% of all observations are below it.
Let K3 = 75th percentile.
P(x K3) = (K3 0)(1 0) = K3 = .75

4.82

102

a.

Chapter 4

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

b.

c + d 0 +1
=
= .5
2
2
d c 1 0
=
=
= .289
12
12
=

2 = .2892 = .083

c.

P(p > .95) = (1 .95)(1) = .05


P(p < .95) = (.95 0)(1) = .95

d.

The analyst should use a uniform probability distribution with c = .90 and d = .95.
1
1
1
=
=
= 20 (.90 p .95)

f(p) = d c .95 .90 .05

0 otherwise

4.84

4.86

Table IV in the text gives the area between z = 0 and z = z0. In


this exercise, the answers may thus be read directly from the
table by looking up the appropriate z.
a.

P(0 < z < 2.0) = .4772

b.

P(0 < z < 3.0) = .4987

c.

P(0 < z < 1.5) = .4332

d.

P(0 < z < .80) = .2881

a.

P(1 z 1) = A1 + A2
= .3413 + .3413
= .6826

b.

P(2 z 2) = A1 + A2
= .4772 + .4772
= .9544

c.

P(2.16 < z 0.55) = A1 + A2


= .4846 + .2088
= .6934

Random Variables and Probability Distributions

103

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

4.88

4.90

104

d.

P(.42 < z < 1.96)


= P(.42 z 0) + P(0 z 1.96)
= A 1 + A2
= .1628 + .4750
= .6378

e.

P(z 2.33) = P(2.33 z 0) + P(z 0)


= A 1 + A2
= .4901 + .5000
= .9901

f.

P(z < 2.33) = P(z 0) + P(0 z 2.33)


= A 1 + A2
= .5000 + .4901
= .9901

a.

P(z = 1) = 0, since a single point does not have an area.

b.

P(z 1) = P(z 0) + P(0 < z 1)


= A 1 + A2
= .5 + .3413
= .8413
(Table IV, Appendix B)

c.

P(z < 1) = P(z 1) = .8413 (Refer to part b.)

d.

P(z > 1) = 1 P(z 1) = 1 .8413 = .1587 (Refer to part b.)

Using Table IV, Appendix B:


a.

P(z z0) = .05


A1 = .5 .05 = .4500
Looking up the area .4500 in Table IV gives
z0 = 1.645.

b.

P(z z0) = .025


A1 = .5 .025 = .4750
Looking up the area .4750 in Table IV
gives z0 = 1.96.

c.

P(z z0) = .025


A1 = .5 .025 = .4750
Looking up the area .4750 in Table IV gives
z = 1.96. Since z0 is to the left of 0, z0 = 1.96.

Chapter 4

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

4.92

4.94

d.

P(z z0) = .10


A1 = .5 .1 = .4
Looking up the area .4000 in Table IV
gives z0 = 1.28.

e.

P(z > z0) = .10


A1 = .5 .1 = .4
z0 = 1.28 (same as in d)

a.

z=1

b.

z = 1

c.

z=0

d.

z = 2.5

e. z = 3
Using Table IV of Appendix B:
a.

To find the probability that x assumes a value more than 2


standard deviations from :
P(x < 2) + P(x > + 2)
= P(z < 2) + P(z > 2)
= 2P(z > 2)
= 2(.5000 .4772)
= 2(.0228) = .0456
To find the probability that x assumes a value more than 3
standard deviations from :
P(x < 3) + P(x > + 3)
= P(z < 3) + P(z > 3)
= 2P(z > 3)
= 2(.5000 .4987)
= 2(.0013) = .0026

b.

To find the probability that x assumes a value within


1 standard deviation of its mean:
P( < x < + )
= P(1 < z < 1)
= 2P(0 < z < 1)
= 2(.3413)
= .6826

Random Variables and Probability Distributions

105

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

To find the probability that x assumes a value within


2 standard deviations of :
P( 2 < x < + 2)
= P(2 < z < 2)
= 2P(0 < z < 2)
= 2(.4772)
= .9544
c.

To find the value of x that represents the 80th percentile,


we must first find the value of z that corresponds to the
80th percentile.
P(z < z0) = .80. Thus, A1 + A2 = .80. Since A1 = .50,
A2 = .80 - .50 = .30. Using the body of Table IV, z0 = .84.
To find x, we substitute the values into the z-score formula:
z=

.84 =

x 1000
x = .84(10) + 1000 = 1008.4
10

To find the value of x that represents the 10th percentile,


we must first find the value of z that corresponds to the
10th percentile.

P(z < z0) = .10. Thus, A1 = .50 - .10 = .40. Using the
body of Table IV, z0 = 1.28. To find x, we substitute the values into the z-score formula:
z=

1.28 =
4.96

x 1000
x = 1.28(10) + 1000 = 987.2
10

The random variable x has a normal distribution with = 50 and = 3.


a.

P(x x0) = .8413


So, A1 + A2 = .8413
Since A1 = .5, A2 = .8413 .5 = .3413.
Look up the area .3413 in the body of Table IV,
Appendix B; z0 = 1.0.

106

Chapter 4

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

To find x0, substitute all the values into the z-score formula:
z=

x 50
1.0 = 0
3
x0 = 50 + 3(1.0) = 53
b.

P(x > x0) = .025


So, A = .5000 .025 = .4750
Look up the area .4750 in the body of Table IV,
Appendix B; z0 = 1.96.
To find x0, substitute all the values into the z-score formula:
z=

x 50
1.96 = 0
3
x0 = 50 + 3(1.96) = 55.88
c.

P(x > x0) = .95


So, A1 + A2 = .95. Since A2 = .5, A1 = .95 .5 = .4500.
Look up the area .4500 in the body of Table IV,
Appendix B; (since it is exactly between two values,
average the z-scores). z0 1.645.
To find x0, substitute into the z-score formula:
z=

x 50
1.645 = 0
3
x0 = 50 3(1.645) = 45.065

d.

P(41 x < x0) = .8630


z=

41 50
= 3
3

A1 = P(41 x ) = P(3 z 0)
= P(0 z 3)
= .4987
A1 + A2 = .8630, since A1 = .4987, A2 = .8630 - .4987 = .3643. Look up .3643 in the body
of Table IV, Appendix B; z0 = 1.1.

Random Variables and Probability Distributions

107

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

To find x0, substitute into the z-score formula:


z=

x 50
1.1 = 0
3
x0 = 50 + 3(1.1) = 53.3
e.

P(x < x0) = .10


So A = .5000 .10 = .4000
Look up area .4000 in the body of Table IV, Appendix B; z0 = 1.28. Since z0 is to the left
of 0, z0 = 1.28.
To find x0, substitute all the values into the z-score formula:
z=

x0 50
3
x0 = 50 1.28(3) = 46.16

1.28 =

f.

P(x > x0) = .01


So A = .5000 .01 = .4900
Look up area .4900 in the body of Table IV, Appendix B; z0 = 2.33.
To find x0, substitute all the values into the z-score formula:
z=

x0 50
3
x0 = 50 + 2.33(3) = 56.99

2.33 =

4.98

a.

Using Table IV, Appendix B,

0 5.26

P ( x > 0) = P z >
= P ( z > 0.526)
10

= .5 + P (0.53 < z < 0) = .5 + .2019 = .7019


b.

108

15 5.26
5 5.26
<z<
P (5 < x < 15) = P
= P(0.026 < z < 0.974)
10
10
= P (.03 < z < 0) + P (0 < z < .97) = .0120 + .3340 = .3460

Chapter 4

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

d.

1 5.26

P ( x < 1) = P z <
= P( z < 0.426)
10

= .5 P(0.43 < z < 0) = .5 .1664 = .3336


25 5.26

P ( x 25) = P z
= P ( z 3.026)
10

= .5 P(3.03 z < 0) = .5 .4988 = .0012


Since the probability of seeing a win percentage of -25% or anything more unusual is so
small (p = .0012), we would conclude that the average casino win percentage is not
5.26%.

4.100

Let x = drivers head injury rating. The random variable x has a normal distribution with
= 605 and = 185. Using Table IV, Appendix B,
a.

b.

700 605
500 605
P (500 < x < 700) = P
<z<
= P (0.57 < z < 0.51)
185
185
= P ( 0.57 < z < 0) + P (0 < z < 0.51) = .2157 + .1950 = .4107
500 605
400 605
P (400 < x < 500) = P
<z<
= P (1.11 < z < 0.57)
185
185

= P ( 1.11 < z < 0) P ( 0.57 < z < 0) = .3665 .2157 = .1508


c.

d.

850 605

P ( x < 850) = P z <


= P ( z < 1.32) = .5 + P (0 < z < 1.32)
185

= .5 + .4066 = .9066
1, 000 605

P ( x > 1, 000) = P z >


= P ( z > 2.14) = .5 P (0 < z < 2.14)
185

= .5 .4838 = .0162
4.102

a.

Let x = crop yield. The random variable x has a normal distribution with = 1,500
and = 250.
1,600 -1,500

P(x < 1,600) = P z <


= P(z < .4) = .5 + .1554 = .6554
250

(Using Table IV)

Random Variables and Probability Distributions

109

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

b.

Let x1 = crop yield in first year and x2 = crop yield in second year. If x1 and x2 are
independent, then the probability that the farm will lose money for two straight years is:
1,600 1,500
1,600 1,500

P(x1 < 1,600) P(x2 < 1,600) = P z1 <


P z2 <

250
250

= P(z1 < .4) P(z2 < .4) = (.5 + .1554)(.5 + .1554) = .6554(.6554) = .4295
(Using Table IV)
c.

4.104

[1,500 + 2 ] 1,500
[1,500 2 ] 1,500
P(1,500 2 x 1,500 + 2) = P
z

= P(2 z 2) = 2P(0 z 2) = 2(.4772) = .9544


(Using Table IV)

Let x = wage rate. The random variable x is normally distributed with = 16 and = 1.25.
Using Table IV, Appendix B,
a.

b.

c.

17.30 16

P ( x > 17.30) = P z >


= P ( z > 1.04)
1.25

= .5 P(0 < z < 1.04) = .5 .3508 = .1492


17.30 16

P ( x > 17.30) = P z >


= P ( z > 1.04)
1.25

= .5 P(0 < z < 1.04) = .5 .3508 = .1492


P(x ) = P(x ) = .5
Thus, = = 16.
(Recall from section 2.4 that in a symmetric distribution, the mean equals the median.)

4.106

a.

The contract will be profitable if total cost, x, is less than $1,000,000.


1,000,000 850,000

P(x < 1,000,000) = P z <


= P(z < .88) = .5 + .3106 = .8106
170,000

b.

The contract will result in a loss if total cost, x, exceeds 1,000,000.


P(x > 1,000,000) = 1 P(x < 1,000,000) = 1 .8106 = .1894

110

Chapter 4

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

P(x < R) = .99. Find R.


R 850,000

= P(z < z0) = .99


P(x < R) = P z <
170,000

A1 = .99 .5 = .4900
Looking up the area .4900 in Table IV, z0 = 2.33
R 850,000
R 850,000
2.33 =
170,000
170,000
R = 2.33(170,000) + 850,000 = $1,246,100

z0 =

4.108

a.

Let x = quantity injected per container. The random variable x has a normal distribution
with = 10 and = .2.
10 10

P(x < 10) = P z <


= P(z < 0.0) = .5
.2

10 10

P(x 10) = P z
= P(z 0.0) = .5
.2

4.110

b.

Since the container needed to be reprocessed, it cost $10. Upon refilling, it contained
10.60 units with a cost of 10.60($20) = $212. Thus, the total cost for filling this container
is $10 + $212 = $222. Since the container sells for $230, the profit is $230 $222 = $8.

c.

Let x = quantity injected per container. The random variable x has a normal distribution
with = 10.10 and = .2. The expected value of x is E(x) = = 10.10. The cost of a
container with 10.10 units is 10.10($20) = $202. Thus, the expected profit would be the
selling price minus the cost or $230 $202 = $28.

a.

If z is a standard normal random variable,


QL = zL is the value of the standard normal distribution which has 25% of the data to the
left and 75% to the right.
Find zL such that P(z < zL) = .25
A1 = .50 .25 = .25.
Look up the area A1 = .25 in the body of Table IV of
Appendix B; zL = .67 (taking the closest value). If interpolation is used, .675 would be
obtained.

Random Variables and Probability Distributions

111

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

QU = zU is the value of the standard normal distribution which has 75% of the data to the
left and 25% to the right.
Find zU such that P(z < zU) = .75
A1 + A2 = P(z 0) + P(0 z zU)
= .5 + P(0 z zU)
= .75
Therefore, P(0 z zU) = .25.
Look up the area .25 in the body of Table IV of Appendix B; zU = .67 (taking the closest
value).
b.

Recall that the inner fences of a box plot are located 1.5(QU - QL) outside the hinges (QL
and QU).
To find the lower inner fence,
QL 1.5(QU QL) = .67 1.5(.67 (.67))
= -.67 1.5(1.34)
= -2.68 (2.70 if zL = .675 and zU = +.675)
The upper inner fence is:
QU + 1.5(QU - QL) = .67 + 1.5(.67 (.67))
= .67 + 1.5(1.34)
= 2.68 (+2.70 if zL = .675 and zU = +.675)

c.

Recall that the outer fences of a box plot are located 3(QU QL) outside the hinges
(QL and QU).
To find the lower outer fence,
QL 3(QU QL) = .67 3(.67 (.67))
= .67 3(1.34)
= -4.69 (4.725 if zL = .675 and zU = +.675)
The upper outer fence is:
QU + 3(QU QL) = .67 + 3(.67 (.67))
= .67 + 3(1.34)
= 4.69 (4.725 if zL = .675 and zU = +.675)

112

Chapter 4

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

d.

P(z < 2.68) + P(z > 2.68)


= 2P(z > 2.68)
= 2(.5000 .4963)
(Table IV, Appendix B)
= 2(.0037) = .0074
(or 2(.5000 .4965) = .0070 if 2.70 and 2.70 are used)
P(z < 4.69) + P(z > 4.69)
= 2P(z > 4.69)
2(.5000 .5000) 0

4.112

4.114

e.

In a normal probability distribution, the probability of an observation being beyond the


inner fences is only .0074 and the probability of an observation being beyond the outer
fences is approximately zero. Since the probability is so small, there should not be any
observations beyond the inner and outer fences. Therefore, they are probably outliers.

a.

IQR = QU QL = 195 72 = 123

b.

IQR/s = 123/95 = 1.295

c.

Yes. Since IQR is approximately 1.3, this implies that the data are approximately normal.

a.

Using MINITAB, the stem-and-leaf display is:


Stem-and-leaf of X
Leaf Unit = 0.10
5

11266

35

11

035

14

039

14

3457

10

346

24469

N = 28

47

Since the data do not form a mound-shape, it indicates that the data may not be normally
distributed.
b.

Using MINITAB, the descriptive statistics are:


Variable
X
Variable
X

Mean

Median

TrMean

StDev

SE Mean

28

5.511

6.100

5.519

2.765

0.5230

Minimum

Maximum

Q1

Q3

1.100

9.700

3.350

8.050

The standard deviation is 2.765.

Random Variables and Probability Distributions

113

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

Using the printout from MINITAB in part b, QL = 3.35, and QU = 8.05. The IQR
= QU QL = 8.05 3.35 = 4.7. If the data are normally distributed, then IQR/s 1.3.
For this data, IQR/s = 4.7/2.765 = 1.70. This is a fair amount larger than 1.3, which
indicates that the data may not be normally distributed.

d.

Using MINITAB, the normal probability plot is:

The data at the extremes are not particularly on a straight line. This indicates that the data are
not normally distributed.

4.116

From the normal probability plot, it appears that the data may not be normal. The points with
small observed values and the points with large observed values do not fall on the straight line.
This implies that the data may not be from a normal distribution.

4.118

a.

We will look at the 4 methods for determining if the data are normal. First, we will look
at a histogram of the data. Using MINITAB, the histogram of the fish weights is:
35
30

Frequency

25
20
15
10
5
0
0

500

1000

1500

2000

2500

Weight

From the histogram, the data appear to be fairly mound-shaped. This indicates that the
data may be normal.

114

Chapter 4

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Next, we look at the intervals x s, x 2 s, x 3s . If the proportions of observations


falling in each interval are approximately .68, .95, and 1.00, then the data are
approximately normal. Using MINITAB, the summary statistics are:
Descriptive Statistics: Weight
Variable
Weight

N
144

Mean
1049.7

Median
1000.0

TrMean
1039.4

Variable
Weight

Minimum
173.0

Maximum
2302.0

Q1
804.5

Q3
1263.3

StDev
376.5

SE Mean
31.4

x s 1049.7 376.5 (673.2, 1, 426.2) 98 of the 144 values fall in this interval. The
proportion is .68. This is exactly the .68 we would expect if the data were normal.
x 2 s 1049.7 2(376.5) 1049.7 753 (296.7, 1802.7) 140 of the 144 values fall
in this interval. The proportion is .97. This is somewhat larger than the .95 we would
expect if the data were normal.
x 3s 1049.7 3(376.5) 1049.7 1126.5 (79.8, 2179.2) 143 of the 144 values
fall in this interval. The proportion is .993. This is close to the 1.00 we would expect if
the data were normal.
From this method, it appears that the data are normal.
Next, we look at the ratio of the IQR to s. IQR = QU QL = 1263.3 804.5 = 458.8.
IQR 458.8
=
= 1.22 This is close to the 1.3 we would expect if the data were normal.
376.5
s
This method indicates the data are normal.
Finally, using MINITAB, the normal probability plot is:
Normal Probability Plot for Weight
ML Estimates - 95% CI

ML Estimates
99

Mean

1049.72

StDev

375.236

Percent

95
90

Goodness of Fit

80
70
60
50
40
30
20

AD*

0.793

10
5
1

1000

2000

Data

Since the data form a fairly straight line, the data appear to be normal.

Random Variables and Probability Distributions

115

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

From the 4 different methods, all indications are that the fish weight data are
approximately normal.
b.

We will look at the 4 methods for determining if the data are normal. First, we will look
at a histogram of the data. Using MINITAB, the histogram of the fish DDT levels is:

140
120

Frequency

100
80
60
40
20
0
0

500

1000

DDT

From the histogram, the data appear to be skewed to the right. This indicates that the data
may not be normal.
Next, we look at the intervals x s, x 2 s, x 3s . If the proportions of observations
falling in each interval are approximately .68, .95, and 1.00, then the data are
approximately normal. Using MINITAB, the summary statistics are:
Descriptive Statistics: DDT
Variable
DDT

N
144

Mean
24.35

Median
7.15

TrMean
10.38

Variable
DDT

Minimum
0.11

Maximum
1100.00

Q1
3.33

Q3
13.00

StDev
98.38

SE Mean
8.20

x s 24.35 98.38 (74.03, 122.73) 138 of the 144 values fall in this interval. The
proportion is .96. This is much greater than the .68 we would expect if the data were
normal.
x 2 s 24.35 2(98.38) 24.35 196.76 (172.41, 221.11) 142 of the 144 values
fall in this interval. The proportion is .986 This is much larger than the .95 we would
expect if the data were normal.
x 3s 24.35 3(98.38) 24.35 295.14 (270.79, 319.49) 142 of the 144 values
fall in this interval. The proportion is .986. This is somewhat lower than the 1.00 we
would expect if the data were normal.
From this method, it appears that the data are not normal.
Next, we look at the ratio of the IQR to s. IQR = QU QL = 13.00 3.33 = 9.67.

116

Chapter 4

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

IQR 9.67
=
= 0.098 This is much smaller than the 1.3 we would expect if the data were
s
98.38
normal. This method indicates the data are not normal.
Finally, using MINITAB, the normal probability plot is:
Normal Probability Plot for DDT
ML Estimates - 95% CI

ML Estimates
99

Percent

95
90

Mean

24.355

StDev

98.0364

Goodness of Fit

80
70
60
50
40
30
20

AD*

38.58

10
5
1

500

1000

Data

Since the data do not form a straight line, the data are not normal.
From the 4 different methods, all indications are that the fish DDT level data are not normal.
4.120

We will look at the 4 methods for determining if the data are normal. First, we will look at
a histogram of the data. Using MINITAB, the histogram of the sanitation scores is:
Histogram of SCORE

40

Fr equency

30

20

10

66

72

78

84

90

96

SC O RE

Random Variables and Probability Distributions

117

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

From the histogram, the data appear to be skewed to the left. This indicates that the data are
not normal.
Next, we look at the intervals x s, x 2 s, x 3s . If the proportions of observations falling
in each interval are approximately .68, .95, and 1.00, then the data are approximately normal.
Using MINITAB, the summary statistics are:
Descriptive Statistics: DDT
Variable
DDT

N
144

Mean
24.35

Median
7.15

TrMean
10.38

Variable
DDT

Minimum
0.11

Maximum
1100.00

Q1
3.33

Q3
13.00

StDev
98.38

SE Mean
8.20

x s 94.911 4.825 (90.086, 99.736) 137 of the 169 values fall in this interval. The
proportion is .81. This is much larger than the .68 we would expect if the data were normal.
x 2 s 94.911 2(4.825) 94.911 9.65 (85.261, 104.561) 165 of the 169 values fall
in this interval. The proportion is .98. This is somewhat larger than the .95 we would expect if
the data were normal.
x 3s 94.911 3(4.825) 94.911 14.475 (80.436, 109.386) 166 of the 169 values fall
in this interval. The proportion is .982. This is somewhat smaller than the 1.00 we would
expect if the data were normal.
From this method, it appears that the data are not normal.
Next, we look at the ratio of the IQR to s. IQR = QU QL = 98 93 = 5.
IQR
5
=
= 1.036 This is much smaller than the 1.3 we would expect if the data were
s
4.825
normal. This method indicates the data are not normal.
Finally, using MINITAB, the normal probability plot is:
Probability Plot of SCORE
N ormal - 95% C I
99.9

Mean
StDev

99

N
AD
P-Value

95

94.91
4.825
169
7.216
<0.005

P er cent

90
80
70
60
50
40
30
20
10
5
1
0.1

60

118

70

80

90
SC O RE

100

110

Chapter 4

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Since the data do not form a straight line, the data are not normal.
From the 4 different methods, all indications are that the sanitation scores data are not normal.
4.122

We will look at the 4 methods for determining if the data are normal. First, we will look at
a histogram of the data. Using MINITAB, the histogram of the tensile strength values is:
Histogram of Strength
3.0

Fr equency

2.5
2.0

1.5
1.0

0.5
0.0

330

335

340
345
Str ength

350

355

From the histogram, the data appear to be somewhat skewed to the left. This might indicate
that the data are not normal.
Next, we look at the intervals x s, x 2 s, x 3s . If the proportions of observations falling
in each interval are approximately .68, .95, and 1.00, then the data are approximately normal.
Using MINITAB, the summary statistics are:
Descriptive Statistics: Strength
Variable
Strength

N
11

N*
0

Variable
Strength

Maximum
356.30

Mean
342.13

SE Mean
2.38

StDev
7.91

Minimum
328.20

Q1
334.70

Median
343.60

Q3
347.80

x s 342.13 7.91 (334.22, 350.04) 8 of the 11 values fall in this interval. The
proportion is .73. This is somewhat larger than the .68 we would expect if the data were
normal.
x 2 s 342.16 2(7.91) 342.16 9.65 (326.34, 357.98) All 11 of the 11 values fall in
this interval. The proportion is 1.00. This is somewhat larger than the .95 we would expect if
the data were normal.
x 3s 342.16 3(7.91) 342.16 23.73 (318.43, 365.89) Again, all 11 of the 11
values fall in this interval. The proportion is 1.00. This is equal to the 1.00 we would expect if
the data were normal.

Random Variables and Probability Distributions

119

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

From this method, it appears that the data are quite normal.
Next, we look at the ratio of the IQR to s. IQR = QU QL = 347.80 334.70 = 13.1.
IQR 13.1
=
= 1.656 This is much larger than the 1.3 we would expect if the data were normal.
s
7.91
This method indicates the data are not normal.

Finally, using MINITAB, the normal probability plot is:


Probability Plot of Strength
Normal - 95% CI
99

Mean
StDev
N
AD
P-Value

95
90

342.1
7.907
11
0.154
0.937

80

Percent

70
60
50
40
30
20
10
5

310

320

330

340
Strength

350

360

370

Since the data do form a fairly straight line, the data could be normal.
From the 4 different methods, three of the four indicate that the data probably are not from a
normal distribution.
4.124

a.

In order to approximate the binomial distribution with the normal distribution, the interval
3 np 3 npq should lie in the range 0 to n.
When n = 25 and p = .4,
np 3 npq 25(.4) 3 25(.4)(1 .4)
10 3 6 10 7.3485 (2.6515, 17.3485)
Since the interval calculated does lie in the range 0 to 25, we can use the normal
approximation.

b.

= np = 25(.4) = 10
2 = npq = 25(.4)(.6) = 6

c.

P(x 9) = 1 P(x 8) = 1 .274 = .726

d.

120

(Table II, Appendix B)

(9 .5) 10
P(x 9) P z

= P(z .61)
= .5000 + .2291 = .7291
(Using Table IV in Appendix B.)

Chapter 4

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

4.126

= np = 1000(.5) = 500, =
a.

npq = 1000(.5)(.5) = 15.811

Using the normal approximation,


(500 + .5) 500

P(x > 500) P z >


= P(z > .03) = .5 .0120 = .4880
15.811

(from Table IV, Appendix B)

b.

(500 .5) 500


(490 .5) 500
P(490 x < 500) P
z<

15.811
15.811

= P(.66 z < .03) = .2454 .0120 = .2334


(from Table IV, Appendix B)

c.

4.128

(550 + .5) 500

P(x > 550) P z >


= P(z > 3.19) .5 .5 = 0
15.811

(from Table IV, Appendix B)

a.

E(x) = = np = 350(.27) = 94.5.

b.

= 2 = npq = 350(.27)(.73) = 68.985 = 8.306

c.

z=

d.

To see if the normal approximation is appropriate, we use:

99.5 94.5
= 0.60
8.306

3 94.5 3(8.306) 94.5 24.918 (69.582, 119.418)


Since the interval lies in the range of 0 to 350, the normal approximation is appropriate.
P ( x 100) P ( z 0.60) = .5 .2257 = .2743 (Using Table IV, Appendix B)
4.130

Let x = number of white-collar employees in good shape who will develop stress related illnesses
in a sample of 400. Then x is a binomial random variable with n = 400 and p = .10. To see if
the normal approximation is appropriate for this problem:
np 3 npq 400(.1) 3 400(.1)(.9) 40 18 (22, 58)
Since this interval is contained in the interval 0, n = 400, the normal approximation is
appropriate.

(60 + .5) 40

P(x > 60) P z >

= P(z > 3.42) .5000 - .5000 = 0

Random Variables and Probability Distributions

121

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

4.132

a.

For n = 100 and p = .01:


3 np 3 npq 100(.01) 3 100(.01)(.99)
1 3(.995) 1 2.985 (1.985, 3.985)
Since the interval does not lie in the range 0 to 100, we cannot use the normal
approximation to approximate the probabilities.

b.

For n = 100 and p = .5:


3 np 3 npq 100(.5) 3 100(.5)(.5)
50 3(5) 50 15 (35, 65)
Since the interval lies in the range 0 to 100, we can use the normal approximation to
approximate the probabilities.

c.

For n = 100 and p = .9:


3 np 3 npq 100(.9) 3 100(.9)(.1)
90 3(3) 90 9 (81, 99)
Since the interval lies in the range 0 to 100, we can use the normal approximation to
approximate the probabilities.

4.134

b.

Let v = number of credit card users out of 100 who carry Visa. Then v is a binomial
random variable with n = 100 and pv = .539.
E(v) = npv = 100(.539) = 53.9.
Let d = number of credit card users out of 100 who carry Discover. Then d is a binomial
random variable with n = 100 and pd = .040.
E(d) = npd = 100(.040) = 4.0.

c.

To see if the normal approximation is valid, we use:

3 npv 3 npv qv 100(.539) 3 100(.539)(.461) 53.9 3(4.9848)


53.9 14.9544 (38.946, 68.854)
Since the interval lies in the range 0 to 100, we can use the normal approximation to
approximate the probability.
(50 .5) 53.9

P (v 50) P z
= P ( z .88) = .5 + .3106 = .8106
4.985

122

Chapter 4

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Let a = number of credit card users out of 100 who carry American Express. Then a is a
binomial random variable with n = 100 and pa = .132. To see if the normal approximation
is valid, we use:

3 npa 3 npa qa 100(.132) 3 100(.132)(.868) 13.2 3(3.385)


13.2 10.155 (3.045, 23.355)
Since the interval lies in the range 0 to 100, we can use the normal approximation to
approximate the probability.
(50 .5) 13.2

P (a 50) P z
= P( z 10.72) .5 + .5 = 0
3.385

4.136

d.

In order for the normal approximation to be valid, 3 must lie in the interval (0, n).
This check was done in part c for both portions of the question. In both cases, the normal
approximation was justified.

a.

If 80% of the passengers pass through without their luggage being inspected, then 20%
will be detained for luggage inspection. The expected number of passengers detained will
be:
E(x) = np = 1,500(.2) = 300

4.140

b.

For n = 4,000, E(x) = np = 4,000(.2) = 800

c.

(600 + .5) 800


P(x > 600) P z >
= P(z > 7.89) = .5 + .5 = 1.0

4000(.2)(.8)

E(x) = =

xp( x) = 1(.2) + 2(.3) + 3(.2) + 4(.2) + 5(.1)


= .2 + .6 + .6 + .8 + .5 = 2.7

E( x ) =

xp( x ) = 1.0(.04) + 1.5(.12) + 2.0(.17) + 2.5(.20) + 3.0(.20) + 3.5(.14) + 4.0(.08)


+ 4.5(.04) + 5.0(.01)
= .04 + .18 + .34 + .50 + .60 + .49 + .32 + .18 + .05 = 2.7

4.144

The sampling distribution is approximately normal only if the sample size is sufficiently large
or if the population being sampled from is normal.

4.146

a.

x = = 10, x = / n = 3/ 25 = 0.6

b.

x = = 100, x = / n = 25 / 25 = 5

c.

x = = 20, x = / n = 40 / 25 = 8

d.

x = = 10, x = / n = 100 / 25 = 20

Random Variables and Probability Distributions

123

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

4.148

4.150

a.

x = = 20, x = / n = 16 / 64 = 2

b.

By the Central Limit Theorem, the distribution of is approximately normal. In order for
the Central Limit Theorem to apply, n must be sufficiently large. For this problem,
n = 64 is sufficiently large.

c.

z=

d.

z=

x x

x
x x

15.5 20
= 2.25
2

23 20
= 1.50
2

For this population and sample size,


E ( x ) = = 100, x = / n = 10 / 900 = 1/3
a.

b.

4.154

Approximately 95% of the time, will be within two standard deviations of the mean, i.e.,
2
1
2 100 2 100 (99.33, 100.67). Almost all of the time, the
3
3
sample mean will be within three standard deviations of the mean, i.e., 3 100
1
3 100 1 (99, 101).
3
1
No more than three standard deviations, i.e., 3 = 1
3

c.

No, the previous answer only depended on the standard deviation of the sampling
distribution of the sample mean, not the mean itself.

a.

x = = 98,500

b.

x =

30,000
50

= 4, 242.6407

c. By the Central Limit Theorem, the sampling distribution of x is approximately normal.

124

x x

z=

e.

P ( x > 89,500) = P ( z > 2.12) = .5 + .4830 = .9830 (Using Table IV, Appendix B)

89,500 98,500
= 2.12
4, 242.6407

d.

Chapter 4

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

4.156

a.

x = = 89.34; x =

7.74
35

= 1.3083

b.

c.

d.

4.158

a.

88 89.34

P( x > 88) = P z >


= P(z > 1.02) = .5 + .3461 = .8461
1.3083

(using Table IV, Appendix B)


87 89.34

P( x < 87) = P z <


= P(z < 1.79) = .5 .4633 = .0367
1.3083

(using Table IV, Appendix B)

Since the sample size is small, we also have to assume that the distribution from

.5
which the sample was drawn is normal. x = = 1.8 , x =
=
= .1118
n
20
1.85 1.8

P ( x 1.85) = P z
= P ( z 0.45) = .5 .1736 = .3264
.1118

(using Table IV, Appendix B)

b.

Using MINITAB, the descriptive statistics are:

Descriptive Statistics: Rough


Variable
Rough

N
20

N*
0

Mean
1.881

SE Mean
0.117

StDev
0.524

Minimum
1.060

Q1
1.303

Median
2.040

Q3
2.293

Maximum
2.640

From this output, the value of x is 1.881.


c.

For x = 1.881:
1.881 1.8

P ( x 1.881) = P z
= P ( z 0.72) = .5 .1736 = .3264
.1118

Since this probability is so high, observing a sample mean of x = 1.881, is not


unusual. The assumptions in part a appear to be valid,
4.160

If the observations are independent of each other, then

P(1, 1) = p(1)p(1) = .2(.2) = .04


P(1, 2) = p(1)p(2) = .2(.3) = .06
P(1, 3) = p(1)p(3) = .2(.2) = .04
etc.

Random Variables and Probability Distributions

125

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

a.

Possible Sample

1, 1
1, 2
1, 3
1, 4
1, 5
2, 1
2, 2
2, 3
2, 4
2, 5
3, 1
3, 2
3, 3

1
1.5
2
2.5
3
1.5
2
2.5
3
3.5
2
2.5
3

p( x )

Possible Samples

.04
.06
.04
.04
.02
.06
.09
.06
.06
.03
.04
.06
.04

3, 4
3, 5
4, 1
4, 2
4, 3
4, 4
4, 5
5, 1
5, 2
5, 3
5, 4
5, 5

x
3.5
4
2.5
3
3.5
4
4.5
3
3.5
4
4.5
5

p( x )
.04
.02
.04
.06
.04
.04
.02
.02
.03
.02
.02
.01

Summing the probabilities, the probability distribution of is:

x
1
1.5
2
2.5
3
3.5
4
4.5
5

p( x )
.04
.12
.17
.20
.20
.14
.08
.04
.01

b.

126

c.

P( x 4.5) = .04 + .01 = .05

d.

No. The probability of observing = 4.5 or larger is small (.05).

Chapter 4

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

4.162

For n = 36, x = = 406 and x = / n = 10.1/ 36 = 1.6833. By the Central Limit


Theorem, the sampling distribution is approximately normal (n is large).
400.8 406

P( x 400.8) = P z
= P(z 3.09) = .5 .4990 = .0010
1.6833

(using Table IV, Appendix B)


The first. If the true value of is 406, it would be extremely unlikely to observe an as small
as 400.8 or smaller (probability .0010). Thus, we would infer that the true value of is less
than 406.

4.164

4.166

a.

This experiment consists of 100 trials. Each trial results in one of two outcomes: chip is
defective or not defective. If the number of chips produced in one hour is much larger
than 100, then we can assume the probability of a defective chip is the same on each trial
and that the trials are independent. Thus, x is a binomial. If, however, the number of
chips produced in an hour is not much larger than 100, the trials would not be
independent. Then x would not be a binomial random variable.

b.

This experiment consists of two trials. Each trial results in one of two outcomes:
applicant qualified or not qualified. However, the trials are not independent. The
probability of selecting a qualified applicant on the first trial is 3 out of 5. The
probability of selecting a qualified applicant on the second trial depends on what
happened on the first trial. Thus, x is not a binomial random variable. It is a
hypergeometric random variable.

c.

The number of trials is not a specified number in this experiment, thus x is not a binomial
random variable. In this experiment, x is counting the number of calls received.

d.

The number of trials in this experiment is 1000. Each trial can result in one of two
outcomes: favor state income tax or not favor state income tax. Since 1000 is small
compared to the number of registered voters in Florida, the probability of selecting a
voter in favor of the state income tax is the same from trial to trial, and the trials are
independent of each other. Thus, x is a binomial random variable.

a.

=
2 =

xp( x) = 10(.2) + 12(.3) + 18(.1) + 20(.4) = 15.4


(x )

p ( x)

= (10 15.4) (.2) + (12 15.4)2(.3) + (18 15.4)2(.1) + (20 15.4)2(.4) = 18.44
= 18.44 4.294
2

P(x < 15) = p(10) + p(12) = .2 + .3 = .5

c.

2 = 15.4 2(4.294) (6.812, 23.988)

d.

P(6.812 < x < 23.988) = .2 + .3 + .1 + .4 = 1.0

Random Variables and Probability Distributions

127

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

4.168

4.170

128

Using Table III, Appendix B,


a.

When = 2, p(3) = P(x 3) P(x 2) = .857 .677 = .180

b.

When = 1, p(4) = P(x 4) P(x 3) = .996 .981 = .015

c.

When = .5, p(2) = P(x 2) P(x 1) = .986 .910 = .076

a.

1
1
1
=
= ,10 x 90

f(x) = d c 90 10 80

0
otherwise

b.

c.

The interval 2 50 2(23.094)


50 46.188 (3.812, 96.188) is indicated
on the graph.

d.

P(x 60) = Base(height) = (60 10)

e.

P(x 90) = 0

f.

P(x 80) = Base(height) = (80 10)

g.

P( x + ) = P(50 23.094 x 50 + 23.094)


= P(26.906 x 73.094)
= Base(height)
1 46.188
= (73.094 26.906) =
= .577
80
80

h.

P(x > 75) = Base(height) = (90 75)

c+d
10 + 90
=
= 50
2
2
d c
90 10
=
=
= 23.094011
12
12

1 5
= = .625
80 8

1 7
= = .875
80 8

1 15
=
= .1875
80 80

Chapter 4

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

4.172

a.

P(z z0) = .5080


P(0 z z0) = .5080 .5 = .0080
Looking up the area .0080 in Table IV,
z0 = .02

b.

P(z z0) = .5517


P(z0 z 0) = .5517 .5 = .0517
Looking up the area .0517 in Table IV, z0 = .13.

c.

P(z z0) = .1492


P(0 z z0) = .5 .1492 = .3508
Looking up the area .3508 in Table IV,
z0 = 1.04

d.

P(z0 z .59) = .4773


P(z0 z 0) + P(0 z .59) = .4773
P(0 z .59) = .2224
Thus, P(z0 z 0) = .4773 .2224 = .2549
Looking up the area .2549 in Table IV, z0 = -.69

4.174

= np = 100(.5) = 50, =
a.

npq = 100(.5)(.5) = 5

(48 + .5) 50

P(x 48) = P z

= P(z .30)
= .5 .1179 = .3821

b.

P(50 x 65)
(65 + .5) 50
(50 .5) 50
= P
z

5
5

= P(.10 z 3.10)
= .0398 + .5000 = .5398

Random Variables and Probability Distributions

129

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

(70 .5) 50

P(x 70) = P z

= P(z 3.90)
= .5 .5 = 0

d.

P(55 x 58)
(58 + .5) 50
(55 .5) 50
= P
z

5
5

= P(.90 z 1.70)
= P(0 z 1.70) P(0 z .90)
= .4554 .3159 = .1395

e.

P(x = 62)
(62 + .5) 50
(62 .5) 50
= P
z

5
5

= P(2.30 z 2.50)
= P(0 z 2.50) (0 z 2.30)
= .4938 .4893 = .0045

f.

P(x 49 or x 72)
(49 + .5) 50

= P z

(72 .5) 50

+ P z

= P(z .10) + P(z 4.30)


= (.5 .0398) + (.5 .5) = .4602

4.176

a.

First we must compute and . The probability distribution for x is:


x
1
2
3
4

= E(x) =

p(x)
.3
.2
.2
.3

xp( x) = 1(.3) + 2(.2) + 3(.2) + 4(.3) = 2.5

2 = E ( x ) 2 =

(x )

p ( x)

= (1 2.5) (.3) + (2 2.5)2(.2) + (3 2.5)2(.2)+ (4 2.52)(.3)


= 1.45

1.45
x = = 2.5, x =
=
= .1904
n
40
2

130

Chapter 4

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

4.180

b.

By the Central Limit Theorem, the distribution of is approximately normal. The sample size,
n = 40, is sufficiently large. Our answer does depend on n. If n is not sufficiently large, the
Central Limit Theorem would not apply.

a.

In order to be a binomial random variable, the five characteristics must hold.


1.
2.
3.

4.
5.

For this problem, there are 5 items scanned. We will assume that these 5 trials are
identical.
For each item scanned, there are 2 possible outcomes: priced incorrectly (S) or
priced correctly (F).
The probability of being priced incorrectly remains constant from trial to trial. For
this problem, we will assume that the probability of being priced incorrectly is P(S)
= 1/30 for each trial.
We will assume that whether one item is priced incorrectly is independent of any
other.
The random variable x is the number of items priced incorrectly in 5 trials.

Thus, x is a binomial random variable.


b.

The estimate of p, the probability of an item being priced incorrectly is 1/30.

c.

5
P(x = 1) = (1/30)1(29/30)4 = .1455
1

d.
e.

5
P(x 1) = 1 P(x = 0) = 1 (1/30)0(29/30)5 = 1 .8441 = .1559
0
Let x = number of items with incorrect prices in 10,000 trials. Thus, x is a binomial
random variable with n = 10,000 and p = 1/30 = .033.

3 np 3 npq 10,000(.033) 3 10, 000(.033)(.967)


330 3 319.11 330 3(17.864) 330 53.591 (276.409, 383.591)
Since the interval lies in the range 0 to 10,000, we can use the normal approximation to
approximate the probabilities.
(100 .5) 330

P(x 100) P z
= P(z 12.90)
17.864

= P(12.90 z < 0) + .5 .5 + .5 = 1 (using Table IV, Appendix B)

Random Variables and Probability Distributions

131

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

f.

Let x = number of items with incorrect prices in 100 trials. Thus, x is a binomial random
variable with n = 100 and p = 1/30 = .033.

3 np 3 npq 100(.033) 3 100(.033)(.967)


3.3 3 3.191 3.3 3(1.786) 3.3 5.358 (2.058, 8.658)
Since the interval does not lie in the range 0 to 100, the normal approximation will not be
appropriate.
4.182

a.

Using Table IV, Appendix B, with = 8.72 and = 1.10,


6 8.72

= P(z < 2.47) = .5 .4932 = .0068


P(x < 6) = P z <
1.10

Thus, approximately .68% of the games would result in fewer than 6 hits.

4.184

b.

The probability of observing fewer than 6 hits in a game is p = .0068. The probability of
observing 0 hits would be even smaller. Thus, it would be extremely unusual to observe
a no hitter.

a.

Using Table III, Appendix B, with =1, P(x = 3) = P(x 3) P(x 2) = .981 .920
= .061

b. P(x > 2) = 1 P(x 2) = 1 .920 = .080.


4.186

a.

Let x = number of employees who have a drug problem in 1,000 trials. Then x is a
binomial random variable with n = 1,000 and p = .052.
E(x) = np = 1,000(.052) = 52

b.

Let x = number of employees who have an alcohol problem in 10 trials. Then x is a


binomial random variable with n = 10 and p = .085.
10
P(x 1) = 1 P(x = 0) = 1 .0850 .91510-0
0
10!
=1
.0850 .91510 = 1 .4113 = .5887
0!(10 - 0)!
10
10!
P(x = 2) = .0852 .91510-2 =
.0852 .9158 = .1597
2
2!(10

2)!

c.

132

We had to assume that the probability of an employee having a substance abuse problem
was constant from trial to trial and that the trials were independent.

Chapter 4

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

4.188

Let x = demand for white bread. Then x is a normal random variable with = 7200 and
= 300:
a.

P(x x0) = .94. Find x0.

x 7200
P(x x0) = P z 0

300

= P(z z0) = .94


A1 = .94 .50 = .4400
Using Table IV and area .4400, z0 = 1.555.
x 0 7200
x 7200
1.555 = 0
x0 = 7666.5 7667
300
300
If the company produces 7,667 loaves, the company will be left with more than 500
loaves if the demand is less than 7,667 - 500 = 7167.
7167 7200

P(x < 7167) = P z <


= P(z < .11)
300

z0 =

b.

= .5 .0438 = .4562 (from Table IV, Appendix B)


Thus, on 45.62% of the days the company will be left with more than 500 loaves.
4.190

Let x = number of inches a gouge is from one end of the spindle. Then x has a uniform
distribution with f(x) as follows:
1
1
1
=
=

f ( x) = d c 18 0 18

0 x 18
otherwise

In order to get at least 14 consecutive inches without a gouge, the gouge must be within 4
inches of either end. Thus, we must find:
P(x < 4) + P(x > 14) = (4 0)(1/18) + (18 14)(1/18)
= 4/18 + 4/18 = 8/18 = .4444
4.192

a.

b.

c.

x = = 3.5 x =

.5
100

= .05

3.60 3.5
3.40 3.5
P(3.40 < x < 3.60) = P
<z<

.05
.05

= P(2 < z < 2) = .4772 + .4772 = .9544


(using Table IV, Appendix B)
3.62 3.5

P( x > 3.62) = P z >


= P(z > 2.40) = .5 .4918 =.0082
.05

(using Table IV, Appendix B)

Random Variables and Probability Distributions

133

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

d.

x = = 3.5 x =

.5
200

= .03536

The mean of the sampling distribution of would stay the same, but the standard deviation
would decrease.

3.60 3.5
3.40 3.5
<z<
P(3.40 < x < 3.60) = P

.03536
.03536
= P(2.83 < z < 2.83) = .4977 + .4977 = .9954
(using Table IV, Appendix B)
This probability is larger than when the sample size was 100.

3.62 3.5

P( x > 3.62) = P z >


= P(z > 3.39) .5 .5 = 0
.03536

(using Table IV, Appendix B)


This probability is smaller than when the sample size was 100.

4.194

a.

Let p1 = probability of an error = 1/100 = .01 and p2 = probability of an error resulting in


a significant problem = 1/500 = .002.
Let x = number of errors in 60,000 trials. Then E(x) = 1 = np1 = 60,000(.01) = 600.

b.

Let y = number of significant errors in 60,000 trials. Then E(y) = 2 = np2 = 60,000(.002)
= 120.
= np2q2 = 60,000(.002)(.998) = 119.76
= 119.76 = 10.94
2 3 120 3(10.94) 120 32.82 (87.18, 152.82)
Using Chebyshev's Rule, at least 88.9% of the observations will fall within 3 standard
deviations of the mean. We would expect the number of significant errors to fall between
87 and 153.

4.196

c.

We must assume that the trials are independent and that the probability of a significant
error is constant from trial to trial.

a.

By the Central Limit Theorem, the sampling distribution of x is approximately normal


since n > 30 and

15
x =
x = = 840
=
= 2.1213
n
50

b.
c.

134

830 840

P( x 830) = P z
= P(z 4.71) .5 .5 = 0
2.1213

Since the probability of observing a mean of 830 or less is extremely small (0) if the true
mean is 840, we would tend to believe that the mean is not 840, but something less.

Chapter 4

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

d.

By the Central Limit Theorem, the sampling distribution of is approximately normal


since n > 30 and

45
x =
x = = 840
=
= 6.3640
n
50
830 840

P( x 830) = P z
= P(z 1.57) .5 .4418 = .0582
6.3640

4.198

Let x = length of time a bus is late. Then x is a uniform random variable with probability
distribution:
1
(0 x 20)

f(x) = 20
0 otherwise
0 + 20
= 10
2

a.

b.

1 1
P(x 19) = (20 19) =
= .05
20 20

c.

It would be doubtful that the director's claim is true, since the probability of the bus being
more than 19 minutes late is so small.

Random Variables and Probability Distributions

135

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The Furniture Fire Case


(To accompany Chapters 34)

Using the entire data set of 3,005 invoices as the population, the mean profit margin is 48.9% and the
standard deviation is 13.8291%. If a random sample is selected from this population, the sampling
distribution of the sample mean ( x ) is approximately normal with a mean of 48.901% and a standard
deviation of 13.8291%/ n by the Central Limit Theorem. If a random sample of 253 invoices is
selected, then the probability of obtaining a sample mean of 50.8% or higher is:

50.8 48.901

P(x 50.8) = P z
= P(z 2.18) = .5 .4854 = .0146
13.8291/ 253

Since the probability of obtaining a sample mean of 50.8% or higher from this population is extremely
small (.0146), we would conclude that there is evidence of fraud.
If we look at the two samples separately, the evidence becomes even more damning. For the sample of
134 invoices, the probability of obtaining a sample mean of 50.6% or higher is:
50.6 48.901

P( x1 50.6) = P z
= P(z 1.42) = .5 .4222 = .0778
13.8291/ 134

For the sample of 119 invoices, the probability of obtaining a sample mean of 51.0% or higher is:
51.0 48.901

P( x2 51.0) = P z
= P(z 1.66) = .5 .4515 = .0485
13.8291/ 119

The probability of observing one sample mean of 50.6% or higher AND a second sample mean of
51.0% or higher is:

P( x1 50.6, x2 51.0) = .0778(.0485) = .0038


Again, since the probability of obtaining two sample means of 50.8% or higher and 51.0% or higher
from this population is extremely small (.0038), we would conclude that there is evidence of fraud.

136

The Furniture Fire Case

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Inferences Based on a Single Sample:


Estimation with Confidence Intervals

5.2

5.4

a.

z/2 = 1.96, using Table IV, Appendix B, P(0 z 1.96) = .4750. Thus, /2 =
.5000 .4750 = .025, = 2(.025) = .05, and 1 - = 1 - .05 = .95. The confidence level is
100% .95 = 95%.

b.

z/2 = 1.645, using Table IV, Appendix B, P(0 z 1.645) = .45. Thus, /2 = .50 .45 =
.05, = 2(.05) = .1, and 1 = 1 .1 = .90. The confidence level is 100% .90 = 90%.

c.

z/2 = 2.575, using Table IV, Appendix B, P(0 z 2.575) = .495. Thus, /2 = .500
.495 = .005, = 2(.005) = .01, and 1 = 1 .01 = .99. The confidence level is
100% .99 = 99%.

d.

z/2 = 1.282, using Table IV, Appendix B, P(0 z 1.282) = .4. Thus, /2 = .5 .4 = .1,
= 2(.1) = .2, and 1 = 1 .2 = .80. The confidence level is 100% .80 = 80%.

e.

z/2 = .99, using Table IV, Appendix B, P(0 z .99) = .3389. Thus, /2 = .5000 .3389
= .1611, = 2(.1611) = .3222, and 1 = 1 .3222 = .6778. The confidence level is
100% .6778 = 67.78%.

a.

For confidence coefficient .95, = .05 and /2 = .05/2 = .025. From Table IV, Appendix
B, z.025 = 1.96. The confidence interval is:

x z.025
b.

c.

s
2.7
25.9 1.96
25.9 .56 (25.34, 26.46)
90
n

For confidence coefficient .90, = .10 and /2 = .10/2 = .05. From Table IV, Appendix
B, z.05 = 1.645. The confidence interval is:

x z.05

s
n

25.9 1.645

2.7
90

25.9 .47 (25.43, 26.37)

For confidence coefficient .99, = .01 and /2 = .01/2 = .005. From Table IV, Appendix
B, z.005 = 2.58. The confidence interval is:
x z.005

5.6

Chapter 5

s
2.7
25.9 2.58
25.9 .73 (25.17, 26.63)
90
n

If we were to repeatedly draw samples from the population and form the interval x 1.96 x
each time, approximately 95% of the intervals would contain . We have no way of knowing
whether our interval estimate is one of the 95% that contain or one of the 5% that do not.

Inferences Based on a Single Sample: Estimation with Confidence Intervals

137

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

5.8

a.

For confidence coefficient .95, = .05 and /2 = .05/2 = .025. From Table IV, Appendix
B, z.025 = 1.96. The confidence interval is:
x z.025

5.10

s
3.3
33.9 1.96
33.9 .323 (33.577, 34.223)
n
400

b.

x z.025

c.

For part a, the width of the interval is 2(.647) = 1.294. For part b, the width of the
interval is 2(.323) = .646. When the sample size is quadrupled, the width of the
confidence interval is halved.

a.

A point estimate for the average number of latex gloves used per week by all healthcare
workers with latex allergy is x = 19.3 .

b.

For confidence coefficient .95, = .05 and /2 = .05/2 = .025. From Table IV, Appendix
B, z.025 = 1.96. The confidence interval is:
x z / 2

138

s
n

19.3 1.96

11.9
46

19.3 3.44 (15.86, 22.74)

c.

We are 95% confident that the true average number of latex gloves used per week by all
healthcare workers with a latex allergy is between 15.86 and 22.74.

d.

The conditions required for the interval to be valid are:


a.
b.

5.12

s
3.3
33.9 1.96
33.9 .647 (33.253, 34.547)
100
n

The sample selected was randomly selected from the target population.
The sample size is sufficiently large, i.e. n > 30.

a.

The point estimate for the mean charitable commitment of tax-exempt organizations is
x = 74.9667.

b.

From the printout, the 95% confidence interval is (68.2371, 81.6962).

c.

The probability of estimating the true mean charitable commitment with a single number
is 0. By estimating the true mean charitable commitment with an interval, we can be
pretty confident that the true mean is in the interval.

Chapter 5

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

5.14

Using MINITAB, the descriptive statistics are:


Descriptive Statistics: r
Variable
r

N
34

Mean
0.4224

Median
0.4300

TrMean
0.4310

Variable
r

Minimum
-0.0800

Maximum
0.7400

Q1
0.2925

Q3
0.6000

StDev
0.1998

SE Mean
0.0343

For confidence coefficient .95, = .05 and /2 = .05/2 = .025. From Table IV, Appendix B,
z.025 = 1.96. The confidence interval is:
x z / 2

.4224 1.96

n
(.3552, .4895)

.1998
34

.4224 .0672

We are 95% confident that the mean value of r is between .3552 and .4895.
5.16

a.

Using MINITAB, the descriptive statistics are:

Descriptive Statistics: Rate


Variable
Rate

N
30

Mean
79.73

Median
80.00

TrMean
80.15

Variable
Rate

Minimum
60.00

Maximum
90.00

Q1
76.75

Q3
84.00

StDev
5.96

SE Mean
1.09

For confidence coefficient .90, = .10 and /2 = .10/2 = .05. From Table IV, Appendix B,
z.05 = 1.645. The confidence interval is:
x z / 2

s
5.96
79.73 1.645
79.73 1.79
n
30
(77.94, 81.52)

b.

We are 90% confident that the mean participation rate for all companies that have 401(k)
plans is between 77.94% and 81.52%.

c.

We must assume that the sample size (n = 30) is sufficiently large so that the Central
Limit Theorem applies.

d.

Yes. Since 71% is not included in the 90% confidence interval, it can be concluded that this
company's participation rate is lower than the population mean.

e.

The center of the confidence interval is . If 60% is changed to 80%, the value of will
increase, thus indicating that the center point will be larger. The value of s2 will decrease if
60% is replaced by 80%, thus causing the width of the interval to decrease.

Inferences Based on a Single Sample: Estimation with Confidence Intervals

139

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

5.18

a.

Using MINITAB, I generated 30 random numbers using the uniform distribution from 1
to 308. The random numbers were:
9, 15, 19, 36, 46, 47, 63, 73, 90, 92, 108, 112, 117, 127, 144, 145, 150, 151, 172, 178, 218,
229, 230, 241, 242, 246, 252, 267, 274, 282
I numbered the 308 observations in the order that they appear in the file. Using the random
numbers generated above, I selected the 9th, 15th, 19th, etc. observations for the sample.
The selected sample is:
.31, .34, .34, .50, .52, .53, .64, .72, .70, .70, .75, .78, 1.00, 1.00, 1.03, 1.04, 1.07, 1.10, .21,
.24, .58, 1.01, .50, .57, .58, .61, .70, .81, .85, 1.00

b.

Using MINITAB, the descriptive statistics for the sample of 30 observations are:

Descriptive Statistics: carats-samp


Variable
carats-s

N
30

Mean
0.6910

Median
0.7000

TrMean
0.6965

Variable
carats-s

Minimum
0.2100

Maximum
1.1000

Q1
0.5150

Q3
1.0000

StDev
0.2620

SE Mean
0.0478

From above, x =.6910 and s = .2620.


c.

For confidence coefficient .95, = .05 and /2 = .05/2 = .025. From Table IV, Appendix
B, z.025 = 1.96. The confidence interval is:
x z / 2

5.20

s
n

.691 1.96

.262
30

.691 .094 (.597, .785)

d.

We are 95% confident that the mean number of carats is between .597 and .785.

e.

From Exercise 2.47, we computed the population mean to be .631. This mean does fall
in the 95% confidence interval we computed in part d.

x=

11,298
= 2.26
5,000

For confidence coefficient, .95, = .05 and /2 = .025. From Table IV, Appendix B,
z.025 = 1.96. The confidence interval is:
1 .5
s
2.26 1.96
2.26 .04 (2.22, 2.30)
x z/2
5000
n
We are 95% confident the mean number of roaches produced per roach per week is between
2.22 and 2.30.

140

Chapter 5

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

5.22

5.24

a.

If x is normally distributed, the sampling distribution of x is normal, regardless of the


sample size.

b.

If nothing is known about the distribution of x, the sampling distribution of x is


approximately normal if n is sufficiently large. If n is not large, the distribution of x is
unknown if the distribution of x is not known.

a.

P(t t0) = .025 where df = 11


t0 = 2.201

b.

P(t t0) = .01 where df = 9


t0 = 2.821

c.

P(t t0) = .005 where df = 6


Because of symmetry, the statement can be rewritten
P(t t0) = .005 where df = 6
t0 = 3.707

d.

5.26

P(t t0) = .05 where df = 18


t0 = 1.734

For this sample,


x = 1567 = 97.9375
x=
n
16
s2 =
s=

( x)

n 1

1567 2
16 = 159.9292
16 1

155,867

s 2 = 12.6463

a.

For confidence coefficient, .80, = 1 .80 = .20 and /2 = .20/2 = .10. From Table VI,
Appendix B, with df = n 1 = 16 1 = 15, t.10 = 1.341. The 80% confidence interval for
is:
s
12.6463
x t.10
97.94 1.341
97.94 4.240 (93.700, 102.180)
n
16

b.

For confidence coefficient, .95, = 1 .95 = .05 and /2 = .05/2 = .025. From Table VI,
Appendix B, with df = n 1 = 24 1 = 23, t.025 = 2.131. The 95% confidence interval for
is:
x t.025

s
n

97.94 2.131

12.6463
16

97.94 6.737 (91.203, 104.677)

The 95% confidence interval for is wider than the 80% confidence interval for found
in part a.

Inferences Based on a Single Sample: Estimation with Confidence Intervals

141

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

For part a:
We are 80% confident that the true population mean lies in the interval 93.700 to
102.180.
For part b:
We are 95% confident that the true population mean lies in the interval 91.203 to
104.677.
The 95% confidence interval is wider than the 80% confidence interval because the more
confident you want to be that lies in an interval, the wider the range of possible values.

5.28

a.

Using MINITAB, the descriptive statistics are:

Descriptive Statistics: MTBE


Variable
MTBE

N
12

N*
0

Mean
97.2

SE Mean
32.8

StDev
113.8

Minimum
8.00

Q1
12.0

Median
50.5

Q3
146.0

Maximum
367.0

A point estimate for the true mean MTBE level for all well sites located near the New
Jersey gasoline service station is x = 97.2 .
b.

For confidence coefficient .99, = .01 and /2 = .01/2 = .005. From Table VI, Appendix
B, with df = n 1 = 12 1 = 11, t.005 = 3.106. The 99% confidence interval is:
s

x t.005

97.2 3.106

113.8
12

97.2 102.04 (4.84, 199.24)

We are 99% confident that the true mean MTBE level for all well sites located near the
New Jersey gasoline service station is between 4.84 and 199.24.
c.

We must assume that the data were sampled from a normal distribution. We will use the
four methods to check for normality. First, we will look at a histogram of the data. Using
MINITAB, the histogram of the data is:
Histogram of MTBE
5

Fr equency

142

50

100

150
200
M T BE

250

300

350

Chapter 5

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

From the histogram, the data do not appear to be mound-shaped. This indicates that the
data may not be normal.
Next, we look at the intervals x s, x 2 s, x 3s . If the proportions of observations
falling in each interval are approximately .68, .95, and 1.00, then the data are
approximately normal. Using MINITAB, the summary statistics are:
x s 97.2 113.8 (16.6, 211.0) 10 of the 12 values fall in this interval. The
proportion is .83. This is not very close to the .68 we would expect if the data were
normal.
x 2 s 97.2 2(113.8) 97.2 227.6 (130.4, 324.8) 11 of the 12 values fall in
this interval. The proportion is .92. This is a somewhat smaller than the .95 we would
expect if the data were normal.
x 2 s 97.2 3(113.8) 97.2 341.4 (244.2, 438.6) 12 of the 12 values fall in
this interval. The proportion is 1.00. This is exactly the 1.00 we would expect if the data
were normal.
From this method, it appears that the data may not be normal.
Next, we look at the ratio of the IQR to s. IQR = QU QL = 146.0 12.0 = 134.0.
IQR 134.0
=
= 1.18 This is somewhat smaller than the 1.3 we would expect if the data
s
113.8
were normal. This method indicates the data may not be normal.

Finally, using MINITAB, the normal probability plot is:


Probability Plot of MTBE
N ormal - 95% C I
99

95
90

Mean
StDev

97.17
113.8

N
AD
P-Value

12
0.929
0.012

P er cent

80
70
60
50
40
30
20
10
5

-300

-200

-100

100
200
M T BE

300

400

500

Since the data do not form a fairly straight line, the data may not be normal.
From above, the all methods indicate the data may not be normal. It appears that the data
probably are not normal.

Inferences Based on a Single Sample: Estimation with Confidence Intervals

143

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

5.30

We must assume that the distribution of the LOS's for all patients is normal.
a.

For confidence coefficient .90, = 1 .90 = .10 and /2 = .10/2 = .05. From Table VI,
Appendix B, with df = n 1 = 20 1 = 19, t.05 = 1.729. The 90% confidence interval is:

x t.05

5.32

5.34

s
n

3.8 1.729

1.2
20

3.8 .464 (3.336, 4.264)

b.

We are 90% confident that the mean LOS is between 3.336 and 4.264 days.

c.

90% confidence means that if repeated samples of size n are selected from a population and
90% confidence intervals are constructed, 90% of all intervals thus constructed will contain
the population mean.

a.

The 95% confidence interval for the mean surface roughness of coated interior pipe is
(1.63580, 2.12620).

b.

No. Since 2.5 does not fall in the 95% confidence interval, it would be very unlikely that
the average surface roughness would be as high as 2.5 micrometers.

a.

The population is the set of all DOT permanent count stations in the state of Florida.

b.

Yes. There are several types of routes included in the sample. There are 3 recreational
areas, 7 rural areas, 5 small cities, and 5 urban areas.

c.

Using MINITAB, the descriptive statistics are:

Descriptive Statistics: 30th hour, 100th hour


Variable
30th hou
100th ho

N
20
20

Mean
2206
2096

Median
2064
1999

TrMean
2165
2048

Variable
30th hou
100th ho

Minimum
252
229

Maximum
4905
4815

Q1
1429
1318

Q3
3068
2877

StDev
1224
1203

SE Mean
274
269

For confidence coefficient .95, = .05 and /2 = .05/2 = .025. From Table VI,
Appendix B, with df = n 1 = 20 1 = 19, t.025 = 2.093. The 95% confidence interval is:
x t.025

s
n

2, 206 2.093

1, 224
20

2, 206 572.84 (1,633.16, 2,778.84)

We are 95% confident that the mean traffic count at the 30th highest hour is between
1,633.16 and 2,778.84.
d.

144

We must assume that the distribution of the traffic counts at the 30th highest hour is
normal. From the stem-and-leaf display, the data look fairly mound-shaped. Thus, the
assumption of normality is probably met.

Chapter 5

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

e.

For confidence coefficient .95, = .05 and /2 = .05/2 = .025. From Table VI, Appendix
B, with df = n 1 = 20 1 = 19, t.025 = 2.093. The 95% confidence interval is:
x t.025

s
n

2,096 2.093

1, 203
20

2,096 563.01 (1,532.99, 2,659.01)

We are 95% confident that the mean traffic count at the 100th highest hour is between
1,532.99 and 2,659.01.
We must assume that the distribution of the traffic counts at the 100th highest hour is
normal. From the stem-and-leaf display, the data look fairly mound-shaped. Thus, the
assumption of normality is probably met.
f.

If = 2,700, it is very possible that it is the mean count for the 30th highest hour. It falls
in the 95% confidence interval for the mean count for the 30th highest hour. It is not very
likely that the mean count for the 100th highest hour is 2,700. It does not fall in the 95%
confidence interval for the mean count for the 100th highest hour. (See parts c and e
above.)

5.36

By the Central Limit Theorem, the sampling distribution of is approximately normal with
pq
mean p = p and standard deviation p =
.
n

5.38

a.

The sample size is large enough if the interval p 3 p does not include 0 or 1.

p 3 p p 3

pq
pq
.88(1 .88)
.88 .089
p 3
.88
n
n
121
(.791, .969)

Since the interval lies within the interval (0, 1), the normal approximation will be
adequate.
b.

For confidence coefficient .90, = .10 and /2 = .05. From Table IV, Appendix B,
z.05 = 1.645. The 90% confidence interval is:
p z .05

c.

pq
p 1.645
n

pq
.88(.12)
.88 .049
.88 1.645 1.645
n
121
(.831, .929)

We must assume that the sample is a random sample from the population of interest.

Inferences Based on a Single Sample: Estimation with Confidence Intervals

145

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

5.40

a.

Of the 50 observations, 15 like the product p =

15
= .30.
30

To see if the sample size is sufficiently large:

p 3 p p 3

pq
.3(.7)
.3 3
.3 .194 (.106, .494)
n
50

Since this interval is wholly contained in the interval (0, 1), we may conclude that the
normal approximation is reasonable.
For the confidence coefficient .80, = .20 and /2 = .10. From Table IV, Appendix B,
z.10 = 1.28. The confidence interval is:
p z.10

5.42

pq
.3(.7)
.3 1.28
.3 .083 (.217, .383)
n
50

b.

We are 80% confident the proportion of all consumers who like the new snack food is
between .217 and .383.

a.

The point estimate of p is p = .11 .

b.

To see if the sample size is sufficiently large:

pq
.11(.89)
.11 3
.11 .077 (.033, .187)
n
150
Since the interval is wholly contained in the interval (0, 1), we may conclude that the
normal approximation is reasonable.
p 3 p p 3

For confidence coefficient .95, = .05 and /2 = .05/2 = .025. From Table IV, Appendix
B, z.025 = 1.96. The confidence interval is:
p z.025

5.44

pq
.11(.89)
.11 1.645
.11 .05 (.06, .16)
n
150

c.

We are 95% confident that the true proportion of MSDS that are satisfactorily completed
is between .06 and .16.

a.

The point estimate of p is p =

x 16
=
= .052 .
n 308

To see if the sample size is sufficiently large:

pq
.052(.948)
p 3 p p 3
.052 3
.052 .038 (.014, .090)
n
308
Since the interval is wholly contained in the interval (0, 1), we may conclude that the
normal approximation is reasonable.

146

Chapter 5

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

For confidence coefficient .99, = .01 and /2 = .01/2 = .005. From Table IV, Appendix
B, z.005 = 2.58. The confidence interval is:
p z.05

b.

pq
.052(.948)
.052 2.58
.052 .033 (.019, .085)
n
308

We are 99% confident that the true proportion of diamonds for sale that are classified as
D color is between .019 and .085.
x 81
= .263 .
The point estimate of p is p = =
n 308
To see if the sample size is sufficiently large:
p 3 p p 3

pq
.263(.737)
.263 3
.263 .075 (.188, .338)
n
308

Since the interval is wholly contained in the interval (0, 1), we may conclude that the
normal approximation is reasonable.
For confidence coefficient .99, = .01 and /2 = .01/2 = .005. From Table IV, Appendix
B, z.005 = 2.58. The confidence interval is:
p z.05

pq
.263(.737)
.263 2.58
.263 .065 (.198, .328)
n
308

We are 99% confident that the true proportion of diamonds for sale that are classified as
VS1 clarity, is between .198 and .328.
5.46

a.

The population is all senior human resource executives at U.S. companies.

b.

The population parameter of interest is p, the proportion of all senior human resource
executives at U.S. companies who believe that their hiring managers are interviewing too
many people to find qualified candidates for the job.

c.

The point estimate of p is p =

x 211
=
= .42 . To see if the sample size is sufficiently
n 502

large:
p 3 p p 3

pq
.42(.58)
.42 3
.42 .066 (.354, .486)
n
502

Since the interval is wholly contained in the interval (0, 1), we may conclude that the
normal approximation is reasonable.

Inferences Based on a Single Sample: Estimation with Confidence Intervals

147

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

d.

For confidence coefficient .98, = .02 and /2 = .02/2 = .01. From Table IV,
Appendix B, z.01 = 2.33. The confidence interval is:
p z.01

pq
.42(.58)
.42 2.33
.42 .051 (.369, .471)
n
502

We are 98% confident that the true proportion of all senior human resource executives at
U.S. companies who believe that their hiring managers are interviewing too many people
to find qualified candidates for the job is between .369 and .471.

5.48

e.

A 90% confidence interval would be narrower. If the interval was narrower, it would
contain fewer values, thus, we would be less confident.

a.

The point estimate of p is

b.

We must check to see if the sample size is sufficiently large:

p 3 p p 3

p = x/n = 35/55 = .636.

pq
.636(.364)
.636 3
.636 .195 (.441, .831)
n
55

Since the interval is wholly contained in the interval (0, 1) we may assume that the
normal approximation is reasonable.
For confidence coefficient, .99, = .01 and /2 = .01/2 = .005. From Table IV,
Appendix B, z.005 = 2.575. The confidence interval is:
p z.005
c.
d.

5.50

pq
.636(.364)
.636 2.575
.636 .167 (.469, .803)
n
55

We are 99% confident that the true proportion of fatal accidents involving children is
between .469 and .803.
The sample proportion of children killed by air bags who were not wearing seat belts or
were improperly restrained is 24/35 = .686. This is rather large proportion. Whether a
child is killed by an airbag could be related to whether or not he/she was properly
restrained. Thus, the number of children killed by air bags could possibly be reduced if
the child were properly restrained.

The point estimate of p is p =

x 36
=
= .434 .
n 83

To see if the sample size is sufficiently large:


p 3 p p 3

pq
.434(.566)
.434 3
.434 .163 (.271, .597)
n
83

Since the interval is wholly contained in the interval (0, 1), we may conclude that the normal
approximation is reasonable.

148

Chapter 5

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

For confidence coefficient .95, = .05 and /2 = .05/2 = .025. From Table IV, Appendix B,
z.025 = 1.96. The confidence interval is:

pq
.434(.566)
.434 1.96
.434 .107 (.327, .541)
n
83

p z.025

We are 95% confident that the true proportion of healthcare workers with latex allergies
actually suspects the he or she actually has the allergy is between .327 and .541.
5.52

To compute the necessary sample size, use

n=

2
( z / 2 ) 2

where = 1 .95 = .05 and /2 = .05/2 = .025.

SE 2

From Table IV, Appendix B, z.025 = 1.96. Thus,


n=

(1.96) 2 (7.2)
= 307.328 308
.32

You would need to take 308 samples.


5.54

a.

To compute the needed sample size, use:

n=

Thus, n =

( z / 2 )
SE

pq

where z.025 = 1.96 from Table IV, Appendix B.

(1.96) 2 (.2)(.8)
= 96.04 97
.08 2

You would need to take a sample of size 97.


b.

To compute the needed sample size, use:

n=

( z / 2 )

SE

pq

(1.96) 2(.5)(.5)
= 150.0625 151
.08 2

You would need to take a sample of size 151.


5.56

a.

For a width of 5 units, SE = 5/2 = 2.5.


To compute the needed sample size, use

( z / 2 ) 2
2

n=

SE

where = 1 .95 = .05 and /2 = .025.

Inferences Based on a Single Sample: Estimation with Confidence Intervals

149

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

From Table IV, Appendix B, z.025 = 1.96. Thus,

n=

(1.96) 2 (14) 2
= 120.47 121
2.52

You would need to take 121 samples at a cost of 121($10) = $1210.


Yes, you do have sufficient funds.
b.

For confidence coefficient .90, = 1 .90 = .10 and /2 = .10/2 = .05. From Table IV,
Appendix B, z.05 = 1.645.

n=

(1.645) 2 (14) 2
= 84.86 85
2.52

You would need to take 85 samples at a cost of 85($10) = $850.


You still have sufficient funds but have an increased risk of error.
5.58

The sample size will be larger than necessary for any p other than .5.

5.60

a.

The confidence level desired by the researchers is 90%.

b.

The sampling error desired by the researchers is SE = .05.

c.

For confidence coefficient .90, = .10 and /2 = .10/2 = .05. From Table IV,
x 64
Appendix B, z.05 = 1.645. From the problem, we will use p = =
= .604
n 106
to estimate p. Thus,

n=

( z / 2 ) 2 pq 1.6452.604(.396)
=
= 258.9 259
( SE ) 2
.052

Thus, we would need a sample of size 259.


5.62

For confidence coefficient .95, = .05 and /2 = .05/2 = .025. From Table IV, Appendix B,
z.025 = 1.96. For this study,
n=

( z / 2 ) 2 2 1.962 (5) 2

= 96.04 97
SE 2
12

The sample size needed is 97.

150

Chapter 5

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

5.64

For confidence coefficient .90, = .10 and /2 = .05. From Table IV, Appendix B,
z.05 = 1.645.
For a width of .06, SE = .06/2 = .03
( z / 2 ) 2 pq
(.1645) 2 (.17)(.83)
= 424.2 425
=
The sample size is n =
2
.032
SE
You would need to take n = 425 samples.

5.66

To compute the necessary sample size, use


n=

( z / 2 ) 2 2
where = 1 .90 = .10 and /2 = .05.
SE 2

From Table IV, Appendix B, z.05 = 1.645. Thus,


n=
5.68

a.

(1.645) 2 (10) 2
= 270.6 271
12

To compute the needed sample size, use


n=

( z / 2 ) 2 2
where = 1 .90 = .10 and /2 = .05.
SE 2

From Table IV, Appendix B, z.10 = 1.645. Thus,


n=

(1.645) 2 (2) 2
= 1,082.41 1,083
.12

b.

As the sample size decreases, the width of the confidence interval increases. Therefore, if
we sample 100 parts instead of 1,083, the confidence interval would be wider.

c.

To compute the maximum confidence level that could be attained meeting the
management's specifications,
n=

( z / 2 ) 2 2
( z / 2 )(2) 2
100(.01)

100
=
( z / 2 ) 2 =
= .25 z/2 = .5
2
2
4
SE
.1

Using Table IV, Appendix B, P(0 z .5) = .1915. Thus, /2 = .5000 .1915 = .3085,

= 2(.3085) = .617, and 1 = 1 .617 = .383.


The maximum confidence level would be 38.3%.

Inferences Based on a Single Sample: Estimation with Confidence Intervals

151

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

5.70

5.72

x =

N n
N
2500 1000
= 4.90
2500

a.

x=

200
1000

b.

x =

200 5000 1000


= 5.66
5000
1000

c.

x =

10,000 1000
= 6.00
10,000
1000

d.

x =

200 100,000 1000


= 6.293
100,000
1000

a.

For n = 36, with the finite population correction factor:


N n 24 5000 64
x = s / n
=

= 3 .9872 = 2.9807
N
5000
64

200

without the finite population correction factor:


24
x = s / n =
=3
64

x without the finite population correction factor is slightly larger.


b.

For n = 400, with the finite population correction factor:


N n
24 5000 400
x = s / n
=

= 1.2 .92 = 1.1510


N
5000
400

without the finite population correction factor:


24
x = s / n =
= 1.2
400

c.

5.74

In part a, n is smaller relative to N than in part b. Therefore, the finite population


correction factor did not make as much difference in the answer in part a as in part b.

An approximate 95% confidence interval for is:


s N n
14 375 40
x 2 x x 2
422 2
375
N
40
n
422 4.184 (417.816, 426.184)

152

Chapter 5

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

5.76

a.

For N = 2,193, n = 223, x =116,754, and s = 39,185, the 95% confidence interval is:

s N n
39,185 2,193 223
116,754 2
N
2,193
n
223
116,754 4,974.06 (111,779.94, 121,728.06)

x 2 x x 2

5.78

b.

We are 95% confident that the mean salary of all vice presidents who subscribe to
Quality Progress is between $111,777.94 and $121,728.06.

a.

The population of interest is the set of all households headed by women that have incomes
of $25,000 or more in the database.

b.

Yes. Since n/N = 1,333/25,000 = .053 exceeds .05, we need to apply the finite population
correction.

c.

The standard error for p should be:

p =
d.

.708(1 .708) 25,000 1,333


p (1 p ) N n

= .012
1333
25,000
n
N

For confidence coefficient .90, = 1 .90 = .10 and /2 = .10/2 = .05. From Table
IV, Appendix B, z.05 = 1.645. The approximate 90% confidence interval is:

p 1.645 p .708 1.645(.012) (.688, .728)


5.80

For N = 1,500, n = 35, x = 1, and s = 124, the 95% confidence interval is:

s N n
124 1,500 35
x 2 x x 2
1 2
1 41.43

1,500
N
n
35
(40.43, 42.43)
We are 95% confident that the mean error of the new system is between -$40.43 and $42.43.

5.82

a.

For a small sample from a normal distribution with unknown standard deviation, we use the
t statistic. For confidence coefficient .95, = 1 .95 = .05 and /2 = .05/2 = .025.
From Table VI, Appendix B, with df = n 1 = 23 1 = 22, t.025 = 2.074.

b.

For a large sample from a distribution with an unknown standard deviation, we can estimate
the population standard deviation with s and use the z statistic. For confidence coefficient
.95, = 1 .95 = .05 and /2 = .05/2 = .025. From Table IV, Appendix B, z.025 =
1.96.

c.

For a small sample from a normal distribution with known standard deviation, we use the z
statistic. For confidence coefficient .95, = 1 .95 = .05 and /2 = .05/2 = .025.
From Table IV, Appendix B, z.025 = 1.96.

Inferences Based on a Single Sample: Estimation with Confidence Intervals

153

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

5.84

d.

For a large sample from a distribution about which nothing is known, we can estimate the
population standard deviation with s and use the z statistic. For confidence coefficient .95,
= 1 .95 = .05 and /2 = .05/2 = .025. From Table IV, Appendix B, z.025 = 1.96.

e.

For a small sample from a distribution about which nothing is known, we can use neither z
nor t.

a.

Of the 400 observations, 227 had the characteristic p = 227/400 = .5675.


To see if the sample size is sufficiently large:
p 3 p p 3

pq
pq
.5675(.4325)
p 3
.5675 3
.5675 .0743
n
n
400
(.4932, .6418)

Since the interval lies within the interval (0, 1), the normal approximation will be
adequate.
For confidence coefficient .95, = .05 and /2 = .05/2 = .025. From Table IV, Appendix
B, z.025 = 1.96. The confidence interval is:
p z.025

b.

pq
1.96
n

pq
.5675(.4325)
.5675 1.96
.5675 .0486
n
400
(.5189, .6161)

For this problem, SE = .02. For confidence coefficient .95, = .05 and /2 = .05/2 =
.025. From Table IV, Appendix B, z.025 = 1.96. Thus,
n=

( z / 2 ) 2 pq (1.96) 2 (.5675)(.4325)
=
= 2,357.2 2,358
SE 2
.022

Thus, the sample size was 2,358.


5.86

a.

The finite population correction factor is:


( N n)
=
N

b.

The finite population correction factor is:


( N n)
=
N

c.

(100 20)
= .8944
100

The finite population correction factor is:


( N n)
=
N

154

(2,000 50)
= .9874
2,000

(1,500 300)
= .8944
1,500

Chapter 5

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

5.88

5.90

a.

From the printout, the 90% confidence interval is (4.277, 6.184). We are 90%
confident that the mean number of offices operated by all Florida law firms is
between 4.277 and 6.184.

b.

From the histogram, it appears that the data probably are not from a normal distribution.
The data appear to be skewed to the right.

c.

The interval constructed in part a depends on the assumption that the data came
from a normal distribution. From part b, it appears that this assumption is not valid.
Thus, the confidence interval is probably not valid.

a.

The point estimate of p is p =

b.

To see if the sample size is sufficiently large:

x 67
=
= .638 .
n 105

pq
.638(.362)
.638 3
.638 .141 (.497, .779)
n
105
Since the interval is wholly contained in the interval (0, 1), we may conclude that the
normal approximation is reasonable.
p 3 p p 3

For confidence coefficient .95, = .05 and /2 = .05/2 = .025. From Table IV, Appendix
B, z.025 = 1.96. The confidence interval is:
p z.025

5.92

pq
.638(.362)
.638 1.96
.638 .092 (.546, .730)
n
105

c.

We are 95% confident that the true proportion of on-the-job homicide cases that occurred
at night is between .546 and .730.

a.

Using MINITAB, the descriptive statistics are:

Descriptive Statistics: NJValues


Variable
NJValues

N
20

N*
0

Mean
440.4

SE Mean
67.8

StDev
303.0

Minimum
159.0

Q1
212.3

Median
297.5

Q3
660.5

Maximum
1190.0

For confidence coefficient .95, = .05 and /2 = .05/2 = .025. From Table VI,
Appendix B, with df = n 1 = 20 1 = 19, t.025 = 2.093. The 95% confidence interval
is:
x t.025
b.

s
n

440.4 2.093

303.0
20

440.4 141.81 (298.59, 582.21)

We are 95% confident that the true mean sales price is between $298,590 and $582,210.

Inferences Based on a Single Sample: Estimation with Confidence Intervals

155

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

"95% confidence" means that in repeated sampling, 95% of all confidence intervals
constructed will contain the true mean sales price and 5% will not.

d.

Using MINITAB, a histogram of the data is:


Histogram of NJValues
9
8

Fr equency

7
6
5
4
3
2
1
0

200

400

600
800
NJValues

1000

1200

Since the sample size is small (n = 20), we must assume that the distribution of sales
prices is normal. From the histogram, it does not appear that the data come from a normal
distribution. Thus, this confidence interval is probably not valid.
5.94

a.

For confidence coefficient .90, = .10 and /2 = .05. From Table IV, Appendix B,
z.05 = 1.645. The 90% confidence interval is:
x z.05

x 1.645

s
n

12.2 1.645

10
100

12.2 1.645
(10.555, 13.845)

We are 90% confident that the mean number of days of sick leave taken by all its
employees is between 10.555 and 13.845.
b.

For confidence coefficient .99, = .01 and /2 = .005. From Table IV, Appendix B,
z.005 = 2.58.
The sample size is n =

2
( z / 2 ) 2

SE 2

(2.58) 2 (10) 2
= 166.4 167
22

You would need to take n = 167 samples.

156

Chapter 5

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

5.96

a.

For confidence coefficient .99, = .01 and /2 = .01/2 = .005. From Table IV, Appendix
B, z.005 = 2.58. The confidence interval is:
x z / 2

2.21
s
1.13 2.58
1.13 .67
72
n
(.46, 1.80)

We are 99% confident that the mean number of pecks at the blue string is between .46
and 1.80.

5.98

b.

Yes. The mean number of pecks at the white string is 7.5. This value does not fall in the
99% confident interval for the blue string found in part a. Thus, the chickens are more
apt to peck at white string.

a.

First we must compute p : p =

x 124
= .78
=
n 159

To see if the sample size is sufficiently large:

pq
.78(22)
.78 3
.78 .099 (.681, .879)
n
159
Since this interval is wholly contained in the interval (0, 1), we may conclude that the
normal approximation is reasonable.

p 3 p p 3

For confidence coefficient .90, = .10 and /2 = .10/2 = .05. From Table IV, Appendix
B, z.05 = 1.645. The confidence interval is:
p z.05

pq
p 1.645
n

pq
.78(.22)
.78 1.645
.78 .054
n
159
(.726, .834)

We are 90% confident that the true proportion of all truck drivers who suffer from sleep
apnea is between .726 and .834.

5.100

b.

Sleep researchers believe that 25% of the population suffer from obstructive sleep apnea.
Since the 90% confidence interval for the proportion of truck drivers who suffer from
sleep apnea does not contain .25, it appears that the true proportion of truck drivers who
suffer from sleep apnea is larger than the proportion of the general population.

a.

The population of interest is the set of all debit cardholders in the U.S.

c.

Of the 1252 observations, 180 had used the debit card to purchase a product or service on
the Internet
p =

180
= .144
1252

Inferences Based on a Single Sample: Estimation with Confidence Intervals

157

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

To see if the sample size is sufficiently large:


p 3 p p 3

pq
.144(.856)
.144 3
.144 .030 (.114, .174)
n
1252

Since this interval is wholly contained in the interval (0, 1), we may conclude that the
normal approximation is reasonable.
d.

For confidence coefficient .98, = 1 .98 = .02 and /2 = .02/2 = .01. From Table IV,
Appendix B, z.01 = 2.33. The confidence interval is:
p z.01

pq
.144(.856)
.144 .023 (.121, .167)
.144 2.33
n
1252

We are 98% confident that the proportion of debit cardholders who have used their card
in making purchases over the Internet is between .121 and .167.

5.102

e.

Since we would have less confidence with a 90% confidence interval than with a 98%
confidence interval, the 90% interval would be narrower.

a.

Of the 100 cancer patients, 7 were fired or laid off = 7/100 = .07.
To see if the sample size is sufficiently large:
p 3 p p 3

pq
pq
.07(.93)
p 3
.07 3
.07 .077
n
n
100
(.007, .145)

Since the interval does not lie within the interval (0, 1), the normal approximation will not
be adequate. We will go ahead and construct the interval anyway.
For confidence coefficient .90, = .10 and /2 = .10/2 = .05. From Table IV, Appendix
B, z.05 = 1.645. The confidence interval is:
p z.05

pq
p 1.645
n

pq
.07(.93)
.07 1.645
.07 .042
n
100
(.028, .112)

Converting these to percentages, we get (2.8%, 11.2%).

158

b.

We are 90% confident that the percentage of all cancer patients who are fired or laid off
due to their illness is between 2.8% and 11.2%.

c.

Since the rate of being fired or laid off for all Americans is 1.3% and this value falls
outside the confidence interval in part b, there is evidence to indicate that employees with
cancer are fired or laid off at a rate that is greater than that of all Americans.

Chapter 5

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

5.104

a.

x 9296
=
= .9296
n 10,000

p =

The approximate 95% confidence interval is:


p (1 p ) N n
.9296(.0704) 500,000 10,000
.9296 2
10,000
500,000
n
N

p 2

.9296 2 .000006413 .9296 .0051 (.9245, .9347)

5.106

10,000
100% = 2% of the subscribers returned the questionnaire. Often in mail
500,000
surveys, those that respond are those with strong views. Thus, the 10,000 that responded
may not be representative. I would question the estimate in part a.

b.

Only

a.

The point estimate for the fraction of the entire market who refuse to purchase bars is:

p =
b.

x 23
=
= .094
n 244

To see if the sample size is sufficient:

p 3

pq
(.094)(.906)
.094 3
.094 .056 (.038, .150)
244
n

Since the interval above is contained in the interval (0, 1), the sample size is sufficiently
large.
c.

For confidence coefficient .95, = 1 .95 = .05 and /2 = .05/2 = .025. From Table IV,
Appendix B, z.025 = 1.96. The confidence interval is:

p z.025
d.

pq
(.094)(.906)
.094 1.96
.094 .037 (.057, .131)
244
n

The best estimate of the true fraction of the entire market who refuse to purchase bars six
months after the poisoning is .094. We are 95% confident the true fraction of the entire
market who refuse to purchase bars six months after the poisoning is between .057 and
.131.

Inferences Based on a Single Sample: Estimation with Confidence Intervals

159

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

5.108

The bound is SE = .1. For confidence coefficient .99, = 1 .99 = .01 and /2 = .01/2 = .005.
From Table IV, Appendix B, z.005 = 2.575.
We estimate p with from Exercise 7.48 which is = .636. Thus,

n=

( z / 2 ) 2 pq 2.5752 (.636)(.364)

= 153.5 154
.12
SE 2

The necessary sample size would be 154.


5.110

Since the manufacturer wants to be reasonably certain the process is really out of control
before shutting down the process, we would want to use a high level of confidence for our
inference. We will form a 99% confidence interval for the mean breaking strength.
For confidence coefficient .99, = .01 and /2 = .01/2 = .005. From Table VI, Appendix B,
with df = n 1 = 9 1 = 8, t.005 = 3.355. The 99% confidence interval is:

x t.005

s
22.9
985.6 3.355
985.6 25.61 (959.99, 1,011.21)
9
n

We are 99% confident that the true mean breaking strength is between 959.99 and 1,011.21.
Since 1,000 is contained in this interval, it is not an unusual value for the true mean breaking
strength. Thus, we would recommend that the process is not out of control.

160

Chapter 5

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Inferences Based on a Single Sample:


Tests of Hypothesis

Chapter 6

6.2

The test statistic is used to decide whether or not to reject the null hypothesis in favor of the
alternative hypothesis.

6.4

A Type I error is rejecting the null hypothesis when it is true.


A Type II error is accepting the null hypothesis when it is false.

= the probability of committing a Type I error.


= the probability of committing a Type II error.
6.6

We can compute a measure of reliability for rejecting the null hypothesis when it is true. This
measure of reliability is the probability of rejecting the null hypothesis when it is true which is
. However, it is generally not possible to compute a measure of reliability for accepting the
null hypothesis when it is false. We would have to compute the probability of accepting the
null hypothesis when it is false, , for every value of the parameter in the alternative
hypothesis.

6.8

Let p = proportion of U.S. companies that have formal, written travel and entertainment
policies for their employees. The null hypothesis would be:
H0: p = .80

6.10

Let = average Libor rate for 3-month loans. Since many Western banks think that the
reported average Libor rate (.054) is too high, they want to show that the average is less than
.054. The appropriate hypotheses would be:
H0: = .054
Ha: < .054

6.12

Let p = proportion of time the camera correctly detects liars. The null hypothesis would be:
H0: p = .75

6.14

a.

A Type I error would be concluding the proposed user is unauthorized when, in fact, the
proposed user is authorized.
A Type II error would be concluding the proposed user is authorized when, in fact, the
proposed user is unauthorized.
In this case, a more serious error would be a Type II error. One would not want to
conclude that the proposed user is authorized when he/she is not.

b.

The Type I error rate is 1%. This means that the probability of concluding the proposed
user is unauthorized when, in fact, the proposed user is authorized is .01.

Inferences Based on a Single Sample: Tests of Hypothesis

161

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The Type II error rate is .00025%. This means that the probability of concluding the
proposed user is authorized when, in fact, the proposed user is unauthorized is .0000025.
c.

The Type I error rate is .01%. This means that the probability of concluding the proposed
user is unauthorized when, in fact, the proposed user is authorized is .0001.
The Type II error rate is .005%. This means that the probability of concluding the
proposed user is authorized when, in fact, the proposed user is unauthorized is .00005.

6.16

6.18

a.

The null hypothesis is: Ho: There is no intrusion.

b.

The alternative hypothesis is: Ha: There is an intrusion.

c.

= P(warning | no intrusion) =

1
= .001 .
1000

= P(no warning | intrusion) =

500
= .5 .
1000

a.

The decision rule is to reject H0 if x > 270. Recall that


z=

x 0

Therefore, reject H0 if x > 270


can be written reject H0 if z >

x 0

x
270 255
z>
63/ 81
z > 2.14

The decision rule in terms of z is to reject H0 if z > 2.14.


b.

6.20

a.

P(z > 2.14) = .5 P(0 < z < 2.14)


= .5 .4838
= .0162
H0: = .36
Ha: < .36

The test statistic is z =

x 0

.323 .36
.034 / 64

= 1.61

The rejection region requires = .10 in the lower tail of the z-distribution. From Table
IV, Appendix B, z.10 = 1.28. The rejection region is z < 1.28.

162

Chapter 6

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Since the observed value of the test statistic falls in the rejection region (z = 1.61 <
1.28), H0 is rejected. There is sufficient evidence to indicate the mean is less than .36 at
= .10.
b.

H0: = .36
Ha: .36

The test statistic is z = 1.61 (see part a).


The rejection region requires /2 = .10/2 = .05 in the each tail of the z-distribution. From
Table IV, Appendix B, z.05 = 1.645. The rejection region is z < 1.645 or z > 1.645.
Since the observed value of the test statistic does not fall in the rejection region (z = 1.61
</ 1.645), H0 is not rejected. There is insufficient evidence to indicate the mean is
different from .36 at = .10.
6.22

a.

To determine whether the mean July, 2006 dealer price of the Toyota Prius differs
from $25,000, we test:
H0: = 25,000
Ha: 25,000

b.

The sample mean is x =

xi = 4,076,271 = 25, 476.69


n

160

The sample variance is:

s2 =

xi2

( xi )

n 1

104,788,653,115
160 1

4,076,2712
160
= 5,904,057.862

The sample standard deviation is: s = s 2 = 5,904,057.862 = 2, 429.8267


x o

The test statistic is z =

d.

The rejection region requires /2 = .05/2 = .025 in each tail of the z-distribution. From
Table IV, Appendix B, z.025 = 1.96. The rejection region is z < 1.96 or z > 1.96.

e.

Since the observed value of the test statistic falls in the rejection region (z = 2.48 > 1.96),
Ho is rejected. There is sufficient evidence to indicate the mean July, 2006 dealer price of
the Toyota Prius differs from $25,000 at = .05.

25, 476.69 25,000


= 2.48
2, 429.8267 160

c.

Inferences Based on a Single Sample: Tests of Hypothesis

163

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

6.24

a.

A Type I error is rejecting H0 when H0 is true. In this case, we would conclude that the
mean number of carats per diamond is different from .6 when, in fact, it is equal to .6.
A Type II error is accepting H0 when H0 is false. In this case, we would conclude that the
mean number of carats per diamond is equal to .6 when, in fact, it is different from .6.

b.

From Exercise 5.18, the random sample of 30 diamonds yielded x = .691 and s = .262.
Let = mean number of carats per diamond. To determine if the mean number of carats
per diamond is different from .6, we test:
H0: = .6
Ha: .6
The test statistic is z =

x 0

.691 .6
.262

30

= 1.90

The rejection region requires /2 = .05/2 = .025 in each tail of the z-distribution. From
Table IV, Appendix B, z.025 = 1.96. The rejection region is z > 1.96 or z < 1.96.
Since the observed value of the test statistic does not fall in the rejection region
(z = 1.90 >/ 1.96), H0 is not rejected. There is insufficient evidence to indicate the mean
number of carats per diamond is different from .6 carats at = .05.
c.

When is changed, H0, Ha, and the test statistic remain the same.
The rejection region requires /2 = .10/2 = .05 in each tail of the z-distribution. From
Table IV, Appendix B, z.05 = 1.645. The rejection region is z > 1.645 or z < 1.645.
Since the observed value of the test statistic falls in the rejection region
(z = 1.90 > 1.645), H0 is rejected. There is sufficient evidence to indicate the mean
number of carats per diamond is different from .6 carats at = .10.

d.

6.26

When the value of changes, the decision can also change. Thus, it is very important to
include the level of used in all decisions.

Using MINITAB, the descriptive statistics are:


Descriptive Statistics: GASTURBINE
Variable
GASTURBINE

N
67

N*
0

Variable
GASTURBINE

Maximum
16243

Mean
11066

SE Mean
195

StDev
1595

Minimum
8714

Q1
9918

Median
10656

Q3
11842

To determine if the mean heat rate of gas turbines augmented with high pressure inlet
fogging exceeds 10,000 kJ/kWh, we test:
H0: = 10,000
H0: > 10,000

164

Chapter 6

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

x o

The test statistic is z =

11,066 10,000
= 5.47
1,595 67

The rejection region requires = .05 in the upper tail of the z-distribution. From Table IV,
Appendix B, z.05 = 1.645. The rejection region is z > 1.645.
Since the observed value of the test statistics falls in the rejection region (z = 5.47 > 1.645),
H0 is rejected. There is sufficient evidence to indicate the true mean heat rate of gas turbines
augmented with high pressure inlet fogging exceeds 10,000 kJ/kWh at = .05.
6.28

a.

Let = average full-service fee (in thousands of dollars) of U.S. funeral homes in 2006.
To determine if the average full-service fee exceeds $6,500, we test:
H0: = 6.50
Ha: > 6.50

b.

Using MINTAB, the output is:


Descriptive Statistics: FUNERAL
Variable
Fee
Variable
Fee

N
36

Mean
6.819
Minimum
5.200

Median
6.600
Maximum
11.600

StDev
1.265
Q1
6.025

SE Mean
0.211
Q3
7.400

H0: = 6.50
Ha: > 6.50
The test statistic is z =

x 0

6.819 6.50
= 1.51
1.265 36

The rejection region requires = .05 in the upper tail of the z-distribution. From Table
IV, Appendix B, z.05 = 1.645. The rejection region is z > 1.645.
Since the observed value of the test statistic does not fall in the rejection region
(z = 1.51 >/ 1.645), H0 is not rejected. There is insufficient evidence to indicate the true
mean full-service fee of U.S. funeral homes in 2006 exceeds $6,500 at = .05.
c.

No. Since the sample size (n = 36) is greater than 30, the Central Limit Theorem applies.
The distribution of x is approximately normal regardless of the population distribution.

Inferences Based on a Single Sample: Tests of Hypothesis

165

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

6.30

a.

To determine if the sample data refute the manufacturer's claim, we test:

H0: = 10
Ha: < 10
b.

A Type I error is concluding the mean number of solder joints inspected per second is less
than 10 when, in fact, it is 10 or more.
A Type II error is concluding the mean number of solder joints inspected per second is at
least 10 when, in fact, it is less than 10.

c.

Using MINITAB, the descriptive statistics are:

Descriptive Statistics: PCB


Variable
PCB

N
48

Mean
9.292

Median
9.000

TrMean
9.432

Variable
PCB

Minimum
0.000

Maximum
13.000

Q1
9.000

Q3
10.000

StDev
2.103

SE Mean
0.304

H0: = 10
Ha: < 10
The test statistic is z =

x 0

9.292 10
2.103 / 48

= 2.33

The rejection region requires = .05 in the lower tail of the z-distribution. From Table
IV, Appendix B, z.05 = 1.645. The rejection region is z < 1.645.
Since the observed value of the test statistic falls in the rejection region (z = 2.33 <
1.645), H0 is rejected. There is sufficient evidence to indicate the mean number of
inspections per second is less than 10 at = .05.
6.32

166

We will reject H0 if the p-value < .


a.

.06 </ .05, do not reject H0.

b.

.10 </ .05, do not reject H0.

c.

.01 < .05, reject H0.

d.

.001 < .05, reject H0.

e.

.251 </ .05, do not reject H0.

f.

.042 < .05, reject H0.

Chapter 6

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

6.34

z=

x 0

49.4 50
4.1/ 100

= 1.46

p-value = P(z 1.46) = .5 + .4279 = .9279


There is no evidence to reject H0 for .10.
6.36

First, find the value of the test statistic:


z=

x 0

10.7 10
3.1/ 50

= 1.60

p-value = P(z 1.60 or z 1.60) = 2P(z 1.60) = 2(.5 .4452) = 2(.0548) = .1096
(using Table IV, Appendix B)
There is no evidence to reject H0 for .10.
6.38

a.

The p-value reported by SAS is for a two-tailed test. Thus, P(z 1.63) + P(z 1.63)
= .1032. For this one-tailed test, the p-value = P(z 1.63) = .1032/2 = .0516.
Since the p-value = .0516 > = .05, H0 is not rejected. There is insufficient evidence to
indicate < 75 at = .05.

b.

For this one-tailed test, the p-value = P(z 1.63). Since P(z 1.63) = .1032/2 = .0516,
P(z 1.63) = 1 .0516 = .9484.
Since the p-value = .9484 > = .10, H0 is not rejected. There is insufficient evidence to
indicate < 75 at = .10.

c.

For this one-tailed test, the p-value = P(z 1.63) = .1032/2 = .0516.
Since the p-value = .0516 < = .10, H0 is rejected. There is sufficient evidence to
indicate > 75 at = .10.

d.

For this two-tailed test, the p-value = .1032.


Since the p-value = .1032 > = .01, H0 is not rejected. There is insufficient evidence to
indicate 75 at = .01.

6.40

The p-value is p = 0.014. The probability of observing a test statistic of t = 2.48 or anything
more unusual if = 25,000 is 0.014. Since p = 0.014 is so small, we would reject H0. There is
sufficient evidence to indicate the mean prices for hybrid Toyota Prius cars is different than
$25,000 for any value of > .014.

6.42

From the printout, the p-value = .000. Since the p-value = .000 < = .01, H0 is rejected.
There is sufficient evidence to indicate that the true population mean weight of plastic golf tees
is different from .250 at = .01.

Inferences Based on a Single Sample: Tests of Hypothesis

167

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

6.44

a.

z=

x o

52.3 51
7.1

= 1.29

50

The p-value is p = P ( z 1.29)+P ( z 1.29) = (.5 .4015) + (.5 .4015) = .1970 .


(Using Table IV, Appendix B.)

b.

The p-value is p = P ( z 1.29)= (.5 .4015) = .0985 . (Using Table IV, Appendix B.)

c.

z=

x o

52.3 51
10.4

50

= 0.88

The p-value is p = P ( z 0.88)+P ( z 0.88) = (.5 .3106) + (.5 .3106) = .3788 .


(Using Table IV, Appendix B.)
d.

In part a, in order to reject H0, would have to be greater than .1970. In part b, in order
to reject H0, would have to be greater than .0985. In part c, in order to reject H0,
would have to be greater than .3788.

e.

For a two-tailed test, /2 = .01/2 = .005. From Table IV, Appendix B, z.005 = 2.58.
z=

x o

2.58 =

52.3 51
s

50

2.58

s
50

= 52.3 51 .3649s = 1.3 s = 3.56

For a one-tailed test, = .01. From Table IV, Appendix B, z.01 = 2.33.
z=

6.46

a.

z=

x o

x 0

2.33 =

52.3 51
s

10.2 0

50

2.33

s
50

= 52.3 51 .3295s = 1.3 s = 3.95

= 2.30

31.3 / 50

b.

For this two-sided test, the p-value = P(z 2.30) + P(z 2.30) = (.5 .4893) + (.5
.4893) = .0214. Since this value is so small, there is evidence to reject H0. There is
sufficient evidence to indicate the mean level of feminization is different from 0% for any
value of > .0214.

c.

z=

x - 0

15.0 0

= 4.23

25.1/ 50

For this two-sided test, the p-value = P(z 4.23) + P(z 4.23) (.5 .5) + (.5 .5) = 0.
Since this value is so small, there is evidence to reject H0. There is sufficient evidence to
indicate the mean level of feminization is different from 0% for any value of
> 0.0.

168

Chapter 6

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

6.48

6.50

a.

P(t > 1.440) = .10


(Using Table VI, Appendix B, with df = 6)

b.

P(t < 1.782) = .05


(Using Table VI, Appendix B, with df = 12)

c.

P(t < 2.060) + P(t > 2.060) = .025 + .025 = .05


(Using Table VI, Appendix B, with df = 25)

d.

The probability of a Type I error is computed above for each of the parts.

a.

H0: = 6
Ha: < 6
The test statistic is t =

x 0
s/ n

4.8 6
1.3/ 5

= 2.064

The necessary assumption is that the population is normal.


The rejection region requires = .05 in the lower tail of the t-distribution with df = n 1
= 5 1 = 4. From Table VI, Appendix B, t.05 = 2.132. The rejection region is t < 2.132.
Since the observed value of the test statistic does not fall in the rejection region (t =
2.064 </ 2.132), H0 is not rejected. There is insufficient evidence to indicate the mean
is less than 6 at = .05.
b.

H0: = 6
Ha: 6
The test statistic is t = 2.064 (from a).
The assumption is the same as in a.
The rejection region requires /2 = .05/2 = .025 in each tail of the t-distribution with
df = n 1 = 5 1 = 4. From Table VI, Appendix B, t.025 = 2.776. The rejection region is
t < 2.776 or t > 2.776.
Since the observed value of the test statistic does not fall in the rejection region (t =
2.064 </ 2.776), H0 is not rejected. There is insufficient evidence to indicate the mean
is different from 6 at = .05.

Inferences Based on a Single Sample: Tests of Hypothesis

169

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

For part a, the p-value = P(t 2.064).


From Table VI, with df = 4, .05 < P(t 2.064) < .10 or .05 < p-value < .10.
For part b, the p-value = P(t 2.064) + P(t 2.064).
From Table VI, with df = 4, 2(.05) < p-value < 2(.10) or .10 < p-value < .20.

6.52

a.

To determine if the true mean breaking strength of the new bonding adhesive is less
than 5.70 Mpa, we test:
H0: = 5.70
Ha: < 5.70

6.54

b.

The rejection region requires = .01 in the lower tail of the t-distribution with
df = n 1 = 10 1 = 9. From Table VI, Appendix B, t.01 = 2.821. The rejection region
is t < -2.821.

c.

The test statistic is t =

d.

Since the observed value of the test statistic falls in the rejection region
(t = 4.33 < 2.821), H0 is rejected. There is sufficient evidence to indicate the true
mean breaking strength of the new bonding adhesive is less than 5.70 Mpa at = .01.

e.

We must assume that the sample was random and selected from a normal population.

x o
s

5.07 5.70
.46

10

= 4.33 .

Some preliminary calculations are:

x=

s2 =

x 736
n

= 105.14

( x)

n 1

(736) 2
7
= 218.4762
7 1

78696

s=

218.4762 = 14.7809

a.

To determine if the mean consumption rate of salad dressings in the Southeastern U.S. is
different than the mean national consumption rate, we test:
H0: = 100
Ha: 100

b.

170

Since the sample size is so small, we must assume that the population being sampled is
normal. In addition, we must assume that the sample is random.

Chapter 6

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

The test statistic is t =

x 0
s/ n

105.14 100
14.7809 / 7

= .92

The rejection region requires /2 = .05/2 = .025 in each tail of the t-distribution. From
Table VI, Appendix B, with df = n 1 = 7 1 = 6, t.025 = 2.447. The rejection region is
t > 2.447 or t < 2.447.
Since the value of the test statistic does not fall in the rejection region (t = .92 >/ 2.447),
H0 is not rejected. There is insufficient evidence to indicate the mean consumption rate of
salad dressings in the Southeastern U.S. is different than the mean national consumption
rate at = .05.

6.56

d.

The observed significance level is p-value = P(t .92) + P(t .92). Since we did not
reject H0 in part c, we know that the p-value must be greater than .05. Using Table VI,
Appendix B, with df = n 1 = 7 1 = 6, p-value = P(t .92) + P(t .92) > .1 + .1 = .2
Thus, with this table, we only know that the p-value is greater than .2.

a.

To determine if the mean repellency percentage of the new mosquito repellent is less than
95, we test:

H0: = 95
Ha: < 95
The test statistic is t =

x 0
s/ n

83 95
15 / 5

= 1.79

The rejection region requires = .10 in the lower tail of the t distribution. From Table
VI, Appendix B, with df = n 1 = 5 1 = 4, t.10 = 1.533. The rejection region is
t < 1.533.
Since the observed value of the test statistic falls in the rejection region (t = 1.79 < 1.533),
H0 is rejected. There is sufficient evidence to indicate that the true mean repellency
percentage of the new mosquito repellent is less than 95 at = .10.

6.58

b.

We must assume that the population of percent repellencies is normally distributed.

a.

Using MINITAB, the descriptive statistics are:

Descriptive Statistics: Plants


Variable
Plants

N
20

Mean
4.000

Median
3.500

TrMean
3.667

Variable
Plants

Minimum
1.000

Maximum
13.000

Q1
1.250

Q3
5.000

StDev
3.061

SE Mean
0.684

Let = mean number of active nuclear power plants operating in all states. To determine
if the mean number of active nuclear power plants operating in all states exceeds 3, we test:

H0: = 3
Ha: > 3

Inferences Based on a Single Sample: Tests of Hypothesis

171

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The test statistic is t =

x o
s

43
3.061

20

= 1.46

The rejection region requires = .10 in the upper tail of the t-distribution with df = n 1
= 20 1 = 19. From Table VI, Appendix B, t.10 = 1.328. The rejection region is
t > 1.328.
Since the observed value of the test statistic falls in the rejection region (t = 1.46 > 1.328),
H0 is rejected. There is sufficient evidence to indicate the mean number of active nuclear
power plants operating in all states exceeds 3 at = .10.
b.

We will look at the 4 methods for determining if the data are normal. First, we will look
at a histogram of the data. Using MINITAB, the histogram of the number of power plants
is:

7
6

Frequency

5
4
3
2
1
0
2

10

12

14

Plants

From the histogram, the data appear to be skewed to the right. This indicates that the data
may not be normal.
Next, we look at the intervals x s, x 2 s, x 3s . If the proportions of observations
falling in each interval are approximately .68, .95, and 1.00, then the data are
approximately normal.

x s 4 3.061 (.939, 7.061) 18 of the 20 values fall in this interval. The


proportion is .90. This is much greater than the .68 we would expect if the data were
normal.
x 2s 4 2(3.061) 4 6.122 (2.122, 10.122) 19 of the 20 values fall in this
interval. The proportion is .95. This is the same as the .95 we would expect if the data
were normal.
x 3s 4 3(3.061) 4 9.183 (5.183, 13.183) 20 of the 20 values fall in this
interval. The proportion is 1.000. This is equal to the 1.00 we would expect if the data
were normal.

172

Chapter 6

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

From this method, it appears that the data are not normal.
Next, we look at the ratio of the IQR to s. IQR = QU QL = 5.00 1.25 = 3.75.

IQR 3.75
=
= 1.22 This is close to the 1.3 we would expect if the data were normal.
s
3.061
This method indicates the data may be normal.
Finally, using MINITAB, the normal probability plot is:
Normal Probability Plot for Plants
ML Estimates - 95% CI

99

ML Estimates

95

Mean

StDev

2.98329

90

Goodness of Fit

Percent

80

AD*

70
60
50

1.298

40
30
20
10
5

1
-5

10

Data

Since the data do not form a straight line, the data are not normal.
From 3 of the 4 different methods, the indications are that the number of power plants data
are not normal.
c.

The two largest values are 9 and 13. The two lowest values are 1 and 1. Using
MINITAB with the data deleted yields the descriptive statistics:

Descriptive Statistics: Plants2


Variable
Plants2

N
16

Mean
3.500

Median
3.500

TrMean
3.429

Variable
Plants2

Minimum
1.000

Maximum
7.000

Q1
2.000

Q3
5.000

StDev
1.826

SE Mean
0.456

To determine if the mean number of active nuclear power plants operating in all states
exceeds 3 (using the reduced data set), we test:
H0: = 3
Ha: > 3

Inferences Based on a Single Sample: Tests of Hypothesis

173

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The test statistic is t =

x o
s

3.5 3
1.826

16

= 1.10

The rejection region requires = .10 in the upper tail of the t-distribution with df = n 1 =
16 1 = 15. From Table VI, Appendix B, t.10 = 1.341. The rejection region is t > 1.341.
Since the observed value of the test statistic does not fall in the rejection region (t = 1.10 >/
1.341), H0 is not rejected. There is insufficient evidence to indicate the mean number of
active nuclear power plants operating in all states exceeds 3 at = .10.
By eliminating the top two and bottom two observations, we have changed the decision
from rejecting H0 to not rejecting H0.
d.

6.60

It is very dangerous to eliminate data points to satisfy assumptions. The data may, in fact,
not be normal. By eliminating data points, one has changed the kind of data that come
from the parent population. Thus, incorrect decisions could be made.

Using MINITAB, the descriptive statistics for the 2 plants are:


Descriptive Statistics: AL1, AL2
Variable
aximum
AL1
AL2

N*

Mean

SE Mean

StDev

Minimum

Q1

Median

Q3

2
2

0
0

0.00750
0.0700

0.00250
0.0200

0.00354
0.0283

0.00500
0.0500

*
*

0.00750
0.0700

*
*

M
0.01000
0.0900

To determine if plant 1 is violating the OSHA standard, we test:


H0: = .004
Ha: > .004
The test statistic is t =

x o
s

.0075 .004
.00354

= 1.40

Since no level was given, we will use = .05. The rejection region requires = .05 in
the upper tail of the t-distribution with df = n 1 = 2 1 = 1. From Table VI, Appendix B,
t.05 = 6.314. The rejection region is t > 6.314.
Since the observed value of the test statistic does not fall in the rejection region
(t = 1.40 >/ 6.314), H0 is not rejected. There is insufficient evidence to indicate the
OSHA standard is violated by plant 1 at = .05.
To determine if plant 2 is violating the OSHA standard, we test:
H0: = .004
Ha: > .004
The test statistic is t =

174

x o
s

.07 .004
.0283

= 3.30

Chapter 6

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Since no level was given, we will use = .05. The rejection region requires = .05 in the
upper tail of the t-distribution with df = n 1 = 2 1 = 1. From Table VI, Appendix B,
t.05 = 6.314. The rejection region is t > 6.314.
Since the observed value of the test statistic does not fall in the rejection region
(t = 3.30 >/ 6.314), H0 is not rejected. There is insufficient evidence to indicate the
OSHA standard is violated by plant 2 at = .05.
6.62

b.

First, check to see if n is large enough.


p0 3 p p0 3

p0 q0
(.70)(.30)
.70 3
.70 .14 (.56, .84)
100
n

Since the interval lies within the interval (0, 1), the normal approximation will be
adequate.
H0: p = .70
Ha: p < .70
The test statistic is z =

p p0

p p0
p0 q0
n

.63 .70
= 1.53
.70(.30)
100

The rejection region requires = .05 in the lower tail of the z-distribution. From Table
IV, Appendix B, z.05 = 1.645. The rejection region is z < 1.645.
Since the observed value of the test statistic does not fall in the rejection region (1.53 </
1.645), H0 is not rejected. There is insufficient evidence to indicate that the proportion
is less than .70 at = .05.
c.

p-value = P(z 1.53) = .5 .4370 = .0630


Since p is not less than = .05, H0 is not rejected.

6.64

a.

No. The p-value is the probability of observing your test statistic or anything more
unusual if H0 is true. For this problem, the p-value = .3300/2 = .1650.
Given the true value of the population proportion, p, is .5, the probability of observing a
test statistic of z = .44 or larger is .1650. Since the p-value is not small (p = .1650), there
is no evidence to reject H0. There is no evidence to indicate the population proportion is
greater than .5.

b.

If the alternative hypothesis were two-tailed, the p-value would be 2 times the p-value for
a one-tailed test. For this problem, the p-value = .3300. The probability of observing
your test statistic or anything more unusual if H0 is true is .3300.
There is no evidence to reject H0 for .10. There is no evidence to indicate that p .5
for .10.

Inferences Based on a Single Sample: Tests of Hypothesis

175

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

6.66

6.68

x 64
=
= .604
n 106

a.

p =

b.

H0: p = .70
Ha: p .70

c.

The test statistic is z =

d.

The rejection region requires /2 = .01/2 = .005 in each tail of the z-distribution. From
Table IV, Appendix B, z.005 = 2.58. The rejection region is z > 2.58 or z < 2.58.

e.

Since the observed value of the test statistic does not fall in the rejection region
(z = 2.16 </ 2.58), H0 is not rejected. There is insufficient evidence to indicate the true
proportion of consumers who believe Made in the USA means 100% of labor and
materials are from the United States is different from .70 at = .01.

a.

The population parameter of interest is p = proportion of items that had the wrong
price scanned at California Wal-Mart stores.

b.

To determine if the true proportion of items scanned at California Wal-Mart stores with
the wrong price exceeds the 2% NIST standard, we test:

p p0
p0 q0
n

.604 .70
= 2.16
.70(.30)
106

H0: p = .02
Ha: p > .02
c.

The test statistic is z =

p po
po qo
n

.083 .02
.02(.98)
1000

= 14.23

The rejection region requires = .05 in the upper tail of the z-distribution. From
Table IV, Appendix B, z.05 = 1.645. The rejection region is z > 1.645.
d.

Since the observed value of the test statistic falls in the rejection region
(z = 14.23 > 1.645), H0 is rejected. There is sufficient evidence to indicate that the true
proportion of items scanned at California Wal-Mart stores with the wrong price exceeds
the 2% NIST standard at = .05. This means that the proportion of items with wrong
prices at California Wal-Mart stores is much higher than what is allowed.

e.

In order for the inference to be valid, the sampling distribution of p must be


approximately normal. We check this assumption:
po 3 p po 3

po qo
.02(.98)
.02 3
.02 .013 (.007, .033)
n
1000

Since the above interval falls completely in the interval (0, 1), the normal distribution
will be adequate.

176

Chapter 6

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

6.70

a.

Let p = proportion of vacation-home owners who are minorities in 2003.


p =

x 46
=
= .111
n 416

To determine if the percentage of vacation-home owners in 2006 who are minorities


is larger than 6%, we test:
H0: p = .06
Ha: p > .06
The test statistic is z =

p po
po qo
n

.111 .06
= 4.38
.06(.94)
416

The rejection region requires = .01 in the upper tail of the z-distribution. From
Table IV, Appendix B, z.01 = 2.33. The rejection region is z > 2.33.
Since the observed value of the test statistic falls in the rejection region
(z = 4.38 > 2.33), H0 is rejected. There is sufficient evidence to indicate that the true
percentage of vacation-home owners in 2006 who are minorities is larger than 6% at
= .01.
b.

6.72

Since the return rate of the questionnaire was so small compared to the number sent out,
one should be very skeptical of the results. It would be fairly unusual that the sample of
returned questionnaires would be representative of the entire population.

Let p = proportion of firms in violation of the new 4-day rule for reporting material changes.
p =

x 23
=
= .050
n 462

To determine if the percentage of firms in violation of the new 4-day rule for reporting
material changes is less than 10%, we test:
H0: p = .10
Ha: p < .10
The test statistic is z =

p po
po qo
n

.050 .10
= 3.58
.10(.90)
462

The rejection region requires = .01 in the lower tail of the z-distribution. From Table IV,
Appendix B, z.01 = 2.33. The rejection region is z < 2.33.
Since the observed value of the test statistic falls in the rejection region
(z = 3.58 < 2.33), Ho is rejected. There is sufficient evidence to indicate that the true
percentage of firms in violation of the new 4-day rule for reporting material changes is less
than 10% at = .01.

Inferences Based on a Single Sample: Tests of Hypothesis

177

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

6.74

Let p = proportion of patients taking the pill who reported an improved condition.
First we check to see if the normal approximation is adequate:
p0 3 p p0 3

p0 q0
.5(.5)
3
.5 .018 (.482, .518)
7000
n

Since the interval falls completely in the interval (0, 1), the normal distribution will be
adequate.
To determine if there really is a placebo effect at the clinic, we test:
H0: p = .5
Ha: p > .5
The test statistic is z =

p p0
p0 q0
n

.7 .5
= 33.47
.5(.5)
7000

The rejection region requires = .05 in the upper tail of the z distribution. From Table IV,
Appendix B, z.05 = 1.645. The rejection region is z > 1.645.
Since the observed value of the test statistic falls in the rejection region (z = 33.47 > 1.645), H0
is rejected. There is sufficient evidence to indicate that there really is a placebo effect at the
clinic at = .05.
6.76

a.

The power of a test increases when:


1.
2.
3.

b.

178

The distance between the null and alternative values of increases.


The value of increases.
The sample size increases.

The power of a test is equal to 1 . As increases, the power decreases.

Chapter 6

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

6.78

6.80

From Exercise 6.77 we want to test H0: = 500 against Ha: > 500 using = .05, = 100, n =
25, and x = 532.9.

532.9 575
= P(z < 2.11)
100 / 25
= .5 .4826 = .0174

a.

= P( x0 < 532.9 when = 575) = P z <

b.

Power = 1 = 1 .0174 = .9826

c.

In Exercise 6.77, = .1949 and the power is .8051. The value of has decreased in this
exercise since = 575 is further from the hypothesized value than = 550. As a result,
the power of the test in this exercise has increased (when decreases, the power of the
test increases).

a.

From Exercise 6.79, we want to test H0: = 75 against Ha: < 75 using = .10, = 15,
n = 49, and x = 72.257.

If = 74,

= P( x0 > 72.257 when = 74) = P z >

If = 72,

= P( x0 > 72.257 when = 72) = P z >

If = 70,

72.257 74
= P(z > .81)
15 / 49
= .5 + .2910 = .7910
72.257 72
= P(z > .12)
15 / 49
= .5 .0478 = .4522

= P( x0 > 72.257 when = 70) = .1469


(Refer to Exercise 6.69, part c.)
If = 68,

= P( x0 > 72.257 when = 68) = P z >

If = 66,

= P( x0 > 72.257 when = 66) = P z >

In summary,

74
.7910

72
.4522

70
.1469

Inferences Based on a Single Sample: Tests of Hypothesis

72.257 68
= P(z > 1.99)
15 / 49
= .5 .4767 = .0233
72.257 66
= P(z > 2.92)
15 / 49
= .5 .4982 = .0018

68
.0233

66
.0018

179

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

b.

c.

Looking at the graph, is approximately .62 when = .73.

d.

Power = 1
Therefore,
74

.7910

Power .2090

72
.4522
.5478

70
.1469
.8531

68
.0233
.9767

66
.0018
.9982

The power curve starts out close to 1 when = 66 and decreases as increases, while the
curve is close to 0 when = 66 and increases as increases.

6.82

e.

As the distance between the true mean and the null hypothesized mean 0 increases,
decreases and the power increases. We can also see that as increases, the power
decreases.

a.

To determine if the mean size of California homes exceeds the national average, we test:
H0: = 2230
Ha: > 2230

180

Chapter 6

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The test statistic is z =

x 0

2347 2230
= 4.55
257 / 100

The rejection region requires = .01 in the upper tail of the z-distribution. From Table
IV, Appendix B, z.05 = 2.33. The rejection region is z > 2.33.
Since the observed value of the test statistic falls in the rejection region (z = 4.55 > 2.33),
H0 is rejected. There is sufficient evidence to indicate the mean size of California homes
exceeds the national average at = .01.
b.

To compute the power, we must first set up the rejection regions in terms of .
s
257
x0 = 0 + z x 0 + 2.33
= 2, 230 + 2.33
= 2,289.88
n
100

We would reject H0 if x > 2,289.88


The power of the test when = 2,330 would be:

x a
2, 289.88 2,330

Power = P( x > 2289.88 = 2,330) = P z > 0


= P z >

x
257 / 100

= P(z > 1.56) = .5 + .4406 = .9406

c.

The power of the test when = 2,280 would be:

x a
Power = P( > 2289.88 = 2,280) = P z > 0
x

= P(z > 0.38) = .5 .1480 = .3520

6.84

a.

2, 289.88 2, 280

= P z >

257 / 100

To determine if the mean mpg for 2006 Honda Civic autos is greater than 38 mpg, we
test:
H0: = 38
Ha: > 38

b.

The test statistic is z =

x 0

40.3 38
= 2.16
6.4 / 36

The rejection region requires = .05 in the upper tail of the z-distribution. From Table
IV, Appendix B, z.05 = 1.645. The rejection region is z > 1.645.
Since the observed value of the test statistic falls in the rejection region (z = 2.16 >
1.645), H0 is rejected. There is sufficient evidence to indicate that the mean mpg for 2006
Honda Civic autos is greater than 38 mpg at = .05.
We must assume that the sample was a random sample.

Inferences Based on a Single Sample: Tests of Hypothesis

181

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

First find:
x0 = 0 + z x = 0 + z

Thus, x0 = 38 + 1.645

where z = 1.645 from Table IV, Appendix B.

6.4
= 39.75
36

For = 38.5:
39.75 38.5

Power = P( x > 39.75 = 38.5) = P z >


= P(z > 1.17)
6.4 / 36

For = 39:

= .5 .3790 = .1210

39.75 39

Power = P( x > 39.75 = 39) = P z >


= P(z > .70)
6.4 / 36

For = 39.5:

= .5 .2580 = .2420

39.75 39.5

Power = P( x > 39.75 = 39.5) = P z >


= P(z > .23 )
6.4 / 36

For = 40:

= .5 .0910 = .4090

39.75 40

Power = P( x > 39.75 = 40) = P z >


= P(z > .23)
6.4 / 36

For = 40.5:

= .5 + .0910 = .5910

39.75 40.5

Power = P( x > 39.75 = 40.5) = P z >


= P(z > .70)
6.4 / 36

= .5 + .2580 = .7580
d.

182

The plot is:

Chapter 6

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

e.

From the plot, the power is approximately .5.


For = 39.75 :

39.75 39.75
Power = P( x > 39.75 | = 39.75) = P z >
= P( z > 0) = .5

6.4 36

f.

From the plot, the power is approximately 1.


For = 43 :

39.75 43
Power = P( x > 39.75 | = 43) = P z >
= P( z > 3.05)

6.4 36

= .5 + .4989 = .9989
If the true value of is 40, the approximate probability that the test will fail to reject H0 is
1 .9989 = .0011.

6.86

Using Table VII, Appendix B:


a.

For n = 12, df = n 1 = 12 1 = 11
P(2 > 02 ) = .10 02 = 17.2750

b.

For n = 9, df = n 1 = 9 1 = 8
P(2 > 02 ) = .05 02 = 15.5073

c.

For n = 5, df = n 1 = 5 1 = 4
P(2 > 02 ) = .025 02 = 11.1433

6.88

a.

It would be necessary to assume that the population has a normal distribution.

b.

H0: 2 = 1
Ha: 2 > 1
The test statistic is 2 =

(n 1) s 2

2
0

6(4.84)
= 29.04
1

The rejection region requires = .05 in the upper tail of the 2 distribution with
2
= 12.5916. The rejection
df = n 1 = 7 1 = 6. From Table VII, Appendix B, .05
region is 2 > 12.5916.
Since the observed value of the test statistic falls in the rejection region (29.04 >
12.5916), H0 is rejected. There is sufficient evidence to indicate that the variance is
greater than 1 at = .05.

Inferences Based on a Single Sample: Tests of Hypothesis

183

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

H0: 2 = 1
Ha: 2 1
(n 1) s 2

The test statistic is 2 =

2
0

6(4.84)
= 29.04
1

The rejection region requires /2 = .025 in the upper tail of the 2 distribution with
2
= 1.237347 and
df = n 1 = 7 1 = 6. From Table VII, Appendix B, .975
2
.025
= 14.4494. The rejection region is 2 < 1.237347 or 2 > 14.4494.

Since the observed value of the test statistic falls in the rejection region (29.04 >
14.4494), H0 is rejected. There is sufficient evidence to indicate that the variance is not
equal to 1 at = .05.
6.90

Some preliminary calculations are:

s2 =

( x)

n 1

302
7 = 7.9048
7 1

176

To determine if 2 < 1, we test:


H0: 2 = 1
Ha: 2 < 1
The test statistic is 2 =

(n 1) s 2

2
0

(7 1)7.9048
= 47.43
1

The rejection region requires = .05 in the lower tail of the 2 distribution with df = n 1 = 7
2
= 1.63539. The rejection region is 2 < 1.63539.
1 = 6. From Table VII, Appendix B, .95
Since the observed value of the test statistic does not fall in the rejection region (2 = 47.43 </
1.63539), H0 is not rejected. There is insufficient evidence to indicate the variance is less
than 1.
6.92

a.

To determine if the breaking strength variance of the new adhesive is less than the
variance of the standard composite adhesive, 2 = .25, we test:
H0: 2 = .25
Ha: 2 < .25

b.

184

The rejection region requires = .01 in the lower tail of the 2 distribution with
2
df = n 1 = 10 1 = 9. From Table VII, Appendix B, .99
= 2.087912. The rejection
2
region is < 2.087912.

Chapter 6

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

6.94

(n 1) s 2

(10 1).462
= 7.6176 .
.25

c.

The test statistic is 2 =

b.

Since the observed value of the test statistic does not fall in the rejection region
(2 = 7.6176 </ 2.087912), H0 is not rejected. There is insufficient evidence to
indicate the breaking strength variance of the new adhesive is less than the
variance of the standard composite adhesive, 2 = .25 at = .01.

e.

We must assume that the distribution of the breaking strengths is approximately


normal and that a random sample was selected from this population.

o2

To determine if the true standard deviation of the point-spread errors exceed 15 (variance
exceeds 225), we test:
H0: 2 = 225
Ha: 2 > 225
The test statistic is 2 =

(n 1) s 2

02

(240 1)13.32
= 187.896
225

The rejection region requires in the upper tail of the 2 distribution with df = n 1
= 240 1 = 239. The maximum value of df in Table VII is 100. Thus, we cannot find the
rejection region using Table VII. Using a statistical package, the p-value associated with
2 = 187.896 is .9938.
Since the p-value is so large, there is no evidence to reject H0. There is insufficient evidence to
indicate that the true standard deviation of the point-spread errors exceeds 15 for any
reasonable value of .
(Since the observed variance (or standard deviation) is less than the hypothesized value of the
variance (or standard deviation) under H0, there is no way H0 will be rejected for any
reasonable value of .)
6.96

Using MINITAB, the descriptive statistics are:


Descriptive Statistics: GASTURBINE
Variable
GASTURBINE

N
67

N*
0

Variable
GASTURBINE

Maximum
16243

Mean
11066

SE Mean
195

StDev
1595

Minimum
8714

Q1
9918

Median
10656

Q3
11842

To determine if the heat rates of the augmented gas turbine engine are more variable
than the heat rates of the standard gas turbine engine, we test:
Ho: 2 = 1,5002
Ha: 2 > 1,5002

Inferences Based on a Single Sample: Tests of Hypothesis

185

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The test statistic is 2 =

( n 1) s 2

o2

(67 1)1,5952
= 74.625 .
1,5002

The rejection region requires = .05 in the upper tail of the 2 distribution with
2
85.95148. The rejection
df = n 1 = 67 1 = 66. From Table VII, Appendix B, .05
2
region is > 85.95148.
Since the observed value of the test statistic does not fall in the rejection region
(2 = 74.625 >/ 85.95148), H0 is not rejected. There is insufficient evidence to indicate the
heat rates of the augmented gas turbine engine are more variable than the heat rates of the
standard gas turbine engine at = .05.
6.98

For a large sample test of hypothesis about a population mean, no assumptions are necessary
because the Central Limit Theorem assures that the test statistic will be approximately
normally distributed. For a small sample test of hypothesis about a population mean, we must
assume that the population being sampled from is normal. The test statistic for the large
sample test is the z statistic, and the test statistic for the small sample test is the t statistic.

6.100

The elements of the test of hypothesis that should be specified prior to analyzing the data are:
null hypothesis, alternative hypothesis, and rejection region based on .

6.102

= P(Type I error) = P(rejecting H0 when it is true). Thus, if rejection of H0 would cause your
firm to go out of business, you would want this probability or to be small.

6.104

a.

H0: = 8.3
Ha: 8.3
The test statistic is z =

x 0

8.2 8.3
.79 / 175

= 1.67

The rejection region requires /2 = .05/2 = .025 in each tail of the z-distribution. From
Table IV, Appendix B, z.025 = 1.96. The rejection region is z < 1.96 or z > 1.96.
Since the observed value of the test statistic does not fall in the rejection region (1.67 </
1.96), H0 is not rejected. There is insufficient evidence to indicate that the mean is
different from 8.3 at = .05.
b.

H0: = 8.4
Ha: 8.4
The test statistic is z =

x 0

8.2 8.4
= 3.35
.79 / 175

The rejection region is the same as part b, z < 1.96 or z > 1.96.

186

Chapter 6

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Since the observed value of the test statistic falls in the rejection region (3.35 < 1.96),
H0 is rejected. There is sufficient evidence to indicate that the mean is different from 8.4
at = .05.
c.

H0: = 1
Ha: 1

H0: 2 = 1
or

Ha: 2 1

The test statistic is 2 =

(n 1) s 2

02

(175 1)(.79) 2
= 108.59
1

The rejection region requires /2 = .05/2 = .025 in each tail of the 2 distribution with df
2
2
129.561 and .975

= n 1 = 175 1 = 174. From Table VII, Appendix B, .025

74.2219. The rejection region is 2 > 129.561 or 2 < 74.2219.


Since the observed value of the test statistic does not fall in the rejection region ( 2 =
108.59 >/ 129.561 and 2 = 108.59 </ 74.2219), H0 is not rejected. There is insufficient
evidence to indicate the variance differs from 1 at = .05.
d.

In part a, the rejection region is z < 1.96 or z > 1.96. In terms of x , the rejection region
would be:

z=

x 0

z=

x 0

1.96 =

xU 8.3
.79

1.96 =

175

.117 = xU 8.3 xU = 8.417

xL 8.3
.79

175

.117 = xL 8.3 xL = 8.183

Based on x , the rejection region would be: Reject H0 if x < 8.183 or x > 8.417
The power of the test is the probability the test statistic falls in the rejection region, given
the alternative hypothesis is true. In this case, we will let a = 8.5.
Power = P( x < 8.183 | a = 8.5) + P( x > 8.417 | a = 8.5)

8.183 8.5
8.417 8.5
= P z <
+ P z >

.79 175
.79 175

= P( z < 5.31) + P ( z > 1.39) = (.5 .5) + (.5 + .4177) = .9177


(Using Table IV, Appendix B)

Inferences Based on a Single Sample: Tests of Hypothesis

187

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

6.106

6.108

a.

The p-value = .1288 = P(t 1.174). Since the p-value is not very small, there is no
evidence to reject H0 for .10. There is no evidence to indicate the mean is greater
than 10.

b.

We must assume that a random sample was selected from a population that is normally
distributed.

c.

For the alternative hypothesis Ha: 10, the p-value is 2 times the p-value for the onetailed test. The p-value = 2(.1288) = .2576. There is no evidence to reject H0 for .10.
There is no evidence to indicate the mean is different from 10.

a.

If we wish to test the research hypothesis that the mean GHQ score for all unemployed
men exceeds 10, we test:
H0: = 10
Ha: > 10
This is a one-tailed test. We are only interested in rejecting H0 if the mean GHQ score for
all unemployed men is greater than 10.

b.

The rejection region requires = .05 in the upper tail of the z-distribution. From Table
IV, Appendix B, z.05 = 1.645. The rejection region is z > 1.645.

c.

The test statistic is z =

x 0

10.94 10.0
= 1.29
5.10 / 49

Since the observed value of the test statistic does not fall in the rejection region (z = 1.29
>/ 1.645), H0 is not rejected. There is insufficient evidence to indicate the mean GHQ
score for all unemployed men is greater than 10 at = .05.
d.

The p-value is P(z 1.29) = .5 .4015 = .0985. (Using Table IV, Appendix B)
The probability of observing our test statistic or anything more unusual, given H0 is true,
is .0985. Since this value is not less than = .05, we do not reject H0. There is
insufficient evidence to indicate the mean GHO score is greater than 10.

6.110

a.

The population parameter of interest is p = proportion of all television viewers with


access to cable-TV who agree with the statement Overall, I find the quality of news on
cable networks to be better than news on the ABC, CBS, and NBC networks.

b.

p =

c.

To determine if the true proportion of TV-viewers who find cable news to be better
quality than network news differs from .50, we test:

x 248
=
= .496
n 500

H0: p = .50
Ha: p .50

188

Chapter 6

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

d.

The test statistic is z =

p p0
p0 q0
n

.496 .50
= 0.18
.50(.50)
500

The rejection region requires /2 = .10/2 = .05 in each tail of the z-distribution. From
Table IV, Appendix B, z.05 = 1.645. The rejection region is z > 1.645 or z < 1.645.
Since the observed value of the test statistic does not fall in the rejection region
(z = 0.18 </ 1.645), H0 is not rejected. There is insufficient evidence to indicate the
true proportion of TV-viewers who find cable news to be better quality than network
news differs from .50 at = .10.
e.

In order for the inference to be valid, the sampling distribution of p must be


approximately normal. We check this assumption:
p0 3 p p0 3

p0 q0
.5(.5)
.5 3
.5 .067 (.433, .567)
n
500

Since the interval falls completely in the interval (0, 1), the normal distribution will be
adequate.
6.112

a.

First, check to see if the normal approximation is adequate:


p0 3 p p0 3

p0 q0
(.25)(.75)
.25 3
.25 .103 (.147, .353)
n
159

Since the interval falls completely in the interval (0, 1), the normal distribution will be
adequate.

p =

x 124
= .786
=
n 159

To determine if the percentage of truckers who suffer from sleep apnea differs from 25%,
we test:
H0: p = .25
Ha: p .25
The test statistic is z =

p p0
p0 q0
n

.786 .25
= 15.61
(.25)(.75)
159

The rejection region requires /2 = .10/2 = .05 in each tail of the z-distribution. From
Table IV, Appendix B, z.05 = 1.645. The rejection region is z < 1.645 or z > 1.645.

Inferences Based on a Single Sample: Tests of Hypothesis

189

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Since the observed value of the test statistic falls in the rejection region (z = 15.61 >
1.645), H0 is rejected. There is sufficient evidence to indicate that the percentage of
truckers who suffer from sleep apnea differs from 25% at = .05.
b.

The observed significance level is the p-value and is:


p-value = P(z 15.61) + P(z 15.61) (.5 .5) + (.5 .5) = 0
Since the p-value is so small, we would reject H0 for any reasonable value of . There is
sufficient evidence to indicate that the percentage of truckers who suffer from sleep apnea
differs from 25%.

6.114

c.

The inference from a confidence interval and a test of hypothesis must agree because the
same numbers are used in both if the same level of significance is used.

a.

Let p = proportion of shoppers using cents-off coupons. To determine if the proportion of


shoppers using cents-off coupons exceeds .65, we test:
H0: p = .65
Ha: p > .65
The test statistic is z =

p p0
p0 q0
n

.77 .65
.65(.35)
1, 000

= 7.96

The rejection region requires = .05 in the upper tail of the z-distribution. From Table
IV, Appendix B, z.05 = 1.645. The rejection region is z > 1.645.
Since the observed value of the test statistic falls in the rejection region (z = 7.96 >
1.645), H0 is rejected. There is sufficient evidence to indicate the proportion of shoppers
using cents-off coupons exceeds .65 at = .05.
b.

The sample size is large enough if the interval does not include 0 or 1.
p0 q0
.65(.35)
.65 3
.65 .045 (.605, .695)
n
1, 000
Since the interval falls completely in the interval (0, 1), the normal distribution will be
adequate.
p0 3 p p0 3

c.

190

The p-value is p = P ( z 7.96) = (.5 .5) .0 . (Using Table IV, Appendix B.) Since the
p-value is smaller than = .05, H0 is rejected. There is sufficient evidence to indicate the
proportion of shoppers using cents-off coupons exceeds .65 at = .05.

Chapter 6

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

6.116

Using MINITAB, the descriptive statistics are:


Descriptive Statistics: Tunnel
Variable
Tunnel

N
10

Mean
989.8

Median
970.5

TrMean
987.9

Variable
Tunnel

Minimum
735.0

Maximum
1260.0

Q1
862.5

Q3
1096.8

StDev
160.7

SE Mean
50.8

To determine whether peak hour pricing succeeded in reducing the average number of vehicles
attempting to use the Lincoln Tunnel during the peak rush hour, we test:
H0: = 1,220
Ha: < 1,220
The test statistic is t =

x 0
s/ n

989.8 1, 220
160.7 / 10

= 4.53

Since no is given, we will use = .05. The rejection region requires = .05 in the lower tail
of the t-distribution with df = n 1 = 10 1 = 9. From Table VI, Appendix B, t.05 = 1.833.
The rejection region is t < 1.833.
Since the observed value of the test statistic falls in the rejection region (t = 4.53 < 1.833),
H0 is rejected. There is sufficient evidence to indicate that peak hour pricing succeeded in
reducing the average number of vehicles attempting to use the Lincoln Tunnel during the peak
rush hour at = .05.
6.118

a.

To determine if the true mean number of pecks at the blue string is less than 7.5, we test:
H0: = 7.5
Ha: < 7.5
The test statistic is z =

x 0

1.13 7.5
2.21

72

= 24.46

The rejection region requires = .01 in the lower tail of the z-distribution. From Table
IV, Appendix B, z.01 = 2.33. The rejection region is z < 2.33.
Since the observed value of the test statistic falls in the rejection region
(z = 24.46 < 2.33), H0 is rejected. There is sufficient evidence to indicate the true
mean number of pecks at the blue string is less than 7.5 at = .01.

b.

From Exercise 5.96, the 99% confidence interval is (.46, 1.80). Since the hypothesized
value of the mean ( = 7.5) does not fall in the confidence interval, it is not a likely
candidate for the true value of the mean. Thus, you would reject it. This agrees with the
conclusion in part a.

Inferences Based on a Single Sample: Tests of Hypothesis

191

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

6.120

a.

p = 24/40 = .6
To determine if the proportion of shoplifters turned over to police is greater than .5, we
test:
H0: p = .5
Ha: p > .5
The test statistic is z =

p p0
p0 q0
n

.6 .5
.5(.5)
40

= 1.26

The rejection region requires = .05 in the upper tail of the z-distribution. From Table
IV, Appendix B, z.05 = 1.645. The rejection region is z > 1.645.
Since the observed value of the test statistic does not fall in the rejection region (z = 1.26
>/ 1.645), H0 is not rejected. There is insufficient evidence to indicate the proportion of
shoplifters turned over to police is greater than .5 at = .05.
b.

To determine if the normal approximation is appropriate, we check:


p0 3 p 3

p0 q0
(.5)(.5)
.5 3
.5 .237 (.263, .737)
n
40

Since the interval falls completely in the interval (0, 1), the normal distribution will be
adequate.
c.

The observed significance level of the test is p-value = P(z 1.26) = .5 .3962 = .1038.
The probability of observing the value of our test statistic or anything more unusual if the
true value of p is .5 is .4038. Since this p-value is so large, there is no evidence to reject
H0. There is no evidence to indicate the true proportion of shoplifters turned over to
police is greater than .5.

6.122

d.

Any value of that is greater than the p-value would lead one to reject H0. Thus, for this
problem, we would reject H0 for any value of > .1038.

a.

To determine whether the mean profit change for restaurants with frequency programs is
greater than $1047.34, we test:
H0: = 1047.34
Ha: > 1047.34

b.

Some preliminary calculations are:


x =

192

x = 30,113.17
n

12

= 2,509.43

Chapter 6

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

( x)

30,113.17 2
n
12
s2 =
= 4,619,331.955
=
n 1
12 1
s = 4,619,331.955 = 2149.2631

The test statistic is t =

126,379,568.8

x 0
s/ n

2509.43 1047.34
2149.2631/ 12

= 2.36

The rejection region requires = .05 in the upper tail of the t-distribution with df = n 1
= 12 1 = 11. From Table VI, Appendix B, t.05 = 1.796. The rejection region is
t > 1.796.
Since the observed value of the test statistic falls in the rejection region (t = 2.36 > 1.796),
H0 is rejected. There is sufficient evidence to indicate the mean profit change for
restaurants with frequency programs is greater than $1047.34 for = .05.
It appears that the frequency program would be profitable for the company if adopted
nationwide.
6.124

a.

A Type II error would be concluding the mean amount of PCB in the air is less than or
equal to 3 parts per million when, in fact, it is more than 3 parts per million.

b.

From Exercise 6.123, z =

x0

/ n

x0 = z

.5
+3
50
x0 = 3.165

+ 0 x0 = 2.33

3.165 3.1
= P(z .92) = .5 + .3212 = .8212
For = 3.1, = P( x 3.165) = P z
.5

50

(from Table IV, Appendix B)


c.

Power = 1 = 1 .8212 = .1788

d.

3.165 3.2
= P(z .49) = .5 .1879 = .3121
For = 3.2, = P( x 3.165) = P z
.5

50

Power = 1 = 1 .3121 = .6879


As the plant's mean PCB departs further from 3, the power increases.

Inferences Based on a Single Sample: Tests of Hypothesis

193

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

6.126

a.

Some preliminary calculations:


x =

s2 =
s=

x = 79.93
n

( x)

= 15.986
2

n
=
n 1
.00043 = .0207

1, 277.7627
5 1

79.932
5
= .00043

To determine if the mean measurement differs from 16.01, we test:


H0: = 16.01
Ha: 16.01
The test statistic is t =

x 0
s/ n

15,986 16.01
.0207 / 5

= 2.59

The rejection region requires /2 = .05/2 = .025 in each tail of the t-distribution with df =
n 1 = 5 1 = 4. From Table VI, Appendix B, t.025 = 2.776. The rejection region is t <
2.776 or t > 2.776.
Since the observed value of the test statistic does not fall in the rejection region (t = 2.59
</ 2.776), H0 is not rejected. There is insufficient evidence to indicate the true mean
measurement differs from 16.01 at = .05.
b.

We must assume that the sample of measurements was randomly selected from a
population of measurements that is normally distributed.

c.

To determine if the standard deviation of the weight measurements is greater than .01, we
test:
H0: 2 = .012
Ha: 2 > .012
The test statistic is 2 =

( n 1) s 2

o2

(5 1).0207 2
= 16.0684 .
.012

The rejection region requires = .05 in the upper tail of the 2 distribution with
df = n 1 = 5 1 = 4. From Table VII, Appendix B, .205 = 9.48773. The rejection
region is 2 > 9.48773.
Since the observed value of the test statistic falls in the rejection region
(2 = 16.0684 > 9.48773), H0 is rejected. There is sufficient evidence to indicate the
standard deviation of the weight measurements is greater than .01 at = .05.

194

Chapter 6

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

6.128

a.

Let pi = proportion of first round games won by the ith seed. To determine if the higher
seed has a better than 50-50 chance of winning a first-round game, we test:
H0: pi = .5
Ha: pi > .5 for i = 1, 2, 3, , 8
The test statistic is zi =

p i p0
po qo
n

No value of was given. We will use = .05. The rejection region requires = .05 in
the upper tail of the z-distribution. From Table IV, Appendix B, z.05 = 1.645. The
rejection region is z > 1.645.
xi
x
x 52
x
49
41
= 1 , p 2 = 2 =
= .942 , p 3 = 3 =
= .788 ,
. Thus, p1 = 1 =
n 52
n 52
n
n 52
x 37
x
x
x
42
36
35
p 4 = 4 =
= .808 , p 5 = 5 =
= .712 , p 6 = 6 =
= .692 , p 7 = 7 =
= .673 ,
n 52
n 52
n 52
n 52
x
22
p 8 = 8 =
= .423
n 52
p i =

The corresponding test statistics are:


z1 =

z3 =

z5 =

z7 =

p1 p0
po qo
n
p 3 p0
po qo
n
p 5 p0
po qo
n
p 7 p0
po qo
n

1.00 .5

.5(.5)
52
.788 .5
.5(.5)
52
.712 .5
.5(.5)
52
.673 .5
.5(.5)
52

= 7.21 , z2 =

= 4.15 , z4 =

= 3.06 , z6 =

= 2.50 , z8 =

p 2 p0
po qo
n
p 4 p0
po qo
n
p 6 p0
po qo
n
p 8 p0
po qo
n

.942 .5

.5(.5)
52
.808 .5
.5(.5)
52
.692 .5
.5(.5)
52
.423 .5
.5(.5)
52

= 6.37 ,

= 4.44 ,

= 2.77 ,

= 1.11

For games matching 1 and 16, since the observed value of the test statistic falls in the
rejection region (z1 = 7.21 > 1.645), H0 is rejected. There is sufficient evidence to
indicate the #1 seed has a better than 50-50 chance of winning a first-round game at
= .05.

Inferences Based on a Single Sample: Tests of Hypothesis

195

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

For games matching 2 and 15, since the observed value of the test statistic falls in the
rejection region (z2 = 6.37 > 1.645), H0 is rejected. There is sufficient evidence to
indicate the #2 seed has a better than 50-50 chance of winning a first-round game at
= .05.
For games matching 3 and 14, since the observed value of the test statistic falls in the
rejection region (z3 = 4.15 > 1.645), H0 is rejected. There is sufficient evidence to
indicate the #3 seed has a better than 50-50 chance of winning a first-round game at
= .05.
For games matching 4 and 13, since the observed value of the test statistic falls in the
rejection region (z4 = 4.44 > 1.645), H0 is rejected. There is sufficient evidence to
indicate the #4 seed has a better than 50-50 chance of winning a first-round game at
= .05.
For games matching 5 and 12, since the observed value of the test statistic falls in the
rejection region (z5 = 3.06 > 1.645), H0 is rejected. There is sufficient evidence to
indicate the #5 seed has a better than 50-50 chance of winning a first-round game at
= .05.
For games matching 6 and 11, since the observed value of the test statistic falls in the
rejection region (z6 = 2.77 > 1.645), H0 is rejected. There is sufficient evidence to
indicate the #6 seed has a better than 50-50 chance of winning a first-round game at
= .05.
For games matching 7 and 10, since the observed value of the test statistic falls in the
rejection region (z7 = 2.50 > 1.645), H0 is rejected. There is sufficient evidence to
indicate the #7 seed has a better than 50-50 chance of winning a first-round game at
= .05.
For games matching 8 and 9, since the observed value of the test statistic does not fall in
the rejection region (z8 = 1.11 >/ 1.645), H0 is not rejected. There is insufficient
evidence to indicate the #8 seed has a better than 50-50 chance of winning a first-round
game at = .05.
b.

Let i = mean margin of victory. To determine if the mean margin of victory is greater
than 10 points, we test:
H0: i = 10
Ha: i > 10 i = 1, 2, 3, and 4
The test statistic is zi =

xi 0

No value of was given. We will use = .05. The rejection region requires = .05 in
the upper tail of the z-distribution. From Table IV, Appendix B, z.05 = 1.645. The
rejection region is z > 1.645.

196

Chapter 6

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The test statistics are:


z1 =

z3 =

x1 0

x3 0

22.9 10
12.4

52

= 7.50 , z2 =

10.6 10
12.0

52

= 0.36 , z4 =

x2 0

x4 0

17.2 10

11.4

52

= 4.55 ,

10.0 10
12.5

52

=0

For games matching 1 and 16, since the observed value of the test statistic falls in the
rejection region (z1 = 7.50 > 1.645), H0 is rejected. There is sufficient evidence to
indicate the #1 seed wins by more than 10 points in first-round games at = .05.
For games matching 2 and 15, since the observed value of the test statistic falls in the
rejection region (z2 = 4.55 > 1.645), H0 is rejected. There is sufficient evidence to
indicate the #2 seed wins by more than 10 points in first-round games at = .05.
For games matching 3 and 14, since the observed value of the test statistic does not fall in
the rejection region (z3 = 0.36 >/ 1.645), H0 is not rejected. There is insufficient evidence
to indicate the #3 seed wins by more than 10 points in first-round games at
= .05.

c.

For games matching 4 and 13, since the observed value of the test statistic does not fall in
the rejection region (z4 = 0 >/ 1.645), H0 is not rejected. There is insufficient evidence to
indicate the #4 seed wins by more than 10 points in first-round games at = .05.
Let i = mean margin of victory. To determine if the mean margin of victory is less than
5 points, we test:
H0: i = 5
Ha: i < 5 i = 5, 6, 7, and 8
The test statistic is zi =

xi 0

No value of was given. We will use = .05. The rejection region requires = .05 in
the upper tail of the z-distribution. From Table IV, Appendix B, z.05 = 1.645. The
rejection region is z < 1.645.
The test statistics are:
z5 =

z7 =

x5 0

x7 0

5.3 5
10.4

52

3.2 5
10.5

52

= 0.21 , z6 =

x6 0

= 1.24 , z8 =

x8 0

Inferences Based on a Single Sample: Tests of Hypothesis

4.3 5
10.7
=

52

= .47 ,

2.1 5
11.0

52

= 4.65

197

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

For games matching 5 and 12, since the observed value of the test statistic does not fall in
the rejection region (z5 = 0.21 </ 1.645), H0 is not rejected. There is insufficient
evidence to indicate the #5 seed wins by less than 5 points in first-round games at
= .05.
For games matching 6 and 11, since the observed value of the test statistic does not fall in
the rejection region (z6 = 0.47 </ 1.645), H0 is not rejected. There is insufficient
evidence to indicate the #6 seed wins by less than 5 points in first-round games at
= .05.
For games matching 7 and 10, since the observed value of the test statistic does not fall in
the rejection region (z7 = 1.24 </ 1.645), H0 is not rejected. There is insufficient
evidence to indicate the #7 seed wins by less than 5 points in first-round games at
= .05.
For games matching 8 and 9, since the observed value of the test statistic falls in the
rejection region (z8 = 4.65 < 1.645), H0 is rejected. There is sufficient evidence to
indicate the #8 seed wins by less than 5 points in first-round games at = .05.
d.

To determine if the standard deviation of victory margin differs from 11, we test:
H0: 12 = 112 = 121
Ha: 12 112 = 121
The test statistic is i2 =

(n 1) si2

02

No level was given, so we will use = .05. The rejection region requires /2 = .05/2
= .025 in each tail of the 2 distribution with df = n 1 = 52 1 = 51. From Table VII,
2
2
= 71.4202 and .975
= 32.3574. The rejection region is 2 < 32.3574
Appendix B, .025

or 2 > 71.4202.
The test statistics are:

12 =

32 =

52 =

72 =

198

(n 1) s12

(n 1) s22 (52 1)(11.4) 2


(52 1)(12.4) 2
= 64.808 , 22 =
=
= 54.777 ,
121
121
02

(n 1) s32

(n 1) s42 (52 1)(12.5) 2


(52 1)(12.0) 2
= 60.694 , 42 =
=
= 65.857 ,
121
121
02

(n 1) s52

(n 1) s62 (52 1)(10.7) 2


(52 1)(10.4) 2
= 45.588 , 62 =
=
= 48.256 ,
121
121
02

(n 1) s72

(n 1) s82 (52 1)(11) 2


(52 1)(10.5) 2
= 46.469 , 82 =
=
= 51.000
121
121
02

2
0

02

02

02

Chapter 6

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

For games matching 1 and 16, since the observed value of the test statistic does not fall in
the rejection region ( 12 = 64.808 >/ 71.4202 and 12 = 64.808 </ 32.3574), H0 is not
rejected. There is insufficient evidence to indicate standard deviation of victory margin
differs from 11 at = .05.
For games matching 2 and 15, since the observed value of the test statistic does not fall in
the rejection region ( 22 = 54.777 >/ 71.4202 and 22 = 54.777 </ 32.3574), H0 is not
rejected. There is insufficient evidence to indicate standard deviation of victory margin
differs from 11 at = .05.
For games matching 3 and 14, since the observed value of the test statistic does not fall in
the rejection region ( 32 = 60.694 >/ 71.4202 and 32 = 60.694 </ 32.3574), H0 is not
rejected. There is insufficient evidence to indicate standard deviation of victory margin
differs from 11 at = .05.
For games matching 4 and 13, since the observed value of the test statistic does not fall in
the rejection region ( 42 = 65.857 >/ 71.4202 and 42 = 65.857 </ 32.3574), H0 is not
rejected. There is insufficient evidence to indicate standard deviation of victory margin
differs from 11 at = .05.
For games matching 5 and 12, since the observed value of the test statistic does not fall in
the rejection region ( 52 = 45.588 >/ 71.4202 and 52 = 45.588 </ 32.3574), H0 is not
rejected. There is insufficient evidence to indicate standard deviation of victory margin
differs from 11 at = .05.
For games matching 6 and 11, since the observed value of the test statistic does not fall in
the rejection region ( 62 = 48.256 >/ 71.4202 and 62 = 48.256 </ 32.3574), H0 is not
rejected. There is insufficient evidence to indicate standard deviation of victory margin
differs from 11 at = .05.
For games matching 7 and 10, since the observed value of the test statistic does not fall in
the rejection region ( 72 = 46.469 >/ 71.4202 and 72 = 46.469 </ 32.3574), H0 is not
rejected. There is insufficient evidence to indicate standard deviation of victory margin
differs from 11 at = .05.
For games matching 8 and 9, since the observed value of the test statistic does not fall in
the rejection region ( 82 = 51.000 >/ 71.4202 and 82 = 51.000 </ 32.3574), H0 is not
rejected. There is insufficient evidence to indicate standard deviation of victory margin
differs from 11 at = .05.

Inferences Based on a Single Sample: Tests of Hypothesis

199

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

e.

Let = mean difference in game outcome and point spread. To determine if the point
spread is a good predictor of the victory margin, we test:
H0: = 0
Ha: 0
The test statistic is z =

x 0

.7 0
11.3

360

= 1.18 .

Since no was given, we will use = .05. The rejection region requires /2 = .05/2 =
.025 in each tail of the z-distribution. From Table IV, Appendix B, z.025 = 1.96. The
rejection region is z > 1.96 or z < 1.96.
Since the observed value of the test statistic does not fall in the rejection region
(z = 1.18 >/ 1.96), H0 is not rejected. There is insufficient evidence to indicate there is a
difference in the game outcome and point spread at = .05. There is no evidence to
indicate the point spread is not a good predictor of the victory margin.
6.130

Using MINITAB, the descriptive statistics are:


Descriptive Statistics: Candy
Variable
Candy

N
5

N*
0

Mean
24.00

SE Mean
1.67

StDev
3.74

Minimum
21.00

Q1
21.00

Median
23.00

Q3
27.50

Maximum
30.00

To give the benefit of the doubt to the students we will use a small value of . (We do
not want to reject H0 when it is true to favor the students.) Thus, we will use = .001.

We must also assume that the sample comes from a normal distribution. To determine if
the mean number of candies exceeds 15, we test:
H0: = 15
Ha: > 15
The test statistic is z =

x o

22 15
3

= 5.22

The rejection region requires = .001 in the upper tail of the z-distribution. From Table IV,
Appendix B, z.001 = 3.08. The rejection region is z > 3.08.
Since the observed value of the test statistic falls in the rejection region (z = 5.22 > 3.08),
H0 is rejected. There is sufficient evidence to indicate the mean number of candies exceeds 15
at = .001.

200

Chapter 6

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Inferences Based on Two Samples:


Confidence Intervals and
Tests of Hypothesis
7.2

a.

x = 1 = 12

x =

b.

x = 2 = 10

x =

c.

x x = 1 2 = 12 10 = 2
1

7.4

1
n1

2
n2

4
= .5
64

3
64

= .375

x x =
d.

Chapter 7

12
n1

22
n2

42 32
25
+
=
= .625
64 64
64

Since n1 30 and n2 30, the sampling distribution of x1 x2 is approximately normal by


the Central Limit Theorem.

Assumptions about the two populations:


1.
2.

Both sampled populations have relative frequency distributions that are approximately
normal.
The population variances are equal.

Assumptions about the two samples:


The samples are randomly and independently selected from the population.
7.6

a.

sp2 =

(n1 1) s12 + (n2 1) s22 (25 1)120 + (25 1)100 5280


= 110
=
=
n1 + n2 2
25 + 25 + 2
48

b.

sp2 =

(20 1)12 + (10 1)20 408


=
= 14.5714
20 + 10 2
28

c.

sp2 =

(6 1).15 + (10 1).2 2.55


=
= .1821
6 + 10 2
14

d.

sp2 =

(16 1)3000 + (17 1)2500 85,000


=
= 2741.9355
16 + 17 2
31

e.

sp2 falls near the variance with the larger sample size.

Inferences Based on Two Samples: Confidence Intervals and Tests of Hypothesis

201

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

7.8

12

22

9
16
+
= .25 = .5
100 100

a.

x x =

b.

The sampling distribution of


x1 x2 is approximately normal
by the Central Limit Theorem
since n1 30 and n2 30.

n1

n2

x x = 1 2 = 10
1

c.

x1 x2 = 15.5 26.6 = 11.1


Yes, it appears that x1 x2 = 11.1 contradicts the null hypothesis H0: 1 2 = 10.

d.

The rejection region requires /2 = .025 = .05/2 in each tail of the z-distribution. From
Table IV, Appendix B, z.025 = 1.96. The rejection region is z < 1.96 or z > 1.96.

e.

H0: 1 2 = 10
Ha: 1 2 10
The test statistic is z =

( x1 x2 ) 10

12
n1

22

(15.5 26.6) 10
= 42.2
.5

n2

The rejection region is z < 1.96 or z > 1.96. (Refer to part d.)
Since the observed value of the test statistic falls in the rejection region (z = 42.2
< 1.96), H0 is rejected. There is sufficient evidence to indicate the difference in the
population means is not equal to 10 at = .05.
f.

The form of the confidence interval is:


( x1 x2 ) z / 2

12
n1

22
n2

For confidence coefficient .95, = 1 .95 = .05 and /2 = .05/2 = .025. From Table IV,
Appendix B, z.025 = 1.96. The confidence interval is:
9
16
+
11.1 .98 (12.08, 10.12)
(15.5 26.6) 1.96
100 100

We are 95% confident that the difference in the two means is between 12.08 and 10.12.
g.

202

The confidence interval gives more information.

Chapter 7

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

7.10

Some preliminary calculations:


x1 =

n1

2
1

s12 =
x1 =

sp2 =
a.

654
15

( x )

n2

6542
15 = 419.6 = 29.3167
15 1
14

n1

28934

n1 1

2
2

s22 =

858
= 53.625
16

( x )

n2
n2 1

8582
16 = 439.75 = 29.3167
16 1
15

46450

(n1 1) s12 + (n2 1) s22 (15 1)29.9714 + (16 1)29.3167 859.3501


= 29.6328
=
=
29
n1 + n2 2
15 + 16 2

H0: 2 1 = 10
Ha: 2 1 > 10
The test statistic is t =

( x1 x2 ) D0
1 1
sp2 +
n1 n2

(53.625 43.6) 10
1 1
29.6328 +
15 16

.025
= .013
1.9564

The rejection region requires = .01 in the upper tail of the t-distribution with df =
n1 + n2 2 = 15 + 16 2 = 29. From Table VI, Appendix B, t.01 = 2.462. The rejection
region is t > 2.462.
Since the test statistic does not fall in the rejection region (t = .013 >/ 2.462), H0 is not
rejected. There is insufficient evidence to conclude 2 1 > 10 at = .01.
b.

For confidence coefficient .98, = .02 and /2 = .01. From Table VI, Appendix B, with
df = n1 + n2 2 = 15 + 16 2 = 29, t.01 = 2.462. The 98% confidence interval for
(2 1) is:

1 1
1 1
( x1 x2 ) t / 2 sp2 + (53.625 43.6) 2.462 29.6328 +
15 16
n1 n2
10.025 4.817 (5.208, 14.842)
We are 98% confident that the difference between the mean of population 2 and the mean
of population 1 is between 5.208 and 14.842.

Inferences Based on Two Samples: Confidence Intervals and Tests of Hypothesis

203

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

7.12

a.

Let 1 = mean carat size of diamonds certified by GIA and 2 = mean carat size of
diamonds certified by HRD. For confidence coefficient .95, = .05 and /2 = .05/2 =
.025. From Table IV, Appendix B, z.025 = 1.96. The 95% confidence interval is:

12

( x1 x2 ) z / 2

n1

22
n2

(.6723 .8129) 1.96

.24562 .18312
+
151
79

.1406 .0563 (.1969, .0843)

b.

We are 95% confident that the difference in mean carat size between diamonds certified
by GIA and those certified by HRD is between -.1969 and -.0843.

c.

Let 3 = mean carat size of diamonds certified by IGI.


( x1 x3 ) z / 2

12
n1

32
n3

(.6723 .3665) 1.96

.24562 .21632
+
151
78

.3058 .0620 (.2438, .3678)

7.14

d.

We are 95% confident that the difference in mean carat size between diamonds certified
by GIA and those certified by IGI is between .2438 and .3678.

e.

( x2 x3 ) z / 2

f.

We are 95% confident that the difference in mean carat size between diamonds certified
by HRD and those certified by IGI is between .3837 and .5091.

a.

Let 1 = mean score for males and 2 = mean score for females. For confidence
coefficient .90, = .10 and /2 = .10/2 = .05. From Table IV, Appendix B, z.025 = 1.645.
The 90% confidence interval is:

22
n2

( x1 x2 ) z / 2

32
n3

12
n1

.18312 .21632
+
79
78
.4464 .0627 (.3837, .5091)

(.8129 .3665) 1.96

22
n2

(39.08 38.79) 1.645

6.732 6.942
+
127
114

0.29 1.452 (1.162, 1.742)

We are 90% confident that the difference in mean service-rating scores between males
and females.
b.

204

Because 0 falls in the 90% confidence interval, we are 90% confident that there is no
difference in the mean service-rating scores between males and females.

Chapter 7

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

7.16

a.

The descriptive statistics are:

Descriptive Statistics: US, Japan


Variable
US
Japan

N
5
5

Mean
6.562
3.118

Median
6.870
3.220

TrMean
6.562
3.118

Variable
US
Japan

Minimum
4.770
1.920

Maximum
8.000
4.910

Q1
5.415
1.970

Q3
7.555
4.215

s 2p =

StDev
1.217
1.227

SE Mean
0.544
0.549

(n1 1) s12 + (n2 1) s22 (5 1)1.217 2 + (5 1)1.227 2


= 1.4933
=
5+52
n1 + n2 2

To determine if the mean annual percentage turnover for U.S. plants exceeds that for
Japanese plants, we test:

H0: 1 2 = 0
Ha: 1 2 > 0
The test statistic is t =

( x1 x2 ) D0
1
1
sp2 +
n1 n2

(6.562 3.118) 0
1 1
1.4933 +
5 5

= 4.456

The rejection region requires = .05 in the upper tail of the t-distribution with df =
n1 + n2 2 = 5 + 5 2 = 8. From Table VI, Appendix B, t.05 = 1.860. The rejection
region is t > 1.860.
Since the observed value of the test statistic falls in the rejection region (t = 4.46 > 1.860),
H0 is rejected. There is sufficient evidence to indicate the mean annual percentage
turnover for U.S. plants exceeds that for Japanese plants at = .05.
b.

The p-value = P(t 4.456). Using Table VI, Appendix B, with df = n1 + n2 2


= 5 + 5 2 = 8, .005 < P(t 4.456) < .001. Since the p-value is so small, there is
evidence to reject H0 for > .005.

c.

The necessary assumptions are:


1.
2.
3.

Both sampled populations are approximately normal.


The population variances are equal.
The samples are randomly and independently sampled.

There is no indication that the populations are not normal. Both sample variances are
similar, so there is no evidence the population variances are unequal. There is no
indication the assumptions are not valid.

Inferences Based on Two Samples: Confidence Intervals and Tests of Hypothesis

205

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

7.18

Let 1 = the mean relational intimacy score for participants in the CMC group and 2 = the
mean relational intimacy score for participants in the FTF group.
Using MINITAB, the descriptive statistics are:
Descriptive Statistics: CMC, FTF
Variable
CMC
FTF

N
24
24

N*
0
0

Mean
3.500
3.542

SE Mean
0.159
0.134

StDev
0.780
0.658

Minimum
2.000
2.000

Q1
3.000
3.000

Median
3.500
4.000

Q3
4.000
4.000

Maximum
5.000
5.000

Some preliminary calculations are:


s 2p =

( n1 1) s12 + ( n2 1) s22 = ( 24 1) .7802 + ( 24 1) .6582


n1 + n2 2

24 + 24 2

= 0.5207

To determine if the mean relational intimacy score for participants in the CMC group is
lower than the mean relational intimacy score for participants in the FTF group, we test:
H0: 1 2 = 0
Ha: 1 2 < 0
The test statistic is t =

( x1 x2 ) Do
1 1
s 2p +
n1 n2

( 3.500 3.542 ) 0 = 0.042 = .20


1
1
.5207 +
24 24

.20831

The rejection region requires = .10 in the lower tail of the t-distribution with
df = n1 + n2 2 = 24 + 24 2 = 46. From Table VI, Appendix B, t.10 1.303. The
rejection region is t < 1.303.
Since the observed value of the test statistic does not fall in the rejection region
(t = .20 / 1.303), H0 is not rejected. There is insufficient evidence to indicate that the
mean relational intimacy score for participants in the CMC group is lower than the mean
relational intimacy score for participants in the FTF group at = .10.
7.20

206

a.

The first population is the set of responses for all business students who have access to
lecture notes and the second population is the set of responses for all business students
not having access to lecture notes.

Chapter 7

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

b.

To determine if there is a difference in the mean response of the two groups, we test:
H0: 1 2 = 0
Ha: 1 2 0
The test statistic is z =

( x1 x2 ) 0
s12 s22
+
n1 n2

(8.48 7.80) 0
= 2.19
.94 2.99
+
86
35

The rejection region requires /2 = .01/2 = .005 in each tail of the z-distribution. From
Table IV, Appendix B, z.005 = 2.58. The rejection region is z < 2.58 or z > 2.58.
Since the observed value of the test statistic does not fall in the rejection region (z = 2.19
>/ 2.58), H0 is not rejected. There is insufficient evidence to indicate a difference in the
mean response of the two groups at = .01.
c.

For confidence coefficient .99, = .01 and /2 = .01/2 = .005. From Table IV, Appendix
B, z.005 = 2.58. The confidence interval is:
s12 s22
.94 2.99
+
(8.48 7.80) 2.58
+
n1 n2
86
35

( x1 x2 ) z.005

.68 .801 (.121, 1.481)


We are 99% confident that the difference in the mean response between the two groups is
between .121 and 1.481.

7.22

d.

A 95% confidence interval would be smaller than the 99% confidence interval. The z
value used in the 95% confidence interval is z.025 = 1.96 compared with the z value used
in the 99% confidence interval of z.005 = 2.58.

a.

The bacteria counts are probably normally distributed because each count is the median
of five measurements from the same specimen.

b.

Let 1 = mean of the bacteria count for the discharge and 2 = mean of the bacteria count
upstream. Since we want to test if the mean of the bacteria count for the discharge
exceeds the mean of the count upstream, we test:
H0: 1 2 = 0
Ha: 1 2 > 0

c.

Using MINITAB, the descriptive statistics are:


Descriptive Statistics: Plant, Upstream

Variable
Plant
Upstream

N
6
6

Mean
32.10
29.617

Median
31.75
30.000

TrMean
32.10
29.617

Variable
Plant
Upstream

Minimum
28.20
26.400

Maximum
36.20
32.300

Q1
29.40
27.075

Q3
35.23
31.850

StDev
3.19
2.355

Inferences Based on Two Samples: Confidence Intervals and Tests of Hypothesis

SE Mean
1.30
0.961

207

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

(n1 1) s12 + (n2 1) s22 (6 1)3.192 + (6 1)2.3552


= 7.861
s =
=
n1 + n2 2
6+62
2
p

The test statistic is t =

( x1 x2 ) 0
1
1
s +
n1 n2

(32.10 29.617) 0

2
p

1 1
7.861 +
6 6

= 1.53

No level was given, so we will use = .05. The rejection region requires = .05 in
the upper tail of the t-distribution with df = n1 + n2 2 = 6 + 6 2 = 10. From Table VI,
Appendix B, t.05 = 1.812. The rejection region is t > 1.812.
Since the observed value of the test statistic does not fall in the rejection region (t = 1.53
>/ 1.812), H0 is not rejected. There is insufficient evidence to indicate the mean bacteria
count for the discharge exceeds the mean of the count upstream at = .05.
d.

We must assume:
1.
2.
3.

7.24

The mean counts per specimen for each location is normally distributed.
The variances of the 2 distributions are equal.
Independent and random samples were selected from each population.

a.

We cannot make inferences about the difference between the mean salaries of male
and female accounting/finance/banking professionals because no standard
deviations are provided.

b.

To determine if the mean salary for males is significantly greater than that for females, we
test:
H0: 1 2 = 0
Ha: 1 2 > 0
The rejection region requires = .05 in the upper tail of the z-distribution. From
Table IV, Appendix B, z.05 = 1.645.
To make things easier, we will assume that the standard deviations for the 2 groups
are the same.
The test statistic is
z=

208

( x1 x2 ) Do = ( 69, 484 52,012 ) 0 =


12 22
+

n1 n2

1
1
+

1400
1400

17,836
471,896.2038
=
(.037796)

Chapter 7

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

In order to reject H0 this test statistic must fall in the rejection region, or be greater
than 1.645. Solving for we get:
z=

471,896.2038

> 1.645 <

471,896.2038
= 286,866.99
1.645

Thus, to reject H0 the average of the two standard deviations has to be less than
$286,866.99.

7.26

c.

Yes. In fact, reasonable values for the standard deviation will be around $5,000. which is
much smaller than the required $286,866.99.

d.

These data were collected from voluntary subjects who responded to a Web-based survey.
Thus, this is not a random sample, but a self-selected sample. Generally, subjects who
respond to surveys tend to have very strong opinions, which may not be the same as the
population in general. Thus, the results from this self-selected sample may not reflect the
results from the population in general.

a.
Pair

Difference

1
2
3
4
5
6

3
2
2
4
0
1

nd

d=

d
i =1

nd

12
=2
6

nd
di
nd
i =1
2

di

n
d
sd2 = i =1
nd 1

(12) 2
34

=2
=
5

b.

d = 1 2

c.

For confidence coefficient .95, = .05 and /2 = .025. From Table VI, Appendix B, with
df = nD 1 = 6 1 = 5, t.025 = 2.571. The confidence interval is:

d t / 2

sd
nd

= 2.571

2
6

2 1.484 (.516, 3.484)

Inferences Based on Two Samples: Confidence Intervals and Tests of Hypothesis

209

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

d.

H0: d = 0
Ha: d 0
The test statistic is t = t =

d
sd

nd

2
= 3.46
2/ 6

The rejection region requires /2 = .05/2 = .025 in each tail of the t-distribution with df =
nD 1 = 6 1 = 5. From Table VI, Appendix B, t.025 = 2.571. The rejection region is
t < 2.571 or t > 2.571.
Since the observed value of the test statistic falls in the rejection region (3.46 > 2.571), H0
is rejected. There is sufficient evidence to indicate that the mean difference is different
from 0 at = .05.
7.28

a.

H0: 1 2 = 0
Ha: 1 2 < 0
The rejection region requires = .10 in the lower tail of the z-distribution. From Table
IV, Appendix B, z.10 = 1.28. The rejection region is z < 1.28.

b.

H0: 1 2 = 0
Ha: 1 2 < 0
The test statistic is z =

d 0 3.5 0
=
= 4.71 .
sd
21
nd
38

The rejection region is z < 1.28 (Refer to part a.)


Since the observed value of the test statistic falls in the rejection region (z = 4.71 <
1.28), H0 is rejected. There is sufficient evidence to indicate 1 2 < 0 at = .10.
c.

Since the sample size of the number of pairs is greater than 30, we do not need to assume
that the population of differences is normal. The sampling distribution of d is
approximately normal by the Central Limit Theorem. We must assume that the
differences are randomly selected.

d.

For confidence coefficient .90, = .10 and /2 = .10/2 = .05. From Table IV, Appendix
B, z.05 = 1.645. The 90% confidence interval is:

d z.05

e.

210

sd
nd

3.5 1.645

21
38

3.5 1.223 (4.723, 2.277)

The confidence interval provides more information since it gives an interval of possible
values for the difference between the population means.

Chapter 7

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

7.30

a.

Let 1 = the mean salary of technology professionals in 2003 and 2 = the


mean salary of technology professionals in 2005. Let d = 1 - 2.
To determine if the mean salary of technology professionals at all U.S. metropolitan
areas has increased between 2003 and 2005, we test:

H0: 1 2 = 0

H0: d = 0
OR

Ha: 1 2 < 0

Ha: d < 0

b.
Metro Area

2003 Salary
($ thousands)
87.7
78.6
71.4
70.8
73.0
76.3
73.6
71.1
69.5
69.0
71.0
73.0
62.3

Silicon Valley
New York
Washington, D.C.
Los Angeles
Denver
Boston
Atlanta
Chicago
Philadelphia
San Diego
Seattle
Dallas-Ft. Worth
Detroit

2005 Salary
($ thousands)
85.9
80.3
77.4
77.1
77.1
80.1
73.2
73.0
69.8
77.1
66.9
71.0
64.1

Difference
(2003 2005)
1.8
1.7
6.0
6.3
4.1
3.8
0.4
1.9
0.3
8.1
4.1
2.0
1.8

nd

c.

d=

di
1

nd

25.7
= 1.977
13
2

nd
di
nd
1

2
(25.7) 2
di

206.59
nd
13
=
= 12.9819
sd2 = 1
nd 1
13 1
sd = sd2 = 12.9819 = 3.603
d o

The test statistic is t =

e.

The rejection region requires = .10 in the lower tail of the t-distribution with
df = nd 1 = 13 1 = 14. From Table VI, Appendix B, t.10 = 1.345. The rejection
region is t < 1.345.

sd

nd

1.977 0
= 1.978
3.603 13

d.

Inferences Based on Two Samples: Confidence Intervals and Tests of Hypothesis

211

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

f.

Since the observed value of the test statistic falls in the rejection region
(t = 1.978 < 1.345), H0 is rejected. There is sufficient evidence to indicate the mean
salary of technology professionals at all U.S. metropolitan areas has increased
between 2003 and 2005 at = .10.

g.

In order for the inference to be valid, we must assume that the population of differences is
normal and that we have a random sample.
Using MINITAB, the histogram of the differences is:
Histogram of Diff
3.0

Fr equency

2.5
2.0

1.5
1.0

0.5
0.0

-7.5

-5.0

-2.5

0.0

2.5

5.0

Diff

The graph is fairly mound-shaped although it is somewhat skewed to the right. Since
there are only 13 observations, this graph is close enough to being mound-shaped to
indicate the normal assumption is reasonable.
7.32

212

a.

The data should be analyzed as a paired difference experiment because each actor who
won an Academy Award was paired with another actor with similar characteristics who
did not win the award.

b.

Let 1 = mean life expectancy of Academy Award winners and 2 = mean life expectancy
of non-Academy Award winners. To compare the mean life expectancies of Academy
Award winners and non-winners, we test:
H0: 1 2 = d = 0
Ha: d 0

c.

Since the p-value was so small, there is sufficient evidence to indicate the mean life
expectancies of the Academy Award winners and non-winners are different for any value
of > .003. Since the sample mean life expectancy of Academy Award winners is
greater than that for non-winners, we can conclude that Academy Award winners have a
longer mean life expectancy than non-winners.

Chapter 7

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

7.34

a.

Let 1 = mean driver chest injury rating and 2 = mean passenger chest injury rating.
Because the data are paired, we are interested in 1 2 = d, the difference in mean
chest injury ratings between drivers and passengers.

b.

The data were collected as matched pairs and thus, must be analyzed as matched pairs.
Two ratings are obtained for each car the drivers chest injury rating and the
passengers chest injury rating.

c.

Using MINITAB, the descriptive statistics are:

Descriptive Statistics: DrivChst, PassChst, diff


Variable
DrivChst
PassChst
diff

N
98
98
98

Mean
49.663
50.224
-0.561

Median
50.000
50.500
0.000

TrMean
49.682
50.148
-0.420

Variable
DrivChst
PassChst
diff

Minimum
34.000
35.000
-15.000

Maximum
68.000
69.000
13.000

Q1
45.000
45.000
-4.000

Q3
54.000
55.000
3.000

StDev
6.670
7.107
5.517

SE Mean
0.674
0.718
0.557

For confidence coefficient .99, = .01 and /2 = .01/2 = .005. From Table IV, Appendix
B, z.005 = 2.58. The 99% confidence interval is:
d z.005

7.36

sd
nd

0.561 2.58

5.517
98

0.561 1.438 (1.999, 0.877)

d.

We are 99% confidence that the difference between the mean chest injury ratings of
drivers and front-seat passengers is between 1.999 and 0.877. Since 0 is in the
confidence interval, there is no evidence that the true mean driver chest injury rating
exceeds the true mean passenger chest injury rating.

e.

Since the sample size is large, the sampling distribution of d is approximately normal by
the Central Limit Theorem. We must assume that the differences are randomly selected.

a.

Let C1 = mean relational intimacy score for the CMC group on the first meeting and
C3 = mean relational intimacy score for the CMC group on the third meeting. Let
Cd = difference in mean relational intimacy score between the first and third meetings
for the CMC group. To determine if the mean relational intimacy score will increase
between the first and third meetings, we test:
Ho: Cd = 0
Ha: Cd < 0

b.

The researchers used the paired t-test because the same individuals participated in each of
the three meeting sessions. Thus, the samples would not be independent.

c.

Since the p-value is so small (p = .003), H0 would be rejected. There is sufficient


evidence to indicate that the mean relational intimacy score for participants in the CMC
group increased from the first to the third meeting for any value of > .003.

Inferences Based on Two Samples: Confidence Intervals and Tests of Hypothesis

213

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

d.

Let F1 = mean relational intimacy score for the FTF group on the first meeting and
F3 = mean relational intimacy score for the FTF group on the third meeting. Let
Fd = difference in mean relational intimacy score between the first and third meetings
for the FTF group. To determine if the mean relational intimacy score will change
between the first and third meetings, we test:
H0: Fd = 0
Ha: Fd 0

e.

7.38

Since the p-value is not small (p = .39), H0 would be not be rejected. There is
insufficient evidence to indicate that the mean relational intimacy score for participants
in the FTF group changed from the first to the third meeting for any value of < .39.

Using MINITAB, the descriptive statistics are:


Descriptive Statistics: Method1, Method2, Diff
Variable
Method1
Method2
Diff

N
10
10
10

N*
0
0
0

Mean
13.39
13.10
0.290

SE Mean
4.18
3.96
0.553

StDev
13.22
12.51
1.750

Minimum
1.00
1.40
-2.200

Q1
1.30
1.78
-0.875

Median
10.35
9.50
-0.150

Q3
24.63
25.05
1.575

Maximum
34.40
30.70
3.700

To determine if the mean transition error for method 1 differs from the mean transition error
for method 2, we test:
H0: 1 2 = 0

H0: d = 0
OR

Ha: 1 2 0
The test statistic is t =

d o
sd

nd

Ha: d 0

0.290 0
= 0.52
1.750 10

The rejection region requires /2 = .10/2 = .05 in each tail of the t-distribution with
df = nd 1 = 10 1 = 9. From Table VI, Appendix B, t.05 = 1.833. The rejection region is
t < 1.833 or t > 1.833.
Since the observed value of the test statistic does not fall in the rejection region
(t = 0.52 >/ 1.833), H0 is not rejected. There is insufficient evidence to indicate the mean
transition error for method 1 differs from the mean transition error for method 2 at = .10.
7.40

Using MINITAB, the descriptive statistics are:


Descriptive Statistics: HMETER, HSTATIC, Diff

214

Variable
HMETER
HSTATIC
Diff

N
40
40
40

N*
0
0
0

Mean
1.0405
1.0410
-0.000523

Variable
HMETER
HSTATIC
Diff

Median
1.0232
1.0237
-0.000165

SE Mean
0.00638
0.00649
0.000204

Q3
1.0883
1.0908
0.000317

StDev
0.0403
0.0410
0.001291

Minimum
0.9936
0.9930
-0.004480

Q1
1.0047
1.0043
-0.001078

Maximum
1.1026
1.1052
0.001580

Chapter 7

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

For confidence coefficient .95, = .05 and /2 = .05/2 = .025. From Table VI,
Appendix B, with df = nd 1 = 40 1 = 39, t.025 2.021. The 95% confidence interval
is:
d t.025

sd

0.000523 2.021

n
(0.000936,

0.001291
0.000523 0.000413
40

0.000110)

We are 95% confident that the true difference in mean density measurements between the two
methods is between -0.000936 and -0.000110. Since the absolute value of this interval is
completely less than the desired maximum difference of .002, the winery should choose the
alternative method of measuring wine density.
7.42

a.

The rejection region requires = .01 in the lower tail of the z-distribution. From Table
IV, Appendix B, z.01 = 2.33. The rejection region is z < 2.33.

b.

The rejection region requires = .025 in the lower tail of the z-distribution. From Table
IV, Appendix B, z.025 = 1.96. The rejection region is z < 1.96.

c.

The rejection region requires = .05 in the lower tail of the z-distribution. From Table
IV, Appendix B, z.05 = 1.645. The rejection region is z < 1.645.
The rejection region requires = .10 in the lower tail of the z-distribution. From Table
IV, Appendix B, z.10 = 1.28. The rejection region is z < 1.28.

d.

7.44

For confidence coefficient .95, = 1 .95 = .05 and /2 = .05/2 = .025. From Table IV,
Appendix B, z.025 = 1.96. The 95% confidence interval for p1 p2 is approximately:
a.

( p1 p 2 ) z / 2

p1q1 p 2 q2
.65(1 .65) .58(1 .58)
+
(.65 .58) 1.96
+
n1
n2
400
400
.07 .067 (.003, .137)

b.

( p1 p 2 ) z / 2

p1q1 p 2 q2
+
(.31 .25) 1.96
n1
n2

.31(1 .31) .25(1 .25)


+
180
250

.06 .086 (.026, .146)


c.

( p1 p 2 ) z / 2

p1q1 p 2 q2
.46(1 .46) .61(1 .61)
+
(.46 .61) 1.96
+
100
120
n1
n2
.15 .131 (.281, .019)

7.46

p =

n1 p1 + n2 p 2 55(.7) + 65(.6) 78
=
=
= .65
55 + 65
120
n1 + n2

q = 1 p = 1 .65 = .35

H0: p1 p2 = 0
Ha: p1 p2 > 0

Inferences Based on Two Samples: Confidence Intervals and Tests of Hypothesis

215

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The test statistic is z =

( p1 p 2 ) 0
1 1
+
pq
n1 n2

(.7 .6) 0
1
1
.65(.35) +
55
65

.1
= 1.14
.08739

The rejection region requires = .05 in the upper tail of the z-distribution. From Table IV,
Appendix B, z.05 = 1.645. The rejection region is z > 1.645.
Since the observed value of the test statistic does not fall in the rejection region (z = 1.14 >/
1.645), H0 is not rejected. There is insufficient evidence to indicate the proportion from
population 1 is greater than that for population 2 at = .05.
7.48

a.

Let p1 = proportion of men who prefer to keep track of appointments in their head and
p2 = proportion of women who prefer to keep track of appointments in their head. To
determine if the proportion of men who prefer to keep track of appointments in their head
is greater than that of women, we test:

H0: p1 p2 = 0
Ha: p1 p2 > 0
b.

p =

n1 p1 + n2 p 2 500(.56) + 500(.46)
= .51 and q = 1 p = 1 .51 = .49
=
n1 + n2
500 + 500

The test statistic is z =

7.50

( p1 p 2 ) 0
1 1
+
pq
n1 n2

(.56 .46) 0
1
1
+
.51(.49)

500 500

= 3.16

c.

The rejection region requires = .01 in the upper tail of the z distribution. From Table
IV, Appendix B, z.01 = 2.33. The rejection region is z > 2.33.

d.

The p-value is p = P(z 3.16) .5 .5 = 0.

e.

Since the observed value of the test statistic falls in the rejection region (z = 3.16 > 2.33),
H0 is rejected. There is sufficient evidence to indicate the proportion of men who prefer
to keep track of appointments in their head is greater than that of women at = .01.

a.

Let p1 = proportion of customers returning the printed survey and p2 = proportion of


customers returning the electronic survey. Some preliminary calculations are:
p1 =

x1 261
=
= .414
n1 631

p 2 =

x2 155
=
= .374
n2 414

For confidence coefficient .90, = .10 and /2 = .10/2 = .05. From Table IV,
Appendix B, z.05 = 1.645. The 90% confidence interval is:

216

Chapter 7

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

( p1 p 2 ) z.05

p1q1 p 2 q2
.414(.586) .374(.626)
+
(.414 .374) 1.645
+
n1
n2
631
414
.04 .051 (.011, .091)

We are 90% confidence that the difference in the response rates for the two types of
surveys is between .011 and .091.

7.52

b.

Since the value .05 falls in the 90% confidence interval, it is not an unusual value. Thus,
there is no evidence that the difference in response rates is different from .05. The
researchers would be able to make this inference.

a.

Let p1 = proportion of managers and professionals who are male and p2 = proportion of
part-time MBA students who are male. To see if the samples are sufficiently large:
p1 3 p1 p1 3

p1q1
p q
(.95)(0.5)
p1 3 1 1 .95 3
n1
n1
162

.95 .05 (.90, 1.00)


p 2 3 p 2 p 2 3

p2 q2
p q
(.689)(.311)
p 2 3 2 2 .95 3
n2
n2
109

.689 .133 (.556, .822)


Since both intervals are contained within the interval (0, 1), the normal approximation
will be adequate.
First, we calculate the overall estimate of the common proportion under H0.
p =

n1 p1 + n2 p 2 162(.95) + 109(.689)
= .845
=
n1 + n2
162 + 109

To determine if the population of managers and professionals consists of more males than
the part-time MBA population, we test:
H0: p1 = p2
Ha: p1 > p2
The test statistic is z =

( p1 p 2 ) 0
1 1
+
pq
n1 n2

(.95 .689) 0
1
1
+
.845(.155)

162 109

= 5.82

The rejection region requires = .05 in the upper tail of the z-distribution. From Table
IV, Appendix B, z.05 = 1.645. The rejection region is z > 1.645.
Since the observed value of the test statistic falls in the rejection (z = 5.82 > 1.645), H0 is
rejected. There is sufficient evidence to indicate that population of managers and
professionals consists of more males than the part-time MBA population at = .05.

Inferences Based on Two Samples: Confidence Intervals and Tests of Hypothesis

217

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

b.

We had to assume:
1. Both samples were randomly selected
2. Both sample sizes are sufficiently large.

c.

First, we calculate the overall estimate of the common proportion under H0.
p =

n1 p1 + n2 p 2 162(.912) + 109(.534)
=
= .760
n1 + n2
162 + 109

To determine if the population of managers and professionals consists of more married


individuals than the part-time MBA population, we test:
H0: p1 = p2
Ha: p1 > p2
The test statistic is z =

( p1 p 2 ) 0
1 1
+
pq
n1 n2

(.912 .534) 0

1
1
+
.760(.240)

162 109

= 7.14

The rejection region requires = .01 in the upper tail of the z-distribution. From Table
IV, Appendix B, z.01 = 2.33. The rejection region is z > 2.33.
Since the observed value of the test statistic falls in the rejection (z = 7.14 > 2.33), H0 is
rejected. There is sufficient evidence to indicate that population of managers and
professionals consists of more married individuals than the part-time MBA population at
= .01.
d.

We had to assume:
1. Both samples were randomly selected
2. Both sample sizes are sufficiently large.

7.54

Let p1 = accuracy rate for modules with correct code and p2 = accuracy rate for modules with
defective code.
Some preliminary calculations are:

p 1 =

218

x1 400
=
= .891
n1 449

p 2 =

x 2 20
=
= .408
n2 49

Chapter 7

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

For confidence coefficient .99, = .01 and /2 = .01/2 = .005. From Table IV, Appendix B,
z.005 = 2.58. The 99% confidence interval is:
( p1 p 2 ) z.005

p1q1 p 2 q2
.891(.109) .408(.592)
+
(.891 .408) 2.58
+
n1
n2
449
49
.483 .185 (.298, .668)

We are 99% confident that the difference in accuracy rates between modules with
correct code and modules with defective code is between .298 and .668.
7.56

a.

Let p = proportion of all children who recognize Joe Camel.


p =

x 15 + 46
=
= .735
n 28 + 55

q = 1 p = 1 .735 = .265

To see if the sample is sufficiently large:


p 3 p p 3

pq
pq
.735(.265)
p 3
.735 3
.735 .145
n
n
83
(.590, .880)

Since the interval lies within the interval (0, 1), the normal approximation will be
adequate.
For confidence coefficient .95, = .05 and /2 = .05/2 = .025. From Table IV, Appendix
B, z.025 = 1.96. The 95% confidence interval is:
p z.025

pq
.735(.265)
.735 1.96
.735 .095 (.640, .830)
n
83

We are 95% confident that the proportion of all children who recognize Joe Camel is
between .640 and .830.
b.

Let p1 = proportion of children under the age of 6 who recognize Joe Camel and
p2 = proportion of children age 6 and over who recognize Joe Camel.
x1 15
=
= .536
n1 28
x
46
p 2 = 2 =
= .836
n2 55
p1 =

q1 = 1 p1 = 1 .536 = .464
q2 = 1 p 2 = 1 .836 = .164

Inferences Based on Two Samples: Confidence Intervals and Tests of Hypothesis

219

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

To see if the samples are sufficiently large:

p1 3 p1 p1 3

p 2 3 p 2 p 2 3

p1q1
p q
.536(.464)
p1 3 1 1 .536 \ 3
28
n1
n1
p2 q2
n2

.536 .283 (.253, .819)


p q
.836(.164)
p 2 3 2 2 .836 3
n2
55
.836 .150 (.686, .986)

Since both intervals lie within the interval (0, 1), the normal approximation will be
adequate.
To determine if the recognition of Joe Camel increases with age, we test:
H0: p1 p2 = 0
Ha: p1 p2 < 0
The test statistic is z =

( p1 p 2 ) 0
1 1
+
pq
n1 n2

(.536 .836) 0
1
1
.735(.265) +
28 55

= 2.93

The rejection region requires = .05 in the lower tail of the z-distribution. From Table
IV, Appendix B, z.05 = 1.645. The rejection region is z < 1.645.
Since the observed value of the test statistic falls in the rejection region (z = 2.93 <
1.645), H0 is rejected. There is sufficient evidence to indicate that the recognition of Joe
Camel increases with age at = .05.
7.58

a.

For confidence coefficient .95, = 1 .95 = .05 and /2 = .05/2 = .025. From Table IV,
Appendix B, z.025 = 1.96.
n1 = n2 =

b.

2
( z / 2 ) ( 12 + 22 )

ME 2

(1.96) 2 (152 + 17 2 )
= 192.83 193
3.22

If the range of each population is 40, we would estimate by:

60/4 = 15
For confidence coefficient .99, = 1 .99 = .01 and /2 = .01/2 = .005. From Table IV,
Appendix B, z.005 = 2.58.
n1 = n2 =

220

2
( z / 2 ) ( 12 + 22 )

ME 2

(2.58) 2 (152 + 152 )


= 46.80 47
82

Chapter 7

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

For confidence coefficient .9, = 1 .9 = .1 and /2 = .1/2 = .05. From Table IV,
Appendix B, z.05 = 1.645. For a width of 1, the bound is .5.
n1 = n2 =

7.60

2
( z / 2 ) ( 12 + 22 )

ME

(1.645) 2 (5.82 + 7.52 )


= 143.96 144
.52

First, find the sample sizes needed for width 5, or margin of error 2.5.
For confidence coefficient .9, = 1 .9 = .1 and /2 = .1/2 = .05. From Table IV, Appendix
B, z.05 = 1.645.
n1 = n2 =

2
( z / 2 ) ( 12 + 22 )

ME 2

(1.645) 2 (102 + 102 )


= 86.59 87
2.52

Thus, the necessary sample size from each population is 87. Therefore, sufficient funds have
not been allocated to meet the specifications since n1 = n2 = 100 are large enough samples.
7.62

For confidence coefficient .95, = .05 and /2 = .05/2 = .025. From Table IV, Appendix B,
z.025 = 1.96.

n1 = n2 =

2
( z / 2 ) ( 12 + 22 )

( ME ) 2

1.962 (3.1892 + 2.3552 )


= 26.8 27
1.52

We would need to sample 27 specimens from each location.


7.64

For confidence coefficient .90, = 1 .90 = .10 and /2 = .10/2 = .05. From Table IV,
Appendix B, z.05 = 1.645. Since no information is given about the values of p1 and p2, we will
be conservative and use .5 for both. A width of .04 means the bound is .04/2 = .02.
n1 = n2 =

7.66

a.

( z / 2 )

( p1 q1 + p2 q2 )
( ME ) 2

1.6452 (.5(.5) + .5(.5) )


.022

= 3,382.5 3,383

For confidence coefficient .80, = 1 .80 = .20 and /2 = .20/2 = .10. From Table IV,
Appendix B, z.10 = 1.28. Since we have no prior information about the proportions, we use
p1 = p2 = .5 to get a conservative estimate. For a width of .06, the margin of error is .03.
n1 = n2 =

b.

( z / 2 )

( p1q1 + p2 q2 )
ME 2

(1.28) 2 (.5(1 .5) + .5(1 .5) )


.032

= 910.22 911

For confidence coefficient .90, = 1 .90 = .10 and /2 = .10/2 = .05. From Table IV,
Appendix B, z.05 = 1.645. Using the formula for the sample size needed to estimate a
proportion from Chapter 7,
n=

( z / 2 )

ME

pq

1.6452 (.5(1 .5) )


.02

.6765
= 1691.27 1692
.0004

No, the sample size from part a is not large enough.

Inferences Based on Two Samples: Confidence Intervals and Tests of Hypothesis

221

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

7.68

For confidence coefficient .95, = 1 .95 = .05 and /2 = .025. From Table IV, Appendix B,
z.025 = 1.96.

n1 = n2 =
7.70

2
( z / 2 ) ( 12 + 22 )

( ME ) 2

1.962 (352 + 802 )


= 292.9 293
102

a.

With 1 = 2 and 2 = 30,


P(F 5.39) = .01 (Table XI, Appendix B)

b.

With 1 = 24 and 2 = 10,


P(F 2.74) = .05 (Table IX, Appendix B)
Thus, P(F < 2.74) = 1 P(F 2.74) = 1 .05 = .95.

c.

With 1 = 7 and 2 = 1,
P(F 236.8) = .05 (Table VIII, Appendix B)
Thus, P(F < 236.8) = 1 P(F 236.8) = 1 .05 = .95.

d.

7.72

With 1 = 40 and 2 = 40,


P(F > 2.11) = .01 (Table XI, Appendix B)

To test H0: 12 = 22 against Ha: 12 22 , the rejection region is F > F/2 with 1 = 10 and
2 = 12.
a.

= .20, /2 = .10
Reject H0 if F > F.10 = 2.19 (Table VIII, Appendix B)

b.

= .10, /2 = .05
Reject H0 if F > F.05 = 2.75 (Table IX, Appendix B)

c.

= .05, /2 = .025
Reject H0 if F > F.025 = 3.37 (Table X, Appendix B)

d.

= .02, /2 = .01
Reject H0 if F > F.01 = 4.30 (Table XI, Appendix B)

7.74

a.

To determine if a difference exists between the population variances, we test:


H0: 12 = 22
Ha: 12 22

The test statistic is F =

222

s22 8.75
=
= 2.26
s12 3.87

Chapter 7

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The rejection region requires /2 = .10/2 = .05 in the upper tail of the F-distribution with
1 = n2 1 = 27 1 = 26 and 2 = n1 1 = 12 1 = 11. From Table IX, Appendix B, F.05
2.60. The rejection region is F > 2.60.
Since the observed value of the test statistic does not fall in the rejection region (F = 2.26
>/ 2.60), H0 is not rejected. There is insufficient evidence to indicate a difference
between the population variances.
b.

The p-value is 2P(F 2.26). From Tables VIII and IX, with 1 = 26 and 2 = 11,
2(.05) < 2P(F 2.26) < 2(.10) .10 < 2P(F 2.26) < .20
There is no evidence to reject H0 for .10.

7.76

Let 12 = variance of carat size for diamonds certified by GIA, 22 = variance of carat size for
diamonds certified by HRD, and 32 = variance of carat size for diamonds certified by IGI.
a.

To determine if the variation in carat size differs for diamonds certified by GIA and
diamonds certified by HRD, we test:
H0: 12 = 22

Ha: 12 22
The test statistic is F =

Larger sample variance s12 .24562


= =
= 1.799
Smaller sample variance s22 .18312

The rejection region requires /2 = .05/2 = .025 in the upper tail of the F-distribution with
1 = n1 1 = 151 1 = 150 and 2 = n2 1 = 79 1 = 78. From Table X, Appendix B,
F.025 1.43. The rejection region is F > 1.43.

Since the observed value of the test statistic falls in the rejection region (F = 1.799 >
1.43), H0 is rejected. There is sufficient evidence to indicate the variation in carat size
differs for diamonds certified by GIA and those certified by HRD at = .05.
b.

To determine if the variation in carat size differs for diamonds certified by GIA and
diamonds certified by IGI, we test:
H0: 12 = 32
Ha: 12 32

The test statistic is F =

Larger sample variance s12 .24562


= =
= 1.289
Smaller sample variance s32 .21632

The rejection region requires /2 = .05/2 = .025 in the upper tail of the F-distribution with
1 = n1 1 = 151 1 = 150 and 2 = n3 1 = 78 1 = 77. From Table X,
Appendix B, F.025 1.43. The rejection region is F > 1.43.

Inferences Based on Two Samples: Confidence Intervals and Tests of Hypothesis

223

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Since the observed value of the test statistic does not fall in the rejection region (F =
1.289 >/ 1.43), H0 is not rejected. There is insufficient evidence to indicate the variation
in carat size differs for diamonds certified by GIA and those certified by IGI at = .05.
c.

To determine if the variation in carat size differs for diamonds certified by HRD and
diamonds certified by IGI, we test:
H0: 22 = 32
Ha: 22 32

Larger sample variance s32 .21632


= =
= 1.396
Smaller sample variance s22 .18312
The rejection region requires /2 = .05/2 = .025 in the upper tail of the F-distribution with
1 = n3 1 = 78 1 = 77 and 2 = n2 1 = 79 1 = 78. From Table X, Appendix B, F.025
1.67. The rejection region is F > 1.67.
The test statistic is F =

Since the observed value of the test statistic does not fall in the rejection region (F =
1.396 >/ 1.67), H0 is not rejected. There is insufficient evidence to indicate the variation
in carat size differs for diamonds certified by HRD and those certified by IGI at = .05.
d.

We will look at the 4 methods for determining if the data are normal. First, we will look
at histograms of the data. Using MINITAB, the histograms of the carat sizes for the 3
certification bodies are:

40
40

30

Percent

Percent

30

20

20

10
10

0.0

0.5

1.0

0.0

GIA

0.5

1.0

HRD

40

Percent

30

20

10

0
0.0

0.5

1.0

IGI

From the histograms, none of the data appear to be mound-shaped. It appears that none
of the data sets are normal.

224

Chapter 7

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Next, we look at the intervals x s, x 2 s, x 3s . If the proportions of observations


falling in each interval are approximately .68, .95, and 1.00, then the data are
approximately normal. Using MINITAB, the summary statistics are:
Descriptive Statistics: GIA, IGI, HRD
Variable
GIA
IGI
HRD

N
151
78
79

Mean
0.6723
0.3665
0.8129

Median
0.7000
0.2900
0.8100

TrMean
0.6713
0.3406
0.8169

Variable
GIA
IGI
HRD

Minimum
0.3000
0.1800
0.5000

Maximum
1.1000
1.0100
1.0900

Q1
0.5000
0.2100
0.6500

Q3
0.9000
0.4850
1.0000

StDev
0.2456
0.2163
0.1831

SE Mean
0.0200
0.0245
0.0206

For GIA:

x s .6723 .2456 (.4267, .9179) 84 of the 151 values fall in this interval. The
proportion is .56. This is much smaller than the .68 we would expect if the data were
normal.
x 2 s .6723 2(.2456) .6723 .4912 (.1811, 1.1635) 151 of the 151 values fall
in this interval. The proportion is 1.00. This is much larger than the .95 we would expect
if the data were normal.
x 3s .6723 3(.2456) .6723 .7368 (.0645, 1.4091) 151 of the 151 values
fall in this interval. The proportion is 1.00. This is the same as the 1.00 we would expect
if the data were normal.
From this method, it appears that the data are not normal.
For IGI:

x s .3665 .2163 (.1502, .5828) 69 of the 78 values fall in this interval. The
proportion is .88. This is much larger than the .68 we would expect if the data were
normal.
x 2s .3665 2(.2163) .3665 .4326 (.0661, .7991) 74 of the 78 values fall
in this interval. The proportion is .95. This is the same as the .95 we would expect if the
data were normal.
x 3s .3665 3(.2163) .3665 .6489 (.2824, 1.0154) 78 of the 78 values fall
in this interval. The proportion is 1.00. This is the same as the 1.00 we would expect if
the data were normal.
From this method, it appears that the data are not normal.

Inferences Based on Two Samples: Confidence Intervals and Tests of Hypothesis

225

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

For HRD:

x s .8129 .1831 (.6298, .9960) 30 of the 79 values fall in this interval. The
proportion is .38. This is much smaller than the .68 we would expect if the data were
normal.
x 2 s .8129 2(.1831) .8129 .3662 (.4467, 1.1791) 79 of the 79 values fall in
this interval. The proportion is 1.00. This is much larger than the .95 we would expect if
the data were normal.
x 3s .8129 3(.1831) .8129 .5493 (.2636, 1.3622) 79 of the 79 values fall in
this interval. The proportion is 1.00. This is the same as the 1.00 we would expect if the
data were normal.
From this method, it appears that the data are not normal.
Next, we look at the ratio of the IQR to s.
For GIA:

IQR = QU QL = 1.1 .3 = .8.


IQR
.8
=
= 3.26 This is much larger than the 1.3 we would expect if the data were
s
.2456
normal. This method indicates the data are not normal.
For IGI:

IQR = QU QL = 1.01 - .18 = .83.


IQR
.83
=
= 3.84 This is much larger than the 1.3 we would expect if the data were
s
.2163
normal. This method indicates the data are not normal.
For HRD:

IQR = QU QL = 1.09 - .5 = .59.


IQR
.59
=
= 3.22 This is much larger than the 1.3 we would expect if the data were
s
.1831
normal. This method indicates the data are not normal.

226

Chapter 7

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Finally, using MINITAB, the normal probability plot for GIA is:
Normal Probability Plot for GIA
ML Estimates - 95% CI

ML Estimates
99

Percent

95
90

Mean

0.672252

StDev

0.244757

Goodness of Fit

80
70
60
50
40
30
20

AD*

3.332

10
5
1

0.0

0.5

1.0

1.5

Data

Since the data do not form a straight line, the data are not normal.
Using MINITAB, the normal probability plot for IGI is:
Normal Probability Plot for IGI
ML Estimates - 95% CI

ML Estimates
99

Mean

0.366538

StDev

0.214863

Percent

95
90

Goodness of Fit

80
70
60
50
40
30
20

AD*

5.622

10
5
1

0.0

0.5

1.0

Data

Since the data do not form a straight line, the data are not normal.

Inferences Based on Two Samples: Confidence Intervals and Tests of Hypothesis

227

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Using MINITAB, the normal probability plot for HRD is:


Normal Probability Plot for HRD
ML Estimates - 95% CI

ML Estimates
99

Percent

95
90

Mean

0.812911

StDev

0.181890

Goodness of Fit

80
70
60
50
40
30
20

AD*

3.539

10
5
1

0.5

1.0

1.5

Data

Since the data do not form a straight line, the data are not normal.
From the 4 different methods, all indications are that the carat size data are not normal
for any of the certification bodies.
7.78

a.

The amount of variability of GHQ scores tells us how similar or different the members of
the group are on GHQ scores. The larger the variability, the larger the differences are
among the members on the GHQ scores. The smaller the variability, the smaller the
differences are among the members on the GHQ scores.

b.

Let 12 = variance of the mental health scores of the employed and 22 = variance of the
mental health scores of the unemployed. To determine if the variability in mental health
scores differs for employed and unemployed workers, we test:
H0: 12 = 22
Ha: 12 22

c.

The test statistic is F =

Larger sample variance s12 5.102


= 2.45
= =
Smaller sample variance s22 3.262

The rejection region requires /2 = .05/2 = .025 in the upper tail of the F-distribution
with 1 = n2 1 = 49 1 = 48 and 2 = n1 1 = 142 1 = 141. From Table XI, Appendix
B, F.025 1.61. The rejection region is F > 1.61.
Since the observed value of the test statistic falls in the rejection region (F = 2.45 > 1.61),
H0 is rejected. There is sufficient evidence to indicate that the variability in mental health
scores differs for employed and unemployed workers for = 05.

228

Chapter 7

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

d.

7.80

We must assume that the 2 populations of mental health scores are normally distributed.
We must also assume that we selected 2 independent random samples.

Let 12 = variance zinc measurements from the text-line, 22 = variance zinc measurements
from the witness-line, and 32 = variance zinc measurements from the intersection. Using
MINITAB, the descriptive statistics are:
Descriptive Statistics: Text-line, Witness-line, Intersection
Variable
Text-lin
WitnessIntersec

N
3
6
5

Mean
0.3830
0.3042
0.3290

Median
0.3740
0.2955
0.3190

TrMean
0.3830
0.3042
0.3290

Variable
Text-lin
WitnessIntersec

Minimum
0.3350
0.1880
0.2850

Maximum
0.4400
0.4390
0.3930

Q1
0.3350
0.2045
0.2900

Q3
0.4400
0.4075
0.3730

a.

StDev
0.0531
0.1015
0.0443

SE Mean
0.0306
0.0415
0.0198

To determine if the variation in the zinc measurements for the text-line and the
intersection differ, we test:
H0: 12 = 32
Ha: 12 32
The test statistic is F =

Larger sample variance s12 .05312


= 1.437
= =
Smaller sample variance s32 .04432

The rejection region requires /2 = .05/2 = .025 in the upper tail of the F-distribution with
1 = n1 1 = 3 1 = 2 and 2 = n3 1 = 5 1 = 4. From Table X, Appendix B, F.025 =
10.65. The rejection region is F > 10.65.
Since the observed value of the test statistic does not fall in the rejection region (F =
1.437 >/ 10.65), H0 is not rejected. There is insufficient evidence to indicate the variation
in the zinc measurements for the text-line and the intersection differ at = .05.
b.

To determine if the variation in the zinc measurements for the witness-line and the
intersection differ, we test:
H0: 22 = 32
Ha: 22 32
The test statistic is F =

Larger sample variance s22 .10152


= 5.250
= =
Smaller sample variance s32 .04432

The rejection region requires /2 = .05/2 = .025 in the upper tail of the F-distribution with
1 = n2 1 = 6 1 = 5 and 2 = n3 1 = 5 1 = 4. From Table X, Appendix B, F.025 =
9.36. The rejection region is F > 9.36.
Since the observed value of the test statistic does not fall in the rejection region (F =
5.250 >/ 9.36), H0 is not rejected. There is insufficient evidence to indicate the variation
in the zinc measurements for the witness-line and the intersection differ at = .05.

Inferences Based on Two Samples: Confidence Intervals and Tests of Hypothesis

229

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

7.82

c.

There is no indication that the variances of the zinc measurements for three locations
differ.

d.

With only 3, 6, and 5 measurements, it is very difficult to check the assumptions.

Using MINITAB, some preliminary calculations are:


Descriptive Statistics: HEATRATE
Variable
HEATRATE

ENGINE
Advanced
Aeroderiv
Traditional

N
21
7
39

Variable
HEATRATE

ENGINE
Advanced
Aeroderiv
Traditional

Q3
10060
14628
11964

a.

N*
0
0
0

Mean
9764
12312
11544

SE Mean
139
1002
205

StDev
639
2652
1279

Minimum
9105
8714
10086

Q1
9252
9469
10592

Median
9669
12414
11183

Maximum
11588
16243
14796

To determine if the heat rate variances for traditional and aeroderivative augmented gas
turbines differ, we test:
H0: 22 = 32
Ha: 22 32
89)
The test statistic is

F=

Larger sample variance s22 26522


= 4.299
=
=
Smaller sample variance s32 12792

The rejection region requires /2 = .05/2 = .025 in the upper tail of the F distribution
with numerator df = 2 = n2 1 = 7 1 = 6 and denominator df = 3 = n3 1 = 39 1
= 38. From Table X, Appendix B, F.025 2.74. The rejection region is F > 2.74.
Since the observed value of the test statistic falls in the rejection region
(F = 4.299 > 2.74), H0 is rejected. There is sufficient evidence to indicate the heat rate
variances for traditional and aeroderivative augmented gas turbines differ at = .05.
Since the test in Exercise 7.23 a assumes that the population variances are the same, the
validity of the test is suspect since we just found the variances are different.
b.

To determine if the heat rate variances for advanced and aeroderivative augmented gas
turbines differ, we test:
H0: 12 = 22
Ha: 12 22

230

Chapter 7

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Larger sample variance s 212 26522


= 17.224
=
=
The test statistic is F =
Smaller sample variance s12
6392
The rejection region requires /2 = .05/2 = .025 in the upper tail of the F distribution
with numerator df = 1 = n1 1 = 7 1 = 6 and denominator df = 2 = n2 1 = 21 1
= 20. From Table X, Appendix B, F.025 = 3.13. The rejection region is F > 3.13.
Since the observed value of the test statistic falls in the rejection region
(F = 17.224 > 3.13), H0 is rejected. There is sufficient evidence to indicate the heat rate
variances for advanced and aeroderivative augmented gas turbines differ at = .05.
Since the test in Exercise 7.23 b assumes that the population variances are the same, the
validity of the test is suspect since we just found the variances are different.
7.84

a.

The 2 samples are randomly selected in an independent manner from the two populations.
The sample sizes, n1 and n2, are large enough so that x1 and x2 each have approximately
normal sampling distributions and so that s12 and s22 provide good approximations to 12
and 22 . This will be true if n1 30 and n2 30.

b.

7.86

1.
2.
3.

Both sampled populations have relative frequency distributions that are


approximately normal.
The population variances are equal.
The samples are randomly and independently selected from the populations.

c.

1.
2.

The relative frequency distribution of the population of differences is normal.


The sample of differences are randomly selected from the population of differences.

d.

The two samples are independent random samples from binomial distributions. Both
samples should be large enough so that the normal distribution provides an adequate
approximation to the sampling distributions of p1 and p 2 .

e.

The two samples are independent random samples from populations which are normally
distributed.

a.

H0: 12 = 22
Ha: 12 22
The test statistic is F =

s2
Larger sample variance
120.1
= 22 =
= 3.84
Smaller sample variance
s1
31.3

The rejection region requires /2 = .05/2 = .025 in the upper tail of the F-distribution
with numerator df 1 = n2 1 = 15 1 = 14 and denominator df 2 = n1 1 = 20 1 = 19.
From Table XI, Appendix B, F.025 2.66. The rejection region is F > 2.66.
Since the observed value of the test statistic falls in the rejection region (F = 3.84 > 2.66),
H0 is rejected. There is sufficient evidence to conclude 12 22 at = .05.

Inferences Based on Two Samples: Confidence Intervals and Tests of Hypothesis

231

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

b.

7.88

No, we should not use a small sample t test to test H0: (1 2) = 0 against Ha: (1 2)
0 because the assumption of equal variances does not seem to hold since we concluded
12 22 in part b.

Some preliminary calculations are:

p1 =
a.

x1 110
x 130
x +x
110 + 130 240
=
=
= .55; p 2 = 2 =
= .65; p = 1 2 =
n1 200
n2 200
n1 + n2 200 + 200 400

H0: (p1 p2) = 0


Ha: (p1 p2) < 0
The test statistic is z =

( p1 p 2 ) 0
1 1
+
pq
n1 n2

(.55 .65) 0
1
1
.6(1 .6)
+

200 200

.10
= 2.04
.049

The rejection region requires = .10 in the lower tail of the z-distribution. From Table
IV, Appendix B, z.10 = 1.28. The rejection region is z < 1.28.
Since the observed value of the test statistic falls in the rejection region (z = 2.04 <
1.28), H0 is rejected. There is sufficient evidence to conclude (p1 p2 < 0) at = .10.
b.

For confidence coefficient .95, = 1 .95 = .05 and /2 = .05/2 = .025. From Table IV,
Appendix B, z.025 = 1.96. The 95% confidence interval for (p1 p2) is approximately:
p q
p q
( p1 p 2 ) z / 2 1 2 + 2 2
n1
n2
.55(1 .55) .65(1 .65)
+
200
200
.10 .096 (.196, .004)
(.55 .65) 1.96

c.

From part b, z.025 = 1.96. Using the information from our samples, we can use
p1 = .55 and p2 = .65. For a width of .01, the margin of error is .005.
n1 = n2 =

232

( z / 2 )

( p1q1 + p2 q2 )
ME

(1.96) 2 (.55(1 .55) + .65(1 .65) )

.005
= 72990.4 72,991

1.82476
.000025

Chapter 7

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

7.90

a.

Let p1 = proportion of Opening Doors students enrolled full time and p2 = proportion
of traditional students enrolled full time.
The target parameter for this comparison is p1 p2.

b.

Let 1 = mean GPA of Opening Doors students and 2 = mean GPA of traditional
students.
The target parameter for this comparison is 1 2.

7.92

Using MINITAB, some preliminary calculations are:


Descriptive Statistics: Spillage
Variable
Spillage

Cause
Collision
Fire
Grounding
HullFail
Unknown

Variable
Spillage

Cause
Q3 Maximum
Collision 102.0
257.0
Fire
80.5
239.0
Grounding 59.00
124.00
HullFail
46.0
221.0
Unknown
*
27.00

a.

N
10
12
14
12
2

N*
0
0
0
0
0

Mean
76.6
70.9
47.79
54.4
26.00

SE Mean
22.3
17.5
7.61
16.3
1.00

StDev
70.4
60.7
28.47
56.4
1.41

Minimum
31.0
26.0
21.00
24.0
25.00

Q1
35.0
32.3
30.25
29.3
*

Median
41.5
49.0
37.50
31.5
26.00

Let 1 = mean spillage for accidents caused by collision and 2 = mean spillage for
accidents caused by fire/explosion.
s 2p =

( n1 1) s12 + ( n2 1) s22 = (10 1) 70.42 + (12 1) 60.72


n1 + n2 2

10 + 12 2

= 4, 256.7415

For confidence coefficient .90, = 1 .90 = .10 and /2 = .10/2 = .05. From Table
VI, Appendix B, with df = n1 + n2 2 = 10 + 12 2 = 20, t.05 = 1.725. The confidence
interval is:

1 1
1 1
( x1 x2 ) t.05 s 2p + (76.6 70.9) 1.725 4256.7415 +
10 12
n1 n2
5.7 48.19 ( 42.59, 53.89)
b.

Let 3 = mean spillage for accidents caused by grounding and 4 = mean spillage for
accidents caused by hull failure.
s 2p =

( n1 1) s12 + ( n2 1) s22 = (14 1) 28.47 2 + (12 1) 56.42


n1 + n2 2

14 + 12 2

= 1,896.9830

Inferences Based on Two Samples: Confidence Intervals and Tests of Hypothesis

233

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

To determine if the mean spillage amount for accidents caused by grounding is


different from the mean spillage amount caused by hull failure, we test:
H0: 3 4 = 0
Ha: 3 4 0
The test statistic is t =

( x1 x2 ) Do
1 1
s 2p +
n1 n2

( 47.79 54.4 ) 0
1 1
1896.983 +
14 12

6.61
= .39
17.1342

The rejection region requires /2 = .05/2 = .025 in each tail of the t-distribution with
df = n1 + n2 2 = 14 +12 2 = 24. From Table VI, Appendix B, t.025 = 2.064. The
rejection region is t < 2.064 or t > 2.064.
Since the observed value of the test statistic does not fall in the rejection region
(t = .39 </ 2.064), H0 is not rejected. There is insufficient evidence to indicate the
mean spillage amount for accidents caused by grounding is different from the mean
spillage amount caused by hull failure at = .05.
c.

The necessary assumptions are:


We must assume that the distributions from which the samples were selected are
approximately normal, the samples are independent, and the variances of the two
populations are equal.
Below are the stem-and-leaf plots for each of the samples:
Stem-and-leaf of Spillage Cause = Collision
Leaf Unit = 10
(6)
4
2
1
1
1

0
0
1
1
2
2

234

0
0
0
0
1
1
1
1
1
2
2

= 10

333444
69
2
5

Stem-and-leaf of Spillage Cause = Fire


Leaf Unit = 10
4
(4)
4
3
2
2
1
1
1
1
1

= 12

2333
4455
7
8
3

Chapter 7

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Stem-and-leaf of Spillage Cause = Grounding


Leaf Unit = 1.0
3
(5)
6
4
3
2
2
2
1
1
1

2
3
4
5
6
7
8
9
10
11
12

168
11678
15
8
2
1
4

Stem-and-leaf of Spillage Cause = Hull Failure


Leaf Unit = 10
(8)
4
2
2
2
1
1
1
1
1
1

0
0
0
0
1
1
1
1
1
2
2

= 14

= 12

22233333
44
0

Based on the shapes of the stem-and-leaf plots, it does not appear that the data are
normally distributed.
Also, we know that if the data are normally distributed, then the Interquartile Range,
IQR, divided by the standard deviation should be approximately 1.3. We will
compute IQR/s for each of the samples:
Collision:
Fire:
Grounding:
Hull Failure:

IQR/s = (102.0 35.0) / 70.4 = .95


IQR/s = (80.5 32.3) / 60.7 = .79
IQR/s = (59.0 30.25) / 28.47 = 1.01
IQR/s = (46.0 29.3) / 56.4 = .29

Since all of these ratios are quite a bit smaller than 1.3, it indicates that none of the
samples come from normal distributions.
Thus, it appears that the assumption of normal distributions is violated.
The sample standard deviations are:
Collision:
Fire:
Grounding:
Hull Failure:

s = 70.4
s = 60.7
s = 28.47
s = 56.4

Without doing formal tests, it appears that the variances of the groups Collision, Fire,
and Hull Failure are probably not significantly different. However, it appears that the
variance for the grounding group is smaller than the others.

Inferences Based on Two Samples: Confidence Intervals and Tests of Hypothesis

235

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

d.

Let 12 = variance of spillage for accidents caused by collision and 22 = variance of


spillage for accidents caused by grounding.
To determine if the variances of the amounts of spillage due to collision and grounding
differ, we test:
H0: 12 = 22
Ha: 12 22
The test statistic is F =

Larger sample variance s12


70.42
= 6.11
= 2 =
Smaller sample variance s2 28.47 2

The rejection region requires /2 = .02/2 = .01 in the upper tail of the F distribution with
numerator df = 1 = n1 1 = 10 1 = 9 and denominator df = 2 = n2 1 = 14 1 = 13.
From Table XI, Appendix B, F.01 = 4.19. The rejection region is F > 4.19.
Since the observed value of the test statistic falls in the rejection region
(F = 6.11 > 4.19), H0 is rejected. There is sufficient evidence to indicate the variances of
the amounts of spillage due to collision and grounding differ at = .02.
7.94

a.

Let 1 = mean rating of concern about product tampering for males and 2 = mean rating
of concern about product tampering for females. To determine whether a difference
exists in the mean level of concern about product tampering between males and females,
we test:
H0: 1 2 = 0
Ha: 1 2 0

7.96

b.

The p-value = .008. Since the p-value is so small, there is evidence to reject H0. There is
sufficient evidence to indicate a difference exists in the mean level of concern about
product tampering between males and females for > .008.

c.

We must assume the sample sizes were sufficiently large so that the Central Limit
Theorem applies. We must also assume that we selected two random and independent
samples from the two populations.

For confidence coefficient .95, = .05 and /2 = .025. From Table IV, Appendix B,
z.025 = 1.96.
n1 n 2 =

7.98

236

a.

( z / 2 )

( p1q1 + p2 q2 )
( ME ) 2

1.962 (.395(.605) + .293(.707) )


.032

= 1904.26 1905

Let p1999 = proportion of adult Americans who would vote for a woman president in 1999
and p1975 = proportion of adult Americans who would vote for a woman president in
1975.

Chapter 7

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

b.

To see if the samples are sufficiently large:


p1999 3 p1999 p1999 3

p1999 q1999
p q
.92(.08)
p1999 3 1999 1999 .92 3
n1999
n1999
2000

.92 .02 (.90, .94)


p1975 3 p1975 p1975 3

p1975 q1975
p q
.73(.27)
p1975 3 1975 1975 .73 3
n1975
n1975
2000

.73 .03 (.70, .76)


Since both intervals are contained within the interval (0, 1), the normal approximation
will be adequate.
c.

For confidence coefficient .90, = .10 and /2 = .10/2 = .05. From Table IV,
Appendix B, z.05 = 1.645. The 90% confidence interval is:
( p1 p 2 ) z.05

p1 q1 p 2 q2
.92(.08) .73(.27)
+
(.92 .73) 1.645
+
n1
n2
2000
1500

.19 .02 (.17, .21)


We are 90% confident that the difference in the proportions of adult Americans who
would vote for a woman president between 1999 and 1975 is between .17 and .21.
d.

To see if the samples are sufficiently large:


p1999 3 p1999 p1999 3

p1999 q1999
p q
.92(.08)
p1999 3 1999 1999 .92 3
n1999
n1999
20

.92 .18 (.74, 1.10)


p1975 3 p1975 p1975 3

p1975 q1975
p q
.73(.27)
p1975 3 1975 1975 .73 3
n1975
n1975
50

.73 .19 (.54, .92)


Since the first interval is not contained within the interval (0, 1), the normal
approximation will not be adequate.
7.100

a.

For each measure, let 1 = mean job satisfaction for day-shift nurses and 2 = mean job
satisfaction for night-shift nurses. To determine whether a difference in job satisfaction
exists between day-shift and night-shift nurses, we test:
H0: 1 2 = 0
Ha: 1 2 0

Inferences Based on Two Samples: Confidence Intervals and Tests of Hypothesis

237

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

b.

Hours of work: The p-value = .813. Since the p-value is so large, there is no evidence to
reject H0. There is insufficient evidence to indicate a difference in mean job satisfaction
exists between day-shift and night-shift nurses on hours of work for .10.
Free time: The p-value = .047. Since the p-value is so small, there is evidence to reject
H0. There is sufficient evidence to indicate a difference in mean job satisfaction exists
between day-shift and night-shift nurses on free time for > .047.
Breaks: The p-value = .0073. Since the p-value is so small, there is evidence to reject H0.
There is sufficient evidence to indicate a difference in mean job satisfaction exists
between day-shift and night-shift nurses on breaks for > .0073.

c.

We must make the following assumptions for each measure:


1.
2.
3.

7.102

The job satisfaction scores for both day-shift and night-shift nurses are normally
distributed.
The variances of job satisfaction scores for both day-shift and night-shift nurses are
equal.
Random and independent samples were selected from both populations of job
satisfaction scores.

For confidence coefficient .90, = 1 .90 = .10 and /2 = .10/2 = .05. From Table IV,
Appendix B, z.05 = 1.645. We estimate p1 = p2 = .5.
n1 n 2 =

7.104

( z / 2 )

( p1q1 + p2 q2 )
( ME ) 2

(1.645)2 (.5(.5) + .5(.5) )


.052

= 541.205 542

Let p1 = proportion of larvae that died in containers containing high carbon dioxide levels and
p2 = proportion of larvae that died in containers containing normal carbon dioxide levels. The
parameter of interest for this problem is p1 p2, or the difference in the death rates for the two
groups.
Some preliminary calculations are:
p =

x1 + x2 .10(80) + .05(80)
=
= .075
n1 + n2
80 + 80

q = 1 p = 1 .075 = .925

To determine if an increased level of carbon dioxide is effective in killing a higher percentage


of leaf-eating larvae, we test:
H0: p1 p2 = 0
Ha: p1 p2 > 0
The test statistic is z =

238

( p1 p 2 ) 0
1
1
+
pq
80 80

(.10 .05) 0
1
1
.075(.925) +
80 80

= 1.201

Chapter 7

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The rejection region requires = .01 in the upper tail of the z distribution. From Table IV,
Appendix B, z.01 = 2.33. The rejection region is z > 2.33.
Since the observed value of the test statistic does not fall in the rejection region (z = 1.201 >/
2.33), H0 is not rejected. There is insufficient evidence to indicate that an increased level of
carbon dioxide is effective in killing a higher percentage of leaf-eating larvae at = .01.
7.106

a.

Let p1 = proportion of female students who switched due to loss of interest in SME and
p2 = proportion of male students who switched due to lack of interest in SME.
Some preliminary calculations are:
p1 =

x1 74
x
x +x
72
74 + 72
=
= .43; p 2 = 2 =
= .44; p = 1 2 =
= .436
n1 172
n2 163
n1 + n2 172 + 163

To determine if the proportion of female students who switch due to lack of interest in
SME differs from the proportion of males who switch due to a lack of interest, we test:
H0: p1 p2 = 0
Ha: p1 p2 0
The test statistic is z =

( p1 p 2 ) 0
1 1
+
pq
n1 n2

(.43 .44) 0
1
1
+
.436(.564)

172 163

= 0.18

The rejection region requires /2 = .10/2 = .05 in each tail of the z-distribution. From
Table IV, Appendix B, z.05 = 1.645. The rejection region is z < 1.645 or z > 1.645.
Since the observed value of the test statistic does not fall in the rejection region (z = 0.18
</ -1.645), H0 is not rejected. There is insufficient evidence to indicate the proportion of
female students who switch due to lack of interest in SME differs from the proportion of
males who switch due to a lack of interest in SME at = .10.
b.

Let p1 = proportion of female students who switched due to low grades in SME and
p2 = proportion of male students who switched due to low grades in SME.
Some preliminary calculations are:

p1 =

x1 33
=
= .19;
n1 172

p 2 =

x2 44
=
= .27
n2 163

For confidence coefficient .90, = .10 and /2 = .10/2 = .05. From Table IV, Appendix
B, z.05 = 1.645. The confidence interval is:
( p1 p 2 ) z.05

p1q1 p 2 q2
.19(.81) .27(.73)
+
(.19 .27) 1.645
+
n1
n2
172
163
.08 .075 (.155, .005)

Inferences Based on Two Samples: Confidence Intervals and Tests of Hypothesis

239

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

We are 90% confident that the difference between the proportions of female and male
switchers who lost confidence due to low grades in SME is between .155 and .005.
Since the interval does not include 0, there is evidence to indicate the proportion of
female switchers due to low grades is less than the proportion of male switchers due to
low grades.
7.108

For confidence level .95, = .05 and /2 = .05/2 = .025. From Table IV, Appendix B,
z.025 = 1.96. The standard deviation can be estimated by dividing the range by 4:

Range
4
= =1
4
4
2
( z / 2 ) ( 12 + 22 )

n1 = n 2 =
7.110

( ME ) 2

1.962 (12 + 12 )
= 192.08 193
.22

Some preliminary calculations are:

2
1

s12 =

n1

n1 1

2
2

s22 =

( x )

( x )

n2
n2 1

2252
5 = 126 = 31.5
4
5 1

10, 251

227 2
5 = 45.2 = 11.3
4
5 1

10,351

Let 12 = variance for instrument A and 22 = variance for instrument B. Since we wish to
determine if there is a difference in the precision of the two machines, we test:
H0: 12 = 22
Ha: 12 22
The test statistic is F =

Larger sample variance s12 31.5


= 2.79
= =
Smaller sample variance s22 11.3

The rejection region requires /2 = .10/2 = .05 in the upper tail of the F-distribution with 1 =
n1 1 = 5 1 = 4 and 2 = n2 1 = 5 1 = 4. From Table IX, Appendix B, F.05 = 6.39. The
rejection region is F > 6.39.
Since the observed value of the test statistic does not fall in the rejection region (F = 2.79 >/
6.39), H0 is not rejected. There is insufficient evidence of a difference in the precision of the
two instruments at = .10.

240

Chapter 7

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

7.112

a.

Let 1 = mean change in bond prices handled by underwriter 1 and 2 = mean change in
bond prices handled by underwriter 2.
sp2 =

(n1 1) s12 + ( n2 1) s22 (27 1).0098 + (23 1).002465 .30903


=
= .006438
=
48
n1 + n2 2
27 + 23 2

To determine if there is a difference in the mean change in bond prices handled by the 2
underwriters, we test:
H0: 1 2 = 0
Ha: 1 2 0
The test statistic is t =

( x1 x2 ) D0
1 1
s +
n1 n2

2
p

.0491 (.0307) 0
1
1
.006438 +
27 23

= .81

The rejection region requires /2 = .05/2 = .025 in each tail of the t-distribution with df =
n1 + n2 2 = 27 + 23 2 = 48. From Table VI, Appendix B, t.025 1.96. The rejection
region is t < 1.96 or t > 1.96.
Since the observed value of the test statistic does not fall in the rejection region (t = .81
</ 1.96), H0 is not rejected. There is insufficient evidence to indicate there is a
difference in the mean change in bond prices handled by the 2 underwriters at = .05.
b.

For confidence coefficient .95, = 1 .95 = .05 and /2 = .05/2 = .025. From Table VI,
Appendix B, with df = 48, t.025 1.96. The confidence interval is:
1 1
( x1 x2 ) t.025 sp2 +
n1 n2
1
1
(.0491 (.0307) 1.96 .006438 +
27 23
.0184 .0446 (.063, .0262)

We are 95% confident the difference in the mean bond prices handled by underwriter 1
and underwriter 2 is somewhere between .063 and .0262.
7.114

a.

To determine if the mean salary of all males with post-graduate degrees exceeds the mean
salary of all females with post-graduate degrees, we test:
H0: M = F
Ha: M > F

b.

The test statistic is z =

( xM xF ) 0
s

2
xM

+s

2
xF

(61, 340 32, 227)


2,1852 + 9322

= 12.26

Inferences Based on Two Samples: Confidence Intervals and Tests of Hypothesis

241

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

242

c.

The rejection region requires = .01 in the upper tail of the z-distribution. From Table
IV, Appendix B, z.01 = 2.33. The rejection region is z > 2.33.

d.

Since the observed value of the test statistic falls in the rejection region (z = 12.26 >
2.33), H0 is rejected. There is sufficient evidence to indicate the mean salary of all males
with post-graduate degrees exceeds the mean salary of all females with post-graduate
degrees at = .01.

Chapter 7

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The Kentucky Milk CasePart II


(To accompany Chapters 57)

(1)Incumbency Rates
I have repeated the incumbency rates for the Tri-county market. If the "normal" incumbency
rate is .7 in competitive markets, then we would like to test to see if the incumbency rate in the
Tri-county market is larger than .7. We will run a test for each of the years from 1985 through
1988, and also for the four years combined.

Year
1984
1985
1986
1987
1988
1989
1990
1991

Tri-County Market
Number of
Same
Incumbency
Districts
Vendors
Rate
10
8
.800
12
12
1.000
13
13
1.000
13
12
.923
13
13
1.000
13
9
.692
13
10
.769
13
11
.846

1985
One of the assumptions necessary for this test is that the sample size is sufficiently large. In
order for the sample size to be sufficiently large, the interval p0 3 p must not contain 0 or 1.
For this problem, the interval is:
p0 3 p .7 3

.7(.3)
.7 .397 (.303, 1.097)
12

Since 1 is included in the interval, the sample size is not sufficiently large. The following test
may not be valid.
To see if the incumbency rate in 1985 exceeds .7, we test:
H0: p = .7
Ha: p > .7

The test statistic is z =

p p0
p0 q0
n

The Kentucky Milk CasePart II

1 .7
= 2.27
.7(.3)
12

243

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The rejection region requires = .05 in the upper tail of the z-distribution. From Table IV,
Appendix B, z.05 = 1.645. The rejection region is z > 1.645.
Since the observed value of the test statistic falls in the rejection region (z = 2.27 > 1.645), H0 is
rejected. There is sufficient evidence to indicate that the incumbency rate exceeds .7 in the Tricounty market at = .05.
1986

In order for the sample size to be sufficiently large, the interval p0 3 p must not contain 0 or
1.
For this problem, the interval is:
p0 3 p .7 3

.7(.3)
.7 .381 (.319, 1.081)
13

Since 1 is included in the interval, the sample size is not sufficiently large. The following test
may not be valid.
To see if the incumbency rate in 1986 exceeds .7, we test:
H0: p = .7
Ha: p > .7

The test statistic is z =

p p0
p0 q0
n

1 .7
.7(.3)
13

= 2.36

The rejection region requires = .05 in the upper tail of the z-distribution. From Table IV,
Appendix B, z.05 = 1.645. The rejection region is z > 1.645.
Since the observed value of the test statistic falls in the rejection region (z = 2.36 > 1.645), H0 is
rejected. There is sufficient evidence to indicate that the incumbency rate exceeds .7 in the Tricounty market at = .05.
1987

In order for the sample size to be sufficiently large, the interval p0 3 p must not contain 0
or 1.
For this problem, the interval is:
p0 3 p .7 3

244

.7(.3)
.7 .381 (.319, 1.081)
13

The Kentucky Milk CasePart II

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Since 1 is included in the interval, the sample size is not sufficiently large. The following test
may not be valid.
To see if the incumbency rate in 1987 exceeds .7, we test:
H0: p = .7
Ha: p > .7
The test statistic is z =

p p0
p0 q0
n

.923 .7
= 1.75
.7(.3)
13

The rejection region requires = .05 in the upper tail of the z-distribution. From Table IV,
Appendix B, z.05 = 1.645. The rejection region is z > 1.645.
Since the observed value of the test statistic falls in the rejection region (z = 1.75 > 1.645), H0 is
rejected. There is sufficient evidence to indicate that the incumbency rate exceeds .7 in the Tricounty market at = .05.
1988

This test is the same as the test for 1986.


Combined 1985-1988

To see if the sample size is sufficiently large, the interval p0 3 p must not contain 0 or 1.
For this problem, the interval is:
p0 3 p .7 3

.7(.3)
.7 .193 (.507, .893)
51

Since neither 0 nor 1 is included in the interval, the sample size is sufficiently large.
p =

50
= .980
51

To see if the incumbency rate in 19851988 exceeds .7, we test:


H0: p = .7
Ha: p > .7

The test statistic is z =

p p0
p0 q0
n

980 .7
= 4.36
.7(.3)
51

The rejection region requires = .05 in the upper tail of the z-distribution. From Table IV,
Appendix B, z.05 = 1.645. The rejection region is z > 1.645.

The Kentucky Milk CasePart II

245

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Since the observed value of the test statistic falls in the rejection region (z = 4.36 > 1.645), H0 is
rejected. There is sufficient evidence to indicate that the incumbency rate exceeds .7 in the Tricounty market at = .05.
Thus, there is evidence, based on the incumbency rates, that bid collusion is present in the Tricounty market.

(2)Bid Price Dispersion

Again, we can use only the data provided which are the winning bids in each of the school
districts in both markets. The sample sizes and the variances for each of the milk products for
each year and each market are provided in the table.
Whole White Milk

YR
83
84
85
86
87
88
89
90
91

N
22
22
26
33
36
36
37
35
5

Surround
Market
VAR
0.000212
0.000188
0.000174
0.000120
0.000105
0.000128
0.000056
0.000063
0.000042

N
8
9
10
10
12
12
12
12
13

Tri-County
Market
VAR
0.000213
0.000022
0.000028
0.000019
0.000027
0.000024
0.000089
0.000010
0.000020

N
10
12
13
13
13
13
13
13
12

Tri-County
Market
VAR
0.000155
0.000040
0.000028
0.000028
0.000049
0.000038
0.000068
0.000025
0.000034

Lowfat White Milk

YR
83
84
85
86
87
88
89
90
91

246

N
24
26
29
33
35
35
35
34
5

Surround
Market
VAR
0.000279
0.000216
0.000210
0.000139
0.000152
0.000165
0.000043
0.000091
0.000051

The Kentucky Milk CasePart II

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Lowfat Chocolate Milk

YR
83
84
85
86
87
88
89
90
91

N
24
25
28
34
36
36
36
33
5

Surround
Market
VAR
0.000287
0.000234
0.000248
0.000163
0.000163
0.000184
0.000060
0.000098
0.000098

N
5
6
6
6
7
9
9
10
11

Tri-County
Market
VAR
0.000015
0.000060
0.000038
0.000027
0.000040
0.000087
0.000087
0.000014
0.000042

I will write out the first test and then summarize the others in a table. The first test will be for
the year 1983 and will compare the variances of the whole white milk.
To determine if the variances in the winning bid prices differ for the two markets, we test:

12
=1
22
2
Ha: 12 1
2
H0:

The test statistic is F =

s2
larger sample variance
.000213
= 1.005
= 12 =
s2
smaller sample variance
.000212

The rejection region requires /2 = .05/2 = .025 in the upper tail of the F-distribution with 1 =
n2 1 = 8 1 = 7 and 2 = n1 1 = 22 1 = 21. From Table IX, Appendix B, F.025 = 2.97.
The rejection region is F > 2.97.
Since the observed value of the test statistic does not fall in the rejection region (F = 1.005 >/
2.97), H0 is not rejected. There is insufficient evidence to indicate that the variances of the
winning bids are different for the two markets.
Whole White Milk
Year
1983
1984
1985
1986
1987
1988
1989
1990
1991

1, 2
7,21
21,8
25,9
32,9
35,11
35,11
11,36
34,11
4,12

F.025
2.97
4.00
3.61
3.56
2.96
2.96
2.51
3.12
4.12

The Kentucky Milk CasePart II

F
1.005
8.545
6.214
6.316
3.889
5.333
1.589
6.300
2.100

Decision
Do not reject
Reject
Reject
Reject
Reject
Reject
Do not reject
Reject
Do not reject

247

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

In all cases where there was a significant difference in the variances of the winning bids
between the two markets, the variance in the Surrounding market was larger than the variance
in the Tri-county market. This implies that collusion might be present in the Tri-county market.
Lowfat White Milk
Year
1983
1984
1985
1986
1987
1988
1989
1990
1991

1, 2
23,9
25,11
28,12
32,12
34,12
34,12
12,34
33,12
4,11

F.025
3.62
3.17
3.02
2.96
2.96
2.96
2.41
2.96
4.28

F
1.800
5.400
7.500
4.964
3.102
4.342
1.581
3.640
1.500

Decision
Do not reject
Reject
Reject
Reject
Reject
Reject
Do not reject
Reject
Do not reject

Again, in all cases where there was a significant difference in the variances of the winning bids
between the two markets, the variance in the Surrounding market was larger than the variance
in the Tri-county market. This implies that collusion might be present in the Tri-county market.
Lowfat Chocolate Milk
Year
1983
1984
1985
1986
1987
1988
1989
1990
1991

v1,v2
23,4
24,5
27,5
33,5
35,6
35,8
8,35
32,9
4,10

F.025
8.56
6.28
6.28
6.23
5.07
3.89
2.65
3.56
4.47

F
19.133
3.900
6.526
6.037
4.075
10.222
1.450
7.000
2.333

Decision
Reject
Do not reject
Reject
Do not reject
Do not reject
Reject
Do not reject
Reject
Do not reject

Again, in all cases where there was a significant difference in the variances of the winning bids
between the two markets, the variance in the Surrounding market was larger than the variance
in the Tri-county market. This implies that collusion might be present in the Tri-county market.
Based on the analysis of the three milk products, there appears to be collusion in the Tri-county
market.

248

The Kentucky Milk CasePart II

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

(3)Average Winning Bid Price

I have provided the SAS output for computing the t-tests to compare the mean winning bid
prices between the two markets for each of the years and each of the milk products. I will
discuss the findings for each milk product separately. For t-tests, we must assume that the two
population variances are the same. If the population variances are not the same, there is an
approximate test that takes into consideration the different variances. The SAS printout
provided allows for the test of equal variances first. I used a p-value of .25 as the cutoff point.
If the p-value was less than or equal to .25 for the F-test, I assumed that the variances were
different and used the approximate test designated as UNEQUAL. If the p-value for the F-test
was greater than .25, I assumed that the population variances were the same and used the test
designated as EQUAL.
Whole White Milk:
Variable: Whole White Milk - 1983
MARKET

Mean

Std Dev

Std Error

Variances

DF

Prob>|T|

-----------------------------------------------------------------------------SUR

22

0.1318

0.01458844

0.00311027

Unequal

2.4045

12.4

0.0326

TRI

0.1173

0.01462038

0.00516909

Equal

2.4071

28.0

0.0229*

For H0: Variances are equal, F' = 1.00

DF = (7,21)

Prob>F' = 0.9116
************************************************************************
Variable: Whole White Milk
MARKET

Mean

- 1984

Std Dev

Std Error

Variances

DF Prob>|T|

-----------------------------------------------------------------------------SUR

22

0.1309

0.01374189

0.00292978

Unequal

-2.3904

28.6

0.0236*

TRI

0.1389

0.00474871

0.00158290

Equal

-1.6825

29.0

0.1032

For H0: Variances are equal, F' = 8.37

DF = (21,8)

Prob>F' = 0.0044
************************************************************************
Variable: Whole White Milk
MARKET

Mean

- 1985

Std Dev

Std Error

Variances

DF Prob>|T|

-----------------------------------------------------------------------------SUR

26

0.1279

0.01321810

0.00259228

Unequal

-4.3968

33.8

0.0001*

TRI

10

0.1415

0.00534266

0.00168950

Equal

-3.1348

34.0

0.0035

For H0: Variances are equal, F' = 6.12

DF = (25,9)

Prob>F' = 0.0077
************************************************************************

The Kentucky Milk CasePart II

249

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Variable: Whole White Milk


MARKET

Mean

- 1986

Std Dev

Std Error

Variances

DF Prob>|T|

----------------------------------------------------------------------------SUR

33

0.1253

0.01098665

0.00191253

Unequal

-8.1534

37.3

0.0001*

TRI

10

0.1446

0.00442846

0.00140040

Equal

-5.3943

41.0

0.0000

For H0: Variances are equal, F' = 6.15

DF = (32,9)

Prob>F' = 0.0070
************************************************************************
Variable: Whole White Milk
MARKET

Mean

- 1987

Std Dev

Std Error

Variances

DF

Prob>|T|

-----------------------------------------------------------------------------SUR

36

0.1264

0.01026078

0.00171013

Unequal

TRI

12

0.1495

0.00527196

0.00152188

Equal

For H0: Variances are equal, F' = 3.79

-10.0785

37.5

0.0001*

-7.4313

46.0

0.0000

DF = (35,11)

Prob>F' = 0.0224
************************************************************************
Variable: Whole White Milk
MARKET

Mean

- 1988

Std Dev

Std Error

Variances

DF

Prob>|T|

-----------------------------------------------------------------------------SUR

36

0.1277

0.01135449

0.00189242

Unequal

-9.9271

42.2

0.0001*

TRI

12

0.1513

0.00499090

0.00144075

Equal

-6.9441

46.0

0.0000

For H0: Variances are equal, F' = 5.18

DF = (35,11)

Prob>F' = 0.0060
************************************************************************
Variable: Whole White Milk
MARKET

Mean

- 1989

Std Dev

Std Error

Variances

DF

Prob>|T|

-----------------------------------------------------------------------------SUR

37

0.1299

0.00752173

0.00123657

Unequal

-0.4890

15.8

0.6316

TRI

12

0.1314

0.00944991

0.00272795

Equal

-0.5501

47.0

0.5849NS

For H0: Variances are equal, F' = 1.58

DF = (11,36)

Prob>F' = 0.2947
************************************************************************
Variable: Whole White Milk
MARKET

Mean

- 1990

Std Dev

Std Error Variances

DF

Prob>|T|

--------------------------------------------------------------------------SUR

35

0.1609

0.00794659

0.00134322 Unequal

-1.1177

43.7

0.2698NS

TRI

12

0.1628

0.00317904

0.00091771 Equal

-0.7673

45.0

0.4469

For H0: Variances are equal, F' = 6.25

DF = (34,11)

Prob>F' = 0.0026
************************************************************************

250

The Kentucky Milk CasePart II

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Variable: Whole White Milk


MARKET

Mean

- 1991

Std Dev

Std Error

Variances

DF

Prob>|T|

-----------------------------------------------------------------------------SUR

0.1452

0.00652012

0.00291589

Unequal

1.2585

5.6

TRI

13

0.1412

0.00458169

0.00127073

Equal

1.4813

16.0

For H0: Variances are equal, F' = 2.03

0.2585
0.1580NS

DF = (4,12)

Prob>F' = 0.3095
************************************************************************

The mean winning bid prices were significantly different between the markets for all years except
1989, 1990, and 1991. In 1983, the mean winning bid for the Surrounding market was
significantly larger than that for the Tri-county market. For the years 19841988, the mean
winning bid price for the Tri-county market was significantly larger than that for the Surrounding
market. This implies evidence of collusion for the years 19841988.
Lowfat White Milk:
Variable: Lowfat White Milk - 1983
MARKET

Mean

Std Dev

Std Error

Variances

DF

Prob>|T|

-----------------------------------------------------------------------------SUR

24

0.1243

0.01672220

0.00341341

Unequal

2.5085

22.6

0.0198

TRI

10

0.1112

0.01246237

0.00394095

Equal

2.2214

32.0

0.0335*

For H0: Variances are equal, F' = 1.80

DF = (23,9)

Prob>F' = 0.3627
************************************************************************
Variable: Lowfat White Milk - 1984
MARKET

Mean

Std Dev

Std Error

Variances

DF

Prob>|T|

----------------------------------------------------------------------------SUR

26

0.1236

0.01469859

0.00288263

Unequal

-3.0061

36.0

0.0048*

TRI

12

0.1338

0.00635717

0.00183516

Equal

-2.3099

36.0

0.0267

For H0: Variances are equal, F' = 5.35

DF = (25,11)

Prob>F' = 0.0059
************************************************************************
Variable: Lowfat White Milk - 1985
MARKET

Mean

Std Dev

Std Error

Variances

DF

Prob>|T|

----------------------------------------------------------------------------SUR

29

0.1200

0.01452245

0.00269675

Unequal

-5.3857

39.2

0.0001*

TRI

13

0.1366

0.00537445

0.00149061

Equal

-3.9769

40.0

0.0003

For H0: Variances are equal, F' = 7.30

DF = (28,12)

Prob>F' = 0.0008
************************************************************************

The Kentucky Milk CasePart II

251

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Variable: Lowfat White Milk - 1986


MARKET

Mean

Std Dev

Std Error

Variances

DF

Prob>|T|

-----------------------------------------------------------------------------SUR

33

0.1178

0.01180640

0.00205523

Unequal

-8.4010

43.0

0.0001*

TRI

13

0.1391

0.00533205

0.00147884

Equal

-6.2183

44.0

0.0000

For H0: Variances are equal, F' = 4.90

DF = (32,12)

Prob>F' = 0.0055
************************************************************************
Variable: Lowfat White Milk - 1987
MARKET

Mean

Std Dev

Std Error

Variances

DF

Prob>|T|

------------------------------------------------------------------------------SUR

35

0.1173

0.01235100

0.00208770

Unequal

-8.7991

37.8

0.0001*

TRI

13

0.1424

0.00701738

0.00194627

Equal

-6.8995

46.0

0.0000

For H0: Variances are equal, F' = 3.10

DF = (34,12)

Prob>F' = 0.0404
************************************************************************
Variable: Lowfat White Milk - 1988
MARKET

Mean

Std Dev

Std Error

Variances

DF

Prob>|T|

----------------------------------------------------------------------------SUR

35

0.1182

0.01285522

0.00217293

Unequal

-9.6219

42.7

0.0001*

TRI

13

0.1448

0.00618019

0.00171408

Equal

-7.1332

46.0

0.0000

For H0: Variances are equal, F' = 4.33

DF = (34,12)

Prob>F' = 0.0095
************************************************************************
Variable: Lowfat White Milk - 1989
MARKET

Mean

Std Dev

Std Error

Variances

DF

Prob>|T|

----------------------------------------------------------------------------SUR

35

0.1187

0.00655938

0.00110874

Unequal

-2.1005

17.9

0.0501

TRI

13

0.1240

0.00828350

0.00229743

Equal

-2.3400

46.0

0.0237*

For H0: Variances are equal, F' = 1.59

DF = (12,34)

Prob>F' = 0.2798
************************************************************************
Variable: Lowfat White Milk - 1990
MARKET

Mean

Std Dev

Std Error

Variances

DF

Prob>|T|

-----------------------------------------------------------------------------SUR

34

0.1519

0.00954524

0.00163700

Unequal

-2.3772

39.8

0.0223*

TRI

13

0.1570

0.00508486

0.00141029

Equal

-1.8347

45.0

0.0732

For H0: Variances are equal, F' = 3.52

DF = (33,12)

Prob>F' = 0.0238
************************************************************************

252

The Kentucky Milk CasePart II

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Variable: Lowfat White Milk - 1991


MARKET

Mean

Std Dev

Std Error

Variances

DF

Prob>|T|

----------------------------------------------------------------------------SUR

0.1364

0.00718485

0.00321316

Unequal

0.2745

6.3

TRI

12

0.1354

0.00585768

0.00169097

Equal

0.3001

15.0

For H0: Variances are equal, F' = 1.50

0.7925
0.7682NS

DF = (4,11)

Prob>F' = 0.5343
************************************************************************

The mean winning bid prices were significantly different between the markets for all years except
1991. In 1983, the mean winning bid for the Surrounding market was significantly larger than
that for the Tri-county market. For the years 19841990, the mean winning bid price for the Tricounty market was significantly larger than that for the Surrounding market. This implies
evidence of collusion for the years 19841990.
Lowfat Chocolate Milk:
Variable: Lowfat Chocolate Milk - 1983
MARKET

Mean

Std Dev

Std Error

Variances

DF

Prob>|T|

-----------------------------------------------------------------------------SUR

24

0.1267

0.01696642

0.00346326

Unequal

5.3313

26.3

0.0001*

TRI

0.1060

0.00394740

0.00176533

Equal

2.6795

27.0

0.0124

For H0: Variances are equal, F' =

18.47

DF = (23,4)

Prob>F' = 0.0117
************************************************************************
Variable: Lowfat Chocolate Milk - 1984
MARKET

Mean

Std Dev

Std Error

Variances

DF

Prob>|T|

----------------------------------------------------------------------------SUR

25

0.1251

0.01530156

0.00306031

Unequal

-2.1693

15.7

0.0457*

TRI

0.1347

0.00778522

0.00317830

Equal

-1.4733

29.0

0.1514

For H0: Variances are equal, F' = 3.86

DF = (24,5)

Prob>F' = 0.1379
************************************************************************
Variable: Lowfat Chocolate Milk - 1985
MARKET

Mean

Std Dev

Std Error

Variances

DF

Prob>|T|

----------------------------------------------------------------------------SUR

28

0.1206

0.01575587

0.00297758

Unequal

-4.6215

20.9

0.0001*

TRI

0.1387

0.00621914

0.00253895

Equal

-2.7384

32.0

0.0100

For H0: Variances are equal, F' = 6.42

DF = (27,5)

Prob>F' = 0.0472
************************************************************************

The Kentucky Milk CasePart II

253

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Variable: Lowfat Chocolate Milk - 1986


MARKET

Mean

Std Dev

Std Error

Variances

DF

Prob>|T|

----------------------------------------------------------------------------SUR

34

0.1169

0.01279357

0.00219408

Unequal

-8.0140

18.2

0.0001*

TRI

0.1414

0.00521130

0.00212751

Equal

-4.5821

38.0

0.0000

For H0: Variances are equal, F' = 6.03

DF = (33,5)

Prob>F' = 0.0533
************************************************************************
Variable: Lowfat Chocolate Milk - 1987
MARKET

Mean

Std Dev

Std Error

Variances

DF

Prob>|T|

----------------------------------------------------------------------------SUR

36

0.1184

0.01280507

0.00213418

Unequal

-7.8853

17.5

0.0001*

TRI

0.1436

0.00632926

0.00239224

Equal

-5.0675

41.0

0.0000

For H0: Variances are equal, F' = 4.09

DF = (35,6)

Prob>F' = 0.0832
************************************************************************
Variable: Lowfat Chocolate Milk - 1988
MARKET

Mean

Std Dev

Std Error

Variances

DF

Prob>|T|

----------------------------------------------------------------------------SUR

36

0.1192

0.01359999

0.00226666

Unequal

10.3636

40.6

0.0001*

TRI

0.1470

0.00425532

0.00141844

Equal

-5.9934

43.0

0.0000

For H0: Variances are equal, F' =

10.21

DF = (35,8)

Prob>F' = 0.0019
************************************************************************
Variable: Lowfat Chocolate Milk - 1989
MARKET

Mean

Std Dev

Std Error

Variances

DF

Prob>|T|

----------------------------------------------------------------------------SUR

36

0.1200

0.00776605

0.00129434

Unequal

-1.7178

10.9

0.1140

TRI

0.1258

0.00932923

0.00310974

Equal

-1.9216

43.0

0.0613NS

For H0: Variances are equal, F' = 1.44

DF = (8,35)

Prob>F' = 0.4274
************************************************************************
Variable: Lowfat Chocolate Milk - 1990
MARKET

Mean

Std De

Std Error

Variances

DF

Prob>|T|

-----------------------------------------------------------------------------SUR

33

0.1531

0.00993298

0.00172911

Unequal

-3.9472

38.3

0.0003*

TRI

10

0.1614

0.00383030

0.00121125

Equal

-2.5773

41.0

0.0137

For H0: Variances are equal, F' = 6.73

DF = (32,9)

Prob>F' = 0.0050
************************************************************************

254

The Kentucky Milk CasePart II

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Variable: Lowfat Chocolate Milk - 1991


MARKET

Mean

Std Dev

Std Error

Variances

DF

Prob>|T|

----------------------------------------------------------------------------SUR

0.1402

0.00991020

0.00443197

Unequal

-0.4431

5.6

TRI

11

0.1423

0.00650294

0.00196071

Equal

-0.5216

14.0

For H0: Variances are equal, F' = 2.32

0.6743
0.6101NS

DF = (4,10)

Prob>F' = 0.2552

The mean winning bid prices were significantly different between the markets for all years except
1989 and 1991. In 1983, the mean winning bid for the Surrounding market was significantly
larger than that for the Tri-county market. For the years 19841988 and 1990, the mean winning
bid price for the Tri-county market was significantly larger than that for the Surrounding market.
This implies evidence of collusion for the years 19841988.

The Kentucky Milk CasePart II

255

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Design of Experiments and


Analysis of Variance
8.2

Chapter 8

The treatments are the combinations of levels of each of the two factors. There are 2 5 = 10
treatments. They are:
(A, 50), (A, 60), (A, 70), (A, 80), (A, 90)
(B, 50), (B, 60), (B, 70), (B, 80), (B, 90)

8.4

8.6

a.

College GPA's are measured on college students. The experimental units are college
students.

b.

Household income is measured on households. The experimental units are households.

c.

Gasoline mileage is measured on automobiles. The experimental units are the


automobiles of a particular model.

d.

The experimental units are the sectors on a computer diskette.

e.

The experimental units are the states.

a.

The response variable is the amount of the purchase.

b. There is one factor in this problem: type of credit card.


c. There are 4 treatments, corresponding to the 4 levels of the factor. The treatments are
VISA, MasterCard, American Express, and Discover.
d. The experimental units are the credit card holders.
8.8

8.10

256

a.

The response variable in this problem is the consumers opinion on the value of the
discount offer.

b.

There are two treatments in this problem: Within-store price promotion and betweenstore price promotion.

c.

The experimental units are the consumers.

a.

There are 2 factors in the problem: Type of yeast and Temperature. Type of yeast
has 2 levels Brewers yeast and bakers yeast. Temperature has 4 levels 45o,
48o, 51o and 54oC.

b.

The response variable is the autolysis yield.

c.

There are a total of 2 4 = 8 treatments in this experiment. The treatments are all the
type of yeast-temperature combinations.

d.

This is a designed experiment.

Chapter 8

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

8.12

8.14

8.16

a.

The response is the evaluation by the undergraduate student of the ethical behavior of the
salesperson.

b.

There are two factorstype of sales job at two levels (high tech. vs. low tech.) and sales
task at two levels (new account development vs. account maintenance).

c.

The treatments are the 2 2 = 4 combinations type of sales job and sales task.

d.

The experimental units are the college students.

a.

From Table IX with 1 = 4 and 2 = 4, F.05 = 6.39.

b.

From Table XI with 1 = 4 and 2 = 4, F.01 = 15.98.

c.

From Table VIII with 1 = 30 and 2 = 40, F.10 = 1.54.

d.

From Table X with 1 = 15, and 2 = 12, F.025 = 3.18.

a.

In the second dot diagram #2, the difference between the sample means is small relative
to the variability within the sample observations. In the first dot diagram #1, the values in
each of the samples are grouped together with a range of 4, while in the second diagram
#2, the range of values is 8.

b. For diagram #1,

7 + 8 + 9 + 10 + 11 54
=
=9
n
6
6
x2 = 12 + 13 + 14 + 14 + 15 + 16 = 84 = 14
x2 =
n
6
6
x1 =

For diagram #2,

5 + 5 + 7 + 11 + 13 + 13 54
=
=9
n
6
6
x2 = 10 + 10 + 12 + 16 + 18 + 18 = 84 = 14
x2 =
n
6
6

x1 =

c.

For diagram #1,


2

SST =

n (x
i =1

x ) 2 1 = 6(9 11.5)2 + 6(14 11.5)2 = 75

x = 54 + 84 = 11.5
x =

12
n

For diagram #2,


2

SST =

n (x
i =1

x ) 2 = 6(9 - 11.5)2 + 6(14 - 11.5)2 = 75

Design of Experiments and Analysis of Variance

257

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

d.

For diagram #1,

2
1

s12 =

( x )

n1

n1 1

2
2

s22 =

( x )

2
542
54
496
6 =
6 =2
6 1
6 1

496

n2
n2 1

842
6 =2
6 1

1186

SSE = (n1 1) s12 + (n2 1) s22 = (6 1)2 + (6 1)2 = 20


For diagram #2,

2
1

s12 =

( x )

n1

n1 1

2
2

s22 =

( x )

n2
n2 1

542
6 = 14.4
6 1

558

842
6 = 14.4
6 1

1248

SSE = (n1 1) s12 + (n2 1) s22 = (6 1)14.4 + (6 1)14.4 = 144


e.

For diagram #1, SS(Total) = SST + SSE = 75 + 20 = 95


SST is

SST
75
100% =
100% = 78.95% of SS(Total)
SS(Total)
95

For diagram #2, SS(Total) = SST + SSE = 75 + 144 = 219


SST is

f.

SST
75
100% =
100% = 34.25% of SS(Total)
SS(Total)
219
SST
75
=
= 75
k 1 2 1
SSE
20
=
MSE =
=2
n k 12 2

For diagram #1, MST =

SST
75
=
= 75
k 1 2 1
SSE
144
=
= 14.4
MSE =
n k 12 2

F=

MST
75
=
= 37.5
MSE
2

F=

MST
75
=
= 5.21
MSE 14.4

For diagram #2, MST =

258

Chapter 8

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

g.

The rejection region for both diagrams requires = .05 in the upper tail of the Fdistribution with 1 = p 1 = 2 1 = 1 and 2 = n p = 12 2 = 10. From Table IX,
Appendix B, F.05 = 4.96. The rejection region is F > 4.96.
For diagram #1, the observed value of the test statistic falls in the rejection region (F =
37.5 > 4.96). Thus, H0 is rejected. There is sufficient evidence to indicate the samples
were drawn from populations with different means at = .05.
For diagram #2, the observed value of the test statistic falls in the rejection region (F =
5.21 > 4.96). Thus, H0 is rejected. There is sufficient evidence to indicate the samples
were drawn from populations with different means at = .05.

h.
8.18

We must assume both populations are normally distributed with common variances.

Refer to Exercise 8.16, the ANOVA table is:


For diagram #1:
Source
Treatment
Error
Total

Df
1
10
11

SS
75
20
95

MS
75
2

F
37.5

SS
75
144
219

MS
75
14.4

F
5.21

For diagram #2:


Source
Treatment
Error
Total

8.20

a.

Df
1
10
11

df for Error is 41 6 = 35
SSE = SS(Total) SST = 46.5 17.5 = 29.0
SST 17.5
=
= 2.9167
k 1
6
MST
2.9167
=
F=
= 3.52
MSE
.8286

MST =

MSE =

SSE 29.0
=
= .8286
nk
35

The ANOVA table is:


Source
Treatment
Error
Total

df
6
35
41

SS
17.5
29.0
46.5

MS
2.9167
.8286

Design of Experiments and Analysis of Variance

F
3.52

259

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

b.

The number of treatments is k. We know k 1 = 6 k = 7.

c.

To determine if there is a difference among the population means, we test:


H0: 1 = 2 = = 7
Ha: At least one of the population means differs from the rest
The test statistic is F = 3.52.
The rejection region requires = .10 in the upper tail of the F-distribution with numerator
df = k 1 = 6 and denominator df = n k = 35. From Table VIII, Appendix B, F.10
1.98. The rejection region is F > 1.98.
Since the observed value of the test statistic falls in the rejection region (F = 3.52 > 1.98),
H0 is rejected. There is sufficient evidence to indicate a difference among the population
means at = .10.

d.

The observed significance level is P(F 3.52). With numerator df = 6 and denominator
df = 35, and Table XI, P(F 3.52) < .01.

e.

H0: 1 = 2
Ha: 1 2
The test statistic is t =

x1 x2
1
1
MSE +
n1 n2

3.7 4.1
1 1
.8286 +
6 6

= .76

The rejection region requires /2 = .10/2 = .05 in each tail of the t-distribution with df =
n p = 35. From Table VI, Appendix B, t.05 1.697. The rejection region is t < 1.697
or t > 1.697.
Since the observed value of the test statistic does not fall in the rejection region (t = .76
</ 1.697), H0 is not rejected. There is insufficient evidence to indicate that 1 and 2
differ at = .10.
f.

For confidence coefficient .90, = .10 and /2 = .05. From Table VI, Appendix B, with
df = 35, t.05 1.697. The confidence interval is:

1 1
1 1
( x1 x2 ) t.05 MSE + (3.7 4.1) 1.697 .8286 +
6 6
n1 n 2
.4 .892 (1.292, .492)
g.

260

The confidence interval is:


x1 t.05 MSE/6 3.7 1.697 .8286 / 6 3.7 .631 (3.069, 4.331)

Chapter 8

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

8.22

a.

The experimental unit in the study is the college tennis coach. The dependent
variable is the response to the statement the Prospective Student-Athlete Form on
the web site contributes very little to the recruiting process on a scale from 1 to 7.
There is one factor in the study and it is the NCAA division of the college tennis
coach. There are 3 levels of this factor, and thus, there are 3 treatments Division I,
Division II, and Division III.

b.

To determine if the mean responses of tennis coaches from the different divisions differ,
we test:

H0: 1 = 2 = 3
Ha: At least 1 i differs

8.24

c.

Since the observed p-value of the test (p < .003) is less than = .05, H0 is rejected. There
is sufficient evidence to indicate differences in mean response among coaches of the 3
divisions.

a.

A completely randomized design was used.

b.

There are 4 treatments: 3 robots/colony, 6 robots/colony, 9 robots/colony, and 12


robots/colony.

c.

To determine if there was a difference in the mean energy expended (per robot) among
the 4 colony sizes, we test:

H0: 1 = 2 = 3 = 4
Ha: At least two means differ
d.

8.26

a.

Since the p-value (<.001) is less than (.05), H0 is rejected. There is sufficient evidence
to indicate a difference in mean energy expended per robot among the 4 colony sizes at
= .05.
To determine if differences exist in the mean rates of return among the three types of
fund groups, we test:

H0: 1 = 2 = 3
Ha: At least two means differ
b.

c.

The rejection region requires = .01 in the upper tail of the F-distribution with
1 = k 1 = 3 1 = 2 and 2 = N k = 90 3 = 87. From Table XI, Appendix B,
F.01 4.98. The rejection region is F > 4.98.
Since the observed value of the test statistic falls in the rejection region (F = 69.65 >
4.98), H0 is rejected. There is sufficient evidence to indicate differences exist in the mean
rates of return among the three types of fund groups at = .01.

Design of Experiments and Analysis of Variance

261

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

8.28

a.

The response variable for this study is the safety rating of nuclear power plants.

b.

There are three treatments in this study. The treatment groups are the scientists, the
journalists, and the federal government policymakers.

c.

To determine whether there are differences in the attitudes of scientists, journalists, and
government officials regarding the safety of nuclear power plants, we test:

H0: 1 = 2 = 3
Ha: At least two means differ
d.

The rejection region requires = .05 in the upper tail of the F-distribution with 1 = k 1
= 3 1 = 2 and 2 = n k = 300 3 = 297. From Table IX, Appendix B, F.05 3.00. The
rejection region is F > 3.00.
In order to reject H0, the test statistic F must be greater than 3.00.

F=

MST
> 3.00
MSE

MST > 3.00(MSE) 3.00 (2.355) = 7.065. Thus, MST must be greater
than 7.065.

8.30

MST
11.28
=
= 4.79
MSE
2.355

e.

For MST = 11.280, F =

f.

With 1 = k 1 = 3 1 = 2 and 2 = n k = 300 3 = 297, P(F > 4.79) .01, using Table
XI, Appendix B. The approximate p-value is .01.

a.

We will select size as the quantitative variable and color as the qualitative variable.
To determine if the mean size of diamonds differ among the 6 colors, we test:

H0: 1 = 2 = 3 = 4 = 5 = 6
Ha: At least two means differ
b.

Using MINITAB, the ANOVA table is:

One-way ANOVA: Carats versus Color


Analysis of Variance for Carats
Source
DF
SS
MS
Color
5
0.7963
0.1593
Error
302
22.7907
0.0755
Total
307
23.5869
Level
D
E
F
G
H
I

N
16
44
82
65
61
40

Pooled StDev =

262

Mean
0.6381
0.6232
0.5929
0.5808
0.6734
0.7310
0.2747

StDev
0.3195
0.2677
0.2648
0.2792
0.2643
0.2918

F
2.11

P
0.064

Individual 95% CIs For Mean


Based on Pooled StDev
----------+---------+---------+-----(-------------*------------)
(-------*-------)
(-----*-----)
(------*------)
(------*------)
(-------*--------)
----------+---------+---------+-----0.60
0.70
0.80

Chapter 8

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The test statistic is F = 2.11 and the p-value is p = 0.064.


Since the p-value (0.064) is less than = .10, H0 is rejected. There is sufficient evidence
to indicate the mean size of diamonds differ among the 6 colors at = .10.
c.

We will check the assumptions of normality and equal variances. Using MINITAB, the
stem-and-leaf plots are:
Stem-and-Leaf Display: Carats
Stem-and-leaf of Carats
Leaf Unit = 0.010
1
3
5
5
7
7
(4)
5
5
5

1
2
3
4
5
6
7
8
9
10

1
2
3
4
5
6
7
8
9
10

1
2
3
4
5
6
7
8
9
10

1156
00011

1
2
3
4
5
6
7
8
9
10

Color = 2

= 44

Color = 3

= 82

= 65

9
123
0011345
6
00012245668
23
000123
113
0000011113

88999
1356667
01124445567
0178
000111122333345566678
0
00001112367
0012555
0
00000011112224

Stem-and-leaf of Carats
Leaf Unit = 0.010
5
12
21
23
(12)
30
26
16
12
12

= 16

23

Stem-and-leaf of Carats
Leaf Unit = 0.010
5
12
23
27
(21)
34
33
22
15
14

9
01
01

Stem-and-leaf of Carats
Leaf Unit = 0.010
1
4
11
12
(11)
21
19
13
10
10

Color= 1

Color = 4

88899
0001359
000124455
08
000013556789
0034
0000001348
0125
000000011126

Design of Experiments and Analysis of Variance

263

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Stem-and-leaf of Carats
Leaf Unit = 0.010
5
14
16
21
27
(13)
21
14
14
1

2
3
4
5
6
7
8
9
10
11

Color = 5

2
457
012344567
03
25778
001466
0001112233448
0014669

2
3
4
5
6
7
8
9
10

= 61

1 89

0000011111266
0

Stem-and-leaf of Carats
Leaf Unit = 0.010
4
8
11
13
15
20
20
17
16

Color = 6

= 40

5689
0113
115
26
25
03355
002
0
0000001111114579

The data for the 6 colors do not look particularly mound-shaped, so the assumption of
normality is probably not valid. However, departures from this assumption often do not
invalidate the ANOVA results.
Using MINITAB, the box plots are:

1.1
1.0
0.9

Carats

0.8
0.7
0.6
0.5
0.4
0.3
0.2
D

Color

The spreads of all the colors appear to be about the same, so the assumption of constant
variance is probably valid.

264

Chapter 8

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

8.32

a.

The df for Groups = 1 = k 1 = 3 1 = 2. The df for Error = 2 = n k = 71 3 = 68.


The completed ANOVA table is:
Source
Groups
Error

b.

df

2
68

SS
128.70
27,124.52

MS
64.35
398.89

F
0.16

To determine if the total number of activities undertaken differed among the three groups
of entrepreneurs, we test:

H0: 1 = 2 = 3
Ha: At least one mean differs
The test statistic is F = 0.16.
The rejection region requires = .05 in the upper tail of the F-distribution with 1 = k 1
= 3 1 = 2 and 2 = n k = 71 3 = 68. From Table IX, Appendix B, F.05 3.15. The
rejection region is F > 3.15.
Since the observed value of the test statistic does not fall in the rejection region (F = 0.16
>/ 3.15), H0 is not rejected. There is insufficient evidence to indicate that the total
number of activities differed among the groups of entrepreneurs at = .05.
c.

The p-value of the test is P(F > 0.16). From Table VIII, Appendix B, with 1 = 2 and

2 = 68, P(F > 0.16) > .10.

d.

No. Since our conclusion was that there was no evidence of a difference in the total
number of activities among the groups, there would be no evidence to indicate a
difference between two specific groups.

e.

This study would be observational. The group that each entrepreneur fell into was
observed, not controlled. Since no differences were found, the type of study does not
have an impact on the conclusions.

8.34

The experimentwise error rate is the probability of making a Type I error for at least one of all
of the comparisons made. If the experimentwise error rate is = .05, then each individual
comparison is made at a value of which is less than .05.

8.36

a.

From the diagram, the following pairs of treatments are significantly different because
they are not connected by a line: A and E, A and B, A and D, C and E, C and B, C and D,
and E and D. All other pairs of means are not significantly different because they are
connected by lines.

b.

From the diagram, the following pairs of treatments are significantly different because
they are not connected by a line: A and B, A and D, C and B, C and D, E and B, E and D,
and B and D. All other pairs of means are not significantly different because they are
connected by lines.

Design of Experiments and Analysis of Variance

265

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

8.38

8.40

c.

From the diagram, the following pairs of treatments are significantly different because
they are not connected by a line: A and E, A and B, and A and D. All other pairs of
means are not significantly different because they are connected by lines.

d.

From the diagram, the following pairs of treatments are significantly different because
they are not connected by a line: A and E, A and B, A and D, C and E, C and B, C and D,
E and D, and B and D. All other pairs of means are not significantly different because
they are connected by lines.

a.

The total number of comparisons conducted is k(k 1)/2 = 4(4 1)/2 = 6.

b.

The mean energy expended by robots in the 12 robot colony is significantly smaller than
the mean energy expended by robots in any of the other size colonies. There is no
difference in the mean energy expended by robots in the 3 robot colony, the 6 robot
colony, and the 9 robot colony.

a.

There will be c =

b.

Comparing the mean safety scores for government officials and journalists, the difference
in mean safety scores is 4.2 3.7 = .5, The critical value for the Tukey comparison is
.23. Since .5 > .23, we conclude that the mean safety score for government officials is
higher than the mean safety score for journalists.

k (k 1) 3(3 1)
= 3 pairwise comparisons.
=
2
2

Comparing the mean safety scores for government officials and scientists, the difference
in mean safety scores is 4.2 4.1 = .1. Since .1 < .23, we conclude that there is no
difference in mean safety scores between government officials and scientists.
Comparing the mean safety scores for scientists and journalists, the difference in mean
safety scores is 4.1 3.7 = .4, The critical value for the Tukey comparison is .23. Since
.4 > .23, we conclude that the mean safety score for scientists is higher than the mean
safety score for journalists.
A display of these conclusions is:
Journalists
3.7
8.42

Scientists
4.1

Gov. Officials
4.2

a.

The probability of declaring at least one pair of means different when they are not is
.01.

b.

There are a total of

k (k 1) 3(3 1)
=
= 3 pair-wise comparisons. They are:
2
2

Under $30 thousand to Between $30 and $60 thousand


Under $30 thousand to Over $60 thousand
Between $30 and $60 thousand to Over $60 thousand

266

Chapter 8

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

Means for groups in homogeneous subsets are displayed in the table:


Income
Group

Subsets

Under $30,000
$30,000-$60,000
Over $60,000
d.

N
379
392
267

1
4.60

2
5.08
5.15

Two of the comparisons in part b will yield confidence intervals that do not contain 0.
They are:
Under $30 thousand to Between $30 and $60 thousand
Under $30 thousand to Over $60 thousand

8.44

From Exercise 8.30, we found that there were differences in the mean carats among the 6 levels
of color
From Exercise 8.30, the mean carats for the 6 colors are:
G
F
E
D
H
I

0.5808
0.5929
0.6232
0.6381
0.6734
0.7310

Using MINITAB, the Tukey confidence intervals are:


Tukey's pairwise comparisons
Family error rate = 0.100
Individual error rate = 0.0101
Critical value = 3.66
Intervals for (column level mean) - (row level mean)
D

-0.1926
0.2225

-0.1491
0.2395

-0.1026
0.1631

-0.1411
0.2558

-0.0964
0.1812

-0.1059
0.1302

-0.2350
0.1644

-0.1909
0.0904

-0.2007
0.0397

-0.2194
0.0341

-0.3032
0.1174

-0.2631
0.0475

-0.2752
-0.0010

-0.2931
-0.0074

Design of Experiments and Analysis of Variance

-0.2022
0.0871

267

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

There are only 2 intervals that do not contain 0:


The confidence interval for the difference in mean carats between colors G and I is
(0.2931, 0.0074). The confidence interval for the difference in mean carats between colors
F and I is (0.2752, 0.0010). Since 0 is not contained in these confidence intervals, there is
sufficient evidence of a difference in the mean number of carats between colors G and I and
between colors F and I. No other differences exist.
8.46

a.

There are 3 blocks used since Block df = b 1 = 2 and 5 treatments since the treatment
df = k 1 = 4.

b.

There were 15 observations since the Total df = n 1 = 14.

c.

H0: 1 = 2 = 3 = 4 = 5
Ha: At least two treatment means differ

d.

The test statistic is F =

e.

The rejection region requires = .01 in the upper tail of the F distribution with 1 = k 1
= 5 1 = 4 and 2 = n k b + 1 = 15 5 3 + 1 = 8. From Table XI, Appendix B, F.01
= 7.01. The rejection region is F > 7.01.

f.

Since the observed value of the test statistic falls in the rejection region (F = 9.109 >
7.01), H0 is rejected. There is sufficient evidence to indicate that at least two treatment
means differ at = .01.

g.

The assumptions necessary to assure the validity of the test are as follows:
1.
2.

8.48

a.

The probability distributions of observations corresponding to all the blocktreatment combinations are normal.
The variances of all the probability distributions are equal.

The ANOVA Table is as follows:


Source
Treatment
Block
Error
Total

268

MST
= 9.109
MSE

df
2
3
6
11

SS
12.032
71.749
.708
84.489

MS
6.016
23.916
.118

F
50.958
202.586

Chapter 8

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

b.

To determine if the treatment means differ, we test:

H0: A = B = C
Ha: At least two treatment means differ
B

The test statistic is F =

MST
= 50.958
MSE

The rejection region requires = .05 in the upper tail of the F distribution with 1 = k 1
= 3 1 = 2 and 2 = n k b + 1 = 12 3 4 + 1 = 6. From Table IX, Appendix B, F.05
= 5.14. The rejection region is F > 5.14.
Since the observed value of the test statistic falls in the rejection region (F = 50.958 >
5.14), H0 is rejected. There is sufficient evidence to indicate that the treatment means
differ at = .05.
c.

To see if the blocking was effective, we test:

H0: 1 = 2 = 3 = 4
Ha: At least two block means differ
The test statistic is F =

MSB
= 202.586
MSE

The rejection region requires = .05 in the upper tail of the F distribution with
1 = k 1 = 4 1 = 3 and 2 = n k b + 1 = 12 3 4 + 1 = 6. From Table IX,
Appendix B, F.05 = 4.76. The rejection region is F > 4.76.
Since the observed value of the test statistic falls in the rejection region (F = 202.586 >
4.76), H0 is rejected. There is sufficient evidence to indicate that blocking was effective
in reducing the experimental error at = .05.
d.

From the printouts, we are given the differences in the sample means. The difference
between Treatment B and both Treatments A and C are positive (1.125 and 2.450), so
Treatment B has the largest sample mean. The difference between Treatment A and C is
positive (1.325), so Treatment A has a larger sample mean than Treatment C. So
Treatment B has the largest sample mean, Treatment A has the next largest sample mean
and Treatment C has the smallest sample mean.
From the printout, all the means are significantly different from each other.

e.

The assumptions necessary to assure the validity of the inferences above are:
1.
2.

The probability distributions of observations corresponding to all the blocktreatment combinations are normal.
The variances of all the probability distributions are equal.

Design of Experiments and Analysis of Variance

269

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

8.50

a.

This is a randomized block design. The blocks are the 12 plots of land. The treatments
are the three methods used on the shrubs: fire, clipping, and control. The response
variable is the mean number of flowers produced. The experimental units are the 36
shrubs.

b.

Plot

c.

To determine if there is a difference in the mean number of flowers produced among the
three treatments, we test:

H0: 1 = 2 = 3
Ha: The mean number of flowers produced differ for at least two of the methods.
The test statistic is F = 5.42 and p = .009. We can reject the null hypothesis at the
> .009 level of significance. At least two of the methods differ with respect to mean
number of flowers produced by pawpaws.
d.

8.52

270

The means of Control and Clipping do not differ significantly. The means of Clipping
and Burning do not differ significantly. The mean of treatment Burning exceeds that of
the Control.

From the printout, the p-value for treatments or Decoy is p = .589. Since the p-value is not
small, we cannot reject H0. There is insufficient evidence to indicate a difference in mean
percentage of a goose flock to approach to within 46 meters of the pit blind among the three
decoy types. This conclusion is valid for any reasonable value of .

Chapter 8

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

8.54

Using SAS, the ANOVA Table is:


The ANOVA Procedure
Dependent Variable: temp
Source

DF

Sum of
Squares

Mean Square

F Value

Pr > F

Model

11

18.53700000

1.68518182

0.52

0.8634

Error

18

58.03800000

3.22433333

Corrected Total

29

76.57500000

R-Square

Coeff Var

Root MSE

temp Mean

0.242076

1.885189

1.795643

95.25000

Source
STUDENT
PLANT

DF

Anova SS

Mean Square

F Value

Pr > F

9
2

18.41500000
0.12200000

2.04611111
0.06100000

0.63
0.02

0.7537
0.9813

To determine if there are differences among the mean temperatures among the three treatments,
we test:

H0: 1 = 2 = 3
Ha: At least two treatment means differ
The test statistic is F = 0.02. The associated p-value is p = .9813. Since the p-value is very
large, there is no evidence of a difference in mean temperature among the three treatments.
Since there is no difference, we do not need to compare the means. It appears that the presence
of plants or pictures of plants does not reduce stress.
8.56

a.

Some preliminary calculations are:

( y )
CM =

n
SS(Total) =

2.952
= .435125
10
y 2 CM = .4705 .435125 = .035375

1.622 1.332
T12 T22
+
CM =
+
.435125 = .004205
10
10
b
b
SST .004205
=
= .004205, df = k 1 = 1
MST =
2 1
k 1
B2
B2 B2
SSB = SS(DOG) = 1 + 2 + + 10 CM
k
k
k
2
2
2
2
2
.32 + .38 + .27 + .36 + .42 + .312 + .19 2 + .192 + .32 + .212
=
2
.435125 = .028925
SSB .028925
=
MSB =
= .003214, df = b 1 = 9
b 1
10 1

SST = SS(DRUG) =

SSE = SS(Total) SST SSB = .035375 .004205 .028925 = .002245

Design of Experiments and Analysis of Variance

271

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

MSE =

F=

SSE
.002245
=
= .0002494
n k b + 1 20 2 10 + 1

MST
.004205
=
= 16.86
MSE .0002494

F=

MSB .003214
=
= 12.89
MSE .0002494

To determine if there is a difference in mean pressure readings for the two treatments, we
test:

H0: A = B
Ha: A B
B

The test statistic is F =

MST
= 16.86
MSE

The rejection region requires = .05 in the upper tail of the F distribution with 1 = k 1
= 2 1 = 1 and 2 = n k b + 1 = 20 2 10 + 1 = 9. From Table IX, Appendix B,
F.05 = 5.12. The rejection region is F > 5.12.
Since the observed value of the test statistic falls in the rejection region (F = 16.86
> 5.12), H0 is rejected. There is sufficient evidence to indicate a difference in mean
pressure readings for the two drugs at = .05.
b.

Since there is expected to be much variation between the dogs, we use the dogs as blocks
to eliminate this identified source of variation.

c.

272

Dog

Drug A

Drug B

1
2
3
4
5
6
7
8
9
10

.17
.20
.14
.18
.23
.19
.12
.10
.16
.13

.15
.18
.13
.18
.19
.12
.07
.09
.14
.08

(A B)
Differences
.02
.02
.01
.00
.04
.07
.05
.01
.02
.05

Chapter 8

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Some preliminary calculations are:


d=

sd2 =
sd =

nd

.29
= .029
10

=
2
i

( d )

nd
nd 1

(.29) 2
10 = .00449 = .0004989
10 1
9

.0129

sd2 = .0004989 = .02234

To determine if there is a difference in mean pressure readings for the two treatments, we
test:
H0: A = B
Ha: A B
B

The test statistic is t =

d 0
sd / nd

.029 0

= 4.105

.02234 / 10

The rejection region requires /2 = .05/2 = .025 in each tail of the t distribution with
df = n 1 = 10 1 = 9. From Table VI, Appendix B, t.025 = 2.262. The rejection region
is t < 2.262 or t > 2.262.
Since the observed value of the test statistic falls in the rejection region (t = 4.105
> 2.262), H0 is rejected. There is sufficient evidence to indicate a difference in the
treatment means at = .05.
d.

In part a, F = 16.86; and in part c, t = 4.105. Note that t2 = 4.1052 = 16.85 = F.


In part a, F.05 = 5.12; and in part c, t.025 = 2.262. Note that t.2025 = 2.2622 = 5.12 = F.05.

e.

p-value = P(F 16.86) with 1 = 1 and 2 = 9.


Using Table XI, Appendix B, P(F 10.56) < .01.
Thus, the p-value is < .01.
The probability of a test statistic this extreme if the treatment means are the same is less
than .01. This is very significant. We would reject H0 in favor of Ha if is larger than
the p-value.

8.58

a.

There are two factors.

b.

No, we cannot tell whether the factors are qualitative or quantitative.

c.

Yes. There are four levels of factor A and three levels of factor B.

d.

A treatment would consist of a combination of one level of factor A and one level of
factor B. There are a total of 4 3 = 12 treatments.

Design of Experiments and Analysis of Variance

273

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

8.60

e.

One problem with only one replicate is there are no degrees of freedom for error. This is
overcome by having at least two replicates.

a.

Factor A has 3 + 1 = 4 levels and factor B has 1 + 1 = 2 levels.

b.

There are a total of 23 + 1 = 24 observations and 4 2 = 8 treatments. Therefore, there


were 24/8 = 3 observations for each treatment.

c.

AB
df = (a 1)(b 1) = (4 1)(2 1) = 3
Error df = n ab = 24 4(2) = 16
SS A
SSA = (a 1)MSA = (4 1)(.75) = 2.25
a 1
SSB .95
=
MSB =
= .95
b 1 2 1
SS AB
MSAB =
SSAB = (a 1)(b 1)MSAB = (4 1)(2 1)(.30) = .9
(a 1)(b 1)
SSE = SS(Total) SSA SSB SSAB = 6.5 2.25 .95 .9 = 2.4
SSE
2.4
=
MSE =
= .15
n ab 24 - 4(2)

MSA =

SST = SSA + SSB + SSAB = 2.25 + .95 + .90 = 4.1


Treatment df = ab 1 = 4(2) 1 = 7
SST 4.1
MST
.5857
MST =
= .5857
FT =
= 3.90
=
=
ab 1 7
MSE
.15
MSA
.75
=
= 5.00
MSE
.15
MSAB .30
=
= 2.00
FAB =
MSE
.15

FA =

FB =
B

MSB .95
=
= 6.33
MSE .15

The ANOVA table is:


Source
Treatments
A
B
AB
Error
Total

274

df
7
3
1
3
16
23

SS
4.1
2.25
.95
.90
2.40
6.50

MS
.59
.75
.95
.30
.15

F
3.90
5.00
6.33
2.00

Chapter 8

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

d.

To determine whether the treatment means differ, we test:


H0: 1 = 2 = = 8
Ha: At least two treatment means differ
The test statistic is F =

MST
= 3.90
MSE

The rejection region requires = .10 in the upper tail of the F-distribution with 1 = ab
1 = 4(2) 1 = 7 and 2 = n ab = 24 4(2) = 16. From Table VIII, Appendix B,
F.10 = 2.13. The rejection region is F > 2.13.
Since the observed value of the test statistic falls in the rejection region (F = 3.90 > 2.13),
H0 is rejected. There is sufficient evidence to indicate the treatment means differ at
= .10.
e.

To determine if the factors interact, we test:


H0: Factors A and B do not interact to affect the response mean
Ha: Factors A and B do interact to affect the response mean
The test statistic is F = 2.00.
The rejection region requires = .10 in the upper tail of the F-distribution with 1 =
(a 1)(b 1) = (4 1)(2 1) = 3 and 2 = n ab = 24 4(2) = 16. From Table VIII,
Appendix B, F.10 = 2.46. The rejection region is F > 2.46.
Since the observed value of the test statistic does not fall in the rejection region (F = 2.00
>/ 2.46), H0 is not rejected. There is insufficient evidence to indicate factors A and B
interact at = .10.
To determine if the four means of factor A differ, we test:
H0: There is no difference in the four means of factor A
Ha: At least two of the factor A means differ
The test statistic is F = 5.00.
The rejection region requires = .10 in the upper tail of the F-distribution with 1 =
a 1 = 4 1 = 3 and 2 = n ab = 24 - 4(2) = 16. From Table VIII, Appendix B, F.10 =
2.46. The rejection region is F > 2.46.
Since the observed value of the test statistic falls in the rejection region (F = 5.00 > 2.46),
H0 is rejected. There is sufficient evidence to indicate at least two of the four means of
factor A differ at = .10.
To determine if the 2 means of factor B differ, we test:
H0: There is no difference in the two means of factor B
Ha: At least two of the factor B means differ

Design of Experiments and Analysis of Variance

275

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The test statistic is F = 6.33.


The rejection region requires = .10 in the upper tail of the F-distribution with 1 =
b 1 = 2 1 = 1 and 2 = n ab = 24 4(2) = 16. From Table VIII, Appendix B, F.10 =
3.05. The rejection region is F > 3.05.
Since the observed value of the test statistic falls in the rejection region (F = 6.33 > 3.05),
H0 is rejected. There is sufficient evidence to indicate the two means of factor B differ at
= .10.
All of the tests performed are warranted because interaction was not significant.
8.62

a.

The treatments are the combinations of the levels of factor A and the levels of factor B.
There are 2 2 = 4 treatments. The treatment means are:
x11 =
x21 =

11

2
x21
2

29.6 + 35.2
= 32.4
2

x12 =

12.9 + 17.6
= 15.25
2

x22 =

12

2
x22
2

47.3 + 42.1
2

28.4 + 22.7
2

The factors do not appear to interactthe


lines are almost parallel. The treatment
means do appear to differ because the
sample means range from 15.25 to 44.7.

b.

276

( x )

235.82
8
n
2
SS(Total) = x CM = 7922.92 6950.205 = 972.715

CM =

SSA =

SSB =

2
i

br
2
i

ar

CM=

154.22 81.62
+
= 7609.05 6950.205 = 658.845
2(2)
2(2)

CM=

95.32 140.52
+
= 7205.585 6950.205 = 255.38
2(2)
2(2)

Chapter 8

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

AB

2
ij

64.82 89.42 30.52 51.12


+
+
+
r
2
4
2
2
658.845 255.38 6950.205 = 7866.43 7864.43 = 2
SSE = SS(Total) SSA SSB SSAB = 972.715 658.845 255.38 2 = 56.49

SSAB =

SSA SSB CM =

df = a 1 = 2 1 = 1
df = b 1 = 2 1 = 1
df = (a 1)(b 1) = (2 1)(2 1) = 1
df = n ab = 8 2(2) = 4
df = n 1 = 8 1 = 7

A
B
AB
Error
Total

SSA 658.845
=
= 658.845
a 1
1
SSAB
2
= =2
MSAB =
(a 1)(b 1) 1

MSA =

MSB =

SSB 255.38
=
= 255.38
b 1
1

MSE =

SSE 56.49
= 14.1225
=
n - ab
4

MSA 658.845
=
= 46.65
MSE 14.1225

FA =
FAB =

FB =
B

MSB 255.38
=
= 18.08
MSE 14.1225

MSAB
2
=
= .14
MSE 14.1225

The ANOVA table is:


Source
A
B
AB
Error
Total

c.

df

1
1
1
4
7

SS
658.845
255.380
2.000
56.490
972.715

MS
658.845
255.380
2.000
14.1225

F
46.65
18.08
.14

SST = SSA + SSB + SSAB = 658.845 + 255.380 + 2.000 = 916.225


df = ab 1 = 2(2) 1 = 3
SST 916.225
MST
305.408
= 21.63
MST =
=
= 305.408 FT =
=
ab 1
3
MSE
14.1225
To determine whether the treatment means differ, we test:
H0: 1 = 2 = 3 = 4
Ha: At least two of the treatment means differ
The test statistic is F = 21.63.
The rejection region requires = .05 in the upper tail of the F-distribution with 1 =
ab 1 = 2(2) 1 = 3 and 2 = n ab = 8 2(2) = 4. From Table IX, Appendix B,
F.05 = 6.59. The rejection region is F > 6.59.

Design of Experiments and Analysis of Variance

277

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Since the observed value of the test statistic falls in the rejection region (F = 21.63 >
6.59), H0 is rejected. There is sufficient evidence to indicate the treatment means differ at
= .05.
This agrees with the conclusion in part a.
d.

Since there are differences among the treatment means, we test for the presence of
interaction:
H0: Factors A and B do not interact to affect the response means
Ha: Factors A and B do interact to affect the response means
The test statistic is F = .14.
The rejection region requires = .05 in the upper tail of the F-distribution with 1 =
(a 1)(b 1) = (2 1)(2 1) = 1 and 2 = n ab = 8 2(2) = 4. From Table IX,
Appendix B, F.05 = 7.71. The rejection region is F > 7.71.
Since the observed value of the test statistic does not fall in the rejection region (F = .14
>/ 7.71), H0 is not rejected. There is insufficient evidence to indicate the factors interact
at = .05.

e.

Since the interaction was not significant, we test for main effects.
To determine whether the two means of factor A differ, we test:
H0: 1 = 2
Ha: 1 2
The test statistic is F = 46.65.
The rejection region requires = .05 in the upper tail of the F-distribution with 1 =
a 1 = 2 1 = 1 and 2 = n ab = 8 2(2) = 4. From Table IX, Appendix B, F.05 = 7.71.
The rejection region is F > 7.71.
Since the observed value of the test statistic falls in the rejection region (F = 46.65 >
7.71), H0 is rejected. There is sufficient evidence to indicate the two means of factor A
differ at = .05.
To determine whether the two means of factor B differ, we test:
H0: 1 = 2
Ha: 1 2
The test statistic is F = 18.08.
The rejection region requires = .05 in the upper tail of the F-distribution with 1 = b 1
= 2 1 = 1 and 2 = n ab = 8 2(2) = 4. From Table IX, Appendix B, F.05 = 7.71. The
rejection region is F > 7.71.

278

Chapter 8

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Since the observed value of the test statistic falls in the rejection region (F = 18.08 >
7.71), H0 is rejected. There is sufficient evidence to indicate the two means of factor B
differ at = .05.
f.

The results of all the tests agree with those in part a.

g.

Since no interaction is present, but the means of both factors A and B differ, we compare
the two means of factor A and compare the two means of factor B. Since there are only
two means to compare for each factor, the higher population mean corresponds to the
higher sample mean.
Factor A: x1 =
x2 =

br

br

29.6 + 35.2 + 47.3 + 42.1


= 38.55
2(2)

12.9 + 17.6 + 28.4 + 22.7


= 20.4
2(2)

The mean for level 1 of factor A is significantly higher than the mean for level 2.
Factor B: x1 =
x2 =

ar

ar

29.6 + 35.2 + 12.9 + 17.6


= 23.825
2(2)

47.3 + 42.1 + 28.4 + 22.7


= 35.125
2(2)

The mean for level 2 of factor B is significantly higher than the mean for level 1.
8.64

a.

There are a total of 2 4 = 8 treatments.

b.

The interaction between temperature and type was significant. This means that the effect
of type of yeast on the mean autolysis yield depends on the level of temperature.

c.

To determine if the main effect of type of yeast is significant, we test:


H0: Ba = Br
Ha: Ba Br
To determine if the main effect of temperature is significant, we test:
H0: 1 = 2 = 3 = 4
Ha: At least one mean differs

d.

The tests for the main effects should not be run until after the test for interaction is
conducted. If interaction is significant, then these interaction effects could cover up the
main effects. Thus, the main effect tests would not be informative.
If the test for interaction is not significant, then the main effect tests could be run.

Design of Experiments and Analysis of Variance

279

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

e.

Bakers yeast:

The mean yield for temperature 54o is significantly lower than the mean yields for the
other 3 temperatures. There is no difference in the mean yields for the temperatures
45o, 48o and 51o.
Brewers yeast:
The mean yield for temperature 54o is significantly lower than the mean yields for the
other 3 temperatures. There is no difference in the mean yields for the temperatures
45o, 48o and 51o.

8.66

a.

This is an observational experiment. The researcher recorded the number of users per
hour for each of 24 hours per day, 7 days per week, for 7 weeks. The researcher did not
manipulate the weeks or days or hours.

b.

The two factors are (1) the day of the week with 7 levels and (2) the hour of the day with
24 levels.

c.

In a factorial experiment, a is the number of levels of factor A and b is the number of


levels of factor B. If we let factor A be the day of the week and factor B be the hour of
the day, then a = 7 and b = 24.

d.

To determine if the a b = 7 24 = 168 treatment means differ, we test:


H0: 1 = 2 = 3 = . . . = 168
Ha: At least two means differ
The test statistic is F =

MST 1143.99
=
= 25.06
MSE
45.65

The rejection region requires = .01 in the upper tail of the F distribution with v1 = p 1
= 168 1 = 167 and v2 = n p = 1172 168 = 1004. From Table XI, Appendix B, F.01
1.00. The rejection region is F > 1.00.
Since the observed value of the test statistic falls in the rejection region (F = 25.06 >
1.00), H0 is rejected. There is sufficient evidence to indicate a difference in mean usage
among the day-hour combinations at = .01.
e.

The hypotheses used to test if an interaction effect exists are:


H0: Days and hours do not interact to affect the mean usage
Ha: Days and hours interact do affect the mean usage

f.

The test statistic is F =

MSAB 55.69
= 1.22
=
MSE
45.65

The p-value is p = .0527. Since the p-value is not less than = .01, H0 is not rejected.
There is insufficient evidence to indicate days and hours interact to affect usage at =
.01.

280

Chapter 8

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

g.

To determine if the mean usage differs among the days of the week, we test:
H0: 1 = 2 = 3 = 4 = 5 = 6 = 7
Ha: At least two means differ
The test statistic is F =

MSA 3122.02
= 68.39
=
MSE
45.65

The p-value is p = .0001. Since the p-value is less than = .01, H0 is rejected. There is
sufficient evidence to indicate the mean usage differs among the days of the week at =
.01.
To determine if the mean usage differs among the hours of the day, we test:
H0: 1 = 2 = 3 = . . . = 24
Ha: At least two means differ
The test statistic is F =

MSB 7157.82
= 156.80
=
MSE
45.65

The p-value is p = .0001. Since the p-value is less than = .01, H0 is rejected. There is
sufficient evidence to indicate the mean usage differs among the hours of the day at =
.01.
8.68

a.

The degrees of freedom for Type of message retrieval system is a 1 = 2 1 = 1. The


degrees of freedom for Pricing option is b 1 = 2 1 = 1. The degrees of freedom for
the interaction of Type of message retrieval system and Pricing option is (a 1)(b 1) =
(2 1)(2 1) = 1. The degrees of freedom for error is n ab = 120 2(2) = 116.
Source
Type of message retrieval system
Pricing Option
Type of system pricing option
Error
Total

b.

Df
1
1
1
116
119

SS
-

MS
-

F
2.001
5.019
4.986

To determine if Type of system and Pricing option interact to affect the mean
willingness to buy, we test:
H0: Type of system and Pricing option do not interact
Ha: Type of system and Pricing option interact

c.

The test statistic is F =

MSAB
= 4.986
MSE

The rejection region requires = .05 in the upper tail of the F distribution with 1 =
(a 1)(b 1) = (2 1)(2 1) = 1 and 2 = n ab = 120 2(2) = 116. From Table IX,
Appendix B, F.05 3.92. The rejection region is F > 3.92.

Design of Experiments and Analysis of Variance

281

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Since the observed value of the test statistic falls in the rejection region (F = 4.986 >
3.92), H0 is rejected. There is sufficient evidence to indicate Type of system and
Pricing option interact to affect the mean willingness to buy at = .05.

8.70

d.

No. Since the test in part c indicated that interaction between Type of system and
Pricing option is present, we should not test for the main effects. Instead, we should
proceed directly to a multiple comparison procedure to compare selected treatment
means. If interaction is present, it can cover up the main effects.

a.

The treatments are the 3 3 = 9 combinations of PES and Trust. The nine treatments are:
(BC, Low), (PC, Low), (NA, Low), (BC, Med), (PC, Med), (NA, Med), (BC, High),
(PC, High), and (NA, High).

b.

df(Trust) = 3 1 = 2;
SSE = SSTot SS(PES) SS(Trust) SSPT
= 161.1162 2.1774 7.6367 1.7380 = 149.5641
SS(PES)
2.1774
=
= 1.0887
MS(PES) =
2
df(PES)
SS(Trust)
7.6367
=
= 3.81835
MS(Trust) =
2
df(Trust)
SS(PT) 1.7380
=
= 0.4345
MS(PT) =
df(PT)
4
SSE
149.5641
MSE =
= 0.7260
=
df(Error)
206
MS(PES)
MS(Trust)
1.0887
3.81835
FPES =
= 1.50
FTrust =
= 5.26
=
=
MSE
MSE
0.7260
0.7260
MS(PT)
0.4345
FPT =
=
= 0.60
0.7260
MSE
The ANOVA table is:
Source
PES
Trust
PES Trust
Error
Total

c.

df
2
2
4
206
214

SS
2.1774
7.6367
1.7380
149.5641
161.1162

MS
1.0887
3.81835
0.4345
0.7260

F
1.50
5.26
0.60

To determine if PES and Trust interact, we test:


H0: PES and Trust do not interact to affect the mean tension
Ha: PES and Trust do interact to affect the mean tension
The test statistic is F = 0.60.

282

Chapter 8

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The rejection region requires = .05 in the upper tail of the F-distribution with 1 =
(a 1)(b 1) = (3 1)(3 1) = 4 and 2 = n ab = 215 3(3) = 206. From Table IX,
Appendix B, F.05 2.37. The rejection region is F > 2.37.
Since the observed value of the test statistic does not fall in the rejection region (F = 0.60
>/ 2.37), H0 is not rejected. There is insufficient evidence to indicate that PES and Trust
interact at = .05.
d.

The plot of the treatment means is:


The mean tension scores for Low
Trust are relatively the same for each
level of PES. Similarly, the mean
tension scores for Medium Trust are
relatively the same for each level of
PES. However, the mean tension
scores for High Trust are not the
same for each level of PES. For both
PES levels BC and PC, as the level of
trust increases, the mean tension
scores decrease. However, for PES
level NA, as trust goes from low to medium, the mean tension decreases. As the trust
goes from medium to high, the mean tension increases. This indicates that interaction is
present which was also found in part d.

e.

8.72

Because the interaction of PES and Trust was found to be significant, the tests for the
main effects are irrelevant. If the factors interact, the interaction effect can cover up any
main effect differences. In addition, interaction implies that the effects of one factor on
the dependent variable are different at different levels of the second factor. Thus, there is
no one "main" effect of the factor.

Using MINITAB, the ANOVA results are:


General Linear Model: Deviation versus Group, Trail
Factor
Group
Trail

Type Levels Values


fixed
4 F G M N
fixed
2 C E

Analysis of Variance for Deviatio, using Adjusted SS for Tests


Source
Group
Trail
Group*Trail
Error
Total

DF
3
1
3
112
119

Seq SS
16271.2
46445.5
2245.2
82131.7
147093.6

Adj SS
13000.6
46445.5
2245.2
82131.7

Adj MS
4333.5
46445.5
748.4
733.3

F
5.91
63.34
1.02

P
0.001
0.000
0.386

First, we must test for treatment effects.


SST = SS(Group) + SS(Trail) + SS(GxT) = 16,271.2 + 46,445.5 + 2,245.2 = 64,961.9.
The df = 3 + 1 + 3 = 7.

Design of Experiments and Analysis of Variance

283

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

MST =

SST 64, 961.9


=
= 9, 280.2714
ab 1 4(2) 1

F=

MST 9, 280.2714
=
= 12.66
MSE
733.3

To determine if there are differences in mean ratings among the 8 treatments, we test:
H0: All treatment means are the same
Ha: At least two treatment means differ
The test statistic is F = 12.66.
Since no was given, we will use = .05. The rejection region requires = .05 in the upper
tail of the F distribution with 1 = ab 1 = 4(2) 1 = 7 and 2 = n ab = 120 4(2) = 112.
From Table IX, Appendix B, F.05 2.09. The rejection region is F > 2.09.
Since the observed value of the test statistic falls in the rejection region (F = 12.66 > 2.09), H0
is rejected. There is sufficient evidence that differences exist among the treatment means at
= .05. Since differences exist, we now test for the interaction effect between Trail and Group.
To determine if Trail and Group interact, we test:
H0: Trail and Group do not interact
Ha: Trail and Group do interact
The test statistic is F = 1.02 and p = .386
Since the p-value is greater than (p = .386 > .05), H0 is not rejected. There is insufficient
evidence that Trail and Group interact at = .05. Since the interaction does not exist, we test
for the main effects of Trail and Group.
To determine if there are differences in the mean rating between the two levels of
Trail, we test:
H0: 1 = 2
Ha: 1 2
The test statistics is F = 63.34 and p = 0.000.
Since the p-value is greater than (p = .000 < .05), H0 is rejected. There is sufficient evidence
that the mean trail deviations differ between the fecal extract trail and the control trail = .05.
To determine if there are differences in the mean rating between the four levels of Group, we
test:
H0: 1 = 2 = 3 = 4
Ha: At least 2 means differ
The test statistics is F = 5.91 and p = 0.001.
Since the p-value is less than (p = 0.001 < .05), Ho is rejected. There is sufficient evidence
that the mean trail deviations differ among the four groups at = .05.

284

Chapter 8

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

8.74

There are 3 2 = 6 treatments. They are A1B1, A1B2, A2B1, A2B2, A3B1, and A3B2.

8.76

a.

SSE = SSTot SST = 62.55 36.95 = 25.60


df Treatment = p 1 = 4 1 = 3
df Error = n p = 20 4 = 16
df Total = n 1 = 20 1 = 19
36.95
= 12.32
MST = SST/df =
3
25.60
= 1.60
MSE = SSE/df =
16
MST
12.32
F=
=
= 7.70
MSE
1.60
The ANOVA table:
Source
Treatment
Error
Total

b.

df

3
16
19

SS
36.95
25.60
62.55

MS
12.32
1.60

F
7.70

To determine if there is a difference in the treatment means, we test:


H0: 1 = 2 = 3 = 4
Ha: At least two of the means differ
where the i represents the mean for the ith treatment.
The test statistic is F =

MST
= 7.70
MSE

The rejection region requires = .10 in the upper tail of the F-distribution with 1 =
(p 1) = (4 1) = 3 and 2 = (n p) = (20 4) = 16. From Table VIII, Appendix B,
F.10 = 2.46. The rejection region is F > 2.46.
Since the observed value of the test statistic falls in the rejection region (F = 7.70 > 2.46),
H0 is rejected. There is sufficient evidence to conclude that at least two of the means
differ at = .10.
c.

x4 =

n4

57
= 11.4
5

For confidence level .90, = .10 and /2 = .10/2 = .05. From Table VI, Appendix B,
with df = 16, t.05 = 1.746. The confidence interval is:
x4 t.05 MSE/n4 11.4 1.746 1.6 / 5 11.4 .99 (10.41, 12.39)

Design of Experiments and Analysis of Variance

285

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

8.78

a.

df(AB) = (a 1)(b - 1) = 3(5) = 15


df(Error) = n ab = 48 4(6) = 24
SSAB = MSAB(df) = 3.1(15) = 46.5
SS(Total) = SSA + SSB + SSAB + SSE = 2.6 + 9.2 + 46.5 + 18.7 = 77
SS A 2.6
SSB 9.2
=
= .8667
=
= 1.84
MSA =
MSB =
a 1 3
b 1 5
SSE 18.7
=
= .7792
MSE =
n ab 24
MSA .8667
MSB 1.84
=
= 1.11
=
= 2.36
FB =
FA =
MSE .7792
MSE .7792
MS AB
3.1
=
= 3.98
FAB =
MSE .7792
B

Source
A
B
AB
Error
Total

df
3
5
15
24
47

SS
2.6
9.2
46.5
18.7
77.0

MS
.8667
1.84
3.1
.7792

F
1.11
2.36
3.98

b.

Factor A has a = 3 + 1 = 4 levels and factor B has b = 5 + 1 = 6 levels. The number of


treatments is ab = 4(6) = 24. The total number of observations is n = 47 + 1 = 48. Thus,
two replicates were performed.

c.

SST = SSA + SSB + SSAB = 2.6 + 9.2 + 46.5 = 58.3


SST
58.3
=
= 2.5347
MST =
ab 1 4(6) 1

F=

MST 2.5347
=
= 3.25
MSE .7792

To determine whether the treatment means differ, we test:


H0: 1 = 2 = = 24
Ha: At least one treatment mean is different
The test statistic is F =

MST
= 3.25
MSE

The rejection region requires = .05 in the upper tail of the F-distribution with 1 = ab
1 = 4(6) 1 = 23 and 2 = n ab = 48 - 4(6) = 24. From Table IX, Appendix B, F.05
2.03. The rejection region is F > 2.03.
Since the observed value of the test statistic falls in the rejection region (F = 3.25 > 2.03),
H0 is rejected. There is sufficient evidence to indicate the treatment means differ at
= .05.

286

Chapter 8

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

d.

Since there are differences among the treatment means, we test for the presence of
interaction:
H0: Factor A and factor B do not interact to affect the response mean
Ha: Factor A and factor B do interact to affect the response mean
The test statistic is F =

MS AB
= 3.98
MSE

The rejection region requires = .05 in the upper tail of the F-distribution with 1 =
(a 1)(b 1) = (4 1)(6 1) = 15 and 2 = n ab = 48 4(6) = 24. From Table IX,
Appendix B, F.05 = 2.11. The rejection region is F > 2.11.
Since the observed value of the test statistic falls in the rejection region (F = 3.98 > 2.11),
H0 is rejected. There is sufficient evidence to indicate factors A and B interact to affect
the response means at = .05.
Since the interaction is significant, no further tests are warranted. Multiple comparisons
need to be performed.
8.80

a.

This is a two-factor factorial design. It is also a completely randomized design.

b.

The two factors are "involvement in topic" and "question wording." Both are qualitative
variables because neither are measured on numerical scales.

c.

There are two levels of "involvement in topic": high and low. There are two levels of
"question wording": positive and negative.

d.

There are 2 2 = 4 treatments. The are:


(high, positive), (high, negative), (low, positive), and (low, negative)

8.82

e.

The experiment's dependent variable is the level of agreement.

a.

To determine if the mean vacancy rates of the eight office-property submarkets in


Atlanta differ, we test:
H0: 1 = 2 = 3 = 4 = 5 = 6 = 7 = 8
Ha: At least two means differ

b.

If quarterly data were used for nine years, there are 4 9 = 36 observations per
submarket. Since there are 8 submarkets, the total sample size is 8 36 = 288. Since no
value of is given, we will use = .05.
The rejection region requires = .05 in the upper tail of the F-distribution with 1 = k 1
= 8 1 = 7 and 2 = n k = 288 8 = 280. From Table X, Appendix B, F.05 2.01. The
rejection region is F > 2.01.

Design of Experiments and Analysis of Variance

287

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Since the observed value of the test statistic falls in the rejection region (F = 17.54 >
2.01), H0 is rejected. There is sufficient evidence to indicate the mean vacancy rates of
the eight office-property submarkets in Atlanta differ at = .05.

8.84

c.

With 1 = k 1 = 8 1 = 7 and 2 = n k = 288 8 = 280, P(F > 17.54) < .01, using
Table XI, Appendix B. Thus, the p-value is less than .01.

d.

We must assume that all eight samples are randomly drawn from normal populations, the
eight populations variances are the same, and the samples are independent.

e.

The mean vacancy rate for the South submarket is significantly larger than the mean
vacancy rates for all other submarkets. The mean vacancy rate of the Downtown
submarket is significantly larger than the mean vacancy rates for all other submarkets
except the South. The mean vacancy rate of the North Lake submarket is significantly
larger than the mean vacancy rates for all other submarkets except the South and
Downtown. The mean vacancy rate of the Midtown submarket is significantly larger than
the mean vacancy rates for all other submarkets except the South, Downtown, and North
Lake. There are no other significant differences.

a.

The response is the weight of a brochure. There is one factor and it is carton. The
treatments are the five different cartons, while the experimental units are the brochures.

b.

( y)

.750052
= .01406437506
n
40
SS(Total) = y 2 CM = .014066537 .01406437506 = .00000216264

CM =

SST =

2
Ti 2
.
.15028 2 .14962 2 .15217 2 .150312
+
+
+
+
.01406437506
n CM = 14767
8
8
8
8
8
i

= .01406568209 - .01406437506 = .00000130703


SSE = SS(Total) SST = .00000216264 - .00000130703 = .00000085561
SST .00000130703
=
MST =
= .000000326756
k 1
5 1
SSE .00000085561
=
= .000000024446
MSE =
nk
40 5
MST .000000326756
F=
=
= 13.37
MSE .000000024446
Source
Treatments
Error
Total

df
4
35
39

SS
.00000130703
.00000085561
.00000216264

MS
F
.000000326756 13.37
.000000024446

To determine whether there are differences in mean weight per brochure among the five
cartons, we test:

H0: 1 = 2 = 3 = 4 = 5
Ha: At least two treatment means differ

288

Chapter 8

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The test statistic is F = 13.37.


The rejection region requires = .05 in the upper tail of the F-distribution with 1 = k 1
= 5 1 = 4 and 2 = n k = 40 5 = 35. From Table IX, Appendix B, F.05 2.53. The
rejection region is F > 2.53.
Since the observed value of the test statistic falls in the rejection region (F = 13.37 >
2.53), H0 is rejected. There is sufficient evidence to indicate a difference in mean weight
per brochure among the five cartons at = .05.
c.

We must assume that the distributions of weights for the brochures in the five cartons are
normal, that the variances of the weights for the brochures in the five cartons are equal,
and that random and independent samples were selected from each of the cartons.

d.

Using MINITAB, the results of Tukeys multiple comparison procedure are:

Level
Carton1
Carton2
Carton3
Carton4
Carton5

N
8
8
8
8
8

Mean
0.018459
0.018785
0.018703
0.019021
0.018789

Individual 95% CIs For Mean Based on


Pooled StDev
---+---------+---------+---------+----(-----*-----)
(----*-----)
(----*-----)
(-----*-----)
(----*-----)
---+---------+---------+---------+-----0.01840
0.01860
0.01880
0.01900

StDev
0.000105
0.000101
0.000109
0.000232
0.000188

Pooled StDev = 0.000156


Tukey 95% Simultaneous Confidence Intervals
All Pairwise Comparisons
Individual confidence level = 99.32%
Carton1 subtracted from:
Carton2
Carton3
Carton4
Carton5

Lower
0.0001013
0.0000188
0.0003375
0.0001050

Center
0.0003262
0.0002437
0.0005625
0.0003300

Upper
0.0005512
0.0004687
0.0007875
0.0005550

Carton2
Carton3
Carton4
Carton5

------+---------+---------+---------+--(-----*------)
(-----*-----)
(-----*-----)
(-----*------)
------+---------+---------+---------+---0.00035
0.00000
0.00035
0.00070

Carton2 subtracted from:


Carton3
Carton4
Carton5

Lower
-0.0003075
0.0000113
-0.0002212

Center
-0.0000825
0.0002363
0.0000037

Carton3
Carton4
Carton5

------+---------+---------+---------+--(------*-----)
(------*-----)
(-----*------)
------+---------+---------+---------+---0.00035
0.00000
0.00035
0.00070

Design of Experiments and Analysis of Variance

Upper
0.0001425
0.0004612
0.0002287

289

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Carton3 subtracted from:


Carton4
Carton5

Lower
0.0000938
-0.0001387

Center
0.0003187
0.0000862

Upper
0.0005437
0.0003112

Carton4
Carton5

------+---------+---------+---------+--(-----*------)
(-----*------)
------+---------+---------+---------+---0.00035
0.00000
0.00035
0.00070

Carton4 subtracted from:


Carton5

Lower
-0.0004575

Center
-0.0002325

Upper
-0.0000075

Carton5

------+---------+---------+---------+--(-----*------)
------+---------+---------+---------+---0.00035
0.00000
0.00035
0.00070

The means arranged in order are:


Carton 1
Carton 3
Carton 2
.018459.018703.018785.018789.019021

Carton 5

Carton 4

The interpretation of the Tukey results are:


The mean weight for carton 4 is significantly higher than the mean weights of all the
other cartons.
The mean weights of cartons 5, 4, and 3 are significantly higher than the mean weight of
carton 1.

8.86

e.

Since there are differences among the cartons, management should sample from many
cartons.

a.

This is a randomized block design.


Response:
Factor:
Factor type:
Treatments:
Experimental units:

290

the length of time required for a cut to stop bleeding


drug
qualitative
drugs A, B, and C
subjects

Chapter 8

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

b.

Using MINITAB, the results are:


General Linear Model: Y versus Drug, Person
Factor
Drug
Person

Type Levels Values


fixed
3 A B C
fixed
5 1 2 3 4 5

Analysis of Variance for Y, using Adjusted SS for Tests


Source
Drug
Person
Error
Total

DF
2
4
8
14

Seq SS
156.4
7645.8
160.1
7962.3

Adj SS
156.4
7645.8
160.1

Adj MS
78.2
1911.5
20.0

F
3.91
95.51

P
0.066
0.000

Tukey 90.0% Simultaneous Confidence Intervals


Response Variable Y
All Pairwise Comparisons among Levels of Drug
Drug = A subtracted from:
Drug
B
C

Lower
-11.56
-3.72

Center
-4.820
3.020

Upper
1.922
9.762

-----+---------+---------+---------+(-------*-------)
(--------*-------)
-----+---------+---------+---------+-8.0
0.0
8.0
16.0

Upper
14.58

-----+---------+---------+---------+(--------*-------)
-----+---------+---------+---------+-8.0
0.0
8.0
16.0

Drug = B subtracted from:


Drug
C

Lower
1.098

Center
7.840

Let 1, 2, and 3 represent the mean clotting time for the three drugs.

H0: 1 = 2 = 3
Ha: At least two means differ
The test statistic is F =

MS(Drug)
= 3.91
MSE

The p-value is p = 0.066. Since the observed level of significance is less than
= .10, H0 is rejected. There is sufficient evidence to indicate differences in the mean
clotting times among the three drugs at = .10.
c.

The observed level of significance is given as 0.066.

d.

To determine if there is a significant difference in the mean response over blocks, we test:

H0: 1 = 2 = 3 = 4 = 5
Ha: At least two block means differ
The test statistic is F =

MS(Person)
= 95.51
MSE

Design of Experiments and Analysis of Variance

291

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The p-value is p = 0.000. Since the observed level of significance is less than
= .10, H0 is rejected. There is sufficient evidence to indicate differences in the mean
clotting times among the five people at = .10.
e.

The confidence interval to compare drugs A and B is (-11.56, 1.922). Since 0 is in the
interval, there is no evidence of a difference in mean clotting times between drugs
A and B.
The confidence interval to compare drugs A and C is (-3.72, 9.762). Since 0 is in the
interval, there is no evidence of a difference in mean clotting times between drugs A and
C.
The confidence interval to compare drugs B and C is (1.098, 14.58). Since 0 is not in the
interval, there is evidence of a difference in mean clotting times between drugs B and C.
Since the numbers are positive, the mean clotting time for drug C is greater than that for
drug B.
In summary, the mean clotting time for drug C is greater than that for drug B. No other
differences exist.

8.88

a.

243.2
57.8
SS A
SSB
=
= 243.2
MSB =
=
= 57.8
1
1
df B
df A
SSAB = SSTot- SSA - SSB - SSE = 976.3 - 243.2 - 57.8 - 670.8 = 4.5
SS AB
SSE
4.5
670.8
= 4.5
MSE =
= 8.712
=
=
MSAB =
1
77
df AB
df E

MSA =

MS A
243.2
= 27.92
=
MSE
8.712
MSAB
4.5
= 0.52
FAB =
=
8.712
MSE

FA =

FB =
B

MSB
57.8
= 6.63
=
MSE 8.712

The ANOVA table is:


Source
Recent Performance (A)
Risk attitude(B)
AB
Error
Total

b.

df

1
1
1
77
80

SS
243.2
57.8
4.5
670.8
976.3

MS
243.2
57.8
4.5
8.712

F
27.92
6.63
0.52

To determine if factors A and B interact, we test:

H0: Factors A and B do not interact to affect the mean decision


Ha: Factors A and B do interact to affect the mean decision
The test statistic is F = 0.52.

292

Chapter 8

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The rejection region requires = .05 in the upper tail of the F-distribution with 1 =
(a 1)(b 1) = (2 1)(2 1) = 1 and 2 = n ab = 81 2(2) = 77. From Table IX,
Appendix B, F.05 4.00. The rejection region is F > 4.00.
Since the observed value of the test statistic does not fall in the rejection region (F = .52
>/ 4.00), H0 is not rejected. There is insufficient evidence to indicate that factors A and B
interact at = .05.
c.

Since the interaction is not significant, the main effect tests are meaningful.
To determine if an individual's risk attitude affects his or her budgetary decisions, we test:

H0: No difference exists between the risk attitude means


Ha: The risk attitude means differ
The test statistic is F = 6.63.
The rejection region requires = .05 in the upper tail of the F-distribution with 1 = b 1
= 2 1 = 1 and 2 = n ab = 81 2(2) = 77. From Table IX, Appendix B, F.05 4.00.
The rejection region is F > 4.00.
Since the observed value of the test statistic falls in the rejection region (F = 6.63 > 4.00),
H0 is rejected. There is sufficient evidence to indicate an individual's risk attitude affects
his or her budgetary decisions at = .05.
d.

To determine if recent performance affects budgeting decisions, we test:

H0: No difference exists between the recent performance means


Ha: The recent performance means differ
The test statistic is F = 27.92.
The rejection region requires = .01 in the upper tail of the F-distribution with 1 = a 1
= 2 1 = 1 and 2 = n ab = 81 2(2) = 77. From Table XI, Appendix B, F.01 7.08.
The rejection region is F > 7.08.
Since the observed value of the test statistic falls in the rejection region (F = 27.92 >
7.08), H0 is rejected. There is sufficient evidence to indicate that recent performance
affects his or her budgetary decisions at = .01.

Design of Experiments and Analysis of Variance

293

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

8.90

Let factor A be second plastic and factor B be metal density. Some preliminary calculations
are:

( y)

5.562
= 3.8642
n
8
SS(Total) = y 2 CM = 9.1646 3.8642 = 5.3004

CM =

SSA =
SSB =

Ai2
.922 4.642
br CM = 2(2) + 2(2) 3.8642 = 5.594 3.8642 = 1.7298
B 2j

ar

SSAB =

CM =

ABij2
ar

.57 2 4.992
+
3.8642 = 6.30625 3.8642 = 2.44205
2(2) 2(2)

SSA SSB CM

.062 .862 .512 4.132


+
+
+
1.7298 2.44205 3.8642
2
2
2
2
= 9.0301 8.03605 = .99405
SSE = SS(Total) SSA SSB SSAB
= 5.3004 1.7298 2.44205 .99405 = .1345
SSA 1.7298
MSA =
=
= 1.7298
a 1 2 1
SSB 2.44205
=
= 2.44205
MSB =
b 1
2 1
SS AB
.99405
=
MSAB =
= .99405
(a 1)(b 1) (1)(1)
SSE
.1345
=
MSE =
= .033625
n ab 8 (2)(2)
MSA
1.7298
F(A) =
=
= 51.44
MSE
.033625
MSB 2.44205
=
F(B) =
= 72.63
MSE .033625
MS AB .99405
F(AB) =
=
= 29.56
MSE .033625
=

Source
A
B
AB
Error
Total

df
1
1
1
4

SS
1.72980
2.44205
.99405
.13450
7
5.30040

MS
1.72980
2.44205
.99405
.033625

F
51.44
72.63
29.56

SST = SSA + SSB + SSAB = 1.7298 + 2.44205 + .99405 = 5.1659


SST
5.1659
=
= 1.7220
MST =
ab 1 2(2) 1

294

Chapter 8

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

F(T) =

MST 1.7220
= 51.21
=
MSE .033625

To determine whether differences exist among the treatment means, we test:

H0: 1 = 2 = 3 = 4
Ha: At least two treatment means differ
The test statistic is F = 51.21.
The rejection region requires = .05 in the upper tail of the F-distribution with 1 = ab 1 =
2(2) 1 = 3 and 2 = n ab = 8 2(2) = 4. From Table IX, Appendix B, F.05 = 6.59. The
rejection region is F > 6.59.
Since the observed value of the test statistic falls in the rejection region (F = 51.21 > 6.59), H0
is rejected. There is sufficient evidence to indicate differences in mean radiation among the
four treatments at = .05.
Since there are differences among the treatment means, we next test to see if the two factors
interact.

H0: Second plastic and metal density do not interact


Ha: Second plastic and metal density do interact
The test statistic is F =

MS AB
= 29.56
MSE

The rejection requires = .05 in the upper tail of the F-distribution with 1 = (a 1)(b 1) = 1
and 2 = n ab = 8 2(2) = 4. From Table IX, Appendix B, F.05 = 7.71. The rejection region
is F > 7.71.
Since the observed value of the test statistic falls in the rejection region (F = 29.56 > 7.71), H0
is rejected. There is sufficient evidence to indicate second plastic and metal density interact at
= .05.
Since interaction is present, no tests for main effects are necessary. Since we want to find the
preferred method to protect patients, we will compare all four treatment means. There are four
p ( p 1)
4(4 1)
treatments, so c =
=
= 6. For * = /c = .05/6 = .0083 and */2 = .0083/2 =
2
2
.0042 .005 and df = n - ab = 4, t.005 = 4.604 from Table VI, Appendix B.

Design of Experiments and Analysis of Variance

295

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

We now form confidence intervals for the differences between each pair of means using the
formula:
( xi x j ) t.005 s

1 1
+
where s =
ni n j

MSE = .033625 = .1834

Pair

11 12
11 21
11 22
12 21
12 22
21 22

1 1
+ .40 .844 (1.244, .444)
2 2
(.03 .255) .844 .255 .844 (1.069, .619)
(.03 2.065) .844 2.035 .844 (2.879, 1.191)
(.43 .255) .844 .175 .844 (.669, 1.019)
(.43 2.065) .844 1.635 .844 (2.479, .791)
(.255 - 2.065) .844 1.81 .844 (2.654, .966)
(.03 .43) 4.604(.1834)

The means that differ are 11 and 22, 12 and 22, and 21 and 22. No other means are
significantly different. Since we are looking for the treatment that gives the best protection
(allows the smallest amount of radiation), we would pick any treatment except 22. Thus, use
second plastic present and heavy alloy, second plastic present and light alloy, or second plastic
not present and heavy alloy. Pick the one of these three which is the cheapest or the most
convenient.
8.92

a.

There are a total of a b = 3 3 = 9 treatments in this study.

b.

Using MINITAB, the ANOVA results are:


General Linear Model: Y versus Display, Price
Factor
Display
Price

Type Levels Values


fixed
3 1 2 3
fixed
3 1 2 3

Analysis of Variance for Y, using Adjusted SS for Tests


Source
Display
Price
Display*Price
Error
Total

DF
2
2
4
18
26

Seq SS
1691393
3089054
510705
8905
5300057

Adj SS
1691393
3089054
510705
8905

Adj MS
F
845696 1709.37
1544527 3121.89
127676 258.07
495

P
0.000
0.000
0.000

To get the SS for Treatments, we must add the SS for Display, SS for Price, and the SS for
Interaction. Thus, SST = 1,691,393 + 3,089,054 + 510,705 = 5,291,152. The df = 2 + 2 +
4 = 8.
SST 5, 291,152
MST 661,394
=
= 661,394
MST =
F=
=
= 1336.15
3(3) 1
ab 1
MSE
495

296

Chapter 8

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

To determine whether the treatment means differ, we test:

H0: 1 = 2 = = 9
Ha: At least two treatment means differ
The test statistic is F =

MST
= 1336.15
MSE

The rejection region requires = .10 in the upper tail of the F-distribution with 1 =
ab 1 = 3(3) 1 = 8 and 2 = n ab = 27 3(3) = 18. From Table VIII, Appendix B,
F.10 = 2.04. The rejection region is F > 2.04.
Since the observed value of the test statistic falls in the rejection region (F = 1336.15 >
2.04), H0 is rejected. There is sufficient evidence to indicate the treatment means differ at
= .10.
c.

Since there are differences among the treatment means, we next test for the presence of
interaction.

H0: Factors A and B do not interact to affect the response means


Ha: Factors A and B do interact to affect the response means
The test statistic is F =

MSAB
= 258.07
MSE

The rejection region requires = .10 in the upper tail of the F-distribution with 1 =
(a 1)(b 1) = (3 1)(3 1) = 4 and 2 = n ab = 17 3(3) = 18. From Table VIII,
Appendix B, F.10 = 2.29. The rejection region is F > 2.29.
Since the observed value of the test statistic falls in the rejection region (F = 258.07 >
2.29), H0 is rejected. There is sufficient evidence to indicate the two factors interact at
= .10.
d.

The main effect tests are not warranted since interaction is present in part c.

e.

The nine treatment means need to be compared.

f.

From the graph, if the like letters are connected, the lines are not parallel. This implies
interaction is present. This agrees with the results of part c.

Design of Experiments and Analysis of Variance

297

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

8.94

a.

This is a completely randomized design with a complete four-factor factorial design.

b.

There are a total of 2 2 2 2 = 16 treatments.

c.

Using SAS, the output is:


Analysis of Variance Procedure
Dependent Variable: Y
Sum of

Mean

Source

DF

Squares

Square

F Value

Pr > F

Model

15

546745.50

36449.70

5.11

0.0012

Error

16

114062.00

7128.88

Corrected Total

31

660807.50

R-Square

C.V.

Root MSE

Y Mean

0.827390

41.46478

84.433

203.63

DF

Anova SS

Mean Square

F Value

Pr > F

SPEED

56784.50

56784.50

7.97

0.0123

FEED

21218.00

21218.00

2.98

0.1037

SPEED*FEED

55444.50

55444.50

7.78

0.0131

COLLET

165025.13

165025.13

23.15

0.0002

SPEED*COLLET

44253.13

44253.13

6.21

0.0241

FEED*COLLET

142311.13

142311.13

19.96

0.0004

SPEED*FEED*COLLET

54946.13

54946.13

7.71

0.0135

WEAR

378.13

378.13

0.05

0.8208

SPEED*WEAR

1540.13

1540.13

0.22

0.6483

FEED*WEAR

946.13

946.13

0.13

0.7204

SPEED*FEED*WEAR

528.13

528.13

0.07

0.7890

COLLET*WEAR

1682.00

1682.00

0.24

0.6337

SPEED*COLLET*WEAR

512.00

512.00

0.07

0.7921

FEED*COLLET*WEAR

72.00

72.00

0.01

0.9212

SPEE*FEED*COLLE*WEAR

1104.50

1104.50

0.15

0.6991

Source

d.

To determine if the interaction terms are significant, we must add together the sum of
squares for all interaction terms as well as the degrees of freedom.
SS(Interaction) = 55,444.50 + 44,253.13 + 142,311.13 + 54,946.13 + 1,540.13 + 946.13
+ 528.13 + 1,682.00 + 512.00 + 72.00 + 1,104.50
= 303,339.78
df(Interaction) = 11
SS(Interacton)
303, 339.78
=
= 27,576.34364
MS(Interaction) =
11
df(Interaction)
MS(Interaction)
27, 576.34364
= 3.87
F(Interaction) =
=
MSE
7128.88

298

Chapter 8

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

To determine if interaction effects are present, we test:

H0: No interaction effects exist


Ha: Interaction effects exist
The test statistic is F = 3.87.
The rejection region requires = .05 in the upper tail of the F-distribution with 1 = 11
and 2 = 16. From Table IX, Appendix B, F.05 2.49. The rejection region is F > 2.49.
Since the observed value of the test statistic falls in the rejection region (F = 3.87 > 2.49),
H0 is rejected. There is sufficient evidence to indicate that interaction effects exist at =
.05.
Since the sums of squares for a balanced factorial design are independent of each other,
we can look at the SAS output to determine which of the interaction effects are
significant. The three-way interaction between speed, feed, and collet is significant
(p = .0135). There are three two-way interactions with p-values less than .05. However,
all of these two-way interaction terms are imbedded in the significant three-way
interaction term.
e.

Yes. Since the significant interaction terms do not include wear, it would be necessary to
perform the main effect test for wear. All other main effects are contained in a significant
interaction term.
To determine if the mean finish measurements differ for the different levels of wear, we
test:

H0: The mean finish measurements for the two levels of wear are the same
Ha: The mean finish measurements for the two levels of wear are different
The test statistic is t = 0.05.
The rejection region requires = .05 in the upper tail of the F-distribution with 1 = 1 and
2 = 16. From Table IX, Appendix B, F.05 = 4.49. The rejection region is F > 4.49.
Since the observed value of the test statistic does not fall in the rejection region (F = .05
>/ 4.49), H0 is not rejected. There is insufficient evidence to indicate that the mean finish
measurements differ for the different levels of wear at = .05.
f.

We must assume that:


i.
ii.
iii.

The populations sampled from are normal.


The population variances are the same.
The samples are random and independent.

Design of Experiments and Analysis of Variance

299

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Categorical Data Analysis

9.2

Chapter 9

The characteristics of the multinomial experiment are:


1.
2.
3.
4.
5.

The experiment consists of n identical trials.


There are k possible outcomes to each trial.
The probabilities of the k outcomes, denoted p1, p2, ... , pk, remain the same from trial to
trial, where p1 + p2 + + pk = 1.
The trials are independent.
The random variables of interest are the counts n1, n2, ... , nk in each of the k cells.

The characteristics of the binomial are the same as those for the multinomial with k = 2.
9.4

The hypotheses of interest are:


H0: p1 = .25, p2 = .25, p3 = .50
Ha: At least one of the probabilities differs from the hypothesized value
E(n1) = np1,0 = 320(.25) = 80
E(n2) = np2,0 = 320(.25) = 80
E(n3) = np3,0 = 320(.50) = 160
The test statistic is =
2

[ ni E (ni )]
E (ni )

(78 80) 2 (60 80) 2 (182 160)2


= 8.075
+
+
80
80
160

The rejection region requires = .05 in the upper tail of the 2 distribution with df = k 1
2
= 5.99147. The rejection region is 2 >
= 3 1 = 2. From Table VII, Appendix B, .05
5.99147.
Since the observed value of the test statistic falls in the rejection region (2 = 8.075 > 5.99147),
H0 is rejected. There is sufficient evidence to indicate that at least one of the probabilities
differs from its hypothesized value at = .05.
9.6

300

a.

The qualitative variable of interest is the location of professional sports stadiums and
ballparks. There are 3 levels or categories of this variable downtown, central city, and
suburban.

b.

Let p1 = proportion of major sports facilities located in downtown areas, p2 = proportion


of major sports facilities located in central city areas, and p3 = proportion of major sports
facilities located in suburban areas in 1997.

Chapter 9

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

To determine if the proportions of major sports facilities in downtown, central city, and
suburban areas in 1997 are the different than in 1985, we test:
H0: p1 = .40, p2 = .30, p3 = .30
Ha: At least one of the proportions differs from their hypothesized values
c.

E(n1) = np1,0 = 113(.40) = 45.2; E(n2) = np2,0 = 113(.30) = 33.9;


E(n3) = np3,0 = 113(.30) = 33.9

d.

The test statistic is


[n E (ni )]2 (58 45.2) 2 (26 33.9) 2 (29 33.9) 2
=
+
+
= 6.174
2 = i
45.2
33.9
33.9
E ( ni )

e.

The degrees of freedom for the test statistic is k 1 = 3 1 = 2. The p-value is


p = P ( 2 6.174) .

Using Table VII, Appendix B, with df = 2, .025 > P ( 2 6.174) > .01 . Thus,
.01 < p < .025.
Since the p-value is smaller than = .05, H0 is rejected. There is sufficient evidence to
indicate the proportions of major sports facilities in downtown, central city, and suburban
areas in 1997 are the different than in 1985.
9.8

a.

The categorical variable is the rating of the student exposure to social and
environmental issues. It has 5 levels: 1-star, 2-stars, 3-stars, 4-stars, and 5-stars.

b.

If there were no difference in the category proportions, then each proportion should be pi
= 1/5 = .20. There were a total of n = 30 business schools sampled. The expected
number would be:
E(n1) = E(n2) = E(n3) = E(n4) = E(n5) = n(pi,0) = 30(.20) = 6

c.

To determine if there are differences in the star rating category proportions of all MBA
programs, we test:
H0: p1 = p2 = p3 = p4 = p5 = .20
Ha: At least one pi differs from its hypothesized value

d.

The test statistic is

ni E ( ni )
( 2 6 )2 ( 9 6 )2 (14 6 )2 ( 5 6 )2 ( 0 6 )2
=
+
+
+
+
= 21
=
E ( ni )
6
6
6
6
6
2

e.

The rejection region requires = .05 in the upper tail of the 2 distribution with
2
= 9.48773. The rejection
df = k 1 = 5 1 = 4. From Table VII, Appendix B, .05
2
region is > 9.48773.

Categorical Data Analysis

301

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

f.

Since the observed value of the test statistic falls in the rejection region
(2 = 21 > 9.48773), H0 is rejected. There is sufficient evidence to indicate differences in
the star rating category proportions of all MBA programs at = .05.

g.

Some preliminary calculations are:


p 3 =

x3 14
=
= .467
n 30

For confidence coefficient .95, = .05 and /2 = .05/2 = .025. From Table IV,
Appendix B, z.025 = 1.96. The 95% confidence interval is:

p 3 z.025

p 3q3
.467(.533)
.467 1.96
.467 .179 (.288, .646)
n
30

We are 95% confident that the proportion of all MBA programs that are ranked in the
3-star category is between .288 and .646.
9.10

a.

Some preliminary calculations are:


E(n1) = np1,0 = 1000(.50) = 500

E(n2) = np2,0 = 1000(.22) = 220

E(n3) = np3,0 = 1000(.11) = 110

E(n4) = np4,0 = 1000(.17) = 170

To determine if the percentages disagree with the percentages reported by


Nielson/NetRatings, we test:
H0: p1 = .50, p2 = .22, p3 = .11, and p4 = .17
Ha: At least one pi differs from its hypothesized value
The test statistic is
2
2
2
2
ni E ( ni )
487 500 )
245 220 )
121 110 )
147 170 )
(
(
(
(
=
+
+
+
=
500
220
110
170
E ( ni )
2

= 7.391
The rejection region requires = .05 in the upper tail of the 2 distribution with
2
df = k 1 = 4 1 = 3. From Table VII, Appendix B, .05
= 7.81473. The rejection
2
region is > 7.81473.
Since the observed value of the test statistic does not fall in the rejection region
(2 = 7.391 >/ 7.81473), H0 is not rejected. There is insufficient evidence to indicate
the percentages disagree with the percentages reported by Nielson/NetRatings at
= .05.

302

Chapter 9

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

b.

Some preliminary calculations are:


p1 =

x1 487
=
= .487
n 1000

For confidence coefficient .95, = .05 and /2 = .05/2 = .025. From Table IV,
Appendix B, z.025 = 1.96. The 95% confidence interval is:
p1 z.025

p1q1
.487(.513)
.487 1.96
.487 .031 (.456, .518)
n
1000

We are 95% confident that the percentage of all Internet searches that use the
Google Search Engine is between 45.6% and 51.8%.
9.12

Some preliminary calculations are:


E(n1) = np1,0 = 2,023(.45) = 910.35

E(n2) = np2,0 = 2,023 (.35) = 708.05

E(n3) = np3,0 = 2,023 (.15) = 303.45

E(n4) = np4,0 = 2,023 (.05) = 101.15

To determine if the percentages of all adults falling into the four response categories
changed after the Enron scandal, we test:
H0: p1 = .45, p2 = .35, p3 = .15, and p4 = .05
Ha: At least one pi differs from its hypothesized value
The test statistic is
2
2
2
2
ni E ( ni )
1,173 910.35 )
587 708.05 )
182 303.45 )
81 101.15 )
(
(
(
(
=
=
+
+
+
910.35
708.05
303.45
101.15
E ( ni )
2

= 149.096
The rejection region requires = .01 in the upper tail of the 2 distribution with
2
= 11.3449. The rejection region is
df = k 1 = 4 1 = 3. From Table VII, Appendix B, .01
2
> 11.3449.
Since the observed value of the test statistic falls in the rejection region
(2 = 149.096 > 11.3449), H0 is rejected. There is sufficient evidence to indicate the
percentages of all adults falling into the four response categories changed after the Enron
scandal at = .01.

Categorical Data Analysis

303

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

9.14

a.

Some preliminary calculations are:


E(n1) = np1,0 = 700(.09) = 63
E(n3) = np3,0 = 700(.02) = 14
E(n5) = np5,0 = 700(.12) = 84
E(n7) = np7,0 = 700(.03) = 21
E(n9) = np9,0 = 700(.09) = 63
E(n11) = np11,0 = 700(.01) = 7
E(n13) = np13,0 = 700(.02) = 14
E(n15) = np15,0 = 700(.08) = 56
E(n17) = np17,0 = 700(.01) = 7
E(n19) = np19,0 = 700(.04) = 28
E(n21) = np21,0 = 700(.04) = 28
E(n23) = np23,0 = 700(.02) = 14
E(n25) = np25,0 = 700(.02) = 14
E(n27) = np27,0 = 700(.02) = 14

2 =

E(n2) = np2,0 = 700(.02) = 14


E(n4) = np4,0 = 700(.04) = 28
E(n6) = np6,0 = 700(.02) = 14
E(n8) = np8,0 = 700(.02) = 14
E(n10) = np10,0 = 700(.01) = 7
E(n12) = np12,0 = 700(.04) = 28
E(n14) = np14,0 = 700(.06) = 42
E(n16) = np16,0 = 700(.02) = 14
E(n18) = np18,0 = 700(.06) = 42
E(n20) = np20,0 = 700(.06) = 42
E(n22) = np22,0 = 700(.02) = 14
E(n24) = np24,0 = 700(.01) = 7
E(n26) = np26,0 = 700(.01) = 7

[ ni E (ni )]2 (39 63) 2 (18 14) 2 (30 14) 2


(34 14) 2
=
+
+
+ ... +
= 360.48
E (ni )
63
14
14
14

To determine if ScrabbleExpress presents the player with unfair word selection


opportunities that are different from the Scrabble board game, we test:
H0: Proportions in ScrabbleExpress are the same as in the Scrabble board game
Ha: Proportions in ScrabbleExpress are different from those in the Scrabble board
game
The test statistic is 2 = 360.47
The rejection region requires = .05 in the upper tail of the 2 distribution with df =
k 1 = 27 1 = 26. From Table VII, Appendix B, 2 = 38.8852. The rejection region
is 2 > 38.8852.
Since the observed value of the test statistic falls in the rejection region ( 2 = 360.47 >
38.8852), H0 is rejected. There is sufficient evidence to indicate the ScrabbleExpress
presents the player with unfair word selection opportunities that are different from the
Scrabble board game at = .05.
b.

The relative frequency of vowels for the board game is P(A) + P(E) + P(I) + P(O) +
P(U) = .09 + .12 + .09 +.08 + .04 = .42
p v =

304

39 + 31 + 25 + 20 + 21 136
=
= .194
700
700

Chapter 9

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

For confidence level .95, = .05 and /2 = .05/2 = .025. From Table IV, Appendix B,
z.025 = 1.96. The 95% confidence interval is:
p v (1 p v )
.194(.806)
.194 1.96
.194 .029 (.165, .223)
n
700

p v z.025

We are 95% confident that the true proportion of vowels in the ScrabbleExpress game is
between .165 and .223. The true proportion from the board game is .42 which is much
greater than the values in the interval.
9.16

2
df = (r 1)(c 1) = (5 1)(5 1) = 16. From Table VII, Appendix B, .05
= 26.2962.
2
The rejection region is > 26.2962.

a.

b.

9.18

2
df = (r 1)(c 1) = (3 1)(6 1) = 10. From Table VII, Appendix B, .10
= 15.9871.
2
The rejection region is > 15.9871.

c.

df = (r 1)(c 1) = (2 1)(3 1) = 2. From Table VII, Appendix B, 2 = 9.21034. The


rejection region is 2 > 9.21034.

a.

To convert the frequencies to percentages, divide the numbers in each column by the
column total and multiply by 100. Also, divide the row totals by the overall total and
multiply by 100. The column totals are 25, 64, and 78, while the row totals are 96 and
71. The overall sample size is 165. The table of percentages are:
Column
2

1
Row 1

b.

9
100 = 36%
25

34
100 = 53.1%
64

53
100 = 67.9%
78

96
100 = 57.5%
167

2 16
100 = 64%
25

30
100 = 46.9%
64

25
100 = 32.1%
78

71
100 = 42.5%
167

Using MINITAB, the graph is:

70
60

57.5%

50

Percent

40
30
20
10
0
1

Column

Categorical Data Analysis

305

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

9.20

If the rows and columns are independent, the row percentages in each column would be
close to the row total percentages. This pattern is not evident in the plot, implying the
rows and columns are not independent.

a-b. To convert the frequencies to percentages, divide the numbers in each column by the
column total and multiply by 100. Also, divide the row totals by the overall total and
multiply by 100.
B
B2

B1
B

c.

B3

Totals

A1 40
100 = 29.9%
134

72
100 = 44.2%
163

42
100 = 29.6%
142

154
100 = 35.1%
439

A2 63
100 = 47.0%
Row
134

53
100 = 32.5%
163

70
100 = 49.3%
142

186
100 = 42.4%
439

A3 31
100 = 23.1%
134

38
100 = 23.3%
163

30
100 = 21.1%
142

99
100 = 22.6%
439

Using MINITAB, the graph is:

45
40
35

35.1%

30

Percent

25
20
15
10
5
0
1

The graph supports the conclusion that the rows and columns are not independent. If they
were, then the height of all the bars would be essentially the same.
9.22

a.

The contingency table would be:


Taxmotivation
Yes
No
Total

306

Itemize Deductions
Yes
No
691
381
794
899
1,482
1,280

Total
1,072
1,693
2,765

Chapter 9

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

b.

c.

E11 =

R1C1 1,072(1, 485)


=
= 575.7
n
2,765

E21 =

R2C1 1,693(1, 485)


=
= 909.3
n
2,765

E12 =

R1C2 1,072(1, 280)


=
= 496.3
n
2,765

E22 =

R2C2 1,693(1, 280)


=
= 783.7
n
2,765

The test statistic is:

2 =

[nij Eij ]2
Eij

[691 575.7]2 [381 496.3]2 [794 909.3]2 [899 783.7]2


+
+
+
575.7
496.3
909.3
783.7
= 81.46
=

d.

To determine if tax-motivation and itemize-deduction are related for charitable givers, we


test:
H0: Tax-motivation and itemize-deduction are independent
Ha: Tax-motivation and itemize-deduction are dependent
The test statistic is 2 = 81.46.
The rejection region requires = .05 in the upper tail of the 2 distribution with df =
2
= 3.84146. The
(r 1)(c 1) = (2 1)(2 1) = 1. From Table VII, Appendix B, .05

rejection region is 2 > 3.84146.


Since the observed value of the test statistic falls in the rejection region ( 2 = 81.46 >
3.84146), H0 is rejected. There is sufficient evidence to indicate that tax-motivation and
itemize-deduction are related for charitable givers at = .05.
e.

To compute the bar graph, we first convert frequencies to percentages by dividing the
numbers in each column by the column total and multiplying by 100%. Also, divide the
row totals by the overall total and multiply by 100%.

Taxmotivation
Yes
No
Total

Itemize Deductions
Yes
691
100% = 46.5%
1485
794
100% = 53.5%
1485
1,485

Categorical Data Analysis

No
381
100% = 29.8%
1280
899
100% = 70.2%
1280
1,280

Total
1072
100% = 38.8%
2765
1693
100% = 61.28%
2765
2,765

307

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Using MINITAB, the bar graph is:

50

40

38.8%

Percent

30

20

10

0
Yes

No

Itemize

9.24.

a.

Some preliminary calculations are:


p C1 =

xC1
175
=
= .028
n1 6, 222

p C 2 =

xC 2
236
=
= .050
4,692
n2

p C 3 =

xC 3
319
=
= .045
7,140
n3

p C 4 =

xC 4
231
=
= .038
6,120
n4

p C 5 =

xC 5
480
=
= .046
n5 10,353

p C 6 =

xC 6 187
=
= .039
4794
n6

The proportions range from .028 to .050. Since .050 is about twice as big as .028, there
may be evidence to conclude some of the proportions are different.
b.

308

Some preliminary calculations are:


E11 =

R1C1 6, 222(37,693)
=
= 5,964.39
n
39,321

E12 =

R1C2 6, 222(1628)
=
= 257.61
n
39,321

E21 =

R2C1 4,692(37,693)
=
= 4497.74
n
39,321

E22 =

R2C2 4,692(1,628)
=
= 194.26
n
39,321

E31 =

R3C1 7,140(37,693)
=
= 6,844.38
n
39,321

E32 =

R3C2 7,140(1,628)
=
= 295.62
n
39,321

E41 =

R4C1 6,120(37,693)
=
= 5,866.61
n
39,321

E42 =

R4C2 6,120(1,628)
=
= 253.39
n
39,321

Chapter 9

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

E51 =

R5C1 10,353(37,693)
=
= 9,924.36
n
39,321

E52 =

R5C2 10,353(1,628)
=
= 428.64
n
39,321

E61 =

R6C1 4,794(37,693)
=
= 4,595.51
39,321
n

E62 =

R6C2 4,794(1,628)
=
= 198.49
39,321
n

To determine if the proportions of censored measurements differ for the six tractor
lines, we test:
H0: Tractor lines and Censored measurements are independent
Ha: Tractor lines and Censored measurements are dependent
The test statistic is
2

2
2
2
nij Eij
6047 5964.39 )
175 257.61)
4456 4497.74 )
(
(
(

=
=
+
+
5964.39
257.61
4497.74
Eij
2

2
187 198.49 )
(
+ +

198.49

= 48.0978

The rejection region requires = .01 in the upper tail of the 2 distribution with
2
= 15.0863.
df = (r 1)(c 1) = (6 1)(2 1) = 5. From Table VII, Appendix B, .01
2
The rejection region is > 15.0863.
Since the observed value of the test statistic falls in the rejection region
(2 = 48.0978 > 15.0863), H0 is rejected. There is sufficient evidence to indicate that
the proportions of censored measurements differ for the six tractor lines at = .01.
c.

9.26

Even though there are differences in the proportions of censured data among the 6 tractor
lines, these proportions range from .028 to .050. In practice, there is very little difference
between .028 and .050.

Some preliminary calculations are:


E11 =

R1C1 95(118)
=
= 42.8
262
n

E21 =

R2 C1 69(118)
=
= 31.1
n
262

E31 =

R3 C1 42(118)
=
= 18.9
n
262

E32 =

R3 C2 42(144)
=
= 23.1
n
262

E41 =

R4 C1 56(118)
=
= 25.2
n
262

E42 =

R4 C2 56(144)
=
= 30.8
n
262

Categorical Data Analysis

E12 =

R1C2 95(144)
=
= 52.2
262
n

E22 =

R2 C2 69(144)
=
= 37.9
n
262

309

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

To determine whether a pig farmers education level has an impact on the size of the pig farm,
we test:
H0: Pig farmers education level and size of pig farm are independent
Ha: Pig farmers education level and size of pig farm are dependent
The test statistic is

2 =
+

[nij Eij ]2
Eij

(42 42.8) 2 (53 52.2) 2 (27 31.1) 2 (42 37.9) 2 (22 18.9) 2
+
+
+
+
42.8
52.2
31.1
37.9
18.9

(20 23.1) 2 (27 25.2)2 (29 30.8) 2


+
+
= 2.17
23.1
25.2
30.8

The rejection region requires = .05 in the upper tail of the 2 distribution with df
2
= (r 1)(c 1) = (4 1)(2 1) = 3. From Table VII, Appendix B, .05
= 7.81473. The

rejection region is 2 > 7.81473.


Since the observed value of the test statistic does not fall in the rejection region ( 2 = 2.17 >/
7.81473), H0 is not rejected. There is insufficient evidence to indicate that a pig farmers
education level has an impact on the size of the pig farm at = .05.
To compute the bar graph, we first convert frequencies to percentages by dividing the numbers
in each row by the row total and multiplying by 100%. Also, divide the column totals by the
overall total and multiply by 100%.
Farm Size
<1,000 pigs
1,000-2,000
pigs
2,000-5,000
pigs
> 5,000
pigs
Total

310

Education Level
No college
College
42
53
100% = 44.2%
100% = 55.8%
95
95
27
42
100% = 39.1%
100% = 60.9%
69
69
22
20
100% = 52.4%
100% = 47.6%
42
42
27
29
100% = 48.2%
100% = 51.8%
56
56
118
144
100% = 45.0%
100% = 55.0%
262
262

Total
95

69
42
56
262

Chapter 9

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Using MINITAB, the bar graph is:

50
45.0%

Percent

40

30
20

10
0
<1,000

1,000-2,000

2,000-5,000

>5,000

Farm Size

9.28

a.

Some preliminary calculations are:


R1C1 53(35)
=
= 26.5
n
70
R C 17(35)
= 8.5
E21 = 2 1 =
n
70

R1C2 53(35)
=
= 26.5
n
70
R C 17(35)
E22 = 2 2 =
= 8.5
n
70

E11 =

E12 =

To determine if the severity of the ethical issue influenced whether the issue was
identified or not by the auditors, we test:
H0: Severity of ethical issue and identification are independent
Ha: Severity of ethical issue and identification are dependent
nij Eij
The test statistic is =
Eij

(27 26.5) (26 26.5) (8 8.5) (9 8.5)


+
+
+
=
= .078
26.5
26.5
8.5
8.5
2

The rejection region requires = .05 in the upper tail of the 2 distribution with df =
2
= 3.84146. The
(r 1)(c 1) = (2 1)(2 1) = 1. From Table VII, Appendix B, .05
2
rejection region is > 3.84146.
Since the observed value of the test statistic does not fall in the rejection region (2 = .078
>/ 3.84146), H0 is not rejected. There is insufficient evidence to indicate that the severity
of the ethical issue influenced whether the issue was identified or not by the auditors at
= .05.
b.

No. If there were 0 in the bottom cell of the column, then the expected count for that cell
will be less than 5. One of the assumptions necessary for the test statistic to have a 2
distribution will not hold.

Categorical Data Analysis

311

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

Suppose we change the numbers in the table to be as follows:


Severity of Ethical Issue
Moderate
Severe
32
21
3
14

Ethical Issue Identified


Ethical Issue Not Identified

Since the row and column totals are the same, the expected cell counts are the same as
above.

nij Eij
The test statistic is =
Eij

(32 26.5) 2 (21 26.5) 2 (3 8.5) 2 (14 8.5) 2


+
+
+
= 9.401
26.5
26.5
8.5
8.5

Now the test statistic would fall in the rejection region.


9.30

a.

The contingency table is:

Altitude
< 300
300-600
600
Totals

b.

Flight Response
Low
High
85
105
77
121
17
59
179
285

Totals
190
198
76
464

Some preliminary calculations are:

E11 =

R1C1 190(179)
=
= 73.297
n
464

E12 =

R1C2 190(285)
=
= 116.703
n
464

E21 =

R2C1 198(179)
=
= 76.384
n
464

E22 =

R2C2 198(285)
=
= 121.616
n
464

E31 =

R3C1 76(179)
=
= 29.319
464
n

E32 =

R3C2 76(285)
=
= 46.681
464
n

To determine if flight response of the geese depends on the altitude of the helicopter,
we test:

H0: Flight response and Altitude of helicopter are independent


Ha: Flight response and Altitude of helicopter are dependent

312

Chapter 9

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The test statistic is

nij Eij

=
Eij

(85 73.297 )2 (105 116.703)2 ( 77 76.384 )2 (121 121.616 )2


73.297

116.703

(17 29.319 )

29.319

( 59 46.681)

76.384

121.616

46.681

= 11.477
The rejection region requires = .01 in the upper tail of the 2 distribution with
2
df = (r 1)(c 1) = (3 1)(2 1) = 2. From Table VII, Appendix B, .01
= 9.21034.
2
The rejection region is > 9.21034.
Since the observed value of the test statistic falls in the rejection region
(2 = 11.477 > 9.21034), H0 is rejected. There is sufficient evidence to indicate that
the flight response of the geese depends on the altitude of the helicopter at = .01.
c.

The contingency table is:


Flight Response
Lateral
Distance
< 1000
1000-2000
2000-3000
3000
Totals

d.

Low
37
68
44
30
179

High
243
37
4
1
285

Totals
280
105
48
31
464

Some preliminary calculations are:

E11 =

R1C1 280(179)
=
= 108.017
n
464

E12 =

R1C2 280(285)
=
= 171.983
n
464

E21 =

R2C1 105(179)
=
= 40.506
n
464

E22 =

R2C2 105(285)
=
= 64.494
n
464

E31 =

R3C1 48(179)
=
= 18.517
464
n

E32 =

R3C2 48(285)
=
= 29.483
464
n

E41 =

R 4 C1 31(179)
=
= 11.959
n
464

E42 =

R4C2 31(285)
=
= 19.041
n
464

Categorical Data Analysis

313

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

To determine if flight response of the geese depends on the lateral distance of the
helicopter, we test:

H0: Flight response and Lateral distance of the helicopter are independent
Ha: Flight response and Lateral distance of the helicopter are dependent
The test statistic is
nij Eij

2 =
Eij
=

( 37 108.017 )2 ( 243 171.983)2 ( 68 40.506 )2 ( 37 64.494 )2


108.017
+

171.983

( 44 18.517 )
18.517

( 4 29.494 )

40.506
2

29.494

( 30 11.959 )

64.494
2

11.959

(1 19.041)2
19.041

= 207.814
The rejection region requires = .01 in the upper tail of the 2 distribution with
2
df = (r 1)(c 1) = (4 1)(2 1) = 3. From Table VII, Appendix B, .01
= 11.3449.
2
The rejection region is > 11.3449.
Since the observed value of the test statistic falls in the rejection region
(2 = 207.814 > 11.3449), H0 is rejected. There is sufficient evidence to indicate that
the flight response of the geese depends on the lateral distance of the helicopter at = .01.
e.

Using SAS, the contingency table for altitude by response with the column percents is:
Table of ALTGRP by RESPONSE
ALTGRP

RESPONSE

Frequency|
Percent |
Row Pct |
Col Pct |LOW
|HIGH
| Total
---------+--------+--------+
<300
|
85 |
105 |
190
| 18.32 | 22.63 | 40.95
| 44.74 | 55.26 |
| 47.49 | 36.84 |
---------+--------+--------+
300-600 |
77 |
121 |
198
| 16.59 | 26.08 | 42.67
| 38.89 | 61.11 |
| 43.02 | 42.46 |
---------+--------+--------+
600+
|
17 |
59 |
76
|
3.66 | 12.72 | 16.38
| 22.37 | 77.63 |
|
9.50 | 20.70 |
---------+--------+--------+
Total
179
285
464
38.58
61.42
100.00

314

Chapter 9

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Statistics for Table of ALTGRP by RESPONSE


Statistic
DF
Value
Prob
-----------------------------------------------------Chi-Square
2
11.4770
0.0032
Likelihood Ratio Chi-Square
2
12.1040
0.0024
Mantel-Haenszel Chi-Square
1
10.2104
0.0014
Phi Coefficient
0.1573
Contingency Coefficient
0.1554
Cramer's V
0.1573
Sample Size = 464

From the row percents, it appears that the lower the plane, the lower the response.
For altitude <300m, 55.26% of the geese had a high response. For altitude 300600m, 61.11% of the geese had a high response. For altitude 600+m, 77.63% of the
geese had a high response. Thus, instead of setting a minimum altitude for the
planes, we need to set a maximum altitude. For this data, the lowest response is at
an altitude of < 300 meters.
Using SAS, the contingency table for lateral distance by response with the column
percents is:
The FREQ Procedure
Table of LATGRP by RESPONSE
LATGRP

RESPONSE

Frequency |
Percent
|
Row Pct
|
Col Pct
|LOW
|HIGH
| Total
----------+--------+--------+
<1000
|
37 |
242 |
279
|
7.99 | 52.27 | 60.26
| 13.26 | 86.74 |
| 20.67 | 85.21 |
----------+--------+--------+
1000-2000 |
68 |
37 |
105
| 14.69 |
7.99 | 22.68
| 64.76 | 35.24 |
| 37.99 | 13.03 |
----------+--------+--------+
2000-3000 |
44 |
4 |
48
|
9.50 |
0.86 | 10.37
| 91.67 |
8.33 |
| 24.58 |
1.41 |
----------+--------+--------+
3000+
|
30 |
1 |
31
|
6.48 |
0.22 |
6.70
| 96.77 |
3.23 |
| 16.76 |
0.35 |
----------+--------+--------+
Total
179
284
463
38.66
61.34
100.00
Frequency Missing = 1
Statistics for Table of LATGRP by RESPONSE
Statistic
DF
Value
Prob
-----------------------------------------------------Chi-Square
3
207.0800
<.0001
Likelihood Ratio Chi-Square
3
226.8291
<.0001
Mantel-Haenszel Chi-Square
1
189.2843
<.0001
Phi Coefficient
0.6688
Contingency Coefficient
0.5559
Cramer's V
0.6688
Effective Sample Size = 463
Frequency Missing = 1

Categorical Data Analysis

315

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

From the row percents, it appears that the greater the lateral distance, the lower the
response. For a lateral distance of 3000+m only 3.23% of the geese had a high
response. Thus, the further away the plane is laterally, the lower the response. For
this data, the lowest response is when the plane is further than 3000 meters.
Thus the recommendation would be a maximum height of 300 m and a minimum
lateral distance of 3000 m.
9.32

a.

Some preliminary calculations are:


E11 =
E12 =
E13 =
E31 =
E32 =
E33 =

50(50)
= 10
250
50(90)
= 18
250
50(110)
= 22
250
100(50)
= 20
250
100(90)
= 36
250
100(110)
= 44
250

100(50)
= 20
250
100(90)
E22 =
= 36
250
100(110)
E23 =
= 44
250
E21 =

To determine if the rows and columns are dependent, we test:


H0: Rows and columns are independent
Ha: Rows and columns are dependent
2

nij Eij
(20 10) 2
(30 44) 2
+"+
The test statistic is =
=
= 54.14
10
44
Eij
2

The rejection region requires = .05 in the upper tail of the 2 distribution with df =
2
= 9.48773. The
(r 1)(c 1) = (3 1)(3 1) = 4. From Table VII, Appendix B, .05
2
rejection region is > 9.48773.
Since the observed value of the test statistic falls in the rejection region (2 = 54.14 >
9.48773), H0 is rejected. There is sufficient evidence to indicate a dependence between
rows and columns at = .05.

316

b.

No, the analysis remains identical.

c.

Yes, the assumptions on the sampling differ.

Chapter 9

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

d.

The percentages are in the table below.


Column
2

1
1

20
50

Row

10
50

20
50

e.

20

100% = 40%

90

20

100% = 20%

90
50

100% = 40%

90

3
10

100% = 22.2%

110

70

100% = 22.2%

110
30

100% = 55.6%

110

100% = 9.1%

Totals
50
250

100% = 63.6%

100

100% = 37.3%

100

250
250

100% = 20%
100% = 40%
100% = 40%

Using MINITAB, the bar graph is:

40

Percent

30

20%

20

10

0
1

Column

The graph supports the decision in part a. In part a, we rejected the null hypothesis and
concluded that the rows and columns were dependent. If they were dependent, then we
would expect the three bars to be the same height. In this graph, they are not the same
height.
9.34

a.

If Bon Appetit readers do not have a preference for their least favorite vegetable, then the
values of p1, p2, p3, and p4 should all be the same. Since there are four categories, then p1
= p2 = p3 = p4 = .25.

b.

To determine if the Bon Appetit readers have a preference for at least one of the
vegetables as least favorite, we test:
H0: p1 = p2 = p3 = p4 = .25
Ha At least one pi .25

Categorical Data Analysis

317

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

Some preliminary calculations:


n=

= 46 + 76 + 44 + 34 = 200

E(ni) = npi,0 = 200(.25) = 50, i = 1, 2, 3, or 4


The test statistic is =
2

[ ni E (ni )]

E ( ni )

(46 50) 2 (76 50) 2 (44 50) 2 (34 50) 2


= 19.68
+
+
+
50
50
50
50

The rejection region requires = .05 in the upper tail of the 2 distribution with df = k 1
2
= 7.81473. The rejection region is
= 4 1 = 3. From Table VII, Appendix B, .05
2
> 7.81473.
Since the observed value of the test statistic falls in the rejection region
(2 = 19.68 > 7.81473), H0 is rejected. There is sufficient evidence to indicate the Bon
Appetit readers have a preference for at least one of the vegetables as least favorite at
= .05.
d.

We must assume that:


Sample is random
Sample size is sufficiently large (every cell has an expected value of at least 5).

9.36

a.

Some preliminary calculations are:


E11 =

R1C1 242(473)
=
= 208.499
n
549

E21 =

R2 C1 212(473)
=
= 182.652
n
549

E31 =

R3 C1 95(473)
=
= 81.849
549
n

E12 =

R1C2 242(76)
=
= 33.501
n
549

E22 =
E32 =

R2 C2 212(76)
=
= 29.348
n
549

R3 C2 95(76)
=
= 13.151
549
n

To determine if the likelihood for stress is dependent on an employees fitness level, we


test:
H0: Stress and Fitness level are independent
Ha: Stress and Fitness level are dependent

318

Chapter 9

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The test statistic is


nij Eij

=
Eij

( 204 208.499 )2 ( 38 33.506 )2 (184 182.652 )2


+

208.499
+

( 28 29.348)
29.348

33.506
2

182.652

(85 81.849 )
81.849

(10 13.151)2
13.151

= 1.648
Since no level was given, we will use = .05. The rejection region requires
= .05 in the upper tail of the 2 distribution with df = (r 1)(c 1) = (3 1)(2 1) = 2.
2
From Table VII, Appendix B, .05
= 5.99147. The rejection region is 2 > 5.99147.
Since the observed value of the test statistic does not fall in the rejection region
(2 = 1.648 > 5.99147), H0 is not rejected. There is insufficient evidence to indicate
that the likelihood for stress is dependent on an employees fitness level at = .05.
b.

A Type I error is rejecting H0 when H0 is true. In this case, it would be concluding that
Stress and Fitness level are dependent when, in fact, they are independent.
A Type II error is accepting Ho when Ho is false. In this case, it would be concluding
that Stress and Fitness level are independent when, in fact, they are dependent.

c.

To convert frequencies to percentages, divide the numbers in each row by the row total
and multiply by 100. Also, divide the column totals by the overall total and multiply by
100.
Stress Level
Poor

Fitness Level

Average
Good
Total

Categorical Data Analysis

No Stress

Stress

204
100 = 84.3%
242
184
100 = 86.8%
212
85
100 = 89.5%
95
473
100 = 86.2%
549

38
100 = 15.7%
242
28
100 = 13.2%
212
10
100 = 10.5%
95
76
100 = 13.8%
549

319

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Using MINITAB, the bar chart is:


Chart of Percent with Stress
16
14

13.8%

P er cent

12
10
8
6
4
2
0

9.38

a.

P oor

A v erage
Fitness Level

G ood

E(n1) = np1,0 = 370(.30) = 111


E(n2) = np2,0 = 370(.20) = 74
E(n3) = np3,0 = 370(.20) = 74
E(n4) = np4,0 = 370(.10) = 37
E(n5) = np5,0 = 370(.10) = 37
E(n6) = np6,0 = 370(.10) = 37

b.

The test statistic is =


2

[ ni E (ni )]

E (ni )

(84 111) (79 74) (75 74) (49 37)


+
+
+
111
74
74
37
2
2
(36 37) (47 37)
+
+
= 13.541
37
37
2

c.

To determine if the true percentages of the colors produced differ from the manufacturers
stated percentages, we test:
H0: p1 = .30, p2 = .20, p3 = .20, p4 = .10, p5 = .10, p6 = .10
Ha: At least one pi does not equal its hypothesized value.
The test statistic is 2 = 13.541.

320

Chapter 9

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The rejection region requires = .05 in the upper tail of the 2 distribution with df = k 1
2
= 11.0705. The rejection region is
= 6 1 = 5. From Table VII, Appendix B, .05
2
> 11.0705.
Since the observed value of the test statistic falls in the rejection region
(2 = 13.541 > 11.0705), H0 is rejected. There is sufficient evidence to indicate the true
percentages of the colors produced differ from the manufacturers stated percentages at
= .05.
9.40

a.

The expected cell counts are:


R1C1
20(11)
= 7.097
=
31
n
RC
11(11)
E21 = 2 1 =
= 3.903
31
n
E11 =

R1C2
20(20)
= 12.903
=
31
n
RC
11(20)
E22 = 2 2 =
= 7.097
31
n
E12 =

b.

One of the assumptions for the chi-square test is that the sample size, n, is large enough
so that, for every cell, the expected cell count, Eij, will be equal to 5 or more. For cell (2,
1), the expected cell count is only 3.903.

c.

To determine if inside ownership and size are independent, we test:


H0: Inside ownership and size are independent
Ha: Inside ownership and size are dependent
The p-value is .0043. Since the p-value is so small, H0 is rejected. There is sufficient
evidence to indicate that inside ownership and size are dependent for > .0043

d.

First, we find the percentages by dividing each cell count by the column total and
multiplying by 100. The row totals are divided by the total sample size. The percentages
are found in the table:
Size
Insider
Ownership
Low
High

Small
3
100% = 27.3%
11
8
100% = 72.7%
11

Categorical Data Analysis

Large
17
100% = 85%
20
3
100% = 15%
20

Totals
20
100% = 64.5%
31
11
100% = 35.5%
31

321

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Using MINITAB, the bar chart is:

90
80
70
64.5%

60

Percent

50
40
30
20
10
0
Small

Large

Size

Since the bars are not the same height, there is evidence that insider ownership and size
are dependent. This is what we found in part c.
9.42

322

Some preliminary calculations are:


E11 =

R1C1 100(171)
=
= 34.2
n
500

E12 =

R1C2 100(207)
=
= 41.4
n
500

E13 =

R1C3 100(80)
=
= 16.0
n
500

E14 =

R1C4 100(42)
=
= 8.4
n
500

E21 =

R2 C1 175(171)
=
= 59.9
500
n

E22 =

R2 C2 175(207)
=
= 72.5
500
n

E23 =

R2 C3 175(80)
=
= 28.0
500
n

E24 =

R2 C4 175(42)
=
= 14.7
500
n

E31 =

R3 C1 145(171)
=
= 49.6
n
500

E32 =

R3 C2 145(207)
=
= 60.0
n
500

E33 =

R3 C3 145(80)
=
= 23.2
n
500

E34 =

R3 C4 145(42)
=
= 12.2
n
500

E41 =

R4 C1 80(171)
=
= 27.4
n
500

E42 =

R4 C2 80(207)
=
= 33.1
n
500

E43 =

R4 C3 80(80)
=
= 12.8
500
n

E44 =

R4 C4 80(42)
=
= 6.7
500
n

Chapter 9

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

To determine if there is a dependence between a son's choice of occupation and his


occupation, we test:

father's

H0: Son's choice of occupation and his father's occupation are independent
Ha: Son's choice of occupation and his father's occupation are dependent.
The test statistic is

=
2

[nij Eij ]2
Eij

(55 34.2) 2 (38 41.4) 2 (7 16.0) 2 (0 8.4) 2 (79 59.9) 2


+
+
+
+
34.2
41.4
16.0
8.4
59.9

(71 72.5) 2 (25 28) 2 (0 14.7) 2 (22 49.6) 2 (75 60) 2 (38 23.2) 2
+
+
+
+
+
72.5
28
14.7
49.6
60
23.2
(10 12.2) 2 (15 27.4) 2 (23 33.1) 2 (10 12.8) 2 (32 6.7) 2
+
+
+
+
+
= 181.32
12.2
27.4
33.1
12.8
6.7
+

The rejection region requires = .05 in the upper tail of the 2 distribution with df
2
= 16.9190. The
= (r 1)(c 1) = (4 1)(4 1) = 9. From Table VII, Appendix B, .05

rejection region is 2 > 16.9190.


Since the observed value of the test statistic falls in the rejection region ( 2 = 181.32 >
16.9190), H0 is rejected. There is sufficient evidence to indicate a dependence between a sons
choice of occupation and his fathers occupation at = .05.
9.44

a.

Some preliminary calculations are:


R1C1 57(52)
=
= 34.465
n
86
R C 29(52)
E21 = 2 1 =
= 17.535
n
86

E11 =

R1C2 57(54)
=
= 22.535
n
86
RC
29(34)
E22 = 2 2 =
= 11.465
n
86

E12 =

To determine if manufacturing firms were more likely to be involved with TQM than
service firms, we test:
H0: Type of firm and TQM are independent
Ha: Type of firm and TQM are dependent
nij Eij
The test statistic is =
Eij

(34 34.465) (23 22.535) (18 17.535) (11 11.465)


+
+
+
= .047
34.465
22.535
17.535
11.465
2

The rejection region requires = .05 in the upper tail of the 2 distribution with df =
2
= 3.84146. The
(r 1)(c 1) = (2 1)(2 1) = 1. From Table VII, Appendix B, .05
2
rejection region is > 3.84146.

Categorical Data Analysis

323

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Since the observed value of the test statistic does not fall in the rejection region (2 = .047
>/ 3.84146), H0 is not rejected. There is insufficient evidence to indicate that the type of
firm and TQM are dependent at = .05. There is no evidence to indicate that
manufacturing firms are more likely to be involved with TQM than service firms.
b.

The p-value is P(2 > .047). From Table VII, Appendix B, with df = 1, .10 < P(2 > .047)
< .90.

c.

We must assume:
1.

2.

9.46

a.

The n observed counts are a random sample from the population of interest. We
may then consider this to be a multinomial experiment with r c = 2 2 = 4
possible outcomes
The sample size, n, will be large enough so that, for every cell, the expected cell
count, E(nij), will be equal to 5 or more.

Some preliminary calculations are:


E(n1) = np1,0 = 85(.26) = 22.1
E(n2) = np2,0 = 85(.30) = 25.5
E(n3) = np3,0 = 85(.11) = 9.35
E(n4) = np4,0 = 85(.14) = 11.9
E(n5) = np5,0 = 85(.19) = 16.15
To determine if probabilities differ from the hypothesized values, we test:
H0: p1 = .26, p2 = .30, p3 = .11, p4 = .14, p5 = .19
Ha: At least one of the probabilities differs from its hypothesized value.
ni E ( n i )
The test statistic is =
E (ni ) 2

(32 22.1) (26 25.5) (15 9.35) (6 11.9) (6 16.15)


+
+
+
+
22.1
25.5
9.35
11.9
16.15
2

= 17.16
The rejection region requires = .05 in the upper tail of the 2 distribution with df = k 1
2
= 9.48773. The rejection region is
= 5 1 = 4. From Table VII, Appendix B, .05
2
> 9.48773.
Since the observed value of the test statistic falls in the rejection region (2 = 17.16 >
9.48873), reject H0. There is sufficient evidence to indicate the probabilities differ from
their hypothesized values at = .05.

324

Chapter 9

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

b.

p1 =

n1 32
= .37647
=
n 85

For confidence coefficient .95, = 1 .95 = .05 and /2 = .05/2 = .025. From Table IV,
Appendix B, z.025 = 1.96. The 95% confidence interval is:
p1 (1 p1 )
n
.37647(1 .37647)
.376 1.96
85
.376 .103
(.273, .479)
z.025

9.48

c.

The interval tells us that between 27.3% and 47.9% of the Avonex MS patients are
exacerbation-free during a two-year period. Since this interval is completely above the
percentage of placebo patients (26%), it seems that the Avonex patients are more likely to
have no exacerbations than placebo patients.

a.

Some preliminary calculations are:


The contingency table is:

Shift 1
2
3

Defectives
25
35
80
140

Non-Defectives
175
165
120
460

200
200
200
600

R1C1 200(140)
= 46.667
=
n
600
200(140)
E21 = E31 =
= 46.667
600
200(460)
E12 = E22 = (n32) =
= 153.333
600

E11 =

To determine if quality of the filters are related to shift, we test:

H0: Quality of filters and shift are independent


Ha: Quality of filters and shift are dependent
The test statistic is 2 =

(80 46.667 )

46.667
= 47.98

Categorical Data Analysis

[ nij Eij ]2
Eij

(175 153.333)
153.333

( 25 46.667 )

46.667

(165 153.333)
153.333

( 35 46.667 )

46.667

(120 153.333)

153.333

325

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The rejection region requires = .05 in the upper tail of the 2 distribution with df =
2
= 5.99147. The
(r 1)(c 1) = (3 1)(2 1) = 2. From Table VII, Appendix B, .05
2
rejection region is > 5.99147.
Since the observed value of the test statistic falls in the rejection region (2 = 47.98 >
5.99147), H0 is rejected. There is sufficient evidence to indicate quality of filters and
shift are related at = .05.
b.

The form of the confidence interval for p is:


p q
25
p z/2 1 1 where p1 =
= .125
200
n
For confidence coefficient .95, = 1 .95 = .05 and /2 = .05/2 = .025. From Table IV,
Appendix B, z.025 = 1.96. The 95% confidence interval is:
.125(.875)
.125 1.96
.125 .046 (.079, .171)
200

9.50

Using SAS, the output is:


The FREQ Procedure
Table of CANDIDATE by TIME
CANDIDATE

TIME

Frequency|
Col Pct |
1|
2|
3|
4|
5|
6|
---------+--------+--------+--------+--------+--------+--------+
SMITH
|
208 |
208 |
451 |
392 |
351 |
410 |
| 52.53 | 55.32 | 55.34 | 55.92 | 56.16 | 55.33 |
---------+--------+--------+--------+--------+--------+--------+
COPPIN
|
55 |
51 |
109 |
98 |
88 |
104 |
| 13.89 | 13.56 | 13.37 | 13.98 | 14.08 | 14.04 |
---------+--------+--------+--------+--------+--------+--------+
MONTES
|
133 |
117 |
255 |
211 |
186 |
227 |
| 33.59 | 31.12 | 31.29 | 30.10 | 29.76 | 30.63 |
---------+--------+--------+--------+--------+--------+--------+
Total
396
376
815
701
625
741

Total
2020
505
1129
3654

Statistics for Table of CANDIDATE by TIME


Statistic
DF
Value
Prob
-----------------------------------------------------Chi-Square
10
2.2839
0.9937
Likelihood Ratio Chi-Square
10
2.2722
0.9938
Mantel-Haenszel Chi-Square
1
0.9851
0.3209
Phi Coefficient
0.0250
Contingency Coefficient
0.0250
Cramer's V
0.0177
Sample Size = 3654

To determine if candidates received votes independent of time period, we test:


H0: Voting and Time period are independent
Ha: Voting and Time period are dependent
The test statistic is 2 = 2.2839.

326

Chapter 9

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Since no value of was given, we will use = .05. The rejection region requires = .05 in
the upper tail of the 2 distribution with df = (r 1)(c 1) = (3 1)(6 1) = 10. From Table
2
= 18.3070. The rejection region is 2 > 18.3070.
VII, Appendix B, .05
Since the observed value of the test statistic does not fall in the rejection region
(2 = 2.2839 >/ 18.3070), H0 is not rejected. There is insufficient evidence to indicate Voting
and Time period are dependent at = .05. Thus, we can conclude that voting and time period
are independent. This means that regardless of time period, the percentage of votes received by
each candidate is the same. In the table created by SAS, the bottom number in each cell is the
column percent. This is the percent of votes received by the candidate in each time period. An
inspection of these percents indicates that candidate Smith received approximately 55.3% of
the votes each time period, candidate Coppin received approximately 13.8% of the vote, and
candidate Montes received approximately 30.9% of the vote. All of this indicates that the
election was rigged.

Categorical Data Analysis

327

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Discrimination in the Work Place

(To accompany Chapters 89)

Part I:

If we assume that those selected for termination were randomly selected from all workers, then the Chisquared test for independence is appropriate. Using SAS, the output is:
TABLE OF RACE BY DECISION
RACE

DECISION

Frequency|
Percent |
Row Pct |
Col Pct |RETAINED|LAIDOFF | Total
---------+--------+--------+
WHITE
|
1051 |
31 |
1082
| 86.50 |
2.55 | 89.05
| 97.13 |
2.87 |
| 90.29 | 60.78 |
---------+--------+--------+
BLACK
|
113 |
20 |
133
|
9.30 |
1.65 | 10.95
| 84.96 | 15.04 |
|
9.71 | 39.22 |
---------+--------+--------+
Total
1164
51
1215
95.80
4.20
100.00
STATISTICS FOR TABLE OF RACE BY DECISION
Statistic
DF
Value
Prob
-----------------------------------------------------Chi-Square
1
43.641
0.001
Likelihood Ratio Chi-Square
1
29.260
0.001
Continuity Adj. Chi-Square
1
40.666
0.001
Mantel-Haenszel Chi-Square
1
43.605
0.001
Fisher's Exact Test (Left)
1.000
(Right)
6.43E-08
(2-Tail)
6.43E-08
Phi Coefficient
0.190
Contingency Coefficient
0.186
Cramer's V
0.190
Sample Size = 1215

328

Discrimination in the Work Place

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

To determine if the variables Race and Decision are related, we test:


H0: Race and Decision are independent
Ha: Race and Decision are dependent
The test statistic is 2 = 43.641.
The p-value is p = .001. Since the p-value is so small, there is evidence to reject H0. There is sufficient
evidence to indicate that Race and Decision are related. From the table, only 2.9% of whites were
terminated. However, 15.0% of black were terminated. There is a significant difference in these
percentages. This supports the plaintiff's position. However, this is all based on the assumption that
those selected to be laidoff were randomly selected. However, if the company made its decision based
on performance as it claims, then those selected to be terminated were not randomly selected and thus,
the test of hypothesis is invalid.
Part II:
If the workers to be terminated were truly selected at random, then the Chi-square test for independence
is appropriate. Using SAS, the output is:
TABLE OF STATUS BY AGE1
STATUS

AGE1

Frequency |
Percent
|
Row Pct
|
Col Pct
|UNDER 40|40 +
| Total
-----------+--------+--------+
ACTIVE
|
18 |
13 |
31
| 32.73 | 23.64 | 56.36
| 58.06 | 41.94 |
| 72.00 | 43.33 |
-----------+--------+--------+
TERMINATED |
7 |
17 |
24
| 12.73 | 30.91 | 43.64
| 29.17 | 70.83 |
| 28.00 | 56.67 |
-----------+--------+--------+
Total
25
30
55
45.45
54.55
100.00

Discrimination in the Work Place

329

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

STATISTICS FOR TABLE OF STATUS BY AGE1


Statistic
DF
Value
Prob
-----------------------------------------------------Chi-Square
1
4.556
0.033
Likelihood Ratio Chi-Square
1
4.651
0.031
Continuity Adj. Chi-Square
1
3.465
0.063
Mantel-Haenszel Chi-Square
1
4.473
0.034
Fisher's Exact Test (Left)
0.993
(Right)
0.031
(2-Tail)
0.055
Phi Coefficient
0.288
Contingency Coefficient
0.277
Cramer's V
0.288
Sample Size = 55

To determine if the variables Status and Age are related, we test:


H0: Age and Status are independent
Ha: Age and Status are dependent
The test statistic is 2 = 4.556.
The p-value is p = .033. Since the p-value is so small, there is evidence to reject H0. There is sufficient
evidence to indicate that Age and Status are related. From the table, 56.7% of those aged 40 and over
were terminated. However, only 28.0% of those aged under 40 were terminated. There is a significant
difference in these percentages. This supports the plaintiff's position.
We can also look at some other revealing statistics. If we compare the mean wages of those terminated
against those who remained active, there is a significant difference. The mean wages of those
terminated is significantly higher than the mean wages of those who remained active. Also, the mean
age of those who remained active (33.0) is significantly less than the mean age of those who were
terminated (44.08). Also, the mean wage of those under 40 ($26,452.20) was significantly less than the
mean wage of those 40 or over ($39,044.17). All of this implies that those who were terminated were
those who were older with the higher salaries. It appears that the company wanted to not only reduce
the work force, but also reduce its mean expenses for those remaining on the workforce.
I can find nothing to support the defendant's position.
TTEST PROCEDURE
Variable: WAGES
STATUS
N
Mean
Std Dev Std Error Variances
T
DF Prob>|T|
------------------------------------------------------------------------------ACTIVE
31 28772.26 6302.5283 1131.9675 Unequal -6.8124 52.9
0.0001
-6.6214 53.0
0.0000*
TERMINATED 24 39195.42 5042.9673 1029.3914 Equal

330

Discrimination in the Work Place

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

For H0: Variances are equal, F' = 1.56


DF = (30,23)
Prob>F' = 0.2738
************************************************************************
Variable: AGE
STATUS
N
Mean Std Dev Std Error
Variances
T
DF Prob>|T|
-----------------------------------------------------------------------------ACTIVE
31 33.0000 8.0000 1.4368
Unequal
-5.7661
53.0
0.0001*
TERMINATED 24 44.0833 6.2549 1.2768
Equal
-5.5886
53.0
0.0000
For H0: Variances are equal, F' = 1.64
DF = (30,23)
Prob>F' = 0.2273
************************************************************************
Variable: WAGES
AGE1
N
Mean
Std Dev Std Error Variances
T
DF Prob>|T|
------------------------------------------------------------------------------UNDER 40 25 26452.2000 4739.5548 947.9110 Unequal -10.1970
49.3
0.0001
-10.2814
53.0
0.0000*
40 + 30 39044.1667 4334.8764 791.4365 Equal
For H0: Variances are equal, F' = 1.20
DF = (24,29)
Prob>F' = 0.6409

Discrimination in the Work Place

331

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Simple Linear Regression

10.2

Chapter 10

For all problems below, we use:

a.

Slope =

"rise" y2 y1
=
"run" x2 x1

Slope =

5 1
= 1 = 1
5 1

If y = 0 + 1x, then 0 = y 1x.


Since a given point is (1, 1) and 1 = 1, the y-intercept = 0 = 1 1(1) = 0.
b.

Slope =

03
= 1 = 1
30

If y = 0 + 1x, then 0 = y 1x.


Since (0, 3) is given, the y-intercept is 0 = 3 (1)(0) = 3.
c.

Slope =

2 1
1
= = .2 = 1
4 (1) 5

If y = 0 + 1x, then 0 = y 1x.


Since a given point is (1, 1) and 1 = 1/5, the y-intercept is 0 = 1 .2(1) = 1.2.
d.

Slope =

6 ( 3) 9
= = 1.125 = 1
2 (6) 8

If y = 0 + 1x, then 0 = y 1x.


Since a given point is (6, 3) and 1 = 9/8, the y-intercept is 0 = 3 1.125(6) = 3.75.
10.4

a.

The equation for a straight line (deterministic) is y = 0 + 1x.


If the line passes through (1, 1), then
1 = 0 + 1(1) 1 = 0 + 1
Likewise, through (5, 5)
5 = 0 + 1(5)

332

Chapter 10

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Solving for these two equations:


1 = 0 + 1
(5 = 0 + 1(5))

4 = 41 1 = 1
Substituting 1 = 1 into the first equation, we get
1 = 0 + 1 0 = 0
The equation is y = 0 + 1x or y = x.
b.

The equation for a straight line is y = 0 + 1x. If the line passes through (0, 3), then
3 = 0 + 1(0), which implies 0 = 3. Likewise, through the point (3, 0), then 0 = 0 + 31
or 0 = 31. Substituting 0 = 3, we get 3 = 31 or 1 = 1. Therefore, the line passing
through (0, 3) and (3, 0) is y = 3 x.

c.

The equation for a straight line is y = 0 + 1x. If the line passes through (1, 1), then
1 = 0 + 1(1). Likewise through the point (4, 2), 2 = 0 + 1(4). Solving for these
two equations
2 = 0 + 14
(1 = 0 11)

51 or 1 =

1=

d.

1
5

Solving for 0, 1 = 0 +

1
1
1 6
(1) or 1 = 0
or 0 = 1 + =
5
5
5 5

The equation, with 0 =

6
1
6 1
and 1 = , is y = + x .
5
5
5 5

The equation for a straight line is y = 0 + 1x. If the line passes through (6, 3), then
3 = 0 16. Likewise, through the point (2, 6), 6 = 0 + 12. Solving these equations
simultaneously.
6 = 0 + 12
[(3) = 0 16]

9=

81 or 1 =

9
8

18
30
9
Solving for 0, 6 = 0 + 2 6
= 0 or 0 =
8
8
8

Therefore, y =

Simple Linear Regression

30 9
+ x.
8 8

333

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

10.6

a.

y = 4 + x. The slope is 1 = 1. The intercept is 0 = 4.

b.

y = 5 2x. The slope is 1 = 2. The intercept is 0 = 5.

c.

y = 4 + 3x. The slope is 1 = 3. The intercept is 0 = -4.

d.

y = 2x. The slope is 1 = 2. The intercept is 0 = 0.

e.

y = x. The slope is 1 = 1. The intercept is 0 = 0.

f.

y = .5 + 1.5x. The slope is 1 = 1.5. The intercept is 0 = .5.

10.8

The "line of means" is the deterministic component in a probabilistic model.

10.10

a.
xi

yi

xi2

xi yi

7
4
6
2
1
1
3

2
4
2
5
7
6
5

72 = 49
42 = 16
62 = 36
22 = 4
12 = 1
12 = 1
32 = 9

7(2) = 14
4(4) = 16
6(2) = 12
2(5) = 10
1(7) = 7
1(6) = 6
3(5) = 15

x = 7 + 4 + 6 + 2 + 1 + 1 + 3 = 24
y = 2 + 4 + 2 + 5 + 7 + 6 + 5 = 31
x = 49 + 16 + 36 + 4 + 1 + 1 + 9 = 116
x y = 14 + 16 + 12 + 10 + 7 + 6 + 15 = 80

Totals:

2
i

b.

334

SSxy =

c.

SSxx =

d.

1 =

x y
i

2
i

SS xy

SS xx

( x )( y )
i

( x )
i

= 80

= 116

(24)(31)
= 80 106.2857143 = -26.2857143
7

(24) 2
= 116 82.28571429 = 33.71428571
7

26.2857143
= .779661017 .7797
33.71428571
24
= 3.428571429 y =
7

31
= 4 .428571429
7

e.

x =

f.

0 = y 1 x = 4.428571429 (.779661017)(3.428571429)

g.

= 4.428571429 (2.673123487) = 7.101694916 7.102


The least squares line is y = 0 + 1 x = 7.102 .7797x.

Chapter 10

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

10.12

a.

b.

Choose y = 1 + x since it best describes the relation of x and y.

c.
y

2
1
3

d.

.5
1.0
1.5

2
1
3

.5
1.0
1.5

SSE =

( y y )

y y

y = 1 + x

2 1.5 = .5
1 2.0 = 1.0
3 2.5 = .5
Sum of errors = 0

1.5
2.0
2.5

y = 3 x
3 .5 = 2.5
3 1.0 = 2.0
3 1.5 = 1.5

y y
2 2.5 = .5
1 2.0 = 1.0
3 1.5 = 1.5
Sum of errors = 0

SSE for 1st model: y = 1 + x, SSE = (.5)2 + (1)2 + (.5)2 = 1.5


SSE for 2nd model: y = 3 - x, SSE = (.5)2 + (1)2 + (1.5)2 = 3.5
The best fitting straight line is the one that has the smallest least squares. The model
y = 1 + x has a smaller SSE, and therefore it verifies the visual check in part a.
e.

Some preliminary calculations are:

=3

SSxy =
SSxx =

1 =

y = 6 xy = 6.5 x = 3.5
( x )( y ) = 6.5 (3)(6) = .5
xy

( x)

.5
= 1; x =
.5

Simple Linear Regression

x
3

= 3.5
=

(3)
= .5
3

3
= 1; y =
3

y
3

6
=2
3

335

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

0 = y 1 x = 2 1(1) = 1 y = 0 + 1 x = 1 + x
The least squares line is the same as the second line given.
10.14

10.16

a.

The straight-line model would be: y = o + 1 x +

b.

The least squares line is:

c.

Since range of observed values for the number of carats (x) does not include 0, the yintercept has no meaning.

d.

The slope of the line is 1 . In terms of this problem, 1 is the change in the mean
asking price for each additional carat. This interpretation is meaningful for values of x
within the observed range. The observed range of x is .18 to 1.10.

e.

y = 2, 298.4 + 11, 598.9(.52) = 3, 733.028 . The predicted asking price for a .52 carat
diamond is $3,733.028.

a.

Some preliminary calculations are:

y = 2, 298.4 + 11, 598.9 x

x = 62

y = 97.8

x 2 = 720.52

y 2 = 1,710.2

x=

x = 62 = 10.33333333
n

SS xy = xy

SS xx = x

1 =

SS xy
SS xx

xy = 1,087.78

y=

y = 97.8 = 16.3
n

( x )( y ) = 1,087.78 62(97.8) = 1,087.78 1,010.6 = 77.18


6

( x)

= 720.52

(62) 2
= 720.52 640.667 = 79.8533333
6

77.18
= 0.966521957 0.9665
79.8533333

o = y 1 x = 16.3 0.966521957(10.33333333) = 6.312606448 6.3126


y = 6.3126 + .9665 x
b.

336

Since x = 0 is not in the observed range of the mean pore diameters, the y-intercept has no
meaning.

Chapter 10

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

10.18

c.

For each unit increase in mean pore diameter, the mean value of porosity is estimated to
increase by .9665.

d.

For x = 10, y = 6.3126 + .9665(10) = 15.9776

a.

Some preliminary calculations are:

x
x

= 6167
2

= 1,641,115

SSxy =

xy

SSxx =

y = 135.8
xy = 34,764.5

n = 24

( x )( y )

n
(6167)(135.8)
= 130.44167
= 34,764.5
24
2

( x)

(6167)
= 56,452.95833
24
SS xy 130.44167
=
1 =
= .002310625 .0023
SS xx 56452.958
= 1,641,115

0 = y 1 x =

135.8
6167
(.002310625)
= 6.252067683 6.25
24
24

The least squares line is y = 6.25 .0023x


b.

0 = 6.25. Since x = 0 is not in the observed range, 0 has no interpretation other than
being the y-intercept.

1 = .0023. For each additional increase of 1 part per million of pectin, the mean
sweetness index is estimated to decrease by .0023.

10.20

c.

y = 6.25 .0023(300) = 5.56

a.

A proposed model is E(y) = o + 1x.

b.

Some preliminary calculations are:

x = 1, 292.7
x 2 = 88,668.43

Simple Linear Regression

y = 3,781.1

xy = 218, 291.63

y 2 = 651,612.45

337

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

x=

x = 1, 292.7 = 58.75909091

y=

22

SS xy = xy

y = 3,781.1 = 171.8681818
22

( x )( y ) = 218, 291.63 1, 292.7(3,781.1)

n
22
= 218, 291.63 222,173.9986 = 3,882.3686

( x)

(1, 292.7) 2
n
22
= 88,668.43 75,957.87682 = 12,710.55318

SSxx = x

1 =

SSxy
SSxx

= 88,668.43

3,882.3686
= 0.305444503 0.305
12,710.55318

o = y 1 x = 171.8681818 (0.305444503)(58.75909091)
= 189.8158231 189.816
The fitted regression line is: y = 189.816 0.305 x
c.

Using MINITAB, a graph of the fitted regression line is:


Fitted Line Plot
F C A T-M ath = 189.8 - 0.3054 P ercent
190

S
R-Sq
R-Sq(adj)

185

5.36572
67.3%
65.7%

FC A T -M ath

180
175
170
165
160
155
10

20

30

40

50
60
P er cent

70

80

90

100

From the fitted regression line, the relationship between the two variables is
negative.

338

Chapter 10

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

d.

o = 189.816 . Since 0 is not in the range of observed values of the variable %


Below Poverty, the y-intercept has no meaning.

1 = 0.305 .

e.

For each unit change in % Below Poverty, the mean value of


FCAT-Math is estimated to decrease by 0.305.

A proposed model is E(y) = o + 1x.


Some preliminary calculations are:

x = 1, 292.7

y = 3,764.2

x 2 = 88,668.43
x=

y 2 = 645, 221.16

x = 1, 292.7 = 58.75909091
n

22

SSxy = xy

xy = 217,738.81

y=

y = 3,764.2 = 171.1
n

22

( x )( y ) = 217,738.81 1, 292.7(3,764.2)

n
= 217,738.81 221,180.97 = 3, 442.16

( x)

22

(1, 292.7) 2
n
22
= 88,668.43 75,957.87682 = 12,710.55318

SS xx = x

1 =

SSxy
SSxx

= 88,668.43

3, 442.16
= 0.270811187 0.271
12,710.55318

o = y 1 x = 171.1 (0.270811187)(58.75909091) = 187.0126192 187.013


The fitted regression line is: y = 187.013 0.271x

Simple Linear Regression

339

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Using MINITAB, a graph of the fitted regression line is:


Fitted Line Plot
F C A T-Read = 187.0 - 0.2708 P ercent
185

180

FC A T -Read

3.42319

R-Sq
R-Sq(adj)

79.9%
78.9%

175
170

165
160

10

20

30

40

50
60
P er cent

70

80

90

100

From the fitted regression line, the relationship between the two variables is
negative.

10.22

o = 187.013 .

Since 0 is not in the range of observed values of the variable %


Below Poverty, the y-intercept has no meaning.

1 = 0.271 .

For each unit change in % Below Poverty, the mean value of


FCAT-Reading is estimated to decrease by .271.

a.

We will select Average Salary as the dependent variable and Mean GMAT as the
independent variable.

b.

Some preliminary calculations are:

x = 6,944

y = 1,080, 288

x 2 = 4,824,680

y 2 = 118,151,669, 430

x=

x = 6,944 = 694.4
n

10

SSxy = xy

y=

y = 1,080, 288 = 108,028.8


n

10

( x )( y ) = 751,698, 490 6,944(1,080, 288)

n
= 751,698, 490 75,015,987.2 = 1,546,502.8

340

xy = 751,698, 490

10

Chapter 10

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

( x)

(6,944) 2
n
10
= 4,824,680 4,821,913.6 = 2,766.4

SSxx = x

1 =

SSxy
SSxx

= 4,824,680

1,546,502.8
= 559.0307981 559.031
2,766.4

o = y 1 x = 108,028.8 (559.0307981)(694.4) = 280,162.1862 280,162.186


The fitted regression line is: y = 280,162.186 + 559.031x

o = 280,162.186 .

Since 0 is not in the range of observed values of the variable


Mean GMAT, the y-intercept has no meaning.

1 = 0.271 . For each additional point increase in the mean GMAT score, the mean
value of Average Salary is estimated to increase by $559.031.

10.24

The graph in b would have the smallest s2 because the width of the data points is the smallest.

10.26

a.

SSE = SSyy 1 SSxy = 95 .75(50) = 57.5


s2 =

x
n

57.5
= 3.19444
20 2

b.

SSyy =

c.

SSyy =

(y

( y)

2
2

50
= 797.5
40
n
SSE = SSyy 1 SSxy = 797.5 .2(2700) = 257.5
SSE
257.5
=
= 6.776315789 6.7763
s2 =
n2
40 2
2

= 860

y ) 2 = 58

1 =

SS xy

91
= .535294117
170

SS xx

SSE = SSyy 1 SSxy = 58 .535294117(91) = 9.2882353 9.288


SSE
9.2882353
=
= 1.161029413 1.1610
s2 =
n2
10 2

10.28

a.

From the printout, SSE = 382,178,624, s2 = MSE = 1,248,950, and s = 1,117.56.

b.

s = 1,117.56. We would expect approximately 95% of the observed values of y to


fall within 2s or 2(1,117.56) = 2,235.12 of their least squares predicted values.

Simple Linear Regression

341

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

10.30

a.

From part a of Exercise 10.17, SSxy = 20.00833333,

y = 239 , y 2 = 10, 255 ,

and 1 = 35.91623038 .

( y)

(239) 2
n
6
= 10, 255 9520.166667 = 734.8333333

SS yy = y

= 10, 255

SSE = SS yy 1SS xy = 734.833333 35.91623068(20.00833333) = 16.2094179


s 2 = MSE =

10.32

SSE 16.2094179
=
= 4.052354475 and s = 4.052354475 = 2.013
n2
62

b.

s = 2.013. We would expect approximately 95% of the observed values of y (Drug


release rate) to fall within 2s or 2(2.013) = 4.026 units of their least squares predicted
values.

a.

Using MINITAB, the scattergram of the data is:

b.

x = 44.71
y = 131,670
y = 1,514,402,100

xy

= 493,117.7

= 167.4615

x=

x = 44.71 = 3.7258333
n

SSxy =

12

xy

y=

y = 131, 670
n

12

= 10,972.5

( x )( y ) = 493,117.7 44.71(131, 670)

n
= 493,117.7 490,580.475 = 2,537.225

( x)

12

44.712
n
12
= 167.4615 166.5820083 = .8794917

SSxx =

1 =

342

SSxy
SS xx

= 167.4615

2, 537.225
= 2884.876571 2884.877
.8794917

Chapter 10

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

0 = y 1 x = 10,972.5 2884.876571(3.7258333) = 10,972.5 10,748.56929


= 233.93071 233.931
The fitted regression line is = 233.931 + 2884.877x

c.

( y)

131, 6702
n
12
= 1,514,402,100 1,444,749,075 = 69,653,025

SSyy =

= 1,514,402,1000

SSE = SSyy 1 SSxy = 69,653,025 2,884.876571(2,537.225)


= 69,653,025 - 7,319,580.958 = 62,333,444.04
s2 =

SSE 62, 333, 444.04


=
= 6,233,344.404
n2
12 2

s=

s 2 = 6, 233, 344.404 = 2,496.6667

We would expect to see most of the hospital charges to fall within 2s or 2($2,496.6667) =
$4,993.3333 of the least squares line.
d.

For x = 4, y = 223.931 + 2,884.877(4) = 11,763.439


y 2s 11,763.439 4,993.3333 (6,770.106, 16,756.772)

e.

10.34

Only one state (California) had an average hospital charge more than 2 standard errors
from the least squares line. Thus, 11 out of 12 or 11/12 or .917 of the states had average
hospital charges within 2 standard errors of the least squares line.

Some preliminary calculations for Brand A are:

x = 750

SSxy = xy

x y = 2, 022 750(44.8) = 218

SSxx = x 2
SS yy = y

= 40, 500

xy = 2, 022 y = 44.8

( x)

= 168.70

15

( y)

= 40, 500

7502
= 3, 000
15

= 168.70

44.82
= 34.89733333
15

218
= 0.0726666667 0.0727
SSxx 3, 000
44.8
750
0 = y 1 x =
(0.0726666667)
= 6.62
15
15

1 =

SSxy

Simple Linear Regression

343

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The least squares prediction equation for Brand A is: y = 6.62 0.0727 x
Some preliminary calculations for Brand B are:

x = 750

SSxy = xy

x y = 2, 622 750(58.9) = 323

SSxx = x 2

SS yy = y

= 40, 500

xy = 2, 622 y = 58.9

( x)

= 270.89

15

( y)

= 40, 500

7502
= 3, 000
15

= 270.89

58.92
= 39.60933333
15

323
1 =
=
= 0.1076666667 0.1077
SSxx 3, 000
58.9
750
0 = y 1 x =
(0.1076666667)
= 9.31
15
15
SSxy

The least squares prediction equation for Brand B is: y = 9.31 0.1077 x
For Brand A,
SSE = SS yy 1SS xy = 34.89733333 ( 0.072666667)(218) = 19.0560
s 2 = MSE =

SSE 19.0560
=
= 1.4658 and s = 1.4658 = 1.211
n 2 15 2

For Brand B,
SSE = SS yy 1SS xy = 39.60933333 (0.107666667)(323) = 4.833
s 2 = MSE =

SSE 4.833
=
= 0.37177 and s = 0.37177 = .61
n 2 15 2

For Brand A, y = 6.62 .0727x. For x = 70, y = 6.62 .0727(70) = 1.531


2s = 2(1.211) = 2.422
Therefore, y 2s 1.531 2.422 (.891, 3.593)
For Brand B, y = 9.31 .1077x. For x = 70, y = 9.31 .1077(70) = 1.7
2s = 2(.61) = 1.22
Therefore, y 2s 1.771 1.22 (.551, 2.991)
More confident with Brand B since there is less variation (s is smaller).

344

Chapter 10

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

10.36

a.

b.

Some preliminary calculations are:

= 21

SSxy =

x = 91 xy = 86
y = 21
x y = 86 21(21) = 86 63 = 23
xy
2

SSxx =

SSyy =

( x)

= 89

( y)

= 91

21
= 91 63 = 28
7

= 89

212
= 26
7

23
= .821428571 .821
28
SS xx
21
21
0 = y 1 x = .821428571 = 3 2.4642857 = .535714285 .536
7
7

1 =

SS xy

The fitted line is y = .536 + .821x.


c.
d.

See the plot in part a.


To test whether x contributes significant information for predicting y, we test:
H0: 1 = 0
Ha: 1 0

e.

The test statistic is t =

1 0
s

where s =
1

s
SSxx

SSE = SSyy 1 SSxy = 26 .821428571(23) = 7.107142857


SSE
7.107142857
= 1.421428571
s2 =
s = 1.42143 = 1.1922
=
72
n2
1.1922
.82143 0
s =
= .2253
t=
= 3.646
1
.2253
28
The degrees of freedom for this t is df = n 2 = 7 2 = 5.

Simple Linear Regression

345

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

f.

The rejection region requires /2 = .05/2 = .025 in each tail of the t-distribution. From
Table VI, Appendix B, t.025 = 2.571 with df = n 2 = 7 2 = 5. The rejection region is
t > 2.571 or t < 2.571.
Since the observed value of the test statistic falls in the rejection region (t = 3.646 >
2.571), H0 is rejected. There is sufficient evidence to indicate that x contributes
information for the prediction of y at = .05.

10.38

Some preliminary calculations are:

= 21

SSxy =

x = 91 xy = 65
y = 19
x y = 65 21(19) = 65 66.5 = -1.5
xy
2

SSxx =

x2

SSyy =

( x)

= 65

( y)

= 91

212
= 91 73.5 = 17.5
6

= 65

192
= 65 60.166667 = 4.8333333
6

1.5
SS xy
1 =
=
= .085714285 .0857
17.5
SS xx
SSE = SSyy 1 SSxy = 4.8333333 (.085714285)(1.5) = 4.704761903
SSE
4.704761903
s2 =
s = 1.76190476 = 1.0845
=
= 1.176190476
62
n2

To determine whether a straight line is useful for characterizing the relationship between
x and y, we test:

H0: 1 = 0
Ha: 1 0
The test statistic is t =

1 0
s

.08571 0
= .33
1.0845

17.5
The rejection region requires /2 = .05/2 = .025 in each tail of the t-distribution with df = n 2
= 6 - 2 = 4. From Table VI, Appendix B, t.025 = 2.776. The rejection region is
t > 2.776 or t < 2.776.
Since the observed value of the test statistic does not fall in the rejection region (t = .33 </
2.776), H0 is not rejected. There is insufficient evidence to indicate that a straight line is
useful for characterizing the relationship between x and y at = .05.

346

Chapter 10

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

10.40

a.

To determine if the average state SAT score in 2005 has a positive relationship with
the average state SAT score in 1990, we test:

H0: 1 = 0
Ha: 1 > 0
b.

From the printout in Exercise 10.15, the p-value is p = 0.000. This is the p-value for a 2tailed test. The p-value for this one-tailed test is 0.000/2 = 0.000. Since the p-value is
less than = .05, H0 is rejected. There is sufficient evidence to indicate the average state
SAT score in 2005 has a positive relationship with the average state SAT score in 1990 at
= .05.

c.

For confidence coefficient .95, = .05 and /2 = .05/2 = .025. From Table VI, Appendix
B, with df = n 2 = 51 2 = 49, t.025 2.011. The 95% confidence interval is:

1 t.025 s 1.073 2.011(.056) 1.073 .113 (.960, 1.186)


1

We are 95% confident that for each additional point in the 1990 average state SAT
score, the increase in the 2005 average stat SAT score is between .960 and 1.186.
10.42

From Exercise 10.18, SSxy = 130.44167, 1 = -0.002310625, and SSxx = 56,452.95833.

y = 135.8

y = 769.72
( y ) = 769.72 135.8

SS yy = y

24

= 1.3183333

SSE = SS yy 1SS xy = 1.3183333 ( 0.002310625)(130.44167) = 1.016931516


SSE 1.016931516
=
= 0.046224159 and s = 0.046224159 = 0.214998
n2
24 2
MSE
0.214998
s =
=
= 0.0009049
1
SSxx
56, 452.95833

s 2 = MSE =

For confidence coefficient .95, = .05 and /2 = .05/2 = .025. From Table VI, Appendix B,
with df = n 2 = 24 2 = 22, t.025 = 2.074. The confidence interval is:

1 t.025 s 0.0023 2.074(0.0009049)


1

0.0023 0.0019 (0.0042, 0.0004)

We are 95% confident that for each additional point increase in the amount of soluble
pectin, the mean sweetness index will decrease from between .0004 and .0042 points.

Simple Linear Regression

347

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

10.44

a.

From Exercise 10.23, SSxy = -787.51087, SSxx = 6,906.6087,

y = 60.1 ,

= 262.271 , and 1 = 0.114022801 .

( y)

(60.1) 2
23
n
= 262.271 157.043913 = 105.227087

SS yy = y

= 262.271

SSE = SS yy 1SS xy = 105.227087 ( 0.114022801)( 787.51087) = 15.43289179


s 2 = MSE =

s =
1

SSE 15.43289179
=
= 0.734899609 and s = 0.734899609 = 0.8573
n2
23 2

MSE

SS xx

0.734899609
6,906.6087

= 0.010315

To determine if the mass of the spill tends to diminish linearly as time increases, we test:
H0: 1 = 0
Ha: 1 < 0
The test statistic is t =

1 0
s

0.114022801
= 11.05
0.010315

The rejection region requires = .05 in the lower tail of the t-distribution with
df = n 2 = 23 2 = 21. From Table VI, Appendix B, t.05 = 1.721. The rejection
region is t < 1.721.
Since the observed value of the test statistic falls in the rejection region
(t = 11.05 < 1.721), H0 is rejected. There is sufficient evidence to indicate the mass
of the spill tends to diminish linearly as time increases at = .05.
b.

For confidence coefficient .95, = .05 and /2 = .05/2 = .025. From Table VI,
Appendix B, with df = n 2 = 23 2 = 21, t.025 = 2.080. The 95% confidence interval
is:

1 t.025 s 0.1140 2.080(0.010315) 0.1140 0.02146


1

(0.13546, -0.09254)
We are 95% confident that for each additional minute of elapsed time, the decrease
in spill mass is between 0.13546 and 0.09254.

348

Chapter 10

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

10.46

a.

Using MINITAB, the scattergram is:

It appears from the plot that as the percentage of the population that is minority increases,
the number of people per branch bank tends to increase.
b.

The value of 1 will be positive. As one variable increases, the other tends to increase.

c.

x = 363.8

y
x=

y = 56,560

xy = 1,075,763

= 9,020.86

= 158,763,894

x = 363.8 = 17.32380952
n

SSxy =

21

xy

y=

x = 56, 560 = 2,693.33333


n

21

( x )( y ) = 1, 075, 763 363.8(56, 560)

n
21
= 1,075,763 979,834.6667 = 95,928.3333

( x)

363.82
n
21
= 9,020.86 6,302.401905 = 2,718.458095

SSxx =

1 =

SS xy

SS xx

= 9,020.86

95, 928.3333
= 35.28777342 35.288
2, 718.458095

( y)

56, 5602
n
21
= 158,763,894 - 152,334,933.3 = 6,428,960.7

SSyy =

= 158,863,894

SSE = SSyy 1 SSxy = 6,428,960.7 35.28777342(95,928.3333)


= 6,428,960.7 3,385,097.29 = 3,043,863.41
s2 =

SSE 3, 043,863.41
=
= 160,203.3374
n2
21 2

Simple Linear Regression

349

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

s=

2
s = 160, 203.3374 = 400.2541

To determine if the data support the charge made against the New Jersey banking
community, we test:
H0: 1 = 0
Ha: 1 0
The test statistic is t =

1 0
s

35.288 0
400.2541

= 4.597

2, 718.458095
The rejection region requires /2 =.01/2 = .005 in each tail of the t-distribution with
df = n 2 = 21 2 = 19. From Table VI, Appendix B, t.005 = 2.861. The rejection region
is t < 2.861 or t > 2.861.
Since the observed value of the test statistic falls in the rejection region (t = 4.597 >
2.861), H0 is rejected. There is sufficient evidence to support the charge made against the
New Jersey banking community at = .01.
10.48

a.

b.

Using MINITAB, the regression analysis is:


Regression Analysis: Index versus Interactions
The regression equation is
Index = 44.1 + 0.237 Interactions
Predictor
Constant
Interact
S = 19.40

Coef
44.130
0.2366

SE Coef
9.362
0.1865

R-Sq = 8.6%

T
4.71
1.27

P
0.000
0.222

R-Sq(adj) = 3.3%

Analysis of Variance
Source
Regression
Residual Error
Total

DF
1
17
18

SS
606.0
6400.6
7006.6

MS
606.0
376.5

F
1.61

P
0.222

From the printout, the least squares line is y = 44.13 + .2366x.

350

Chapter 10

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

From the printout, s = 19.40


The standard deviation s represents the spread of the manager success index about the
least squares line. Approximately 95% of the manager success indexes should lie within
2s = 2(19.40) = 38.8 of the least squares line.

d.

Refer to the scattergram in part a. The number of interactions with outsiders might
contribute some information in the prediction of managerial success, but it does not look
like a very strong relationship.

e.

To determine if the number of interactions contributes information for the prediction of


managerial success, we test:
H0: 1 = 0
Ha: 1 0
The test statistic is t =

1 0
s

= 1.27

The rejection region requires /2 = .05/2 = .025 in each tail of the t-distribution with df =
n 2 = 19 2 = 17. From Table VI, Appendix B, t.025 = 2.110. The rejection region is
t > 2.110 or t < 2.110.
Since the observed value of the test statistic does not fall in the rejection region (t = 1.27
>/ 2.110), H0 is not rejected. There is insufficient evidence to indicate the number of
interactions contributes information for the prediction of managerial success at = .05.
f.

For confidence coefficient .95, = 1 .95 = .05 and /2 = .05/2 = .025. From Table VI,
Appendix B, with df = 17, t.025 = 2.110. The 95% confidence interval is:

1 t.025 s .2366 2.110(.1865) .2366 .3935 (.1569, .6301)


1

We are 95% confident the change in the mean manager success index for each additional
interaction with outsiders is between .1569 and .6301.
10.50

a.

Using MINITAB, the regression analysis is:


Regression Analysis: Risk versus Credit
The regression equation is
Risk = 56.2 - 0.400 Credit
Predictor
Constant
Credit

Coef
56.215
-0.39961

S = 12.6777

SE Coef
6.033
0.09152

R-Sq = 33.4%

T
9.32
-4.37

P
0.000
0.000

R-Sq(adj) = 31.7%

Analysis of Variance
Source
Regression
Residual Error
Total

Simple Linear Regression

DF
1
38
39

SS
3064.4
6107.5
9171.9

MS
3064.4
160.7

F
19.07

P
0.000

351

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

To determine if country credit risk contributes information for the prediction of market
volatility, we test:
H0: 1 = 0
Ha: 1 0
The test statistic is t =

1 0
s

= 4.37 (from printout).

The p-value is .000. Since the p-value is so small, there is strong evidence to indicate
that country credit risk contributes information for the prediction of market volatility at
> .000.
b.

Using MINITAB, a scattergram of the data with the fitted regression line is:

Regression Plot
Risk = 56.22 .3996 Credit
S = 12.6777

R-Sq = 33.4 %

R-Sq(adj) = 31.7 %

90

80
70

Ris k

60

50

40
30

20
10
20

30

40

50

60

70

80

90

100

Credit

From the plot, there appears to be several outliers. Observations 1, 19, 34, and 36 have
arrows pointing at them.

352

Chapter 10

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

Eliminating those four data points and using MINITAB, the regression analysis is as
follows:
The regression equation is
Risk = 48.9 - 0.316 Credit
Predictor
Constant
Credit

Coef
48.891
-0.31599

s = 7.46401

Stdev
3.991
0.05883

R-sq = 45.9%

t-ratio
12.25
-5.37

p
0.000
0.000

R-sq(adj) = 44.3%

Analysis of Variance
SOURCE
Regression
Error
Total
Unusual
Obs.
4
25
27

DF
1
34
35

SS
1607.4
1894.2
3501.6

Observations
C2
C1
35.1
63.70
25.3
23.30
55.6
46.40

MS
1607.4
55.7

Fit Stdev.Fit
37.80
2.13
40.90
2.62
31.32
1.35

F
28.85

Residual
25.90
-17.60
15.08

p
0.000

St.Resid
3.62R
-2.52R
2.05R

R denotes an obs. with a large st. resid.

After eliminating the four data points, the regression analysis is very similar. The fitted
regression line is:

y = 48.891 .31599x
To determine if country credit risk contributes information for the prediction of market
volatility, we test:
H0: 1 = 0
Ha: 1 0
The test statistic is t =

1 0
s

= 5.37 (from printout).

The p-value is .000. Since the p-value is so small, there is strong evidence to indicate that
country credit risk contributes information for the prediction of market volatility at
> .000.
The standard error for the analysis when the four data points have been removed (s = 7.464)
is much smaller than the standard error with all the data points (s = 12.6777).

Simple Linear Regression

353

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

10.52

10.54

a.

r = 1 implies x and y are perfectly, positively related.

b.

r = 1 implies x and y are perfectly, negatively related.

c.

r = 0 implies x and y are not related.

d.

r = .90 implies x and y are positively related. Since r is close to 1, the strength of the
relationship is very high.

e.

r = .10 implies x and y are positively related. Since r is close to 0, the relationship is
fairly weak.

f.

r = .88 implies x and y are negatively related. Since r is close to 1, the relationship is
fairly strong.

a.

Some preliminary calculations are:

x =0
y = 12
SSxy =

x = 10 xy = 20
y = 70
x y = 20 0(12) = 20
xy
2

SSxx =

SSyy =

r=

( x)

( y)

SS xy

SS xxSS yy

= 10

0
= 10
5

= 70

122
= 41.2
5

20
10(41.2)

= .9853

r2 = .98532 = .9709
Since r = .9853, there is a very strong positive linear relationship between x and y.
Since r2 = .9709, 97.09% of the total sample variability around the sample mean response
is explained by the linear relationship between x and y.

354

Chapter 10

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

b.

Some preliminary calculations are:

x =0
y = 16
SSxy =

x = 10
xy = 15
y = 74
x y = 15 0(16) = 15
xy
2

SSxx =

SSyy =

r=

( x)

( y)

02
= 10
5

= 74

162
= 22.8
5

SS xy

= 10

15

10(22.8)
SS xxSS yy
2
2
r = (.9934) = .9868

= .9934

Since r = .9934, there is a very strong negative linear relationship between x and y.
Since r2 = .9868, 98.68% of the total sample variability around the sample mean response
is explained by the linear relationship between x and y.
c.

Some preliminary calculations are:

x = 18
y = 14
SSxy =

x = 52 xy = 36
y = 32
x y = 36 18(14) = 0
xy
2

SSxx =

SSyy =

Simple Linear Regression

( x)

( y)

= 52

182
= 5.71428571
7

= 32

142
=4
7

355

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

SS xy

r=

5.71428571(4)

SS xxSS yy

=0

r2 = 02 = 0
Since r = 0, this implies that x and y are not linearly related.
Since r2 = 0, 0% of the total sample variability around the sample mean response is
explained by the linear relationship between x and y.

d.

Some preliminary calculations are:

x = 15
y =4
SSxy =

x = 71 xy = 12
y =6
x y = 12 15(4) = 0
xy
2

SSxx =

x2

SSyy =

r=

( x)

( y)

SS xy

SS xxSS yy
2
2
r =0 =0

= 71

152
= 26
5

=6

42
= 2.8
5

0
26(2.8)

=0

Since r = 0, this implies that x and y are not linearly related.


Since r2 = 0, 0% of the total sample variability around the sample mean response is
explained by the linear relationship between x and y.

356

Chapter 10

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

10.56

10.58

10.60

a.

From the printout, r2 = R-Sq = 89.3%. 89.3% of the total sample variability around the
sample mean asking price is explained by the linear relationship between asking
price
and number of carats for diamond.

b.

r = r 2 = .893 = .945. The value of r has the same sign as 1 , which is positive. Since
r is very close to 1, there is a strong positive linear relationship between
asking price
and number of carats for diamond.

a.

Since r = .43, there is a fairly weak positive linear relationship between total time
allotted to sports and audience rating.

b.

r2 = .432 = .1849. Since r2 = .1849, 18.49% of the total sample variability around the
sample mean audience rating is explained by the linear relationship between audience
rating and total time allocated to sports.

a.

Using MINITAB, a scattergram of the data is:


Scatterplot of NetWorth vs Age
50

NetWor th

40

30

20

10
20

30

40

50

60

70

80

90

A ge

There appears to be a slight increase in the Net Worth as age increases, but the
relationship is fairly weak.
b.

Some preliminary calculations are:

x = 859

y = 303.8

x 2 = 53,567

y 2 = 8, 202.28

SS xy = xy

xy = 17,841.6

( x )( y ) = 17,841.6 859(303.8)

15
n
= 17,841.6 17,397.61333 = 443.98667

Simple Linear Regression

357

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

( x)

(859) 2
15
n
= 53,567 49,192.06667 = 4,374.93333

SS xx = x

= 53,567

( y)

(303.8) 2
15
n
= 8, 202.28 6,152.962667 = 2,049.317333

SS yy = y

1 =

r=

SSxy
SS xx

= 8, 202.28

443.98667
= 0.101484213 0.1015
4,374.93333

SSxy
SSxx SS yy

443.98667
= .1483
4,374.93333 2,049.317333

Since r is positive, there is a very weak positive linear relationship between a


persons net worth and his/her age.
c.

If r had a negative sign, the interpretation would be:


Since r is negative, there is a very weak negative linear relationship between a
persons net worth and his/her age.

10.62

From Exercises 10.23 and 10.44, SSxy = -787.51087, SSxx = 6,906.6087, and
SSyy = 105.227087.

r=

SSxy
SSxx SS yy

787.51087
= .924
6,906.6087 105.227087

There is a very strong negative linear relationship between mass of spill and elapsed
time of the spill.

r 2 = .9242 = .854 Approximately 85.4% of the variability in the mass of the spill
around the sample mean is explained by the linear relationship between mass of the spill
and elapsed time of the spill.

358

Chapter 10

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

10.64

a.

Using MINITAB, the scattergram is:

15

WeightChg

10

-5

-10
0

10

20

30

40

50

60

70

80

Digest

b.

Some preliminary calculations are:

x = 1, 266.5
y = 1, 075.5

xy = 4,103.25 y = 46

= 57, 390.75

x y = 4,103.25 1, 266.5(46) = 2, 716.130952

SSxy = xy
SSxx = x 2
SS yy = y

1 =
r=

SSxy
SSxx

( x)

42

= 57, 390.75

( y)

= 1, 075.5

(1, 266.5) 2
= 19,199.74405
42

462
= 1, 025.119048
42

n
2, 716.130952
=
= 0.141467039
19,199.74405

SSxy
SSxx SS yy

2, 716.130952
19,199.74405 1, 025.119048

= .6122

There is a moderate positive linear relationship between digestion efficiency and


weight change.
c.

To determine whether weight change is correlated with digestion, we test:


H0: = 0
Ha: 0
The test statistic is t =

Simple Linear Regression

r
1 r
n2
2

.6122
1 .61222
42 2

= 4.90

359

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The rejection region requires /2 = .01/2 = .005 in each tail of the t-distribution with
df = n 2 = 42 2 = 40. From Table VI, Appendix B, t.005 = 2.704. The rejection
region is t > 2.704 or t < 2.704.
Since the observed value of the test statistic falls in the rejection region (t = 4.90 >
2.704), H0 is rejected. There is sufficient evidence to indicate weight change and
digestion are correlated at = .01.
d.

After deleting the data corresponding to duck chow, the preliminary calculations are:

x = 701.50
SS xy = xy

SS xx = x 2
SS yy = y

= 21, 069

xy = 99.5 y = 18 y

= 404.00

x y = 99.5 701.50(18) = 482.1363636


n

( x)

33

= 21, 069

( y)

= 404

(701.50) 2
= 6,156.81061
33

(18) 2
= 394.1818182
33

n
482.1363636
1 =
=
= 0.078309435
SSxx 6,156.81061
SSxy

r=

SSxy
SSxx SS yy

482.1363636

= .3095

6,156.81061 394.1818182

There is a rather weak positive linear relationship between digestion efficiency and
weight change.
To determine whether weight change is correlated with digestion, we test:
H0: = 0
Ha: 0
The test statistic is t =

.3095

= 1.81
1 r2
1 .30952
n2
33 2
The rejection region requires /2 = .01/2 = .005 in each tail of the t-distribution with
df = n 2 = 33 2 = 31. From Table VI, Appendix B, t.005 = 2.750. The rejection
region is t > 2.750 or t < 2.750.

Since the observed value of the test statistic does not fall in the rejection region
(t = 1.81 >/ 2.750), H0 is not rejected. There is insufficient evidence to indicate weight
change and digestion are correlated at = .01.

360

Chapter 10

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

e.

Using MINITAB, the scattergram is:

80
70
60

Digest

50
40
30
20
10
0
5

15

25

35

Fiber

Some preliminary calculations are:

x = 943.5 x
y = 57, 390.75

= 24, 533.25

xy = 21, 405.5 y = 1, 266.5

SSxy = xy
SSxx = x 2
SS yy = y

1 =
r=

SSxy
SSxx

x y = 21, 405.5 943.5(1, 266.5) = 7, 045.51786


n

( x)

42

( y)

= 24, 533.25

(943.5) 2
= 3, 338.19643
42

= 57, 390.75

1, 266.52
= 19,199.74405
42

n
7, 045.51786
=
= 2.110576177
3, 338.19643

SSxy
SSxx SS yy

7, 045.51786
3, 338.19643 19,199.74405

= .8801

There is a fairly strong negative linear relationship between digestion efficiency and
acid-detergent fiber.

Simple Linear Regression

361

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

To determine whether acid-detergent fiber is correlated with digestion, we test:


H0: = 0
Ha: 0
The test statistic is t =

r
1 r2
n2

.8801
1 (.8801) 2
42 2

= 11.72

The rejection region requires /2 = .01/2 = .005 in each tail of the t-distribution with
df = n 2 = 42 2 = 40. From Table VI, Appendix B, t.005 = 2.704. The rejection
region is t > 2.704 or t < 2.704.
Since the observed value of the test statistic falls in the rejection region (t = 11.72 <
2.704), H0 is rejected. There is sufficient evidence to indicate acid-detergent fiber and
digestion are correlated at = .01.
After deleting the data corresponding to duck chow, the preliminary calculations are:

x = 877 x
y = 21, 069

xy = 17, 274 y = 701.50

= 24, 036.5

x y = 17, 274 877(701.50) = 1, 368.89394

SSxy = xy

SSxx = x 2
SS yy = y

( x)

33

= 24, 036.5

( y)

= 21, 069

(877) 2
= 729.56061
33

(701.50) 2
= 6,156.81061
33

n
1, 368.89394
1 =
=
= 1.876326547
SSxx
729.56061
SSxy

r=

SSxy
SSxx SS yy

1, 368.89394

= .6459

729.56061 6,156.81061

There is a moderate negative linear relationship between digestion efficiency and


acid-detergent fiber.
To determine whether acid-detergent fiber is correlated with digestion, we test:
H0: = 0
Ha: 0

362

Chapter 10

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The test statistic is t =

r
1 r2
n2

.6459
1 ( .6459) 2
33 2

= 4.71

The rejection region requires /2 = .01/2 = .005 in each tail of the t-distribution with
df = n 2 = 33 2 = 31. From Table VI, Appendix B, t.005 = 2.750. The rejection
region is t > 2.750 or t < 2.750.
Since the observed value of the test statistic falls in the rejection region (t = 4.71 <
2.750), H0 is rejected. There is sufficient evidence to indicate acid-detergent fiber and
digestion are correlated at = .01.
10.66

a.

b.

Some preliminary calculations are:

= 28

SSxy =

x = 224 xy = 254 y = 37 y
x y = 254 28(37) = 106
xy
2

SSxx =

x2

SSyy =

( x)

= 307

( y)

= 224

282
= 112
7

= 307

37 2
= 111.4285714
7

106
= .946428571
SS xx 112
37
28
.946428571 = 1.5
0 = y 1 x =
7
7

1 =

SS xy

The least squares line is y = 1.5 + .946x.


c.

SSE = SSyy 1 SSxy = 111.4285714 (.946428571)(106) = 11.1071429


SSE 11.1071429
=
= 2.22143
s2 =
n2
72

Simple Linear Regression

363

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

d.

The form of the confidence interval is:


1 ( xp x )
y t/2s
+
SSxx
n

where s =

2
s =

For xp = 3, y = 1.5 + .946(3) = 4.338 and x =

2.22143 = 1.4904

28
=4
7

For confidence coefficient .90, = 1 .90 = .10 and /2 = .10/2 = .05. From Table VI,
Appendix B, t.05 = 2.015 with df = n 2 = 7 2 = 5.
The 90% confidence interval is:
1 (3 4)
+
4.338 1.170 (3.168, 5.508)
7
112
2

4.338 2.015(1.4904)
e.

The form of the prediction interval is:


1 ( xp x )
y t/2s 1 + +
SSxx
n

The 90% prediction interval is:


1 (3 4)
+
4.338 3.223 (1.115, 7.561)
7
112
2

4.338 2.015(1.4904) 1 +
f.

The 95% prediction interval for y is wider than the 95% confidence interval for the mean
value of y when xp = 3.
The error of predicting a particular value of y will be larger than the error of estimating
the mean value of y for a particular x value. This is true since the error in estimating the
mean value of y for a given x value is the distance between the least squares line and the
true line of means, while the error in predicting some future value of y is the sum of two
errorsthe error of estimating the mean of y plus the random error that is a component of
the value of y to be predicted.

10.68

a.

The form of the confidence interval is:


s
y = 22 = 2.2
y t/2
where y =
n
10
n

s2 =

( y)

n 1

(22) 2
10 = 3.7333 and s = 1.9322
10 1

82

For confidence coefficient .95, = 1 .95 = .05 and /2 = .05/2 = .025. From Table VI,
Appendix B, t.025 = 2.262 with df = n 1 = 10 1 = 9. The 95% confidence interval is:
2.2 2.262

364

1.9322
10

2.2 1.382 (.818, 3.582)

Chapter 10

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

b.

c.

The confidence intervals computed in Exercise 10.63 are much narrower than that found
in part a. Thus, x appears to contribute information about the mean value of y.

d.

From Exercise 12.63, 1 = .843, s = .8619, SSxx = 38.9, and n = 10.


H0: 1 = 0
Ha: 1 0
The test statistic is t =

1 0
s

1 0
s

.843 0
= 6.10
.8619

SSxx

38.9

The rejection region requires /2 = .05/2 = .025 in each tail of the t-distribution with df =
n 2 = 10 2 = 8. From Table VI, Appendix B, t.025 = 2.306. The rejection region is t >
2.306 or t < 2.306.
Since the observed value of the test statistic falls in the rejection region (t = 6.10 > 2.306),
H0 is rejected. There is sufficient evidence to indicate the straight-line model contributes
information for the prediction of y at = .05.
10.70

10.72

a.

The 95% confidence interval for E(y) when y = .52 is (3,598.1, 3,868.1). We are
95% confident that the mean asking price for a diamond weighing .52 carats is
between $3,598.10 and $3,868.10.

b.

The 95% prediction interval for y when y = .52 is (1529.8, 5,936.3). We are 95%
confident that the actual asking price for a diamond weighing .52 carats is between
$1,529.80 and $5,936.30.

Answers may vary. One possible answer is:


The 90% confidence interval for x = 220.00 is (5.64898, 5.83848). We are 90% confident that
the mean sweetness index of all orange juice samples will be between 5.64898 and 5.83848
parts per million when the pectin value is 220.00.

Simple Linear Regression

365

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

10.74

a.

Using MINITAB, the results of the regression analysis are:


Regression Analysis: Managers versus UnitsSold
The regression equation is
Managers = 5.33 + 0.586 UnitsSold
Predictor
Constant
UnitsSol

Coef
5.325
0.58610

S = 2.566

SE Coef
1.180
0.03818

R-Sq = 92.9%

T
4.51
15.35

P
0.000
0.000

R-Sq(adj) = 92.5%

Analysis of Variance
Source
Regression
Residual Error
Total

DF
1
18
19

SS
1552.0
118.6
1670.5

MS
1552.0
6.6

F
235.63

P
0.000

To determine the usefulness of the model, we test:


H0: 1 = 0
Ha: 1 0
The test statistic is t =

1 0
s

= 15.35 (from printout).

The rejection region requires /2 = .05/2 = .025 in each tail of the t-distribution with df =
n 2 = 20 2 = 18. From Table VI, Appendix B, t.025 = 2.101. The rejection region is
t > 2.101 or t < 2.101.
Since the observed value of the test statistic falls in the rejection region (t = 15.35
> 2.101), H0 is rejected. There is sufficient evidence to indicate the model is useful at
= .05. Therefore, the monthly sales is useful in predicting the number of managers at
= .05.
b.

For confidence coefficient .90, = 1 .90 = .10 and /2 = .10/2 = .05. From Table VI,
Appendix B, t.05 = 1.734 with df = 18.
For xp = 39, x =

x = 540
n

20

= 27, and y = 5.325 + .5861(39) = 28.1829.

The form of the prediction interval is:


2
1 (39 27) 2
1 ( xp x )
+
28.183 1.734(2.5664) 1 +
y t/2s 1 + +
20
4, 518
n
SSxx

28.183 4.629 (23.554, 32.812)


c.

366

We are 90% confident the actual number of managers needed when 39 units are sold is
between 23.55 and 32.81.

Chapter 10

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

10.76

a.

From Exercise 10.34, SSxx = 3000 and x = 50.


Also, for Brand A, s = 1.211; for Brand B, s = .610.
For Brand A, y = 6.62 .0727(45) = 3.349, while for Brand B, y = 9.31 .1077(45)
= 4.464.
The degrees of freedom for both brands is n 2 = 15 2 = 13. For confidence coefficient
.90, (i.e., for all parts of this question), = .10 and /2 = .05. From Table VI, Appendix
B, with df = 13, t.05 = 1.771.
The form of both confidence intervals is y t/2s

2
1 ( xp x )
+
n
SSxx

For Brand A, we obtain:


1 (45 50)
+
3.349 .587 (2.762, 3.936)
15
3000
2

3.349 1.771(1.211)
For Brand B, we obtain:

1 (45 50)
+
4.464 .296 (4.168, 4.760)
15
3000
2

4.464 1.771(.610)

The first interval is wider, caused by the larger value of s.

b.

2
1 ( xp x )
The form of both prediction intervals is y t/2s 1 + +
n
SSxx

For Brand A, we obtain:


1 (45 - 50)
3.349 1.771(1.211) 1 + +
15
3000

3.349 2.224 (1.125, 5.573)

For Brand B, we obtain:


1 (45 - 50)
4.464 1.771(.610) 1 + +
15
3000

4.464 1.120 (3.344, 5.584)

Again, the first interval is wider, caused by the larger value of s. Each of these intervals
is wider than its counterpart from part a, since, for the same x, a prediction interval for an
individual y is always wider than a confidence interval for the mean of y. This is due to
an individual observation having a greater variance than the variance of the mean of a set
of observations.

Simple Linear Regression

367

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

To obtain a confidence interval for the life of a brand A cutting tool that is operated at
100 meters per minute, we use:
2
1 ( xp x )
y t/2s 1 + +
n
SSxx

For x = 100, y = 6.62 .0727(100) = .65.


The degrees of freedom are n 2 = 15 2 = 13. For confidence coefficient .95, = .05
and /2 = .025. From Table VI, Appendix B, with df = 13, t.025 = 2.160.
Here, we obtain:
.65 2.160(1.211) 1 +

(100 50) 2
1
+
.65 3.606 (4.256, 2.956)
15
3000

The additional assumption would be that the straight line model fits the data well for the
x's actually observed all the way up to the value under consideration, 100. Clearly from
the estimated value of .65, this is not true (usually, negative "useful lives" are not
found).
10.78

a.

b.

One possible line is y = x.


x

y - y

1
3
5

1
3
5

1
3
5

0
0
0
0

For this example

( y y ) = 0

A second possible line is y = 3.

368

y - y

1
3
5

1
3
5

3
3
3

2
0
2
0

For this example

( y y ) = 0

Chapter 10

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

Some preliminary calculations are:

x = 9 x = 35 xy = 35
y = 9 y = 35
x y = 35 9(9) = 8
SSxy = xy
n
3
( x ) = 35 9 = 8
SSxx = x
n
3
( y ) = 35 9 = 8
SSyy = y
3
n
2

2
i

1 =

SS xy
SS xx

8
=1
8

9 9
0 = 1 x = 1 = 0
3

The least squares line is y = 0 + 1x = x.


d.

For y = x, SSE = SSyy 1 SSxy = 8 1(8) = 0


For y = 3, SSE = ( yi yi ) 2 = (1 3)2 + (3 3)2 + (5 3)2 = 8
The least squares line has the smallest SSE of all possible lines.

10.80

a.

The variables x and y do appear to be related. It appears when x increases, y tends to


increase.
b.

r = r 2 = .612 = .7823
The correlation between concentration and exhaustion index is .7823. This relationship is
positive since r > 0. The relationship is fairly strong. No, this does not mean that
concentration causes emotional exhaustion. They are just related.

Simple Linear Regression

369

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

To determine if the straight-line relationship is useful, we test:


H0: 1 = 0
Ha: 1 0
The test statistic is t =

1 0
s

= 6.03

The rejection region requires /2 = .05/2 = .025 in each tail of the t-distribution with df =
n 2 = 25 2 = 23. From Table VI, Appendix B, t.025 = 2.069. The rejection region is
t > 2.069 or t < 2.069.
Since the observed value of the test statistic falls in the rejection region (t = 6.03 > 2.069),
H0 is rejected. There is sufficient evidence to indicate the model is useful for predicting
burnout at = .05.
d.

r2 = .612
61.2% of the sample variation of exhaustion index is explained by the linear relationship
between the exhaustion index and concentration.

e.

For confidence level .95, = .05 and /2 = .05/2 = .025. From Table VI, Appendix B,
with df = n 2 = 25 2 = 23, t.025 = 2.069. The 95% confidence interval is:

1 t.025 s 8.865 2.069(1.471) 8.865 3.043 (5.822, 11.908)


1

We are 95% confident that the change in mean exhaustion index for each unit
change in concentration is between 5.822 and 11.908.
f.

For confidence coefficient .95, = 1 .95 and /2 = .05/2 = .025. From Table VI,
Appendix B, t.025 = 2.069 with df = 23. The confidence interval is:
2
1 ( xp x )
y t/2s
where y = 29.497 + 8.865(80) = 679.703
+
n
SSxx

1 (80 68.56)
+
679.703 80.054
25
14, 026.16
(599.678, 759.757)
2

679.703 2.069(174.2074)

We are 95% confident that the interval from 599.648 to 759.757 encloses the mean
exhaustion level for all professionals who have 80% of their social contacts within their
work groups.

370

Chapter 10

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

10.82

a.

x = 590,124
x = 27,727,637,890
xy = 1,396,503,941
y = 30,537.4
y = 73,506,140.4
( x )( y ) = 1,396,503,941 590,124(30, 537.4) = 10,284,507
SSxy = xy
13
13
( x ) = 27,727,637,890 590,124 = 939,458,250
SSxx = x
13
13
2

10, 284, 507


= .010947274 .0109
939, 458, 250
SS xx
.010947274(590,124)
30, 537.4
1 = y 1 x =
= 1852.088523 1852.089

13
13

1 =

SS xy

The least squares line is y = 1852.089 + .0109x.


b.

The plot of the data is:

c.

Based on the graph, it does not appear that the line fits the data very well. The points do
not lie very close to the line.

d.

Some preliminary calculations are:


SS yy = y

( y)

= 73, 506,140.4

(30, 537.4) 2
= 1, 772,848.19
13

SSE = SS yy 1SS xy = 1, 772,848.19 (0.010947274)(10, 284, 507) = 1, 660, 260.874

SSE 1, 660, 260.874


=
= 150, 932.8067
n2
13 2
and s = 150, 932.8067 = 388.501
s 2 = MSE =

Simple Linear Regression

371

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

For confidence level .95, = .05 and /2 = .05/2 = .025. From Table VI, Appendix B,
with df = n 2 = 13 2 = 11, t.025 = 2.201. The 95% confidence interval is:

1 t.025 s .0109 2.201


1

10.84

10.86

388.501
939, 458, 250

.0109 .0279 (0.0170, 0.0388)

e.

Since 0 is contained in the 95% confidence interval, there is no evidence to indicate that
there is a linear relationship between buying income and retail sales.

a.

r = .14. Because this value is close to 0, there is a very weak positive linear relationship
between math confidence and computer interest for boys.

b.

r = .33. Because this value is fairly close to 0, there is a weak positive linear relationship
between math confidence and computer interest for girls.

a.

1 = .020. For each additional 1% increase in leaves infected, the mean log of the
average number of infections per leaf is estimated to increase by .02.

b.

r2 = .816. 81.6% of the total sample variability around the sample mean log of the
average number of infections per leaf is explained by the linear relationship between the
log of the average number of infections per leaf and the percentage of leaves infected.

c.

s = .288. We would expect most of the observed values of the log of the average number
of infections per leaf to fall within 2s or 2(.288) or .576 units of their predicted values.

d.

r = .816 = .903. Because this number is close to 1, there is a fairly strong positive
linear relationship between the log of the average number of infections per leaf and the
percentage of leaves infected.

e.

To determine if there is a linear relationship between the log of the average number of
infections per leaf and the percentage of leaves infected, we test:
H0: 1 = 0
Ha: 1 0
The test statistic is t =

r
(1 r ) /(n 2)
2

.903
(1 .816) /(100 2)

= 20.83

The rejection region requires /2 = .05/2 = .025 in each tail of the t-distribution with
df = n 2 = 100 2 = 98. From Table VI, Appendix B, t.025 1.99. The rejection region
is t < 1.99 or t > 1.99.
Since the observed value of the test statistic falls in the rejection region (t = 20.83 > 1.99),
H0 is rejected. There is sufficient evidence to indicate that there is a linear relationship
between the log of the average number of infections per leaf and the percentage of leaves
infected at = .05.

372

Chapter 10

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

10.88

f.

For xp = 80%, y = .939 + .020(80) = .661. The antilog (base 10) of .661 is 4.58. Thus,
when the percentage of leaves infected is 80%, the average number of infections per leaf
is predicted to be 4.58.

a.

A straight line model relating an NFL teams current value to its operating income is:
y = 0 + 1x +

b.

x = 1,037.6

y = 26, 207

x 2 = 38,996.28
x=

y 2 = 22,024,389

x = 1,037.6 = 32.425
32

SSxy = xy

xy = 879, 473.1

y=

y = 26, 207 = 818.96875


32

( x )( y ) = 879, 473.1 1,037.6(26, 207)

n
= 879, 473.1 849,761.975 = 29,711.125

( x)

32

(1,037.6) 2
32
n
= 38,996.28 33,644.18 = 5,352.1

SSxx = x

1 =

SSxy
SSxx

= 38,996.28

29,711.125
= 5.551302293 5.551
5,352.1

o = y 1 x = 818.96875 (5.551302293)(32.425) = 638.9677731 638.968


The fitted regression line is: y = 638.968 + 5.551x
c.

1 = 5.551. When operating income increases by 1 millon dollars, the mean current
value is estimated to increase by 5.551 million dollars. This is meaningful for values of
operating income between 7.8 and 54.3 million dollars.

0 = 638.968. This has no meaning since x = 0 is not in the observed range.


d.

Some additional calculations are:

( y)

(26, 207) 2
32
n
= 22,024,389 21, 462,714.03 = 561,674.97

SS yy = y

= 22,024,389

SSE = SS yy 1SS xy = 561674.97 5.551302293(29,711.125) = 396,739.5337

Simple Linear Regression

373

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

s 2 = MSE =

SSE 396,739.5337
=
= 13, 224.65112 and
n2
32 2

s = 13, 224.65112 = 114.9985


To determine if a linear relationship exists between current value and operating
income, we test:
H0: 1 = 0
Ha: 1 0
The test statistics is t =

1 0
s

5.551 0
= 3.53
114.9985
5,352.1

No was given so we will use = .05. The rejection region requires /2 = .05/2 =
.025 in each tail of the t-distribution with df = n 2 = 32 2 = 30. From Table VI,
Appendix B, t.025 = 2.042. The rejection region is t > 2.042 or t < 2.042.
Since the observed value of the test statistic falls in the rejection region (t = 3.53 >
2.042), H0 is rejected. There is sufficient evidence to indicate a significant linear
relationship between current value and operating income at = .05.
r2 =

SS yy SSE
SS yy

561,674.97 396,739.5337
= .29365 .294
561,674.97

29.4% of the sample variation around the sample mean current value is explained by
the linear relationship between current value and operating income.
There is a significant linear relationship between current value and operating income.
However, the relationship is not particularly strong.
10.90

a.

Using MINITAB, the regression analysis is:


Regression Analysis: BTU versus Area
The regression equation is
BTU = - 99045 + 103 Area
Predictor
Constant
Area
S = 628185

Coef
-99045
102.81

SE Coef
261618
15.86

R-Sq = 67.8%

T
-0.38
6.48

P
0.709
0.000

R-Sq(adj) = 66.1%

Analysis of Variance
Source
Regression
Residual Error
Total

374

DF
SS
MS
1 1.65850E+13 1.65850E+13
20 7.89232E+12 3.94616E+11
21 2.44773E+13

F
42.03

P
0.000

Chapter 10

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Predicted Values for New Observations


New Obs
1

Fit
723467

SE Fit
165874

95.0% CI
377459, 1069475)

95.0% PI
( -631816, 2078750)

Values of Predictors for New Observations


New Obs
Area
1
8000

0 (INTERCEP) = 99045
1 (AREA) = 102.81
b.

To determine if energy consumption is positively linearly related to the shell area, we


test:
H0: 1 = 0
Ha: 1 > 0
The test statistic is t = 6.48 (from printout).
The rejection region requires = .10 in the upper tail of the t-distribution with df = n 2
= 22 2 = 20. From Table VI, Appendix B, t.10 = 1.325. The rejection region is
t > 1.325.
Since the observed value of the test statistic falls in the rejection region (t = 6.48 > 1.325),
H0 is rejected. There is sufficient evidence to indicate that energy consumption is
positively linearly related to the shell area at = .10.

c.

Since this is a one-tailed test but the output calculates the p-value for a two-tailed test, the
observed significance level is:
1
( Prob > T
2

) 12 (.000) = .000

This is the probability of observing our value of t (6.481) or anything larger if 1 = 0.


Since this probability is so small, there is strong evidence to reject H0.
d.

r2 = R-Square = .678
67.8% of the total sample variability in energy consumption around its mean is explained
by the linear relationship between energy consumption and shell area.

e.

From the printout, for xp = 8000, y = 723,467


The 95% prediction interval is (631,816, 2,078,750).
This interval is so large and includes negative BTU's; it is not very useful.

Simple Linear Regression

375

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

10.92

Some preliminary calculations are:

x = 4305
y = 201,558
a.

1 =

x
y

xy = 76,652,695
x 1,652,025
2

= 1,652,025

xy = 76,652,695

= 3,571,211,200

= 46.39923427 46.3992

The least squares line is y = 46.3992x.

b.

SSxy =

xy

SSxx =

1 =

SSxy
SSxx

x y
n

( x)

= 76,652,695

= 1,652,025

4305(201,558)
= 18,805,549
15
2

4305
= 416,490
15

n
18,805,549
=
= 45.15246224 45.1525
416, 490

0 y 1 x =

201,558
4305
45.15246224
= 478.4433
15
15

The least squares line is y = 478.4433 + 45.1525x.


c.

376

Because x = 0 is not in the observed range, we are trying to represent the data on the
observed interval with the best fitting line. We are not concerned with whether the line
goes through (0, 0) or not.

Chapter 10

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

d.

Some preliminary calculations are:

( y)

201,5582
= 862,836,042
15
n
SSE = SSyy 1 SSxy = 862,836,042 - 45.15246224(18,805,549) = 13,719,200.88
SSE 13,719, 200.88
s2 =
=
= 1,055,323.145
s = 1027.2892
n2
15 2

SSyy =

= 3,571, 211, 200

H0: 0 = 0
Ha: 0 0
The test statistic is t =

0 0
2

x
1
+
s
n SSxx

478.443
2
1
1027.2892
+ 287
15 416, 490

= .906

The rejection region requires /2 = .10/2 = .05 in each tail of the t-distribution with
df = n 2 = 15 2 = 13. From Table VI, Appendix B, t.05 = 1.771. The rejection region
is t < 1.771 or t > 1.771.
Since the observed value of the test statistic does not fall in the rejection region (t = .906
>/ 1.771), H0 is not rejected. There is insufficient evidence to indicate 0 is different
from 0 at = .10. Thus, 0 should not be included in the model.
10.94

Answers may vary. Possible answer:


The scaffold-drop survey provides the most accurate estimate of spall rate in a given wall
segment. However, the drop areas were not selected at random from the entire complex; rather,
drops were made at areas with high spall concentrations. Therefore, if the photo spall rates
could be shown to be related to drop spall rates, then the 83 photo spall rates could be used to
predict what the drop spall rates would be.
a.

Construct a scattergram for the data.

The scattergram shows a positive relationship between the photo spall rate (x) and the
drop spall rate (y).

Simple Linear Regression

377

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

b.

Find the prediction equation for drop spall rate. The MINITAB output shows the results
of the analysis.
The regression equation is
drop = 2.55 + 2.76 photo
Predictor
Constant
photo
S = 4.164

Coef
2.548
2.7599

StDev
1.637
0.2180

R-Sq = 94.7%

T
P
1.56 0.154
12.66 0.000
R-Sq(adj) = 94.1%

Analysis of Variance
Source
DF
SS
Regression
1
2777.5
Residual Error 9
156.0
Total
10
2933.5
Unusual Observations
Obs
photo
drop
11
11.8
43.00

MS
2777.5
17.3

F
P
160.23 0.000

Fit StDev Fit


35.11
1.97

Residual St Resid
7.89
2.15R

R denotes an observation with a large standardized residual


y = 2.55 + 2.76x
c.

Conduct a formal statistical hypthesis test to determine if the photo spall rates contribute
information for the prediction of drop spall rates.
H0: 1 = 0
Ha: 1 0
The test statistic is t = 12.66, with p-value < .0001.
Reject H0 for any level of significance .0001. There is sufficient evidence to indicate that
photo spall rates contribute information for the prediction of drop spall rates at .0001.

d.

378

One could now use the 83 photos spall rates to predict values for 83 drop spall rates.
Then use this information to estimate the true spall rate at a given wall segment and
estimate to total spall damage.

Chapter 10

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Multiple Regression and


Model Building

11.2

a.

0 = 506.346, 1 = 941.900, 2 = -429.060

b.

y = 506.346 941.900x1 429.060x2

c.

SSE = 151,016, MSE = 8883, s = 94.251

Chapter 11

We expect about 95% of the y-values to fall within 2s or 2(94.251) or 188.502


units of the fitted regression equation.
d.

H0: 1 = 0
Ha: 1 0

The test statistic is t =

1 0
s

941.900
= 3.42
275.08

The rejection region requires /2 = .05/2 = .025 in each tail of the t distribution with
df = n (k + 1) = 20 - (2 + 1) = 17. From Table VI, Appendix B, t.025 = 2.110. The
rejection region is t < 2.110 or t > 2.110.
Since the observed value of the test statistic falls in the rejection region (t = 3.42 <
2.110), H0 is rejected. There is sufficient evidence to indicate 1 0 at = .05.
e.

For confidence coefficient .95, = .05 and /2 = .025. From Table VI, Appendix
B, with df = n (k + 1) = 20 (2 + 1) = 17, t.025 = 2.110. The 95% confidence
interval is:

2 t.025 s 429.060 2.110(379.83) 429.060 801.441


2

(1230.501, 372.381)
f.

R2 = R-Sq = 45.9% . 45.9% of the total sample variation of the y values is explained
by the model containing x1 and x2.
R2a = R-Sq(adj) = 39.6%. 39.6% of the total sample variation of the y values is
explained by the model containing x1 and x2, adjusted for the sample size and the
number of parameters in the model.

Multiple Regression and Model Building

379

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

g.

To determine if at least one of the independent variables is significant in prediction y,


we test:
H0: 1 = 2 = 0
Ha: At least one i 0
From the printout, the test statistic is F = 7.22
Since no level was given, we will choose = .05. The rejection region requires
= .05 in the upper tail of the F-distribution with 1 = k = 2 and 2 = n (k + 1)
= 20 (2 + 1) = 17. From Table IX, Appendix B, F.05 = 3.59. The rejection region is
F > 3.59.
Since the observed value of the test statistic falls in the rejection region
( F = 7.22 > 3.59), H0 is rejected. There is sufficient evidence to indicate at least
one of the variables, x1 or x2, is significant in predicting y at = .05.

11.4

h.

The observed significance level of the test is p-value = 0.005. Since the
p-value is so small, we will reject H0 for most reasonable values of . There is
sufficient evidence to indicate at least one of the variables, x1 or x2, is significant in
predicting y at greater than 0.005.

a.

We are given 1 = 3.1, s = 2.3, and n = 25.


1

H0: 1 = 0
Ha: 1 > 0
The test statistic is t =

1 0
s

3.1
= 1.35
2.3

The rejection region requires = .05 in the upper tail of the t distribution with df =
n (k + 1) = 25 (2 + 1) = 22. From Table VI, Appendix B, t.05 = 1.717. The
rejection region is t > 1.717.
Since the observed value of the test statistic does not fall in the rejection region (t =
1.35 >/ 1.717), H0 is not rejected. There is insufficient evidence to indicate 1 > 0 at
= .05.
b.

We are given 2 = .92, s = .27, and n = 25.


2

H0: 2 = 0
Ha: 2 0
The test statistic is t =

2 0
s

.92
= 3.41
.27

380

Chapter 11

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The rejection region requires /2 = .05/2 = .025 in each tail of the t distribution with
df = n (k + 1) = 25 (2 + 1) = 22. From Table VI, Appendix B, t.025 = 2.074. The
rejection region is t < 2.074 or t > 2.074.
Since the observed value of the test statistic falls in the rejection region (t = 3.41 >
2.074), reject H0. There is sufficient evidence to indicate 2 0 at = .05.
c.

For confidence coefficient .90, = 1 .90 = .10 and /2 = .10/2 = .05. From Table
VI, Appendix B, with df = n (k + 1) = 25 (2 + 1) = 22, t.05 = 1.717. The
confidence interval is:

1 t.05 s 3.1 1.717(2.3) 3.1 3.949 (.849, 7.049)


1

We are 90% confident that 1 falls between .849 and 7.049.


d.

For confidence coefficient .99, = 1 .99 = .01 and /2 = .01/2 = .005. From Table
VI, Appendix B, with df = n (k + 1) = 25 (2 + 1) = 22, t.005 = 2.819. The
confidence interval is:

2 t.005 s .92 2.819(.27) .92 .761 (.159, 1.681)


2

We are 99% confident that 2 falls between .159 and 1.681.


11.6

a.

For x2 = 1 and x3 = 3,
E(y) = 1 + 2x1 + 1 3(3)
E(y) = 2x1 7
The graph is :

Multiple Regression and Model Building

381

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

b.

For x2 = 1 and x3 = 1
E(y) = 1 + 2x1 + (1) 3(1)
E(y) = 2x1 3
The graph is:

c.

They are parallel, each with a slope of 2. They have different y-intercepts.

d.

The relationship will be parallel lines.

11.8

No. There may be other independent variables that are important that have not been
included in the model, while there may also be some variables included in the model which
are not important. The only conclusion is that at least one of the independent variables is a
good predictor of y.

11.10

a.

The first order model is: E(y) = 0 + 1 x1 + 2 x2 + 3 x3 + 4 x4 + 5 x5

b.

R2 = .58. 58% of the total sample variation of the levels of trust is explained by the
model containing the 5 independent variables.

c.

F=

d.

.58 5
R2 k
=
= 16.57
2
(1 R ) [n (k + 1)] (1 .58) [66 (5 + 1)]

The rejection region requires = .10 in the upper tail of the F-distribution with 1 = k
= 5 and 2 = n (k + 1) = 66 (5 + 1) = 60. From Table VIII, Appendix B, F.10 = 1.90.
The rejection region is F > 1.96.
Since the observed value of the test statistic falls in the rejection region
(F = 16.57 > 1.96), H0 is rejected. There is sufficient evidence to indicate that at
least one of the 5 independent variables is useful in the prediction of level of trust at
= .10.

11.12

a.

The least squares prediction equation is:

y = 3.70 + .34 x1 + .49 x2 + .72 x3 + 1.14 x4 + 1.51x5 + .26 x6 .14 x7 .10 x8 .10 x9 .

382

Chapter 11

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

b.

0 = 3.70 . This is estimate of the y-intercept. It has no other meaning because the
point with all independent variables equal to 0 is not in the observed range.

1 = 0.34 . For each additional walk, the mean number of runs scored is estimated
to increase by .30, holding all other variables constant.

2 = 0.49 . For each additional single, the mean number of runs scored is estimated to
increase by .49, holding all other variables constant.

3 = 0.72 . For each additional double, the mean number of runs scored is
estimated to increase by .72, holding all other variables constant.

4 = 1.14 . For each additional triple, the mean number of runs scored is estimated
to increase by 1.14, holding all other variables constant.

5 = 1.51 . For each additional home run, the mean number of runs scored is
estimated to increase by 1.51, holding all other variables constant.

6 = 0.26 . For each additional stolen base, the mean number of runs scored is
estimated to increase by .26, holding all other variables constant.

7 = 0.14 . For each additional time a runner is caught stealing, the mean number
of runs scored is estimated too decrease by .14, holding all other variables constant.

8 = 0.10 . For each additional strikeout, the mean number of runs scored is
estimated to decrease by .10, holding all other variables constant.

9 = 0.10 . For each additional out, the mean number of runs scored is estimated
to decrease by .10, holding all other variables constant.
c.

H0: 7 = 0
Ha: 7 < 0

The test statistic is t =

7 0
s

.14 0
= 1.00
.14

The rejection region requires = .05 in the lower tail of the t-distribution with df
= n (k + 1) = 234 (9 + 1) = 224. From Table VI, Appendix B, t.05 = 1.645. The
rejection region is t < 1.645.
Since the observed value of the test statistic does not fall in the rejection region
(t = 1.00 </ 1.645), H0 is not rejected. There is insufficient evidence to indicate
that the mean number of runs decreases as the number of runners caught stealing
increase, holding all other variables constant at = .05.

Multiple Regression and Model Building

383

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

d.

For confidence level .95, = .05 and /2 = .05/2 = .025. From Table VI, Appendix
B, with df = 224, t.025 = 1.96. The 95% confidence interval is:

5 t / 2 s 1.51 1.96(.05) 1.51 0.098 (1.412, 1.608)


5

We are 95% confident that the mean number of runs will increase by anywhere from
1.412 to 1.608 for each additional home run, holding all other variables constant.
11.14. a.

b.

R2 = .31. 31% of the total sample variation of the natural log of the level of CO2
emissions in 1996 is explained by the model containing the 7 independent variables.
The test statistic is F =

.31 7
R2 k
=
= 3.72
2
(1 R ) [n (k + 1)] (1 .31) [66 (7 + 1)]

The rejection region requires = .01 in the upper tail of the F-distribution with 1 = k
= 7 and 2 = n (k + 1) = 66 (7 + 1) = 58. From Table XI, Appendix B, F.01 = 2.95.
The rejection region is F > 2.95.
Since the observed value of the test statistic falls in the rejection region
(F = 3.72 > 2.95), H0 is rejected. There is sufficient evidence to indicate that at
least one of the 7 independent variables is useful in the prediction of natural log of
the level of CO2 emissions in 1996 at = .01.
c.

To determine if foreign investments in 1980 is a useful predictor of CO2 emissions in


1996, we test:
H0: 1 = 0
Ha: 1 0

11.16

d.

The test statistic is t = 2.52 and the p-value is p < 0.05. Since the observed p-value is
less than (p < .05), Ho is rejected. There is sufficient evidence to indicate foreign
investments in 1980 is a useful predictor of CO2 emissions in 1996 at = .05.

a.

From MINITAB, the output is:


Regression Analysis: DDT versus Mile, Length, Weight
The regression equation is
DDT = - 108 + 0.0851 Mile + 3.77 Length - 0.0494 Weight
Predictor
Constant
Mile
Length
Weight

Coef
-108.07
0.08509
3.771
-0.04941

S = 97.48

SE Coef
62.70
0.08221
1.619
0.02926

R-Sq = 3.9%

T
-1.72
1.03
2.33
-1.69

P
0.087
0.302
0.021
0.094

R-Sq(adj) = 1.8%

Analysis of Variance
Source
Regression
Residual Error
Total

384

DF
3
140
143

SS
53794
1330210
1384003

MS
17931
9501

F
1.89

P
0.135

Chapter 11

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The least squares prediction equation is:

y = 108.07 + 0.08509x1 + 3.771x2 0.04941x3


b.

s = 97.48. We would expect about 95% of the observed values of DDT level to fall
within 2s or 2(97.48) = 194.96 units of their least squares predicted values.

c.

To determine if at least one of the variables is useful in predicting the DDT level, we
test:
Ho: 1 = 2 = 3 = 0
Ha: At least 1 i 0

The test statistic is F = 1.89 and the p-value is p = .135. Since the p-value is not less
than = .05 (p = .135 </ .05), H0 is not rejected. There is insufficient evidence to
indicate at least one of the variables is useful in predicting the DDT level at = .05.
d.

To determine if DDT level increases as length increases, we test:


H0: 2 = 0
Ha: 2 > 0

The test statistics is t = 2.33


The p-value is p = .021/2 = .0105. Since the p-value is less than (p = .0105 < .05),
H0 is rejected. There is sufficient evidence to indicate that DDT level increases as
length increases, holding the other variables constant at = .05.
The observed significance level is p = .0105.
e.

For confidence coefficient .95, = .05 and /2 = .05/2 = .025. From Table VI,
Appendix B, with df = n 3 = 144 4 = 140, t.025 = 1.96. The 95% confidence
interval is:

3 t / 2 s 0.04941 1.96(0.02926) 0.04941 0.05735


3

(0.10676, 0.00794)

We are 95% confident that the mean DDT level will change from 0.10676 to
0.00794 for each additional point increase in weight, holding length and mile
constant. Since 0 is in the interval, there is no evidence that weight and DDT level
are linearly related.

Multiple Regression and Model Building

385

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

11.18

a.

From MINITAB, the output is:


Regression Analysis: WeightChg versus Digest, Fiber
The regression equation is
WeightChg = 12.2 - 0.0265 Digest - 0.458 Fiber
Predictor
Constant
Digest
Fiber

Coef
12.180
-0.02654
-0.4578

S = 3.519

SE Coef
4.402
0.05349
0.1283

R-Sq = 52.9%

T
2.77
-0.50
-3.57

P
0.009
0.623
0.001

R-Sq(adj) = 50.5%

Analysis of Variance
Source
Regression
Residual Error
Total

DF
2
39
41

SS
542.03
483.08
1025.12

MS
271.02
12.39

F
21.88

P
0.000

y = 12.2 .0265x1 .458x2


b.

0 = 12.2 = the estimate of the y-intercept


1 = .0265. We estimate that the mean weight change will decrease by .0265% for
each additional increase of 1% in digestion efficiency, with acid-detergent fibre held
constant.

2 = .458. We estimate that the mean weight change will decrease by .458% for
each additional increase of 1% in acid-detergent fibre, with digestion efficiency held
constant.
c.

To determine if digestion efficiency is a useful predictor of weight change, we test:


H0: 1 = 0
Ha: 1 0
The test statistic is t = .50. The p-value is p = .623. Since the p-value is greater than
(p = .623 > .01), H0 is not rejected. There is insufficient evidence to indicate that
digestion efficiency is a useful linear predictor of weight change at = .01.

d.

For confidence coefficient .99, = 1 .99 = .01 and /2 = .01/2 = .005. From Table
VI, Appendix B, with df = n (k + 1) = 42 (2 + 1) = 39, t.005 2.704. The 99%
confidence interval is:

2 t.005 s .4578 2.704 (.1283) .4578 .3469 (.8047, .1109)


2

We are 99% confident that the change in mean weight change for each unit change in
acid-detergent fiber, holding digestion efficiency constant is between .8047% and
.1109%.

386

Chapter 11

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

e.

R2 = R-Sq = 52.9%. 52.9% of the total sample variance of the weight changes is
explained by the model containing the 2 independent variables, digestion efficiency ad
acid-detergent fiber.
R2a = R-Sq(adj) = 50.5%. 50.5% of the total sample variance of the weight changes is
explained by the model containing the 2 independent variables, digestion efficiency
ad acid-detergent fiber, adjusting for the sample size and the number of parameters in
the model.

f.

To determine if at least one of the variables is useful in predicting weight change, we


test:
H0: 1 = 2 = 0
Ha: At least 1 i 0
The test statistic is F = 21.88 and the p-value is p = .000. Since the p-value is less
than = .05 (p = .000 < .05), H0 is rejected. There is sufficient evidence to indicate at
least one of the variables is useful in predicting weight change at = .05.

11.20

a.

The least squares prediction equation is:


y = 4.30 .002x1 + .336x2 + .384x3 + .067x4 .143x5 + .081x6 + .134x7

b.

To determine if the model is adequate, we test:


H0: 1 = 2 = 3 = 4 = 5 = 6 = 7 = 0
Ha: At least one i 0, i = 1, 2, 3, ..., 7
The test statistic is F = 111.1 (from table).
Since no was given, we will use = .05. The rejection region requires = .05 in
the upper tail of the F-distribution with 1 = k = 7 and 2 = n (k + 1) = 268 (7 + 1)
= 260. From Table IX, Appendix B, F.05 2.01. The rejection region is F > 2.01.
Since the observed value of the test statistic falls in the rejection region (F = 111.1 >
2.01), H0 is rejected. There is sufficient evidence to indicate that the model is
adequate for predicting the logarithm of the audit fees at = .05.

c.

3 = .384.

For each additional subsidiary of the auditee, the mean of the


logarithm of audit fee is estimated to increase by .384 units.

Multiple Regression and Model Building

387

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

d.

To determine if the 4 > 0, we test:


H0: 4 = 0
Ha: 4 > 0
The test statistic is t = 1.76 (from table).
The p-value for the test is .079. Since the p-value is not less than (p = .079 </ =
.05), H0 is not rejected. There is insufficient evidence to indicate that 4 > 0, holding
all the other variables constant, at = .05.

e.

To determine if the 1 < 0, we test:


H0: 1 = 0
Ha: 1 < 0
The test statistic is t = 0.049 (from table).
The p-value for the test is .961. Since the p-value is not less than (p = .961 </ =
.05), H0 is not rejected. There is insufficient evidence to indicate that 1 < 0, holding
all the other variables constant, at = .05. There is insufficient evidence to indicate
that the new auditors charge less than incumbent auditors.

11.22

To determine if the model is useful, we test:


H0: 1 = 2 = = 18 = 0
Ha: At least one i 0, i = 1, 2, ... , 18
The test statistic is F =

R2 / k
.95 /18
=
= 1.06
2
(1 R ) /[n ( k + 1)]
(1 .95) /[20 (18 + 1)]

The rejection region requires = .05 in the upper tail of the F distribution with 1 = k = 18
and 2 = n (k + 1) = 20 (18 + 1) = 1. From Table IX, Appendix B, F.05 245.9. The
rejection region is F > 245.9.
Since the observed value of the test statistic does not fall in the rejection region (F = 1.06
>/ 247), H0 is not rejected. There is insufficient evidence to indicate the model is adequate
at = .05.
Note: Although R2 is large, there are so many variables in the model that 2 is small.

388

Chapter 11

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

11.24

a.

From MINITAB, the output is:


Regression Analysis: Labor versus Pounds, Units, Weight
The regression equation is
Labor = 132 + 2.73 Pounds + 0.0472 Units - 2.59 Weight
Predictor
Constant
Pounds
Units
Weight

Coef
131.92
2.726
0.04722
-2.5874

S = 9.810

SE Coef
25.69
2.275
0.09335
0.6428

R-Sq = 77.0%

T
5.13
1.20
0.51
-4.03

P
0.000
0.248
0.620
0.001

R-Sq(adj) = 72.7%

Analysis of Variance
Source
Regression
Residual Error
Total
Source
Pounds
Units
Weight

DF
3
16
19

DF
1
1
1

SS
5158.3
1539.9
6698.2

MS
1719.4
96.2

F
17.87

P
0.000

Seq SS
3400.6
198.4
1559.3

The least squares equation is:


y = 131.92 + 2.726x1 + .0472x2 2.587x3
b.

To test the usefulness of the model, we test:


H0: 1 = 2 = 3 = 0
Ha: At least one i 0, for i = 1, 2, 3
The test statistic is F =

MSR
1719.4
=
= 17.87
MSE
96.2

The rejection region requires = .01 in the upper tail of the F-distribution with
1 = k = 3 and 2 = n (k + 1) = 20 (3 + 1) = 16. From Table XI, Appendix B,
F.01 = 5.29. The rejection region is F > 5.29.
Since the observed value of the test statistic falls in the rejection region (F = 17.87
> 5.29), H0 is rejected. There is sufficient evidence to indicate a relationship exists
between hours of labor and at least one of the independent variables at = .01.
c.

H0: 2 = 0
Ha: 2 0
The test statistic is t = .51. The p-value = .620. We reject H0 if p-value < . Since
.620 > .05, do not reject H0. There is insufficient evidence to indicate a relationship
exists between hours of labor and percentage of units shipped by truck, all other
variables held constant, at = .05.

Multiple Regression and Model Building

389

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

d.

R2 is printed as R-Sq. R2 = .770. We conclude that 77% of the sample variation of the
labor hours is explained by the regression model, including the independent variables
pounds shipped, percentage of units shipped by truck, and weight.

e.

If the average number of pounds per shipment increases from 20 to 21, the estimated
change in mean number of hours of labor is 2.587. Thus, it will cost $7.50(2.587) =
$19.4025 less, if the variables x1 and x2 are constant.

f.

Since s = Standard Error = 9.81, we can estimate approximately with 2s precision or


2(9.81) or 19.62 hours.

g.

No. Regression analysis only determines if variables are related. It cannot be used to
determine cause and effect.

11.26

From the printout, the 90% prediction interval is (-151.996, 175.4874). We are 90%
confidence that an actual DDT level for a fish caught 100 miles upstream that is 40
centimeters long and weighs 800 grams will be between -151.996 and 175.4874. Since the
DDT level cannot be negative, the interval would be between 0 and 175.4874.

11.28

a.

From MINITAB, the output is:


Regression Analysis: Precip versus Altitude, Latit, Coast
The regression equation is
Precip = - 102 + 0.00409 Altitude + 3.45 Latit - 0.143 Coast
Predictor
Constant
Altitude
Latit
Coast

Coef
-102.36
0.004091
3.4511
-0.14286

S = 11.10

SE Coef
29.21
0.001218
0.7949
0.03634

R-Sq = 60.0%

T
-3.50
3.36
4.34
-3.93

P
0.002
0.002
0.000
0.001

R-Sq(adj) = 55.4%

Analysis of Variance
Source
Regression
Residual Error
Total
Source
Altitude
Latit
Coast

DF
1
1
1

DF
3
26
29

SS
4809.4
3202.3
8011.7

MS
1603.1
123.2

F
13.02

P
0.000

Seq SS
730.7
2175.3
1903.4

Predicted Values for New Observations


New Obs
1

Fit
29.25

SE Fit
5.60

95.0% CI
17.75,
40.76)

95.0% PI
3.71,
54.80)

Values of Predictors for New Observations


New Obs Altitude
Latit
Coast
1
6360
36.6
145

The fitted regression line is:

y = 102.36 + 0.00409 x1 + 3.4511x2 0.1429 x3

390

Chapter 11

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

b.

To determine if the first-order model is useful for the predicting annual precipitation,
we test:

H0: 1 = 2 = 3 = 0
Ha: At least one i 0, i = 1, 2, 3
The test statistic is 13.02 and the p-value is p = 0.000. Since the p-value is less than
= .05, H0 is rejected. There is sufficient evidence to indicate that the model is
useful for predicting annual precipitation at = .05.
c.

The prediction interval is (3.71, 54.80).


With 95% confidence, we can conclude that the annual precipitation for an individual
meteorological station with characteristics x1 = 6360 feet, x2 = 36.6, x3 = 145 miles
will fall between 3.71 inches and 54.80 inches.

11.30

The first order model is:

E(y) = 0 + 1x1 + 2x2 + 3x5


We want to find a 95% prediction interval for the actual voltage when the volume fraction
of the disperse phase is at the high level (x1 = 80), the salinity is at the low level (x2 = 1),
and the amount of surfactant is at the low level (x5 = 2).
Using MINITAB, the output is:
The regression equation is
y = 0.993 - 0.0243 x1 + 0.142 x2 + 0.385 x5
Predictor

Coef

StDev

0.9326

0.2482

3.76

0.002

x1

-0.024272

0.004900

-4.95

0.000

x2

0.14206

0.07573

1.88

0.080

x5

0.38457

0.09801

3.92

0.001

Constant

S = 0.4796

R-Sq = 66.6%

R-Sq(adj) = 59.9%

Analysis of Variance
Source
Regression
Residual

DF

SS

MS

6.8701

2.2900

9.95

0.001

15

3.4509

0.2301

18

10.3210

Error
Total
Sourc

DF

Seq SS

x1

1.4016

x2

1.9263

x5

3.5422

Multiple Regression and Model Building

391

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Unusual Observations
Obs

x1

Fit

StDev Fit

Residual

St Resid

40.0

3.200

2.068

0.239

1.132

2.72R

R denotes an observation with a large standardized residual


Predicted Values
Fit

StDev Fit

-0.098

0.232

95.0%
( -0.592,

CI
0.396)

95.0%
(

-1.233,

PI
1.038)

The 95% prediction interval is (1.233, 1.038). We are 95% confident that the actual
voltage is between 1.233 and 1.038 kw/cm when the volume fraction of the disperse phase
is at the high level (x1 = 80), the salinity is at the low level (x2 = 1), and the amount of
surfactant is at the low level (x5 = 2).
11.32

11.34

a.

E(y) = 0 + 1x1 + 2x2 + 3x1x2

b.

E(y) = 0 + 1x1 + 2x2 + 3x3 + 4x1x2 + 5x1x3 + 6x2x3

a.

R2 = 1

SSE
SS yy

=1

21
= .956
479

95.6% of the total variability of the y values is explained by this model.


b.

To test the utility of the model, we test:

H0: 1 = 2 = 3 = 0
Ha: At least one i 0, i = 1, 2, 3
The test statistic is F =

R2 / k
.956 / 3
= 202.8
=
2
(1 R )[n (k + 1)] (1 .956)[32 (3 + 1)]

The rejection region requires = .05 in the upper tail of the F distribution, with 1 = k
= 3 and 2 = n (k + 1) = 32 (3 + 1) = 28. From Table IX, Appendix B, F.05 = 2.95.
The rejection region is F > 2.95.
Since the observed value of the test statistic falls in the rejection region (F = 202.8 >
2.95), H0 is rejected. There is sufficient evidence that the model is adequate for
predicting y at = .05.

392

Chapter 11

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

The relationship between y and x1 depends on the level of x2.

d.

To determine if x1 and x2 interact, we test:

H0: 3 = 0
Ha: 3 0
The test statistic is t =

1 0 10
s

= 2.5.

The rejection region requires /2 = .05/2 = .025 in each tail of the t distribution with
df = n (k + 1) = 32 (3 + 1) = 28. From Table VI, Appendix B, t.025 = 2.048. The
rejection region is t < 2.048 or t > 2.048.
Since the observed value of the test statistic falls in the rejection region (t = 2.5 >
2.048), H0 is rejected. There is sufficient evidence to indicate that x1 and x2 interact at
= .05.
11.36

a.

To determine if the overall model is useful for predicting y, we test:

H0: 1 = 2 = 3 = 0
Ha: At least one i is not 0
The test statistic is F = 226.35 and the p-value is p < .001. Since the p-value is less
than (p < .001 < .05), Ho is rejected. There is sufficient evidence to indicate the
overall model is useful for predicting y, willingness of the consumer to shop at a
retailers store in the future at = .05.
b.

To determine if consumer satisfaction and retailer interest interact to affect


willingness to shop at retailers shop in future, we test:

H0: 3 = 0
Ha: 3 0
The test statistic is t = -3.09 and the p-value is p < .01. Since the p-value is less
than (p < .01 < .05), H0 is rejected. There is sufficient evidence to indicate

Multiple Regression and Model Building

393

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

consumer satisfaction and retailer interest interact to affect willingness to shop at


retailers shop in future at = .05.
c.

When x2 = 1,

y = o + .426 x1 + .044 x2 .157 x1 x2


= + .426 x + .044(1) .157 x (1)
o

= o + .044 + (.426 .157) x1


= + .044 + .269 x
o

Since no value is given for o , we will use o = 1 for graphing purposes. Using
MINITAB, a graph might look like:
Scatterplot of YHAT vs X1 when X2=1
3.0

YHA T

2.5

2.0

1.5

d.

4
X1

When x2 = 7,

y = o + .426 x1 + .044 x2 .157 x1 x2


= + .426 x + .044(7) .157 x (7)
o

= o + .308 + (.426 1.099) x1


= + .308 .673x
o

Since no value is given for o , we will again use o = 1 for graphing purposes.

394

Chapter 11

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Using MINITAB, a graph might look like:


Scatterplot of YHAT vs X1 when X2=7

YHA T

-1

-2

-3

-4
1

e.

4
X1

Using MINITAB, both plots on the same graph would be:


Scatterplot of YAHT vs X1
Variable

x2=1
x2=7

YHA T

1
0
-1
-2
-3
-4
1

4
X1

Since the lines are not parallel, it indicates that interaction is present.
11.38

a.

The hypothesized regression model including the interaction between x1 and x2


would be:

E ( y ) = o + 1 x1 + 2 x2 + 3 x1 x2
b.

If x1 and x2 interact to affect y then the effect of x1 on y depends on the level of x2.
Also, the effect of x2 on y depends on the level of x1.

Multiple Regression and Model Building

395

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

Since the p-value is not small (p = .25), Ho is not rejected. There is insufficient
evidence to indicate x1 and x2 interact to affect y.

d.

1 corresponds to x1, the number ahead in line. If the negative feeling score gets
larger as the number of people ahead increases, then 1 is positive. 2 corresponds to
x2, the number behind in line. If the negative feeling score gets lower as the number
of people behind increases, then 2 is negative.

11.40

a.

If client credibility and linguistic delivery style interact, then the effect of client
credibility on the likelihood value depends on the level of linguistic delivery style.

b.

To determine the overall model adequacy, we test:

H0: 1 = 2 = 3 = 0
Ha: At least one i 0
c.

The test statistic is F = 55.35 and the p-value is p < 0.0005.


Since the p-value is so small (p < 0.0005), H0 is rejected for any reasonable value of
. There is sufficient evidence to indicate that the model is adequate at > 0.0005.

d.

To determine if client credibility and linguistic delivery style interact, we test:

H0: 3 = 0
Ha: 3 0
e.

The test statistic is t = 4.008 and the p-value is p < 0.005.


Since the p-value is so small (p < 0.005), H0 is rejected. There is sufficient evidence
to indicate that client credibility and linguistic delivery style interact at > 0.005.

f.

When x1 = 22, the least squares line is:

y = 15.865 + 0.037(22) 0.678 x2 + 0.036 x2 (22) = 16.679 + 0.114 x2


The estimated slope of the Likelihood-Linguistic delivery style line when client
credibility is 22 is 0.114. When client credibility is equal to 22, for each additional
point increase in linguistic delivery style, the mean likelihood is estimated to increase
by 0.114.
g.

When x1 = 46, the least squares line is:

y = 15.865 + 0.037(46) 0.678 x2 + 0.036 x2 (46) = 17.567 + 0.978 x2


The estimated slope of the Likelihood-Linguistic delivery style line when client
credibility is 46 is 0.978. When client credibility is equal to 46, for each additional
point increase in linguistic delivery style, the mean likelihood is estimated to increase
by 0.978.

396

Chapter 11

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

11.42

a.

E(y) = 0 + 1x1 + 2x2 + 3x3 + 4x4 + 5x5

b.

H0: 4 = 0

c.

t = 4.408, p-value = .001


Since the p-value is so small, there is strong evidence to reject H0. There is sufficient
evidence to indicate that the strength of client-therapist relationship contributes
information for the prediction of a client's reaction for any > .001.

11.44

d.

Answers may vary.

e.

R2 = .2946. 29.46% of the variability in the client's reaction scores can be explained
by this model.

a.

1 = .02. The mean level of support for a military response is estimated to increase
by .02 for each day increase in level of TV news exposure, all other
variables held constant.

b.

To determine if an increase in TV news exposure is associated with an increase in


support for military resolution, we test:

H0: 1 = 0
Ha: 1 > 0
The p-value is p = .03/2 = .015. Since the p-value is less than (p = .015 < .05), H0 is
rejected. There is sufficient evidence to indicate that an increase in TV news
exposure is associated with an increase in support for military resolution, all other
variables held constant, at = .05.
c.

To determine if the relationship between support for military resolution and gender
depends on political knowledge, we test:

H0: 8 = 0
Ha: 8 0
The p-value is p = .02. Since the p-value is less than (p = .02 < .05), H0 is rejected.
There is sufficient evidence to indicate that the relationship between support for a
military resolution and gender depends on political knowledge, all other variables
held constant, at = .05.
d.

To determine if the relationship between support for military resolution and race
depends on political knowledge, we test:

H0: 9 = 0
Ha: 9 0
The p-value is p = .08. Since the p-value is not less than (p = .08 </ .05), H0 is not
rejected. There is insufficient evidence to indicate that the relationship between

Multiple Regression and Model Building

397

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

support for a military resolution and race depends on political knowledge, all other
variables held constant, at = .05.
e.

f.

R2 = .194.

19.4% of the variation in the support for military resolution is


explained by the model containing the seven independent variables
and the two interaction terms.
H0: 1 = 2 = 3 = 4 = 5 = 6 = 7 = 8 = 9 = 0
Ha: At least one i 0, i = 1, 2, 3, ... , 9
The test statistic is F =

R2 / k
.194 / 9
=
= 46.88
2
(1 R ) /[n (k + 1)] (1 .194) /[1763 (9 + 1)]

The rejection region requires = .05 in the upper tail of the F distribution with 1 =
k = 9 and 2 = n (k + 1) = 1763 (9 + 1) = 1753. From Table IX, Appendix B, F.05
1.88. The rejection region is F > 1.88.
Since the observed value of the test statistic falls in the rejection region (F = 46.88 >
1.88), H0 is rejected. There is sufficient evidence to indicate that the model is useful
at = .05.
11.46

a.

H0: 2 = 0
Ha: 2 0
The test statistic is t =

2 0
s

.47 0
= 3.133
.15

The rejection region requires /2 = .05/2 = .025 in each tail of the t distribution with
df = n (k + 1) = 25 (2 + 1) = 22. From Table VI, Appendix B, t.025 = 2.074. The
rejection region is t < 2.074 or t > 2.074.
Since the observed value of the test statistic falls in the rejection region (t = 3.133 >
2.074), H0 is rejected. There is sufficient evidence to indicate the quadratic term
should be included in the model at = .05.
b.

H0: 2 = 0
Ha: 2 > 0
The test statistic is the same as in part a, t = 3.133.
The rejection region requires = .05 in the upper tail of the t distribution with df =
22. From Table VI, Appendix B, t.05 = 1.717. The rejection region is t > 1.717.
Since the observed value of the test statistic falls in the rejection region (t = 3.133 >
1.717), H0 is rejected. There is sufficient evidence to indicate the quadratic curve
opens upward at = .05.

398

Chapter 11

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

11.48

11.50

a.

b.

It moves the graph to the right (2x) or to the left (+2x) compared to the graph of
y = 1 + x2.

c.

It controls whether the graph opens up (+x2) or down (x2). It also controls how steep
the curvature is, i.e., the larger the absolute value of the coefficient of x2 , the
narrower the curve is.

a.

0 has no meaning because x = 0 would not be in the observed range of values. In


this case, x is the year with values between 1984 and 1999.

b.

1 = 321.67. Since the quadratic effect is included in the model, the linear term is
just a location parameter and has no meaning.

c.

2 = .0794. Since the value of 2 is positive, the curvature is upward.

d.

Since no data have been collected past 1999, we have no idea if the relationship
between the two variables from 1984 to 1999 will remain the same until 2021.

Multiple Regression and Model Building

399

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

11.52

a.

Using MINITAB, a sketch of the least squares prediction equation is:


Scatterplot of yhat vs Dose
12
10

yhat

8
6
4
2
0
0

100

200

300

400
Dose

500

600

700

800

b.

For x = 500, y = 10.25 + .0053(500) .0000266(5002 ) = 10.25 + 2.65 6.65 = 6.25

c.

For x = 0, y = 10.25 + .0053(0) .0000266(02 ) = 10.25

d.

For x = 100, y = 10.25 + .0053(100) .0000266(1002 ) = 10.25 + .53 .266 = 10.514


This value is slightly larger than that for the control group (10.25).
For x = 200, y = 10.25 + .0053(200) .0000266(2002 ) = 10.25 + 1.06 1.064 = 10.246
This value is slightly smaller than that for the control group (10.25). So, the largest
value of x which yields an estimated weight change that is closest to, but just less than
the estimated weight change for the control group is x = 200.

11.54

a.

A first order model is:


E(y) = o + 1x

b.

A second order model is:


E(y) = o + 1x + 2x2

400

Chapter 11

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

Using MINITAB, a scattergram of these data is:


Scatterplot of International vs Domestic
1200

Inter national

1000
800
600
400
200
0
100

200

300

400
Domestic

500

600

From the plot, it appears that the first order model might fit the data better. There
does not appear to be much of a curve to the relationship.
d.

Using MINITAB, the output is:


Regression Analysis: International versus Domestic, Dsq
The regression equation is
International = 203 - 0.58 Domestic + 0.00364 Dsq
Predictor
Constant
Domestic
Dsq

Coef
202.9
-0.581
0.003638

S = 142.696

SE Coef
245.0
1.510
0.002085

R-Sq = 78.8%

T
0.83
-0.38
1.74

P
0.424
0.707
0.107

R-Sq(adj) = 75.2%

Analysis of Variance
Source
Regression
Residual Error
Total
Source
Domestic
Dsq

DF
1
1

DF
2
12
14

SS
906515
244345
1150860

MS
453258
20362

F
22.26

P
0.000

Seq SS
844526
61990

To investigate the usefulness of the model, we test:

H0: 1 = 2 = 0
Ha: At least one i 0, i = 1, 2

Multiple Regression and Model Building

401

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The test statistic is F = 22.26.


The p-value is p = 0.000. Since the p-value is so small, we reject H0. There is
sufficient evidence to indicate the model is useful for predicting foreign gross
revenue.
To determine if a curvilinear relationship exists between foreign and domestic gross
revenues, we test:

H0: 2 = 0
Ha: 2 0
The test statistic is t = 1.74
The p-value is p = 0.107. Since the p-value is greater than = .05
(p = 0.107 > = .05), H0 is not rejected. There is insufficient evidence to indicate
that a curvilinear relationship exists between foreign and domestic gross revenues at
= .05.
e.

11.56

402

From the analysis in part d, the first-order model better explains the variation in
foreign gross revenues. In part d, we concluded that the second-order term did not
improve the model.

a.

b.

It moves the graph to the right (2x) or to the left (+2x) compared to the graph of
y = 1 + x2.

c.

It controls whether the graph opens up (+x2) or down (x2). It also controls how steep
the curvature is, i.e., the larger the absolute value of the coefficient of x2 , the
narrower the curve is.

Chapter 11

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

11.58

a.

A scatterplot of the data is:


-

10500+

7000+

***

* *

**
*

*
*

** *

**

3500+

*
*

* *

+---------+---------+---------+---------+---------+------X
0.0

8.0

16.0

24.0

32.0

40.0

b.

From the plot, it looks like a second-order model would fit the data better than a firstorder model. There is little evidence that a third-order model would fit the data better
than a second-order model.

c.

Using MINITAB, the output for fitting a first-order model is:


The regression equation is
Y = 2752 + 122 X
Predictor
Constant
X

Coef
2752.4
122.34

s = 1904

Stdev
613.5
26.08

R-sq = 36.7%

t-ratio
4.49
4.69

p
0.000
0.000

R-sq(adj) = 35.0%

Analysis of Variance
SOURCE
Regression
Error
Total

DF
1
38
39

SS
79775688
137726224
217501920

Unusual Observations
Obs.
X
Y
27
27.0
2007
40
40.0
11520

MS
79775688
3624374

Fit Stdev.Fit
6056
345
7646
591

F
22.01

Residual
-4049
3874

p
0.000

St.Resid
-2.16R
2.14R

R denotes an obs. with a large st. resid.

Multiple Regression and Model Building

403

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

To see if there is a significant linear relationship between day and demand, we test:

H0: 1 = 0
Ha: 1 0
The test statistic is t = 4.69.
The p-value for the test is p = 0.000. Since the p-value is less than = .05, H0 is
rejected. There is sufficient evidence to indicate that there is a linear relationship
between day and demand at = .05.
d.

Using MINITAB, the output for fitting a second-order model is:


The regression equation is
Y = 5120 - 216 X + 8.25 XSQ
Predictor
Constant
X
XSQ

Coef
5120.2
-215.92
8.250

s = 1637

Stdev
816.9
91.89
2.173

R-sq = 54.4%

t-ratio
6.27
-2.35
3.80

p
0.000
0.024
0.001

R-sq(adj) = 52.0%

Analysis of Variance
SOURCE
Regression
Error
Total

DF
2
37
39

SS
118377056
99124856
217501920

SOURCE
X
XSQ

DF
1
1

SEQ SS
79775688
38601372

Unusual Observations
Obs.
X
Y
27
27.0
2007

MS
59188528
2679050

Fit Stdev.Fit
5305
357

F
22.09

Residual
-3298

p
0.000

St.Resid
-2.06R

R denotes an obs. with a large st. resid.

To see if there is a significant quadratic relationship between day and demand, we


test:

H0: 2 = 0
Ha: 2 0
The test statistic is t = 3.80.
The p-value for the test is p = 0.001. Since the p-value is less than = .05, H0 is
rejected. There is sufficient evidence to indicate that there is a quadratic relationship
between day and demand at = .05.

404

Chapter 11

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

e.

11.60

Since the quadratic term is significant in the second-order model in part d, the second
order model is better.

The model is E(y) = 0 + 1x1 + 2x2


where

1 if the variable is at level 2


x1 =
0 otherwise

1 if the variable is at level 3


x2 =
0 otherwise

0 = mean value of y when qualitative variable is at level 1.


1 = difference in mean value of y between level 2 and level 1 of qualitative variable.
2 = difference in mean value of y between level 3 and level 1 of qualitative variable.
11.62

a.

The least squares prediction equation is:


y = 80 + 16.8x1 + 40.4x2

b.

1 estimates the difference in the mean value of the dependent variable between level
2 and level 1 of the independent variable.

2 estimates the difference in the mean value of the dependent variable between level
3 and level 1 of the independent variable.
c.

The hypothesis H0: 1 = 2 = 0 is the same as H0: 1 = 2 = 3.


The hypothesis Ha: At least one of the parameters 1 and 2 differs from 0 is the same
as Ha: At least one mean (1, 2, or 3) is different.

d.

The test statistic is F =

MSR 2059.5
=
= 24.72
MSE
83.3

Since no was given, we will use = .05. The rejection region requires = .05 in
the upper tail of the test statistic with numerator df = k = 2 and denominator df = n
(k + 1) = 15 (2 + 1) = 12. From Table IX, Appendix B, F.05 = 3.89. The rejection
region is F > 3.89.
Since the observed value of the test statistic falls in the rejection region (F = 24.72 >
3.89), H0 is rejected. There is sufficient evidence to indicate at least one of the means
is different at = .05.
11.64

a.

b.

A confidence interval for the difference of two population means could be used.
Since both sample sizes are over 30, the large sample confidence interval is used (with
independent samples).
1 if public college
Let x1 =
0 otherwise
The model is E(y) = 0 + 1x1

Multiple Regression and Model Building

405

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

1 is the difference between the two population means. A point estimate for 1 is 1 .
A confidence interval for 1 could be used to estimate the difference in the two
population means.

11.66

a.

1 if no
Let x1 =
0 if yes
The model would be E(y) = 0 + 1x1
In this model, 0 is the mean job preference for those who responded yes to the
question "Flextime of the position applied for" and 1 is the difference in the mean job
preference between those who responded 'no' to the question and those who answered
yes to the question.

b.

1 if referral
Let x1 =
0 if not

1 if on-premise
x2 =
0 if not

The model would be E(y) = o + 1x1 + 2x2


In this model, o is the mean job preference for those who responded none to level
of day care support required, 1 is the difference in the mean job preference between
those who responded referral and those who responded none, and 2 is the
difference in the mean job preference between those who responded on-premise and
those who responded none.
c.

1 if counseling
Let x1 =
0 if not

1 if active search
x2 =
0 if not

The model would be E(y) = 0 + 1x1 + 2x2


In this model, 0 is the mean job preference for those who responded none to
spousal transfer support required, 1 is the difference in the mean job preference
between those who responded counseling and those who responded none, and 2 is
the difference in the mean job preference between those who responded active
search and those who responded none.
d.

1 if not married
Let x1 =
0 if married
The model would be E(y) = 0 + 1x1
In this model, 0 is the mean job preference for those who responded married to
marital status and 1 is the difference in the mean job preference between those who
responded not married and those who answered married.

406

Chapter 11

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

e.

1 if female
Let x1 =
0 if male
The model would be E(y) = 0 + 1x1
In this model, 0 is the mean job preference for males and 1 is the difference in the
mean job preference between females and males.

11.68

a.

4 = .296 The difference in the mean value of DTVA between when the operating
earnings are negative and lower than last year and when the operating earnings are
not negative and lower than last year is estimated to be .296, holding all other
variables constant.

b.

To determine if the mean DTVA for firms with negative earnings and earnings lower
than last year exceed the mean DTVA of other firms, we test:
H0: 4 = 0
Ha: 4 > 0
The p-value for this test is p = .001 / 2 = .0005. Since the p-value is so small, we
would reject H0 for = .05. There is sufficient evidence to indicate the mean DTVA
for firms with negative earnings and earnings lower than last year exceed the mean
DTVA of other firms at = .05.

11.70

c.

Ra2 = .280 28% of the variability in the DTVA scores is explained by the model
containing the 5 independent variables, adjusted for the number of variables in the
model and the sample size.

a.

To determine if there is a difference in the mean monthly rate of return for T-Bills
between an expansive Fed monetary policy and a restrictive Fed monetary policy, we
test:
H0: 1 = 0
Ha: 1 0
The test statistic is t = 8.14.
Since no n nor is given, we cannot determine the exact rejection region. However,
we can assume that n is greater than 2 since the data used are from 1972 and 1997.
With = .05, the critical value of t for the rejection region will be smaller than 4.303.
Thus, with = .05, t = 8.14 will fall in the rejection region. There is sufficient
evidence to indicate a difference in the mean monthly rate of return for T-Bills
between an expansive Fed monetary policy and a restrictive Fed monetary policy at
= .05.
However, the value of R2 is .1818. The model used is explaining only 18.18% of the
variability in the monthly rate of return. This is not a particularly large value.

Multiple Regression and Model Building

407

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

To determine if there is a difference in the mean monthly rate of return for Equity
REIT between an expansive Fed monetary policy and a restrictive Fed monetary
policy, we test:
H0: 1 = 0
Ha: 1 0
The test statistic is t = 3.46.
Since no n nor is given, we cannot determine the exact rejection region. However,
we can assume that n is greater than 4 since the data used are from 1972 and 1997.
With = .05, the critical value of t for the rejection region will be smaller than 3.182.
Thus, with = .05, t = 3.46 will fall in the rejection region. There is sufficient
evidence to indicate a difference in the mean monthly rate of return for Equity REIT
between an expansive Fed monetary policy and a restrictive Fed monetary policy at
= .05.
However, the value of R2 is .0387. The model used is explaining only 3.87% of the
variability in the monthly rate of return. This is a very small value.
b.

For the first model, 1 is the difference in the mean monthly rate of return for T-Bills
between an expansive Fed monetary policy and a restrictive Fed monetary policy.
For the second model, 1 is the difference in the mean monthly rate of return for
Equity REIT between an expansive Fed monetary policy and a restrictive Fed
monetary policy.

c.

The least squares prediction equation for the equity REIT index is:
y = 0.01863 0.01582x.
When the Federal Reserves monetary policy is restrictive, x = 1. The predicted mean
monthly rate of return for the equity REIT index is

y = 0.01863 0.01582(1) = .00281


When the Federal Reserves monetary policy is expansive, x = 0. The predicted mean
monthly rate of return for the equity REIT index is
y = 0.01863 0.01582(0) = .01863.
11.72

a.

The first-order model is E(y) = 0 + 1x1

b.

The new model is E(y) = 0 + 1x1 + 2x2 + 3x3


1 if level 2
where x 2 =
0 otherwise

408

1 if level 3
x3 =
0 otherwise

Chapter 11

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

To allow for interactions, the model is:


E(y) = 0 + 1x1 + 2x2 + 3x3 + 4x1x2 + 5x1x3

11.74

11.76

d.

The response lines will be parallel if 4 = 5 = 0

e.

There will be one response line if 2 = 3 = 4 = 5 = 0

a.

When x2 = x3 = 0, E(y) = 0 + 1x1


When x2 = 1 and x3 = 0, E(y) = 0 + 1x1 + 2
When x2 = 0 and x3 = 1, E(y) = 0 + 1x1 + 3

b.

For level 1, y = 44.8 + 2.2x1


For level 2, y = 44.8 + 2.2x1 + 9.4
= 54.2 + 2.2x1
For level 3, y = 44.8 + 2.2x1 + 15.6
= 60.4 + 2.2x1

The model is E(y) = 0 + 1x1 + 2 x12 + 3x2 + 4x3 + 5x4


where x1 is the quantitative variable and
1 if level 2 of qualitative variable
x2 =
0 otherwise
1 if level 3 of qualitative variable
x3 =
0 otherwise
1 if level 4 of qualitative variable
x4 =
0 otherwise

Multiple Regression and Model Building

409

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

11.78

a.

E(y) = 0 + 1x1 + 2x2 + 3x1x2

1 if diet is duck chow


where x 2 =
0 otherwise
b.

Using MINITAB, the printout is:


The regression equation is
WtChg = -2.21 + 0.0783x1 - 10.4x2 - 0.095x1x2
Predicto
r
Constant
x1
x2
x1x2
S = 3.882

Coef

StDev

-2.210
0.07831
10.354
-0.0948

1.250
0.04947
8.538
0.1418

-1.77
1.58
1.21
-0.67

0.085
0.122
0.233
0.508

R-Sq = 44.1%

R-Sq(adj) = 39.7

Analysis of Variance
Source
Regression
Residual
Error
Total
Sourc
e
x1
x2
x1x2

DF
3
38

SS
452.54
572.58

41

1025.12

DF

Seq SS

1
1
1

384.24
61.57
6.73

MS
150.85
15.07

F
10.01

P
0.000

Unusual Observations
Obs
12
37
40

x1
30.0
42.5
75.0

y
-8.500
8.000
8.500

WtChg StDev Fit Residual St Resid


0.139
0.802
-8.639
-2.27R
7.445
2.990
0.555
0.22 X
6.910
2.077
1.590
0.48 X

R denotes an observation with a large standardized residual


X denotes an observation whose X value gives it large influence.

The fitted equation is y = 2.21 + .0783x1 + 10.4x2 .095x1x2

410

Chapter 11

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

For diet = plants, x2 = 0


y = 2.21 + .0783x1 + 10.4(0) .095x1(0) = 2.21 + .0783x1

The slope is .0783. For each unit increase in digestion efficiency, the mean weight
change is estimated to increase by .0783 for goslings fed plants.
d.

For diet = plants, x2 = 1

y = 2.21 + .0783x1 + 10.4(1) .095x1(1) = 8.19 .0167x1


The slope is .0167. For each unit increase in digestion efficiency, the mean weight
change is estimated to decrease by .0167 for goslings fed duck chow.
e.

To determine if the slopes associated with the two diets differ, we test:
H0: 3 = 0
Ha: 3 0

From MINITAB, the test statistic is t = .67 with p-value = .508


Since = .05 is less than the p-value, we fail to reject H0. There is insufficient
evidence to conclude that the slopes associated with the two diets are significantly
different at = .05
11.80

a.

1 if intervention group
Let x2 =
0 if otherwise
The first-order model would be:
E(y) = 0 + 1x1 + 2x2

b.

For the control group, x2 = 0. The first-order model is:


E(y) = 0 + 1x1 + 2(0) = 0 + 1x1

For the intervention group, x2 = 1. The first-order model is:


E(y) = 0 + 1x1 + 2(1) = 0 + 1x1 + 2 = (0 + 2) + 1x1

In both models, the slope of the line is 1.


c.

If pretest score and group interact, the first-order model would be:
E(y) = 0 + 1x1 + 2x2 + 3x1x2

Multiple Regression and Model Building

411

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

d.

For the control group, x2 = 0. The first-order model including the interaction is:
E(y) = 0 + 1x1 + 2(0) + 3x1(0) = 0 + 1x1

For the intervention group, x2 = 1. The first-order model including the interaction is:
E(y) = 0 + 1x1 + 2(1) + 3x1(1) = 0 + 1x1 + 2 + 3x1
= (0 + 2) + (1 + 3)x1

The slope of the model for the control group is 1. The slope of the model for the
intervention group is 1 + 3.
11.82

a.

The first-order model is:


E(y) = 0 + 1x1 + 2x2

b.

For the high-tech firms, x2 = 1. The model for the high-tech firm is:
E(y) = 0 + 1x1 + 2(1) = 0 + 2 + 1x1

The slope of the line would be 1.


c.

The new model would include the interaction term:


E(y) = 0 + 1x1 + 2x2 + 3x1x2

d.

For the high-tech firms, x2 = 1. The model for the high-tech firm is:
E(y) = 0 + 1x1 + 2(1) + 3x1(1) = 0 + 2 + (1 + 3)x1

The slope of the line would be 1 + 3.


11.84

By adding variables to the model, SSE will decrease or stay the same. Thus, SSEC SSER.
The only circumstance under which we will reject H0 is if SSEC is much smaller than SSER.
If SSEC is much smaller than SSER, F will be large. Thus, the test is only one-tailed.

11.86

a.

Ha: At least one i 0, i = 3, 4, 5

b.

The reduced model would be E(y) = 0 + 1x1 + 2x2

c.

The numerator df = k g = 5 2 = 3 and the denominator df = n (k + 1)


= 30 (5 + 1) = 24.

412

Chapter 11

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

d.

H0: 3 = 4 = 5 = 0
Ha: At least one i 0, i = 3, 4, 5
(SSE R SSE C)/(k g )
SSE C /[n (k + 1)]
(1250.2 1125.2) /(5 2) 41.6667
= .89
=
=
1125.2 /[30 (5 + 1)]
46.8833

The test statistic is F =

The rejection region requires = .05 in the upper tail of the F distribution with
numerator df = k g = 5 2 = 3 and denominator df = n (k + 1) = 30 (5 + 1) = 24.
From Table IX, Appendix B, F.05 = 3.01. The rejection region is F > 3.01.
Since the observed value of the test statistic does not fall in the rejection region (F =
.89 >/ 3.01), H0 is not rejected. There is insufficient evidence to indicate the secondorder terms are useful at = .05.
11.88

a.

Let variables x1 through x4 be the Demographic variables, variables x5 through x11 be


the Diagnostic variables, variables x12 through x15 be the Treatment variables, and
variables x16 through x21 be the Community variables. The compete model is:
E ( y ) = 0 + 1 x1 + 2 x2 + 3 x3 + 4 x4 + 5 x5 + 6 x6 + 7 x7 + 8 x8 + 9 x9
+ 10 x10 + 11 x11 + 12 x12 + 13 x13 + 14 x14 + 15 x15 + 16 x16 + 17 x17
+ 18 x18 + 19 x19 + 20 x20 + 21 x21

b.

To determine if the 7 Diagnostic variables contribute information for the prediction


of y, we test:
H0: 5 = 6 = = 11 = 0

c.

The reduced model would be:


E ( y ) = 0 + 1 x1 + 2 x2 + 3 x3 + 4 x4 + 12 x12 + 13 x13 + 14 x14
+ 15 x15 + 16 x16 + 17 x17 + 18 x18 + 19 x19 + 20 x20 + 21 x21

11.90

d.

Since the p-value is so small (p < .0001), H0 is rejected. There is sufficient evidence
to indicate at least one of the seven diagnostic variables contributes information for
the prediction of y.

a.

The complete second order model is:


E(y) = 0 + 1x1 + x12 + 3x2 + 4x1x2 + 5 x12 x2
where x1 = age
1 if current
x2 =
0 otherwise

Multiple Regression and Model Building

413

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

b.

To determine if the quadratic terms are important, we test:

c.

H0: 2 = 5 = 0
To determine if the interaction terms are important, we test:
H0: 4 = 5 = 0

d.

From MINITAB, the outputs from fitting the three models are:
Regression Analysis: Value versus Age, AgeSq, Status, AgeSt, AgeSqSt
The regression equation is
Value = 83 - 5.7 Age + 0.236 AgeSq - 62 Status + 5.4 AgeSt - 0.234 AgeSqSt
Predictor
Constant
Age
AgeSq
Status
AgeSt
AgeSqSt

Coef
83.4
-5.74
0.2361
-62.1
5.36
-0.2337

S = 286.8

SE Coef
316.3
18.68
0.2549
354.8
24.81
0.4080

R-Sq = 24.7%

T
0.26
-0.31
0.93
-0.18
0.22
-0.57

P
0.793
0.760
0.359
0.862
0.830
0.570

R-Sq(adj) = 16.1%

Analysis of Variance
Source
Regression
Residual Error
Total
Source
Age
AgeSq
Status
AgeSt
AgeSqSt

DF
5
44
49

DF
1
1
1
1
1

SS
1186549
3618994
4805542

MS
237310
82250

F
2.89

P
0.024

Seq SS
865746
138871
77594
77342
26996

Regression Analysis: Value versus Age, Status, AgeSt


The regression equation is
Value = - 176 + 11.2 Age + 196 Status - 11.4 AgeSt
Predictor
Constant
Age
Status
AgeSt

Coef
-176.1
11.166
196.5
-11.432

S = 283.2

SE Coef
145.0
3.902
178.9
6.763

R-Sq = 23.2%

T
-1.21
2.86
1.10
-1.69

P
0.231
0.006
0.278
0.098

R-Sq(adj) = 18.2%

Analysis of Variance
Source
Regression
Residual Error
Total
Source
Age
Status
AgeSt

414

DF
1
1
1

DF
3
46
49

SS
1116017
3689526
4805543

MS
372006
80207

F
4.64

P
0.006

Seq SS
865746
21097
229174

Chapter 11

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Regression Analysis: Value versus Age, AgeSq, Status


The regression equation is
Value = 166 - 8.8 Age + 0.253 AgeSq - 106 Status
Predictor
Constant
Age
AgeSq
Status

Coef
165.8
-8.81
0.2535
-105.6

S = 284.5

SE Coef
182.7
10.89
0.1632
107.9

R-Sq = 22.5%

T
0.91
-0.81
1.55
-0.98

P
0.369
0.423
0.127
0.333

R-Sq(adj) = 17.5%

Analysis of Variance
Source
Regression
Residual Error
Total
Source
Age
AgeSq
Status

DF
1
1
1

DF
3
46
49

SS
1082210
3723332
4805542

MS
360737
80942

F
4.46

P
0.008

Seq SS
865746
138871
77594

Test for part b:


The test statistic is:
F=

(SSE R SSE C)/(k g ) (3, 689, 526 3, 618, 994) / 2


=
= .429
82, 250
SSE C /[n ( k + 1)]

Since no is given, we will use = .05. The rejection region requires = .05 in the
upper tail of the F distribution with 1 = 2 numerator degrees of freedom and 2 = 44
denominator degrees of freedom. From Table IX, Appendix B, F.05 3.23. The
rejection region is F > 3.23.
Since the observed value of the test statistic does not fall in the rejection region (F =
.429 >/ 3.23), H0 is not rejected. There is insufficient evidence to indicate the
quadratic terms are important for predicting market value at = .05.
Test for part c:
The test statistic is:
F=

(SSE R SSE C)/(k g ) (3, 723, 332 3, 618, 994) /(5 3)


=
= .634
82, 250
SSE C /[n (k + 1)]

The rejection region is the same as in previous test. Reject H0 if F > 3.23.
Since the observed value of the test statistic does not fall in the rejection region
(F = .634 >/ 3.23), H0 is not rejected. There is insufficient evidence to indicate the
interaction terms are important for predicting market value at = .05.

Multiple Regression and Model Building

415

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

11.92

a.

The reduced model for testing if the mean posttest scores differ for the intervention
and control groups would be:
E(y) = 0 + 1x1

11.94

b.

The reported p-value is .03. Since the p-value is so small, H0 is rejected. There is
evidence to indicate that the mean posttest sun safety knowledge scores differ for the
intervention and control groups for > .03.

c.

The reported p-value is .033. Since the p-value is so small, H0 is rejected. There is
evidence to indicate that the mean posttest sun safety comprehension scores differ for
the intervention and control groups for > .033.

d.

The reported p-value is .322. Since the p-value is not small, H0 is not rejected. There
is no evidence to indicate that the mean posttest sun safety application scores differ
for the intervention and control groups for < .322.

a.

To determine whether the rate of increase of emotional distress with experience is


different for the two groups, we test:
H0: 4 = 5 = 0
Ha: At least one i 0, i = 4, 5

b.

To determine whether there are differences in mean emotional distress levels that are
attributable to exposure group, we test:
H0: 3 = 4 = 5 = 0
Ha: At least one i 0, i = 3, 4, 5

c.

To determine whether there are differences in mean emotional distress levels that are
attributable to exposure group, we test:
H0: 3 = 4 = 5 = 0
Ha: At least one i 0, i = 3, 4, 5
The test statistic is F =

(SSE R SSE C) /(k g )


(795.23 783.9) /(5 2)
=
= .93
783.9 /[200 (5 + 1)]
SSE C /[ n (k + 1)]

The rejection region requires = .05 in the upper tail of the F distribution with 1 = k
g = 5 2 = 3 and 2 = n (k + 1) = 200 (5 + 1) = 194. From Table IX, Appendix
B, F.05 2.60. The rejection region is F > 2.60.
Since the observed value of the test statistic does not fall in the rejection region
(F = .93 >/ 2.60), H0 is not rejected. There is insufficient evidence to indicate that
there are differences in mean emotional distress levels that are attributable to exposure
group at = .05.

416

Chapter 11

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

11.96

a.

The best one-variable predictor of y is the one whose t statistic has the largest absolute
value. The t statistics for each of the variables are:

Independent
Variable

x1
x2
x3
x4
x5
x6

t=

i
s

t = 1.6/.42 = 3.81
t = .9/.01 = 90
t = 3.4/1.14 = 2.98
t = 2.5/2.06 = 1.21
t = 4.4/.73 = 6.03
t = .3/.35 = .86

The variable x2 is the best one-variable predictor of y. The absolute value of the
corresponding t score is 90. This is larger than any of the others.

11.98

b.

Yes. In the stepwise procedure, the first variable entered is the one which has the
largest absolute value of t, provided the absolute value of the t falls in the rejection
region.

c.

Once x2 is entered, the next variable that is entered is the one that, in conjunction with
x2, has the largest absolute t value associated with it.

a.

In step 1, all 1 variable models are fit. Thus, there are a total of 11 models fit.

b.

In step 2, all two-variable models are fit, where 1 of the variables is the best one
selected in step 1. Thus, a total of 10 two-variable models are fit.

c.

In the 11th step, only one model is fit the model containing all the independent
variables.

d.

The model would be:

E ( y ) = 0 + 1 x1 + 2 x2 + 3 x3 + 4 x4 + 7 x7 + 9 x9 + 10 x10 + 11 x11
e.

67.7% of the total sample variability of overall satisfaction is explained by the model
containing the independent variables safety on bus, seat availability, dependability, t
travel time, convenience of route, safety at bus stops, hours of service, and frequency
of service.

f.

Using stepwise regression does not guarantee that the best model will be found.
There may be better combinations of the independent variables that are never found,
because of the order in which the independent variables are entered into the model.

Multiple Regression and Model Building

417

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

11.100 a.

The plot of the residuals reveals a nonrandom pattern. The residuals exhibit a curved
shape. Such a pattern usually indicates that curvature needs to be added to the model.

b.

The plot of the residuals reveals a nonrandom pattern. The residuals versus the
predicted values shows a pattern where the range in values of the residuals increases
as y increases. This indicates that the variance of the random error, , becomes
larger as the estimate of E(y) increases in value. Since E(y) depends on the x-values
in the model, this implies that the variance of is not constant for all settings of the
x's.

c.

This plot reveals an outlier, since all or almost all of the residuals should fall within 3
standard deviations of their mean of 0.

d.

This frequency distribution of the residuals is skewed to the right. This may be due to
outliers or could indicate the need for a transformation of the dependent variable.

11.102 a.

b.

Since all the pairwise correlations are .45 or less in absolute value, there is little
evidence of extreme multicollinearity.
No. The overall model test is significant (p < .001). This implies that at least one
variable contributes to the prediction of the urban/rural rating. Looking at the
individual t-tests, there are several that are significant, namely x1, x3, and x5. There is
no evidence that multicollinearity is present.

11.104 First, we need to compute the value of the residual:

Residual = y y = 87 29.63 = 57.37


We are given that the standard deviation is s = 24.68. Thus, an observation with a
residual of 57.37 is 57.37 / 24.68 = 2.32 standard deviations from the fitted regression
line. Since this is less than 3 standard deviations from the regression line, this point is
not considered an outlier.

418

Chapter 11

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

11.106 a.

From MINITAB, the output is:


Regression Analysis: Food versus Income, Size
The regression equation is
Food = 2.79 - 0.00016 Income + 0.383 Size
Predictor
Constant
Income
Size

Coef
2.7944
-0.000164
0.38348

S = 0.7188

SE Coef
0.4363
0.006564
0.07189

R-Sq = 55.8%

T
6.40
-0.02
5.33

P
0.000
0.980
0.000

R-Sq(adj) = 52.0%

Analysis of Variance
Source
Regression
Residual Error
Total
Source
Income
Size

DF
2
23
25

DF
1
1

SS
15.0027
11.8839
26.8865

MS
7.5013
0.5167

F
14.52

P
0.000

Seq SS
0.2989
14.7037

Correlations: Income, Size


Pearson correlation of Income and Size = -0.137
P-Value = 0.506

No; Income and household size do not seem to be highly correlated. The correlation
coefficient between income and household size is .137.
b.

Using MINITAB, the residual plots are:


Histogram of the Residuals
(response is Food)

Frequency

10

0
-1.0

-0.5

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Residual

Multiple Regression and Model Building

419

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Residuals Versus the Fitted Values


(response is Food)
3

Residual

-1
3

Fitted Value

Residuals Versus Income


(response is Food)
3

Residual

-1
0

10

20

30

40

50

60

70

80

90

100

Income

Residuals Versus Size


(response is Food)
3

Residual

-1
0

Size

Yes; The residuals versus income and residuals versus homesize exhibit a curved shape.
Such a pattern could indicate that a second-order model may be more appropriate.

420

Chapter 11

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

No; The residuals versus the predicted values reveals varying spreads for different
values of y . This implies that the variance of is not constant for all settings of the
x's.

d.

Yes; The outlier shows up in several plots and is the 26th household (Food consumption
= $7500, income = $7300 and household size = 5).

e.

No; The frequency distribution of the residuals shows that the outlier skews the
frequency distribution to the right.

11.108 Using MINITAB, the residual plots are:

Residual Plots for DDT


Normal Probability Plot of the Residuals

Percent

99
90
50
10
1
0.1

Residuals Versus the Fitted Values


Standardized Residual

99.9

-5

0
5
Standardized Residual

2.5
0.0

50

10

50
Fitted Value

100

Residuals Versus the Order of the Data


Standardized Residual

Frequency

100

2
4
6
8
Standardized Residual

5.0

Histogram of the Residuals

7.5

10

150

10.0

10.0
7.5
5.0
2.5
0.0
1 10 20 30 4 0 5 0 6 0 7 0 8 0 9 0 00 10 20 30 40
1 1 1 1 1

Observation Order

Residuals Versus WEIGHT


(response is DDT)
12

Standardized Residual

10
8
6
4
2
0
0

500

1000

1500

2000

2500

WEIGHT

Multiple Regression and Model Building

421

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Residuals Versus LENGTH


(response is DDT)
12

Standardized Residual

10
8
6
4
2
0
20

25

30

35
LENGTH

40

45

50

55

Residuals Versus MILE


(response is DDT)
12

Standardized Residual

10
8
6
4
2
0
0

50

100

150

200

250

300

350

MILE

From the normal probability plot, the points do not fall on a straight line, indicating the
residuals are not normal. The histogram of the residuals indicates the residuals are
skewed to the right, which also indicates that the residuals are not normal. The plot of
the residuals versus yhat indicates that there is at least one outlier and the variance is
not constant. One observation has a standardized residual of more than 10 and several
others have standardized residuals greater than 3. This is also evident in the plots of the
residuals versus each of the independent variables. Since the assumptions of normality
and constant variance appear to be violated, we could consider transforming the data.
We should also check the outlying observations to see if there are any errors connected
with these observations.
11.110 a.

To determine if at least one of the parameters is not zero, we test:


H0: 1 = 2 = 3 = 4 = 0
Ha: At least one i 0
The test statistic is F =

422

R2 / k
.83 / 4
=
= 24.41
2
(1 R ) /[n (k + 1)] (1 .83)([25 (4 + 1)]

Chapter 11

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The rejection region requires = .05 in the upper tail of the F distribution with
numerator df = k = 4 and denominator df = n (k + 1) = 25 (4 + 1) = 20. From
Table IX, Appendix B, F.05 = 2.87. The rejection region is F > 2.87.
Since the observed value of the test statistic falls in the rejection region (F = 24.41
> 2.87), H0 is rejected. There is sufficient evidence to indicate at least one of the
parameters is nonzero at = .05.
b.

H0: 1 = 0
Ha: 1 < 0
The test statistic is t =

1 0
s

2.43 0
= 2.01
1.21

The rejection region requires = .05 in the lower tail of the t distribution with df =
n (k + 1) = 25 (4 + 1) = 20. From Table VI, Appendix B, t.05 = 1.725. The
rejection region is t < 1.725.
Since the observed value of the test statistic falls in the rejection region (t = 2.01
< 1.725), H0 is rejected. There is sufficient evidence to indicate 1 is less than 0 at
= .05.
c.

H0: 2 = 0
Ha: 2 > 0
The test statistic is t =

2 0
s

.05 0
= .31
.16

The rejection region requires = .05 in the upper tail of the t distribution. From part
b above, the rejection region is t > 1.725.
Since the observed value of the test statistic does not fall in the rejection region (t =
.31 >/ 1.725), H0 is not rejected. There is insufficient evidence to indicate 2 is
greater than 0 at = .05.
d.

H0: 3 = 0
Ha: 3 0
The test statistic is t =

3 0
s

.62 0
= 2.38
.26

The rejection region requires /2 = .05/2 = .025 in each tail of the t distribution with
df = 20. From Table VI, Appendix B, t.025 = 2.086. The rejection region is t < 2.086
or t > 2.086.
Since the observed value of the test statistic falls in the rejection region (t = 2.38 >
2.086), H0 is rejected. There is sufficient evidence to indicate 3 is different from 0 at
= .05.

Multiple Regression and Model Building

423

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

11.112 The error of prediction is smallest when the values of x1, x2, and x3 are equal to their sample
means. The further x1, x2, and x3 are from their means, the larger the error. When x1 = 60,
x2 = .4, and x3 = 900, the observed values are outside the observed ranges of the x values.
When x1 = 30, x2 = .6, and x3 = 1300, the observed values are within the observed ranges
and consequently the x values are closer to their means. Thus, when x1 = 30, x2 = .6, and
x3 = 1300, the error of prediction is smaller.
11.114 From the plot of the residuals for the straight line model, there appears to be a mound shape
which implies the quadratic model should be used.
11.116 a.
b.

Ha: At least one of 4 and 5 0


The regression model
E(y) = 0 + 1x1 + 2x2 + 3 x22 + 4x1x2 + 5x1 x22
is fit to the 35 data points, yielding a sum of squares for error, denoted SSEC. The
regression model
E(y) = 0 + 1x1 + 2x2 + 3 x22
is also fit to the data and its sum of squares for error is obtained, denoted SSER. Then
the test statistic is:
F=

(SSE R SSE C) /( k g )
SSE C /[n (k + 1)]

where k = 5, g = 3, and n = 35.


c.

The numerator degrees of freedom is k g = 5 3 = 2, and the denominator degrees


of freedom is n (k + 1) = 35 (5 + 1) = 29.

d.

The rejection region requires = .05 in the upper tail of the F distribution with
numerator df = 2 and denominator df = 29. From Table IX, Appendix B, F.05 = 3.33.
The rejection region is F > 3.33.

11.118 a.

E(y) = 0 + 1x1 + 2x2 + 3x3


1, if level 2
1, if level 3
x3 =
where x2 =
0, otherwise
0, otherwise

b.

E(y) = 0 + 1x1 + 2 x12 + 3x2 + 4x3 + 5x1x2 + 6x1x3 + 7 x12 x2 + 8 x12 x3


where x1, x2, and x3 are as in part a.

424

Chapter 11

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

11.120 a.
b.
11.122 a.

E(y) = 0 + 1x1 + 2x2


E(y) = 0 + 1x1 + 2 x12 + 3x2 + 4 x22 + 5x1x2
1.
2.
3.
4.
5.

b.

c.

The "Quantitative GMAT score" is measured on a numerical scale, so it is a


quantitative variable.
The "Verbal GMAT score" is measured on a numerical scale, so it is a
quantitative variable.
The "Undergraduate GPA" is measured on a numerical scale, so it is a
quantitative variable.
The "First-year graduate GPA" is measured on a numerical scale, so it is a
quantitative variable.
The "Student cohort" has 3 categories, so it is a qualitative variable. Note that
the numerical scale is meaningless in this situation. (It is possible to consider
this as a quantitative variable. However, for this problem we will consider it as
qualitative.)

The quantitative variables GMAT score, verbal GMAT score, undergraduate GPA,
and first-year graduate GPA should all be positively correlated to final GPA.
1
x5 =
0
1
x6 =
0

if student entered doctoral program in year 3


otherwise
if student entered doctoral program in year 5
otherwise

d.

E(y) = 0 + 1x1 + 2x2 + 3x3 + 4x4 + 5x5 + 6x6

e.

0 = the y-intercept for students entering in year 1.


1 = the final GPA will increase by 1 for each additional increase of one unit of
GMAT score, holding the remaining variables constant.
2 = the final GPA will increase by 2 for each additional increase of one unit of
verbal GMAT score, holding the remaining variables constant.
3 = the final GPA will increase by 3 for each additional increase of one
undergraduate GPA point, holding the remaining variables constant.
4 = the final GPA will increase by 4 for each additional increase of one first-year
graduate GPA point, holding the remaining variables constant.

5 = difference in mean final GPA between student cohort year 2 and year 1.
6 = difference in mean final GPA between student cohort year 3 and year 1.
f.

E(y) = 0 + 1x1 + 2x2 + 3x3 + 4x4 + 5x5 + 6x6 + 7x1x5 + 8x1x6


+ 9x2x5 + 10x2x6 + 11x3x5 + 12x3x6 + 13x4x5 + 14x4x6

Multiple Regression and Model Building

425

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

g.

For the year 1 cohort, x5 = x6 = 0. The model is:


E(y) = 0 + 1x1 + 2x2 + 3x3 + 4x4 + 5(0) + 6(0) + 7x1(0) + 8x1(0)
+ 9x2(0) + 10x2(0) + 11x3(0) + 12x3(0) + 13x4(0) + 14x4(0)
= 0 + 1x1 + 2x2 + 3x3 + 4x4
The slopes for the four variables are 1, 2, 3 and 4 respectively.

11.124 a.

The hypothesized model is:


E(y) = 0 + 1x1 + 2x2 + 3x3 + 4x4 + 5x5

0 = y-intercept. It has no interpretation in this model.


1 = difference in the mean salaries between males and females, all other variables
held constant.

2 = difference in the mean salaries between whites and nonwhites, all other variables
held constant.
3 = change in the mean salary for each additional year of education, all other
variables held constant.

4 = change in the mean salary for each additional year of tenure with firm, all other
variables held constant.
5 = change in the mean salary for each additional hour worked per week, all other
variables held constant.
b.

The least squares equation is:

y = 15.491 + 12.774x1 + .713x2 + 1.519x3 + .32x4 + .205x5

0 = estimate of the y-intercept. It has no interpretation in this model.


1 : We estimate the difference in the mean salaries between males and females to be
$12.774, all other variables held constant.

2 : We estimate the difference in the mean salaries between whites and nonwhites to
be

$.713, all other variables held constant.

3 : We estimate the change in the mean salary for each additional year of education
to be $1.519, all other variables held constant.

4 : We estimate the change in the mean salary for each additional year of tenure
with firm to be $.320, all other variables held constant.

5 : We estimate the change in the mean salary for each additional hour worked per
week to be $.205, all other variables held constant.

426

Chapter 11

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

R2 = .240. 24% of the total variability of salaries is explained by the model containing
gender, race, educational level, tenure with firm, and number of hours worked per
week.
To determine if the model is useful for predicting annual salary, we test:
H0: 1 = 2 = 3 = 4 = 5 = 0
Ha: At least one i 0
The test statistic is F =

R2 / k
.24 / 5
=
= 11.68
2
(1 R )[n (k + 1)] (1 .24) /[191 (5 + 1)]

The rejection region requires = .05 in the upper tail of the F distribution with
numerator df = k = 5 and denominator df = n (k + 1) = 191 (5 + 1) = 185. From
Table IX, Appendix B, F.05 2.21. The rejection region is F > 2.21.
Since the observed value of the test statistic falls in the rejection region (F = 11.68 >
2.21), H0 is rejected. There is sufficient evidence to indicate the model containing
gender, race, educational level, tenure with firm, and number of hours worked per
week is useful for predicting annual salary for = .05.
d.

To determine if male managers are paid more than female managers, we test:
H0: 1 = 0
Ha: 1 > 0
The p-value given for the test < .05/2 = .025. Since the p-value is less than = .05,
there is evidence to reject H0. There is evidence to indicate male managers are paid
more than female managers, holding all other variables constant, for > .025.

e.

11.126 a.
b.

The salary paid an individual depends on many factors other than gender. Thus, in
order to adjust for other factors influencing salary, we include them in the model.
The main effects model would be: E ( y ) = 0 + 1 x1 + 8 x8

1 = .28 . The mean value for the relative error of the effort estimate for developers
is estimated to be .28 units below that of project leaders, holding previous accuracy
constant.

8 = .27 . The mean value for the relative error of the effort estimate if previous
accuracy is more than 20% is estimated to be .27 units above that if previous
accuracy is less than 20%, holding company role of estimator constant.
c.

One possible reason for the sign of 1 being opposite from what is expected could be
that company role of estimator and previous accuracy could be correlated.

Multiple Regression and Model Building

427

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

11.128 a.

R2 = .45. 45% of the total variability of the suicide rates is explained by the model
containing unemployment rate, percentage of females in the work force, divorce rate,
logarithm of GNP, and annual percent change in GNP.
To determine if the model is useful for predicting suicide rate, we test:
H0: 1 = 2 = 3 = 4 = 5 = 0
Ha: At least one i 0
The test statistic is F =

R2 / k
.45 / 5
=
= 6.38
2
(1 R )[n (k + 1)] (1 .45) /[45 (5 + 1)]

The rejection region requires = .05 in the upper tail of the F distribution with
numerator df = k = 5 and denominator df = n (k + 1) = 45 (5 + 1) = 39. From
Table IX, Appendix B, F.05 2.45. The rejection region is F > 2.45.
Since the observed value of the test statistic falls in the rejection region (F = 6.38 >
2.45), H0 is rejected. There is sufficient evidence to indicate the model containing
unemployment rate, percentage of females in the work force, divorce rate, logarithm of
GNP and annual percent change in GNP is useful for predicting suicide rate for = .05.
b.

0 = .002 = estimate of the y-intercept. It has no interpretation in this model.


1 : We estimate the change in suicide rate for each unit change in unemployment
rate to be .0204, all other variables held constant.

2 : We estimate the change in suicide rate for each unit change in percentage of
females in the work force to be .0231, all other variables held constant.

3 : We estimate the change in suicide rate for each unit change in divorce rate to be
.0765, all other variables held constant.

4 : We estimate the change in suicide rate for each unit change in logarithm of GNP
to be .2760, all other variables held constant.

5 : We estimate the change in suicide rate for each unit change in annual percent
change in GNP to be .0018, all other variables held constant.
The p-values for unemployment rate and percentage of females in the work force are
less than .05. This indicates that both are important in predicting suicide rate. The pvalues for divorce rate, logarithm of GNP, and annual percent change in GNP are all
greater than .10. This indicates that none of these variables are important in
predicting suicide rate. We must view these conclusions with caution. Some of these
independent variables may be highly correlated with each other. If so, some of the
variables declared nonsignificant may be significant if the other variables are removed
from the model.

428

Chapter 11

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

To determine if unemployment rate is a useful predictor of the suicide rate, we test:


H0: 1 = 0
Ha: 1 0
The p-value = .002. Since this p-value is less than = .05, there is evidence to reject
H0. There is sufficient evidence to indicate unemployment rate is a useful predictor of
the suicide rate for = .05.

d.

Curvature: It may be possible that the relationship between the suicide rate and some
of the independent variables is not linear, but curved. Thus, some of the variables that
do not appear to be useful predictors may, in fact, be useful predictors if the secondorder term was added to the model.
Interaction: Again, it may be possible that the effect of some independent variables
on the suicide rate is different for different levels of other independent variables. This
possibility should be explored before throwing out certain independent variables.
Multicollinearity: Some of these independent variables may be highly correlated with
each other. If so, some of the variables declared nonsignificant may be significant if
other variables are removed from the model.

11.130 CEO income (x1) and stock percentage (x2) are said to interact if the effect of one variable,
say CEO income, on the dependent variable profit (y) depends on the level of the second
variable, stock percentage.
11.132 a.

The SAS output is:


DEP VARIABLE: Y
ANALYSIS OF VARIANCE
SUM OF

MEAN

DF

SQUARES

SQUARE

F VALUE

PROB>F

MODEL

25784705.01

8594901.67

241.758

0.0001

ERROR

16

568826.19

35551.63709

C TOTAL

19

26353531.20

ROOT MSE

188.5514

R-SQUARE

0.9784

DEP MEAN

3014.2

ADJ R-SQ

0.9744

SOURCE

C.V.

6.255438

PARAMETER ESTIMATES
PARAMETER

STANDARD

T FOR H0:

ESTIMATE

ERROR

PARAMETER=0

PROB > |T|

290.99944

4.581

0.0003

0.37864583

-0.399

0.6949

5.34596285

-0.491

0.6300

0.006863831

7.569

0.0001

VARIABLE

DF

INTERCEP

1333.17830

X1

-0.15122302

X2

-2.62532461

X1X2

0.05195415

The fitted model is y = 1333.18 .151x1 2.625x2 + .052x1x2

Multiple Regression and Model Building

429

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

b.

To determine if the overall model is useful, we test:


H0: 1 = 2 = 3 = 0
Ha: At least one i 0, i = 1, 2, 3
The test statistic is F =

MSR
8, 594, 901.67
=
= 241.758
MSE
35, 551.637

The rejection region requires = .05 in the upper tail of the F distribution with
numerator df = k = 3 and denominator df = n (k + 1) = 20 (3 + 1) = 16. From
Table IX, Appendix B, F.05 = 3.24. The rejection region is F > 3.24.
Since the observed value of the test statistic falls in the rejection region (F = 241.758
> 3.24), H0 is rejected. There is sufficient evidence to indicate the model is useful at
= .05.
c.

To determine if the interaction is present, we test:


H0: 3 = 0
Ha: 3 0

The test statistic is t =

3 0
s

= 7.569.

The rejection region requires /2 = .05/2 = .025 in each tail of the t distribution with
df = n (k + 1) = 20 (3 + 1) = 16. From Table VI, Appendix B, t.025 = 2.120. The
rejection region is t < 2.120 or t > 2.120.
Since the observed value of the test statistic falls in the rejection region (t = 7.569 >
2.120), H0 is rejected. There is sufficient evidence to indicate the interaction between
advertising expenditure and shelf space is present at = .05.

430

d.

Advertising expenditure and shelf space are said to interact if the affect of advertising
expenditure on sales is different at different levels of shelf space.

e.

If a first-order model was used, the effect of advertising expenditure on sales would
be the same regardless of the amount of shelf space. If interaction really exists, the
effect of advertising expenditure on sales would depend on which level of shelf space
was present.

Chapter 11

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

11.134 a.

There is a curvilinear trend.


b.

From MINITAB, the output is:


The regression equation is y = 42.2 - 0.0114x + 0.000001 xsq
Predictor

Coef

StDev

42.247

5.712

7.40

0.000

-0.011404

0.005053

-2.26

0.037

0.00000061

0.00000037

1.66

0.115

Constant
x
xsq
S = 21.81

R-Sq = 34.9%

R-Sq(adj) = 27.2%

Analysis of Variance
Source

DF

SS

MS

4325.4

2162.7

4.55

0.026

475.6

Regression
Residual Error

17

8085.5

Total

19

12410.9

Sourc

DF

Seq SS

e
x

3013.3

xsq

1312.1

Unusual Observations
Obs
16
17

x1
9150

Fit

StDev Fit

Residual

4.60

-11.21

16.24

15.81

St Resid
1.09 x

15022

2.20

8.09

21.40

-5.89

-1.41 x

X denotes an observation whose X value gives it large influence.

The fitted model is y = 42.2 .0114x + .00000061x2

Multiple Regression and Model Building

431

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

To determine if a curvilinear relationship exists, we test:


H0: 2 = 0
Ha: 2 0

From MINITAB, the test statistic is t = 1.66 with p-value = .115. Since the p-value is
greater than = .05, do not reject H0. There is insufficient evidence to indicate that a
curvilinear relationship exists between dissolved phosphorus percentage and soil loss
at = .05.
11.136 a.

The first order model for this problem is:


E(y) = 0 + 1x1 + 2x2 + 3x3 + 4x4

b.

Using MINITAB, the printout is:


Regression Analysis
The regression equation is
y = 28.9 -0.000000 x1 + 0.844 x2 - 0.360 x3 - 0.300 x4
Predictor

Coef

StDev

28.87

12.67

2.28

0.034

x1

-0.00000011

0.00000028

-0.38

0.708

x2

0.8440

0.2326

3.63

0.002

x3

-0.3600

0.1316

-2.74

0.013

x4

-0.3003

0.1834

-1.64

0.117

Constant

S = 5.989

R-Sq = 51.2%

R-Sq(adj) = 41.5%

Analysis of Variance
Source
Regression

DF

SS

MS

753.76

188.44

5.25

0.005

35.87

Residual Error

20

717.40

Total

24

1471.17

Source

DF

Seq SS

x1

129.96

x2

355.43

x3

172.19

x4

96.17

Unusual Observations
Obs

x1

Fit

StDev Fit

Residual

11940345

32.60

17.25

3.40

15.35

St Resid
3.11R

12

4905123

27.00

16.17

4.36

10.83

2.63R

R denotes an observation with a large standardized residual

432

Chapter 11

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The least squares prediction line is y = 28.9 .00000011x1 + .844x2 .360x3 .300x4.
To determine if the model is useful for predicting percentage of problem mortgages,
we test:
H0: 1 = 2 = 3 = 4 = 0
Ha: At least one of the coefficients is nonzero

The test statistic is F =

MS(Model)
= 5.25
MSE

The p-value is p = .005. Since the p-value is less than = .05 (p = .005 < .05), H0 is
rejected. There is sufficient evidence to indicate the model is useful in predicting
percentage of problem mortgages at = .05.
c.

0 = 28.9. This is merely the y-intercept. It has no other meaning in this problem.
1 = 0.00000011. For each unit increase in total mortgage loans, the mean
percentage of problem mortgages is estimated to decrease by 0.00000011, holding
percentage of invested assets, percentage of commercial mortgages, and percentage of
residential mortgages constant.

2 = 0.844. For each unit increase in percentage of invested assets, the mean
percentage of problem mortgages is estimated to increase by 0.844, holding total
mortgage loans, percentage of commercial mortgages, and percentage of residential
mortgages constant.

3 = 0.360. For each unit increase in percentage of commercial mortgages, the


mean percentage of problem mortgages is estimated to decrease by 0.360, holding
total mortgage loans, percentage of invested assets, and percentage of residential
mortgages constant.

4 = 0.300. For each unit increase in percentage of residential mortgages, the mean
percentage of problem mortgages is estimated to decrease by 0.300, holding total
mortgage loans, percentage of invested assets, and percentage of commercial
mortgages constant.

Multiple Regression and Model Building

433

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

d.

Using MINITAB, the scattergrams are:

From the scattergrams, it appears that possibly x2 and x4 might warrant inclusion in
the model as second order terms.

434

Chapter 11

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

e.

Using MINITAB, the printout is:


Regression Analysis
The regression equation is
y = 56.2 -0.000000 x1 - 1.82 x2 - 0.449 x3 + 0.223 x4 + 0.0771 x2sq - 0.0189 x4sq
Predictor

Coef

StDev

56.17

13.81

4.07

0.001

x1

-0.00000008

0.00000025

-0.31

0.760

x2

-1.8177

0.9935

-1.83

0.084

x3

-0.4494

0.1127

-3.99

0.001

x4

0.2227

0.6079

0.37

0.718

x2sq

0.07707

0.02665

2.89

0.010

x4sq

-0.01887

0.02334

-0.81

0.429

Constant

S = 4.956

R-Sq = 69.9%

R-Sq(adj) = 59.9%

Analysis of Variance
Source
Regression

DF

SS

MS

1029.03

171.51

6.98

0.001

24.56

Residual Error

18

442.13

Total

24

1471.17

Source

DF

Seq SS

x1

129.96

x2

355.43

x3

172.19

x4

96.17

x2sq

259.22

x4sq

16.05

Unusual Observations
Obs

x1

Fit

StDev Fit

Residual

4 11940345

32.600

26.777

4.038

5.823

St Resid
2.03R
-2.04R

10

5328142

7.500

16.105

2.599

-8.605

12

4905123

27.000

16.559

3.607

10.441

3.07R

20

2978628

3.200

11.759

2.679

-8.559

-2.05R

R denotes an observation with a large standardized residual

The least squares prediction equation is


y = 56.2 .00000008x1 1.82x2 .449x3 + .223x4 + 1 .0771x22 .0189 x42
To determine if the model is useful for predicting percentage of problem mortgages,
we test:
H0: 1 = 2 = 3 = 4 = 5 = 6 = 0
Ha: At least one of the coefficients is nonzero

Multiple Regression and Model Building

435

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The test statistic is F =

MS(Model)
= 6.98
MSE

The p-value is p = .001. Since the p-value is less than = .05 (p = .001 < .05), H0 is
rejected. There is sufficient evidence to indicate the model is useful in predicting
percentage of problem mortgages at = .05.
f.

To determine if one or more of the second-order terms of our model contribute


information for the prediction of the percentage of problem mortgages, we test:
H0: 5 = 6 = 0
Ha: At least one of the coefficients is nonzero

The test statistic is F =

(SSE R SSE C) /( k g ) (717.40 442.13) /(6 4)


=
= 5.60
442.13 /[25 (6 + 1)]
SSE C /[n (k + 1)]

The rejection region requires = .05 in the upper tail of the F-distribution with 1 =
(k g) = (6 4) = 2 and 2 = n (k + 1) = 25 (6 + 1) = 18. From Table IX,
Appendix B, F.05 = 3.55. The rejection region is F > 3.55.
Since the observed value of the test statistic falls in the rejection region (F = 5.60 >
3.55), H0 is rejected. There is sufficient evidence to indicate one or more of the
second-order terms of our model contribute information for the prediction of the
percentage of problem mortgages at = .05.
11.138 a.

Using SAS, the output for fitting the model is:


DEP VARIABLE: Y
ANALYSIS OF VARIANCE
SUM OF

MEAN

DF

SQUARES

SQUARE

F VALUE

PROB>F

MODEL

2396.36410

798.78803

99.394

0.0001

ERROR

16

128.58590

8.03662

C TOTAL

11

2524.95000

SOURCE

436

ROOT MSE

2.83489

R-SQUARE

0.9491

DEP MEAN

23.05000

ADJ R-SQ

0.9395

C.V.

12.29889

Chapter 11

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

PARAMETER ESTIMATES
PARAMETER

STANDARD

T FOR H0:

VARIABLE

DF

ESTIMATE

ERROR

PARAMETER=0

PROB > |T|

INTERCEP

-11.768830

3.05032146

-3.858

0.0014

X1

10.293782

1.43788129

7.159

0.0001

X1SQ

-0.417991

0.16132974

-2.591

0.0197

X2

13.244076

1.50325080

8.810

0.0001

The fitted model is: y = 11.8 + 10.3x1 .418 x12 + 13.2x2


b.

To determine if the second-order term is necessary, we test:


H0: 2 = 0
Ha: 2 0

The test statistic is t = 2.591.


The p-value is p = .0197. Since the p-value is less than (p = .0197 < .05), H0 is
rejected. There is sufficient evidence to conclude that the second-order term in the
model proposed by the operations manager is necessary at = .05.
c.

The reduced model E(y) = 0 + 3x2 was fit to the data. The SAS output is:
DEP VARIABLE: Y
ANALYSIS OF VARIANCE
SUM OF

MEAN

DF

SQUARES

SQUARE

F VALUE

PROB>F

MODEL

1.25000000

1.25000000

0.009

0.9258

ERROR

18

2523.70000

140.20556

C TOTAL

19

2524.95000

ROOT MSE

11.84084

R-SQUARE

0.0005

DEP MEAN

23.05

ADJ R-SQ

-0.0550

SOURCE

C.V.

51.37025

PARAMETER ESTIMATES
PARAMETER

STANDARD

T FOR H0:

VARIABLE

DF

ESTIMATE

ERROR

PARAMETER=0

PROB > |T|

INTERCEP

23.30000000

3.74440323

6.223

0.0001

X2

-0.50000000

5.29538583

-0.094

0.9258

Multiple Regression and Model Building

437

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The fitted model is y = 23.3 .5x2.


The hypotheses are:
H0: 1 = 2 = 0
Ha: At least one i 0, i = 1, 2

(SSE R SSE C) /(k g )


SSE C /[ n (k + 1)]
(2523.7 128.586) /(3 1) 1197.557
=
= 149.01
=
128.586 /[20 (3 + 1)]
8.036625

The test statistic is F =

The rejection region requires = .10 in the upper tail of the F distribution with
numerator df = k g = 3 1 = 2 and denominator df = n (k + 1) = 20 (3 + 1) = 16.
From Table VIII, Appendix B, F.10 = 2.67. The rejection region is F > 2.67.
Since the observed value of the test statistic falls in the rejection region (F = 149.01
> 2.67), H0 is rejected. There is sufficient evidence to indicate the age of the machine
contributes information to the model at = .10.
After adjusting for machine type, there is evidence that down time is related to age.
11.140 a.

For a sunny weekday, x1 = 0 and x2 = 1:


x3 = 70 y = 250 700(0) + 100(1) + 5(70) + 15(0)(70) = 700
x3 = 80 y = 250 700(0) + 100(1) + 5(80) + 15(0)(80) = 750
x3 = 90 y = 800
x3 = 100 y = 850

For a sunny weekend, x1 = 1 and x2 = 1:


x3 = 70 y = 250 700(1) + 100(1) + 5(70) + 15(1)(70) = 1050
x3 = 80 y = 250 700(1) + 100(1) + 5(80) + 15(1)(80) = 1250
x3 = 90 y = 1450
x3 = 100 y = 1650

438

Chapter 11

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

For both sunny weekdays and sunny weekend days, as the predicted high temperature
increases, so does the predicted day's attendance. However, the predicted day's
attendance on sunny weekend days increases at a faster rate than on sunny weekdays.
Also, the predicted day's attendance is higher on sunny weekend days than on sunny
weekdays.
b.

To determine if the interaction term is a useful addition to the model, we test:


H0: 4 = 0
Ha: 4 0

The test statistic is t =

4
s

15
=5
3

The rejection region requires /2 = .05/2 = .025 in each tail of the t distribution with
df = n (k + 1) = 30 (4 + 1) = 25. From Table VI, Appendix B, t.025 = 2.06. The
rejection region is t < 2.06 or t > 2.06.
Since the observed value of the test statistic falls in the rejection region (t = 5 > 2.06),
H0 is rejected. There is sufficient evidence to indicate the interaction term is a useful
addition to the model at = .05.
c.

For x1 = 0, x2 = 1, and x3 = 95,


y = 250 700(0) + 100(1) + 5(95) + 15(0)(95) = 825

d.

The width of the interval in Exercise 11.139e is 1245 645 = 600, while the width is
850 800 = 50 for the model containing the interaction term. The smaller the width
of the interval, the smaller the variance. This implies that the interaction term is quite
useful in predicting daily attendance. It has reduced the unexplained error.

Multiple Regression and Model Building

439

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

e.

11.142 a.

Because an interaction term including x1 is in the model, the coefficient corresponding


to x1 must be interpreted with caution. For all observed values of x3 (temperature), the
interaction term value is greater than 700.
From MINITAB, the output is:
Regression Analysis: y versus x1, x2, x1sq, x2sq, x1x2
The regression equation is
y = - 9.92 + 0.167 x1 + 0.138 x2 - 0.00111 x1sq -0.000843 x2sq
+0.000241 x1x2
Predictor
Constant
x1
x2
x1sq
x2sq
x1x2

Coef
-9.917
0.16681
0.13760
-0.0011082
-0.0008433
0.0002411

S = 0.1871

SE Coef
1.354
0.02124
0.02673
0.0001173
0.0001594
0.0001440

R-Sq = 93.7%

T
-7.32
7.85
5.15
-9.45
-5.29
1.67

P
0.000
0.000
0.000
0.000
0.000
0.103

R-Sq(adj) = 92.7%

Analysis of Variance
Source
Regression
Residual Error
Total
Source
x1
x2
x1sq
x2sq
x1x2

DF
5
34
39

DF
1
1
1
1
1

SS
17.5827
1.1908
18.7735

MS
3.5165
0.0350

F
100.41

P
0.000

Seq SS
5.2549
7.5311
3.6434
1.0552
0.0982

The least squares prediction equation is:


y = 9.917 + .167 x1 + .138 x2 .00111x12 .000843 x22 + .000241x 1 x2

b.

The standard deviation for the first-order model is s = .4023. The standard deviation
for the second-order model is s = .1871.
The relative precision for the first-order model is 2(.4023) = .8046. The relative
precision for the second-order model is 2(.1871) = .3742.

c.

To determine if the model is useful, we test:


H0: 1 = 2 = 3 = 4 = 5 = 0
Ha: At least one i 0, i = 1, 2, ... , 5

The test statistic is F =

MSR
3.5165
=
= 100.41
MSE
.0350

The p-value is .0000. Since the p-value is less than = .05, H0 is rejected. There is
sufficient evidence to indicate the model is useful for predicting GPA at = .05.

440

Chapter 11

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

d.

To determine if the interaction term is important, we test:


H0: 5 = 0
Ha: 5 0

The test statistic is t = 1.67.


The p-value is .103. Since the p-value is not less than = .10, H0 is not rejected.
There is insufficient evidence to indicate the interaction term is important for
predicting GPA at = .10.
e.

From MINITAB, the plots are:

Residuals Versus x1
(response is y)
0.5
0.4
0.3

Residual

0.2
0.1
0.0
-0.1
-0.2
-0.3
-0.4
40

50

60

70

80

90

100

x1

Multiple Regression and Model Building

441

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Residuals Versus x2
(response is y)
0.5
0.4
0.3

Residual

0.2
0.1
0.0
-0.1
-0.2
-0.3
-0.4
50

60

70

80

90

100

x2

The residual plots of the residuals against x1 and against x2 for the second-order model
indicate there is no mound or bowl shape in either graph. This implies that secondorder is the highest order necessary. We have eliminated the mound shape from the
plots of the residuals against x1 and the residuals against x2 for the first-order model.
From the plots and the results of the tests in 11.145, it appears the second order model
is preferable for predicting GPA.
f.

To see if the second-order terms are useful, we test:


H0: 3 = 4 = 5
Ha: At least one i 0, i = 3, 4, 5

The test statistic is F =

(SSE R SSE C ) /(k g ) (5.9876 1.1908) / 3


=
= 45.68
.0350
SSE C / [ n (k + 1) ]

Since no is given, we will use = .05. The rejection region requires = .05 in the
upper tail of the F distribution with 1 = k g = 5 2 = 3 and 2 = n [k + 1] =
40 (5 + 1) = 34. From Table IX, Appendix B, F.05 2.92. The rejection region is
F > 2.92.
Since the observed value of the test statistic falls in the rejection region (F = 45.68 >
2.92), H0 is rejected. There is sufficient evidence that at least one second-order term
is useful at = .05.

442

Chapter 11

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

11.144 a.

The model is E(y) = 0 + 1x1


A sketch of the response curve might be:

b.

The model is E(y) = 0 + 1x1 + 2x2 + 3x3


1 if brand 2
where x 2 =
0 otherwise

1 if brand 3
x3 =
0 otherwise

A sketch of the response curve might be:

c.

The model is E(y) = 0 + 1x1 + 2x2 + 3x3 + 4x1x2 + 5x1x3


A sketch of the response curve might be:

Multiple Regression and Model Building

443

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The Condo Sales Case


(To accompany Chapters 1011)

Several models were fit to obtain the final model. I first fit a model with only the main effects for
Floor, Distance, View, Endunit, and Furnish. Of these, only Furnish, adjusted for the other variables,
was not significant. See the output below.
The regression equation is
Price = 184 - 3.81 Floor + 1.74 Distance + 40.3 View - 32.7 Endunit
+ 4.28 Furnish
Predictor
Constant
Floor
Distance
View
Endunit
Furnish

Coef
183.570
-3.8076
1.7414
40.325
-32.716
4.279

s = 24.39

Stdev
5.221
0.7482
0.3750
3.456
9.581
3.602

R-sq = 49.4%

t-ratio
35.16
-5.09
4.64
11.67
-3.41
1.19

p
0.000
0.000
0.000
0.000
0.001
0.236

R-sq(adj) = 48.2%

Analysis of Variance
SOURCE
Regression
Error
Total
SOURCE
Floor
Distance
View
Endunit
Furnish

DF
5
203
208

SS
118091
120802
238893

DF
1
1
1
1
1

SEQ SS
14149
21208
75065
6829
840

MS
23618
595

F
39.69

p
0.000

I then added Floor2 and Distance2 to the model with all main effects. For this model, all of the main
effects, including Furnish, were significant along with both squared terms. The output follows.
The regression equation is
Price = 220 - 13.3 Floor - 7.01 Distance + 38.9 View - 22.0 Endunit
+ 7.31 Furnish + 1.05 FlSq + 0.572 DiSq
Predictor
Constant
Floor
Distance
View
Endunit
Furnish
FlSq
DiSq
s = 22.49

444

Coef
220.258
-13.296
-7.007
38.927
-21.967
7.308
1.0512
0.5719

Stdev
8.178
3.253
1.614
3.202
9.086
3.419
0.3492
0.1033

R-sq = 57.4%

t-ratio
26.93
-4.09
-4.34
12.16
-2.42
2.14
3.01
5.54

p
0.000
0.000
0.000
0.000
0.017
0.034
0.003
0.000

R-sq(adj) = 56.0%

The Condo Sales Case

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Analysis of Variance
SOURCE
Regression
Error
Total

DF
7
201
208

SS
137234
101659
238893

DF
1
1
1
1
1
1
1

SEQ SS
14149
21208
75065
6829
840
3640
15503

SOURCE
Floor
Distance
View
Endunit
Furnish
FlSq
DiSq

MS
19605
506

F
38.76

p
0.000

I then did a stepwise regression, forcing all the main effects and the two squared terms into the model,
to see if any two-way interaction terms could be added to the model. From this, only the interaction
between Floor and View was significant. The output from the final model is:
The regression equation is
Price = 206 - 9.93 Floor - 7.02 Distance + 66.0 View - 22.5 Endunit
+ 6.48 Furnish + 1.02 FlSq + 0.577 DiSq - 6.04 FV
Predictor
Constant
Floor
Distance
View
Endunit
Furnish
FlSq
DiSq
FV

Coef
206.123
-9.927
-7.020
65.952
-22.451
6.485
1.0207
0.57720
-6.037

s = 21.44

Stdev
8.379
3.186
1.539
6.619
8.662
3.265
0.3330
0.09848
1.312

R-sq = 61.5%

t-ratio
24.60
-3.12
-4.56
9.96
-2.59
1.99
3.07
5.86
-4.60

p
0.000
0.002
0.000
0.000
0.010
0.048
0.002
0.000
0.000

R-sq(adj) = 60.0%

Analysis of Variance
SOURCE
Regression
Error
Total

DF
8
200
208

SS
146965
91928
238893

DF
1
1
1
1
1
1
1
1

SEQ SS
14149
21208
75065
6829
840
3640
15503
9731

SOURCE
Floor
Distance
View
Endunit
Furnish
FlSq
DiSq
FV

The Condo Sales Case

MS
18371
460

F
39.97

p
0.000

445

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

This final model is fairly good. The R-squared value is .615. Thus, 61.5% of the variation in prices can
be explained by the model that includes the follow variables: Floor and Floor-squared, Distance and
Distance-squared, View, Endunit, Furnish, and the interaction of Floor and View. The residual plots
are as follows:

From the residual plots, it appears that the data are normally distributed, but there may be a couple of
outliers. This is evident by the two points whose standardized residuals are less than 3. Also, it
appears that there is constant variance. Thus, the model looks to be fairly good. It would be better if
the R-squared value was higher, however.
The final model is:
Price = 206 9.93 Floor 7.02 Distance + 66.0 View 22.5 Endunit + 6.48 Furnish
+ 1.02 FlSq + 0.577 DiSq - 6.04 FV
I have included graphs to indicate how each variable affects the price. These graphs reflect the
relationship between Price and a selected variable, holding the other variables constant.
The first graph is a graph of Price by Floor for each level of View, since Floor and View interact. Both
lines are curved to reflect the quadratic relationship between Floor and Price. For the Non-ocean view,
the price is fairly constant. There is a slight decrease in price as the Floor increases until Floor 5, and
then a slight increase as the floor increases. For the Ocean view, the price decreases at a decreasing rate
as the Floor increases.
The second graph is a graph of the Price by Distance. Again, the quadratic relationship is reflected by
the curved line. As the distance increases, the price decreases until a distance of 6 is reached. Then the
price begins to increase again as the distance increases.

446

The Condo Sales Case

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The third graph is a graph of the Price by View, for each Floor. Again, we must look at the relationship
between Price and View at each Floor because of the significant interaction. For all Floors, the price of
the Ocean View is higher than the price of the Non-ocean View. However, the difference in the two
views depends on the floor.
The fourth graph is a graph of the Price by Endunit. From the graph, the price of the endunits are less
than the others.
The last graph is a graph of the Price by Furnish. From the graph, the price of the furnished units is
higher than the price of the non-furnished units.

The Condo Sales Case

447

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Methods for Quality Improvement

Chapter 12

12.2

If rational subgrouping is not used, it is possible that a change in the process mean will go
undetected. In rational subgrouping, samples are selected so that a change in the process mean
occurs between samples, not within samples.

12.4

An x -chart is used to monitor the process mean.

12.6

The variation of a process must be stable. If it were not, the control limits of the -chart would
be meaningless since they are a function of the process variation.

12.8

a.

According to rule 4 (14 points in a row alternating up and down), the process is out of
control. Therefore, it is affected by both common and special causes of variation. An incontrol process is affected by only common causes. Rule 4 says that if we observe 14
points in a row alternating up and down, that is an indication of the presence of special
causes of variation in addition to common causes. Points 2 through 16 alternate up and
down.

b.

The extended x -chart is:


_
x
35

UCL
A

30
B
25

C
=
x

20
C
15
B
10
A
5

LCL
1

10

15

20

25

30

Sample Number

The additional points suggest that the process is out of control. Rule 1 (One point
beyond Zone A), Rule 5 (2 out of 3 points in a row in Zone A or beyond), and Rule 6
(4 out of 5 points in a row in Zone B or beyond) indicate the process is out of control.
12.10

448

a.

x1 + x2 + " + x25 2008.8


=
= 80.352
25
k
R + R2 + " + R25 198.7
= 7.948
R= 1
=
25
k
x=

Chapter 12

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

b.

Centerline = x = 80.352

From Table XII, Appendix B, with n = 5, A2 = .577.

Upper control limit = x + A2 R = 80.352 + .577(7.948) = 84.938


Lower control limit = x A2 R = 80.352 .577(7.948) = 75.766
c d.

2
( A2 R ) ) = 80.352 + 23 (.577)(7.948) = 83.409
3
2
2
Lower AB boundary = x ( A2 R ) ) = 80.352 (.577)(7.948) = 77.295
3
3
1
1
Upper BC boundary = x + ( A2 R ) ) = 80.352 + (.577)(7.948) = 81.881
3
3
1
1
Lower BC boundary = x ( A2 R ) ) = 80.352 + (.577)(7.948) = 78.823
3
3

Upper AB boundary = x +

The x -chart is:

Rule 1:
Rule 2:
Rule 3:
Rule 4:
Rule 5:
Rule 6:

One point beyond Zone A: Point 10 is beyond Zone A. This indicates the
process is out of control.
Nine points in a row in Zone C or beyond: No sequence of nine points are in
Zone C (on one side of the centerline) or beyond.
Six points in a row steadily increasing or decreasing: No sequence of six
points steadily increase or decrease.
Fourteen points in a row alternating up and down: This pattern does not exist.
Two out of three points in Zone A or beyond: There are no groups of three
consecutive points that have two or more in Zone A or beyond.
Four out of five points in a row in Zone B or beyond: No sequence of five
points has four or more in Zone B or beyond.

Rule 1 indicates the process is out of control.

Methods for Quality Improvement

449

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

12.12

a.

From Table XII, Appendix B, with n = 4, A2 = .729.

x = .6733 and R = .335


Upper control limit = x + A2 R = .6733 + .729(.335) = .9175
Lower control limit = x A2 R = .6733 .729(.335) = .4291

b.

Upper A B boundary = x +

2
2
A2 R ) = .6733 + (.729)(.335) = .8361
(
3
3

Lower A B boundary = x

2
2
A2 R ) = .6733 (.729)(.335) = .5105
(
3
3

Upper B C boundary = x +

1
1
A2 R ) = .6733 + (.729)(.335) = .7547
(
3
3

Lower A B boundary = x

1
1
A2 R ) = .6733 (.729)(.335) = .5919
(
3
3

Rule 1:
Rule 2:

Rule 3:
Rule 4:
Rule 5:
Rule 6:

One point beyond Zone A: No points are beyond Zone A.


Nine points in a row in Zone C or beyond: There are nine points (Points 9
through 17) in a row in Zone C (on one side of the centerline) or beyond. This
indicates that the process is out of control.
Six points in a row steadily increasing or decreasing: No sequence of six
points steadily increase or decrease.
Fourteen points in a row alternating up and down: This pattern does not exist.
Two out of three points in Zone A or beyond: There are no groups of three
consecutive points that have two or more in Zone A or beyond.
Four out of five points in a row in Zone B or beyond: No sequence of five
points has four or more in Zone B or beyond.

Rule 2 indicates the process in out of control.


c.

450

These control limits should not be used to monitor future output because the process is
out of control. One or more special causes of variation are affecting the process mean.
These should be identified and eliminated in order to bring the process into control.

Chapter 12

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

12.14

a.

The process of interest is the production of bolts used in military aircraft.

b.

Using MINITAB, the descriptive statistics are:

Descriptive Statistics: Length by Hour


Variable
Length

Hour
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

N
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4

Mean
36.973
36.957
37.067
37.065
36.948
36.998
37.000
37.005
37.027
36.970
37.020
36.983
37.070
37.073
36.993
36.955
37.038
37.010
36.955
37.035
36.995
37.023
37.003
36.995
37.010

Median
36.965
36.970
37.060
37.040
36.940
36.985
36.995
36.995
37.020
36.950
37.050
36.985
37.075
37.075
37.020
36.965
37.035
37.010
36.965
37.045
36.985
37.020
37.010
37.005
37.020

TrMean
36.973
36.957
37.067
37.065
36.948
36.998
37.000
37.005
37.027
36.970
37.020
36.983
37.070
37.073
36.993
36.955
37.038
37.010
36.955
37.035
36.995
37.023
37.003
36.995
37.010

StDev
0.098
0.079
0.081
0.096
0.121
0.101
0.054
0.087
0.111
0.106
0.098
0.066
0.132
0.025
0.069
0.040
0.097
0.085
0.058
0.109
0.044
0.096
0.039
0.071
0.083

Variable
Length

Hour
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

SE Mean
0.049
0.040
0.040
0.048
0.061
0.051
0.027
0.044
0.055
0.053
0.049
0.033
0.066
0.013
0.035
0.020
0.049
0.043
0.029
0.055
0.022
0.048
0.019
0.036
0.041

Minimum
36.880
36.850
36.990
36.980
36.810
36.890
36.940
36.910
36.900
36.870
36.880
36.900
36.910
37.040
36.890
36.900
36.940
36.910
36.880
36.900
36.960
36.930
36.950
36.900
36.900

Maximum
37.080
37.040
37.160
37.200
37.100
37.130
37.070
37.120
37.170
37.110
37.100
37.060
37.220
37.100
37.040
36.990
37.140
37.110
37.010
37.150
37.050
37.120
37.040
37.070
37.100

Q1
36.885
36.878
36.995
36.990
36.835
36.908
36.953
36.927
36.927
36.880
36.918
36.920
36.940
37.048
36.920
36.913
36.948
36.927
36.895
36.925
36.960
36.935
36.963
36.923
36.927

Q3
37.067
37.025
37.147
37.165
37.068
37.100
37.053
37.093
37.135
37.080
37.093
37.043
37.195
37.095
37.038
36.988
37.130
37.093
37.005
37.135
37.040
37.113
37.035
37.058
37.083

For each sample, we compute R = range = largest measurement - smallest measurement.

Methods for Quality Improvement

451

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The results are listed in the table:


Sample No.
1
2
3
4
5
6
7
8
9
10
11
12
13

R
.20
.19
.17
.22
.29
.24
.13
.21
.27
.24
.22
.16
.31

Sample No.
14
15
16
17
18
19
20
21
22
23
24
25

R
.06
.15
.09
.20
.20
.13
.25
.09
.19
.09
.17
.20

x1 + x2 + " + x25 925.1650


=
= 37.0066
k
25
R + R2 + " R25 4.67
=
R = 1
= .1868
k
25

x =

Centerline = x = 37.007
From Table XII, Appendix B, with n = 4, A2 = .729.

Upper control limit = x + A2 R = 37.007 + .729(.1868) = 37.143


Lower control limit = x A2 R = 37.007 .729(.1868) = 36.871
2
2
A2 R ) ) = 37.007 + (.729)(.1868) = 37.098
(
3
3
2
2
Lower AB boundary = x ( A2 R ) ) = 37.007 (.729)(.1868) = 36.916
3
3
1
1
Upper BC boundary = x + ( A2 R ) ) = 37.007 + (.729)(.1868) = 37.052
3
3
1
1
Lower BC boundary = x ( A2 R ) ) = 37.007 (.729)(.1868) = 36.962
3
3

Upper AB boundary = x +

452

Chapter 12

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The x -chart is:

c.

To determine if the process is in or out of control, we check the six rules:


Rule 1:
Rule 2:
Rule 3:
Rule 4:
Rule 5:
Rule 6:

One point beyond Zone A: No points are beyond Zone A.


Nine points in a row in Zone C or beyond: No sequence of nine points
are in Zone C (on one side of the centerline) or beyond.
Six points in a row steadily increasing or decreasing: No sequence of six
points steadily increase or decrease.
Fourteen points in a row alternating up and down: This pattern does not
exist.
Two out of three points in Zone A or beyond: There are no groups of
three consecutive points that have two or more in Zone A or beyond.
Four out of five points in a row in Zone B or beyond: No sequence of
five points has four or more in Zone B or beyond.

The process appears to be in control. No special causes of variation appear to be present.

12.16

d.

An example of a special cause of variation would be if the machine used to produce the
bolts slipped out of alignment and started producing bolts of a different length. An
example of common cause variation would be the grade of the raw material used to make
the bolts.

e.

Since the process appears to be in control, it is appropriate to use these limits to monitor
future process output.

a.

x1 + x2 + " + x16 868.18


=
= 54.26125
k
16
R + R2 + " + R16 44.1
=
R= 1
= 2.75625
k
16

x =

Centerline = x = 54.26125
From Table XII, Appendix B, with n = 5, A2 = .577

Upper control limit = x + A2 R = 54.26125 + .577(2.75625) = 55.8516

Methods for Quality Improvement

453

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Lower control limit = x A2 R = 54.26125 .577(2.75625) = 52.6709


Upper A B boundary = x +

2
2
(.577)(2.75625) = 55.3215
( A2 R ) = 54.26125 +
3
3

Lower A B boundary = x

2
2
( A2 R) = 54.26125
(.577)(2.75625) = 53.2010
3
3

Upper B C boundary = x +

1
1
( A2 R ) = 54.26125 + (.577)(2.75625) = 54.7914
3
3

Lower B C boundary = x

1
1
( A2 R ) = 54.26125 (.577)(2.75625) = 53.7311
3
3

The x -chart is:

b.

To determine if the process is in or out of control, we check the six rules:


Rule 1:
Rule 2:
Rule 3:
Rule 4:
Rule 5:

Rule 6:

One point beyond Zone A: One point is beyond Zone A.


Nine points in a row in Zone C or beyond: No sequence of nine points
are in Zone C (on one side of the centerline) or beyond.
Six points in a row steadily increasing or decreasing: No sequence of six
points steadily increase or decrease.
Fourteen points in a row alternating up and down: This pattern does not
exist.
Two out of three points in Zone A or beyond: There are two sets of three
consecutive points (data points 3, 4, and 5 and data points 4, 5, and 6)
that have two points in Zone A or beyond.
Four out of five points in a row in Zone B or beyond: No sequence of
five points has four or more in Zone B or beyond.

Special causes of variation appear to be present. The process appears to be out of control.
Rules 1 and 5 indicate the process is out of control.
c.

454

Since the process is out of control, these control limits should not be used to monitor
future process outputs.

Chapter 12

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

12.18

The R-chart is designed to monitor the variation of the process.

12.20

Using Table XII, Appendix B:

12.22

a.

With n = 4, D3 = 0.000

D4 = 2.282

b.

With n = 12, D3 = 0.283

D4 = 1.717

c.

With n = 24, D3 = 0.451

D4 = 1.548

a.

From Exercise 12.11, the R values are:


Sample No.
1
2
3
4
5
6
7
8
9
10

R=

R
1.8
2.8
3.8
2.5
3.7
5.0
5.5
3.5
2.5
4.1

Sample No.
11
12
13
14
15
16
17
18
19
20

R
3.2
0.9
2.6
4.0
2.2
4.3
3.6
2.5
2.2
5.5

R1 + R2 + " R20 66.2


=
= 3.31
k
20

Centerline = R = 3.31
From Table XII, Appendix B, with n = 4, D4 = 2.282, and D3 = 0.

Upper control limit = R D4 = 3.31(2.282) = 7.553


Since D3 = 0, the lower control limit is negative and is not included on the chart.
b.

From Table XII, Appendix B, with n = 4, d2 = 2.059, and d3 = .880.

Upper AB boundary = R + 2d3

R
3.31
= 6.139
= 3.31 + 2(.880)
d2
2.059

Lower AB boundary = R 2d3

R
3.31
= 0.481
= 3.31 2(.880)
d2
2.059

Upper BC boundary = R + d3

R
3.31
= 4.725
= 3.31 + (.880)
d2
2.059

Lower BC boundary = R d3

R
3.31
= 1.895
= 3.31 (.880)
d2
2.059

Methods for Quality Improvement

455

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

The R-chart is:

To determine if the process is in or out of control, we check the four rules:


Rule 1:
Rule 2:
Rule 3:
Rule 4:

One point beyond Zone A: No points are beyond Zone A.


Nine points in a row in Zone C or beyond: No sequence of nine points
are in Zone C (on one side of the centerline) or beyond.
Six points in a row steadily increasing or decreasing: No sequence of six
points steadily increase or decrease.
Fourteen points in a row alternating up and down: This pattern does not
exist.

The process appears to be in control.


12.24

a.

From Table XII, Appendix B, with n = 4, D3 = 0, and D4 = 2.282.


R = .335
Upper control limit = R D4 = .335(2.282) = .7645

Since D3 = 0, the lower control limit is negative and is not included on the chart.
b.

To determine if special causes of variation are present, we need to complete the R-chart.
From Table XII, Appendix B, with n = 4, d2 = 2.059, and d3 = .880.

456

Upper A B boundary = R + 2d 3

R
.335
= .335 + 2(.880)
= .6213
d2
2.059

Lower A B boundary = R 2d3

R
.335
= .335 2(.880)
= .0486
d2
2.059

Chapter 12

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Upper B C boundary = R + d3

R
.335
= .335 + (.880)
= .4782
d2
2.059

Lower B C boundary = R d 3

R
.335
= .335 (.880)
= .1918
d2
2.059

The R-chart is:


UCL = .7646
.6213
.4782
R = 0.335
.1918
.0486

To determine if the process is in control, we check the four rules.


Rule 1:
Rule 2:
Rule 3:
Rule 4:

One point beyond Zone A: No points are beyond Zone A.


Nine points in a row in Zone C or beyond: There are not nine points are
in a row in Zone C (on one side of the centerline) or beyond.
Six points in a row steadily increasing or decreasing: No sequence of six
points steadily increase or decrease.
Fourteen points in a row alternating up and down: This pattern does not
exist.

It appears that the process is in control.


c.

Yes. This process appears to be in control. Therefore, these control limits could be used
to monitor future output.

d.

Of the 30 R values plotted, there are only 6 different values. Most of the R values take on
one of three values. This indicates that the data must be discrete (take on a countable
number of values), or that the path widths are multiples of each other.

Methods for Quality Improvement

457

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

12.26

a.

R=

R1 + R2 + " + R20 4 + 6 + " + 15 176


=
=
= 8.8
k
20
20

Centerline = R = 8.8

From Table XII, Appendix B, with n = 5, D4 = 2.114 and D3 = 0.


Upper control limit = RD4 = 8.8(2.114) = 18.603

Since D3 = 0, the lower control limit is negative and is not included on the chart.
From Table XII, Appendix B, with n = 5, d2 = 2.326 and d3 = 0.864.
R
8.8
= 15.338
= 8.8 + 2(.864)
d2
2.326

Upper A B boundary = R + 2d3

Lower A B boundary = R 2 d 3

Upper B C boundary = R + d 3

Lower B C boundary = R d 3

= 8.8 2(.864)

d2

= 8.8 + (.864)

8.8
= 12.069
2.326

= 8.8 (.864)

8.8
= 5.531
2.326

d2
R
d2

8.8
= 2.262
2.326

The R-chart is:

b.

To determine if the process is in or out of control, we check the four rules:


Rule 1:
Rule 2:

458

One point beyond Zone A: No points are beyond Zone A.


Nine points in a row in Zone C or beyond: No sequence of nine points are
in Zone C (on one side of the centerline) or beyond.

Chapter 12

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Rule 3:
Rule 4:

Six points in a row steadily increasing or decreasing: No sequence of six


points steadily increase or decrease.
Fourteen points in a row alternating up and down: This pattern does not
exist.

The process appears to be in control since none of the out-of-control signals are observed.
No special causes of variation appear to be present.

12.28

c.

Since the process appears to be in control, the control limits of the R-chart could be used
to monitor future replacement cycle times.

d.

From part b, we decided that the process was in control. However, there does appear to
be a pattern emerging in the R-chart. As the sample number increases, the value of R is
tending to increase. If this process was monitored for a longer period of time, the R-chart
might indicate that the process was out of control.

a.

R =

R1 + R2 + " + R16 .4 + 1.4 + " + 2.6 44.1


=
=
= 2.756
k
16
16

Centerline = R = 2.756

From Table XII, Appendix B, with n = 5, D4 = 2.114 and D3 = 0.


Upper control limit = RD4 = 2.756(2.114) = 5.826

Since D3 = 0, the lower control limit is negative and is not included on the chart.
From Table XII, Appendix B, with n = 5, d2 = 2.326 and d3 = 0.864.
Upper A B boundary = R + 2d3

R
2.756
= 4.803
= 2.756 + 2(.864)
d2
2.326

Lower A B boundary = R 2d3

R
2.756
= 2.756 - 2(.864)
= .709
d2
2.326

Upper B C boundary = R + d3

2.756
R
= 2.756 + (.864)
= 3.780
2.326
d2

Lower B C boundary = R d3

R
2.756
= 1.732
= 2.756 - (.864)
d2
2.326

Methods for Quality Improvement

459

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The R-chart is:

b.

The R-chart is designed to monitor the process variation.

c.

To determine if the process is in or out of control, we check the four rules:


Rule 1:
Rule 2:
Rule 3:
Rule 4:

One point beyond Zone A: No points are beyond Zone A.


Nine points in a row in Zone C or beyond: No sequence of nine points are in
Zone C (on one side of the centerline) or beyond.
Six points in a row steadily increasing or decreasing: No sequence of six
points steadily increases or decreases.
Fourteen points in a row alternating up and down: This pattern does not exist.

The process appears to be in control. None of the out-of-control signals are present.
There is no indication that special causes of variation present.
12.30

The p-chart is designed to monitor the proportion of defective units produced by a process.

12.32

a.

To compute the proportion of defectives in each sample, divide the number of defectives
by the number in the sample, 200:

No. of defectives
P =
No. in sample

460

Chapter 12

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The sample proportions are listed in the table:


Sample No.

1
2
3
4
5
6
7
8
9
10
11
12
13
b.

p
.080
.070
.045
.055
.075
.040
.060
.080
.085
.065
.075
.050
.045

Sample No. p
14
.060
15
.070
16
.055
17
.040
18
.035
19
.060
20
.075
21
.045
22
.080
23
.065
24
.055
25
.050

To get the total number of defectives, sum the number of defectives for all 25 samples.
The sum is 303. To get the total number of units sampled, multiply the sample size by the
number of samples: 200(25) = 5000.
p =

Total defective in all samples


303
= .0606
=
5000
Total units sampled

Centerline = p = .060

Upper control limit = p + 3

Lower control limit = p 3

c.

p (1 p )
.0606(.9394)
= .0606 + 3
= .1112
n
200
p (1 p )
.0606(.9394)
= .0606 3
= .0100
n
200

p (1 p )
.0606(.9394)
= .0606 + 2
= .0943
n
200
p (1 p )
.0606(.9394)
Lower AB boundary = p 2
= .0606 2
= .0269
n
200
p (1 p )
.0606(.9394)
= .0606 +
Upper BC boundary = p +
= .0775
n
200
p (1 p )
.0606(.9394)
= .0606
Lower BC boundary = p
= .0437
n
200
Upper AB boundary = p + 2

Methods for Quality Improvement

461

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

d.

The p-chart is:

e.

To determine if the process is in or out of control, we check the four rules:


Rule 1:
Rule 2:

One point beyond Zone A: No points are beyond Zone A.


Nine points in a row in Zone C or beyond: No sequence of nine points
are in Zone C (on one side of the centerline) or beyond.
Six points in a row steadily increasing or decreasing: No sequence of six
points steadily increase or decrease.
Fourteen points in a row alternating up and down: This pattern does not
exist.

Rule 3:
Rule 4:

The process appears to be in control. There do not appear to be any special causes of
variation.
12.34

a.

The sample size is determined by the following:


n>

9 (1 p 0 )
p0

9(1 .01)
= 891
.01

The minimum sample size is 892.


b.

The sample size is determined by the following:


n>

9 (1 p 0 )
p0

9(1 .05)
= 171
.05

The minimum sample size is 172.

462

Chapter 12

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

The sample size is determined by the following:


n>

9 (1 p 0 )
p0

9(1 .10)
= 81
.10

The minimum sample size is 82.


d.

The sample size is determined by the following:


n>

9 (1 p 0 )
p0

9(1 .20)
= 36
.20

The minimum sample size is 37.


12.36

a.

The sample size is determined by the following:


n>

9 (1 p 0 )
p0

9(1 .07)
= 119.6 120
.07

The minimum sample size is 120.


b.

To compute the proportion of defectives in each sample, divide the number of defectives
by the number in the sample, 120:
p =

No. defectives
No. in sample

The sample proportions are listed in the table:


Sample No.

1
2
3
4
5
6
7
8
9
10

p
.092
.042
.033
.067
.083
.108
.075
.067
.083
.092

Sample No. p
11
.083
12
.100
13
.067
14
.050
15
.083
16
.042
17
.083
18
.083
19
.025
20
.067

To get the total number of defectives, sum the number of defectives for all 20 samples.
The sum is 171. To get the total number of units sampled, multiply the sample size by the
number of samples: 120(20) = 2400.

Methods for Quality Improvement

463

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

p =

Total defective in all samples


171
= .071
=
2400
Total units sampled

Centerline = p = .071
Upper control limit = p + 3
Lower control limit = p 3

p (1 p )
.071(.929)
= .071 + 3
= .141
n
120
p (1 p )
.071(.929)
= .071 3
= .001
n
120

p (1 p )
.071(.929)
= .071 + 2
= .118
n
120
p (1 p )
.071(.929)
= .071 2
= .024
Lower AB boundary = p 2
n
120
p (1 p )
.071(.929)
Upper BC boundary = p +
= .071 +
= .094
n
120
p (1 p )
.071(.929)
Lower BC boundary = p
= .071
= .048
n
120
Upper AB boundary = p + 2

The p-chart is:

c.

To determine if the process is in or out of control, we check the four rules:


Rule 1:
Rule 2:
Rule 3:
Rule 4:

One point beyond Zone A: No points are beyond Zone A.


Nine points in a row in Zone C or beyond: No sequence of nine points
are in Zone C (on one side of the centerline) or beyond.
Six points in a row steadily increasing or decreasing: No sequence of six
points steadily increase or decrease.
Fourteen points in a row alternating up and down: This pattern does not
exist.

The process appears to be in control.

464

Chapter 12

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

12.38

d.

Since the process is in control, it is appropriate to use the control limits to monitor future
process output.

e.

No. The number of defectives recorded was per day, not per hour. Therefore, the p-chart
is not capable of signaling hour-to-hour changes in p.

a.

To compute the proportion of defectives in each sample, divide the number of defectives
by the number in the sample, 200:
p =

No. defectives
No. in sample

The sample proportions are listed in the table:


Sample No.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

p
.065
.025
.010
.015
.010
.015
.005
.010
.005
.005
.055
.030
.010
.015
.005

Sample No. p
16
.015
17
.005
18
.010
19
.015
20
.005
21
.045
22
.025
23
.010
24
.005
25
.015
26
.010
27
.020
28
.010
29
.005
30
.005

To get the total number of defectives, sum the number of defectives for all 30 samples.
The sum is 96. To get the total number of units sampled, multiply the sample size by the
number of samples: 200(30) = 6000.
p =

Total defective in all samples


96
=
= .016
Total units sampled
6000

The centerline is p = .016


p (1 p )
.016(1 .016)
= .016 + 3
= .0426
n
200
p (1 p )
.016(1 .016)
Lower control limit = p 3
= .016 3
= -.0106
n
200
p (1 p )
.016(1 .016)
Upper AB boundary = p + 2
= .016 + 2
= .0337
n
200
Upper control limit = p + 3

Methods for Quality Improvement

465

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Lower AB boundary = p 2
Upper BC boundary = p +
Lower BC boundary = p

p (1 p )
.016(1 .016)
= .016 2
= -.0017
n
200
p (1 p )
.016(1 .016)
= .016 +
= .0249
n
200
p (1 p )
.016(1 .016)
= .016
= .0071
n
200

The p-chart is:

b.

To determine if the process is in or out of control, we check the four rules:


Rule 1:
Rule 2:
Rule 3:
Rule 4:

One point beyond Zone A: There are 3 points beyond Zone APoints 1,
11, and 21.
Nine points in a row in Zone C or beyond: No sequence of nine points
are in Zone C (on one side of the centerline) or beyond.
Six points in a row steadily increasing or decreasing: This pattern is not
present.
Fourteen points in a row alternating up and down: This pattern does not
exist.

The process does not appear to be in control. Rule 1 indicates that the process is out of
control.
12.40

Specification spread is the difference between the upper specification limit and the lower
specification spread. The specification spread is determined by customers, management, and
product designers. Process spread is the spread of the actual output and is a function of the
standard deviation of the data.

12.42

There are two reasons why CP should not be used in isolation. First, CP is a statistic and is
subject to sampling error. The sample standard deviation is used to estimate the population
standard deviation which is used to calculate the process spread. Thus, the estimate of the
process spread can vary from sample to sample. Second, CP does not reflect the shape of the
output distribution. Distributions with different shapes can have the same CP value.

466

Chapter 12

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

12.44

12.46

12.48

The specification spread is the difference between the upper specification limit and the lower
specification limit.
a.

Specification spread = USL LSL = 19.65 12.45 = 7.20

b.

Specification spread = USL LSL = .0010 .0008 = .0002

c.

Specification spread = USL LSL = 1.43 1.27 = 0.16

d.

Specification spread = USL LSL = 490 486 = 4

CP =

Specification spread
USL LSL
=
6
Process spread

a.

CP

1.0065 1.0035
USL LSL
.003
=1
=
=
6s
.003
6(.0005)

b.

CP

22 21
USL LSL
1
=
=
= .8333
6s
1.2
6(.2)

c.

CP

875 870
USL LSL
5
= 1.111
=
=
6s
4.5
6(.75)

a.

If the output distribution is normal with a mean of 1000 and a standard deviation of 100,
then the proportion of the output that is unacceptable is:
P(x < 980) + P(x > 1,020)
980 1, 000
1, 020 1, 000

= P z <
+ P z >

100
100

= P(z < .2) + P(z > .2) = (.5 .0793) + (.5 .0793) = .8414
(using Table IV, Appendix B)
The percentage of unacceptable output is 84.14%.

b.

CP =

USL LSL 1, 020 980


40
= .067

=
6
600
6(100)

Since the value of CP is less than 1, the process is not capable.

Methods for Quality Improvement

467

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

12.50

a.

A capability diagram is:


LSL = 35 is off the chart.

b.

Fifty-two of the observations are above the upper specification limit. Thus, the
percentage is (52/100) 100% = 52%.

c.

From the sample, x = 37.007 and s = .083.


CP =

d.

37 35
USL LSL
2

= 4.016
=
6s
.498
6(.083)

Since the CP value is greater than 1, the process is capable.

12.52

The quality of a good or service is indicated by the extent to which it satisfies the needs and
preferences of its users. Its eight dimensions are: performance, features, reliability,
conformance, durability, serviceability, aesthetics, and other perceptions that influence
judgments of quality.

12.54

A process is a series of actions or operations that transform inputs to outputs. A process


produces output over time. Organizational process: Manufacturing a product. Personnel
Process: Balancing a checkbook.

12.56

The six major sources of process variation are: people, machines, materials, methods,
measurements, and environment.

12.62

Common causes of variation are the methods, materials, equipment, personnel, and
environment that make up a process and the inputs required by the process. That is, common
causes are attributable to the design of the process. Special causes of variation are events or
actions that are not part of the process design. Typically, they are transient, fleeting events that
affect only local areas or operations within the process for a brief period of time. Occasionally,
however, such events may have a persistent or recurrent effect on the process.

12.64

If a process is capable, then it is necessarily in control. If a process is in control, then the


control chart should be used to monitor the process.

468

Chapter 12

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

12.66

The probability of observing a value of more than 3 standard deviations from its mean is:
P( x > + 3 x ) + P( x < 3 x ) = P(z > 3) + P(z < 3)
= .5000 .4987 + .5000 .4987 = .0026
If we want to find the number of standard deviations from the mean the control limits should be
set so the probability of the chart falsely indicating the presence of a special cause of variation
is .10, we must find the z score such that:
P(z > z0) + P(z < z0) = .1000 or P(z > z0) = .0500
Using Table IV, Appendix B, z0 = 1.645. Thus the control limits should be set 1.645 standard
deviations above and below the mean.

12.68

a.

The centerline = x =

x = 150.58
n

20

= 7.529

The time series plot is:

12.70

b.

The variation pattern that best describes the pattern in this time series is the level shift.
Points 1 through 10 all have fairly low values, while points 11 through 20 all have fairly
high values.

a.

Yes. The minimum sample size necessary so the lower control limit is not negative is:
n>

9 (1 p 0 )
p0

From the data, p0 .06


Thus, n >

9(1 .06)
= 141. Our sample size was 200.
.06

Methods for Quality Improvement

469

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

b.

To compute the proportion of defectives in each sample, divide the number of defectives
by the number in the sample, 200:
p =

No. of defectives
No. in sample

The sample proportions are listed in the table:


Sample No.
1
2
3
4
5
6
7
8
9
10
11

p
.02
.03
.055
.06
.025
.05
.04
.08
.085
.10
.14

Sample No.
12
13
14
15
16
17
18
10
20
21

p
.10
.10
.085
.065
.05
.055
.035
.03
.04
.045

To get the total number of defectives, sum the number of defectives for all 21 samples.
The sum is 258. To get the total number of units sampled, multiply the sample size by the
number of samples: 200(21) = 4200.
p =

No. of defectives
258
=
= .0614
No. in sample
4200

Centerline = p = .0614
Upper control limit = p + 3
Lower control limit = p 3
Upper A-B boundary = p +
Lower A-B boundary = p
Upper B-C boundary = p +
Lower B-C boundary = p

470

p (1 p )
= .0614 + 3
n
p (1 p )
= .0614 3
n
p (1 p )
2
= .0614 +
n
p (1 p )
2
= .0614
n
p (1 p )
= .0614 +
n
p (1 p )
= .0614
n

.0614(.9386)
= .1123
200
.0614(.9386)
= .0105
200
.0614(.9386)
2
= .0953
200
.0614(.9386)
2
= .0275
200
.0614(.9386)
= .0784
200
.0614(.9386)
= .0444
200

Chapter 12

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The p-chart is:

c.

To determine if the control limits should be used to monitor future process output, we
need to check the four rules.
Rule 1:
Rule 2:
Rule 3:
Rule 4:

One point beyond Zone A: The 11th point is beyond Zone A. This
indicates the process is out of control.
Nine points in a row in Zone C or beyond: There are not nine points in a
row in Zone C (on one side of the centerline) or beyond.
Six points in a row steadily increasing or decreasing: No sequence of six
points steadily increase or decrease.
Fourteen points in a row alternating up and down: This pattern does not
exist.

Rule 1 indicates the process is out of control. These control limits should not be used to
monitor future process output.
12.72

a.

In order for the x -chart to be meaningful, we must assume the variation in the process is
constant (i.e., stable).
x and R = range = largest measurement - smallest
For each sample, we compute x =
n
measurement. The results are listed in the table:
Sample No.
1
2
3
4
5
6
7
8
9
10
11
12

x
32.325
30.825
30.450
34.525
31.725
33.850
32.100
28.250
32.375
30.125
32.200
29.150

Methods for Quality Improvement

R
11.6
12.4
7.8
10.2
9.1
10.4
10.1
6.8
8.7
6.3
7.1
9.3

Sample No.
13
14
15
16
17
18
19
20
21
22
23
24

x
31.050
34.400
31.350
28.150
30.950
32.225
29.050
31.400
30.350
34.175
33.275
30.950

R
13.3
9.6
7.3
8.6
7.6
5.6
10.0
8.7
8.9
10.5
13.0
8.9

471

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

x1 + x2 + " + x24 755.225


=
= 31.4677
k
24
R + R2 + " + R24 221.8
=
R = 1
= 9.242
k
24
x =

Centerline = x = 31.468
From Table XII, Appendix B, with n = 4, A2 = .729.
Upper control limit = x + A2 R = 31.468 + .729(9.242) = 38.205
Lower control limit = x A2 R = 31.468 - .729(9.242) = 24.731
2
2
( A2 R ) = 31.468 +
(.729)(9.242) = 35.960
3
3
2
2
Lower A-B boundary = x ( A2 R) = 31.468 (.729)(9.242) = 26.976
3
3
1
1
Upper B-C boundary = x + ( A2 R ) = 31.468 + (.729)(9.242) = 33.714
3
3
1
1
Lower B-C boundary = x ( A2 R) = 31.468 (.729)(9.242) = 29.222
3
3

Upper A-B boundary = x +

The x -chart is:

b.

To determine if the process is in or out of control, we check the six rules.


Rule 1:
Rule 2:
Rule 3:
Rule 4:

472

One point beyond Zone A: No points are beyond Zone A.


Nine points in a row in Zone C or beyond: No sequence of nine points
are in Zone C (on one side of the centerline) or beyond.
Six points in a row steadily increasing or decreasing: No sequence of six
points steadily increase or decrease.
Fourteen points in a row alternating up and down: This pattern does not
exist.

Chapter 12

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Rule 5:
Rule 6:

Two out of three points in Zone A or beyond: There are no groups of


three consecutive points that have two or more in Zone A or beyond.
Four out of five points in a row in Zone B or beyond: No sequence of
five points has four or more in Zone B or beyond.

The process appears to be in control. There are no indications that special causes of
variation are affecting the process.

12.74

c.

Since the process appears to be in control, these limits should be used to monitor future
process output.

a.

A capability analysis diagram is:

b.

For an upper specification limit of 5, there are 27 observations above this limit. Thus,
(27/100) 100% = 27% of the observations are unacceptable. It does not appear that the
process is capable.

c.

From Exercise 14.73, the process appears to be in control. Thus, it is appropriate to


estimate CP.
From the sample, x = 3.867 and s = 2.190
CP =

50
USL LSL
5
= .381
=

6s
6(2.19) 13.14

Since the CP value is less than 1, the process is not capable.

12.76

d.

There is no lower specification limit because management has no time limit below which
is unacceptable. The variable being measured is time customers wait in line. The actual
lower limit would be 0.

a.

To get the total number of defectives, sum the number of defectives for all 36 samples.
The sum is 279. To get the total number of units sampled, multiply the sample size by the
number of samples: 160(36) = 5760.
p =

Total defective in all samples


279
= .048
=
5760
Total units sampled

The centerline is p = .048

Methods for Quality Improvement

473

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

p (1 p )
.048(1 .048)
= .048 + 3
= .099
N
160
p (1 p )
.048(1 .048)
= .048 3
= -.003
3
N
160
p (1 p )
.048(1 .048)
p +2
= .048 + 2
= .082
N
160
p (1 p )
.048(1 .048)
= .048 2
= .014
p 2
N
160
p (1 p )
.048(1 .048)
= .048 +
= .065
p +
N
160
p (1 p )
.048(1 .048)
= .048
= .031
p
N
160

Upper control limit = p + 3


Lower control limit = p
Upper AB boundary =
Lower AB boundary =
Upper BC boundary =
Lower BC boundary =
The p-chart is:

To determine if the process is in or out of control, we check the four rules of the R-chart:
Rule 1:
Rule 2:
Rule 3:
Rule 4:

One point beyond Zone A: No points are beyond Zone A.


Nine points in a row in Zone C or beyond: No sequence of nine points
are in Zone C (on one side of the centerline) or beyond.
Six points in a row steadily increasing or decreasing: This pattern is not
present.
Fourteen points in a row alternating up and down: This pattern does not
exist.

The process appears to be in control. Thus, there is no indication that special causes of
variation are present.

474

Chapter 12

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

The Pareto diagram is:

Most of the defects are due to microcracks. Thus, "microcracks" are the "vital few." The
other types of defectives are broken stands, gaps between layers, and internal voids.
These are the "trivial many."

Methods for Quality Improvement

475

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Time Series: Descriptive Analyses,


Models, and Forecasting

13.2

a.

Chapter 13

The simple composite index is calculated as follows:


First, sum the observations for all the series of interest at each time period. Select the
base time period. Divide each sum by the sum in the base time period and multiply by
100.

b.

To calculate a weighted composite index, we follow the following steps:


First, multiply the observations in each time series by its appropriate weight. Then sum
the weighted observations across all times series for each time period. Select the base
time period. Divide each weighted sum by the weighted sum in the base time period and
multiply by 100.

c.

The steps necessary to compute a Laspeyres Index are:


1.
2.
3.
4.
5.

d.

The steps necessary to compute a Paasche index are:


1.
2.
3.
4.

5.

13.4

476

a.

Collect data for each of k price series.


Select a base time period and collect purchase quantity information for each of
the k series at the base time period.
Using the purchase quantity values at the base period as weights, multiply each
value in the kth series by its corresponding weight.
Sum the products for each time period.
Divide each sum by the sum corresponding to the base period and multiply by
100.

Collect data for each of k price series.


Select a base period.
Collect purchase quantity information for each series at each time period.
For each time period, multiply the value in each price series by its
corresponding purchase quantity for that time period. Sum the products for
each time period.
To find the value of the Paasche index at a particular time period, multiply the
purchase quantity values (weights) for that time period by the corresponding
price values of the base time period. Sum the results for the base period. The
Paasche Index is then found by dividing the sum found in (4) by the sum found
in (5).

The simple index for the quarter 4 price of product A, using quarter 1 as the base
period is (4.25 / 3.25) 100 = 130.77.

Chapter 13

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

13.6

b.

The simple index for the quarter 2 price of product B, using quarter 1 as the base
period is (1.25 / 1.75) 100 = 71.43.

c.

To find the simple composite index, we must first sum the prices for all three products
over the base period and the quarter for which we want to compute the simple composite
index. The sum for quarter 1 is 3.25 + 1.75 + 8.00 = 13.00. The sum for quarter 4 is 4.25
+ 1.00 + 10.50 = 15.75. The simple composite index for quarter 4 using quarter 1 as the
base period is (15.75 / 13.00) 100 = 121.15.

d.

The sum of all the products for quarter 2 is 3.50 + 1.25 + 9.35 = 14.10. The simple
composite index for quarter 4 using quarter 2 as the base period is (15.75 / 14.10) 100 =
111.70.

a.

To find the simple index, divide each value by the value for the base year and multiply by
100. The index numbers are:

Year
1975
1980
1985
1990
1995
2000
b.

Simple Index
(Base Year = 1975)
(13,719/13,719) 100 = 100.00
(21,023/13,719) 100 = 153.24
(27,735/13,719) 100 = 202.16
(35,353/13,719) 100 = 257.69
(40,611/13,719) 100 = 296.02
(50,890/13,719) 100 = 370.95

Simple Index
(Base Year = 1980)
(13,719/21,023) 100 = 65.26
(21,023/21,023) 100 = 100.00
(27,735/21,023) 100 = 131.93
(35,353/21,023) 100 = 168.16
(40,611/21,023) 100 = 193.17
(50,890/21,023) 100 = 242.07

The index value for 1990 is 257.69 when the base is 1975. Thus, the median annual
family income for 1990 increased by 257.69 100 = 157.69% over the median annual
family income in 1975.
The index value for 1990 is 168.16 when the base is 1980. Thus, the median annual
family income for 1990 increased by 168.16 100 = 68.16% over the median annual
family income in 1980.

13.8

a.

To compute the simple index, divide each housing start value by the 2001, Quarter 1
value, 274 and then multiply by 100.
Year
2001

2002

2003

Quarter
1
2
3
4
1
2
3
4
1
2

Simple Index
(274/274) 100 =
(374/274) 100 =
(341/274) 100 =
(285/274) 100 =
(293/274) 100 =
(386/274) 100 =
(361/274) 100 =
(319/274) 100 =
(304/274) 100 =
(406/274) 100 =

100.00
136.50
124.45
104.01
106.93
140.88
131.75
116.42
110.95
148.18

Year
2003
2004

2005

Time Series: Descriptive Analyses, Models, and Forecasting

Quarter
3
4
1
2
3
4
1
2
3
4

Simple Index
(412/274) x 100 =
(377/274) x 100 =
(345/274) x 100 =
(456/274) x 100 =
(440/274) x 100 =
(370/274) x 100 =
(369/274) x 100 =
(485/274) x 100 =
(471/274) x 100 =
(392/274) x 100 =

150.36
137.59
125.91
166.42
160.58
135.04
134.67
177.01
171.90
143.07

477

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

13.10

b.

The value of the index for Quarter 2, 2004 is 166.42. Thus, the housing starts in Quarter
2, 2004 increased by 166.42 100 = 66.42% over the housing starts in the base quarter,
Quarter 1, 2001.

c.

The value of the index for Quarter 4, 2005 is 143.07. Thus, the housing starts in Quarter
4, 2005 increased by 143.07 100 = 43.07% over the housing starts in the base quarter,
Quarter 1, 2001.

d.

The number of housing starts for Quarter 1, 2003 is 304 thousand. The number of
housing starts for Quarter 4, 2005 is 392 thousand. Using Quarter 1, 2003 as the base, the
index for Quarter 4, 2005 is (392/304) 100 = 128.95. Thus, the number of housing
starts in Quarter 4, 2005 increased by 128.95 100 = 28.95% over the housing starts in
Quarter 1, 2003.

a.

To compute the simple index for the agricultural data, divide each farm value by the
1980 value 3,364 and then multiply by 100. To compute the simple index for the
nonagricultural data, divide each nonfarm value by the 1980 value 95,938 and then
multiply by 100. The two indices are:

Year
1980
1985
1990
1995
2000
2003

Farm Index
(3,364/3,364) 100 =
(3,179/3,364) 100 =
(3,223/3,364) 100 =
(3,440/3,364) 100 =
(2,464/3,364) 100 =
(2,275/3,364) 100 =

Nonfarm Index
(95,938/95,938) 100 =
(10,3971/95,938) 100 =
(115,570/95,938) 100 =
(121,460/95,938) 100 =
(134,427/95,938) 100 =
(135,461/95,938) 100 =

100.00
108.37
120.46
126.60
140.12
141.20

b.

The nonfarm segment has shown the greater percentage change in employment over the
time period. The nonfarm employment in 2003 was 41.20% greater than in 1980. The
farm employment in 2003 was 32.37% lower than in 1980.

c.

To compute the simple composite index, first sum the two values (farm and nonfarm) for
every time period. Then divide the sum by the sum in 1980, 99,302, and then multiply by
100. The simple composite index is:
Year
1980
1985
1990
1995
2000
2003

d.

478

100.00
94.50
95.81
102.26
73.25
67.63

Sum
99,302
107,150
118,793
124,900
136,891
137,736

Simple Composite Index


(99,302/99,302) 100 =
(107,150/99,302) 100 =
(118,793/99,302) 100 =
(124,900/99,302) 100 =
(136,891/99,302) 100 =
(137,736/99,302) 100 =

100.00
107.90
119.63
125.78
137.85
138.70

The simple composite index value for 2003 is 138.70. The composite employment is
38.70% higher in 2003 than in 1980.

Chapter 13

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

13.12

a.

The find Laspeyres index, we multiply the durable goods by 10.9, the nondurable goods
by 14.02, and the services by 42.6. The three products are then summed. The index is
found by dividing the weighted sum at each time period by the weighted sum of 1970,
17,108.86, and then multiplying by 100. The Laspeyres index and the simple composite
index for 1970 (computed in Exercise 13.11) are:
Year

Simple Composite
Index-1970
51.43
68.77
100.00
158.52
270.39
412.59
581.78
768.60
1,033.83
1,272.99

1960
1965
1970
1975
1980
1985
1990
1995
2000
2004
b.

Weighted
Sum
8,409.95
11,442.51
17,108.86
27,509.89
48,215.53
76,167.86
110,254.64
150,193.08
202,856.51
251,152.45

Laspeyres
Index
49.16
66.88
100.00
160.79
281.82
445.20
644.43
877.87
1,185.68
1,467.97

The plot of the two indices is:


1600

Variable
I-1970
Laspeyres

1400
1200

Index

1000
800
600
400
200
0
1960

1965

1970

1975 1980

1985

1990

1995

2000

Y ear

The two indices are very similar from 1960 to approximately 1980. After 1980, the
difference between the two indices becomes larger, with the Laspeyres index increasing
faster than the simple composite index.

Time Series: Descriptive Analyses, Models, and Forecasting

479

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

13.14

a.

To get the simple composite price index, sum the prices for the three metals for each
month, divide by 2,090.35 (the sum of the prices for the base period January), and
multiply by 100. To get the simple composite quantity index, sum the quantities for the
three metals for each month, divide by 8,793.40 (the sum of the quantities for the base
period January), and multiply by 100. The indices are:

Month
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
b.

Price
Total
2,090.35
2,495.72
2,536.85
2,409.55
2,550.70
2,603.20
2,719.30
2,998.52
2,978.98
2,997.82
3,038.80
3,018.57

Price
Index
100.00
119.39
121.36
115.27
122.02
124.53
130.09
143.45
142.51
143.41
145.37
144.41

Quantity
Index
100.00
97.02
106.97
102.89
105.80
104.08
105.78
107.56
106.70
110.29
103.79
100.16

To compute the Laspeyres index, multiply the price for each month by the quantity for
each of the metals for January, sum the products for the three metals, divide by
1,768,700.64 (the sum for the base period January), and multiply by 100. The Laspeyres
index is:
Month
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec

480

Quantity
Total
8,793.40
8,531.70
9,406.50
9,047.10
9,303.20
9,152.10
9,301.80
9,457.90
9,382.90
9,698.20
9,127.00
8,807.90

Total
1,768,700.64
2,077,067.24
2,345,138.00
2,114,563.64
1,760,956.32
1,746,326.88
2,117,568.80
2,377,017.20
2,100,958.72
2,276,109.40
2,366,980.72
2,155,654.92

Laspeyres Index
100.00
117.43
132.59
119.55
99.56
98.74
119.72
134.39
118.79
128.69
133.83
121.88

Chapter 13

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

The plots of the simple composite price index, the simple composite quantity index, and
Laspeyres index are:
150

Variable
Price
Quantity
Laspeyres

140

Index

130

120

110

100

90
Jan

Feb Mar

Apr

May

Jun

Jul

Aug

Sep Oct

Nov Dec

M onth

The quantity index appears to be fairly stable while the price index steadily
increases. The Laspeyres index is rather unstable, as it varies much more than the
other two indices.
d.

The following steps are used to compute the Paasche index:


1.
2.

3.

First, multiply the price production for copper, steel, and lead for each month.
The numerator of the index is the sum of these three quantities at each month.
Next, multiply the production values of copper by 1,133, the production of steel by
187.75, and the production of lead by 769.6. The denominator is the sum of these
three quantities at each month.
The values of the Paasche index are the ratios of these two values at each month
times 100.

The Paasche index is:


Month
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec

Paasche
Numerator
1,768,700.64
2,013,192.24
2,500,128.80
2,180,640.81
1,858,912.26
1,822,735.92
2,230,984.40
2,549,791.96
2,244,369.96
2,504,067.86
2,450,159.20
2,175,046.70

Paasche
Denominator
1,768,700.64
1,714,396.58
1,884,813.60
1,823,938.71
1,867,861.77
1,844,379.26
1,864,385.48
1,898,332.74
1,888,977.74
1,946,822.77
1,831,683.15
1,781,166.44

Time Series: Descriptive Analyses, Models, and Forecasting

Paasche
Index
100.00
117.43
132.65
119.56
99.52
98.83
119.66
134.32
118.81
128.62
133.77
122.11

481

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

e.

The plot of the Laspeyres index and the Paasche index is:
The two indices are almost identical.
Time Series Plot of Laspeyres, Paasche
135

Variable
Laspeyres
Paasche

130
125

Data

120
115
110
105
100
Jan

Feb Mar

Apr

May

Jun

Jul

Aug

Sep Oct

Nov Dec

M onth

13.16

f.

The values of Laspeyres index for September and December are 118.79 and 121.88 The
values of the Paasche index for September and December are 118.81 and 122.11. These
values are almost identical. Both the Laspeyres and Paasche indices are so close to being
the same, neither is superior to the other.

a.

The exponentially smoothed employment for the first period is equal to the employment
for that period. For the rest of the time periods, the exponentially smoothed employment
values are found by multiplying .5 times the employment value of that time period and
adding to that (1 .5) times the value of the exponentially smoothed employment figure
of the previous time period.
The exponentially smoothed employment value for the time period 2 is .5(281) +
(1 .5)(280) = 280.5. The rest of the values are shown in the table.

Month
Jan.
Feb.
Mar.
Apr.
May
June
July
Aug.
Sept.
Oct.
Nov.
Dec.

482

t
1
2
3
4
5
6
7
8
9
10
11
12

Yt
280
281
250
246
239
218
218
210
205
206
200
200

Exponentially
Smoothed Series
w = .5
280.0
280.5
265.3
255.6
247.3
232.7
225.3
217.7
211.3
208.7
204.3
202.2

Chapter 13

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

b.

The graph of the time series and the exponentially smoothed series is:

280
270

Exponentially Smoothed
Series

260

Yt

250
240

Series

230
220
210
200
2

10

12

Time Period

13.18

a.

The exponentially smoothed fish catch for Chile for the first period is equal to the fish
catch for that period. For the rest of the time periods, the exponentially smoothed fish
catch values are found by multiplying .5 times the fish catch of that time period and
adding to that (1 .5) times the value of the exponentially smoothed fish catch figure of
the previous time period. The exponentially smoothed fish catch for Chile for the time
period 1995 is .5(7,590.5) + (1 .5)(5,195.4) = 6,392.95. The rest of the values are
shown in the table.
Similarly, the exponentially smoothed fish catch for Brazil for the first period is equal to
the fish catch for that period. For the rest of the time periods, the exponentially smoothed
fish catch values are found by multiplying .5 times the fish catch of that time period and
adding to that (1 .5) times the value of the exponentially smoothed fish catch figure of
the previous time period. The exponentially smoothed fish catch for Brazil for time
period 1995 is .5(800.0) + (1 .5)(802.9) = 801.45. The rest of the values are shown in
the table.

Time Series: Descriptive Analyses, Models, and Forecasting

483

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Year
1990
1995
1998
1999
2000
2001
2002

b.

Chile
Catch
5,195.4
7,590.5
3,265.3
5,050.2
4,300.0
3,797.1
4,271.5

Chile
w=.5
Exponentially
Smoothed
Catch
5,195.40
6,392.95
4,829.13
4,939.66
4,619.83
4,208.47
4,239.98

Brazil
Catch
802.9
800.0
706.8
703.9
766.8
806.7
822.1

Brazil
w=.5
Exponentially
Smoothed
Catch
802.90
801.45
754.13
729.01
747.91
777.30
799.70

The plot of the two time series and the two exponentially smoothed series is:
8000

Variable
Chile
Brazil
Chile-Exp
Brazil-Exp

7000

Fish C atch

6000
5000
4000
3000
2000
1000
0
1990

1992

1994

1996
Y ear

1998

2000

2002

Both the time series and the exponentially smoothed series for the fish catch in Brazil are
fairly stable over time. There is a decrease and then increase for both series in Brazil.
Both the time series and exponentially smoothed series for the fish catch in Chile show a
decrease over time. The exponentially smoothed series is more stable than the actual time
series.

484

Chapter 13

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

13.20

a.

The exponentially smoothed expenditure for the first time period is equal to the
expenditure for that period. For the rest of the time periods, the exponentially smoothed
expenditures are found by multiplying the expenditures for the time period by w = .2 and
adding to that (1 .2) times the exponentially smoothed value above it. The
exponentially smoothed value for the year 1991 is .2(548.9) + (1 .2)(590.1) = 581.86.
The rest of the values appear in the table. The process is repeated with w = .8.

Year
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005

b.

Expenditure
s
590.1
548.9
581.1
607.6
643.2
654.6
687.1
727.4
779.3
831.6
853.4
872.0
890.9
912.3
925.6
931.5

w = .2
Exponentially
Smoothed
Value
590.10
581.86
581.71
586.89
598.15
609.44
624.97
645.46
672.23
704.10
733.96
761.57
787.43
812.41
835.05
854.34

w = .8
Exponentially
Smoothed
Value
590.10
557.14
576.31
601.34
634.83
650.65
679.81
717.88
767.02
818.68
846.46
866.89
886.10
907.06
921.89
929.58

The plot of the two series is:

Variable
Expend
Exp-.2
Exp-.8

900

Expenditur es

800

700

600

500
1991

1993

1995

1997 1999
Y ear

2001

2003

There trend in personal consumption expenditure on transportation increased at a


faster rate in the 1990s than in the 2000s. In the 2000s, the consumption
expenditure is increasing but at a slower rate.

Time Series: Descriptive Analyses, Models, and Forecasting

485

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

13.22

a.

The exponentially smoothed Stock Index for the first time period is equal to the Stock
Index for that time period. For the rest of the time periods, the exponentially smoothed
stock price is found by multiplying w = .3 times the stock prices for that time period and
adding to that (1 .3) times the value of the exponentially smoothed stock price for the
previous time period. The exponentially smoothed stock prices for the second time
period is .3(1372.7) + (1 .3)(1286.4) = 1312.29. The rest of the values are shown in the
table.

Year
1999

2000

2001

2002

2003

2004

2005

2006

486

Quarter
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3

S&P
500
1286.4
1372.7
1282.7
1469.2
1498.6
1454.6
1436.5
1320.3
1160.3
1224.4
1040.9
1148.1
1147.4
989.8
815.3
879.8
848.2
974.5
996
1111.9
1126.2
1140.8
1114.6
1211.9
1180.6
1191.3
1228.8
1248.3
1294.9
1270.2
1335.8

Exponentially
Smoothed
Series
w = .3
1286.4
1312.3
1303.4
1353.1
1396.8
1414.1
1420.8
1390.7
1321.6
1292.4
1217.0
1196.3
1181.6
1124.1
1031.4
986.0
944.6
953.6
966.3
1010.0
1044.9
1073.6
1085.9
1123.7
1140.8
1155.9
1177.8
1198.9
1227.7
1240.5
1269.1

Exponentially
Smoothed
Series
w = .7
1286.4
1346.8
1301.9
1419.0
1474.7
1460.6
1443.7
1357.3
1219.4
1222.9
1095.5
1132.3
1142.9
1035.7
881.4
880.3
857.8
939.5
979.0
1072.0
1110.0
1131.5
1119.7
1184.2
1181.7
1188.4
1216.7
1238.8
1278.1
1272.6
1316.8

Chapter 13

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The plot of the original series and the exponentially smoothed series with w = .3 is:

Variable
S&P 500
Exp-.3

1500
1400

S & P 500

1300
1200
1100
1000
900
800
Q uarter
Year

b.

Q1
1999

Q1
Q1
2000 2001

Q1
Q1
2002 2003

Q1
2004

Q1
Q1
2005 2006

The same procedure is followed for w = .7. The exponentially smoothed Stock Index for
the first time period is equal to the Stock Index for that time period. For the rest of the
time periods, the exponentially smoothed stock price is found by multiplying w = .7 times
the stock prices for that time period and adding to that (1 .7) times the value of the
exponentially smoothed stock price for the previous time period. The exponentially
smoothed stock prices for the second time period is .7(1372.7) + (1 .7)(1286.4) =
1346.8. The rest of the values are shown in the table in part a.
The plot of the original series and the exponentially smoothed series with w = .7 is:

Variable
S&P 500
Exp-.7

1500
1400

S & P 500

1300
1200
1100
1000
900
800
Q uarter
Year

c.

Q1
1999

Q1
Q1
2000 2001

Q1
Q1
2002 2003

Q1
2004

Q1
Q1
2005 2006

The exponentially smoothed series with w = .3 better describes the trends in the series.
The exponentially smoothed series with w = .7 is almost exactly like the original series.

Time Series: Descriptive Analyses, Models, and Forecasting

487

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

13.24

13.26

a.

The missing trend value for quarter 3 is:


T3 = v(E3 E2) + (1 v)T2 = .6(3.78 3.50) + (1 .6)(.25) = .27

b.

The missing smoothed value for quarter 4 is:


E4 = wY4 + (1 w)(E3 + T3) = .2(4.25) + (1 .2)(3.78 + .27) = 4.09.

c.

The forecast for quarter 5 is:


FQ5 = Ft+1 = Et + Tt = 4.09 + .29 = 4.38.

a.

To compute the exponentially smoothed values, we follow these steps:


E1 = Y1 = 345
E2 = wY2 + (1 w)E1 = .6(456) + (1 .6)(345) = 411.60
E3 = wY3 + (1 w)E2 = .6(440) + (1 .6)(411.60) = 428.64
The rest of the values are computed in a similar manner and are listed in the table:
Year
2004

Quarter
1
2
3
4
1
2
3
4

2005

b.

Exponentially Smoothed
w = .6
345.00
411.60
428.64
393.46
378.78
442.51
459.61
419.04

Housing Starts
345
456
440
370
369
485
471
392

Using MINITAB, the plot is:


500

Variable
Housing
Exp-.6

475

Star ts

450

425

400

375

350
Q uarter
Year

Q1
2004

Q2

Q3

Q4

Q1
2005

Q2

Q3

Q4

c. To forecast using exponentially smoothed values, we use the following:


F2006,1 = Ft+1 = Et = 419.04
F2006,2 = Ft+2 = Ft+1 = 419.04
F2006,3 = Ft+3 = Ft+1 = 419.04
F2006,4 = Ft+4 = Ft+1 = 419.04

488

Chapter 13

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

13.28

a.

Using the information from Exercise 13.21, the forecast using the exponentially
smoothed values with w = .9 is:
F2006 = Ft+2 = Ft+ 1 = Et = 1815.3

b.

We first compute the Holt-Winters values for years 1974-2004.


With w = .3 and v = .8,
E2 = Y2 = 1171
E3 = wY3 + (1 w)(E2 + T2) = .3(1663) + (1 .3)(1171 + 245) = 1490.1
T2 = Y2 Y1 = 1171 926 = 245
T3 = v(E3 E2) + (1 v)T2 = .8(1490.1 1171) + (1 .8)(245) = 304.28
The rest of the Ets and Tts appear in the table:

Year
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004

t
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31

Imports
926
1,171
1,663
2,058
1,892
1,866
1,414
1,067
633
540
553
479
771
876
987
1,232
1,282
1,233
1,247
1,339
1,307
1,303
1,258
1,378
1,522
1,543
1,664
1,770
1,490
1,671
1,833

Et
w = .3
v = .8

Tt
w = .3
v = .8

1171.00
1490.10
1873.47
2136.31
2253.87
2107.47
1734.46
1182.96
637.02
235.48
8.40
50.00
283.65
622.67
1020.93
1365.36
1571.76
1639.14
1619.79
1529.25
1411.34
1289.30
1232.36
1270.65
1364.08
1508.72
1679.04
1736.09
1771.26
1820.42

245.00
304.28
367.55
283.79
150.80
86.96
315.80
504.36
537.62
428.76
267.41
20.21
182.88
307.79
380.16
351.58
235.43
100.99
4.72
71.48
108.63
119.36
69.42
16.75
78.09
131.33
162.52
78.14
43.77
48.08

Time Series: Descriptive Analyses, Models, and Forecasting

489

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

To forecast using the Holt-Winters Model:


For w = .3 and v = .8,
F2006 = Ft+2 = Ft+1 = Et + 2Tt = 1,820.42 + 2(48.08) = 1,916.58
c.

The error forecast for the exponentially smoothed series is


Yt+2 Ft+2 = 2,100 1815.3 = 284.7
The error forecast for the Holt-Winters series is
Yt+2 Ft+2 = 2,100 1,916.58 = 183.42
The error for the Holt-Winters forecast is smaller than the error for the exponentially
smoothed forecast.

13.30

a.

We first compute the Holt-Winters values for the years 2003-2005.


With w = .3 and v = .5,
E2 = Y2 = 974.5
E3 = wY3 + (1 w)(E2 + T2) =.3(996.0) + (1 .3)(974.5 + 126.3) = 1,069.36.
T2 = Y2 Y1 = 974.5 848.2 = 126.3
T3 = v(E3 E2) + (1 v)T2 = .5(1,069.36 974.5) + (1 .5)(126.3) = 110.58
The rest of the Ets and Tts appear in the table that follows.

Year
2003

2004

2005

2006

490

Quarter
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3

S&P
500
848.2
974.5
996.0
1111.9
1126.2
1140.8
1114.6
1211.9
1180.6
1191.3
1228.8
1248.3
1294.9
1270.2
1335.8

Et
w = .3
v = .5

Tt
w = .3
v = .5

Et
w = .7
v = .5

Tt
w = .7
v = .5

974.5
1069.36
1159.53
1219.79
1252.32
1250.50
1258.03
1246.99
1232.52
1227.45
1229.96

126.30
110.58
100.37
80.32
56.42
27.30
17.42
3.19
-5.64
-5.35
-1.42

974.5
1027.44
1113.45
1148.72
1161.64
1139.88
1192.62
1193.28
1196.53
1221.92
1245.60

126.30
89.62
87.81
61.54
37.23
7.74
30.24
15.45
9.35
17.37
20.52

Chapter 13

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

To forecast using the Holt-Winters Model with w = .3 and v = .5:


F2006,1 = Ft+1 = Et + Tt = 1,229.96 1.42 = 1,228.54
F2006,2 = Ft+2 = Et + 2Tt = 1,229.96 + 2(1.42) = 1,227.12
F2006,3 = Ft+3 = Et + 3Tt = 1,229.96 + 3(1.42) = 1,225.70
With w = .7 and v = .5,
E2 = Y2 = 974.5
E3 = wY3 + (1 w)(E2 + T2) =.7(996.0) + (1 .7)(974.5 + 126.3) = 1,027.44.
T2 = Y2 Y1 = 974.5 848.2 = 126.3
T3 = v(E3 E2) + (1 v)T2 = .5(1,027.44 974.5) + (1 .5)(126.3) = 89.62
The rest of the Ets and Tts appear in the table above.
To forecast using the Holt-Winters Model with w = .7 and v = .5:
F2006,1 = Ft+1 = Et + Tt = 1,245.60 + 20.52 = 1,266.12
F2006,2 = Ft+2 = Et + 2Tt = 1,245.60 + 2(20.52) = 1,286.64
F2006,3 = Ft+3 = Et + 3Tt = 1,245.60 + 3(20.52) = 1,307.16
13.32

a.

From Exercise 13.25a, the forecasts for 2003-2005 using w = .3 are:


F2003 = 199.48
F2004 = 199.48
F2005 = 199.48
The errors are the differences between the actual values and the predicted values.
Thus, the errors are:
Y2003 F2003 = 195 199.48 = 4.48
Y2004 F2004 = 197 199.48 = 2.48
Y2005 F2005 = 195 199.48 = 4.48

b.

From Exercise 13.25a, the forecasts for 2003-2005 using w = .7 are:


F2003 = 199.74
F2004 = 199.74
F2005 = 199.74
The errors are:
Y2003 F2003 = 195 199.74 = 4.74
Y2004 F2004 = 197 199.74 = 2.74
Y2005 F2005 = 195 199.74 = 4.74

Time Series: Descriptive Analyses, Models, and Forecasting

491

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

For the exponentially smoothed forecasts with w = .3,


m

| Yt Ft |

|195 199.48 | + |197 199.48 | + |195 199.48 | 11.44


=
= 3.81
m
3
3
m (Yt Ft )
195 199.48 197 199.48 195 199.48

+
+

Y
195
197
195
i =1
t

100
=
MAPE =
100

m
3

.0585
=
100 = 1.9512
3
MAD =

i =1

(Yt Ft )

i =1

RMSE =

=
d.

(195 199.48)2 + (197 199.48)2 + (195 199.48)2


3
46.2912
= 3.928
3

For the exponentially smoothed forecasts with w = .7,


m

MAD =

| Yt Ft |
i =1

|195 199.74 | + |197 199.74 | + |195 199.74 | 12.22


=
= 4.07
3
3

m (Yt Ft )

Yt
i =1
MAPE =
m

195 199.74 197 199.74 195 199.74

+
+

195
197
195
100 =

100

.0625
=
100 = 2.0841
3
m

RMSE =

(Yt Ft )
i =1

=
=

492

(195 199.74 )2 + (197 199.74 )2 + (195 199.74 )2


3
52.4428
= 4.181
3

Chapter 13

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

13.34

a.

From Exercise 13.29a, the forecasts for the 3 quarters of 2006 using w = .7 are:
F2006,1 = 1,238.8
F2006,2 = 1,238.8
F2006,3 = 1,238.8
For the exponentially smoothed forecasts with w = .7:
m

MAD =

| Yt Ft |
i =1

|1294.9 1238.8 | + |1270.2 1238.8 | + |1335.8 1238.8 | 184.5


=
= 61.5
3
3

m (Yt Ft )

Yt
i =1

MAPE =
m

1294.9 1238.8 1270.2 1238.8 1335.8 1238.8

+
+

1294.9
1270.2
1335.8
100 =

100

.1407
=
100 = 4.689
3
m

RMSE =

(Yt Ft )
i =1

=
=

b.

(1294.9 1238.8)2 + (1270.2 1238.8)2 + (1335.8 1238.8)2


3
13,542.17
= 67.187
3

From Exercise 13.29b, the forecasts for the 3 quarters of 2006 using w = .3 are:
F2006,1 = 1,198.9
F2006,2 = 1,198.9
F2006,3 = 1,198.9
For the exponentially smoothed forecasts with w = .3:
m

MAD =

| Yt Ft |
i =1

m
304.2
=
= 101.4
3

|1294.9 1198.9 | + |1270.2 1198.9 | + |1335.8 1198.9 |


3

Time Series: Descriptive Analyses, Models, and Forecasting

493

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

m (Yt Ft )

Yt
i =1
MAPE =
m

1294.9 1198.9 1270.2 1198.9 1335.8 1198.9

+
+

1294.9
1270.2
1335.8
100 =

100

.2328
=
100 = 7.759
3
m

(Yt Ft )

i =1

RMSE =

=
=

13.36

(1294.9 1198.9 )2 + (1270.2 1198.9 )2 + (1335.8 1198.9 )2


3
33,041.3
= 104.946
3

c.

For all three measures of error, the exponentially smoothed series with w = .7 is smaller
than the exponentially smoothed series with w = .3. Thus, the more accurate series would
be the exponentially smoothed series with w = .7.

a.

From Exercise 13.31, the actual data and the forecasts using the exponential
smoothing and the Holt-Winters forecasts are:

Year
2005

Month
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec

Gold
Price
424.2
423.4
434.2
428.9
421.9
430.7
424.5
437.9
456.0
469.9
476.7
509.8

Exponential
Forecast
w =.5
433.47
433.47
433.47
433.47
433.47
433.47
433.47
433.47
433.47
433.47
433.47
433.47

Holt-Winters
Forecast
w =.5, v =.5
454.09
466.55
479.01
491.47
503.93
516.39
528.85
541.31
553.77
566.23
578.69
591.15

For the exponential smoothing forecasts with w = .5:


m

MAD =

| Yt Ft |
i =1

| 424.2 433.47 | + | 423.4 433.47 | + + | 509.8 433.47 |


12

m
230.9
=
= 19.242
12

494

Chapter 13

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

m (Yt Ft )

Yt
i =1
MAPE =
m

424.2 433.47 423.4 433.47


509.8 433.47

+
+ +

424.2
423.4
509.8
100 =

12

100

.4904
=
100 = 4.087
12
m

(Yt Ft )

i =1

RMSE =

( 424.2 433.47 )2 + ( 423.4 433.47 )2 + + ( 509.8 433.47 )2


12
9,980.2268
= 28.839
12

For the Holt-Winters forecasts with w = .5 and v = .5:


m

MAD =

| Yt Ft |
i =1

| 424.2 454.09 | + | 423.4 466.55 | + + | 509.8 591.15 |


12

m
933.34
=
= 77.778
12

m (Yt Ft )

Yt
i =1
MAPE =
m

424.2 454.09 423.4 466.55


509.8 591.15

+
+ +

424.2
423.4
509.8
100 =

12

100

2.0897
=
100 = 17.415
12
m

RMSE =

(Yt Ft )
i =1

=
=

( 424.2 454.09 )2 + ( 423.4 466.55)2 + + ( 509.8 591.15)2


12
80,190.7476
= 81.747
12

For all three measures of forecast errors, the exponential smoothing forecasts had
smaller errors. Thus, the exponential smoothing forecasts are better.

Time Series: Descriptive Analyses, Models, and Forecasting

495

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

b.

From Exercise 13.31, the actual data and the forecasts using the exponential
smoothing one-step-ahead and the Holt-Winters one-step-ahead forecasts are:

Year
2005

Month
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec

Gold
Price
424.2
423.4
434.2
428.9
421.9
430.7
424.5
437.9
456.0
469.9
476.7
509.8

Exponential
Forecast
w =.5
433.47
428.83
426.12
430.16
429.53
425.71
428.21
426.35
432.13
444.06
456.98
466.84

Holt-Winters
Forecast
w =.5, v =.5
454.09
444.12
433.57
433.84
430.10
422.67
425.37
423.40
432.74
452.27
473.40
488.19

For the exponential smoothing one-step-ahead forecasts with w = .5:


m

MAD =

| Yt Ft |
i =1

| 424.2 433.47 | + | 423.4 428.83 | + + | 509.8 466.84 |


12

m
164.32
=
= 13.693
12

m (Yt Ft )

Yt
i =1
MAPE =
m

424.2 433.47 423.4 428.83


509.8 466.84

+
+ +

424.2
423.4
509.8
100 =

12

100

.3540
=
100 = 2.950
12
m

RMSE =

(Yt Ft )
i =1

=
=

496

( 424.2 433.47 )2 + ( 423.4 428.83)2 + + ( 509.8 466.84 )2


12
3,884.9754
= 17.993
12

Chapter 13

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

For the Holt-Winters one-step-ahead forecasts with w = .5 and v = .5:


m

MAD =

| Yt Ft |
i =1

| 424.2 454.09 | + | 423.4 444.12 | + + | 509.8 488.19 |


12

m
153.58
=
= 12.798
12

m (Yt Ft )

Yt
i =1
MAPE =
m

424.2 454.09 423.4 444.12


509.8 488.19

+
+ +

424.2
423.4
509.8
100 =

12

100

.3434
=
100 = 2.862
12
m

RMSE =

(Yt Ft )

i =1

=
=

( 424.2 454.09 )2 + ( 423.4 444.12 )2 + + ( 509.8 488.19 )2


12
3,019.9854
= 15.864
12

For all three measures of forecast errors, the Holt-Winters forecasts have smaller errors.
Thus, the Holt-Winters forecasts are better.
13.38

a.

Using MINITAB, the output is:


Regression Analysis: Price versus t
The regression equation is
Price = 24.7 + 0.0910 t
Predictor
Constant
t

Coef
24.6975
0.09103

S = 1.497

SE Coef
0.7851
0.08119

R-Sq = 8.2%

T
31.46
1.12

P
0.000
0.281

R-Sq(adj) = 1.7%

Analysis of Variance
Source
Regression
Residual Error
Total

DF
1
14
15

SS
2.817
31.379
34.197

MS
2.817
2.241

F
1.26

P
0.281

Predicted Values for New Observations


New Obs
1

Fit
26.245

SE Fit
0.785

95.0% CI
24.561, 27.929)

95.0% PI
22.619, 29.871)

Values of Predictors for New Observations


New Obs
1

t
17.0

Time Series: Descriptive Analyses, Models, and Forecasting

497

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Predicted Values for New Observations


New Obs
2

Fit
26.336

SE Fit
0.857

95.0% CI
24.497, 28.175)

95.0% PI
22.636, 30.036)

Values of Predictors for New Observations


New Obs
2

b.

t
18.0

The estimates of the parameters in the model, E(Yt) = 0 + 1t, are

0 = 24.6975 The price is estimated to be 24.6975 cents/pound for t = 0 or for 1991.


1 = .09103

c.

The price is estimated to increase by .091 cents/pound for each additional


year.

The forecast for 2007 is:


Using t = 17, Y 2003 = 24.6975 + .09103(17) = 26.2450
The forecast for 2008 is:
Using t = 18, Y 2004 = 24.6975 + .09103(18) = 26.3360
Yes, these agree with the predicted values on the printout.

d.

From the printout, the 95% forecast intervals are:


2007 (22.619, 29.871)
2008 (22.636, 30.036)
We are 95% confident that the actual price in 2007 will be between 22.619 and 29.871.
We are 95% confident that the actual price in 2008 will be between 22.636 and 30.036.

e.

13.40

498

No, we would not recommend that this model be used to forecast annual price. If we
were to test if there is a significant linear relationship between time and annual price
(H0: 1 = 0 vs Ha: 1 0), the test statistic would be t = 1.12 and the p-value would be
p = .281. Thus, we would conclude there is insufficient evidence to indicate a linear
relationship exists between time and annual price. (Do not reject H0.)

The major advantage of regression forecasts over the exponentially smoothed forecasts is that
prediction intervals can be formed using the regression forecasts and not using the
exponentially smoothed forecasts.

Chapter 13

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

13.42

a.

Using MINITAB, the results are:


Regression Analysis: Price versus Time
The regression equation is
Price = 4.76 + 0.309 Time
Predictor
Constant
Time

Coef
4.7608
0.30857

S = 0.769971

SE Coef
0.4184
0.04601

R-Sq = 77.6%

T
11.38
6.71

P
0.000
0.000

R-Sq(adj) = 75.8%

Analysis of Variance
Source
Regression
Residual Error
Total

DF
1
13
14

SS
26.661
7.707
34.368

MS
26.661
0.593

F
44.97

P
0.000

Unusual Observations
Obs
15

Time
15.0

Price
10.740

Fit
9.389

SE Fit
0.379

Residual
1.351

St Resid
2.01R

R denotes an observation with a large standardized residual.


Predicted Values for New Observations
New
Obs
1

Fit
9.698

SE Fit
0.418

95% CI
(8.794, 10.602)

95% PI
(7.805, 11.591)

Values of Predictors for New Observations


New
Obs
1

Time
16.0

Predicted Values for New Observations


New
Obs
1

Fit
10.006

SE Fit
0.459

95% CI
(9.014, 10.999)

95% PI
(8.069, 11.943)

Values of Predictors for New Observations


New
Obs
1

Time
17.0

From the printout:

o = 4.7608 . The price of gas is estimated to be 4.7608 dollars per 1,000 cubic
feet in 1989.

1 = .30857 . For each additional year, the price of gas is estimated to increase
by .30857 dollars per 1,000 cubic feet.

Time Series: Descriptive Analyses, Models, and Forecasting

499

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

b.

To determine the model fit, we test:


H0: = 0
Ha: 0
The test statistic is t = 6.71 (from the printout).
The p-value is p = 0.000. Since the p-value is so small, H0 is rejected for any
reasonable value of . There is sufficient evidence that the model has an adequate
fit.

c.

The 95% prediction interval for 2005 is (7.805, 11.591). We are 95% confident that the
actual annual price of natural gas in 2005 is between 7.805 and 11.591 dollars per 1,000
cubic feet.
The 95% prediction interval for 2006 is (8.069, 11.943). We are 95% confident that
the actual annual price of natural gas in 2006 is between 8.069 and 11.943 dollars per
1,000 cubic feet.

13.44

d.

There are basically two problems with using simple linear regression for predicting time
series data. First, we must predict values of the time series for values of time outside the
observed range. We observe data for time periods 1, 2, , t and use the regression
model to predict values of the time series for t + 1, t + 2, . The second problem is that
simple linear regression does not allow for any cyclical effects such as seasonal trends.

a.

The regression model is: E (Yt ) = o + 1t + 2 Q1 + 3 Q2 + 3 Q3

b.

Using MINITAB, the output is:


Regression Analysis: Sales versus t, Q1, Q2, Q3
The regression equation is
Sales = 120 + 16.5 t + 262 Q1 + 223 Q2 + 106 Q3
Predictor
Constant
t
Q1
Q2
Q3

Coef
119.85
16.512
262.34
222.83
105.51

S = 26.00

SE Coef
16.95
1.028
16.73
16.57
16.48

R-Sq = 96.9%

T
7.07
16.07
15.68
13.45
6.40

P
0.000
0.000
0.000
0.000
0.000

R-Sq(adj) = 96.1%

Analysis of Variance
Source
Regression
Residual Error
Total
Source
t
Q1
Q2
Q3

500

DF
1
1
1
1

DF
4
15
19

SS
318560
10139
328700

MS
79640
676

F
117.82

P
0.000

Seq SS
114343
81883
94610
27724

Chapter 13

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Predicted Values for New Observations


New Obs
1

Fit
728.95

SE Fit
16.95

95.0% CI
692.82, 765.08)

95.0% PI
662.80, 795.10)

95.0% PI
639.80, 772.10)

95.0% PI
539.00, 671.30)

95.0% PI
450.00, 582.30)

Values of Predictors for New Observations


New Obs
1

t
21.0

Q1
1.00

Q2
0.000000

Q3
0.000000

Predicted Values for New Observations


New Obs
1

Fit
705.95

SE Fit
16.95

95.0% CI
669.82, 742.08)

Values of Predictors for New Observations


New Obs
1

t
22.0

Q1
0.000000

Q2
1.00

Q3
0.000000

Predicted Values for New Observations


New Obs
1

Fit
605.15

SE Fit
16.95

95.0% CI
569.02, 641.28)

Values of Predictors for New Observations


New Obs
1

t
23.0

Q1
0.000000

Q2
0.000000

Q3
1.00

Predicted Values for New Observations


New Obs
1

Fit
516.15

SE Fit
16.95

95.0% CI
480.02, 552.28)

Values of Predictors for New Observations


New Obs
1

t
24.0

Q1
0.000000

Q2
0.000000

Q3
0.000000

The least squares equation is:


Yt = 119.85 + 16.512t + 262.34Q1 + 222.83Q2 + 105.51Q3

1 = 16.512
2 = 262.34
3 = 222.83
4 = 105.51

For every increase in time period (1 quarter), the mean sales index
increases by an estimated 16.512.
The difference in mean sales index between the first and fourth quarters
is estimated to be 262.34.
The difference in the mean sales index between the second and fourth
quarters is estimated to be 222.83.
The difference in the mean sales index between the third and fourth
quarters is estimated to be 105.51.

To determine if the model is useful, we test:


H0: 1 = 2 = 3 = 4 = 0
Ha: At least one i 0, i = 1, 2, 3, 4
The test statistic is F = 117.82

Time Series: Descriptive Analyses, Models, and Forecasting

501

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Since no is given, we will use = .05. The rejection region requires = .05 in the
upper tail of the F-distribution with numerator df = k = 4 and denominator df = n (k + 1)
= 20 (4 + 1) = 15. From Table IX, Appendix B, F = 3.06. The rejection region is
F > 3.06.
Since the observed value of the test statistic falls in the rejection region (F = 117.82 >
3.06), H0 is rejected. There is sufficient evidence to indicate the model is useful at
= .05.
c.

The assumption of independent error terms is in doubt.

d.

The forecasts and the 95% prediction intervals are found at the bottom of the printout and
are:

2007

13.46

13.48

I
II
III
IV

Forecast
728.95
705.95
605.15
516.115

95% Lower Limit 95% Upper Limit


662.8
795.1
639.8
772.1
539.0
671.3
450.0
582.3

a.

d = 3.9 indicates the residuals are very strongly negatively autocorrelated.

b.

d = .2 indicates the residuals are very strongly positively autocorrelated.

c.

d = 1.99 indicates the residuals are probably uncorrelated.

a.

To determine if the overall model contributes information for the prediction of monthly
passenger car and light truck sales, we test:
H0: 1 = 2 = 3 = 4 = 5 = 0
Ha: At least 1 i 0
The test statistic is F =

R2 / k
.856 / 5
=
= 164.067
2
(1 R ) /[n (k + 1)] (1 .856) /[144 (5 + 1)]

The rejection region requires = .05 in the upper tail of the F distribution with 1 = k = 5
and 2 = n (k + 1) = 144 (5 + 1) = 138. From Table IX, Appendix B, F.05 2.29. The
rejection region is F > 2.29.
Since the observed value of the test statistic falls in the rejection region (F = 164.067
> 2.29), H0 is rejected. There is sufficient evidence to indicate the overall model
contributes information for the prediction of monthly passenger car and light truck sales
at = .05.
b.

To determine if positive autocorrelation is present, we test:


H0: No first-order autocorrelation
Ha: Positive first-order autocorrelation of residuals

502

Chapter 13

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The test statistics is d = 1.01.


For = .05, the rejection region is d < dL, = dL,.05 1.57. The value dL,.05 is found in
Table XIII, Appendix B, with k = 5, n = 144, and = .05.
Since the observed value of the test statistic falls in the rejection region (d = 1.01 <
1.57, H0 is rejected. There is sufficient evidence to indicate the time series residuals
are positively autocorrelated at = .05.

13.50

c.

One of the requirements for the validity of the test in part b is that the error terms are
independent. Since H0 was rejected in part a, there is evidence that positive
autocorrelation exists. Since the error terms are not independent, the test in part b
may not be valid.

a.

There is a tendency for the residuals to have long positive runs and negative runs.
Residuals 1 through 6 are positive, while residuals 7 through 25 are negative. Residuals
26 through 35 are positive. This indicates the error terms are correlated.

b.

From the printout, the Durbin-Watson d is d = .0627.


To determine if the time series residuals are autocorrelated, we test:
H0: No first-order autocorrelation of residuals
Ha: Positive or negative first-order autocorrelation of residuals
The test statistic is d = .0627.
For = .10, the rejection region is d < dL,/2 = dL,.05 = 1.40 or (4 d) < dL,.05 = 1.40. The
value dL,.05 is found in Table XIII, Appendix B, with k = 1, n = 35, and = .10.
Since the observed value of the test statistic falls in the rejection region (d = .0627 <
1.40), H0 is rejected. There is sufficient evidence to indicate the time series residuals are
autocorrelated at = .10.

c.

We must assume the residuals are normally distributed.

Time Series: Descriptive Analyses, Models, and Forecasting

503

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

13.52

a.

Using MINITAB, the plot of the residuals against t is:


Scatterplot of RESI1 vs Time
1.5

1.0

RESI1

0.5

0.0

-0.5

-1.0
0

8
T ime

10

12

14

16

There is not a random scattering of the residuals. The first 5 residuals are positive, the
next 6 are negative, the next one is positive, the next one is negative and the last 2 are
positive. This does not appear to be a random scattering. The plot suggests the
possibility of autocorrelation.
b.

Using MINITAB, the output is:


Regression Analysis: Price versus Time
The regression equation is
Price = 4.76 + 0.309 Time
Predictor
Constant
Time

Coef
4.7608
0.30857

S = 0.769971

SE Coef
0.4184
0.04601

R-Sq = 77.6%

T
11.38
6.71

P
0.000
0.000

R-Sq(adj) = 75.8%

Analysis of Variance
Source
Regression
Residual Error
Total

DF
1
13
14

SS
26.661
7.707
34.368

MS
26.661
0.593

F
44.97

P
0.000

Unusual Observations
Obs
15

Time
15.0

Price
10.740

Fit
9.389

SE Fit
0.379

Residual
1.351

St Resid
2.01R

R denotes an observation with a large standardized residual.


Durbin-Watson statistic = 1.39909

504

Chapter 13

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

To determine if positive autocorrelation is present, we test:


H0: No first-order autocorrelation
Ha: Positive first-order autocorrelation of residuals
The test statistics is d = 1.399.
For = .05, the rejection region is d < dL, = dL,.05 = 1.08. The value dL,.05 is found in
Table XIII, Appendix B, with k = 1, n = 15, and = .05.
Since the observed value of the test statistic does not fall in the rejection region
(d = 1.399 </ 1.08), H0 is not rejected. From Table XII, Appendix B, dU, = 1.36 with
k = 1, n = 15 and = .05. Since the observed value of the test statistic falls above the
upper limit (d = 1.399 > 1.36), there is insufficient evidence to indicate the time series
residuals are positively autocorrelated at = .05.

13.54

c.

Since the error terms do not appear to be dependent, the validity of the test for the model
adequacy appears to be fine.

a.

Using MINITAB, the plot of the residuals against t is:


Scatterplot of RESI1 vs t
30
20

RESI1

10
0

-10
-20
-30
0

10

15

20

25

30

35

Since there appear to be groups of consecutive positive and groups of consecutive


negative residuals, the data appear to be autocorrelated.

Time Series: Descriptive Analyses, Models, and Forecasting

505

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

b.

Using MINITAB, the output is:


Regression Analysis: Policies versus t
The regression equation is
Policies = 385 - 0.363 t
Predictor
Constant
t

Coef
385.326
-0.3632

S = 15.0555

SE Coef
5.280
0.2632

R-Sq = 5.6%

T
72.98
-1.38

P
0.000
0.177

R-Sq(adj) = 2.7%

Analysis of Variance
Source
Regression
Residual Error
Total

DF
1
32
33

SS
431.6
7253.3
7685.0

MS
431.6
226.7

F
1.90

P
0.177

Unusual Observations
Obs
1

t
1.0

Policies
355.00

Fit
384.96

SE Fit
5.05

Residual
-29.96

St Resid
-2.11R

R denotes an observation with a large standardized residual.


Durbin-Watson statistic = 0.424942

To determine if positive autocorrelation is present, we test:


H0: No first-order autocorrelation
Ha: Positive first-order autocorrelation of residuals
The test statistics is d = 0.42.
For = .05, the rejection region is d < dL, = dL,.05 = 1.39. The value dL,.05 is found in
Table XIII, Appendix B, with k = 1, n = 34, and = .05.
Since the observed value of the test statistic falls in the rejection region
(d = .42 < 1.39), H0 is rejected. There is sufficient evidence to indicate the time series
residuals are positively autocorrelated at = .05.
c.

506

Since the error terms do not appear to be independent, the validity of the test for model
adequacy is in question.

Chapter 13

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

13.56

a.

Year
1995
2000
2001
2002
2003
2004
b.

The exponentially smoothed price for the first time period is equal to the price for that
period. For the rest of the time periods, the exponentially smoothed prices are found by
multiplying the price for that time period by w = .5 and adding to that (1 .5) times the
exponentially smoothed price for the time period preceeding it. The exponentially
smoothed values for each of the price series appear in the table:

Cold
Finished
Price
25.70
23.08
22.76
23.26
25.15
38.67

Exponentially
Smoothed
Value
w = .5
25.70
24.39
23.58
23.42
24.28
31.48

Exponentially
Smoothed
Value
w = .5
25.32
20.50
16.10
16.28
15.54
23.19

Hot
Rolled
Price
25.32
15.67
11.71
16.46
14.80
30.84

Galvanized
Price
34.47
21.38
16.41
22.00
20.08
36.69

Exponentially
Smoothed
Value
w = .5
34.47
27.93
22.17
22.08
21.08
28.89

The plot of the three price series and the exponentially smoothed series are:
Cold Finished
40

Variable
CF
CF-Exp-.5

P r ice

35

30

25

1995 1996 1997

1998 1999

2000 2001

2002 2003

2004

Y ear

Time Series: Descriptive Analyses, Models, and Forecasting

507

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Hot Rolled
Variable
HR
HR-Exp-.5

30

P r ice

25

20

15

10
1995 1996 1997 1998

1999 2000 2001

2002 2003 2004

Y ear

Galvanized
Variable
Gal
Gal-Exp-.5

35

P r ice

30

25

20

15
1995

1996 1997 1998 1999 2000

2001 2002 2003 2004

Y ear

c.

The exponential smoothing forecasts for 2005 are:


Cold Finished: F2005 = E2004 = 31.48
Hot Rolled:
F2005 = E2004 = 23.19
Galvanized:
F2005 = E2004 = 28.89
One of the main drawbacks of this kind of forecast is the inability to forecast
future values using prediction intervals.

508

Chapter 13

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

13.58

a.

To compute the Laspeyres index, multiply the price for each year by the quantity for each
of the items for 1990, sum the products for the four items, divide by 14.05 (the sum for
the base period 1990), and multiply by 100. The Laspeyres index is:

Year
1990
1995
2000
2004

13.60

Spaghetti
0.85
0.88
0.88
0.95

Ground
Beef
1.63
1.40
1.63
2.14

Eggs
1.00
1.16
0.96
0.98

Potatoes
0.32
0.38
0.35
0.51

Total
14.05
13.72
14.37
18.68

Laspeyres
100.00
97.65
102.28
132.95

b.

From 1990 to 2004, the basket of foods increased by 132.95 100 = 32.95%.

a.

We first calculate the exponentially smoothed values for 19801999.


E1 = Y1 = 56.50
E2 = .8Y2 + (1 .8)E1 = .8(27.0) + .2(56.50) = 32.90
E3 = .8Y3 + (1 .8)E2 = .8(38.75) + .2(32.90) = 37.58
The rest of the values appear in the table.
Year

1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006

Closing Exponentially Smoothed Value


Price
(w = .8)
56.50
56.50
27.00
32.90
38.75
37.58
45.25
43.72
41.75
42.14
68.37
63.12
45.62
49.12
48.02
48.24
48.01
48.06
64.03
60.84
45.00
48.17
68.07
64.09
30.03
36.84
29.05
30.61
32.05
31.76
41.05
39.19
50.75
48.44
65.50
62.09
49.00
51.62
36.31
39.37
48.44
46.63
55.75
53.93
40.00
42.79
46.60
45.84
46.65
46.49
39.43
40.84
43.80
43.21

Time Series: Descriptive Analyses, Models, and Forecasting

509

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The forecasts for 2007 and 2008 are:


F2007 = Ft+1 = Et = 43.21
F2008 = Ft+2 = Et = 43.21
The expected gain is F2008 Y2006 = 43.21 43.80 = .59. Since this number is
negative, it is actually a loss.
b.

We first calculate the Holt-Winters values for 1980-2006.


For w = .8 and v = .5,
E2 = Y2 = 27.00
E3 = .8Y3 + (1 .8)(E2 + T2)
= .8(38.75) + .2(27 29.50) = 30.50
T2 = Y2 Y1 = 27.00 56.50 = 29.50
T3 = .5(E3 E2) + (1 .5)(T2)
= .5(30.50 27.00) + .5(29.50) = -13.00
The rest of the values appear in the table.
Year

1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006

510

Closing Price

56.50
27.00
38.75
45.25
41.75
68.37
45.62
48.02
48.01
64.03
45.00
68.07
30.03
29.05
32.05
41.05
50.75
65.50
49.00
36.31
48.44
55.75
40.00
46.60
46.65
39.43
43.80

Holt-Winters
w = .8
v = .5
Et
Tt

27.00 29.5
30.50 13.00
39.70 1.90
40.96 0.32
62.82 10.77
51.22 0.42
48.58 1.53
47.82 1.14
60.56
5.80
49.27 2.74
63.76
5.87
37.95 9.97
28.84 9.54
29.50 4.44
37.85
1.96
48.56
6.33
63.38 10.58
53.99
0.59
39.96 6.72
45.40 0.64
53.55
3.76
43.46 3.17
45.34 0.65
46.26
0.14
40.82 2.65
42.67 0.40

Chapter 13

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The forecasts for 2007 and 2008 are:


F2007 = Ft+1 = Et + Tt = 42.67 + (.40) = 42.27
F2008 = Ft+2 = Et + 2Tt = 42.67 + 2(.40) = 41.87
The expected gain is F2008 Y2006 = 41.87 43.80 = 1.93. Since this number is
negative, it is actually a loss.
13.62

a.

To compute the simple index for the IRA series, divide each IRA value by the 1990
value, 140, and then multiply by 100. To compute the simple index for the 401(k) series,
divide each 401(k) value by the 1990 value, 35, and then multiply by 100. The values for
the indices are in the table:

Year
1990
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004

b.

IRA
140
350
476
598
767
960
1234
1232
1161
1034
1307
1490

IRA
Simple
Index
100.00
250.00
340.00
427.14
547.86
685.71
881.43
880.00
829.29
738.57
933.57
1064.29

401(k)
35
184
266
346
466
616
810
815
794
706
919
1086

401(k)
Simple
Index
100.00
525.71
760.00
988.57
1331.43
1760.00
2314.29
2328.57
2268.57
2017.14
2625.71
3102.86

The time series plot is:


3500

Variable
IRAindex
401(K)index

3000

Index

2500
2000
1500
1000
500
0
1990 1992

1994

1996 1998
Y ear

2000

2002

2004

Time Series: Descriptive Analyses, Models, and Forecasting

511

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

13.64

c.

Both the IRA and 401(K) finds have increased since 1990. However, the 401(K) fund
has increased at a higher rate than has the IRA fund.

a.

Using MINITAB, the results from fitting the model E(Yt) = o + 1t are:
Regression Analysis: GDP versus t
The regression equation is
GDP = 9595 + 79.5 t
Predictor
Constant
t

Coef
9594.96
79.537

S = 97.4825

SE Coef
45.28
3.780

R-Sq = 96.1%

T
211.89
21.04

P
0.000
0.000

R-Sq(adj) = 95.9%

Analysis of Variance
Source
Regression
Residual Error
Total

DF
1
18
19

SS
4206863
171051
4377914

MS
4206863
9503

F
442.70

P
0.000

Unusual Observations
Obs
1

t
1.0

GDP
9876.0

Fit
9674.5

SE Fit
42.0

Residual
201.5

St Resid
2.29R

R denotes an observation with a large standardized residual.


Durbin-Watson statistic = 0.236602
Predicted Values for New Observations
New
Obs
1

Fit
11265.2

SE Fit
45.3

95% CI
(11170.1, 11360.4)

95% PI
(11039.4, 11491.1)

Values of Predictors for New Observations


New
Obs
1

t
21.0

Predicted Values for New Observations


New
Obs
1

Fit
11344.8

SE Fit
48.6

95% CI
(11242.6, 11446.9)

95% PI
(11115.9, 11573.6)

Values of Predictors for New Observations


New
Obs
1

512

t
22.0

Chapter 13

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Predicted Values for New Observations


New
Obs
1

Fit
11424.3

SE Fit
52.0

95% CI
(11315.0, 11533.6)

95% PI
(11192.2, 11656.5)

Values of Predictors for New Observations


New
Obs
1

t
23.0

Predicted Values for New Observations


New
Obs
1

Fit
11503.8

SE Fit
55.5

95% CI
(11387.3, 11620.4)

95% PI
(11268.2, 11739.5)X

X denotes a point that is an outlier in the predictors.


Values of Predictors for New Observations
New
Obs
1

t
24.0

The fitted regression line is: Yt = 9,594.96 + 79.537t


From the printout, the 2006 quarterly GDP forecasts are:

Year
2006

b.

Quarter
Q1
Q2
Q3
Q4

Forecast
11,265.2
11,344.8
11,424.3
11,503.8

95% Lower
Limit
11,039.4
11,115.9
11,192.2
11,268.2

95% Upper
Limit
11,491.1
11,573.6
11,656.5
11,739.5

The following model is fit: E(Yt) = o + 1t + 1t + 2Q1 + 3Q2 + 4Q3


1 if quarter 1
where Q1 =
0 otherwise

1 if quarter 2
Q2 =
0 otherwise

1 if quarter 3
Q3 =
0 otherwise

The MINITAB printout is:


Regression Analysis: GDP versus t, Q1, Q2, Q3
The regression equation is
GDP = 9573 + 79.8 t + 29.4 Q1 + 21.1 Q2 + 25.8 Q3
Predictor
Constant
t
Q1
Q2
Q3

Coef
9572.60
79.850
29.35
21.10
25.85

S = 105.993

SE Coef
69.10
4.190
68.20
67.56
67.17

R-Sq = 96.2%

T
138.53
19.06
0.43
0.31
0.38

P
0.000
0.000
0.673
0.759
0.706

R-Sq(adj) = 95.1%

Time Series: Descriptive Analyses, Models, and Forecasting

513

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Analysis of Variance
Source
Regression
Residual Error
Total
Source
t
Q1
Q2
Q3

DF
1
1
1
1

DF
4
15
19

SS
4209395
168519
4377914

MS
1052349
11235

F
93.67

P
0.000

Seq SS
4206863
656
212
1664

Unusual Observations
Obs
1

t
1.0

GDP
9876.0

Fit
9681.8

SE Fit
58.1

Residual
194.2

St Resid
2.19R

R denotes an observation with a large standardized residual.


Durbin-Watson statistic = 0.238059
Predicted Values for New Observations
New
Obs
1

Fit
11278.8

SE Fit
69.1

95% CI
(11131.5, 11426.1)

95% PI
(11009.1, 11548.5)

Values of Predictors for New Observations


New
Obs
1

t
21.0

Q1
1.00

Q2
0.000000

Q3
0.000000

Predicted Values for New Observations


New
Obs
1

Fit
11350.4

SE Fit
69.1

95% CI
(11203.1, 11497.7)

95% PI
(11080.7, 11620.1)

Values of Predictors for New Observations


New
Obs
1

t
22.0

Q1
0.000000

Q2
1.00

Q3
0.000000

Predicted Values for New Observations


New
Obs
1

Fit
11435.0

SE Fit
69.1

95% CI
(11287.7, 11582.3)

95% PI
(11165.3, 11704.7)

Values of Predictors for New Observations


New
Obs
1

t
23.0

Q1
0.000000

Q2
0.000000

Q3
1.00

Predicted Values for New Observations


New
Obs
Fit SE Fit
95% CI
95% PI
1 11489.0
69.1 (11341.7, 11636.3) (11219.3, 11758.7)
Values of Predictors for New Observations
New
Obs
1

514

t
24.0

Q1
0.000000

Q2
0.000000

Q3
0.000000

Chapter 13

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The fitted regression line is:


Yt = 9,572.6 + 79.85t + 29.35Q1 + 21.10Q2 + 25.85Q3
To determine whether the data indicate a significant seasonal component, we test:

H0: 2 = 3 = 4 = 0
Ha: At least one i 0

i = 2, 3, 4

The test statistic is


F=

(SSE R SSE C ) /(k g ) (171,051 168,519) /(4 1)


844
=
=
= 0.075
SSE C [ n ( k + 1)]
168,519 /[20 (4 + 1)]
11, 234.6

Since no is given, we will use = .05. The rejection region requires = .05 in the
upper tail of the F-distribution with 1 = k g = 4 1 = 3 and 2 = n (k + 1) = 20
(4 + 1) = 15. From Table IX, Appendix B, F.05 = 3.29. The rejection region is F > 3.29.
Since the observed value of the test statistic does not fall in the rejection region
(F = .075 >/ 3.29), H0 is not rejected. There is insufficient evidence to indicate a seasonal
component at = .05. This supports the assertion that the data have been seasonally
adjusted.
c.

From the printout, the 2006 quarterly forecasts are:

Year
2006

d.

Quarter
Q1
Q2
Q3
Q4

Forecast
11,278.8
11,350.4
11,435.0
11,489.0

95% Lower
Limit
11,009.1
11,080.7
11,165.3
11,219.3

95% Upper
Limit
11,548.5
11,620.1
11,704.7
11,758.7

To determine if the time series residuals are autocorrelated, we test:


H0: No first-order autocorrelation of residuals
Ha: Positive or negative first-order autocorrelation of residuals
The test statistic is d = 0.24.
For = .10, the rejection region is d < dL,/2 = dL,.05 = .90 or (4 d) < dL,.01 = .90. The
value of dL,.05 is found in Table XIII, Appendix B, with k = 4 and n = 20.
Since the observed value of the test statistic falls in the rejection region (d = 0.24 < .90),
H0 is rejected. There is sufficient evidence to indicate the time series residuals are
autocorrelated at = .10.

Time Series: Descriptive Analyses, Models, and Forecasting

515

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

13.66

a.

Using MINITAB, the results from fitting the model E(Yt) = 0 + 1t are:
Regression Analysis: Revolving versus t
The regression equation is
Revolving = - 84.5 + 33.8 t
Predictor
Constant
t

Coef
-84.54
33.768

S = 56.7803

SE Coef
23.41
1.575

R-Sq = 95.2%

T
-3.61
21.44

P
0.001
0.000

R-Sq(adj) = 95.0%

Analysis of Variance
Source
Regression
Residual Error
Total

DF
1
23
24

SS
1482334
74152
1556486

MS
1482334
3224

F
459.78

P
0.000

Unusual Observations
Obs
1

t
1.0

Revolving
55.0

Fit
-50.8

SE Fit
22.0

Residual
105.8

St Resid
2.02R

R denotes an observation with a large standardized residual.


Predicted Values for New Observations
New
Obs
1

Fit
827.2

SE Fit
24.8

95% CI
(775.9, 878.5)

95% PI
(699.0, 955.4)

Values of Predictors for New Observations


New
Obs
1

t
27.0

Predicted Values for New Observations


New
Obs
1

Fit
861.0

SE Fit
26.2

95% CI
(806.7, 915.2)

95% PI
(731.6, 990.3)

Values of Predictors for New Observations


New
Obs
1

t
28.0

The fitted regression line is: Yt = 84.54 + 33.768t

516

Chapter 13

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

For the years 2006 and 2007, t = 27 and 28. From the printout, the predicted values
and 95% prediction intervals for 2006 and 2007 are:

Year
2006
2007

b.

Forecast
827.2
861.0

95% Lower
Limit
699.0
731.6

95% Upper
Limit
955.4
990.3

To compute the Holt-Winters values for the years 1980-2004:


With w = .7 and v = .7,
E2 = Y2 = 61
E3 = wY3 + (1 w)(E2 + T2) =.7(66) + (1 .7)(61 + 6) = 66.3.
T2 = Y2 Y1 = 61 55 = 6
T3 = v(E3 E2) + (1 v)T2 = .7(66.3 61) + (1 .7)(6) = 5.51
The rest of the values appear in the table:

Year
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004

Revolving
55
61
66
79
100
122
136
153
174
198
239
245
257
288
338
443
499
530
579
608
678
722
738
759
794

Holt-Winters
w = .7
v = .7
Et
Tt

61.00
66.30
76.84
95.76
118.91
137.17
153.98
173.24
196.19
232.66
250.91
261.89
284.49
327.99
419.44
497.62
543.45
584.91
614.75
669.40
720.81
748.01
765.97
792.44

Time Series: Descriptive Analyses, Models, and Forecasting

6.00
5.51
9.03
15.95
20.99
19.08
17.49
18.73
21.69
32.04
22.38
14.40
20.14
36.49
74.97
77.22
55.24
45.59
34.57
48.62
50.57
34.22
22.83
25.38

517

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Using the Holt-Winters series, the forecasts for 2006 and 2007 are:
F2006 = Ft+2 = Et + 2Tt = 792.44 + 2(25.38) = 843.20
F2007 = Ft+3 = Et + 3Tt = 792.44 + 3(25.38) = 868.58
These values are very similar to forecasts found using regression.
13.68

a.

From Example 13.4, the exponentially smoothed value for September 2005 is
80.333. The forecasts for October through December 2005 are:
F2005,Oct = Ft+1 = Et = 80.333
F2005,Nov = Ft+2 = Ft+1 = 80.333
F2005,Dec = Ft+3 = Ft+1 = 80.333
The forecast errors are the differences between the actual values and the forecasted
values. The forecast errors are:
Year
2005,Oct
2005,No
v
2005,Dec

b.

Yt+i
81.88

Ft+i
80.333

Difference
1.55

88.90
82.20

80.333
80.333

8.57
1.87

Using MINITAB, the results of fitting the model are:


Regression Analysis: IBM versus Time
The regression equation is
IBM = 95.8 - 0.740 Time
Predictor
Constant
Time

Coef
95.777
-0.7401

S = 5.79351

SE Coef
2.622
0.2088

R-Sq = 39.8%

T
36.53
-3.54

P
0.000
0.002

R-Sq(adj) = 36.6%

Analysis of Variance
Source
Regression
Residual Error
Total

DF
1
19
20

SS
421.71
637.73
1059.44

MS
421.71
33.56

F
12.56

P
0.002

Unusual Observations
Obs
12

Time
12.0

IBM
98.58

Fit
86.90

SE Fit
1.28

Residual
11.68

St Resid
2.07R

R denotes an observation with a large standardized residual.


Durbin-Watson statistic = 0.688518

518

Chapter 13

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Predicted Values for New Observations


New
Obs
1

Fit
79.50

SE Fit
2.62

95% CI
(74.01, 84.98)

95% PI
(66.19, 92.81)

Values of Predictors for New Observations


New
Obs
1

Time
22.0

Predicted Values for New Observations


New
Obs
1

Fit
78.76

SE Fit
2.81

95% CI
(72.88, 84.63)

95% PI
(65.28, 92.23)

Values of Predictors for New Observations


New
Obs
1

Time
23.0

Predicted Values for New Observations


New
Obs
1

Fit
78.02

SE Fit
2.99

95% CI
(71.75, 84.28)

95% PI
(64.37, 91.67)

Values of Predictors for New Observations


New
Obs
1

Time
24.0

The least squares fitted model is: Yt = 95.777 .7401t

o = 95.777

The estimated stock price for IBM in December 2003 is 95.777.

1 = .7401

The estimated decrease in the value of the stock for IBM for each
additional month is .7401.

c.

The approximate precision is 2s or 2(5.79) or 11.58 .

d.

The forecasts and prediction intervals are found at the bottom of the printout in
part b.

Year
2005, Oct
2005, Nov
2005, Dec

Forecast
79.50
78.76
78.02

95% Lower
Limit
66.19
65.28
64.37

The precision for October is approximately

95% Upper
Limit
92.81
92.23
91.67

92.81 66.19
= 13.31 .
2

Time Series: Descriptive Analyses, Models, and Forecasting

519

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The precision for November is approximately

92.23 65.28
= 13.48 .
2

The precision for December is approximately

91.67 64.37
= 13.65 .
2

All of these are close to the 11.58 from part c.


e.

The MAD, MAPE, and RMSE for the smoothed series are:
m

MAD =

| Yt Ft |
i =1

| 81.88 80.33 | + | 88.90 80.33 | + | 82.20 80.33 | 11.98


=
= 3.994
3
3

m (Yt Ft )

Yt
i =1
MAPE =
m

(Yt Ft )

i =1

RMSE =

81.88 80.33 88.90 80.33 82.20 80.33

+
+

81.88
88.90
88.90
100 =
3

.1380
=
100 = 4.599
3

=
=

100

(81.88 80.33)2 + (88.90 80.33)2 + (82.20 80.33)2


3
79.2724
= 5.140
3

The MAD, MAPE, and RMSE for the regression model are:
m

MAD =

| Yt Ft |
i =1

m
16.70
=
= 5.567
3

| 81.88 79.50 | + | 88.90 78.76 | + | 82.20 78.02 |


3

m (Yt Ft )

Yt
i =1
MAPE =
m

520

81.88 79.50 88.90 78.76 82.20 78.02

+
+

81.88
88.90
88.90
100 =
3

.1940
=
100 = 6.466
3

100

Chapter 13

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

RMSE =

(Yt Ft )
i =1

=
=

(81.88 79.50 )2 + (88.90 78.76 )2 + (82.20 78.02 )2


3
125.9564
= 6.480
3

The values of MAD, MAPE, and RMSE for the exponentially smoothed model are all
smaller than their corresponding values for the regression model.
f.

We have to assume that the error terms are independent.

g.

To determine if positive autocorrelation is present, we test:


H0: No first-order autocorrelation of residuals
Ha: Positive first-order autocorrelation of residuals
The test statistic is d = 0.69.
The rejection region is d < dL, = dL,.05 = 1.22. The value of dL,.05 is found in
Table XIII, Appendix B, with k = 1 and n = 21 .
Since the observed value of the test statistic falls in the rejection region
(d = .69 < 1.22), H0 is rejected. There is sufficient evidence to indicate the time
series residuals are positively autocorrelated at = .05. Since there is evidence of
positive autocorrelation, the validity of the regression model is questioned.

Time Series: Descriptive Analyses, Models, and Forecasting

521

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The Gasket Manufacturing Case


(To accompany Chapters 1213)

For this study, I constructed an R chart and an x -chart for both the original data (5.1) and for the new
data (5.2).
First, we will analyze the data set, 5.1 (that collected under the discretion of the operator).
We must compute the mean and range for each sample. The range = R = largest measurement smallest measure. The results are listed in the table:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

0.0440
0.0438
0.0453
0.0451
0.0459
0.0449
0.0472
0.0457
0.0464
0.0451
0.0456
0.0448
0.0459
0.0456
0.0472
0.0462
0.0427
0.0431
0.0425
0.0429
0.0443
0.0443
0.0429
0.0448

Samples
0.0446
0.0425
0.0428
0.0441
0.0466
0.0471
0.0477
0.0459
0.0457
0.0447
0.0455
0.0423
0.0468
0.0471
0.0465
0.0463
0.0437
0.0448
0.0442
0.0447
0.0441
0.0423
0.0427
0.0451

0.0437
0.0443
0.0433
0.0434
0.0476
0.0451
0.0452
0.0472
0.0447
0.0457
0.0445
0.0442
0.0452
0.0450
0.0461
0.0471
0.0445
0.0429
0.0432
0.0450
0.0450
0.0447
0.0464
0.0428

x
0.0441
0.0435
0.0438
0.0442
0.0467
0.0457
0.0467
0.0463
0.0456
0.0452
0.0452
0.0438
0.0460
0.0459
0.0466
0.0465
0.0436
0.0436
0.0433
0.0442
0.0445
0.0438
0.0440
0.0442

Range
0.0009
0.0018
0.0025
0.0017
0.0017
0.0022
0.0025
0.0015
0.0017
0.0010
0.0011
0.0025
0.0016
0.0021
0.0011
0.0009
0.0018
0.0019
0.0017
0.0021
0.0009
0.0024
0.0037
0.0023

x1 + x2 + " + x24 1.0770


=
= .0449
n
24
R + R2 + " + R24 .0436
=
R = 1
= .0018
n
24
x =

We now construct an R chart. From Table XVII, Appendix B, with n = 3, D3 = .000 and
D4 = 2.574.

522

The Gasket Manufacturing Case

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

R = .0018
Upper control limit = RD4 = .0018(2.574) = .0046
Since D3 = 0, the lower control limit is negative and is not included on the chart.
From Table XVII, Appendix B, with n = 3, d2 = 1.693 and d3 = .888.

Upper AB boundary = R + 2d3

.0018
R
= .0018 + 2(.888)
= .0037
1.693
d2

Lower AB boundary = R 2d3

.0018
R
= .0018 2(.888)
= .0001 = 0
1.693
d2

Upper BC boundary = R + d3

.0018
R
= .0018 + (.888)
= .0027
d2
1.693

Lower BC boundary = R d3

.0018
R
= .0018 (.888)
= .0009
1.693
d2

The R-chart is:

To determine if the process is in control, we check the four rules.


Rule 1: One point beyond Zone A: There are no points beyond Zone A.
Rule 2: Nine points in a row in Zone C or beyond: No sequence of nine points are in Zone
C (on one side of the centerline) or beyond.
Rule 3: Six points in a row steadily increasing or decreasing: This pattern is not present.
Rule 4: Fourteen points in a row alternating up and down: This pattern does not exit.
The process appears to be in control. No rule is violated.
Next, we construct the x -chart.

The Gasket Manufacturing Case

523

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Centerline = x = .0449
From Table XVII, Appendix B, with n = 3, A2 = 1.023

Upper control limit = x + A2 R = .0449 + 1.023(.0018) = .0467


Lower control limit = x A2 R = .0449 1.023(.0018) = .0431

Upper A-B boundary = x =

2
2
( A2 R ) = .0449 + (1.023(.0018) ) = .0461
3
3

Lower AB boundary = x

2
2
( A2 R ) = .0449 (1.023(.0018) ) = .0437
3
3

Upper BC boundary = x +

1
1
( A2 R ) = .0449 + (1.023(.0018) ) = .0455
3
3

Lower BC boundary = x

1
1
( A2 R ) = .0449 (1.023(.0018) ) = .0443
3
3

The x -chart is:

To determine if the process is in or out of control, we check the six rules:


Rule 1: One point beyond Zone A: No points are beyond Zone A.
Rule 2: Nine points in a row in Zone C or beyond: No sequence of nine points are in
Zone C (on one side of the centerline) or beyond.
Rule 3: Six points in a row steadily increasing or decreasing: This pattern is not present.
Rule 4: Fourteen points in a row alternating up and down: This pattern does not exit.
Rule 5: Two out of three points in Zone A or beyond: There are six groups of at least three
points in Zone A or beyondpoints 57, points 68, points 79, points 1416,
points 1719, and points 1820.
Rule 6: Four out of five points in a row in Zone B or beyond: There are six groups of points
that satisfy this rulepoints 59, points 610, points 1721, points 1822, points
1923, and points 2024.

524

The Gasket Manufacturing Case

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The process appears to be out of control. Rules 5 and 6 indicate that the process is out of control.
Since the process is out of control, a capability analysis is not appropriate. However, I will include a
dot diagram which indicates that many of the actual observations are outside of the specification limits.
The dot plot is:
.
: : ::: ....

:. .:

. .

..

:. .::: :.::.:::. .:: : ...:.. .

::

..

-------+---------+---------+---------+---------+--------0.0430

0.0440

0.0450

0.0460

0.0470

0.0480

The specification limits are .043 to .047. There are 11 points below .043 and 8 above .047. Thus, 19
out of the 72 points or .264 of the points are outside of the specification limits.
This indicates that the present system, when the operator is allowed to adjust the system at his/her
discretion, is not capable of reaching the needs of the customers.
Next, we analyze the second set of data, 5.2.
First, we must compute the mean and range for each sample. The range = R = largest measurement smallest measure. The results are listed in the table:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

0.0445
0.0435
0.0438
0.0449
0.0433
0.0455
0.0455
0.0445
0.0443
0.0449
0.0465
0.0461
0.0443
0.0456
0.0447
0.0454
0.0445
0.0438
0.0453
0.0455
0.0440
0.0444
0.0445
0.0450

Samples
0.0455
0.0453
0.0459
0.0449
0.0461
0.0454
0.0458
0.0451
0.0450
0.0448
0.0449
0.0439
0.0434
0.0459
0.0442
0.0445
0.0471
0.0445
0.0444
0.0435
0.0438
0.0450
0.0447
0.0463

The Gasket Manufacturing Case

0.0457
0.0450
0.0428
0.0467
0.0451
0.0461
0.0445
0.0436
0.0441
0.0467
0.0448
0.0452
0.0454
0.0452
0.0457
0.0451
0.0465
0.0472
0.0451
0.0443
0.0444
0.0467
0.0461
0.0456

x
0.0452
0.0446
0.0442
0.0455
0.0448
0.0457
0.0453
0.0444
0.0445
0.0455
0.0454
0.0451
0.0444
0.0456
0.0449
0.0450
0.0460
0.0452
0.0449
0.0444
0.0441
0.0454
0.0451
0.0456

Range
0.0012
0.0018
0.0031
0.0018
0.0028
0.0007
0.0013
0.0015
0.0009
0.0019
0.0017
0.0022
0.0020
0.0007
0.0015
0.0009
0.0026
0.0034
0.0009
0.0020
0.0006
0.0023
0.0016
0.0013

525

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

x1 + x2 + " + x24 1.0808


=
= .0450
n
24
R + R2 + " + R24 .0407
=
R = 1
= .0017
24
n
x =

First, we construct an R chart. From Table XVII, Appendix B, with n = 3, D3 = .000 and D4 = 2.574.

R = .0017
Upper control limit = RD4 = .0017(2.574) = .0044
Since D3 = 0, the lower control limit is negative and is not included on the chart.
From Table XVII, Appendix B, with n = 3, d2 = 1.693 and d3 = .888.

.0017
Upper AB boundary = R + 2d3 R = .0017 + 2(.888)
= .0035
1.693
d2
.0017
Lower AB boundary = R 2d3 R = .0017 2(.888)
= -.0001 = 0
1.693
d2
.0017
Upper BC boundary = R + d3 R = .0017 + (.888)
= .0026
1.693
d2
.0017
Lower BC boundary = R d3 R = .0017 (.888)
= .0008
1.693
d2
The R-chart is:

To determine if the process is in control, we check the four rules.


Rule 1: One point beyond Zone A: There are no points beyond Zone A.
Rule 2: Nine points in a row in Zone C or beyond: No sequence of nine points are in
Zone C (on one side of the centerline) or beyond.

526

The Gasket Manufacturing Case

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Rule 3: Six points in a row steadily increasing or decreasing: This pattern is not present.
Rule 4: Fourteen points in a row alternating up and down: This pattern does not exit.
The process appears to be in control. No rule is violated.
Next, we construct the x -chart.
Centerline = x = .0450
From Table XVII, Appendix B, with n = 3, A2 = 1.023

Upper control limit = x + A2 R = .0450 + 1.023(.0017) = .0467


Lower control limit = x A2 R = .0450 1.023(.0017) = .0433
Upper A-B boundary = x +

2
2
( A2 R ) = .0450 + (1.023(.0017) ) = .0462
3
3

Lower AB boundary = x

2
2
( A2 R ) = .0450 (1.023(.0017) ) = .0438
3
3

Upper BC boundary = x +

1
1
( A2 R ) = .0450 + (1.023(.0017) ) = .0456
3
3

Lower BC boundary = x

1
1
( A2 R ) = .0450 (1.023(.0017) ) = .0444
3
3

The x -chart is:

To determine if the process is in or out of control, we check the six rules:


Rule 1: One point beyond Zone A: No points are beyond Zone A.
Rule 2: Nine points in a row in Zone C or beyond: No sequence of nine points are in
Zone C (on one side of the centerline) or beyond.

The Gasket Manufacturing Case

527

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Rule 3:
Rule 4:
Rule 5:
Rule 6:

Six points in a row steadily increasing or decreasing: This pattern is not present.
Fourteen points in a row alternating up and down: This pattern does not exit.
Two out of three points in Zone A or beyond: This pattern does not exist.
Four out of five points in a row in Zone B or beyond: This pattern does not exist.

The process appears to be in control. No rules are violated. Since the process is in control, we will
perform a capability analysis to see if the process can meet the customer's demand. I will include a dot
diagram which indicates that many of the actual observations are outside of the specification limits.
The dot plot is:
.
:
.
..:
: :: . :
:
.
.
.. :. : .... ::: ::: ::::: :::. : : . : :
..
-----+---------+---------+---------+---------+---------+0.04320
0.04400
0.04480
0.04560
0.04640
0.04720

The specification limits are .043 to .047. There is one point below .043 and two points above .047.
Thus, 3 out of the 72 points or .042 of the points are outside of the specification limits. This indicates
that the present system, when the operator does not adjust the system at his/her discretion, might be able
to meet the needs of the customers.
We will also compute the capability index. The capability index is defined as the ratio of the
specification limits to 6 standard deviations or:

Cp =

upper specification limit lower specification limit


6

Since is not known, we will estimate it with s. In this case, s = .00095. The capability index is:

Cp =

.047 - .043
= .702
6(.00095)

Since the capability index is less than 1, it indicates that the process is not capable of meeting the
customer's needs. Even though this process (operator does not make adjustments) is in control, it is not
capable of meeting the needs of the customers.
In conclusion, it appears that the engineers are correctthe present equipment is not capable of
producing gasket material within the necessary limits.

528

The Gasket Manufacturing Case

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Nonparametric Statistics

14.2

14.4

Chapter 14

a.

Since the normal distribution is symmetric, the probability that a randomly selected
observation exceeds the mean of a normal distribution is .5.

b.

By the definition of "median," the probability that a randomly selected observation


exceeds the median of a normal distribution is .5.

c.

If the distribution is not normal, the probability that a randomly selected observation
exceeds the mean depends on the distribution. With the information given, the
probability cannot be determined.

d.

By definition of "median," the probability that a randomly selected observation exceeds


the median of a non-normal distribution is .5.

a.

H0: = 9
Ha: > 9
The test statistic is S = {Number of observations greater than 9} = 7.
The p-value = P(x 7) where x is a binomial random variable with n = 10 and p = .5.
From Table II,
p-value = P(x 7) = 1 P(x 6) = 1 .828 = .172
Since the p-value = .172 > = .05, H0 is not rejected. There is insufficient evidence to
indicate the median is greater than 9 at = .05.

b.

H0 : = 9
Ha: 9
S1 = {Number of observations less than 9} = 3 and
S2 = {Number of observations greater than 9} = 7
The test statistic is S = larger of S1 and S2 = 7.
The p-value = 2P(x 7) where x is a binomial random variable with n = 10 and p = .5.
From Table II,
p-value = 2P(x 7) = 2(1 P(x 6)) = 2(1 - .828) = .344
Since the p-value = .344 > = .05, H0 is not rejected. There is insufficient evidence to
indicate the median is different than 9 at = .05.

Nonparametric Statistics

529

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

H0: = 20
Ha: < 20
The test statistic is S = {Number of observations less than 20} = 9.
The p-value = P(x 9) where x is a binomial random variable with n = 10 and p = .5.
From Table II,
p-value = P(x 9) = 1 P(x 8) = 1 .989 = .011
Since the p-value = .011 < = .05, H0 is rejected. There is sufficient evidence to indicate
the median is less than 20 at = .05.

d.

H0: = 20
Ha: 20
S1 = {Number of observations less than 20} = 9 and
S2 = {Number of observations greater than 20} = 1
The test statistic is S = larger of S1 and S2 = 9.
The p-value = 2P(x 9) where x is a binomial random variable with n = 10 and p = .5.
From Table II,
p-value = 2P(x 9) = 2(1 P(x 8)) = 2(1 .989) = .022
Since the p-value = .022 < = .05, H0 is rejected. There is sufficient evidence to indicate
the median is different than 20 at = .05.

e.

For all parts, = np = 10(.5) = 5 and =

npq = 10(.5)(.5) = 1.581.

(7 .5) 5

For part a, P(x 7) P z


= P(z .95) = .5 .3289 = .1911
1.581

This is close to the probability .172 in part a. The conclusion is the same.
(7 .5) 5

For part b, 2P(x 7) 2 P z


= 2P(z .95) = 2(.5 .3289)
1.581

= .3422
This is close to the probability .344 in part b. The conclusion is the same.
(9 .5) 5

= P(z 2.21) = .5 .4864


For part c, P(x 9) P z
1.581

= .0136

This is close to the probability .011 in part c. The conclusion is the same.

530

Chapter 14

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

(9 .5) 5

For part d, 2P(x 9) 2 P z


= 2P(z 2.21) = 2(.5 .4864)
1.581

= .0272
This is close to the probability .022 in part d. The conclusion is the same.

14.6

f.

We must assume only that the sample is selected randomly from a continuous probability
distribution.

a.

To determine if the median amount of caffeine in Breakfast Blend coffee exceeds


300 milligrams, we test:
H0: = 300
Ha: > 300

b.

S=4

c.

Using Table II, Appendix B, with n = 6 and p = .5,

P ( x 4) = 1 P ( x 3) = 1 .656 = .344
d.

14.8

a.

Since the probability in part c is greater than = .05, H0 is not rejected. There is
insufficient evidence to indicate the median amount of caffeine in Breakfast Blend coffee
exceeds 300 milligrams at = .05.
To determine if cohesiveness will deteriorate after storage, we test:
H0: = 0
Ha: > 0

b.

The test statistic is S = {number of measurements greater than 0} = 13.


The p-value = P(x 13) where x is a binomial random variable with n = 20 and p = .5.
From Table II,
p-value = P(x 13) = 1 P(x 12) = 1 .868 = .132

14.10

c.

Since the p-value = .132 > = .05, H0 is not rejected. There is insufficient evidence
to indicate cohesiveness will deteriorate after storage at = .05.

a.

I would recommend the sign test because five of the sample measurements are of similar
magnitude, but the 6th is about three times as large as the others. It would be very
unlikely to observe this sample if the population were normal.

b.

To determine if the airline is meeting the requirement, we test:


H0: = 30
Ha: < 30

Nonparametric Statistics

531

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

The test statistic is S = number of measurements less than 30 = 5.


H0 will be rejected if the p-value < = .01.

d.

The test statistic is S = 5.


The p-value = P(x 5) where x is a binomial random variable with n = 6 and p = .5.
From Table II,
p-value = P(x 5) = 1 P(x 4) = 1 .891 = .109
Since the p-value = .109 is not less than = .01, H0 is not rejected. There is insufficient
evidence to indicate the airline is meeting the maintenance requirement at = .01.

14.12

To determine if the median surface roughness of coated interior pipe differs from 2
micrometers, we test:
H0: = 2
Ha: 2
S1 = {Number of measurements < 2} = 9.
S2 = {Number of measurements > 2} = 11.
The test statistic is S = Larger of S1 and S2 = 11.
The p-value = 2 P(x 11) where x is a binomial random variable with n = 20 and p = .5
From Table II, Appendix B,
p-value = 2 P(x 11) = 2(1 P( x 10)) = 2(1 .588) = .824
Since the p-value = .824 </ = .05, H0 is not rejected. There is insufficient evidence to
indicate the median surface roughness of coated interior pipe differs from 2 micrometers
at = .05.

14.14

To determine if the distribution of A is shifted to the left of distribution B, we test:


H0: The two sampled populations have identical distributions
Ha: The probability distribution for population A is shifted to the left of population B.

n1 (n1 + n2 + 1)
15(15 + 15 + 1)
173
2
2
The test statistic is z =
=
= 2.47
15(15)(15 + 15 + 1)
n1n2 ( n1 + n2 + 1)
12
12
The rejection region requires = .05 in the lower tail of the z-distribution. From Table IV,
z.05 = 1.645. The rejection region is z < 1.645.
T1

Since the observed value of the test statistic falls in the rejection region (z = 2.47 < 1.645),
H0 is rejected. There is sufficient evidence to indicate the distribution of A is shifted to the left
of distribution B.

532

Chapter 14

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

14.16
Sample from
Population 1
15
10
12
16
13
8

Rank
13
8.5
10.5
14
12
4.5

T1 = 62.5
a.

Sample from
Population 2
5
12
9
9
8
4
5
10

Rank
2.5
10.5
6.5
6.5
4.5
1
2.5
8.5
T2 = 42.5

H0: The two sampled populations have identical probability distributions


Ha: The probability distribution for population 1 is shifted to the left or to the right
of that for 2
The test statistic is T1 = 62.5 since sample A has the smallest number of measurements.
The null hypothesis will be rejected if T1 TL or T1 TU where TL and TU correspond to
= .05 (two-tailed), n1 = 6 and n2 = 8. From Table XV, Appendix B, TL = 29 and TU = 61.
Reject H0 if T1 29 or T1 61.
Since T1 = 62.5 61, we reject H0 and conclude there is sufficient evidence to indicate
population 1 is shifted to the left or right of population 2 at = .05.

b.

H0: The two sampled populations have identical probability distributions


Ha: The probability distribution for population 1 is shifted to the right of population 2
The test statistic remains T1 = 62.5.
The null hypothesis will be rejected if T1 TU where TU corresponds to = .05 (onetailed), n1 = 6 and n2 = 8. From Table XV, Appendix B, TU = 58.
Reject H0 if T1 58.
Since T1 = 62.5 58, we reject H0 and conclude there is sufficient evidence to indicate
population 1 is shifted to the right of population 2 at = .05.

Nonparametric Statistics

533

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

14.18

a.

Some preliminary calculations:


Private Sector
2.58
5.05
0.05
2.10
4.30
2.25
2.50
1.94
2.33

b.

Rank
10
13
1
5
12
6
8
4
7
T1 = 66

Public Sector
5.40
2.55
9.00
10.55
1.02
5.11
12.42
1.67
3.33

Rank
15
9
16
17
2
14
18
3
11
T2 = 105

To determine if the distribution for public sector organizations is located to the right of
the distribution for private sector firms, we test:
H0: The two sampled populations have identical probability distributions
Ha: The probability distribution of the public sector is located to the right of that
for the private sector
The test statistic is T2 = 105.
The null hypothesis will be rejected if T2 TU where TU corresponds to = .05 (onetailed), and n1 = n2 = 9. From Table XV, Appendix B, TU = 105.
Reject H0 if T2 105.
Since T2 = 105 105, H0 is rejected. There is sufficient evidence to indicate that the
distribution in the public sector organization is located to the right of the distribution for
the private sector firms at = .05.

c.

The null hypothesis will be rejected if T2 TU where TU corresponds to = .05 (onetailed), and n1 = n2 = 9. From Table XV, Appendix B, TU = 105. Since T1 = 105, we
would reject H0. Thus, the p-value is less than or equal to = .05.

d.

The assumptions necessary for the test are:


1.
2.

534

The two samples are random and independent.


The two probability distributions from which the samples were drawn are
continuous.

Chapter 14

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

14.20

a.
American Purchasing
Managers
Sample 1
Rank
50
20.5
10
4.5
35
15.5
30
13.5
20
10.5
15
7.5
8
3
40
17.5
80
26.5
75
25
19
9
11
6
5
1.5
25
12
30
13.5
T1 = 186

b.

Mexican Purchasing
Managers
Sample 2
Rank
10
4.5
90
29
65
24
50
20.5
20
10.5
15
7.5
60
23
80
26.5
85
28
35
15.5
5
1.5
55
22
40
17.5
45
19
95
30
T2 = 279

To determine whether American and Mexican purchasing managers perceive the given
ethical situation differently, we test:
H0: The two sampled populations have identical probability distributions
Ha: The probability distribution of the American managers is shifted to the right or left
of the probability distribution of the Mexican managers.

The test statistic is z =

n1 (n1 + n2 + 1)
15(15 + 15 + 1)
186
2
2
=
= 1.929
15(15)(15 + 15 + 1)
n1n2 (n1 + n2 + 1)
12
12

T1

The rejection region requires /2 = .05/2 = .025 in each tail of the z-distribution. From
Table IV, Appendix B, z.025 = 1.96. The rejection region is z < 1.96 or z > 1.96.
Since the observed value of the test statistic does not fall in the rejection region
(z = 1.929 </ 1.96), H0 is not rejected. There is insufficient evidence to indicate
American and Mexican purchasing managers perceive the given ethical situation
differently at = .05.
c.

In order to use the t-test, we need to assume that the two populations being sampled from
are normal and that the variances of the two populations are equal. To check these
assumptions, we will use stem-and-leaf plots and dot plots.

Nonparametric Statistics

535

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The stem-and-leaf plots are:


Stem-and-leaf of Ethics
Leaf Unit = 1.0
2
6
(2)
7
4
3
2
2
1

0
1
2
3
4
5
6
7
8

0
1
2
3
4
5
6
7
8
9

= 15

Managers = 2

= 15

58
0159
05
005
0
0
5
0

Stem-and-leaf of Ethics
Leaf Unit = 1.0
1
3
4
5
7
(2)
6
4
4
2

Managers = 1

5
05
0
5
05
05
05
05
05

Neither of these two stem-and-leaf plots look mound-shaped. The assumption that the
populations are normal may not be valid.
The dot plots are:
Managers
1
.... . :

. :

. .

. .

+---------+---------+---------+---------+---------+-------Ethics
Managers
2

. .

. .

. .

. .

. .

. .

+---------+---------+---------+---------+---------+-------Ethics
0

20

40

60

80

100

The spread of the two data sets look approximately equal. The assumption that the
variances of the two populations are the same appears to be valid.

536

Chapter 14

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

14.22

a.

Using MINITAB, histograms of the two data sets are:

Histogram of HEATRATE
9000 10000 11000 12000 13000 14000 15000 16000

Aeroderiv

20

Traditional

Frequency

15

10

9000 10000 11000 12000 13000 14000 15000 16000

HEATRATE
Panel variable: ENGINE

From the histograms, the data for each group do not look like they are moundshaped. The variance of the aeroderivative engines is greater than that of the
traditional engines. Thus, the assumptions of normal distributions and equal
variances necessary for the t-test are probably not met.

14.24

b.

The p-value = .3431. Since this p-value is not small, H0 is not rejected. There is no
evidence to indicate that the heat rate distribution of the traditional turbine engines is
shifted to the right or left of that for the aeroderivative turbine engines.

a.

We first rank all the data:


Firms with
Successful MIS (1)
Score
Rank
Score
52
5
90
70
15
75
40
1.5
80
80
19
95
82
21
90
65
12.5
86
59
9
95
60
10.5
93

T1 = 290.5

Nonparametric Statistics

Rank
25.5
17
19
29.5
25.5
23
29.5
28

Firms with
Unsuccessful MIS (2)
Score
Rank Score
Rank
60
10.5
65
12.5
50
4
55
7
55
7
70
15
70
15
90
25.5
41
3
85
22
40
1.5
80
19
55
7
90
25.5

T2 = 174.5

537

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

To determine whether the distribution of quality scores for the successfully implemented
systems differs from that for the unsuccessfully implemented systems, we test:
H0: The two sampled distributions are identical
Ha: The probability distribution for the successful MIS is shifted to the right or left of
that for the unsuccessful MIS

The test statistic is z =

n1 (n1 + n2 + 1)
16(16 + 14 + 1)
290.5
2
2
=
= 1.767
16(14)(16 + 14 + 1)
n1n2 (n1 + n2 + 1)
12
12

T1

The rejection region requires /2 = .05/2 = .025 in each tail of the z-distribution. From
Table IV, Appendix B, z.025 = 1.96. The rejection region is z < 1.96 or z > 1,96.
Since the observed value of the test statistic does not fall in the rejection region
(z = 1.767 >/ 1.96), H0 is not rejected. There is insufficient evidence to indicate the
distribution of quality scores for the successfully implemented systems differs from that
for the unsuccessfully implemented systems at = .05.
b.

We could use the two-sample t-test if:


1.
2.

14.26

a.

Both populations are normal.


The variances of the two populations are the same.

The test statistic is T or T+, the smaller of the two.


The rejection region is T 152, from Table XVI, Appendix B, with n = 30, = .10, and
two-tailed.

b.

The test statistic is T.


The rejection region is T 60, from Table XVI, Appendix B, with n = 20, = .05, and
one-tailed.

c.

The test statistic is T+.


The rejection region is T+ 0, from Table XVI, Appendix B, with n = 8, = .005, and
one-tailed.

14.28

a.

The rejection region requires = .05 in the upper tail of the z-distribution. From Table
IV, Appendix B, z.05 = 1.645. The rejection region is z > 1.645.
n(n + 1)
25(26)
273
4
4 = 2.97
=
n(n + 1)(2n + 1)
25(26)(51)
24
24
T+

b.

The large sample test statistic is z =

Since the observed value of the test statistic falls in the rejection region (z = 2.97 >
1.645), H0 is rejected. There is sufficient evidence to indicate that the responses for A
tend to be larger than those for B at = .05.

538

Chapter 14

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

p-value = P(z 2.97) = .5 P(0 < z < 2.97)


= .5 .4985
= .0015 (from Table IV, Appendix B)
Thus, we can reject H0 for any preselected greater than .0015.

14.30

a.

To determine if the chest injury ratings of drivers and front-seat passengers differ,
we test:
H0: The two sampled populations have identical probability distributions
Ha: The probability distribution of drivers is shifted to the right or left of that for
front-seat passengers

b.

Using MINITAB, the results are:


Wilcoxon Signed Rank Test: Diff
Test of median = 0.000000 versus median not = 0.000000

Diff

N
18

N for
Test
16

Wilcoxon
Statistic
23.0

P
0.021

Estimated
Median
-4.000

From the printout, the test statistic is T+ = 23.

c.

The rejection region is T+ To where To corresponds to = .01 (two-tailed) and n = 16.


From Table XVI, Appendix B, To = 19. The rejection region is T+ 19.

d.

Since the observed value of the test statistic does not fall in the rejection region
(T+ = 23 / 19), H0 is not rejected. There is insufficient evidence to indicate the chest
injury ratings of drivers and front-seat passengers differ at = .01.
From the printout, the p-value is p = .021.

14.32

Some preliminary calculations:


Theme

Tourism
Physical
Transportation
People
History
Climate
Forestry
Agriculture
Fishing
Energy
Mining
Manufacturing

Nonparametric Statistics

High School
Teachers
10
2
7
1
2
6
5
7
9
2
10
12

Geography
Alumni
2
1
3
6
5
4
8
10
7
8
11
12

Difference
Rank of Absolute
T-A
Differences
8
11
1
1.5
4
8
9
5
6
3
2
3.5
6
3
6
3
2
3.5
10
6
1.5
1
0
(eliminated)
Positive rank sum T+ = 27.5

539

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

To determine if the distributions of theme rankings for the two groups differ, we test:
H0: The probability distributions for the two populations are identical
Ha: The probability distribution of the high school teachers is shifted to the right or left
of the probability distribution of the geography alumni
The test statistic is T+ = 27.5.
Reject H0 if T+ T0 where T0 is based on = .05 and n = 11 (two-tailed):
Reject H0 if T+ 11 (from Table XVI, Appendix B)
Since the observed value of the test statistic does not fall in the rejection region (T+ = 27.5 /
11), H0 is not rejected. There is insufficient evidence to indicate that the distributions of these
rankings for the two groups differ at = .05. Practically, this means that the thematic content
of a new atlas could be based on the views of either educators or geography alumni.
14.34

Some preliminary calculations are:

Employee
1
2
3
4
5
6
7
8
9
10

Before
Flextime
54
25
80
76
63
82
94
72
33
90

After
Flextime
68
42
80
91
70
88
90
81
39
93

Difference
(B A)
4
17
0
15
7
6
4
9
6
3

Difference
7
9
(Eliminated)
8
5
3.5
2
6
3.5
1
T+ = 2

To determine if the pilot flextime program is a success, we test:


H0: The two probability distributions are identical
Ha: The probability distribution before is shifted to the left of that after
The test statistic is T+ = 2.
The rejection region is T+ 8, from Table XVI, Appendix B, with n = 9 and = .05.
Since the observed value of the test statistic falls in the rejection region (T+ = 2 8), H0 is
rejected. There is sufficient evidence to indicate the pilot flextime program has been a success
at = .05.

540

Chapter 14

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

14.36

Some preliminary calculations are:

Science
0
4
3
1
3
2
4
2
3
4

Math
2
3
0
1
1
3
0
1
1
1

Rank of
Difference
Absolute
ScienceDifference
Math
2
5
1
2
3
7.5
0
eliminate
2
5
1
2
4
9
1
2
2
5
3
7.5
Negative rank sum T_ = 7
Positive rank sum T+ = 38

To determine if there are differences in the levels of family involvement between math
and science homework, we test;
H0: The distributions of the science and math levels of family involvement are the
same
Ha: The distributions of the science and math levels of family involvement differ
The test statistic is T_ = 7.
The rejection region is T_ To where To corresponds to = .05 (two-tailed) and n = 9.
From Table XVI, Appendix B, To = 6. The rejection region is T_ 6.
Since the observed value of the test statistic does not fall in the rejection region
(T_ = 7 / 6), H0 is not rejected. There is insufficient evidence to indicate there are
differences in the levels of family involvement between math and science homework at
= .05.
14.38

a.

The hypotheses are:


H0: The three probability distributions are identical
Ha: At least two of the three probability distributions differ in location

b.

The test statistic is:


H=

2
12
12 230 2 440 2 365 2
Rj
+
+
3(n + 1) =

3(46)
n( n + 1)
45(46) 15
15
15
nj

= 146.754 138 = 8.754

Nonparametric Statistics

541

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The rejection region requires = .05 in the upper tail of the 2 distribution with
2
= 5.99147. The rejection
df = p 1 = 3 1 = 2. From Table VII, Appendix B, .05
region is H > 5.99147.
Since the observed value of the test statistic falls in the rejection region (H = 8.754 >
5.99147), H0 is rejected. There is sufficient evidence to indicate that the probability
distributions of at least two of the populations A, B, and C, differ in location at = .05.
c.

d.

14.40

a.

The approximate p-value is P(2 8.754). From Table VII, Appendix B, with df = 2,
.01 P(2 8.754) .025.
RB 440
R A = 230
=
= 29.333
= 15.333
RB =
15 15
15 15
RC 365
n + 1 45 + 1
=
= 24.333
=
= 23
R =
RC =
15 15
2
2
12
H=
n j ( R j R )2
n(n + 1)
12
=
15(15.333 23) 2 + 15(29.333 23) 2 + 15(24.333 23) 2 = 8.754

45(46)
In order to compare the three population means using parametric techniques, we must
assume that all populations being sampled from are normal and all population variances
are the same. It is quite possible that these two conditions are not met with this data.
RA =

b.

Since we want to compare 3 groups, we will use the Kruskal-Wallis test.

c.

The test statistic is


H=

R 2j
53352 3937 2 37692
12
12

+
=
+
+
3(
1)
n

n
n(n + 1)
161(161 + 1) 67
57
37
j

3(161 + 1)

= 11.201

14.42

d.

Since the p-value is so small (p = .0037), H0 will be rejected. There is sufficient


evidence to indicate DEF distributions differ for the 3 tax litigation forums for > .0037.

a.

To determine if the distributions of office rental growth rates differ among the four
market cycle phases, we test:
H0: The four probability distributions are identical
Ha: At least two of the growth rate distributions differ

542

Chapter 14

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

b.
Phase I
2.7
1.0
1.1
3.4
4.2
3.5

14.44

The ranks of the measurements are:


Rank
9
4.5
6
10
12
11
R1 = 52.5

Phase II
10.5
11.5
9.4
12.2
8.6
10.9

Rank
20
23
19
24
18
21
R2 = 125

Phase III
6.1
1.2
11.4
4.4
6.2
7.6

Rank
14
7
22
13
15.5
17
R3 = 88.5

Phase IV
1.0
6.2
10.8
2.0
1.1
2.3

Rank
4.5
15.5
1
8
3
2
R4 = 34

c.

The rank sums appear in the table above. The test statistic is:
R 2j
52.52 1252 88.52 342
12
12

+
=
+
+
+
3(
1)
H=
n

3(24 + 1)
n
n( n + 1)
24(24 + 1) 6
6
6
6
j
= 16.23

d.

The rejection region requires = .05 in the upper tail of the 2 distribution with df =
2
p 1 = 4 1 = 3. From Table VII, Appendix B, .05
= 7.81473. The rejection region is
H > 7.81473.

e.

Since the observed value of the test statistic falls in the rejection region
(H = 16.23 > 7.81473), H0 is rejected. There is sufficient evidence to indicate the
distributions of office rental growth rates differ among the four market cycle phases at
= .05.

Some preliminary calculations are:


Aromatics
1.06
0.79
0.82
0.89
1.05
0.95
0.65
1.15
1.12

Ranks
26
19
20
22
25
24
18
29
27.5

R1 = 210.5

Nonparametric Statistics

Chloroalkanes
1.58
1.45
0.57
1.16
1.12
0.91
0.83
0.43

Ranks
32
31
15
30
27.5
23
21
9.5

R2 = 189

Esters
0.29
0.06
0.44
0.61
0.55
0.43
0.51
0.10
0.34
0.53
0.06
0.09
0.17
0.60
0.17

Ranks
7
1.5
11
17
14
9.5
12
4
8
13
1.5
3
5.5
16
5.5
R3 = 128.5

543

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

To determine if the sorption rate distributions differ among the three solvents, we test:
H0: The three probability distributions are identical
Ha: At least two of the three probability distributions differ in location

The test statistic is


R 2j
210.52 1892 128.52
12
12
3(n + 1) =
+
+
H=

3(32 + 1)

n( n + 1)
nj
32(32 + 1) 9
8
15
= 20.197

The rejection region requires = .01 in the upper tail of the 2 distribution with df = p 1 = 3
2
1 = 2. From Table VII, Appendix B, .01
= 9.21034. The rejection region is
H > 9.21034.
Since the observed value of the test statistic falls in the rejection region (H = 20.197 >
9.21034), H0 is rejected. There is sufficient evidence to indicate the sorption rate distributions
differ among the three solvents at = .01.
14.46

a.

The F-test would be appropriate if:


1.
2.
3.

b.
c.

All p populations sampled from are normal.


The variances of the p populations are equal.
The p samples are independent.

The variances for the three populations are probably not the same and the populations are
probably not normal.
To determine whether the salary distributions differ among the three cities, we test:
H0: The three probability distributions are identical
Ha: At least two of the three probability distributions differ in location

Some preliminary calculations are:


1
Atlanta
34,600
84,900
61,700
38,900
77,200
83,600
59,800

544

Rank
1
19
11
3
17
18
10
R1 = 79

2
Los Angeles
42,400
135,000
63,000
43,700
69,400
97,000
49,500

Rank
4
21
12
5
13
20
7
R2 = 82

3
Washington, D.C.
38,000
76,900
48,000
72,600
73,200
51,800
55,000

Rank
2
16
6
14
15
8
9
R3 = 70

Chapter 14

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The test statistic is H =


=

2
12
Rj
3(n + 1)

n( n + 1)
nj

12 79 2 82 2 70 2
+
+

3(22) = 66.2894 66 = .2894


21(22) 7
7
7

The rejection region requires = .05 in the upper tail of the 2 distribution with df = p
2
1 = 3 1 = 2. From Table VII, Appendix B, .05
= 5.99147. The rejection region is
H > 5.99147.
Since the observed value of the test statistic does not fall in the rejection region (H =
.2894 >/ 5.99147), H0 is not rejected. There is insufficient evidence to indicate the salary
distributions differ among the three cities at = .05.
We must assume we have independent random samples, sample sizes greater than or
equal to 5 from each population, and that all populations are continuous.
14.48

a.

The hypotheses are:


H0: The probability distributions for three treatments are identical
Ha: At least two of the probability distributions differ in location

b.

The rejection region requires = .10 in the upper tail of the 2 distribution with df =
2
p 1 = 3 1 = 2. From Table VII, Appendix B, .10
= 4.60517. The rejection region is
Fr > 4.60517.

c.

Some preliminary calculations are:


Block
1
2
3
4
5
6
7

9
13
11
10
9
14
10

Rank

1
2
1
1
2
2
1
RA = 10

B
11
13
12
15
8
12
12

Rank
2
2
2.5
2
1
1
2
RB = 12.5

C
18
13
12
16
10
16
15

Rank
3
2
2.5
3
3
3
3
RC = 19.5

12
R 2j 3b( p + 1)

bp ( p + 1)
12
102 + 12.52 + 19.52 3(7)(4) = 90.9286 84 = 6.9286
=

7(3)(4)

The test statistic is Fr =

Since the observed value of the test statistic falls in the rejection region (Fr = 6.9286
> 4.60517), H0 is rejected. There is sufficient evidence to indicate the effectiveness of the
three different treatments differ at = .10.

Nonparametric Statistics

545

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

14.50

a.

The Friedman test statistic is Fr =


=

14.52

12
R 2j 3b( p + 1)
bp ( p + 1)

12
(27 2 + 252 + 182 + 112 + 92 ) 3(6)(5 + 1) = 17.333
6(5)(5 + 1)

b.

The rejection region requires = .05 in the upper tail of the 2 distribution with df =
2
p 1 = 5 1 = 4. From Table VII, Appendix B, .05
= 9.48773. The rejection region is
Fr > 9.48773.

c.

Since the observed value of the test statistic falls in the rejection region
(Fr = 17.333 > 9.48773), H0 is rejected. There is sufficient evidence to indicate there is
a difference in the levels of farm production among the five conditions at = .05.

a.

To determine if the distributions of rotary oil rigs differ among the three states, we test:
H0: The probability distributions of the rotary oil rigs for the 3 states are the same
Ha: At least two of the probability distributions of rotary oil rigs differ in location

b.

The ranked data are:


Month/Year
Nov. 2000
Oct. 2001
Nov. 2001

c.

Utah
2
2
2
R2 = 6

Alaska
1
1
1
R3 = 3

The test statistic is


Fr =

546

California
3
3
3
R1 = 9

12
12
92 + 62 + 32 3(3)(3 + 1) = 6
R 2j 3b( p + 1) =

3(3)(3 + 1)
bp ( p + 1)

d.

The rejection region requires = .05 in the upper tail of the 2 distribution with df =
2
p 1 = 3 1 = 2. From Table VII, Appendix B, .05
= 5.99147. The rejection region is
H > 5.99147.

e.

Since the observed value of the test statistic falls in the rejection region (H = 6 > 5.99147),
H0 is rejected. There is sufficient evidence to indicate the distributions of rotary oil rigs
differ among the three states at = .05.

Chapter 14

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

14.54

Some preliminary calculations are:

Location
Anguilla
Antigua
Dominica
Guyana
Jamaica
St. Lucia
Suriname

Temephos Rank
4.6
5
9.2
5
7.8
5
1.7
2
3.4
3
6.7
4
1.4
1
R1 = 13

Malsathion
Rank
1.2
1
2.9
3
1.4
1
1.9
4
3.7
4
2.7
1.5
1.9
3
R2 = 15

Fenitrothion
Rank
1.5
2.5
2.0
1.5
2.4
2
2.2
5
2.0
2
2.7
1.5
2.0
4
R3 = 18.5

Fenthion Rank
1.8
4
7.0
4
4.2
4
1.5
1
1.5
1
4.8
3
2.1
5
R4 = 22

Chlorpyrifos Rank
1.5
2.5
2.0
1.5
4.1
3
1.8
3
7.1
5
8.7
5
1.7
2
R5 = 22

To determine if the resistance ratio distributions of the 5 insecticides differ, we test:


H0: The distributions of the 5 insecticide ratios are the same
Ha: At least two of the distributions of insecticide ratios differ
12
R 2j 3b( p + 1)

bp ( p + 1)
12
(252 + 17.52 + 18.52 + 222 + 222 ) 3(7)(5 + 1) = 2.086
=
7(5)(5 + 1)

The test statistic is Fr =

Since no was given, we will use = .05. The rejection region requires = .05 in the upper
2
tail of the 2 distribution with df = p 1 = 5 1 = 4. From Table VII, Appendix B, .05
=
9.48773. The rejection region is Fr > 9.48773.
Since the observed value of the test statistic does not fall in the rejection region
(Fr = 2.086 >/ 9.48773), H0 is not rejected. There is insufficient evidence to indicate that the
resistance ratio distributions of the 5 insecticides differ at = .05.
14.56

Some preliminary calculations are:

Week
1
2
3
4
5
6
7
8
9

Monday
5
5
2.5
2
5
4
5
4
1
R1 = 33.5

Tuesday
1
4
2.5
1
1
2
3.5
2
2
R2 = 19

Wednesday
4
3
5
3.5
2
3
1.5
1
5
R3 = 28

Thursday
2
1
1
5
3
1
3.5
3
3
R1 = 22.5

Friday
3
2
4
3.5
4
5
1.5
5
4
R2 = 32

To determine if the distributions of days of the weeks differ, we test:


H0: The probability distributions of the 5 days of the week are the same
Ha: At least two of the probability distributions of the 5 days of the week differ in
location

Nonparametric Statistics

547

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The test statistic is


12
R 2j 3b( p + 1)

bp ( p + 1)
12
33.52 + 192 + 282 + 22.52 + 322 3(9)(5 + 1) = 6.778
=
9(5)(5 + 1)

Fr =

Since no was given we will use = .05. The rejection region requires = .05 in the upper
2
tail of the 2 distribution with df = p 1 = 5 1 = 4. From Table VII, Appendix B, .05
=
9.48773. The rejection region is H > 9.48773.
Since the observed value of the test statistic does not fall in the rejection region
(H = 6.778 >/ 9.48773), H0 is not rejected. There is insufficient evidence to indicate the
distributions of the absentee rate for the days of the weeks differ at = .05.
14.58

14.60

a.

From Table XVII with n = 10, rs,/2 = rs,.025 = .648. The rejection region is rs > .648 or
rs < .648.

b.

From Table XVII with n = 20, rs, = rs,.025 = .450. The rejection region is rs > .450.

c.

From Table XVII with n = 30, rs, = rs,.01 = .432. The rejection region is rs < .432.

a.

H0: s = 0
Ha: s 0

b.

The test statistic is rs =

x
0
3
0
4
3
0
4

548

Rank, u
3
5.5
3
1
5.5
3
7
u = 28

SSuv =

uv

SSuu =

SSvv =

SSuv
SSuuSSvv
y
0
2
2
0
3
1
2

Rank, v
1.5
5
5
1.5
7
3
5
v = 28

( u )( v ) = 131 28(28)
n

(u )

( v)

= 137.5

(20) 2
7

= 137.5

(20) 2
7

u2
9
30.25
9
1
30.25
9
49
u 2 = 137.5

v2
2.25
25
25
2.25
49
9
25
v 2 = 137.5

uv
45
27.5
15
1.5
38.5
9
35
uv = 131

= 19

Chapter 14

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

rs =

19
= .745
25.5(25.5)
Reject H0 if rs < rs,/2 or rs > rs,/2 where /2 = .025 and n = 7:
Reject H0 if rs < .786 or rs > .786 (from Table XVII, Appendix B).

Since the observed value of the test statistic does not fall in the rejection region, (rs = .745
>/ .786), do not reject H0. There is insufficient evidence to indicate x and y are correlated
at = .05.

14.62

c.

The p-value is P(rs .745) + P(rs .745). For n = 7, rs = .745 is above rs,.025 where /2 =
.025 and below rs,.05 where /2 = .05. Therefore, 2(.025) = .05 < p-value < 2(.05) = .10.

d.

The assumptions of the test are that the samples are randomly selected and the probability
distributions of the two variables are continuous.

a.

Some preliminary calculations are:


Expert
1
6
5
1
3
2
4

Brand
A
B
C
D
E
F

rs = 1
b.

6 di2
n(n 1)
2

= 1

Expert
2
5
6
2
1
4
3

Difference di
1
1
1
2
2
1

di2
1
1
1
4
4
1
di2 = 12

6(12)
= 1 .343 = .657
6(62 1)

To determine if there is a positive correlation in the rankings of the two experts, we test:
H0: s = 0
Ha: s > 0
The test statistic is rs = .657.
Reject H0 if rs > rs, where = .05 and n = 6. From Table XVII, Appendix B,
rs,.01 = .829. Reject H0 if rs > .829.
Since the observed value of the test statistic does not fall in the rejection region
(rs = .657 >/ .829), H0 is not rejected. There is insufficient evidence to indicate a
positive correlation in the rankings of the two experts at = .05.

Nonparametric Statistics

549

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

14.64

a.

Some preliminary calculations are:


x
u
y
v
5.2
1
220
4.5
5.5
7
227
7.5
6.0
23.5
259
15.5
5.9
20.5
210
1
5.8
16
224
6
6.0
23.5
215
3
5.8
16
231
9
5.6
10
268
19
5.6
10
239
11
5.9
20.5
212
2
5.4
5
410
24
5.6
10
256
14
5.8
16
306
22
5.5
7
259
15.5
5.3
3
284
21
5.3
3
383
23
5.7
12.5
271
20
5.5
7
264
18
5.7
12.5
227
7.5
5.3
3
263
17
5.9
20.5
232
10
5.8
16
220
4.5
5.8
16
246
13
5.9
20.5
241
12
u =300
v = 300
SSuv =

uv

SSuu =

SSvv =

rs =

u-sq
1
49
552.25
420.25
256
552.25
256
100
100
420.25
25
100
256
49
9
9
156.25
49
156.25
9
420.25
256
256
420.25
2
u =4878

( u )( v ) = 3197.5 300(300)
n

(u )

(v)

SSuv
SSuuSSvv

24

= 4878

v-sq
20.25
56.25
240.25
1
36
9
81
361
121
4
576
196
484
240.25
441
529
400
324
56.25
289
100
20.25
169
144
2
v =4898.5

uv
4.5
52.5
364.25
20.5
96
70.5
144
190
110
41
120
140
352
108.5
63
69
250
126
93.75
51
205
72
208
246
uv =3197.5

= 552.5

3002
= 1128
24

= 4898.5
552.5

1128(1148.5)

3002
= 1148.5
24
= .4854

Since the magnitude of the correlation coefficient is not particularly large, there is a fairly
weak negative relationship between sweetness index and pectin.

550

Chapter 14

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

b.

To determine if there is a negative association between the sweetness index and the
amount of pectin, we test:
H0: s = 0
Ha: s < 0
The test statistic is rs = .4854
Reject H0 if rs < rs, where = .01 and n = 24.
Reject H0 if rs < .485 (from Table XVII, Appendix B)
Since the observed value of the test statistic falls in the rejection region
(rs = .4854 < .485), H0 is rejected. There is sufficient evidence to indicate there is a
negative association between the sweetness index and the amount of pectin at = .01.

14.66

a.

Some preliminary calculations are:


Parent
643
381
342
251
216
208
192
141
131
128
124

Rank, u
11
10
9
8
7
6
5
4
3
2
1

rs = 1

Subsid
2,617
1,724
1,867
1,238
890
681
1,534
899
492
579
672

6 di2
n( n 1)
2

=1

Rank, v
11
9
10
7
5
4
8
6
1
2
3

Difference di
0
1
-1
1
2
2
-3
-2
2
0
-2

di2
0
1
1
1
4
4
9
4
4
0
4
2
di = 32

6(32)
= 1 .145 = .855
11(112 1)

Since this correlation coefficient is fairly close to 1, it indicates that there is a


relatively strong positive relationship between the number of parent companies and
the number of subsidiaries.
To determine if the number of parent companies is positively related to the number of
subsidiaries, we test:
H0: s = 0
Ha: s > 0
The test statistic is rs = .855.

Nonparametric Statistics

551

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

From Table XVI, Appendix B, rs,.05 = .523, with n = 11. The rejection region is
rs > .523.
Since the observed value of the test statistic falls in the rejection region
(rs = .855 > .523), H0 is rejected. There is sufficient evidence to indicate that the
number of parent companies is positively related to the number of subsidiaries at
= .05.
b.

We must assume:
1. The sample is randomly selected.
2. The probability distributions of both of the variables are continuous.
The actual number of companies and subsidiaries are not continuous. However,
since the numbers of companies/subsidiaries are very large, this assumption is
basically met. From the information given, we cannot tell whether the sample was
random or not.

14.68

b.

Some preliminary calculations:

Involvement

1
2
3
4
5
6
7
8
9
10
11

rs = 1

6 d i2
n(n 1)
2

ui

vi

Differences
di = ui vi

8
6
10
2
5
9
1
4
7
11
3

9
7
10
1
5
8
2
4
6
11
3

1
1
0
1
0
1
1
0
1
0
0

=1

d i2

di2

1
1
0
1
0
1
1
0
1
0
1
=6

6(6)
= .972
11(112 1)

To determine if a positive relationship exists between participation rates and cost savings
rates, we test:
H0: s = 0
Ha: s > 0
The test statistic is rs = .972.
From Table XVII, Appendix B, rs,.01 = .736, with n = 11. The rejection region is
rs > .736.

552

Chapter 14

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Since the observed value of the test statistic does falls in the rejection region (rs = .972 >
.736), H0 is rejected. There is sufficient evidence to indicate that a positive relationship
exists between participation rates and cost savings rates at = .01.
c.

In order for the above test to be valid, we must assume:


1.
2.

The sample is randomly selected.


The probability distributions of both of the variables are continuous.

In order to use the Pearson correlation coefficient, we must assume that both populations
are normally distributed. It is very unlikely that the data are normally distributed.
14.70

The appropriate test for this completely randomized design is the Kruskal-Wallis H-test. Some
preliminary calculations are:
Sample 1
18
32
43
15
63

Rank
4.5
6
9
3
12

Sample 2
12
33
10
34
18

Rank Sample 3
12
87
7
53
1
65
8
50
4.5
64
77
R2 = 22.5

R1 = 34.5

Rank

16
11
14
10
13
15
R3 = 79

To determine whether at least two of the populations differ in location, we test:


H0: The three probability distributions are identical
Ha: At least two of the three probability distributions differ in location
2

Rj
12
The test statistic is H =
3( n + 1)

n( n + 1)
nj
=

(34.5) 2 (22.5) 2 (79) 2


12
+
+

3(16 + 1)
16(16 + 1) 5
5
6

= 60.859 51 = 9.859
The rejection region requires = .05 in the upper tail of the 2 distribution with df = p 1 = 3
2
1 = 2. From Table VII, Appendix B, .05
= 5.99147. The rejection region is H > 5.99147.
Since the observed value of the test statistic falls in the rejection region (H = 9.859 > 5.99147),
reject H0. There is sufficient evidence to indicate a difference in location for at least two of the
three probability distributions at = .05.

Nonparametric Statistics

553

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

14.72

The appropriate test for two independent samples is the Wilcoxon rank sum test. Some
preliminary calculations are:
Sample 1
1.2
1.9
.7
2.5
1.0
1.8
1.1

Rank
4
8.5
1
10
2
7
3
T1 = 35.5

Sample 2
1.5
1.3
2.9
1.9
2.7
3.5

Rank
6
5
12
8.5
11
13

T2 = 55.5

To determine if there is a difference between the locations of the probability distributions,


we test:
H0: The two sampled populations have identical probability distributions
Ha: The probability distribution for population 1 is shifted to the left or right of that for 2
The test statistic is T2 = 55.5.
Reject H0 if T2 TL or T2 TU where = .05 (two-tailed), n1 = 7 and n2 = 6:
Reject H0 if T2 28 or T2 56 (from Table XV, Appendix B).
Since T2 = 55.5 / 28 and T2 = 55.5 / 56, do not reject H0. There is insufficient evidence to
indicate a difference between the locations of the probability distributions for the sampled
populations at = .05.
14.74

a.

To determine whether the median biting rate is higher in bright, sunny weather, we test:
H0: = 5
Ha: > 5

b.

( S .5) .5n
(95 .5) .5(122)
=
= 6.07
.5 n
.5 122
(where S = number of observations greater than 5)

The test statistic is z =

The p-value is p = P(z 6.07). From Table IV, Appendix B, p = P(z 6.07) 0.0000.
c.

554

Since the observed p-value is less than (p = 0.0000 < .01), H0 is rejected. There is
sufficient evidence to indicate that the median biting rate in bright, sunny weather is
greater than 5 at = .01.

Chapter 14

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

14.76

Some preliminary calculations are:


Difference
Highway 1 Highway 2
25
4
23
16
16

Rank of Absolute
Differences
5
1
4
2.5
2.5
T+ = 1

To determine if the heavily patrolled highway tends to have fewer speeders per 100 cars
than the occasionally patrolled highway, we test:
H0: The two sampled populations have identical probability distributions
Ha: The probability distribution for highway 1 is shifted to the left of that for
highway 2
The test statistic is T+ = 1.
The rejection region is T+ 1 from Table XVI, Appendix B, with n = 5 and = .05.
Since the observed value of the test statistic falls in the rejection region (T+ = 1 1), H0 is
rejected. There is sufficient evidence to indicate the probability distribution for highway
1 is shifted to the left of that for highway 2 at = .05.
b.

Some preliminary calculations are:


Day

1
2
3
4
5

Difference
Highway 1 Highway 2
25
4
23
16
16

d=

di = 76
5

di2

= 15.2

( di )

n
=
n 1
sd = 131.7 = 11.4761

sd2 =

(76) 2
5
5 1

1682

To determine if the mean number of speeders per 100 cars differ for the two highways,
we test:
H0: 1 = 2
Ha: 1 2
The test statistic is t =

Nonparametric Statistics

d 0
15.2
=
= 2.96
s d / n 11.4761
5

555

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The rejection region requires /2 = .05/2 = .025 in each tail of the t-distribution with df =
n 1 = 5 1 = 4. From Table VI, Appendix B, t.025 = 2.776. The rejection region is t >
2.776 and t < 2.776.
Since the observed value of the test statistic falls in the rejection region (t = 2.96
< 2.776), H0 is rejected. There is sufficient evidence to indicate the mean number of
speeders per 100 cars differ for the two highways at = .05.
We must assume that the population of differences is normally distributed and that a
random sample of differences was selected.
14.78

a.

Since only 70 of the 80 customers responded to the question, only the 70 will be
included.
To determine if the median amount spent on hamburgers at lunch at McDonald's is less
than $2.25, we test:
H0: = 2.25
Ha: < 2.25
S = number of measurements less than 2.25 = 20.
The test statistic is z =

( S .5) .5n
.5 n

(20 .5) .5(70)


.5 70

= 3.71

No was given in the exercise. We will use = .05. The rejection region requires
= .05 in the lower tail of the z-distribution. From Table IV, Appendix B, z.05 = 1.645.
The rejection region is z > 1.645.
Since the observed value of the test statistic does not fall in the rejection region (z = 3.71
>/ 1.645), H0 is not rejected. There is insufficient evidence to indicate that the median
amount spent on hamburgers at lunch at McDonald's is less than $2.25 at = .05.

556

b.

No. The survey was done in Boston only. The eating habits of those living in Boston are
probably not representative of all Americans.

c.

We must assume that the sample is randomly selected from a continuous probability
distribution.

Chapter 14

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

14.80

Some preliminary calculations:


1

Urban
4.3
5.2
6.2
5.6
3.8
5.8
4.7

2
3
Rank Suburban
Rank Rural
Rank
4.5
5.9
14
5.1
9
10.5
6.7
17
4.8
7
15.5
7.6
19
3.9
2
12
4.9
8
6.2
15.5
1
5.2
10.5
4.2
3
13
6.8
18
4.3
4.5
6
R1 = 62.5
R2 = 86.5
R3 = 41

To determine if there is a difference in the level of property taxes among the three types of
school districts, we test:
H0: The three probability distributions are identical
Ha: At least two of the three probability distributions differ in location
2

The test statistic is H =

Rj
12
3( n + 1)

n( n + 1)
nj

62.52 86.52 412


12
+
+

3(20) = 65.8498 60
19(19 + 1) 7
6
6
= 5.8498
=

The rejection region requires = .05 in the upper tail of the 2 distribution with df = p 1 =
2
= 5.99147. The rejection region is H > 5.99147.
3 1 = 2. From Table VII, Appendix B, .05
Since the observed value of the test statistic does not fall in the rejection region (H = 5.8498 >/
5.99147), H0 is not rejected. There is insufficient evidence to indicate that there is a difference
in the level of property taxes among the three types of school districts at = .05.
14.82

a. Some preliminary calculations are:


Truck Static Weight of
Truck (ui)
1
3
2
4
3
10
4
1
5
6
6
8
7
2
8
5
9
7
10
9
55

Nonparametric Statistics

Weigh-in-Motion
Prior (vi)
3
4
9
1.5
6
8
1.5
5
7
10
55

Weigh-in-Motion
After (wi)
3
4
10
2
6
8
1
5
7
9
55

uivi

9
16
90
1.5
36
64
3
25
49
90
383.5

uiwi

9
16
100
2
36
64
2
25
49
81
384

557

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

ui vi = 383.5 55(55)

SSuv =

ui vI

SSuw =

u i wi

SSuu =

ui2

SSvv =

SSww =

rs1 =
rs2 =

vi2

n
( ui wi )

( ui )
n

= 385

SSuu SSvv

= 384.5

SSuw
SSuu SSww

= 385

81
82.5(82)

= 81

55(55)
= 81.5
10

552
= 81.5
10

( wi )

SSuv

= 384

( vi )

wi2

10

552
= 82
10
552
= 82.5
10

= .9848

81.5
= .9879
82.5(82.5)

The correlation coefficient for x and y1 is rs1 = .9848.


Since rs1 > 0, the relationship between static weight and weigh-in-motion prior to
adjustment is positive. Because the value is close to 1, the relationship is very strong. It
is larger than r1 = .965 found in Exercise 10.89.
The correlation coefficient for x and y2 is rs2 = .9879.
Since rs2 > 0, the relationship between static weight and weigh-in-motion after the
adjustment is positive. Because the value is close to 1, the relationship is very strong. It
is smaller than r2 = .996 found in Exercise 10.89.
b.

In order for rs to be exactly 1, the rankings for the static weight and the weigh-in-motion
must be the same for each truck.
In order for rs to be exactly 0, the rankings for one of the variables (static weight) must be
equal to 11 minus ranking of the other variable (weigh-in-motion) for each truck.

14.84

a.

To determine if the median level differs from the target, we test:


H0: = .75
Ha: .75

b.

S1 = number of observations less than .75 and S2 = number of observations greater than
.75.
The test statistic is S = larger of S1 and S2.
The p-value = 2P(x S) where x is a binomial random variable with n = 25 and p = .5. If
the p-value is less than = .10, reject H0.

558

Chapter 14

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

c.

A Type I error would be concluding the median level is not .75 when it is. If a Type I
error were committed, the supervisor would correct the fluoridation process when it was
not necessary. A Type II error would be concluding the median level is .75 when it is
not. If a Type II error were committed, the supervisor would not correct the fluoridation
process when it was necessary.

d.

S1 = number of observations less than .75 = 7 and S2 = number of observations greater


than .75 = 18.
The test statistic is S = larger of S1 and S2 = 18.
The p-value = 2P(x 18) where x is a binomial random variable with n = 25 and p = .5.
From Table II,
p-value = 2P(x 18) = 2(1 P(x 17)) = 2(1 .978)
= 2(.022) = .044
Since the p-value = .044 < = .10, H0 is rejected. There is sufficient evidence to indicate
the median level of fluoridation differs from the target of .75 at = .10.

e.

A distribution heavily skewed to the right might look something like the following:

One assumption necessary for the t-test is that the distribution from which the sample is
drawn is normal. A distribution which is heavily skewed in one direction is not normal.
Thus, the sign test would be preferred.
14.86

Some preliminary calculations are:


Hours

Rank

1
2
3
4
5
6
7
8

1
2
3
4
5
6
7
8

Nonparametric Statistics

Fraction
Defective
.02
.05
.03
.08
.06
.09
.11
.10

Rank

1
3
2
5
4
6
8
7

di

0
1
1
1
1
0
1
1

d i2

di2

0
1
1
1
1
0
1
1
=6

559

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

To determine if the fraction defective increases as the day progresses, we test:


H0: s = 0
Ha: s > 0
The test statistic is rs = 1

6 di2
n(n 1)
2

=1

6(6)
= 1 .071 = .929
8(82 1)

Reject H0 if rs > rs, where = .05 and n = 8:


Reject H0 if rs > .643 (from Table XVII, Appendix B).
Since rs = .929 > .643, reject H0. There is sufficient evidence to indicate that the fraction
defective increases as the day progresses at = .05.
14.88

a.

The design utilized was a completely randomized design.

b.

Some preliminary calculations are:


Site 1
34.3
35.5
32.1
28.3
40.5
36.2
43.5
34.7
38.0
35.1

Rank
6
11
3
1
19
12
23
8
15
9
R1 = 107

Site 2
39.3
45.5
50.2
72.1
48.6
42.2
103.5
47.9
41.2
44.0

Rank
17
25
28
29
27
21
30
26
20
24
R2 = 247

Site 3
34.5
29.3
37.2
33.2
32.6
38.3
43.3
36.7
40.0
35.2

Rank
7
2
14
5
4
16
22
13
18
10
R3 = 111

To determine if the probability distributions for the three sites differ, we test:
H0: The three sampled population probability distributions are identical
Ha: At least two of the three sampled population probability distributions differ in
location
2

Rj
12
The test statistic is H =
3( n + 1) 3(n + 1)

n( n + 1)
nj
=

560

12 107 2 247 2 1112


+
+

3(31) = 109.3923 93
30(31) 10
10
10
= 16.3923

Chapter 14

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

The rejection region requires = .05 in the upper tail of the 2 distribution with df =
2
= 5.99147. The rejection region is
p 1 = 3 1 = 2. From Table VII, Appendix B, .05
H > 5.99147.
Since the observed value of the test statistic falls in the rejection region (H = 16.3923 >
5.99147), H0 is rejected. There is sufficient evidence to indicate the probability
distributions for at least two of the three sites differ at = .05.
c.

Since H0 was rejected, we need to compare all pairs of sites.


Some preliminary calculations are:
Site 1
34.3
35.5
32.1
28.3
40.5
36.2
43.5
34.7
38.0
35.1

Site 2
39.3
45.5
50.2
72.1
48.6
42.2
103.5
47.9
41.2
44.0

Rank
3
6
2
1
10
7
13
4
8
5
T1 = 59
Rank
9
15
18
19
17
12
20
16
11
14
T2 = 151

Site 2
39.3
45.5
50.2
72.1
48.6
42.2
103.5
47.9
41.2
44.0

Rank
9
15
18
19
17
12
20
16
11
14
T2 = 151
Site 3
34.5
29.3
37.2
33.2
32.6
38.3
43.3
36.7
40.0
35.2

Site 1
34.3
35.5
32.1
28.3
40.5
36.2
43.5
34.7
38.0
35.1

Rank
6
11
3
1
18
12
20
8
15
9
T1 = 103

Site 3
34.3
29.3
37.2
33.2
32.6
38.3
43.3
36.7
40.0
35.2

Rank
7
2
14
5
4
16
19
13
17
10
T3 = 107

Rank
4
1
7
3
2
8
13
6
10
5
T3 = 59

For each pair, we test:


H0: The two sampled population probability distributions are identical
Ha: The probability distribution for one site is shifted to the right or left of the
other.
The rejection region for each pair is T 79 or T 131 from Table XV, Appendix B,
with n1 = n2 = 10 and = .05.

Nonparametric Statistics

561

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

For sites 1 and 2:


The test statistic is T1 = 59.
Since the observed value of the test statistic falls in the rejection region,
(TA = 59 79), H0 is rejected. There is sufficient evidence to indicate the
probability distribution for site 1 is shifted to the left of that for site 2 at = .05.
For sites 1 and 3:
The test statistic is T1 = 103.
Since the observed value of the test statistic does not fall in the rejection region
(T1 = 103 </ 79 and 103 >/ 131), H0 is not rejected. There is insufficient evidence
to indicate the probability distribution for site 1 is shifted to the right or left of that
for site 3 at = .05.
For sites 2 and 3:
The test statistic is T2 = 151.
Since the observed value of the test statistic falls in the rejection region
(T2 = 151 131), H0 is rejected. There is sufficient evidence to indicate the
probability distribution for site 2 is shifted to the right of that for site 3 at = .05.
Thus, the income for those at site 2 is significantly higher than at the other two sites.
d.

The necessary assumptions are:


1.
2.
3.

The three samples are random and independent.


There are five or more measurements in each sample.
The three probability distributions from which the samples are drawn are
continuous.

For parametric tests, the assumptions are:


1.
2.
3.

562

The three populations are normal.


The samples are random and independent
The three population variances are equal.

Chapter 14

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

14.90

Using MINITAB, the results of the Wilcoxon Rank Sum Test (Mann-Whitney Test) for
each of the Variables are:
Mann-Whitney Test and CI: CREATIVE-S, CREATIVE-NS
CREATIVE-S
CREATIVE-NS

N
47
67

Median
5.0000
4.0000

Point estimate for ETA1-ETA2 is 1.0000


95.0 Percent CI for ETA1-ETA2 is (0.9999,1.0000)
W = 3734.5
Test of ETA1 = ETA2 vs ETA1 not = ETA2 is significant at 0.0000
The test is significant at 0.0000 (adjusted for ties)

Mann-Whitney Test and CI: INFO-S, INFO-NS


INFO-S
INFO-NS

N
47
67

Median
5.000
5.000

Point estimate for ETA1-ETA2 is 0.000


95.0 Percent CI for ETA1-ETA2 is (-0.000,1.000)
W = 2888.5
Test of ETA1 = ETA2 vs ETA1 not = ETA2 is significant at 0.2856
The test is significant at 0.2743 (adjusted for ties)

Mann-Whitney Test and CI: DECPERS-S, DECPERS-NS


DECPERS-S
DECPERS-NS

N
47
67

Median
3.000
2.000

Point estimate for ETA1-ETA2 is -0.000


95.0 Percent CI for ETA1-ETA2 is (-0.000,1.000)
W = 2963.5
Test of ETA1 = ETA2 vs ETA1 not = ETA2 is significant at 0.1337
The test is significant at 0.1228 (adjusted for ties)

Mann-Whitney Test and CI: SKILLS-S, SKILLS-NS


SKILLS-S
SKILLS-NS

N
47
67

Median
6.0000
5.0000

Point estimate for ETA1-ETA2 is 1.0000


95.0 Percent CI for ETA1-ETA2 is (0.9999,1.9999)
W = 3498.5
Test of ETA1 = ETA2 vs ETA1 not = ETA2 is significant at 0.0000
The test is significant at 0.0000 (adjusted for ties)

Nonparametric Statistics

563

To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Mann-Whitney Test and CI: TASKID-S, TASKID-NS


N
47
67

TASKID-S
TASKID-NS

Median
5.000
4.000

Point estimate for ETA1-ETA2 is 1.000


95.0 Percent CI for ETA1-ETA2 is (-0.000,1.000)
W = 3028.0
Test of ETA1 = ETA2 vs ETA1 not = ETA2 is significant at 0.0614
The test is significant at 0.0566 (adjusted for ties)

Mann-Whitney Test and CI: AGE-S, AGE-NS


AGE-S
AGE-NS

N
47
67

Median
47.000
45.000

Point estimate for ETA1-ETA2 is 1.000


95.0 Percent CI for ETA1-ETA2 is (-1.000,4.001)
W = 2891.5
Test of ETA1 = ETA2 vs ETA1 not = ETA2 is significant at 0.2779
The test is significant at 0.2771 (adjusted for ties)

Mann-Whitney Test and CI: EDYRS-S, EDYRS-NS


EDYRS-S
EDYRS-NS

N
47
67

Median
13.000
13.000

Point estimate for ETA1-ETA2 is -0.000


95.0 Percent CI for ETA1-ETA2 is (0.000,-0.000)
W = 2664.0
Test of ETA1 = ETA2 vs ETA1 not = ETA2 is significant at 0.8268
The test is significant at 0.8191 (adjusted for ties)

A summary of the tests above and the t-tests from Chapter 7 are listed in the table:
Variable
CREATIVE
INFO
DECPERS
SKILLS
TASKID
AGE
EDYRS

Wilcoxon
Test Statistic, T2
3734.5
2888.5
2963.5
3498.5
3028.0
2891.5
2664.0

p-value
0.000
0.274
0.123
0.000
0.057
0.277
0.819

t
8.847
1.503
1.506
4.766
1.738
0.742
-0.623

p-value
0.000
0.136
0.135
0.000
0.087
0.460
0.534

The p-values for the Wilcoxon Rank Sum Tests and the t-tests are similar and the
decisions are the same.
Since the sample sizes are large (n = 47 and n = 67), the Central Limit Theorem applies.
Thus, the t-tests (or z-tests) are valid. One assumption for the Wilcoxon Rank Sum test
is that the distributions are continuous. Obviously, this is not true. There are many ties
in the data, so the Wilcoxon Rank Sum tests may not be valid.

564

Chapter 14

You might also like