Professional Documents
Culture Documents
Learning Objectives
1. Learn how the general linear model can be used to model problems involving curvilinear
relationships.
2. Understand the concept of interaction and how it can be accounted for in the general linear model.
3. Understand how an F test can be used to determine when to add or delete one or more variables.
4. Develop an appreciation for the complexities involved in solving larger regression analysis
problems.
5. Understand how variable selection procedures can be used to choose a set of independent variables
for an estimated regression equation.
6. Learn how analysis of variance and experimental design problems can be analyzed using a
regression model.
7. Know how the Durbin-Watson test can be used to test for autocorrelation.
16 - 1
© 2010 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 16
Solutions:
Analysis of Variance
SOURCE DF SS MS F p
Regression 1 362.13 362.13 6.85 0.059
Residual Error 4 211.37 52.84
Total 5 573.50
b. Since the p-value corresponding to F = 6.85 is 0.59 > the relationship is not significant.
c.
16 - 2
© 2010 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Regression Analysis: Model Building
Analysis of Variance
SOURCE DF SS MS F p
Regression 2 541.85 270.92 25.68 0.013
Residual Error 3 31.65 10.55
Total 5 573.50
e. Since the p-value corresponding to F = 25.68 is .013 < the relationship is significant.
Analysis of Variance
SOURCE DF SS MS F p
Regression 1 59.39 59.39 4.76 0.117
Residual Error 3 37.41 12.47
Total 4 96.80
The high p-value (.117) indicates a weak relationship; note that 61.4% of the variability in y has
been explained by x.
Analysis of Variance
SOURCE DF SS MS F p
Regression 2 93.529 46.765 28.60 0.034
Residual Error 2 3.271 1.635
Total 4 96.800
16 - 3
© 2010 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 16
At the .05 level of significance, the relationship is significant; the fit is excellent.
Analysis of Variance
SOURCE DF SS MS F p
Regression 1 18.461 18.461 4.37 0.075
Residual Error 7 29.539 4.220
Total 8 48.000
c. The following standardized residual plot indicates that the constant variance assumption
is not satisfied.
d. The logarithmic transformation does not appear to eliminate the wedged-shaped pattern in the
above residual plot. The reciprocal transformation does, however, remove the wedge-shaped
pattern. Neither transformation provides a good fit. The Minitab output for the reciprocal
transformation and the corresponding standardized residual pot are shown below.
16 - 4
© 2010 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Regression Analysis: Model Building
Analysis of Variance
SOURCE DF SS MS F p
Regression 1 0.010501 0.010501 4.19 0.080
Residual Error 7 0.017563 0.002509
Total 8 0.028064
Analysis of Variance
SOURCE DF SS MS F p
Regression 1 33223 33223 31.86 0.005
Residual Error 4 4172 1043
Total 5 37395
16 - 5
© 2010 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 16
Analysis of Variance
SOURCE DF SS MS F p
Regression 2 36643 18322 73.15 0.003
Residual Error 3 751 250
Total 5 37395
b. Since the linear relationship was significant (Exercise 4), this relationship must be significant.
Note also that since the p-value of .003 < = .05, we can reject H0.
c. The fitted value is 1302.01, with a standard deviation of 9.93. The 95% confidence interval is
1270.41 to 1333.61; the 95% prediction interval is 1242.55 to 1361.47.
16 - 6
© 2010 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Regression Analysis: Model Building
A simple linear regression model does not appear to be appropriate. There appears to be a
curvilinear relationship between the two variables.
Analysis of Variance
Source DF SS MS F P
Regression 1 29187 29187 16.99 0.001
Residual Error 15 25769 1718
Total 16 54956
16 - 7
© 2010 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 16
Unusual Observations
Obs Miles Riders Fit SE Fit Residual St Resid
8 51.0 231.0 101.0 14.4 130.0
3.34R
There is an unusual trend in the points. There is also some indication that the variance may not be
constant.
Analysis of Variance
Source DF SS MS F P
Regression 1 8.1376 8.1376 29.33 0.000
Residual Error 15 4.1612 0.2774
Total 16 12.2988
16 - 8
© 2010 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Regression Analysis: Model Building
Unusual Observations
Obs Miles lnRiders Fit SE Fit Residual St Resid
8 51.0 5.442 4.406 0.183 1.037
2.10R
The standardized residual plot indicates that the transformation has eliminated the problem
identified in the residual plot constructed in part (b).
Analysis of Variance
Source DF SS MS F P
Regression 1 0.0063645 0.0063645 11.72 0.004
16 - 9
© 2010 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 16
Unusual Observations
Obs Miles 1/Riders Fit SE Fit Residual St Resid
17 9.0 0.12500 0.05666 0.00857 0.06834
3.15R
e. The standardized residual plot corresponding to the reciprocal transformation indicates an unusual
pattern that is not evident in the standardized residual plot for the logarithmic transformation. The
estimated regression equation for the logarithmic transformation also provides a better fit. We
recommend using the logarithmic transformation.
16 - 10
© 2010 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Regression Analysis: Model Building
A simple linear regression model does not appear to be appropriate. There appears to be a
curvilinear relationship between the two variables.
Analysis of Variance
Source DF SS MS F P
Regression 2 12643604 6321802 14.15 0.001
Residual Error 12 5359686 446641
Total 14 18003290
16 - 11
© 2010 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 16
Analysis of Variance
Source DF SS MS F P
Regression 1 3.6595 3.6595 45.55 0.000
Residual Error 13 1.0444 0.0803
Total 14 4.7038
d. The model in part (c) is preferred because it provides a better fit.
9. a.
b.
Note the line drawn through the data. This line indicates a possible curvilinar relationship between
these two variables.
c. In the Minitab output that follows IndexSq denotes the square of the Cost-of-Living Index.
16 - 12
© 2010 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Regression Analysis: Model Building
Analysis of Variance
Source DF SS MS F P
Regression 3 609.96 203.32 27.70 0.000
Residual Error 46 337.67 7.34
Total 49 947.62
At the .05 level of significance there is overall significance. And, each of the three independent
variables (Cost-of-Living Index, IndexSq, and Income) is significant.
The primary concern of using this estimate is that the estimated regression equation was developed for
metropolitan areas with a population of 1,000,000 or more. But, the population for Tucson is so close to
1,000,000 that the estimated regression equation should still provide a good estimate. Note to Instructor: the
actual value of the percentage of the workforce in creative fields for Tucson reported by Kiplinger was
31.1%.
b.
F = 440/1.8 = 244.44
16 - 13
© 2010 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 16
d.
Using Excel or Minitab, the p-value corresponding to F = 15.28 is .000.
Analysis of Variance
Source DF SS MS F P
Regression 1 4.6036 4.6036 17.66 0.000
Residual Error 28 7.2998 0.2607
Total 29 11.9035
Analysis of Variance
Source DF SS MS F P
Regression 3 7.5795 2.5265 15.19 0.000
Residual Error 26 4.3240 0.1663
Total 29 11.9035
c. SSE(reduced) = 7.2998
16 - 14
© 2010 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Regression Analysis: Model Building
SSE(full) = 4.3240
MSE(full) = .1663
The p-value associated with F = 8.95 (2 degrees of freedom numerator and 26 denominator)
is .001. With a p-value < α =.05, the addition of the two independent variables is statistically
significant.
Analysis of Variance
Source DF SS MS F P
Regression 1 1350901 1350901 9.67 0.004
Residual Error 28 3909645 139630
Total 29 5260546
Analysis of Variance
Source DF SS MS F P
Regression 3 3430493 1143498 16.25 0.000
16 - 15
© 2010 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 16
c. SSE(reduced) = 3,909,645
SSE(full) = 1,830,053
MSE(full) = 70,387
The p-value associated with F = 16.25 (2 degrees of freedom numerator and 26 denominator)
is .000. With a p-value < α =.05, the addition of the two independent variables is statistically
significant.
Analysis of Variance
Source DF SS MS F P
Regression 1 2990221 2990221 36.88 0.000
Residual Error 28 2270325 81083
Total 29 5260546
Because the equation developed in part (b) provides a better fit, it is preferred over the equation
developed in part (d).
14. a. The Minitab output is shown below:
Analysis of Variance
Source DF SS MS F P
Regression 2 3379.6 1689.8 35.41 0.000
Residual Error 17 811.3 47.7
Total 19 4190.9
16 - 16
© 2010 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Regression Analysis: Model Building
Source DF Seq SS
Age 1 1772.0
Pressure 1 1607.7
Unusual Observations
Obs Age Risk Fit SE Fit Residual St Resid
17 66.0 8.00 25.05 1.67 -17.05 -2.54R
Analysis of Variance
Source DF SS MS F P
Regression 4 3672.11 918.03 26.54 0.000
Residual Error 15 518.84 34.59
Total 19 4190.95
Source DF Seq SS
Age 1 1771.98
Pressure 1 1607.66
Smoker 1 281.10
AgePress 1 11.37
Unusual Observations
Obs Age Risk Fit SE Fit Residual St Resid
17 66.0 8.00 20.91 2.01 -12.91 -2.34R
c.
The p-value associated with F = 4.23 (2 numerator and 15 denominator DF) is .000
16 - 17
© 2010 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 16
Analysis of Variance
Source DF SS MS F P
Regression 1 6.4044 6.4044 29.41 0.000
Residual Error 48 10.4512 0.2177
Total 49 16.8556
Analysis of Variance
Source DF SS MS F P
Regression 3 13.1137 4.3712 53.74 0.000
Residual Error 46 3.7419 0.0813
Total 49 16.8556
c. SSE(reduced) = 10.4512
SSE(full) = 3.7419
MSE(full) = .0813
The p-value associated with F = 41.26 (2 degrees of freedom numerator and 46 denominator)
is .000. With a p-value < α =.05, the addition of the two independent variables is statistically
significant.
16 - 18
© 2010 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Regression Analysis: Model Building
The independent variable most correlated with Weeks is Age. The Minitab output corresponding to
using Age as the independent variable is shown below:
Analysis of Variance
Source DF SS MS F P
Regression 1 9161.4 9161.4 24.01 0.000
Residual Error 48 18316.1 381.6
Total 49 27477.5
Step 1 2 3 4
Constant -8.86002 -9.09741 -0.10922 -0.06890
16 - 19
© 2010 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 16
Sales -17.4
T-Value -2.79
P-Value 0.008
The results suggest a model using four independent variables: Age, Manager, Head, and Sales. The
corresponding Minitab output is shown below:
Analysis of Variance
Source DF SS MS F P
Regression 4 15216.0 3804.0 13.96 0.000
Residual Error 45 12261.5 272.5
Total 49 27477.5
c. The results using Minitab’s Forward Selection procedure are the same as the results using Minitab’s
Stepwise procedure in part (b).
d. The results using Minitab’s Backward Elimination procedure are shown below:
Step 1 2 3 4
Constant 22.85070 13.62308 13.06817 -0.06890
16 - 20
© 2010 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Regression Analysis: Model Building
Educ -0.61
T-Value -0.66
P-Value 0.516
These results also suggest using the model with four independent variables: Age, Head, Manager, and
Sales.
M M
a T a
r e n S
E r H n a a
A d i e u g l
Mallows g u e a r e e
Vars R-Sq R-Sq(adj) C-p S e c d d e r s
16 - 21
© 2010 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 16
The results suggest a model using five independent variables: Age, Married, Head, Manager, and Sales.
The corresponding Minitab output is shown below:
Analysis of Variance
Source DF SS MS F P
Regression 5 15959.5 3191.9 12.19 0.000
Residual Error 44 11518.0 261.8
Total 49 27477.5
17. The output obtained using Minitab’s Best Subset Regression is shown below:
16 - 22
© 2010 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Regression Analysis: Model Building
v n t S r
e s t a i
i n v
A i n d e
v n g G
e S r
r R A a e
a e v v e
Mallows g g g e n
Vars R-Sq R-Sq(adj) C-p S e . . s s
1 38.7 36.5 28.3 0.51060 X
1 33.0 30.7 33.3 0.53350 X
2 58.3 55.2 12.9 0.42897 X X
2 53.9 50.5 16.8 0.45059 X X
3 63.7 59.5 10.2 0.40781 X X X
3 60.3 55.7 13.2 0.42659 X X X
4 72.0 67.5 4.8 0.36514 X X X X
4 64.7 59.0 11.3 0.41015 X X X X
5 72.9 67.2 6.0 0.36672 X X X X X
The Best Subset Regression output indicates that a model using four independent variables, Drive
Average, Greens in Reg., Putting Average, and DriveGreens, may be a good choice. The Minitab
output for this model is shown below:
Analysis of Variance
Source DF SS MS F P
Regression 4 8.5703 2.1426 16.07 0.000
Residual Error 25 3.3332 0.1333
Total 29 11.9035
18. a. Because the independent variable most highly correlated with RPG is OBP, it
will provide the best one-variable estimated regression equation. The Minitab
output using OBP to predict RPG is shown below:
Analysis of Variance
Source DF SS MS F P
Regression 1 72.108 72.108 78.85 0.000
16 - 23
© 2010 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 16
b. The output using Minitab’s Stepwise Regression procedure using Alpha-to-Enter = 0.05 and
Alpha-to-Remove = 0.05 is shown below:
Step 1 2 3
Constant -4.0491 -1.5951 -0.9808
HR 0.071 0.069
T-Value 4.16 5.06
P-Value 0.001 0.000
AVG -12.6
T-Value -3.23
P-Value 0.005
Using less sensitive values for Alpha-to-Enter and Alpha-to-Remove will provide a model with
additional independent variables. For example, the output using Minitab’s Stepwise Regression
procedure using Alpha-to-Enter = 0.10 and Alpha-to-Remove = 0.10 is shown below:
Step 1 2 3 4 5
Constant -4.0491 -1.5951 -0.9808 -0.6161 -0.9088
3B 0.182 0.244
T-Value 1.88 2.99
P-Value 0.079 0.010
BB -0.0223
T-Value -2.92
P-Value 0.011
16 - 24
© 2010 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Regression Analysis: Model Building
The following output using Minitab’s Best Subset procedure also confirms that a variety of models
will provide a good fit.
R O S A
Mallows 2 3 H B B S S C B L V
Vars R-Sq R-Sq(adj) C-p S H B B R I B O B S P G G
1 81.4 80.4 66.6 0.95631 X
1 78.9 77.7 77.9 1.0192 X
2 90.8 89.7 27.0 0.69299 X X
2 88.4 87.0 37.8 0.77872 X X
3 94.4 93.4 12.8 0.55552 X X X
3 94.4 93.3 13.0 0.55820 X X X
4 95.8 94.6 8.8 0.50014 X X X X
4 95.5 94.3 10.0 0.51589 X X X X
5 97.2 96.2 4.5 0.42096 X X X X X
5 97.2 96.2 4.6 0.42336 X X X X X
6 97.6 96.6 4.5 0.40042 X X X X X X
6 97.5 96.4 5.1 0.41198 X X X X X X
7 98.2 97.2 3.9 0.36245 X X X X X X X
7 98.2 97.1 4.1 0.36664 X X X X X X X
8 98.3 97.1 5.3 0.36471 X X X X X X X X
8 98.3 97.0 5.7 0.37269 X X X X X X X X
9 98.4 97.0 7.1 0.37506 X X X X X X X X X
9 98.4 96.9 7.3 0.38077 X X X X X X X X X
10 98.4 96.7 9.0 0.39477 X X X X X X X X X X
10 98.4 96.7 9.0 0.39496 X X X X X X X X X X
11 98.4 96.3 11.0 0.41758 X X X X X X X X X X X
11 98.4 96.2 11.0 0.41848 X X X X X X X X X X X
12 98.4 95.7 13.0 0.44629 X X X X X X X X X X X X
It would be hard to make an argument that there is one best model given these results. The five
variable model identified using Minitab’s Stepwise Regression procedure with Alpha-to-Enter =
0.10 and Alpha-to-Remove = 0.10 seems like a reasonable choice. The Minitab regression output
corresponding to this model is shown below:
Analysis of Variance
Source DF SS MS F P
Regression 5 86.088 17.218 97.16 0.000
Residual Error 14 2.481 0.177
Total 19 88.569
16 - 25
© 2010 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 16
The standardized residual plot does not indicate any reason to question the usual assumptions
regarding the error term. Thus, the estimated regression equation using OBP, HR, AVG, 3B, and
BB appears to be a good choice.
19. See the solution to Exercise 14 in this chapter. The Minitab output using the best subsets
regression procedure is shown below:
Response is Risk
P A
r g
e S e
s m P
s o r
A u k e
g r e s
Vars R-Sq R-Sq(adj) C-p S e e r s
This output suggests that the model involving Age, Pressure, and Smoker is the preferred model;
the Minitab output for this model is shown below:
16 - 26
© 2010 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Regression Analysis: Model Building
Source DF SS MS F P
Regression 3 3660.7 1220.2 36.82 0.000
Residual Error 16 530.2 33.1
Total 19 4190.9
Source DF Seq SS
Age 1 1772.0
Pressure 1 1607.7
Smoker 1 281.1
Unusual Observations
Obs Age Risk Fit SE Fit Residual St Resid
17 66.0 8.00 21.11 1.94 -13.11 -2.42R
x1 x2 x3 Treatment
0 0 0 A
1 0 0 B
0 1 0 C
0 0 1 D
E(y) = 0 + 1 x1 + 2 x2 + 3 x3
x1 x2 Treatment
0 0 1
1 0 2
0 1 3
22. Factor A
Factor B
x2 x3 Level
0 0 1
1 0 2
0 1 3
16 - 27
© 2010 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 16
D1 D2 Mfg.
0 0 1
1 0 2
0 1 3
E(y) = 0 + 1 D1 + 2 D2
b. The Minitab output is shown below:
Analysis of Variance
SOURCE DF SS MS F p
Regression 2 104.000 52.000 10.64 0.004
Residual Error 9 44.000 4.889
Total 11 148.000
c. H0 : 1 = 2 = 0
d. The p-value of .004 is less than = .05; therefore, we can reject H0 and conclude that the
mean time to mix a batch of material is most the same for each manufacturer.
D1 D2 D3 Paint
0 0 0 1
1 0 0 2
0 1 0 3
0 0 1 4
Analysis of Variance
SOURCE DF SS MS F p
Regression 3 330.00 110.00 2.54 0.093
Residual Error 16 692.00 43.25
16 - 28
© 2010 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Regression Analysis: Model Building
Total 19 1022.00
H0 : 1 = 2 = 3 = 0
The p-value of .093 is greater than = .05; therefore, at the 5% level of significance we can not
reject H0.
b. Note: Estimating the mean drying for paint 2 using the estimated regression equations developed
in part (a) may not be the best approach because at the 5% level of significance, we cannot reject
H0. But, if we want to use the output, we would proceed as follows.
D1 = 1 D2 = 0 D3 = 0
X2 X3 Car
0 0 1
1 0 2
0 1 3
The complete data set and the Minitab output are shown below:
Y X1 X2 X3
50 0 0 0
55 0 1 0
63 0 0 1
42 1 0 0
44 1 1 0
46 1 0 1
Analysis of Variance
SOURCE DF SS MS F p
Regression 3 289.00 96.33 9.17 0.100
Residual Error 2 21.00 10.50
Total 5 310.00
16 - 29
© 2010 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 16
To test for any significant difference between the two analyzers we must test H0: 1 Since the
p-value corresponding to t = -4.54 is .045 < = .05, we reject H0: 0 the time to do a tuneup
is not the same for the two analyzers.
Advertisement
DesignB DesignC Design
0 0 A
1 0 B
0 1 C
The complete data set and the Minitab output are shown below:
Analysis of Variance
16 - 30
© 2010 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Regression Analysis: Model Building
Source DF SS MS F P
Regression 5 448.00 89.60 5.60 0.029
Residual Error 6 96.00 16.00
Total 11 544.00
Overall the model is significant because the p-value corresponding to F = 5.60 < α = .05.
Individually, none of the variables are significant using α = .05
Analysis of Variance
Source DF SS MS F P
Regression 1 294.00 294.00 11.76 0.006
Residual Error 10 250.00 25.00
Total 11 544.00
Thus, DesignB is significant using α = .05. However, the model involving just the interaction
between Large and DesignB also provides some interesting results:
Analysis of Variance
Source DF SS MS F P
Regression 1 345.60 345.60 17.42 0.002
Residual Error 10 198.40 19.84
Total 11 544.00
Here we see that the interaction term is significant. Thus, one might consider that the differences
are due to both design and a large advertisement. But, it is hard to reach any definite conclusions
given the size of the data set. A larger sample size is really needed to make any stronger
conclusions about the relationships among the variables.
16 - 31
© 2010 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 16
Analysis of Variance
Source DF SS MS F P
Regression 1 107.70 107.70 295.39 0.000
Residual Error 18 6.56 0.36
Total 19 114.26
Unusual Observations
b. The Durbin-Watson statistic is .798118. At the .05 level of significance, dL = 1.20 and dU =1.41.
Because d < dL, there is significant positive autocorrelation.
28. From Minitab, d = 1.60. At the .05 level of significance, dL = 1.04 and dU = 1.77. Since dL d
dU, the test is inconclusive.
The curvature in the scatter diagram indicates that a simple linear regression model may not be
appropriate.
16 - 32
© 2010 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Regression Analysis: Model Building
Analysis of Variance
Source DF SS MS F P
Regression 2 260.92 130.46 8.36 0.014
Residual Error 7 109.18 15.60
Total 9 370.10
c. The Minitab output for the transformed nonlinear model is shown below:
Analysis of Variance
Source DF SS MS F P
Regression 1 0.0072470 0.0072470 9.17 0.016
Residual Error 8 0.0063193 0.0007899
Total 9 0.0135662
The estimated regression equation developed in part (b) provides a much better fit.
30. a.
16 - 33
© 2010 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 16
Analysis of Variance
Source DF SS MS F P
Regression 2 3161747 1580874 26.82 0.000
Residual Error 16 943263 58954
Total 18 4105011
The results obtained support the conclusion that there is a curvilinear relationship between weight
and price.
Analysis of Variance
Source DF SS MS F P
Regression 2 2944410 1472205 20.30 0.000
Residual Error 16 1160601 72538
Total 18 4105011
Type of bike appears to be a significant factor in predicting price. But, the estimated regression
equation developed in part (b) appears to provide a slightly better fit.
d. A portion of the Minitab output follows. In this output WxF denotes the interaction between the
weight of the bike and the dummy variable Type_Fitness and WxC denotes the interaction
between the weight of the bike and the dummy variable Type_Comfort.
16 - 34
© 2010 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Regression Analysis: Model Building
Analysis of Variance
Source DF SS MS F P
Regression 5 3450170 690034 13.70 0.000
Residual Error 13 654841 50372
Total 18 4105011
By taking into account the type of bike, the weight, and the interaction between these two factors
this estimated regression equation provides an excellent fit.
Source DF SS MS F P
Regression 4 2587.7 646.9 5.42 0.002
Residual Error 35 4176.3 119.3
Total 39 6764.0
b. The low value of the adjusted coefficient of determination (31.2%) does not indicate a good fit.
The scatter diagram suggests a curvilinear relationship between these two variables.
d. The output from Minitab’s best subsets procedure is shown below, where FinishedSq is the square
of Finished.
16 - 35
© 2010 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 16
Response is Delay
F
i
I F n
n Q i i
d P u n s
u u a i h
s b l s e
t l i h d
Mallows r i t e S
Vars R-Sq R-Sq(adj) C-p S y c y d q
1 15.9 13.7 34.0 12.234 X
1 9.9 7.6 39.0 12.662 X
2 37.8 34.4 17.8 10.667 X X
2 26.9 22.9 26.9 11.561 X X
3 51.1 47.1 8.7 9.5813 X X X
3 42.2 37.3 16.1 10.425 X X X
4 59.1 54.4 4.1 8.8960 X X X X
4 51.6 46.1 10.3 9.6712 X X X X
5 59.1 53.1 6.0 9.0149 X X X X X
The estimated regression equation using Industry, Quality, Finished, and FinishedSq has an
adjusted coefficient of determination of 54.4%.
32. The computer output is shown below:
Analysis of Variance
SOURCE DF SS MS F p
Regression 1 1076.1 1076.1 7.19 0.011
Residual Error 38 5687.9 149.7
Total 39 6764.0
At the .05 level of significance, dL = 1.44 and dU = 1.54. Since d = 1.55 > dU, there is no
significant positive autocorrelation.
16 - 36
© 2010 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Regression Analysis: Model Building
Analysis of Variance
SOURCE DF SS MS F p
Regression 2 1818.6 909.3 6.80 0.003
Residual Error 37 4945.4 133.7
Total 39 6764.0
b. The residual plot as a function of the order in which the data are presented is shown below:
c. At the .05 level of significance, dL = 1.39 and dU = 1.60. Since dL ≤ d ≤ dU, the test is inconclusive.
D1 D2 Type
0 0 Non
1 0 Light
0 1 Heavy
16 - 37
© 2010 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 16
Analysis of Variance
SOURCE DF SS MS F p
Regression 2 9.333 4.667 4.00 0.034
Residual Error 21 24.500 1.167
Total 23 33.833
Since the p-value = .034 is less than = .05, there are significant differences between comfort
levels for the three types of browsers.
35. Let Mid-size = 1 if a mid-size car, 0 otherwise; Luxury = 1 if a luxury car, 0 otherwise; and Sports
= 1 if a sports car, 0 otherwise.
Analysis of Variance
Source DF SS MS F P
Regression 3 501.67 167.22 5.36 0.004
Residual Error 36 1123.70 31.21
Total 39 1625.37
16 - 38
© 2010 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.