You are on page 1of 50

Chapter 15

Multiple Regression

Learning Objectives

1. Understand how multiple regression analysis can be used to develop relationships involving one
dependent variable and several independent variables.

2. Be able to interpret the coefficients in a multiple regression analysis.

3. Know the assumptions necessary to conduct statistical tests involving the hypothesized regression
model.

4. Understand the role of computer packages in performing multiple regression analysis.

5. Be able to interpret and use computer output to develop the estimated regression equation.

6. Be able to determine how good a fit is provided by the estimated regression equation.

7. Be able to test for the significance of the regression equation.

8. Understand how multicollinearity affects multiple regression analysis.

9. Know how residual analysis can be used to make a judgement as to the appropriateness of the model,
identify outliers, and determine which observations are influential.

10. Understand how logistic regression is used for regression analyses involving a binary dependent
variable.

15 - 1
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 15

Solutions:

1. a. b1 = .5906 is an estimate of the change in y corresponding to a 1 unit change in x1 when x2 is held


constant.

b2 = .4980 is an estimate of the change in y corresponding to a 1 unit change in x2 when x1 is held


constant.

b. ŷ = 29.1270 + .5906(180) + .4980(310) = 289.82

2. a. Partial Minitab output follows:

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value


Regression 1 10021.2 10021.2 15.53 0.004
X1 1 10021.2 10021.2 15.53 0.004
Error 8 5161.7 645.2
Lack-of-Fit 7 5157.2 736.7 163.72 0.060
Pure Error 1 4.5 4.5
Total 9 15182.9

Model Summary

S R-sq R-sq(adj) R-sq(pred)


25.4009 66.00% 61.75% 49.59%

Coefficients

Term Coef SE Coef T-Value P-Value VIF


Constant 45.1 25.4 1.77 0.114
X1 1.944 0.493 3.94 0.004 1.00

Regression Equation

Y = 45.1 + 1.944 X1

The estimated regression equation is ŷ = 45.1 + 1.944x1

An estimate of y when x1 = 45 is ŷ = 45.1 + 1.944(45) = 132.58

b. Partial Minitab output follows:

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value


Regression 1 3363.4 3363.4 2.28 0.170
X2 1 3363.4 3363.4 2.28 0.170
Error 8 11819.5 1477.4
Lack-of-Fit 6 11010.5 1835.1 4.54 0.192
Pure Error 2 809.0 404.5
Total 9 15182.9

15 - 2
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression

Model Summary

S R-sq R-sq(adj) R-sq(pred)


38.4374 22.15% 12.42% 0.00%

Coefficients

Term Coef SE Coef T-Value P-Value VIF


Constant 85.2 38.4 2.22 0.057
X2 4.32 2.86 1.51 0.170 1.00

Regression Equation

Y = 85.2 + 4.32 X2

The estimated regression equation is

ŷ = 85.2 + 4.32x2

An estimate of y when x2 = 15 is

ŷ = 85.2 + 4.32(15) = 150

c. Partial Minitab output is shown below:

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value


Regression 2 14052 7026.1 43.50 0.000
X1 1 10689 10688.7 66.17 0.000
X2 1 4031 4030.9 24.95 0.002
Error 7 1131 161.5
Total 9 15183

Model Summary

S R-sq R-sq(adj) R-sq(pred)


12.7096 92.55% 90.42% 87.95%

Coefficients

Term Coef SE Coef T-Value P-Value VIF


Constant -18.4 18.0 -1.02 0.341
X1 2.010 0.247 8.13 0.000 1.00
X2 4.738 0.948 5.00 0.002 1.00

Regression Equation

Y = -18.4 + 2.010 X1 + 4.738 X2

The estimated regression equation is

ŷ = -18.4 + 2.01x1 + 4.738x2

15 - 3
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 15

An estimate of y when x1 = 45 and x2 = 15 is

ŷ = -18.4 + 2.01(45) + 4.738(15) = 143.12

3. a. b1 = 3.8 is an estimate of the change in y corresponding to a 1 unit change in x1 when x2, x3, and x4
are held constant.

b2 = -2.3 is an estimate of the change in y corresponding to a 1 unit change in x2 when x1, x3, and x4
are held constant.

b3 = 7.6 is an estimate of the change in y corresponding to a 1 unit change in x3 when x1, x2, and x4
are held constant.

b4 = 2.7 is an estimate of the change in y corresponding to a 1 unit change in x4 when x1, x2, and x3
are held constant.

b. ŷ = 17.6 + 3.8(10) – 2.3(5) + 7.6(1) + 2.7(2) = 57.1

4. a. ŷ = 25 + 10(15) + 8(10) = 255; sales estimate: $255,000

b. Sales can be expected to increase by $10 for every dollar increase in inventory investment when
advertising expenditure is held constant. Sales can be expected to increase by $8 for every dollar
increase in advertising expenditure when inventory investment is held constant.

5. a. Partial Minitab output follows:

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value


Regression 1 16.640 16.640 11.27 0.015
Televison_Advertising_($1000s) 1 16.640 16.640 11.27 0.015
Error 6 8.860 1.477
Lack-of-Fit 4 6.360 1.590 1.27 0.485
Pure Error 2 2.500 1.250
Total 7 25.500

Model Summary

S R-sq R-sq(adj) R-sq(pred)


1.21518 65.26% 59.46% 28.39%

Coefficients

Term Coef SE Coef T-Value P-Value VIF


Constant 88.64 1.58 56.02 0.000
Televison_Advertising_($1000s) 1.604 0.478 3.36 0.015 1.00

Regression Equation

Weekly Gross_Revenue_($1000s) = 88.64 + 1.604 Televison_Advertising_($1000s)

15 - 4
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression

b. Partial Minitab output follows:

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value


Regression 2 23.435 11.7177 28.38 0.002
Televison_Advertising_($1000s) 1 23.425 23.4247 56.73 0.001
Newspaper_Advertising_($1000s) 1 6.795 6.7953 16.46 0.010
Error 5 2.065 0.4129
Total 7 25.500

Model Summary

S R-sq R-sq(adj) R-sq(pred)


0.642587 91.90% 88.66% 68.19%

Coefficients

Term Coef SE Coef T-Value P-Value VIF


Constant 83.23 1.57 52.88 0.000
Televison_Advertising_($1000s) 2.290 0.304 7.53 0.001 1.45
Newspaper_Advertising_($1000s) 1.301 0.321 4.06 0.010 1.45

Regression Equation

Weekly Gross_Revenue_($1000s) = 83.23 + 2.290 Televison_Advertising_($1000s) +


1.301 Newspaper_Advertising_($1000s)

c. No, it is 1.60 in part (a) and 2.29 above. In part (b) it represents the marginal change in revenue due
to an increase in television advertising with newspaper advertising held constant.

d. Revenue = 83.2 + 2.290(3.5) + 1.301(1.8) = $93.5568 or $93,566.80

6. a. Partial Minitab output follows:

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value


Regression 1 4814.3 4814.3 19.11 0.001
Yds/Att 1 4814.3 4814.3 19.11 0.001
Error 14 3527.4 252.0
Lack-of-Fit 13 3037.6 233.7 0.48 0.829
Pure Error 1 489.8 489.8
Total 15 8341.7

Model Summary

S R-sq R-sq(adj) R-sq(pred)


15.8732 57.71% 54.69% 44.88%

Coefficients

Term Coef SE Coef T-Value P-Value VIF


Constant -58.8 26.2 -2.25 0.041
Yds/Att 16.39 3.75 4.37 0.001 1.00

15 - 5
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 15

Regression Equation

Win% = -58.8 + 16.39 Yds/Att

Fits and Diagnostics for Unusual Observations

Std
Obs Win% Fit Resid Resid
14 81.30 47.77 33.53 2.19 R

R Large residual

b. Partial Minitab output follows:

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value


Regression 1 3653 3652.8 10.91 0.005
Int/Att 1 3653 3652.8 10.91 0.005
Error 14 4689 334.9
Lack-of-Fit 11 3536 321.4 0.84 0.644
Pure Error 3 1153 384.4
Total 15 8342

Model Summary

S R-sq R-sq(adj) R-sq(pred)


18.3008 43.79% 39.77% 26.48%

Coefficients

Term Coef SE Coef T-Value P-Value VIF


Constant 97.5 13.9 7.04 0.000
Int/Att -1600 485 -3.30 0.005 1.00

Regression Equation

Win% = 97.5 - 1600 Int/Att

Fits and Diagnostics for Unusual Observations

Obs Win% Fit Resid Std Resid


8 12.50 55.93 -43.43 -2.45 R

R Large residual

c. Partial Minitab output follows:

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value


Regression 2 6277 3138.5 19.76 0.000
Yds/Att 1 2624 2624.2 16.52 0.001
Int/Att 1 1463 1462.8 9.21 0.010
Error 13 2065 158.8
Total 15 8342

15 - 6
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression

Model Summary

S R-sq R-sq(adj) R-sq(pred)


12.6024 75.25% 71.44% 60.51%

Coefficients

Term Coef SE Coef T-Value P-Value VIF


Constant -5.8 27.1 -0.21 0.835
Yds/Att 12.95 3.19 4.06 0.001 1.15
Int/Att -1084 357 -3.03 0.010 1.15

Regression Equation

Win% = -5.8 + 12.95 Yds/Att - 1084 Int/Att

Fits and Diagnostics for Unusual Observations

Obs Win% Fit Resid Std Resid


8 12.50 38.57 -26.07 -2.28 R

R Large residual

d. The predicted value of Win% for the Kansas City Chiefs is

Win% = - 5.8 + 12.95(6.2) – 1084(.036) = 35.47%

With 7 wins and 9 loses, the Kansas City Chiefs won 43.75% of the games they played. The
predicted value is somewhat lower than the actual value.

7. a. Partial Minitab output follows:

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value


Regression 1 66.34 66.343 9.87 0.014
Contrast Ratio 1 66.34 66.343 9.87 0.014
Error 8 53.76 6.720
Lack-of-Fit 7 41.26 5.894 0.47 0.811
Pure Error 1 12.50 12.500
Total 9 120.10

Model Summary

S R-sq R-sq(adj) R-sq(pred)


2.59221 55.24% 49.65% 37.16%

Coefficients

Term Coef SE Coef T-Value P-Value VIF


Constant 69.89 3.85 18.17 0.000
Contrast Ratio 0.1699 0.0541 3.14 0.014 1.00

Regression Equation

Overall Rating = 69.89 + 0.1699 Contrast Ratio

15 - 7
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 15

b. Partial Minitab output follows:

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value


Regression 2 100.51 50.255 17.96 0.002
Contrast Ratio 1 24.32 24.323 8.69 0.021
Resolution 1 34.17 34.168 12.21 0.010
Error 7 19.59 2.798
Total 9 120.10

Model Summary

S R-sq R-sq(adj) R-sq(pred)


1.67285 83.69% 79.03% 64.56%

Coefficients

Term Coef SE Coef T-Value P-Value VIF


Constant 43.10 8.06 5.35 0.001
Contrast Ratio 0.1134 0.0385 2.95 0.021 1.21
Resolution 0.382 0.109 3.49 0.010 1.21

Regression Equation

Overall Rating = 43.10 + 0.1134 Contrast Ratio + 0.382 Resolution

c. ŷ = 43.10 + .1134(85) + .382(74) = 81.007 or 81

8. a. Partial Minitab output follows.

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value


Regression 1 60.202 60.2022 17.21 0.001
Shore Excursions 1 60.202 60.2022 17.21 0.001
Error 18 62.963 3.4980
Lack-of-Fit 17 62.463 3.6743 7.35 0.283
Pure Error 1 0.500 0.5000
Total 19 123.166

Model Summary

S R-sq R-sq(adj) R-sq(pred)


1.87028 48.88% 46.04% 39.19%

Coefficients

Term Coef SE Coef T-Value P-Value VIF


Constant 69.30 4.80 14.44 0.000
Shore Excursions 0.2348 0.0566 4.15 0.001 1.00

Regression Equation

Overall = 69.30 + 0.2348 Shore Excursions

15 - 8
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression

b. Partial Minitab output follows.

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value


Regression 2 90.95 45.477 24.00 0.000
Shore Excursions 1 69.05 69.053 36.44 0.000
Food/Dining 1 30.75 30.752 16.23 0.001
Error 17 32.21 1.895
Total 19 123.17

Model Summary

S R-sq R-sq(adj) R-sq(pred)


1.37650 73.85% 70.77% 62.81%

Coefficients

Term Coef SE Coef T-Value P-Value VIF


Constant 45.18 6.95 6.50 0.000
Shore Excursions 0.2529 0.0419 6.04 0.000 1.01
Food/Dining 0.2482 0.0616 4.03 0.001 1.01

Regression Equation

Overall = 45.18 + 0.2529 Shore Excursions + 0.2482 Food/Dining

c. yˆ  45.18  .2529(Shore Excursions)  .2482(Food/Dining)  45.18  .2529(80)  .2482(90) = 87.75

Thus, an estimate of the overall score is approximately 88.

9. a. Partial Minitab output follows:

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value


Regression 1 6241.07 6241.07 455.79 0.000
Club Head Speed 1 6241.07 6241.07 455.79 0.000
Error 188 2574.25 13.69
Lack-of-Fit 177 2521.50 14.25 2.97 0.024
Pure Error 11 52.75 4.80
Total 189 8815.32

Model Summary

S R-sq R-sq(adj) R-sq(pred)


3.70038 70.80% 70.64% 70.04%

Coefficients

Term Coef SE Coef T-Value P-Value VIF


Constant 124.72 7.41 16.83 0.000
Club Head Speed 1.3943 0.0653 21.35 0.000 1.00

15 - 9
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 15

Regression Equation

Total Distance = 124.72 + 1.3943 Club Head Speed

b. Partial Minitab output follows:

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value


Regression 1 6591.4 6591.42 557.21 0.000
Ball Speed 1 6591.4 6591.42 557.21 0.000
Error 188 2223.9 11.83
Lack-of-Fit 180 2106.8 11.70 0.80 0.728
Pure Error 8 117.1 14.64
Total 189 8815.3

Model Summary

S R-sq R-sq(adj) R-sq(pred)


3.43937 74.77% 74.64% 74.13%

Coefficients

Term Coef SE Coef T-Value P-Value VIF


Constant 117.14 7.02 16.68 0.000
Ball Speed 0.9876 0.0418 23.61 0.000 1.00

Regression Equation

Total Distance = 117.14 + 0.9876 Ball Speed

c. The following scatter diagram illustrates the relationship between the two variables.

190.00
185.00
180.00
Ball Speed

175.00
170.00
165.00
160.00
155.00
150.00
100.00 105.00 110.00 115.00 120.00 125.00 130.00
Club Head Speed

The scatter diagram shows a very strong linear relationship between the two variables. In fact, for
these data the coefficient of determination is approximately .99. As a result using both variables in
the same model is not recommended because once the linear effect of one variable is accounted for
the other variable will be of little additional value. This situation, referred to as multicollinearity, is
discussed later in the chapter in the section on testing for significance.

15 - 10
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression

d. Partial Minitab output follows:

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value


Regression 2 7297.8 3648.89 449.64 0.000
Ball Speed 1 7297.7 7297.74 899.27 0.000
Launch Angle 1 706.4 706.36 87.04 0.000
Error 187 1517.5 8.12
Total 189 8815.3

Model Summary

S R-sq R-sq(adj) R-sq(pred)


2.84872 82.79% 82.60% 82.16%

Coefficients

Term Coef SE Coef T-Value P-Value VIF


Constant 81.60 6.95 11.74 0.000
Ball Speed 1.0927 0.0364 29.99 0.000 1.11
Launch Angle 1.646 0.176 9.33 0.000 1.11

Regression Equation

Total Distance = 81.60 + 1.0927 Ball Speed + 1.646 Launch Angle

e. ŷ = predicted Total Distance = 81.6 + 1.0927(170) + 1.646(11) = 285.465 yards

10. a. Partial Minitab output follows.

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value


Regression 1 0.047263 0.047263 13.01 0.002
SO/IP 1 0.047263 0.047263 13.01 0.002
Error 18 0.065392 0.003633
Lack-of-Fit 16 0.063942 0.003996 5.51 0.164
Pure Error 2 0.001450 0.000725
Total 19 0.112655

Model Summary

S R-sq R-sq(adj) R-sq(pred)


0.0602733 41.95% 38.73% 29.38%

Coefficients

Term Coef SE Coef T-Value P-Value VIF


Constant 0.6758 0.0631 10.71 0.000
SO/IP -0.2838 0.0787 -3.61 0.002 1.00

Regression Equation

R/IP = 0.6758 - 0.2838 SO/IP

15 - 11
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 15

b. Partial Minitab output follows.

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value


Regression 1 0.02887 0.028874 6.20 0.023
HR/IP 1 0.02887 0.028874 6.20 0.023
Error 18 0.08378 0.004655
Lack-of-Fit 6 0.01258 0.002097 0.35 0.894
Pure Error 12 0.07120 0.005933
Total 19 0.11265

Model Summary

S R-sq R-sq(adj) R-sq(pred)


0.0682239 25.63% 21.50% 14.87%

Coefficients

Term Coef SE Coef T-Value P-Value VIF


Constant 0.3081 0.0604 5.10 0.000
HR/IP 1.347 0.541 2.49 0.023 1.00

Regression Equation

R/IP = 0.3081 + 1.347 HR/IP

c. Partial Minitab output follows.

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value


Regression 2 0.06348 0.031738 10.97 0.001
SO/IP 1 0.03460 0.034603 11.96 0.003
HR/IP 1 0.01621 0.016214 5.60 0.030
Error 17 0.04918 0.002893
Total 19 0.11266

Model Summary

S R-sq R-sq(adj) R-sq(pred)


0.0537850 56.35% 51.21% 41.25%

Coefficients

Term Coef SE Coef T-Value P-Value VIF


Constant 0.5365 0.0814 6.59 0.000
SO/IP -0.2483 0.0718 -3.46 0.003 1.05
HR/IP 1.032 0.436 2.37 0.030 1.05

Regression Equation

R/IP = 0.5365 - 0.2483 SO/IP + 1.032 HR/IP

15 - 12
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression

d. Using the estimated regression equation in part (c) we obtain

R/IP = 0.5365 - 0.2483 SO/IP + 1.032 HR/IP = 0.5365 - 0.2483(.91) + 1.032(.16) = .4757

The predicted value for R/IP was less than the actual value.

e. This suggestion does not make sense. If a pitcher gives up more runs per inning pitched this pitcher’s
earned run average also has to increase. For these data the sample correlation coefficient between
ERA and R/IP is .964. The following partial Minitab output shows the results for part (c) using ERA
as the dependent variable.

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value


Regression 2 5.174 2.5870 14.17 0.000
SO/IP 1 1.905 1.9052 10.44 0.005
HR/IP 1 2.190 2.1901 12.00 0.003
Error 17 3.103 0.1825
Total 19 8.276

Model Summary

S R-sq R-sq(adj) R-sq(pred)


0.427204 62.51% 58.10% 50.29%

Coefficients

Term Coef SE Coef T-Value P-Value VIF


Constant 3.878 0.647 6.00 0.000
SO/IP -1.843 0.570 -3.23 0.005 1.05
HR/IP 11.99 3.46 3.46 0.003 1.05

Regression Equation

ERA = 3.878 - 1.843 SO/IP + 11.99 HR/IP

11. a. SSE = SST - SSR = 6,724.125 - 6,216.375 = 507.75

SSR 6, 216.375
b. R2    .924
SST 6, 724.125

n 1 10  1
c. Ra2  1  (1  R 2 )  1  (1  .924)  .902
n  p 1 10  2  1

d. The estimated regression equation provided an excellent fit.

SSR 14, 052.2


12. a. R2    .926
SST 15,182.9

n 1 10  1
b. Ra2  1  (1  R 2 )  1  (1  .926)  .905
n  p 1 10  2  1

15 - 13
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 15

c. Yes; after adjusting for the number of independent variables in the model, we see that 90.5% of the
variability in y has been accounted for.

SSR 1760
13. a. R2    .975
SST 1805

n 1 30  1
b. Ra2  1  (1  R 2 )  1  (1  .975)  .971
n  p 1 30  4  1

c. The estimated regression equation provided an excellent fit.

SSR 12, 000


14. a. R2    .75
SST 16, 000
n 1 9
b. Ra2  1  (1  R 2 )  1  .25  .68
n  p 1 7

c. The adjusted coefficient of determination shows that 68% of the variability has been explained by
the two independent variables; thus, we conclude that the model does not explain a large amount of
variability.

SSR 23.435
15. a. R2    .919
SST 25.5

n 1 8 1
Ra2  1  (1  R 2 )  1  (1  .919)  .887
n  p 1 8  2 1

b. Multiple regression analysis is preferred since both R2 and Ra2 show an increased percentage of the
variability of y explained when both independent variables are used.

16. a. r 2 = .577. Thus, the averages number of passing yards per attempt is able to explain 57.7% of the
variability in the percentage of games won. Considering the nature of the data and all the other
factors that might be related to the number of games won, this is not too bad a fit.

b. The value of the coefficient of determination increased to R2 = .752, and the adjusted coefficient of
determination is Ra2 = .714. Thus, using both independent variables provides a much better fit.

17. a. A portion of the Minitab output from part (d) of exercise 9 follows:

Model Summary

S R-sq R-sq(adj) R-sq(pred)


2.84872 82.79% 82.60% 82.16%

The value of R-sq = 82.79% and the value of R-sq(adj) = 82.60% indicate that the estimated
regression equation provided a very good fit.

b. A portion of the Minitab output part (b) of exercise 9 follows:

Model Summary

S R-sq R-sq(adj) R-sq(pred)


3.43937 74.77% 74.64% 74.13%

15 - 14
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression

The value of R-sq = 74.77% indicates that using just ball speed can account for 74.77% of the
variability in total distance. The addition of launch angle increases the percentage to almost 83%.
Therefore, the estimated regression equation using both ball speed and launch angle will provide
better predictions.

18. a. A portion of the Minitab output follows:

Model Summary

S R-sq R-sq(adj) R-sq(pred)


0.0537850 56.35% 51.21% 41.25%

The Minitab output in part (c) of exercise 10 shows that R-sq = .5635 and R-sq(adj) = .5121.

b. The fit is not great, but considering the nature of the data being able to explain slightly more than
50% of the variability in the number of runs given up per inning pitched using just two independent
variables is not too bad.

c. Partial Minitab output using ERA as the dependent variable follows.

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value


Regression 2 5.174 2.5870 14.17 0.000
SO/IP 1 1.905 1.9052 10.44 0.005
HR/IP 1 2.190 2.1901 12.00 0.003
Error 17 3.103 0.1825
Total 19 8.276

Model Summary

S R-sq R-sq(adj) R-sq(pred)


0.427204 62.51% 58.10% 50.29%

Coefficients

Term Coef SE Coef T-Value P-Value VIF


Constant 3.878 0.647 6.00 0.000
SO/IP -1.843 0.570 -3.23 0.005 1.05
HR/IP 11.99 3.46 3.46 0.003 1.05

Regression Equation

ERA = 3.878 - 1.843 SO/IP + 11.99 HR/IP

The Minitab output shows that R-sq = .6251 and R-sq(adj) = .5810

Approximately 60% of the variability in the ERA can be explained by the linear effect of HR/IP and
SO/IP. This is not too bad considering the complexity of predicting pitching performance.

19. a. MSR = SSR/p = 6,216.375/2 = 3,108.188

SSE 507.75
MSE    72.536
n  p  1 10  2  1

15 - 15
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 15

b. F = MSR/MSE = 3,108.188/72.536 = 42.85

Using F table (2 degrees of freedom numerator and 7 denominator), p-value is less than .01

Actual p-value = .0001

Because p-value   = .05, the overall model is significant.

c. t = .5906/.0813 = 7.26

Using t table (7 degrees of freedom), area in tail is less than .005; p-value is less than .01

Actual p-value = .0002

Because p-value   ,  is significant.

d. t = .4980/.0567 = 8.78

Using t table (7 degrees of freedom), area in tail is less than .005; p-value is less than .01

Actual p-value = .0001

Because p-value   ,  is significant.

20. A portion of the Minitab output follows.

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value


Regression 2 14052 7026.1 43.50 0.000
X1 1 10689 10688.7 66.17 0.000
X2 1 4031 4030.9 24.95 0.002
Error 7 1131 161.5
Total 9 15183

Model Summary

S R-sq R-sq(adj) R-sq(pred)


12.7096 92.55% 90.42% 87.95%

Coefficients

Term Coef SE Coef T-Value P-Value VIF


Constant -18.4 18.0 -1.02 0.341
X1 2.010 0.247 8.13 0.000 1.00
X2 4.738 0.948 5.00 0.002 1.00

Regression Equation

Y = -18.4 + 2.010 X1 + 4.738 X2

a. Since the p-value corresponding to F = 43.50 is .000 <  = .05, we reject H0: = = 0; there is a
significant relationship.

15 - 16
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression

b. Since the p-value corresponding to t = 8.13 is .000 <  = .05, we reject H0: = 0;  is significant.

c. Since the p-value corresponding to t = 5.00 is .002 <  = .05, we reject H0: = 0; is significant.

21. a. In the two independent variable case the coefficient of x1 represents the expected change in y
corresponding to a one unit increase in x1 when x2 is held constant. In the single independent variable
case the coefficient of x1 represents the expected change in y corresponding to a one unit increase in
x1.

b. Yes. If x1 and x2 are correlated one would expect a change in x1 to be accompanied by a change in
x2.

22. a. SSE = SST - SSR = 16000 - 12000 = 4000

SSE 4000
s2    571.43
n - p -1 7

SSR 12000
MSR    6000
p 2

b. F = MSR/MSE = 6000/571.43 = 10.50

Using F table (2 degrees of freedom numerator and 7 denominator), p-value is less than .01

Actual p-value = .008

Because p-value   , we reject H0. There is a significant relationship among the variables.

23. a. F = 28.38

Using F table (2 degrees of freedom numerator and 5 denominator), p-value is less than .01

Actual p-value = .002

Because p-value   , there is a significant relationship.

b. t = 7.53

Using t table (5 degrees of freedom), area in tail is less than .005; p-value is less than .01

Actual p-value = .001

Because p-value   , is significant and x1 should not be dropped from the model.

c. t = 4.06

Actual p-value = .010

Because p-value   , is significant and x2 should not be dropped from the model.

15 - 17
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 15

24. a. Partial Minitab output follows:

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value


Regression 2 6179 3089.6 13.18 0.000
OffPassYds/G 1 6079 6079.5 25.94 0.000
DefYds/G 1 1713 1712.6 7.31 0.011
Error 29 6797 234.4
Total 31 12976

Model Summary

S R-sq R-sq(adj) R-sq(pred)


15.3096 47.62% 44.01% 33.45%

Coefficients

Term Coef SE Coef T-Value P-Value VIF


Constant 60.5 28.4 2.14 0.041
OffPassYds/G 0.3186 0.0626 5.09 0.000 1.21
DefYds/G -0.2413 0.0893 -2.70 0.011 1.21

Regression Equation

Win% = 60.5 + 0.3186 OffPassYds/G - 0.2413 DefYds/G

ŷ = 60.5 + 0.3186 OffPassYds/G ˗ 0.2413 DefYds/G

b. Because the p-value for the F test = .000 <  = .05, there is a significant relationship.

c. For OffPassYds/G: Because the p-value = .000 <  = .05, OffPassYds/G is significant.

For DefYds/G: Because the p-value = .011 < = .05, DefYds/G is significant.

25. a. Partial Minitab output follows.

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value


Regression 3 92.352 30.784 15.98 0.000
Itineraries/Schedule 1 1.398 1.398 0.73 0.407
Shore Excursions 1 61.261 61.261 31.81 0.000
Food/Dining 1 30.539 30.539 15.86 0.001
Error 16 30.813 1.926
Total 19 123.166

Model Summary

S R-sq R-sq(adj) R-sq(pred)


1.38775 74.98% 70.29% 58.09%

15 - 18
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression

Coefficients

Term Coef SE Coef T-Value P-Value VIF


Constant 35.6 13.2 2.69 0.016
Itineraries/Schedule 0.110 0.130 0.85 0.407 1.05
Shore Excursions 0.2445 0.0434 5.64 0.000 1.07
Food/Dining 0.2474 0.0621 3.98 0.001 1.01

Regression Equation

Overall = 35.6 + 0.110 Itineraries/Schedule + 0.2445 Shore Excursions


+ 0.2474 Food/Dining

b. Because the p-value corresponding to F = 15.98, 0.000, is less than .05, the level of significance,
overall there is a significant relationship.

c. Because the p-value for Itineraries/Schedule (.407) is greater than the level of significance (.05),
Itineraries/Schedule is not significant. Shore Excursions (p-value = .000) and Food/Dining (p-value
= .001) are both significant because the p-value for each of these independent variables is less than
the level of significance (.05).

d. After removing Itineraries/Schedule from the model, we obtained the following Minitab output.

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value


Regression 2 90.95 45.477 24.00 0.000
Shore Excursions 1 69.05 69.053 36.44 0.000
Food/Dining 1 30.75 30.752 16.23 0.001
Error 17 32.21 1.895
Total 19 123.17

Model Summary

S R-sq R-sq(adj) R-sq(pred)


1.37650 73.85% 70.77% 62.81%

Coefficients

Term Coef SE Coef T-Value P-Value VIF


Constant 45.18 6.95 6.50 0.000
Shore Excursions 0.2529 0.0419 6.04 0.000 1.01
Food/Dining 0.2482 0.0616 4.03 0.001 1.01

Regression Equation

Overall = 45.18 + 0.2529 Shore Excursions + 0.2482 Food/Dining

With Itineraries/Schedule in the model, the R-sq was .7498, while the R-sq after Itineraries/Schedule
was removed from the model was .7385. Removing Itineraries/Schedule from the model resulted in
almost no loss in the model’s ability to explain variability in the Overall Score.

15 - 19
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 15

26. The partial Minitab output from part (c) of exercise 10 follows.

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value


Regression 2 0.06348 0.031738 10.97 0.001
SO/IP 1 0.03460 0.034603 11.96 0.003
HR/IP 1 0.01621 0.016214 5.60 0.030
Error 17 0.04918 0.002893
Total 19 0.11266

Model Summary

S R-sq R-sq(adj) R-sq(pred)


0.0537850 56.35% 51.21% 41.25%

Coefficients

Term Coef SE Coef T-Value P-Value VIF


Constant 0.5365 0.0814 6.59 0.000
SO/IP -0.2483 0.0718 -3.46 0.003 1.05
HR/IP 1.032 0.436 2.37 0.030 1.05

Regression Equation

R/IP = 0.5365 - 0.2483 SO/IP + 1.032 HR/IP

a. The p-value associated with F = 10.97 is .001. Because the p-value < .05, there is a significant
overall relationship.

b. For SO/IP, the p-value associated with t = -3.46 is .003. Because the p-value < .05, SO/IP is
significant. For HR/IP, the p-value associated with t = 2.37 is .030. Because the p-value < .05, HR/IP
is also significant.

27. a. ŷ = 29.1270 + .5906(180) + .4980(310) = 289.8150

b. The point estimate for an individual value is ŷ = 289.8150, the same as the point estimate of the
mean value.

28. Partial Minitab output follows:

Regression Equation

Y = -18.4 + 2.010 X1 + 4.738 X2

Variable Setting
X1 45
X2 15

Fit SE Fit 95% CI 95% PI


143.157 4.64909 (132.164, 154.151) (111.156, 175.158)

15 - 20
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression

a. The 95% confidence interval is 132.164 to 154.151.

b. The 95% prediction interval is 111.156 to 175.158.

29. a. ŷ = 83.2 + 2.29(3.5) + 1.30(1.8) = 93.555 or $93,555

Note: In Exercise 5b, the Minitab output also shows that b0 = 83.23, b1 = 2.290, and b2 = 1.301;
hence, ŷ = 83.23 + 2.23x1 + 1.301x2. Using this estimated regression equation, we obtain

ŷ = 83.23 + 2.29(3.5) + 1.301(1.8) = 93.5868 or $93,586.80

The difference ($93,586.80 - $93,555 = $31.80) is simply due to the fact that additional significant
digits are used in the computations. From a practical point of view, however, the difference is not
enough to be concerned about. In practice, a computer software package is always used to perform
the computations and this will not be an issue.

b. Partial Minitab output follows:

Regression Equation

Weekly Gross_Revenue_($1000s) = 83.23


+ 2.290 Televison_Advertising_($1000s)
+ 1.301 Newspaper_Advertising_($1000s)

Variable Setting
Televison_Advertising_($1000s) 3.5
Newspaper_Advertising_($1000s) 1.8

Fit SE Fit 95% CI 95% PI


93.5875 0.290886 (92.8398, 94.3353) (91.7743, 95.4007)

Confidence interval estimate: 92.8398 to 94.3353 or $92,839.80 to $94,335.30

c. From the partial Minitab output provided in art (b)

Prediction interval estimate: 91.7743 to 95.4007 or $91,774.30 to $95,400.70

30. a. Partial Minitab output follows:

Analysis of Variance

Source DF Seq SS Contribution Adj SS Adj MS F-Value P-Value


Regression 2 6179 47.62% 6179 3089.6 13.18 0.000
OffPassYds/G 1 4466 34.42% 6079 6079.5 25.94 0.000
DefYds/G 1 1713 13.20% 1713 1712.6 7.31 0.011
Error 29 6797 52.38% 6797 234.4
Total 31 12976 100.00%

Model Summary

S R-sq R-sq(adj) PRESS R-sq(pred)


15.3096 47.62% 44.01% 8636.13 33.45%

15 - 21
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 15

Coefficients

Term Coef SE Coef 95% CI T-Value P-Value VIF


Constant 60.5 28.4 ( 2.5, 118.5) 2.14 0.041
OffPassYds/G 0.3186 0.0626 ( 0.1907, 0.4466) 5.09 0.000 1.21
DefYds/G -0.2413 0.0893 (-0.4239, -0.0587) -2.70 0.011 1.21

Regression Equation

Win% = 60.5 + 0.3186 OffPassYds/G - 0.2413 DefYds/G

The estimated regression equation is

ŷ = 60.5 + 0.3186 OffPassYds/G ˗ 0.2413 DefYds/G

For OffPassYds/G = 225 and DefYds/G = 300, the predicted value of the percentage of games won
is

ŷ = 60.5 + 0.3186(223) ˗ 0.2413(300) = 59.1578

b. Partial Minitab output follows

Regression Equation

Win% = 60.5 + 0.3186 OffPassYds/G - 0.2413 DefYds/G

Variable Setting
OffPassYds/G 225
DefYds/G 300

Fit SE Fit 95% CI 95% PI


59.8270 4.88676 (49.8325, 69.8216) (26.9589, 92.6952)

The 95% prediction interval is 26.9589 to 92.6952.

31. a. Partial Minitab output follows.

Analysis of Variance

Source DF Seq SS Contribution Adj SS Adj MS F-Value P-Value


Regression 2 3.9954 68.27% 3.9954 1.9977 11.84 0.002
Trade Price 1 0.2442 4.17% 0.9665 0.9665 5.73 0.036
Speed of Execution 1 3.7512 64.10% 3.7512 3.7512 22.22 0.001
Error 11 1.8567 31.73% 1.8567 0.1688
Total 13 5.8521 100.00%

Model Summary

S R-sq R-sq(adj) PRESS R-sq(pred)


0.410845 68.27% 62.50% 3.45220 41.01%

15 - 22
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression

Coefficients

Term Coef SE Coef 95% CI T-Value P-Value VIF


Constant -0.783 0.942 (-2.857, 1.290) -0.83 0.423
Trade Price 0.558 0.233 ( 0.045, 1.071) 2.39 0.036 1.07
Speed of Execution 0.734 0.156 ( 0.391, 1.077) 4.71 0.001 1.07

Regression Equation

Satisfaction Electronic Trades = -0.783 + 0.558 Trade Price


+ 0.734 Speed of Execution

ŷ = -0.783 + 0.558 Trade Price + 0.734 Speed of Execution

b. Satisfaction Electronic Trades = - 0.783 + 0.558(3) + 0.734(3) = 3.093

c./d. A portion of the Minitab output follows.

Regression Equation

Satisfaction Electronic Trades = -0.783 + 0.558 Trade Price + 0.734 Speed


of Execution

Variable Setting
Trade Price 3
Speed of Execution 3

Fit SE Fit 95% CI 95% PI


3.09292 0.111486 (2.84754, 3.33830) (2.15596, 4.02989)
Predicted Values for New Observations

For part (c) the 95% confidence interval is 2.84754 to 3.33830

For part (d) the 95% prediction interval is 2.155596 to 4.02989; but, because the highest possible
rating is 4, the upper end of the prediction interval is treated as 4.

32. a. E(y) =  + x1 + x2 where

x2 = 0 if level 1 and 1 if level 2

b. E(y) =  + x1 + (0) =  + x1

c. E(y) =  + x1 + (1) =  + x1 + 

d.  = E(y | level 2) - E(y | level 1)

is the change in E(y) for a 1 unit change in x1 holding x2 constant.

15 - 23
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 15

33. a. two

b. E(y) =  + x1 + x2 + x3 where

x2 x3 Level
0 0 1
1 0 2
0 1 3

c. E(y | level 1) = + x1 + (0) + (0) = + x1

E(y | level 2) = + x1 + (1) + (0) = + x1 + 

E(y | level 3) = + x1 + (0) + (0) = + x1 + 

= E(y | level 2) - E(y | level 1)

= E(y | level 3) - E(y | level 1)

is the change in E(y) for a 1 unit change in x1 holding x2 and x3 constant.

34. a. $15,300

b. Estimate of sales = 10.1 - 4.2(2) + 6.8(8) + 15.3(0) = 56.1 or $56,100

c. Estimate of sales = 10.1 - 4.2(1) + 6.8(3) + 15.3(1) = 41.6 or $41,600

35. a. Let Type = 0 if a mechanical repair


Type = 1 if an electrical repair

Partial Minitab output follows:

Analysis of Variance

Source DF Seq SS Contribution Adj SS Adj MS F-Value P-Value


Regression 1 0.9127 8.71% 0.9127 0.9127 0.76 0.408
Type 1 0.9127 8.71% 0.9127 0.9127 0.76 0.408
Error 8 9.5633 91.29% 9.5633 1.1954
Total 9 10.4760 100.00%

Model Summary

S R-sq R-sq(adj) PRESS R-sq(pred)


1.09335 8.71% 0.00% 15.5648 0.00%

Coefficients

Term Coef SE Coef 95% CI T-Value P-Value VIF


Constant 3.450 0.547 ( 2.189, 4.711) 6.31 0.000
Type 0.617 0.706 (-1.011, 2.244) 0.87 0.408 1.00

Regression Equation

Repair Time_(hours) = 3.450 + 0.617 Type

15 - 24
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression

b. The estimated regression equation did not provide a good fit. In fact, the p-value of .408 shows that
the relationship is not significant for any reasonable value of .

c. Let Person = 0 if Bob Jones performed the service


Person = 1 if Dave Newton performed the service

Partial Minitab output follows:

Analysis of Variance

Source DF Seq SS Contribution Adj SS Adj MS F-Value P-Value


Regression 1 6.400 61.09% 6.400 6.4000 12.56 0.008
Person 1 6.400 61.09% 6.400 6.4000 12.56 0.008
Error 8 4.076 38.91% 4.076 0.5095
Total 9 10.476 100.00%

Model Summary

S R-sq R-sq(adj) PRESS R-sq(pred)


0.713793 61.09% 56.23% 6.36875 39.21%

Coefficients

Term Coef SE Coef 95% CI T-Value P-Value VIF


Constant 4.620 0.319 ( 3.884, 5.356) 14.47 0.000
Person -1.600 0.451 (-2.641, -0.559) -3.54 0.008 1.00

Regression Equation

Repair Time_(hours) = 4.620 - 1.600 Person


9 10.4760

d. We see that 61.09% of the variability in repair time has been explained by the repair person that
performed the service; an acceptable, but not good, fit.

36. a. The Minitab output follows:

Analysis of Variance

Source DF Seq SS Contribution Adj SS Adj MS F-Value P-Value


Regression 3 9.4305 90.02% 9.43049 3.14350 18.04 0.002
Months 1 5.5960 53.42% 2.11783 2.11783 12.15 0.013
Type 1 3.4049 32.50% 2.30138 2.30138 13.21 0.011
Person 1 0.4296 4.10% 0.42957 0.42957 2.47 0.167
Error 6 1.0455 9.98% 1.04551 0.17425
Lack-of-Fit 5 1.0455 9.98% 1.04551 0.20910 * *
Pure Error 1 0.0000 0.00% 0.00000 0.00000
Total 9 10.4760 100.00%

Model Summary

S R-sq R-sq(adj) PRESS R-sq(pred)


0.417434 90.02% 85.03% 3.38309 67.71%

15 - 25
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 15

Coefficients

Term Coef SE Coef 95% CI T-Value P-Value VIF


Constant 1.860 0.729 ( 0.077, 3.643) 2.55 0.043
Months 0.2914 0.0836 (0.0869, 0.4960) 3.49 0.013 2.43
Type 1.102 0.303 ( 0.360, 1.845) 3.63 0.011 1.27
Person -0.609 0.388 (-1.558, 0.340) -1.57 0.167 2.16

Regression Equation

Repair Time_(hours) = 1.860 + 0.2914 Months + 1.102 Type - 0.609 Person

b. Since the p-value corresponding to F = 18.04 is .002 <  = .05, the overall model is statistically
significant.

c. The p-value corresponding to t = -1.57 is .167 >  = .05; thus, the addition of Person is not
statistically significant. Person is highly correlated with Months (the sample correlation coefficient is
-.691); thus, once the effect of Months has been accounted for, Person will not add much to the
model.

37. a. A portion of the Minitab output follows:

Analysis of Variance

Source DF Seq SS Contribution Adj SS Adj MS F-Value P-Value


Regression 1 91.29 34.42% 91.29 91.290 9.97 0.005
Price ($) 1 91.29 34.42% 91.29 91.290 9.97 0.005
Error 19 173.95 65.58% 173.95 9.155
Lack-of-Fit 10 96.45 36.36% 96.45 9.645 1.12 0.437
Pure Error 9 77.50 29.22% 77.50 8.611
Total 20 265.24 100.00%

Model Summary

S R-sq R-sq(adj) PRESS R-sq(pred)


3.02575 34.42% 30.97% 214.718 19.05%

Coefficients

Term Coef SE Coef 95% CI T-Value P-Value VIF


Constant 69.28 3.40 (62.16, 76.39) 20.37 0.000
Price ($) 0.559 0.177 (0.188, 0.929) 3.16 0.005 1.00

Regression Equation

Score = 69.28 + 0.559 Price ($)

b. Because the p-value = .005 < α = .05, there is a significant relationship.

c. Let Type_Italian = 1 if the restaurant is an Italian restaurant; 0 otherwise

15 - 26
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression

d. A portion of the Minitab output follows:

Analysis of Variance

Source DF Seq SS Contribution Adj SS Adj MS F-Value P-Value


Regression 2 139.577 52.62% 139.577 69.7886 10.00 0.001
Price ($) 1 91.290 34.42% 96.075 96.0754 13.76 0.002
Type_Italian 1 48.287 18.21% 48.287 48.2869 6.92 0.017
Error 18 125.661 47.38% 125.661 6.9812
Lack-of-Fit 13 122.161 46.06% 122.161 9.3970 13.42 0.005
Pure Error 5 3.500 1.32% 3.500 0.7000
Total 20 265.238 100.00%

Model Summary

S R-sq R-sq(adj) PRESS R-sq(pred)


2.64219 52.62% 47.36% 187.424 29.34%

Coefficients

Term Coef SE Coef 95% CI T-Value P-Value VIF


Constant 67.40 3.05 (60.99, 73.82) 22.07 0.000
Price ($) 0.573 0.155 (0.249, 0.898) 3.71 0.002 1.00
Type_Italian 3.04 1.16 ( 0.61, 5.47) 2.63 0.017 1.00

Regression Equation

Score = 67.40 + 0.573 Price ($) + 3.04 Type_Italian

e. For the Type_Italian dummy variable, the p-value = .017 < α = .05; thus, type of restaurant is a
significant factor in overall customer satisfaction.

f. The estimated regression equation computed in part (d) is ŷ = 67.4 + .573(Price) +


3.04(Type_Italian).

For a seafood/steakhouse Type_Italian = 0 and the estimated score is ŷ = 67.4 + .573(20) + 3.04(0)
= 79.86

For an Italian restaurant Type_Italian = 1 and the estimated score is ŷ = 67.4 + .573(20) + 3.04(1) =
82.90

Thus, the satisfaction score increases by 3.04 points.

38. a. Partial Minitab output follows:

Analysis of Variance

Source DF Seq SS Contribution Adj SS Adj MS F-Value P-Value


Regression 3 3660.7 87.35% 3660.7 1220.25 36.82 0.000
Age 1 1772.0 42.28% 1394.8 1394.84 42.09 0.000
Pressure 1 1607.7 38.36% 1027.4 1027.35 31.00 0.000
Smokers 1 281.1 6.71% 281.1 281.10 8.48 0.010
Error 16 530.2 12.65% 530.2 33.14
Total 19 4190.9 100.00%

15 - 27
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 15

Model Summary

S R-sq R-sq(adj) PRESS R-sq(pred)


5.75657 87.35% 84.98% 799.476 80.92%

Coefficients

Term Coef SE Coef 95% CI T-Value P-Value VIF


Constant -91.8 15.2 (-124.0, -59.5) -6.03 0.000
Age 1.077 0.166 ( 0.725, 1.429) 6.49 0.000 1.46
Pressure 0.2518 0.0452 (0.1559, 0.3477) 5.57 0.000 1.25
Smokers 8.74 3.00 ( 2.38, 15.10) 2.91 0.010 1.36

Regression Equation

Risk = -91.8 + 1.077 Age + 0.2518 Pressure + 8.74 Smokers

b. Since the p-value corresponding to t = 2.91 is .010 <  = .05, smoking is a significant factor.

c. Partial Minitab output follows

Regression Equation

Risk = -91.8 + 1.077 Age + 0.2518 Pressure + 8.74 Smokers

Variable Setting
Age 68
Pressure 175
Smokers 1

Fit SE Fit 95% CI 95% PI


34.2661 1.99785 (30.0309, 38.5014) (21.3487, 47.1836)

The point estimate is 34.2661; the 95% prediction interval is 21.3487 to 47.1836. Thus, the
probability of a stroke (.213487 to .471836 at the 95% confidence level) appears to be quite high.
The physician would probably recommend that Art quit smoking and begin some type of treatment
designed to reduce his blood pressure.

39. a. Partial Minitab output follows:

Analysis of Variance

Source DF Seq SS Contribution Adj SS Adj MS F-Value P-Value


Regression 1 67.60 84.50% 67.60 67.600 16.35 0.027
x 1 67.60 84.50% 67.60 67.600 16.35 0.027
Error 3 12.40 15.50% 12.40 4.133
Total 4 80.00 100.00%

Model Summary

S R-sq R-sq(adj) PRESS R-sq(pred)


2.03306 84.50% 79.33% 23.8635 70.17%

15 - 28
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression

Coefficients

Term Coef SE Coef 95% CI T-Value P-Value VIF


Constant 0.20 2.13 (-6.59, 6.99) 0.09 0.931
x 2.600 0.643 (0.554, 4.646) 4.04 0.027 1.00

Regression Equation

y = 0.20 + 2.600 x

b. Using Minitab we obtained the following values:

Standardized
xi yi yˆ i Residual
1 3 2.8 .16
2 7 5.4 .94
3 5 8.0 -1.65
4 11 10.6 .24
5 14 13.2 .62

The point (3, 5) does not appear to follow the trend of remaining data; however, the value of the
standardized residual for this point, -1.65, is not large enough for us to conclude that (3, 5) is an
outlier.

A plot of the standardized residuals versus ŷ also shows that the largest standardized residual
corresponds with the point for which ŷ = 8, which is the point (3, 5). Again, the value of the
standardized residual for this point, -1.65, is not large enough for us to conclude that (3, 5) is an
outlier.

15 - 29
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 15

c. Using Minitab, we obtained the following values:

Studentized
xi yi Deleted Residual
1 3 .13
2 7 .91
3 5 - 4.42
4 11 .19
5 14 .54

t.025 = 4.303 (n - p - 2 = 5 - 1 - 2 = 2 degrees of freedom)

Since the studentized deleted residual for (3, 5) is -4.42 < -4.303, we conclude that the third
observation is an outlier.

40. a. Partial Minitab output follows:

Analysis of Variance

Source DF Seq SS Contribution Adj SS Adj MS F-Value P-Value


Regression 1 1934.42 98.76% 1934.42 1934.42 238.03 0.001
x 1 1934.42 98.76% 1934.42 1934.42 238.03 0.001
Error 3 24.38 1.24% 24.38 8.13
Total 4 1958.80 100.00%

Model Summary

S R-sq R-sq(adj) PRESS R-sq(pred)


2.85073 98.76% 98.34% 243.374 87.58%

Coefficients

Term Coef SE Coef 95% CI T-Value P-Value VIF


Constant -53.28 5.79 (-71.69, -34.87) -9.21 0.003
x 3.110 0.202 ( 2.468, 3.752) 15.43 0.001 1.00

Regression Equation

y = -53.28 + 3.110 x

b. Using the Minitab we obtained the following values:

Studentized
xi yi Deleted Residual
22 12 -1.94
24 21 -.12
26 31 1.79
28 35 .40
40 70 -1.90

t.025 = 4.303 (n - p - 2 = 5 - 1 - 2 = 2 degrees of freedom)

15 - 30
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression

Since none of the studentized deleted residuals are less than -4.303 or greater than 4.303, none of the
observations can be classified as an outlier.

c. Using Minitab we obtained the following values:

xi yi hi
22 12 .38
24 21 .28
26 31 .22
28 35 .20
40 70 .92

3( p  1) 3(1  1)
The critical value is   1.2
n 5

Since none of the values exceed 1.2, we conclude that there are no influential observations in the
data.

d. Using Minitab we obtained the following values:

xi yi Di
22 12 .60
24 21 .00
26 31 .26
28 35 .03
40 70 11.09

Since D5 = 11.09 > 1 (rule of thumb critical value), we conclude that the fifth observation is
influential.

41. a. The Minitab output appears in the solution to part (b) of Exercise 5; the estimated regression
equation is:

Revenue = 83.2 + 2.29 TVAdv + 1.301 NewsAdv

b. Using Minitab we obtained the following values:

Standardized
yˆ i Residual
96.63 -1.62
90.41 -1.08
94.34 1.22
92.21 - .37
94.39 1.10
94.24 - .40
94.42 -1.12
93.35 1.08

With the relatively few observations, it is difficult to determine if any of the assumptions regarding
the error term have been violated. For instance, an argument could be made that there does not
appear to be any pattern in the plot; alternatively an argument could be made that there is a
curvilinear pattern in the plot.

15 - 31
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 15

c. The values of the standardized residuals are greater than -2 and less than +2; thus, using test, there
are no outliers. As a further check for outliers, we used Minitab to compute the following
studentized deleted residuals:

Studentized
Observation Deleted Residual
1 -2.11
2 -1.10
3 1.31
4 - .33
5 1.13
6 - .36
7 -1.16
8 1.10

t.025 = 2.776 (n - p - 2 = 8 - 2 - 2 = 4 degrees of freedom)

Since none of the studentized deleted residuals is less than -2.776 or greater than 2.776, we conclude
that there are no outliers in the data.

d. Using Minitab we obtained the following values:

Observation hi Di
1 .63 1.52
2 .65 .70
3 .30 .22
4 .23 .01
5 .26 .14
6 .14 .01
7 .66 .81
8 .13 .06

3( p  1) 3(2  1)
The critical value for leverage is   1.125
n 8

Since none of the values exceed 1.125, we conclude that there are no influential observations.
However, using Cook’s distance measure, we see that D1 > 1 (rule of thumb critical value); thus, we
conclude the first observation is influential. Final Conclusion: observations 1 is an influential
observation.

42. a. Partial Minitab output follows:

Analysis of Variance

Source DF Seq SS Contribution Adj SS Adj MS F-Value P-Value


Regression 2 915.66 91.94% 915.66 457.828 74.12 0.000
Price ($1000s) 1 406.39 40.80% 46.22 46.222 7.48 0.017
Horsepower 1 509.27 51.13% 509.27 509.266 82.45 0.000
Error 13 80.30 8.06% 80.30 6.177
Total 15 995.95 100.00%

Model Summary

S R-sq R-sq(adj) PRESS R-sq(pred)


2.48532 91.94% 90.70% 142.697 85.67%

15 - 32
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression

Coefficients

Term Coef SE Coef 95% CI T-Value P-Value VIF


Constant 71.33 2.25 ( 66.47, 76.18) 31.73 0.000
Price ($1000s) 0.1072 0.0392 ( 0.0225, 0.1918) 2.74 0.017 1.30
Horsepower 0.08450 0.00931 (0.06439, 0.10460) 9.08 0.000 1.30

Regression Equation

Speed at 1/4 mile (mph) = 71.33 + 0.1072 Price ($1000s) + 0.08450


Horsepower r

Fits and Diagnostics for All Observations

Speed at
1/4 mile
Obs (mph) Fit SE Fit 95% CI Resid Std Resid Del Resid HI
1 90.70 90.49 0.89 ( 88.56, 92.41) 0.21 0.09 0.09 0.128399
2 108.00 105.88 2.01 (101.55, 110.22) 2.12 1.45 1.52 0.652221
3 93.20 91.68 0.88 ( 89.78, 93.59) 1.52 0.65 0.64 0.126054
4 103.20 99.76 1.14 ( 97.30, 102.23) 3.44 1.56 1.66 0.210277
5 102.10 105.85 0.94 (103.81, 107.89) -3.75 -1.63 -1.76 0.144325
6 116.20 116.83 1.70 (113.16, 120.50) -0.63 -0.35 -0.33 0.467435
7 91.70 92.83 0.89 ( 90.90, 94.76) -1.13 -0.49 -0.47 0.129294
8 89.70 90.63 0.87 ( 88.75, 92.51) -0.93 -0.40 -0.39 0.122422
9 93.00 94.32 0.78 ( 92.63, 96.00) -1.32 -0.56 -0.54 0.098752
10 92.30 91.54 0.94 ( 89.52, 93.56) 0.76 0.33 0.32 0.141539
11 99.00 103.46 0.80 (101.73, 105.19) -4.46 -1.90 -2.14 0.104138
12 84.60 87.11 1.06 ( 84.81, 89.41) -2.51 -1.12 -1.13 0.183349
13 103.20 100.08 1.05 ( 97.80, 102.35) 3.12 1.39 1.44 0.180096
14 93.20 93.20 0.87 ( 91.31, 95.08) 0.00 0.00 0.00 0.123771
15 105.00 102.76 0.86 (100.91, 104.62) 2.24 0.96 0.96 0.119325
16 97.00 95.68 0.65 ( 94.27, 97.08) 1.32 0.55 0.54 0.068602

Obs Cook’s D DFITS


1 0.00 0.03363
2 1.31 2.07558 X
3 0.02 0.24241
4 0.21 0.85470
5 0.15 -0.72273
6 0.03 -0.31262
7 0.01 -0.18152
8 0.01 -0.14467
9 0.01 -0.17970
10 0.01 0.12868
11 0.14 -0.73025
12 0.09 -0.53557
13 0.14 0.67723
14 0.00 0.00071
15 0.04 0.35229
16 0.01 0.14555

X Unusual X

b. The standardized residual plot follows. There appears to be a very unusual trend in the standardized
residuals.

15 - 33
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 15

Standardized Residual

-1

-2
90 95 100 105 110 115 120
Fitted Value

c. The Minitab output shown in part (a) did not identify any observations with a large standardized
residual; thus, there does not appear to be any outliers in the data.

d. The Minitab output shown in part (a) identifies observation 2 as an influential observation.

43. a. The Minitab output follows:

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value


Regression 2 229.676 114.838 969.26 0.000
Greens in Reg. 1 151.431 151.431 1278.11 0.000
Putting Avg. 1 123.274 123.274 1040.46 0.000
Error 131 15.521 0.118
Lack-of-Fit 130 15.483 0.119 3.10 0.429
Pure Error 1 0.038 0.038
Total 133 245.197

Model Summary

S R-sq R-sq(adj) R-sq(pred)


0.344209 93.67% 93.57% 93.26%

Coefficients

Term Coef SE Coef T-Value P-Value VIF


Constant 57.148 0.989 57.76 0.000
Greens in Reg. -23.106 0.646 -35.75 0.000 1.04
Putting Avg. 1.0320 0.0320 32.26 0.000 1.04

Regression Equation

Scoring Avg. = 57.148 - 23.106 Greens in Reg. + 1.0320 Putting Avg.

15 - 34
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression

Fits and Diagnostics for Unusual Observations

Scoring
Obs Avg. Fit Resid Std Resid
11 73.8000 72.8612 0.9388 2.74 R
25 72.1250 72.9445 -0.8195 -2.41 R
34 74.7500 73.7888 0.9612 2.81 R
36 75.5000 75.2114 0.2886 0.88 X
56 73.2500 73.8035 -0.5535 -1.67 X
59 74.8330 74.4215 0.4115 1.24 X
62 74.1670 73.4551 0.7119 2.10 R
64 75.7500 76.1522 -0.4022 -1.22 X
102 74.0000 75.1150 -1.1150 -3.33 R
122 73.6250 72.2792 1.3458 4.01 R
129 73.2780 72.5712 0.7068 2.09 R

R Large residual
X Unusual X

b. The standardized residual plot follows:

The standardized residual plot does not support the assumption about  . There are several unusual
observations and the variance of the residuals appears to be increasing for larger values of ŷ .

c. The Minitab output in part (a) identified seven outliers: observations 11, 25, 34, 62, 102, 122, and
129.
 Observations 25 and 102 correspond to Charley Hull and P.K. Kongkraphan, respectively; their
scoring averages were was much lower than other golfers with similar percentage of time hitting
the green in regulation and average number of putts taken on greens hit in regulation.

15 - 35
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 15

 Observations 11, 34, 62, 122, and 129 correspond to Ashleigh Simon, Dori Carter, Karin Sjodin,
Sophia Popov, and Tiffany Joh, respectively; their scoring averages were each much higher than
other golfers with similar percentage of time hitting the green in regulation and average number
of putts taken on greens hit in regulation.

d. The Minitab output in part (a) identified four influential observations: observations 36, 56, 59, and
64. Observation 36 corresponds to Garrett Phillips, observation 56 corresponds to Jing Yan,
Observation 59 corresponds to Ju Young Park, and observation 64 corresponds to Karlin Beck.

e 0   x
44. a. E ( y) 
1  e 0   x

b. It is an estimate of the probability that a customer that does not have a Simmons credit card will
make a purchase.

c. A portion of the Minitab binary logistic regression output follows:

Coefficients

Term Coef SE Coef VIF


Constant -0.944 0.315
Card 1.025 0.423 1.00

Odds Ratios for Continuous Predictors

Odds Ratio 95% CI


Card 2.7857 (1.2147, 6.3886)

Regression Equation

P(1) = exp(Y')/(1 + exp(Y'))

Y' = -0.944 + 1.025 Card

Thus, the estimated logit is gˆ( x)  -0.944 + 1.025x

d. For customers that do not have a Simmons credit card (x = 0)

gˆ(0)  -0.945 + 1.25(0) = 0.945

and

e gˆ (0) e 0.945 0.38868


yˆ   0.945
  0.279
1 e gˆ (0)
1 e 1  0.38868

For customers that have a Simmons credit card (x = 1)

gˆ(1)  -0.945 + 1.025(1) = 0.0800

and

15 - 36
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression

e gˆ (1) e0.08 1.0833


yˆ     0.52
1 e gˆ (1)
1 e 0.08
1  1.0833

e. Using the Minitab output shown in part (c), the estimated odds ratio is 2.7857. We can conclude that
the estimated odds of making a purchase for customers who have a Simmons credit card are 2.7857
times greater than the estimated odds of making a purchase for customers that do not have a
Simmons credit card.

.3148
45. a. odds =  .4594
1  .3148

.5796
b. odds1 =  1.3787
1  .5796

odds0 = .4594 (from part (a))

odds1 1.3787
odds ratio =   3.00
odds0 .4594

c. The odds ratio for x2 computed holding annual spending constant at $4000 is also 3.00. This shows
that the odds ratio for x2 is independent of the value of x1.

e 0   x
46. a. E ( y) 
1  e 0   x

b. A portion of the Minitab binary logistic regression output follows:

Coefficients

Term Coef SE Coef VIF


Constant -2.633 0.799
Balance 0.2202 0.0900 1.00

Odds Ratios for Continuous Predictors

Odds Ratio 95% CI


Balance 1.2463 (1.0447, 1.4868)

Regression Equation

P(1) = exp(Y')/(1 + exp(Y'))

Y' = -2.633 + 0.2202 Balance

Thus, the estimated logistic regression equation is

e2.633 0.2202 x
E ( y) 
1  e 2.633 0.2202 x

15 - 37
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 15

c. A portion of the Minitab binary logistic regression output follows:

Deviance Table

Source DF Adj Dev Adj Mean Chi-Square P-Value


Regression 1 9.460 9.460 9.46 0.002
Balance 1 9.460 9.460 9.46 0.002
Error 48 51.626 1.076
Total 49 61.086

Significant result: the p-value corresponding to the 2 test statistic is 0.002.

d. For an average monthly balance of $1000, x = 10

e 2.633 0.2202 x e2.633 0.2202(10) e 0.431 0.6499


E ( y)      0.3939
1  e2.633 0.2202 x 1  e 2.633 0.2202(10) 1  e0.431 1.6499

Thus, an estimate of the probability that customers with an average monthly balance of $1000 will
sign up for direct payroll deposit is 0.3939.

e. Repeating the calculations in part (d) using various values for x, a value of x = 12 or an average
monthly balance of approximately $1200 is required to achieve this level of probability.

f. Using the Minitab output shown in part (b), the estimated odds ratio is 1.2463. Because values of x
are measured in hundreds of dollars, the estimated odds of signing up for payroll direct deposit for
customers that have an average monthly balance of $600 is 1.2463 times greater than the estimated
odds of signing up for payroll direct deposit for customers that have an average monthly balance of
$500. Moreover, this interpretation is true for any one hundred dollar increment in the average
monthly balance.

e 0  1 x1  2 x2
47. a. E ( y) 
1  e 0  1 x1  2 x2

b. For a given GPA, it is an estimate of the probability that a student who did not attend the orientation
program will return to Lakeland for the sophomore year.

c. A portion of the Minitab binary logistic regression output follows:

Coefficients

Term Coef SE Coef VIF


Constant -6.89 1.75
GPA 2.539 0.673 1.01
Program 1.561 0.563 1.01

Odds Ratios for Continuous Predictors

Odds Ratio 95% CI


GPA 12.6644 (3.3872, 47.3515)
Program 4.7624 (1.5794, 14.3607)

Regression Equation

P(1) = exp(Y')/(1 + exp(Y'))

15 - 38
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression

Y' = -6.89 + 2.539 GPA + 1.561 Program

Thus, the estimated logit is gˆ( x1 , x2 )  6.89  2.539 x1  1.561x2

d. A portion of the Minitab binary logistic regression output follows:

Deviance Table

Source DF Adj Dev Adj Mean Chi-Square P-Value


Regression 2 47.869 23.9347 47.87 0.000
GPA 1 20.966 20.9663 20.97 0.000
Program 1 7.862 7.8616 7.86 0.005
Error 97 80.338 0.8282
Total 99 128.207

Significant result: the p-value corresponding to the 2 test statistic is 0.000.

e. From the portion of the Minitab binary logistic regression output shown in the solution to part (d),
both variables are significant at  = .01: the p-value for x1 is 0.000 and the p-value for x2 is 0.005.

f. For x1 =2.5 and x2 = 0

ĝ (2.5, 0) = -6.89 + 2.539(2.5) + 1.561(0) = -0.5425

and

e gˆ (2.5,0) e0.5425 0.5813


yˆ   0.5425
  0.3676
1 e gˆ (2.5,0)
1 e 1  0.5813

For x1 =2.5 and x2 = 1

ĝ (2.5, 1) = -6.89 + 2.539(2.5) + 1.561(1) = 1.0185

and

e gˆ (2.5,1) e1.0185 2.769


yˆ     0.7347
1 e gˆ (2.5,1)
1  e1.0185 1  2.769

g. From the Minitab output in part (c) we see that the estimated odds ratio is 4.7624 for the orientation
program. This means that the odds of students who attended the orientation program continuing are
4.7624 times greater than for students who did not attend the program.

h. We recommend making the orientation program required. From part (e), we see that the odds of
continuing are much higher for students who have attended the orientation program.

e  0  1x1   2 x2
48. a. E( y) 
1  e  0  1x1   2 x2

15 - 39
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 15

b. A portion of the Minitab binary logistic regression output follows:

Coefficients

Term Coef SE Coef VIF


Constant -39.5 12.5
Wet 3.37 1.26 1.03
Noise 1.816 0.831 1.03

Odds Ratios for Continuous Predictors

Odds Ratio 95% CI


Wet 29.2095 (2.4521, 347.9413)
Noise 6.1489 (1.2059, 31.3540)

Regression Equation

P(1) = exp(Y')/(1 + exp(Y'))

Y' = -39.5 + 3.37 Wet + 1.816 Noise

Thus, the estimated logit is gˆ( x)  -39.5 + 3.37Wet + 1.816Noise

c. For tires that have a Wet performance rating of 8 and a Noise performance rating of 8

gˆ( x)  -39.5 + 3.37Wet + 1.816Noise

gˆ( x)  -39.5 + 3.37(8) + 1.816(8) = 1.988

e1.988 7.30092
yˆ    0.8795
1 e 1.988
1  7.30092

The probability that a customer will probably or definitely purchase a particular tire again with these
performance characteristics is .8795.

d. For tires that have a Wet performance rating of 7 and a Noise performance rating of 7

gˆ( x)  -39.5 + 3.37Wet + 1.816Noise

gˆ( x)  -39.5 + 3.37(7) + 1.816(7) = -3.198

e 3.198 .04084
yˆ    0.0392
1  e 3.198 1  .04084

The probability that a customer will probably or definitely purchase a particular tire again with these
performance characteristics is .0392.

e. Wet and Noise performance ratings of 7 are both considered Excellent performance ratings using the
Tire Rack performance scale. Nonetheless, the probability that the customer will repurchase a tire
with these characteristics is very low. But, a one point increase in both ratings increases the
probability to .8795. So, achieving the highest possible levels of performance is essential if the
manufacture wants to have the greatest chance of having an existing customer buy their tire again.

15 - 40
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression

49. a. The expected increase in final college grade point average corresponding to a one point increase in
high school grade point average is .0235 when SAT mathematics score does not change. Similarly,
the expected increase in final college grade point average corresponding to a one point increase in
the SAT mathematics score is .00486 when the high school grade point average does not change.

b. ŷ = -1.41 + .0235(84) + .00486(540) = 3.19

50. a. Job satisfaction can be expected to decrease by 8.69 units with a one unit increase in length of
service if the wage rate does not change. A dollar increase in the wage rate is associated with a 13.5
point increase in the job satisfaction score when the length of service does not change.

b. ŷ = 14.4 - 8.69(4) + 13.5(13) = 155.14

51. a. The computer output with the missing values filled in is as follows:

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value


Regression _2 1612 806 71.820 0.000
x1 1 146.366 146.366 13.042 0.004
x2 1 289.047 289.047 25.756 0.000
Error 12 134.67 11.223
Total 14 1746.67

Model Summary

S R-sq R-sq(adj) R-sq(pred)


3.35 92.30% 91.02% 85.12%

Coefficients

Term Coef SE Coef T-Value P-Value VIF


Constant 8.103 2.667 3.04 0.010
x1 7.602 2.105 3.61 0.004 1.62
x2 3.111 0.613 5.08 0.000 1.62

Regression Equation

y = 8.103 + 7.602 X1 + 3.111 X2

b. F.05 = 3.89

F = 71.82 > F.05; significant relationship

Actual p-value = .000

Because p-value   = .05, the overall relationship is significant

c. Using t table (12 degrees of freedom), area in tail corresponding to t = 3.61 is less than .005; p-value
is less than .01

Actual p-value = .004

Because p-value   , reject H0: = 0

15 - 41
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 15

Using t table (12 degrees of freedom), area in tail corresponding to t = 5.08 is less than .005; p-value
is less than .01

Actual p-value = .000

Because p-value   , reject H0: = 0

d. See computer output.

14
e. Ra2  1  (1  .9230)  .9102
12

52. a. The computer output with the missing values filled in is as follows

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value


Regression 2 1.76209 0.88105 52.29 0.000
X1 1 0.12389 0.12389 7.35 0.030
X2 1 0.34308 0.34308 20.36 0.003
Error 7 0.11794 0.01685
Total 9 1.88003

Model Summary

S R-sq R-sq(adj) R-sq(pred)


0.1298 93.73% 91.93% 74.53%

Coefficients

Term Coef SE Coef T-Value P-Value VIF


Constant -1.41 0.4848 -2.91 0.023
x1 0.0235 0.0087 2.71 0.030 1.54
x2 0.00486 0.0011 4.51 0.003 1.54

Regression Equation

y = -1.41 + 0.0235 X1 + 0.00486 X2

b. F.05 = 4.74

F = 52.29 > F.05; significant relationship

Actual p-value = .000

Because p-value   = .05, the overall relationship is significant

c. for 1 : p-value = .030; reject H0: 1 = 0

for  2 : p-value = .003; reject H0:  2 = 0

SSR
d. R2   .9373
SST

15 - 42
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression

9
Ra2  1  (1  .9373)  .9193
7

good fit

53. a. The computer output with the missing values filled in is as follows

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value


Regression 2 648.82 324.41 22.79 0.003
X1 1 444.58 444.58 31.23 0.003
X2 1 598.57 598.57 42.05 0.001
Error 5 71.18 14.24
Total 7 720.00

Model Summary

S R-sq R-sq(adj) R-sq(pred)


3.773 90.11% 86.16% 69.93%

Coefficients

Term Coef SE Coef T-Value P-Value VIF


Constant 14.41 8.191 1.76 0.139
x1 -8.69 1.555 -5.59 0.003 1.95
x2 13.52 2.085 6.48 0.001 1.95

Regression Equation

y = 14.41 – 8.69 X1 + 13.52 X2

b. F.05 = 5.79

F = 22.79 > F.05; significant relationship.

Actual p-value = .003

Because p-value ≤ α = .05, the overall relationship is significant.

SSR
c. R2   .9011
SST
7
Ra2  1  (1  .9011)  .8616
5
good fit

d. for : t = p-value = .0035; reject H0: = 0

for : p-value = .001; reject H0: = 0

54. a. A portion of the Minitab output follows:

Analysis of Variance

15 - 43
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 15

Source DF Adj SS Adj MS F-Value P-Value


Regression 1 60.787 60.7866 85.93 0.000
Steering 1 60.787 60.7866 85.93 0.000
Error 16 11.318 0.7074
Lack-of-Fit 11 10.092 0.9174 3.74 0.078
Pure Error 5 1.227 0.2453
Total 17 72.105

Model Summary

S R-sq R-sq(adj) R-sq(pred)


0.841071 84.30% 83.32% 81.08%

Coefficients

Term Coef SE Coef T-Value P-Value VIF


Constant -7.52 1.47 -5.13 0.000
Steering 1.815 0.196 9.27 0.000 1.00

Regression Equation

Buy Again = -7.52 + 1.815 Steering

Because the p-value = .000 < α = .05, there is a significant relationship.

b. The estimated regression equation provided a good fit; 84.3 % of the variability in the Buy Again
rating was explained by the linear effect of the Steering rating.

c. A portion of the Minitab output follows:

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value


Regression 2 67.185 33.5924 102.41 0.000
Steering 1 1.888 1.8880 5.76 0.030
Tread Wear 1 6.398 6.3982 19.51 0.001
Error 15 4.920 0.3280
Total 17 72.105

Model Summary

S R-sq R-sq(adj) R-sq(pred)


0.572723 93.18% 92.27% 90.18%

Coefficients

Term Coef SE Coef T-Value P-Value VIF


Constant -5.39 1.11 -4.86 0.000
Steering 0.690 0.288 2.40 0.030 4.65
Tread Wear 0.911 0.206 4.42 0.001 4.65

Regression Equation

Buy Again = -5.39 + 0.690 Steering + 0.911 Tread Wear

15 - 44
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression

d. For the Treadwear independent variable, the p-value = .001 < α = .05; thus, the addition of
Treadwear is significant.

55. a. A portion of the Regression tool output follows.

Regression Statistics
Multiple R 0.8013
R Square 0.6421
Adjusted R Square 0.6409
Standard Error 3.4123
Observations 309

ANOVA
df SS MS F Significance F
Regression 1 6413.2883 6413.2883 550.8029 1.79552E-70
Residual 307 3574.5628 11.6435
Total 308 9987.8511

Upper
Coefficients Standard Error t Stat P-value Lower 95% 95%
Intercept 41.0534 0.5166 79.4748 8.1E-207 40.0370 42.0699
Displacement -3.7232 0.1586 -23.4692 1.8E-70 -4.0354 -3.4110

Because the p-value corresponding to F = 550.8029 is .0000 <  = .05, there is a significant
relationship.

b. A portion of the Excel Regression tool output follows.

Regression Statistics
Multiple R 0.8276
R Square 0.6849
Adjusted R Square 0.6829
Standard Error 3.2068
Observations 309

ANOVA
df SS MS F Significance F
Regression 2 6841.0876 3420.5438 332.6232 1.79466E-77
Residual 306 3146.7635 10.2835
Total 308 9987.8511

15 - 45
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 15

Upper
Coefficients Standard Error t Stat P-value Lower 95% 95%
Intercept 40.5946 0.4906 82.7379 1.8E-211 39.6291 41.5600
Displacement -3.1944 0.1701 -18.7745 7.43E-53 -3.5292 -2.8596
FuelPremium -2.7230 0.4222 -6.4498 4.37E-10 -3.5537 -1.8922

c. For FuelPremium, the p-value corresponding to t = -6.4498 is .000 <  = .05; significant. The
addition of the dummy variables is significant.

d. A portion of the Excel Regression tool output follows.

Regression Statistics
Multiple R 0.8554
R Square 0.7317
Adjusted R Square 0.7282
Standard Error 2.9688
Observations 309

ANOVA
df SS MS F Significance F
Regression 4 7308.5436 1827.1359 207.3108 1.54798E-85
Residual 304 2679.3075 8.8135
Total 308 9987.8511

Upper
Coefficients Standard Error t Stat P-value Lower 95% 95%
Intercept 37.9626 0.7892 48.1055 3.5E-144 36.4097 39.5155
Displacement -3.2418 0.1941 -16.7007 6.97E-45 -3.6238 -2.8599
FuelPremium -2.1352 0.4519 -4.7253 3.52E-06 -3.0243 -1.2460
FrontWheel 3.0747 0.5394 5.7005 2.83E-08 2.0133 4.1360
RearWheel 3.3114 0.5413 6.1174 2.92E-09 2.2462 4.3765

e. Since the p-value corresponding to F = 207.3108 is .0000 <  = .05, there is a significant overall
relationship. Because the p-values for each independent variable are also <  = .05, each of the
independent variables is significant.

15 - 46
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression

56. a. Type of Fund is a categorical variable with three levels. Let FundDE = 1 for a domestic equity fund
and FundIE = 1 for an international fund. The Excel output follows:

Regression Statistics
Multiple R 0.7838
R Square 0.6144
Adjusted R Square 0.5960
Standard Error 5.5978
Observations 45

ANOVA
Df SS MS F Significance F
Regression 2 2096.8489 1048.4245 33.4584 2.03818E-09
Residual 42 1316.0771 31.3352
Total 44 3412.9260

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%


Intercept 4.9090 1.7702 2.7732 0.0082 1.3366 8.4814
FundDE 10.4658 2.0722 5.0505 9.033E-06 6.2839 14.6477
FundIE 21.6823 2.6553 8.1658 3.288E-10 16.3237 27.0408

ŷ = 4.9090+ 10.4658 FundDE + 21.6823 FundIE

Since the p-value corresponding to F = 33.4584 is .0000 <  = .05, there is a significant relationship.

b. R Square = .6144. A reasonably good fit using only Type of Fund.

c. The Excel output follows:

Regression Statistics
Multiple R 0.8135
R Square 0.6617
Adjusted R Square 0.6279
Standard Error 5.3726
Observations 45

ANOVA
Df SS MS F Significance F
Regression 4 2258.3432 564.5858 19.5598 5.48647E-09
Residual 40 1154.5827 28.8646
Total 44 3412.9260

15 - 47
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 15

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%


Intercept 1.1899 2.3781 0.5004 0.6196 -3.6164 5.9961
FundDE 6.8969 2.7651 2.4942 0.0169 1.3083 12.4854
FundIE 17.6800 3.3161 5.3315 4.096E-06 10.9778 24.3821
Net Asset Value ($) 0.0265 0.0670 0.3950 0.6950 -0.1089 0.1619
Expense Ratio (%) 6.4564 2.7593 2.3399 0.0244 0.8798 12.0331

Since the p-value corresponding to F = 19.5558 is .0000 <  = .05, there is a significant relationship.

For Net Asset Value ($), the p-value corresponding to t = .3950 is .6950 >  = .05, Net Asset Value
($) is not significant and can be deleted from the model.

d. Morningstar Rank is a categorical variable. The data set only contains funds with four ranks (2-Star
through –5Star), so three dummy variables are needed. Let 3StarRank = 1 for a 3-StarRank,
4StarRank = 1 for a 4-StarRank, and 5StarRank = 1 for a 5-StarRank. The Excel output follows:

Regression Statistics
Multiple R 0.8501
R Square 0.7227
Adjusted R Square 0.6789
Standard Error 4.9904
Observations 45

ANOVA
Df SS MS F Significance F
Regression 6 2466.5721 411.0954 16.5072 2.96759E-09
Residual 38 946.3539 24.9040
Total 44 3412.9260

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%


Intercept -4.6074 3.2909 -1.4000 0.1696 -11.2694 2.0547
FundDE 8.1713 2.2754 3.5912 0.0009 3.5650 12.7776
FundIE 19.5194 2.7795 7.0227 2.292E-08 13.8926 25.1461
Expense Ratio (%) 5.5197 2.5862 2.1343 0.0393 0.2843 10.7552
3StarRank 5.9237 2.8250 2.0969 0.0427 0.2048 11.6426
4StarRank 8.2367 2.8474 2.8927 0.0063 2.4725 14.0009
5StarRank 6.6241 3.1425 2.1079 0.0417 0.2624 12.9858

ŷ = -4.6074 + 8.1713 FundDE + 19.5194 FundIE +5.5197 Expense Ratio (%) + 5.9237 3StarRank +
8.2367 4StarRank + 6.6241 5StarRank

At the .05 level of significance, all the independent variables are significant.

15 - 48
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression

e. ŷ = -4.6074 + 8.1713(1) + 19.5194(0) +5.5197(1.05) + 5.9237(1) + 8.2367(0) +6.62415(0) =


15.28%

57. a. A portion of the Minitab output follows:

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value


Regression 1 14048.6 14048.6 15.34 0.001
Hourly ($1000s) 1 14048.6 14048.6 15.34 0.001
Error 28 25645.2 915.9
Lack-of-Fit 25 25630.7 1025.2 212.12 0.000
Pure Error 3 14.5 4.8
Total 29 39693.9

Model Summary

S R-sq R-sq(adj) R-sq(pred)


30.2639 35.39% 33.09% 24.60%

Coefficients

Term Coef SE Coef T-Value P-Value VIF


Constant 40.3 15.7 2.58 0.016
Hourly ($1000s) 1.195 0.305 3.92 0.001 1.00

Regression Equation

Salaried ($1000s) = 40.3 + 1.195 Hourly ($1000s)

b. Because the p-value = .001 < α = .05, there is a significant relationship.

c. A portion of the Minitab output follows:

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value


Regression 3 22820.3 7606.8 11.72 0.000
Hourly ($1000s) 1 14595.9 14595.9 22.49 0.000
Size-Midsize 1 41.8 41.8 0.06 0.802
Size-Small 1 7050.1 7050.1 10.86 0.003
Error 26 16873.6 649.0
Total 29 39693.9

Model Summary

S R-sq R-sq(adj) R-sq(pred)


25.4752 57.49% 52.59% 44.68%

Coefficients

Term Coef SE Coef T-Value P-Value VIF


Constant 27.0 14.0 1.93 0.065
Hourly ($1000s) 1.224 0.258 4.74 0.000 1.01
Size-Midsize -3.2 12.6 -0.25 0.802 1.18
Size-Small 34.4 10.4 3.30 0.003 1.17

15 - 49
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 15

Regression Equation

Salaried ($1000s) = 27.0 + 1.224 Hourly ($1000s) - 3.2 Size-Midsize


+ 34.4 Size-Small

e. Hourly ($1000s): Significant because the p-value = .000 < α = .05

Size-Midsize: Not significant because the p-value = .802 > α = .05

Size-Small: Significant because the p-value = .003 < α = .05

f. A portion of the Minitab output using Hourly ($1000s) and Size-Small as the independent variables
follows.

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value


Regression 2 22778 11389.2 18.18 0.000
Hourly ($1000s) 1 14584 14583.8 23.28 0.000
Size-Small 1 8730 8729.8 13.93 0.001
Error 27 16915 626.5
Total 29 39694

Model Summary

S R-sq R-sq(adj) R-sq(pred)


25.0299 57.39% 54.23% 46.53%

Coefficients

Term Coef SE Coef T-Value P-Value VIF


Constant 26.3 13.5 1.95 0.062
Hourly ($1000s) 1.218 0.252 4.82 0.000 1.00
Size-Small 35.41 9.49 3.73 0.001 1.00

Regression Equation

Salaried ($1000s) = 26.3 + 1.218 Hourly ($1000s) + 35.41 Size-Small

15 - 50
© 2017 Cengage Learning. All Rights Reserved.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.

You might also like