You are on page 1of 9

Spreadsheet Modeling and Decision

Analysis A Practical Introduction to


Business Analytics 8th Edition
Ragsdale Solutions Manual
Visit to download the full and correct content document: https://testbankdeal.com/dow
nload/spreadsheet-modeling-and-decision-analysis-a-practical-introduction-to-busines
s-analytics-8th-edition-ragsdale-solutions-manual/
Chapter 9 - Regression Analysis : S-1
————————————————————————————————————————————

Chapter 9
Regression Analysis
1. a. Y = 250 + 3 X
b. Functional. For a given value of X there is one unique value of Y.

2. The model with the highest R2 might actually "overfit" the data and not provide accurate predictions. The
R2 statistic can be inflated (or made arbitrarily large) by including superfluous independent variables in the
model. If this happens the predictive ability of the model will actually be degraded since the model is
biased toward sample specific anomalies in the data that may not be characteristic of the underlying
population from which the sample was drawn.

3. The optimal value of b0 is ̅


Y and the optimal value of b1 is 0. If there is no relationship between X and Y
the best (ESS minimizing) estimate of Y is its average value. In this case, ESS=TSS, RSS=0, and R2 = 0.

4. The solution would be unbounded. For virtually any regression problem, the sum of the estimation errors
can be made to approach - by selecting a regression function such that the estimated values a far greater
than the actual values. Even in we place a lower bound of zero on the sum of the estimation errors, a
regression function with a sum of estimation errors equal to zero will not necessarily fit the data well.

5. You should collect data so that the average value of the X1 observations is equal to X1h.

6. a. See file: Prb9_6.xlsx


There is a reasonably linear relationship between the variables -- except at the upper end of the X-
axis. If possible we should investigate this anomaly in the data to make sure this is not due to a data
entry error.
^
b. The estimated regression equation is: Y = - 7.84 + 0.8121 X1
c. R2=0.849. Approximately 84.9% of the total variation in Y (long-term total debt) around its mean
can be accounted for using X1 (long-term assets).
d. Approximate 95% LCL = 32.776 - 2×4.2835 = 24.2 (million)
Approximate 95% UCL = 32.776 + 2×4.2835 = 41.3 (million)

7. a. See file: Prb9_7.xlsx


There is a reasonably linear relationship between the variables.
^
b. The estimated regression equation is: Y = - 6169. 94 + 195.698 X1
c. R2=0.922. Approximately 92% of the total variation in Y (Charitable Contributions) can be
accounted for using X1 (AGI).
d. A confidence interval for the expected level of charitable contributions for a given level of AGI can be
constructed. Charitable contributions in excess of the upper limit of the confidence interval would be
suspect.

8. See file: Prb9_8.xlsx


a. Using mileage (X1) as the independent variable produces the highest R2 value (0.844).
^
b. Y = - 206653 - 155.647 X1 + 118.78 X 2
Adding X2 to the analysis does not really help if X1 is already included in the model. These two
variables are likely to be highly correlated.
^
c. Y = 28793.9 - 177. 476 X1 + 2886.84 X 3
Adding X3 to the analysis does seem to help if X1 is already included in the model. The adjusted R2
value is 0.876.
Chapter 9 - Regression Analysis : S-2
————————————————————————————————————————————

d. A t-top adds approximately $2886 to the re-sale value of the car.


^
e. Y = - 299860 - 144.782 X1 + 164.765 X 2 + 3001.5 X 3
f. The regression function in c has the largest adjusted R2.

9. See file: Prb9_9.xlsx


a. The relation between mileage and price seems to be fairly linear while the relationship between model
year and price appears to be quadratic.
^
b. Y = 1.54 E +8 - 107. 267 X1 - 156354 X 2 + 2440.542 X 3 + 39.570 X 4
c. No, and the adjusted R2 is also better than any of the models investigated in problem 7.

10. See file: Prb9_10.xlsx


a. Strong linear relations are suggested by each plot.
b. TV advertising (X3) has the single strongest R2 at 0.6745.
c. Using all three variabales produces the highest adjusted R2 at 0.996.
d. Ŷ = −43.096 + 0.0598 X1 + 0.1159 X2 + 0.0582 X3
e. Spend $3679 on Print $4920 on Web, and $900 on TV. Total spend = $9499.
f. The spend on the Web is at its maximum and TV is at its minimum. This combination of spend levels
is not similar to anything in our sample so we might be extrapolating too much from our data. A more
reliable approach might be a solution that minimizes the deviation from the average spend in each
media that produces the desired # of applications.

11. See file: Prb9_11.xlsx


a. Each plot suggest a linear relationship.
b. Annual total inpatient days (X3).
c. X3 & X2
d. X3 , X2 & X1
e. X3 , X2 & X1
f. Ŷ = 22.33+16.02X3-7.67X2+9.84X1
g. The 12th and 26th observations are more than 2 standard errors above their expected values, the 34 th
observation is more than 2 standard errors below its expected value.

12. See file: Prb9_12.xlsx


a. Yes
^
b. Y = 5. 471 + 0.1777 X1
c. The R2 statistic indicates that approximately 96.6% of the total variation in the % of O-ring
expansion is accounted for by temperature.
d. 10.62%
e. No. While the % of O-ring expansion estimated in part d is well above the 5% minimum requirement,
this estimate is based on an rather large extrapolation. The lowest actual launch temperature for
which data is available is 55 degrees. Thus, we cannot be at all confident in the estimate at 29
degrees.

13. See file: Prb9_13.xlsx


a. Both independent variables exhibit a quadratic relationship with the dependent variable.
^
b. Y = - 3.0093 + 1.153 X1 + 1.621 X 2
R2 = 0.817. Approximately 81.7% of the total variation in the returns on the stocks is accounted for
using this particular model.
^
c. Y = - 53.54 - 8.127 X1 + 152.7395 X 2 + 0.3707 X 3 - 51.834 X 4
R2 = 0.985. Approximately 98.5% of the total variation in the returns on the stocks is accounted for
using this model.
d. The second model has the highest adjusted R2.
Chapter 9 - Regression Analysis : S-3
————————————————————————————————————————————

14. a. See file Prb9_14.xlsx


b. A third order polynomial seems to fit reasonably well.
^
Y = 216.44 - 109.37X1 + 25.969X1 - 1.7699X1
2 3
c.
d. Approximately 98.4% (R2 = 0.9837) of the total variation in the dependent variables (breaking point )
around its mean is accounted for by the model.
e. About 5.773 ounces of glue.

15. a. See file Prb9_15.xlsx


b. R2 = 0.778. Approximately 77.8% of the total variation in the number of mortgage applicants is
accounted for using this model.
c. Approximate Lower confidence limit = ~596.0 (Exact = 527.2)
Approximate Upper confidence limit = ~2320.2 (Exact = 2389.0)
d. R2 = 0.972. Approximately 97.2% of the total variation in the number of mortgage applicants is
accounted for using this model.
e. Approximate Lower confidence limit = 780.8
Approximate Upper confidence limit = 1413.2
f. The quadratic model appears to be much more accurate.

16. a. See file: Prb9_16.xlsx, quadratic relationship.


b. See file: Prb9_16.xlsx, quadratic relationship.
c. Ŷ = 30.337 + 6.622X 1 - 0.3576 X12 + 7.040 X 2 − 0.4889 X 22
d. 81.917

17. a. See file: Prb9_17.xlsx


b. Ra2 = 0.895 ; but not a good fit
c. Ra2 = 0.892 ; but not a good fit
d. Ra2 = 0.961 ; and a pretty good fit

18. a. Ŷ = 31.212 + 0.3656 X 1


b. See file: Prb9_18.xlsx. This model does not account for much of the systematice variation in the data.
c. R2 = 0.0502. Approximately 5% of the total variation in the peak demand for power around its mean is
accounted for by the model in part a.
d. Ŷ = −9.345 + 2.746 X 1 + 11 .115 X 2 + 12 .77 X 3 + 11.20X 4 + 13 .373 X 5 + 8.706 X 6 + 0.707 X 7
e. See file: Prb9_17.xlsx. This model fits the data reasonably well.
f. R2 = 0.774. Approximately 77.4% of the total variation in the peak demand for power around its mean
is accounted for by the model in part d.
g. Ŷ = −9.345 + 11.20 (1) + 0.707 (94 ) = 68 .28 MwH
h. Approximate Lower confidence limit: 68.28 – 2 × 3.075 = 62.13 MwH
Approximate Upper confidence limit: 68.28 + 2 × 3.075 = 74.43 MwH
Duque can be approximately 95% confident that the actual peak demand for power on a 94 degree
Wednesday in July will fall within the interval calculated above. Duque can use this information to
ensure it has enough generating capacity available to meet peak demand.

19. a. See file: Prb9_19.xlsx


Estimated Price = -403.74+181.88B+436.2C+311.8D+139.47T
b. Approximately 76.9% of the total variation in selling price around its mean is accounted for by the
regression model.
c. 1355  2 × 181.3. We are approximately 95% confident that prices of 18 inch diameter casserole pans
will fall within the price range of $992 to $1717.
d. A categorical variable indicating the condition of each piece might help explain the variations in price.
Chapter 9 - Regression Analysis : S-4
————————————————————————————————————————————

20. a. See file: Prb9_20.xlsx. Age and Milage both have fairly strong negative linear relations with price.
The geographic variables all have fairly weak linear relations with price, with the possible exception of
the indicator variable for the Mid-Atlantic region.
b. The highest R2 value is obtained using all of the variables. R2 =0.722. Estimated Price = 20339.05 –
1070.75 Age – 0.0645 Milage – 1073.97 West + 2156.02South + 3318.418 Northeast - 1746.53Mid-
Atlantic + 2153.14 Southwest
c. Adjusted R2 =0.672. Estimated Price = 19346.321 -1051.9795 Age -0.0685759 Milage + 3237.1774
South + 4354.1414 + Northeast + 3229.119 Southwest
d. See file: Prb9_20.xlsx.
e. The condition of the car, whether or not it is a convertible, etc.

21. a. See file: Prb_21.xlsx


b. The highest R2 value is obtained using all of the variables. R2 =0.814. Estimated Viscosity = -98800.84 -
5.44 X1 +1580.13 X2 -87.97 X3 +0.14 X4 -0.65 X5 -19.83 X6 -1419.47 X7
c. The highest adjusted R2 value is obtained using variables 1, 2, 3, 5, & 6. R2 =0.781.
Estimated Viscosity = -83441 -4.67 X1 + 1351.86 X2 -88.02 X3 -0.477 X5 -24.332 X6
d. About 7346 pounds of EO.

22. See file Prb9_22.xlsx

a.

1 2

3 4
Chapter 9 - Regression Analysis : S-5
————————————————————————————————————————————

2 Suggests a strong linear relatiobship


3 Suggets a moderate linear relationship
1, 4 & 5 suggest no/weak linear relationships

b. X2 and X5 produce the highest adjusted R2 value.


Ŷ = 43164.8 + 941.352 X 2 − 4593 X 5
Adjusted R2 = 82.99%
c. See file Prb9_21.xlsx
d. Units of Work, and the binary variables for cities 1, 2, and 4.
Ŷ = 19586.18 + 924.60 X1 + 20202 .04 X 2 + 25295 .1 X 3 + 5871 .19 X 4
Adjusted R2 = 84.46%
. e. The model in part d is better because it has a higher adjusted R 2.

23. a. See file: Prb9_23.xlsx, linear relations


b. Years of service. R2 = 0.737.
c. Years of service & Certifications. Adjusted R2 = 0.8557.
d. Use the model with all 3 variables. Adjusted R2 = 0.9005.
^
e. Y = 32.921 + 1.0578 X 1 + 0.3252 X 2 + 1.2992 X 3
f. $46,780  1.5×1.726 or $44,191 to $49,369

24. a. See file: Prb9_24.xlsx. There is a fairly strong linear relation between avg outside temperature and
average heating cost. There is also fairly strong linear relation between square footage and average
heating cost. There is somewhat of a linear relation between age of furnace and average heating cost.
There is not much of a relation between the amount of attic insulation and average heating cost.
b. Square footage has the strongest linear relation with average heating cost.
c. Attic insulation and square footage.
d. Attic insulation, age of the furnace, and square footage.
e. Ŷ = -29.218 - 1.178X 1 - 6.895X 2 + 3.213X 3 + 0.149X 4
f. Ŷ = -152.037 - 4.8923×4 + 3.698×5 + 0.1815×2500 = 300.801
Approximate Lower confidence limit = 300.801 – 2 × 34.985 = 230.84
Approximate Upper confidence limit = 300.801 + 2 × 34.985 = 370.78
We can be approximately 95% confident that the actual average heating cost for a home with 4 inches
of insulation, a 5-year old furnace, and 2500 square feet will be between about $230 and $370.

25. a. See file: Prb9_25.xlsx


b0 = 6.030
b1 = 0.170
b. It places less emphasis on outliers
c. There is no easy goodness of fit measure.

26. a. See file: Prb9_26.xlsx


b0 = 4.591
b1 = 0.191
Chapter 9 - Regression Analysis : S-6
————————————————————————————————————————————

b. The estimation error for each observation is kept to a minimum level.


c. There is no easy goodness of fit measure. Undue emphasis may be given to outliers.

Case 9-1: Diamonds Are Forever


See file: Case9_1.xlsx

1. There doesn’t appear to be much of a linear relationship between price and color or price and clarity. There
appears to be a positive correlation between price and carats.
2. The model with X3 & X2 as independent variables has the highest adjusted R2 value (69.4%).
3. Ŷ = 42.05 - 165.35X 2 + 1688.0X 3 , R2 = 72.3%, adjusted R2 = 69.4%.
4. See file: Case9_1.xlsx. The eighth set of earrings appears to be the most underpriced value.
5. The model with X3 & X2 as independent variables has the highest adjusted R2 value (71.4%).
6. Ŷ = 14.807 - 2.65X 2 + 28.13X 3 , R2 = 74.1%, adjusted R2 = 71.4%.
7. See file: Case9_1.xlsx. The fourth & fifth sets of earrings appears to be the most under priced values.
8. Step-wise regression leads to the selection of a model with X3 & X6 with an adjusted R2 of 73.3%.
9. Ŷ =-25.33+15.03X1+63.8X3-1.16X4-12.68X5, R2 = 81.6%, adjusted R2 = 77.3%.
10. See file: Case9_1.xlsx. The eighth set of earrings appears to be the most underpriced value. (In this final
analysis, it is interesting to note that the earring selected as the best value in part d now appear to be far less
of a great value. It is also instructive to note that step-wise regression is a heuristic that does not
necessarily lead to the best multiple regression model.)

Case 9-2: Fiasco In Florida


See file: Case9_2.xlsx

1. 1725 votes
2. See file: Case9_2.xlsx. Palm Beach county appears to be an outlier.
3. Buchanan votes = 109.7673 + 0.002541 × Gore Votes
4. R2 = 0.630. Approximately 63% of the total variation in the votes received by Buchanan in each county is
accounted for by the number of votes received by Gore in the same county.
5. Buchanan votes = 109.7673 + 0.002541 × 268945 = 793.15
Lower confidence limit = 793.15 – 2.576 × 137.32 = 439.32
Upper confidence limit = 793.15 + 2.576 × 137.32 = 1146.87
With approximately 99% confidence, we would expect Buchanan to receive between 439 and 1147 votes in
Palm Beach county if Gore received 268945.
6. See file: Case9_2.xlsx. Palm Beach county appears to be an outlier.
7. Buchanan votes = 66.09 + 0.003478 × Bush Votes
8. R2 = 0.753. Approximately 75.3% of the total variation in the votes received by Buchanan in each county
is accounted for by the number of votes received by Bush in the same county.
9. Buchanan votes = 66.09 + 0.003478 × 152846 = 597.71
Approximate Lower confidence limit = 597.71 – 2.576 × 112.18 = 308.73
Approximate Upper confidence limit = 597.71 + 2.576 × 112.18 = 886.68
With approximately 99% confidence, we would expect Buchanan to receive between 309 and 887 votes in
Palm Beach county if Bush received 152846.
10. The results suggest that something quite unexpected happened with the votes in Palm Beach county. We
cannot really say much more than that, but it does appear that whatever happened may have cost Al Gore
the election. This analysis assumes that the voting patterns observed in other counties of Florida are
representative of voting patterns in Palm Beach county – which may or may not be the case.

Case 9-3: The Georgia Public Service Commission


1. See file: Case9_3.xlsx

2. See file: Case9_3.xlsx


Chapter 9 - Regression Analysis : S-7
————————————————————————————————————————————

^
3. Y = 33.3205 + 15.0159 X1

4. R2 = 0.8735. Approximately 87.4% of the total variation in line maintenance expense is accounted for
using this model.

^
5. Y = 33.3205 + 15.0159 (75) = 1,159.51

6. An approximate 95% confidence interval for a new line maintenance expense at this number of customers
is given by:
Approximate 95% Lower Confidence Limit = 1,159.51 - 2×187.713 = $784.09 (in 000s)
Approximate 95% Lower Confidence Limit = 1,159.51 + 2×187.713 = $1,534.95 (in 000s)
Thus, it would not be unexpected for a company with 75,000 customers to show a line maintenance charge
of $1,500,000.

7. See file: Case9_3.xlsx. There seems to be some systematic variation that this not being accounted for by
the linear model.
^
Y = 707.47 - 7.392 X1 + 0.1543 (X1)
2
8.

9. R2 = 0.9416. Approximately 94.2% of the total variation in line maintenance expense is accounted for
using this model.

10. R a2 = 0.9286. This is larger than the R2 and adjusted-R2 statistics from the linear model -- implying that
the addition of the quadratic term in the model served a useful purpose.

^
11. Y = 707.47 - 7.392(75) + 0.1543(75)2 = 1,021.024

12. See file: Case9_3.xlsx. The quadratic model seems to fit the data much better.

13. An approximate 95% confidence interval for a new line maintenance expense at this number of customers
is given by:
Approximate 95% Lower Confidence Limit = 1,021.024 - 2×134.407 = $752.209 (in 000s)
Approximate 95% Lower Confidence Limit = 1,021.024 + 2×134.407 = $1,289.838 (in 000s)
Thus, it would appear unusual for a company with 75,000 customers to show a line maintenance charge of
$1,500,000 and Nolan might wish to investigate the cause of this further.

14. The quadratic model appears to be best.

You might also like