This action might not be possible to undo. Are you sure you want to continue?

# 4/20/2007

24 Answers

**Mix and Match
**

1. f 2. e 3. c 4. g 5. h 6. j 7. a 8. d 9. b 10. i

True/False

11. True. 12. False The value of se typically decreases, but it does not have to. R2 must increase. 13. False It’s called a marginal slope because it includes the effects of other explanatory variables. 14. True 15. False It might be smaller, but it does not have to be smaller. It depends on the size and sign of any indirect effects. 16. False The marginal and partial slopes need not even have the same sign, much less both be close to zero. 17. True 18. True 19. False We should only conclude that at least some deviation from this hypothesis occurs. It may not be the case that both are different from zero. Perhaps only one of them differs from zero. 20. True 21. False It’s primary use is locating the effects of leveraged outliers.

4/20/2007 22. True

24 Answers

Think About It

23. Most likely we have some collinearity. Busy areas attract a lot of fast food outlets because sales are high (positive correlation). Among densely populated areas, however, the number of competitors reduces sales of a store (negative partial slope). You’d like to have the densely populated area to yourself. The more competitors that are around, the lower your sales for a give population density. 24. The two explanatory variables, test score and education, are evidently redundant. Once you know one, the other adds little value. Both are positively correlated (evidently), so either has a positive correlation with performance. But once you know the educational background, the score on the qualifying test adds little additional value. 25. a) Estimated Salary = b0 + 5 Age + 2 Test Score b) The indirect effect is 10 $M/Point = 2 years/point * 5 $M/year, larger than the direct effect. c) The marginal effect is the direct plus indirect effect, or 10 + 2 = 12 $M/point. d) You’re not going to be much older, so we need the partial effect. Raising the test score by 5 points nets $10,000 annually. It’s probably worth it if you’re going to stay with the company long enough to earn it back. 26. a) No, not without the intercept. b) Positive. The marginal slope is -0.1 + 0.7*0.2 = 0.04 c) A young person with lots of money to spend. 27. a) The correlation of something with itself is 1. b) You cannot, not without knowing the variance of x1. c) The partial and marginal slopes will be the same because the two explanatory variables are evidently uncorrelated. There can be no indirect effect. 28. a) Yes. R2 is at least as large as 0.74082 > 0.54. b) The same as the correlation, 0.7408. The correlations become covariances when standardized, so we have the covariances and variances. c) They differ because the two explanatory variables are correlated. 29. a) The fitted value is 87 + 0.3 * 250 + 1.5 * 100 =312, or $312,000 revenue per month 87 + 0.3 * 200 + 1.5 * 75 = 259.5, or $259,500 revenue per month Expand to the second location. b) The intercept, $87,000, resembles a fixed cost. The intercept estimates fixed revenue that is present regardless of the distance to the destination or the population. Perhaps it’s money earned from air freight or other services provided by the airline. Without a confidence interval, we cannot be sure if the value is really far from zero. It might be a large extrapolation. c) Among comparably populated cities, flights to those that are 100 miles farther away produce 0.3 × 100 = $30,000 more revenue per month, on average. A24-2

0459 0. b) We could predict to within about 2 times se.8803 p-value ≈0.55)*((100-1-2)/2) ≈ 39.01 c) Yes.0216 * 400 = 46. 30.0073% per additional competing room.0.700 to $19. On average.700. a. a) Yes. we ought to be able to predict monthly revenue to within about $65.789 . the p-value is larger than 0.02 gain in margin per additional 1000 square feet.3543 0.26) * ((37-1-2)/2) ≈ 48.0176 1.9826 5. because the absolute value of the t-statistic is larger than 2.01 <0. d) Based on the fit of this model. the t-statistic for Distance is larger than 2 in absolute size.001 <<. The overall F-statistic is (0. Given that the conditions of the model check out. is a baseline value added to the estimated margin for every hotel.800).7 >>4. average monthly revenue to larger cities is higher by about $1. d) The slope for office shows that sites near offices earn higher margins.0216 0. a.2*2.2515 t-statistic 1.515.0925 0.001 ≈ 0.4259 Rooms -0.5. 14.05 and the t-statistic is less than 2. 32. we cannot tell if the intercept is an extrapolation if interpreted as the predicted value for a hotel in a very isolated location with no competitors or offices.7 to 19. d) Yes.89% Choose the more isolated site. A24-3 .4789 SE 55. 54% operating margin.4/20/2007 24 Answers d) If we compare revenue from flights to cities that are equally distant from the hub.2273 p-value <<. 34. Without seeing the scale of the other variables. a) The estimated margin from the location near the office complex is Est Margin = 54 .000 with 95% confidence.0073 0. c) The negative slope indicates that on average. because the overall F-statistic is F = (. or 16. the confidence interval for 10 times the slope for population is 14.0.b) The filled in table is Intercept Distance Population Estimate 87. so we should keep 1 decimal place and give the interval as 9.0073 * 300 + 0.5 per person. 31.3428 1. with about a 0.8% with 95% confidence. b) The standard deviation of the residuals around the fit is $32.10 <0.819 thousand dollars The relevant se rounds to 2.1777 10.4 >> 4. sites with more competing rooms have lower operating margins (at a slope of about 0.789 + 2*2.0013 5. a) Yes. b) The intercept.0216 * 50 = 52.215% whereas at the more isolated complex the margin is Est Margin = 54 .0073 * 2250 + 0.2 c) No.74/0.8 thousand dollars ($9.b) The completed table of output is Estimate SE t-statistic Intercept 53.7060 5.5869 3.515 = 9. 33.759 to 19.45/.6154 Office 0.

5 3 3.8838083 222.5 4 4.4/20/2007 24 Answers You Do It 35.9544 0. The plots look straight enough (particularly that for width).35 19. Second.53 3.0000 c) The fit of this model has R2 = 0.0001 d) First.10 Prob>|t| <.48894 Std Error 62. The two x’s are not very correlated. Diamonds a) The plots show the discrete properties of the data: we only have several fixed lengths and widths.0001 0. Price ($) Length (Inch) Width (mm) Price ($) 1. 1000 800 1000 800 Price ($) Price ($) 15 20 25 30 600 400 200 0 600 400 200 0 1 1.95) is between price and width.635 8.5 3 3. the overall fit of the model is not straight enough.5 Length (Inch) Width (mm) 30 Length (Inch) 25 20 15 1 1.1998 0.64679 t Ratio -6. the model is missing an obviously important variable: the amount of gold in the chain.654034 11.5 4 4.5 2 2.94 and se = $57 with these coefficients… Term Intercept Length (Inch) Width (mm) Estimate -405.5 Width (mm) b) The largest correlation (0. there’s a trend in the residuals.9544 Length (Inch) 0.1998 1.0000 0. Evidently width tells you more about how much gold than the length.0000 0.0026 <. A24-4 . Width is very highly related to price.11863 2.5 2 2.0355 Width (mm) 0.0355 1.

4/20/2007 24 Answers 200 Price ($) Residual 150 100 50 0 -50 -100 0 200 400 600 Price ($) Predicted e) We formed the “volume” of the chain as the length (in mm) times the width2.1225 0. lose importance. The model looks much straighter with a much smaller se near $17. 50 Price ($) Residual 25 0 -25 -50 -75 0 200 400 600 800 1000 Price ($) Predicted f) Here’s the fit for the improved model.05 -1. A24-5 . though not perfectly.43198 0. particularly the length.60 0.994674 17.59663 0. This in a way gets at the amount of gold in the chain.0724 <.0930388 0. and now we can identify some outliers (a bargain and an expensive chain) that were hidden. There’s still a problem in the residuals. With the added volume. such as the point highlighted in the figures.88 15.92 Prob>|t| 0.9633 0. Our proxy for gold isn’t perfect for the heavier chains. but there’s not the clear trend as before.27885 0. the other two explanatory variables. The residuals have some pattern left.005845 t Ratio 1.0001 36. but they are much smaller. The only problems appear to be a scattering of outliers.118884 0. R2 se Term Intercept Length (inch) Width (mm) Volume (cu mm) Estimate 55.0672 Std Error 34. Convenience shopping a) The scatterplots appear straight enough.971144 16.0451975 -30.

430022 245.0000 0.6717 Std Error 77.1242 1.1242 Car Washes 0.1759 0.0001 0.4/20/2007 24 Answers Sales (Dollars) Sales (Dollars) 3000 3000 2000 2000 1000 1000 2000 3000 4000 5000 1000 0 100 200 300 400 500 600 700 800 Volume (Gallons) Car Washes 5000 4000 3000 2000 1000 Volume (Gallons) 0 100 200 300 400 500 600 700 800 Car Washes b) The largest correlation is between volume of gas and sales.0469 Term Intercept Volume (Gallons) Car Washes d) The outliers are scattered and not very serious with so much data.2326914 0.00 Prob>|t| <.0001 <.0000 c) The fitted model is R2 se Estimate 1112.1166 t Ratio 14. but not very much.04 2.1700 Volume (Gallons) 0.022442 0.6496 1.8611 0.6496 0. Sales Volume Car Washes (Dollars) (Gallons) Sales (Dollars) 1. Seems as though sales at the car wash are not very predictive of either gas volume or sales at the store.3150315 0. Car washes are slightly correlated with both of these. A24-6 .0000 0.1700 0.28 14. The residuals are nearly normal.

99 1000 1200 1000 800 600 400 200 0 -200 -400 -600 -800 1200 1500 Sales (Dollars) Residual 0 1800 2100 2400 2700 10 2030 4050 -3 -2 -1 0 1 2 3 Sales (Dollars) Predicted Count Normal Quantile Plot e) The slope for car washes indicates that among stations with comparable levels of gasoline sales. 37.47 in added daily sales (on average) per additional wash.05 . The scatterplots of transfer time on file size and time of day seem reasonably linear.50 .4659 and round to 2 decimals. The size of the effect is small. The lower endpoint is basically zero.05 because the precise cutoff with this number of cases is 1. though their may be some bending in the plot of transfer time on the time of day.25 . however.4/20/2007 24 Answers .10 .2326914 + 2 * 0.0005 to . it does not matter. 0.01 . meaning that these two explanatory variables are closely associated. Notice in the rounding.90 . however. the calculations are 0.1166.96 rather than our approximate 2. those that sell more car washes generate higher sales in the connected convenience store.2326914 . 50 50 Transfer Time (sec) 40 30 20 10 20 30 40 50 60 70 80 90 100 Transfer Time (sec) 40 30 20 10 0 1 2 3 4 5 File Size (MB) Time A24-7 .95 .75 . The reported p-value is slightly less than 0. with added sales amounting to between nothing and $0. Download a) The file sizes increased steadily over the day.1166 ≈ -. To get the interval.2 * 0.

f) The key difference is the increase in the se of the slope. Thus.2 * 0.185726 0.47 1.68 seconds per MB – a huge A24-8 .90 .624569 6. The overall F-statistic is approximately F = (0. so the indirect effect of file size will be very large. 15 .3237435 -0.16189 t Ratio 2. or about -. On the other hand.25 .10 .179818 to 0. c) The multiple regression is R2 se Estimate 7.0156 0.179818.885703 0.4/20/2007 100 90 24 Answers File Size (MB) 80 70 60 50 40 30 20 0 1 2 3 4 5 Time b) The marginal and partial slopes for the file size will be very different.50 . The residual plot suggests slightly more variation for larger file sizes. but cannot reject either H0: β1 = 0 or H0: β2 = 0.05 . the effect is not too strong (albeit significant by the Durbin-Watson test.3237435 .06 Prob>|t| 0. The effect is fairly subtle and is also evident in a time plot of the residuals. with the residuals oscillating back in forth from positive to negative. we can reject H0: β1 = β2 = 0.80 -0. but not completely.99 Transfer Time (sec) Residual 15 10 5 0 10 5 0 -5 -5 -10 15 20 25 30 35 -10 5 10 15-3 -2 -1 0 1 2 3 Transfer Time (sec) Predicted Count Normal Quantile Plot e) No.0757 0.1388209 0.3237435 + 2 * 0. The outcomes of these tests are “weird”.624/(1-0.179818 3.283617 Std Error 2.01 . D = 2. The confidence interval for the partial slope for file size from the multiple regression is 0.75 .624))*(77/2) ≈ 64 is very significant (being much larger than 4).95 . We will not easily be able to separate their influence from one another.67). the t-statistics as seen in the tabular summary are both less than 2.04 to 0. Again. There is also a slight negative dependence over time.9533 Term Intercept File Size (MB) Time (hours since 8 am) d) Somewhat. The file size and time of day are virtually redundant. The residuals appear nearly normal with no evidence of bending patterns.

186 sec/hour after 8am) = -. One particularly large home has 7 bath – bet they have someone else do the cleaning.3133 + 2 * 0. as you would expect. The marginal slope is 0. This looks to meet the usual A24-9 . making these leveraged outliers. The indirect effect (from the simple regressions) is (0.16066 14. The relationships appear linear.0275.0562 hours since 8am/MB)* (-0. 900 800 700 600 500 400 300 200 100 0 2 3 4 5 6 7 8 9 10 11 12 13 900 800 700 600 500 400 300 200 100 0 1 2 3 4 5 6 7 Price ($M) Sq Feet Price ($M) Num Bath Rms 13 12 11 10 9 8 7 6 5 4 3 2 1 2 3 4 5 6 7 Sq Feet Num Bath Rms b) R2 se Term Intercept Sq Feet Num Bath Rms Estimate 107.78193 11. The two explanatory variables are related.81 1.59055 5.26 Prob>|t| <. not the change in the standard errors.2 * 0. The estimates (slopes) are about the same.533512 81. Home prices a) Some of the homes are large and expensive.4/20/2007 24 Answers range that includes zero. but the range in the multiple regression is much larger.3683 seconds per MB.2583 to .41869 45.32 sec/MB.2099 c) There’s no sign of the usual changing variation.0275 to 0.3133 .793861 0.48 7.0001 0. g) The direct effect of file size (from the multiple regression) is indirect effect of file size is 0. or about . 38.0104532 sec/MB is very small.0001 <.03068 Std Error 19.74715 t Ratio 5. The path diagram only tells you about the difference between the indirect and direct effect (slope in the simple and multiple regression).

01 .2883 or about -9 to 38 thousand dollars per bathroom.7005 to 38. 200 .4 .4291.25 .95 .3267 + 2 * 9.4291 = 63. The value of converting space (the partial slope.6 . The range of the intervals is comparable. and her cost of 40 thousand lies outside this range.75 .99 Price ($M) Residual 200 100 0 -100 -200 200 300 400 500 600 700 800 100 0 -100 -200 10 20 30 -3 -2 -1 0 1 2 3 Price ($M) Predicted Count Normal Quantile Plot d) Yes. The residuals are nearly normal. the conversion to a bathroom does not increase the size of the home) is from -9 to 38. 14.3 .90 . the CI is 14. 82.7472 = -8.7939 + 2 * 11.05 .7939 . These jobs feature expensive material costs.5 . f) She’s unlikely to recover the value of the conversion from the sale price.50 . 39.5335/(1-0.5335))* (150-1-2)/2 ≈ 84 which is much larger than 4 needed to assure statistical significance.7 . The concern remains the presence of the leveraged outlier. e) The confidence interval for the marginal slope is 82. Production costs a) The scatterplots are OK: roughly linear with a few troublesome outliers.8 Material Cost ($/unit) Labor Hours ($/unit) A24-10 . The estimates change because of the correlation between the two explanatory variables (evident in “a”) which implies a large indirect effect.4/20/2007 24 Answers assumptions.3267 .10 . The overall F-statistic is F = (0. For the partial slope.2 * 11.1 .2 . but relatively typical labor hours and average costs. 65 60 65 60 Average Cost ($/unit) 55 50 45 40 35 30 25 20 1 2 3 4 5 6 7 8 Average Cost ($/unit) 55 50 45 40 35 30 25 20 .2 * 9.4685 to 101. Don’t do it (unless she just wants another bathroom). but the estimates are rather different.7472.1849 or about 63 to 101 thousand dollars per bathroom.

the residual plot looks fine and the residuals are nearly normal.7 .50 .336022 7.873795 2.2 .490999.70 per unit. f) To within about $14. For example.084669 0.01 .357028 .3 . 34.6 .75 .0001 <.490999 or 25 to 43 $/Hour.0001 c) Yes. A24-11 .5 .4 .6 se’s from zero). That’s a fairly wide margin considering that some of the less expensive orders cost only $30 per unit.1 .65 Prob>|t| <.357028 + 2 * 4.2 * 4.10 . on average. e) The confidence interval for labor is 34. There’s little indirect effect because there’s little correlation between labor and material costs. It’s not as if we’re pulling in expensive labor for valuable materials. the indicated model meets the usual conditions.13 7.0001 <. The prediction could be off on such orders by 50%.4/20/2007 8 24 Answers Material Cost ($/unit) 7 6 5 4 3 2 1 . Evidently.99 10 0 -10 -20 Average Cost ($/unit) Predicted 10 20 30 -3 -2 -1 0 1 2 3 Count Normal Quantile Plot d) Yes.490999 t Ratio 9. the cost of additional labor on these jobs runs about 25 to 43 $ per hour.444853 4.357028 0.95 .05 .25 .53 5.337964 Std Error 2.90 . the estimated slope for labor hours is far from zero (more than 7. Average Cost ($/unit) Residual 20 10 0 -10 -20 20 25 30 35 40 45 50 55 60 65 20 .2842944 34.8 Labor Hours ($/unit) b) The estimated multiple regression is R2 se Term Intercept Material Cost ($/unit) Labor Hours (Hrs/unit) Estimate 19.

0001 0 -10 0 10 20 30 40 50 60 70 80 90 Age b) R2 se Term Intercept 1/Sq Feet Age Estimate 15.0005 0.0001 <.0001 .0352693 0.0006 0.177344 538. but the outliers either indicate curvature or perhaps a change in the variation for some group of leases.438612 Std Error 0. The model is close to meeting the conditions of the MRM.99 Cost per Sq Foot Predicted -2 -1 0 1 2 3 Count Normal Quantile Plot A24-12 .21 6.5394 0.0009 Cost per Sq Foot 22 20 18 16 14 12 -10 0 10 20 30 40 50 60 70 80 90 1/Sq Feet 0.95 .0001 <.90 .0004 0.0005 .06 7.05 .0007 Age 1/Sq Feet 0.001 0. and we should proceed.50 .0007 .10 . but as far as the data go…Yes.01 . 7 6 5 4 3 2 1 0 -1 -2 -3 -4 15 16 17 18 19 20 21 Cost per Sq Foot Residual 7 6 5 4 3 2 1 0 -1 -2 -3 -4 20 40 60 -3 . the residuals are nearly normal.466548 3263.0632 0.0009 0.329793 1.25 .0008 0.75 . The two explanatory variables appear unrelated.0003 .0002 0. though not very strong association.55 Prob>|t| <. 26 24 26 24 Cost per Sq Foot 22 20 18 16 14 12 0 .4/20/2007 40.004673 t Ratio 87.0003 0. Leases 24 Answers a) Other than the outliers (which are rather expensive for their size and age. so there will be similar marginal and partial slopes. Even so. marked here with x’s further below) the plots look reasonably linear.0001 c) See part “f”.

the variation of negative residuals is larger than the variation of positive residuals. even on the log scale. 0. Perhaps that’s why the older buildings cost more – its not the age of the buildings. The average cost of a lease in a 5 year old building is about 3 to 5 cents more per square foot than comparable space in a 4 year old building. and thus statistically significant.30 Prob>|t| <. F= (0. 41.39 11.80991 0.3298/(1-0. those in older buildings appear slightly more expensive. but between y and the explanatory variables as well as between the explanatory variables 8 8 Log R&D Expense Log R&D Expense 6 4 2 0 -2 -4 -6 0 10 6 4 2 0 -2 -4 -6 0 10 Log Assets Log Net Sales 10 Log Assets 0 0 10 Log Net Sales b) R2 se Term Intercept Log Assets Log Net Sales Estimate -1. the data are not nearly normal. e) Among leases for the same amount of office space.0001 <.3298)) * (223-1-2)/2 ≈ 54 which is much larger than 4. it’s the location and the older buildings are in a nice part of town.035269 + 2 * 0. R&D expenses a) The scatterplots (all on log scales) show strongly linear trends.026 to 0.0001 <.203173 0.2284876 0.18 4. That is.5831633 0.052146 0. Details for the confidence interval 0.089859 0. The range below zero is more extreme than the range above.004673 ≈ 0.0001 c) The residuals are skewed.045 f) This model does not address the location of the buildings. As a result.869808 Std Error 0.2 * 0.053194 t Ratio -13.035269 . A24-13 .4/20/2007 24 Answers d) Yes. This lurking variable could have a considerable impact on the slopes in this model.004673.

0 -2.053194. The marginal elasticity is 0.0 -6 -4 -2-1 0 1 2 3 4 5 6 7 8 9 2 1 0 -1 -2 -3 Log R&D Expense Predicted 25 50 75 -4 -3 -2 -1 0 1 2 3 4 Count Normal Quantile Plot d) Yes. but it’s a big step forward. the fit appears more linear with more similar (though perhaps still changing) variation. Among companies of equal assets.05 .50.0 0.75.3) indicates that this slope is significantly different from zero. The marginal elasticity includes the indirect effect: the marginal elasticity includes the benefit of having more assets (which itself has positive partial elasticity).0 -1. because the t-statistic (4.0 -3. the addition of this explanatory variable significantly increases R2. e) The partial elasticity of R&D expenses with respect to net sales is 0.3348756 or about (to presentation precision) 0.1220996 to . 0.25.12 to 0.4/20/2007 24 Answers The model would not be suitable for prediction (ie.95. 42.8 Base Price MSRP Predicted b) Not entirely. but not for predicting individual companies.12 to 0.04. f) Yes.053194 = .10.99 .999 2.33.0001 RSq=0. The simple explanation for the difference is that the partial elasticity estimates the effect of percentage differences in net sales among companies with equal assets. R&D spending averages between 0.01.2284876 .001. it’s considerably smaller. Cars a) The calibration and residual plot show the a small amount of curvature (the fit underpredicts the price of the small cars) as well as large changes in the variation.79 ± 0. A big A24-14 .2284876 + 2 * 0. 95% prediction intervals would not have the right coverage). Base Price MSRP Residual 100000 90000 80000 70000 60000 50000 40000 30000 20000 10000 0 0 10000 30000 50000 70000 Base Price MSRP Actual 50000 40000 30000 20000 10000 0 -10000 -20000 -30000 -40000 0 10000 30000 50000 70000 Base Price MSRP Predicted P< .90 . Log R&D Expense Residual .0 1. The CLT suggests inferences about slopes are OK. For the model with logs. Hence.33 percent higher among those with 1% higher net sales.2 * 0.67 RMSE=9898. so the confidence intervals for the estimates do not even overlap.

44. The indirect effect of log10 weight on log10 price is almost the same as the marginal slope.1 4. as well as having a more linear relation. d) The confidence interval for the marginal elasticity for weight is 1. e) Yes.64.2 * 0. there’s nothing left for the direct effect.9 3.04.1 0.1063 to -0.8 4.1026 0.0 24 Answers 0. The indirect effect for log10 weight is 1.2 -0. This second variable appears more associated with the GDP.432 + 2 * 0.2 0. The scatterplots seem reasonably linear. If all we know is that one car is heavier than another.1 -0. on average.77 RMSE=0.84. we’d expect the two cars to be comparably priced.34.7 4.1063.19.2 4.0 Log 10 Price Predicted c) It might be zero.0177 .6 4.95.3 3.3 4. Hence.0 -0.0378 * 1.7 4. 60000 50000 60000 50000 GDP (per cap) 40000 30000 20000 10000 0 -10 GDP (per cap) -5 0 5 10 15 20 40000 30000 20000 10000 0 200 300 400 500 600 700 800 Trade Bal (%GDP) Muni Waste (kg/person) A24-15 .0378 (the slope in the regression of log10 HP on log10 weight) times the direct effect for log10 HP.84. or about -0. 5 4.2 * 0. The confidence interval for the partial elasticity is -0. weight has an effect.0177 + 2 * 0. Zero lies inside the confidence interval.34.44.432 . but the partial elasticity is near zero (not significantly different from zero). If the cars have the same HP.3964: 1. All of the effect of this variable comes from its indirect effect via changes in HP.1 4 3.1 4.94. but only indirectly.0001 RSq=0. however. the heavier car is likely to cost more (on average).64.24. and good enough to continue.4/20/2007 improvement.17 to 1.3 Log 10 Price Actual Log 10 Price Residual Log 10 Price Predicted P<.133.54.3964 ≈ 1.4492 f) Yes.94.54.7 4.9 4.24.133 to 1. but it does not have to be exactly zero.70. The estimated marginal and partial elasticities have similar standard errors.5 4. 43. 1. or about 1. OECD a) The scatterplots show a very strong association between y and the second predictor.04.95.4 4.23 to 0.

772618 6934. The slope for the trade balance will change because of the presence of indirect effects.25 .79 <.225 959. On average.0119591 Muni Waste (kg/person) The indirect effect for trade balance is thus 7. Because the indirect effect is positive.7335205 Trade Bal (%GDP) Similarly the path from municipal waste to trade balance has slope Estimated Trade Bal (%GDP) = -4.153925 t Ratio Prob>|t| -0.90 .12 0.95 .10 .3).623 Std Error 4796.4/20/2007 20 15 24 Answers Trade Bal (%GDP) 10 5 0 -5 -10 200 300 400 500 600 700 800 Muni Waste (kg/person) b) The two x’s are correlated (r ≈ 0.05 .01 . with only 29 cases. Of course. the marginal slope is larger than the partial slope.003 232.990754 + 0. 15000 GDP (per cap) Residual 10000 5000 0 -5000 -10000 -15000 0 10000 30000 50000 15000 10000 5000 0 -5000 -10000 -15000 2 4 6 8 10-3 . we cannot be very sure and we may have missed a subtle problem. For example. the residuals have similar variances (left) and are nearly normal (right). c) The estimated model is R2 se Term Intercept Trade Bal (%GDP) Muni Waste (kg/person) Estimate -4622.7335205 * 62.7805 9.99 GDP (per cap) Predicted -2 -1 0 1 2 3 Count Normal Quantile Plot e) The direct path from trade balance to y has coefficient 960 and the path from waste to y has coefficient 62. The path from trade balance to muni waste has slope from the fit Estimated Muni Waste (kg/person) = 503.184369 ≈ 481 As a check the sum of the direct and indirect effects are 960 + 481 = 1441 which is the marginal slope for the trade balance.50 .93174 + 7.0001 d) Yes. countries A24-16 .184369 0.75 .60593 62.96 0.0003 6.3440 4.

44.3716563 0.279374 0. This plot may have two clusters of employees.1325818 Std Error 0.8765 to $80. 12 11 12 11 Log Profit 10 9 8 7 0 1 2 3 4 5 6 7 Log Profit 10 9 8 7 0 1 2 3 4 5 6 7 8 9 10 Log Accounts 7 6 Log Commission Log Accounts 5 4 3 2 1 0 0 1 2 3 4 5 6 7 8 9 10 Log Commission b) Because the association between the two explanatory variables is weak.2 * 9.0001 <. 62. though the association is weak in each case.0001 . The model is not causal.4/20/2007 24 Answers with larger exports have more consumption (producing more trash). c) The estimated model is RSquare Root Mean Square Error Observations (or Sum Wgts) Term Intercept Log Accounts Log Commission Estimate 8.671333 464 t Ratio 71. countries with more waste per person have larger GDP per person. would be rounded to $44 to $80.117483 0. it means that at a given trade balance. f) The 95% confidence interval for the slope for municipal waste is 62. the marginal and partial elasticities should be similar.1539 = $43. and this consumption contributes to GDP. Rather. The se rounds to 9. The interval does not include zero.4921 more GDP per kilogram of waste.1539.0001 <.1995083 0.1843 + 2 * 9.12 A24-17 Prob>|t| <.75 8.029552 0. The association between the two explanatory variables is particularly weak. This does not mean countries should produce more waste.26 6.016318 0. Hiring a) The scatterplots seem reasonably linear.1843 . so that β2 is not zero.

The indirect effect for the log of the number of the accounts is . Because there could be other factors at work. g) To answer this question requires that you believe the MRM and treat these effects as causal. so marginal and partial slopes will likely differ.13.2 * 0.14 to 0. the residuals are nearly normal. The partial elasticity of the number of accounts is larger than the partial elasticity of the early commissions.05 . Looks like there was more of an indirect effect than we anticipated. 0. The residuals are a bit skewed.75 .99 2 1 0 -1 -2 -3 Log Profit Residual 1 0 -1 -2 -3 8 9 10 11 25 50 75 -3 -2 -1 0 1 2 3 Log Profit Predicted Count Normal Quantile Plot e) The confidence interval for the partial elasticity is 0. A24-18 . The largest correlation is between the two explanatory variables.01 . 0. so put the effort here. that’s wishful thinking. As a whole.20 = 0.25 .0908 ≈ 0.90 . but the deviations are only in the lower extreme. then go with the program that concentrates on developing accounts.1325 * 0. 45.09 + 0.50 .029552.20 and the partial elasticity for early commission is 0. 2 . f) The path diagram shows that the partial elasticity for the number of accounts is 0.029552 or about (to presentation precision) 0.1995083 + 2 * 0.10 . The marginal elasticity is larger than this interval.1995083 .26. though negative residuals seem more dispersed (more variable) than positive residuals. with weak associations between the two predictors and the response.4/20/2007 24 Answers d) The residuals show little pattern.6855 (from the regression of log commission on log accounts) Notice that this checks (up to rounding errors) : the sum of direct and indirect effects is the marginal elasticity given in the text.29. Promotion a) The scatterplots are vaguely linear.95 . If you do choose to believe the model.

215 0.005 0.04 0.7 Detail Voice 0.0191598 0.5 .280169 0.005 0.225 . The residuals are also nearly normal.0133 Term Intercept Detail Voice Sample Voice c) The residuals look fine.004656 0.5 .010 -0.010 0.010 Residual 0.008333 t Ratio 45.210 0.230 0.60 Prob>|t| <.4/20/2007 0.4 .065153 0. though rather variable (i.06 .240 0.235 24 Answers Market Share 0.02 .1 .235 0.006605 39 Std Error 0.10 .04 .14 Market Share 0.015 .000 -0.e.225 0.215 0.005 -0.230 0.02 .08 0.005 -0.210 0. the model does not explain much variation.) The DW does not find a pattern over time (D = 2.015 0 5 10 15 20 25 30 35 40 Market Share Predicted Row Number A24-19 .220 0.0001 0.2 .3 .205 .10 0.14 0.7 Sample Voice b) The estimated model is R2 se n Estimate 0. Market Share Residual 0.240 0.69 0.220 0.2 ..6 .000 -0.0216912 0.3 .07).225 0.06 0.29 2.010 -0.12 .08 .1 .230 0.7704 0.2127433 0.205 .6 .12 Sample Voice Detail Voice 0.4 .220 .

.1 -0.3 .2 0.5 0. 46.3 .2 Apple Return Apple Return Market Return -0.2 .05 .2 -0.01 . it is hard to separate the two.e. The correlations are modest in size.4 -0. f) No. The partial slope for detailing is not significantly different from zero (i. Perhaps the best advice would be to do some experiments.01 0.2 -0.1 -0.50 .4/20/2007 24 Answers 0.4 IBM Return b) The estimated model is R2 se n 0. 0. Since detailing and sampling tend to come together.1 Market Return 0 -0. that at a given level of sample share.2 -0.2 -0.3 -0.3 0. zero is in the 95% confidence interval).3 -0.005 -0.4 -0.1 .4 0. F = (0.28/(1-0.5 -0.015 2 4 6 -3 . It only means. so the effect is statistically significant.4 IBM Return 0. with common outlying events (such as October 1987).99 -2 -1 0 1 2 3 Count Normal Quantile Plot d) Yes. e) No. The partial effect for detailing is not significantly different from zero.4 0. periods with a higher share of detailing have not shown gains in market share. Apple a) All three variables are correlated with each other.1 0 .25 .2 -0.5 0.95 . The model is not causal. as in the statement of the question in part “e”.2 0.3 -0.1 0 -0.2 .10 .75 .6 -0.216589 0.13255 300 A24-20 . but this does not mean detailing has no effect.3 0.01 -0.005 0 -0.1 . but reasonably linear.90 .1 -0.5 -0.28)) * (39-1-2)/2 ≈ 7 > 4.1 0 .1 0 -0.1 0.1 0 .6 -0.3 -0.

and the scatterplot of residuals on the fitted values looks about as good as they come.4 0. A24-21 .73.110773 = . e) The confidence interval for the estimate of IBM returns is 0.50 .3 -0.25 . The interpretation is that during months with equal returns on the market.1 . the improvement is statistically significant because the confidence interval for the slope of IBM returns does not include zero (just barely).2 * 0.91 to 1.4 -0. but not by much considering the sampling variation.1 0 -0.0409 c) The residuals – and model – appear fine.2275089 + 2 * 0.110773.3168817 + 2 * 0.01 . perhaps not unless we can anticipate movements in IBM.007868 0.2275089 .1 .5%. There is little dependence over time (DW = 1.4 0. Apple went up as well from near zero to about 0.2 -0. months in which IBM returned 1%.45 (presentation precision).2275089 Std Error 0. Apple Return Residual 0. f) Yes.4/20/2007 Term Intercept Market Return IBM Return Estimate 0.204542 = .4 -0.05.01 to 0.3 0. 0.9077977 to 1. This range is so wide as to allow the possibility of the marginal and partial slopes being the same. the partial slope is smaller.44 2. As to a better trading strategy.1 0.1 -0.204542 0.2 -0.90.3168817 .204542.10 .0001 0.0048214 1.3 -0.5405 <.94).0 -0.1 -0.0 .2 0. Related ideas known as “pairs trading” rely on correlations between the movements of two stocks to identify opportunities to buy one and sell the other. The residuals are also nearly normal.3 0.75 .05 Prob>|t| 0.95 .4490549 or 0.2 0.2 * 0.2 -0.5 20 40 60-3 -2 -1 0 1 2 3 .0059629 to .3 -0.3168817 0. That is.5 -0.2 0.99 Apple Return Predicted Count Normal Quantile Plot d) The confidence interval for the market effect is 1.61 6. marked as ×) is on target. on average. Even the outlier period (October 1987. 1.110773 24 Answers t Ratio 0.7259657 or 0.

45000 40000 35000 45000 40000 35000 Price 30000 25000 20000 0 1 2 3 4 5 Price 30000 25000 20000 0 10000 30000 50000 70000 Age 5 4 Mileage Age 3 2 1 0 0 10000 30000 50000 70000 Mileage e) R2 se n Term Intercept Age Estimate 40323. but none of these seem extreme.510372 3178. d) The plots appear straight-enough.4/20/2007 4M Leasing 24 Answers a) Without an estimated value for the residual price.803 0. namely. but we might be losing profitable sales due to charging too much.86 -6. Perhaps it should have charged more for mileage if this factor has a large effect on resale value. the manufacturer may not be able to cover costs when the cars are returns. If we use marginal estimates of these effects. and we can see the collinearity between the two proposed explanatory variables.000 for many years and stay positive) become more evident as cars get much older.8478 288.8791 t Ratio 55. That might lead us to charge more than we need to cover our costs. there’s more variation in attributes among very expensive cars than among cheaper cars.879 218 Std Error 721.42 Prob>|t| <.0001 A24-22 . we’ll in effect double count for the age of the car when we estimate the impact of mileage on the residual value. A few outlier appear in the plots. Also. for example. c) Most of the curvature we have seen in previous examples with cars (See Chapter 20) come from combining very different models: for example.0001 <. the nonlinear patterns that come as cars lose value (you cannot lose $10. That’s OK (from the manufacturer’s point of view). that older cars have been driven further. b) We need multiple regression because it is likely that the two factors are related.937 -1853.

This would make the total cost per year 2400 + 1720 = $4100. you could lease a BMW 325i for 36 months at $420 per month. pushing that 10000 * 0.400 down and $0. These estimates on average will cover the costs due to aging with 95% confidence.171523 to -. The diagonal stripes come from rounding of the prices.077 to $0. plus $3.18 per mile.280 to $2. with say 0.5612 to -1276.] i) This analysis ignores the fact that these cars cost different amounts at the time of purchase.4/20/2007 Term Mileage Estimate -0.400 per year with and additional $0.124023 Std Error 0.172 per mile.803 + 2 * 288. we have not identified other differences among these cars.8791. Other factors.124023 + 2 * 0.720 into the set annual price.75 .076523 which rounds to $0.22 Prob>|t| <. -0.800.20 per mile over 30.000. or about 350 per month. They did not all start from the same initial cost. such as special options that might increase the value of the car further. Also.000 miles.0001 f) The residuals have similar variances and are nearly normal. in row 17). A24-23 .99 Price Residual Price Predicted Count Normal Quantile Plot g) For the effect of age on residual value.172 = 1. -0.01 . There’s one unusually expensive car among these ($13.124023 .0448 which rounds to a drop in resale value of about $1.95 .02375 24 Answers t Ratio -5. We have not observed the actual loss in value. For mileage.25 . at the time of this writing. such as a blow to the reputation of BMW. but otherwise nothing stands out and particularly troublesome. -1853.17 per additional mile. [You might also suggest a lease that allows 10. In fact. would also make the estimates from this model inaccurate.2 * 0.90 .2 * 288.803 .430 per year.02375 = -. the 95% confidence interval is -1853.02375. h) To cover the loss in value of the car over the term of the lease. we’ve only seen how time (and mileage) has affected their value. I recommend that we structure the lease for a 3-series BMW to cost $2.05 .50 .8791 = -2431. 15000 10000 5000 0 -5000 -10000 20000 250003000035000 40000 15000 10000 5000 0 -5000 -10000 25 50 75 -3 -2 -1 0 1 2 3 .10 .