Professional Documents
Culture Documents
14.2 Insert x values into the least squares equation and solve for yˆ .
b0 = meaningless
b1 = 5.6128 implies that we estimate that mean sales price increases by $5,612.80 for
each increase of 100 square feet in house size, when the niceness rating stays constant.
b2 = 3.8344 implies that we estimate that mean sales price increases by $3,834.40 for
each increase in niceness rating of 1, when the square footage remains constant.
c. Therefore, actual hours were 17207.31 – 15896.25 = 1311.06 hours greater than predicted.
14.6 (Variance and standard deviation of the populations of potential error term values)
14.7 a. The proportion of the total variation in the observed values of the dependent variable
explained by the multiple regression model.
b. The adjusted R-squared differs from R-squared by taking into consideration the number
of independent variables in the model.
14.8 Overall F-test used to determine if at least one independent variable is significant.
14.9 (1)
14-1
Chapter 14 - Multiple Regression
(3)
(4) F(model)
(5) Based on 2 and 7 degrees of freedom, F.05 = 4.74. Since F(model) = 350.87 > 4.74, we
reject by setting
(6) Based on 2 and 7 degrees of freedom, F.01 = 9.55. Since F(model) = 350.87 > 9.55, we
reject by setting
(7) p-value = 0.00 (which means less than .001). Since this p-value is less than = .10, .05,
.01, and .001, we have extremely strong evidence that is false. That is,
we have extremely strong evidence that at least one of x1 and x2 is significantly related to
y.
14.10 (1)
(3)
14-2
Chapter 14 - Multiple Regression
(4) F(model)
(5) Based on 3 and 26 degrees of freedom, F.05 = 2.98. Since F(model) = 72.80 > 2.98, we
reject by setting
(6) Based on 3 and 26 degrees of freedom, F.01 = 4.64. Since F(model) = 72.80 > 4.64, we
reject by setting
(7) p-value is less than .001. Since this p-value is less than = .10, .05, .01, and .001, we
have extremely strong evidence that is false. That is, we have
extremely strong evidence that at least one of x1, x2, x3 is significantly related to y.
s2 = = = 149892.68483
s= = 387.15977
unexplained = 1,798,712.2179
explained = 462,327,889.39
(3) R2 = = .9961
(4) F = = =
1028.131
14-3
Chapter 14 - Multiple Regression
14.12 significantly related to y with strong (a) or very strong (b) evidence
= [29.347 2.365(4.891)]
= [17.780, 40.914]
14-4
Chapter 14 - Multiple Regression
14-5
Chapter 14 - Multiple Regression
14-6
Chapter 14 - Multiple Regression
14.18 The midpoint of both the confidence interval and the prediction interval is the value of y-hat
given by the least squares model.
= (1.57 / 3.242)2
= 0.2345
= [172.28 3.499(1.57)]
= [172.28 5.49]
= [166.79, 177.77]
[ t.005 ]
= [172.28 3.499(3.242) ]
14-7
Chapter 14 - Multiple Regression
= [172.28 12.60]
= [159.68, 184.88]
14.20 a. 95% PI
890,255 bottles
791,876($3.70) = $2,929,941.20
b. 99% PI; t.005 = 2.779
[8.41065 2.779(.235) ] = [7.74465,9.07665]
14.21 y = 17207.31 is above the upper limit of the interval [14906.2, 16886.3]; this y-value is
unusually high.
14.23 There is one dummy variable for each value a categorical variable can take on. You use m-1
dummy variables to model a categorical variable that can take on m values.
14.24 The difference in the average of the dependent variable when the dummy variable equals 1 and
when the dummy variable equals 0.
b. equals the difference between the mean innovation adoption times of stock
companies and mutual companies.
95% CI for : [4.9770, 11.1339]; 95% confident that for any size of insurance firm,
the mean speed at which an insurance innovation is adopted is between 4.9770 and
11.1339 months faster if the firm is a stock company rather than a mutual company.
d. No interaction
14.26 a. The pool coefficient is $25,862.30. Since the cost of the pool is $35,000 you expect to
recoup $25,862.3 / $35,000 = 74%.
b. There is not an interaction between pool and any other independent variable.
14.27 a. =
=
14-8
Chapter 14 - Multiple Regression
=
b. F = 184.57, p-value < .001:
c.
e.
p-value
e.
14.28 a. The point estimate of the effect on the mean of campaign B compared to campaign A is
b4 = 0.2695.
14-9
Chapter 14 - Multiple Regression
d. is significant at alpha = 0.1 and alpha = 0.05 because p-value = 0.0179. Thus there is
strong evidence that is greater than 0.
95% confidence interval= [0.0320,0.3081], since it doesn't overlap 0, we are at least 95%
confident that is greater than 0.
14.30 Complete:
Reduced:
14.31 (k – g) denotes the number of regression parameters we have set equal to zero in .
[n – (k + 1)] denotes the denominator degrees of freedom.
14.32 To test in Model 2, we note that Model 2 is the complete model, and therefore
k = 5. If is true, Model 2 becomes Model 1, which is the reduced model, and therefore k – g =
2. It follows that .
Based on 2 and 24 degrees of freedom . Since F = 19.703 is greater than , we reject at . We
have seen in Exercise 12.45 that . Therefore, if , it follows that and . That is, is equivalent
to .
14-10
Chapter 14 - Multiple Regression
14.35 When plotting residuals versus predicted values of y you look for the residuals to fall in a
horizontal band to indicate constant variance. When plotting residuals versus the individual
x’s you look for the residuals to fall in a random pattern indicating a linear model is sufficient.
When doing a normal probability plot of the residuals you look for a straight line pattern to
conclude that the errors follow a normal distribution. Finally you plot the residuals versus
time (or observation order) and look for a random pattern to conclude that the error terms are
independent.
Residuals
4.7
Residual (gridlines = std. error)
2.4
0.0
-2.4
-4.7
-7.1
0 5 10 15 20
Observation
14.39
14-11
Chapter 14 - Multiple Regression
14.40 The estimates of the coefficients indicate that at a specified square footage, adding rooms
increases selling price while adding bedrooms reduces selling price. Thus building both a
family room and a living room (while maintaining square footage) should increase sales price.
In addition, adding a bedroom at the cost of another room will tend to decrease selling price.
14.41 If you do promotions on weekends, then that promotion, on average nets you 4745 – 4690 = 55
while promotions on day games net you on average 4745 + 5059 = 9804. Thus promotions on
day games gain, on average, more attendance.
14.42 a. Residual plots show all regression assumptions are being met.
b. Normal probability plot addresses whether or not error terms follow a normal
distribution. Since this plot is approximately a straight line we conclude the errors follow
a normal distribution.
Internet Exercise:
14.44 a. All the scatterplots below show linear relationships. Interpretations will vary.
14-12
Chapter 14 - Multiple Regression
14-13
Chapter 14 - Multiple Regression
14-14
Chapter 14 - Multiple Regression
14-15
Chapter 14 - Multiple Regression
b.
Analysis of Variance
Source DF SS MS F P
Regression 4 2163.98 541.00 64.20 0.000
Residual Error 88 741.59 8.43
Total 92 2905.57
14-16
Chapter 14 - Multiple Regression
b0 = 28.1 which is the average city mpg when all variables are 0. No practical
meaning.
b1 = -0.0115 which means that as weight goes up by 1 city mpg goes down by 0.0115.
b2 = 0.275 which means that as wheel base goes up by 1 city mpg goes up by 0.275.
b3 = 0.646 which means that as displacement goes up by 1, city mpg goes up by 0.646.
b4 = -1.55 which means the average city mpg for domestics is 1.55 less than the
average city mpg for imports.
14-17