Chapter 11
REGRESSION ANALYSIS I
SIMPLE LINEAR REGRESSION
11.1 The points on the line 3 2 y x = + for 1 x = and 4 x = are (1,5) and (4,11)
respectively. The intercept is 3 and the slope is 2.
11.2 (a) If 41 x = , then 12(41) 75 417 y = = .
(b) We must determine the smallest x value such that 12 75 0 y x = > .
Solving this inequality yields 6.25 x > . So, we must sell at least 7
batteries in a month in order to make a profit.
11.3 (a) Predictor variable x = duration of training
Response variable y = performance in skilled job
(b) Predictor variable x = average number of cigarettes smoked daily
Response variable y = CO level in blood
(c) Predictor variable x = Humidity level in environment
Response variable y = Growth rate of fungus
(d) Predictor variable x = Expenditures in promoting product
Response variable y = Amount of product sales
326 CHAPTER 11. REGRESSION ANALYSIS I
11.4 The model is
0 1
Y x e = + + , where ( ) 0 E e = and ( ) sd e = , so
0 1
4, 3, = = and 5 = .
11.5 The model is
0 1
Y x e = + + , where ( ) 0 E e = and ( ) sd e = , so
0 1
6, 3, = =
and 3 = .
11.6
0 1
1 3 Y x e x e = + + = + + , where ( ) 0 E e = and ( ) sd e = .
(a) At 4 x = , ( ) 1 3(4) 13 E Y = + = and ( ) ( ) 2 sd Y sd e = = .
(b) At 2 x = , ( ) 1 3(2) 7 E Y = + = and ( ) ( ) 2 sd Y sd e = = .
11.7
0 1
2 3 Y x e x e = + + = + , where ( ) 0 E e = and ( ) sd e = .
(a) At 1 x = , ( ) 2 3(1) 1 E Y = = and ( ) ( ) 4 sd Y sd e = = .
(b) At 2 x = , ( ) 2 3(2) 4 E Y = = and ( ) ( ) 4 sd Y sd e = = .
11.8 The straight line for the means of the model 3 4 Y x e = + + is 3 4 y x = + . The
graph of the line is shown below.
11.9 The straight line for the means of the model 7 2 Y x e = + + is 7 2 y x = + . The
graph of the line is shown below.
327
11.10 (a) At 3 x = ,
0 1
( ) (3) 2 1(3) 5 E Y = + = = .
At 6 x = ,
0 1
( ) (6) 2 1(6) 8 E Y = + = = .
(b) No, only the mean is larger. By chance the error e at 2 x = , which has
standard deviation 3, could be quite negative and/or the error at 4 x = very
large.
11.11 (a) At 4 x = ,
0 1
( ) (4) 4 3(4) 16 E Y = + = + = .
At 5 x = ,
0 1
( ) (5) 4 3(5) 19 E Y = + = + = .
(b) No, only the mean is larger. By chance the error e at 5 x = , which has
standard deviation 4, could be quite negative and/or the error at 4 x = very
large.
11.12 (a) The scatter diagram is shown below.
0
2
4
6
8
10
12
0 1 2 3 4 5 6 7
x
y
(b) The computations needed to calculate , , , , and
xx xy yy
x y S S S are provided
in the following table:
x y x x
y y
( x x )( y y ) ( x x )
2
( y y )
2
2 1 2 4 8 4 16
5 6 1 1 1 1 1
6 10 2 5 10 4 25
3 3 1 2 2 1 4
4 5 0 0 0 0 0
Total
20 25 0 0 21 10 46
So, we have
328 CHAPTER 11. REGRESSION ANALYSIS I
2
20
5
25
5
2
4 ( ) 10
5 ( )( ) 21
( ) 46
x
xx n
y
xy n
yy
x S x x
y S x x y y
S y y
= = = = =
= = = = =
= =
(c)
1 0 1
21
2.1, 5 2.1(4) 3.4
10
xy
xx
S
y x
S
= = = = = =
(d) The fitted line is 3.4 2.1 y x = + , which is graphed above in (a).
11.13 (a) The scatter diagram is shown below.
0
1
2
3
4
5
6
7
8
9
0 1 2 3 4 5 6
x
y
(b) The computations needed to calculate , , , , and
xx xy yy
x y S S S are provided
in the following table:
x y x x
y y
( x x )( y y ) ( x x )
2
( y y )
2
1 8 2 4 8 4 16
2 4 1 0 0 1 0
3 5 0 1 0 0 1
3 3 0 1 0 0 1
4 3 1 1 1 1 1
5 1 2 3 6 4 9
Total
18 24 0 0 15 10 28
So, we have
2
18
6
24
6
2
3 ( ) 10
4 ( )( ) 15
( ) 28
x
xx n
y
xy n
yy
x S x x
y S x x y y
S y y
= = = = =
= = = = =
= =
(c)
1 0 1
15
1.5, 4 ( 1.5)(3) 8.5
10
xy
xx
S
y x
S
= = = = = =
(d) The fitted line is 8.5 1.5 y x = , which is graphed in part (a).
329
11.14 (a) The residuals and their sum are calculated in the following table:
x y 3.4 2.1 y x = + e y y =
( y y )
2
2 1 0.8 0.2 0.04
5 6 7.1 1.1 1.21
6 10 9.2 0.8 0.64
3 3 2.9 0.1 0.01
4 5 5 0 0
Total
0 1.9
So, ( ) 0.0 y y =
.
(b) SSE = Sum of squares residuals = 1.9
2
2
(21)
SSE 46 1.9
10
xy
yy
xx
S
S
S
= = = (Check this using the calculations in
Exercise 11.12.)
(c)
2
SSE 1.9
0.6333
2 5 2
S
n
= = =
11.15 (a) The residuals and their sum are calculated in the following table:
x y 8.5 1.5 y x = e y y =
( y y )
2
1 8 7 1 1
2 4 5.5 1.5 2.25
3 5 4 1 1
3 3 4 1 1
4 3 2.5 0.5 0.25
5 1 1 0 0
Total
5.50
So, ( ) 0.0 y y =
.
(b) SSE = Sum of squares residuals = 5.50
2
2
( 15)
SSE 28 5.50
10
xy
yy
xx
S
S
S
= = = = =
= = = = =
= =
(b)
1 0 1
16
1.6, 5 ( 1.6)(2) 8.2
10
xy
xx
S
y x
S
= = = = = =
(c) The fitted line is 8.2 1.6 y x = .
11.17 (a) The computations needed to calculate , , , , and
xx xy yy
x y S S S are provided
in the following table:
x y x x
y y
( x x )( y y ) ( x x )
2
( y y )
2
1 4 3 2 6 9 4
2 3 2 3 6 4 9
4 6 0 0 0 0 0
6 8 2 2 4 4 4
7 9 3 3 9 9 9
Total
20 30 0 0 25 26 26
So, we have
2
20
5
30
5
2
4 ( ) 26
6 ( )( ) 25
( ) 26
x
xx n
y
xy n
yy
x S x x
y S x x y y
S y y
= = = = =
= = = = =
= =
(b)
1 0 1
25
0.96, 6 (0.96)(4) 2.16
26
xy
xx
S
y x
S
= = = = = =
(c) The fitted line is 2 y x = .
331
11.18 (a)
1
0 1
6191.04
6.228,
994.038
89.20 (6.228)(38.046) 147.750
xy
xx
S
S
y x
= = =
= = =
The fitted line is 147.750 6.228 y x = + .
(b)
2
2
(6191.04)
SSE 76, 293.2 37, 734.34
994.038
xy
yy
xx
S
S
S
= = =
(c)
2
SSE 37, 734.34
2096.352
2 18
S
n
= = =
11.19 (a)
1
0 1
290.10
0.8091,
358.55
5.30 (0.8091)(3.15) 2.751
xy
xx
S
S
y x
= = =
= = =
The fitted line is 2.751 0.8091 y x = + .
(b)
2
2
(290.10)
SSE 618.2 383.482
358.55
xy
yy
xx
S
S
S
= = =
(c)
2
SSE 383.482
21.305
2 18
S
n
= = =
11.20 We first calculate the means and sums of squares and products:
889 520
7 7
7 7 7
2 2
1 1 1
127, 74.286
113, 237, 39, 328, 66, 392
x y
n n
i i i i
i i i
x y
x y x y
= = =
= = = = = =
= = =
Thus,
2
7 7
2 2
1 1
7 7 7
1 1 1
2
7 7
2 2
1 1
/ 7 113, 237 (889) / 7 334
/ 7 66, 392 (520)(889) / 7 352
/ 7 39, 328 (520) / 7 699.43
xx i i
i i
xy i i i i
i i i
yy i i
i i
S x x
S x y x y
S y y
= =
= = =
= =
 
= = =

\
  
= = =
 
\ \
 
= = =

\
(a)
352
1 0 1 334
352
1.054, 74.286 ( )(127) 59.56
334
xy
xx
S
y x
S
= = = = = =
The fitted line is 59.56 1.054 y x = + .
332 CHAPTER 11. REGRESSION ANALYSIS I
(b)
2
2
(352)
SSE 699.43 328.46
334
xy
yy
xx
S
S
S
= = =
(c)
2
SSE 328.46
65.69
2 5
S
n
= = =
11.21 We first calculate the means and sums of squares and products:
889 520
7 7
7 7 7
2 2
1 1 1
127, 74.286
113, 237, 39, 328, 66, 392
y x
n n
i i i i
i i i
y x
y x x y
= = =
= = = = = =
= = =
Thus,
2
7 7
2 2
1 1
7 7 7
1 1 1
2
7 7
2 2
1 1
/ 7 39, 328 (520) / 7 699.43
/ 7 66, 392 (520)(889) / 7 352
/ 7 113, 237 (889) / 7 334
xx i i
i i
xy i i i i
i i i
yy i i
i i
S x x
S x y x y
S y y
= =
= = =
= =
 
= = =

\
  
= = =
 
\ \
 
= = =

\
(a)
1
352
0.5033,
699.43
xy
xx
S
S
= = =
0 1
127 (0.5033)(74.286) 89.61 y x = = =
The fitted line is 89.61 0.5033 y x = + .
(b)
2
2
(352)
SSE 334 156.85
699.43
xy
yy
xx
S
S
S
= = =
(c)
2
SSE 156.85
31.37
2 5
S
n
= = =
(d) No, the two should be inverses of each other (the graphs of which are
reflections over the y x = line). Hence, they will not have the same
equation unless they are both identical to y x = .
11.22 Since
1
xy
xx
S
S
= and
2
SSE
xy
yy
xx
S
S
S
= , we have
(a)
2
1
SSE
xy xy
yy yy xy yy xy
xx xx
S S
S S S S S
S S
= = =
(b)
2
1 1 1
SSE
xx
yy xy yy xy yy xx
xx
S
S S S S S S
S
= = =
11.23 At
0
0 1 1 1
, x y x y x x y
=
= + = + =
_
333
11.24 (a) At
0
0 1 1 1 1
, ( )
i i i i i
x y x y x x y x x
=
= + = + = +
_
.
(b) The residual at
i
x is
i i
y y or
1 1
( ) ( ) ( )
i i i i i
e y y x x y y x x = =
Summing the
i
e , we obtain
1
( ) ( ) 0 0
i i
y y x x
(
= +
because the
sum of deviations ( )
i
y y
and ( )
i
x x
are zero.
(c)
2
2 2 2 2
1 1 1
( ) ( ) ( ) 2 ( )( ) ( )
i i i i i i i
e y y x x y y x x y y x x
(
= = +
Summing, we obtain
2 2 2 2
1 1
2
1 1
2 2
2
SSE ( ) 2 ( )( ) ( )
2
2
i i i i i
yy xy xx
xy xy xy
yy xy xx yy
xx xx xx
e y y x x y y x x
S S S
S S S
S S S S
S S S
(
= = +
= +
= + =
11.25 (a) The computations needed to calculate , , , , and
xx xy yy
x y S S S are provided
in the following table:
x y x x
y y
( x x )( y y ) ( x x )
2
( y y )
2
1 5 2 6 12 4 36
2 11 1 0 0 1 0
3 9 0 2 0 0 4
4 14 1 3 3 1 9
5 16 2 5 10 4 25
Total
15 55 0 0 25 10 74
So, we have
2
15
5
55
5
2
3 ( ) 10
11 ( )( ) 25
( ) 74
x
xx n
y
xy n
yy
x S x x
y S x x y y
S y y
= = = = =
= = = = =
= =
1 0 1
25
2.5, 11 (2.5)(3) 3.5
10
xy
xx
S
y x
S
= = = = = =
2
2
(25)
SSE 74 11.5
10
xy
yy
xx
S
S
S
= = =
2
SSE 11.5
3.833
2 3
S
n
= = =
334 CHAPTER 11. REGRESSION ANALYSIS I
(b) We test the hypotheses:
0 1 1 1
: 0 versus : 0 H H =
Since H
1
is two sided and 0.05 = , the rejection region is
0.025
: 3.182 R T t > = (for d.f. = 3). The value of the observed t is
1
0 2.5
4.038
/ 3.833/ 10
xx
t
s S
= = = ,
which lies in R. Hence, H
0
is rejected at 0.05 = .
(c) The expected value is estimated by 3.5 2.5(3) 11 + = . Since
2 2
1 (3 ) 1 (3 3)
3.833 0.8756
5 10
xx
x
s
n S
+ = + =
and the upper 0.05 point of the t with d.f. = 3 is 2.353, the 90%
confidence interval for the expected y value is given by
11 2.353(0.8756) or (8.9397, 13.0603).
11.26 A 90% confidence interval for
0
(in Exercise 11.25) is given by
2 2
0
1 1 3
= = = = =
= = = = =
= =
335
1 0 1
2.8
0.28, 2.4 (0.28)(2) 1.84
10
xy
xx
S
y x
S
= = = = = =
2
2
2.8
SSE 0.82 0.036
10
xy
yy
xx
S
S
S
= = =
2
SSE 0.036
0.012
2 3
S
n
= = =
(b) We test the hypotheses:
0 1 1 1
: 1 versus : 1 H H =
Since H
1
is two sided and 0.05 = , the rejection region is
0.025
: 3.182 R T t > = (for d.f. = 3). The value of the observed t is
1
1 0.28 1
20.785
/ 0.012 / 10
xx
t
s S
= = = ,
which lies in R. Hence, H
0
is rejected at 0.05 = .
(c) The expected value is estimated by 1.84 0.28(3.5) 2.82 + = . Since
2 2
1 (3.5 ) 1 (3.5 2)
0.012 0.0714
5 10
xx
x
s
n S
+ = + = ,
the 95% confidence interval for the expected y value is given by
2.82 3.182(0.0714) or (2.593, 3.047).
(d) A 90% confidence interval for
0
is given by
2 2
0
1 1 2
= = = =
= = = =
= =
Thus, we have
2
SSE 26.766
xy
yy
xx
S
S
S
= =
2
SSE 26.766
5.3532
2 5
S
n
= = =
.
Therefore,
2.314
0.0571
1644.3
xx
s
S
= = . Also,
1
0.8694
xy
xx
S
S
= = .
So, a 95% confidence interval for
1
is given by
1
+ =
or (291.31, 296.11).
337
(b) A 95% prediction interval (for 290 x
= ) is given by
2
1 (290 295.7)
293.71 2.571(2.314) 1 293.71 6.415
7 1644.3
+ + =
or (287.30,300.13).
11.31 (a) The expected value of HDI corresponding to 22 x
+ =
Since
0.025
2.160 t = for d.f. = 13, the 95% confidence interval is
0.8773 2.160(0.0298) 0.8773 0.0644 =
or (0.8129, 0.9417).
The width of the confidence interval in Example 9 is 0.103, whereas the
interval just computed has width 0.1288.
(b) Now, we compute the 95% confidence interval for a single country
corresponding to 22 x
= . As in (a),
0 1
0.8773 x
+ = , but the estimated
SE is now
( )
2
22 9.953 1
1 1.091
15 1173.46
+ + =
Since
0.025
2.160 t = for d.f. = 13, the 95% confidence interval is
0.8773 2.160(1.091) 0.8773 2.357 =
or (1.480, 3.234).
(c) No, it cannot establish causality.
11.32 (a) The line from Example 9 with the variables reversed was
0.493 0.174 y x = + . To obtain the line with the predictor variable reversed,
solve this equation for x to obtain 5.747 2.833 x y = .
(b) Test the hypotheses:
0 1 1 1
: 0 versus : 0 H H =
Using
0.05
2
0.05, 2.160 t = = with d.f. = 13, we use the twosided rejection
region : 2.160 R T . Now, note that
2
2
20.471
1173.46 170.248
0.41772
170.248
3.619
2 13
xy
xx
yy
S
SSE S
S
SSE
s
n
= = =
= = =
338 CHAPTER 11. REGRESSION ANALYSIS I
The test statistic is
5.747 5.747
1.026
3.619
0.41772
yy
T
s
S
= = =
Since this value does not lie in R, we do not reject H
0
at this level.
(c) The expected value of internet usage corresponding to 0.650 y =
is estimated as
+ =
Since
0.025
2.160 t = for d.f. = 13, the 95% confidence interval is
0.903 2.160(0.9393) 0.903 2.029 =
or (1.126, 2.932).
(d) Now, we compute the 95% prediction interval for a single country
with HDI 0.650. As in (a),
+ + =
Since
0.025
2.160 t = for d.f. = 13, the 95% prediction interval is
0.903 2.160(3.7389) 0.903 8.07602 =
or (7.173, 8.97902).
11.33 (a) The model is
0 1
Y x e = + + and the fit suggested by the data is
994 0.10373 y x = + with
( ) 299.4 sd e = = .
Note that the r
2
is only 0.302. This means that only 30.2% of the
variability in the data is explained by the model (refer to Exercise 11.44).
(b) The tratio on the computer output is the tstatistic for testing that the
coefficient is zero. Since the tratio for the x term is 3.48 with pvalue
0.002, we reject H
0
:
0
0 = at 0.05 = .
11.34 (a) At 5000 x = , the predicted mean response is given by
994 0.10373(5000) 1512.7 y = + =
.
(b) Since
0.05
1.701 t = for d.f. = 28, a 90% confidence interval for the mean
response at 5000 x = is given by
2 2
0
1 1 (5000 8354)
+ = +
or (1316.4, 1709.0).
339
11.35 (a) The model is
0 1
Y x e = + + and the fit suggested by the data is
0.3381 0.83099 y x = + with
( ) 0.1208 sd e = = .
(b) Since the tratio for the x term is 9.55 with pvalue less than 0.0001, we
reject H
0
:
1
0 = at 0.05 = . As such, the x term is needed in the model.
11.36 (a) At 3 x = , the predicted mean response is
0.3381 0.83099(3) 2.8311 y = + = .
(b) Since
0.05
1.714 t = for d.f. = 23, a 90% confidence interval for the mean
response at 3 x = is given by
2 2
0
1 1 (3 1.793)
+ = +
or (0.14961, 0.52659).
(c) This 90% confidence interval is given by
[ ]
2
1 (2 1.793)
0.338 0.831(2) (1.714) 0.0146
25 1.848
+ + or (1.95, 2.05)
11.37 (a) Using Minitab, we find that the fitted line plot is as follows:
0 1 2 3 4 5 6 7 8
24
25
26
27
28
29
30
31
32
33
age
l
e
n
g
t
h
Y= 26.3101 + 0.537657X
RSq = 27.7 %
Regression Plot
(b) Enter the data into a Minitab worksheet. The output is as follows:
Regression Analysis
The regression equation is
length = 26.3 + 0.538 age
Predictor Coef StDev T P
Constant 26.3101 0.7356 35.77 0.000
age 0.5377 0.2105 2.55 0.021
S = 1.722 RSq = 27.7% RSq(adj) = 23.5%
340 CHAPTER 11. REGRESSION ANALYSIS I
Analysis of Variance
Source DF SS MS F P
Regression 1 19.353 19.353 6.52 0.021
Residual Error 17 50.437 2.967
Total 18 69.789
Look in the age row the pvalue of 0.021 is the result of the hypothesis
test of the slope at the 95% level. Here, we reject H
0
in favor of claiming
there is linear relationship between the two variables.
(c) & (d) One can proceed by hand as in other exercises/examples. We, however
choose to use Minitab to obtain the following two 90% confidence
intervals corresponding to age x = 4:
Predicted Values
Fit StDev Fit 90.0% CI 90.0% PI
28.461 0.453 ( 27.673, 29.249) ( 25.362, 31.559)
11.38
2
2
2
(6191.04)
0.505
(994.038)(76293.2)
xy
xx yy
S
r
S S
= = =
11.39
2
2
2
(290.10)
0.380
(358.55)(618.2)
xy
xx yy
S
r
S S
= = =
11.40
2
2
2
(20.471)
0.855
(1173.46)(0.41772)
xy
xx yy
S
r
S S
= = =
11.41 (a) and (b): The
2
r value is the same as in Exercise 11.40.
11.42 By Exercise 11.25, we have 10, 74, 25
xx yy xy
S S S = = = . So,
(a)
2
2
2
(25)
0.845
(10)(74)
xy
xx yy
S
r
S S
= = =
(b)
25
0.919
(10)(74)
r = =
(c)
2
(25)
SSE 74 11.5
10
= =
(d)
2
SSE 11.5
3.833
2 3
s
n
= = =
341
11.43 By Exercise 11.28, we have 10, 0.82, 2.8
xx yy xy
S S S = = = . So,
(a) Proportion of variance explained
2
2
0.956
xy
xx yy
S
r
S S
= =
(b) 0.978
xy
xx yy
S
r
S S
= =
11.44 Proportion explained =
2
0.302 r =
11.45 Proportion explained =
2
0.799 r =
11.46 (a)
2
0.649 and 0.421 r r = =
(b)
2
0.279 and 0.078 r r = =
(c)
2
0.733 and 0.537 r r = =
(d) The pattern is quite different for male and female wolves. The scatter
diagram is as follows:
11.47 (a) Recall
1
xy
xx
S
S
= , so multiplying r by
xx
xx
S
S
1 1
1 1
xy xy xx xx xx
xx xx yy xx yy xx yy yy
S S S S S
r
S S S S S S S S
= = = =
(b)
2 2 2
2
SSE 1 (1 )
xy xy yy xy
yy yy yy yy
xx xx yy xx yy
S S S S
S S S S r
S S S S S
 
= = = =


\
11.48 SS due to regression =
2 2
2
1
xy xy
xx
xx
xx xx xx
S S
S
S
S S S
= = .
342 CHAPTER 11. REGRESSION ANALYSIS I
11.49 The product x = (leaf length) (leaf width) is the area of a rectangle that contains
the leaf. It should be larger than the leaf, so the slope should be less than one.
11.50 (a) & (c) The fitted line is given by 3.966 3.144 y x = + . The scatter diagram and
fitted line are illustrated below:
0
5
10
15
20
25
0 1 2 3 4 5 6
x
y
(b) Observe that
2
23
9
108
9
2
2.556 ( ) 16.2018
12 ( )( ) 50.952
( ) 176
x
xx n
y
xy n
yy
x S x x
y S x x y y
S y y
= = = = =
= = = = =
= =
(d) The predicted value is 3.966 3.144(3) 13.398 + = .
11.51 (a) Note that
1
0 1
3.144
12 3.144(2.556) 3.964
xy
xx
S
S
y x
= =
= = =
So,
0 1
y x = + .
The residuals and their sum are calculated in the following table:
x y y e y y = ( y y )
2
1 8 8.802 0.802 0.643
1 6 8.802 2.802 7.851
1 7 8.802 1.802 3.247
2 10 10.857 0.857 0.734
3 15 12.912 2.088 4.360
3 12 12.912 0.912 0.832
3 13 12.912 0.088 0.008
343
4 19 14.967 4.033 16.265
5 18 17.022 0.978 0.956
Total
34.896
So, SSE 34.896 = .
(b) SSE = Sum of squares residuals = 34.896
2
2
(69.904)
SSE 176 32.337
34.014
xy
yy
xx
S
S
S
= = =
(c)
2
SSE 34.896
4.985
2 9 2
S
n
= = =
11.52 (a) A 95% confidence interval for
1
is given by
1
2.233
 
= =

\
or (1.832, 4.456) .
(b) This 95% confidence interval is given by
[ ]
2
1 (4 2.556)
3.964 3.144(4) 1.895(2.233) 16.54 2.072
9 16.2018
+ + =
or (14.468, 18.612).
11.53 (a)
1 0 1
12.4
2.214, 54.8 (2.214)(8.3) 73.18
5.6
xy
xx
S
y x
S
= = = = = + =
The fitted line is then given by 73.18 2.214 y x = .
Also,
2
2
(12.4)
SSE 38.7 11.24
5.6
xy
yy
xx
S
S
S
= = =
2
SSE 11.24
0.8648
2 13
S
n
= = =
(b) We test the hypotheses:
0 1 1 1
: 2 versus : 2 H H = <
Since H
1
is leftsided and 0.05 = , the rejection region is
0.05
: 1.771 R T t < = (for d.f. = 13). The value of the observed t is
1
( 2) 2.214 2
0.5446
/ 0.8648 / 5.6
xx
t
s S
+
= = = ,
which does not lie in R. Hence, H
0
is not rejected at 0.05 = .
(c) A 95% confidence interval (for 10 x
+ or (49.51, 52.57).
344 CHAPTER 11. REGRESSION ANALYSIS I
11.54 (a) The decomposition of the total yvariability is given by
2
SSE
Total Explained by linear relation Error
38.7 27.46 11.24
yy
xy
xx
S
S
S
= +
= +
_
(b) Proportion explained =
2
27.46
0.71
38.7
r = =
(c)
12.4
0.84
(5.6)(38.7)
xy
xx yy
S
r
S S
= = =
11.55 (a) The scatter diagram is given by
0
200
400
600
800
1000
1200
850 900 950 1000 1050 1100 1150
x
y
We first calculate the means and sums of squares and products:
7880 6845
8 8 8 8
8 8 8
2 2
1 1 1
985, 855.625
7, 797, 438, 5, 914,875, 6, 785, 540
x y
i i i i
i i i
x y
x y x y
= = =
= = = = = =
= = =
Thus,
2
8 8
2
1 1
8 8 8
1 1 1
2
8 8
2
1 1
/ 8 35, 638
/ 8 43, 215
/ 8 58121.88
xx i i
i i
xy i i i i
i i i
yy i i
i i
S x x
S x y x y
S y y
= =
= = =
= =
 
= =

\
   
= =
 
\ \
 
= =

\
Consequently, we have
345
1 0 1
73, 215
2.054, 1167.57
35, 638
xy
xx
S
y x
S
= = = = =
The fitted line is 1167.57 2.054 y x = + .
Furthermore,
2
SSE 5718.93
xy
yy
xx
S
S
S
= =
2
SSE 5718.93
953.16
2 6
S
n
= = =
, so that 30.87 S = .
(b) We test the hypotheses:
0 1 1 1
: 0 versus : 0 H H = >
Since H
1
is rightsided and 0.05 = , the rejection region is
0.05
: 1.943 R T t > = (for d.f. = 6). The value of the observed t is
1
0 2.054
5.406
/ 71.73/ 35, 638
xx
t
s S
= = = ,
which lies in R. Hence, H
0
is rejected at 0.05 = . This implies that the
mean rent y increases with size x.
(c) The expected increase is
1
. A 95% confidence interval is calculated as
30.87
2.054 2.447 2.054 0.0021
35, 638
 
=

\
or (2.0519,2.0561).
(d) For a specific apartment of size 1025 x = , a 95% prediction interval is
given by
[ ]
2
1 (1025 985)
1167.57 2.054(1025) 2.447(30.87) 1
8 35, 638
937.78 81.70
+ + +
=
or (856.08, 1019.48).
11.56 (a)
43, 215
0.9495
(35, 638)(58,121.88)
xy
xx yy
S
r
S S
= = =
(b) Proportion of variability explained is
2
0.9016 r = .
346 CHAPTER 11. REGRESSION ANALYSIS I
11.57 (a) & (b) The scatter diagram and fitted line are shown below:
0
2
4
6
8
10
12
14
16
18
20
0 1 2 3 4 5 6 7 8 9
x
y
2
40
9
100.7
9
2
4.4 ( ) 50.58
11.19 ( )( ) 77.85
( ) 136.41
xx
xy
yy
x S x x
y S x x y y
S y y
= = = =
= = = =
= =
As such, we have
1 0 1
1.558, 18.114
xy
xx
S
y x
S
= = = =
So, the fitted line is 18.114 1.558 y x = .
(c) Note that
2
SSE 16.59,
xy
yy
xx
S
S
S
= = so that 1.539
7
SSE
s = = .
Since
0.025
2.365 t = for d.f. = 7, a 95% confidence interval for
1
is
1
1.539
 
= =

\
or (2.07, 1.046).
11.58 (a) The predicted value is 18.114 1.558(5) = 10.324
A 95% prediction interval for this value is then given by
2
1 (5 4.4)
10.324 2.365(1.539) 10.324 1.252
9 50.58
+ =
or (9.072, 11.576).
(b) The predicted value for a single car is 16.556.
A 90% prediction interval for this value is then given by
2
1 (5 4.4)
16.556 1.895(1.539) 1 16.556 3.084
9 50.58
+ + =
or (13.472, 19.64).
(c) No. There is no data over that region. The linear relation cannot hold for
cars that old because we know the selling price never goes below zero.
347
11.59
77.85
0.9455
(50.58)(136.41)
xy
xx yy
S
r
S S
= = = . Since
2
0.8939 r = is the
proportion of variance explained by the linear regression of y on x, the fit
appears to be inadequate.
11.60 (a) Observe that
2
2
(106)(63) (106)
20 20
(63)
20
624 290.1, 1180 618.20
557 358.55
xy xx
yy
S S
S
= = = =
= =
So,
1 0 1
290.1
0.469, 3.15 (0.469)(5.3) 0.6643
618.20
xy
xx
S
y x
S
= = = = = =
The fitted line is 0.6643 0.469 y x = + .
(b)
290.1
0.616
(618.20)(358.55)
xy
xx yy
S
r
S S
= = =
(c)
2
0.380 r = , so the proportion of yvariability explained by the straight line
fit is nearly 38%.
11.61 (a)
0 1
1.071, 2.7408 = =
(b) SSE = 63.65
(c) Estimated S.E. of
0
is 2.751,
Estimated S.E. of
1
is 0.4411.
(d) For testing
0 0
: 0, 0.39 H t = = .
For testing
0 1
: 0, 6.21 H t = = .
(e)
2
0.828 r =
(f) The decomposition of the total sum of squares is
2
SSE
Total Explained by linear relation Error
307.90 307.25 63.65
yy
xy
xx
S
S
S
= +
= +
_
348 CHAPTER 11. REGRESSION ANALYSIS I
11.62 The Minitab output is as follows:
(a) The scatter diagram is illustrated below:
(b) From the output, the least squares regression line is 53.17 1.0349 y x = + .
According to the large pvalue for the test of zero intercept, the model
349
could be refit without an intercept term. However, the unusual
observation noted in the output could make the current analysis misleading
(see Exercise 11.63). We proceed as if the intercept term is needed.
(c) For d.f. = 17,
0.05
1.740 t = . Therefore, the null hypothesis
0 1
: 0 H = will
be rejected at 0.05 = if the observed t value is in the rejection region
: 1.740 R T . According to the output, the observed t is given by
1.0349
0.2945
3.51 t = = , which lies in R. Hence, H
0
is rejected at 0.05 = .
Furthermore, since the pvalue of 0.003 is very small, the data strongly
support that
1
0 > which, in turn, indicates that the expected value of
weight increases with body length.
11.63 The Minitab output is on the next page.
(a) From the output, the fitted line is 87.17 1.2765 y x = + .
(b) For d.f. = 16,
0.05
1.746 t = . Therefore, the null hypothesis
0 1
: 0 H = will
be rejected at 0.05 = if the observed t value is in the rejection region
: 1.746 R T . According to the output, the observed t is given by
1.2765
0.2156
5.92 t = = , which lies in R. Hence, H
0
is rejected at 0.05 = .
Furthermore, since the pvalue is very small, the data strongly support that
1
0 > which, in turn, indicates that the expected value of weight increases
with body length.
(c) The intercept term is now needed since the pvalue is 0.003. The
estimated slope has increased.
350 CHAPTER 11. REGRESSION ANALYSIS I
11.64 (a) From the data, we calculate:
2
2
154.7 ( ) 11, 262.2
616.1 ( )( ) 41, 914.6
( ) 225, 897.8
x
xx n
y
xy n
yy
x S x x
y S x x y y
S y y
= = = =
= = = =
= =
So, the fitted line is 40.35 3.722 y x = + .
(b) The residual sum of squares is SSE = 69,904, and the estimate of is
SSE
62.3
18
s = = . Since
0.025
2.101 t = for d.f. = 18, the 95% confidence
interval is given by
1
62.3
 
=

\
or (2.489, 4.955).
(c) The predicted value of y for 150 x
+ + =
or (465, 733).
(d) The calculations are similar to those in part (c), so we present only the
final results:
At 175 x = , the confidence interval is 692 136 or (556, 828).
At 195 x = , the confidence interval is 766 143 or (623, 909).
11.65 (a) From the data, we calculate:
2
15
15
2
16.66 ( ) 40.476
80.04 ( )( ) 133.804
( ) 629.836
x
xx
y
xy
yy
x S x x
y S x x y y
S y y
= = = =
= = = =
= =
So,
1
0 1
133.804
3.306,
40.476
80.04 (3.306)(16.66) 24.96
xy
xx
S
S
y x
= = =
= = =
So, the fitted line is 24.96 3.306 y x = + .
(b) The residual sum of squares is SSE = 187.5, and the estimate of is
SSE
3.798
13
s = = . Since
0.025
2.160 t = for d.f. = 13, the 95% confidence
interval is given by
1
3.798
 
=

\
or (2.02, 4.60).
351
(c) The predicted temperature ( ) F
for 15 x