CH 12 Sol

Chapter 12
Regression Models
12.1 The point (x0 , y0 ) is the closest if it lies on the vertex of the right triangle with vertices (x0 , y 0 )
and (x , a + bx0 ). By the Pythagorean theorem, we must have
0
h i h i
0 2 0 2 0 2 0 2 2
x x0 ) + y0 (a + bx )
( + ( x x0 ) +(
y y 0 ) = (x0 x0 )2 + (y 0 (a + bx0 )) .
0 and y0 from (12.2.7) we obtain for the LHS above

Substituting the values of x
"
0 2 2 0 2 # " 0 2 0 2 #
b(y bx0 a) b (y bx0 a) b(y bx0 a) y bxa)
+ + +
1+b2 1+b2 1+b2 1+b2
" #
2 4 2
2 b +b +b +1 2
= (y 0 (a + bx0 )) 2 = (y 0 (a + bx0 )) .
(1+b2 )
set
12.3 a. Differentiation yields f /i = 2(xi i ) 2 [yi (+i )] = 0 i (1 + 2 ) =
xi (y i ), which is the required solution. Also, 2 f / 2 = 2(1 + 2 ) > 0, so this is a
minimum.
b. Partsi), ii), and iii)
are immediate. For iv) just note that D is Euclidean distance between
(x1 , y1 ) and (x2 , y2 ), hence satisfies the triangle inequality.
12.5 Differentiate log L, for L in (12.2.17), to get
n
n 1 Xh i2
) .
log L = + y i (
+ xi
2 2 2
2(2 ) 1+2 i=1
Set this equal to zero and solve for 2 . The answer is (12.2.18).
12.7 a. Suppressing the subscript i and the minus sign, the exponent is
2 2 2 2 2 2
(x) [y(+)] + 2 [y(+x)]
2 + = 2 (k) + 2 ,
2 2 2 + 2
2 x+2 (y)
where k = 2 + 2 2
. Thus, integrating with respect to eliminates the first term.
b. The resulting function must be the joint pdf of X and Y . The double integral is infinite,
however.
12.9 a. From the last two equations in (12.2.19),
1 1 1 Sxy
2 =
2 = Sxx
Sxx ,
n n n
Similarly,
which is positive only if Sxx > Sxy /.
1 1 1 Sxy
2 =
2 = Syy 2
Syy 2 ,
n n n
xy .
which is positive only if Syy > S
12-2 Solutions Manual for Statistical Inference
b. We have from part a), 2 > 0 Sxx > Sxy / and 2 > 0 Syy > S xy . Furthermore,
> 0 implies that Sxy and have the same sign. Thus Sxx > |Sxy |/||
2 and Syy > |||S
xy |.
Combining yields
|Sxy | Syy
< < .
Sxx |Sxy |
12.11 a.
Cov(aY +bX, cY +dX)

= E(aY + bX)(cY + dX) E(aY + bX)E(cY + dX)
= E acY 2 +(bc + ad)XY +bdX 2 E(aY + bX)E(cY + dX)

= acVarY + ac(EY )2 + (bc + ad)Cov(X, Y )

+(bc + ad)EXEY + bdVarX + bd(EX)2 E(aY + bX)E(cY + dX)
= acVarY + (bc + ad)Cov(X, Y ) + bdVarX.
b. Identify a = , b = 1, c = 1, d = , and using (12.3.19)
Cov(Yi +Xi , Yi Xi ) = VarY + (1 2 )Cov(X, Y ) VarX

= 2 + 2 2 + (1 2 )2 2 + 2

= 2 2 = 0
if 2 = 2 . (Note that we did not need the normality assumption, just the moments.)
c. Let W i = Yi + Xi , Vi = Yi + Xi . Exercise 11.33 shows that p if Cov(Wi , Vi ) = 0,
then n 2r/ 1 r2 has a tn2 distribution. Thus n 2r ()/ 1 r2 () has a tn2
distribution for all values of , by part (b). Also
( 2
)!
(n 2)r() (n 2)r2 ()

P : 2 F1,n2, =P (X, Y ) : F1,n2, = 1 .
1 r() 1 r2 ()
12.13 a. Rewrite (12.2.22) to get

2
t
t
()
: + = : . F .
n2 n2 2 (n 2)

b. For of (12.2.16), the numerator of r () in (12.2.22) can be written

+ 1 .
Syy +(1 2 )S xy Sxy = 2 (Sxy ) + (Sxx Syy ) + Sxy = Sxy ( )

Again from (12.2.22), we have
r2 ()
1 r2 ()
2
Syy +(1 2 )Sxy Sxy
= 2,
( 2 2 Syy +2Sxy +Sxx ) (Syy 2Sxy + 2 Sxx ) (Syy +(1 2 )Sxy Sxx )
and a great deal of straightforward (but tedious) algebra will show that the denominator
of this expression is equal to
(1 + 2 )2 Syy Sxx Sxy

2

.
Second Edition 12-3
Thus
2 2
r2 () 2 Sxy
2
+ 1
= y
1 r2 () 2
(1 2 ) Syy Sx S 2xy

2

!2
1+ (1 + 2 )2 Sxy
2
= 2
h i,

1+ 2 2
2 (Sxx Syy ) + 4S 2
xy
2 from page 588. Now using the fact that and 1/ are both roots
after substituting
of the same quadratic equation, we have
2 2 2
(1+2 ) 2
(S Syy ) +4Sxy

1
= + = xx .
2 2
Sxy
Thus the expression in square brackets is equal to 1.

12.15 a.
e+(/) e0 1
(/) = = = .
1+e +(/) 1 + e0 2
b.
e+((/)+c) ec
((/) + c) = = ,
1+e +((/)+c) 1 + ec
and
ec ec
1 ((/) c) = 1 = .
1 + ec 1 + ec
c.
d e+x
(x) = = (x)(1 (x)).
dx [1 + e+x ]2
d. Because
(x)
= e+x ,
1 (x)
the result follows from direct substitution.
e. Follows directly from (d).
f. Follows directly from

F ( + x) = f ( + x) and F ( + x) = xf ( + x).

g. For F (x) = ex /(1 + ex ), f (x) = F (x)(1 F (x)) and the result follows. For F (x) = (x) of
f
(12.3.2), from part (c) if follows that F (1F ) = .
12.17 a. The likelihood equations and solution are the same as in Example 12.3.1 with the exception
that here (xj ) = ( + xj ), where is the cdf of a standard normal.
b. If the 0 1 failure response in denoted oring and the temperature data is temp, the
following R code will generate the logit and probit regression:
summary(glm(oring~temp, family=binomial(link=logit)))
summary(glm(oring~temp, family=binomial(link=probit)))
12-4 Solutions Manual for Statistical Inference
For the logit model we have

Estimate Std. Error z value P r(> |z|)
Intercept 15.0429 7.3719 2.041 0.0413
temp 0.2322 0.1081 2.147 0.0318
and for the probit model we have
Estimate Std. Error z value P r(> |z|)
Intercept 8.77084 3.86222 2.271 0.0232
temp 0.13504 0.05632 2.398 0.0165
Although the coefficients are different, the fit is qualitatively the same, and the probability
of failure at 31 , using the probit model, is .9999.
12.19 a. Using the notation of Example 12.3.1, the likelihood (joint density) is
J yj nj yj Y J nj P
e+xj 1 1
P
yj + xj yj
Y
= e j j .
j=1
1 + e+xj 1 + e+xj j=1
1 + e+xj
yj and xj yj are sufficient.

P P
By the Factorization Theorem, j j
b. Straightforward substitution.
d
12.21 Since d log(/(1 )) = 1/((1 )),
2

1 (1 ) 1
Var log =
1 (1 ) n n(1 )
P
12.23 a. If ai = 0, X X X
E ai Yi = ai [ + xi + (1 )] = ai xi =
i i i
for ai = xi x
.
b.
1X
E(Y x
) = [ + xi + (1 )] x
= + (1 ),
n i
so the least squares estimate a is unbiased in the model Yi = 0 + xi + i , where 0 =

+ (1 ).
12.25 a. The least absolute deviation line minimizes
|y1 (c + dx1 )| + |y2 (c + dx1 )| + |y3 (c + dx3 )| .
Any line that lies between (x1 , y1 ) and (x1 , y2 ) has the same value for the sum of the first
two terms, and this value is smaller than that of any line outside of (x1 , y1 ) and (x2 , y2 ).
Of all the lines that lie inside, the ones that go through (x3 , y3 ) minimize the entire sum.
b. For the least squares line, a = 53.88 and b = .53. Any line with b between (17.914.4)/9 =
.39 and (17.9 11.9)/9 = .67 and a = 17.9 136b is a least absolute deviation line.
12.27 In the terminology of M -estimators
P (see the argument on pages 485 486), L is consistent
for the 0 that satisfies E0 i (Yi 0 xi ) = 0, so we must take the true to be this
value. We then see that X
(Yi L xi ) 0
i
as long as the derivative term is bounded, which we assume is so.

Second Edition 12-5
12.29 The argument for the median is a special case of Example 12.4.3, where we take xi = 1
so x2 = 1. The asymptotic distribution is given in (12.4.5) which, for x2 = 1, agrees with
Example 10.2.3.
12.31 The LAD estimates, from Example 12.4.2 are = 18.59 and = .89. Here is Mathematica
code to bootstrap the standard deviations. (Mathematica is probably not the best choice here,
as it is somewhat slow. Also, the minimization seemed a bit delicate, and worked better when
done iteratively.) Sad is the sum of the absolute deviations, which is minimized iteratively
in bmin and amin. The residuals are bootstrapped by generating random indices u from the
discrete uniform distribution on the integers 1 to 23.
1. First enter data and initialize
Needs["StatisticsMaster"]
Clear[a,b,r,u]
a0=18.59;b0=-.89;aboot=a0;bboot=b0;
y0={1,1.2,1.1,1.4,2.3,1.7,1.7,2.4,2.1,2.1,1.2,2.3,1.9,2.4,
2.6,2.9,4,3.3,3,3.4,2.9,1.9,3.9};
x0={20,19.6,19.6,19.4,18.4,19,19,18.3,18.2,18.6,19.2,18.2,
18.7,18.5,18,17.4,16.5,17.2,17.3,17.8,17.3,18.4,16.9};
model=a0+b0*x0;
r=y0-model;
u:=Random[DiscreteUniformDistribution[23]]
Sad[a_,b_]:=Mean[Abs[model+rstar-(a+b*x0)]]
bmin[a_]:=FindMinimum[Sad[a,b],{b,{.5,1.5}}]
amin:=FindMinimum[Sad[a,b/.bmin[a][[2]]],{a,{16,19}}]
2. Here is the actual bootstrap. The vectors aboot and bboot contain the bootstrapped values.
B=500;
Do[
rstar=Table[r[[u]],{i,1,23}];
astar=a/.amin[[2]];
bstar=b/.bmin[astar][[2]];
aboot=Flatten[{aboot,astar}];
bboot=Flatten[{bboot,bstar}],
{i,1,B}]
3. Summary Statistics
Mean[aboot]
StandardDeviation[aboot]
Mean[bboot]
StandardDeviation[bboot]
4. The results are Intercept: Mean 18.66, SD .923 Slope: Mean .893, SD .050.

CH 12 Sol

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CH 12 Sol

Uploaded by

Copyright:

Available Formats

Chapter 12

0 and y0 from (12.2.7) we obtain for the LHS above

Cov(aY +bX, cY +dX)

= acVarY + ac(EY )2 + (bc + ad)Cov(X, Y )

b. Identify a = , b = 1, c = 1, d = , and using (12.3.19)

Cov(Yi +Xi , Yi Xi ) = VarY + (1 2 )Cov(X, Y ) VarX

12.13 a. Rewrite (12.2.22) to get

b. For of (12.2.16), the numerator of r () in (12.2.22) can be written

(1 + 2 )2 Syy Sxx Sxy

Thus the expression in square brackets is equal to 1.

For the logit model we have

yj and xj yj are sufficient.

so the least squares estimate a is unbiased in the model Yi = 0 + xi + i , where 0 =

|y1 (c + dx1 )| + |y2 (c + dx1 )| + |y3 (c + dx3 )| .

as long as the derivative term is bounded, which we assume is so.

You might also like

CH 12 Sol

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CH 12 Sol

Uploaded by

Copyright:

Available Formats

Chapter 12

0 and y0 from (12.2.7) we obtain for the LHS above

Cov(aY +bX, cY +dX)

= acVarY + ac(EY )2 + (bc + ad)Cov(X, Y )

b. Identify a = , b = 1, c = 1, d = , and using (12.3.19)

Cov(Yi +Xi , Yi Xi ) = VarY + (1 2 )Cov(X, Y ) VarX

12.13 a. Rewrite (12.2.22) to get

b. For of (12.2.16), the numerator of r () in (12.2.22) can be written

(1 + 2 )2 Syy Sxx Sxy

Thus the expression in square brackets is equal to 1.

For the logit model we have

yj and xj yj are sufficient.

so the least squares estimate a is unbiased in the model Yi = 0 + xi + i , where 0 =

|y1 (c + dx1 )| + |y2 (c + dx1 )| + |y3 (c + dx3 )| .

as long as the derivative term is bounded, which we assume is so.

You might also like

so the least squares estimate a is unbiased in the model Yi = 0 + xi + i , where 0 =