Professional Documents
Culture Documents
Regression Models
12.1 The point (x0 , y0 ) is the closest if it lies on the vertex of the right triangle with vertices (x0 , y 0 )
and (x , a + bx0 ). By the Pythagorean theorem, we must have
0
h i h i
0 2 0 2 0 2 0 2 2
x x0 ) + y0 (a + bx )
( + ( x x0 ) +(
y y 0 ) = (x0 x0 )2 + (y 0 (a + bx0 )) .
b. We have from part a), 2 > 0 Sxx > Sxy / and 2 > 0 Syy > S xy . Furthermore,
> 0 implies that Sxy and have the same sign. Thus Sxx > |Sxy |/||
2 and Syy > |||S
xy |.
Combining yields
|Sxy | Syy
< < .
Sxx |Sxy |
12.11 a.
= 2 2 = 0
if 2 = 2 . (Note that we did not need the normality assumption, just the moments.)
c. Let W i = Yi + Xi , Vi = Yi + Xi . Exercise 11.33 shows that p if Cov(Wi , Vi ) = 0,
then n 2r/ 1 r2 has a tn2 distribution. Thus n 2r ()/ 1 r2 () has a tn2
distribution for all values of , by part (b). Also
( 2
)!
(n 2)r() (n 2)r2 ()
P : 2 F1,n2, =P (X, Y ) : F1,n2, = 1 .
1 r() 1 r2 ()
r2 ()
1 r2 ()
2
Syy +(1 2 )Sxy Sxy
= 2,
( 2 2 Syy +2Sxy +Sxx ) (Syy 2Sxy + 2 Sxx ) (Syy +(1 2 )Sxy Sxx )
and a great deal of straightforward (but tedious) algebra will show that the denominator
of this expression is equal to
Thus
2 2
r2 () 2 Sxy
2
+ 1
= y
1 r2 () 2
(1 2 ) Syy Sx S 2xy
2
!2
1+ (1 + 2 )2 Sxy
2
= 2
h i,
1+ 2 2
2 (Sxx Syy ) + 4S 2
xy
2 from page 588. Now using the fact that and 1/ are both roots
after substituting
of the same quadratic equation, we have
2 2 2
(1+2 ) 2
(S Syy ) +4Sxy
1
= + = xx .
2 2
Sxy
d. Because
(x)
= e+x ,
1 (x)
the result follows from direct substitution.
e. Follows directly from (d).
f. Follows directly from
F ( + x) = f ( + x) and F ( + x) = xf ( + x).
g. For F (x) = ex /(1 + ex ), f (x) = F (x)(1 F (x)) and the result follows. For F (x) = (x) of
f
(12.3.2), from part (c) if follows that F (1F ) = .
12.17 a. The likelihood equations and solution are the same as in Example 12.3.1 with the exception
that here (xj ) = ( + xj ), where is the cdf of a standard normal.
b. If the 0 1 failure response in denoted oring and the temperature data is temp, the
following R code will generate the logit and probit regression:
summary(glm(oring~temp, family=binomial(link=logit)))
summary(glm(oring~temp, family=binomial(link=probit)))
12-4 Solutions Manual for Statistical Inference
for ai = xi x
.
b.
1X
E(Y x
) = [ + xi + (1 )] x
= + (1 ),
n i
Any line that lies between (x1 , y1 ) and (x1 , y2 ) has the same value for the sum of the first
two terms, and this value is smaller than that of any line outside of (x1 , y1 ) and (x2 , y2 ).
Of all the lines that lie inside, the ones that go through (x3 , y3 ) minimize the entire sum.
b. For the least squares line, a = 53.88 and b = .53. Any line with b between (17.914.4)/9 =
.39 and (17.9 11.9)/9 = .67 and a = 17.9 136b is a least absolute deviation line.
12.27 In the terminology of M -estimators
P (see the argument on pages 485 486), L is consistent
for the 0 that satisfies E0 i (Yi 0 xi ) = 0, so we must take the true to be this
value. We then see that X
(Yi L xi ) 0
i
12.29 The argument for the median is a special case of Example 12.4.3, where we take xi = 1
so x2 = 1. The asymptotic distribution is given in (12.4.5) which, for x2 = 1, agrees with
Example 10.2.3.
12.31 The LAD estimates, from Example 12.4.2 are = 18.59 and = .89. Here is Mathematica
code to bootstrap the standard deviations. (Mathematica is probably not the best choice here,
as it is somewhat slow. Also, the minimization seemed a bit delicate, and worked better when
done iteratively.) Sad is the sum of the absolute deviations, which is minimized iteratively
in bmin and amin. The residuals are bootstrapped by generating random indices u from the
discrete uniform distribution on the integers 1 to 23.
1. First enter data and initialize
Needs["StatisticsMaster"]
Clear[a,b,r,u]
a0=18.59;b0=-.89;aboot=a0;bboot=b0;
y0={1,1.2,1.1,1.4,2.3,1.7,1.7,2.4,2.1,2.1,1.2,2.3,1.9,2.4,
2.6,2.9,4,3.3,3,3.4,2.9,1.9,3.9};
x0={20,19.6,19.6,19.4,18.4,19,19,18.3,18.2,18.6,19.2,18.2,
18.7,18.5,18,17.4,16.5,17.2,17.3,17.8,17.3,18.4,16.9};
model=a0+b0*x0;
r=y0-model;
u:=Random[DiscreteUniformDistribution[23]]
Sad[a_,b_]:=Mean[Abs[model+rstar-(a+b*x0)]]
bmin[a_]:=FindMinimum[Sad[a,b],{b,{.5,1.5}}]
amin:=FindMinimum[Sad[a,b/.bmin[a][[2]]],{a,{16,19}}]
2. Here is the actual bootstrap. The vectors aboot and bboot contain the bootstrapped values.
B=500;
Do[
rstar=Table[r[[u]],{i,1,23}];
astar=a/.amin[[2]];
bstar=b/.bmin[astar][[2]];
aboot=Flatten[{aboot,astar}];
bboot=Flatten[{bboot,bstar}],
{i,1,B}]
3. Summary Statistics
Mean[aboot]
StandardDeviation[aboot]
Mean[bboot]
StandardDeviation[bboot]
4. The results are Intercept: Mean 18.66, SD .923 Slope: Mean .893, SD .050.