Professional Documents
Culture Documents
1 The model
yi = β0 + β1 xi + ei , i = 1, 2, · · · , n. (1)
β1 = Slope
E(Y|X=x)
1
1
β0 = Intercept
0
0 1 2 3 4
Predictor = X
x̄ = 202.95294, ȳ = 139.60529
Sxx = 530.78235, Sxy = 475.31224,
Syy = 427.79402.
Thus, the parameter estimates are
b Sxy
β1 = = 0.895, βb0 = ȳ−βb1 x̄ = −42.138.
Sxx
The estimate line, given by either of the equations
b
E(Lpress|temp) = −42.138 + 0.895T emp.
The fit of this line to the data is excellent as shown
in Figure 2.
1.45
0.010
log(Pressure)
Residuals
1.40
0.005
1.35
0.000
195 200 205 210 195 200 205 210
Temperature Temperature
E(β̂1 ) = β1 ,
E(β̂0 ) = β0 ,
2
σ
var(β̂1 ) =
Sxx
2
1 x̄
var(β̂0 ) = σ 2( + )
n Sxx
There are several other useful properties of the
least squares fit:
2 SSRes
σ̂ = = M SRes.
n−2
5 The models in the centered form
yi = β0 + β1 (xi − x̄) + β1 x̄ + ǫi
= (β0 + β1 x̄) + β1 (xi − x̄) + ǫi
= β0′ + β1 (xi − x̄) + ǫi
It is easy to show that β̂0′ = ȳ , the estimator of
the slope is unaffected by the transformation, and
Cov(β̂0′ , β̂1 ) = 0.
6 Hypothesis testing on the slope and intercept
H0 : β1 = β10
(5)
H1 : β1 6= β10
2
Since ei ∼ N (0, σ ), we have yi ∼ N (β0 +
β1 xi, σ 2) and β̂1 ∼ N (β, σ 2 /Sxx ). Therefor,
β̂1 − β10
Z0 = p ∼ N (0, 1)
σ 2/Sxx
if the null hypothesis H0 : β1 = β10 is true. If σ 2
were known, we could use Z0 to test the hypothe-
sis (5).
If σ 2 is unknown, we know that (1) M SRes is an
2 2
unbiased estimator of σ ; (2) (n − 2)M SRes/σ
follows a χ2n−2 distribution; and (3) M SRes and β̂1
are independent. Therefore,
To test
H0 : β0 = β00
(6)
H1 : β0 6= β00 ,
we could use the test statistic
β̂0 − β00 β̂0 − β00
t0 = q = .
1 x̄2
M SRes( n + Sxx ) se(β̂0 )
A very important special case of the hypothesis
in (5) is
H0 : β1 = 0
(7)
H1 : β1 6= 0,
Failing to reject the null hypothesis implies that there
is no linear relationship between x and y .
Consider the snowfall data, β̂1 = 0.2035, and
se(β̂1 ) = 0.1310. Thus, t = (0.20335−0)/0.1310 =
1.553. Comparing t with the critical value t(0.05, 91) =
1.986, we conclude that early and late season snow-
falls are independent.
7 The analysis of variance
F0 > Fα,1,n−2
in Table 2.
8 Coefficient of determination
The quantity
2SSR SSRes
R = =1−
SST SST
is called the coefficient of determination. For Forbes’
data,
2425.63910
R = = 0.995,
427.79402
and thus about 99.5% of the variability in the ob-
served values is explained by boiling point.
In the below, we list some properties of R2 .
β̂1 − β1 β̂0 − β0
and
se(β̂1 ) se(β̂0 )
is t with n − 2 degrees of freedom. Therefore,
a 100(1 − α) percent confidence interval on the
slope β1 is given by
i.e.,
22.805 ≤ P ressure ≤ 24.054.
11 The Residuals
0
−5
60 62 64 66 68
Fitted values
0.5
0.0
Fitted values