Hsuhl LRClass Chap4

Chapter 4
Simultaneous Inferences and Other Topics in

Regression Analysis
許湘伶
Applied Linear Regression Models
(Kutner, Nachtsheim, Neter, Li)
hsuhl (NUK) LR Chap 4 1 / 45

Joint Estimation of β0 and β1
Draw inferences with confidence coefficient 0.95 about both

β0 and β1 .
Chap. 2: Not provide 95% C.I.s for β0 and β1 ⇒ (0.95)2 if
the inferences were independent
A family confidence coefficient of 0.95 would indicate that if
repeated samples are selected and interval estimates for both
β0 and β1 are calculated for each sample by specified
procedures would lead to a family of estimates where both
confidence intervals are correct.

Bonferroni Joint Confidence Intervals
Developing joint confidence intervals for β0 and β1 with a

specified family confidence coefficient
Each statement confidence coefficient is adjusted to be hight
than 1 − α ⇒ the family confidence coefficient is at least
1−α
The ordinary confidence limits for β0 and β1 with 1 − α:
b0 ± t(1 − α/2; n − 2)s{b0 }

b1 ± t(1 − α/2; n − 2)s{b1 }

Bonferroni Joint Confidence Intervals (cont.)
What is the probability that one or both of these intervals

are incorrect?
A1 : the event that the first C.I. does not cover β0

A2 : the event that the first C.I. does not cover β1
⇒P(A1 ) = α P(A2 ) = α
Bonderroni inequality:
P(Ā1 ∩ Ā2 ) ≥ 1 − P(A1 ) − P(A2 ) = 1 − 2α

β0 and β1 are separately estimated with 0.95% C.I.⇒ the

Bonferroni inequality guarantees us a family confidence
coefficient of at least 90% that both intervals based on the
same sample are correct.
The 1 − α family confidence limits for β0 and β1 for model
(2.1) by the Bonfirroni procedure:
b0 ± Bs{b0 } b1 ± Bs{b1 }
B = t(1 − α/4; n − 2)

Example (p.156)
Toluca company
α = 0.1; B = t(1 − 0.4/1; 23) = 2.069
b0 = 62.37 s{b0 } = 26.18

b1 = 3.5702 s{b1 } = 0.3470
The joint C.I.: (b0 ± Bs{b0 } b1 ± Bs{b1 })
8.20 ≤ β0 ≤ 116.5
2.85 ≤ β1 ≤ 4.29

toluca<-read.table("toluca.txt",header=T)
attach(toluca)
fitreg<-lm(Hrs~Size)
confint(fitreg)
confint(fitreg,level=1-alpha/2) # Bonferroni C.I.

The Bonferroni 1 − α family confidence coefficient is actually

a lower bound on the true family confidence coefficient.
If g interval estimates are desired with family confidence
coefficient 1 − α, constructing each interval estimate with
statement confidence coefficient 1 − α/g will suffice.
The Bonferroni technique is ordinarily most useful when the
number of simultaneous estimates is not too large.
It is not necessary with the Bonferroni procedure that the
C.I. have the same statement confidence coefficient.
(P(A1 ) + P(A2 ) = α)
Joint Confidence intervals can be used directly for testing.

b0 and b1 are usually correlated.
σ{b0 , b1 } = −X̄ σ 2 {b1 }
When the independent variable is Xi − X̄ , b0∗ and b1 are

uncorrelated ∵ the mean of the Xi − X̄ observations is zero.

Simultaneous Estimation of Mean Response
The mean responses at a number of X levels need to be

estimated.
Ex: Toluca Company: the mean number of work hours for
X = 30, 65, 100 units
Two procedures for simultaneous estimation of a number of
different mean responses:
Working-Hotelling procedure
Bonferroini procedure

Working-Hotelling procedure
Working-Hotelling procedure: based on the confidence band

for the regression line (Chap. 2.6)
Ŷh ± Ws{Ŷh }, W 2 = 2F (1 − α; 2, n − 2)
The confidence band contains the entire regression line ⇒
contains the mean responses at all X levels
The simultaneous confidence limits for g mean responses
E{Yh } for model (2.1) withe the Working-Hotelling
procedure:
Ŷh ± Ws{Ŷh }, W 2 = 2F (1 − α; 2, n − 2)
Ŷh = b0 + b1 Xh
(Xh −X̄) 2
1
s{Ŷh } = MSE n +P(X −X̄)2
i

Working-Hotelling procedure (cont.)
Toluca Company example

Yh at Xh = 30, 65, 100
1 − α = 0.90; F (0.90; 2, 23) = 2.549
W 2 = 5.098 ⇒ W = 2.258
Xh Ŷh s{Ŷh }
30 169.5 16.96
65 294.4 9.918
100 419.4 14.27
⇒131.2 ≤ E{Yh } ≤ 207.8

272.0 ≤ E{Yh } ≤ 316.8
387.2 ≤ E{Yh } ≤ 451.6

Xh<-c(30,65,100)
pred<-predict.lm(fitreg,data.frame(Size = c(Xh)),
se.fit = T, level = 0.9)
W <-rep(sqrt( 2 * qf(0.90, 2, n-2) ),length(Xh))
rbind(pred$fit-W * pred$se.fit,pred$fit+W * pred$se.fit)
1 2 3
[1,] 131.1542 272.0351 387.1591
[2,] 207.7897 316.8229 451.6130


Bonferroni procedure
The Bonferroni confidence limits for E{Yh } at g levels Xh

with 1 − α family confidence coefficient:
Ŷh ± Bs{Ŷh }, B = t(1 − α/2g; n − 2)

Bonferroni procedure (cont.)

Yh at Xh = 30, 65, 100
1 − α = 0.90; F (0.90; 2, 23) = 2.549
B = t(1 − 0.1/(2 ∗ 3); 23) = 2.263
⇒131.1 ≤ E{Yh } ≤ 207.9
272.0 ≤ E{Yh } ≤ 316.8
387.1 ≤ E{Yh } ≤ 451.7

Xh<-c(30,65,100)
pred<-predict.lm(fitreg,data.frame(Size = c(Xh)),
se.fit = T, level = 0.9)
B=rep(qt(1-alpha/(2*length(Xh)),n-2),length(Xh))
rbind(pred$fit-B * pred$se.fit, pred$fit+B * pred$se.fit)
1 2 3
[1,] 131.0570 271.9783 387.0774
[2,] 207.8868 316.8797 451.6947


Simultaneous Prediction Intervals for New
Observations
Simultaneous Prediction Intervals for New Observations
The simultaneous predictions of g new observations on Y in

g independent trials at g different levels of X
Two procedures:
Scheffé procedure: using the F distribution
Ŷh ± Ss{pred}, S 2 = gF (1 − α; g, n − 2)
Bonferroni procedure: using the t distribution
Ŷh ± Bs{pred}, B = t(1 − α/2g; n − 2)

(Xh −X̄) 2
2 1
s {pred} = MSE 1 + n
+ P
(Xi −X̄)2

Observations

(cont.)

Predict Yh(new) at Xh = 80, 100
1 − α = 0.95; S 2 = 2(0.95; 2, 23) = 6.844 ⇒ s = 2.616;
B = t(1 − 0.5/(2 ∗ 2); 23) = 2.398
Xh Ŷh s{pred} Bs{pred}
80 348.0 49.91 119.7
100 419.4 50.87 122.0
228.3 ≤ Yh(new) ≤ 467.7, (Xh = 80)

197.4 ≤ Yh(new) ≤ 541.4, (Xh = 100)

Observations

(cont.)
Xh<-c(80,100)
g<-length(Xh)
alpha<-0.05
CI.New <- predict.lm(fitreg,data.frame(Size = c(Xh)),
se.fit = T, level = 1-alpha)
M <-rbind(rep(qt(1 - alpha / (2*g), fitreg$df),g),
rep(sqrt( g * qf( 1 - alpha, g, fitreg$df)),g))
spred <- sqrt( CI.New$residual.scale^2 + (CI.New$se.fit)^2 ) # (2.38)

pred.new <- t(rbind(
"Yh" = Xh,
"s.pred" = spred,
"fit" = CI.New$fit,
"lower.B" = CI.New$fit - M[1,] * spred,
"upper.B" = CI.New$fit + M[1,] * spred,
"lower.S" = CI.New$fit - M[2,] * spred,
"upper.S" = CI.New$fit + M[2,] * spred))

Observations

(cont.)
(a) g = 2 (b) g = 3

Observations

(cont.)
Both the B and S multiples for simultaneous predictions

become larger as g increases.
參考資料： http://rstudio-pubs-static.s3.amazonaws.com/
5218_61195adcdb7441f7b08af3dba795354f.html

Regression through Origin
Sometimes the regression function is known to be linear and

to go through the origin at (0,0).
Yi = βi Xi + εi , εi ∼ N (0, σ 2 ) (4.10)
E{Y } = β1 X :a straight line through the origin with slope β1

The point estimator: (LSE and MLE)
P
Xi Yi
b1 = P 2
Xi
df : n − 1
Regression through Origin (cont.)
Figure : Confidence limits for Regression through Origin

Charles Plumbing Supplies Company

Operates 12 warehouses
X : number of work units performed
Y : total variable labor cost
b1 and Ŷ :
b1 = 4.68527
Ŷ = b1 X
s 2 = MSE = 223.42; s 2 {b1 } = 0.0011700

95% C.I. : 4.61 ≤ β1 ≤ 4.76

(a) Data (b) plot
Figure : Regression through Origin-Warehouse Example

Important Caution for Using Regression through Origin:
Yi = βi Xi + εi , εi ∼ N (0, σ 2 ) (4.10)
the residuals do not sum to zero usually. (See Table 4.2)

The only constraint:
X
Xi ei = 0 (from Normal equation)
The residuals plot will usually not be balanced around the

zero line.

P 2
(Yi − Ȳ )2
P
SSE = e i may exceed SSTO =
Curvilinear
a linear pattern with an intercept away from the origin
SSE
⇒R2 = 1 − may turn out < 0
SSTO
⇒R2 has no clear meaning for regression through the origin.
Evaluated for aptness

It is generally a safe practice not to use
regression-through-the-origin model (4.10) and instead use
model (2.1)

Some statistical package calculate R2 according to (2.72) and

show a negative R2 ; Other statistical package calculate R2
using SSTOU (uncorrected SS)
SSTOU : avoids obtaining a negative coefficient but lacks any
meaningful interpretation

Effects of Measurement Errors
Measure Errors:
The response variable: Y
no new problems are created when these errors are
uncorrelated and no biased
absorbed in the model error term ε
The predictor variable X
pressure in a tank
temperature in an oven
speed of a production line

Measurement Errors in X
Illustration:
Xi : the true age of the ith employee
Xi∗ : the age reported by the employee of the employment
record
Measurement Error δi :
δi = Xi∗ − Xi
The regression model:
Yi = β0 + β1 Xi + εi

Measurement Errors in X (cont.)

Observe only Xi∗ :
Yi = β0 + β1 (Xi∗ − δi ) + εi (4.23)
⇒Yi = β0 + β1 Xi∗ + (εi − β1 δi ) (4.24)
Xi∗ is a random variable is correlated with εi − β1 δi

εi − β1 δi is not independent of Xi∗ (∵ δi = Xi∗ − Xi )
To determine the dependence: Conditions
E{δi } = 0
E{εi } = 0
E{δi εi } = 0
⇒ E{Xi∗ } = Xi

E{δi } = 0: the reported ages would be unbiased estimates of

the true ages
E{εi } = 0: the model error term have expectation 0
E{δi εi } = 0: the measurement error δi not be correlated with
εi
⇒ σ{δi , εi } = 0
The covariance between Xi∗ and εi − β1 δi : not zero
E{εi − β1 δi } = 0;
σ{Xi∗ , εi − β1 δi } = E{δi εi − β1 δi2 } = −β1 σ 2 {δi }

Assume (Y , X ∗ ) ∼ bivariate normal distribution

The conditional distribution of Yi given Xi∗ : normal and
independent
Conditional mean: E{Yi |Xi∗ } = β0∗ + β1∗ Xi∗

Conditional variance: σY2 |X ∗
⇒ β1∗ = β1 [σX2 /(σX2 + σY2 )] < β1
If σY2 is small relative to σX2 , the bias would be small;

otherwise the bias may be substantial(重要的).

Berkson Model
There is one situation where measurement errors in X are no

problem. (Berkson Model)
The observation Xi∗ is fixed quantity
The unobserved true value X is a random variable
The measurement error: δi = Xi∗ − Xi (no constaint on the
relation between Xi∗ and δi )
E{δi } = 0
Still applicable for the Berkson case:
Yi = β0 + β1 Xi∗ + (εi − β1 δi ) (4.28)

Berkson Model (cont.)
E{εi − β1 δi } = 0; E{εi } = 0; E{δi } = 0
⇒ σ{Xi∗ , εi − β1 δi } = 0 (∵ Xi∗ is a constant)
Least squares procedures can be applied for the Berkson case:

unbiased estimators b0 , b1
The error terms have expectation zero
The predictor variable is a constant, and the error terms are
not correlated with it.

4.6 Inverse Predictions
Inverse Predictions
Figure : Scatter Plot and Fitted Regression Line-calibration Example.

Inverse Predictions (cont.)
Two examples:
A trade association(夥伴關係) analyst: selling price Yh(new)
from another firm not belong to the trade association is
known ⇒ desired to estimate the cost Xh (new)
Y =the amount of decrease in cholesterol(膽固醇) level; X : a
given dosage of a new drug for 50 patients⇒ for a new
patient, Yh(new)
⇒ to estimate the appropriate dosage levelXh(new)

Inverse Predictions (cont.)

Yi = β0 + β1 Xi + εi
Ŷ = b0 + b1 X
Yh(new) − b0
⇒ X̂h(new) = b1 6= 0
b1
X̂h(new) is the MLE of Xh(new) for normal error regression

model (2.1)
Approximate 1 − α confidence limits for Xh(new) :
X̂h(new) ± t(1 − α/2; n − 2)s{predX }

MSE (X̂h(new) − X )2
s 2 {predX } = [1 + 1/n + ]X
b12 (Xi − X )2
P

Inverse Predictions-Example
Medical
measuring low concentration of galactose(半乳糖) in the blood
12 sample;
X : known concentration; four different levels
Y : measured concentration
n = 12 b0 = −0.100 b1 = 1.017 MSE = 0.0272
(Xi − X̄ )2 = 135
X
s{b1 } = 0.0142 X̄ = 5.500 Ȳ = 5.492
Ŷ = −0.100 + 1.017X
H0 : β1 = 0 ⇒ Conclude β1 6= 0
Yh(new) = 6.52 ⇒ X̂h(new) = 6.509 ⇒ 6.13 ≤ Xh(new) ≤ 6.89

Inverse Predictions-Example (cont.)
Inverse prediction problem: calibration(校正) problem

The approximate confidence interval (4.32) is appropriate if
[t(1 − α/2; n − 2)]2 MSE

is small; less than 0.1
b12 (Xi − X )2
P

4.7 Choice of X Levels
Choice of X Levels
By experiment: X ’s are under control of the experimenter

Among other things: consider
1 How many levels of X should be investigated?
2 What shall the two extreme levels be?
3 How shall the other levels of X be spaced?
4 How many observations should be taken at each level of X
No single answer: ∵ different purposes lead to different
answers
estimate the intercept
predict new observations
estimate mean responses
other purposes

Choice of X Levels (cont.)
The variances developed for the regression model(2.1):

2
2 2 X
σ {b0 } = σ [1/n + P ]
(Xi − X )2
σ2
σ 2 {b1 } = P
(Xi − X )2
c } = σ2[ 1 + P (Xh − X )2
σ 2 {Yh ]
n (Xi − X )2
1 (Xh − X )2
σ 2 {pred} = σ 2 [1 + + P ]
n (Xi − X )2

Choice of X Levels (cont.)
Estimate β1 : min s 2 {b1 } ⇔ max (Xi − X )2 ⇒ using two

P
levels of X at the two extremes and placing half of the

observations at each of the two levels
Estimate Yh : X̄ = Xh
Choose: (Cox advices)
2 levels: only interested in whether there is an effect and its
direction
3 levels: goal is describing relation and any possible curvature
4 or more levels:further description of response curve and any
potential non-linearity such as an asymptotic value

Hsuhl LRClass Chap4

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Hsuhl LRClass Chap4

Uploaded by

Copyright:

Available Formats

Chapter 4

Simultaneous Inferences and Other Topics in

hsuhl (NUK) LR Chap 4 1 / 45

Draw inferences with confidence coefficient 0.95 about both

hsuhl (NUK) LR Chap 4 2 / 45

Developing joint confidence intervals for β0 and β1 with a

b0 ± t(1 − α/2; n − 2)s{b0 }

hsuhl (NUK) LR Chap 4 3 / 45

What is the probability that one or both of these intervals

A1 : the event that the first C.I. does not cover β0

P(Ā1 ∩ Ā2 ) ≥ 1 − P(A1 ) − P(A2 ) = 1 − 2α

hsuhl (NUK) LR Chap 4 4 / 45

β0 and β1 are separately estimated with 0.95% C.I.⇒ the

hsuhl (NUK) LR Chap 4 5 / 45

b0 = 62.37 s{b0 } = 26.18

The joint C.I.: (b0 ± Bs{b0 } b1 ± Bs{b1 })

hsuhl (NUK) LR Chap 4 6 / 45

hsuhl (NUK) LR Chap 4 7 / 45

The Bonferroni 1 − α family confidence coefficient is actually

hsuhl (NUK) LR Chap 4 8 / 45

b0 and b1 are usually correlated.

σ{b0 , b1 } = −X̄ σ 2 {b1 }

When the independent variable is Xi − X̄ , b0∗ and b1 are

hsuhl (NUK) LR Chap 4 9 / 45

Simultaneous Estimation of Mean Response

The mean responses at a number of X levels need to be

hsuhl (NUK) LR Chap 4 10 / 45

Working-Hotelling procedure: based on the confidence band

hsuhl (NUK) LR Chap 4 11 / 45

Working-Hotelling procedure (cont.)

Toluca Company example

⇒131.2 ≤ E{Yh } ≤ 207.8

hsuhl (NUK) LR Chap 4 12 / 45

Working-Hotelling procedure (cont.)

hsuhl (NUK) LR Chap 4 13 / 45

Working-Hotelling procedure (cont.)

hsuhl (NUK) LR Chap 4 14 / 45

The Bonferroni confidence limits for E{Yh } at g levels Xh

Ŷh ± Bs{Ŷh }, B = t(1 − α/2g; n − 2)

hsuhl (NUK) LR Chap 4 15 / 45

Bonferroni procedure (cont.)

Toluca Company example

hsuhl (NUK) LR Chap 4 16 / 45

Bonferroni procedure (cont.)

hsuhl (NUK) LR Chap 4 17 / 45

Bonferroni procedure (cont.)

hsuhl (NUK) LR Chap 4 18 / 45

Simultaneous Prediction Intervals for New Observations

The simultaneous predictions of g new observations on Y in

Bonferroni procedure: using the t distribution

Ŷh ± Bs{pred}, B = t(1 − α/2g; n − 2)

hsuhl (NUK) LR Chap 4 19 / 45

Simultaneous Prediction Intervals for New Observations

Toluca Company example

228.3 ≤ Yh(new) ≤ 467.7, (Xh = 80)

hsuhl (NUK) LR Chap 4 20 / 45

Simultaneous Prediction Intervals for New Observations

spred <- sqrt( CI.New$residual.scale^2 + (CI.New$se.fit)^2 ) # (2.38)

hsuhl (NUK) LR Chap 4 21 / 45

Simultaneous Prediction Intervals for New Observations

hsuhl (NUK) LR Chap 4 22 / 45

Simultaneous Prediction Intervals for New Observations