Problems: Simple Linear Regression

Simple Linear Regression 35
Problems
1. Consider a set of data (xi , yi ), i = 1, 2, · · · , n, and the following two

regression models:
yi = β0 + β1 xi + ε, (i = 1, 2, · · · , n), Model A
yi = γ0 + γ1 xi + γ2 x2i + ε, (i = 1, 2, · · · , n), Model B
Suppose both models are fitted to the same data. Show that
SSRes, A ≥ SSRes, B
If more higher order terms are added into the above Model B, i.e.,
yi = γ0 + γ1 xi + γ2 x2i + γ3 x3i + · · · + γk xki + ε, (i = 1, 2, · · · , n),
show that the inequality SSRes, A ≥ SSRes, B still holds.

2. Consider the zero intercept model given by
yi = β1 xi + εi , (i = 1, 2, · · · , n)
where the εi ’s are independent normal variables with constant variance

σ 2 . Show that the 100(1 − α)% confidence interval on E(y|x0 ) is given
by

x2
b1 x0 + tα/2, n−1 s n 0 2
i=1 xi

n n
yi xi
where s = (yi − b1 xi )/(n − 1) and b1 = i=1
n .
2
i=1 i=1 xi
3. Derive and discuss the (1 − α)100% confidence interval on the slope β1
for the simple linear model with zero intercept.
4. Consider the fixed zero intercept regression model
yi = β1 xi + εi , (i = 1, 2, · · · , n)
The appropriate estimator of σ 2 is given by

n
(yi − ŷi )2
s2 =
i=1
n−1
Show that s2 is an unbiased estimator of σ 2 .

36 Linear Regression Analysis: Theory and Computing
Table 2.10 Data for Two Parallel

Regression Lines
x y
x1 y1
. .
.. ..
xn1 yn1
xn1 +1 yn1 +1
.. ..
. .
xn1 +n2 yn1 +n2
5. Consider a situation in which the regression data set is divided into two
parts as shown in Table 2.10.
The regression model is given by
⎧
(1)
⎪
⎨ β0 + β1 xi + εi , i = 1, 2, · · · , n1 ;
yi =
⎪
⎩ β (2) + β x + ε , i = n + 1, · · · , n + n .
0 1 i i 1 1 2
In other words, there are two regression lines with common slope. Using
the centered regression model
⎧
(1∗)
⎪
⎨ β0 + β1 (xi − x̄1 ) + εi , i = 1, 2, · · · , n1 ;
yi =
⎪
⎩ β (2∗) + β (x − x̄ ) + ε , i = n + 1, · · · , n + n ,
0 1 i 2 i 1 1 2
n1 n1 +n2
where x̄1 = i=1 xi /n1 and x̄2 = i=n1 +1 xi /n2 . Show that the least
squares estimate of β1 is given by
n1 n1 +n2
i=1 (xi − x̄1 )yi + i=n1 +1 (xi − x̄2 )yi
b1 = n1 n1 +n
2 2
i=1 (xi − x̄1 ) + i=n1 +1 (xi − x̄2 )
2
6. Consider two simple linear models

Y1j = α1 + β1 x1j + ε1j , j = 1, 2, · · · , n1
and
Y2j = α2 + β2 x2j + ε2j , j = 1, 2, · · · , n2
Assume that β1 = β2 the above two simple linear models intersect. Let
x0 be the point on the x-axis at which the two linear models intersect.
Also assume that εij are independent normal variable with a variance
σ 2 . Show that
α1 − α2
(a). x0 =
β1 − β 2
(b). Find the maximum likelihood estimates (MLE) of x0 using the
least squares estimators α̂1 , α̂2 , β̂1 , and β̂2 .
(c). Show that the distribution of Z, where

Z = (α̂1 − α̂2 ) + x0 (β̂1 − β̂2 ),
is the normal distribution with mean 0 and variance A2 σ 2 , where
2 2
2
x1j − 2x0 x1j + x20 n1 x2j − 2x0 x2j + x20 n2
A = + .
n1 (x1j − x̄1 )2 n2 (x2j − x̄2 )2
(d). Show that U = N σ̂ 2 /σ 2 is distributed as χ2 (N ), where N =

n1 + n2 − 4.
(e). Show that U and Z are independent.
(f). Show that W = Z 2 /A2 σ̂ 2 has the F distribution with degrees of

freedom 1 and N .

(g). Let S12 = (x1j − x̄1 )2 and S22 = (x2j − x̄2 )2 , show that the
solution of the following quadratic equation about x0 , q(x0 ) =
ax20 + 2bx0 + c = 0,
1 1
(β̂1 − β̂2 )2 − 2 + 2 σ̂ 2 Fα,1,N x20
S1 S2
x̄ x̄2 2
1
+ 2 (α̂1 − α̂2 )(β̂1 − β̂2 ) + + σ̂ Fα,1,N x0
S12 S22
x2 2
x2j 2
1j
+ (α̂1 − α̂2 )2 − 2 + 2 σ̂ Fα,1,N = 0.
n 1 S1 n 2 S2
Show that if a ≥ 0 and b2 − ac ≥ 0, then 1 − α confidence interval
on x0 is
√ √
−b − b2 − ac −b + b2 − ac
≤ x0 ≤ .
a a
7. Observations on the yield of a chemical reaction taken at various tem-
peratures were recorded in Table 2.11:
(a). Fit a simple linear regression and estimate β0 and β1 using the
least squares method.
(b). Compute 95% confidence intervals on E(y|x) at 4 levels of temper-
atures in the data. Plot the upper and lower confidence intervals
around the regression line.
38 Linear Regression Analysis: Theory and Computing
Table 2.11 Chemical Reaction Data
temperature (C 0 ) yield of chemical reaction (%)
150 77.4
150 77.4
150 77.4
150 77.4
150 77.4
150 77.4
150 77.4
150 77.4
150 77.4
150 77.4
150 77.4
150 77.4
Data Source: Raymond H. Myers, Classical and Mod-

ern Regression Analysis With Applications, P77.
(c). Plot a 95% confidence band on the regression line. Plot on the
same graph for part (b) and comment on it.
8. The study “Development of LIFETEST, a Dynamic Technique to As-
sess Individual Capability to Lift Material” was conducted in Virginia
Polytechnic Institute and State University in 1982 to determine if cer-
tain static arm strength measures have influence on the “dynamic lift”
characteristics of individual. 25 individuals were subjected to strength
tests and then were asked to perform a weight-lifting test in which
weight was dynamically lifted overhead. The data are in Table 2.12:
(a). Find the linear regression line using the least squares method.
(b). Define the joint hypothesis H0 : β0 = 0, β1 = 2.2. Test this
hypothesis problem using a 95% joint confidence region and β0
and β1 to draw your conclusion.
(c). Calculate the studentized residuals for the regression model. Plot
the studentized residuals against x and comment on the plot.
Table 2.12 Weight-lifting Test Data
Individual Arm Strength (x) Dynamic Lift (y)
1 17.3 71.4
2 19.5 48.3
3 19.5 88.3
4 19.7 75.0
5 22.9 91.7
6 23.1 100.0
7 26.4 73.3
8 26.8 65.0
9 27.6 75.0
10 28.1 88.3
11 28.1 68.3
12 28.7 96.7
13 29.0 76.7
14 29.6 78.3
15 29.9 60.0
16 29.9 71.7
17 30.3 85.0
18 31.3 85.0
19 36.0 88.3
20 39.5 100.0
21 40.4 100.0
22 44.3 100.0
23 44.6 91.7
24 50.4 100.0
25 55.9 71.7
Data Source: Raymond H. Myers, Classical and Mod-

ern Regression Analysis With Applications, P76.

Problems: Simple Linear Regression

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Problems: Simple Linear Regression

Uploaded by

Copyright:

Available Formats

Simple Linear Regression 35

1. Consider a set of data (xi , yi ), i = 1, 2, · · · , n, and the following two

yi = γ0 + γ1 xi + γ2 x2i + γ3 x3i + · · · + γk xki + ε, (i = 1, 2, · · · , n),

show that the inequality SSRes, A ≥ SSRes, B still holds.

where the εi ’s are independent normal variables with constant variance

The appropriate estimator of σ 2 is given by

Show that s2 is an unbiased estimator of σ 2 .

Table 2.10 Data for Two Parallel

6. Consider two simple linear models

(c). Show that the distribution of Z, where

(d). Show that U = N σ̂ 2 /σ 2 is distributed as χ2 (N ), where N =

(e). Show that U and Z are independent.

(f). Show that W = Z 2 /A2 σ̂ 2 has the F distribution with degrees of

Table 2.11 Chemical Reaction Data

temperature (C 0 ) yield of chemical reaction (%)

Data Source: Raymond H. Myers, Classical and Mod-

Table 2.12 Weight-lifting Test Data

Individual Arm Strength (x) Dynamic Lift (y)

Data Source: Raymond H. Myers, Classical and Mod-

You might also like