You are on page 1of 4

Linear models and their mathematical foundations Study sheet 2

Cynthia Huber, Steffen Unkel Winter term 2018/19

Exercise 1:

Weight MPG
1 2620 21.0
2 2875 21.0
3 2320 22.8
4 3215 21.4
5 3440 18.7
6 3460 18.1

An excerpt of data on weight in pounds (1st column) and the petrol consumption in miles per
gallon (2nd column) of 32 vehicle variants is given above. For estimating the influence of the
weight (Weight) on petrol consumption (MPG) a linear model is fitted. The output obtained
by using the function lm() in R is given below.

Call:
lm(formula = MPG ~ Weight, data = auto)

Residuals:
Min 1Q Median 3Q Max
-4.5432 -2.3647 -0.1252 1.4096 6.8727

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 37.2851262 1.8776273 19.858 < 2e-16 ***
Weight -0.0053445 0.0005591 -9.559 1.29e-10 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.046 on 30 degrees of freedom


Multiple R-squared: 0.7528, Adjusted R-squared: 0.7446
F-statistic: 91.38 on 1 and 30 DF, p-value: 1.294e-10

a) Interpret the estimated parameters.

b) Calculate a 95% confidence interval for the coefficient of weight (β1 ).

c) Is there a linear relationship between weight and the petrol consumption (α = 0.05)?

d) Why does it make more sense to specify the consumption in gallons per mile than in miles
per gallon? When we fit a linear model with this transformed variable (GPM), we obtain
the following output:

Date: 14 November 2018 Page 1


Call:
lm(formula = GPM ~ Weight, data = auto)

Residuals:
Min 1Q Median 3Q Max
-0.0179837 -0.0044301 0.0009413 0.0045354 0.0116583

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.169e-03 4.695e-03 1.314 0.199
Weight 1.494e-05 1.398e-06 10.685 9.57e-12 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.007616 on 30 degrees of freedom


Multiple R-squared: 0.7919, Adjusted R-squared: 0.785
F-statistic: 114.2 on 1 and 30 DF, p-value: 9.566e-12

e) How do the parameter estimates change if one specifies the weight in kilogram and the
consumption in litres per 100 km (1 kg = 2.2046 American pounds, 1 km = 0.6214 miles,
1 litre = 0.22 gallons)? Derive the general conversion formula for the LS estimates after
the following linear transformations have been applied:

xi → ti = a0 + a1 xi , with a1 6= 0 ,
yi → ui = b0 + b1 yi , with b1 6= 0 ,

and then calculate the new parameter estimates by means of subtask d). Is it possible to
use the results of 1 a) for this calculation?

Date: 14 November 2018 Page 2


Exercise 2:
In case the data are such that the fitted regression line must go through the origin, a simple
linear regression model without an intercept is required, that is, we need a model of the form

yi = βxi + i , i = 1, . . . , n ,

where the i ’s are independent and identically distributed random variables with E(i ) = 0
and Var(i ) = σ 2 .

a) Derive the least squares estimator for β.

Pn Pn yi
b) Let βe = Pn1 1
with xi 6= 0 be two estimators of β. Are βe
i=1 xi
i=1 yi and β̌ = n i=1 xi
and β̌ unbiased estimators of β?

c) The R package gamair, data(hubble) contains data from the Hubble space telescope
on distances and velocities of 24 galaxies (see also ?hubble). Fit the following simple
linear regression model without intercept to the data:

velocity = β1 distance +  . (1)

This is essentially what astronomers call Hubble’s Law and β1 is known as Hubble’s
constant. Use the estimated value of β1 to find an approximate value for the age of the
universe.

Exercise 3:
Consider the simple regression model

yi = β0 + β1 xi + i , i = 1, 2, . . . , n

and assume that the assumptions A1–A4, which have been introduced in the lecture, hold
(see slide 3 of the set of slides “Linear models and their mathematical foundations: Simple
linear regression”).

a) Show that the least-squares estimator for (β0 , β1 ) is the same as the maximum-likelihood
estimator.

b) What is the maximum-likelihood estimator for σ 2 ?

c) Show that Var(β̂1 ) = σ 2 / ni=1 (xi − x̄)2 .


P

d) Show that Var(β̂0 ) = σ 2 (1/n + x̄2 / ni=1 (xi − x̄)2 ).


P

Date: 14 November 2018 Page 3


Exercise 4:
In the lecture, the bivariate normal regression model with a single stochastic predictor variable
x has been introduced. Now suppose that k stochastic predictors are given and assume that
(y, x1 , . . . , xk ) = (y, x> ) is distributed as Nk+1 (µ, Σ).

a) Derive E(y|x) and define β0 and β1 .

b) Derive Var(y|x).

Hint: Make use of the result that is displayed on slide 14 of the set of slides “Linear models
and their mathematical foundations: Preliminaries”.

Date: 14 November 2018 Page 4

You might also like