You are on page 1of 2

MAS367/MAS467/MAS6003 Linear and Generalised Linear Models 1

Linear and Generalised Linear Models


Exercise Sheet 2

Submit only those questions marked with (*)


Submit your work either in the class or by email to kostas@sheffield.ac.uk. The deadline
is Thursday 9th November.

1. (*) A linear model of the form

yi = β0 + β1 xi + β2 ln(xi ) + i

is fitted to the tractor data. Write down the design matrix X. Perform by hand (and
using the anova command or otherwise in R where possible) the following 2-sided tests:

(a) H0 : β1 = β2 = 0
(b) H0 : β2 = 0
(c) H0 : β2 = 3
(d) H0 : β1 = β2

2. (*) The following data give the viscosity of a compound with different quantities of oil
and filler.
Oil level 0 0 0 0 0 0 10 10 10 10 10 10
Filler level 0 12 24 36 48 60 0 12 24 36 48 60
Viscosity 26 38 50 76 108 157 17 26 37 53 83 124
Oil level 20 20 20 20 20 20 30 30 30 30 30
Filler level 0 12 24 36 48 60 12 24 36 48 60
Viscosity 13 20 27 37 57 87 15 22 27 41 63

(a) We wish to fit a model to express viscosity as a linear function of oil and filler
levels (with no interactions). Consider the Box-Cox transformations given in the
lecture notes. Describe how the variance of the response changes as a function of
the expected value of the response if the following values of λ (separately) stabilize
the variance of the response (λ = −1, −1/2, 0, 1/2, 1). Is it possible to tell by eye
which of these might be the best transformation for this data set?
(b) Using the boxcox command in R, determine the most suitable transformation from
the Box-Cox power family of transformations. Are any of the λ = −1, −1/2, 0, 1/2, 1
transformations contained within the 95% confidence interval for λ?

3. The stackloss data in library(stats) in R provides the values of the air flow X1 ,
cooling water inlet temperature X2 , acid concentration X3 and loss of NH3 (in tenths of
a percent) Y for 21 days of operation of a plant oxidising NH3 to HNO3 .
Assume that a linear relationship exists between Y and X1 , X2 , X3 and interactions
between them (interaction means products of terms for continuous variables). By using
forward, backwards, full stepwise, explore the best selection of variables for these data.
How much of the total sums of squares does the model account for? Check other model
diagnostics for the model that you choose.
MAS367/MAS467/MAS6003 Linear and Generalised Linear Models 2

4. (*) A discrete random variable has probability distribution given by

ky 2 β (y+k)
f (y; β) = p β > 0, y = 0, 1, ...
(β + 3)(y+2k) (y + 1)

where k is a known positive constant.

(a) Show that f (y; β) can be


 written
 in the form given
 for GLMs in the notes where φ
β β
and w are unity, θ = ln β+3 and b(θ) = −kln (β+3) 2

(b) Express b(θ) in terms of θ


(c) Hence show that E(Y ) = k(β/3 − 1)
(d) Find an expression in terms of β for var(Y )
 
k+µ
(e) Show that the canonical link is given by g(µ) = ln 2k+µ

You might also like