You are on page 1of 2

Stat 431 Assignment 1 Winter 2017

Due by 1:00pm on Friday, January 20, 2017

Notes for Submission: Upload your assignment directly to Crowdmark via the link you received by email
(let me know if you have not received this email). It is your responsibility to make sure your solution to
each question is submitted in the correct section, that the pages are rotated correctly, and that everything is
legible. Typed solutions are preferred. Be sure to include all R code and relevant output for each question
(where applicable). Once the solution key is posted on Learn, no further late submissions will be accepted.

Question 1
An article in Technometrics (1974, Vol. 16, pp. 523–531) considered the following stack-loss data from a
plant oxidizing ammonia to nitric acid. Twenty-one daily responses of stack loss y (the amount of ammonia
escaping) were measured with air flow x1 , temperature x2 , and acid concentration x3 . R code to input the
data is given below.1

y = c(42, 37, 37, 28, 18, 18, 19, 20, 15, 14, 14, 13, 11, 12, 8, 7, 8, 8, 9, 15, 15)
x1 = c(80, 80, 75, 62, 62, 62, 62, 62, 58, 58, 58, 58, 58, 58, 50, 50, 50, 50, 50, 56, 70)
x2 = c(27, 27, 25, 24, 22, 23, 24, 24, 23, 18, 18, 17, 18, 19, 18, 18, 19, 19, 20, 20, 20)
x3 = c(89, 88, 90, 87, 87, 87, 93, 93, 87, 80, 89, 88, 82, 93, 89, 86, 72, 79, 80, 82, 91)

(a) Fit a linear regression model relating the results of the stack loss to the three regressor varilables.
Provide an summary output of your fitted model. Use the model to predict stack loss when x1 = 60,
x2 = 26, and x3 = 85.
(b) Conduct a t-test for the null hypothesis H0 : β3 = 0 at the α = 0.05 level. Show the calculation of the
test statistic and p-value. What conclusion to do you draw regarding the relationship between stack
loss and acid concentration?

(c) Calculate a 90% confidence interval for β2 (show your work) and provide a written interpretation of
this regression coefficient.
(d) Conduct a residual analysis of the fitted model using various residual plots. What conclusions do you
draw about the overal fit of the model?

1 This question is adapted from Montgomery, D. C., & Runger, G. C. (2010). Applied statistics and probability for engineers.

John Wiley & Sons.

1
Question 2
The angle θ at which electrons are emitted in muon decay has a distribution with the density:
1 + αx
f (x; α) = , −1 ≤ x ≤ 1, −1 ≤ α ≤ 1
2
where x = cos θ.

(a) Find the likelihood, log-likelihood, score, and information functions for a sample of n independent
observations from this distribtuion.
(b) Use the Newton Raphson algorithm to find the maximum likelihood estimate of α for the data given
below. Note: you must code the algorithm yourself instead of using any built-in optimization or root
finding functions.

x = c(0.164747403, 0.106092128, 0.855715027, 0.221426789, 0.177047372, -0.684621760,


0.194327486, 0.745426807, 0.375342389, -0.176311307, 0.604868366, 0.291522420,
0.145012995, -0.682037664, -0.004203192, 0.998613873, 0.334344244, -0.463665374,
0.255391879, -0.308331904, 0.549739806, 0.143395894, 0.660216568, 0.260438615,
0.365576435, -0.988310236, 0.317882172, -0.710406476, -0.805007831, 0.643207268,
-0.256027985, 0.256180027, 0.325371336, 0.072878236, -0.428863335, 0.184964353,
-0.701840279, 0.729145080, -0.191107998, 0.286108217, -0.309805516, -0.451841456,
-0.463702736, 0.045797852, 0.982804115, -0.957954171, 0.985425250, 0.479191423)

(c) Calculate 95% confidence intervals for α using the likelihood ratio, score, and Wald asymptotic results.
Which of the three intervals do you prefer, and why?
(d) Use each of the likelihood ratio, score, and Wald results to test the null hypothesis that α = 0.25.

Question 3
Suppose that Y is a random variable from the exponential distribtuion with rate parameter λ > 0 and
probability density function:
f (y; λ) = λe−λy

(a) Show that the distribution of Y is a member of the exponential family by identifying the canonical
parameter, the dispersion parameter, and the functions a(φ), b(θ), c(y; φ).
(b) Obtain an expression for the mean and variance of Y and identify the canonical link.
(c) Suppose Yi , i = 1, . . . , n are iid and for each Yi there is vector of explanatory variables xi =
(1, xi1 , . . . , xi,p−1 )0 . Consider the linear predictor ηi = x0i β and the canonical link found in (b).
Find the specific form of the score vector and information matrix for β.
(d) The R code below gives data on y the time in years until a first claim for 25 insurance policies and
x a proprietary measure of risk. Use Newton Raphson to estimate β = (β0 , β1 ) from an exponential
generalized linear model with the canonical link. Again, you must code your own Newton Raphson
algorithm rather than relying on any built-in functions in R.

y = c(0.9683, 0.4515, 17.4488, 0.6287, 2.2330, 2.6467, 3.9589, 0.0782, 5.4717, 4.1161,
0.6715, 1.6350, 0.1640, 0.3331, 0.7501, 3.0846, 0.6889, 6.3826, 7.0869, 0.7967,
3.2684, 0.1373, 2.8698, 1.5126, 0.9055)
x = c(0.1036, 2.1824, 0.1745, 2.0089, 1.2317, 0.6166, 0.4675, 3.2074, 0.0277, 1.2962,
0.6812, 0.1946, 1.3291, 0.4381, 0.2984, 0.3018, 0.7928, 0.2021, 1.0280, 0.0121,
1.2043, 2.9322, 1.4526, 0.6444, 0.1849)