You are on page 1of 2

The University of British Columbia

Irving K. Barber School of Arts and Sciences


DATA 101
Lab Assignment 4
Date: Due on Nov 5. Submit your answers to Canvas.
Demonstration. The TA will go through a set of examples with you.
1. Source the list in coursegrades.R into an R session. This list consists of final grades in
a number of courses.
(a) How many list elements are there?
(b) Use lapply to calculate the mean grade for each course, and then compute the
median grade for each course. Also, find the number of student grades in each
course. Which class is smallest?
Course1 only has 18 students.
(c) Compare the result of the previous part with what happens when you use vapply as
in
(The use of numeric here is to tell the function that the output will be in the form
of vectors of length 1).
(d) Use lapply and vapply to see what the output looks like when FUN = summary. (In
vapply, the output will be in vectors of length 6, so you need to use numeric(6)
this time.)
2. Consider the data in p2.10 in the MPV package. This data relates measurements of
blood pressure taken for males of varying weights.
(a) Construct a scatterplot which relates pressure to weight.
(b) Fit a simple regression model to the data which can be used to predict pressure from
weight. In particular, write down the slope and intercept estimates for the best fit
line.
(c) Overlay the scatterplot with the best-fit line obtained in the previous part.
(d) Find all of the fitted values.
(e) Find all of the residuals.
(f) Predict the blood pressure when the liquid volume is 150. Also, calculate a 95%
prediction interval for this. Are you extrapolating?
(g) Predict the pressure when the liquid volume is 110. Also, calculate a 95% prediction
interval for this. Are you extrapolating?

1
Exercises for Submission
In each question below, write out (or type) the required lines of R code, if needed, together
with the answer to the question. Total mark 30 points. 2 points for each question.

1. Suppose a simple regression model has been fit to data relating next month’s change in
unemployment rate (in percent), E in to this month’s change in GDP (in percent), G:
E = .005 − .011G. Use this model to predict next month’s unemployment rate, assuming
this month’s rate is 4.582%, supposing the GDP
(a) increased by 1% this month.
(b) decreased by 1% this month.
(c) stayed the same this month.
2. Consider the data in p2.16 in the MPV package. This data relates measurements of air
pressure taken in a tank with various volumes of a liquid.
(a) Construct a scatterplot which relates pressure to volume.
(b) Fit a simple regression model to the data which can be used to predict pressure from
volume. In particular, write down the slope and intercept estimates for the best fit
line.
(c) Overlay the scatterplot with the best-fit line obtained in the previous part.
(d) Find all of the fitted values.
(e) Find all of the residuals.
(f) Predict the pressure when the liquid volume is 2100. Also, calculate a 95% prediction
interval for this. Are you extrapolating?
(g) Predict the pressure when the liquid volume is 1100. Also, calculate a 95% prediction
interval for this. Are you extrapolating?
3. Refer to the previous question. Perform the following assignment:
(a) Is p2.16.summary a list?
(b) Find the names of the objects in p2.16.summary.
(c) List the contents of p2.16.summary$coefficients.

4. Continue working with the coursegrades list.


(a) Use lapply and vapply to find the interquartile range (IQR) for each class.
(b) Use lapply and vapply to find the range for each class. (Note that the output is a
2-vector here.)

You might also like