Professional Documents
Culture Documents
Project 4
Part 1: Individual Submission
Full Name:
Student Number:
Instructions
• These problems are to be completed by each student individually. You are allowed and encouraged to discuss these
problems with other members of your hive, but you must write your own solution for submission.
• Write clearly and concisely in a linear fashion. Explain your steps. Do not submit messy scrapwork.
• Always state your final answer in the form of a sentence, including units wherever applicable.
• You must submit your answers by handwriting directly into this template. You can do this either by using
a tablet or by printing the page, writing in your responses and scanning them. Please do not type into this document.
• Your submission must be in the form of a single pdf, with the pages in the correct order. If you use the
print+scan method, you can use a scanner app on your smartphone to scan all pages into a single pdf, in the correct
order. Solutions submitted incorrectly will not be marked.
• Please write your Student ID number at the top of every page.
• Upload every single page, even if you don’t write anything on some of the pages.
Please handwrite the sequence of short sentences as given in the instructions booklet.
Instructions here assume that you are using Excel. If you wish to use di↵erent software, such as R or MATLAB,
then you may do so. In this case, please indicate that you used a di↵erent software and explain your steps clearly
enough that another student could easily follow them.
Recall from your Week 21 Tutorial that the line of best fit for the data set (x1 , y1 ), . . . , (xn , yn ) is the line y = mx+b
n
X
such that the function f (m, b) = (yi (mxi + b))2 is minimized.
i=1
1. The people of the town of Snoozeville want to know if there’s a link between nighttime traffic and their quality
of sleep. They record the average number of cars on the road in Snoozeville at night and compare that to the
average nightly sleep that citizens of Snoozeville get in hours. They provide you with the following data:
(a) Plot the five data points on the grid below. (Please sketch the points only; do not connect the dots.)
Average nightly sleep
8
y (hours)
6
0 10 20 30 40 50
Average nightly traffic x (cars)
(b) The least squares line of best fit is found by minimizing the function f (m, b) which is the sum of the
squares of the vertical distances between the line y = mx + b and each data point. For the Snoozeville
dataset, please write out this function below, including all five terms in your sum.
Yi Cmx it b 32
f (m, b) =
8 b
2 67.6 Zo b p 7 30m b
2
6.4 50m b
(c) On scrap paper, find and simplify the two partial derivatives of f (m, b), then record your results below.
fb (m, b) =
74 220 106
Page 2
MAT133 Student ID:
(d) The function f (m, b) has one critical point. Find that critical point and use the second derivative test to
classify it as a local minimum.
fm Con61 0 4840 0
628 1536 7800
1536 7800m 22054 92 2960m so
x L 0.31
5 1536 7800m
220
Em If
b or 8.08
Eben b o 35
74 220 t o 153623,8001 so
74 220 7 O
tisg
(e) The local minimum you found is also a global minimum. Hence, the function f (m, b) is minimized when
8 08 Aughours
0.031 hours
m= and b = .
car
(number) (units) (number) sleep
(units)
(f) If the line of best fit is an accurate model of this data, how much sleep would you expect Snoozeville
residents to get on average on a night with 300 cars on the road? Use your result to comment on whether
the line of best fit is a reasonable way to predict nightly sleep in this case.
2. Your argument from Question 1 can be generalized to derive a formula for the least-squares line of best fit for
any data set. Please read this argument in the Focus on Theory section of Chapter 8. However, note that the
authors did not explain why the critical point corresponds to a minimum! Use the second derivative test to
check that f has a local minimum at this critical point. Use the abbreviated notation SY, SX, SXY, and SXX
in your calculations and explanation. You may assume that (SX)2 < nSXX.
https://www.kaggle.com/datasets/andrewsundberg/college-basketball-
dataset?resource=download
(b) Create a scatterplot of your data. Then, add a line of best fit. In excel, this can be done by adding a
trendline to your scatterplot. In the space below, sketch a graph of the line of best fit for your data. Make
sure to properly label your axes.
(c) Find the line of best fit for your data. In Excel, this can be done using the LINEST function.
0.6 24.733
The line of best fit for my data is y = x+ .
(d) Explain in plain language what the coefficients m and b tell you about the relationship between your two
variables. What do you find interesting about this relationship?
Page 4