You are on page 1of 4

MAT133: Calculus and Linear Algebra for Commerce

Project 4
Part 1: Individual Submission

Full Name:

UofT Email Address:

Student Number:

Instructions
• These problems are to be completed by each student individually. You are allowed and encouraged to discuss these
problems with other members of your hive, but you must write your own solution for submission.
• Write clearly and concisely in a linear fashion. Explain your steps. Do not submit messy scrapwork.
• Always state your final answer in the form of a sentence, including units wherever applicable.
• You must submit your answers by handwriting directly into this template. You can do this either by using
a tablet or by printing the page, writing in your responses and scanning them. Please do not type into this document.
• Your submission must be in the form of a single pdf, with the pages in the correct order. If you use the
print+scan method, you can use a scanner app on your smartphone to scan all pages into a single pdf, in the correct
order. Solutions submitted incorrectly will not be marked.
• Please write your Student ID number at the top of every page.
• Upload every single page, even if you don’t write anything on some of the pages.

Examples of incorrect submissions


• Typed Solutions will not be marked.
• Raw photos submitted without the use of a scanner app will result in a 20% penalty. Scanner apps improve contrast,
ensure that the pages are upright and cropped properly so that we can read them.
• Incorrectly ordered pages will result in a 20% penalty.
• Writing on blank paper or altering the template cannot be marked and will result in a mark of zero.
• Submissions after the deadline will not be accepted.
• Submissions via email will not be accepted.
• Submissions without the academic integrity statement will receive a mark of 0.

Academic Integrity Statement


After reviewing the Project Instructions Booklet, please handwrite the abbreviated statement in the box.

Please handwrite the sequence of short sentences as given in the instructions booklet.

Student ID Number Signature Submission Date


MAT133 Student ID:
In this individual portion of the project, you will use multivariable calculus to investigate the line of best fit for a
data set that consists of one independent variable and one dependent variable. In the pod portion of the project,
you will find a plane of best fit for a data set that consists of two independent variables and one dependent variable.

Instructions here assume that you are using Excel. If you wish to use di↵erent software, such as R or MATLAB,
then you may do so. In this case, please indicate that you used a di↵erent software and explain your steps clearly
enough that another student could easily follow them.

Recall from your Week 21 Tutorial that the line of best fit for the data set (x1 , y1 ), . . . , (xn , yn ) is the line y = mx+b
n
X
such that the function f (m, b) = (yi (mxi + b))2 is minimized.
i=1

1. The people of the town of Snoozeville want to know if there’s a link between nighttime traffic and their quality
of sleep. They record the average number of cars on the road in Snoozeville at night and compare that to the
average nightly sleep that citizens of Snoozeville get in hours. They provide you with the following data:

Average nightly traffic x (cars) 0 10 20 30 50


Average nightly sleep y (hours) 8 7.7 7.6 7.3 6.4

(a) Plot the five data points on the grid below. (Please sketch the points only; do not connect the dots.)
Average nightly sleep

8
y (hours)

6
0 10 20 30 40 50
Average nightly traffic x (cars)

(b) The least squares line of best fit is found by minimizing the function f (m, b) which is the sum of the
squares of the vertical distances between the line y = mx + b and each data point. For the Snoozeville
dataset, please write out this function below, including all five terms in your sum.

Yi Cmx it b 32
f (m, b) =

8 b
2 67.6 Zo b p 7 30m b
2
6.4 50m b

(c) On scrap paper, find and simplify the two partial derivatives of f (m, b), then record your results below.

1536 7800m 220 b


fm (m, b) =

fb (m, b) =
74 220 106

Page 2
MAT133 Student ID:
(d) The function f (m, b) has one critical point. Find that critical point and use the second derivative test to
classify it as a local minimum.

fm Con61 0 4840 0
628 1536 7800
1536 7800m 22054 92 2960m so
x L 0.31
5 1536 7800m
220
Em If
b or 8.08
Eben b o 35
74 220 t o 153623,8001 so

74 220 7 O
tisg
(e) The local minimum you found is also a global minimum. Hence, the function f (m, b) is minimized when

8 08 Aughours
0.031 hours
m= and b = .
car
(number) (units) (number) sleep
(units)

Therefore, the line of best fit is y = t .

(f) If the line of best fit is an accurate model of this data, how much sleep would you expect Snoozeville
residents to get on average on a night with 300 cars on the road? Use your result to comment on whether
the line of best fit is a reasonable way to predict nightly sleep in this case.

I predict that the residents will get an average of


f 24 hours of sleep. This tells me...

Not an accurate measure as sleep can never be an


negative value

2. Your argument from Question 1 can be generalized to derive a formula for the least-squares line of best fit for
any data set. Please read this argument in the Focus on Theory section of Chapter 8. However, note that the
authors did not explain why the critical point corresponds to a minimum! Use the second derivative test to
check that f has a local minimum at this critical point. Use the abbreviated notation SY, SX, SXY, and SXX
in your calculations and explanation. You may assume that (SX)2 < nSXX.

as xxi Isyl CSX Syx an CS xx CS x 0 Sx


5 63700 37 Cho x 7687 118
5K 00

Choo 373 5 3900 211032


m g 768
o o
j
Page 3
MAT133 Student ID:
3. Choose a data set. Your data set should have two variables, one of which you will think of as an input variable
x and the other as an output variable y. Please choose a data set such that your input variable x is not time.
Make sure you use real and reliable data. While sites like Kaggle contain a lot of real datasets, they also contain
fictitious datasets designed solely to test machine learning algorithms. You should not assume that data posted
on one of the sites in the Appendix is automatically suitable! Instead, look for information on how the data
was collected to verify that the data is real and reliable.
(a) Cite your data set, including a link if possible.

https://www.kaggle.com/datasets/andrewsundberg/college-basketball-
dataset?resource=download

(b) Create a scatterplot of your data. Then, add a line of best fit. In excel, this can be done by adding a
trendline to your scatterplot. In the space below, sketch a graph of the line of best fit for your data. Make
sure to properly label your axes.

(c) Find the line of best fit for your data. In Excel, this can be done using the LINEST function.

0.6 24.733
The line of best fit for my data is y = x+ .

(d) Explain in plain language what the coefficients m and b tell you about the relationship between your two
variables. What do you find interesting about this relationship?

M 24.733 which tells us how many


basketball players were drafted
college
in each year
which of players
D O 6 is
avg
at time
starting
The change is quite drastic through
out

Page 4

You might also like