You are on page 1of 4

MAT133: Calculus and Linear Algebra for Commerce

Project 4
Part 1: Individual Submission

Full Name:

UofT Email Address:

Student Number:

Instructions
• These problems are to be completed by each student individually. You are allowed and encouraged to discuss these
problems with other members of your hive, but you must write your own solution for submission.
• Write clearly and concisely in a linear fashion. Explain your steps. Do not submit messy scrapwork.
• Always state your final answer in the form of a sentence, including units wherever applicable.
• You must submit your answers by handwriting directly into this template. You can do this either by using
a tablet or by printing the page, writing in your responses and scanning them. Please do not type into this document.
• Your submission must be in the form of a single pdf, with the pages in the correct order and no
additional pages. If you use the print+scan method, you can use a scanner app on your smartphone to scan all
pages into a single pdf, in the correct order. Solutions submitted incorrectly will receive a grade of zero.
• It is your responsibility to check that the answer boxes in your submission line up with the answer boxes in Gradescope.

Examples of incorrect submissions


• Typed Solutions will not be marked and will receive a grade of 0.
• Raw photos will not be marked and will receive a grade of 0. Scanner apps improve contrast, ensure that the pages
are upright and cropped properly so that we can read them.
• Incorrectly ordered pages will not be marked and will receive a grade of 0.
• Writing on blank paper or altering the template will result in a mark of zero, because your submission cannot
be marked.
• Submissions after the deadline will not be accepted.
• Submissions via email will not be accepted.
• Submissions without the academic integrity statement will receive a mark of 0.

Academic Integrity Statement


After reviewing the Project Instructions Booklet, please handwrite the abbreviated statement in the box.

Please handwrite the sequence of short sentences as given in the instructions booklet.

Student ID Number Signature Submission Date


MAT133 Student ID:
In this individual portion of the project, you will use multivariable calculus to investigate the line of best fit for a
data set that consists of one independent variable and one dependent variable. In the pod portion of the project,
you will find a plane of best fit for a data set that consists of two independent variables and one dependent variable.

Instructions here assume that you are using Excel. If you wish to use different software, such as R or MATLAB,
then you may do so. In this case, please indicate that you used a different software and explain your steps clearly
enough that another student could easily follow them.

Recall from your Week 21 Tutorial that the line of best fit for the data set (x1 , y1 ), . . . , (xn , yn ) is the line y = mx+b
n
X
such that the function f (m, b) = (yi − (mxi + b))2 is minimized.
i=1

1. The people of the town of Snoozeville want to know if there’s a link between nighttime traffic and their quality
of sleep. They record the average number of cars on the road in Snoozeville at night and compare that to the
average nightly sleep that citizens of Snoozeville get in hours. They provide you with the following data:

Average nightly traffic x (cars) 0 10 20 30 50


Average nightly sleep y (hours) 8.2 7.6 7.2 6.9 6.4

(a) Plot the five data points on the grid below. (Please sketch the points only; do not connect the dots.)
Average nightly sleep

8
y (hours)

6
0 10 20 30 40 50
Average nightly traffic x (cars)

(b) The least squares line of best fit is found by minimizing the function f (m, b) which is the sum of the
squares of the vertical distances between the line y = mx + b and each data point. For the Snoozeville
dataset, please write out this function below, including all five terms in your sum.

f (m, b) =

(c) In order to find the minimum value of f (m, b), it will be helpful to know the partial derivatives. On scrap
paper, find and simplify the two partial derivatives of f (m, b), then record your results below.

fm (m, b) =

fb (m, b) =

Page 2
MAT133 Student ID:
(d) The function f (m, b) has one critical point. Find that critical point and use the second derivative test to
verify that it is a local minimum.

(e) The local minimum you found is also a global minimum. Hence, the function f (m, b) is minimized when

m= and b = .
(number) (units) (number) (units)

Therefore, the line of best fit is y = .

(f) If the line of best fit is an accurate model of this data, how much sleep would you expect Snoozeville
residents to get on average on a night with 300 cars on the road? Use your result to comment on whether
the line of best fit is a reasonable way to predict nightly sleep in this case.

I predict that the residents will get an average of hours of sleep. This tells me...

2. Your argument from Question 1 can be generalized to derive a formula for the least-squares line of best fit for
any data set. Please read this argument in the Focus on Theory section of Chapter 8. However, note that the
authors did not explain why the critical point corresponds to a minimum! Your job is to fill this gap in their
reasoning: Use the second derivative test to check that f has a local minimum at this critical point. Use the
abbreviated notation SY, SX, SXY, and SXX in your calculations and explanation. You may assume that
(SX)2 < nSXX.

Page 3
MAT133 Student ID:
3. Choose a data set. Your data set should have two variables, one of which you will think of as an input variable
x and the other as an output variable y. Please choose a data set such that your input variable x is not time.
Make sure you use real and reliable data. While sites like Kaggle contain a lot of real datasets, they
also contain fictitious datasets designed solely to test machine learning algorithms. You should not assume that
data posted on one of the sites in the Appendix is automatically suitable! Instead, look for information on how
the data was collected to verify that the data is real and reliable.
(a) Cite your data set, including the source, authors, and URL (if applicable).

(b) Create a scatterplot of your data in Excel. Then, add a line of best fit. In Excel, this can be done by
adding a trendline to your scatterplot. In the space below, roughly sketch your graph and label your axes.
Your sketch does not have to be perfect, but include any important features you notice.

(c) Find the line of best fit for your data. In Excel, this can be done using the LINEST function.

The line of best fit for my data is y = x+ .

(d) Explain in plain language what the coefficients m and b tell you about the relationship between your two
variables. What do you find interesting about this relationship?

Page 4

You might also like