Professional Documents
Culture Documents
1. Email *
2. Name *
3. Roll No. *
Linear Regression
Given the dataset of heights and weights of six individuals, fit a linear regression line
using the ordinary least squares method.
https://docs.google.com/forms/d/1DhEhoLPO3y5h2KUbE3MF5p2AxCx8gElBsT2k_A-5rQo/edit 1/7
24/03/2024, 09:30 Machine Learning Assignment 2024
6. What is value of predicted weight (round upto 2 decimal places) for * 1 point
height=67?
7. Find the mean squared error (rounded upto 2 decimal places) for the best * 1 point
fit line.
8. What will happen if we add a constant 'c' to all height values? * 1 point
Both intercept and regression coefficient will increase by the constant 'c'
9. Suppose that for some linear regression problem we have some training * 1 point
set, and for our training set we managed to find some theta_0 and theta_1
such that J(theta_0, theta_1)=0. Which of the statements below must then
be true?
Our training set can be fit perfectly by a straight line, i.e., all of our training examples
lie perfectly on some straight line.
For this to be true, we must have theta_0 = 0 and theta_1 =0 so that h(x) = 0
For this to be true, we must have y_i = 0 for every value of i=1,2,3...,m
We can perfectly predict the value of y even for new examples that we have not yet
seen. (e.g., we can perfectly predict prices of even new houses that we have not yet
seen.)
https://docs.google.com/forms/d/1DhEhoLPO3y5h2KUbE3MF5p2AxCx8gElBsT2k_A-5rQo/edit 2/7
24/03/2024, 09:30 Machine Learning Assignment 2024
10. Which all statements are true with respect to the least squares method? * 1 point
Logistic Regression
Certain health risk factors such as high blood pressure and cigarette smoking etc. lead
to sudden death. Therefore a multiple logistic regression model was fit with regression
coefficients as shown below.
11. Also, predict the probability of death if diastolic blood pressure is 180 * 1 point
mmHg with other conditions remaining same. Round your answer upto 2
decimal places.
https://docs.google.com/forms/d/1DhEhoLPO3y5h2KUbE3MF5p2AxCx8gElBsT2k_A-5rQo/edit 3/7
24/03/2024, 09:30 Machine Learning Assignment 2024
12. Using logistic regression, predict the probability of death for a 50 year old * 1 point
man with diastolic blood pressure of 120 mmHg, a relative weight of 100
Kg of study mean, a cholesterol level of 250 mg/100mL, a glucose level
of 100 mg/100mL who smokes 10 cigarettes per day. Round your answer
upto 2 decimal places.
Using the dataset given below, build a logistic regression model to predict the target
variable y using gradient descent method. Assume learning rate=0.1 and initial value of
the coefficients as 0. Apply ONE iteration of the gradient descent algorithm and answer
the questions given below.
13. What is the value of theta0 after first iteration? Round your answer upto 3 * 1 point
decimal places.
https://docs.google.com/forms/d/1DhEhoLPO3y5h2KUbE3MF5p2AxCx8gElBsT2k_A-5rQo/edit 4/7
24/03/2024, 09:30 Machine Learning Assignment 2024
14. What is the value of theta1 after first iteration? Round your answer upto 3 * 1 point
decimal places.
15. What is the value of theta2 after first iteration? Round your answer upto 3 * 1 point
decimal places.
16. What is the value of cost function after first iteration? Round your upto 2 * 1 point
decimal places.
17. What is probability of y=1 for an instance x1=1.5 and x2=2.2? * 1 point
Performance Metrics
Suppose 10000 patients get tested for flu; out of them, 9000 are actually healthy and
1000 are actually sick. For the sick people, a test was positive for 620 and negative for
380. For healthy people, the same test was positive for 180 and negative for 8820.
Construct a confusion matrix for the data.
https://docs.google.com/forms/d/1DhEhoLPO3y5h2KUbE3MF5p2AxCx8gElBsT2k_A-5rQo/edit 5/7
24/03/2024, 09:30 Machine Learning Assignment 2024
K-Means Clustering
Consider the given data points A1(2, 10), A2(2, 5), A3(8, 4), A4(5, 8), A5(7, 5), A6(6, 4),
A7(1, 2), A8(4, 9). Cluster these points into 3 clusters using k-means algorithm. Initial
cluster centers are cluster C1 (2,10), C2 (5,8) and C3 (1,2). Show your results after
first iteration only. Use Manhattan distance measure to calculate distance between two
points.
Answer the following:
C1
C2
C3
https://docs.google.com/forms/d/1DhEhoLPO3y5h2KUbE3MF5p2AxCx8gElBsT2k_A-5rQo/edit 6/7
24/03/2024, 09:30 Machine Learning Assignment 2024
27. Which points belong to cluster C2? (Select all points in cluster C2) * 1 point
A1
A2
A3
A4
A5
A6
A7
A8
28. What is the new center of cluster C3 after first iteration? write as (#,#) * 1 point
Files submitted:
Forms
https://docs.google.com/forms/d/1DhEhoLPO3y5h2KUbE3MF5p2AxCx8gElBsT2k_A-5rQo/edit 7/7