You are on page 1of 2

Riphah International University

Introduction to Machine Learning Spring 2023

Mar 22 Assignment 1 Marks: 20


Due on: Mar
28

A file compensation.csv is uploaded along with this assignment. The file contains
dataset describing the different forms of benefits employees have been drawing during
their services and based on that the final compensation given to them at the time of
retirement.

The compensation feature is the target. This means you need to carry out linear
regression to predict compensation value using the input features. Follow the given
steps to complete your assignment.

Part A

1. Import appropriate libraries


2. Read the csv file into a dataframe using appropriate function
3. Describe your dataset using appropriate function of pandas
4. Plot each input feature against the output feature/target into a scatter plot to
see if there is a linear trend
5. Define a separate dataframe X and y representing input and target features
6. Use appropriate function to split the dataset into training and testing partitions
7. Create an instance of LinearRegression
8. Call the fit method for multiple linear regression using all input features
9. Predict the values for y_test and plot the true and predicted values
10. Print the score (r2)

Part B

7. After step 6 in above, perform Kfold validation using 3, 5 and 10 splits and report
the validation score after each fold.
Evaluation Rubric

Marks
0-30% 31-70% 71-100%
Criteria
Syntactic More than 4 2-3 independent Less than 2
Correctness (4) independent syntax errors independent
syntax errors syntax errors
Logical 3 or more logical 1-2 logical errors 1 or no logical
Correctness (6) errors errors
Results Accuracy More than 4 wrong 2-3 wrong values 1 or no wrong
(4) values values reported
Completion (6) More than 3 tasks 1 or few tasks All tasks completed
undone undone

You might also like