You are on page 1of 8




TITLE: Python lab exercise to implement linear regression

AIM: To find the best fit line for the given data using Linear


Auto Insurance in Sweden

In the following data

X = number of claims
Y = total payment for all the claims in thousands of Swedish Kronor
for geographical zones in Sweden
108 392
19 46
13 1
124 422
40 114
57 170
23 5
14 77
45 214
10 65
5 20
48 248
11 23
23 39
7 48
2 6
24 134
6 50
3 4
23 113
6 14
9 48
9 52
3 13
29 103
7 77
4 11
20 98
7 27
4 38
0 0
25 69
6 14
5 40
22 161
11 57
61 217
12 58
4 12
16 59
13 89
60 202
41 181
37 152
55 162
41 73
11 21
27 92
8 76
3 39
17 142
13 93
13 31
15 32
8 55
29 133
30 194
24 137
9 87
31 20
14 95
53 244
26 187

In statistics, linear regression is a linear approach to modelling the
relationship between a dependent variable and one or more
independent variables. Let X be the independent variable and Y be the
dependent variable. We will define a linear relationship between these
two variables as follows:
Y = mX+C
Our objective is to determine the value of m and c, such that the line
corresponding to those values is the best fitting line or gives the
minimum error.

Loss Function:
The loss is the error in our predicted value of m and c. We will use the
Mean Squared Error function to calculate the loss. There are three
steps in this function:
• Find the difference between the actual y and predicted y value
(y = mx + c), for a given x.
• Square this difference.
• Find the mean of the squares for every value in X.
Here yᵢ is the actual value and ȳᵢ is the predicted value. Let’s
substitute the value of ȳᵢ:

So, we square the error and find the mean. hence the name Mean
Squared Error.

The Gradient Descent Algorithm:

Gradient descent is an iterative optimization algorithm to find the
minimum of a function. Here that function is our Loss Function.

Now, apply gradient descent to m and c and approach it step by step:

• Initially let m = 0 and c = 0. Let L be our learning rate. This
controls how much the value of m changes with each step. L
could be a small value like 0.0001 for good accuracy.
• Calculate the partial derivative of the loss function with respect
to m, and plug in the current values of x, y, m and c in it to
obtain the derivative value D.

Dₘ is the value of the partial derivative with respect to m.

Similarly let’s find the partial derivative with respect to c, Dc:
• Now we update the current value of m and c using the following

• We repeat this process until our loss function is a very small

value or ideally 0 (which means 0 error or 100% accuracy). The
value of m and c that we are left with now will be the optimum

# Making the imports
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
plt.rcParams['figure.figsize'] = (12.0, 9.0) # plot size

# Preprocessing Input data

data = pd.read_csv('ml_lab_1.csv')
X = data.iloc[:, 0] # get all rows from column 0
Y = data.iloc[:, 1] # get all rows from column 1
plt.scatter(X, Y) # draw a scatter plot # display all figures

# Building the model

m = 0 # m is the slope of the line
c = 0 # c is the y intercept
L = 0.0001 # The learning Rate
iters = 1000 # The number of iterations to perform gradient descent

n = float(len(X)) # Number of elements in X

# Performing Gradient Descent

for i in range(iters):
Y_pred = m*X + c # The current predicted value of Y
D_m = (-2/n) * sum(X * (Y - Y_pred)) # Derivative wrt m
D_c = (-2/n) * sum(Y - Y_pred) # Derivative wrt c
m = m - L * D_m # Update m
c = c - L * D_c # Update c

print (m, c)

# Making predictions
Y_pred = m*X + c

plt.scatter(X, Y)
plt.plot([min(X), max(X)], [min(Y_pred), max(Y_pred)],
color='green') # predicted
m = 3.70478986556524, c = 1.6365274116383624

• When learning rate is set to 0.0001, we are getting a slope with
min error and there are 13 points very close to the slope

• When learning rate is set to 0.001, we can say that the points
moved a little further from the slope

• When learning rate is set to 0.00001, we are getting a slope with

min error and there are 13 points very close to the slope almost
similar to the result obtained when L=0.0001
• When learning rate is 0.1, we are not getting any slope

Hence, we can say 0.0001 is the ideal learning rate for the plotting the

You might also like