You are on page 1of 12

Finding best t line for given dataset in

Octave
Shubham-AE18B041

September 30, 2019

1 Introduction
In this assignment we are trying to nd the best le line for the given dataset
using "Least square tting" method. We are given three datasets which
include co-ordinates of 200 points (x&y co-ordinates). The data is given in
.txt format.

2 Logic
The data we have is in .txt format. First we have to assign the values to
particular variables say x&y. The variable x will store x co-ordinates and y
will store y co-ordinates. Let m and c be the slope and y-intercept of the
best t ine which we have to nd. Then the equation y=mx+c for all 200
points can be written in matrix format as
   
y1 x1 1
 y2   x2 1  
 m
 ..  =  .. .. 
  
 .   . . c
yn xn 1

Now we will multiply by xT to both sides


   
y1 x1 1
. . . xn  y2  x1 x2 . . . xn  x2 1 m
      
x1 x2
. . . 1  ...  1 1 . . . 1  ... ...  c
 =
1 1
 

yn xn 1

Now multiplying by the inversre of second matrix on both sides:


 −1    −1   
C D A C D C D m
=
E F B E F E F c

1
By doing above matrix multiplication we will get values of m&c
   
m slope
=
c y _intercept

3 Code
The octave code for performing the above operatins is given below

data=load('data3.txt') #we can change the data file accordingly


l=150 #we can change no. of points accordingly
x=ones(l,2)
y=data(1:l,2)
x(:,1)=data(1:l,1)
xtran=transpose(x)
A=xtran*y
B=xtran*x
ans=inverse(B)*A
plot(data(1:l),y,' or','markersize',2)
xfit=[1:l]
yfit=ans(1)*xfit+ans(2)
hold on
p2=plot(xfit,yfit,'-b')
xlabel('X-data')
ylabel('Y-data')
title('Dataset-3, 150 points') #we can change the title accordingly
legend(p2,(sprintf("y= %f *x + %f\n", ans(1),ans(2))))

4 Result
The resultant graph for dierent datasets and dierent no. of points is shown
below.

2
Figure 1: Dataset-1 , no. of points-25

3
Figure 2: Dataset-1 , no. of points-75

4
Figure 3: Dataset-1 , no. of points-150

5
Figure 4: Dataset-2 , no. of points-25

6
Figure 5: Dataset-2 , no. of points-75

7
Figure 6: Dataset-2 , no. of points-150

8
Figure 7: Dataset-3 , no. of points-25

9
Figure 8: Dataset-3 , no. of points-75

10
Figure 9: Dataset-3 , no. of points-150

11
5 Inference
We can see that graphs plotted in python and octave are exactly same. For
datasets 2 & 3 our data set is nearly linear so we get a straight line where
each point is very close to the best t line. But for dataset 1 we can see that
the data is not linear. So some points are close to the best  line whereas
some are away from the best t line.

12

You might also like