You are on page 1of 21

Introduction to Machine Learning

Dr. Shekha Rai

Department of Electrical Engineering


National Institute of Technology Rourkela
India

1 / 21
Contents

Regression

2 / 21
Regression

Supervised learning problem where given examples of instances


X, Y- learn a function so that given an unknown X, need to
predict Y .

f : X → Y (continuous) (1)
Many functions can be used. Simplest → linear function

X comprises of single or multiple features.

In linear regression, linear function is assumed and out of


different linear function, the one which optimizes certain criteria
is chosen.

3 / 21
Regression

Figure:

Figure: 4 / 21
Regression

Blue circle → different instances in the training data .

Let green line is the actual function from which the points were
generated.

In regression, instances are given and a function has to be found.

Blue points are given and function (red line) is found.

Need to find out how good is the line wrt the training examples.

To measure how good is the line → define the error of the line.

5 / 21
Regression

Find the distance of the point from this red line, and take the
squared distance from each point to the line, and take the sum of
squared errors.

The sum of squared errors is one measure of error.

6 / 21
Regression

Figure:

7 / 21
Regression

Fig(a)→ linear function which is parallel to the x-axis, so that


y = c ( constant)

Fig(b) → the function corresponds to y = mx + c

Fig(c) → cubic function y = ax3 + bx2 + cx + d.

Fig(d) → 9th degree polynomial.

8 / 21
Regression

Fig(a)→ not a very good fit with the data.

Fig(b) → fit is slightly better than Fig (a)

Sum of squared errors fig (a) > (b) > (c) > (d)

Fig(d) → 9th degree polynomial.

Fig(d) → even though the red line is fitted to all the points, this
function does not really correspond to the green line. So, for
other points, the error may be higher.

Fig(c) → function seems to have fit the points much better, and
within this range, the fit to the green line will be smaller.

9 / 21
Types of regression models

Simple (1 feature)

Linear

Non-linear

Multiple(2+ feature)

Linear

Non-linear

10 / 21
Linear regression

Figure:

11 / 21
Linear regression

Figure:

12 / 21
Linear regression

Given, the data points task is to find the parameters of the line i.e
b0 &b1

Assuming some random noise

Y = bo + b1 x +  (2)

Error is zero mean

Eq (2) denotes simple regression

13 / 21
Linear regression

For p predictor variables

Y = bo + b1 x + b2 x2 + ....bp xp + ε (3)

Eq (3) denotes multiple regression

Define a model to estimate the parameters (i.e bo , b1 , ...bp )

E(Y/X) = bo + b1 x + b2 x2 + ....bp xp (4)

Need to find bˆo + bˆ1 x + bˆ2 x2 + ....bˆp xp as estimate of the original


function.

Minimize the sum of square error

Least square line → bˆo + bˆ1 x + bˆ2 x2 + ....bˆp xp

14 / 21
Linear regression
Assumptions

E(εi ) = 0

σ(εi ) = σ(ε) where σ(ε) in unknown.

Errors are independent.

Errors are normally distributed with mean=0 and standard


deviation σi

Gaussian noise or white-noise

15 / 21
Linear regression

The least-squares regression line is the unique line such that the
sum of squared vertical (y) distances between the data points and
the line is the smallest possible.

Given the training points di = (xi , yi )

SSE is
X
(yi − (bˆo + bˆ1 xi ))2 (5)
dD

Find bˆo and bˆ1

16 / 21
Simple Linear regression

For 2 − d problem

Y = bo + b1 X (6)
To find the values of the coefficients which minimizes the
objective function, partial derivative of the objective function
(SSE) w.r.t coefficient is taken and set to 0

P P P
n xy − x y
b1 = P 2 P (7)
n x − ( x)2
P P
y − b1 x
bo = (8)
n
This is a closed form solution.

17 / 21
Multiple Linear regression

For multiple linear regression problem

Y = bo + b1 X1 + b2 X2 + ...bn Xn (9)
n
X
h(x) = bi xi (10)
i=0

Closed form solution requires matrix inversion.

Alternative is to use iterative algorithms like delta (LMS)

Update the values of b0 , b1 ... etc to minimize the sum of squares.

18 / 21
Multiple-choice

In the mathematical equation of linear regression


Y = bo + b1 X + ε, (bo , b1 ) refers to

1 (X-intercept, Slope)

2 (Slope, X-Intercept)

3 (Y-Intercept, Slope)

4 (Slope, Y-Intercept)

19 / 21
Multiple-choice

In the mathematical equation of linear regression


Y = bo + b1 X + ε, (bo , b1 ) refers to

1 (X-intercept, Slope)

2 (Slope, X-Intercept)

3 (Y-Intercept, Slope)

4 (Slope, Y-Intercept)

Answer (3)

20 / 21
Thank You

21 / 21

You might also like