ML 5

Introduction to Machine Learning
Dr. Shekha Rai
Department of Electrical Engineering

National Institute of Technology Rourkela
India
1 / 21
Contents
Regression
2 / 21
Regression
Supervised learning problem where given examples of instances

X, Y- learn a function so that given an unknown X, need to
predict Y .
f : X → Y (continuous) (1)
Many functions can be used. Simplest → linear function
X comprises of single or multiple features.
In linear regression, linear function is assumed and out of

different linear function, the one which optimizes certain criteria
is chosen.
3 / 21
Regression
Figure:
Figure: 4 / 21
Regression
Blue circle → different instances in the training data .
Let green line is the actual function from which the points were
generated.
In regression, instances are given and a function has to be found.
Blue points are given and function (red line) is found.
Need to find out how good is the line wrt the training examples.
To measure how good is the line → define the error of the line.
5 / 21
Regression
Find the distance of the point from this red line, and take the
squared distance from each point to the line, and take the sum of
squared errors.
The sum of squared errors is one measure of error.
6 / 21
Regression
Figure:
7 / 21
Regression
Fig(a)→ linear function which is parallel to the x-axis, so that

y = c ( constant)
Fig(b) → the function corresponds to y = mx + c
Fig(c) → cubic function y = ax3 + bx2 + cx + d.
Fig(d) → 9th degree polynomial.
8 / 21
Regression
Fig(a)→ not a very good fit with the data.
Fig(b) → fit is slightly better than Fig (a)
Sum of squared errors fig (a) > (b) > (c) > (d)
Fig(d) → 9th degree polynomial.
Fig(d) → even though the red line is fitted to all the points, this
function does not really correspond to the green line. So, for
other points, the error may be higher.
Fig(c) → function seems to have fit the points much better, and
within this range, the fit to the green line will be smaller.
9 / 21
Types of regression models
Simple (1 feature)
Linear
Non-linear
Multiple(2+ feature)
Linear
Non-linear
10 / 21
Linear regression
Figure:
11 / 21
Linear regression
Figure:
12 / 21
Linear regression
Given, the data points task is to find the parameters of the line i.e
b0 &b1
Assuming some random noise
Y = bo + b1 x + (2)
Error is zero mean
Eq (2) denotes simple regression
13 / 21
Linear regression
For p predictor variables
Y = bo + b1 x + b2 x2 + ....bp xp + ε (3)
Eq (3) denotes multiple regression
Define a model to estimate the parameters (i.e bo , b1 , ...bp )
E(Y/X) = bo + b1 x + b2 x2 + ....bp xp (4)
Need to find bô + bˆ1 x + bˆ2 x2 + ....bˆp xp as estimate of the original

function.
Minimize the sum of square error
Least square line → bô + bˆ1 x + bˆ2 x2 + ....bˆp xp
14 / 21
Linear regression
Assumptions
E(εi ) = 0
σ(εi ) = σ(ε) where σ(ε) in unknown.
Errors are independent.
Errors are normally distributed with mean=0 and standard

deviation σi
Gaussian noise or white-noise
15 / 21
Linear regression
The least-squares regression line is the unique line such that the
sum of squared vertical (y) distances between the data points and
the line is the smallest possible.
Given the training points di = (xi , yi )
SSE is
X
(yi − (bô + bˆ1 xi ))2 (5)
dD
Find bô and bˆ1
16 / 21
Simple Linear regression
For 2 − d problem
Y = bo + b1 X (6)
To find the values of the coefficients which minimizes the
objective function, partial derivative of the objective function
(SSE) w.r.t coefficient is taken and set to 0
P P P
n xy − x y
b1 = P 2 P (7)
n x − ( x)2
P P
y − b1 x
bo = (8)
n
This is a closed form solution.
17 / 21
Multiple Linear regression
For multiple linear regression problem
Y = bo + b1 X1 + b2 X2 + ...bn Xn (9)
n
X
h(x) = bi xi (10)
i=0
Closed form solution requires matrix inversion.
Alternative is to use iterative algorithms like delta (LMS)
Update the values of b0 , b1 ... etc to minimize the sum of squares.
18 / 21
Multiple-choice
In the mathematical equation of linear regression

Y = bo + b1 X + ε, (bo , b1 ) refers to
1 (X-intercept, Slope)
2 (Slope, X-Intercept)
3 (Y-Intercept, Slope)
4 (Slope, Y-Intercept)
19 / 21
Multiple-choice
In the mathematical equation of linear regression

Y = bo + b1 X + ε, (bo , b1 ) refers to
1 (X-intercept, Slope)
2 (Slope, X-Intercept)
3 (Y-Intercept, Slope)
4 (Slope, Y-Intercept)
Answer (3)
20 / 21
Thank You
21 / 21

ML 5

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ML 5

Uploaded by

Copyright:

Available Formats

Introduction to Machine Learning

Dr. Shekha Rai

Department of Electrical Engineering

Supervised learning problem where given examples of instances

X comprises of single or multiple features.

In linear regression, linear function is assumed and out of

Blue circle → different instances in the training data .

In regression, instances are given and a function has to be found.

Blue points are given and function (red line) is found.

The sum of squared errors is one measure of error.

Fig(a)→ linear function which is parallel to the x-axis, so that

Fig(b) → the function corresponds to y = mx + c

Fig(c) → cubic function y = ax3 + bx2 + cx + d.

Fig(d) → 9th degree polynomial.

Fig(a)→ not a very good fit with the data.

Fig(b) → fit is slightly better than Fig (a)

Fig(d) → 9th degree polynomial.

Assuming some random noise

Error is zero mean

Eq (2) denotes simple regression

For p predictor variables

Eq (3) denotes multiple regression

Define a model to estimate the parameters (i.e bo , b1 , ...bp )

E(Y/X) = bo + b1 x + b2 x2 + ....bp xp (4)

Need to find bˆo + bˆ1 x + bˆ2 x2 + ....bˆp xp as estimate of the original

Minimize the sum of square error

Least square line → bˆo + bˆ1 x + bˆ2 x2 + ....bˆp xp

σ(εi ) = σ(ε) where σ(ε) in unknown.

Errors are independent.

Errors are normally distributed with mean=0 and standard

Gaussian noise or white-noise

Given the training points di = (xi , yi )

Find bˆo and bˆ1

For multiple linear regression problem

Closed form solution requires matrix inversion.

Alternative is to use iterative algorithms like delta (LMS)

Update the values of b0 , b1 ... etc to minimize the sum of squares.

In the mathematical equation of linear regression

In the mathematical equation of linear regression

You might also like