You are on page 1of 44

Chapter II:

Supervised Learning
Lecture 2.3
Road Map
◼ Introduction
◼ Generalization, Overfitting, and Underfitting
◼ Some Sample Datasets
◼ Supervised Machine Learning Algorithms
◼ k-Nearest Neighbors
◼ Linear Models
◼ Naive Bayes Classifiers
◼ Decision Trees
◼ Support Vector Machines

Chapter 2: Supervised Machine Learning 2


What is Linear Regression?
◼ Linear regression is an algorithm that
provides a linear relationship between an
independent variable and a dependent
variable to predict the outcome of future
events.
◼ Linear regression is a supervised learning
algorithm that simulates a mathematical
relationship between variables and makes
predictions for continuous or numeric variables
such as salary, age, product price, etc.
Chapter 2: Supervised Machine Learning 3
Cont..
◼ It tries to apply relations that will predict the
outcome of an event based on the independent
variable data points.
◼ The relation is usually a straight line that best
fits the different data points as close as
possible. The output is of a continuous form,
i.e., numerical value. For example, the output
could be revenue or sales in currency, the
number of products sold, etc.
◼ In the above example, the independent variable
can be single or multiple.
Chapter 2: Supervised Machine Learning 4
Simple Linear Regression
◼ Simple linear regression is an approach
for predicting a response using a single
feature.
◼ It is assumed that the two variables are
linearly related. Hence, we try to find a
linear function that predicts the response
value(y) as accurately as possible as a
function of the feature or independent
variable(x).

Chapter 2: Supervised Machine Learning 5


Cont..
◼ Let us consider a dataset where we have a
value of response y for every feature x:

Chapter 2: Supervised Machine Learning 6


Cont..
◼ A scatter plot of the above dataset looks like: -

Chapter 2: Supervised Machine Learning 7


Cont..
◼ Now, the task is to find a line that fits best
in the above scatter plot so that we can
predict the response for any new feature
values. (i.e. a value of x not present in a
dataset)
◼ This line is called a regression line.
◼ The equation of regression line is
represented as:

Chapter 2: Supervised Machine Learning 8


Cont..

Chapter 2: Supervised Machine Learning 9


Cont..
◼ Is this the best line we should use?

Chapter 2: Supervised Machine Learning 10


Cont..

Chapter 2: Supervised Machine Learning 11


Cont..

Chapter 2: Supervised Machine Learning 12


Cont..

Chapter 2: Supervised Machine Learning 13


Cont..

Chapter 2: Supervised Machine Learning 14


Cont..

Chapter 2: Supervised Machine Learning 15


Cont..

Chapter 2: Supervised Machine Learning 16


Cont..

Chapter 2: Supervised Machine Learning 17


Cont..

Chapter 2: Supervised Machine Learning 18


Cont..

Chapter 2: Supervised Machine Learning 19


Cont..

Chapter 2: Supervised Machine Learning 20


Cont..

Chapter 2: Supervised Machine Learning 21


Cont..

Chapter 2: Supervised Machine Learning 22


Cont..

Chapter 2: Supervised Machine Learning 23


Cont..

Chapter 2: Supervised Machine Learning 24


Cont..

Chapter 2: Supervised Machine Learning 25


How could we get the best linear line?

◼ Take the line that has the Least Square Error!


CS583, Bing Liu, UIC 26
Cont..

Chapter 2: Supervised Machine Learning 27


Gradient Descent!
◼ Another important concept in Linear Regression
is Gradient Descent.
◼ It is a popular optimization approach employed in
training machine learning models by reducing
errors between actual and predicted outcomes.
◼ Optimization in machine learning is the task of
minimizing the cost function parameterized by
the model's parameters.
◼ The primary goal of gradient descent is to
minimize the convex function by parameter
iteration.
Chapter 2: Supervised Machine Learning 28
Cont..

Chapter 2: Supervised Machine Learning 29


Cont..

Chapter 2: Supervised Machine Learning 30


Cont..

Chapter 2: Supervised Machine Learning 31


Cont..

Chapter 2: Supervised Machine Learning 32


Cont..

Chapter 2: Supervised Machine Learning 33


Cont..

Chapter 2: Supervised Machine Learning 34


Linear Regression Equation
◼ Linear regression can be expressed
mathematically as:
◼ y= β0 + β1x + ε

Here,
y = Dependent Variable
x = Independent Variable
β0 = intercept of the line
β1 = Linear regression coefficient (slope of the line)
ε = random error

Chapter 2: Supervised Machine Learning 35


Cont..
Here,

Chapter 2: Supervised Machine Learning 36


Linear Regression Model
◼ Since the Linear Regression algorithm
represents a linear relationship between a
dependent (x) and one or more independent (y)
variables, it is known as Linear Regression.
◼ This means it finds how the value of the
dependent variable changes according to the
change in the value of the independent
variable. The relation between independent
and dependent variables is a straight line with a
slope.

Chapter 2: Supervised Machine Learning 37


Types of Linear Regression
◼ Linear Regression can be broadly classified
into two types of algorithms:
1. Simple Linear Regression
◼ A simple straight-line equation involving slope (dy/dx)
and intercept (an integer/continuous value) is utilized
in simple Linear Regression. Here a simple form is:
◼ Y = mx + c where y denotes the output x is the
independent variable, and c is the intercept when
x=0. With this equation, the algorithm trains the
model of machine learning and gives the most
accurate output.

Chapter 2: Supervised Machine Learning 38


Cont..
2. Multiple Linear Regression
◼ When a number of independent variables
more than one, the governing linear
equation applicable to regression takes a
different form.
Non-Linear Regression
◼ When the best fitting line is not a straight
line but a curve, it is referred to as Non-
Linear Regression.
Chapter 2: Supervised Machine Learning 39
Cont..
◼ Now let’s look at how we can apply
the Linear Regression algorithm
using scikit-learn.
◼ First, Let us import the necessary
libraries.
◼ Second, Let us split our data into a
training and a test set so we can
evaluate generalization performance.
Chapter 2: Supervised Machine Learning 40
Cont..
Here is the code:

Chapter 2: Supervised Machine Learning 41


Cont..
Let’s look at the training set and test set
performance:

Chapter 2: Supervised Machine Learning 42


Discussions
◼ Linear regression, or ordinary least squares
(OLS), is the simplest and most classic linear
method for regression.
◼ Linear regression finds the parameters m and
b that minimize the mean squared error
between predictions and the true regression
targets, y, on the training set.
◼ The mean squared error is the sum of the
squared differences between the predictions
and the true values.
Chapter 2: Supervised Machine Learning 43
Thank You!

Chapter 2: Supervised Machine Learning 44

You might also like