You are on page 1of 35

Artificial Intelligence

LEARNING: REGRESSION VS
CLASSIFICATION

Week-15 Lecture-30
Agenda 2

 Regression

 Multiple Linear Regression.

 Polynomial Regression.

 Classification

 Logistic Regression.

 k-nearest neighbors.

 Tree based methods.


Regression
4
Recap: Linear Regression

Dependent Independent
Variable Variable

𝑦 𝑦 = 𝑏0 + 𝑏1𝑥

Constant Coefficient
Dependent
Variable

𝒙
Independent
Variable
5
Example: Linear Regression

Salary
Data
Example: Linear Regression (Ordinary 6
Least Squares)
7
Multiple features (variables)

• Example: Price of a House Price


Size (feet2) ($1000)

2104 460
1416 232
1534 315
852 178
… …
��ො = 𝑏0 + 𝑏1𝑥
9
Multiple features (variables)

Size (feet2) Number of Number of Age of home Price ($1000)


bedrooms floors (years)

2104 5 1 45 460
1416 3 2 40 232
1534 3 2 30 315
852 2 1 36 178

Notatio n: … … … …

= number of features
= input (features) of training example.
= value of feature in training example.
10
Hypothesis:

Previously:
��ො = 𝑏0 + 𝑏1𝑥 (No More)

Now:

��ො = 𝑏0 + 𝑏1𝑥1 + 𝑏2𝑥2 + 𝑏3𝑥3 + 𝑏4𝑥4


11

For convenience of notation, define .

Multivariate linear regression.


12
Building a Multivariate linear Regression model
13
Building a Multivariate linear Regression model Cont.

 There are 5 methods for building such models:

 All in

 Backward Elimination

 Forward Selection

 Bidirectional Elimination

 Score Comparison
14
Building a Multivariate linear Regression model Cont.
15
Housing prices prediction

𝑦 = 𝑏0 + 𝑏1 × 𝑤𝑖𝑑𝑡ℎ + 𝑏2 × 𝑙𝑒𝑛𝑔𝑡ℎ

𝑥1
𝑥2
𝐴𝑟𝑒𝑎 = 𝑤𝑖𝑑𝑡ℎ × 𝑙𝑒𝑛𝑔𝑡ℎ

𝑦 = 𝑏0 + 𝑏1 × X
16
Polynomial regression

𝑏0 + 𝑏1𝑥 + 𝑏 2 𝑥 2

Price
(y) 𝑏0 + 𝑏1𝑥 + 𝑏 2 𝑥 2 + 𝑏 3 𝑥 3

Size (x)
��ො = 𝑏0 + 𝑏1𝑥 + 𝑏2𝑥2 +
𝑏3𝑥3 3
= 𝑏0 + 𝑏1 𝑠𝑖𝑧𝑒 + 𝑏2 𝑠𝑖𝑧𝑒 2 + 𝑏3 𝑠𝑖𝑧𝑒
17
Other Type of Regression

Support Vector Regression

Decision Tree Regression

Random Forest Regression

Artificial neural network regression

Deep learning based regression


18
Example: Linear Regression (Implementation)

Importing Libraries

Data Import

Train and Test Split


19
Example: Simple Linear Regression (Implementation)

Fitting a linear regressor to the


Salary Data

Plotting the Results


20
Example: Simple Linear Regression (Results)
Classification
22
Classification

 A classification problem is when the output variable is a category


“red” or “blue” or “disease” and “no disease”.
 A classification model attempts to draw some conclusion
from observed values.
 Given one or more inputs a classification model will try to
predict the value of one or more outcomes.

Which of the following is/are classification problem(s)?

 Predicting the gender of a person by his/her handwriting


style
 Predicting house price based on area
 Predicting whether monsoon will be normal next year
 month
Predict the number of copies a music album will be sold Image source: Kaggle
Classification:
Logistic Regression
24
Logistic Regression: Overview

 Defined: A model for predicting one variable from


other variable(s).

 Variables: IV(s) is continuous/categorical,


DV is dichotomous (contrast b/w things)

 Relationship: Prediction of group membership

 Example: Can we predict students passage from GPA, etc.

 Assumptions: Multicollinearity (not linearity or normality)


25
Comparison to Linear Regression

Since dichotomous outcome, can’t use linear


regression because not linear
Since dichotomous outcome, we are now talking
about “probabilities” (of 0 or 1)
 So logistic is about predicting the probability of
the outcome occurring.
26
Logistic Regression basics

• Logistic is based upon “odds ratio”


• which is the probability of an event divided by probability of non-event.
• For example, if Exp(b) =2, then a one unit change would make the event
twice as likely (.67/.33) to occur.

Odds after a unitchangein the predictor


Exp(b) 
Odds before a unitchangein the
predictor
27
Logistic Regression basics

 Single predictor
P(Y ) 1 e
1
( b 0 b 1 X 1   i )

 Multiple predictor 
1
P(Y) 
1e (b0  b1 X1b 2 X 2 ...bn Xn  i )

 Notice the linear regression equation


 e is the base of the natural logarithm (about
2.718)
k-nearest neighbor: Intuition
28
 K nearest neighbors is a simple algorithm that stores all available
cases and classifies new cases based on a similarity measure
(e.g., distance functions).

 KNN has been used in statistical estimation and pattern


recognition already in the beginning of 1970’s as a non-parametric
technique.

Hasan, M.J.; Kim, J.-M. Fault Detection of a Spherical Tank Using a Genetic Algorithm-Based Hybrid Feature Pool and k-Nearest Neighbor Algorithm. Energies 2019, 12, 991.
Tree Based Methods

Decision Tree Complexity

* Slides from Seo Hui(LG Electronics), “Gradient Boosting


Model”
30
Implementation: Logistic Regression
31
Implementation: Logistic Regression Cont.
32
Implementation: Logistic Regression Cont.
33
Logistic Regression: Training and Testing Results
34
Another Classification
Algorithms
Support vector machine
Random Forest
Naïve Bayes
Neural Network (MLP)

35
References

 Y. Bengio, I. Goodfellow, and A. Courville, Deep learning, vol.1.


Citeseer, 2017.ai.berkeley.edu

 SuperDataScience

 towardsdatascience.com

You might also like