Professional Documents
Culture Documents
www.intellipaat.com
Agenda for
Today’s Session
01 03 05
What is Logistic
Linear Regression Spam Email
Regression?
Recap Classifier
02 04 06
www.intellipaat.com
Look back at Linear Regression
www.intellipaat.com
Hey Josh! Help me
Regression
(Recap)
www.intellipaat.com
Hey. Can you do
Linear something about it?
Sure…
Regression
(Recap)
www.intellipaat.com
So,
Hey. Can you do Spending “X” bucks
Linear something about it? will get her a
property of area “Y”
Regression
(Recap)
www.intellipaat.com
How did you find Simple linear
Linear out? regression.
Regression
(Recap)
www.intellipaat.com
How did you find Let me show you
Linear out? how.
Regression
(Recap)
www.intellipaat.com
Dependent
Variable
(Property Size)
+ ve
Spending more
Linear money will get her
bigger property.
Regression
(Recap)
Independent
Variable
(Money)
www.intellipaat.com
Dependent
Variable
(House Area)
Regression
(Recap)
- ve
Independent
Variable
(Garden Area)
www.intellipaat.com
Dependent
Variable
(Property Area)
Y
Linear
Regression
(Recap)
n to Classification
Linear
Regression
Regression
Logistic
Regression
comes into
the picture
www.intellipaat.com
Linear
Introductio Regression
Regression
n to
Logistic Supervised
Logistic
Regression
Regression Machine
Learning
Classification
SVM
Unsupervised
www.intellipaat.com
A statistical classification model
Deals with categorical dependent variables
What is Could be binary or dichotomous
www.intellipaat.com
Tool for applied statistics and discrete data analysis
Gives outcome in terms of probability
Why In-turn helps in classifying the given data
Logistic
Regression
?
www.intellipaat.com
Let’s understand this with an example
www.intellipaat.com
Let us explore with an example of Spam email classifier .
Classifier Email
Not
Spam
www.intellipaat.com
Steps:
www.intellipaat.com
Spam Email
Let’s get started.
Classifier
www.intellipaat.com
Independent variable:
Count of spam words
www.intellipaat.com
Dependent variable:
Probability of mail being spam.
Binary or dichotomous
STEP:1 1
Word
count
0 1 2 3 4 5 6 7 9 10
www.intellipaat.com
Probability
2 words 0
6 words 1
STEP:2 8 words 1
1 words 0
Define the variable
7 words 1
Plot labeled data
3 words 0
Draw regression line
9 words 1
Find out the best using MLE
8 words 0
Pre-labeled Dataset
www.intellipaat.com
Draw Regression Find out the best
STEP:3 Define the variable Plot labeled data
Line using MLE
Probability
1
1 words 0
5 words 1
3 words 1
2 words 0
7 words 1
4 words 0
9 words 1 0
8 words 0
www.intellipaat.com
Draw Regression Find out the best
STEP:3 Define the variable Plot labeled data
Line using MLE
Probability Probability
1 words 0 1
5 words 1 Spam if
>= 0.5
3 words 1
2 words 0 0.5
9 words 1 0
8 words 0
www.intellipaat.com
Draw Regression Find out the best
STEP:3 Define the variable Plot labeled data
Line using MLE
Probability Probability
1 words 0 1
5 words 1 Spam if
>= 0.5
3 words 1
2 words 0 0.5
9 words 1 0
8 words 0
www.intellipaat.com
Draw Regression Find out the best
STEP:3 Define the variable Plot labeled data
Line using MLE
Probability
1
Not spam if
But how do we find the < 0.5
best fit? 0
www.intellipaat.com
Draw Regression Find out the best
STEP:3 Define the variable Plot labeled data
Line using MLE
How? log(odds)
+∞
Probability
1
log(odds) 0
-∞
www.intellipaat.com
Draw Regression Find out the best
STEP:3 Define the variable Plot labeled data
Line using MLE
log(odds) How?
+∞
Probability of
mail being spam
1
Sigmoid
0
Function
0.03
0
-∞
www.intellipaat.com
Draw Regression Find out the best
STEP:3 Define the variable Plot labeled data
Line using MLE
How?
Probability of
mail being spam
1
Individual
log(likelihood)
likelihood
0.03
0
www.intellipaat.com
STEP:3
Define the variable
What does log(odds) mean?
Plot labeled data
www.intellipaat.com
Probability vs odds
This guy went fishing 5 times
a week
STEP:3
Caught a fish 2 times
Failed to catch 3 times.
Chances for 2
Draw regression line
Probability= =
Total chances 5
Find out the best using MLE
Chances for 2
Odds = =
Chances Against 3
www.intellipaat.com
Log(odds) and log(odds ratio)
2
Odds of catching on sunny day=
3
Define the variable
3
Odds of catching on rainy day=
Plot labeled data 2
2
Log(odds of catching on sunny day) =log
3
Draw regression line
3
Log(odds of catching on rainy day)=log( )
Find out the best using MLE 2
2
𝑜𝑑𝑑𝑠 𝑜𝑛 𝑟𝑎𝑖𝑛𝑦 𝑑𝑎𝑦
Log(odds ratio)= Log( )=log ( ) =log(0.44)
3
3
𝑜𝑑𝑑𝑠 𝑜𝑛 𝑠𝑢𝑛𝑛𝑦 𝑑𝑎𝑦
2
www.intellipaat.com
Draw Regression Find out the best
STEP:3 Define the variable Plot labeled data
Line using MLE
log(odds)
+∞
Probability
1
log(odds) 0
-∞
www.intellipaat.com
Draw Regression Find out the best
STEP:3 Define the variable Plot labeled data
Line using MLE
𝑃(𝑆𝑝𝑎𝑚)
log(odds) = log( )
1−𝑃(𝑆𝑝𝑎𝑚)
1
Probability log(odds) = log( )
1−1
1 1
log(odds) = log( )
0
+∞
log𝑏 0 = 𝑐
=>0=𝑏 𝑐 -------(1)
0
So, for equation(1) to be true.
If b<1 =>c - ∞
If b>1 =>c + ∞
www.intellipaat.com
Draw Regression Find out the best
STEP:3 Define the variable Plot labeled data
Line using MLE
log(odds)
+∞
Probability
1 𝑃(𝑆𝑝𝑎𝑚)
log(odds) = log( )
1−𝑃(𝑆𝑝𝑎𝑚)
1
log(odds) = log( ) 0
0
log(odds)=log(1)-log(0)
log(odds)=0-(- ∞)+∞
0
www.intellipaat.com
Draw Regression Find out the best
STEP:3 Define the variable Plot labeled data
Line using MLE
log(odds)
+∞
Probability
1 𝑃(𝑁𝑜𝑡 𝑆𝑝𝑎𝑚)
log(odds) = log( )
1−𝑃(𝑁𝑜𝑡 𝑆𝑝𝑎𝑚)
0
log(odds) = log( ) 0
1−0
0
log(odds) = log( )
1
0 log(odds)=log(0)-log(1)-∞
-∞
www.intellipaat.com
Draw Regression Find out the best
STEP:3 Define the variable Plot labeled data
Line using MLE
log(odds) log(odds)
+∞ +∞
0 0
-∞ -∞
www.intellipaat.com
Draw Regression Find out the best
STEP:3 Define the variable Plot labeled data
Line using MLE
log(odds) How?
+∞
Probability of
mail being spam
1
Sigmoid
0
Function
0.03
0
-∞
www.intellipaat.com
STEP:3
Define the variable What does Sigmoid Function mean?
Plot labeled data
www.intellipaat.com
Sigmoid Function:
Sigmoid function is the standard logistic function
𝐿(𝑒 𝑘 𝑥−𝑥0 )
Logistic function=
STEP:3 1+𝑒 𝑘(𝑥−𝑥0 )
Here,
Define the variable
L – Curve’s maximum value
k – Steepness of the curve
x0 – x value of Sigmoid midpoint
Plot labeled data
𝑒𝑥
Sigmoid function=
Draw regression line 1+𝑒 𝑥
Here,
Find out the best using MLE k=1
x0=0
L=1
www.intellipaat.com
Sigmoid Function:
𝑒𝑥
Sigmoid function=
1+𝑒 𝑥
STEP:3
S’ shaped curve.
Sigmoid curve has a finite limit of:
Define the variable ‘0’ as x approaches −∞
‘1’ as x approaches +∞
Plot labeled data
Y
Draw regression line 1
0 X
www.intellipaat.com
Sigmoid Function:
STEP:3 X 1
Find out the best using MLE Takes any real-valued number and maps into a
value between 0 and 1.
Helpful while solving classification problems
www.intellipaat.com
Draw Regression Find out the best
STEP:3 Define the variable Plot labeled data
Line using MLE
log(odds) How?
+∞
Probability of
mail being spam
1
Sigmoid
0
Function
0.03
0
-∞
www.intellipaat.com
Draw Regression Find out the best
STEP:3 Define the variable Plot labeled data
Line using MLE
log(odds)
+∞
Probability of
mail being spam
𝑒 log(𝑜𝑑𝑑𝑠) 1
P=
1+𝑒 log(𝑜𝑑𝑑𝑠)
-∞
www.intellipaat.com
Draw Regression Find out the best
STEP:3 Define the variable Plot labeled data
Line using MLE
log(odds)
+∞
Probability of
mail being spam
𝑒 log(−3.2) 1
P= = 0.03
1+𝑒 log(−3.2)
0.5
0
-3.2
0.03 .
0
-∞
www.intellipaat.com
Draw Regression Find out the best
STEP:3 Define the variable Plot labeled data
Line using MLE
log(odds)
+∞
Probability of
mail being spam
𝑒 log(5.6) 1
0.99
5.6
P= = 0.99
1+𝑒 log(5.6)
-3.2
-∞
www.intellipaat.com
Draw Regression Find out the best
STEP:3 Define the variable Plot labeled data
Line using MLE
log(odds)
+∞
Probability of
mail being spam
𝑒 log(−4.5) 1
5.6
P= = 0.01
1+𝑒 log(−4.5)
-3.2
0.01
-4.5 0
-∞
www.intellipaat.com
Draw Regression Find out the best
STEP:3 Define the variable Plot labeled data
Line using MLE
log(odds)
+∞
Probability of
mail being spam
5.6
𝑒 log(odds) 1
P=
1+𝑒 log(𝑜𝑑𝑑𝑠)
-3.2
0.01
-4.5 0
-∞
www.intellipaat.com
Draw Regression Find out the best
STEP:4 Define the variable Plot labeled data
Line using MLE
www.intellipaat.com
Draw Regression Find out the best
STEP:4 Define the variable Plot labeled data
Line using MLE
Probability
1
0.99
0.03
0
www.intellipaat.com
Draw Regression Find out the best
STEP:4 Define the variable Plot labeled data
Line using MLE
Probability
1
0.99
log(1-0.01) + log(1-0.02)
+ log(1-0.05) + log(1-0.05)
log(Likelihood of data given this curve) = -0.084
0.5 + log(0.97) + log(0.99)
+ log(0.99) + log(0.99)
0.03
0
www.intellipaat.com
Draw Regression Find out the best
STEP:4 Define the variable Plot labeled data
Line using MLE
+∞
Probability
5.6 1
0.99
0.03
0
-∞
www.intellipaat.com
Draw Regression Find out the best
STEP:4 Define the variable Plot labeled data
Line using MLE
+∞
5.6
0 Now let us rotate this line to find out the best fitted
regression line.
-3.2
-∞
www.intellipaat.com
Draw Regression Find out the best
STEP:4 Define the variable Plot labeled data
Line using MLE
+∞
Probability
1
0.93
2.7
-∞
www.intellipaat.com
Draw Regression Find out the best
STEP:4 Define the variable Plot labeled data
Line using MLE
Probability
1
0.93
log(0.93) + log(0.97)
log(Likelihood of data given this curve) = + log(0.99) + log(0.99)
-0.207
0.5 + log(1-0.18) + log(1-0.2) +
log(1-0.03) + log(1-0.03)
0.18
www.intellipaat.com
Draw Regression Find out the best
STEP:4 Define the variable Plot labeled data
Line using MLE
+∞
Probability
1
0.93
2.7
-∞
www.intellipaat.com
Draw Regression Find out the best
STEP:4 Define the variable Plot labeled data
Line using MLE
+∞ +∞
5.6
2.7
-∞ -∞
+∞
-∞
www.intellipaat.com
Draw Regression Find out the best
STEP:4 Define the variable Plot labeled data
Line using MLE
+∞
-∞
www.intellipaat.com
Draw Regression Find out the best
STEP:4 Define the variable Plot labeled data
Line using MLE
+∞
www.intellipaat.com
Draw Regression Find out the best
STEP:4 Define the variable Plot labeled data
Line using MLE
+∞
www.intellipaat.com
Summary
www.intellipaat.com
Summary
A B C
Draw a regression line and Use MLE to find out the best
Plot the labeled data
keep rotating fitted regression line
0 0 0
Sigmoid
Log(odds) MLE
Function
www.intellipaat.com
www.intellipaat.com
India : +91-7847955955
sales@intellipaat.com
www.intellipaat.com