Professional Documents
Culture Documents
Logistic Regression
Logistic Regression
Dr. D. Harimurugan
Logistic regression
Logistic regression
Binary classification (0/1)
Multiclass classfication (0,1,2)
Logistic regression
Binary classification (0/1)
Multiclass classfication (0,1,2)
Logistic regression
Binary classification (0/1)
Multiclass classfication (0,1,2)
Logistic regression
Binary classification (0/1)
Multiclass classfication (0,1,2)
g(z)|z=∞ = 1 g(z)|z=−∞ = 0
h(x) represents the estimated probability data belongs to
one class
ML Dr. D. Harimurugan, EE - NITJ
Logistic regression Introduction
Multiclass classification Sigmoid function
Evaluation Metrics Decison Boundary
Naive Baye’s Algorithm Cost Function
1
hθ (x) = g(X .θ) =
1 + e−(X .θ)
1
hθ (x) = g(X .θ) =
1 + e−(X .θ)
1
hθ (x) = g(X .θ) =
1 + e−(X .θ)
1
hθ (x) = g(X .θ) =
1 + e−(X .θ)
z ≥ 0 ⇒ g(z) ≥ 0.5 ⇒ hθ (x) ≥ 0.5 ⇒ Class − 1
z < 0 ⇒ g(z) < 0.5 ⇒ hθ (x) < 0.5 ⇒ Class − 0
X .θ ≥ 0 ⇒ g(X .θ) ≥ 0.5 ⇒ Class − 1
X .θ < 0 ⇒ g(X .θ) < 0.5 ⇒ Class − 0
Predicting probability of ’y’ belong to class-1 or class-0 is
equivalent to predicting X .θ greater than or less than zero.
Based on the value of h, we will divide the dataset into
classes and the boundary we call it as “Decision
boundary”
ML Dr. D. Harimurugan, EE - NITJ
Logistic regression Introduction
Multiclass classification Sigmoid function
Evaluation Metrics Decison Boundary
Naive Baye’s Algorithm Cost Function
Decision Boundary
Decision Boundary
hθ (x) = g(θ0 + θ1 x1 + θ2 x2 )
Decision Boundary
hθ (x) = g(θ0 + θ1 x1 + θ2 x2 )
Decision Boundary
hθ (x) = g(θ0 + θ1 x1 + θ2 x2 )
x1 + x2 = 4
x1 + x2 − 4 = 0
Decision Boundary
hθ (x) = g(θ0 + θ1 x1 + θ2 x2 )
x1 + x2 = 4
x1 + x2 − 4 = 0
−4 Predict y=1, if x1 + x2 ≥ 4
θ= 1
Predict y=0, if x1 + x2 < 4
1
Decision Boundary
hθ (x) = g(θ0 + θ1 x1 + θ2 x2 )
x1 + x2 = 4
x1 + x2 − 4 = 0
−4 Predict y=1, if x1 + x2 ≥ 4
θ= 1
Predict y=0, if x1 + x2 < 4
1
hθ (x) = 0.5 ⇒ g(x1 + x2 = 4)
Decision Boundary
Decision boundary is
Decision boundary is
x12 + x22 = 1
x12 + x22 − 1 ≥ 0 ⇒ y = 1
x12 + x22 − 1 < 0 ⇒ y = 0
Decision boundary is
x12 + x22 = 1
x12 + x22 − 1 ≥ 0 ⇒ y = 1
x12 + x22 − 1 < 0 ⇒ y = 0
−1
0
hθ (x) = g(x12 + x22 − 1)
0
θ=
1
1
ML Dr. D. Harimurugan, EE - NITJ
Logistic regression Introduction
Multiclass classification Sigmoid function
Evaluation Metrics Decison Boundary
Naive Baye’s Algorithm Cost Function
(
(i) −log(hθ (x))
(i) if y=1
cost(hθ (x) , y ) =
−log(1 − hθ (x)) if y=0
ML Dr. D. Harimurugan, EE - NITJ
Logistic regression Introduction
Multiclass classification Sigmoid function
Evaluation Metrics Decison Boundary
Naive Baye’s Algorithm Cost Function
y=1:
y=1:
cost(hθ (x), y ) = −1.log(hθ (x)) − (1 − 1).log(1 − hθ (x))
cost(hθ (x), y) = −log(hθ (x))
y=1:
cost(hθ (x), y ) = −1.log(hθ (x)) − (1 − 1).log(1 − hθ (x))
cost(hθ (x), y) = −log(hθ (x))
y=0:
y=1:
cost(hθ (x), y ) = −1.log(hθ (x)) − (1 − 1).log(1 − hθ (x))
cost(hθ (x), y) = −log(hθ (x))
y=0:
cost(hθ (x), y ) = −0.log(hθ (x)) − (1 − 0).log(1 − hθ (x))
cost(hθ (x), y) = −log(1 − hθ (x))
1
hθ (x) =
1 − e−X .θ
Find the probabilites of each model and the test point belongs
to the model which gives highest probability
ML Dr. D. Harimurugan, EE - NITJ
Logistic regression Accuracy
Multiclass classification Confusion matrix
Evaluation Metrics Precision and Recall, F1 Score
Naive Baye’s Algorithm AUC-ROC & Log-loss
Accuracy
Confusion matrix
Precision and recall
F1-score
AUC-ROC
Log loss
Gini coefficient
TP FP
TPF (sensitivity) = TPF =
TP + FN TN + FP
False negative fraction Positive predicted value
FN TP
TPF = PPV =
TP + FN TP + FP
True negative fraction Negative predicted value
TN TN
TPF (specificity) = NPV =
TN + FP TN + FN
[P(sunny |No).P(hot|No)P(normal|No)P(false|No)].P(no)
=
P(sunny ).P(hot)P(false)P(normal)
[P(sunny |No).P(hot|No)P(normal|No)P(false|No)].P(no)
=
P(sunny ).P(hot)P(false)P(normal)
3
P(sunny |yes) =
9
2
P(hot|yes) =
9
6
P(Normal|yes) =
9
6
P(False|yes) =
9
9
P(yes) =
14
= [P(sunny |yes).P(hot|yes)P(normal|yes)P(false|yes)].P(yes)
3 2 6 6 9
P(yes|sunny , hot, normal, false) = . . . . = 0.0211
9 9 9 9 14
2 2 1 2 5
P(No|sunny, hot, normal, false) = . . . . = 0.0024
5 5 5 5 14
P(yes|today) > P(no|today)
Hence, the test data belongs to class “Yes”
P(x|Y )P(Y )
P(Y |x) =
P(x)
P(x|Y )P(Y )
P(Y |x) =
P(x)