You are on page 1of 18

Perceptrons and SVMs

CS771: Introduction to Machine Learning


Nisheeth
2
Hyperplane

Can be positive or
negative
CS771: Intro to ML
3
Hyperplane based (binary) classification

For multi-class classification with


Prediction Rule hyperplanes, there will be multiple
hyperplanes (e.g., one for each pair of
classes); more on this later

CS771: Intro to ML
4
Loss Functions for Classification

0-1 Loss

Non-convex, non-differentiable, and


NP-Hard to optimize (also no useful
gradient info for the most part)

(0,1)

(0,0)
CS771: Intro to ML
5
Loss Functions for Classification
“Perceptron” Loss

Convex and Non-differentiable

Already saw this in logistic regression


(the likelihood resulted in this loss
Log(istic) Loss Hinge Loss (0,0)
function)

(0,1)
Convex and Differentiable Convex and Non-differentiable

(0,0) (0,0) (1,0)


CS771: Intro to ML
6
Learning by Optimizing Perceptron Loss

One randomly chosen example in


each iteration

CS771: Intro to ML
7
The Perceptron Algorithm

Note: An example may


get chosen several times Mistake condition
during the entire run

If training data is linearly separable, the


Perceptron algo will converge in a finite number
of iterations
(Block & Novikoff theorem)

CS771: Intro to ML
8
Perceptron and (lack of) Margins

Kind of an “unsafe” situation to have – ideally


would like it to be reasonably away from
closest training examples from either class

CS771: Intro to ML
9
Support Vector Machine (SVM) SVM originally proposed by Vapnik and
colleagues in early 90s

Class +1
Distance from the closest point
(on either side)

“Margin” of the hyperplane


Class -1

Constrained
optimization
problem

CS771: Intro to ML
10
Hard-Margin SVM

Class +1

Class -1

CS771: Intro to ML
11
Solving Hard-Margin SVM

CS771: Intro to ML
12
Solving Hard-Margin SVM

(Note: For various SVM solvers, can see “Support Vector Machine Solvers” by Bottou and Lin) CS771: Intro to ML
13
Solving Hard-Margin SVM

CS771: Intro to ML
14
Soft-Margin SVM (More Commonly Used)

Soft-margin constraint:
CS771: Intro to ML
15
Soft-Margin SVM (Contd)
Sum of slacks is like the
training error

Inversely prop. to Trade-off hyperparam


training
margin error

CS771: Intro to ML
16
Support Vectors in Soft-Margin SVM

1. Lying on the supporting hyperplanes

2. Lying within the margin region but still on the correct


side of the hyperplane

3. Lying on the wrong side of the hyperplane


(misclassified training examples)

CS771: Intro to ML
17
Solving Soft-Margin SVM

CS771: Intro to ML
18
Solving Soft-Margin SVM

Weighted sum of training inputs

CS771:
(Note: For various SVM solvers, can see “Support Vector Machine Solvers” Intro
by Bottou to ML
and Lin)

You might also like