Professional Documents
Culture Documents
Support Vector Machines Op2miza2on Objec2ve: Machine Learning
Support Vector Machines Op2miza2on Objec2ve: Machine Learning
Vector
Machines
Op2miza2on
objec2ve
Machine
Learning
Alterna(ve
view
of
logis(c
regression
If
,
we
want
,
If
,
we
want
,
Andrew
Ng
Alterna(ve
view
of
logis(c
regression
Cost
of
example:
Andrew
Ng
Support
vector
machine
Logis2c
regression:
Andrew
Ng
SVM
hypothesis
Hypothesis:
Andrew
Ng
Support
Vector
Machines
Large
Margin
Intui2on
Machine
Learning
Support
Vector
Machine
-‐1
1
-‐1
1
If
,
we
want
(not
just
)
If
,
we
want
(not
just
)
Andrew
Ng
SVM
Decision
Boundary
Whenever :
-‐1 1
Whenever :
-‐1 1
Andrew
Ng
SVM
Decision
Boundary:
Linearly
separable
case
x2
x1
x2
x1
Andrew
Ng
Support
Vector
Machines
The
mathema2cs
behind
large
margin
classifica2on
(op2onal)
Machine
Learning
Vector
Inner
Product
Andrew
Ng
SVM
Decision
Boundary
Andrew
Ng
SVM
Decision
Boundary
Andrew
Ng
Support
Vector
Machines
Kernels
I
Machine
Learning
Non-‐linear
Decision
Boundary
x2
x1
x1
Andrew
Ng
Kernels
and
Similarity
Andrew
Ng
Example:
Andrew
Ng
x2
x1
Andrew
Ng
Support
Vector
Machines
Kernels
II
Machine
Learning
Choosing
the
landmarks
Given
:
x2
x1
Predict
if
Where
to
get
?
Andrew
Ng
SVM
with
Kernels
Given
choose
Given
example
:
Andrew
Ng
SVM
with
Kernels
Hypothesis:
Given
,
compute
features
Predict
“y=1”
if
Training:
Andrew
Ng
SVM
parameters:
C
(
).
Large
C:
Lower
bias,
high
variance.
Small
C:
Higher
bias,
low
variance.
Andrew
Ng
Support
Vector
Machines
Machine
Learning
Use
SVM
so]ware
package
(e.g.
liblinear,
libsvm,
…)
to
solve
for
parameters
.
Need
to
specify:
Choice
of
parameter
C.
Choice
of
kernel
(similarity
func2on):
E.g.
No
kernel
(“linear
kernel”)
Predict
“y
=
1”
if
Gaussian
kernel:
,
where
.
Need
to
choose
.
Andrew
Ng
Kernel
(similarity)
func(ons:
function f = kernel(x1,x2)
x1 x2
return
Note: Do perform feature scaling before using the Gaussian kernel.
Andrew
Ng
Other
choices
of
kernel
Note:
Not
all
similarity
func2ons
make
valid
kernels.
(Need
to
sa2sfy
technical
condi2on
called
“Mercer’s
Theorem”
to
make
sure
SVM
packages’
op2miza2ons
run
correctly,
and
do
not
diverge).
Andrew Ng