Design of SVM

Support Vector Machine
Compiled by Koushika B, COE17B044

Guided by
Dr Umarani Jayaraman
Department of Computer Science and Engineering

Indian Institute of Information Technology Design and Manufacturing
Kancheepuram
May 5, 2022
1 / 20
SVM
w t Xi + w0 > 0
w .Xi + w0 > 0 if Xiw1
w .Xi + w0 < 0 if Xiw2
g (X ) = w .Xi + w0 ≥ b for good generalization
The distance of point ‘X’ from the hyperplane ‘H’ which is
represented as g(X) can be calculated as r = g (x)/||w ||
which is nothing but (W .Xi + w0 )/||w || .
2 / 20
So, we must ensure that
(W .Xi + w0 )/||W || ≥ b ⇒ 1
(W .Xi + w0 ) ≥ b.||W ||
W .Xi + w0 ≥ 1 if Xi ω1
W .Xi + w0 ≤ −1 if Xi ω2 ⇒ 2
3 / 20
From 2, we can establish an uniform criteria as follows Yi is the class label
whose values is ±1
+1 for class ω1
-1 for class ω2
yi (W .Xi + w0 ) ≥ 1 and this equality holds
yi (W .Xi + w0 ) = 1 if Xi are support vectors
yi (W .Xi + w0 ) > 1 if Xi are not support vectors
4 / 20
5 / 20
From 1 ,
In order to maximize the margin ‘b’, ||w || has to be minimized and at
the same time wo has to be maximized .
For minimization of ||w || , let’s consider the other constraints.
yi (w .Xi + w0 ) = 1
Because it is constraint optimisation problem, it can be converted into
un-constraint problem by using the lagrangian multiplier.
6 / 20
Minimization of ||W || is same as minimization of φ(W ) -
function of W
φ(W ) = W t .W
φ(w ) = 12 .W .W (dot product)
1/2 is introduced for mathematical convenience.
with the constraints,

yi (W .Xi + w0 ) = 1
subject to the constraint, if Xi are support vectors
7 / 20
It can be written using unconstraint optimization problem as follows
Pn
L(||W ||, wo ) = 12 ||W ||2 – i=1 λ[yi (W .Xi + wo )–1]
minimize ||w || .
1 2
2 ||W || is objective function
[yi (W .Xi + wo )–1] is a constraint
8 / 20
We can define the lagrangian of the form,
L(w , wo ) = 21 (w .w )– αi [yi(w .Xi + wo)–1]

P
minimize ||wo || .
maximize ||w || .
αi is Lagrangian multiplier
L(w , wo ) = 12 (w .w )–
P
αi [yi (w .Xi + wo )–1]
by taking derivative w.r.t ’w’ and 0 wo0
9 / 20
L(W , wo ) = 12 (W .W )– P αi [yi (W Xi + w
P
Po )–1]
L(W , wo ) = 21 (W .W )– αi yi (W Xi ) − αi yi wo + αi
P
∂L P
∂wo =− αi yi = 0
∂L Pn
∂wo = i=1 αi yi = 0 ⇐ one of the constraints
n= number of samples during training
10 / 20
L(W , wo ) = 12 (W .W )– αi yi (W .Xi ) − αi yi wo + αi
P P P
∂L P
∂w = W − αi yi Xi = 0
Pn
W = i=1 αi yi Xi
Substitute those values P

in 1
L(W , wo ) = 12 (W .W )– αi yi (W .Xi ) − αi yi wo + αi
P P
where P
W .Xi = αj yj Xj and αi yi wo ) = 0
11 / 20
1 Pn Pn P
= 2 i=1 αi αj yi yj (Xi .Xj )– i=1 αi αj yi yj (Xi .Xj ) + αi
Pn 1 Pn
L(W , wo ) = i=1 αi − 2 i=1 αi αj yi yj (Xi .Xj )
We have to maximize this with different values of αi
12 / 20
αi - lagrangian multipliers are always positive
αi ≥ 0
Pn
i=1 αi yi =0
When we try to maximize this lagrangian function, it is quite likely

some of the lagrangian multipliers are zero and few of the lagrangian
multipliers are very high.
if αi = 0; that indicates its corresponding training vectors (Xi ) are
not support vectors.
if αi 6= 0 , its corresponding training vectors (Xi ) are having high
influence over the position of hyper plane (support vectors).
Here αi 6= 0 will go into decision making.
13 / 20
g (z) = W .Z P
+ wo
g (z) = sign( ni=1 αi yi Xi .Z + wo )
unknown Feature Vector Z

if sign is +ve Z ω1
if sign is -ve Z ω2
14 / 20
The steps of SVM design to estimate W and wo
Pn
W = i=1 αi yi Xi
and wo
wo = 12 [min
P P
∀yi =+1 αi yi (Xi .Xj ) + max ∀yi =−1 αi yi (Xi .Xj )]
wo = 21 [min
P P
∀yi =+1 W .Xi + max ∀yi =−1 W .Xi ]
substitute these W and wo into classification rule to classify the unknown

feature vector Z .
g (Z ) = W .ZP+ wo
g (Z ) = sign ni=1 αi yi Xi .Z + wo
15 / 20
16 / 20
but in implementation,
w t Xi + wo = 1
w .Xi + wo = 1
wo = 1 − (w .X )
17 / 20
Additional slides
Consider the optimization problem maximize f(x,y) subject to :
g(x,y) = 0 or g(x,y) = c
L(x, y , λ) = f (x, y )–λg (x, y )
f(x,y) is the objective function

g(x,y) is the constraint
assume both f and g have continuous first partial derivatives.
18 / 20
Additional slides
Here
f (x, y ) = f (||w ||, wo ) ⇒ objective function
g (x, y ) = yi (w .X + wo ) = 1 ⇒ constraint
we can define the lagrangian of the form ,
Pn
L(||w ||, wo ) = 12 ||w ||2 – i=1 λ[yi (w .Xi + wo )–1]
minimize ||w || .
maximize ||wo ||
||w ||2 is objective function
[yi (w .Xi + wo )–1] is the constraint
19 / 20
THANK YOU
20 / 20

Design of SVM

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Design of SVM

Uploaded by

Copyright:

Available Formats

Support Vector Machine

Compiled by Koushika B, COE17B044

Department of Computer Science and Engineering

yi (W .Xi + w0 ) ≥ 1 and this equality holds

yi (W .Xi + w0 ) = 1 if Xi are support vectors

yi (W .Xi + w0 ) > 1 if Xi are not support vectors

1/2 is introduced for mathematical convenience.

with the constraints,

subject to the constraint, if Xi are support vectors

L(w , wo ) = 21 (w .w )– αi [yi(w .Xi + wo)–1]

by taking derivative w.r.t ’w’ and 0 wo0

n= number of samples during training

Substitute those values P

We have to maximize this with different values of αi

When we try to maximize this lagrangian function, it is quite likely

unknown Feature Vector Z

substitute these W and wo into classification rule to classify the unknown

Consider the optimization problem maximize f(x,y) subject to :

f(x,y) is the objective function

assume both f and g have continuous first partial derivatives.

You might also like