Professional Documents
Culture Documents
Figure 13: square showing the negative and circles showing the positive examples
As shown in the above figure, there is no linear function that can separate the four training
points. So, we will use the kernel function as
k ( ⃗x i , ⃗x j )=( ⃗x i ∙ ⃗x j +1 )2 (a)
Where, ⃗x i =( x i 1 , x i2 ¿ and ⃗x j =( x j 1 , x j2 ¿
The kernel function can be expressed as
k ( ⃗x 1 , ⃗x 2 )=¿) (b)
The images of input vector x is induced in the feature space is therefore deduced to be
Φ ( ⃗x i )=¿) (c )
Here, all four points are the Support Vectors and by replacing the kernel function in equation
(a), we will have
4 4 4
1
W(α)=∑ α i –
2 ∑ ∑ αi α j y i y j ( 1+ ⃗x i ∙ ⃗x j )2 (e)
i=1 i=1 j=1
= α 1 +α 2+
1
α 3+ α 4− [α 1 α 1 y1 y 1 ( 1+ ⃗x 1 ∙ ⃗x 1 )2 +α 1 α 2 y 1 y 2 ( 1+ ⃗x 1 ∙ ⃗x 2) 2+ α 1 α 3 y 1 y 3 ( 1+ ⃗x 1 ∙ ⃗x3 ) 2 …+ α 4 α 4 y 4 y 4 ( 1+ ⃗x 4 ∙ ⃗x 4 )2 ]
2
(f)
[−1]
( ⃗x 1 ∙ ⃗x 1 )=[ −1−1 ] −1 = 2 [ +1 ]
( ⃗x 1 ∙ ⃗x 2 )=[ −1−1 ] −1 =0
Similarly,
( ⃗x 1 ∙ ⃗x 3 ) =0 ( ⃗x 1 ∙ ⃗x 4 ) = -2 ( ⃗x 2 ∙ ⃗x 2 ) =2 ( ⃗x 2 ∙ ⃗x 3 ) = -2
( ⃗x 2 ∙ ⃗x 4 ) =0 ( ⃗x 3 ∙ ⃗x 3 ) =2 ( ⃗x 3 ∙ ⃗x 4 ) =0 ( ⃗x 4 ∙ ⃗x 4 ) =2
Substituting these values in equation (f), we will get
W(α) =α 1 +α 2+
1
α 3+ α 4− [9 α 21−2 α 1 α 2−2 α 1 α 3 +2 α 1 α 4 +9 α 22 +2 α 2 α 3 −2 α 2 α 4 + 9 α 23 −2 α 2 α 4 + α 24 ]
2
(g) Optimizing W(α) with respect to the Lagrange
multipliers yields the following set of simultaneous equations:
−1
w.r.to α 1: 1 ¿ -2 α 2 −2 α 3 +2 α 4 ¿=0
2
9 α 1 -α 2−α 3+ α 4=1 (h)
1
= ¿ + y2 Φ (⃗
x 2) + y 3 Φ ( ⃗
x3 )+ y 4 Φ ( ⃗
x4 ) ¿
8
1 1 1 1
=
8
1
1
[[ ][ ][ ][ ]
1
1
−√ 2 − √ 2
−√ 2
1
1
√2
√2 −√ 2 √2
1
1 − √ 2 + − √ 2 + −√ 2 − √2
1
√2
[ ] []
0 0
0
−1
1 −4 √ 2
= = √2
8 0
0
0
0
0
0
w is 0.
The bias b is 0, because the first element of ⃗
The optimal hyperplane becomes
w . Φ ¿) + b = 0
⃗
w and b,
Substituting the values of ⃗
[]
x 21
−1 √2 x 1 x 2
[0 0 0 0] =0
√2 x 22
√ 2 x1
√ 2 x2
Which reduces to optimal hyperplane,
−x 1 x 2=0