OS16 SVMs

SVM using Matlab
November 2018
Boucetta Imad
Université de technologie de Troyes

Master OSS
OS16 - Apprentissage et applications en IA
Contents
1 Introduction 2
2 Mathematical background 2
2.1 An optimization problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.2 Misclassification tolerance . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.3 Dual problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.4 Kernel trick . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3 SVM using Matlab 7

3.1 Separable data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.2 Non-linearly separable data . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2.1 Linear penalization SVM . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2.2 Kernel trick . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
4 Results 9
4.0.1 Data using hard margin separator . . . . . . . . . . . . . . . . . . . . 9
4.0.2 Data using soft margin separator . . . . . . . . . . . . . . . . . . . . 10
4.0.3 Data using polynomial kernel . . . . . . . . . . . . . . . . . . . . . . 13
4.0.4 Data using Gaussian kernel . . . . . . . . . . . . . . . . . . . . . . . 14
5 Summary 18
A Hard margin matlab code 19
B Soft margin matlab code 20
C RBF kernel matlab code 22
1
1 Introduction
SVMs are a family of algorithms among the best supervised machine learning algorithms
that can be used as classifiers.
We will discuss some of its different types, and apply it to 4 different types of data randomly
generated, and work on finding the best results (i.e. most suitable parameters suitable to
our data set).
The data generated will be randomly generated, it will be composed of 50 point (2 features)
to train our, and 300 point (2 features) to test our boundary decision.
The whole work will be executed under Matlab.
2 Mathematical background
2.1 An optimization problem

We are willing to separate n training points which belong to two different classes, we label
the data according to their belonging, yi = +1 or yi = −1 for the points that belong to the
class 1 or class 2 (respectively).
Figure 1: Training data
To separate the data we look for an hyperplane (i.e a decision boundary, e.g. a line for data
with 2 features or a plane for data with 3 features etc.) described by its equation (1) where
2
w is the hyperplane’s normal and b its bias.
< w, x > + b = 0 (1)
As there are an infinite of hyperplanes that can separate the data (perfectly or not), we tend
to look for a separator that not only separate data but also maximize the margins (i.e. an
hyperplane where its distance to all the points is maximized). As the data is separated, each
Figure 2: An hyperplane that separate data
point i can be described by:
< w, xi > + b ≥ 1 if yi = +1 (2)
< w, xi > + b ≤ −1 if yi = −1 (3)
Where yi is the point i label (i.e. belongs to class 1 (yi = +1) or 2 (yi = −1) ), equations (2)
and (3) can be combined into one equation (4).
yi (< w, xi > + b) ≥ 1 (4)
The points that sits on the margin (i.e < w, xi > + b = 1) are called support vectors, because
they are the ones that actually orientate the hyperplane as they are the closet to it.
1
As consequence their distance to the hyperplane is the same and it is equal to kwk (the
2
distance can be easily proved), thus the distance of the margin is kwk .
To maximize the margin we can reformulate the problem and minimize the value of 12 kwk,
which is the same as minimizing 21 kwk2 (quadratic optimizations turned out to be easier)
3
Thus the optimization problem is :
1
min kwk2
w 2 (5)
subject to yi (< w, xi > + b) ≥ 1, i = 1, . . . , n.
2.2 Misclassification tolerance

The problem precedent necessitate satisfaction of all the constraints in order to find an opti-
mal solution, however we can tolerate sometimes misclassification, so that we could separate
non-separable data.
Therefor we introduce slack variables ξi to allow some data to violate the constraints, and a
trade-off parameter C determine the misclassification’s tolerance degree.
n
1 X
min kwk2 + C ξi
w 2 i=1 (6)
subject to yi (< w, xi > + b) ≥ 1 − ξi , i = 1, . . . , n.
Figure 3: Training data separated with some points misclassified
4
2.3 Dual problem
The constrained optimization can be solved using Lagrangian multipliers :
n
1 X
max min kwk2 − αi [yi (< w, xi > +b) − 1)]
αi w,b 2 i=1 (7)
subject to αi ≥ 0, i = 1, . . . , n.
Where αi are Lagrangian multipliers. If we solve for w and b we would get:

n
X
w= α i y i xi . (8)
i=1
n
X
αi yi = 0. (9)
i=1
Substituting these values back in equation (7), we obtain:

n n
X 1X
max αi − yi yj αi αj < xi , xj >
αi
i,j=1
2 i,j=1 (10)
subject to αi ≥ 0, i = 1, . . . , n.
The equation (10) called the dual problem, it shows that the optimization depends only on
the dot product < xi , xj >.
The precedent format can be extended to cover misclassification tolerated problems (6):
n n
X 1X
max αi − yi yj αi αj < xi , xj >
αi
i,j=1
2 i,j=1 (11)
subject to C ≥ αi ≥ 0, i = 1, . . . , n.
2.4 Kernel trick

When the data is not separable linearly we could use a trick called the Kernel trick, which
consist to transform the data to another dimension where the data are separated by an
hyperplane.
5
Figure 4: Data not linearly separable
Figure 5: Data transformed to another dimension
The importance of the kernel trick is that we tend to calculate the kernel without calculating
(or even knowing) the transformation function.
The dual problem become :

n
X n
X
max αi − yi yj αi αj < φ(xi ), φ(xj ) >
αi
i,j=1 i,j=1 (12)
subject to αi ≥ 0, i = 1, . . . , n.
6
Finally :
n
X n
X
max αi − yi yj αi αj k(xi , xj )
αi
i,j=1 i,j=1 (13)
subject to αi ≥ 0, i = 1, . . . , n.
3 SVM using Matlab

To solve the optimization problem (which is a quadratic problem) under Matlab we use the
solver quadprog specified by :



 A.z ≤ b,
1 T 
min z Hz + f T z such that Aeq.z = beq, (14)
x 2 

lb ≤ z ≤ ub.

H, A, and Aeq are matrices, and f , b, beq, lb, ub, and x are vectors.
To solve our problem we look for the solver’s parameters to adapt it with our problem.
3.1 Separable data

If the data is linearly separable we can use equation (10) and adapt to our solver equation
format, with z is the [α1 α2 . . . αn ]T vector, therefor the solver’s parameters are :
H=(x∗x . ’ ) . ∗ ( y∗y . ’ ) ;
f=−ones ( n , 1 ) ;
A= [ ] ;
b=[];
l b=z e r o s ( n , 1 ) ;
ub = [ ] ;
Aeq=y . ’ ;
beq =0;
Where x is the data vector, y data labels and n is the number of the training data. Finally
we call the solver quadprog and calculate w and b.
[ a l p ha s , f v a l , e f , op , l d ]= quadprog (H, f , A, b , Aeq , beq , lb , ub ) ;
w=x . ’ ∗ ( a l p h a s . ∗ y ) ;
[ maxAlphas , imaxAlphas ]=max( a l p h a s ) ;
b=y ( imaxAlphas)−w. ’ ∗ x ( imaxAlphas , : ) . ’ ;
7
3.2 Non-linearly separable data
For non-linearly separable data we have to use other ways to separate data.
3.2.1 Linear penalization SVM
For misclassification tolerated problems, we can use equation (11), therefor the solver’s pa-
rameters are :
C=T r a d e o f f v a l u e ;
H=(x∗x . ’ ) . ∗ ( y∗y . ’ ) ;
f=−ones ( n , 1 ) ;
A= [ ] ;
b=[];
l b=z e r o s ( n , 1 ) ;
ub=C∗ ones ( n , 1 ) ;
Aeq=y . ’ ;
beq =0;
[ a l p ha s , f v a l , e f , op , l d ]= quadprog (H, f , A, b , Aeq , beq , lb , ub ) ;

w=x . ’ ∗ ( a l p h a s . ∗ y ) ;
b=y ( imaxAlphas)−w. ’ ∗ x ( imaxAlphas , : ) . ’ ;
3.2.2 Kernel trick
To apply the kernel trick we should create firstly the kernel functions, in our case we created
the polynomial kernel:
k(x1 , x2 ) = (1+ < x1 , x2 >)q
f u n c t i o n [ Z ] = k e r n e l P o l (X1 , X2 , q )
[ xxx1 , yyy1 ]= meshgrid (X1 ( : , 1 ) , X2 ( : , 1 ) ) ;
Z=(1+xxx1 . ∗ yyy1+xxx2 . ∗ yyy2 ) . ˆ q ;
end
8
And the radial basis function kernel:
1 2
k(x1 , x2 ) = e− 2σ2 kx1 −x2 k
f u n c t i o n [ Z ] = kernel RBF (X1 , X2)

Z=exp ( 0 . 0 5 ∗ ( − ( abs ( xxx1−yyy1 ).ˆ2+ abs ( xxx2−yyy2 ) . ˆ 2 ) ) ) ;
end
4 Results
We dispose of 4 types of data distributions.
4.0.1 Data using hard margin separator
Figure 6: Training data separated
9
As expected, there is no misclassified data, and therefor the linear decision boundary is
suitable to our data.
4.0.2 Data using soft margin separator
In this section we will use the code mentioned in section 3.2.1, and act on the variable C to
get the smallest misclassification percentage possible.
10
Figure 9: Training and test data separated
11
* The trade-off parameter C represent how much we want to stick to our data. Small C
means we want to enlarge our margins by sacrificing training data separation, in the other
hand large C means that we want to stick to our training data.
* Even though the data is not linearly separable, by introducing to the optimizer a trade-off
variable we got reasonable results with separators featured by low misclassification percent-
age.
12
4.0.3 Data using polynomial kernel
In this section we will include the polynomial kernel using the code mentioned in the section
3.2.2 Ẇe will acts on the polynomial’s degree to find the most suitable separator for our
data.
Figure 14: Test data separated by our decision boundary
13
4.0.4 Data using Gaussian kernel
In this section we will include the radial basis kernel using the code mentioned in the section
3.2.3 Ẇe will acts on the RBF dispersion σ and trade-off value C to find the most suitable
separator for our data.
14
We can use a loop to find the separator for different values of σ and C.
15
16
* We can extract results of the misclassification percentage by changing C and σ and observ-
ing a search grid (i.e. set C and vary σ) to find the most suitable parameters to our data
distribution.
* The SVM optimization is affected by the dot product < xi , xj > of the points, nevertheless
in the RBF kernel this dot product can represented as the distance between points kxi − xj k,
thus σ works as an amplifier of this distance, if σ is large only very close points define the
boundary decision, otherwise the boundary decision, otherwise take on take into considera-
tion relation between spaced points.
The RBF kernel is more useful to separate data where a class encircle another one.
17
5 Summary
* Support vector machines are known by their performances.
* Trade-off parameter C represent how much we want to sacrifice separation for bigger mar-
gin.
* Kernels are useful to generate a boundary decision for non linearly separable data.
* The σ in the RBF kernel determine the what aspect we want to define our boundary
decision, do we want to very close points, or do we want to create larger decision bound-
aries.
References
[1] Efavdb. http://efavdb.com/svm-classification/.
[2] Medium. https://medium.com/deep-math-machine-learning-ai/chapter-3-support-
vector-machine-with-math-47d6193c82be.
[3] Thisdata. https://thisdata.com/blog/unsupervised-machine-learning-with-one-class-
support-vector-machines/.
18
A Hard margin matlab code
na =25;
nt =150;
[ ax1 , ay1 , tx1 , ty1 ]= Gen data ( 1 , na , nt ) ;
figure
h=p l o t d a t a ( ax1 , ay1 ) ;
legend (h , ’ c l a s s 1 ’ , ’ c l a s s 2 ’)
H=(ax1 ∗ ax1 . ’ ) . ∗ ( ay1 ∗ ay1 . ’ ) ;

f=−ones (2∗ na , 1 ) ;
A= [ ] ;
b=[];
l b=z e r o s (2∗ na , 1 ) ;
ub = [ ] ;
Aeq=ay1 . ’ ;
beq =0;
[ a l p ha s , f v a l 2 , e x i t f l a g 2 , output2 , lambda2 ] =quadprog (H, f , A, b , Aeq , beq , lb , ub ) ;

w=ax1 . ’ ∗ ( a l p h a s . ∗ ay1 ) ;

b=ay1 ( imaxAlphas)−w. ’ ∗ ax1 ( imaxAlphas , : ) . ’ ;
t y f=tx1 ∗w+b ;
Err=1−sum ( s i g n ( t y f )==s i g n ( ty1 ) ) / l e n g t h ( ty1 )
h o l d on
f p l o t (@( x ) −(w( 1 ) ∗ x+b ) /w( 2 ) , ’ k ’ ) ;
f p l o t (@( x ) −(w( 1 ) ∗ x+b−1)/w( 2 ) , ’ k − − ’);
f p l o t (@( x ) −(w( 1 ) ∗ x+b+1)/w( 2 ) , ’ k − − ’);
figure
h=p l o t d a t a ( tx1 , ty1 ) ;
19
h o l d on
f p l o t (@( x ) −(w( 1 ) ∗ x+b ) /w( 2 ) , ’ k ’ ) ;
f p l o t (@( x ) −(w( 1 ) ∗ x+b−1)/w( 2 ) , ’ k − − ’);
f p l o t (@( x ) −(w( 1 ) ∗ x+b+1)/w( 2 ) , ’ k − − ’);
a n n o t a t i o n ( ’ textbox ’ , . . .
[0.15 0.85 0.29 0 . 0 5 ] , . . .
’ S t r i n g ’ , { [ ’ M i s c l a s s i f i e d %= ’ num2str ( round ( Err ∗ 1 0 0 , 2 ) ) ’% ’ ] } , . . .
’ BackgroundColor ’ , [ 1 1 1 ] ) ;
t i t l e ( [ ’ Linear seperator ’ ] )
hold o f f
B Soft margin matlab code
na =25;
nt =150;
figure
h=p l o t d a t a ( ax1 , ay1 ) ;
C=40;
f o r C= [ 0 . 0 0 0 0 1 0 . 0 0 0 1 0 . 0 0 1 0 . 0 1 0 . 1 1 10 100 1000 10000 100000 1 0 0 0 0 0 0 ]
try
H=(ax1 ∗ ax1 . ’ ) . ∗ ( ay1 ∗ ay1 . ’ ) ;

f=−ones (2∗ na , 1 ) ;
A= [ ] ;
b=[];
l b=z e r o s (2∗ na , 1 ) ;
ub=C∗ ones (2∗ na , 1 ) ;
Aeq=ay1 . ’ ;
beq =0;
20
[ a l p ha s , f v a l 2 , e x i t f l a g 2 , output2 , lambda2 ] =quadprog (H, f , A, b , Aeq , beq , lb , ub ) ;
w=ax1 . ’ ∗ ( a l p h a s . ∗ ay1 ) ;

b=ay1 ( imaxAlphas)−w. ’ ∗ ax1 ( imaxAlphas , : ) . ’ ; ;
t y f=tx1 ∗w+b ;
a y f=ax1 ∗w+b ;
Err=1−sum ( s i g n ( t y f )==s i g n ( ty1 ) ) / l e n g t h ( ty1 ) ;
Errx=1−sum ( s i g n ( a y f)==s i g n ( ay1 ) ) / l e n g t h ( ay1 ) ;
figure
subplot (1 ,2 ,1)
h=p l o t d a t a ( ax1 , ay1 ) ;
h o l d on
f p l o t (@( x ) −(w( 1 ) ∗ x+b ) /w( 2 ) , ’ k ’ ) ;
f p l o t (@( x ) −(w( 1 ) ∗ x+b−1)/w( 2 ) , ’ k − − ’);
f p l o t (@( x ) −(w( 1 ) ∗ x+b+1)/w( 2 ) , ’ k − − ’);
[0.15 0.85 0.12 0 . 0 5 ] , . . .
’ S t r i n g ’ , { [ ’ M i s c l a s s i f i e d %= ’ num2str ( round ( Errx ∗ 1 0 0 , 2 ) ) ’% ’ ] } , . . .
t i t l e ( [ ’ T r a i n i n g data s e p e r a t i o n f o r C = ’ num2str (C) ] )
hold o f f
subplot (1 ,2 ,2)
h=p l o t d a t a ( tx1 , ty1 ) ;
h o l d on
f p l o t (@( x ) −(w( 1 ) ∗ x+b ) /w( 2 ) , ’ k ’ ) ;
f p l o t (@( x ) −(w( 1 ) ∗ x+b−1)/w( 2 ) , ’ k − − ’);
f p l o t (@( x ) −(w( 1 ) ∗ x+b+1)/w( 2 ) , ’ k − − ’);
[0.59 0.85 0.12 0 . 0 5 ] , . . .
21
t i t l e ( [ ’ Test data s e p e r a t i o n f o r C = ’ num2str (C) ] )
s e t ( g c f , ’ u n i t s ’ , ’ p o i n t s ’ , ’ p o s i t i o n ’ , [ 0 , 0 , width , h e i g h t ] )
hold o f f
catch
end
end
C RBF kernel matlab code
na =25;
nt =150;
figure
h=p l o t d a t a ( ax1 , ay1 ) ;
C=10;
sigma =10;
f o r C= [ 0 . 0 0 0 0 1 0 . 0 0 0 1 0 . 0 0 1 0 . 0 1 0 . 1 1 10 100 1000 10000 100000 1 0 0 0 0 0 0 ]
f o r sigma = [ 0 . 0 0 0 0 1 0 . 0 0 0 1 0 . 0 0 1 0 . 0 1 0 . 1 1 10 100 1000 10000 100000 1 0 0 0 0 0 0
H=(kernel RBF ( ax1 , ax1 , sigma ) . ∗ ( ay1 ∗ ay1 . ’ ) ) ;

f=−ones (2∗ na , 1 ) ;
A= [ ] ;
b2 = [ ] ;
l b=z e r o s (2∗ na , 1 ) ;
ub=C∗ ones (2∗ na , 1 ) ;
Aeq=ay1 . ’ ;
beq =0;
22
[ a l p ha s , f v a l , e x i t f l a g , output , lambda2 ] =quadprog (H, f , A, b2 , Aeq , beq , lb , ub ) ;
ep s=1e −6;
s v s=f i n d ( alphas >eps & alphas <C−eps ) ;
b=ay1 ( s v s (1)) − kernel RBF ( ax1 , ax1 ( s v s ( 1 ) , : ) , sigma ) ∗ ( ay1 . ∗ a l p h a s ) ;
t y f=kernel RBF ( ax1 , tx1 , sigma ) ∗ ( ay1 . ∗ a l p h a s )+b ;

a y f=kernel RBF ( ax1 , ax1 , sigma ) ∗ ( ay1 . ∗ a l p h a s )+b ;
Err=1−sum ( s i g n ( t y f )==s i g n ( ty1 ) ) / l e n g t h ( ty1 ) ;
Errx=1−sum ( s i g n ( a y f)==s i g n ( ay1 ) ) / l e n g t h ( ay1 ) ;
figure
subplot (1 ,2 ,1)
h=p l o t d a t a ( ax1 , ay1 ) ;
h o l d on
mn=min ( ax1 ) ;
mx=max( ax1 ) ;
Xc=l i n s p a c e ( min ( ax1 ( : , 1 ) ) , max( ax1 ( : , 1 ) ) , 1 0 0 ) ;
Yc=l i n s p a c e ( min ( ax1 ( : , 2 ) ) , max( ax1 ( : , 2 ) ) , 1 0 0 ) ;
[ Xc , Yc]= meshgrid ( Xc , Yc ) ;
Xc=Xc ( : ) ;
Yc=Yc ( : ) ;
Zc=kernel RBF ( ax1 , [ Xc , Yc ] , sigma ) ∗ ( ay1 . ∗ a l p h a s )+b ;
n = c e i l ( s q r t ( numel ( Zc ) ) ) ;
Zcf = zeros (n ) ;
Xcf = z e r o s ( n ) ;
Ycf = z e r o s ( n ) ;
Z c f ( 1 : numel ( Z c f ) ) = Zc ( : ) ;
Ycf ( 1 : numel ( Z c f ) ) = Yc ( : ) ;
Xcf ( 1 : numel ( Z c f ) ) = Xc ( : ) ;
c o n t o u r ( Xcf , Ycf , Zcf , [ 1 −1] , ’ k−−’)

h o l d on
23
c o n t o u r ( Xcf , Ycf , Zcf , [ 0 0 ] , ’ k ’ )
[0.15 0.85 0.12 0 . 0 5 ] , . . .
’ S t r i n g ’ , { [ ’ M i s c l a s s i f i e d %= ’ num2str ( round ( Errx ∗ 1 0 0 , 2 ) ) ’% ’ ] } , . . .
t i t l e ( [ ’ T r a i n i n g data s e p e r a t i o n by RBF k e r n e l f o r \ sigma = ’ num2str ( sigm
’ , C = ’ num2str (C) ] )
hold o f f
subplot (1 ,2 ,2)
h=p l o t d a t a ( tx1 , ty1 ) ;
h o l d on
mn=min ( ax1 ) ;
mx=max( ax1 ) ;
Xc=l i n s p a c e ( min ( tx1 ( : , 1 ) ) , max( tx1 ( : , 1 ) ) , 1 0 0 ) ;
Yc=l i n s p a c e ( min ( tx1 ( : , 2 ) ) , max( tx1 ( : , 2 ) ) , 1 0 0 ) ;
[ Xc , Yc]= meshgrid ( Xc , Yc ) ;
Xc=Xc ( : ) ;
Yc=Yc ( : ) ;
Zc=kernel RBF ( ax1 , [ Xc , Yc ] , sigma ) ∗ ( ay1 . ∗ a l p h a s )+b ;
n = c e i l ( s q r t ( numel ( Zc ) ) ) ;
Zcf = zeros (n ) ;
Xcf = z e r o s ( n ) ;
Ycf = z e r o s ( n ) ;
Z c f ( 1 : numel ( Z c f ) ) = Zc ( : ) ;
Ycf ( 1 : numel ( Z c f ) ) = Yc ( : ) ;
Xcf ( 1 : numel ( Z c f ) ) = Xc ( : ) ;
24
c o n t o u r ( Xcf , Ycf , Zcf , [ 1 −1] , ’ k−−’)
h o l d on
c o n t o u r ( Xcf , Ycf , Zcf , [ 0 0 ] , ’ k ’ )
[0.59 0.85 0.12 0 . 0 5 ] , . . .
t i t l e ( [ ’ Test data s e p e r a t i o n by RBF k e r n e l f o r \ sigma = ’ num2str ( sigma )
’ , C = ’ num2str (C) ] )
hold o f f
catch
end
end
end
25

OS16 SVMs

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

OS16 SVMs

Uploaded by

Copyright:

Available Formats

SVM using Matlab

Université de technologie de Troyes

3 SVM using Matlab 7

A Hard margin matlab code 19

B Soft margin matlab code 20

C RBF kernel matlab code 22

2.1 An optimization problem

Figure 1: Training data

< w, x > + b = 0 (1)

Figure 2: An hyperplane that separate data

point i can be described by:

< w, xi > + b ≥ 1 if yi = +1 (2)

< w, xi > + b ≤ −1 if yi = −1 (3)

yi (< w, xi > + b) ≥ 1 (4)

2.2 Misclassification tolerance

Figure 3: Training data separated with some points misclassified

Where αi are Lagrangian multipliers. If we solve for w and b we would get:

Substituting these values back in equation (7), we obtain:

2.4 Kernel trick

Figure 5: Data transformed to another dimension

The dual problem become :

3 SVM using Matlab

3.1 Separable data

3.2.1 Linear penalization SVM

[ a l p ha s , f v a l , e f , op , l d ]= quadprog (H, f , A, b , Aeq , beq , lb , ub ) ;

3.2.2 Kernel trick

k(x1 , x2 ) = (1+ < x1 , x2 >)q

f u n c t i o n [ Z ] = kernel RBF (X1 , X2)

4.0.1 Data using hard margin separator

Figure 6: Training data separated

4.0.2 Data using soft margin separator

Figure 8: Training data

Figure 10: Training and test data separated

Figure 12: Training and test data separated

Figure 13: Training data

Figure 14: Test data separated by our decision boundary

Figure 16: Test data separated by our decision boundary

4.0.4 Data using Gaussian kernel

Figure 18: Training and test data separated

Figure 20: Training and test data separated

Figure 22: Training data

H=(ax1 ∗ ax1 . ’ ) . ∗ ( ay1 ∗ ay1 . ’ ) ;

[ a l p ha s , f v a l 2 , e x i t f l a g 2 , output2 , lambda2 ] =quadprog (H, f , A, b , Aeq , beq , lb , ub ) ;

[ maxAlphas , imaxAlphas ]=max( a l p h a s ) ;

B Soft margin matlab code

H=(ax1 ∗ ax1 . ’ ) . ∗ ( ay1 ∗ ay1 . ’ ) ;

[ maxAlphas , imaxAlphas ]=max( a l p h a s ) ;

C RBF kernel matlab code

H=(kernel RBF ( ax1 , ax1 , sigma ) . ∗ ( ay1 ∗ ay1 . ’ ) ) ;

t y f=kernel RBF ( ax1 , tx1 , sigma ) ∗ ( ay1 . ∗ a l p h a s )+b ;

c o n t o u r ( Xcf , Ycf , Zcf , [ 1 −1] , ’ k−−’)

You might also like