Professional Documents
Culture Documents
=
+
< > + > > e
,
where C is a constant which is a cost trade-off between maximizing the margin and minimizing the classification error
of the training samples.
2.2 Twin Support Vector Machine
The twin support vector machine (TSVM) is a new nonparallel plane classifier for binary data classification. It
generates two nonparallel planes by solving two smaller-sized quadric programming problems such that each plane is
closer to one of the two classes and is as far as possible from the other. That a new sample is assigned to class +1 or -1
depends upon its proximity to the two nonparallel hyper-planes. The linear classifier TSVM aims to obtain following
two nonparallel planes
0 0
T T
w x b w x b
+ +
+ = + = , .
This leads to the following pair of quadratic optimization problem
2
1
, ,
1
min || || . . ( ) , 0
2
T
w b
Aw e b ce st Bw e b e
+ +
+ + + + +
+ + + > >
,
(2)
2
2
, ,
1
min || || . . ( ) , 0
2
T
w b
Bw eb c e st Aw eb e
+
+ + + + + +
+ + + > >
,
(3)
where c
1
and c
2
are the tuning parameters, e
+
and e
s s
, (3)
1
2
1
max ( ) . . 0
2
T T T T
e Q P P Q st c o
+
s s
. (4)
The two nonparallel hyper-planes can be obtained from the solution of (3) and (4)
1
1
( , ) ( )
( , ) ( )
T T T T
T T T T
w b H H I G
w b P P I Q
v c o
v c
+ + +
= = +
= = +
. (5)
In order to deal with the case which
T
H H or
T
P P is singular and avoid the possible ill-condition of
T
H H and
T
P P ,
formula (5) above artificially introduces a regularization term ( 0) I c c > , where I is an identity matrix of appropriate
dimension.
It is be seen that in the above discussion, the linear TSVM requires matrices of size (n+1)(n+1), where n is much
smaller in comparison to the number of pattern of class +1 and -1.
For nonlinear case, the separating nonparallel planes are changed by introducing a nonlinear kernel K, namely
1 1 2 2
( , ) 0, ( , ) 0
T T T T
K x C u b K x C u b + = + = ,
where [ ]
T T
C A B = and K is a appropriate kernel. The primal problems of nonlinear TSVM are given as follows:
1 1 2
2
1 1 1 1 2 2 1 2 1 2 2 2
, ,
1
|| ( , ) || . . ( ( , ) ) , 0
2
T T T
u b
min K A C u eb ce st K B C u eb e
+ + + > >
,
1 1 2
2
2 2 2 2 1 1 2 1 2 2 1 1
, ,
1
|| ( , ) || . . ( , ) , 0
2
T T T
u b
min K B C u eb c e st K A C u eb e
+ + + > >
.
2.3 v-Twin Support Vector Machine
Similar to v-SVM, introducing two new parameter v
1
and v
2
instead of the trade-off factors C
1
and C
2
, Peng
proposed v-twin support vector machine (v-TSVM) and rewrite the primal optimization problems as follows [5]:
2
1
, ,
2
2
, ,
1 1
min ( ) .. ( ) , 0, 0, .
2
1 1
min ( ) . . , 0, 0, .
2
j
i
T T
i j j j j
w b
i I j I
T T
j i i i i
w b
j I i I
wx b v st wx b j I
l
wx b v st wx b i I
l
+ + +
+
+ + + + + + +
e =
+
+
e =
+ + + > > > e
+ + + > > > e
To understand the roles of
for all
0,
j
j I
= e
(or 0,
I
j I
+
= e ), the negative (positive) samples are separated
by the positive (or negative) hyperplane with the margin / ( )
T
w w
+ + +
(or / ( )
T
w w
). At the same time, the adaptive
quality effectively overcomes the above shortcomings in the TSVM. By introducing Lagrangian multipliers, two dual
QPPs are obtained in the following:
1
1 2 1 2 1
1, 2
1
1 2 1 2 2
1, 2
1 1
min ( ) . . 0 , , .
2
1 1
min ( ) . . 0 , .
2
T T
j j j i i j j j
j j I i I i I
T T
i i i j j i i i
i i I j I i I
z z z z st v j I
l
z z z z st v i I
l
o o o o
o o o o
+
+
e e =
+
+
e e =
s s > e
s s > e
To compute
, the samples ,
i
x i I
+
e (or ,
i
x i I
e ) with
1
0
i
l
o
+
< <
(or
1
0
j
l
o
< <
) are chosen, which means 0
i
=
(or 0
j
= ) and
T
i
w x b
+ = (or
T
i
w x b
+ + +
+ = ).
3. FUZZY TWI N SUPPORT VECTOR MACHI NE
3.1 Linear Fuzzy Twin Support Vector Machine
It may be seen that no matter twin support vector machine and v-twin support vector machine, they do not consider the
effects of training samples on the optimal separating hyper-plane. In this paper, we propose fuzzy twin support vector
machine (FTSVM) by introducing importance of sample. According to weighted sample on the non-parallel
hyperplane, the quadratic programming problems are given as follows:
International Journal of Application or Innovation in Engineering& Management (IJAIEM)
Web Site: www.ijaiem.org Email: editor@ijaiem.org, editorijaiem@gmail.com
Volume 2, Issue 3, March 2013 ISSN 2319 - 4847
Volume 2, Issue 3, March 2013 Page 462
(1) (1) (2)
1
(1) (1) 2 (2) (1) (1) (2) (2)
1 1 2 1 1
, , ,
2
1 1
min || || . . ( ) , 0, 0
2
T
w b
Aw eb s st Bw b
l
v + + + > > >
, (6)
(1) (1) (2)
2
(2) (2) 2 (1) (2) (2) (1) (1)
2 2 1 2 2
, , ,
1
1 1
min || || . . ( ) , 0, 0
2
T
w b
Bw eb s st Aw b
l
v + + + > > >
, (7)
where
1 2
, (0,1] v v e denotes the regularization parameter of positive and negative samples, respectively.
1 2
, (0,1] s s e
denotes the fuzzy membership of positive and negative samples, respectively. Objective for FTSVM finds the two non-
parallel hyperplane, namely a positive hyperplane and a negative hyperplane. For the sake of brevity, we only consider
the dual problem of optimal problem (6). In order to solve the optimization problem, we construct the following
Lagrange function corresponding to the problem (6)
(1) (1) 2 (2) (1) (1) (2) (2)
1 1 2 1 1
2
1 1
|| || ( )
2
T T T
L Aw eb s Bw b
l
v o | t = + + + + +
, (8)
where Lagrange multipliers , , o | t are all greater than zero. According to Karush-Kuhn-Tucker (KKT) conditions
(1) (1)
(1)
( ) 0
T T
L
A Aw eb B
w
o
c
= + + =
c
, (9)
(1) (1)
1 1 2 (1)
( ) 0
T T
L
e SAw eb e
b
o
c
= + + =
c
, (10)
1 2
1
0
T
L
v e o t
c
= + =
c
, (11)
2
(2)
2
0
s L
l
o |
c
= =
c
, (12)
By simple Computation according to above equations, we obtain following equation (13).
(1) (1)
1 1 2
[ e ][ e ][ ] [ e ] 0
T T T T T
A A w b B o + = (13)
Let
(1) (1)
1 2
[ e ], [ ] , [ e ]
T
H A U w b G B = = = and rewrite (13) as follows
1
0, ( )
T T T T
H HU G U H H G o o
+ = = . (14)
It is well known that
T
H H is always positive semi-definite. However, it may be ill-conditioned in some situation.
Thus, according to ridge regression approaches, we introduce a regularization term I c to U to deal with possible ill-
condition for
T
H H , where I is identity matrix with suitable order. Now, (14) becomes
1
( )
T T
U H H I G c o
= + .
Applying equations (9) to (12) into the Lagrange function, the primal problem (6) can be transforms into the
following dual problem
1
2
1
2
1
min ( )
2
. . 0 ,
T T T
T
G H H G
s
st e e v
l
o
o o
o o
s s >
. (15)
From the KKT conditions, we obtain
(1) (1) (2) (2)
1 1
( ) 0, 0, 0.
T T
Bw b o | t + + = = =
Similarly, we obtain the parameters of another hyper-plane
(2) (2) 1
( ) , ( )
T T T
w b R R Q Q P |
s s >
, (16)
where
1
P [A e ] = and
2
[B e ] Q = .
3.2 The Nonlinear Fuzzy Twin Support Vector Machine
In this section, we extend the presented method above to the nonlinear situation using kernel trick and consider the
kernel-based surfaces rather than planes in primal space,
namely
(1) (1) (2) (2)
( , ) 0, ( , ) 0
T T
K A C K B C + = + = ,where [ ] C AB = and K denotes an chosen kernel function.
We construct the optimization problem of FTSVM as follows:
International Journal of Application or Innovation in Engineering& Management (IJAIEM)
Web Site: www.ijaiem.org Email: editor@ijaiem.org, editorijaiem@gmail.com
Volume 2, Issue 3, March 2013 ISSN 2319 - 4847
Volume 2, Issue 3, March 2013 Page 463
(1) (1) 2 (2) (1) (1) (2) (2)
1 1 2 1 1
, , ,
2
1 1
min || ( , ) || . . ( ( , ) ) , 0, 0,
2
T T T
wb
K A C w eb s st K B C w eb
l
v + + + > > >
(17)
(2) (2) 2 (1) (2) (2) (1) (1)
2 2 1 2 2
, , ,
1
1 1
min || ( , ) || . . ( ( , ) ) , 0, 0.
2
T T T
b
K B C w eb s st K A C w eb
l
v + + + > > >
(18)
We construct Lagrange function of primal problem (17) as
(1) (1) 2 (2) (1) (1) (2) (2)
1 1 2 1 1
2
1 1
|| ( , ) || ( ( , ) )
2
T T T T T
L K AC w eb s K B C w b
l
v t o | = + + + + +
,
where 0, 0, 0 o | t > > > are Lagrange multipliers. According to KKT conditions
(1) (1)
1 (1)
(1) (1)
1 2 (1)
1 2
1
2
(2)
2
(A, ) ( (A, ) ) 0 ,
( (A, ) ) 0,
0,
0.
T T T T
T T T
T
L
K C K C w eb B
w
L
e K C w eb e
b
L
v e
s L
l
o
o
o t
o |
c
= + + =
c
c
= + + =
c
c
= + =
c
c
= =
c
According to similar method above, we obtain equation (19).
(1) (1)
1 1 2
[( ( , ) e ][ ( , ) e ][ ] [( ( , ) e ] 0
T T T T T T T T
K A C K A C w b K B C o + = . (19)
Let
(1) (1)
1 2
[ ( , ) e ], [ ] , [ ( , ) e ]
T T T
H K A C U w b G K B C = = = . Then the equation (19) is modified as follows
1
0, ( )
T T T T
H HU G U H H G o o
+ = = .
So, the dual problem of nonlinear FTSVM is given by
1 2
1
2
1
min ( ) . . 0 ,
2
T T T T
s
G H H G st e v
l
o
o o o o
s s >
. (20)
Similarly, we also obtain the following dual problem for optimization problem (18), where
T
1
P [ (A,C ) e ] K = and
T
2
[ (B,C ) e ] Q K = .
1 1
2
1
1
min P( ) P . . 0 ,
2
T T T T
s
Q Q st e v
l
o
| | | |
s s >
. (21)
Based on above derivation, we give fuzzy twin support vector machine algorithm FTSVM in the following which
include linear FTSVM and nonlinear FTSVM.
Step 1 Choose a kernel function and compute membership of each sample for class +1 and class -1 to construct
vector s
1
and s
2
.
Step 2 Compute H and G.
Step 3 Set values of parameters
1 2
, (0,1) v v e
.
Step 4 Solve the quadratic programming problems (15) or (20) and (16) or (21) to obtain U and R for two
nonparallel hyperplanes.
Step 5 Compute distance dist
+1
between
n
x R e and
(1) (1)
0
T
x w b + = and distance dist
-1
between
n
x R e and
(2) (2)
0
T
x w b + = , respectively.
Step 6 Compare dist
+1
with dist
-1
, if dist
+1
>dist
-1
then x is assigned to class +1 else class -1.
3.3 Fuzzy Membership Function
The design of fuzzy membership function is the key to the fuzzy algorithm using fuzzy technology. In this paper, we
use class center method to generate fuzzy membership. Firstly, we denote the mean of class +1 as class-center x
+
and
the mean of class -1 as class center x
where 0 o > is used to avoid the case 0
i
s = .
International Journal of Application or Innovation in Engineering& Management (IJAIEM)
Web Site: www.ijaiem.org Email: editor@ijaiem.org, editorijaiem@gmail.com
Volume 2, Issue 3, March 2013 ISSN 2319 - 4847
Volume 2, Issue 3, March 2013 Page 464
4. EXPERI MENTAL RESULTS AND ANALYSI S
In this section, to evaluate the performance with proposed algorithm FTSVM, we investigate its classification
accuracies and computational efficiencies on 7 real-world UCI benchmark datasets [6]. In experiments, we focus on the
comparison between the proposed algorithm FTSVM and some methods which include TSVM, FSVM and SVM. All
the classification methods are implemented in Matlab 7.0 environment on a PC with Intel P4 processor with 1GB
RAM. We compute the fuzzy membership by a function of the distance between the points and its class center.
Table 1 gives the classification accuracy of linear FTSVM with TSVM, FSVM, and SVM using 5-fold cross-
validation method. From Table 1, we can see that the accuracy of linear FTSVM is significantly better than linear
TSVM on all 7 UCI datasets. We also report the training time of the algorithms which is shown in Table 2. It indicates
that FTSVM is faster than the FSVM, because it solves two smaller size problems instead of one large size problem for
all samples. However, there is no statistical different in average training time between FTSVM and FSVM for bupa
dataset. Thus FTSVM is better than FSVM in the accuracy. Table 3 compares the performance of the FTSVM classifier
with that of TSVM, FSVM and SVM for Gaussian kernel. The results in Table 3 are similar with that appeared in
Table 1. That is to say that FTSVM has the better classification accuracy than TSVM in all datasets.
Table 1: Classification accuracy using 5-fold cross-validation
Data Set FTSVM TSVM FSVM SVM
australian 85.975.16 85.795.09 85.894.79 85.514.58
breast-cancer 65.004.13 62.833.16 64.862.48 64.864.73
bupa 74.823.18 68.406.38 74.802.52 69.283.02
fourclass 64.397.28 64.395.70 68.548.84 73.666.32
german 78.148.15 71.206.35 70.808.20 76.907.63
heart 85.564.45 82.226.60 82.596.08 81.488.58
pima 79.085.92 73.026.05 76.952.45 76.552.40
Table 2: Training time (in Seconds)
Data Set FTSVM FSVM
australian 12.50 133.53
breast-cancer 16.27 150.17
bupa 2.14 2.13
fourclass 7.89 29.44
german 14.98 50.09
heart 0.23 0.45
pima 8.58 42.65
Table 3: Classification accuracy using 5-fold cross-validation with RBF kernel
Data Set FTSVM TSVM FSVM SVM
australian 86.081.43 84.812.15 85.562.30 85.512.16
breast-cancer 65.604.32 64.423.87 65. 012.48 65.424.53
bupa 77.803.87 71.455.49 76.672.21 72.783.97
fourclass 64.535.51 64.455.49 64.386.18 64.356.48
german 78.208.15 72.456.35 71.688.20 73.507.63
heart 84.444.53 81.894.31 83.335.00 82.226.67
pima 79.515.92 73.706.05 77.422.45 76.552.40
In addition, aiming at noise sensitive problem of twin support vector machine, we study and compare the performance
of fuzzy twin support vector machine FTSVM and twin support vector machine. First, we use random method to
produce two class samples, denoted by and , respectively. Then, we add three noise data points (-2, -2), (-2, 0)
and (-1, -1), respectively. Figure 1 shows the classification result on TSVM, where data is not added noise. Figure 2
and Figure 3 show the classification results with noise data on TSVM and FTSVM, respectively. From Figure 2 and
Figure 3, we observe that fuzzy twin support vector machine is almost unaffected by noise data, whereas twin support
vector machine is largely affected by noise. This indicates that the fuzzy memberships of samples play an important
role in the classification.
International Journal of Application or Innovation in Engineering& Management (IJAIEM)
Web Site: www.ijaiem.org Email: editor@ijaiem.org, editorijaiem@gmail.com
Volume 2, Issue 3, March 2013 ISSN 2319 - 4847
Volume 2, Issue 3, March 2013 Page 465
5. CONCLUSI ONS
In this paper, we study fuzzy twin support vector machine (TSVM) by applying fuzzy membership to training samples.
Samples are classified by assigning them to the nearest one of two non parallel planes. Experiments on several UCI
benchmark datasets show that the presented algorithm FTSVM is effective and feasible relative to twin support vector
machine, fuzzy support vector machine and support vector machine. Moreover, we show that presented algorithm
FTSVM is of anti-noise capability. In the future, we further study fuzzy twin support vector machine and expand it to
multi-classification problem.
Acknowledgements
This work is support by Natural Science Foundation of China (No. 61073121) and Nature Science Foundation of Hebei
Province (No. F2012201014).
References
[1] Lin Chun-Fu and Wang Sheng-De, Fuzzy Support Vector Machines, IEEE transactions on neural works, 13(2),
pp. 464-471, 2002.
[2] Fung G, Mangasarian O L, Proximal support vector machine classifiers, In: Proc 7th ACM SIFKDD Intl Conf
on Knowledge Discovery and Data Mining, pp. 77-86, 2001.
[3] Jayadeva, Khemchandni Reshma, Suresh Chandra. Twin support vector machines for pattern classification,
IEEE Transaction on Pattern Analysis and Machine Intelligence, 29(5), pp. 905-910, 2007.
[4] Kumar M. Arun, Gopal M, Least squares twin support vector machines for pattern classification, Expert Systems
with Applications, 36(4), pp. 7535-7543, 2009.
[5] Peng Xinjun, A v-twin support vector machine (v-TSVM) classifier and its geometric algorithms, Information
Sciences, 180, pp. 3863-3875, 2010.
[6] Blake C. L., Merz C. J , UCI Repository for Machine Learning databases IrvineCA: University of California,
Department of Information and Computer Sciences, http://www.ics.uci.edu/mlearn/MLRepository.html, 1998.
AUTHOR
Kai Li received the B.S. and M.S. degrees in Mathematics Department Electrical Engineering
Department from Hebei University,Baoding, China, in 1982 and 1992, respectively. He received the
Ph.D. degree from Beijing Jiaotong University, Beijing, China, in 2001. He is currently a Professor in
School of Mathematics and Computer Science, Hebei University. His current research interests include
machine learning, data mining, computational intelligence, and pattern recognition.
Hongyan Ma received the B.S. and M.S. degrees in Information and Computational Science and
Applied Mathematics from Hebei University,Baoding, China, in 2000 and 2007, respectively. She is
currently with industrial and commercial college of Hebei University as a teacher. Her current research
interests include machine learning, data mining and information retrieval.