Professional Documents
Culture Documents
Key Lab. of Machine Learning and Computational Intelligence, College of Mathematics and Computer Science, Hebei
University, Baoding 071002, China
E-MAIL: mclsx@hbu.cn, mengjie655@163.com
map data vectors from the input space to a high-dimensional where n is the number of the support vectors which satisfy
feature space using a nonlinear mapping Φ .The mapping Φ the inequality 0<α i < C .
can be replaced by kernel functions k ( xi , x j ) , which obeys
(4) Find the smallest sphere, thus the squared distance
Mercers conditions. It is only needed to compute the inner between a given test point z and a is:
products between support vectors Φ ( xi ) and the pattern
z − a = ( z ⋅ z ) − 2¦ α i ( z ⋅ xi ) + ¦¦ α iα j ( xi ⋅ x j ) (12)
2
specific format of Φ . The commonly used kernel functions Typically, the decision of whether z belongs to the same
are polynomials functions, radial basis functions and certain class as the training data or not is obtained by comparing the
sigmoid functions. For an unknown input pattern x, we have distance with the radius. If the distance is greater than the
the following discriminative function: radius, it is rejected; otherwise accepted.
Analogous to the standard SVM in the section 2, simply
1511
Proceedings of the Ninth International Conference on Machine Learning and Cybernetics, Qingdao, 11-14 July 2010
all the inner products can be replaced by the kernel we first use the method above to reduce the negative
function k ( xi , x j ) , in the kernel version, the hyper-sphere lives class only using one of the groups ( S P ), and then for the
1
in a high (maybe infinite) dimensional space induced by the remaining samples of the negative class the similar
kernel. method is used to remove the edge samples for the other
group ( S P ), the reduced samples will be obtained.
2
4. SVM classifier based on the reduced samples (4) The samples reduced by the above three steps are trained
by SVM classifier to find the optimal hyper-plane.
The algorithm we proposed will be described in detail.
There are two key steps in the algorithm: first, the samples
outside the sphere will be removed by using the algorithm of
SVDD in section 3; second, for the samples inside the sphere,
we will remove the edge points by the Euclidian Distance.
The proposed algorithm proceeds as follows;
(1) Cluster initialization: each class of the data is clustered
P1
into several groups using k-means. Now we take the
P2
dataset with two classes for example, suppose that the
positive class P is clustered into u groups, while the
negative class N into v groups, then the positive class can
N1
be denoted as:
P = P1 ∪ P2 ∪ " ∪ Pu ,
the negative class:
N = N1 ∪ N 2 ∪ " ∪ N v , Figure 1. The initialize clusters of the samples
here we take u =2ˈv = 1 for example.
(2) SVDD: use the SVDD algorithm mentioned in section 3
to obtain the smallest spheres S P , S P , S N for P1 , P2 , N1
1 2 1
distance of the point xi to the center of the negative Figure 2. the reduced samples by SVDD
class.
As the example supposed, the positive class has two Figure 3. The reduced samples based on Euclidian Distance
groups, while the negative class has only one group. So
1512
Proceedings of the Ninth International Conference on Machine Learning and Cybernetics, Qingdao, 11-14 July 2010
ijcnn1 49990 22
cod-rna-scale 59535 8
xi
TABLE 2. THE OPTION OF TRAINING PARAMETERS
datasets t c v
sP1∗
german.numer-scale rbf 1 10
sP2 ∗
svmguide3 rbf 10 10
svmguide1-scale linear 100 10
ijcnn1 rbf 1 10
cod-rna-scale rbf 10 10
1513
Proceedings of the Ninth International Conference on Machine Learning and Cybernetics, Qingdao, 11-14 July 2010
greatly, the accuracy is improved from 94.9677% to vector domain description method RSVDD, Journal of
96.4548%, the training time is decreased from 1282.69580s Xidian University (Natural Science Edition), Vol.35,
to 42.9563s. No.5, pp,928-929, Oct.2008.
As what the above show, the proposed algorithm, [8] Fang Zhu, Junhua Gu, New reduction strategy of
especially for the large datasets, performs well to decrease large-scale training samples set for SVM, Journal of
the consumption of computer memory, improve the Computer Applications, Vol.29, No.10, pp, 2736-2740,
classification accuracy and accelerate the training speed of Oct.2009.
SVM. [9] Tax D M J. One-class Classification: Concept- learning
in the Absence of Counter-examples [D]. Netherlands:
6. Conclusion Delft Univ, 2001.
[10] Manuele Bicego, Mario A.T. Figueiredo. Soft clustering
In this paper we present a new sample reduction method, using weighted one-class support vector machines,
which first reduces the training samples through the Pattern Recognition, Vol.42, pp. 27-32, 2009.
algorithm of SVDD and then remove the edge points based [11] J.C. Burges, A tutorial on support vector machines for
on Euclidian Distance. The experimental results show that pattern recognition, Data Mining and Knowledge
the new algorithm in the paper is capable of reducing the Discovery, Vol.2, No. 2, pp, 955–974, 1998.
number of samples as well as the training time while [12] Camastra, A. Verri, A novel kernel method for
maintaining high accuracy. clustering, IEEE Trans. Pattern Anal. Mach. Intell., 27,
pp, 801–805, 2005
Acknowledgements [13] C.-C. Chang, C.-J. Lin, LIBSVM: a library for support
vector machines, 2001. Software available at
This research is supported in part by the natural science http://www.csie.ntu.edu.tw/ѝcjlin/libsvm.
foundation of Hebei Province(No.F2008000635), the plan of [14] C.L. Blake, C.J. Merz,UCI repository of machine
the natural science foundation of Hebei University(doctor learning databases, Department of Information and
project) (No.Y2008122), the key project foundation of Computer Sciences, University of California, Irvine,
applied fundamental research of Hebei Available at http://www.ics.uci.edu/ѝmlearn/ML
Province(No.08963522D),the Scientific Research Project of Repository.html
Department of Education of Hebei Province (No.2009107).
References
1514