Professional Documents
Culture Documents
College of Computer Science and Technology, Harbii Engineering University, Harbin 150001, China
E-MAIL: davis525@sina.com,zhangjianpei@0451.com, yj_fmt@0451.com
0-7803-8403-2/04/$20.00 @2WIEEE
1764
Authorized licensed use limited to: SLIIT - Sri Lanka Institute of Information Technology. Downloaded on June 09,2023 at 09:03:56 UTC from IEEE Xplore. Restrictions apply.
. .
Proceedmgs of the Third International Conference'onMachine Learning and Cybemetics, Shanghai, 26-29 A G s t 2004
1765
Authorized licensed use limited to: SLIIT - Sri Lanka Institute of Information Technology. Downloaded on June 09,2023 at 09:03:56 UTC from IEEE Xplore. Restrictions apply.
,Proceedingsof the Third International Conference on Machine Laming and Cybernetics, Shanghai, 26-29 August 2004
3 v e n history data set X iand support vector set SV,, optimal hyperplane Vi constructed by SV,;
new data set X,+,;
nitialize SV=SV,, SV,=O;
3xecute
1. Partition X i into Xt' and X;' with vi;
2. Train Xi+,,pick up support vector set SVj+land construct y j + ,partition
, Xi+,
into X,:: and X z l with Vi+,
3. Set SV=SV+SVr+,;
10
4. Construct y with S V ;
5. Partition X i into Xi'' and X;" with y , partition Xi+, into XL;'and Xi:' with y ;
6.Set SV, = ( X ~ ' - X ~ ' ' " + ( X ' ' ' - X ; " ) + ( X , : : -xL:')+(xL:'-xL:);
7. Set SV=SV+SV,;
8. Set i=i+l;
L.
Xhile (SV, 5 E ) i/
hatch heuristic
algorithm and heuristic incremental SVM learning
Number incremental incremental
algorithm by conducted experiment using a business text SVM (#SVs) SVM (#SVS)
database. This is a pre-labeled by hand dataset which
training 46 46
consists of 1675 data points, each having a dimension of 27. 368
data
Take 368 data points as initial training set and 216 data
subset 1 288 I1 131
points as test set randomly, separate the rest data points into
4 subsets and use the polynomial kernel. The value of E subset 2 326 115 146
is 3% of the size of each subset. Table 1 provides the subset 3 205 138 163
classification precision results for this experiment. subset 4 212 165 182
For adding data points in partition difference set to
support vector set, the amount of support vectors of The following criteria are often used to evaluate an
heuristic incremental SVM learning algorithm is more than incremental classification algorithm: the classification
batch incremental case. Table 2 provides the support precision, the scalability of leaming and classification on
1766
Authorized licensed use limited to: SLIIT - Sri Lanka Institute of Information Technology. Downloaded on June 09,2023 at 09:03:56 UTC from IEEE Xplore. Restrictions apply.
Proceedqs of the Third Intemadonal Conference on Machine Learning and Cybemetics, Shanghai, 26-29 August 2004
large data sets, the robustness to noise. It can be seen that [l] F. J. Provost and V. Kolluri. “A suivey of methods for
the classifcation precision results in incremental steps are scaling up inductive leaming algorithms”, Technical
improved from the table 1. For the given & is only Report ISL-97-3, Intelligent Systems Lab.,
determined by scale of data and request of classifcation Department of Computer Science, University of
precision but not state of data distribution, the algorithm Pittsburgh, 1997.
proposed in this paper satisfies the criteria on the whole. 121 V. Vapnik. The Nature of Statistical Learning Theory.
Springer Verlag, New York, 1995.
5. Conclusions [3] Osuna E, Freund R, Girosi F. “An improved training
algorithm for support vector machine”, Proceeding of
SVM is adaptable to incremental leaming to vast data IEEE NNSP’97, Amelia Island FL, pp.276-285,
classification for its outstanding power to summarize the September 1997.
data space. A heuristic incremental SVM learning [4] Xiao Rong, Wang Jicbeng, Sun Zhengxing, B a n g
algorithm is proposed based on considering the possibility Fuyan. “An Apporach to Incremental SVM Leaming
of new data set works on the history data. It collects more Algorithm”, Joumal of NanJing University(Natural
data points which contribute more to fmal hyperplane as Sciences), Vol38, No. 2, pp. 152-157, Mar. 2002.
support vectors from partition differenceof training data set. [5] Zeng Wen-hua, Ma Jian. “A Novel Approach to
Experiments improved that this algorithm is efficient to Incremental SVM leaming Algorithm”, Joumal of
de! with vast data classification problems with hgher Xiamen Univeristy(Natural Science), Vol 41, No. 6,
classifcation precision. pp.687-691, Nov. 2002.
Results achieved in this paper are promising and some [6] Christopher J, Burges C, A Tutorial on Support
additional researches will be performed in the future with Vector Machines for Pattern Recognition, Kluwer
large amount and variety &data classification. Academic Pubhshers, Boston, 1998.
[7] N. Syed, H. Liu, and K. Sung, “Incremental leaming
Acknowledgements with support vector machines”, Proceeding of UCAI
Conference, Sweden, August 1999.
This paper is sponsored by the Natural Science [SI R. Klinkenberg and J. Thorsten. “Detecting concept
Foundation of Heilongjiang Province under Grant No. drift with support vector machines”, Proceeding of
Fo304. 17” ICML Conference, Morgan Kaufmann, June
2OOO.
References [9] P. Mitra, C. A. Murthy, and S. K. Pal. “Data
Condensation in Large Databases by Incremental
Learning with Support Vector Machines”, Proceeding
of ICPR Conference, Spain ,September 2OOO
1767
Authorized licensed use limited to: SLIIT - Sri Lanka Institute of Information Technology. Downloaded on June 09,2023 at 09:03:56 UTC from IEEE Xplore. Restrictions apply.