You are on page 1of 6

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 26, NO.

10, OCTOBER 2015 2583

Twin Support Vector Machine for Clustering


Zhen Wang, Yuan-Hai Shao, Lan Bai, and Nai-Yang Deng

Abstract— The twin support vector machine (TWSVM) is one of the sets have shown that our TWSVC performs better than the
powerful classification methods. In this brief, a TWSVM-type clustering relevant plane-based clustering methods.
method, called twin support vector clustering (TWSVC), is proposed.
Our TWSVC includes both linear and nonlinear versions. It determines
The rest of this brief is organized as follows. We give a quick
k cluster center planes by solving a series of quadratic programming review of k-means, kPC, and TWSVM in Section II. Our TWSVC
problems. To make TWSVC more efficient and stable, an initialization with the corresponding initializations is described in Section III. The
algorithm based on the nearest neighbor graph is also suggested. experiments and the conclusion are arranged in Sections IV and V,
The experimental results on several benchmark data sets have shown respectively.
a comparable performance of our TWSVC.

Index Terms— Manifold clustering, plane-based clustering,


II. BACKGROUND
twin support vector machine (TWSVM), unsupervised learning.
In this brief, we consider m data samples {x1 , x2 , . . . , xm } in
the n-dimensional real vector space R n. Assuming these m samples
I. I NTRODUCTION
belong to k classes or clusters with their corresponding labels
Clustering, which aims at dividing the data samples into different in {1, 2, . . . , k}, and are represented by the m × n matrix
clusters, is one of the most fundamental problems in machine X = (x1 , x2 , . . . , xm )T . We further organize the data samples with
learning [1]–[4]. It has been applied to many real-world problems, label i into the matrix X i and those with the rest labels into the
e.g., marketing, text mining, and web analysis [5], [6]. matrix X̂ i, where i = 1, 2, . . . , k. For reader’s convenience, the
In contrast to the classical point-based clustering methods symbols X, X i , and X̂ i will also refer to the corresponding sets,
(such as k-means [7]), the plane-based clustering methods [such as depending on the specific context they appear. For example, the
k-plane clustering (kPC) [8] and proximal plane clustering (PPC) [9]] symbol X can be comprehended as X = (x1 , x2 , . . . , xm )T
were also proposed. However, kPC ignores the influence between the or X = {x1 , x2 , . . . , xm }.
clusters, and PPC may have difficulty when the matrix in its objective
is not positive definite.
In this brief, we propose a novel plane-based clustering method A. k-Means
based on twin support vector machine (TWSVM) [10], named twin Consider the clustering problem with a set X of m unlabeled data
support vector clustering (TWSVC). TWSVM is a milestone in the samples in R n. k-means [7], [16] wishes to cluster X into k clusters
development of the plane-based classification and has been widely X 1 , X 2 , . . . , X k such that the data samples are around k cluster center
studied [11], [12]. It is the first time to extend TWSVM to clustering points
problem. The main contributions of this brief include the following.
1) Following the spirit of TWSVM, our TWSVC exploits Center-pointi := x̄i ∈ R n, i = 1, 2, . . . , k. (1)
information from both within cluster and between cluster.
It tries to find the solution to the following problem with i = 1, . . . , k:
2) Different from TWSVM where one class plane is required to
mi
keep the samples of different classes far away from only one 1
side, in our TWSVC, the requirement of from only one side is min ||X i ( j ) − x̄i ||2 (2)
x̄ i ,X i 2
replaced by the more reasonable from both sides of the cluster j =1
center plane. where X i ( j ) denotes the j th sample in X i , m i is the number of the
3) Linear TWSVC is extended to nonlinear case by kernel trick samples in X i , m 1 + · · · + m k = m, and || · || denotes L 2 norm.
to cope with the manifold clustering [13]–[15]. However, k-means does not solve (2) directly, due to X i ⊂ X is
4) To make our TWSVC more efficient and stable, an initialization a discrete variable. In practice, it always starts from a random initial
algorithm based on the nearest neighbor graph (NNG) is assignment of the samples. Then, the means of the corresponding
proposed. The experimental results on several benchmark data cluster samples are set to the k cluster centers (1), since the mean is
the solution to (2) when X i is given. Next, each sample is relabeled
Manuscript received June 17, 2014; revised October 26, 2014; accepted
December 5, 2014. Date of publication January 6, 2015; date of current according to its nearest cluster center by
version September 16, 2015. This work was supported in part by the Zhejiang
Provincial Natural Science Foundation of China under Grant LQ12A01020 y = arg min{||x − x̄i ||, i = 1, . . . , k}. (3)
and in part by the National Natural Science Foundation of China under i
Grant 11201426, Grant 10971223, and Grant 11371365. (Corresponding The cluster center points and the sample labels are updated alternately
authors: Z. Wang and Y.-H. Shao.)
Z. Wang and L. Bai are with the School of Mathematical Sciences, until some terminate conditions are satisfied.
Inner Mongolia University, Hohhot 010021, China (e-mail:
wangz11@mails.jlu.edu.cn; bailanhaomei@163.com).
Y.-H. Shao is with Zhijiang College, Zhejiang University of Technology, B. kPC
Hangzhou 310014, China (e-mail: shaoyuanhai21@163.com).
For the clustering problem, kPC [8] wishes to cluster X into
N.-Y. Deng is with the College of Science, China Agricultural University,
Beijing 100083, China (e-mail: dengnaiyang@cau.edu.cn). k clusters such that the data samples are along with k cluster center
Color versions of one or more of the figures in this paper are available planes
online at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TNNLS.2014.2379930 Center-planei := wi x + bi = 0, i = 1, . . . , k (4)
2162-237X © 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
2584 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 26, NO. 10, OCTOBER 2015

where wi ∈ R n and bi ∈ R. It tries to find the solution to the


following problem with i = 1, . . . , k:
1
min ||X i wi + bi e||2
wi ,bi ,X i 2
s.t. ||wi ||2 = 1 (5)
where e is a column vector of ones with an appropriate
dimension.
In practice, kPC also starts from a random initial assignment of
the samples. For a certain X i , it obtains the solution to (5) by solving
an eigenvalue problem [8] with i = 1, . . . , k. Then, each sample is Fig. 1. Geometric interpretation of TWSVC.
relabeled by
  plane in our TWSVC is required to be as close as possible to the
y = arg min |wi x + bi |, i = 1, . . . , k (6) ith cluster X i and far away from the other clusters X̂ i from both
i sides with i = 1, . . . , k.
where | · | denotes the absolute value. The cluster center planes and With the initial cluster labels of X, TWSVC updates the whole
the sample labels are also updated alternately until some terminate cluster center planes and the sample labels alternately similar to kPC
conditions are satisfied. until some terminate conditions are satisfied.
In the following, we solve (10) with a certain X i . Note that (10)
C. TWSVM can be solved by the concave–convex procedure (CCCP) [18], which
decomposes the ith problem in (10) into a series of convex quadratic
For the classification problem, given k classes of samples subproblems with the initial wi0 and bi0 as follows:
X 1 , X 2 , . . . , X k as the training set X, TWSVM [10], [17] seeks
1
 j +1

j +1 2 j +1
k class center planes, which can be expressed as min X i wi + bi e + ce ξi
j +1 j +1 j +1 2
wi ,bi ,ξi
Center-planei := wi x + bi = 0, i = 1, . . . , k. (7)
 j +1 j +1  j +1 j +1
s.t. T  X̂ i wi + bi e ≥ e − ξi , ξi ≥ 0 (11)
Each plane is close to the samples of one class and is far away
from the samples of the other classes from one side. To find the where the index of the subproblem j = 0, 1, 2, . . ., and T (·) denotes
planes in (7), it is required to solve the following primal problem the first-order Taylor expansion.
j j
with i = 1, . . . , k: By introducing the subgradient [19] of | X̂ i wi + bi e| with respect
to wi and bi , we have ∇(| Xˆi wi + bi e|) = diag(sign( X̂ i wi +
j j j j j
1
min X i wi + bi e2 + ce ξi j j j j j
wi ,bi ,ξi 2 bi e))[ X̂ i , e]. Note | X̂ i wi + bi e| = diag(sign( X̂ i wi + bi e))
j j
s.t. X̂ i wi + bi e ≥ e − ξi , ξi ≥ 0 (8) ( X̂ i wi + bi e), then
 j +1 j +1 
where c > 0 is a penalty parameter and ξi ∈ R m−m i is a slack vector. T | X̂ i wi + bi e|
The problem in (8) is a quadratic programming problem (QPP), j j  j j  j +1 j +1
j j

and its geometric meaning is clear. For example, when i = 1, its = | X̂ i wi + bi e| + ∇ | X̂ i wi + bi e| wi ; bi − wi ; bi
 j j   j +1 j +1

objective function makes the data samples in Class 1 proximal to the = diag sign( X̂ i wi + bi e) [ X̂ i , e] wi ; bi
first class center plane w1 x + b1 = 0, while the constraints make  j j   j j  j j

the data samples in the rest of the classes have a distance at least 1 + | X̂ i wi + bi e| − diag sign X̂ i wi + bi e [ X̂ i , e] wi ; bi
  j j  j +1 j +1 
from this plane from one side. = diag sign X̂ i wi + bi e X̂ i wi + bi e . (12)
A new sample x is assigned to which class depending on the
Thus, (11) becomes
distances to these k class center planes.
1 j +1 j +1 j +1
min ||X i wi + bi e||2 + ce ξi
j +1 j +1 j +1 2
III. T WIN S UPPORT V ECTOR C LUSTERING wi ,bi ,ξi
A. Linear TWSVC   j j  j +1 j +1 
s.t. diag sign X̂ i wi + bi e X̂ i wi + bi e
For the clustering problem, the proposed TWSVC seeks k cluster j +1 j +1
≥ e − ξi ξi ≥ 0. (13)
center planes
Inspired by SVM [20], [21] and TWSVM [10], [11], the solution
Center-planei := wi x + bi = 0, i = 1, . . . , k (9) j +1 j +1
[wi ; bi ] to (13) can be obtained by solving its dual problem
by considering the following problem with i = 1, . . . , k: 1 
min α G(H  H )−1 G  α − e α
1 α 2
min ||X i wi + bi e||2 + ce ξi s.t. 0 ≤ α ≤ ce (14)
wi ,bi ,ξi ,X i 2
j j
s.t. | X̂ i wi + bi e| ≥ e − ξi , ξi ≥ 0. (10) where G = diag(sign( X̂ i wi + bi e))[ X̂ i e], H = [X i e], and
α∈R m−m i is the Lagrangian multiplier vector.
Now, we give a simple example in Fig. 1 with three clusters to show
Problem (14) is a convex QPP and can be solved efficiently by the
the geometric interpretation of our TWSVC. The first, the second, and
successive overrelaxation [22] method, which is an iterative method
the third cluster samples are represented by purple ◦, blue ×, and
to solve the system of linear equations and has been successfully
green +, respectively. The first cluster center plane w1 x + b1 = 0
extended to solve the above problem [11]. Thus, the solution to (13)
(red solid) is required to be as close as possible to the first cluster
can be obtained from the solution to (14) by
samples (purple ◦) and far away from the samples of the other clusters j +1 j +1

(both blue × and green +). More precisely, the ith cluster center wi ; bi = (H  H )−1 G  α. (15)
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 26, NO. 10, OCTOBER 2015 2585

TABLE I
A CCURACIES OF THE L INEAR C LUSTERING M ETHODS ON THE B ENCHMARK D ATA S ETS

In short, for i = 1, 2, . . . , k, (10) can be solved by the following 1) For the given data set, select an positive integer parameter p
procedure. and then construct the p nearest neighbor undirected graph, i.e.,
1) Select the initial [wi0 ; bi0 ]. for i = 1, . . . , m, find xi ’s p nearest neighbors and connect xi
j +1 j +1 with its neighbors.
2) For j = 0, 1, . . ., find [wi ; bi ] by (15).
j +1 j +1 j j 2) Create the clusters by associating the connected samples,
3) Stop if ||[wi ; bi ] − [wi ; bi ]|| is small enough, and then
j +1 j +1
resulting in t clusters. If the current number of clusters t is
set wi = wi , bi = bi . equal to k, stop.
It has been proved that the CCCP is able to find a local solution 3) If t < k, disconnect the two connected samples with the largest
to (10) [18]. Once the solution [wi ; bi ] with i = 1, . . . , k is obtained, distance and go to step 2).
the cluster labels of the data samples can be updated by (6). 4) If t > k, compute the Hausdorff distance [26] between every
two clusters among the t clusters and sort all pairs in ascending
B. Nonlinear TWSVC order. Incorporate the nearest pair of clusters into one, until
Now, let us turn to extend the above linear TWSVC to manifold k clusters are formulated, where the Hausdorff distance between
clustering by kernel trick [23]. Similar to [10], [11], and [24], our two sets S1 and S2 of samples is defined as
nonlinear TWSVC seeks k cluster center manifolds in an appropriate h(S1 , S2 ) = max{max{ min ||i − j ||}, max{ min ||i − j ||}}. (18)
kernel generated space as follows: i∈S1 j ∈S2 i∈S2 j ∈S1

Center-manifoldi := K (x, X)u i + γi = 0, i = 1, 2, . . . , k (16) Second, we concern with the initialization of the CCCP in our
TWSVC, where one needs to select an initial point [wi0 ; bi0 ]. Noting
where K (·, ·) is an appropriate kernel function [10], [11], u i ∈ R m , the relationship between our TWSVC and kPC, the solution to (5)
and γi ∈ R. in kPC is taken. Problem (5) can be converted to an eigenvalue
The counterpart of (10) is problem by the Karush–Kuhn–Tucker conditions [27] as
1 1
min ||K (X i , X)u i + γi e||2 + ce ηi X i  ee − I X i wi0 = λwi0 (19)
u i ,γi ,ηi ,X i 2 e e
s.t. |K ( X̂ i , X)u i + γi | ≥ e − ηi , ηi ≥ 0 (17) where I is an identity matrix, resulting in that wi0 should be
where ηi is a slack vector, i = 1, 2, . . . , k. The above problem can the eigenvector corresponding to the smallest eigenvalue of λ and
also be solved by CCCP similar to the linear case. The details are bi0 = −e X i wi0 /e e, i = 1, . . . , k.
omitted.
IV. E XPERIMENTAL R ESULTS
C. Initializations In this section, we analyze the performance of our TWSVC
First, we consider the initialization of the labels of TWSVC. compared with k-means (linear and nonlinear formations [7], [28]),
Traditionally, the initial labels in clustering are randomly generated. kPC (linear and nonlinear formations [8]), PPC (linear and
However, the experiments on k-means [7], kPC [8], and PPC [9] have nonlinear formations [9] where the nonlinear formation
indicated that the results are unstable and strongly depend on the can be obtained easily by the kernel trick as TWSVC),
initial labels. Therefore, we present an initialization algorithm based fuzzy c-means (FCM, linear formation [29]), and Camastra
on the NNG [25], which has been frequently used in manifold-based method (nonlinear formation [30]) on several benchmark data
learning. The main process is as follows. sets [31], [32]. All the methods are implemented by MATLAB [33]
2586 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 26, NO. 10, OCTOBER 2015

TABLE II
A CCURACIES OF THE M ANIFOLD C LUSTERING M ETHODS ON THE B ENCHMARK D ATA S ETS

Fig. 2. Illustration of the effectiveness of linear TWSVC with different parameters.

on a PC with an Intel Core Duo processor (double 3.4 GHz) initialization, all the methods are run 10 times, and the average
with 4-GB RAM. accuracy, the standard deviation, and the one-run CPU time are
In the experiments, we used the metric accuracy to measure recorded in Tables I and II for linear and nonlinear clustering,
the performance of these methods [34]. Given the cluster labels respectively.
yi ∈ N, i = 1, . . . , m, it is easy to compute the corresponding Tables I and II show the following.
similarity matrix M ∈ R m×m , where 1) TWSVC gets higher accuracy than the other plane-based
cluster methods with either random initialization or NNG-based
1, if yi = y j
M(i, j ) = (20) initialization on most data sets.
0, otherwise
2) TWSVC owns the highest average accuracy among these
suppose Mt is the similarity matrix computed by the truth cluster clustering methods.
labels of the data set, and M p is the one computed by the prediction 3) The NNG-based initialization is superior to random initializa-
of a clustering method. Then, the metric accuracy of the clustering tion on most data sets, especially for the plane-based methods.
method is defined as the Rand statistic [34] However, the training time of TWSVC is longer than the others
n 00 + n 11 − m because it needs to solve a series of QPPs.
Accuracy = × 100% (21)
m2 − m Fig. 2 shows the relations between the parameters and the
where n 00 is the number of zeros in M p and Mt , and n 11 is the accuracy (vertical axis) of our linear TWSVC on the above data
number of ones in M p and Mt . sets. It can be found from Fig. 2 that the accuracy of our TWSVC is
To test the proposed initialization strategy, all initial cluster affected by both p and c, and higher accuracy is reached by smaller
labels are selected by both random initialization and NNG-based p for most data sets.
initialization. The parameters c and μ in kPC, PPC, or TWSVC Fig. 3 shows the relations between the parameters and the
are selected from {2i |i = −8, −7, . . . , 7}, and p in the accuracy (vertical axis) of our nonlinear TWSVC only on three
NNG-based initialization is selected from {1, 2, 3, 4, 5}. For random data sets. More results of other data sets can be found at
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 26, NO. 10, OCTOBER 2015 2587

Fig. 3. Illustration of the effectiveness of nonlinear TWSVC with different parameters on (i)–(v) Dermatology, (vi)–(x) Ecoli, and (xi)–(xv) Haberman.

http://www.optimal-group.org/Resource/TWSVC.html. In Fig. 3, the [3] P. Padungweang, C. Lursinsap, and K. Sunat, “A discrimination analysis
rows correspond to the data sets and the columns correspond to the for unsupervised feature selection via optic diffraction principle,” IEEE
Trans. Neural Netw. Learn. Syst., vol. 23, no. 10, pp. 1587–1600,
parameters. From Fig. 3, it can be seen that:
Oct. 2012.
1) the parameter p ≤ 3 often makes nonlinear TWSVC perform [4] I. Cattinelli, G. Valentini, E. Paulesu, and N. A. Borghese, “A novel
well, which is similar to linear TWSVC; approach to the problem of non-uniqueness of the solution in hierarchical
2) the parameter c ≥ 1 is always a good option for most data sets; clustering,” IEEE Trans. Neural Netw. Learn. Syst., vol. 24, no. 7,
3) the performance of the nonlinear TWSVC is affected by the pp. 1166–1173, Jul. 2013.
[5] M. W. Berry, Survey of Text Mining: Clustering, Classification, and
parameter μ significantly; Retrieval, vol. 1. New York, NY, USA: Springer-Verlag, 2004.
4) different data sets correspond to different optimal μ, which may [6] R. Ilin, “Unsupervised learning of categorical data with competing
be affected by the data structure. models,” IEEE Trans. Neural Netw. Learn. Syst., vol. 23, no. 11,
pp. 1726–1737, Nov. 2012.
[7] A. K. Jain and R. C. Dubes, Algorithms for Clustering Data.
V. C ONCLUSION Upper Saddle River, NJ, USA: Prentice-Hall, 1988.
A TWSVM-type plane-based clustering method (TWSVC) has [8] P. S. Bradley and O. L. Mangasarian, “k-plane clustering,” J. Global
been proposed. It contains both the linear and nonlinear formations. Optim., vol. 16, no. 1, pp. 23–32, 2000.
[9] Y.-H. Shao, L. Bai, Z. Wang, X.-Y. Hua, and N.-Y. Deng, “Prox-
The cluster center planes in TWSVC are obtained by solving a series
imal plane clustering via eigenvalues,” Proc. Comput. Sci., vol. 17,
of QPPs instead of the eigenvalue problems in both kPC and PPC. pp. 41–47, May 2013.
In addition, an efficient and stable NNG-based initialization is also [10] Jayadeva, R. Khemchandani, and S. Chandra, “Twin support vector
presented. The experimental results on several public available data machines for pattern classification,” IEEE Trans. Pattern Anal. Mach.
sets have indicated that our TWSVC has higher accuracy compared Intell., vol. 29, no. 5, pp. 905–910, May 2007.
[11] Y.-H. Shao, C.-H. Zhang, X.-B. Wang, and N.-Y. Deng, “Improvements
with current plane-based clustering methods. For practical conve- on twin support vector machines,” IEEE Trans. Neural Netw., vol. 22,
nience, the corresponding TWSVC MATLAB code can be down- no. 6, pp. 962–968, Jun. 2011.
loaded from http://www.optimal-group.org/Resource/TWSVC.html. [12] L. Bai, Z. Wang, Y.-H. Shao, and N.-Y. Deng, “A novel feature selection
It should be pointed out that, in our TWSVC, there are several method for twin support vector machine,” Knowl.-Based Syst., vol. 59,
parameters need to be selected and a series of QPPs is needed to pp. 1–8, Mar. 2014.
[13] R. Souvenir and R. Pless, “Manifold clustering,” in Proc. 10th IEEE Int.
be solved. Consequently, designing more efficient solvers and model Conf. Comput. Vis. (ICCV), vol. 1. Oct. 2005, pp. 648–653.
selection methods is interesting. [14] W. Cao and R. Haralick, “Nonlinear manifold clustering by dimension-
ality,” in Proc. 18th Int. Conf. Pattern Recognit. (ICPR), vol. 1. 2006,
ACKNOWLEDGMENT pp. 920–924.
[15] S. T. Roweis and L. K. Saul, “Nonlinear dimensionality reduction by
The authors would like to thank the editor and the anonymous locally linear embedding,” Science, vol. 290, no. 5500, pp. 2323–2326,
reviewers for their valuable comments and suggestions. 2000.
[16] X. Huang, Y. Ye, and H. Zhang, “Extensions of k-means-type algo-
R EFERENCES rithms: A new clustering framework by integrating intracluster com-
pactness and intercluster separation,” IEEE Trans. Neural Netw. Learn.
[1] M. Aldenderfer and R. Blashfield, Cluster Analysis. Los Angeles, CA, Syst., vol. 25, no. 8, pp. 1433–1446, Aug. 2014.
USA: Sage Publications, 1985. [17] W. Zhen, C. Jin, and Q. Ming, “Non-parallel planes support vector
[2] A. K. Jain, M. N. Murty, and P. J. Flynn, “Data clustering: A review,” machine for multi-class classification,” in Proc. Int. Conf. Logistics Syst.
ACM Comput. Surv., vol. 31, no. 3, pp. 264–323, 1999. Intell. Manage., vol. 1. Jan. 2010, pp. 581–585.
2588 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 26, NO. 10, OCTOBER 2015

[18] A. L. Yuille and A. Rangarajan, “The concave-convex procedure [26] F. Hausdorff, Mengenlehre. Berlin, Germany: Walter de Gruyter, 1927.
(CCCP),” in Advances in Neural Information Processing Systems, vol. 2. [27] R. Fletcher, Practical Methods of Optimization. New York, NY, USA:
Cambridge, MA, USA: MIT Press, 2002, pp. 1033–1040. Wiley, 1987.
[19] P.-M. Cheung and J. T. Kwok, “A regularization framework for multiple- [28] I. S. Dhillon, Y. Guan, and B. Kulis, “Kernel k-means: Spectral
instance learning,” in Proc. 23rd Int. Conf. Mach. Learn., 2006, clustering and normalized cuts,” in Proc. 10th ACM SIGKDD Int. Conf.
pp. 193–200. Knowl. Discovery Data Mining, 1988, pp. 551–556.
[20] C. Cortes and V. Vapnik, “Support-vector networks,” Mach. Learn., [29] D. Dembélé and P. Kastner, “Fuzzy C-means method for cluster-
vol. 20, no. 3, pp. 273–297, 1995. ing microarray data,” Bioinformatics, vol. 19, no. 8, pp. 973–980,
[21] N. Deng, Y. Tian, and C. Zhang, Support Vector Machines: Optimization 2003.
Based Theory, Algorithms, and Extensions. Philadelphia, PA, USA: [30] F. Camastra and A. Verri, “A novel kernel method for clustering,”
CRC Press, 2012. IEEE Trans. Pattern Anal. Mach. Intell., vol. 27, no. 5, pp. 801–805,
[22] O. L. Mangasarian and D. R. Musicant, “Successive overrelaxation for May 2005.
support vector machines,” IEEE Trans. Neural Netw., vol. 10, no. 5, [31] C. T. Zahn, “Graph-theoretical methods for detecting and describing
pp. 1032–1037, Sep. 1999. gestalt clusters,” IEEE Trans. Comput., vol. C-20, no. 1, pp. 68–86,
[23] B. Schölkopf and A. J. Smola, Learning With Kernels. Cambridge, MA, Jan. 1971.
USA: MIT Press, 2002. [32] C. Blake and C. Merz. (1998). UCI Repository for Machine Learning
[24] O. L. Mangasarian and E. W. Wild, “Multisurface proximal support Databases. [Online]. Available: http://www.ics.uci.edu/~mlearn/
vector machine classification via generalized eigenvalues,” IEEE Trans. MLRepository.html
Pattern Anal. Mach. Intell., vol. 28, no. 1, pp. 69–74, Jan. 2006. [33] The MathWorks, Inc. (1994–2010). MATLAB User’s Guide. [Online].
[25] D. T. Larose, “k-nearest neighbor algorithm,” in Discovering Knowledge Available: http://www.mathworks.com
in Data: An Introduction to Data Mining. Warwick, U.K.: Wiley, 2005, [34] P.-N. Tan, M. Steinbach, and V. Kumar, Introduction to Data Mining,
pp. 90–106. 1st ed. Boston, MA, USA: Addison-Wesley, 2005.

You might also like