Professional Documents
Culture Documents
a r t i c l e i n f o a b s t r a c t
Article history: Because clustering is an unsupervised learning task, a number of different validity indices have been
Received 6 July 2017 proposed to measure the quality of the clustering results. However, there is no single best validity measure
Received in revised form for all types of clustering tasks because individual clustering validity indices have both advantages and
20 November 2017
shortcomings. Because each validity index has demonstrated its effectiveness in particular cases, it is
Accepted 23 November 2017
reasonable to expect that a more generalized clustering validity index can be developed, if individually
Available online 12 December 2017
effective cluster validity indices are appropriately integrated. In this paper, we propose a new cluster
validity index, named Charnes, Cooper & Rhodes − cluster validity (CCR-CV), by integrating eight internal
Keywords:
Clustering validity
clustering efficiency measures based on data envelopment analysis (DEA). The proposed CCR-CV can be
Data envelopment analysis used for purposes that are more general because it extends the coverage of a single validity index by
Linear programming adaptively adjusting the combining weights of different validity indices for different datasets. Based
Internal measure on the experimental results on 12 artificial and 30 real datasets, the proposed clustering validity index
demonstrates superior ability to determine the optimal and plausible cluster structures compared to
benchmark individual validity indices.
© 2017 Elsevier B.V. All rights reserved.
https://doi.org/10.1016/j.asoc.2017.11.052
1568-4946/© 2017 Elsevier B.V. All rights reserved.
B. Kim et al. / Applied Soft Computing 64 (2018) 94–108 95
Table 1
Examples of internal clustering validity indices
i x ∈ Ci
i
R-squared RS ( x − c2 − x − ci 2 )/ x − c2 Elbow
x∈D
i x ∈ Ci x∈D
2
Modified Hubert statistic n(n−1)
d(x, y)dx ∈ Ci ,y ∈ Cj (ci , cj ) Elbow
x∈D y∈D
− 1)
ni d2 (ci , c) / (NC
Calinski-Harabasz index CH i Max
d2 (x, ci ) / (n − NC)
i
x ∈ Ci
2
·
d(x,c)
1 x∈D
I index I ( NC · maxd(ci , cj )) Max
⎧i ⎛ ⎞⎫
d(x,ci ) i,j
x ∈ Ci
⎪
⎪ ⎪
⎪
⎨ ⎜ x ∈ C ,y ∈ C max d (x, y)
⎟⎬
min ⎜ ⎟
i j
Dunn’s indices D min Max
i ⎪ j ⎝ ⎠⎪
⎪
⎩ max max d (x, y) ⎪
⎭
x,y ∈ Ck
k
1 b(x)−a(x)
Silhouette index S NC
{ n1 max[b(x),a(x)]
} Max
i
i x ∈ Ci
1
Davies-Bouldin index DB NC
max{[ n1 d(x, ci ) + 1
nj
d(x, cj )]/d(ci , cj )} Min
i
j,j =
/ i
i x ∈ Ci x ∈ Cj
research [16,17]. However, they cannot be applied to real world financial datasets and the results were inconsistent with the known
problems because such external information is not readily avail- properties of the adopted methods.
able. Hence, internal validation indices are more commonly used in In this paper, we propose an integrated clustering validity mea-
practice. The main advantage of internal validation measures is that sure named Charnes, Cooper & Rhodes − cluster validity (CCR-CV),
they do not require any prior knowledge on the clustering structure which combines eight internal validity indices based on data envel-
of a given dataset [7]. They evaluate the compactness within clus- opment analysis (DEA). There exist two main difficulties of cluster
ters and separation between clusters based on their own formulas. validity integration. First, some validity indices are designed to be
Different formulas are a result of considering the impact of various minimized with the optimal cluster structure whereas others are
factors such as noise, density, sub-clusters, skewed distributions, designed to be maximized [6]. Moreover, the combining weights
and monotonicity of index [6]. must not be fixed constants; rather, they must vary according to
Because datasets have their own intrinsic characteristics, there the intrinsic characteristics of the dataset. DEA was originally devel-
is no single unique internal validity measure that is best fitted to all oped to evaluate the efficiency of a system by measuring the ratio of
data structures [11,12]. In supervised learning, it is also known that the weighted sum of the output components to the weighted sum of
there is no single algorithm that outperforms the other algorithms the input components [24]. The weights are not fixed; rather they
for all datasets [18]. However, if multiple algorithms are properly are determined by solving an optimization problem, considering
combined, the predictive performance of this combination, known not only the features of the system itself but also its competitors.
as an ensemble, is typically superior to single algorithms [19–21]. Therefore, we employ four validity indices pursuing maximization
Similarly, in unsupervised clustering, it is a reasonable expecta- as the output component and four validity indices pursuing min-
tion that the effectiveness of clustering validity measures can be imization as the input component to define the efficiency of DEA.
improved if they can be collectively used with an appropriate inte- To determine the appropriate combining weights of the validity
gration technique. To this end, some studies attempted to form an indices for a certain clustering algorithm with its associated param-
ensemble of multiple clustering validity measures to resolve the eters, we formulate the optimization problem using all candidate
limitations of individual validity measures. For example, Jaskowiak algorithm-parameter pairs. Hence, we expect that the coverage of
et al. [22] constructed an ensemble validity measure based on 28 the proposed clustering validity index can be extended with supe-
different measures with nine different selection strategies. How- rior performance compared to the individual indices.
ever, none of them has a sound theoretical basis for integration, but The remainder of this paper is organized as follows. In Sec-
the integration is done empirically. Kou et al. [23] employed three tion 2, we briefly review the internal validity measures used in
multiple-criteria decision-making (MCDM) methods for evaluating this study. In Section 3, we introduce DEA and the formulation
clustering results for financial risk analysis. Although their idea of of the optimization problem used in DEA. Then, we demonstrate
using MCDM as a tool for integrating validity measures is interest- the proposed DEA-based integrated clustering validity measures. In
ing, the experiments have some limitations; they only considered Section 4, experimental settings including dataset description, clus-
96 B. Kim et al. / Applied Soft Computing 64 (2018) 94–108
1 k
Notation Definition
We employ four clustering validity indices whose values are MSTk : minimum spanning tree of Ck (9)
minimized with the optimal clustering structure: Davies-Bouldin
Comp (Ck ) : largest edge of MSTk (10)
index [25], Xie-Beni index [26], Ray-Trui index [27], and Comp-Sepa
index [28]. CompAll = max Comp (Ck ) (11)
The Davies-Bouldin index (DB) computes the compactness of a k=1,2,...,K
2.2. Validity index for maximization the records in the cluster (27); the separation between clusters (DB )
is computed by the maximum distance between two cluster centers
We employ four clustering validity indices whose values are of mass (26). Then, the ratio of DB to EW is weighted by the ET over
maximized with the optimal clustering structure: Dunn index [29], K, which are the average distance between individual records and
Calinski-Harabasz index [30], Silhouette index [31], and PBM index the center of mass of the entire data and the number of clusters,
[32]. respectively.
The Dunn index (Dunn) evaluates the quality of a clustering ∗
structure using the ratio of the minimum of the between cluster dis- DB = max d(Gk , Gk ) (26)
k<k∗
tance and the maximum of the within cluster distance. The between
cluster distance is defined by the minimum distance between two
K
different clusters as in (14); the within cluster distance is defined EW = d Mi , Gk (27)
as the maximum distance between the two records in the same k=1 i ∈ Ik
cluster as in (16).
dkk∗ = min Mik − Mjk
∗
(14)
N
1 expressed as follows:
o (Mi , Ck∗ ) = d (Mi , Mi∗ ) (21)
nk∗ Weighted sum of outputs
i∗ ∈ Ik∗ Relative Efficiency = . (30)
Weighted sum of inputs
b (i) = min o (Mi , Ck∗ ) (22)
k∗ =
/ k DEA evaluates the relative efficiency by solving a linear pro-
gramming (LP) that generates not only the relative efficiency score
b (i) − a (i)
s (i) = (23) but also the weights of the input and output elements.
max (a (i) , b (i))
Since its introduction, a considerable number of DEA variation
1 models have been proposed [41]; however, we employ the basic
Sk = s (i) (24)
nk CCR model in our study because it has the simplest form and is
i ∈ Ik
sufficient to evaluate the validity of clustering results. In DEA, each
unit is called a DMU and the inputs and the outputs must be non-
1
K
S= Sk (25) negative numeric values. The objective function and the constraints
K of the CCR model are formulated as follows:
k=1
The PBM index (PBM) computes the compactness of a cluster u1 O1e + u2 O2e + . . . + uM OMe
max Ee = , (31)
(EW ) by averaging the distance between the center of mass and all v1 I1e + v2 I2e + . . . + vN INe
98 B. Kim et al. / Applied Soft Computing 64 (2018) 94–108
Table 3
(v1 Dunnk + v2 CHk + v3 Silhouettek + v4 PBMk ) ≤ 0, (38) Summary of artificial datasets.
Table 4
Summary of real datasets (the number in parenthesis is the original number of records in the dataset).
Table 5
Description of DMUs.
algorithms are selected. K-Means is the most widely used clus- Table 6
Candidate and minpts values for each artificial dataset.
tering algorithm in both academia and industry. The purpose of
K-Means is to determine the K cluster centroids and assign the clus- Dataset minpts
ter memberships to all records in the dataset. The only parameter Well separated 2 4 6 6 7 8
for K-Means is the number of clusters K. K-Means begins with the K Flame 1 1.5 2 6 7 8
initial centroids, which are usually chosen at random. Then, every Two moon 0.1 0.15 0.2 6 7 8
record is assigned to the nearest centroid. After the membership Tri moon 0.05 0.1 0.15 6 7 8
Spiral 1 2 3 1 2 3
assignment, the centroids are updated by averaging all the mem-
Two circle 0.05 0.1 0.15 6 7 8
bers in the same cluster. These procedures are repeated alternately Bulls eye 0.05 0.1 0.15 6 7 8
until the cluster centroids and corresponding memberships do not Pathbased1 1.5 2 2.5 6 7 8
change. K-Medoids uses the same procedure as K-Means except for
the computation of the centroid. The centroid of a cluster is defined
by the median value of the cluster members in K-Medoids whereas cluster membership to all records in the given dataset, we assign
it is defined by the mean value of the cluster members in K-Means. the cluster membership of the records identified as noise by the
We also employ the Fuzzy C-Means (FCM) clustering, which is the original DBSCAN to the nearest cluster centroid.
soft clustering version of the K-Means. Similar to the K-Means, the Among graph-based clustering algorithms, k-minimum span-
FCM finds the optimal centroids and the cluster membership of ning tree (k-MST) [47] is employed. In k-MST, each record is
individual records during the training. The only difference between represented as a node in the graph and close nodes are connected
K-Means and FCM is that FCM allows a record to have multiple by edges with the weights proportional to the distance between the
cluster memberships with difference assigning probabilities. Since two nodes. Two objectives of k-MST are (1) there must not be any
the cluster validity measures used in this study are designed for cycles in the graph and (2) the total sum of the weighted edges are
the hard clustering methods, each record is assigned to the cluster minimized. Then, k-MST cuts the (k − 1) largest edges to generate
centroid with the highest assigning probability. k isolated sub-trees, which are considered as clusters.
For the hierarchical-based approach, Agglomerative Nesting
(AN) [45] is adopted. Hierarchical clustering can be conducted
4.3. Design DMUs
by either the top-down method, which divides the entire dataset
into sub-regions until every single record consists of a single clus-
Based on the five clustering algorithms and their associated
ter, or by the bottom-up method, which begins with all records
parameters, we design 54 DMUs for the artificial datasets as illus-
as a single cluster and merges the two nearest clusters until the
trated in Table 5. First, 15 DMUs are each used for K-Means and
entire dataset is merged to a single cluster. Because the bottom-up
FCM: three different numbers of clusters and five sets of initial cen-
approach is more commonly adopted in practice, we also employ
troids. Because the class information is available for all the datasets,
the bottom-up approach-based AN. The parameter for AN is the dis-
we set the number of clusters in K-Means to the numbers adjacent
tance measure between two clusters. We consider three distance
to the actual number of classes: C (actual number of classes), C − 1,
measures: single linkage, complete linkage, and average linkage.
and C + 1. For those datasets that have only two classes, the number
DBSCAN [46] is adopted for the density-based approach. The
of clusters is set to two, three, and four. Furthermore, because the
assumption of density-based clustering algorithms is that the
final clustering result of the K-Means is highly dependent on the
records in a cluster are generated from the same distribution
initial centroids, which are randomly chosen in practice, we repeat
and different clusters have different data generating distributions.
the centroid initiation five times for each number of clusters. Three
DBSCAN has two model parameters: the radius and the minimum
DMUs are used for K-Medoids by changing the number of clusters.
number of points (minpts) required to form a dense region. DBSCAN
Unlike the K-Means, K-Medoids always selects an actual record
begins with an arbitrary record that has not yet been considered.
as a centroid; it does not depend on the centroid initiation. Nine
If the −neighborhood of the target record contains sufficiently
DMUs are designed for AN based on the number of clusters and the
many records, a cluster is started. Then, the −neighborhood of
distance measure between two clusters: single linkage, complete
the members of the newly formed cluster is also added to the
linkage, and average linkage. Nine DMUs are designed for DBSCAN:
same cluster. This diffusion process is repeated until there are no
three values and three minpts. Because is considerably sensi-
records in the −neighborhood of the current cluster members. The
tive to the data structure, we conduct a preliminary experiment to
original DBSCAN regards the records that have no records in their
determine reasonable values by searching a sufficient number of
−neighborhood as noise and does not assign the cluster member-
candidates. The selected values for each artificial dataset are
ship to them. Because all the other clustering algorithms assign the
provided in Table 6. Because minpts is not as sensitive as to the
B. Kim et al. / Applied Soft Computing 64 (2018) 94–108 101
1
K
data structure, we use the same minpts candidates (six, seven, and
eight) for all artificial datasets. Balanced correction rate = Balnaced correction ratek (41)
K
A total of 30 DMUs are used for the real datasets. All the DMUs k=1
in the experiment with the artificial datasets are also used except 1
for the nine DMUs associated with DBSCAN. Not only is the Balanced accuracyk = (Sensitivityk + Specificityk ) (42)
2
in DBSCAN sensitive to the data structure, its computational cost
1
K
increases exponentially with regard to the number of records in a
dataset. Balanced accuracy = Balanced accuracyk (43)
K
k=1
Table 8
Summary indicating that each clustering validity index can determine the optimal clustering structure for each artificial dataset.
Index Well separated Flame Two moon Tri moon Spiral Two circle Bulls eye Pathbased1 Ratio
DB X X X X X X X X 0/8
XB O O O O O O O X 7/8
RT X O X X O X X O 3/8
CS O O O O O O O X 7/8
Dunn O O O O O O O X 7/8
CH O X X X X X X O 2/8
Silhouette O O X X X X X X 2/8
PBM O O X X X X X X 2/8
CCR-CV O O O O O O O O 8/8
DEA employed by CCR-CV could adaptively adjust the aggregating validity index with the highest time complexity, which is CS in this
weights of the indices for different data structures to identify the experiment. The relative increase in CCR-CV computation time is
optimal cluster structure. only 1.76% on average (maximum of 2.28% and minimum of 1.20%)
Figs. 3 and 4 show the resulting clustering structures of some compared to that of the CS. In addition, the time required to solve
selected DMUs and their corresponding CCR-CV scores for the “Well the DEA optimization is less than 10−2 seconds, which is negligible
separated” dataset and the “Bulls eye” dataset, respectively. For the compared to the time required to compute the validity indices.
“Well separated” dataset, the DMUs resulting in the optimal cluster
structure (Fig. 3(b), (e), and (f)) have efficiency scores of 1. Other 5.2. Results on real datasets
non-optimal structures result in CCR-CV scores lower than 1. CCR-
The average ACC, BACC, and BCR of each clustering validity
CV can find a more complicated cluster structure such as “Bulls eye”
index for each real dataset with 30 repetitions are provided in
as shown in Fig. 4. The only DMU with a CCR-CV score of 1 used
Tables 10–12. In terms of the ACC, the proposed CCR-CV yielded
agglomerative clustering with a single linkage and was instructed
the best performance for 26 of the 30 datasets, followed by CH
to find two clusters (Fig. 4(f)). Other DMUs could not find the opti-
and Silhouette (three datasets each). The other indices resulted
mal cluster structure, and resulted in CCR-CV scores lower than 1.
in the best performance in less than two datasets; XB and Dunn
Table 9 shows the total elapsed time to compute each clustering
were not found to be the best for any datasets. Similar results
validity index for all DMUs for the synthetic datasets. Because the
were demonstrated for BACC and BCR. CCR-CV yielded the best
CCR-CV takes the values of the eight individual validity measures, it
performance for 27 and 28 datasets in terms of BACC and BCR,
takes the longest time. However, the total elapsed time for CCR-CV
respectively. Except for CCR-CV, none of the clustering validity
computation is highly dependent on the computation time of the
indices resulted in the highest score in more than five datasets.
Fig. 3. CCR-CV scores for resulting clustering structure by different clustering algorithms for Well separated dataset.
B. Kim et al. / Applied Soft Computing 64 (2018) 94–108 103
Fig. 4. CCR-CV scores for resulting clustering structure by different clustering algorithms for Bulls eye dataset.
Table 9
Elapsed time (seconds) for computing each cluster validity index for the synthetic datasets.
Well separated 0.09 0.09 0.10 41.76 0.15 0.08 0.10 0.11 42.48 0.00
Flame 0.08 0.08 0.10 22.81 0.06 0.06 0.06 0.08 23.33 0.00
Two moon 0.10 0.11 0.09 37.40 0.10 0.10 0.10 0.09 38.09 0.00
Tri moon 0.11 0.11 0.09 38.16 0.11 0.11 0.12 0.11 38.92 0.01
Spiral 0.36 0.36 0.37 216.49 0.37 0.38 0.39 0.36 219.08 0.00
Two circle 0.16 0.17 0.18 68.70 0.17 0.18 0.17 0.17 69.90 0.00
Bulls eye 0.37 0.37 0.36 179.48 0.38 0.37 0.39 0.38 182.10 0.01
Pathbased1 0.25 0.26 0.27 98.69 0.26 0.27 0.27 0.26 100.53 0.00
The last row of Tables 10–12 indicates the number of datasets for The average performance improvements of CCR-CV compared to
which each individual clustering validity index yielded the highest the other clustering validity indices for the three metrics are sum-
score when CCR-CV was not considered. In this situation, although marized in Table 13. Because some validity measures resulted in a
CH most frequently resulted in the best performance, the number zero value for some datasets and performance measures, e.g., DB
of best performances was less biased toward a single validity in “Wine” dataset in terms of BCR, we excluded these cases when
index, as we know that none of the existing clustering validity computing the relative improvement. The values in the parenthe-
indices outperforms the others for all data structures. However, sis are the p-value of the t-test with the following hypotheses.
as in the results for the artificial datasets, if multiple clustering The null hypothesis (H0 ) is that the performances between CCR-
validity indices are properly integrated, we can expect improved CV and the clustering validity index in the corresponding column
performance because the optimal cluster structure determined by are the same (statistically not distinguishable); whereas the alter-
CCR-CV is more consistent with the actual class distribution than native hypothesis (H1 ) is that CCR-CV statistically outperforms the
those identified by individual validity indices. clustering validity index in the corresponding column. Note that
Because the experimental results in Tables 11–13 are based on all the p-values are smaller than 0.001, which implies that CCR-
the stratified sampled dataset, the experimental results with the CV outperformed all the compared individual clustering validities
original datasets (without sampling) are provided in the Appendix irrespective of the performance measures. Further, the relative
A. The only difference between these two sets of results is that CS improvement of CCR-CV is noticeable in terms of BCR. It doubles,
was excluded because of its heavy computational complexity. For at least, the BCR (the relative improvement is higher than 100%)
the original datasets, the proposed CCR-CV generally outperformed for real datasets against the individual clustering validity measures
the other validity indices, which implies that the proposed CCR-CV with the only exception being CH.
is independent of the dataset size.
104 B. Kim et al. / Applied Soft Computing 64 (2018) 94–108
Table 10
ACC of each clustering validity index for each real dataset (bold number denotes the best results with the asterisk (*) being the best without ties).
Wine 0.3989 0.3989 0.3989 0.3989 0.3989 0.9663 0.3989 0.3989 0.9663
Seeds 0.9190 0.3381 0.6429 0.3381 0.3381 0.6571 0.6429 0.9095 0.9190
Glass 0.5421 0.3692 0.3692 0.3738 0.3692 0.5701 0.5561 0.3692 0.5794*
Iris 0.6667 0.6667 0.6667 0.6667 0.6667 0.6667 0.6667 0.6867 0.8333*
Newthyroid 0.7116 0.7070 0.7070 0.7070 0.7070 0.8744 0.7070 0.7070 0.8744
Ecoli 0.6935 0.4554 0.6369 0.4554 0.4554 0.6726 0.6845 0.6935 0.6964*
Abalone 0.3667 0.3667 0.3667 0.3667 0.3667 0.5200 0.3683 0.3667 0.5300*
Yeast 0.5583* 0.3283 0.3683 0.3283 0.3267 0.5467 0.5467 0.5467 0.5517
Winequality red 0.4300 0.4283 0.4267 0.4283 0.4283 0.5433 0.5467 0.4283 0.5633*
Winequality white 0.4783 0.4550 0.4550 0.4583 0.4550 0.4783 0.4750 0.4783 0.4950*
Vehicle 0.3688 0.2636 0.3688 0.2648 0.2636 0.3641 0.3735 0.3735 0.3735
Penbased 0.1242 0.1227 0.6121 0.1227 0.1227 0.6379* 0.5606 0.5606 0.6121
Satimage 0.2609 0.2438 0.5202 0.2453 0.2438 0.7174 0.5202 0.7220 0.7717*
Vowel 0.1384 0.1394 0.3212 0.1414 0.1394 0.3051 0.1475 0.2869 0.3576*
Texture 0.2855 0.2800 0.4436 0.2800 0.5582 0.6473 0.6291 0.4436 0.6655*
Segment 0.1501 0.1501 0.1501 0.1501 0.1530 0.5657 0.6248* 0.6017 0.6017
BreastTissue 0.4717 0.2264 0.4057 0.2264 0.2830 0.5189 0.5189 0.5189 0.5566*
Balance 0.4696 0.6667 0.6795 0.6795 0.6715 0.6042 0.4647 0.4647 0.6683
Pageblocks 0.9051 0.9033 0.9069 0.9051 0.9033 0.9179 0.9051 0.9051 0.9234*
Shuttle 0.8603 0.7897 0.7897 0.7897 0.7897 0.8362 0.8638 0.7897 0.8638
User knowledge modeling 0.3251 0.3275 0.3251 0.3275 0.3251 0.4491 0.4268 0.3251 0.5955*
Urban land cover 0.3869 0.2143 0.2262 0.2262 0.2202 0.5833 0.6131 0.5119 0.6964*
Image segmentation 0.4333 0.1714 0.3095 0.1714 0.1714 0.5762 0.6238 0.5762 0.7952*
Wall-following robot navigation 0.4103 0.4084 0.4103 0.4084 0.4084 0.5018 0.4267 0.4084 0.5092*
Foresttypes 0.4685 0.3748 0.3748 0.3767 0.3748 0.4990 0.4990 0.3748 0.7706*
Wdbc 0.6327 0.6309 0.6309 0.6309 0.6309 0.9121 0.6309 0.6309 0.9139*
Sonar 0.5337 0.5337 0.5337 0.5337 0.5337 0.5337 0.5337 0.5337 0.6010*
Optdigits 0.1352 0.1157 0.1263 0.1157 0.1157 0.6157 0.5854 0.1317 0.6673*
Hepatobiliary disorders 0.3358 0.3340 0.3358 0.3358 0.3358 0.3582 0.3582 0.3340 0.4011*
Banknote 0.5743 0.5554 0.6866 0.5554 0.5554 0.7289 0.6822 0.6866 0.8397*
No. best cases 2 0 1 1 0 3 3 1 26
No. best cases (without CCR-CV) 5 1 2 1 1 14 9 6
Table 11
BACC of each clustering validity index for each real dataset (bold number denotes the best results with the asterisk (*) being the best without ties).
Wine 0.5000 0.5000 0.5000 0.5000 0.5000 0.9779 0.5000 0.5000 0.9779
Seeds 0.9393 0.5036 0.7321 0.5036 0.5036 0.7429 0.7321 0.9321 0.9393
Glass 0.6219 0.5239 0.5239 0.5274 0.5239 0.6829 0.6313 0.5339 0.6872*
Iris 0.7500 0.7500 0.7500 0.7500 0.7500 0.7500 0.7500 0.7650 0.8750*
Newthyroid 0.5244 0.5162 0.5162 0.5162 0.5162 0.7958 0.5162 0.5162 0.7958
Ecoli 0.7236 0.6340 0.7065 0.6340 0.6340 0.7078 0.6821 0.7236 0.7367*
Abalone 0.5013 0.5013 0.5013 0.5013 0.5013 0.6306 0.5028 0.5013 0.6363*
Yeast 0.6586 0.5706 0.5821 0.5706 0.5702 0.6559 0.6559 0.6561 0.6608*
Winequality red 0.5268 0.5058 0.5045 0.5058 0.5058 0.5447 0.5458 0.5012 0.5521*
Winequality white 0.5604 0.5409 0.5409 0.5422 0.5409 0.5604 0.5231 0.5604 0.5687*
Vehicle 0.5878 0.5050 0.5878 0.5057 0.5050 0.5788 0.5909 0.5909 0.5909
Penbased 0.5111 0.5103 0.7856 0.5103 0.5103 0.7981* 0.7561 0.7561 0.7856
Satimage 0.5204 0.5054 0.6845 0.5068 0.5054 0.7678 0.6845 0.8047 0.8312*
Vowel 0.5261 0.5267 0.6267 0.5278 0.5267 0.6178 0.5311 0.6078 0.6467*
Texture 0.6070 0.6040 0.6940 0.6040 0.7570 0.8060 0.7960 0.6940 0.8160*
Segment 0.5042 0.5042 0.5042 0.5042 0.5059 0.7466 0.7811* 0.7677 0.7677
BreastTissue 0.6624 0.5143 0.6048 0.5143 0.5429 0.6861 0.6861 0.6861 0.7172*
Balance 0.5054 0.6376 0.6462 0.6462 0.6408 0.5957 0.5021 0.5021 0.6386
Pageblocks 0.6032 0.5931 0.5990 0.6032 0.5931 0.5605 0.5861 0.5861 0.6888*
Shuttle 0.7104 0.6270 0.6270 0.6270 0.6270 0.6080 0.7886 0.6270 0.7886
User knowledge modeling 0.5059 0.5071 0.5059 0.5071 0.5059 0.5794 0.5588 0.5059 0.6417*
Urban land cover 0.6232 0.5249 0.5303 0.5303 0.5277 0.7697 0.7799 0.6993 0.8430*
Image segmentation 0.6694 0.5167 0.5972 0.5167 0.5167 0.7528 0.7806 0.7528 0.8806*
Wall-following robot navigation 0.5039 0.5029 0.5039 0.5029 0.5029 0.6632 0.5127 0.5029 0.6632
Foresttypes 0.5946 0.5019 0.5019 0.5038 0.5019 0.6255 0.6255 0.5019 0.8307*
Wdbc 0.5071 0.5047 0.5047 0.5047 0.5047 0.8945 0.5047 0.5047 0.9017*
Sonar 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5337 0.5956*
Optdigits 0.5168 0.5078 0.5137 0.5078 0.5078 0.7869 0.7702 0.5166 0.8147*
Hepatobiliary disorders 0.5027 0.5014 0.5027 0.5027 0.5027 0.5536 0.5536 0.5014 0.6080*
Banknote 0.5213 0.5000 0.6475 0.5000 0.5000 0.7137 0.6426 0.6475 0.8360*
No. best cases 1 0 1 1 0 4 3 1 27
No. best cases (without CCR-CV) 6 1 3 3 1 15 10 7
B. Kim et al. / Applied Soft Computing 64 (2018) 94–108 105
Table 12
BCR of each clustering validity index for each real dataset (bold number denotes the best results with the asterisk (*) being the best without ties).
Wine 0.0000 0.0000 0.0000 0.0000 0.0000 0.9775 0.0000 0.0000 0.9775
Seeds 0.9390 0.0680 0.5521 0.0680 0.0680 0.5650 0.5521 0.9315 0.9390
Glass 0.3923 0.1455 0.1455 0.1803 0.1455 0.4892 0.4173 0.2094 0.4987*
Iris 0.6290 0.5690 0.5690 0.5690 0.5690 0.5690 0.5690 0.6577 0.8732*
Newthyroid 0.1770 0.1445 0.1445 0.1445 0.1445 0.7657 0.1445 0.1445 0.7657
Ecoli 0.5300 0.3418 0.4868 0.3418 0.3418 0.5132 0.4307 0.5300 0.5477*
Abalone 0.0414 0.0414 0.0414 0.0414 0.0414 0.4615 0.0716 0.0414 0.4641*
Yeast 0.4413 0.2221 0.2874 0.2221 0.2189 0.4385 0.4385 0.4387 0.4498*
Winequality red 0.1554 0.0692 0.0499 0.0692 0.0692 0.2103 0.2115 0.0308 0.2186*
Winequality white 0.2613 0.1638 0.1638 0.1735 0.1638 0.2613 0.1571 0.2613 0.2741*
Vehicle 0.3207 0.0715 0.3207 0.0885 0.0715 0.4442* 0.3259 0.3259 0.3259
Penbased 0.1103 0.1057 0.7027 0.1057 0.1057 0.6874 0.6070 0.6070 0.7027
Satimage 0.1063 0.0549 0.4314 0.0614 0.0549 0.5997 0.4314 0.7179 0.7464*
Vowel 0.1589 0.1149 0.5099 0.1172 0.1149 0.4716 0.1238 0.3472 0.5682*
Texture 0.2806 0.2584 0.4638 0.2584 0.5977 0.7361 0.7236 0.4638 0.7457*
Segment 0.0452 0.0452 0.0452 0.0452 0.0763 0.5890 0.6370* 0.6218 0.6218
BreastTissue 0.3998 0.0929 0.2581 0.0929 0.1608 0.4301 0.4301 0.4301 0.5479*
Balance 0.1339 0.4375 0.4544 0.4544 0.4449 0.4289 0.1171 0.1171 0.4696*
Pageblocks 0.3709 0.3407 0.3602 0.3709 0.3407 0.2195 0.2883 0.2883 0.4780*
Shuttle 0.4597 0.2818 0.2818 0.2818 0.2818 0.3377 0.7495 0.2818 0.7495
User knowledge modeling 0.0777 0.0997 0.0777 0.0997 0.0777 0.2771 0.2240 0.0777 0.4603*
Urban land cover 0.3913 0.1631 0.1949 0.1949 0.1710 0.6009 0.6612 0.4760 0.8311*
Image segmentation 0.4641 0.1151 0.2841 0.1151 0.1151 0.6083 0.6799 0.6083 0.8577*
Wall-following robot navigation 0.0622 0.0539 0.0622 0.0539 0.0539 0.5357 0.1139 0.0539 0.5362*
Foresttypes 0.2939 0.0412 0.0412 0.0583 0.0412 0.3407 0.3407 0.0412 0.8191*
Wdbc 0.1190 0.0971 0.0971 0.0971 0.0971 0.8919 0.0971 0.0971 0.9004*
Sonar 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.5902*
Optdigits 0.1426 0.0942 0.1151 0.0942 0.0942 0.6698 0.6518 0.1185 0.7493*
Hepatobiliary disorders 0.0504 0.0357 0.0504 0.0504 0.0504 0.3022 0.3022 0.0357 0.4693*
Banknote 0.2065 0.0000 0.5432 0.0000 0.0000 0.7005 0.5341 0.5432 0.8354*
No. best cases 1 0 1 0 0 3 2 2 28
No. best cases (without CCR-CV) 6 1 4 3 1 16 9 6
Table 13
Relative performance improvement of CCR-CV over the other clustering validity measures (the value in the parenthesis is the p-value for the t-test).
ACC 79.05% (<0.001) 114.72% (<0.001) 66.63% (<0.001) 113.73% (<0.001) 108.62% (<0.001) 10.95% (<0.001) 26.92% (<0.001) 42.44% (<0.001)
BACC 29.80% (<0.001) 39.53% (<0.001) 29.17% (<0.001) 39.27% (<0.001) 38.32% (<0.001) 8.48% (<0.001) 19.57% (<0.001) 23.36% (<0.001)
BCR 304.93% (<0.001) 561.22% (<0.001) 355.80% (<0.001) 510.04% (<0.001) 514.97% (<0.001) 28.19% (<0.001) 131.92% (<0.001) 320.22% (<0.001)
Based on the experimental results on artificial and real the appropriate combining weights for the individual indices under
datasets, we conclude the following. First, the proposed CCR-CV different circumstances, we employed the CCR model in the DEA
demonstrates superior ability to determine the optimal cluster- approach and formed an integrated validity index as the ratio of the
ing structure by assigning appropriate weights for each individual four validity measures pursuing maximization to the four validity
validity measure when solving the LP problem defined by DEA. measures pursuing minimization. The experimental results con-
It is empirically supported that CCR-CV can identify the inherent firmed that the proposed CCR-CV could not only determine the
structure for not only well-separated spherical clusters but also the inherent cluster structure but also divide the datasets into more
arbitrary shape of clusters in the artificial datasets. Moreover, CCR- homogeneous clusters.
CV can determine a more consistent cluster structure in accordance Although the effectiveness of the proposed CCR-CV is empiri-
with the actual class distributions than the individual clustering cally supported, there are some limitations of the current research,
validation indices. Based on the performance measures commonly which leads us to future research directions. First, because the
employed in classification tasks, CCR-CV yields the best scores for DEA model allows more than one DMU to have an efficiency
the majority of the datasets demonstrating a significant relative score of one, CCR-CV can assign the highest efficiency score, i.e.,
improvement. “1,” to not only the optimal cluster structure but also other clus-
Table 14 shows the total elapsed time to compute each cluster- ter structures. It would be more practically useful if the relative
ing validity index for all DMUs for the real datasets. These results superiority among the cluster structures with the equal high-
are similar to the time-complexity results of the synthetic datasets. est CCR-CV scores was determined. Secondly, we believe that
The total elapsed time for CCR-CV is highly dependent on the valid- the eight internal measures employed were sufficiently diversi-
ity index with the highest time complexity (CS) and the DEA part of fied with regard to the method of measuring the compactness
the CCR-CV did not take much time (less than 0.03 s). The relative within a cluster and the separation between clusters. However,
time increase of the CCR-CV compared to CS is 1.06% on average there could be other cluster validity measures that are individu-
(maximum of 1.89% and minimum of 0.47%). ally effective and collectively diversified when combined with the
current indices. If these can be identified, the coverage of CCR-CV
6. Conclusion could be extended and improved practical usefulness would be
secured.
In this paper, we proposed a new clustering validity index by
integrating eight different clustering validity indices. To allocate
106 B. Kim et al. / Applied Soft Computing 64 (2018) 94–108
Table 14
Elapsed time (seconds) for computing each cluster validity index for the real datasets (The time within the parentheses is the time required to solve the DEA optimization
problem.
Wine 0.08 0.06 0.06 24.38 0.06 0.06 0.05 0.06 24.83 0.02
Seeds 0.08 0.08 0.08 27.31 0.08 0.08 0.09 0.08 27.90 0.02
Glass 0.08 0.08 0.08 52.22 0.09 0.08 0.09 0.08 52.83 0.03
Iris 0.05 0.04 0.03 13.66 0.05 0.05 0.05 0.05 14.01 0.03
Newthyroid 0.08 0.08 0.08 31.12 0.07 0.09 0.08 0.06 31.67 0.01
Ecoli 0.15 0.16 0.15 113.28 0.16 0.15 0.15 0.16 114.37 0.01
Abalone 0.48 0.50 0.50 302.79 0.50 0.52 0.58 0.55 306.44 0.02
Yeast 0.55 0.50 0.51 369.72 0.50 0.49 0.50 0.50 373.31 0.04
Winequality red 0.53 0.54 0.55 347.00 0.53 0.52 0.50 0.52 350.70 0.01
Winequality white 0.52 0.53 0.52 346.33 0.53 0.50 0.54 0.52 350.01 0.02
Vehicle 1.22 1.19 1.21 901.15 1.24 1.21 1.24 1.23 909.70 0.01
Penbased 0.73 0.73 0.72 435.72 0.71 0.71 0.71 0.70 440.74 0.01
Satimage 0.99 0.99 0.99 564.73 0.98 0.99 1.02 1.00 571.70 0.01
Vowel 1.40 1.39 1.38 948.16 1.36 1.36 1.44 1.36 957.89 0.01
Texture 0.76 0.77 0.75 547.42 0.77 0.78 0.78 0.78 552.82 0.01
Segment 0.88 0.83 0.85 590.54 0.82 0.82 0.84 0.83 596.42 0.01
BreastTissue 0.03 0.03 0.03 17.71 0.03 0.03 0.03 0.03 17.93 0.01
Balance 0.49 0.49 0.49 241.30 0.49 0.47 0.47 0.49 244.71 0.02
Pageblocks 0.47 0.44 0.43 287.47 0.45 0.44 0.44 0.44 290.60 0.02
Shuttle 0.47 0.47 0.49 322.29 0.47 0.45 0.47 0.47 325.60 0.02
User knowledge modeling 0.22 0.22 0.23 104.00 0.27 0.24 0.23 0.22 105.65 0.02
Urban land cover 0.22 0.24 0.22 264.76 0.22 0.22 0.23 0.23 266.37 0.03
Image segmentation 0.09 0.09 0.09 72.29 0.09 0.09 0.09 0.10 72.94 0.01
Wall-following robot navigation 0.56 0.56 0.57 330.35 0.55 0.56 0.58 0.54 334.29 0.02
Foresttypes 0.55 0.56 0.57 348.78 0.56 0.56 0.56 0.57 352.72 0.01
Wdbc 0.71 0.69 0.69 487.58 0.67 0.65 0.69 0.69 492.39 0.02
Sonar 0.18 0.17 0.17 75.72 0.18 0.17 0.17 0.17 76.95 0.02
Optdigits 1.12 1.12 1.11 875.90 1.12 1.11 1.15 1.14 883.79 0.02
Hepatobiliary disorders 0.40 0.42 0.42 251.77 0.41 0.41 0.42 0.40 254.66 0.01
Banknote 0.58 0.60 0.60 347.13 0.59 0.57 0.60 0.58 351.26 0.01
Table A1
ACC of each clustering validity index for each sampled real dataset without stratified sampling (bold number denotes the best results with the asterisk (*) being the best
without ties).
Abalone (4177) 0.4643 0.3659 0.3659 0.3659 0.5299 0.5299* 0.3659 0.5278
Winequality white (4898) 0.4500 0.4492 0.4492 0.4492 0.4761 0.4755 0.4490 0.4780*
Penbased (10,992) 0.1108 0.1107 0.6234 0.1058 0.6775 0.3993 0.3976 0.7586*
Satimage (6435) 0.7487* 0.2389 0.5308 0.2389 0.6706 0.2398 0.6777 0.7484
Texture (5500) 0.0931 0.0927 0.3820 0.0927 0.5878 0.5878 0.3820 0.6329*
Pageblocks (5472) 0.8988 0.8988 0.8988 0.8988 0.9128 0.9145 0.9004 0.9145
Wall-following robot navigation (5456) 0.4195 0.4043 0.4049 0.4043 0.4855 0.4098 0.4043 0.4855
Optdigits (5620) 0.1069 0.1046 0.1071 0.1046 0.6023 0.6972 0.1069 0.7183*
No. best cases 1 0 0 0 1 2 0 6
No. best cases (without CCR-CV) 1 0 0 0 4 4 0
Table A2a
BACC of each clustering validity index for each sampled real dataset without stratified sampling (bold number denotes the best results with the asterisk (*) being the best
without ties).
Abalone (4177) 0.5904 0.5002 0.5002 0.5002 0.6383 0.6373 0.5002 0.6450*
Winequality white (4898) 0.5063 0.5037 0.5037 0.5037 0.5149 0.5146 0.5036 0.5231*
Penbased (10,992) 0.5038 0.5038 0.7918 0.5009 0.8223 0.6625 0.6616 0.8659*
Satimage (6435) 0.7824 0.5005 0.6889 0.5005 0.7477 0.5014 0.7889 0.8229*
Texture (5500) 0.5012 0.5010 0.6601 0.5010 0.7733 0.7733 0.6601 0.7981*
Pageblocks (5472) 0.5050 0.5050 0.5050 0.5050 0.5832 0.6563 0.5476 0.6563
B. Kim et al. / Applied Soft Computing 64 (2018) 94–108 107
Wall-following robot navigation (5456) 0.5190 0.5001 0.5004 0.5001 0.5482 0.5137 0.5001 0.5482
Optdigits (5620) 0.5028 0.5016 0.5029 0.5016 0.7794 0.8321 0.5028 0.8438*
No. best cases 0 0 0 0 1 1 0 8
No. best cases (without CCR-CV) 0 0 0 0 5 3 1
Table A2b
BCR of each clustering validity index for each sampled real dataset without stratified sampling (bold number denotes the best results with the asterisk (*) being the best
without ties).
Abalone (4177) 0.5493 0.0157 0.0157 0.0157 0.4693 0.4674 0.0157 0.6012*
Winequality white (4898) 0.1006 0.0396 0.0396 0.0396 0.1180 0.1171 0.0347 0.1548*
Penbased (10,992) 0.0504 0.0491 0.6378 0.0262 0.7580 0.3920 0.3911 0.8517*
Satimage (6435) 0.6145 0.0173 0.4335 0.0173 0.5742 0.0370 0.7011 0.7374*
Texture (5500) 0.0304 0.0251 0.3918 0.0251 0.6518 0.6518 0.3918 0.7279*
Pageblocks (5472) 0.0837 0.0837 0.0837 0.0837 0.3108 0.4355 0.2002 0.4355
Wall-following robot navigation (5456) 0.1958 0.0098 0.0197 0.0098 0.2969 0.1515 0.0098 0.2969
Optdigits (5620) 0.0480 0.0373 0.0522 0.0373 0.6283 0.7701 0.0480 0.7828*
No. best cases 0 0 0 0 1 1 0 8
No. best cases (without CCR-CV) 0 0 0 0 5 3 1
References [21] D. Opitz, R. Maclin, Popular ensemble methods: an empirical study, J. Artif.
Intell. Res. 11 (1999) 169–198.
[1] L. Kaufman, P.J. Rousseeuw, Finding Groups in Data: An Introduction to [22] P.A. Jaskowiak, D. Moulavi, A.C.S. Furtado, R.J.G.B. Campello, A. Zimek, J.
Cluster Analysis, John Wiley, Hoboken, NJ, USA, 1990. Sander, On strategies for building effective ensembles of relative clustering
[2] D. Zakrzewska, J. Murlewski, Clustering algorithms for bank customer validity criteria, Knowl. Inf. Syst. 47 (2016) 329–354.
segmentation, 5th International Conference on Intelligent Systems Design [23] G. Kou, Y. Peng, G. Wang, Evaluation of clustering algorithms for financial risk
and Applications (ISDA’05) (2005) 197–202. analysis using MCDM methods, Inf. Sci. 275 (2014) 1–12.
[3] F.T. Piller, D. Walcher, Toolkits for idea competitions: a novel method to [24] A. Charnes, W.W. Cooper, E. Rhodes, Measuring the efficiency of decision
integrate users in new product development, R&D Manage. 36 (2006) making units, Eur. J. Oper. Res. 2 (1978) 429–444.
307–318. [25] D.L. Davies, D.W. Bouldin, A cluster separation measure, IEEE Trans. Pattern
[4] M. Halkidi, Y. Batistakis, M. Vazirgiannis, Cluster validity methods: part I, Anal. Mach. Intell. 1 (1979) 224–227.
SIGMOD Rec. 31 (2002) 40–45. [26] X.L. Xie, G. Beni, A validity measure for fuzzy clustering, IEEE Trans. Pattern
[5] M. Halkidi, Y. Batistakis, M. Vazirgiannis, Clustering validity checking Anal. Mach. Intell. 13 (1991) 841–847.
methods: part II, SIGMOD Rec. 31 (2002) 19–27. [27] S. Ray, R.H. Turi, Determination of number of clusters in k-means clustering
[6] Y. Liu, Z. Li, H. Xiong, X. Gao, J. Wu, Understanding of internal clustering and application in colour image segmentation, The 4th International
validation measures, 2010 IEEE International Conference on Data Mining Conference on Advances in Pattern Recognition and Digital Techniques (1999)
(2010) 911–916. 137–143.
[7] M. Halkidi, Y. Batistakis, M. Vazirgiannis, On clustering validation techniques, [28] S. Liu, Y.L. Huang, A new clustering validity index for evaluating arbitrary
J. Intell. Inf. Syst. 17 (2001) 107–145. shape clusters, 2007 International Conference on Machine Learning and
[8] S. Askari, N. Montazerin, M.H. Fazel Zarandi, Generalized Possibilistic Fuzzy Cybernetics (2007) 3969–3974.
C-Means with novel cluster validity indices for clustering noisy data, Appl. [29] J.C. Dunn, Well-Separated clusters and optimal fuzzy partitions, J. Cybern. 4
Soft Comput. 53 (2017) 262–283. (1974) 95–104.
[9] S. Saha, S. Bandyopadhyay, Some connectivity based cluster validity indices, [30] T. Caliński, J. Harabasz, A dendrite method for cluster analysis, Commun. Stat.
Appl. Soft Comput. 12 (2012) 1555–1565. 3 (1974) 1–27.
[10] H.-L. Shieh, Robust validity index for a modified subtractive clustering [31] P.J. Rousseeuw, Silhouettes A graphical aid to the interpretation and
algorithm, Appl. Soft Comput. 22 (2014) 47–59. validation of cluster analysis, J. Comput. Appl. Math. 20 (1987) 53–65.
[11] E. Rendón, I.M. Abundez, C. Gutierrez, S.D. Zagal, A. Arizmendi, E.M. Quiroz, [32] M.K. Pakhira, S. Bandyopadhyay, U. Maulik, Validity index for crisp and fuzzy
H.E. Arzate, A comparison of internal and external cluster validation indexes, clusters, Pattern Recogn. 37 (2004) 487–501.
in: Proceedings of the 2011 American Conference on Applied Mathematics [33] D. Bouyssou, Using DEA as a tool for MCDM: some remarks, J. Oper. Res. Soc.
and the 5th WSEAS International Conference on Computer Engineering and 50 (1999) 974–978.
Applications, World Scientific and Engineering Academy and Society [34] R. Ramanathan, Data envelopment analysis for weight derivation and
(WSEAS), Puerto Morelos, Mexico, 2011, pp. 158–163. aggregation in the analytic hierarchy process, Comput. Oper. Res. 33 (2006)
[12] L.J. Deborah, R. Baskaran, A. Kannan, A survey on internal validity measure for 1289–1307.
cluster validation, Int. J. Comput. Sci. Eng. Surv. 1 (2010) 85–102. [35] H. Lee, C. Kim, Benchmarking of service quality with data envelopment
[13] O. Arbelaitz, I. Gurrutxaga, J. Muguerza, J.M. Pérez, I. Perona, An extensive analysis, Expert Syst. Appl. 41 (2014) 3761–3768.
comparative study of cluster validity indices, Pattern Recogn. 46 (2013) [36] M. Zeleny, J.L. Cochrane, Multiple Criteria Decision Making, University of
243–256. South Carolina Press, 1973.
[14] J. Wu, J. Chen, H. Xiong, M. Xie, External validation measures for K-means [37] V. Belton, S.P. Vickers, Demystifying DEA-A visual interactive approach based
clustering: a data distribution perspective, Expert Syst. Appl. 36 (2009) on multiple criteria analysis, J. Oper. Res. Soc. 44 (1993) 883–896.
6050–6061. [38] J. Doyle, R. Green, Data envelopment analysis and multiple criteria decision
[15] Y. Lei, J.C. Bezdek, S. Romano, N.X. Vinh, J. Chan, J. Bailey, Ground truth bias in making, Omega 21 (1993) 713–715.
external cluster validity indices, Pattern Recogn. 65 (2017) 58–70. [39] T.J. Stewart, Relationships between data envelopment analysis and
[16] B. Wu, B.-G. Hu, Q. Ji, A Coupled Hidden Markov Random Field model for multicriteria decision analysis, J. Oper. Res. Soc. 47 (1996) 654–665.
simultaneous face clustering and tracking in videos, Pattern Recogn. 64 [40] W.D. Cook, K. Tone, J. Zhu, Data envelopment analysis: prior to choosing a
(2017) 361–373. model, Omega 44 (2014) 1–4.
[17] M. Reiter, P. Rota, F. Kleber, M. Diem, S. Groeneveld-Krentz, M. Dworzak, [41] W.W. Cooper, L.M. Seiford, K. Tone, Introduction to Data Envelopment
Clustering of cell populations in flow cytometry data using a combination of Analysis and Its Uses with DEA-Solver Software and References, Springer, US,
Gaussian mixtures, Pattern Recogn. 60 (2016) 1029–1040. 2006.
[18] M. Fernández-Delgado, E. Cernadas, S. Barro, D. Amorim, Do we need [42] W. Duch, K. Grudziński, Ensembles of similarity-based models, in:
hundreds of classifiers to solve real world classification problems? J. Mach. International Symposium on the Intelligent Information Systems X,
Learn. Res. 15 (2014) 3133–3181. Springer-Verlag,Zakopane, Poland, 2001, pp. 75–85.
[19] R. Caruana, A. Niculescu-Mizil, An empirical comparison of supervised [43] J. MacQueen, Some methods for classification and analysis of multivariate
learning algorithms, in: Proceedings of the 23rd International Conference on observations, Proceedings of the Fifth Berkeley Symposium on Mathematical
Machine Learning, ACM, Pittsburgh, Pennsylvania, USA, 2006, pp. 161–168. Statistics and Probability, in: Statistics, University of California Press Berkeley,
[20] G. Seni, J. Elder, Ensemble Methods in Data Mining: Improving Accuracy Calif., vol. 1, 1967, pp. 281–297.
Through Combining Predictions, Morgan and Claypool Publishers, 2010. [44] L. Kaufman, P.J. Rousseeuw, Partitioning around medoids (Program PAM), in:
Finding Groups in Data, John Wiley & Sons, Inc., 2008, pp. 68–125.
108 B. Kim et al. / Applied Soft Computing 64 (2018) 94–108
[45] K. Chidananda Gowda, G. Krishna, Agglomerative clustering using the concept Conference on Knowledge Discovery and Data Mining, AAAI Press Portland,
of mutual nearest neighbourhood, Pattern Recogn. 10 (1978) 105–112. Oregon, 1996, pp. 226–231.
[46] M. Ester, H.-P. Kriegel, J. Sander, X. Xu, A density-based algorithm for [47] O. Grygorash, Y. Zhou, Z. Jorgensen, Minimum spanning tree based clustering
discovering clusters: a density-based algorithm for discovering clusters in algorithms, 2006 18th IEEE International Conference on Tools with Artificial
large spatial databases with noise, in: Proceedings of the Second International Intelligence (ICTAI’06) (2006) 73–81.