You are on page 1of 15

Applied Soft Computing 64 (2018) 94–108

Contents lists available at ScienceDirect

Applied Soft Computing


journal homepage: www.elsevier.com/locate/asoc

Integrating cluster validity indices based on data envelopment


analysis
Boseop Kim a , Hakyeon Lee b , Pilsung Kang a,∗
a
School of Industrial Management Engineering, Korea University, Seoul, South Korea
b
Department of Industrial and Systems Engineering, Seoul National University of Science and Technology, Seoul, South Korea

a r t i c l e i n f o a b s t r a c t

Article history: Because clustering is an unsupervised learning task, a number of different validity indices have been
Received 6 July 2017 proposed to measure the quality of the clustering results. However, there is no single best validity measure
Received in revised form for all types of clustering tasks because individual clustering validity indices have both advantages and
20 November 2017
shortcomings. Because each validity index has demonstrated its effectiveness in particular cases, it is
Accepted 23 November 2017
reasonable to expect that a more generalized clustering validity index can be developed, if individually
Available online 12 December 2017
effective cluster validity indices are appropriately integrated. In this paper, we propose a new cluster
validity index, named Charnes, Cooper & Rhodes − cluster validity (CCR-CV), by integrating eight internal
Keywords:
Clustering validity
clustering efficiency measures based on data envelopment analysis (DEA). The proposed CCR-CV can be
Data envelopment analysis used for purposes that are more general because it extends the coverage of a single validity index by
Linear programming adaptively adjusting the combining weights of different validity indices for different datasets. Based
Internal measure on the experimental results on 12 artificial and 30 real datasets, the proposed clustering validity index
demonstrates superior ability to determine the optimal and plausible cluster structures compared to
benchmark individual validity indices.
© 2017 Elsevier B.V. All rights reserved.

1. Introduction has led to the development of a significant number of different clus-


tering validity indices [4–10]. Although all validity indices agree
Clustering is not only one of the most actively studied mul- that an effective clustering result must satisfy the two qualita-
tivariate data analysis algorithms in the pattern recognition and tive principles, i.e., homogeneity within clusters and heterogeneity
machine learning fields, but it is also one of the most widely applied between clusters, they employ different formulas to quantify these
algorithms to solve real world problems such as customer segmen- principles. It is accepted that none of the currently exiting cluster
tation in marketing and new product or service development in validity measures can guarantee the best results for all cluster-
retail business [1–3]. The purpose of clustering is to determine a ing tasks [11–13]. Hence, in practice, many clustering algorithms
number of groups (clusters) and associated cluster memberships are employed to determine the different number of clusters and
for all records such that records in the same clusters are homo- associated cluster memberships. Then, these clustering results are
geneous (similar to each other) whereas the records in different evaluated by multiple validity measures to allow data analysts or
clusters are heterogeneous (different from each other). Clustering domain experts to determine the most practically plausible clus-
is regarded as unsupervised learning because there are no explicit tering result based on their domain knowledge [7].
answers for the following two questions: (1) what is the optimal Clustering validity indices can be grouped into two major cat-
number of clusters for a given dataset? and (2) what are the best egories: external and internal [11]. External indices evaluate the
cluster membership assignments for all records in the dataset? clustering results by comparing the cluster memberships assigned
Because there are no explicit answers for the above questions, it by a clustering algorithm with the previously known knowledge
becomes difficult to evaluate the quality of clustering results, which such as externally supplied class labels [14,15]; internal indices
evaluate the goodness of the cluster structure by focusing on the
intrinsic information of the data itself [12]. Because external indices
allow a more objective comparison between clustering algorithms
∗ Corresponding author at: 801A Innovation Hall, Korea University, 145 Anam ro,
with different parameters, e.g., the number of clusters, they have
Seongbuk Gu, Seoul 02841, South Korea.
E-mail addresses: svie89@korea.ac.kr (B. Kim), hylee@snut.ac.kr (H. Lee),
been adopted to validate any newly proposed clustering algo-
pilsung kang@korea.ac.kr (P. Kang). rithm by comparing it with the existing algorithms in the academic

https://doi.org/10.1016/j.asoc.2017.11.052
1568-4946/© 2017 Elsevier B.V. All rights reserved.
B. Kim et al. / Applied Soft Computing 64 (2018) 94–108 95

Table 1
Examples of internal clustering validity indices

Index Abb. Definition Optimal Value


  1
Root-mean-square std dev RMSSTD x − ci  /P 2
(ni − 1) } 2 Elbow

i x ∈ Ci
 i

R-squared RS ( x − c2 − x − ci 2 )/ x − c2 Elbow
x∈D
 i x ∈ Ci x∈D

2
Modified Hubert ␶ statistic ␶ n(n−1)
d(x, y)dx ∈ Ci ,y ∈ Cj (ci , cj ) Elbow

 x∈D y∈D
− 1)
ni d2 (ci , c) / (NC
Calinski-Harabasz index CH  i Max
d2 (x, ci ) / (n − NC)
i

x ∈ Ci
2

· 
d(x,c)
1 x∈D
I index I ( NC · maxd(ci , cj )) Max
⎧i ⎛ ⎞⎫
d(x,ci ) i,j
x ∈ Ci


⎪ ⎪

⎨ ⎜ x ∈ C ,y ∈ C max d (x, y)
⎟⎬
min ⎜ ⎟
i j
Dunn’s indices D min Max
i ⎪ j ⎝ ⎠⎪

⎩ max max d (x, y) ⎪

x,y ∈ Ck
 
k

1 b(x)−a(x)
Silhouette index S NC
{ n1 max[b(x),a(x)]
} Max
i

 i x ∈ Ci
 
1
Davies-Bouldin index DB NC
max{[ n1 d(x, ci ) + 1
nj
d(x, cj )]/d(ci , cj )} Min
i
j,j =
/ i

 i x ∈ Ci x ∈ Cj

Xie-Beni index XB [ d2 (x, ci )]/[n · mind2 (ci , cj )] Min


i,j =
/ i
i x ∈ Ci
 maxd(ci ,cj ) 
i,j
 −1
1
SD validity index SD Dis(NCmax )Scat(NC) + Dis(NC) Scat(NC) = NC
(Ci )/(D) Dis(NC) = ( d(ci , cj )) Min
mind(ci ,cj )
i  i,j
i j
 f (x,ui,j )
S Dbw validity index S Dbw Scat (NC) + Dens bw (NC) Densbw(NC) = NC
NC(NC−1)
[  x ∈ Ci ∪ Cj ] Min
max{ f (x,ci ), f (x,cj )}
x ∈ Ci x ∈ Cj
i j,j =
/ i

Reprinted from Ref. [6].

research [16,17]. However, they cannot be applied to real world financial datasets and the results were inconsistent with the known
problems because such external information is not readily avail- properties of the adopted methods.
able. Hence, internal validation indices are more commonly used in In this paper, we propose an integrated clustering validity mea-
practice. The main advantage of internal validation measures is that sure named Charnes, Cooper & Rhodes − cluster validity (CCR-CV),
they do not require any prior knowledge on the clustering structure which combines eight internal validity indices based on data envel-
of a given dataset [7]. They evaluate the compactness within clus- opment analysis (DEA). There exist two main difficulties of cluster
ters and separation between clusters based on their own formulas. validity integration. First, some validity indices are designed to be
Different formulas are a result of considering the impact of various minimized with the optimal cluster structure whereas others are
factors such as noise, density, sub-clusters, skewed distributions, designed to be maximized [6]. Moreover, the combining weights
and monotonicity of index [6]. must not be fixed constants; rather, they must vary according to
Because datasets have their own intrinsic characteristics, there the intrinsic characteristics of the dataset. DEA was originally devel-
is no single unique internal validity measure that is best fitted to all oped to evaluate the efficiency of a system by measuring the ratio of
data structures [11,12]. In supervised learning, it is also known that the weighted sum of the output components to the weighted sum of
there is no single algorithm that outperforms the other algorithms the input components [24]. The weights are not fixed; rather they
for all datasets [18]. However, if multiple algorithms are properly are determined by solving an optimization problem, considering
combined, the predictive performance of this combination, known not only the features of the system itself but also its competitors.
as an ensemble, is typically superior to single algorithms [19–21]. Therefore, we employ four validity indices pursuing maximization
Similarly, in unsupervised clustering, it is a reasonable expecta- as the output component and four validity indices pursuing min-
tion that the effectiveness of clustering validity measures can be imization as the input component to define the efficiency of DEA.
improved if they can be collectively used with an appropriate inte- To determine the appropriate combining weights of the validity
gration technique. To this end, some studies attempted to form an indices for a certain clustering algorithm with its associated param-
ensemble of multiple clustering validity measures to resolve the eters, we formulate the optimization problem using all candidate
limitations of individual validity measures. For example, Jaskowiak algorithm-parameter pairs. Hence, we expect that the coverage of
et al. [22] constructed an ensemble validity measure based on 28 the proposed clustering validity index can be extended with supe-
different measures with nine different selection strategies. How- rior performance compared to the individual indices.
ever, none of them has a sound theoretical basis for integration, but The remainder of this paper is organized as follows. In Sec-
the integration is done empirically. Kou et al. [23] employed three tion 2, we briefly review the internal validity measures used in
multiple-criteria decision-making (MCDM) methods for evaluating this study. In Section 3, we introduce DEA and the formulation
clustering results for financial risk analysis. Although their idea of of the optimization problem used in DEA. Then, we demonstrate
using MCDM as a tool for integrating validity measures is interest- the proposed DEA-based integrated clustering validity measures. In
ing, the experiments have some limitations; they only considered Section 4, experimental settings including dataset description, clus-
96 B. Kim et al. / Applied Soft Computing 64 (2018) 94–108

Table 2 other clusters, is computed by (3). Finally, DB is defined as the


Notations used in this paper.
average of Mk .

1 k
Notation Definition

A Data matrix with size N × p (N: number of records, p: ık = Mi − Gk  (1)


nk
number of variables) i ∈ Ik
Mi ith row of the data matrix A
Oi ith row of A, corresponding to an individual instance of data  ∗  ∗
P an integer vector with values with between 1 and K for
kk∗ = d Gk , Gk = Gk − Gk  (2)
each index i (1 ≤ k ≤ K), the coordinate Pi is equal to the  
number k (1 ≤ k ≤ K) of the cluster to which observation Oi ık + ık∗
belongs Mk = max (3)
k∗ =
/ k kk∗
Ck can be represented by a submatrix A{k} composed of rows
of A whose index i is such that Pi = k
1
K
cardinal of Ck , the matrix A{k} has size nk × p and the
nk
 DB = Mk (4)
relation nk = N K
k=1
k
Ik  of the observations
set of indices  belonging to the cluster Xie-Beni (XB) was originally developed to evaluate fuzzy clus-
Ck Ik = i|Oi ∈ Ck = i|Pi = k
tering results that generate probabilistic cluster memberships
Gk , Gk barycenter of the observations in the cluster Ck
rather than deterministic memberships. However, it can also be
G barycenter of all (entire) data
 2 used to evaluate crisp clustering. Unlike the DB index, which con-
WGSSk within-cluster dispersion of Ck , WGSSk = Mik − Gk 
siders all records in the cluster when measuring the separation
i ∈ Ik
between two clusters, XB defines the degree of cluster separa-

K

WGSS pooled within-cluster, WGSS = WGSSk


tion (ı1 ) by the minimum distance between two individual records
in different clusters (5). The compactness of the clustering result
k=1

K
2
is computed by the weighted sum of the squares of the clusters
BGSS between-cluster dispersion, BGSS = nk Gk − G (WGSS). Finally, XB is defined by the ratio of WGSS to the square
k=1
of the minimum distance between the two clusters as in (6). Both
DB and XB focus on the separation between nearby clusters; how-
ever, XB places more emphasis on the two closest clusters among
tering algorithms, and performance measures are demonstrated. In all clusters.
Section 5, the experimental results are provided and their implica-  
tions are discussed. In Section 6, we summarize the current research ı1 (Ck , Ck∗ ) = min d Mi , Mj (5)
i ∈ Ik ,j ∈ Ik∗
and discuss the directions of future work.
1 WGSS
2. Internal clustering validity index XB = (6)
N minı1 (Ck , Ck∗ )2
k<k∗
The purpose of an internal clustering validity index is to assess
the goodness of a given clustering structure based on two fun- Similar to the XB index, the Ray-Turi index (RT) uses the same
damental principles: (1) how homogeneously a single cluster is measure to evaluate the compactness within a cluster; however,
formed (compactness) such that all the records in the same cluster it considers the distance between the centers of mass of the two
are similar to each other and (2) how heterogeneously the different clusters when measuring the degree of cluster separation. 2kk∗ is
clusters are determined (separation) such that two sets of records defined by the squared distance between two different clusters
in different clusters are different from each other. An effective clus- and RT uses the minimum 2kk∗ among all possible pairs for the
tering structure should minimize the compactness (also known as denominator.
within cluster variance) and maximize the separation (also known  ∗ 2 ∗
min2kk∗ = min d Gk , Gk = minGk − Gk 2 (7)
as between cluster variance). The main difference among internal k<k∗ k<k∗ k<k∗
clustering validity indices is the manner of measuring the compact-
1 WGSS
ness and separation based on the assigned cluster memberships. RT = (8)
Table 1 provides some examples of internal clustering validity mea- N min2 ∗
kk
k<k∗
sures. As indicated in the formulas, all validity indices use the
distance between records and the distance between a record and its A distinctive feature of the Comp-Sepa (CS) index is that it mea-
cluster centroid; however, the combining rules are different. In this sures the compactness of a single cluster by its longest edge among
section, we briefly review eight internal validity measures. Table 2 the edges in the minimum spanning tree (MST) constructed for the
summarizes the notations used in this paper. cluster, rather than the (weighted) average distance between the
records and their centroid as used in the previous three indices as
2.1. Validity index for minimization indicated in (9).

We employ four clustering validity indices whose values are MSTk : minimum spanning tree of Ck (9)
minimized with the optimal clustering structure: Davies-Bouldin
Comp (Ck ) : largest edge of MSTk (10)
index [25], Xie-Beni index [26], Ray-Trui index [27], and Comp-Sepa
index [28]. CompAll = max Comp (Ck ) (11)
The Davies-Bouldin index (DB) computes the compactness of a k=1,2,...,K

cluster (ık ) by averaging the distance between individual records


Sepa = min d (Gk , Gk∗ ) (12)
and their cluster center of mass as in (1). The separation between k∗ =
/ k
two clusters (kk∗ ) is defined by the distance between their centers
of mass (2). Then, the intermediate measure Mk , which considers CompAll
CompSepa = (13)
the compactness and separation between the kth cluster and the Sepa
B. Kim et al. / Applied Soft Computing 64 (2018) 94–108 97

2.2. Validity index for maximization the records in the cluster (27); the separation between clusters (DB )
is computed by the maximum distance between two cluster centers
We employ four clustering validity indices whose values are of mass (26). Then, the ratio of DB to EW is weighted by the ET over
maximized with the optimal clustering structure: Dunn index [29], K, which are the average distance between individual records and
Calinski-Harabasz index [30], Silhouette index [31], and PBM index the center of mass of the entire data and the number of clusters,
[32]. respectively.
The Dunn index (Dunn) evaluates the quality of a clustering ∗
structure using the ratio of the minimum of the between cluster dis- DB = max d(Gk , Gk ) (26)
k<k∗
tance and the maximum of the within cluster distance. The between
cluster distance is defined by the minimum distance between two 
K 
 
different clusters as in (14); the within cluster distance is defined EW = d Mi , Gk (27)
as the maximum distance between the two records in the same k=1 i ∈ Ik
cluster as in (16).
dkk∗ = min Mik − Mjk 

(14)

N

i ∈ Ik ,j ∈ Ik∗ ET = d (Mi , G) (28)


i=1
dmin = min dkk∗ (15)
/ k∗
k= 1 ET
2
∗ PBM = × × DB (29)
Dk = max Mik − Mjk  (16) K EW
i,j ∈ Ik ,i =
/ j

dmax = max Dk (17) 3. Integrating cluster validity indices based on data


1≤k≤K
envelopment analysis
dmin
Dunn = (18)
dmax 3.1. Data envelopment analysis
The Calinski-Harabasz index (CH) adopts the WGSS in XB and
RT to measure the compactness of the clusters. To measure the DEA measures the efficiency of decision-making units (DMUs)
separation among the clusters, it aggregates the distance between that perform a homogeneous transformation process with multiple
two cluster centers of mass using the number of records in the inputs and outputs. Although it was originally developed for mea-
clusters as the combining weights as in (19). suring performance in terms of efficiency, recent years have seen
an increasing role of DEA as a tool for multiple-criteria decision-
BGSS/ (K − 1) N − K BGSS
CH = = (19) making (MCDM) [33–35]. MCDM is a field of operations research
WGSS/ (N − K) K − 1 WGSS that explicitly evaluates multiple criteria that are typically con-
The Silhouette index (Silhouette) has a distinctive feature com- flicting in a decision making process; some should be maximized,
pared to the other validity indices. It measures the compactness and others should be minimized [36]. The original purpose of DEA and
separation not at the cluster level, rather at the individual record MCDM differ. However, scholars recognized that the MCDM and
level. The record level compactness is measured by a (i), which is DEA formulations coincide if inputs and outputs are viewed as cri-
defined by the mean distance between the ith record and the other teria, with minimization of inputs and maximization of outputs
records in the same cluster to the ith record (20). The record level [37,38]. Negative items (the smaller the value, the better) are con-
separation is measured by b (i), which is defined by the minimum sidered as inputs whereas positive items (the greater the value, the
average distance between the ith record and the records in the clus- better) are regarded as outputs [39]. However, it is not assumed
ters to which the ith record does not belong (22). The record level that inputs are transformed into outputs [40]. Efficiency scores of
cluster validity,s (i), is computed by the difference between b (i) and DMUs then correspond to priority or performance scores in MCDM.
a (i) over the maximum value of the two ((23)). Then, the cluster In general, efficiency can be defined by the ratio of outputs of
level validity Sk is computed by averaging the s (i)’s in the same an organization to the inputs of the organization. For example, to
cluster (24). Finally, the Silhouette index is computed by averaging evaluate the efficiency of bank branches, the input elements could
the cluster level validity Sk ’s (25). be the number of employees and rental cost for the office, whereas
1  the output elements could be the number of new customers and
a (i) = d (Mi , Mi∗ ) (20) the total amount of new loans. Because the weight for each input
nk − 1
i∗ ∈ Ik ,i∗ =
/ i or output element can be different, the relative efficiency can be

1  expressed as follows:
o (Mi , Ck∗ ) = d (Mi , Mi∗ ) (21)
nk∗ Weighted sum of outputs
i∗ ∈ Ik∗ Relative Efficiency = . (30)
Weighted sum of inputs
b (i) = min o (Mi , Ck∗ ) (22)
k∗ =
/ k DEA evaluates the relative efficiency by solving a linear pro-
gramming (LP) that generates not only the relative efficiency score
b (i) − a (i)
s (i) = (23) but also the weights of the input and output elements.
max (a (i) , b (i))
Since its introduction, a considerable number of DEA variation
1 models have been proposed [41]; however, we employ the basic
Sk = s (i) (24)
nk CCR model in our study because it has the simplest form and is
i ∈ Ik
sufficient to evaluate the validity of clustering results. In DEA, each
unit is called a DMU and the inputs and the outputs must be non-
1
K
S= Sk (25) negative numeric values. The objective function and the constraints
K of the CCR model are formulated as follows:
k=1

The PBM index (PBM) computes the compactness of a cluster u1 O1e + u2 O2e + . . . + uM OMe
max Ee = , (31)
(EW ) by averaging the distance between the center of mass and all v1 I1e + v2 I2e + . . . + vN INe
98 B. Kim et al. / Applied Soft Computing 64 (2018) 94–108

Fig. 1. Visualization of artificial datasets.

u1 O1k + u2 O2k + . . . + uM OMk 3.2. CCR-CV


s.t ≤ 1, k = 1, 2, . . ., K, (32)
v1 I1k + v2 I2k + . . . + vN INk
Because clustering is an unsupervised learning task, there is no
where Ek is the efficiency of the kth unit (0 ≤ Ek ≤ 1), uj is the weight single validity index that can be globally applied to different data
of the jth output (j = 1, . . ., M), vi is the weight of the ith input structures that have different impacts of noise, density, and sub-
(i = 1, . . ., N), Ojk is the amount of the jth output in the kth DMU clusters. However, it can be reasonably assumed that if multiple
(∀Ojk ≥ 0), Iik is the amount of the ith input in the kth DMU (∀Iik ≥ 0), validity indices are appropriately integrated, they can be applied to
and K is the total number of DMUs. This optimization problem can a wider range of data structures compared to an individual validity
be rewritten in an LP form as follows: index. To integrate multiple clustering validity indices, we propose
CCR-CV by employing the concept of DEA with the CCR model.
maxEe = u1 O1e + u2 O2e + . . . + uM OMe , (33) In CCR-CV, eight clustering validity indices are adopted as
demonstrated in Sections 2.1 and 2.2. The four indices pursuing
maximization (DB, XB, RT, and CS) are regarded as the output
such that (u1 O1k + u2 O2k + . . . + uM OMk ) elements, whereas the other four indices pursuing minimization
− (v1 I1k + v2 I2k + . . . + vN INk ) ≤ 0, (34) (Dunn, CH, Silhouette, and PBM) are regarded as the input elements
in the CCR model. A DMU consists of (1) a clustering algorithm, (2)
the number of clusters, and (3) other parameters to execute the
algorithm (if necessary). The objective function and the constraints
k = 1, 2, . . ., K, of CCR-CV are formulated as follows:
u1 DBe + u2 XBe + u3 RTe + u4 CSe
maxEe = , (35)
v1 Dunne + v2 CHe + v3 Silhouettee + v4 PBMe
v1 I1e + v2 I2e + . . . + vN INe = 1,
u1 DBk + u2 XBk + u3 RTk + u4 CSk
s.t ≤ 1, k = 1, 2, . . ., K.
uj ≥ 0, j = 1, 2, . . ., M,
v1 Dunnk + v2 CHk + v3 Silhouettek + v4 PBMk
(36)
vi ≥ 0, i = 1, 2, . . ., N.
The solutions of the above LP are the efficiency score of each Similar to (33) and (34), the above optimization problem can be
DMU (Ek ) and the most favorable input and output weights (vi and rewritten in an LP form as in (37) and (38).
uj ) for the DMU. If the efficiency score of a DMU is less than one,
it implies that there are other DMUs that can be more efficient, max Ee = u1 DBe + u2 XBe + u3 RTe + u4 CSe , (37)
although the weights of the inputs and outputs are most favorably
assigned to the current DMU. s.t. (u1 DBk + u2 XBk + u3 RTk + u4 CSk ) −
B. Kim et al. / Applied Soft Computing 64 (2018) 94–108 99

Table 3
(v1 Dunnk + v2 CHk + v3 Silhouettek + v4 PBMk ) ≤ 0, (38) Summary of artificial datasets.

Name # of instances # of clusters


k = 1, 2, . . ., N, Well separated 500 5
Flame 240 2
Two moon 600 2
v1 Dunne + v2 CHe + v3 Silhouettee + v4 PBMe = 1, Tri moon 600 3
Spiral 312 3
Two circle 400 2
uj ≥ 0, j = 1, 2, 3, 4,
Bulls eye 300 2
Pathbased1 300 3
vi ≥ 0, i = 1, 2, 3, 4.
By comparing the relative efficiency scores produced by CCR-CV, validity index. However, it would be difficult to determine the opti-
we can identify the best clustering structure. mal cluster structure by a single validity index for the remaining six
Theoretically, the CCR-CV has no restriction on the number of datasets. We attempt to verify whether the proposed CCR-CV can
cluster validity indices that should be minimized (inputs) and max- function effectively for these cases.
imized (outputs), because the DEA can receive an arbitrary number Table 4 provides a description of the real datasets. Because the
of inputs and outputs. One can use other clustering validity indices agglomerative clustering algorithm requires heavy computational
as inputs or outputs as long as they have the smallest or largest resources, we reduced the number of instances of the “Winequal-
value, respectively, with the optimal clustering structure. To ver- ity red”, “Winequality white”, “Penbased”, “Texture”, “Segment”,
ify the idea of the CCR-CV structure, we used the eight manually “Shuttle”, “Wall-following robot navigation”, “Optdigits”, and “Ban-
selected clustering validity indices introduced in Section 2. knote” based on a stratified sampling that preserved the class
distribution.
4. Experimental settings
4.2. Clustering algorithms
4.1. Datasets
To ensure sufficiently diverse clustering results for each dataset,
To verify CCR-CV, we used eight artificially generated datasets we employ five clustering algorithms based on four different
and 30 real datasets obtained from three different data repositories: approaches: partition-based, hierarchical-based, density-based,
(1) UCI machine learning repository (UCI, http://archives.ics.uci. and graph-based strategies. In this study, we only consider hard
edu/ml/) (2) Knowledge extraction based on evolutionary learning clustering algorithms, by which every instance is assigned to only
(KEEL, http://sci2s.ugr.es/keel/datasets.php), and (3) Indian statis- one cluster, because the clustering validity indices are computed
tical institute [42]. Table 3 provides the number of instances and based on the assumption that each instance belongs to only one
clusters in each artificial dataset. The optimal clustering struc- cluster. Hence, some widely used soft clustering algorithms, such
tures of the artificial datasets are also illustrated in Fig. 1. We as fuzzy C-means clustering, are not taken into account.
assume that the optimal cluster structure of the “Well separated” From the partitioning-based approach, the K-Means Cluster-
and “Flame” datasets can be identified by any individual clustering ing (K-Means) [43] and K-Medoids Clustering (K-Medoids) [44]

Table 4
Summary of real datasets (the number in parenthesis is the original number of records in the dataset).

Name # of instances # of attributes # of classes Source

Wine 178 13 3 UCI


Seeds 210 7 3 UCI
Glass 214 9 6 UCI
Iris 150 4 3 UCI
Newthyroid 214 5 3 KEEL
Ecoli 336 7 8 KEEL
Abalone 600 (4177) 8 3 UCI
Yeast 600 (1484) 8 10 UCI
Winequality red 600 (1599) 11 6 KEEL
Winequality white 600 (4898) 11 7 KEEL
Vehicle 846 18 4 KEEL
Penbased 660 (10,992) 16 10 KEEL
Satimage 644 (6435) 36 6 KEEL
Vowel 990 10 11 KEEL
Texture 550 (5500) 40 11 KEEL
Segment 693 (2310) 19 6 KEEL
BreastTissue 106 9 5 KEEL
Balance 624 4 3 KEEL
Pageblocks 548 (5472) 10 5 UCI
Shuttle 580 (57,999) 9 4 KEEL
User knowledge modeling 403 5 5 UCI
Urban land cover 168 147 9 UCI
Image segmentation 210 19 7 UCI
Wall-following robot navigation 546 (5456) 24 4 UCI
Foresttypes 523 27 4 UCI
Wdbc 569 30 2 KEEL
Sonar 208 60 2 KEEL
Optdigits 562 (5620) 64 10 KEEL
Hepatobiliary disorders 536 9 4 ISI
Banknote 686 (1372) 4 2 KEEL
100 B. Kim et al. / Applied Soft Computing 64 (2018) 94–108

Table 5
Description of DMUs.

Clustering Algorithm No. DMUs Parameters Dataset

K-Means 15 Number of clusters: 3 (C − 1, C, C + 1) Artificial


(C: the number of classes in the dataset) Real
Number of centroids initialization: 5
K-Medoids 3 Number of clusters: 3 (C − 1, C, C + 1) Artificial
(C: the number of classes in the dataset) Real
Fuzzy C-Means 15 Number of clusters: 3 (C − 1, C, C + 1) Artificial
(C: the number of classes in the dataset) Real
Number of centroids initialization: 5
AN 9 Distance measure: 3 (single linkage, complete linkage, average linkage) Artificial
Real
DBSCAN 9 Number of ␧ values: 3 Artificial
Number of minpts: 3
k-MST 3 None Artificial
Real

algorithms are selected. K-Means is the most widely used clus- Table 6
Candidate ␧ and minpts values for each artificial dataset.
tering algorithm in both academia and industry. The purpose of
K-Means is to determine the K cluster centroids and assign the clus- Dataset ␧ minpts
ter memberships to all records in the dataset. The only parameter Well separated 2 4 6 6 7 8
for K-Means is the number of clusters K. K-Means begins with the K Flame 1 1.5 2 6 7 8
initial centroids, which are usually chosen at random. Then, every Two moon 0.1 0.15 0.2 6 7 8
record is assigned to the nearest centroid. After the membership Tri moon 0.05 0.1 0.15 6 7 8
Spiral 1 2 3 1 2 3
assignment, the centroids are updated by averaging all the mem-
Two circle 0.05 0.1 0.15 6 7 8
bers in the same cluster. These procedures are repeated alternately Bulls eye 0.05 0.1 0.15 6 7 8
until the cluster centroids and corresponding memberships do not Pathbased1 1.5 2 2.5 6 7 8
change. K-Medoids uses the same procedure as K-Means except for
the computation of the centroid. The centroid of a cluster is defined
by the median value of the cluster members in K-Medoids whereas cluster membership to all records in the given dataset, we assign
it is defined by the mean value of the cluster members in K-Means. the cluster membership of the records identified as noise by the
We also employ the Fuzzy C-Means (FCM) clustering, which is the original DBSCAN to the nearest cluster centroid.
soft clustering version of the K-Means. Similar to the K-Means, the Among graph-based clustering algorithms, k-minimum span-
FCM finds the optimal centroids and the cluster membership of ning tree (k-MST) [47] is employed. In k-MST, each record is
individual records during the training. The only difference between represented as a node in the graph and close nodes are connected
K-Means and FCM is that FCM allows a record to have multiple by edges with the weights proportional to the distance between the
cluster memberships with difference assigning probabilities. Since two nodes. Two objectives of k-MST are (1) there must not be any
the cluster validity measures used in this study are designed for cycles in the graph and (2) the total sum of the weighted edges are
the hard clustering methods, each record is assigned to the cluster minimized. Then, k-MST cuts the (k − 1) largest edges to generate
centroid with the highest assigning probability. k isolated sub-trees, which are considered as clusters.
For the hierarchical-based approach, Agglomerative Nesting
(AN) [45] is adopted. Hierarchical clustering can be conducted
4.3. Design DMUs
by either the top-down method, which divides the entire dataset
into sub-regions until every single record consists of a single clus-
Based on the five clustering algorithms and their associated
ter, or by the bottom-up method, which begins with all records
parameters, we design 54 DMUs for the artificial datasets as illus-
as a single cluster and merges the two nearest clusters until the
trated in Table 5. First, 15 DMUs are each used for K-Means and
entire dataset is merged to a single cluster. Because the bottom-up
FCM: three different numbers of clusters and five sets of initial cen-
approach is more commonly adopted in practice, we also employ
troids. Because the class information is available for all the datasets,
the bottom-up approach-based AN. The parameter for AN is the dis-
we set the number of clusters in K-Means to the numbers adjacent
tance measure between two clusters. We consider three distance
to the actual number of classes: C (actual number of classes), C − 1,
measures: single linkage, complete linkage, and average linkage.
and C + 1. For those datasets that have only two classes, the number
DBSCAN [46] is adopted for the density-based approach. The
of clusters is set to two, three, and four. Furthermore, because the
assumption of density-based clustering algorithms is that the
final clustering result of the K-Means is highly dependent on the
records in a cluster are generated from the same distribution
initial centroids, which are randomly chosen in practice, we repeat
and different clusters have different data generating distributions.
the centroid initiation five times for each number of clusters. Three
DBSCAN has two model parameters: the radius ␧ and the minimum
DMUs are used for K-Medoids by changing the number of clusters.
number of points (minpts) required to form a dense region. DBSCAN
Unlike the K-Means, K-Medoids always selects an actual record
begins with an arbitrary record that has not yet been considered.
as a centroid; it does not depend on the centroid initiation. Nine
If the ␧−neighborhood of the target record contains sufficiently
DMUs are designed for AN based on the number of clusters and the
many records, a cluster is started. Then, the ␧−neighborhood of
distance measure between two clusters: single linkage, complete
the members of the newly formed cluster is also added to the
linkage, and average linkage. Nine DMUs are designed for DBSCAN:
same cluster. This diffusion process is repeated until there are no
three ␧ values and three minpts. Because ␧ is considerably sensi-
records in the ␧−neighborhood of the current cluster members. The
tive to the data structure, we conduct a preliminary experiment to
original DBSCAN regards the records that have no records in their
determine reasonable ␧ values by searching a sufficient number of
␧−neighborhood as noise and does not assign the cluster member-
␧ candidates. The selected ␧ values for each artificial dataset are
ship to them. Because all the other clustering algorithms assign the
provided in Table 6. Because minpts is not as sensitive as ␧ to the
B. Kim et al. / Applied Soft Computing 64 (2018) 94–108 101

Table 7 is partitioned into three clusters; however, the actual number of


Confusion matrix with the actual class label and assigned class label by clustering.
classes is two.
Predicted class K
n
k∗ =k k k
∗ nkk
Original class A ... k ... K Accuracy = K  K
, Sensitiviyk = K , Specificyk
k∗ =1
n ∗
k=1 k k
n ∗
k∗ =1 k k
A n11 n1k n1K
. ..
K K
. . k∗ =
/ k
n ∗
/ k k k
k=
.
k

nk∗ 1 nk∗ k nk∗ K
=  K K (39)
k∗ =1
n ∗
/ k k k
k=
. ..
. .
.
K nK1 nKk nKK 
Balanced correction ratek = Sensitivityk × Specificityk (40)

1
K
data structure, we use the same minpts candidates (six, seven, and
eight) for all artificial datasets. Balanced correction rate = Balnaced correction ratek (41)
K
A total of 30 DMUs are used for the real datasets. All the DMUs k=1
in the experiment with the artificial datasets are also used except 1
for the nine DMUs associated with DBSCAN. Not only is the ␧ Balanced accuracyk = (Sensitivityk + Specificityk ) (42)
2
in DBSCAN sensitive to the data structure, its computational cost
1
K
increases exponentially with regard to the number of records in a
dataset. Balanced accuracy = Balanced accuracyk (43)
K
k=1

4.4. Performance measures


5. Experimental results
Because we know the optimal clustering structure for the arti-
ficial datasets, the proposed CCR-CV and the other eight clustering In this section, we compare the clustering validity results
validity indices are tested to determine if they can identify the obtained by the proposed CCR-CV method with those of individual
optimal clustering structure. The greater the number of datasets validity indices in terms of their ability to find the optimal cluster-
for which a clustering validity index can determine the optimal ing structure (synthetic datasets), the three classification accuracy
structure, the more generally effective the index. measures (real datasets), and the time complexity required to com-
For the real datasets, we measure the quality of a cluster valid- pute each validity index (both datasets).
ity index by comparing the actual class labels of the records and
cluster memberships determined as optimal by the cluster valid- 5.1. Results on artificial datasets
ity index. It is a reasonable assumption that records with the same
class labels have similar characteristics compared to the records The results of determining the optimal structure of each artifi-
with different class labels. Hence, the class entropy in an individual cial dataset by each clustering index are summarized in Table 8.
cluster will be low if the dataset is well clustered and an effective Among the individual measures, no measure could identify the
cluster validity index should identify this. Once the optimal cluster optimal structure for all the artificial datasets. Although XB, CS, and
structure is determined by a clustering validity index, the majority Dunn determined the optimal cluster structure for seven datasets,
class label of the cluster is assigned to all the records in the clus- they could not identify the optimal structure for the “Pathbased1”
ter. With the actual class labels and assigned class based on the dataset. RT was successful in determining the optimal clustering
clustering result, a confusion matrix can be constructed as illus- structure for only three datasets; CH, Silhouette, and PBM suc-
trated in Table 7. Based on the confusion matrix, we compute three ceeded for only two datasets. DB could not identify the optimal
performance measures to compare the clustering validity index: clustering structure for any of the artificial datasets. Conversely,
the simple accuracy (ACC), balanced accuracy (BACC), and balanced the proposed CCR-CV index determined the optimal structure
correction rate (BCR) as indicated in (39)–(43). Fig. 2 presents an for all the artificial datasets. Because at least one validity index
illustrative example to compute these measures when the dataset could determine the optimal cluster structure for each dataset, the

Fig. 2. Example of calculating external measures by majority voting.


102 B. Kim et al. / Applied Soft Computing 64 (2018) 94–108

Table 8
Summary indicating that each clustering validity index can determine the optimal clustering structure for each artificial dataset.

Index Well separated Flame Two moon Tri moon Spiral Two circle Bulls eye Pathbased1 Ratio

DB X X X X X X X X 0/8
XB O O O O O O O X 7/8
RT X O X X O X X O 3/8
CS O O O O O O O X 7/8
Dunn O O O O O O O X 7/8
CH O X X X X X X O 2/8
Silhouette O O X X X X X X 2/8
PBM O O X X X X X X 2/8
CCR-CV O O O O O O O O 8/8

DEA employed by CCR-CV could adaptively adjust the aggregating validity index with the highest time complexity, which is CS in this
weights of the indices for different data structures to identify the experiment. The relative increase in CCR-CV computation time is
optimal cluster structure. only 1.76% on average (maximum of 2.28% and minimum of 1.20%)
Figs. 3 and 4 show the resulting clustering structures of some compared to that of the CS. In addition, the time required to solve
selected DMUs and their corresponding CCR-CV scores for the “Well the DEA optimization is less than 10−2 seconds, which is negligible
separated” dataset and the “Bulls eye” dataset, respectively. For the compared to the time required to compute the validity indices.
“Well separated” dataset, the DMUs resulting in the optimal cluster
structure (Fig. 3(b), (e), and (f)) have efficiency scores of 1. Other 5.2. Results on real datasets
non-optimal structures result in CCR-CV scores lower than 1. CCR-
The average ACC, BACC, and BCR of each clustering validity
CV can find a more complicated cluster structure such as “Bulls eye”
index for each real dataset with 30 repetitions are provided in
as shown in Fig. 4. The only DMU with a CCR-CV score of 1 used
Tables 10–12. In terms of the ACC, the proposed CCR-CV yielded
agglomerative clustering with a single linkage and was instructed
the best performance for 26 of the 30 datasets, followed by CH
to find two clusters (Fig. 4(f)). Other DMUs could not find the opti-
and Silhouette (three datasets each). The other indices resulted
mal cluster structure, and resulted in CCR-CV scores lower than 1.
in the best performance in less than two datasets; XB and Dunn
Table 9 shows the total elapsed time to compute each clustering
were not found to be the best for any datasets. Similar results
validity index for all DMUs for the synthetic datasets. Because the
were demonstrated for BACC and BCR. CCR-CV yielded the best
CCR-CV takes the values of the eight individual validity measures, it
performance for 27 and 28 datasets in terms of BACC and BCR,
takes the longest time. However, the total elapsed time for CCR-CV
respectively. Except for CCR-CV, none of the clustering validity
computation is highly dependent on the computation time of the
indices resulted in the highest score in more than five datasets.

Fig. 3. CCR-CV scores for resulting clustering structure by different clustering algorithms for Well separated dataset.
B. Kim et al. / Applied Soft Computing 64 (2018) 94–108 103

Fig. 4. CCR-CV scores for resulting clustering structure by different clustering algorithms for Bulls eye dataset.

Table 9
Elapsed time (seconds) for computing each cluster validity index for the synthetic datasets.

Time complexity DB XB RT CS Dunn CH Silhouette PBM CCR-CV (Total) CCR-CV (DEA)

Well separated 0.09 0.09 0.10 41.76 0.15 0.08 0.10 0.11 42.48 0.00
Flame 0.08 0.08 0.10 22.81 0.06 0.06 0.06 0.08 23.33 0.00
Two moon 0.10 0.11 0.09 37.40 0.10 0.10 0.10 0.09 38.09 0.00
Tri moon 0.11 0.11 0.09 38.16 0.11 0.11 0.12 0.11 38.92 0.01
Spiral 0.36 0.36 0.37 216.49 0.37 0.38 0.39 0.36 219.08 0.00
Two circle 0.16 0.17 0.18 68.70 0.17 0.18 0.17 0.17 69.90 0.00
Bulls eye 0.37 0.37 0.36 179.48 0.38 0.37 0.39 0.38 182.10 0.01
Pathbased1 0.25 0.26 0.27 98.69 0.26 0.27 0.27 0.26 100.53 0.00

The last row of Tables 10–12 indicates the number of datasets for The average performance improvements of CCR-CV compared to
which each individual clustering validity index yielded the highest the other clustering validity indices for the three metrics are sum-
score when CCR-CV was not considered. In this situation, although marized in Table 13. Because some validity measures resulted in a
CH most frequently resulted in the best performance, the number zero value for some datasets and performance measures, e.g., DB
of best performances was less biased toward a single validity in “Wine” dataset in terms of BCR, we excluded these cases when
index, as we know that none of the existing clustering validity computing the relative improvement. The values in the parenthe-
indices outperforms the others for all data structures. However, sis are the p-value of the t-test with the following hypotheses.
as in the results for the artificial datasets, if multiple clustering The null hypothesis (H0 ) is that the performances between CCR-
validity indices are properly integrated, we can expect improved CV and the clustering validity index in the corresponding column
performance because the optimal cluster structure determined by are the same (statistically not distinguishable); whereas the alter-
CCR-CV is more consistent with the actual class distribution than native hypothesis (H1 ) is that CCR-CV statistically outperforms the
those identified by individual validity indices. clustering validity index in the corresponding column. Note that
Because the experimental results in Tables 11–13 are based on all the p-values are smaller than 0.001, which implies that CCR-
the stratified sampled dataset, the experimental results with the CV outperformed all the compared individual clustering validities
original datasets (without sampling) are provided in the Appendix irrespective of the performance measures. Further, the relative
A. The only difference between these two sets of results is that CS improvement of CCR-CV is noticeable in terms of BCR. It doubles,
was excluded because of its heavy computational complexity. For at least, the BCR (the relative improvement is higher than 100%)
the original datasets, the proposed CCR-CV generally outperformed for real datasets against the individual clustering validity measures
the other validity indices, which implies that the proposed CCR-CV with the only exception being CH.
is independent of the dataset size.
104 B. Kim et al. / Applied Soft Computing 64 (2018) 94–108

Table 10
ACC of each clustering validity index for each real dataset (bold number denotes the best results with the asterisk (*) being the best without ties).

ACC DB XB RT CS Dunn CH Silhouette PBM CCR-CV

Wine 0.3989 0.3989 0.3989 0.3989 0.3989 0.9663 0.3989 0.3989 0.9663
Seeds 0.9190 0.3381 0.6429 0.3381 0.3381 0.6571 0.6429 0.9095 0.9190
Glass 0.5421 0.3692 0.3692 0.3738 0.3692 0.5701 0.5561 0.3692 0.5794*
Iris 0.6667 0.6667 0.6667 0.6667 0.6667 0.6667 0.6667 0.6867 0.8333*
Newthyroid 0.7116 0.7070 0.7070 0.7070 0.7070 0.8744 0.7070 0.7070 0.8744
Ecoli 0.6935 0.4554 0.6369 0.4554 0.4554 0.6726 0.6845 0.6935 0.6964*
Abalone 0.3667 0.3667 0.3667 0.3667 0.3667 0.5200 0.3683 0.3667 0.5300*
Yeast 0.5583* 0.3283 0.3683 0.3283 0.3267 0.5467 0.5467 0.5467 0.5517
Winequality red 0.4300 0.4283 0.4267 0.4283 0.4283 0.5433 0.5467 0.4283 0.5633*
Winequality white 0.4783 0.4550 0.4550 0.4583 0.4550 0.4783 0.4750 0.4783 0.4950*
Vehicle 0.3688 0.2636 0.3688 0.2648 0.2636 0.3641 0.3735 0.3735 0.3735
Penbased 0.1242 0.1227 0.6121 0.1227 0.1227 0.6379* 0.5606 0.5606 0.6121
Satimage 0.2609 0.2438 0.5202 0.2453 0.2438 0.7174 0.5202 0.7220 0.7717*
Vowel 0.1384 0.1394 0.3212 0.1414 0.1394 0.3051 0.1475 0.2869 0.3576*
Texture 0.2855 0.2800 0.4436 0.2800 0.5582 0.6473 0.6291 0.4436 0.6655*
Segment 0.1501 0.1501 0.1501 0.1501 0.1530 0.5657 0.6248* 0.6017 0.6017
BreastTissue 0.4717 0.2264 0.4057 0.2264 0.2830 0.5189 0.5189 0.5189 0.5566*
Balance 0.4696 0.6667 0.6795 0.6795 0.6715 0.6042 0.4647 0.4647 0.6683
Pageblocks 0.9051 0.9033 0.9069 0.9051 0.9033 0.9179 0.9051 0.9051 0.9234*
Shuttle 0.8603 0.7897 0.7897 0.7897 0.7897 0.8362 0.8638 0.7897 0.8638
User knowledge modeling 0.3251 0.3275 0.3251 0.3275 0.3251 0.4491 0.4268 0.3251 0.5955*
Urban land cover 0.3869 0.2143 0.2262 0.2262 0.2202 0.5833 0.6131 0.5119 0.6964*
Image segmentation 0.4333 0.1714 0.3095 0.1714 0.1714 0.5762 0.6238 0.5762 0.7952*
Wall-following robot navigation 0.4103 0.4084 0.4103 0.4084 0.4084 0.5018 0.4267 0.4084 0.5092*
Foresttypes 0.4685 0.3748 0.3748 0.3767 0.3748 0.4990 0.4990 0.3748 0.7706*
Wdbc 0.6327 0.6309 0.6309 0.6309 0.6309 0.9121 0.6309 0.6309 0.9139*
Sonar 0.5337 0.5337 0.5337 0.5337 0.5337 0.5337 0.5337 0.5337 0.6010*
Optdigits 0.1352 0.1157 0.1263 0.1157 0.1157 0.6157 0.5854 0.1317 0.6673*
Hepatobiliary disorders 0.3358 0.3340 0.3358 0.3358 0.3358 0.3582 0.3582 0.3340 0.4011*
Banknote 0.5743 0.5554 0.6866 0.5554 0.5554 0.7289 0.6822 0.6866 0.8397*
No. best cases 2 0 1 1 0 3 3 1 26
No. best cases (without CCR-CV) 5 1 2 1 1 14 9 6

Table 11
BACC of each clustering validity index for each real dataset (bold number denotes the best results with the asterisk (*) being the best without ties).

BACC DB XB RT CS Dunn CH Silhouette PBM CCR-CV

Wine 0.5000 0.5000 0.5000 0.5000 0.5000 0.9779 0.5000 0.5000 0.9779
Seeds 0.9393 0.5036 0.7321 0.5036 0.5036 0.7429 0.7321 0.9321 0.9393
Glass 0.6219 0.5239 0.5239 0.5274 0.5239 0.6829 0.6313 0.5339 0.6872*
Iris 0.7500 0.7500 0.7500 0.7500 0.7500 0.7500 0.7500 0.7650 0.8750*
Newthyroid 0.5244 0.5162 0.5162 0.5162 0.5162 0.7958 0.5162 0.5162 0.7958
Ecoli 0.7236 0.6340 0.7065 0.6340 0.6340 0.7078 0.6821 0.7236 0.7367*
Abalone 0.5013 0.5013 0.5013 0.5013 0.5013 0.6306 0.5028 0.5013 0.6363*
Yeast 0.6586 0.5706 0.5821 0.5706 0.5702 0.6559 0.6559 0.6561 0.6608*
Winequality red 0.5268 0.5058 0.5045 0.5058 0.5058 0.5447 0.5458 0.5012 0.5521*
Winequality white 0.5604 0.5409 0.5409 0.5422 0.5409 0.5604 0.5231 0.5604 0.5687*
Vehicle 0.5878 0.5050 0.5878 0.5057 0.5050 0.5788 0.5909 0.5909 0.5909
Penbased 0.5111 0.5103 0.7856 0.5103 0.5103 0.7981* 0.7561 0.7561 0.7856
Satimage 0.5204 0.5054 0.6845 0.5068 0.5054 0.7678 0.6845 0.8047 0.8312*
Vowel 0.5261 0.5267 0.6267 0.5278 0.5267 0.6178 0.5311 0.6078 0.6467*
Texture 0.6070 0.6040 0.6940 0.6040 0.7570 0.8060 0.7960 0.6940 0.8160*
Segment 0.5042 0.5042 0.5042 0.5042 0.5059 0.7466 0.7811* 0.7677 0.7677
BreastTissue 0.6624 0.5143 0.6048 0.5143 0.5429 0.6861 0.6861 0.6861 0.7172*
Balance 0.5054 0.6376 0.6462 0.6462 0.6408 0.5957 0.5021 0.5021 0.6386
Pageblocks 0.6032 0.5931 0.5990 0.6032 0.5931 0.5605 0.5861 0.5861 0.6888*
Shuttle 0.7104 0.6270 0.6270 0.6270 0.6270 0.6080 0.7886 0.6270 0.7886
User knowledge modeling 0.5059 0.5071 0.5059 0.5071 0.5059 0.5794 0.5588 0.5059 0.6417*
Urban land cover 0.6232 0.5249 0.5303 0.5303 0.5277 0.7697 0.7799 0.6993 0.8430*
Image segmentation 0.6694 0.5167 0.5972 0.5167 0.5167 0.7528 0.7806 0.7528 0.8806*
Wall-following robot navigation 0.5039 0.5029 0.5039 0.5029 0.5029 0.6632 0.5127 0.5029 0.6632
Foresttypes 0.5946 0.5019 0.5019 0.5038 0.5019 0.6255 0.6255 0.5019 0.8307*
Wdbc 0.5071 0.5047 0.5047 0.5047 0.5047 0.8945 0.5047 0.5047 0.9017*
Sonar 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5337 0.5956*
Optdigits 0.5168 0.5078 0.5137 0.5078 0.5078 0.7869 0.7702 0.5166 0.8147*
Hepatobiliary disorders 0.5027 0.5014 0.5027 0.5027 0.5027 0.5536 0.5536 0.5014 0.6080*
Banknote 0.5213 0.5000 0.6475 0.5000 0.5000 0.7137 0.6426 0.6475 0.8360*
No. best cases 1 0 1 1 0 4 3 1 27
No. best cases (without CCR-CV) 6 1 3 3 1 15 10 7
B. Kim et al. / Applied Soft Computing 64 (2018) 94–108 105

Table 12
BCR of each clustering validity index for each real dataset (bold number denotes the best results with the asterisk (*) being the best without ties).

BCR DB XB RT CS Dunn CH Silhouette PBM CCR-CV

Wine 0.0000 0.0000 0.0000 0.0000 0.0000 0.9775 0.0000 0.0000 0.9775
Seeds 0.9390 0.0680 0.5521 0.0680 0.0680 0.5650 0.5521 0.9315 0.9390
Glass 0.3923 0.1455 0.1455 0.1803 0.1455 0.4892 0.4173 0.2094 0.4987*
Iris 0.6290 0.5690 0.5690 0.5690 0.5690 0.5690 0.5690 0.6577 0.8732*
Newthyroid 0.1770 0.1445 0.1445 0.1445 0.1445 0.7657 0.1445 0.1445 0.7657
Ecoli 0.5300 0.3418 0.4868 0.3418 0.3418 0.5132 0.4307 0.5300 0.5477*
Abalone 0.0414 0.0414 0.0414 0.0414 0.0414 0.4615 0.0716 0.0414 0.4641*
Yeast 0.4413 0.2221 0.2874 0.2221 0.2189 0.4385 0.4385 0.4387 0.4498*
Winequality red 0.1554 0.0692 0.0499 0.0692 0.0692 0.2103 0.2115 0.0308 0.2186*
Winequality white 0.2613 0.1638 0.1638 0.1735 0.1638 0.2613 0.1571 0.2613 0.2741*
Vehicle 0.3207 0.0715 0.3207 0.0885 0.0715 0.4442* 0.3259 0.3259 0.3259
Penbased 0.1103 0.1057 0.7027 0.1057 0.1057 0.6874 0.6070 0.6070 0.7027
Satimage 0.1063 0.0549 0.4314 0.0614 0.0549 0.5997 0.4314 0.7179 0.7464*
Vowel 0.1589 0.1149 0.5099 0.1172 0.1149 0.4716 0.1238 0.3472 0.5682*
Texture 0.2806 0.2584 0.4638 0.2584 0.5977 0.7361 0.7236 0.4638 0.7457*
Segment 0.0452 0.0452 0.0452 0.0452 0.0763 0.5890 0.6370* 0.6218 0.6218
BreastTissue 0.3998 0.0929 0.2581 0.0929 0.1608 0.4301 0.4301 0.4301 0.5479*
Balance 0.1339 0.4375 0.4544 0.4544 0.4449 0.4289 0.1171 0.1171 0.4696*
Pageblocks 0.3709 0.3407 0.3602 0.3709 0.3407 0.2195 0.2883 0.2883 0.4780*
Shuttle 0.4597 0.2818 0.2818 0.2818 0.2818 0.3377 0.7495 0.2818 0.7495
User knowledge modeling 0.0777 0.0997 0.0777 0.0997 0.0777 0.2771 0.2240 0.0777 0.4603*
Urban land cover 0.3913 0.1631 0.1949 0.1949 0.1710 0.6009 0.6612 0.4760 0.8311*
Image segmentation 0.4641 0.1151 0.2841 0.1151 0.1151 0.6083 0.6799 0.6083 0.8577*
Wall-following robot navigation 0.0622 0.0539 0.0622 0.0539 0.0539 0.5357 0.1139 0.0539 0.5362*
Foresttypes 0.2939 0.0412 0.0412 0.0583 0.0412 0.3407 0.3407 0.0412 0.8191*
Wdbc 0.1190 0.0971 0.0971 0.0971 0.0971 0.8919 0.0971 0.0971 0.9004*
Sonar 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.5902*
Optdigits 0.1426 0.0942 0.1151 0.0942 0.0942 0.6698 0.6518 0.1185 0.7493*
Hepatobiliary disorders 0.0504 0.0357 0.0504 0.0504 0.0504 0.3022 0.3022 0.0357 0.4693*
Banknote 0.2065 0.0000 0.5432 0.0000 0.0000 0.7005 0.5341 0.5432 0.8354*
No. best cases 1 0 1 0 0 3 2 2 28
No. best cases (without CCR-CV) 6 1 4 3 1 16 9 6

Table 13
Relative performance improvement of CCR-CV over the other clustering validity measures (the value in the parenthesis is the p-value for the t-test).

DB XB RT CS Dunn CH Silhouette PBM

ACC 79.05% (<0.001) 114.72% (<0.001) 66.63% (<0.001) 113.73% (<0.001) 108.62% (<0.001) 10.95% (<0.001) 26.92% (<0.001) 42.44% (<0.001)
BACC 29.80% (<0.001) 39.53% (<0.001) 29.17% (<0.001) 39.27% (<0.001) 38.32% (<0.001) 8.48% (<0.001) 19.57% (<0.001) 23.36% (<0.001)
BCR 304.93% (<0.001) 561.22% (<0.001) 355.80% (<0.001) 510.04% (<0.001) 514.97% (<0.001) 28.19% (<0.001) 131.92% (<0.001) 320.22% (<0.001)

Based on the experimental results on artificial and real the appropriate combining weights for the individual indices under
datasets, we conclude the following. First, the proposed CCR-CV different circumstances, we employed the CCR model in the DEA
demonstrates superior ability to determine the optimal cluster- approach and formed an integrated validity index as the ratio of the
ing structure by assigning appropriate weights for each individual four validity measures pursuing maximization to the four validity
validity measure when solving the LP problem defined by DEA. measures pursuing minimization. The experimental results con-
It is empirically supported that CCR-CV can identify the inherent firmed that the proposed CCR-CV could not only determine the
structure for not only well-separated spherical clusters but also the inherent cluster structure but also divide the datasets into more
arbitrary shape of clusters in the artificial datasets. Moreover, CCR- homogeneous clusters.
CV can determine a more consistent cluster structure in accordance Although the effectiveness of the proposed CCR-CV is empiri-
with the actual class distributions than the individual clustering cally supported, there are some limitations of the current research,
validation indices. Based on the performance measures commonly which leads us to future research directions. First, because the
employed in classification tasks, CCR-CV yields the best scores for DEA model allows more than one DMU to have an efficiency
the majority of the datasets demonstrating a significant relative score of one, CCR-CV can assign the highest efficiency score, i.e.,
improvement. “1,” to not only the optimal cluster structure but also other clus-
Table 14 shows the total elapsed time to compute each cluster- ter structures. It would be more practically useful if the relative
ing validity index for all DMUs for the real datasets. These results superiority among the cluster structures with the equal high-
are similar to the time-complexity results of the synthetic datasets. est CCR-CV scores was determined. Secondly, we believe that
The total elapsed time for CCR-CV is highly dependent on the valid- the eight internal measures employed were sufficiently diversi-
ity index with the highest time complexity (CS) and the DEA part of fied with regard to the method of measuring the compactness
the CCR-CV did not take much time (less than 0.03 s). The relative within a cluster and the separation between clusters. However,
time increase of the CCR-CV compared to CS is 1.06% on average there could be other cluster validity measures that are individu-
(maximum of 1.89% and minimum of 0.47%). ally effective and collectively diversified when combined with the
current indices. If these can be identified, the coverage of CCR-CV
6. Conclusion could be extended and improved practical usefulness would be
secured.
In this paper, we proposed a new clustering validity index by
integrating eight different clustering validity indices. To allocate
106 B. Kim et al. / Applied Soft Computing 64 (2018) 94–108

Table 14
Elapsed time (seconds) for computing each cluster validity index for the real datasets (The time within the parentheses is the time required to solve the DEA optimization
problem.

Time complexity DB XB RT CS Dunn CH Silhouette PBM CCR-CV CCR-CV

Wine 0.08 0.06 0.06 24.38 0.06 0.06 0.05 0.06 24.83 0.02
Seeds 0.08 0.08 0.08 27.31 0.08 0.08 0.09 0.08 27.90 0.02
Glass 0.08 0.08 0.08 52.22 0.09 0.08 0.09 0.08 52.83 0.03
Iris 0.05 0.04 0.03 13.66 0.05 0.05 0.05 0.05 14.01 0.03
Newthyroid 0.08 0.08 0.08 31.12 0.07 0.09 0.08 0.06 31.67 0.01
Ecoli 0.15 0.16 0.15 113.28 0.16 0.15 0.15 0.16 114.37 0.01
Abalone 0.48 0.50 0.50 302.79 0.50 0.52 0.58 0.55 306.44 0.02
Yeast 0.55 0.50 0.51 369.72 0.50 0.49 0.50 0.50 373.31 0.04
Winequality red 0.53 0.54 0.55 347.00 0.53 0.52 0.50 0.52 350.70 0.01
Winequality white 0.52 0.53 0.52 346.33 0.53 0.50 0.54 0.52 350.01 0.02
Vehicle 1.22 1.19 1.21 901.15 1.24 1.21 1.24 1.23 909.70 0.01
Penbased 0.73 0.73 0.72 435.72 0.71 0.71 0.71 0.70 440.74 0.01
Satimage 0.99 0.99 0.99 564.73 0.98 0.99 1.02 1.00 571.70 0.01
Vowel 1.40 1.39 1.38 948.16 1.36 1.36 1.44 1.36 957.89 0.01
Texture 0.76 0.77 0.75 547.42 0.77 0.78 0.78 0.78 552.82 0.01
Segment 0.88 0.83 0.85 590.54 0.82 0.82 0.84 0.83 596.42 0.01
BreastTissue 0.03 0.03 0.03 17.71 0.03 0.03 0.03 0.03 17.93 0.01
Balance 0.49 0.49 0.49 241.30 0.49 0.47 0.47 0.49 244.71 0.02
Pageblocks 0.47 0.44 0.43 287.47 0.45 0.44 0.44 0.44 290.60 0.02
Shuttle 0.47 0.47 0.49 322.29 0.47 0.45 0.47 0.47 325.60 0.02
User knowledge modeling 0.22 0.22 0.23 104.00 0.27 0.24 0.23 0.22 105.65 0.02
Urban land cover 0.22 0.24 0.22 264.76 0.22 0.22 0.23 0.23 266.37 0.03
Image segmentation 0.09 0.09 0.09 72.29 0.09 0.09 0.09 0.10 72.94 0.01
Wall-following robot navigation 0.56 0.56 0.57 330.35 0.55 0.56 0.58 0.54 334.29 0.02
Foresttypes 0.55 0.56 0.57 348.78 0.56 0.56 0.56 0.57 352.72 0.01
Wdbc 0.71 0.69 0.69 487.58 0.67 0.65 0.69 0.69 492.39 0.02
Sonar 0.18 0.17 0.17 75.72 0.18 0.17 0.17 0.17 76.95 0.02
Optdigits 1.12 1.12 1.11 875.90 1.12 1.11 1.15 1.14 883.79 0.02
Hepatobiliary disorders 0.40 0.42 0.42 251.77 0.41 0.41 0.42 0.40 254.66 0.01
Banknote 0.58 0.60 0.60 347.13 0.59 0.57 0.60 0.58 351.26 0.01

Acknowledgements 2017-0-00349, Development of Media Streaming system with


Machine Learning using QoE (Quality of Experience)).
This research was supported by (1) Basic Science Research
Program through the National Research Foundation of Appendix A. Clustering validity performance without
Korea (NRF) funded by the Ministry of Education (NRF- stratified sampling
2016R1D1A1B03930729), (2) National Research Foundation
of Korea (NRF) grant funded by the Korea government (MSIT; See Tables A1, A2a and A2b .
Ministry of Science, ICT) (NRF-2015R1A2A2A04007359), and (3)
Institute for Information & communications Technology Promo-
tion (IITP) grant funded by the Korea government (MSIT) (No.

Table A1
ACC of each clustering validity index for each sampled real dataset without stratified sampling (bold number denotes the best results with the asterisk (*) being the best
without ties).

ACC DB XB RT Dunn CH Silhouette PBM CCR-CV

Abalone (4177) 0.4643 0.3659 0.3659 0.3659 0.5299 0.5299* 0.3659 0.5278
Winequality white (4898) 0.4500 0.4492 0.4492 0.4492 0.4761 0.4755 0.4490 0.4780*
Penbased (10,992) 0.1108 0.1107 0.6234 0.1058 0.6775 0.3993 0.3976 0.7586*
Satimage (6435) 0.7487* 0.2389 0.5308 0.2389 0.6706 0.2398 0.6777 0.7484
Texture (5500) 0.0931 0.0927 0.3820 0.0927 0.5878 0.5878 0.3820 0.6329*
Pageblocks (5472) 0.8988 0.8988 0.8988 0.8988 0.9128 0.9145 0.9004 0.9145
Wall-following robot navigation (5456) 0.4195 0.4043 0.4049 0.4043 0.4855 0.4098 0.4043 0.4855
Optdigits (5620) 0.1069 0.1046 0.1071 0.1046 0.6023 0.6972 0.1069 0.7183*
No. best cases 1 0 0 0 1 2 0 6
No. best cases (without CCR-CV) 1 0 0 0 4 4 0

Table A2a
BACC of each clustering validity index for each sampled real dataset without stratified sampling (bold number denotes the best results with the asterisk (*) being the best
without ties).

BACC DB XB RT Dunn CH Silhouette PBM CCR-CV

Abalone (4177) 0.5904 0.5002 0.5002 0.5002 0.6383 0.6373 0.5002 0.6450*
Winequality white (4898) 0.5063 0.5037 0.5037 0.5037 0.5149 0.5146 0.5036 0.5231*
Penbased (10,992) 0.5038 0.5038 0.7918 0.5009 0.8223 0.6625 0.6616 0.8659*
Satimage (6435) 0.7824 0.5005 0.6889 0.5005 0.7477 0.5014 0.7889 0.8229*
Texture (5500) 0.5012 0.5010 0.6601 0.5010 0.7733 0.7733 0.6601 0.7981*
Pageblocks (5472) 0.5050 0.5050 0.5050 0.5050 0.5832 0.6563 0.5476 0.6563
B. Kim et al. / Applied Soft Computing 64 (2018) 94–108 107

Table A2a (Continued)

BACC DB XB RT Dunn CH Silhouette PBM CCR-CV

Wall-following robot navigation (5456) 0.5190 0.5001 0.5004 0.5001 0.5482 0.5137 0.5001 0.5482
Optdigits (5620) 0.5028 0.5016 0.5029 0.5016 0.7794 0.8321 0.5028 0.8438*
No. best cases 0 0 0 0 1 1 0 8
No. best cases (without CCR-CV) 0 0 0 0 5 3 1

Table A2b
BCR of each clustering validity index for each sampled real dataset without stratified sampling (bold number denotes the best results with the asterisk (*) being the best
without ties).

BCR DB XB RT Dunn CH Silhouette PBM CCR-CV

Abalone (4177) 0.5493 0.0157 0.0157 0.0157 0.4693 0.4674 0.0157 0.6012*
Winequality white (4898) 0.1006 0.0396 0.0396 0.0396 0.1180 0.1171 0.0347 0.1548*
Penbased (10,992) 0.0504 0.0491 0.6378 0.0262 0.7580 0.3920 0.3911 0.8517*
Satimage (6435) 0.6145 0.0173 0.4335 0.0173 0.5742 0.0370 0.7011 0.7374*
Texture (5500) 0.0304 0.0251 0.3918 0.0251 0.6518 0.6518 0.3918 0.7279*
Pageblocks (5472) 0.0837 0.0837 0.0837 0.0837 0.3108 0.4355 0.2002 0.4355
Wall-following robot navigation (5456) 0.1958 0.0098 0.0197 0.0098 0.2969 0.1515 0.0098 0.2969
Optdigits (5620) 0.0480 0.0373 0.0522 0.0373 0.6283 0.7701 0.0480 0.7828*
No. best cases 0 0 0 0 1 1 0 8
No. best cases (without CCR-CV) 0 0 0 0 5 3 1

References [21] D. Opitz, R. Maclin, Popular ensemble methods: an empirical study, J. Artif.
Intell. Res. 11 (1999) 169–198.
[1] L. Kaufman, P.J. Rousseeuw, Finding Groups in Data: An Introduction to [22] P.A. Jaskowiak, D. Moulavi, A.C.S. Furtado, R.J.G.B. Campello, A. Zimek, J.
Cluster Analysis, John Wiley, Hoboken, NJ, USA, 1990. Sander, On strategies for building effective ensembles of relative clustering
[2] D. Zakrzewska, J. Murlewski, Clustering algorithms for bank customer validity criteria, Knowl. Inf. Syst. 47 (2016) 329–354.
segmentation, 5th International Conference on Intelligent Systems Design [23] G. Kou, Y. Peng, G. Wang, Evaluation of clustering algorithms for financial risk
and Applications (ISDA’05) (2005) 197–202. analysis using MCDM methods, Inf. Sci. 275 (2014) 1–12.
[3] F.T. Piller, D. Walcher, Toolkits for idea competitions: a novel method to [24] A. Charnes, W.W. Cooper, E. Rhodes, Measuring the efficiency of decision
integrate users in new product development, R&D Manage. 36 (2006) making units, Eur. J. Oper. Res. 2 (1978) 429–444.
307–318. [25] D.L. Davies, D.W. Bouldin, A cluster separation measure, IEEE Trans. Pattern
[4] M. Halkidi, Y. Batistakis, M. Vazirgiannis, Cluster validity methods: part I, Anal. Mach. Intell. 1 (1979) 224–227.
SIGMOD Rec. 31 (2002) 40–45. [26] X.L. Xie, G. Beni, A validity measure for fuzzy clustering, IEEE Trans. Pattern
[5] M. Halkidi, Y. Batistakis, M. Vazirgiannis, Clustering validity checking Anal. Mach. Intell. 13 (1991) 841–847.
methods: part II, SIGMOD Rec. 31 (2002) 19–27. [27] S. Ray, R.H. Turi, Determination of number of clusters in k-means clustering
[6] Y. Liu, Z. Li, H. Xiong, X. Gao, J. Wu, Understanding of internal clustering and application in colour image segmentation, The 4th International
validation measures, 2010 IEEE International Conference on Data Mining Conference on Advances in Pattern Recognition and Digital Techniques (1999)
(2010) 911–916. 137–143.
[7] M. Halkidi, Y. Batistakis, M. Vazirgiannis, On clustering validation techniques, [28] S. Liu, Y.L. Huang, A new clustering validity index for evaluating arbitrary
J. Intell. Inf. Syst. 17 (2001) 107–145. shape clusters, 2007 International Conference on Machine Learning and
[8] S. Askari, N. Montazerin, M.H. Fazel Zarandi, Generalized Possibilistic Fuzzy Cybernetics (2007) 3969–3974.
C-Means with novel cluster validity indices for clustering noisy data, Appl. [29] J.C. Dunn, Well-Separated clusters and optimal fuzzy partitions, J. Cybern. 4
Soft Comput. 53 (2017) 262–283. (1974) 95–104.
[9] S. Saha, S. Bandyopadhyay, Some connectivity based cluster validity indices, [30] T. Caliński, J. Harabasz, A dendrite method for cluster analysis, Commun. Stat.
Appl. Soft Comput. 12 (2012) 1555–1565. 3 (1974) 1–27.
[10] H.-L. Shieh, Robust validity index for a modified subtractive clustering [31] P.J. Rousseeuw, Silhouettes A graphical aid to the interpretation and
algorithm, Appl. Soft Comput. 22 (2014) 47–59. validation of cluster analysis, J. Comput. Appl. Math. 20 (1987) 53–65.
[11] E. Rendón, I.M. Abundez, C. Gutierrez, S.D. Zagal, A. Arizmendi, E.M. Quiroz, [32] M.K. Pakhira, S. Bandyopadhyay, U. Maulik, Validity index for crisp and fuzzy
H.E. Arzate, A comparison of internal and external cluster validation indexes, clusters, Pattern Recogn. 37 (2004) 487–501.
in: Proceedings of the 2011 American Conference on Applied Mathematics [33] D. Bouyssou, Using DEA as a tool for MCDM: some remarks, J. Oper. Res. Soc.
and the 5th WSEAS International Conference on Computer Engineering and 50 (1999) 974–978.
Applications, World Scientific and Engineering Academy and Society [34] R. Ramanathan, Data envelopment analysis for weight derivation and
(WSEAS), Puerto Morelos, Mexico, 2011, pp. 158–163. aggregation in the analytic hierarchy process, Comput. Oper. Res. 33 (2006)
[12] L.J. Deborah, R. Baskaran, A. Kannan, A survey on internal validity measure for 1289–1307.
cluster validation, Int. J. Comput. Sci. Eng. Surv. 1 (2010) 85–102. [35] H. Lee, C. Kim, Benchmarking of service quality with data envelopment
[13] O. Arbelaitz, I. Gurrutxaga, J. Muguerza, J.M. Pérez, I. Perona, An extensive analysis, Expert Syst. Appl. 41 (2014) 3761–3768.
comparative study of cluster validity indices, Pattern Recogn. 46 (2013) [36] M. Zeleny, J.L. Cochrane, Multiple Criteria Decision Making, University of
243–256. South Carolina Press, 1973.
[14] J. Wu, J. Chen, H. Xiong, M. Xie, External validation measures for K-means [37] V. Belton, S.P. Vickers, Demystifying DEA-A visual interactive approach based
clustering: a data distribution perspective, Expert Syst. Appl. 36 (2009) on multiple criteria analysis, J. Oper. Res. Soc. 44 (1993) 883–896.
6050–6061. [38] J. Doyle, R. Green, Data envelopment analysis and multiple criteria decision
[15] Y. Lei, J.C. Bezdek, S. Romano, N.X. Vinh, J. Chan, J. Bailey, Ground truth bias in making, Omega 21 (1993) 713–715.
external cluster validity indices, Pattern Recogn. 65 (2017) 58–70. [39] T.J. Stewart, Relationships between data envelopment analysis and
[16] B. Wu, B.-G. Hu, Q. Ji, A Coupled Hidden Markov Random Field model for multicriteria decision analysis, J. Oper. Res. Soc. 47 (1996) 654–665.
simultaneous face clustering and tracking in videos, Pattern Recogn. 64 [40] W.D. Cook, K. Tone, J. Zhu, Data envelopment analysis: prior to choosing a
(2017) 361–373. model, Omega 44 (2014) 1–4.
[17] M. Reiter, P. Rota, F. Kleber, M. Diem, S. Groeneveld-Krentz, M. Dworzak, [41] W.W. Cooper, L.M. Seiford, K. Tone, Introduction to Data Envelopment
Clustering of cell populations in flow cytometry data using a combination of Analysis and Its Uses with DEA-Solver Software and References, Springer, US,
Gaussian mixtures, Pattern Recogn. 60 (2016) 1029–1040. 2006.
[18] M. Fernández-Delgado, E. Cernadas, S. Barro, D. Amorim, Do we need [42] W. Duch, K. Grudziński, Ensembles of similarity-based models, in:
hundreds of classifiers to solve real world classification problems? J. Mach. International Symposium on the Intelligent Information Systems X,
Learn. Res. 15 (2014) 3133–3181. Springer-Verlag,Zakopane, Poland, 2001, pp. 75–85.
[19] R. Caruana, A. Niculescu-Mizil, An empirical comparison of supervised [43] J. MacQueen, Some methods for classification and analysis of multivariate
learning algorithms, in: Proceedings of the 23rd International Conference on observations, Proceedings of the Fifth Berkeley Symposium on Mathematical
Machine Learning, ACM, Pittsburgh, Pennsylvania, USA, 2006, pp. 161–168. Statistics and Probability, in: Statistics, University of California Press Berkeley,
[20] G. Seni, J. Elder, Ensemble Methods in Data Mining: Improving Accuracy Calif., vol. 1, 1967, pp. 281–297.
Through Combining Predictions, Morgan and Claypool Publishers, 2010. [44] L. Kaufman, P.J. Rousseeuw, Partitioning around medoids (Program PAM), in:
Finding Groups in Data, John Wiley & Sons, Inc., 2008, pp. 68–125.
108 B. Kim et al. / Applied Soft Computing 64 (2018) 94–108

[45] K. Chidananda Gowda, G. Krishna, Agglomerative clustering using the concept Conference on Knowledge Discovery and Data Mining, AAAI Press Portland,
of mutual nearest neighbourhood, Pattern Recogn. 10 (1978) 105–112. Oregon, 1996, pp. 226–231.
[46] M. Ester, H.-P. Kriegel, J. Sander, X. Xu, A density-based algorithm for [47] O. Grygorash, Y. Zhou, Z. Jorgensen, Minimum spanning tree based clustering
discovering clusters: a density-based algorithm for discovering clusters in algorithms, 2006 18th IEEE International Conference on Tools with Artificial
large spatial databases with noise, in: Proceedings of the Second International Intelligence (ICTAI’06) (2006) 73–81.

You might also like