You are on page 1of 12

This article has been accepted for inclusion in a future issue of this journal.

Content is final as presented, with the exception of pagination.

IEEE TRANSACTIONS ON CYBERNETICS 1

Distributed k-Means Algorithm and Fuzzy c-Means


Algorithm for Sensor Networks Based on
Multiagent Consensus Theory
Jiahu Qin, Member, IEEE, Weiming Fu, Huijun Gao, Fellow, IEEE, and Wei Xing Zheng, Fellow, IEEE

Abstract—This paper is concerned with developing a dis- their data through the network to a main location. The devel-
tributed k-means algorithm and a distributed fuzzy c-means opment of WSNs was motivated by military applications such
algorithm for wireless sensor networks (WSNs) where each node as battlefield surveillance; today such networks are used in
is equipped with sensors. The underlying topology of the WSN
is supposed to be strongly connected. The consensus algorithm many industrial applications, for instance, industrial and man-
in multiagent consensus theory is utilized to exchange the mea- ufacturing automation, machine health monitoring, and so
surement information of the sensors in WSN. To obtain a faster on [1]–[6].
convergence speed as well as a higher possibility of having the When using a WSN as an exploratory infrastructure, it is
global optimum, a distributed k-means++ algorithm is first pro- often desired to infer hidden structures in distributed data col-
posed to find the initial centroids before executing the distributed
k-means algorithm and the distributed fuzzy c-means algo- lected by the sensors. For a scene segmentation and monitoring
rithm. The proposed distributed k-means algorithm is capable application, nodes in the WSN need to collaborate with each
of partitioning the data observed by the nodes into measure- other in order to segment the scene, identify the object of
dependent groups which have small in-group and large out-group interest, and thereafter to classify it. Outlier detection is also a
distances, while the proposed distributed fuzzy c-means algo- typical task for WSN with applications in monitoring chemical
rithm is capable of partitioning the data observed by the nodes
into different measure-dependent groups with degrees of mem- spillage and intrusion detection [14].
bership values ranging from 0 to 1. Simulation results show Many centralized clustering algorithms have been proposed
that the proposed distributed algorithms can achieve almost to solve the data clustering problem in order to grasp the fun-
the same results as that given by the centralized clustering damental aspects of the ongoing situation [7]–[11]. However,
algorithms. in WSN, large amounts of data are dispersedly collected
Index Terms—Consensus, distributed algorithm, fuzzy in geographically distributed nodes over networks. Due to
c-means, hard clustering, k-means, soft clustering, wireless the limited energy, communication, computation and storage
sensor network (WSN). resources, centralizing the whole distributed data to one fusion
node to perform centralized clustering may be impractical.
Thus, there is a great demand for distributed data cluster-
I. I NTRODUCTION
ing algorithms in which the global clustering problem can be
WIRELESS sensor network (WSN) consists of spatially
A distributed autonomous sensors capable of limited com-
puting power, storage space, and communication range to
solved at each individual node based on local data and limited
information exchanges among nodes. In addition, compared
with the centralized clustering, distributed clustering is more
monitor physical or environmental conditions, such as tem- flexible and robust to node and/or link failures.
perature, sound and pressure, and to cooperatively pass on Clustering can be achieved by various algorithms that dif-
fer significantly in their notion of what constitutes a cluster
Manuscript received June 24, 2015; revised October 25, 2015 and January and how to efficiently find a cluster. One of the most pop-
24, 2016; accepted February 2, 2016. This work was supported in part by
the National Natural Science Foundation of China under Grant 61333012 and ular notions of the clusters is groups with small in-group
Grant 61473269, in part by the Fundamental Research Funds for the Central and large out-group differences [7]–[11]. Different measures
Universities under Grant WK2100100023, in part by the Anhui Provincial of difference (or similarity) may be used to place items into
Natural Science Foundation under Grant 1408085QF105, and in part by the
Australian Research Council under Grant DP120104986. This paper was clusters, including distance and intensity. A great deal of algo-
recommended by Associate Editor R. Selmic. rithms [7]–[10] have been proposed to deal with the clustering
J. Qin and W. Fu are with the Department of Automation, University problem based on distance. Among them, the k-means algo-
of Science and Technology of China, Hefei 230027, China (e-mail:
jhqin@ustc.edu.cn; fwm1993@mail.ustc.edu.cn). rithm is the most popular hard clustering algorithm, which
H. Gao is with the Research Institute of Intelligent Control and features simple and fast-convergent iterations [7].
Systems, Harbin Institute of Technology, Harbin 150001, China (e-mail: Parallel and distributed implementations of the k-means
hjgao@hit.edu.cn).
W. X. Zheng is with the School of Computing, Engineering and algorithm have risen most often because of its high effi-
Mathematics, Western Sydney University, Sydney, NSW 2751, Australia ciency in dealing with large data sets. The algorithm proposed
(e-mail: w.zheng@westernsydney.edu.au). in [12] is the parallel version of the k-means algorithm in
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org. multiprocessors of distributed memory, which has no com-
Digital Object Identifier 10.1109/TCYB.2016.2526683 munication constraints. In [13], observations are assigned to
2168-2267 c 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

2 IEEE TRANSACTIONS ON CYBERNETICS

distributed stations until all observations in a station belong The remainder of this paper is organized as follows. Some
to a cluster based on the partition method of the k-means graph theory and the consensus algorithm are introduced in
algorithm. It is apparent that a large number of observa- Section II. The k-means clustering algorithm is reviewed and
tions need to be exchanged, which would cause the network a distributed k-means algorithm is proposed in Section III,
burden to be heavy. A kind of distributed k-means algo- while a distributed fuzzy c-means algorithm is developed in
rithm is proposed in [14] for peer-to-peer networks, in which Section IV. Section V provides simulation results to demon-
each node gathers basic information of itself and then dis- strate the usefulness of the proposed algorithms. Finally, some
seminates it to all the others. It can be observed that the conclusions are made in Section VI.
algorithm proposed in [14] is not scaling well, since a very
large amount of data needs to be stored when the number II. P RELIMINARIES
of nodes is huge. Forero et al. [15], [16] treated the clus-
tering problem performed by distributed k-means algorithms In order to provide a distributed implementation of the
as a distributed optimization problem. The duality theory k-means algorithm, we need a tool to diffuse information
and decomposition methods, which play a major role in the among the nodes. The average-consensus, max-consensus, and
field of distributed optimization [17], are utilized to derive min-consensus algorithms are proved with their effectiveness
the distributed k-means algorithms. However, an additional in composing local observations by means of one-hop com-
parameter is introduced, and furthermore, the clustering results munication. In this section, we first introduce some notations
may not converge when an unsuitable parameter is selected. of graph theory and then the consensus algorithm.
In [18], the distributed consensus algorithm in the multia-
gent systems community [19] is applied to help develop a A. Graph Theory
distributed k-means algorithm. Although the algorithm pro- Let G = (V, E, A) be a weighted directed graph of order n
posed in [18] is feasible to accomplish the clustering work with a set of nodes V = {1, . . . , n}, a set of edges E ⊂ V × V
in a distributed manner without introducing any parameters, and a weighted adjacency matrix A = [aij ] ∈ Rn×n with non-
there are still some issues that need to be noted. The first negative adjacency element aij . If there is a directed edge from
issue is the high computational complexity. The second issue j to i in the directed graph G, then the edge is denoted by (i, j).
is that it is impossible for the algorithm to work when the The adjacency elements associated with the edges of the graph
network is a directed graph which does happen in prac- are positive, i.e., (i, j) ∈ E ⇔ aij > 0. The set of neighbors of
tice since different sensors may have different sensing radii. node i is denoted by Ni = { j ∈ V : (i, j) ∈ E}.
Finally, the hard clustering algorithm is not always suitable A directed path from node i to node j of G is a sequence
because in some situations such as soil science [20] and of edges like (i, i1 ), (i1 , i2 ), . . . , (il−1 , il ), (il , j) whose length
image processing [21], overlapping classes may be useful in is l + 1. The distance between two nodes is the length of the
dealing with the observations for which the soft clustering shortest path connecting them. A directed graph is strongly
is required. Certainly, there are also some other techniques connected if there is a directed path from every node to every
to tackle the distributed clustering problems, which are not other node. The diameter D of a strongly connected graph is
based on the k-means algorithm. For example, [22] presents a the maximum value of the distances between all the nodes.
type of distributed information theoretic clustering algorithms For a directed graph, the in-degree and out-degree of i are,
based on the centralized one. In [23], distributed cluster- respectively, the sum of the weights of incoming and out-
ing and learning schemes over networks are considered, in going edges. A directed graph is weight balanced if for all
 
which different learning tasks are completed by different i ∈ V, nj=1 aij = nj=1 aji , i.e., the in-degree and out-degree
clusters. coincide.
We consider in this paper, the framework that the nodes in An undirected graph is a special kind of directed graph in
the WSN hold a (possibly high-dimensional) measure of infor- which aij = aji for all i, j ∈ V. Apparently, an undirected graph
mation and the network is a strongly connected directed graph. is always weight balanced.
To be specific, we aim to extend the algorithms proposed
in [18] from the following perspectives. B. Consensus Algorithm
1) In order to obtain a faster convergence speed as well as a
higher possibility of the global optimum, the k-means++ Based on the notations given above, we review the consen-
algorithm [24], a kind of centralized algorithm to find sus algorithms in this section. A large number of consensus
the initial centroid of the k-means algorithm, is extended algorithms have been proposed (see [19], [25]–[29] and the
to the distributed one. references therein), but we only introduce some simple algo-
2) The distributed k-means algorithm proposed in this rithms in this section.
paper has a relatively lower computational complexity For a set of n nodes associated with a graph G, the discrete
than the one in [18] and can be used to deal with the dynamics is described by
condition that the network topology is a directed graph, xi (t + 1) = ui (Ni ∪ {i}, t), xi (0) = xi0 , i = 1, . . . , n (1)
irrespective of whether the underlying topology of each
cluster is connected or not. where xi (t), xi0 ∈ Rd , and ui (Ni ∪ {i}, t) is the function of the
3) A distributed fuzzy c-means clustering algorithm is states of the nodes in the set Ni ∪{i}. Here the dimensions of the
provided for distributed soft clustering. states of nodes are supposed to be 1 for simplification. In fact,
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

QIN et al.: DISTRIBUTED k-MEANS ALGORITHM AND FUZZY c-MEANS ALGORITHM FOR SENSOR NETWORKS 3

if the state is high-dimensional, then the same operation can Denote by q(t) the minimal polynomial of the matrix W, which
be conducted on every component of the state individually. is a unique nomic polynomial of smallest degree such that
Let X (x10 , . . . , xn0 ) ∈ R be any function of the initial q(W) = 0. Suppose that the minimal polynomial of W has
states of all the nodes. The X -problem is to seek a proper degree σ + 1 (σ + 1 ≤ N) and takes the form q(t) = tσ +1 +
ui (Ni ∪ {i}, t) such that lim xi (t) = X (x10 , . . . , xn0 ) in a ασ tσ + · · · + α1 t + α0 .
t→∞
distributed way. In the finite-time average-consensus problem, the nodes are
In the max-consensus problem, the nodes are required to required to get the exact average of their initial states after
converge to the maximum of the initial states, i.e., X (·) is finite-time steps. When the directed graph is strongly con-
the maximum of its arguments. When the directed graph is nected and weight balanced, the finite-time average-consensus
strongly connected, the max-consensus can be reached [19] if can be reached within σ steps [31] by letting
the following control law is chosen: [xi (σ ) xi (σ − 1) · · · xi (0)]S
xi (σ + 1) = (5)
ui (Ni ∪ {i}, t) = max xj (t), i = 1, . . . , n. (2) [1 1 · · · 1]S
j∈Ni ∪{i}
Apparently, the maximum values can be spread through with
⎡ ⎤
the network within D steps, the diameter of the network, 1
which can be upper-bounded by n, the number of the nodes. ⎢ 1 + ασ ⎥
⎢ ⎥
Regarding the computational complexity, for node i we have ⎢1 + ασ + ασ −1 ⎥
S=⎢ ⎥
|Ni | ∝ n, where |·| denotes the cardinality of a set, thus the ⎢ .. ⎥
⎣ ⎦
computational complexity is O(n2 ) for a node. . σ
Similarly, when the directed graph is strongly connected, the 1 + j=1 αi
min-consensus can be reached [19] if the following control law if the control law (4) is chosen since
is chosen:
[xi (σ ) xi (σ − 1) · · · xi (0)]S
ui (Ni ∪ {i}, t) = min xj (t), i = 1, . . . , n (3) lim xi (t) = .
j∈Ni ∪{i} t→∞ [1 1 · · · 1]S
and the computational complexity for one node is also O(n2 ). The distributed method to get α0 , . . . , ασ is also provided
Remark 1: In order to perform the max-consensus and min- in [31]. Similarly, the corresponding computational complexity
consensus algorithms, every node needs to know the number of is O(n2 ) for a node.
nodes n in the network. This is a strong assumption that lacks In the following, we will denote by
robustness (e.g., some nodes could run out of battery) and pre-
vents scaling (i.e., if a new node is added into the network, x = max-consensus(xi , Ni , n)
every other node should update the number of nodes n). This and
problem can be alleviated by using a semi-distributed algo-
rithm for estimating n (see the work in [30]). On the other x = min-consensus(xi , Ni , n)
hand, if an upper-bound of n is provided, which is a much
milder condition, then the max-consensus and min-consensus the execution of n iterations of the max-consensus proce-
algorithms can also perform well. dure described by (2) and min-consensus procedure described
In the average-consensus problem, the nodes are required to by (3) by the ith node, respectively, and also denote by
converge to the average of their initial states, i.e., X (·) is the
x = average-consensus(xi , Ni , ai , n)
mean of its arguments. When the directed graph is strongly
connected and weight balanced, the average-consensus can be the execution of n iterations of the finite-time average-
reached [19] if the following control law is chosen: consensus procedure described by (4) by the ith node, where
  
ui (Ni ∪ {i}, t) = xi (t) + τ aij xj (t) − xi (t) (4) xi is the initial state for node i, Ni is the set of neighbors
j∈Ni
of node i, ai is the stack of the weights of node i’s ingoing
edges, and n is the number of the nodes in the network. The
where the  parameter τ is assumed to be τ ≤
returned value x of each of the first two algorithms is the final
(1/maxi ( j=i aij )). Suppose that the times of iteration
state of node i and the returned value x of the last algorithm is
is tmax , then the computational complexity is O(ntmax ) for a
obtained by (5) for node i, which is the same for every node
node since |Ni | ∝ n.
after running each algorithm.
Remark 2: Note that the result of the average-consensus
It is a strong assumption that a directed graph is assumed
algorithm is only asymptotically correct. This is different from
to be weight balanced. However, some weight balancing tech-
the max-consensus/min-consensus algorithm, which can pro-
niques [32], [33] can be utilized. In this paper, we employ
vide an exact result after a finite number of iterations. In the
the mirror imbalance-correcting algorithm presented in [33] to
following, we will introduce the finite-time average-consensus
make the network be weight balanced, for which the basic idea
algorithm, which can get the exact average in finite-time steps.
is that each node changes its weights of the outgoing edges
Introduce W = [wij ] ∈ RN×N with
 according to certain strategy. When considering the worst-case
τ aij , i=j scenario, the computational complexity is O(n4 ) for each node
wij = 
1−τ N a
k=1 ik , i = j. for each node for each node.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

4 IEEE TRANSACTIONS ON CYBERNETICS

III. D ISTRIBUTED k-M EANS A LGORITHM Algorithm 1: Distributed k-Means++ Algorithm: ith Node
In this section, the distributed k-means algorithm will be Data: xi , n, k, Ni
proposed. For this purpose, we first briefly introduce the Result: c(1)
centralized k-means algorithm which has been very well \* Propagation of the first centroid *\
developed in the existing literature.
tempi = random(0, 1);
temp = max-consensus(tempi , Ni , n);
A. Introduction to the Centralized k-Means Algorithm
if tempi == temp then
Given a set of observations {x1 , x2 , . . . , xn }, where each xic = xi ;
observation is a d-dimensional real-valued vector, the k-means else
algorithm [7] aims to partition the n observations into k(≤ n) xic = [−∞, . . . , −∞] ;
sets S = {S1 , S2 , . . . , Sk } so as to minimize the within-cluster end
sum of squares (WCSS) function. In other words, its objective c1 (1) = max-consensus(xic , Ni , n);
is to find for m = 2;m ≤ k;m + + do

k 
2 \* Propagation of the maximum value of all the
argmin xi − cj (6) nodes’ distance to their nearest centroids *\
S j=1 xi ∈Si
di = min1≤j≤m−1 xi − cj (1) ;
where cj is the presentation of cluster j, generally the centroid randi = di2 ×random(0, 1);
of points in Sj . randmax = max-consensus(randi , Ni , n);
The algorithm uses an iterative refinement technique. Given
an initial set of k centroids c1 (1), . . . , ck (1), the algorithm \* Propagation of the tth center *\
proceeds by alternating between an assignment step and an if randi == randmax then
update step as follows. xic = xi ;
During the assignment step, assign each observation to the else
cluster characterized by the nearest centroid, that is xic = [−∞, . . . , −∞] ;
  end
2 2
Si (T) = xp : xp − ci (T) ≤ xp − cj (T) , 1 ≤ j ≤ k , cm (1) = max-consensus(xic , Ni , n);
i = 1, . . . , k (7) end
c(1) = [c1 (1) , . . . , ck (1) ] ;
where each xp is assigned to exactly one cluster, even if it
could be assigned to two or more of them. Apparently, this
step can minimize the WCSS function.
During the update step, the centroid, say ci (T + 1), of the
observations in the new cluster is computed as follows: a given k, so does the distributed one. Thus, an effective dis-
 tributed initial centroids choosing method is important. Here
1
ci (T + 1) = xj , i = 1, . . . , k. (8) we provide a distributed implementation of the k-means++
|Si (T)| algorithm [24], a centralized algorithm to find the initial cen-
xj ∈Si (T)
troids for the k-means algorithm. It is noted that k-means++
Since the arithmetic mean is a least-squares estimator, this step
generally outperforms k-means in terms of both accuracy and
also minimizes the WCSS function.
speed [24].
The algorithm converges if the centroids no longer change.
Let D(x) denote the distance from observation x to the
The k-means algorithm can converge to a (local) optimum,
closest centroids that have been chosen. The k-means++
while there is no guarantee for it to converge to the global opti-
algorithm [24] is executed as follows.
mum [7]. Since for a given k, the result of the above clustering
1) Choose randomly an observation from {x1 , x2 , . . . , xn }
algorithm depends solely on the initial centroids, a common
as the first centroid c1 .
practice in clustering the sensor observations is to execute the
2) Take a new center cj , choosing
 x from the observations
algorithm several times for different initial centroids and then
with probability D(x)2 /( x∈{x1 ,x2 ,...,xn } D(x)2 ).
select the best solution.
3) Repeat step 2) until we have taken k centers altogether.
The computational complexity of the k-means algorithm is
Consider that the n nodes are deployed in an Euclidean
O(nkdM) [7], where n is the number of the d-dimensional
space of any dimension and suppose that each node has a
vectors, k is the number of clusters, and M is the number of
limited sensing radius which may be different for different
iterations before reaching convergence.
nodes, therefore leading to the fact that the underlying topol-
ogy of the WSN is directed. Each node is endowed with a real
B. Choosing Initial Centroids Using the Distributed vector xi ∈ Rd representing the observation. Here we assume
k-Means++ Algorithm that every node has its own unique identification (ID), and
The choice of the initial centroids is the key to making the further the underlying topology of the WSN is strongly con-
k-means algorithm work well. In the k-means algorithm, the nected. The detailed realization of the distributed k-means++
partition result is dependent only on the initial centroids for algorithm is as shown in Algorithm 1.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

QIN et al.: DISTRIBUTED k-MEANS ALGORITHM AND FUZZY c-MEANS ALGORITHM FOR SENSOR NETWORKS 5

For node i, it knows its observation xi , n (the number of the


nodes in the network), k (the number of the clusters), as well
as Ni (the set of node i’s neighbors). Executing the distributed
k-means++ algorithm on all the nodes synchronously yields
initial centroids c(1) which can be propagated to all the nodes.
The detailed procedures are listed as follows.
1) First, every node produces a random number between
0 and 1, and the maximum number can be figured
out by all nodes through the max-consensus algorithm.
The node with maximum number chooses its obser-
vation while others choose [−∞, . . . , −∞] ∈ Rd to
participate in the max-consensus algorithm so that the
observation of the node with the largest random number,
designated as the first centroid c1 (1), can be known by
every node.
2) Next, all the nodes perform the following steps iter-
atively until all the k centroids are fixed. In the mth
iteration, calculate the distance from each node to the
existing centroids and obtain di = min1≤j≤m−1 xi −
cj (1) , i = 1, . . . , n. Then, every node produces a
random number randi between 0 and di2 , followed by
using the max-consensus algorithm to spread randmax =
max1≤i≤n randi throughout the network. If a certain (a) (b)
node, say i, satisfies randi = randmax , then its obser-
vation is designated as the mth centroid and further is Fig. 1. Flowchart for proposed algorithms. Flowchart for distributed
(a) k-means algorithm and (b) c-means algorithm.
obtained by other nodes in the same manner as that for
the first centroid.
Since the computational complexity of the max-consensus
algorithm is O(dn2 ), and the max-consensus algorithm needs it belongs to. During the update step, the finite-time average-
to be executed for 2k times, the distributed k-means++ algo- consensus algorithm can be utilized to choose the mean of each
rithm’s computational complexity is O(kdn2 ) for each node. cluster as the new centroid. Note that some pretreatments such
Remark 3: A contradiction will appear if there exist at as weight balancing and normalization [see Fig. 1(a)] need to
least two nodes, say node i and node j, such that randi = be done before the distributed algorithm is being executed.
randj = randmax in Algorithm 1, since these two nodes both The flowchart for the distributed k-means algorithm is shown
choose their own observations as the new centroid. A solu- in Fig. 1(a) and the concrete realization of the algorithm is
tion to this contradiction is that for any node, say node , shown in Algorithm 2.
with rand = randmax , it provides its ID while others provide Note that in Algorithm 2,
0 to participate in the max-consensus algorithm. Therefore, mirror-imbalance-correcting(·)
the maximum ID of the node with rand = randmax can be is the mirror imbalance-correcting algorithm in [33] and
obtained by every node, and further such a node’s observation distributed-k-means++(·)
can be chosen as the new centroid. In fact, the same oper- is the distributed k-means++ algorithm proposed in
ation needs to be performed for obtaining the first centroid, Section III-B.
although the probability for two nodes to produce the same For node i, it knows its observation xi ∈ Rd , n (the number
random number is normally very small. of the nodes), k (the number of the clusters), as well as Ni (the
set of node i’s neighbors). All the nodes execute the distributed
k-means algorithm synchronously, and the final centroids c and
the cluster ki which it belongs to, are obtained by every node.
C. Distributed k-Means Algorithm The whole algorithm is executed as follows.
Based on the above work, in this section we will introduce 1) Weight Balancing: To compute the average of a num-
the proposed distributed k-means algorithm in detail. ber of observations by the finite-time average-consensus
The objective of the distributed k-means algorithm is to par- algorithm, the strongly connected directed graph needs to
tition the data observed by the nodes into k clusters, which be weight balanced. The mirror imbalance-correcting algo-
minimizes the WCSS function (6) via a distributed approach. rithm [33] is executed to assign weights to node i’s outgoing
Similarly to the work in [18], we follow the same step as edges to make the network be weight balanced.
that for the centralized algorithm except that every step is 2) Normalization: A min–max value standardization
performed in a decentralized manner. During the assignment method is used to normalize the components in the observa-
step, if all the centroids are spread through the network via the tions. The max-consensus and min-consensus algorithms [19]
max-consensus algorithm, then each node knows which cluster are used first to obtain the maximum and minimum values of
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

6 IEEE TRANSACTIONS ON CYBERNETICS

Algorithm 2: Distributed k-Means Algorithm: ith Node 4) Assignment Step: Since the centroids have been provided
Data: xi = [xi1 , . . . , xid ] , n, k, Ni in step 3), each node needs to figure out which cluster, rep-
Result: c, ki resented by the centroid, it belongs to. More specific, node i
belongs to the cluster ki which satisfies the condition that the
\* Weight balancing *\ distance from node i to the centroid of cluster ki is the shortest
ai = mirror-imbalance-correcting(Ni ); among all the distances from the centroids to such a node, i.e.,
\* Normalization *\ it belongs to the ki th cluster with ki = argminj x̂i − cj (T) .
5) Update Step: After the assignment step, the update step
[ maxi , . . . , maxd ] = max-consensus(xi , Ni , n); is executed. Let T = T + 1. To compute the jth center, each
[ mini , . . . , mind ] = min-consensus(xi , Ni , n); node provides its normalized observations if it is assigned to
xij −minj
x̂ij = max j −minj
,for j = 1, . . . , d; the jth cluster and [0, . . . , 0] ∈ Rd otherwise, to participate
x̂i = [x̂i1 , . . . , x̂id ] ; in the finite-time average-consensus algorithm. If the quan-
tity of the nodes in the jth cluster is qj and n the centroid is
\* Choice of the initial centroids *\
cj (T), then it is apparent that qj cj (T) = i=1 nic x̂ic . Thus,
c(1)  [c1 (1) , . . . , ck (1) ] = the result of the finite-time average-consensus algorithm is
distributed-k-means++(xi , n, k, Ni ); (qj cj (T)/n).
T = 1; Then every node chooses 1 if it is assigned to the jth
while true do cluster and 0 otherwise to participate in the finite-time average-
\* Assignment step *\ consensus algorithm. The finite-time average-consensus
n algo-
rithm’s result is (qj /n) since qj = i=1 nic . Apparently,
ki = argminj x̂i − cj (T) ;
the jth cluster’s centroid can be obtained by cj (T) =
\* Update step *\ (qj cj (T)/n)/(qj /n).
T = T + 1; 6) Check Convergence: After the new centroids are pro-
for j = 1; j ≤ k; j + + do duced, we need to judge whether the algorithm has converged.
if ki == j then Obviously, if c(T) = c(T − 1), then c(T) is the final cluster
nic = 1; center c. Otherwise, the assignment step and update step are
else repeated until the algorithm converges.
nic = 0; It is known that the computational complexity of the dis-
end tributed k-means++ algorithm is O(kdn2 ) and the complexity
cj (T) = average-consensus(nic x̂ic , Ni , ai , n); of the mirror imbalance-correcting algorithm is O(n4 ) for a
nj = average-consensus(nic , Ni , ai , n); node. During the normalization step, every node executes
cj (T) = cj (T)/nj ; the max-consensus and min-consensus for one time and the
end dimension of the observations is d, so the computational com-
c(T) = [c1 (T) , . . . , ck (T) ] ; plexity is O(dn2 ). As for the assignment step and update step,
if the algorithm has converged within T steps, then the com-
\* Check convergence *\
putational complexity is O(kdn2 T), since the complexity of
if c(T) == c(T − 1) then the finite-time average-consensus algorithm is O(dn) for a
break; node. In conclusion, the computational complexity of the dis-
end tributed k-means algorithm is O(n2 max{n2 , dkT}) for a node,
end and when the network is undirected, the computational com-
c = c(T); plexity is O(n2 dkT) for a node since an undirected graph is
always weight balanced.

every component of the observation. For the jth component of D. Advantages of the Distributed k-Means Algorithm
node i’s observation xij , if the maximum and minimum value Compared with the algorithms proposed in [18], the main
are maxj and minj , respectively, then invoking the min–max difference occurs in the update step regarding how to cal-
value standardization xij is normalized as follows: culate the averages of the observations in each cluster. For
the algorithm proposed in this paper, when calculating the
xij − minj average of the nodes’ observations in each cluster, each node
x̂ij = . (9)
maxj − minj provides valid information if it belongs to a cluster other-
wise invalid information (i.e., 0) through the whole network.
3) Choice of the Initial Centroids: Similar to the centralized Thus, the proposed algorithm is valid as long as the whole
k-means++, this step is the key to yield a faster convergence network is strongly connected. However, for the algorithms
speed as well as a higher possibility of obtaining the global given in [18], the centroid for each cluster of observations
optimum. The normalized observations are applied to the dis- is calculated by employing the finite-time average-consensus
tributed k-means++ algorithm to produce the initial centroids. algorithm on the underlying network of such a cluster, thus
Let T = 1, and the initial centroids are c(1). leading to two shortcomings of the algorithm proposed in [18].
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

QIN et al.: DISTRIBUTED k-MEANS ALGORITHM AND FUZZY c-MEANS ALGORITHM FOR SENSOR NETWORKS 7

The first shortcoming is that it cannot be extended directly and


to the case when the network topology is a directed graph,
1
uij = , i = 1, . . . , n, j = 1, . . . , k. (12)
k  xi −cj 2/(m−1)
because there is no guarantee that the subnetwork (i.e., the
underlying topology of each cluster) is strongly connected as t=1 xi −ct
well if the whole network itself is strongly connected. The
second shortcoming is that for the algorithm provided in [18] Similar to the k-means algorithm, the fuzzy c-means algo-
to cope with disconnected clusters, it involves
 a high com- rithm also uses an iterative refinement technique. Given an
putational complexity of O(n2 k max{d, n}σ (Sj (t))T), where initial set of k centers c1 (1), . . . , ck (1) or an initial member-
σ (Sj (t)) means the set of subclusters of cluster Sj (t) at step t, ship matrix, the algorithm proceeds by alternating between
while the computational complexity of the algorithm proposed updating cj and U according to (11) and (12).
herein is only O(n2 dkT). In other words, the advantages of the Remark 5: The algorithm minimizes intracluster variance
algorithm proposed in this paper over that in [18] is its capa- as well, but has the same problems as the k-means algorithm,
bility of dealing with the condition that the network topology that is, the obtained minimum is a local minimum and the
is a directed graph and its lower computational complexity. results depend on the initial choice.
Remark 4: The distributed k-means algorithm prefers clus-
ters of approximately similar size, as it will always assign
an observation to the nearest centroid. This often leads to B. Distributed Fuzzy c-Means Algorithm
incorrect cut borders in between clusters. Moreover, it is nec- In this section, a proposed distributed fuzzy c-means algo-
essary for an observation to be assigned to more than one rithm will be presented in detail.
cluster [20], [21]. For example, when one data point is located We make the same assumptions as the distributed k-means
in the overlapping area of many different clusters, it may not algorithm, i.e., the n nodes deployed in any high dimensional
be appropriate to assign the data point to exactly one cluster space are endowed with real vectors xi ∈ Rd , i = 1, . . . , n,
since it has the properties of those overlapping clusters. Hence, which represent the observations, and the underlying topology
a distributed fuzzy partition method, which will be introduced of the WSN is strongly connected.
in the next section, is needed. The goal of the proposed algorithm is to obtain centers
representing different clusters and the membership matrix
describing to what extent a node belongs to a cluster so
IV. D ISTRIBUTED F UZZY c-M EANS A LGORITHM as to minimize the objective function described in (10). We
In this section, we introduce the distributed fuzzy c-means follow the same step as that for the centralized algorithm
algorithm. To this end, the fuzzy c-means algorithm is first except that every step is performed in a decentralized man-
briefly presented here. ner. If all the centers can be spread across the network
via the max-consensus algorithm, then each node can cal-
culate the membership matrix directly according to (12).
A. Introduction to the Centralized Fuzzy c-Means Algorithm When computing the new centers, the finite-time average-
Consider a given set of observations {x1 , x2 , . . . , xn }, where consensus algorithm can be utilized. The flowchart for the
each observation is a d-dimensional real vector. Similarly to distributed fuzzy c-means algorithm is shown in Fig. 1(b)
the k-means algorithm, the fuzzy c-means algorithm [34], [35] and the concrete realization of the algorithm is given in
aims to partition the n observations into k(≤ n) sets S = Algorithm 3.
{S1 , S2 , . . . , Sk } with degrees called the membership matrix For node i, it knows its observation xi ∈ Rd , n (the number
described by a row stochastic matrix U = [uij ] ∈ Rn×k , where of the nodes in the network), k (the number of the clus-
the membership value uij means that node i belongs to cluster j ters), Ni (the set of neighbors of node i), as well as δ (the
with the degree of uij , so as to minimize an objective function. parameter to judge whether the algorithm converges). All the
The objective function is given by nodes execute the distributed fuzzy c-means algorithm syn-
chronously and the final centers c and the membership value

n 
k uij describing to what extent node i belongs to cluster j, are
2
ij xi − cj
um (10) obtained by every node. The whole algorithm is executed
i=1 j=1 as follows.
1) Weight Balancing, Normalization and Choice of the
where cj is the presentation of the cluster j, which is called Initial Centers: These steps are the same as the distributed
the center later, and m is the fuzzifier determining the level of k-means algorithm.
cluster fuzziness. In the absence of experimentation or domain 2) Calculate Membership Matrix: Since the centers have
knowledge, m is commonly set to 2 [35]. been provided in step 1), each node can calculate the mem-
The Lagrangian multiplier method [34] can be used to solve bership matrix as shown in (12).
this problem and the necessary condition to minimize the 3) Calculate New Centers: Each node provides um ij x̂ and
objective function is m
uij to participate in the finite-time average-consensus algo-
n m
rithm. So the resulting finite-timeaverage-consensus algorithm
i=1 uij xi should be ( ni=1 um /n) ( n
i=1 uij /n). Apparently, the
x̂ and m
cj = n m , j = 1, . . . , k (11) ij i
i=1 uij ratio of the two results is the new center according to (11).
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

8 IEEE TRANSACTIONS ON CYBERNETICS

Algorithm 3: Distributed Fuzzy c-Means Algorithm: ith


Node
Data: xi = [xi1 , . . . , xid ] , n, k, Ni , δ
Result: c, uij (j = 1, . . . , k)
\* Weight balancing *\
ai =mirror-imbalance-correcting(Ni );
\* Normalization *\
[ maxi , . . . , maxd ] = max-consensus(xi , Ni , n);
[ mini , . . . , mind ] = min-consensus(xi , Ni , n);
xij −minj
x̂ij = max j −minj
,for j = 1, . . . , d;
x̂i = [x̂i1 , . . . , x̂id ] ;
\* Choice of the initial centers *\
c(1)  [c1 (1) , . . . , ck (1) ] = Fig. 2. Observations of the positions with initial centroids and undirected
network topology.
distributed-k-means++(xi , n, k, Ni );
T = 1;
while true do
\* Calculate membership matrix *\
for j = 1; j ≤ k; j + + do
uij =   x −c1 2/(m−1) ;
k i j
t=1 xi −ct
end
\* Calculate new centers *\
T = T + 1;
for j = 1; j ≤ k; j + + do
cj (T) =average-consensus(um ij x̂i , Ni , ai , n);
nj = average-consensus(um ij Ni , ai , n);
,
cj (T) = cj (T)/nj ;
end
Fig. 3. Clustering result with final centroids of the distributed k-means
c(T) = [c1 (T) , . . . , ck (T) ] ; algorithm in the undirected graph.
\* Check convergence *\
if c(T) − c(T − 1) < δ then
break; V. S IMULATION R ESULTS
end Now we provide several examples to illustrate the capa-
end bilities of the proposed distributed k-means algorithm and
c = c(T); distributed fuzzy c-means algorithm.
Example 1: In this example, we consider that the network
topology is an undirected graph. Here the 40 nodes are ran-
domly selected in the region [0, 1]×[0, 1] and the observation
4) Check Convergence: After the new centers are produced, of each node is the position. We need to partition the 40 mea-
we need to judge whether the algorithm converges. Obviously, sures into four clusters. The network is shown in Fig. 2 and the
if c(T) − c(T − 1) < δ, which means that the centers change initial centroids chosen by the distributed k-means algorithm
a little during two iterations, then c(T) is the final cluster are marked by red cross.
centers c. Otherwise, T = T + 1, we need to calculate the First, we use the proposed distributed k-means algorithm
membership matrix and the new centers iteratively until the to deal with these observations and the clustering result is
algorithm converges. depicted in Fig. 3, in which the different types of points mean
Similar to the distributed k-means algorithm, the computa- different clusters while the final centroids are marked by red
tional complexity for a node of the distributed fuzzy c-means cross.
algorithm is O(n2 max{n2 , dkT}) if the network is directed and And then the proposed distributed fuzzy c-means algorithm
O(n2 dkT) if the network is undirected. is used to deal with the observations in Fig. 4. Considering that
Remark 6: Similar to the centralized fuzzy c-means algo- soft clustering cannot get a fixed partition, we make the fol-
rithm, the distributed fuzzy algorithm can begin with a given lowing treatment so that the clustering result can be observed
initial set of centers or a given initial membership matrix as directly. For node i, if uij ≥ 0.5, then we partition node i
well. Note that the distributed k-means++ algorithm provided into cluster j. If uij < 0.5 for all j, then we call it fuzzy
in Section III-B is also applicable in finding an initial set of points presented by “plus” in the simulation results. The sim-
centers for the distributed fuzzy c-means algorithm. ulation result is depicted in Fig. 4. Note that the fuzzy points
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

QIN et al.: DISTRIBUTED k-MEANS ALGORITHM AND FUZZY c-MEANS ALGORITHM FOR SENSOR NETWORKS 9

Fig. 4. Clustering result with final centroids of the distributed fuzzy c-means Fig. 6. Clustering result with final centroids of the distributed k-means
algorithm in the undirected graph. algorithm in the directed graph.

Fig. 5. Observations of the positions with initial centroids and directed Fig. 7. Clustering result with final centroids of the distributed fuzzy c-means
network topology. algorithm in the directed graph.

are always on the borders of clusters which is tally with the from the nine classes and 400 data points are chosen from a
actual situation. class.
Here we provide the same observations except that the net- Fig. 8(b) depicts the evolution of the WCSS functions of
work is a connected directed graph as shown in Fig. 5, where the distributed k-means algorithm with k = 6, 9, 12 and of
the initial centers are marked by red cross. The clustering the centralized k-means algorithm with k = 9. This illustrates
result using the distributed k-means algorithm is depicted in that the convergence of the proposed distributed implemen-
Fig. 6, in which the different types of points mean different tation is similar to the centralized algorithm. The clustering
clusters and the final centers are marked by red cross. When result of the distributed k-means algorithm with k = 9 is
the distributed fuzzy c-means algorithm is applied, the cor- displayed in Fig. 8(a). Table I compares the performance
responding results are shown in Fig. 7. From Figs. 3 and 6 including the final value of the WCSS function and the iter-
which are plotted using the distributed k-means algorithm, as ation times before convergence over 100 Monte Carlo runs
well as Figs. 4 and 7 which are plotted using the distributed for the distributed k-means algorithm (represented by dk-
fuzzy c-means algorithm, it can be observed that the clustering means), the k-means algorithm with first centroids chosen by
is the same for the same observations regardless of whether using the k-means++ algorithm (represented by k-means++),
the network topology is undirected or not. and the k-means algorithm with first centroids randomly cho-
Example 2: Consider a WSN with 3600 nodes. The net- sen from the observations (represented by the k-means). It can
work is a connected undirected graph with each node commu- be observed that the distributed k-means algorithm performs
nicating with other 50 nodes. The observed data is generated the best both in terms of speed and accuracy.
from nine classes, with vectors from each class generated Fig. 9(b) depicts the evolution of the objective functions
from a 2-D Gaussian distribution with the common covari- of the distributed fuzzy c-means algorithm with k = 6, 9, 12
ance  = 2I2 , and the corresponding means: μ1 = [6, 6]T , and of the centralized fuzzy c-means algorithm with k = 9.
μ2 = [0, 6]T , μ3 = [−6, 6]T , μ4 = [6, 0]T , μ5 = [0, 0]T , This illustrates that the convergence of the proposed distributed
μ6 = [−6, 0]T , μ7 = [6, −6]T , μ8 = [0, −6]T , and implementation is similar to the centralized algorithm. The
μ9 = [−6, −6]T . Every node randomly draws one data point clustering result of the distributed fuzzy c-means algorithm
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

10 IEEE TRANSACTIONS ON CYBERNETICS

(a) (a)

(b) (b)

Fig. 8. Clustering of the distributed k-means algorithm in the undirected Fig. 9. Clustering of the distributed fuzzy c-means algorithm in the undirected
graph with 3600 observations fitting Gaussian distribution. (a) Clustering graph with 3600 observations fitting Gaussian distribution. (a) Clustering
result of the distributed k-means algorithm with k = 9. (b) Evolution of result of the distributed fuzzy c-means algorithm with k = 9. (b) Evolution
the WCSS functions of the distributed k-means algorithm with k = 6, 9, 12 of the objective functions of the distributed fuzzy c-means algorithm with
and of the centralized k-means algorithm with k = 9. k = 6, 9, 12 and of the centralized fuzzy c-means algorithm with k = 9.

TABLE I TABLE II
C ENTRALIZED V ERSUS D ISTRIBUTED k-M EANS . C ENTRALIZED V ERSUS D ISTRIBUTED F UZZY c-M EANS .
P ERFORMANCE C OMPARISON P ERFORMANCE C OMPARISON

with k = 9 is displayed in Fig. 9(a), in which the fuzzy points fuzzy c-means algorithm performs the best both in terms of
are presented by black “circle.” Table II compares the perfor- speed and accuracy.
mance including the finial value of the objective function and Example 3: This example tests the proposed decentralized
the iteration times before convergence over 100 Monte Carlo schemes on real data for the first cloud cover database avail-
runs for the distributed fuzzy c-means algorithm (represented able from the UC-Irvine Machine Learning Repository, with
by dc-means), the fuzzy c-means algorithm with first cen- the goal of identifying regions sharing common characteris-
troids chosen by using the k-means++ algorithm (represented tics. The cloud dataset is composed of 1024 points in 10-D.
by fuzzy c-means++), and the fuzzy c-means algorithm with Suppose that the WSN consisting of 1024 nodes is a con-
first centroids randomly chosen from the observations (repre- nected undirected graph with each node communicating with
sented by the fuzzy c-means). It can be seen that the distributed other 20 nodes.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

QIN et al.: DISTRIBUTED k-MEANS ALGORITHM AND FUZZY c-MEANS ALGORITHM FOR SENSOR NETWORKS 11

provided to illustrate the feasibility of the proposed distributed


algorithms.

ACKNOWLEDGMENT
The authors would like to thank the Associate Editor and
the anonymous reviewers for their valuable comments and sug-
gestions that have helped to improve this paper considerably.

R EFERENCES
[1] K. Sohraby, D. Minoli, and T. F. Znati, Wireless Sensor Networks:
Technology, Protocols, and Applications. Hoboken, NJ, USA: Wiley,
2007.
[2] J. Heo, J. Hong, and Y. Cho, “EARQ: Energy aware routing for real-
Fig. 10. Evolution of the WCSS functions of the distributed k-means algo- time and reliable communication in wireless industrial sensor networks,”
rithm with k = 10, 25, 50 in the undirected graph with 1024 10-D observations IEEE Trans. Ind. Informat., vol. 5, no. 1, pp. 3–11, Feb. 2009.
from cloud dataset. [3] Q.-S. Jia, L. Shi, Y. Mo, and B. Sinopoli, “On optimal partial broad-
casting of wireless sensor networks for Kalman filtering,” IEEE Trans.
Autom. Control, vol. 57, no. 3, pp. 715–721, Mar. 2012.
[4] J. Liang, Z. Wang, B. Shen, and X. Liu, “Distributed state estimation in
sensor networks with randomly occurring nonlinearities subject to time
delays,” ACM Trans. Sensor Netw., vol. 9, no. 1, 2012, Art. ID 4.
[5] W. Kim, J. Park, J. Yoo, H. J. Kim, and C. G. Park, “Target localization
using ensemble support vector regression in wireless sensor networks,”
IEEE Trans. Cybern., vol. 43, no. 4, pp. 1189–1198, Aug. 2013.
[6] S. H. Semnani and O. A. Basir, “Semi-flocking algorithm for motion
control of mobile sensors in large-scale surveillance systems,” IEEE
Trans. Cybern., vol. 45, no. 1, pp. 129–137, Jan. 2015.
[7] J. MacQueen, “Some methods for classification and analysis of multi-
variate observations,” in Proc. 5th Berkeley Symp. Math. Stat. Probab.,
vol. 1. Berkeley, CA, USA, 1967, pp. 281–297.
[8] J. C. Bezdek, R. Ehrlich, and W. Full, “FCM: The fuzzy c-means
clustering algorithm,” Comput. Geosci., vol. 10, no. 2–3, pp. 191–203,
1984.
[9] G. H. Ball and D. J. Hall, ISODATA, A Novel Method of Data Analysis
and Pattern Classification. Menlo Park, CA, USA: Stanford Res. Inst.,
1965.
Fig. 11. Evolution of the objective functions of the distributed fuzzy c-means [10] J. C. Dunn, “A fuzzy relative of the ISODATA process and its use in
algorithm with k = 10, 25, 50 in the undirected graph with 1024 10-D detecting compact well-separated clusters,” J. Cybern., vol. 3, no. 3,
observations from cloud dataset. pp. 32–57, 1973.
[11] M. Ester, H.-P. Kriegel, J. Sander, and X. Xu, “A density-based algo-
rithm for discovering clusters in large spatial databases with noise,” in
Proc. 2nd Int. Conf. Knowl. Disc. Data Mining, Portland, OR, USA,
Fig. 10 depicts the evolution of the WCSS functions of Aug. 1996, pp. 226–231.
the distributed k-means algorithm with k = 10, 25, 50, which [12] I. S. Dhillon and D. S. Modha, “A data-clustering algorithm on dis-
tributed memory multiprocessors,” Large-Scale Parallel Data Mining
confirms the convergence of the proposed k-means algorithm. (LNCS 1759). Berlin, Germany: Springer, 2000, pp. 245–260.
Moreover, Fig. 11 plots the evolution of the objective functions [13] S. Kantabutra and A. L. Couch, “Parallel K-means clustering algorithm
of the distributed fuzzy c-means algorithm with k = 10, 25, 50, on NOWs,” NECTEC Tech. J., vol. 1, no. 6, pp. 243–247, 2000.
which demonstrates the convergence of the proposed c-means [14] S. Bandyopadhyay et al., “Clustering distributed data streams in peer-
to-peer environments,” Inf. Sci., vol. 176, no. 14, pp. 1952–1985, 2006.
algorithm. [15] P. A. Forero, A. Cano, and G. B. Giannakis, “Consensus-based k-means
algorithm for distributed learning using wireless sensor networks,”
in Proc. Workshop Sensors Signal Inf. Process., Sedona, AZ, USA,
VI. C ONCLUSION May 2008, pp. 11–14.
[16] P. A. Forero, A. Cano, and G. B. Giannakis, “Distributed clustering
In this paper, based on the distributed consensus theory in using wireless sensor networks,” IEEE J. Sel. Topics Signal Process.,
multiagent systems, we have developed a distributed k-means vol. 5, no. 4, pp. 707–724, Aug. 2011.
[17] D. P. Bertsekas and J. N. Tsitsiklis, Parallel and Distributed
clustering algorithm as well as a distributed fuzzy c-means Computation: Numerical Methods. Upper Saddle River, NJ, USA:
algorithm for clustering data in WSNs with strongly con- Prentice-Hall, 1989.
nected underlying graph topology. The proposed distributed [18] G. Oliva, R. Setola, and C. N. Hadjicostis. (Nov. 10, 2014). Distributed
K-Means Algorithm. [Online]. Available: http://arxiv.org/abs/1312.4176
k-means algorithm, by virtue of one-hop communication, has [19] R. Olfati-Saber, J. A. Fax, and R. M. Murray, “Consensus and coop-
been proven to be feasible in partitioning the data observed eration in networked multi-agent systems,” Proc. IEEE, vol. 95, no. 1,
by the nodes into measure-dependent groups which has small pp. 215–233, Jan. 2007.
[20] A. B. McBratney and I. O. A. Odeh, “Application of fuzzy sets in
in-group and large out-group differences. On the other hand, soil science: Fuzzy logic, fuzzy measurements and fuzzy decisions,”
the proposed distributed fuzzy c-means algorithm has been Geoderma, vol. 77, nos. 2–4, pp. 85–113, 1997.
shown to be workable in partitioning the data observed by the [21] S. Balla-Arabé, X. Gao, and B. Wang, “A fast and robust level set
method for image segmentation using fuzzy clustering and lattice
nodes into different measure-dependent groups with degrees of Boltzmann method,” IEEE Trans. Cybern., vol. 43, no. 3, pp. 910–920,
membership values. Finally, some simulation results have been Jun. 2013.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

12 IEEE TRANSACTIONS ON CYBERNETICS

[22] P. Shen and C. Li, “Distributed information theoretic clustering,” IEEE Weiming Fu received the B.E. degree in automation
Trans. Signal Process., vol. 62, no. 13, pp. 3442–3453, Jul. 2014. from the University of Science and Technology of
[23] X. Zhao and A. H. Sayed, “Distributed clustering and learning over China, Hefei, China, in 2014, where he is currently
networks,” IEEE Trans. Signal Process., vol. 63, no. 13, pp. 3285–3300, pursuing the master’s degree.
Jul. 2015. His current research interests include consensus
[24] D. Arthur and S. Vassilvitskii, “k-means++: The advantages of careful problems in multiagent coordination and synchro-
seeding,” in Proc. 18th Annu. ACM-SIAM Symp. Discrete Algorithms, nization of complex dynamical networks.
New Orleans, LA, USA, Jan. 2007, pp. 1027–1035.
[25] Z. Lin, B. Francis, and M. Maggiore, “State agreement for continuous-
time coupled nonlinear systems,” SIAM J. Control Optim., vol. 46, no. 1,
pp. 288–307, 2007.
[26] W. Ren, R. W. Beard, and E. M. Atkins, “Information consensus in mul-
tivehicle cooperative control,” IEEE Control Syst. Mag., vol. 27, no. 2,
pp. 71–82, Apr. 2007.
[27] J. Qin, H. Gao, and W. X. Zheng, “Second-order consensus for multi-
agent systems with switching topology and communication delay,” Syst. Huijun Gao (F’14) received the Ph.D. degree in
Control Lett., vol. 60, no. 6, pp. 390–397, 2011. control science and engineering from the Harbin
[28] Y. Tang, H. Gao, W. Zou, and J. Kurths, “Distributed synchronization in Institute of Technology, Harbin, China, in 2005.
networks of agent systems with nonlinearities and random switchings,” He was a Research Associate with the Department
IEEE Trans. Cybern., vol. 43, no. 1, pp. 358–370, Feb. 2013. of Mechanical Engineering, University of Hong
[29] J. Qin, W. X. Zheng, and H. Gao, “Coordination of multiple agents with Kong, Hong Kong, from 2003 to 2004. From 2005
double-integrator dynamics under generalized interaction topologies,” to 2007, he was a Post-Doctoral Research Fellow
IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 42, no. 1, pp. 44–57, with the Department of Electrical and Computer
Feb. 2012. Engineering, University of Alberta, Edmonton, AB,
[30] F. Garin and L. Schenato, “A survey on distributed estimation and control Canada. Since 2004, he has been with the Harbin
applications using linear consensus algorithms,” in Networked Control Institute of Technology, where he is currently a
Systems. London, U.K.: Springer, 2010, pp. 75–107. Professor and the Director of the Research Institute of Intelligent Control
[31] S. Sundaram and C. N. Hadjicostis, “Finite-time distributed consensus and Systems. His current research interests include network-based control,
in graphs with time-invariant topologies,” in Proc. Amer. Control Conf., robust control/filter theory, time-delay systems, and their engineering appli-
New York, NY, USA, Jul. 2007, pp. 711–716. cations.
[32] C. N. Hadjicostis and A. Rikos, “Distributed strategies for balancing a Dr. Gao was a recipient of the IEEE J. David Irwin Early Career Award
weighted digraph,” in Proc. 20th IEEE Mediterr. Conf. Control Autom., from the IEEE IES. He is a Co-Editor-in-Chief for the IEEE T RANSACTIONS
ON I NDUSTRIAL E LECTRONICS , and an Associate Editor for Automatica,
Barcelona, Spain, Jul. 2012, pp. 1141–1146.
[33] B. Gharesifard and J. Cortés, “Distributed strategies for generating the IEEE T RANSACTIONS ON C YBERNETICS, the IEEE T RANSACTIONS ON
weight-balanced and doubly stochastic digraphs,” Eur. J. Control, F UZZY S YSTEMS, the IEEE/ASME T RANSACTIONS ON M ECHATRONICS,
vol. 18, no. 6, pp. 539–557, 2012. and the IEEE T RANSACTIONS ON C ONTROL S YSTEMS T ECHNOLOGY. He
[34] J. C. Bezdek, Pattern Recognition With Fuzzy Objective Function is an AdCom Member of the IEEE Industrial Electronics Society.
Algorithms. New York, NY, USA: Springer, 1981.
[35] N. R. Pal and J. C. Bezdek, “On cluster validity for the fuzzy c-means
model,” IEEE Trans. Fuzzy Syst., vol. 3, no. 3, pp. 370–379, Aug. 1995.
Wei Xing Zheng (M’93–SM’98–F’14) received
the Ph.D. degree in electrical engineering from
Southeast University, Nanjing, China, in 1989.
He has held various faculty/research/visiting posi-
tions with Southeast University; the Imperial College
of Science, Technology and Medicine, London,
U.K.; the University of Western Australia, Perth,
WA, Australia; the Curtin University of Technology,
Perth; the Munich University of Technology,
Munich, Germany; the University of Virginia,
Charlottesville, VA, USA; and the University of
California at Davis, Davis, CA, USA. He is currently a Full Professor with
Western Sydney University, Sydney, NSW, Australia.
Jiahu Qin (M’12) received the first Ph.D. degree Dr. Zheng served as an Associate Editor for the IEEE T RANSACTIONS ON
in control science and engineering from the Harbin C IRCUITS AND S YSTEMS -I: F UNDAMENTAL T HEORY AND A PPLICATIONS,
Institute of Technology, Harbin, China, in 2012, and the IEEE T RANSACTIONS ON AUTOMATIC C ONTROL, the IEEE S IGNAL
the second Ph.D. degree in information engineering P ROCESSING L ETTERS, the IEEE T RANSACTIONS ON C IRCUITS AND
from Australian National University, Canberra, ACT, S YSTEMS -II: E XPRESS B RIEFS, and the IEEE T RANSACTIONS ON F UZZY
Australia, in 2014. S YSTEMS, and as a Guest Editor for the IEEE T RANSACTIONS ON C IRCUITS
He was a Visiting Fellow with the University of AND S YSTEMS -I: R EGULAR PAPERS . He is currently an Associate Editor
Western Sydney, Sydney, NSW, Australia, in 2009 for Automatica, the IEEE T RANSACTIONS ON AUTOMATIC C ONTROL (the
and 2015, respectively, and a Research Fellow with second term), the IEEE T RANSACTIONS ON C YBERNETICS, the IEEE
the Australian National University, from 2013 to T RANSACTIONS ON N EURAL N ETWORKS AND L EARNING S YSTEMS, and
2014. He is currently a Professor with the University other scholarly journals. He has also served as the Chair of the IEEE
of Science and Technology of China, Hefei, China. His current research inter- Circuits and Systems Society’s Technical Committee on Neural Systems
ests include consensus and coordination in multiagent systems as well as and Applications and the IEEE Circuits and Systems Society’s Technical
complex network theory and its application. Committee on Blind Signal Processing. He is a Fellow of IEEE.

You might also like