You are on page 1of 5

The 13th International Conference on

Computer Science & Education (ICCSE 2018)


August 8-11, 2018. Colombo, Sri Lanka InS4.1

Fuzzy clustering based on Water Wave Optimization


Zifei Ren Chunzhi Wang
School of computer Science, Hubei university of technology School of computer Science, Hubei university of technology
Wuhan, China Wuhan, China
renzifi@163.com chunzhiwang@vip.163.com

Xinkai Fan Zhiwei Ye


School of computer Science, Hubei university of technology School of computer Science, Hubei university of technology
Wuhan, China Wuhan, China
Fxk7676@tom.com weizhiye121@163.com

Abstract—Fuzzy clustering is one of the most widely used and proposed various improvements, such as clustering using
sensitive algorithms. However, one of its fatal weaknesses is that neural network technology. But neural networks can only save
it is very sensitive to initialization and easy to get into local time by parallel processing, and still cannot solve the problem
minima. WWO (The water wave optimization algorithm) is a of sensitivity to initialization. So that urgently need a global
widely used global optimization method. Its main advantage is optimization algorithm to overcomes the disadvantage of
simple, general and suitable for parallel processing. Thus being sensitive of initialization.
combining optimization algorithm in waves and fuzzy WWO algorithm is a kind of evolutionary algorithm,
clustering, can play a water wave algorithm for global which based on the theory of shallow water wave. The
optimization ability. It can give attention to both local algorithm of each individual in the population to a "wave"
optimization ability and improve the convergence speed at object. Through the simulation of the wave propagation,
the same time, to better solve the problem of clustering. refraction, and breaker operation to search in the problem
The algorithm using water wave optimization algorithm to space efficiently. The WWO algorithm has a simple
find the optimal solution as the initial clustering center of framework, fewer control parameters, and a smaller
fuzzy clustering. And then, using the fuzzy clustering to population size. The test results on an important set of
initialize clustering center. Finally obtain the global benchmark functions. It show that the overall performance of
optimal solution, thus overcomes the shortcomings of fuzzy the WWO algorithm is higher than a series of other emerging
clustering. evolutionary algorithms. Such as biogeographic optimization,
gravity search algorithm, foraging search, bat algorithm and so
Index Terms—water wave optimization; Fuzzy clustering;
Select center. on. WWO has also been successfully applied to a railway
dispatching problem, which reflects its broad application
prospects in practical engineering optimization problems.
I. INTRODUCTION
Traditional cluster analysis is a hard partition, which II. WWO ALGORITHM
strictly divides each object to be identified into a class[1]. The WWO solves the optimization problem by simulating
typical representative of the hard partition method is the fuzzy water wave motion. The solution space of the problem
C-means(FCM) algorithm[2]. In this algorithm, the corresponds to the seabed, and the higher the fitness value of
classification has the property of either, so the category water waves closer to the seabed (vertical), the lower the
boundary of this classification is clear. In fact, most of the fitness value[6]. The algorithm constructs a water wave group
objects are not strict attributes, this hard division does not at initialization, where the wave height h of each wave is
truly reflect the actual relationship of object and class. As a initialized to a constant hmax and the wavelength Ȝ . In the
result, people put forward for dealing with the object of soft iterative process of the algorithm, the population is evolved by
partition. Fuzzy c-means algorithm is one of the most widely repeatedly applying three operations of propagation, refraction,
used and most sensitive algorithms in fuzzy cluster and fragmentation[7].
analysis[3][14]. It is not only accepted by the researchers of (1) Communication. Each wave will perform a propagation
the fuzzy engineering, but also promoted to other branches of operation during each iteration of the algorithm. Let the new
science, such as medical diagnosis, computer visualization, wave obtained after the propagation of the water wave X be
communication and process sensing. The algorithm uses an XĄ. The position of each dimension D (1İdİD, where D
iterative hill-climbing algorithm to find the optimal solution to represents the dimension of the problem) is calculated as
the problem under investigation because it is a local search follows.
algorithm. So there are two fatal weaknesses: First, it takes a
lot of time to process large data sets. The second is very X ' = X d + rand ( −1,1) × λ × Ld (1)
 d
sensitive to initialization and it is easy to fall into local After propagation, the fitness values of the new water
minimums. In response to these problems, people have wave XĄ and the original water wave X are compared. The

978-1-5386-5495-8/18/$31.00 ©2018 IEEE 85


InS4.1

fitness function is f. If f(X')>f(X), replace X in the population IV. WWO BASED FCM ALGORITHM
with X' and set the wave height of X' to hmax; otherwise, keep WWO is a new global optimization algorithm that
X and decrease its wave height by 1 to record Loss of energy. propagates waves of water from multiple individuals. The use
After each iteration, the WWO algorithm updates the of refraction and crushing to manipulate particular individuals
wavelength of each water wave X in the following manner: allows individuals to be renewed, so that individuals in the
data set are updated from generation to generation and
λ = λ ×a − ( f ( x )− f min +ε ) /( fmax − fmin +ε )
(2)
gradually approach the optimal solution. The key idea of
Among them, fmax and fmin represent the maximum and WWO is that low-energy waves have large wavelengths and
minimum fitness values in the current population. Į is the therefore are explored over a wide range of short life cycles,
wave length attenuation coefficient, and İ is a very small while the wavelengths of high-energy waves are shorter. So
positive integer (to avoid dividing by 0). local search is strengthened in small areas of a long life cycle.
(2) Refraction. If a wave does not improve after trying Its advantages are simple, universal, and suitable for parallel
multiple propagation operations, its wave height will decay to processing. It is much more efficient than blind search, and is
zero. more versatile than a specific algorithm for specific problems.
(3) breaking waves. The specific method is to randomly select It is a problem-free solution model. It is precisely because
the k-dimension in D (k is a random number between 1 and these characteristics of water wave optimization algorithm can
the predefined parameter kmax), and update each selected overcome the sensitivity of FCM to initialization[12][13].
dimension . Combining the water wave optimization algorithm with FCM
III. FCM ALGORITHM can not only make the global optimization ability better, but
also take into account the FCM's local optimization ability.
In this algorithm, the degree of membership uij of the FCM
algorithm is extended to be continuous, ranging from any A. The construction of fitness function
value in [0,1].The parameter variables are the same as the Biologists use the term fitness to measure how well a
previous ones[4]. m is the weight index used to determine the species adapts to its habitat. Similarly, the concept of fitness
ambiguity of the clustering result[5]. When m=1, the FCM is also used in the water wave optimization algorithm to
clustering algorithm becomes the hard division described measure the degree of good or bad of each individual in the
earlier[8]. In practical applications, m is usually taken as [1.25, population that may conducive to finding the optimal solution
2.5] in the optimization calculation[16].
Given a data set X={x1,x2,...xn}, where each sample For fuzzy clustering, the smaller the objective function
contains s attributes. The FCM algorithm divides the data set value corresponding to the optimal clustering result, the better
X into c (2İcİn) classes, and v={v1,v2,...,vc} is c cluster the clustering effect[9]. The smaller the objective function
centers. Uij indicates that sample j belongs to the degree of value is, the larger the fitness function corresponding to the
membership of class i[15]. clustering result should be[10]. This paper uses the objective
The objective function of FCM clustering is function of the FCM algorithm to define the individual's
fitness function[11].
¦ M ¦L
Q F
- 8 9 = = =
X LMP G LM (3)
WWO first initializes a set of waves whose height is set to
In order to minimize the objective function J(U,V) of an offset to create a new wave X. For the evaluation of each
equation (6) , the class center vi and the membership matrix U solution (cluster center) in the water wave optimization
can be calculated by the following formula: algorithm. The individual fitness function is defined as:

¦
n
uijm x j  
vi = j =1
, i = 1,2, … c  (4) I [ L = = (6)
¦M ¦L
Q F
¦ -
n
um
j =1 ij =
XP G 
= LM LM
When dij Į 0 In the formula, J is the value of the objective function of
− the formula. The smaller J is, the higher f(xi) is, and the better
§ 
·
¨ F § [M − YL · P − ¸ the clustering effect is.
X LM = ¨ ¦ N = ¨ ¸ ¸ M = … Q
¨¨ ¨ [M − Y N ¸ ¸¸ (5) B. WWO-FCM algorithm steps:
© ¹
© ¹ Step 1. In the water wave optimization, a solution corresponds
to a water wave. The three characteristics of each individual
When GLM ˈuij=0,kj,i=1,2,…ˈn. are: fitness value f(x), wave height h, and wavelength Ȝ. The
In a word, The main part of fuzzy clustering, is to initial wave height h is a constant (hmax) and the wavelength Ȝ
determine similarity. Fuzzy clustering algorithm is a simple is also a constant. Calculate the fitness value of each
iterative process. individual water wave and start iterating. If the termination
condition is satisfied, the optimal solution is returned and the
algorithm ends.

86
InS4.1

Step 2. Perform the following operations for each wave X in TABLE II. WWO-FCM algorithm parameter settings
the population and perform the propagation operation
Parameter leuk72_3k X4k2_far
according to equation (1) to generate x’. If f(X')>f(X), perform Name
Glass Dataset
Dataset Dataset
a crushing operation and replace X, otherwise refraction. N0 6 3 4
Step 3. Update the global optimal solution of the entire hmax 12 12 12
population, output the individual with the best fitness, and
d 13 39 2
output the optimal matching result vio.
Step 4. Enter the number of clusters c, the fuzzy factor m, and n 2 2 2
the iteration termination conditions. The clustering center is Ȝ 0.5 0.5 0.5
vio. Į 1.01 1.01 1.01
Step 5. Use formula (5) to calculate uij using formula (4) to ȕ 0.1 0.1 0.1
calculate vi1. Kmax 7 12 1
Step 6. If 9  − 9  ≤ δ , the iteration is terminated and go
The three algorithms were run 30 times respectively. Fig. 1
to step 7. Otherwise, let vio= vi1 (i=1,2,…ˈc) go to step 5. - Fig. 3 show the results of the three data sets of Glass,
Step 7. Output the clustering result (V, U). leuk72_3k, and X4k2_far, respectively, and the variation
V. EXPERIMENTAL STUDIES curves of the objective function. The average error score and
running time of the three algorithms are used to make statistics.
In order to verify the validity and feasibility of the WWO- The statistical results are shown in Table 3.
FCM algorithm, a set of actual data sets, Glass, is used in the According to the experimental results in Table 3 to Table 5,
UCI standard data set. Two sets of data sets leuk72_3k and it can be seen that the objective function value of the WWO-
X4k2_far in the simulation data set were tested on this FCM algorithm converges to the global minimum. And the
algorithm. The data set information is summarized in Table 1. accuracy is better than the GA-FCM algorithm and the PSO-
FCM. The classification result is also more accurate.
TABLE I. Information for data sets

Number of
Data set Size Dimension
clusters

Glass 214 6 13
leuk72_3k 72 3 39

X4k2_far 400 4 2

The Glass dataset consists of a sample of glass fragments


left behind by the crime scene. It contains 214 samples and
contains 6 glass samples. They are 70 samples of
building_windows_float, 76 samples of building_windows_n-
on_float, 17 samples of vehicle_windows_float , and 13 glass
samples of containers, 9 tableware and 29 headlamps. Each
glass sample has 10 attributes, respectively is identity label (Id
number), refractive index (RI), sodium (Na), the content of the
Fig. I. Clustering results of the WWO-FCM algorithm on Glass
content of magnesium (Mg), aluminum (A1), the content of
silicon (Si), the content of potassium (K), calcium (Ca), the
content of barium (Ba) and the content of iron (Fe).The two
groups of data in leuk72_3k and X4k2_far are the exposed
two sets of simulated data sets. The detailed information is
shown in table 1.
GA-FCM algorithm, PSO-FCM algorithm and WWO-
FCM algorithm were respectively used to cluster analysis of
the above three sets of experimental data sets respectively.
The minimum error in each algorithm was 10 to 6, and the
fuzzy weighted index m=2.The parameters of the PSO-FCM
algorithm are set in the experiment: the particle swarm
population size is n=20, the maximum number of iterations is
500, the learning factor C1=C2=1.5, the inertia weight
Wmax=0.9, and Wmin=0.4; the WWO-FCM algorithm
parameter settings are shown in Table 2 below.
Fig. II. The convergence comparison of different methods on Glass

87
InS4.1

TABLE III. Clustering results of Glass data

Objective
Average
function Number of
algorithm accuracy Time(s)
optimal iterations
%
value
GA-FCM 83.97 154.146 93 7.44
PSO-FCM 85.00 152.286 88 6.17
WWO-FCM 90.35 151.146 92 7.15

Fig. V. Clustering results of WWO-FCM algorithm on X4k2_far

Fig. III. Clustering results of WWO-FCM algorithm on leuk72_3k

Fig. VI. The convergence comparison of different methods on X4k2_far

TABLE V. Clustering results of X4k2_far data

Objective
Average
function Number of
algorithm accuracy Time(s)
optimal iterations
%
value

GA-FCM 95.32 122.93 17 3.28


Fig. IV. The convergence comparison of different methods on leuk72_3k PSO-FCM 97.25 110.76 16 3.13
TABLE IV. Clustering results of leuk72_3k data WWO-FCM 99.04 100.00 14 2.52

Objective From the convergence effect point of view, for the Glass
Average
algorithm accuracy
function Number of
Time(s) dataset, sometimes, WWO-FCM algorithm has a longer
optimal iterations running time and more running iterations than the other two
%
value
algorithms, but its accuracy has been improved. For the other
GA-FCM 78.61 814.77 36 3.38 two data sets, WWO-FCM occasionally has a long running
PSO-FCM 86.63 806.97 34 3.17 time, but the number of iterations and accuracy are always the
highest. Therefore, on the whole, the WWO-FCM algorithm
WWO-FCM 89.48 804.54 33 3.92
has a better clustering effect.

88
InS4.1

VI. CONCLUSION Adding PSO Algorithm[M]// Hybrid Artificial Intelligent


Systems. Springer Berlin Heidelberg, 2012:231-242.
The WWO-FCM algorithm proposed in this paper [5] Zhang B, Xue L, Wang W, et al. An Improving Fuzzy C-means
combines water wave optimization algorithm and FCM fuzzy Algorithm for Concept-Drifting Data Stream[J]. 2016, pp 439-
clustering algorithm. The water wave optimization algorithm 450.
first uses its unique global optimization ability to find the [6] Zheng Yu-jun. Water wave optimization: A new nature-
optimal solution, and uses the found optimal solution as the inspired metaheuristics[C]. Computers and Operations
initial clustering center of FCM, then clusters with FCM Research, 55 (2015), pp1–11
clustering steps to obtain a global optimal solution. Because the [7] Zheng Yu-jun, Zhang Bei. A simplified water wave
water wave optimization algorithm has a mechanism to jump optimization algorithm[C]//Proceedings of the 2015 IEEE
Congress on Evolutionary Computation.IEEEPress,New
out of the local optimum, this algorithm overcomes the
York,2015
shortcomings of the FCM algorithm in sensitivity to the initial [8] Garai P, Maji P. Identification of Co-Expressed microRNAs
value and noise, and also overcomes the defect that the FCM Using Rough Hypercuboid Based Interval Type-2 Fuzzy C-
can easily fall into a local optimum. The experimental results Means Algorithm[C]// International Conference on Advanced
show that WWO-FCM not only has a strong global search Computing and Intelligent Engineering. 2016.
ability, but also has higher clustering accuracy than traditional [9] Tripathy B K, Tripathy A, Govindarajulu K, et al. On Kernel
GA-FCM and PSO-FCM algorithms. The speed of the Based Rough Intuitionistic Fuzzy C-means Algorithm and a
clustered dataset is improved, and all clustering effects are Comparative Analysis[J]. 2014, 2014(27) pp 349-359.
significantly improved. However, some initial parameters of [10] Li M, Song Y C, Li Y, et al. Research of improved fuzzy c -
means algorithm based on a new metric norm[J]. Journal of
the WWO-FCM algorithm are only determined empirically and
Shanghai Jiaotong University, 2015, 20(1):51-55.
need further study and improvement. [11] Kuo H C, Lin Y J. The Optimal Estimation of Fuzziness
Parameter in Fuzzy C-Means Algorithm[M]// Rough Sets. 2017,
ACKNOWLEDGMENT
2017(10313), pp 51–55.
This work is funded by the National Natural Science [12] Szilágyi L, Szilágyi S M, Enăchescu C. A Study on Cluster Size
Foundation of China under No.61772180,61502155. Based Sensitivity of Fuzzy c -Means Algorithm Variants[M]// Neural
on the in-depth study of unstructured big data analysis Information Processing. Springer International Publishing,
algorithm and the research of social network based image 2016, 2016(9948) ,pp 470-478.
search technology under the big data environment of youth [13] Agrawal S, Tripathy B K. Decision Theoretic Rough
Intuitionistic Fuzzy C-Means Algorithm[J]. 2016, 2016(50) ,pp
project. 71-82.
[14] Endo Y, Kinoshita N, Iwakura K, et al. Hard and Fuzzy c-
REFERENCES
means Algorithms with Pairwise Constraints by Non-metric
[1] Sheshasaayee A, Sridevi D. A Combined System for Terms[J]. 2014, 8825:145-157.
Regionalization in Spatial Data Mining Based on Fuzzy C- [15] Kumar S, Kashyap M, Saluja A, et al. Segmentation of Cotton
Means Algorithm with Gravitational Search Algorithm[J]. Bolls by Efficient Feature Selection Using Conventional Fuzzy
2017,pp 787-794. C-Means Algorithm with Perception of Color[J]. 2016, 2016
[2] Raveen S, Prasad P S S, Chillarige R R. A New Preprocessor to (239), pp 731-741.
Fuzzy c-Means Algorithm[C]// International Workshop on [16] Hassen D B, Taleb H, Yaacoub I B, et al. Classification of
Multi-Disciplinary Trends in Artificial Intelligence. Springer- Chest Lesions with Using Fuzzy C-Means Algorithm and
Verlag New York, Inc. 2014:124-135. 5. Support Vector Machines[M]// International Joint Conference
[3] Rubio E, Castillo O, Melin P. Interval Type-2 Fuzzy System SOCO’13-CISIS’13-ICEUTE’13. Springer International
Design Based on the Interval Type-2 Fuzzy C-Means Publishing, 2014:319-328.
Algorithm[J]. 2016.
[4] Pang L, Xiao K, Liang A, et al. A Improved Clustering
Analysis Method Based on Fuzzy C-Means Algorithm by

89

You might also like