You are on page 1of 17

Fuzzy Clustering Techniques: Fuzzy C-Means and

Fuzzy Min-Max Clustering Neural Networks
Benjamin James Bush
SSIE 617 Term Paper, Fall 2012

|1| INTRODUCTION
Data clustering is a data processing strategy which aims to organize a collection of
data points (hereby simply called points) into groups. Traditionally, the data set is
partitioned so that each point belongs to one and only one cluster. However, unless
the data is very highly clustered, it is often the case that some points do not
completely belong to any one cluster. With the arrival of Fuzzy clustering, these
points could be assigned a set of membership degrees associated with each
cluster instead of artificially pigeonholing it as belonging to only one. The volume
of literature available on fuzzy clustering is immense; a general review of the
literature is outside the scope of this term paper. This paper discusses only 2
approaches to fuzzy clustering: the ubiquitous Fuzzy C-Means Clustering algorithm
and the less well known but interesting Fuzzy Min-Max Clustering Neural Network.
These approaches are discussed in sections 2 and 3, respectively. In part 4 I will
briefly discuss several applications which use the fuzzy clustering techniques
covered here.

|2| THE FUZZY C-MEANS (FCM) CLUSTERING
ALGORITHM
Fuzzy C-Means, also known as Fuzzy K-Means and Fuzzy ISODATA, is one of the
oldest and most ubiquitous fuzzy clustering algorithms. FCM is a generalization of
the K-Means clustering algorithm, which is a simple and widely used method for
finding crisp clusters. Understanding FCM’s crisp ancestor is instructive and is
discussed below.
|2.1| K-MEANS CLUSTERING
The “K” in K-Means refers to the fact that in K-Means clustering, the number of
clusters is decided before the process begins. The “Means” in K-Means refers to the
fact that each cluster is characterized by the mean of all the points that belong to
the cluster. Thus, in K-Means clustering our goal is literally to find K means, thereby
giving us the K clusters we seek. In particular, the means we seek are those which
minimize the cost function depicted in the following figure:

. Equation taken from [1].Figure 1: Cost fucntion minimized during K-Means clustering. Annotations by me.

Each of the centroid is now moved to the position obtained by taking the mean of each of the points in the cluster associated with the centroid. Each centroid is associated with a different cluster. available at http://shabal. A flow chart is provided below to aid the reader’s understanding of the algorithm. Visually inspecting these key frames in conjunction with the above flow chart can be very instructive.The process is initialized by picking K different “centroids” at random from the space in which the points are embedded. To form these clusters. From here. until the value of the cost function stops decreasing significantly).in/visuals. These two phases are repeated in turn until convergence is reached (i. Understanding of the K-means clustering algorithm can be further enhanced by viewing a series of animated GIF images produced by Andrey A. When evaluated. a point is assigned to the cluster corresponding to the closest centroid.e. . the K-Means process can be divided into two phases: Phase 1: Form Clusters. Phase 2: Move Centroids.html Key frames from the animation are provided below for the reader’s convenience. It should be noted that there is no guarantee that the cost function will be minimized. each point in the data set is evaluated in turn. The outcome depends on initial conditions. Shabalin. Figure 2: A flow chart summarizing the K-Means Clustering Algorithm.

PhD |2. Annotations by me. with the following restriction: The sum of all membership degrees for any given data point is equal to 1. Figure 4: Cost function for FCM. Compare with Figure 1. Figure adapted from [1]. Shabalin.2| FUZZY C-MEANS CLUSTERING (FCM) FCM is a generalization of K-Means.Figure 3: Key frames from an animation on k-means clustering by Andrey A. The cost function used in FCM (shown in figure 3) is very similar to the one used by K-Means. FCM allows clusters to be fuzzy sets. While K-Means assigns each point to one and only one cluster. so that each point belongs to all clusters to varying degrees. Applying the method of Lagrange multipliers to minimize the above cost functions yields the following necessary (but not sufficient) constraints [1]: . but there are some key differences: The inner sum contains a term for each data point in the set. Each of these terms is weighed by a membership degree raised to the power of a fuzziness exponent.

Each of the centroid is now moved to the position obtained via the first of the above constraints. . For this purpose. Figure taken from [2]. The numerical value of these degrees is given by the second of the above constraints. FCM after initialization is divided into two phases: Phase 1: Form Clusters. To form these clusters.Like K-means. Note also the incorporation of the aforementioned constraints. When evaluated. Also like K-means. as is used in Figure 6 below. FCM is initialized by choosing a fixed number of centroids at random. each point in the data set is evaluated in turn. a point is assigned a membership degree with respect to each cluster. Compare to Figure 2 It is instructive to visualize fuzzy clusters visualized by FCM. Figure 5: A flow chart summarizing the FCM Clustering Algorithm. Phase 2: Move Centroids. it is convenient to use a one dimensional data set. The reader should verify that the flow chart for FCM provided below closely resembles the flow chart for K-Means above. Figure 6: Three fuzzy clusters produced by FCM on a 1 dimensional data set. Each centroid is associated with a different fuzzy cluster.

com/products/fuzzy-logic/index.1 One can run FCM on several preloaded data sets or provide a custom data file. .mathworks.MATLAB’s fcmdemo command provides a great way to interact with FCM using 2 dimensional data. To start the demo. as can the fuzziness exponent and the stopping criteria. Screenshots follow on the next page. simply enter the command fcmdemo into the MATLAB command window. The number of clusters can be varied.htmlThe Laptops in the Enginet classrooms at Binghamton University already have the Fuzzy Logic Toolbox installed. one can directly view and manipulate each of the fuzzy clusters. Once FCM has finished running. 1 MATLAB’s fcmdemo depends on the Fuzzy Logic Toolbox. which is available for purchase from MathWorks at the following URL: http://www.

Figure 8: Membership function plots after running fcmdemo with fuzziness exponent m = 1.5 Figure 9: Membership function plots after running fcmdemo with fuzziness exponent m = 4. Compare with figure 8.Figure 7: The main window of fcmdemo after running it on data set 2 with C = 3 and m = 2. .

|3. one can control the position of the min and max points. suppose we have a 2 dimentional hyperbox with min point <5. y> to lie within the hyperbox. The membership function of the hyperbox fuzzy set then decays linearly as one moves further away from the hyperbox core. Then for a data point <x.|3| FUZZY MIN-MAX CLUSTERING NEURAL NETWORKS (FMMCNN) FCM requires that the number of clusters be specified in advance. the number of clusters that should be used is not always clear. which we discuss in this section. A systemwide parameter γ controls the rate of this decay. Analogously. A hyperbox fuzzy set has a hyperbox core. The min point is a vector whose components provide a series of lower bounds for each dimension which must be surpassed to remain within the hyperbox. as well as . so that every point that lies within the hyperbox is given a membership degree of 1. the max point provides a series of upper bounds for each dimension. There are many fuzzy clustering techniques which will automatically determine the number of clusters that should be used. A formal definition of the membership function associated with a hyperbox fuzzy set is shown on the next page. Among them is the Fuzzy Min-Max Clustering Neural Network (FMMCNN). which must be respected to remain within the hyperbox. A hyperbox is completely defined by its min point and its max point.1| HYPERBOX FUZZY SETS The fuzzy clusters used in a FMMCNN are called hyperbox fuzzy sets. Figure 10: A data set (top) can be clustered into 4 (bottom left) or 2 (bottom right) clusters. contour plots can be generated and manipulated using a Mathematica notebook created by me. as the figure below illustrates. For example. With this notebook. However. it is necessary that x ≥ 5 and y ≥ 20. To gain a more practical / intuitive understanding of hyperbox fuzzy sets. 20>.

thereby revealing that hyberbox fuzzy sets can be thought of as generalized symmetric trapezoidal fuzzy numbers.2 0 .6 0 .8 0 .0 0 .0 0 .0 .4 0 .4 0 .8 1 .0 1 . The notebook can also plot one dimensional hyperbox fuzzy sets.0 0 .8 1 .8 0 .4 0 .4 0 .0 1 .6 0 .benjaminjamesbush.the gamma membership decay parameter.6 0 .4 0 .2 0 . Figure 11: Membership function of a Hyperbox Fuzzy Set.8 0 .6 0 . They have been corrected in Figure 11.6 0 .0 0 .2 0 . The notebook is available from my website at the following URL: http://www. 1 .com/fuzzyclustering Screenshots are given on the following page for the reader’s convenience.0 0 .6 0 .2 0 .4 0 .2 0 .2 0 .0 0 .8 2 [3] contains some typographical errors. Adapted from [3]2 m in m in m in m ax m ax m ax g am m a g amm a g am m a 1 .

One dimensional (top) and two dimensional (bottom). Each input node is connected to the output node via a pair .m in 1 m in 2 m ax1 m in 1 m in 2 m a x1 m in 1 m in 2 m a x1 m ax2 m a x2 m a x2 gamma g am m a g amm a Figure 12: Manipulating hyperbox fuzzy sets in Mathematica. |3.1| FUZZY MIN MAX NEURAL NETWORKS A major advantage of using hyperbox fuzzy set for fuzzy clustering is the fact that they can easily be implemented as 2 layer artificial neural networks. The following figure illustrates how this is done. Figure 13: A hyperbox fuzzy set implemented as a 2 layer artificial neural network. The input layer contains one node per dimension of the space in which the data points are embedded.

1| EVOLVING FUZZY CLUSTERS Another advantage of hyperbox fuzzy sets is their relative simplicity with which they can be expressed. |3. For their fitness function. which is in some way an optimal compromise between fitting the data and using the smallest possible number of clusters. respectively. This makes it very easy to design an evolutionary algorithms which can be used to evolve sets of hyperbox fuzzy sets for use within fuzzy min-max clustering neural networks. . see [4]. For more information on the MDL. Fogel and Simpson use the minimum description length (MDL).of weighted links which are weighed by the corresponding component value of the max point and min point. Implementing a clustering system in this way allows for the development of massively parallel systems that can quickly calculate the membership values for incoming data. One such algorithm was published by Fogel and Simpson in [3] and is outlined in the flow chart on the next page. As previously mentioned. a hyperbox fuzzy set can be completely represented by a min point and a max point.

RADIOLOGY John. al. Below I list of a few interesting applications which I encountered in the literature.Figure 14: Flow chart summerizing the evolutionary algorithm used in [3] |4| APPLICATIONS Fuzzy clustering is becoming an important data processing technique in many scientific fields. . fuzzy min-max clustering neural networks are harder to come by. an important step in the formation of cells for cellular manufacturing [8]. used a fuzzy min-max clustering neural network to group parts into part families. While the use of FCM is widespread. GENETICS Gasch and Eisen used FCM to find clusters of yeast genes [5]. INDUSTRIAL ENGINEERING Dobado et. Innocent and Barnes used a fuzzy min-max clustering neural network to group x-ray images of the tibia into clusters [7]. POLITICS Teran and Meier designed a fuzzy system that used FCM to simplify the complex political landscape and recommend candidates to voters based on fuzzy data obtained from surveys [6].

.

.APPENDIX: MATHEMATICA CODE The following Mathematica code can be used to create interactive plots of hyperbox fuzzy sets in one and two dimensions. The code has been tested on Mathematica 8.

5. 0. 0.2 .   f min  a. 1 . Point max. gamma .05 . 0 .05 .4 0 .2 0 . 0. Point min. max_. PlotRange  0. 0 . . max. If 0  x  y  1. Manipulate Show Plot b1D a. 1 Graphics PointSize 0. 0.I n [1 ]:= I n [2 ]:= I n [3 ]:=               f x_. If x  y  1.4 0 . 0. Black.6 1 . max. 1 . min_.0 : If x  y  0. a. gamma_ : 1  f a  max . 6 .3 . 1 b1D a_.2 0 . y_                   0 . Red.6 0 . x  y .8 .8 O u t[3 ]= 0 . gamma m in m ax gam m a 1 .0  .0 0 . min. 1 . 0. gamma. 0. min. 40 0 . Graphics PointSize 0. gamma .

gamma  f min1  a1. Graphics Line min1. Black. min2 . 1 . max2 .In [ 5 ] : = In [ 6 ] : =                                                   Manipulate Show ContourPlot b2D a1. min1. 0. Graphics PointSize 0. ContourLabels  True . min2. Contours  5. 10 m in 1 m in 2 m ax1 m ax2 gam m a O u t[6 ]=   b2D a1_. . max1. min1. max2 . 1 . 0. gamma . max1_. Point min1. Rectangle min1. gamma_ : 1 1  f a1  max1. 0. 0. min1. gamma. Graphics EdgeForm Thick . max1.05 . gamma . Point max1. 0.3 .05 . min2 . White. 2. a2. a2_. min2 Graphics PointSize 0. gamma  2 1  f a2  max2. 0. 1 . min2 .2 . max2. 1 . min2 . max1. 1 . 1 . 6 . max2_. 0. max2. 0.6 . gamma  f min2  a2. max1. max2 . max2 . 0. min1_. min2. max1. Red.5 . min2_. min1. 0. a2. a1.

June 2000. (2012. pp. 14." in ] Electronic Government and the Information Systems Perspective. ] http://videolectures. August) Videolectures. [Online]. S.polimi. 11. November 2010. 125. "Cell formation using a Fuzzy ] Min-Max neural network. P R Innocent.: Prentice Hall. October 2002. J M Bueno. Neuro-fuzzy and soft computing: a ] computational approach to learning and machine intelligence." Information Sciences." Genome Biology.html [3 D B Fogel and P K Simpson. and J Larrañeta.net: MDL Tutorial.. 2010. 6276. 1. vol. no.it/matteucc/Clustering/tutorial_html/cmeans. pp. "Neuro-fuzzy clustering of radiographic ] tibia image data using type 2 fuzzy sets. 65-82. 3. "Evolving Fuzzy Clusters. "Exploring the conditional coregulation of yeast gene ] expression through fuzzy k-means clustering. 93-107. 40. no." International Journal of Production Research. 1993.net/icml08_grunwald_mld/ [5 A P Gasch and M B Eisen. . vol. [7 R I John. no. "A Fuzzy Recommender System for eElections.dei. http://home. C-T Sun. [8 D Dobado. Jang. and M R Barnes. 1997. S Lozano. [4 Peter Grünwald.WORKS CITED [1 J. (2008. R. pp. May) A Tutorial on Clustering Algorithms: Fuzzy C] Means Clustering. [Online]. and E Mizutani." in IEEE International ] Conference on Neural Networks. [6 L Teran and A Meier. vol. [2 Matteo Matteucci.