You are on page 1of 12
c-MEANS CLUSTERING Bezdek (1981) developed an extremely powerful classification method to accommodate fuzzy data. It is an extension of a method known as ¢-means, or hard c-means, when employed in a crisp classification sense. To introduce this method, we define a sample set of n data samples that we wish to classify x {1620.53.49} 02) Each data sample, x;, is defined by m features, that is, p= (rtm tia, tims a0.3) where =n classes just places each data sample into its own class, and c= 1 places all data samples into the same class: neither case requires any effort in classification, ‘and both arc intrinsically uninteresting. Equation (10.4) expresses the fact that the set of all classes exhausts the universe of data samples. Equation (10.5) indicates that none of the classes overlap in the sense that a data sample can belong to more than one class Equation (10.6) simply expresses that a class cannot be empty and it cannot contain all the data samples. Suppose we have the case where ¢ ifested in the following set expressions: Equations (10.4) and (10.5) are then man- AQ=™, ALURy=X, and AOR =O. ‘These set expressions are equivalent to the excluded middle axioms (Equation (2.12)) ‘The function-theoretic expressions associated with Equations (10.4)~(10.6) are as follows: Vineo=i, teil, aos) waster) 924,08) =0, forall k, 10s) ae RKS RRA cio.0) ‘where the characteristic function x, (x4) is defined once again as, sO = ki oat (10.1) Equations (10.8) and (10.9) explain that any sample x, can only and definitely belong to fone of the ¢ classes, Equation (10.10) implies that no class is empty and no class is a whole set X (ie. the universe). Simplicity in notation, our membership assignment of the jth data point in the i caso i oe tt ea Gt ding emt U comic elements 43; {i = 1,2, 10.12) ial Any matrix U eM, is a hard c-parttion. The cardinality of any hard e-patition, M., is (10.13) wor the exression($) isthe binomial eosin fe things taken ata ime. Example 10.6. Suppose we have five data points ia a universe, X= (xy.x3.5. 449) ‘Also, suppose we want to cluster these five points into two classes. For this ease we have ‘3 and e = 2. The cardinality, using Equation (10.13), ofthis hard e-partition is given as am, = 2-1) + BIH 15. Some ofthe 15 possible hard 2-parttions are listed hee: Li gp. too mio e op oooe oo foot) foots toriag 1 o oft oo 1 oF To oor ora rro ori ta 1 a oO 1 rit 1o) gy foooor ooooi ™ lr it ol are not different-clustering 2-parttons. In fact they are the sare 2-partitions irespective of fan arbitrary row-swap. If we label the fist cow of the frst U matrix clas cy and we label the ‘eeond row class cz, we would get the same classification for the second U matrix by simply felabeling each row the fist row is cp and the second row is cy. The cardinality measure tiven in Equation (10.13) gives the number of unique cpartiions for n dala pois. ‘An interesting question now arises: Of all the possible e-partions for m data samples, how can we select tbe most reasonable c-partion for the partion space M-? For instance, in the example just provided, which of the 15 possible hard 2-partiions fr five data points ‘and two classes is the best? The answer to this question is provided by the objective function (classification eriteria) to be used to classify or eluster the data. The one proposed for the ICM algorithm is known as a withir-class sum of squared errors approaci using a Euclidean ‘norm to characterize distance. This algorithm is denoted 1(U, v), where U is the partition ‘maltix, and the parameter v isa Vector of cluster centers. This objective function is given as w= DOV wlan’ 019 ‘where dy is a Fuelidean distance measure (in m-dimensional feature space, R™) between the [ah data Sample X and sth cluster center v, which is piven as follows . a [e~—»"] oy - Since each data sample requires m coordinates to describe cach cluster center also requires m coordinates to describe its local ‘Vreretore, the rth cluster centr is a vector of length ms, da x4 — 1) = Ie — location in R"-space, in this same space. Yale where the jth connate scale! by Sonny Dv (10.16) ‘We seek the optimum partition, U*, to be the partition that produces the minimum ‘value forthe function, J. Th 0" = Uv. 10.17) "Finding the optimum pation matt, Us exceedingly dificult for practical problems because M, — oo for even modest-sized problems. For example, for the case where n = 25 and «10, the cardinality approaches an extremely large number, that is, M. —> 108! Obviously, a search for optimality by cxhaustion is not computationally feasible for problems fof resonable inerest.Fentnatly. very useful and efiectve-atematve search algorithms have boen devised (Bezdek, 1981 ‘One ston search algorithm is known as terarve optimisation. Basically, this method is like many other iterative methods in that we stat wth an intel guess atthe U main From this sstmed mitt (inptt values forthe number of classes) and iteration tolerance (the accaracy me domand in the slaton), we calculate the centers of the elutrs (classes) From these clster, or clas, cener, we secalculate the membership values that cach data point has in the cluster. We compare these values with th assumed values and continu this process antl the changes from cycle to cycle are within our prescribed tolerance Level The step-by-step procedures in this iterative optimization method are provided as fol- tows Benak 1981) 1, Fix © (2<¢

You might also like