unproblematic to implement and works rapid in mostsituations. But the sensitivity of KM algorithm to initializationmakes it easily trapped in local optima. K-Harmonic Means(KHM) clustering resolves the problem of initialization faced by KM algorithm. Even then KHM also easily runs into localoptima. PSO algorithm is a global optimization technique. Ahybrid data clustering algorithm based on the PSO and KHM(PSOKHM) was proposed by Yang et al. in [3]. This hybriddata clustering algorithm utilizes the advantages of both thealgorithms. Therefore the PSOKHM algorithm not only helpsthe KHM clustering run off from local optima but alsoconquer the inadequacy of the slow convergence speed of thePSO algorithm. They conducted experiments to compare thehybrid data clustering algorithm with that of PSO and KHMclustering on seven different data sets. The results of theexperiments show that PSOKHM was simply superior to theother two clustering algorithms.Huang in [4] put forth a technique that enhances theimplementation of K-Means algorithm to various data sets.Generally, the efficiency of K-Means algorithm in clusteringthe data sets is high. The restriction for implementing K-Means algorithm to cluster real world data which containscategorical value is because of the fact that it was mostlyemployed to numerical values. They presented two algorithmswhich extend the k-means algorithm to categorical domainsand domains with mixed numeric and categorical values. Thek-modes algorithm uses a trouble-free matching dissimilaritymeasure to deal with categorical objects, replaces the means of clusters with modes, and uses a frequency-based method tomodernize modes in the clustering process to decrease theclustering cost function. The k-prototypes algorithm, from thedefinition of a combined dissimilarity measure, further integrates the k-means and k-modes algorithms to allow for clustering objects described by mixed numeric and categoricalattributes. The experiments were conducted on well knownsoybean disease and credit approval data sets to demonstratethe clustering performance of the two algorithms.Kluger [5] first proposed spectral biclustering for processing gene expression data. But Kluger’s focus is mainlyon unsupervised clustering, not on gene selection.There are some present works related to the findinginitialization centroids.1. Compute mean (
μ
j) and standard deviation (
σ
j) for every jth attribute values.2. Compute percentile Z1, Z2,…, Zk corresponding to areaunder the normal curve from –
∞
to (2s-1)/2k, s=1, 2, … ,k (clusters).3. Compute attribute values xs =zs
σ
j+
μ
j corresponding tothese percentiles using mean and standard deviation of theattribute.4. Perform the K-means to cluster data based on jth attributevalues using xs as initial centers and assign cluster labels toevery data.5. Repeat the steps of 3-4 for all attributes (l).6. For every data item t create the string of the class labelsPt = (P1, P2,…, Pl) where Pj is the class label of t when usingthe jth attribute values for step 4 clustering.7. Merge the data items which have the same pattern stringPt yielding K
′
clusters. The centroids of the K
′
clusters arecomputed. If K
′
> K, apply Merge- DBMSDC (Density basedMulti Scale Data Condensation) algorithm [6] to merge theseK
′
clusters into K clusters.8. Find the centroids of K clusters and use the centroid asinitial centers for clustering the original dataset using K Means.Although the mentioned initialization algorithms can helpfinding good initial centers for some extent, they are quitecomplex and some use the K-Means algorithm as part of their algorithms, which still need to use the random method for cluster center initialization. The proposed approach for findinginitial cluster centroid is presented in the following section.III.
M
ETHODOLOGY
3.1.
Initial Cluster Centers Deriving from DataPartitioning
The algorithm follows a novel approach that performs data partitioning along the data axis with the highest variance. Theapproach has been used successfully for color quantization [7].The data partitioning tries to divide data space into small cellsor clusters where intercluster distances are large as possibleand intracluster distances are small as possible.
Fig. 1 Diagram of ten data points in 2D, sorted by its X value, with anordering number for each data point
For instance, consider Fig. 1. Suppose ten data points in 2Ddata space are given.The goal is to partition the ten data points in Fig. 1 into twodisjoint cells where sum of the total clustering errors of thetwo cells is minimal, see Fig. 2. Suppose a cutting plane perpendicular to X-axis will be used to partition the data. LetC
1
and C
2
be the first cell and the second cell respectively and
and
be the cell centroids of the first cell and the secondcell, respectively. The total clustering error of the first cell isthus computed by:
,
(1)and the total clustering error of the second cell is thuscomputed by:
,
(2)
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 8, No. 9, December 2010338http://sites.google.com/site/ijcsis/ISSN 1947-5500