15 views

Uploaded by احمد ياسر

- Bio in for Matic Spca
- machine learning
- Machine Learning_updated.pdf
- Multidimensional Clustering
- Optimization of Clustering Algorithm Using Metaheuristic
- 2-78-1394014260-3. Manage-An Empirical Analysis of Customer Satisfaction-Ibukun Fatudimu.pdf
- SEGMENTATION USING ‘NEW’ TEXTURE FEATURE
- 5) Clustering With Multiviewpoint-Based Similarity Measure
- Particle Swarm Optimization based K-Prototype Clustering Algorithm
- 50120130406022
- 412-522-1-PB
- QT Clustering
- Presentation
- 4614ijitcs01
- Image Segment at i On
- Image Segmentation through Clustering Based.pdf
- cluster_analysis-2
- Color Segmentation Methods
- Tracking of Fluorescent Cells Based on the Wavelet Otsu Model
- Segmentation and Classification of MRI Brain Tumor

You are on page 1of 6

Algorithm 1

K-means is a method of clustering observations into a specific number of disjoint clusters.

The K refers to the number of clusters specified. Various distance measures exist to deter-

mine which observation is to be appended to which cluster. The algorithm aims at minimiz-

ing the measure between the centroide of the cluster and the given observation by iteratively

appending an observation to any cluster and terminate when the lowest distance measure

is achieved.

1. The sample space is intially partitioned into K clusters and the observations are ran-

domly assigned to the clusters.

Calculate the distance from the observation to the centroide of the cluster.

IF the sample is closest to its own cluster THEN leave it ELSE select another

cluster.

3. Repeat steps 1 and 2 untill no observations are moved from one cluster to another

When step 3 terminates the clusters are stable and each sample is assigned a cluster which

results in the lowest possible distance to the centriode of the cluster.

Common distance measures include the Euclidean distance, the Euclidean squared distance

and the Manhattan or City distance.

The Euclidean measure corresponds to the shortest geometric distance between to points.

s

N

d= (xi yi )2 (1.1)

i=1

1

CHAPTER 1. THE K-MEANS CLUSTERING ALGORITHM

A faster way of determining the distance is by use of the squared Euclidean distance which

calculates the above distance squared, i.e.

N

dsq = (xi yi )2 (1.2)

i=1

The Manhattan measure calculates a distance between points based on a grid and is illus-

trated in Figure 1.1.

Figure 1.1: Comparision between the Euclidean and the Manhattan measure.

For applications in speech processing the squared Euclidean distance is widely used.

K-means can be used to cluster the extracted features from speech signals. The extracted

features from the signal include for instance mel frequency cepstral coefficients or line spec-

trum pairs. This allows speech signals with similar spectral characteristics to be positioned

into the same position in the codebook. In this way similar narrow band signals will be

predicted likewise thereby limiting the size of the codebook.

The following figures illustrate the K-means algoritm on a 2-dimensional data set.

2

1.4. EXAMPLE OF K-MEANS CLUSTERING

20

15

10

10

15

20

20 15 10 5 0 5 10 15 20

Figure 1.2: Example of signal data made from Gaussian White Noise.

20

15

10

10

15

20

20 15 10 5 0 5 10 15 20

Figure 1.3: The signal data are seperated into seven clusters. The centroids are

marked with a cross.

3

CHAPTER 1. THE K-MEANS CLUSTERING ALGORITHM

3

Cluster

Silhouette Value

Figure 1.4: The Silhouette diagram shows how well the data are seperated

into the seven clusters. If the distance from one point to two centroids is the

same, it means the point could belong to both centroids. The result is a conflict

which gives a negative value in the Silhouette diagram. The positive part of

the Silhuoette diagram, shows that there is a clear seperation of the points

between the clusters.

4

1.5. MATLAB SOURCE CODE

1 close all

2 clear all

3 clc

4

5 Limit = 2 0 ;

6

7 X = [ 1 0 randn ( 4 0 0 , 2 ) ; 1 0 randn ( 4 0 0 , 2 ) ] ;

8 p l o t ( X ( : , 1 ) ,X ( : , 2 ) , k . )

9 length ( X ( : , 1 ) )

10 figure

11 %i = 1 ;

12 k=1;

13 f o r i = 1 : length (X ( : , 1 ) )

14 i f ( s q r t (X( i , 1 ) ^2+X( i , 2 ) ^2) ) > Limit ;

15 X( i , 1 ) = 0 ;

16 X( i , 2 ) = 0 ;

17 else

18 Y( k , 1 ) =X( i , 1 ) ;

19 Y( k , 2 ) =X( i , 2 ) ;

20 k=k + 1 ;

21 end

22 end

23 p l o t ( Y ( : , 1 ) ,Y ( : , 2 ) , k . )

24 figure

25

26 [ cidx , c t r s ] = kmeans ( Y , 7 , d i s t , sqEuclidean , rep , 5 , disp ,

f i n a l , EmptyAction , s i n g l e t o n ) ;

27

28 p l o t ( Y( c i d x ==1 ,1) ,Y( c i d x ==1 ,2) , r. , ...

29 Y( c i d x ==2 ,1) ,Y( c i d x ==2 ,2) , b . , c t r s ( : , 1 ) , c t r s ( : , 2 ) , kx ) ;

30

31 hold on

32 p l o t ( Y( c i d x ==3 ,1) ,Y( c i d x ==3 ,2) , y . ,Y( c i d x ==4 ,1) ,Y( c i d x ==4 ,2) , g . )

;

33

34 hold on

35 p l o t ( Y( c i d x ==5 ,1) ,Y( c i d x ==5 ,2) , c . ,Y( c i d x ==6 ,1) ,Y( c i d x ==6 ,2) , m. )

;

36

37 hold on

38 p l o t ( Y( c i d x ==7 ,1) ,Y( c i d x ==7 ,2) , k . ) ;

39

40 f i g u r e

5

CHAPTER 1. THE K-MEANS CLUSTERING ALGORITHM

42 mean ( s i l k )

- Bio in for Matic SpcaUploaded byMonica Clements
- machine learningUploaded byShahid KI
- Machine Learning_updated.pdfUploaded bybikram2128
- Multidimensional ClusteringUploaded byZoricaSrdjevic
- Optimization of Clustering Algorithm Using MetaheuristicUploaded byEditor IJRITCC
- 2-78-1394014260-3. Manage-An Empirical Analysis of Customer Satisfaction-Ibukun Fatudimu.pdfUploaded byMadhu Evuri
- SEGMENTATION USING ‘NEW’ TEXTURE FEATUREUploaded byAnonymous IlrQK9Hu
- 5) Clustering With Multiviewpoint-Based Similarity MeasureUploaded byremo1423
- Particle Swarm Optimization based K-Prototype Clustering AlgorithmUploaded byInternational Organization of Scientific Research (IOSR)
- 50120130406022Uploaded byIAEME Publication
- 412-522-1-PBUploaded byneoseed2
- QT ClusteringUploaded bySuvajit Dutta
- PresentationUploaded byPoonam Kataria
- 4614ijitcs01Uploaded byijitcs
- Image Segment at i OnUploaded bySuresh Ibk
- Image Segmentation through Clustering Based.pdfUploaded byMarius_2010
- cluster_analysis-2Uploaded byeug_manu8
- Color Segmentation MethodsUploaded byRaviChandra
- Tracking of Fluorescent Cells Based on the Wavelet Otsu ModelUploaded byEditor IJRITCC
- Segmentation and Classification of MRI Brain TumorUploaded byIRJET Journal
- A Survey on Software Component RestructuringUploaded byAnonymous kw8Yrp0R5r
- Dahlke 2009 HPUploaded bydragon_287
- AnnieUploaded byAga A
- COMPARITIVE STUDY OF DATA MINING TECHNIQUES FOR INTRUSION DETECTION SYSTEMUploaded byAnonymous vQrJlEN
- ImageProcessing-IDRISI-Feb2008Uploaded bynirmal kumar
- Data Mining Project Titles 2016-2017Uploaded byShaka Technologies
- Fuzzy 21stUploaded byRajesh Ramakrishnan
- Data MiningUploaded byPavan Kotesh
- 4.Analysis of Different.fullUploaded byvicepresident marketing
- FORESTRY CLASSIFICATION OF BEIJING USING SUPERVISED & UNSUPERVISED.pptxUploaded byjahan_gir_s

- 1.pdfUploaded byاحمد ياسر
- Ch2 Optical FiberUploaded byاحمد ياسر
- AmeraAttarPatient-DependentCIBEC 2018 Paper 851Uploaded byاحمد ياسر
- Telecommunication Switching Systems and NetworksUploaded byاحمد ياسر
- PcmUploaded byاحمد ياسر
- GRA_06_05Uploaded byاحمد ياسر
- 1.pdfUploaded byاحمد ياسر
- Traffic Light 2 wayUploaded byMychon Kan
- Ch2 Optical FiberUploaded byاحمد ياسر
- K-meansUploaded byاحمد ياسر
- 5283 Millsap Ch20Uploaded byShaibal Barua
- Flip flop i sudUploaded byAcharayu D. Rufio
- 655118524Uploaded byاحمد ياسر
- TB5.pdfUploaded byMuhammad Nouman
- Monochrome and Colour Television - R.R.gulatiUploaded bytomaikyakaru

- Lecture 2Uploaded byMIR SARTAJ
- EEE509_syllabi-2015Uploaded bynova514
- m483Uploaded byEpic Win
- Neper Reference ManualUploaded byalicarlos13
- Humor and AdsUploaded byirenek
- Santore&Co Policy, SpilloversUploaded byapi-3747949
- Fuzzy Logic Tools Reference Manual v1.0Uploaded byA: Javier Barragán
- TwowayANOVA With RUploaded byMarselina Natalya Dewi Hutami
- DeterminantsUploaded byMirabella Astra
- PunctuationUploaded byrifdien_1
- English Plural NounsUploaded byanadar
- Why the Math of Correlation Matters WSJ 10-4-10Uploaded byaureliotiziano
- BroadbandChaos GauthierUploaded byBicky A. Márquez
- 16F84Uploaded byCerres Windel
- 2015 SNUG SV SVA for RTL Designers PaperUploaded byspauls
- ArraysUploaded byRishav Kar
- BM6043_ProbabilityUploaded byWilliam Teo
- Lecture 11+12Uploaded byVaida Cosmin-Ionut
- Physics Electrostatics 2Uploaded bykishore_subrama3240
- Schrodt - Patterns, Rules and Learning Computational Models Ofinternational BehaviorUploaded byElio Amicarelli
- NR-420803-Computer Application in Chemical EnggUploaded bySrinivasa Rao G
- 105 Kamratug Full PaperUploaded byAditya Rutwik
- Correlation of Pore Volume Compressibility with Porosity in One of the Iranian Southern Carbonate Reservoirs.pdfUploaded byjoreli
- full_textUploaded byRohadi Hashim
- 00680573(gnss)Uploaded byBianca Deac
- Handbook Sediment Transport by Currents and Waves - Leo Van RijnUploaded byPrutserXXL
- 125913139-Winches-and-Reels-Tension-in-Line-and-Pressure-on-the-Drum-and-Flanges.pdfUploaded byjosebernal_mza
- MP 94Sp Interval-Class Content in Equally Tempered Pitch-Class Sets - Common Scales Exhibit optimum Tonal Consonance.pdfUploaded byFabian Moss
- Latin TermsUploaded byMarie Balberan
- MATH1081-Topic2-LectureNotesUploaded byyizzy