4.3 K-Medoids

K-MEDOIDS CLUSTERING
A Problem of K-means
 Sensitive to outliers

Outlier: objects with extremely large (or small) values
 May substantially distort the distribution of the data
+
+
Outlier
2
Limitations of K-means: Outlier Problem
• The k-means algorithm is sensitive to outliers !
• Since an object with an extremely large value may substantially distort the
distribution of the data.
• Solution: Instead of taking the mean value of the object in a cluster as a reference
point, medoids can be used, which is the most centrally located object in a cluster .
10 10
9 9
8 8
7 7
6 6
5 5
4 4
3 3
2 2
1 1
0 0
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
k-Medoids Clustering Method
 k-medoids: Find k representative objects, called medoids

PAM (Partitioning Around Medoids, 1987)

CLARA (Kaufmann & Rousseeuw, 1990)

CLARANS (Ng & Han, 1994): Randomized sampling
10 10
9 9
8 8
7 7
6 6
5 5
4 4
3 3
2 2
1 1
0 0
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
k-means k-medoids
4
MEDOID
1.A medoid can be defined as the point in the
cluster, whose similarities with all the other points in
the cluster is maximum.
2.In k-medoids clustering, each cluster is

represented by one of the data point in the cluster.
These points are named cluster medoids.
3.The term medoid refers to an object within a

cluster for which average dissimilarity between it
and all the other the members of the cluster is
minimal. It corresponds to the most centrally located
point in the cluster.
5
PAM (Partitioning Around Medoids) (1987)
 PAM (Kaufman and Rousseeuw, 1987)

 Arbitrarily choose k objects as the initial medoids
 Until no change, do

(Re)assign each object to the cluster with the nearest medoid

Improve the quality of the k-medoids
(Randomly select a nonmedoid object, Orandom,
compute the total cost of swapping a medoid with
Orandom)
 Work for small data sets (100 objects in 5 clusters)
 Not efficient for medium and large data sets
6
A Typical K-Medoids Algorithm (PAM)
Total Cost ({mi}) = 20
10 10
10
9 9
9
8 8
8
7 7
6
Arbitrary 6
Assign 7
5
choose k 5 each 6
5
4 object as 4 remaining 4
3
initial 3
object to 3
2
medoids 2
nearest 2
1
medoids
1
1
0 (mi,i=1..k) 0
0 1 2 3 4 5 6 7 8 9 10 0
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10
K=2
select a nonmedoid
Total Cost = 18 object,Oj
10 10
9
Compute 9
Swapping mt and Oj 8
total cost of
8
7 7
If cost({Oj,{mi,i≠t})} is the smallest, 6 all possible 6
and cost ({Oj,{mi,i≠t}}) < cost({mi}).

5
new set of 5
4 4
3
medoids 3
(Oj,{mi},i≠t)
2 2
1 1
Do loop 0
0 1 2 3 4 5 6 7 8 9 10
0
0 1 2 3 4 5 6 7 8 9 10
Until no change 7
The k-Medoids Algorithm
Total swapping Cost (TCih)
Total swapping cost TCih=jCjih
– Where Cjih is the cost of swapping i with h for all non medoid objects j
– Cjih will vary depending upon different cases

Sum of Squared Error (SSE)
𝑘
SSE ( 𝐸 ¿= ∑ ∑ 𝑑 2 ( 𝑗 , 𝑖)
𝑖 =1 𝑗 ∈ 𝐶𝑖
𝑦3
𝑥2 𝑦2
𝑥3
𝑥1 C2
C1
𝑦1
K-MEDOIDS ALGORITHM
EXAMPLE-1
K-MEDOIDS ALGORITHM
EXAMPLE-1
Select the Randomly K-Medoids
K-MEDOIDS ALGORITHM
EXAMPLE-1
Allocate to Each Point to Closest Medoid
K-MEDOIDS ALGORITHM
EXAMPLE-1
K-MEDOIDS ALGORITHM
EXAMPLE-1
K-MEDOIDS ALGORITHM
EXAMPLE-1
Determine New Medoid for Each Cluster
K-MEDOIDS ALGORITHM
EXAMPLE-1
Determine New Medoid for Each Cluster
K-MEDOIDS ALGORITHM
EXAMPLE-1
K-MEDOIDS ALGORITHM
EXAMPLE-1
Stop Process
Case (i) : Computation of Cjih
• j currently belongs to the cluster represent by the medoid i

• j is less similar to the medoid t compare to h
10
8
t
7
Therefore, in future, j j
6
belongs to the cluster h
5
represented by h
4
3
i
2
0
0 1 2 3 4 5 6 7 8 9 10
𝑘
𝐸=∑ ∑ 𝑑( 𝑗 ,𝑖) Cjih = d(j, h) - d(j, i)
𝑖=1 𝑗∈𝐶 𝑖
Case (ii) : Computation of Cjih
• j currently belongs to the cluster represent by the medoid i

• j is more similar to the medoid t compare to h
10
8 h
Therefore, in future, j 7 j
belongs to the cluster 6
represented by t 5
4
i t
3
0
0 1 2 3 4 5 6 7 8 9 10
Cjih = d(j, t) - d(j, i)

Case (iii) : Computation of Cjih
• j currently belongs to the cluster represent by the medoid t

• j is more similar to the medoid t compare to h
10
8
t j
7
Therefore, in future, j
6
5
represented by t itself
4
3
i
2
0
0 1 2 3 4 5 6 7 8 9 10
Cjih = 0
Case (iv) : Computation of Cjih
• j currently belongs to the cluster represent by the medoid t

• j is less similar to the medoid t compare to h
10
8
t
7
Therefore, in future, j j
6
5
represented by h
4
3
i
2
0
0 1 2 3 4 5 6 7 8 9 10
Cjih = d(j, t) - d(j, h)

PAM or K-Medoids: Example
Data Objects 9
8
3
A1 A2 7
4
6 10
01 2 6 1
5
02 3 4
9
03 4
3 8 2 6 8
04 4 7 3 7
5
05 6 2 2
3 4 5 6 7 8 9
06 6 4
07 7 3
08 Goal: create two clusters
7 4
09 8 5 Choose randmly two medoids
010 7 6
08 = (7,4) and 02 = (3,4)
Data Objects 9
cluster1
8
3 cluster2
A1 A2 7
4
6 10
01 2 6 1
5
02 3 4
9
03 4
3 8 2 6 8
04 4 7 3 7
5
05 6 2 2
3 4 5 6 7 8 9
06 6 4
07 7 3 Assign each object to the closest representative object
08 7 4
09 Using L1 Metric (Manhattan), we form the following clusters
8 5
010 7 6 Cluster1 = {01, 02, 03, 04}
Cluster2 = {05, 06, 07, 08, 09, 010}

Data Objects 9
cluster1
8
3 cluster2
A1 A2 7
4
6 10
01 2 6 1
5
02 3 4
9
03 4
3 8 2 6 8
04 4 7 3 7
5
05 6 2 2
3 4 5 6 7 8 9
06 6 4
07 7 3
Compute the absolute error criterion [for the set of Medoids
08 7 4
09 (O2,O8)]
8 5
𝒌
010 7 6
𝑬=∑ ∑ |𝒑− 𝑶 𝒊|=(|𝑶 𝟏 − 𝑶 𝟐|+|𝑶 𝟑 − 𝑶 𝟐|+|𝑶 𝟒 − 𝑶 𝟐|)+¿ ¿
𝒊=𝟏 𝒑 ∈𝑪 𝒊
(|𝑶𝟓 − 𝑶 𝟖|+|𝑶 𝟔 − 𝑶 𝟖|+|𝑶 𝟕 − 𝑶 𝟖|+|𝑶 𝟗 − 𝑶𝟖|+|𝑶𝟏𝟎 − 𝑶𝟖|)

Data Objects 9
cluster1
8
3 cluster2
A1 A2 7
4
6 10
01 2 6 1
5
02 3 4
9
03 4
3 8 2 6 8
04 4 7 3 7
5
05 6 2 2
3 4 5 6 7 8 9
06 6 4
07 7 3
08 The absolute error criterion [for the set of Medoids (O2,O8)]
7 4
09 8 5
010 7 6 E = (3+4+4)+(3+1+1+2+2) = 20
Data Objects 9
cluster1
8
3 cluster2
A1 A2 7
4
6 10
01 2 6 1
5
02 3 4
9
03 4
3 8 2 6 8
04 4 7 3 7
5
05 6 2 2
3 4 5 6 7 8 9
06 6 4
07 7 3 • Choose a random object 07
08 7 4
• Swap 08 and 07
09 8 5
010 • Compute the absolute error criterion [for the set of
7 6
Medoids (02,07)
E = (3+4+4)+(2+2+1+3+3) = 22
Data Objects 9
cluster1
8
3 cluster2
A1 A2 7
4
6 10
01 2 6 1
5
02 3 4
9
03 4
3 8 2 6 8
04 4 7 3 7
5
05 6 2 2
3 4 5 6 7 8 9
06 6 4
07 7 3 → Compute the cost function
08 7 4
Absolute error [02 ,07 ] - Absolute error [for 02 ,08 ]
09 8 5
010 7 6
S=22-20
S> 0 => It is a bad idea to replace 08 by 07
Data Objects 9
cluster1
8
3 cluster2
A1 A2 7
4
6 10
01 2 6 1
5
02 3 4
9
03 4
3 8 2 6 8
04 4 7 3 7
5
05 6 2 2
3 4 5 6 7 8 9
06 6 4
07 7 3 C687 = d(07,06) - d(08,06)=2-1=1
08 7 4
C587 = d(07,05) - d(08,05)=2-3=-1 C187=0, C387=0, C487=0
09 8 5
010 C987 = d(07,09) - d(08,09)=3-2=1
7 6
C1087 = d(07,010) - d(08,010)=3-2=1
TCih=jCjih=1-1+1+1=2=S
CLARA (Clustering Large Applications) (1990)
CLARA (Clustering LARge Applications, Kaufmann and Rousseeuw)
Sampling based method:
 It draws multiple samples of the data set,
 applies PAM on each sample,
 gives the best clustering as the output (minimizing cost/SSE)
Strength: deals with larger data sets than PAM

Weakness:
 Efficiency depends on the sample size
 A good clustering based on samples will not necessarily represent a
good clustering of the whole data set if the sample is biased
31

4.3 K-Medoids

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

4.3 K-Medoids

Uploaded by

Copyright:

Available Formats

K-MEDOIDS CLUSTERING

• The k-means algorithm is sensitive to outliers !

distribution of the data.

2.In k-medoids clustering, each cluster is

3.The term medoid refers to an object within a

 PAM (Kaufman and Rousseeuw, 1987)

If cost({Oj,{mi,i≠t})} is the smallest, 6 all possible 6

and cost ({Oj,{mi,i≠t}}) < cost({mi}).

Total swapping cost TCih=jCjih

– Cjih will vary depending upon different cases

• j currently belongs to the cluster represent by the medoid i

• j currently belongs to the cluster represent by the medoid i

Cjih = d(j, t) - d(j, i)

• j currently belongs to the cluster represent by the medoid t

• j currently belongs to the cluster represent by the medoid t

Cjih = d(j, t) - d(j, h)

Cluster2 = {05, 06, 07, 08, 09, 010}

(|𝑶𝟓 − 𝑶 𝟖|+|𝑶 𝟔 − 𝑶 𝟖|+|𝑶 𝟕 − 𝑶 𝟖|+|𝑶 𝟗 − 𝑶𝟖|+|𝑶𝟏𝟎 − 𝑶𝟖|)

Strength: deals with larger data sets than PAM

You might also like