Tutorial 12 Answers

IS328 Data Mining
Semester 2, 2019
Partitional Clustering Techniques

K-Means and K-Medoids Clustering
Tutorial 12 Exercises
Q1
Suppose we want to group the visitors to a website using just their age (a one-dimensional space)
as follows:
15,17,17,19,19,20,20,21,22,28,35,45,52,58,59,60,60,61,61
Assume the initial centres as 15 and 20
Use K = 2 with the K-Means algorithm. Show your calculations and steps.
The Initial centres are 15 and 20
The initial clusters are

[15] [15, 17, 17]
[20] [19,19,20,20,21,22,28,35,45,52,58,59,60,60,61,61]
The new centres are 16.33 and 40.

The new clusters are
[16.33] [15, 17, 17, 19,19,20,20,21,22,28]
[40] [35,45,52,58,59,60,60,61,61]
The new centres are 19.8 and 54.6

[19.8] [15, 17, 17, 19,19,20,20,21,22,28, 35]
[54.6] [45,52,58,59,60,60,61,61]
The new centres are 21.2 and 57

[21.2] [15, 17, 17, 19,19,20,20,21,22,28, 35]
[57] [45,52,58,59,60,60,61,61]
Here the K-Means algorithm terminates, as there is no any change in the clusters.
Q2 K-means clustering with Manhattan Distance
We are given the following data on 5 objects:
Object X1 X2
1 3 2
2 8 6
3 6 7
4 3 4
5 7 2
Cluster this data into two clusters, using the k-means algorithm.
To initialize the algorithm, put objects 1 and 3 in one cluster, and objects 2, 4 and 5 in the
other cluster.
Show the steps of the algorithm clearly. Use Manhattan distance for calculating distances.
ITERATION 1
Cluster A = [ (3, 2), (6, 7)]
Cluster B = [(8, 6), (3, 4), (7, 2)]
C1 = [3+6)/2, (2+7)/2] = (4.5, 4.5)

C2 = (8+3+7)/3 , (6+4+2)/3) = (6, 4)
Manhattan Distances from the Centres

Object C1 (4,5, 4,5) C2(6,4)
(3,2) 4 5
(8,6) 5 4
(6,7) 4 3
(3,4) 2 3
(7, 2) 5 3
ITERATION 2
Cluster A = [ (3, 2), (3,4)]
Cluster B = [(8, 6), (6,7), (7, 2)]
C1 = [3+3)/2, (3+4)/2] = (3,3)

C2 = (8+6+7)/3 , (6+7+2)/3) = (7, 5)
Manhattan Distances from the Centres
Object C1 (3,3) C2(7,5)
(3,2) 1 7
(8,6) 8 2
(6,7) 7 3
(3,4) 1 5
(7, 2) 5 3
ITERATION 3
Cluster A = [ (3, 2), (3,4)]
Cluster B = [(8, 6), (6,7), (7, 2)]
The members of clusters A and B are the same for Iteration 2 and 3.
Therefore the K-Means terminates.
The final clusters are

Cluster A = [ (3, 2), (3,4)]
Cluster B = [(8, 6), (6,7), (7, 2)]
Q3. K-means clustering with Euclidean Distance

Use the k-means algorithm and Euclidean distance to cluster the following 6 objects into 3
clusters:
A1=(2,10), A2=(2,5), A3=(8,4), A4=(5,8), A5=(7,5), A6=(1,2)
Suppose that the initial seeds (centers of each cluster) are A1, A4 and A6.
Run the k-means algorithm to cluster the above data:
Draw a 10 by 10 space with all the 6 points and show the clusters after each iteration.
Initial centres are C1(2,10), C2(5, 8), C3 (1, 2)
ITERATION 1
Euclidean Distances from the Centres

Object C1 (2,10) C2(5,8) C3(1,2)
A1(2,10) 0 3.61 8.06
A2(2, 5) 5 4.24 3.16
A3(8, 4) 8.49 5 7.62
A4(5,8) 3.61 0 7.21
A5(7, 5) 7.07 3.61 6.71
A6(1, 2) 8.06 7.21 0
The current clusters are
Cluster A = [ (2, 10)]
Cluster B = [(8, 4), (5, 8), (7, 5)]
Cluster C = [(2, 5), (1, 2)]
The new centres are C1(2, 10), C2 (6.67, 5.67), and C3(1.5, 3.5)
ITERATION 2
Euclidean Distances from the Centres

Object C1 (2,10) C2(6.67, 5.67) C3(1.5, 3.5)
A1(2,10) 0 6.37 6.52
A2(2, 5) 5 4.72 1.58
A3(8, 4) 8.49 2.13 6.52
A4(5,8) 3.61 2.87 5.70
A5(7, 5) 7.07 0.75 5.70
A6(1, 2) 8.06 6.75 1/58
The current clusters are

Cluster A = [ (2, 10)]
Cluster B = [(8, 4), (5, 8), (7, 5)]
Cluster C = [(2, 5), (1, 2)]
Since the members have not changed, K-Means terminates here.

The final clusters are
Cluster A = [ (2, 10)]
Cluster B = [(8, 4), (5, 8), (7, 5)]
Cluster C = [(2, 5), (1, 2)]
Exercise 4: K-Medoid Clustering Using Distance Matrix

K-Medoids
Initial Clusters
C1 A, C, D
C2 B, E, F
Medoid of (A, C, D)
A C D Total
A 0 5.66 3.61 9.27
C 5.66 0 2.24 7.90
D 3.61 2.24 0 5.85
Medoid of (B, E, F)
B E F Total
B 0 3.54 2.50 6.04
E 3.54 0 1.12 4.66
F 2.50 1.12 0 3.62
Clusters 2
C1 D, C, E
C2 F, A, B
Medoid of (D, C, E)
D C E Total
D 0 2.24 1.00 3.24
C 2.24 0 1.41 3.65
E 1.00 1.41 0 2.41
Medoid of (F, A, B)
F A B Total
F 0 3.20 2.50 5.70
A 3.20 0 0.71 3.91
B 2.50 0.71 0 3.21
Clusters 3
C1 E, C, D, F
C2 B, A
Medoid of (D, C, E, F)
D C E F Total
D 0 2.24 1.00 0.50 3.74
C 2.24 0 1.41 2.50 6.15
E 1.00 1.41 0 2.50 4.91
F 0.50 2.50 1.12 0 4.12
Clusters 4
C1 D, C, E, F
C2 B, A
Therefore the final clusters are {A, B} and {C, D, E, F}.

Tutorial 12 Answers

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Tutorial 12 Answers

Uploaded by

Copyright:

Available Formats

IS328 Data Mining

Partitional Clustering Techniques

The Initial centres are 15 and 20

The initial clusters are

The new centres are 16.33 and 40.

The new centres are 19.8 and 54.6

The new centres are 21.2 and 57

C1 = [3+6)/2, (2+7)/2] = (4.5, 4.5)

Manhattan Distances from the Centres

C1 = [3+3)/2, (3+4)/2] = (3,3)

The final clusters are

Q3. K-means clustering with Euclidean Distance

A1=(2,10), A2=(2,5), A3=(8,4), A4=(5,8), A5=(7,5), A6=(1,2)

Initial centres are C1(2,10), C2(5, 8), C3 (1, 2)

Euclidean Distances from the Centres

Euclidean Distances from the Centres

The current clusters are

Since the members have not changed, K-Means terminates here.

Exercise 4: K-Medoid Clustering Using Distance Matrix

Therefore the final clusters are {A, B} and {C, D, E, F}.

You might also like