You are on page 1of 5

ITS665 || Data Mining

Tutorial 6 Part 2 – Topic 6 Part 2 (Cluster Analysis)

Question 1

Given the following dissimilarity matrix table.

A B C D E
A 0
B 0.07 0
C 0.38 0.14 0
D 0.25 0.08 0.06 0
E 0.12 0.25 0.35 0.14 0

Apply k-Means algorithm to find clusters with two initial centres, A and B

Question 2

The following six points, X1, X2, X3, X4, X5, X6 represent tourist locations around the city
of Shah Alam. The task is to cluster those points into TWO (2) clusters with X2 and X4 as
the centre of each cluster.

X1 = (2, 6)
X2 = (4, 7)
X3 = (5, 11)
X4 = (7, 10)
X5 = (8, 9)
X6 = (9, 8)

a) Calculate the distance of each point to the initial centroid by using Euclidean
distance.

b) Apply k-mean algorithm to produce TWO (2) clusters. Show the steps and produce
the clusters after iteration 1.

c) Based on the points, sketch the clusters


ITS665 || Data Mining

Question 3

Given the following data points and the distance matrix based on Euclidean distance:

A1 = (2, 10) A5 = (7, 5)


A2 = (2, 5) A6 = (6, 4)
A3 = (8, 4) A7 = (1, 2)
A4 = (5, 8) A8 = (4, 9)

A1 A2 A3 A4 A5 A6 A7 A8
A1 0
A2 √25 0
A3 √36 √37 0
A4 √13 √18 √25 0
A5 √50 √25 √2 √13 0
A6 √52 √17 √2 √17 √2 0
A7 √65 √10 √53 √52 √45 √29 0
A8 √5 √20 √41 √2 √25 √29 √58 0

a) Show the steps in applying k-means algorithm for 1 epoch only to create THREE
(3) clusters based on the 8 points. Suppose that the initial centres are A1, A4 and
A7.

b) Show a 10 by 10 space with all the 8 points after the first epoch

c) Calculate the new clusters and mark them on the graph


ITS665 || Data Mining

Question 4

Suppose that the data mining task is to cluster the following eight points into THREE (3)
clusters. The initial centres of each cluster are A1, A4 and A7.

A1 = (3, 10) A5 = (7, 5)


A2 = (2, 5) A6 = (6, 3)
A3 = (8, 2) A7 = (1, 1)
A4 = (5, 8) A8 = (3, 9)

A1 A2 A3 A4 A5 A6 A7 A8
A1 0 √26 √89 √8 √41 √58 √85 √1

A2 0 √45 √18 √25 √20 √17 √17

A3 0 √45 √10 √5 √50 √74

A4 0 √13 √26 √65 √5

A5 0 √5 √52 √32

A6 0 √29 √45

A7 0 √68

A8 0

a) Show the new clusters of each point

b) Draw a 10 by 10 grid with all the 8 points and show the clusters after the first epoch

c) Calculate the new centers for each cluster


ITS665 || Data Mining

Question 5

The following diagram shows the results of k-means clustering with k running from 2 to
12.

What is the best number of clusters based on the above figure? Justify why.

Question 6

The following points represent the location of eight cities:

X1 = (5, 5) Y2 = (3, 5)
X2 = (12, 4) Z1 = (9, 2)
X3 = (8, 2) Z2 = (11, 2)
Y1 = (4, 6) Z3 = (4, 9)

The task is to cluster these points into three clusters. Suppose we assign X2, Y1 and Z2
as the initial center of each cluster. Use the k-means algorithm to show the three cluster
centers after the first round of execution using Manhattan distance function.
ITS665 || Data Mining

Question 7

Suppose the data mining task is to cluster the following six points into three clusters.

A1 = (3, 1) A3 = (4, 5) A5 = (1, 2)


A2 = (2, 3) A4 = (0, 3) A6 = (4, 7)

The distance function is Manhattan distance. Suppose initially A1 and A5 are assigned
as the center of each cluster, respectively. Use k-means algorithm to show only the two
cluster center after the first round of execution.

You might also like