Answers-Q3 1-Q3 2

Answer
Given: K=2, initial cluster centers: (0,0) & (5,6), and
Data samples: (0,0), (0,1), (1,0), (3,3), (5,6), (8,9), (9,8), (9,9)
Iteration 1
Distance from Distance from

Data Sample Cluster-1’s center Cluster-2’s center
(0,0) (5,6)
(0,0) 0 7.81
(0,1) 1 7.07
(1,0) 1 7.21
(3,3) 4.24 3.61
(5,6) 7.81 0
(8,9) 12.37 4.24
(9,8) 12.37 4.47
(9,9) 12.73 5
Updated cluster centers:
For Cluster-1: centroid of samples (0,0), (0,1), and (1,0) = (0.33,0.33)
For Cluster-2: centroid of samples (3,3), (5,6), (8,9), (9,8), and (9,9) = (6.8,7)
Iteration 2

(0.33,0.33) (6.8,7)
(0,0) 0.47 9.76
(0,1) 0.75 9.08
(1,0) 0.75 9.09
(3,3) 3.78 5.52
(5,6) 7.35 2.06
(8,9) 11.58 2.33
(9,8) 11.58 2.42
(9,9) 12.26 2.97
For Cluster-1: centroid of samples (0,0), (0,1), (1,0), and (3,3) = (1,1)
For Cluster-2: centroid of samples (5,6), (8,9), (9,8), and (9,9) = (7.75,8)
Iteration 3

(1,1) (7.75,8)
(0,0) 1.41 11.14
(0,1) 1 10.44
(1,0) 1 10.47
(3,3) 2.83 6.90
(5,6) 6.40 3.40
(8,9) 10.63 1.03
(9,8) 10.63 1.25
(9,9) 11.31 1.60
For Cluster-1: centroid of samples (0,0), (0,1), (1,0), and (3,3) = (1,1)
For Cluster-2: centroid of samples (5,6), (8,9), (9,8), and (9,9) = (7.75,8)
No change in cluster centers. Therefore, final clusters are
Cluster-1: (0,0), (0,1), (1,0), and (3,3)
Cluster-2: (5,6), (8,9), (9,8), and (9,9)
Within cluster sum of squares:
WCSS = {(𝟎 − 𝟏)𝟐 + (𝟎 − 𝟏)𝟐 } + {(𝟎 − 𝟏)𝟐 + (𝟏 − 𝟏)𝟐 } + {(𝟏 − 𝟏)𝟐 + (𝟎 − 𝟏)𝟐 } +
{(𝟑 − 𝟏)𝟐 + (𝟑 − 𝟏)𝟐 } + {(𝟓 − 𝟕. 𝟕𝟓)𝟐 + (𝟔 − 𝟖)𝟐 } + {(𝟖 − 𝟕. 𝟕𝟓)𝟐 + (𝟗 − 𝟖)𝟐 } +
{(𝟗 − 𝟕. 𝟕𝟓)𝟐 + (𝟖 − 𝟖)𝟐 } + {(𝟗 − 𝟕. 𝟕𝟓)𝟐 + (𝟗 − 𝟖)𝟐 }
Subjecting the data points in Question 3.1 to K-medoids
Assumption: K=2, initial medoids are (0,0) & (5,6), and Manhattan distance is
used as a dissimilarity measure/metric.
Manhattan Distance from Manhattan Distance

Data Sample Cluster-1’s medoid from Cluster-2’s medoid
(0,0) (5,6)
(0,0) 0 11
(0,1) 1 10
(1,0) 1 10
(3,3) 6 5
(5,6) 11 0
(8,9) 17 6
(9,8) 17 6
(9,9) 18 7
Cost = 0+1+1+5+0+6+6+7 = 26.
What will be the cost if in Cluster-1, the medoid (0,0) is swapped with the non-
medoid data point (1,0)?

(1,0) (5,6)
(0,0) 1 11
(0,1) 2 10
(1,0) 0 10
(3,3) 5 5
(5,6) 10 0
(8,9) 16 6
(9,8) 16 6
(9,9) 17 7
Cost = 1+2+0+5+0+6+6+7 = 27. The cost increases and therefore, the swap should
be avoided.
Will there be a decrease in the cost if in Cluster-2, the medoid (5,6) is swapped
with the non-medoid data point (8,9)?

(0,0) (8,9)
(0,0) 0 17
(0,1) 1 16
(1,0) 1 16
(3,3) 6 11
(5,6) 11 6
(8,9) 17 0
(9,8) 17 2
(9,9) 18 1
Cost = 0+1+1+6+6+0+2+1 = 17. The cost decreases and therefore, they can be
swapped. The medoid for Cluster-2 should be (8,9).
Since no other swap could reduce the cost below 17, the final clusters can be
formed based on the last swap. The final clusters will be
Cluster-1: (0,0), (0,1), (1,0), and (3,3)
Cluster-2: (5,6), (8,9), (9,8), and (9,9)

Answer for part-(a): simple-linkage clustering
ITERATION 1
STEP 1.1
P1 P2 P3 P4 P5 P6
P1 1
P2 0.7895 1
P3 0.1579 0.3684 1
P4 0.0100 0.2105 0.8421 1
P5 0.5292 0.7023 0.5292 0.3840 1
P6 0.3542 0.5480 0.6870 0.5573 0.8105 1
STEP 1.2
P1 P2 P34 P5 P6
P1 1
P2 0.7895 1
P34 0.1579 0.3684 1
P5 0.5292 0.7023 0.5292 1
P6 0.3542 0.5480 0.6870 0.8105 1
ITERATION 2
STEP 2.1
P1 P2 P34 P5 P6
P1 1
P2 0.7895 1
P34 0.1579 0.3684 1
P5 0.5292 0.7023 0.5292 1
P6 0.3542 0.5480 0.6870 0.8105 1
STEP 2.2
P1 P2 P34 P56
P1 1
P2 0.7895 1
P34 0.1579 0.3684 1
P56 0.5292 0.7023 0.6870 1
ITERATION 3
STEP 3.1
P1 P2 P34 P56
P1 1
P2 0.7895 1
P34 0.1579 0.3684 1
P56 0.5292 0.7023 0.6870 1
STEP 3.2
P12 P34 P56

P12 1
P34 0.3684 1
P56 0.7023 0.6870 1
ITERATION 4
STEP 4.1
P12 P34 P56

P12 1
P34 0.3684 1
P56 0.7023 0.6870 1
The next level (i.e. the top most level) of hierarchy will have P1256 and P34.
Resulting dendrogram
Answer for part-(b): complete-linkage clustering
ITERATION 1
STEP 1.1
P1 P2 P3 P4 P5 P6
P1 1
P2 0.7895 1
P3 0.1579 0.3684 1
P4 0.0100 0.2105 0.8421 1
P5 0.5292 0.7023 0.5292 0.3840 1
P6 0.3542 0.5480 0.6870 0.5573 0.8105 1
STEP 1.2
P1 P2 P34 P5 P6
P1 1
P2 0.7895 1
P34 0.0100 0.2105 1
P5 0.5292 0.7023 0.3840 1
P6 0.3542 0.5480 0.5573 0.8105 1
ITERATION 2
STEP 2.1
P1 P2 P34 P5 P6
P1 1
P2 0.7895 1
P34 0.0100 0.2105 1
P5 0.5292 0.7023 0.3840 1
P6 0.3542 0.5480 0.5573 0.8105 1
STEP 2.2
P1 P2 P34 P56
P1 1
P2 0.7895 1
P34 0.0100 0.2105 1
P56 0.3542 0.5480 0.3840 1
ITERATION 3
STEP 3.1
P1 P2 P34 P56
P1 1
P2 0.7895 1
P34 0.0100 0.2105 1
P56 0.3542 0.5480 0.3840 1
STEP 3.2
P12 P34 P56

P12 1
P34 0.0100 1
P56 0.3542 0.3840 1
ITERATION 4
STEP 4.1
P12 P34 P56

P12 1
P34 0.0100 1
P56 0.3542 0.3840 1
The next level (i.e. the top most level) of hierarchy will have P3456 and P12.
Resulting dendrogram

Answers-Q3 1-Q3 2

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Answers-Q3 1-Q3 2

Uploaded by

Copyright:

Available Formats

Answer

Given: K=2, initial cluster centers: (0,0) & (5,6), and

Distance from Distance from

Updated cluster centers:

For Cluster-1: centroid of samples (0,0), (0,1), and (1,0) = (0.33,0.33)

Distance from Distance from

Updated cluster centers:

Distance from Distance from

No change in cluster centers. Therefore, final clusters are

Cluster-1: (0,0), (0,1), (1,0), and (3,3)

Cluster-2: (5,6), (8,9), (9,8), and (9,9)

Within cluster sum of squares:

Manhattan Distance from Manhattan Distance

Cost = 0+1+1+5+0+6+6+7 = 26.

Manhattan Distance from Manhattan Distance

Manhattan Distance from Manhattan Distance

Cluster-1: (0,0), (0,1), (1,0), and (3,3)

Cluster-2: (5,6), (8,9), (9,8), and (9,9)

P12 P34 P56

P12 P34 P56

P12 P34 P56

P12 P34 P56

You might also like