Professional Documents
Culture Documents
1
0.05 1
3
0
1 3 2 5 4 6
Strengths of Hierarchical Clustering
• Do not have to assume any particular number
of clusters
– Any desired number of clusters can be obtained by
‘cutting’ the dendrogram at the proper level
– Divisive:
• Start with one, all-inclusive cluster
• At each step, split a cluster until each cluster contains an individual point (or there
are k clusters)
proximity matrix p1
p2
p3
p4
p5
.
.
. Proximity Matrix
...
p1 p2 p3 p4 p9 p10 p11 p12
Intermediate Situation
• After some merging steps, we have some clusters
C1 C2 C3 C4 C5
C1
C2
C3
C3
C4
C4
C5
Proximity Matrix
C1
C2 C5
...
p1 p2 p3 p4 p9 p10 p11 p12
Intermediate Situation
• We want to merge the two closest clusters (C2C1and
C2C5)C3and
C4update
C5
C2
C3
C3
C4
C4
C5
Proximity Matrix
C1
C2 C5
...
p1 p2 p3 p4 p9 p10 p11 p12
After Merging
C2
• The question is “How do we update the proximity matrix?”
U
C1 C5 C3 C4
C1 ?
C2 U C5 ? ? ? ?
C3
C3 ?
C4
C4 ?
Proximity Matrix
C1
C2 U C5
...
p1 p2 p3 p4 p9 p10 p11 p12
How to Define Inter-Cluster Distance
p1 p2 p3 p4 p5 ...
p1
Similarity?
p2
p3
p4
p5
MIN
.
MAX .
Group Average .
Proximity Matrix
Distance Between Centroids
Other methods driven by an objective
function
– Ward’s Method uses squared error
How to Define Inter-Cluster Similarity
p1 p2 p3 p4 p5 ...
p1
p2
p3
p4
p5
MIN
.
MAX .
Group Average .
Proximity Matrix
Distance Between Centroids
Other methods driven by an objective
function
– Ward’s Method uses squared error
How to Define Inter-Cluster Similarity
p1 p2 p3 p4 p5 ...
p1
p2
p3
p4
p5
MIN
.
MAX .
Group Average .
Proximity Matrix
Distance Between Centroids
Other methods driven by an objective
function
– Ward’s Method uses squared error
How to Define Inter-Cluster Similarity
p1 p2 p3 p4 p5 ...
p1
p2
p3
p4
p5
MIN
.
MAX .
Group Average .
Proximity Matrix
Distance Between Centroids
Other methods driven by an objective
function
– Ward’s Method uses squared error
How to Define Inter-Cluster Similarity
p1 p2 p3 p4 p5 ...
p1
p2
p3
p4
p5
MIN
.
MAX .
Group Average .
Proximity Matrix
Distance Between Centroids
Other methods driven by an objective
function
– Ward’s Method uses squared error
MIN or Single Link
• Proximity of two clusters is based on the two closest points in
the different clusters
– Determined by one pair of points, i.e., by one link in the proximity graph
– You start with smaller clusters and then merge these with other clusters
to form bigger ones
• Example:
Distance Matrix:
Hierarchical Clustering: MIN
5
1 Dendrogram
3
5 0.2
2 1 0.15
2 3 6 0.1
0.05
4
4 0
3 6 2 5 4 1
• 1st level: 5 and 2 are closer (sub-cluster A); 3 and 6 are closer (sub-cluster B)
• 2nd level: sub-cluster A is closer to sub-cluster B; merge them to form sub-cluster C
• 3rd level: Merge 4 into sub-cluster C to form a bigger cluster D.
• Finally, merge 1 into sub-cluster D to form a cluster E
Nested Clusters
Strength of MIN
Two Clusters
Original Points
4 1
2 5 0.4
0.35
5
2 0.3
0.25
3 6
0.2
3 0.15
1 0.1
4 0.05
0
3 6 4 1 2 5
p jClusterj
proximity(Clusteri , Clusterj )
|Clusteri ||Clusterj |
5 4 1
2 0.25
5 0.2
2
0.15
3 6 0.1
1 0.05
4 0
3 3 6 4 1 2 5
• Strengths
– Less susceptible to noise and outliers
• Limitations
– Biased towards globular clusters