Professional Documents
Culture Documents
▪ divisive
▪ works the other way around, as a top-down approach.
AGGLOMERATIVE CLUSTERING
ALGORITHM
1. Consider each data record as a cluster (i.e., “single-point” cluster).
2. The number of clusters is equal to n which is the number of data records within the input dataset.
3. Merge the two closest clusters into one bigger cluster. The number of clusters will become (n-1)
4. Repeat step two until a single cluster is formed: the “universe” cluster.
5. Construct a tree (i.e., dendrogram) to visualize the progression of the formed clusters at each step.
AGGLOMERATIVE CLUSTERING
ALGORITHM
1. Consider each data record as a cluster (i.e., “single-point” cluster).
2. The number of clusters is equal to n which is the number of data records within the input dataset.
3. Merge the two closest clusters into one bigger cluster. The number of clusters will become (n-1)
4. Repeat step two until a single cluster is formed: the “universe” cluster.
5. Construct a tree (i.e., dendrogram) to visualize the progression of the formed clusters at each step.
AGGLOMERATIVE
CLUSTERING ALGORITHM
AGGLOMERATIVE
CLUSTERING ALGORITHM
1. Euclidean distance between the students
Example:
2. Create the proximity matrix for the Euclidean distances between all the data
records
AGGLOMERATIVE
CLUSTERING ALGORITHM
The second step is to combine the closest two data records into one cluster (Cluster 1), which
are data record number 1 and data record number 2, because they have the minimum value
(3.61) as seen in the proximity matrix.
AGGLOMERATIVE
CLUSTERING ALGORITHM
▪ the next minimum distance is 9.22, which is between data record number 3 and
data record number 5; therefore, they are combined in one cluster (Cluster 2).
AGGLOMERATIVE
CLUSTERING ALGORITHM
▪ The next step is to continue calculating the next minimum distance between the
data records (or clusters), which is the distance with a value of 9.43, found between
data record number 3 and data record number 4.
AGGLOMERATIVE
CLUSTERING ALGORITHM
▪ The next minimum distance is 14.14 between data record number 1 (element of
Cluster 1) and data record number 4 (element of Cluster 3). A larger cluster
(Cluster 4) is then formed by combining clusters 1 and 3. Since all the data records
are included in this new cluster, this cluster is the “universe” cluster.
AGGLOMERATIVE
CLUSTERING ALGORITHM
▪ Finally, the tree (i.e., dendrogram) is constructed as a two dimensional graph with
x-axis for the data records of the input dataset, and the y-axis for the recorded
distance between the combined data records (or clusters) at each step, as shown in
the following figure.