Professional Documents
Culture Documents
Distance Measures
Distance Measures
Euclidean Distance
Mahalanobis Distance
Minkowski Distance
Euclidean Distance
The Euclidean distance is a non-negative measure which calculates
the distance between two points. The distance between these two
points is quantified based on the Pythagoras Theorem.
Mahalanobis Distance
The Mahalanobis distance is used to find the distance between two
points as a form of t-score. The Mahalanobis distance also takes
normalisation and dispersion of the data into account.
Minkowski Distance
Single Linkage
Complete Linkage
Centroid Linkage
Ward’s Linkage
Average Linkage
The process is then repeated until there is only a single cluster left.
For the linkage examples, I would be using a simple scatterplot to
show the relation with the Points defined by the following table.
From the scatter plot we can deduce that three clusters can be seen.
For the Single linkage, two clusters with the closest minimum
distance are merged. This process repeats until there is only a single
cluster left.
Hierarchical Clustering using Complete Linkage
For the Complete linkage, two clusters with the closest maximum
distance are merged. This process repeats until there is only a single
cluster left.
Hierarchical Clustering using Centroid Linkage
For the Centroid linkage, two clusters with the lowest centroid
distance are merged. This process repeats until there is only a single
cluster left.
For Ward’s linkage, two clusters are merged based on their error
sum of square (ESS) values. The two clusters with the lowest ESS
are merged. This process repeats until there is only a single cluster
left.