Professional Documents
Culture Documents
Ki2 s07 Clustering Algorithms
Ki2 s07 Clustering Algorithms
Clustering Algorithms
Johan Everts
What is Clustering?
Find K clusters (or a classification that consists of K clusters) so that the objects of one cluster are similar to each other whereas objects of different clusters are dissimilar. (Bacher 1996)
Stages in clustering
Hierarchical Clustering
Agglomerative clustering treats each data point as a singleton cluster, and then successively merges clusters until all points have been merged into a single remaining cluster. Divisive clustering works the other way around.
Agglomerative Clustering
Single link In single-link hierarchical clustering, we merge in each step the two clusters whose two closest members have the smallest distance.
Agglomerative Clustering
Complete link In complete-link hierarchical clustering, we merge in each step the two clusters whose merger has the smallest diameter.
NA
RM TO
255
412 996
468
268 400
754
564 138
0
219 869
219
0 669
869
669 0
FI 662 0 295
NA/RM
255
268
564
BA/NA/RM
FI
MI/TO
BA/NA/RM
268
564
FI MI/TO
268 564
0 295
295 0
BA/FI/NA/RM
MI/TO
BA/FI/NA/RM
295
MI/TO
295
Square error
K-Means
Step 0: Start with a random partition into K clusters Step 1: Generate a new partition by assigning each pattern to its closest cluster center Step 2: Compute new cluster centers as the centroids of the clusters. Step 3: Steps 1 and 2 are repeated until there is no change in the membership (also cluster centers remain the same)
K-Means
Leader - Follower
Online Specify threshold distance Find the closest cluster center
Distance above threshold ? Create new cluster Or else, add instance to cluster
Leader - Follower
Distance above threshold ? Create new cluster Or else, add instance to cluster
Leader - Follower
Distance above threshold ? Create new cluster Or else, add instance to cluster and update cluster center
Leader - Follower
Distance above threshold ? Create new cluster Or else, add instance to cluster and update cluster center
Leader - Follower
Distance above threshold ? Create new cluster Or else, add instance to cluster and update cluster center
Kohonen SOMs
The Self-Organizing Map (SOM) is an unsupervised artificial neural network algorithm. It is a compromise between biological modeling and statistical data processing
Kohonen SOMs
Each weight is representative of a certain input. Input patterns are shown to all neurons simultaneously. Competitive learning: the neuron with the largest response is chosen.
Kohonen SOMs
Select next input pattern Find Best Matching Unit Update weights of winner and neighbours Decrease learning rate & neighbourhood size
Kohonen SOMs
Kohonen SOMs
Kohonen SOMs
Performance Analysis
K-Means
Leader Follower
Performance Analysis
Stability and Convergence Assured Principle of self-ordering Slow and many iterations needed for convergence Computationally intensive
Conclusion
Any elevated performance over one class, is exactly paid for in performance over another class
Ensemble clustering ?
Use SOM and Basic Leader Follower to identify clusters and then use k-mean clustering to refine.
Any Questions ?