Professional Documents
Culture Documents
Chapter Four Part 2
Chapter Four Part 2
CHAPTER FOUR
ADVANCED ANALYTICAL THEORY AND METHODS:
CLUSTERING
Outline
▪ Overview
▪ Clustering Algorithms
• Partitional Algorithms
✓ K-means
✓ Example for K-means Clustering
• Hierarchical
✓ Example for Hierarchical Clustering
• Density-based
✓ DBSCAN
Source: https://www.youtube.com/watch?v=DfJJzu6Vzi4
▪ Place K points into the space represented by the objects that are being
clustered. These points represent initial group centroids.
▪ Assign each object to the group that has the closest centroid.
▪ When all objects have been assigned, recalculate the positions of the K
centroids.
▪ Repeat Steps 2 and 3 until the centroids no longer move. This produces
a separation of the objects into groups from which the metric to be
minimized can be calculated.
12/26/2022 DATA SCIENCE AND BIG DATA ANAYTICS, CHAPTER FOUR 17
K-means clustering
▪ Graphical representation →
▪ Min = 2
▪ Epsilon =3
▪ Min = 3
▪ Epsilon =4
• Resistant to Noise
• Can handle clusters of
different shapes and sizes
12/27/2022 DATA SCIENCE AND BIG DATA ANAYTICS, CHAPTER FOUR 65
DBSCAN - Challenges
▪ Mean-Shift Clustering
✓ Used mainly in image processing and computer vision
▪ etc.
12/27/2022 DATA SCIENCE AND BIG DATA ANAYTICS, CHAPTER FOUR 70