Professional Documents
Culture Documents
Data Mining
Dr. Ismael A. Ali | ismaelali.net
UoZ
K-means Clustering
● It automatically groups the data into clusters
● We need to determine the number of clusters (k)
● Each cluster has a centroid (center point)
● Each point is assigned to the cluster with the closest
centroid
K-means Clustering
Stop
K-means Clustering
k=3
K-means Clustering
k=3
Goal: Cluster the Iris flowers in the garden into 3
categories/clusters. After clustering we will be
having 3 clusters/groups of flowers with similar
Petal Septal properties (length/width)
y2
p2 (x2, y2) Distance(p1,p2) = | x2 - x1 | + | y2 - y1 |
= | -0.6 - -0.75| + |1.8 - 0.70 |
y1
p1 (x1,y1)
x2
X
x1
x3
Example of K-means Clustering
Example of K-means Clustering
1- select k=3 random centers
2- assign each point to its
closest center
Example of K-means Clustering
Example of K-means Clustering
1- select k=3 random centers
2- assign each point to its
closest center
Example of K-means Clustering
Example of K-means Clustering
3- move centers/centroids to the middle of their groups
Example of K-means Clustering
Example of K-means Clustering
Example of K-means Clustering
Example of K-means Clustering
Example of K-means Clustering
Example of K-means Clustering
Example of K-means Clustering
Example of K-means Clustering
on next week ...
● We will be practicing data clustering in Python
● Looking at real-world datasets
● Explain the Data Clustering Assignment
○ Report template
○ List of datasets to work on