You are on page 1of 8

Introduction to

Clustering
Clustering is the task of grouping a set of objects in such a way that objects in
the same group (or cluster) are more similar to each other than to those in
other groups. It is a key technique in exploratory data analysis and data
mining.
Types of Clustering Algorithms
Partitioning Hierarchical Density-Based
Algorithms Algorithms Algorithms
These algorithms divide the This method creates a tree of They are based on the idea that
data into several groups without clusters, also known as a a cluster is a dense area of data
overlapping. dendrogram. points.
K-means Clustering
Initial Cluster Centers
1 Randomly select initial cluster centers in the data space.

Cluster Assignment
2 Assign each point to the nearest cluster based on the distance.

Update Centers
3 Update the cluster centers to be the mean of the points assigned to the cluster.
Hierarchical Clustering
1 Agglomerative Approach 2 Divisive Approach
Start with each point as a cluster and Start with one cluster containing all
merge the closest clusters successively. points and split into smaller clusters.
Density-Based Clustering
Core Points Directly Density Outliers
Data points that have a Reachable Points that are not directly
specified number of points Data points are directly density reachable but belong
within a given radius. density reachable from a to a cluster.
core point.
Evaluating Clustering Results
1 Cohesion 2 Separation
Measure of how closely related all the objects Measure of how distinct or well-separated a
in the cluster are. cluster is from other clusters.
Applications of Clustering
Customer Segmentation 1
Group customers based on similar
behavior and preferences.
2 Anomaly Detection
Identify outliers or abnormal data points
in a dataset.
Image Segmentation 3
Partitioning an image into multiple
segments for analysis.
Conclusion and Future Directions

1 Future Potential 2 Enhanced Algorithms


Integration with deep learning and artificial Developing more efficient and scalable
intelligence for advanced applications. clustering algorithms for big data.

You might also like