Professional Documents
Culture Documents
Lecture - 35 Clustering
Lecture - 35 Clustering
ANALYSIS
Discovering Groups
Introduction to Cluster Analysis
■ Clustering is defined as grouping a set of similar objects into classes or
clusters. In other words, during cluster analysis, the data is grouped into
classes or clusters, so that records within a cluster (intra-cluster) have
high similarity with one another but have high dissimilarities in
comparison to objects in other clusters (inter-cluster).
■ The similarity of records is identified on the basis of values of attributes
describing the objects. Cluster analysis is an important human activity.
Characteristics of clusters
Applications of Cluster Analysis
■ Marketing
■ Land use
■ Insurance
■ City-planning
■ Earthquake studies
■ Biology studies
■ Web discovery
■ Fraud detection
Desired Features of Clustering
■ Scalability
■ Ability to handle different types of attributes
■ Independent of data input order
■ Identification of clusters with different shapes
■ Ability to handle noisy data
■ High performance
■ Interpretability
■ Ability to stop and resume
■ Minimal user guidance
Distance Metrics
■ Euclidean distance
[Credits: https://numerics.mathdotnet.com/Distance.html]
Distance Metrics
■ Manhattan distance
[Credits: https://numerics.mathdotnet.com/Distance.html]
Distance Metrics
■ Chebyshev distance
[Credits: https://numerics.mathdotnet.com/Distance.html]