You are on page 1of 4

Cluster Analysis

Cluster:
A set of objects that are similar to each other and
separated from the other objects.
Cluster analysis:
Finding similarities between data according to the
characteristics found in the data and grouping
similar data objects into clusters
Employee Opinion Surveys
Market Research
Factor Analysis: to find patterns within
variables
Cluster Analysis: to find patterns between
individuals
Discriminant Analysis: to look for differences
between groups
How to measure similarity?
How to form clusters?
How many clusters?
Key Terms:
Hierarchical clustering is best for small datasets
because this procedure computes a proximity
matrix of the distance/similarity of every case
with every other case in the dataset. An
agglomerative or divisive method can be used to
cluster cases.
Agglomerative method: It begins with each case
being a cluster by itself and continues until
similar clusters merge together.
Divisive method: It begins with every case into
one cluster and continues until each case is
divided into individual clusters.
Wards method: Compute sum of squared
distances within clusters
Centroid method: The distance between two
clusters is defined as the difference between the
centroids (cluster averages)
Euclidean Distance:
1/ 2
2
d ij ( xik x jk )
k

You might also like