You are on page 1of 7

Cluster Analysis

Cluster analysis is a class of techniques used to


classify objects or cases into relatively
homogeneous groups called clusters. Objects in
each cluster tend to be similar to each other and
dissimilar to objects in the other clusters. Cluster
analysis is also called classification analysis, or
numerical taxonomy.
Both cluster analysis and discriminant analysis
are concerned with classification. However,
discriminant analysis requires prior knowledge of
the cluster or group membership for each object
or case included, to develop the classification
rule. In contrast, in cluster analysis there is no a
priori information about the group or cluster
membership for any of the objects. Groups or

Variable 1

AnFig.Ideal
Clustering
Situation
20.1

Variable 2

Variable 1

A Practical Clustering
Fig. 20.2
Situation

Variable 2

A Classification of Clustering
Clustering
Procedures
Procedures
Fig. 20.4
Nonhierarchical

Hierarchical

Agglomerative

Divisive

Sequential
Threshold
Linkage
Methods

Variance
Methods

Parallel
Threshold
Centroid
Methods

Wards Method

Single

Complete

Average

Optimizing
Partitioning

Linkage Methods of
Fig. 20.5
Single Linkage
Clustering
Minimum
Distance
Cluster 1

Complete Linkage

Cluster 2

Maximum
Distance

Cluster 1

Cluster 2

Average Linkage

Cluster 1

Average
Distance

Cluster 2

Clustering Variables
In this instance, the units used for analysis are
the variables, and the distance measures are
computed for all pairs of variables.
Hierarchical clustering of variables can aid in the
identification of unique variables, or variables
that make a unique contribution to the data.
Clustering can also be used to reduce the number
of variables. Associated with each cluster is a
linear combination of the variables in the cluster,
called the cluster component. A large set of
variables can often be replaced by the set of
cluster components with little loss of information.
However, a given number of cluster components
does not generally explain as much variance as
the same number of principal components.

You might also like