You are on page 1of 1

EDUC 343: INTRODUCTION TO APPLIED MULTIVARIATE DATA ANALYSIS

Cluster Analysis

Cluster Analysis is a multivariate analysis technique that seeks to organize information about variables so that
relatively homogenenous groups, or "clusters," can be formed. The clusters formed with this family of methods
should be highly internally homogenous (members are similar to one another) and highly externally heterogenous
(members are not like members of other clusters.

Although cluster analysis is relatively simple, and can use a variety of input data, it is a relatively new technique
and is not supported by a comprehensive body of statistical literature. So, most of the guidelines for using cluster
analysis are rules of thumb and some authors caution researchers in their use cluster analysis.

What you need in order to do a cluster analysis?

Like MDS, cluster analysis can accept a wide variety of input data. While these are generally called "similarity"
measures, they can also be termed "proximity," "resemblance," or "association." Some authors recommend using
standardized data, since you may be clustering items measured on different scales, and standardizing will give you
a "unit free" measure.

Steps in conducting a cluster analysis

There are four basic cluster analysis steps:

 data collection and selection of the variables for analysis


 generation of a similarity matrix
 decision about number of clusters and interpretation
 validation of cluster solution

Output of a cluster analysis

The main outcome of a cluster analysis is a dendrogram, which is also called a tree diagram or a tree plot.

Like the other techniques, cluster analysis presents the problem of how many factors, or dimensions, or clusters to
keep. One rule of thumb for this is to choose a place where the cluster structure remains stable for a long distance.
Some other possibilities are to look for cluster groupings that agree with existing or expected structures, or to
replicate the analysis on subsets of the data to see if the structures emerge consistently.

You might also like