You are on page 1of 18

# Clustering for Segmentation

## Automatic Cluster Detection

K-Means clustering algorithm depends on a geometric interpretation of the data
Other automatic cluster detection (ACD) algorithms include:
Gaussian mixture models
Agglomerative clustering
Divisive clustering
Self-organizing maps (SOM) - Neural Nets
ACD is a tool
No preclassified training data set
No distinction between independent and dependent variables
Marketing clusters referred to as segments
Customer segmentation is a popular application of clustering
ACD rarely used in isolation other methods follow up
Segmentation
Organizing Customers into groups with similar traits, product
preferences or expectations
Demographic Characteristics
Psychographics (interests, attitudes, opinions, personality, values, lifestyles)
Desired benefits from products/services
Past-purchase or product use behaviors
K-means Clustering

## K circa 1967 this algorithm looks for a fixed number of clusters

which are defined in terms of proximity of data points to each other
How K-means works (see next slide figures):
Algorithm selects K data points randomly
Assigns each of the remaining data points to one of K clusters
Calculate the mean of cases of each cluster and move the K data
points/ cluster seeds to the mean of the cluster
Reassign cases closest to the new seed I as belonging to cluster I
Euclidean distance (dist. Between two points (u1,v1) and (u2,v2)
is the sq. root (sq. (u1-u2) + sq. (v1-v2)
K-means Clustering
Data Preparation for K-Means

## Abbott Analytics, 2001-2017 7

Sum of Squared Errors vs.
# Clusters

## Abbott Analytics, 2001-2017 8

K-means Clustering

Resulting clusters
describe underlying
structure in the data,
however, there is no
one right description of
that structure
Similarity & Difference
Automatic Cluster Detection is quite simple for a software program to
accomplish data points, clusters mapped in space
purchases, phone calls, airplane trips, car registrations, etc. which
have no obvious connection to the dots in a cluster diagram
Similarity & Difference
Clustering business data requires some notion of natural association
records (data) in a given cluster are more similar to each other than
to those in another cluster
For DM software, this concept of association must be translated into
some sort of numeric measure of the degree of similarity
Most common translation is to translate data values (eg., gender, age,
product, etc.) into numeric values so can be treated as points in space
If two points are close in geometric sense then they represent similar
data in the database
Similarity & Difference

Categorical (eg., mint, cherry, chocolate)
Ranks (eg., freshman, soph, etc. or valedictorian,
salutatorian)
Intervals (eg., 56 degrees, 72 degrees, etc)
True measures interval variables that measure from a
meaningful zero point
Age, weight, height, length, tenure are good examples
Pattern Discovery
the discovery of interesting, unexpected, or valuable structures in
large data sets.
- David Hand, Professor of Statistics, Imperial College

## - If youve got terabytes of data, and youre relying on data mining to

find interesting things in there for you, youve lost before youve even
begun. You really need people who understand what it is they are
looking for and what they can do with it once they find it.
- Herb Edelstein, President of Two Crows Corporation
Inputs (Desirable Charateristics)
Meaningful to the analysis objective
Relatively independent
Limited in number
Have a measurement level of Interval
Have low kurtosis (measure of density of distribution in peak, flanks
and tails) and skewness (measure of symmetry of distribution)
statistics
Grocery Store Case Study
Analysis goal:
Where should you open new grocery store locations?
Group geographic regions into segments based on
income, household size, and population density.
Analysis plan:
Select and transform segmentation inputs.
Select the number of segments to create.
Create segments with the Cluster tool.
Interpret the segments.

47