2022 BM MRS Cluster Analysis

Cluster analysis
1
WHAT IS SEGMENTATION?
• Segmenting, at its most basic,
• Is the separation of a group of customers with different needs into subgroups of

customers with similar needs and preferences.
• By doing this, a company can better tailor and target its products and services to meet
each segment’s needs.
2
WHY DO WE NEED SEGMENTATION?
• Segmentation is a critical enabler to achieve business objectives and realize benefits
• Segmentation is critical to identify white spaces for new products/offerings
• Segmentation helps organizations to optimize their retention and acquisition strategy
• Segmentation is often used to optimize pricing across different products
• Segmentation enables organizations to become more customer-centric
• Market Dynamics make segmentation critical to business success
3
WHAT ARE THE DIFFERENT WAYS OF SEGMENTATION?
What are they doing?

• Product usage and loyalty tactic
Purchase • Brand Awareness
behaviour • Price Paid, SoW, Frequency
segmentation
How are they doing it?
• Purchase & shopping
behaviours
Channel • Key influencers
segmentation
Who are they?
• Lifestyle and life stage
• Geography
Demographics • Industry type (B2B)
segmentation
When and Where are they

doing it? strategic
Occasion • Purchase and usage occasions
segmentation
Needs
Why are they doing it?
• Category needs, desires and
segmentation
beliefs
Picture courtesy Prof. Theodoros Evgeniou 44

WHAT ARE THE DIFFERENT KINDS OF DATA USED FOR SEGMENTATION?
Primary data (Qual Customer data Third party data

and Quant)
• Behavioral • Product/Service • Credit score

usage
• Satisfaction • Demographics
• Subscription
• Preferred comm • Behavioral
channels • Features usage
• Preferred • Social network

engagement level integration
• Attitudes • Acquisition
channel
• Some
demographics
5
WHAT MAKES FOR A GOOD SEGMENTATION?
Segmentation exercise is considered successful if the segments formed are
Identifiable
Substantial
Accessible
Stable
Differentiable
Actionable
6
the basic concept
7
CLUSTER ANALYSIS THE BASIC CONCEPT
• Cluster analysis is an interdependence technique used to classify objects into relatively

homogeneous groups called clusters
• Cluster analysis is a classification technique that falls under the umbrella of

unsupervised learning methods and so is different from classification methods like
Logistic Regression, Discriminant Analysis, CART, CHAID which are termed as
supervised learning methods.
• Difference between cluster analysis and other methods mentioned above is that
clusters are discovered from the data and are not known apriori
8
Conducting Cluster analysis –
the steps involved
9
STEP-1: FORMULATE THE PROBLEM
• Formulate the problem (Select variables that form the basis of clustering)
– Select variables describe the similarity between objects in terms that are
relevant to the marketing problem
10
&
&
11 11
• Formulate the problem (Select variables that form the basis of clustering)
– Select variables describe the similarity between objects in terms that are
relevant to the marketing problem
– Select variables based on past research, theory or consideration of
hypotheses to be tested
– Consult experts in the category
– Clustering variables Vs Profiling variables
12
STEP- 2: SELECT A DISTANCE OR SIMILARITY MEASURE
• Measure similarity in terms of distance between objects

Similarity  1/distance
• Measures of similarity:
Euclidean distance {∑(Vai – Vbi)2}1/2
City block or Manhattan distance
Chebychev distance Max {|Vai – Vbi|}
• Euclidean distance is the most popular distance metric used
13
STEP- 3: SELECT A CLUSTERING PROCEDURE
• Hierarchical (a procedure characterized by a tree-like structure)
• Non-hierarchical (K-means clustering- a procedure that assigns a cluster

center and groups all objects within a specified threshold)
14
HIERARCHICAL CLUSTERING PROCEDURES
Cluster1 Cluster2
15
15
NON HIERARCHICAL CLUSTERING PROCEDURES
• Choose the number of clusters, k.
• Generate k random points as cluster centroids
• Assign each point to the nearest centroid
• Recompute the new cluster centroid
• Repeat till convergence criteria is met (assignment to clusters is not changing

over multiple iterations)
16
STEP- 3: SELECT A CLUSTERING PROCEDURE
• Question: What are the advantages, and disadvantages of non-hierarchical

clustering?
• Answer:
Advantages:
Faster, and has merit when the number of objects is large
Disadvantages:
Number of clusters must be pre-specified
Selection of cluster centers is arbitrary
Clustering solution depends on order of objects
17
STEP- 4: DECIDE ON THE NUMBER OF CLUSTERS
• Theoretical, conceptual or practical considerations might suggest a number
• In hierarchical clustering, distances at which clusters are combined can be

used as a criteria – get this info from agglomeration schedule or dendrogram
• In non-hierarchical clustering, ratio of Within group variance to between group

variance can be plotted against the number of clusters – point at which an
elbow occurs indicates the number of clusters
• Relative sizes of the clusters should be meaningful
18
EXHIBIT 1: AGGLOMERATION SCHEDULE The Coefficients column indicates
the distance between the two
clusters (or cases) joined at
each stage.
Cluster combined Stage cluster first

appears
Stage Cluster1 Cluster2 Coefficients Cluster1 Cluster2 Next stage
1 2 7 1.922 0 0 2
2 2 3 6.452 1 0 10
3 4 11 10.580 0 0 5
4 1 12 13.700 0 0 6
5 4 9 62.775 3 0 7
6 1 10 101.530 4 0 9
7 4 8 316.408 5 0 11
8 5 6 489.957 0 0 9
9 1 5 1530.504 6 8 10
10 1 2 2271.371 9 2 11
11 1 4 11671.97 10 7 0
For a good cluster solution, you will see a
4 clusters remain sudden jump in the distance coefficient
after stage8 (or a sudden drop in the similarity
coefficient) as you read down the table.
19
19
EXHIBIT 2 - DENDROGRAM
At each stage, one

case or cluster is
Stage joined with
another case or
1 2 7 cluster
2 2 When clusters or
3 cases are joined,
they are
3 4 11 subsequently
labeled with the
4 1 12 smaller of the two
cluster numbers.
5 4 9
10 2 1
11
1 4
1
20
20
STEP- 5: INTREPRET AND PROFILE THE CLUSTERS
• Examine the cluster centroids
• Profile clusters based on variables that were not used for clustering
• Identify variables that significantly differentiate between clusters using

Discriminant analysis or ANOVA
• Significant differences across not only Clustering variables but also

Descriptive variables, indicates presence of Natural clusters
21
CLUSTER ANALYSIS – STEPS INVOLVED
• Formulate the problem
• Select a similarity measure
• Select a clustering procedure
• Decide on the number of clusters
• Interpret, and profile the clusters
• Assess Reliability, and Validity
22
Thank you
23

2022 BM MRS Cluster Analysis

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

2022 BM MRS Cluster Analysis

Uploaded by

Copyright:

Available Formats

Cluster analysis

• Segmenting, at its most basic,

• Is the separation of a group of customers with different needs into subgroups of

• Segmentation is a critical enabler to achieve business objectives and realize benefits

• Segmentation is critical to identify white spaces for new products/offerings

• Segmentation helps organizations to optimize their retention and acquisition strategy

• Segmentation is often used to optimize pricing across different products

• Segmentation enables organizations to become more customer-centric

• Market Dynamics make segmentation critical to business success

What are they doing?

When and Where are they

Picture courtesy Prof. Theodoros Evgeniou 44

Primary data (Qual Customer data Third party data

• Behavioral • Product/Service • Credit score

• Preferred • Social network

Segmentation exercise is considered successful if the segments formed are

• Cluster analysis is an interdependence technique used to classify objects into relatively

• Cluster analysis is a classification technique that falls under the umbrella of

• Measure similarity in terms of distance between objects

• Euclidean distance is the most popular distance metric used

• Hierarchical (a procedure characterized by a tree-like structure)

• Non-hierarchical (K-means clustering- a procedure that assigns a cluster

• Choose the number of clusters, k.

• Generate k random points as cluster centroids

• Assign each point to the nearest centroid

• Recompute the new cluster centroid

• Repeat till convergence criteria is met (assignment to clusters is not changing

• Question: What are the advantages, and disadvantages of non-hierarchical

• Theoretical, conceptual or practical considerations might suggest a number

• In hierarchical clustering, distances at which clusters are combined can be

• In non-hierarchical clustering, ratio of Within group variance to between group

• Relative sizes of the clusters should be meaningful

Cluster combined Stage cluster first

At each stage, one

• Examine the cluster centroids

• Identify variables that significantly differentiate between clusters using

• Significant differences across not only Clustering variables but also

• Formulate the problem

• Select a similarity measure

• Select a clustering procedure

• Decide on the number of clusters

• Interpret, and profile the clusters

• Assess Reliability, and Validity

You might also like