You are on page 1of 1

01 Cluster Analysis 03 Steps to conduct Cluster Analysis : 07 How it Works :

Before performing cluster analysis, we must ensure that the data is standardised, especially if the The model uses a dependent and
Cluster analysis is a handy variables are on different measurement scales. this can be done with scale() function in R. an independent variable to estimate:
method in the data science toolkit. Step 1: Select variables that are relevant to the market research problem on the basis of which

"Cluster analysis is a group of clusters are determined. 1. Each attribute's


multivariate techniques whose Step 2: Identify a suitable similarity measure to assess how different the objects are. weight(coefficients)

needed to
primary purpose is to group Step 3: Select the appropriate clustering method – Hierarchical (Agglomerative/ Divisive), Non- cause customers
to pick a certain
objects (entities) based on their Hierarchical or Combination. product. This reflects

the most
characteristics" (S. Pour Step 4: Determine an adequate number of clusters and visualise using plots or dendrograms such critical influences on customer
Mohammad, 2022, slide 6). It is a that clusters do not overlap. choice likelihood
form of exploratory data analysis Step 5: Examine cluster centroids and interpret cluster profiles.
2. Predictions: Using only product
(EDA) where observations are Step 6: Test validity by performing cluster analysis with different distance measures. attributes and weights in the
divided into meaningful groups Formation of clusters: Starting with each observation as its own "cluster", the two most model, we predict the most
that share common characteristics similar(closest) observations are grouped together. probable choices by a new set of
(features) (Gorenshteyn et al., customers. Then the firm can
n.d., Chapter 1). It identifies
clusters with similar or identical 04 Types of Clustering Methods 06 Choice Models segment and target customers
according to their choice likelihood
properties. These commonalities They can be boradly classified into 3
can guide various business types: A Choice Model is a
decisions; for example, mathematical/statistical model that 3. Simulated market share of a
organisations can use them to Non- predicts how a firm's marketing product category. The model adds
moderate and target unique Hierarchical Combination interventions, customer traits, and/or up product choices by all
Hierarchical
customer segments through e.g k-means Can be further classified as environmental circumstances influence the customers faced with all products.
segmentation analyses of likelihood of an observed consumer choice The information can help
consumers. Marketing or response. Choice models can be used managers plan their marketing
Toolkit to make crucial marketing decisions by efforts.
02 Aims: Agglomerative Devisive predicting what... 08 Choice Modeling in R

Cluster analysis aims to maximise Most popular Agglomerative approaches are: consumers are likely to choose based on the Choice modelling in R is done
homogeneity within clusters features of the options available. It is also helpful using a type of regression analysis
Complete - Average - in determining the most important factors
(reduce dissimilarity) whilst Single - linkage called Multinomial Regression
linkage linkage influencing a customer's choice likelihood and analysis which is used when the
maximising heterogeneity
between any 2 clusters. segment and target customers based on those exploratory variable has more than
Dissimilarity is given by the Centroid - Method similarities. It can simulate the potential market two nominal categories. Binomial
distance between clusters share for various products based on customer regression is used if there are

choice. precisely two categories.
Here, Distance= 1 - Similarity. 05 Execution in R - Cluster Analysis # Load the data

The distance measured most # Explore data using names(),


# Import data
commonly is the Euclidean # Explore data using functions like str(), names(), summary(), plot() etc. summary() or head() functions
distance, the length of the # Cluster analysis can only be conducted on quantitative data. therefore, data must be # Optional : Visualise data using
hypotenuse of the Right triangle graphs and plots
formed between two points in a normalised first.
plane. Done using dist() in R. # Standardise dataset using scale(). # Call the glm.fit() function and
# Calculate distance(dissimilarity) using dist() function to get the proximity matrix pass your R formula as the first
Distance is directly proportional to # Use clustering algorith hclust() to form clusters using different methods argument
dissimilarity - a more significant # Plot clusters on graphs using plot() to visualise clusters in a dendrogram. # Pass argument
distance between two # Use cutree() function to split into adequate number of clusters. family="binomial"
observations indicates a greater
dissimilarity between them and # Number of clusters can be verified using a scree plot. # Summarize data with
vice versa. #Non-hierarchical clustering is conducted using kmeans() function on a normalised dataset. summary() function.

You might also like