Data Segmentation

Data
• Data Segmentation involves dividing up and grouping data into

relevant segments, allowing an organisation to make better marketing
decisions based on customer personalisation and prospect insights.
• For instance, customer data can be segmented by lifestyle choices,
location, personal identifiers, and other parameters that help a brand
identify and market to its best customers.
• The benefit of Data Segmentation is that it gives organisations more
personalised datasets. With a strong understanding of who customers
are, and the experiences they value comes a stronger understanding of
how to best communicate and connect with them.
Customer characteristics
• You begin by gathering all of the relevant and required information
about your customers into one spreadsheet. The first question
typically is, which characteristics do you use?
• The types of customer characteristics mainly fall into one of three
categories. First, there are the characteristics that most people
usually come up with first. Where is the customer located? What is
the customer's industry? How many employees does it have? What is
its revenue? How many regions is the customer in? These
characteristics are the demographic characteristics of your
customers, and your customer relationship management (CRM)
systems often already contain these data points.
• Second, there are characteristics of your customer's behavior. These
behavior characteristics are data points, such as, the number of orders in a
month, the average value of orders, and the number of days to pay. Often,
you use queries to extract this information from your enterprise resource
planning system. You might already have such behavioral characteristics of
your customers available now. Sometimes, you create new calculations in
queries to get new numbers.
• Third, there are characteristics of your customers that do not come from
any centralized database. Examples of this type of information include an
assessment of the relationship quality from your salesperson, or a rating
that is based on the number of returns or complaints. You might have to
add this type of data manually.
The importance of data segmentation
Data Segmentation, also known plainly as Segmentation, is important for
several reasons. By segmenting customer data into relevant subgroups,
organisations can identify opportunities more effectively, as well as deliver
more targeted communications and improve revenue streams.
Data Segmentation can help businesses achieve the following:
• Identifying opportunities: When it comes to customer data, there is often a lot
of information to take in. Data segmentation helps break up the data in
meaningful ways, providing a more detailed insight into the types of
customers in the database. Once segmented, organisations can identify
opportunities and patterns amongst subgroups of customers. For instance,
older demographics might be more responsive to one form of communication
than younger demographics.
2. Targeting communications
• With segmented data, organisations are able to tailor
communications to relevant audiences more effectively. This can
help deliver marketing messages in ways that each individual
customer or prospect is more likely to resonate with.
• By understanding what customers are interested in, and more
about who they are, your communications team can engage with
customers in ways that they are more receptive to.
• Data segmentation can also be applied to qualify prospective
customers in your pipeline.
3. Improve revenue
• The main benefit of data segmentation from a business
perspective is the potential to increase revenue.
• By understanding customers in more detail and targeting
communications in a way that they are more receptive to,
businesses can in turn increase revenue opportunities.
• This can also save your sales team a lot of time since they’ll have
access to more detailed overviews of customers through
segmentation.
Models of segmentation using SPSS
• Two step
• K-Means
• Hierarchical
• Tree
• Discriminant
• Nearest neighbor
Two step Cluster Analysis
• The two-step Clustering Component is a cluster analysis designed
to handle large datasets which can be a combination of both
continuous and categorical variables. Quantitative variables with
different scale units and nominal scaled variables may be
simultaneously analyzed. The user must decide to handle ordinal
variables either as continuous or as categorical. Two step cluster
analysis procedure uses a likelihood distance measure which
assumes that variables in the cluster model are independent, in
other words the variables should not be dependent to one another.
• Two step cluster analysis is named so because it consists of two
steps, the first step, where the total observations are clustered into
small sub clusters and later on they are treated as individual
observations. The distance criteria will determine whether the
observation is joined to existing cluster or form a new cluster. The
algorithm of two-step cluster is capable to determine the number
of clusters automatically. The second step is grouping, where
analysis is performed with the sub clusters created and they are
grouped into the required number of clusters. SPSS uses the
agglomerative hierarchical clustering method which works
efficiently through the auto-cluster feature in the two-step
clustering component. This simulation which is consistently
accurate and scalable in performance and shows the automatic
procedure of determining the number of clusters even when
working with large data files.
• For example if we analyze the information about the customers of
a bank, dividing them into three clusters, using SPSS two step
cluster method. Two step creates three customers’ profiles. The
largest group contains skilled customers, whose purpose of the
loan is education or business. The second group consists in persons
with real estate, but mostly unemployed, which asked for a credit
for retraining or for household goods. The third profile groups
people with unknown properties, who make a request for a car or a
television and then for education. The purpose of the analysis is
reinforcing the company’s profits by managing its clients more
effectively.
• A form of exploratory data analysis in which observations are
divided into different groups with standard features is known as
clustering analysis. The purpose of classification or cluster analysis
is to ensure that different groups must have different observations
as possible. The two main types of classification are K-Means
clustering and Hierarchical Clustering. K-Means is used when the
number of classes is fixed, while the latter is used for an unknown
number of classes. Distance is used to separate observations into
different groups in clustering algorithms.
Introduction to Clustering
Clustering is defined as dividing data points or population into several
groups such that similar data points are in the same groups. The aim to
segregate groups based on similar traits. Clustering can be divided into
two subgroups, broadly:
• Soft Clustering– In this type of clustering, a likelihood or probability of
the data point of being in a particular cluster is assigned instead of
putting each data point into a separate cluster.
• Hard Clustering– Each data point either entirely belongs to a cluster or
not at all. The task of clustering is subjective; i.e., there are many ways
of achieving the goal. There can be many different sets of rules for
defining similarity among data points. Out of more than 100 clustering
algorithms, a few are used correctly. They are as follows:
• Connectivity models– These models are based on the fact that data points
closer in the data space exhibit more similarity than those lying farther away.
The model’s first approach can classify data points into separate clusters and
aggregation as the distance decreases. Another approach is classifying data
points as a single cluster and partitioning as the distance increases. The choice
of distance function is subjective. The models are easily interpreted but lack
scalability for handling large datasets: example- Hierarchical clustering.
• Centroid models– Iterative clustering algorithms in which similarity is derived
as the notion of the closeness of data point to the cluster’s centroid. Example-
K-Means clustering. The number of clusters is mentioned in advance, which
requires prior knowledge of the dataset.
• Distribution models– The models are based on the likelihood of all data
points in the cluster belonging to the same distribution. Overfitting is common
in these models.
• Density models– These models search the data space for varying density
areas. It isolates various density regions and assigns data points in the same
cluster.
K-Means Clustering
• The most common clustering is K-Means. The first step is to create c new
observations among our unlabelled data and locate them randomly, called
centroids. The number of centroids represents the number of output classes.
The first step of the iterative process for each centroid is to find the nearest
point (in terms of Euclidean distance) and assign them to its category. Next, for
each category, the average of all the points attributed to that class is computed.
The output is the new centroid of the class.
• With every iteration, the observations can be redirected to another centroid.
After several reiterations, the centroid’s change in location is less critical as
initial random centroids converge with real ones—the process ends when there
is no change in centroids’ position. Many methods can be employed for the task,
but a common one is ‘elbow method’. A low level of variation is needed within
the clusters measured by the within-cluster sum of squares. The number of
centroids and observations are inversely proportional. Thus, setting the highest
possible number of centroids would be inconsistent.
Hierarchical Clustering
Two techniques are used by this algorithm- Agglomerative and Divisive. In HC, the
number of clusters K can be set precisely like in K-means, and n is the number of
data points such that n>K. The agglomerative HC starts from n clusters and
aggregates data until K clusters are obtained. The divisive starts from only one
cluster and then splits depending on similarities until K clusters are obtained. The
similarity here is the distance among points, which can be computed in many
ways, and it is the crucial element of discrimination. It can be computed with
different approaches:
• Min: Given two clusters C1 and C2 such that point a belongs to C1 and b to C2.
The similarity between them is equal to the minimum of distance
• Max: The similarity between points a and b is equal to the maximum of distance
• Average: All the pairs of points are taken, and their similarities are computed.
Then the average of similarities is the similarity between C1 and C2.
• If there is a specific number of clusters in the dataset, but the group
they belong to is unknown, choose K-means
• If the distinguishes are based on prior beliefs, hierarchical
clustering should be used to know the number of clusters
• With a large number of variables, K-means compute faster
• The result of K-means is unstructured, but that of hierarchal is
more interpretable and informative
• It is easier to determine the number of clusters by hierarchical
clustering’s dendrogram
Decision tree
• Decision trees are used in everyday life decisions. Flow diagrams
are actually visual representations of decision trees.
• A decision tree is often a generalization of the experts’ experience,
a means of sharing knowledge of a particular process. For example,
before the introduction of scalable machine learning algorithms,
the credit scoring task in the banking sector was solved by experts.
The decision to grant a loan was made on the basis of some
intuitively (or empirically) derived rules that could be represented
as a decision tree.
• The decision tree as a machine learning algorithm is essentially the same thing as
the diagram shown above; we incorporate a stream of logical rules of the form
“feature a value is less than x and feature b value is less than y … => Category 1”
into a tree-like data structure. The advantage of this algorithm is that they are
easily interpretable. For example, using the above scheme, the bank can explain to
the client why they were denied for a loan: e.g the client does not own a house and
her income is less than 5,000.
Challenges in data segmentation
Data segmentation is one of the most valuable data management tools for any
business holding customer or prospect data. Though, many businesses find it
difficult to segment data effectively.
Some of the most common challenges businesses face with data segmentation
include:
• Not having enough data
• Having too much data
• Inaccurate data
• Lack of internal resources
However, many of these challenges are often easier to overcome than people
think.
Size of a dataset
• In most cases, it is unlikely that there is not enough data to
segment. Even with smaller datasets, data segmentation can
provide hyper-personalised insights into customer groups.
• Smaller businesses that don’t hold much customer data can also
ensure that sales communications are highly personalised by
segmenting B2B or B2C prospect lists.
• Equally, having too much data does not hinder the ability to
segment data. If you have a dataset of any size, it is likely that
there are key insights to uncover with data segmentation.
Inaccurate data
• Another challenge businesses face in data segmentation is
inaccurate data. Where there are inaccuracies in the dataset, it
becomes more difficult to make reliable business decisions based
on the segmentation.
• Though, this issue can easily be overcome by validating
and cleaning the data beforehand.
Lack of internal resources
• Many businesses also struggle due to a lack of internal resources.
This is especially true for businesses with larger datasets, or smaller
teams.
• This can make it difficult to manage the data effectively,
particularly when datasets contain inaccuracies that need
addressing before segmenting the data.
Automated Data Preparation
• Preparing data for analysis is one of the most important steps in any project—and
traditionally, one of the most time consuming. Automated Data Preparation
(ADP) handles the task for you, analyzing your data and identifying fixes,
screening out fields that are problematic or not likely to be useful, deriving new
attributes when appropriate, and improving performance through intelligent
screening techniques. You can use the algorithm in fully automatic fashion,
allowing it to choose and apply fixes, or you can use it in interactive fashion,
previewing the changes before they are made and accept or reject them as you
want.
• Using ADP enables you to make your data ready for model building quickly and
easily, without needing prior knowledge of the statistical concepts involved.
Models will tend to build and score more quickly; in addition, using ADP improves
the robustness of automated modeling processes.
• Note: when ADP prepares a field for analysis, it creates a new field containing the
adjustments or transformations, rather than replacing the existing values and
properties of the old field. The old field is not used in further analysis; its role is set to
None. Also note that any user-missing value information is not transferred to these
newly created fields, and any missing values in the new field are system-missing.
• Example. An insurance company with limited resources to investigate homeowner's
insurance claims wants to build a model for flagging suspicious, potentially
fraudulent claims. Before building the model, they will ready the data for modeling
using automated data preparation. Since they want to be able to review the
proposed transformations before the transformations are applied, they will use
automated data preparation in interactive mode.
• An automotive industry group keeps track of the sales for a variety of personal
motor vehicles. In an effort to be able to identify over- and underperforming models,
they want to establish a relationship between vehicle sales and vehicle
characteristics. They will use automated data preparation to prepare the data for
analysis, and build models using the data "before" and "after" preparation to see
how the results differ.
• What is your objective? Automated data preparation recommends data preparation
steps that will affect the speed with which other algorithms can build models and
improve the predictive power of those models. This can include transforming,
constructing and selecting features. The target can also be transformed. You can specify
the model-building priorities that the data preparation process should concentrate on.
• Balance speed and accuracy. This option prepares the data to give equal priority to both
the speed with which data are processed by model-building algorithms and the accuracy
of the predictions.
• Optimize for speed. This option prepares the data to give priority to the speed with
which data are processed by model-building algorithms. When you are working with very
large datasets, or are looking for a quick answer, select this option.
• Optimize for accuracy. This option prepares the data to give priority to the accuracy of
predictions produced by model-building algorithms.
• Custom analysis. When you want to manually change the algorithm on the Settings tab,
select this option. Note that this setting is automatically selected if you subsequently
make changes to options on the Settings tab that are incompatible with one of the other
objectives.

Data Segmentation

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Data Segmentation

Uploaded by

Copyright:

Available Formats

Data

• Data Segmentation involves dividing up and grouping data into

You might also like