Data segmentation involves dividing customer data into meaningful subgroups to gain insights. This allows companies to better understand their customers and target communications. There are several clustering algorithms that can be used for data segmentation, such as k-means clustering, hierarchical clustering, and two-step clustering. Two-step clustering in SPSS automatically determines the optimal number of clusters. Data segmentation benefits companies by helping identify opportunities among customer subgroups, improving targeted communications, and increasing revenue opportunities overall.
Data segmentation involves dividing customer data into meaningful subgroups to gain insights. This allows companies to better understand their customers and target communications. There are several clustering algorithms that can be used for data segmentation, such as k-means clustering, hierarchical clustering, and two-step clustering. Two-step clustering in SPSS automatically determines the optimal number of clusters. Data segmentation benefits companies by helping identify opportunities among customer subgroups, improving targeted communications, and increasing revenue opportunities overall.
Data segmentation involves dividing customer data into meaningful subgroups to gain insights. This allows companies to better understand their customers and target communications. There are several clustering algorithms that can be used for data segmentation, such as k-means clustering, hierarchical clustering, and two-step clustering. Two-step clustering in SPSS automatically determines the optimal number of clusters. Data segmentation benefits companies by helping identify opportunities among customer subgroups, improving targeted communications, and increasing revenue opportunities overall.
• Data Segmentation involves dividing up and grouping data into
relevant segments, allowing an organisation to make better marketing decisions based on customer personalisation and prospect insights. • For instance, customer data can be segmented by lifestyle choices, location, personal identifiers, and other parameters that help a brand identify and market to its best customers. • The benefit of Data Segmentation is that it gives organisations more personalised datasets. With a strong understanding of who customers are, and the experiences they value comes a stronger understanding of how to best communicate and connect with them. Customer characteristics • You begin by gathering all of the relevant and required information about your customers into one spreadsheet. The first question typically is, which characteristics do you use? • The types of customer characteristics mainly fall into one of three categories. First, there are the characteristics that most people usually come up with first. Where is the customer located? What is the customer's industry? How many employees does it have? What is its revenue? How many regions is the customer in? These characteristics are the demographic characteristics of your customers, and your customer relationship management (CRM) systems often already contain these data points. • Second, there are characteristics of your customer's behavior. These behavior characteristics are data points, such as, the number of orders in a month, the average value of orders, and the number of days to pay. Often, you use queries to extract this information from your enterprise resource planning system. You might already have such behavioral characteristics of your customers available now. Sometimes, you create new calculations in queries to get new numbers. • Third, there are characteristics of your customers that do not come from any centralized database. Examples of this type of information include an assessment of the relationship quality from your salesperson, or a rating that is based on the number of returns or complaints. You might have to add this type of data manually. The importance of data segmentation Data Segmentation, also known plainly as Segmentation, is important for several reasons. By segmenting customer data into relevant subgroups, organisations can identify opportunities more effectively, as well as deliver more targeted communications and improve revenue streams. Data Segmentation can help businesses achieve the following: • Identifying opportunities: When it comes to customer data, there is often a lot of information to take in. Data segmentation helps break up the data in meaningful ways, providing a more detailed insight into the types of customers in the database. Once segmented, organisations can identify opportunities and patterns amongst subgroups of customers. For instance, older demographics might be more responsive to one form of communication than younger demographics. 2. Targeting communications • With segmented data, organisations are able to tailor communications to relevant audiences more effectively. This can help deliver marketing messages in ways that each individual customer or prospect is more likely to resonate with. • By understanding what customers are interested in, and more about who they are, your communications team can engage with customers in ways that they are more receptive to. • Data segmentation can also be applied to qualify prospective customers in your pipeline. 3. Improve revenue • The main benefit of data segmentation from a business perspective is the potential to increase revenue. • By understanding customers in more detail and targeting communications in a way that they are more receptive to, businesses can in turn increase revenue opportunities. • This can also save your sales team a lot of time since they’ll have access to more detailed overviews of customers through segmentation. Models of segmentation using SPSS • Two step • K-Means • Hierarchical • Tree • Discriminant • Nearest neighbor Two step Cluster Analysis • The two-step Clustering Component is a cluster analysis designed to handle large datasets which can be a combination of both continuous and categorical variables. Quantitative variables with different scale units and nominal scaled variables may be simultaneously analyzed. The user must decide to handle ordinal variables either as continuous or as categorical. Two step cluster analysis procedure uses a likelihood distance measure which assumes that variables in the cluster model are independent, in other words the variables should not be dependent to one another. • Two step cluster analysis is named so because it consists of two steps, the first step, where the total observations are clustered into small sub clusters and later on they are treated as individual observations. The distance criteria will determine whether the observation is joined to existing cluster or form a new cluster. The algorithm of two-step cluster is capable to determine the number of clusters automatically. The second step is grouping, where analysis is performed with the sub clusters created and they are grouped into the required number of clusters. SPSS uses the agglomerative hierarchical clustering method which works efficiently through the auto-cluster feature in the two-step clustering component. This simulation which is consistently accurate and scalable in performance and shows the automatic procedure of determining the number of clusters even when working with large data files. • For example if we analyze the information about the customers of a bank, dividing them into three clusters, using SPSS two step cluster method. Two step creates three customers’ profiles. The largest group contains skilled customers, whose purpose of the loan is education or business. The second group consists in persons with real estate, but mostly unemployed, which asked for a credit for retraining or for household goods. The third profile groups people with unknown properties, who make a request for a car or a television and then for education. The purpose of the analysis is reinforcing the company’s profits by managing its clients more effectively. • A form of exploratory data analysis in which observations are divided into different groups with standard features is known as clustering analysis. The purpose of classification or cluster analysis is to ensure that different groups must have different observations as possible. The two main types of classification are K-Means clustering and Hierarchical Clustering. K-Means is used when the number of classes is fixed, while the latter is used for an unknown number of classes. Distance is used to separate observations into different groups in clustering algorithms. Introduction to Clustering Clustering is defined as dividing data points or population into several groups such that similar data points are in the same groups. The aim to segregate groups based on similar traits. Clustering can be divided into two subgroups, broadly: • Soft Clustering– In this type of clustering, a likelihood or probability of the data point of being in a particular cluster is assigned instead of putting each data point into a separate cluster. • Hard Clustering– Each data point either entirely belongs to a cluster or not at all. The task of clustering is subjective; i.e., there are many ways of achieving the goal. There can be many different sets of rules for defining similarity among data points. Out of more than 100 clustering algorithms, a few are used correctly. They are as follows: • Connectivity models– These models are based on the fact that data points closer in the data space exhibit more similarity than those lying farther away. The model’s first approach can classify data points into separate clusters and aggregation as the distance decreases. Another approach is classifying data points as a single cluster and partitioning as the distance increases. The choice of distance function is subjective. The models are easily interpreted but lack scalability for handling large datasets: example- Hierarchical clustering. • Centroid models– Iterative clustering algorithms in which similarity is derived as the notion of the closeness of data point to the cluster’s centroid. Example- K-Means clustering. The number of clusters is mentioned in advance, which requires prior knowledge of the dataset. • Distribution models– The models are based on the likelihood of all data points in the cluster belonging to the same distribution. Overfitting is common in these models. • Density models– These models search the data space for varying density areas. It isolates various density regions and assigns data points in the same cluster. K-Means Clustering • The most common clustering is K-Means. The first step is to create c new observations among our unlabelled data and locate them randomly, called centroids. The number of centroids represents the number of output classes. The first step of the iterative process for each centroid is to find the nearest point (in terms of Euclidean distance) and assign them to its category. Next, for each category, the average of all the points attributed to that class is computed. The output is the new centroid of the class. • With every iteration, the observations can be redirected to another centroid. After several reiterations, the centroid’s change in location is less critical as initial random centroids converge with real ones—the process ends when there is no change in centroids’ position. Many methods can be employed for the task, but a common one is ‘elbow method’. A low level of variation is needed within the clusters measured by the within-cluster sum of squares. The number of centroids and observations are inversely proportional. Thus, setting the highest possible number of centroids would be inconsistent. Hierarchical Clustering Two techniques are used by this algorithm- Agglomerative and Divisive. In HC, the number of clusters K can be set precisely like in K-means, and n is the number of data points such that n>K. The agglomerative HC starts from n clusters and aggregates data until K clusters are obtained. The divisive starts from only one cluster and then splits depending on similarities until K clusters are obtained. The similarity here is the distance among points, which can be computed in many ways, and it is the crucial element of discrimination. It can be computed with different approaches: • Min: Given two clusters C1 and C2 such that point a belongs to C1 and b to C2. The similarity between them is equal to the minimum of distance • Max: The similarity between points a and b is equal to the maximum of distance • Average: All the pairs of points are taken, and their similarities are computed. Then the average of similarities is the similarity between C1 and C2. • If there is a specific number of clusters in the dataset, but the group they belong to is unknown, choose K-means • If the distinguishes are based on prior beliefs, hierarchical clustering should be used to know the number of clusters • With a large number of variables, K-means compute faster • The result of K-means is unstructured, but that of hierarchal is more interpretable and informative • It is easier to determine the number of clusters by hierarchical clustering’s dendrogram Decision tree • Decision trees are used in everyday life decisions. Flow diagrams are actually visual representations of decision trees. • A decision tree is often a generalization of the experts’ experience, a means of sharing knowledge of a particular process. For example, before the introduction of scalable machine learning algorithms, the credit scoring task in the banking sector was solved by experts. The decision to grant a loan was made on the basis of some intuitively (or empirically) derived rules that could be represented as a decision tree. • The decision tree as a machine learning algorithm is essentially the same thing as the diagram shown above; we incorporate a stream of logical rules of the form “feature a value is less than x and feature b value is less than y … => Category 1” into a tree-like data structure. The advantage of this algorithm is that they are easily interpretable. For example, using the above scheme, the bank can explain to the client why they were denied for a loan: e.g the client does not own a house and her income is less than 5,000. Challenges in data segmentation Data segmentation is one of the most valuable data management tools for any business holding customer or prospect data. Though, many businesses find it difficult to segment data effectively. Some of the most common challenges businesses face with data segmentation include: • Not having enough data • Having too much data • Inaccurate data • Lack of internal resources However, many of these challenges are often easier to overcome than people think. Size of a dataset • In most cases, it is unlikely that there is not enough data to segment. Even with smaller datasets, data segmentation can provide hyper-personalised insights into customer groups. • Smaller businesses that don’t hold much customer data can also ensure that sales communications are highly personalised by segmenting B2B or B2C prospect lists. • Equally, having too much data does not hinder the ability to segment data. If you have a dataset of any size, it is likely that there are key insights to uncover with data segmentation. Inaccurate data • Another challenge businesses face in data segmentation is inaccurate data. Where there are inaccuracies in the dataset, it becomes more difficult to make reliable business decisions based on the segmentation. • Though, this issue can easily be overcome by validating and cleaning the data beforehand. Lack of internal resources • Many businesses also struggle due to a lack of internal resources. This is especially true for businesses with larger datasets, or smaller teams. • This can make it difficult to manage the data effectively, particularly when datasets contain inaccuracies that need addressing before segmenting the data. Automated Data Preparation • Preparing data for analysis is one of the most important steps in any project—and traditionally, one of the most time consuming. Automated Data Preparation (ADP) handles the task for you, analyzing your data and identifying fixes, screening out fields that are problematic or not likely to be useful, deriving new attributes when appropriate, and improving performance through intelligent screening techniques. You can use the algorithm in fully automatic fashion, allowing it to choose and apply fixes, or you can use it in interactive fashion, previewing the changes before they are made and accept or reject them as you want. • Using ADP enables you to make your data ready for model building quickly and easily, without needing prior knowledge of the statistical concepts involved. Models will tend to build and score more quickly; in addition, using ADP improves the robustness of automated modeling processes. • Note: when ADP prepares a field for analysis, it creates a new field containing the adjustments or transformations, rather than replacing the existing values and properties of the old field. The old field is not used in further analysis; its role is set to None. Also note that any user-missing value information is not transferred to these newly created fields, and any missing values in the new field are system-missing. • Example. An insurance company with limited resources to investigate homeowner's insurance claims wants to build a model for flagging suspicious, potentially fraudulent claims. Before building the model, they will ready the data for modeling using automated data preparation. Since they want to be able to review the proposed transformations before the transformations are applied, they will use automated data preparation in interactive mode. • An automotive industry group keeps track of the sales for a variety of personal motor vehicles. In an effort to be able to identify over- and underperforming models, they want to establish a relationship between vehicle sales and vehicle characteristics. They will use automated data preparation to prepare the data for analysis, and build models using the data "before" and "after" preparation to see how the results differ. • What is your objective? Automated data preparation recommends data preparation steps that will affect the speed with which other algorithms can build models and improve the predictive power of those models. This can include transforming, constructing and selecting features. The target can also be transformed. You can specify the model-building priorities that the data preparation process should concentrate on. • Balance speed and accuracy. This option prepares the data to give equal priority to both the speed with which data are processed by model-building algorithms and the accuracy of the predictions. • Optimize for speed. This option prepares the data to give priority to the speed with which data are processed by model-building algorithms. When you are working with very large datasets, or are looking for a quick answer, select this option. • Optimize for accuracy. This option prepares the data to give priority to the accuracy of predictions produced by model-building algorithms. • Custom analysis. When you want to manually change the algorithm on the Settings tab, select this option. Note that this setting is automatically selected if you subsequently make changes to options on the Settings tab that are incompatible with one of the other objectives.