You are on page 1of 24

1.

Clustering
2. Association
3. Data Cleaning 
4. Data Visualization
5. Classification
6. Machine Learning 
7. Prediction
8. Neural Networks
9. Outlier Detection
10. Data Warehousing

10 Data Mining Techniques


1. Clustering
Clustering is a technique used to represent data visually — such as in graphs that show
buying trends or sales demographics for a particular product.

What Is Clustering in Data Mining?


Clustering refers to the process of grouping a series of different data points based on their
characteristics. By doing so, data miners can seamlessly divide the data into subsets,
allowing for more informed decisions in terms of broad demographics (such as consumers
or users) and their respective behaviors. 

Methods for Data Clustering

 Partitioning method: This involves dividing a data set into a group of specific clusters
for evaluation based on the criteria of each individual cluster. In this method, data points
belong to just one group or cluster.
 Hierarchical method: With the hierarchical method, data points are a single cluster,
which are grouped based on similarities. These newly created clusters can then be
analyzed separately from each other. 
 Density-based method: A machine learning method where data points plotted together
are further analyzed, but data points by themselves are labeled “noise” and discarded.
 Grid-based method: This involves dividing data into cells on a grid, which then can be
clustered by individual cells rather than by the entire database. As a result, grid-based
clustering has a fast-processing time.
 Model-based method: In this method, models are created for each data cluster to locate
the best data to fit that particular model.
Examples of Clustering in Business
Clustering helps businesses manage their data more effectively. For example, retailers can
use clustering models to determine which customers buy particular products, on which days,
and with what frequency. This can help retailers target products and services to customers
in a specific demographic or region.
Clustering can help grocery stores group products by a variety of characteristics (brand, size,
cost, flavor, etc.) and better understand their sales tendencies. It can also help car insurance
companies that want to identify a set of customers who typically have high annual claims
in order to price policies more effectively. In addition, banks and financial institutions might
use clustering to better understand how customers use in-person versus virtual services to
better plan branch hours and staffing.

2. Association
Association rules are used to find correlations, or associations, between points in a data set. 

What Is Association in Data Mining?


Data miners use association to discover unique or interesting relationships between
variables in databases. Association is often employed to help companies determine
marketing research and strategy.

Methods for Data Mining Association


Two primary approaches using association in data mining are the single-dimensional and
multi-dimensional methods.

 Single-dimensional association: This involves looking for one repeating instance of a


data point or attribute. For instance, a retailer might search its database for the
instances a particular product was purchased. 
 Multi-dimensional association: This involves looking for more than one data point in a
data set. That same retailer might want to know more information than what a customer
purchased — such as their age, method of purchase (cash or credit card), or age.

Examples of Association in Business


The analysis of impromptu shopping behavior is an example of association — that is,
retailers notice in data studies that parents shopping for childcare supplies are more likely to
purchase specialty food or beverage items for themselves during the same trip. These
purchases can be analyzed through statistical association.
Association analysis carries many other uses in business. For retailers, it’s particularly
helpful in making purchasing suggestions. For example, if a customer buys a smartphone,
tablet, or video game device, association analysis can recommend related items like cables,
applicable software, and protective cases. 
Additionally, association is used by the government to employ census data and plan for
public services; it is also used by doctors to diagnose various illnesses and conditions more
effectively.

3. Data Cleaning
Data cleaning is the process of preparing data to be mined.

What Is Data Cleaning in Data Mining?


Data cleaning involves organizing data, eliminating duplicate or corrupted data, and filling in
any null values. When this process is complete, the most useful information can be
harvested for analysis.

Methods for Data Cleaning

 Verifying the data: This involves checking that each data point in the data set is in the
proper format (e.g., telephone numbers, social security numbers). 
 Converting data types: This ensures data is uniform across the data set. For instance,
numeric variables only contain numbers, while string variables can contain letters,
numbers, and characters. 
 Removing irrelevant data: This clears useless or inapplicable data so full emphasis can
be placed on necessary data points. 
 Eliminating duplicate data points: This helps speed up the mining process by boosting
efficiency and reducing errors.
 Removing errors: This eliminates typing mistakes, spelling errors, and input errors that
could negatively affect analysis outcomes. 
 Completing missing values: This provides an estimated value for all data and reduces
missing values, which can lead to skewed or incorrect results.

Examples of Data Cleaning in Business


According to Experian, 95 percent of businesses say they have been impacted by poor data
quality. Working with incorrect data wastes time and resources, increases analysis costs
(because models need to be repeated), and often leads to faulty analytics.
Ultimately, no matter how great their models or algorithms are, businesses suffer when their
data is incorrect, incomplete, or corrupted.

4. Data Visualization
Data visualization is the translation of data into graphic form to illustrate its meaning to
business stakeholders. 

What Is Data Visualization in Data Mining?


Data can be presented in visual ways through charts, graphs, maps, diagrams, and more.
This is a primary way in which data scientists display their findings. 

Methods for Data Visualization


Many methods exist for representing data visually. Here are a few:

 Comparison charts: Charts and tables express relationships in the data, such as monthly
product sales over a one-year period.
 Maps: Data maps are used to visualize data pertaining to specific geographic locations.
Through maps, data can be used to show population density and changes; compare
populations of neighboring states, counties, and countries; detect how populations are
spread over geographic regions; and compare characteristics in one region to those in
other regions. 
 Heat maps: This is a popular visualization technique that represents data through
different colors and shading to indicate patterns and ranges in the data. It can be used
to track everything from a region’s temperature changes to its food and pop culture
trends. 
 Density plots: These visualizations track data over a period of time, creating what can
look like a mountain range. Density plots make it easy to represent occurrences of single
events over time (e.g., month, year, decade). 
 Histograms: These are similar to density plots but are represented by bars on a graph
instead of a linear form.
 Network diagrams: These diagrams show how data points relate to each other by using
a series of lines (or links) to connect objects together.
 Scatter plots: These graphs represent data point relationships on a two-variable axis.
Scatter plots can be used to compare unique variables such as a country’s life
expectancy or the amount of money spent on healthcare annually.
 Word clouds: These graphics are used to highlight specific word or phrase instances
appearing in a body of text; the larger the word’s size in the cloud, the more frequent its
use.

Examples of Data Visualization in Business


Representing data visually is an important skill because it makes data readily
understandable to executives, clients, and customers. According to Markets and Markets,
the market size for global data visualization tools is expected to nearly double (to $10.2
billion) by 2026.
Companies can make faster, more informed decisions when presented with data that is easy
to understand and interpret. Today, this is typically accomplished through effective, visually
accessible mediums such as graphs, 3D models, and even augmented reality. As a result, it’s
a good idea for aspiring data professionals to consider learning such skills through a data
science and visualization bootcamp.
5. Classification
Classification is a fundamental technique in data mining and can be applied to nearly every
industry. It is a process in which data points from large data sets are assigned to categories
based on how they’re being used.

What Is Classification in Data Mining?


In data mining, classification is considered to be a form of clustering — that is, it is useful for
extracting comparable points of data for comparative analysis. Classification is also used to
designate broad groups within a demographic, target audience, or user base through which
businesses can gain stronger insights. 

Methods for Data Mining Classification

 Logistic regression: This algorithm attempts to show the probability of a specific


outcome within two possible results. For example, an email service can use logistic
regression to predict whether or not an email is spam.
 Decision trees: Once data is classified, follow-up questions can be asked, and the results
diagrammed into a chart called a decision tree. For example, if a computer company
wants to predict the likelihood of laptop purchases, it may ask, Is the potential buyer a
student? The data is classified into “Yes” and “No” decision trees, with other questions to
be asked afterward in a similar fashion. 
 K-nearest neighbors (KNN): This is an algorithm that tries to identify an unknown object
by comparing it to others. For instance, grocery chains might use the K-nearest
neighbors algorithm to decide whether to include a sushi or hot meals station in their
new store layout based on consumer habits in the local marketplace.
 Naive Bayes: Based on the Bayes Theorem of Probability, this algorithm uses historical
data to predict whether similar events will occur based on a different set of data.
 Support Vector Machine (SVM): This machine learning algorithm is often used to define
the line that best divides a data set into two classes. An SVM can help classify images
and is used in facial and handwriting recognition software.

Examples of Classification in Business


Financial institutions classify consumers based on many variables to market new loans or
project credit card risks. Meanwhile, weather apps classify data to project snowfall totals
and other similar figures. Grocery stores also use classification to group products by the
consumers who buy them, helping forecast buying patterns.

6. Machine Learning
Machine learning is the process by which computers use algorithms to learn on their own.
An increasingly relevant part of modern technology, machine learning makes computers
“smarter” by teaching them how to perform tasks based on the data they have gathered.  
What Is Machine Learning in Data Mining?
In data mining, machine learning’s applications are vast. Machine learning and data mining
fall under the umbrella of data science but aren’t interchangeable terms. For instance,
computers perform data mining as part of their machine learning functions.

Methods for Machine Learning

 Supervised learning: In this method, algorithms train machines to learn using pre-
labeled data with correct values, which the machines then classify on their own. It’s
called supervised because the process trains (or “supervises”) computers to classify
data and predict outcomes. Supervised machine learning is used in data mining
classification.
 Unsupervised learning: When computers handle unlabeled data, they engage in
unsupervised learning. In this case, the computer classifies the data itself and then looks
for patterns on its own. Unsupervised models are used to perform clustering and
association.
 Semi-supervised learning: Semi-supervised learning uses a combination of labeled and
unlabeled data, making it a hybrid of the above models. 
 Reinforcement learning: This is a more layered process in which computers learn to
make decisions based on examining data in a specific environment. For example, a
computer might learn to play chess by examining data from thousands of games played
online.

Examples of Machine Learning in Business


With machine learning, companies can use computers to quickly identify all sorts of data
patterns (in sales, product usage, buying habits, etc.) and develop business plans using
those insights. This is a growing need in many industries. 
According to a MicroStrategy survey, 18 percent of analytics professionals said machine
learning and AI will have the most significant impact on their strategies over the next five
years. Learning more advanced topics like machine learning is thus becoming imperative for
data scientists.
Classification Problems Real-world Examples
Here is the list of real-life examples of machine learning classification problems:

 Customer behavior prediction: Customers can be classified into different


categories based on their buying patterns, web store browsing patterns etc. For
example, classification models can be used to determine whether a customer is
likely to purchase more items or not. If the classification model predicts a greater
likelihood that they are about to make more purchases, then you might want to
send them promotional offers and discounts accordingly. Or if it has been
determined that they will probably fall off of their purchasing habits soon, maybe
save them for later by making their information readily available.
 Document classification: A multinomial classification model can be trained to
classify documents in different categories. In this case, the classification model can
be thought of as a function that maps from a document to a category label.
Different algorithms can be used for document classification such as Naive Bayes
classifier, Support Vector Machines (SVM), or Neural Networks models. Deep
learning algorithms such as Deep Boltzmann Machines (DBMs), Deep Belief
Networks (DBNs), and Stacked Autoencoders (SAEs) give state-of-the-art
classification results on different document classification datasets.
 Spam filtering: An algorithm is trained to recognize spam email by learning the
characteristics of what constitutes spam vs non-spam email. The classification
model could be a function that maps from an email text to a spam classification (or
non-spam classification). Algorithms such as Naive Bayes and Support Vector
Machines can be used for classification. Once the classification model is trained, it
can then be used to filter new incoming emails as spam or non-spam. The picture
below represents the Spam classification model depicted as Spam classifier.
 Image classification: One of the most popular classification problems is image
classification: determining what type of object (or scene) is in a digital image.
Images can be thought of as a high-dimensional vectors which we would like to
classify into different classes such as cat, car, human, and airplane. A multinomial
classification model can be trained to classify images into different categories. For
example, in order to classify images of dogs and cats for use within machine vision
systems, machine learning techniques can help automate this process based on
pre-classified images of dogs and cats.rent categories. Deep learning algorithms
such as Convolutional Neural Networks (CNN)-based classification models are
state-of-the-art in different image classification tasks. Another use case is image
segmentation, where the pixels of an image are assigned a label based on what
object they belong to. Image segmentation is defined as “the process of
distinguishing semantically meaningful image regions on the basis of visual
features”. The picture below represents how the CNN algorithm can be used to
build a classification model that classifies images such as Cat and dog.

 Web text classification: Classifying web pages/documents into different topics is


another classification problem. This classification task can be carried out by
mapping a text document to its corresponding topic category, which can further be
used for other downstream classification tasks such as automatic tagging of web
pages. The naive Bayes classification model is usually used for this classification
task, but deep learning models have been shown to give better classification
accuracy than naive Bayes models. For example, classification models can be used
to automatically classify web text into one of the following categories: Sports,
Entertainment, or Technology. Google news is a classical example of this
classification problem: it automatically classifies articles into different topic
categories. Here is the diagram representing the same:

 Ad click-through rate prediction: Binary classification models can be used to


predict whether one or more ads on the website will be clicked or not. Such models
are used to optimize the ad inventory on websites by selecting which ads will have a
better chance of being clicked. A machine learning classification model can be built
using historical data about what types of users do or don’t click on certain ads,
along with information like demographics and content within each web page where
an ad appears; then it is used to predict the chances that a user will click on an ad.
 Product categorization: A multinomial classification can be used to categorize
the products sold by different retailers in the same categories irrespective of
categories assigned to the product by the respective retailers. This use case is
relevant for eCommerce aggregators. Product classification is used in catalog-based
shopping websites such as Amazon, where products are automatically classified
into different categories based on their features or usage. Read this page on
product categorization for greater details.
 Malware classification: A multinomial classification can be used to classify the
new/emerging-malware on the basis of comparable features of similar malware.
Malware classification is very useful for security experts to take appropriate actions
for combating/preventing malware. Machine learning classification algorithms
such as Naïve Bayes, k-NN, and tree-based models can be used for malware
classification.
 Image sentiment analysis: Machine learning binary classification models can
be built based on machine learning algorithms to classify whether the image
contains a positive or negative emotion/sentiment or not. This use case is relevant
in the field of social media analytics where machine learning techniques are applied
to understand users’ opinions and sentiments on different topics.
 Customer churn prediction: A binary classification model can be used to
classify whether a customer will churn or not in the near future. The application of
the customer churn classification model can be found in different business
scenarios like up-selling/cross-selling to existing customers, identifying at-risk
accounts in the customer base, etc. More commonly, telecommunications
companies have been found to use machine learning classification models for
churn prediction.
 Customer behavior assessment for promotional offers: A binary
classification model can be used to classify whether an account is customer-friendly
or not in the context of a specific business scenario like upselling, cross-selling, etc.
For example, based on past data about how customers respond to certain types of
offers; machine learning techniques can be used to predict whether a given
customer will respond positively or negatively to the offer.
 Anomaly detection problems such as fraud detection: Anomaly detection
models can be built using machine learning classification algorithms like Naïve
Bayes, k-NN, etc. The application of these machine learning anomaly detection
models is very wide and includes use cases such as finding unusual patterns in
financial transactions that may indicate fraud, finding machine problems by
detecting unusual machine readings, and monitoring machine parameters to detect
abnormalities. In the case of an anomaly detection problem, the machine learning
model is trained to predict a particular class (e.g., normal or anomalous) and the
classification model is then used to classify new data as either belonging to the
normal set of data points or anomalous set of data points. The classification
algorithm should be able to detect rare cases that lie outside the training
distribution and can be used to detect possible fraudulent credit card transactions,
for example.
 Credit card fraud detection: A binary classification model can be used for
credit card fraud detection where the historical transaction data of a customer is
analyzed using machine learning algorithms like Naïve Bayes, k-NN, etc. Based on
past fraudulent or non-fraudulent transaction data and machine learning
classification models, it can be predicted whether the given credit card will result in
fraudulent transactions or not. Read more about credit card fraud detection and
machine learning.
 Deduction validation classification: A binary classification model can be used
to classify whether a deduction claimed by the buyer on a given invoice is a valid or
invalid deduction. This would be useful in account receivables to classify whether
the given invoice will be paid in full or partial based on deduction validation
classification. Read more about account receivables and machine learning use
cases.
 Credit-worthiness assessment: A machine learning classification model can be
trained to predict the probability of default for a customer based on past
transaction data and historical information about customers who have
defaulted/not defaulted in their payments. Credit card companies, financial
institutions like banks, etc
 Blocked order release recommendation: A binary classification model can be
built to classify whether an order placed by the customer should be blocked or not
based on the buyer credit exposure. This use case is very prevalent in account
receivables where machine learning classification models are used to predict
whether a given order should be blocked or not. This would help the business save
costs by identifying high-risk customers.
 Sentiment analysis: A machine learning binary classification model can be
trained to identify the sentiment (positive/negative) of a given text document based
on classification algorithms like Naïve Bayes, SVM etc. This would help determine
whether the sentiment expressed in a document such as an email is positive or
negative for business purposes like identifying whether a customer is satisfied or
dissatisfied with the service provided.
Machine learning classification models can be used to solve a wide variety of business
problems. There are many machine learning algorithms that can be applied in order to help
you classify different types of documents, images, and even customer behavior for
promotional offers or up-selling/cross-selling opportunities. For example, machine
intelligence could detect fraud by looking at unusual patterns in financial transactions which
may indicate fraudulent activity. Machine intelligence is also widely implemented within the
telecommunications industry where machine learning classification models have been found
to predict churn rates with more accuracy than traditional methods based on historical data
from past customers. In summary, machine learning classification algorithms/models are an
extremely powerful tool that has vast applications across industries and use cases such as
credit card fraud detection or document classification (e.g., categorizing a given image as a
dog or cat) to machine learning classification models that help classify customer behavior for
up-selling/cross-selling.

If you are a business analyst, a product manager, or someone associated with building
products, you may want to check out my latest book on reasoning by first
principles titled – First principles thinking: Building winning products using first
principles thinking. You may as well check out the related blog – First principles thinking
explained with examples.

7 Innovative Uses of Clustering Algorithms in


the Real World
Clustering algorithms are a powerful technique for machine learning on unsupervised data. The most
common algorithms in machine learning are hierarchical clustering and K-Means clustering. These two
algorithms are incredibly powerful when applied to different machine learning problems.
Both k-means and hierarchical clustering have been applied to different scenarios to help gain new
insights into the problem. Before diving into the innovative uses of clustering algorithms, I will first share
an overview of the two algorithms. 

What is unsupervised learning?


Before we get started, let me first introduce the concept of unsupervised learning. Unsupervised learning
is where you train a machine learning algorithm, but you don’t give it the answer to the problem.

1) K-means clustering algorithm


The K-Means clustering algorithm is an iterative process where you are trying to minimize the distance
of the data point from the average data point in the cluster.
2) Hierarchical clustering
Hierarchical clustering algorithms seek to create a hierarchy of clustered data points.
The algorithm aims to minimize the number of clusters by merging those closest to one another using a
distance measurement such as Euclidean distance for numeric clusters or Hamming distance for text.

Here are 7 examples of clustering algorithms in action.


1. Identifying Fake News
Fake news is not a new phenomenon, but it is one that is becoming prolific.
What the problem is: Fake news is being created and spread at a rapid rate due to technology
innovations such as social media. The issue gained attention recently during the 2016 US presidential
campaign. During this campaign, the term Fake News was referenced an unprecedented number of
times.
How clustering works: In a paper recently published by two computer science students at the University
of California, Riverside, they are using clustering algorithms to identify fake news based on the content.
The way that the algorithm works is by taking in the content of the fake news article, the corpus,
examining the words used and then clustering them. These clusters are what helps the algorithm
determine which pieces are genuine and which are fake news. Certain words are found more commonly
in sensationalized, click-bait articles. When you see a high percentage of specific terms in an article, it
gives a higher probability of the material being fake news. 

2. Spam filter
You know the junk folder in your email inbox? It is the place where emails that have been identified as
spam by the algorithm. 
Many machine learning courses, such as Andrew Ng’s famed Coursera course, use the spam filter as an
example of unsupervised learning and clustering. 
What the problem is: Spam emails are at best an annoying part of modern day marketing techniques,
and at worst, an example of people phishing for your personal data. To avoid getting these emails in
your main inbox, email companies use algorithms. The purpose of these algorithms is to flag an email
as spam correctly or not. 
How clustering works: K-Means clustering techniques have proven to be an effective way of identifying
spam. The way that it works is by looking at the different sections of the email (header, sender, and
content). The data is then grouped together. 
These groups can then be classified to identify which are spam. Including clustering in the classification
process improves the accuracy of the filter to 97%. This is excellent news for people who want to be
sure they’re not missing out on your favorite newsletters and offers.

3. Marketing and Sales


Personalization and targeting in marketing is big business. 
This is achieved by looking at specific characteristics of a person and sharing campaigns with them that
have been successful with other similar people. 
What the problem is: If you are a business trying to get the best return on your marketing investment, it
is crucial that you target people in the right way. If you get it wrong, you risk not making any sales, or
worse, damaging your Customer trust.
How clustering works: Clustering algorithms are able to group together people with similar traits and
likelihood to purchase. Once you have the groups, you can run tests on each group with different
marketing copy that will help you better target your messaging to them in the future.

4. Classifying network traffic


Imagine you want to understand the different types of traffic coming to your website. You are
particularly interested in understanding which traffic is spam or coming from bots.
What the problem is: As more and more services begin to use APIs on your application, or as your
website grows, it is important you know where the traffic is coming from. For example, you want to be
able to block harmful traffic and double down on areas driving growth. However, it is hard to know which
is which when it comes to classifying the traffic.
How clustering works: K-means clustering is used to group together characteristics of the traffic
sources. When the clusters are created, you can then classify the traffic types. The process is faster and
more accurate than the previous Auto class method. By having precise information on traffic sources,
you are able to grow your site and plan capacity effectively.

5. Identifying fraudulent or criminal activity 


In this scenario, we are going to focus on fraudulent taxi driver behavior. However, the technique has
been used in multiple scenarios.
What is the problem: You need to look into fraudulent driving activity. The challenge is how do you
identify what is true and which is false?
How clustering works: By analysing the GPS logs, the algorithm is able to group similar behaviors.
Based on the characteristics of the groups you are then able to classify them into those that are real and
which are fraudulent.

6. Document analysis
There are many different reasons why you would want to run an analysis on a document. In this
scenario, you want to be able to organize the documents quickly and efficiently.
What the problem is: Imagine you are limited in time and need to organize information held in
documents quickly. To be able to complete this ask you need to: understand the theme of the text,
compare it with other documents and classify it.
How clustering works: Hierarchical clustering has been used to solve this problem. The algorithm is able
to look at the text and group it into different themes. Using this technique, you can cluster and organize
similar documents quickly using the characteristics identified in the paragraph.

7. Fantasy Football and Sports 


Ok so up until this point we have looked into different business problems and how clustering algorithms
have been applied to solve them.
 But now for the critical issues – fantasy football!
What is the problem: Who should you have in your team? Which players are going to perform best for
your team and allow you to beat the competition? The challenge at the start of the season is that there is
very little if any data available to help you identify the winning players. 
How clustering works: When there is little performance data available to train your model on, you have
an advantage for unsupervised learning. In this type of machine learning problem, you can find similar
players using some of their characteristics. This has been done using K-Means clustering. Ultimately
this means you can get a better team more quickly at the start of the year, giving you an advantage.

How will you use clustering algorithms?


So there you have it, those were 7 innovative uses of clustering algorithms. As you can see, while the
technique remains reasonably constant, you can apply it to many different scenarios. 
Looking at the characteristics of different groups of data can help you make better predictions of
behavior. In this scenario, the real value of the algorithms is to help you create the best possible groups
of data.
Once you have a solid foundation of grouped data to work with, the opportunities become infinite.
Categories: Artificial Intelligence
Tags: algorithms, analytics, Artificial Intelligence, Big Data, clustering

Real-Life Examples of Association Analysis, Clustering Analysis, Text Mining, and Web
Usage Mining

Association Analysis (Data mining):

Association Analysis is one of the data mining techniques that analyze the co-occurrence
of purchasing events or acts of behavior by using customer transaction data. The
purpose of Association Analysis is proposing more suitable offers to customer thanks to
analyze of past behavior of them. In other words, thanks to Association Analysis, it is
predictable that the customer who buys the X product may also prefer to Y product.
Photo by Heidi Fin on Unsplash

Spotify songs recommendation to users according to their listening histories can be shown
as a real-life example of Association Analysis

Thanks to Bandits for Recommendation as Treatments (BaRT)[1], Spotify can determine which
songs each user likes personally and suggest new songs to them according to their likes. When
Spotify endeavors to determine users’ preferences, analyze historical interactions such as;
listened music, skipped music, songs that are added to playlists and, radio activities. Then detects
users with similar musical tastes and recommends the song on a personal playlist to another. It is
possible to decline that to identify relations between user’s preferences, genres, and singers,
Association Analysis is used by Spotify. Because the definition of the Association Analysis such as;
offers personal suggestions, estimates proposals to be liked, and so on are compatible with the
business algorithm used by Spotify.

In a field where competition is tough, Spotify needs to differentiate from its rivals. Because it was
competing with Apple Music, Pandora and, so on.[2] All of its competitors had similar features
and all competitors have an identical, immensely large catalog of music. Therefore, Spotify
realized the importance of a unique value proposition to gain an advantage over its competitors.

Photo by Charles Deluvio on Unsplash

In this way, Spotify created the ‘’ discover weekly’’ feature that makes a difference from its
competitors by using Association Analysis and eliminates the threat of potential customers to
choose competitors of Spotify. Because Spotify has an impressive understanding of the
importance of customization for the users. In this way, the company builds loyalty between the
listeners and Spotify. Moreover, Scott Wolf remarked that Spotify’s data reveals a more realistic
personality analysis than a traditional physiological test. While people choosing their songs, they
do not behave like someone else. It provides more accurate results about individuals'
characteristics to understand their preferences and attitudes. [3]

The critical challenge appears in this point; security of personal data. One of the revenue sources
of Spotify and other similar companies is selling user data to other companies for customized
advertisements. Adam Bly states that thanks to Spotify’s data, companies can apply the right
strategy. They can present the right message to the right people at the right time. Even the Spotify
get permission from users to use their data during the registration phase and clarify there is
nothing illegal thing, it is obvious that reliability will continue as a problem for people who use
Spotify. Because, beyond personalized advertisements of products, data used for political goals by
Pandora[4], a competitor of Spotify. Therefore, users are concerned about this issue, and the
situation creates a new challenge for Spotify.

Integrating Association Analysis, create numerous competitive advantages. Some of that can be
listed as increasing customer satisfaction and loyalty, supplying effective marketing,
strengthening customer relationship management, enhancing productivity, and reducing risks
and uncertainties.[5]

Using Association Analysis creates these competitive advantages. It increases customer


satisfaction by making it easier for companies to respond to the changing customer habits and
needs thanks to analysis. In this way, users who are satisfied continue to use products or services
and loyalty has occurred. Additionally, determining customers’ habits give chance to companies
to create an impressive marketing strategy. Following customer behavior also helps companies to
understand what the customers like or not and it strengths customer relationship management
and reduces risks.

Clustering Analysis (Data Mining):

Clustering Analysis is used to analyze data that are similar (in one sense) compared to others.
It tries to create distinct clusters correctly based on the given information. The goal of the
Clustering Analysis is to forecast future events based on data set and segmenting customers. In
this way, corporations can make better decisions.

Through to Money Club Card, Migros identify customer segments, that can be shown as an


example of Clustering Analysis.

Migros analyze customer purchasing acts such as which customer buy which product and from
which stores. In this way, Migros classifies its customers according to demographic features and
endeavor to identify which lifestyle is close to customers. According to the customer’s purchasing
behavior, Migros offers special products personally to affect people, sell more products, and
increase its revenue. Consequently, it can be matched with the Clustering Analysis, as mentioned
above, the definition of the clustering is well-matched with the Migros’s data analysis.

Before the Clustering Analysis, it was not possible to offer personalized products to customers
because Migros and other retailers could not identify the customer’s purchasing habits. Migros
faced the risk of boring its customers due to over 400 campaigns in a year because it could not
classify its customers who would be interested.[6] Moreover, mass marketing campaigns
cannibalize to one another and it causes to reducing revenue.

Migros executives state that after distinct segments are determined, they started to assign their
strategies. They mainly focused on segments which are consisted of people who consume healthy
foods, gourmets, eat junk-foods, and families with children. Migros contacts those segments
directly and offers special prices for products that are purchased by them frequently. Also, people
are informed about products that would be attractive. For example, people with healthy food
consumption habits get informed about the benefits of lycopene and in which products they can
find. Thanks to campaigns reaching the right customer target, profits of the campaign is
increased effectively. As proof of that, Alexandra Brunner declared that the conversation rate
rises if relevant offers access to relevant customers directly.

Unfortunately, using Clustering Analysis also created some challenges such as; the control of
operational processes. Also, the data security of customers is an issue for the Clustering Analysis.
Because, significant personal information is collected through Money Club Card, and some
potential customer who cares about the privacy of personal life would not prefer Migros to protect
personal information.

Adoption of Clustering Analysis makes it easy to target customer segments, advance customer
satisfaction level, increase efficiency, and effectiveness. Responding in time to customers
becomes available for the companies. Thanks to special offers, the lock-in effect has happened.
Furthermore, one of the other competitive advantages is Network Effect.

Encountering products that meet the needs of the customer at the right time, ensures customer
satisfaction level. In addition, if people do not purchase from Migros, the switching cost is
occurred due to special discounts. Therefore, loyalty is provided between the customer and the
company. Last but not least, Migros increases the number of customers and people influence
others that creates Network Effect.

Text Mining:

Text mining allows examining unstructured data, in this way meaningful information can be
obtained. Text mining extracts the main elements by discovering the relationship between
words. Sentiment analysis is one of the techniques to understand customers’ perspectives.
Photo by Norbert Buduczki on Unsplash

Sprout Social used by Casio can be shown as an example of Sentiment Analysis.

Thanks to Sprout Social, customer feedbacks on Twitter about Casio are gathered and Casio can
respond to its customers quickly rather than the past. Sprout Social analyzes people’s messages in
terms of tones as neutral, negative, or positive. In this way, Casio can understand customers’
perspectives of business effectively. Also, it can analyze whether customers like their products or
not. According to feedbacks, Casio makes regulations in the processes. According to the definition
above, it is obvious that Sprout Social can be explained as a real-life example of Sentiment
Analysis.

Casio wanted to compete with their rivals aggressively. Also, it needs to provide more efficient
social customer care. Casio needs to build a bridge between its marketing and customer care
departments, but also it must be beneficial for the large customer volumes.

Using Sprout Social make it possible to accessing customer care request for executives directly.
The obtained information is shared between marketing and customer care departments in a
single platform.[7] Additionally, Sprout Social not only analyses the tone of customers’ comments
but also observes how those comments have been continuing. Which means Casio can follow its
brand insight whether the value of the brand is increasing or not.[8] Integrating Sprout Social
increased response rates of Casio, helped to recognize customers’ current requests and demands,
and understand trends. One of the other crucial effects of Sentiment Analysis is determining the
right marketing strategy such as using distinct words in their social marketing campaigns.

Even there are no any reported challenges for Casio created by the sentiment analysis, constantly
changing demands and trends can cause difficulties for the brand. Because preparing marketing
campaigns takes time. Therefore, they must also be fast in preparing processes to catch
customer’s necessities and requests.

As competitive advantages, Sentiment Analysis a lot of benefits to companies such as;


differentiation from competitors, customer satisfaction, the ability to have more intimacy with
the customers, and so on.

Sentiment Analysis allows companies to identify what is their value when comparing to rivals or
why their customers prefer other brands at which points. Answers of these questions gain the
chance for differentiation. Also gaining insight about customers increases the opportunity of
customer satisfaction.
Web Usage Mining:

Web mining searches information from web sites. The core purpose of web mining is analyzing
customer behavior for companies and assessing the effectiveness of websites to improve. It can
be separated into 3 types as web content mining, web structure mining, and web usage mining.
Under the title of web usage mining, it can be described as a technique that analyzes usage
patterns from the webserver. In this way, the needs of the customer can be appeared and served
better for the requirements of web-based applications.[9]

According to the definition, Tableau integrated by A.S Watson Group can be given as a real-


life example of Web Usage Mining

Thanks to Tableau, Watson gained the capability to filter and drill down in data. It also helped the
idea generation process of Watson. Instead of time-consuming email methods, customer insights
obtained instantly thanks to Tableau.[10] The aim is to gain deep insight into customer behavior
and increasing the effectiveness of the website. Investigating user interaction for Watson to
achieve its goals proves that this is a real-life example of Web Usage Mining.

ASW states that the main problem was the time-consuming processes because, before the
Tableau, all works are made manually. Also, they did not see the performance of sub-categories in
the traditional reports so achieving perspective was limited from the reports. They had to create
separate reports if they want to see but this was also time-consuming.

Tableau visualized all data and supply the sole user interface. In this way, Watson interprets their
data well and can make better timely decisions. Moreover, a collaboration between departments
is reinforced. That means gaining a competitive advantage for the Watsons and all problems,
mentioned above solved.

Again, even there are not any reported problems by Watson, the main challenge of web usage
mining can be explained as ethical issues. In terms of customer, privacy creates suspicion and
causes criticism especially when users are not informed.

The primary competitive advantages gained are speed advantage, supplying better performance
to customers

Tableau integrates external and internal data sources; therefore, insights of customers are more
accessible for Watson. Due to the easiness of using Tableau, the employee can analyze and
interpret data without the need for data analysts, which saves time. Thanks to visualization,
Watson’s employees can recognize and emphasize some parts and can focus on these areas which
require immediate action to catch the market trends and to stay ahead of the competitors.
[1] Gershgorn, D. (2019, October 4). How Spotify’s Algorithm Knows Exactly What You Want to
Listen To. Retrieved from https://onezero.medium.com/how-spotifys-algorithm-knows-exactly-
what-you-want-to-listen-to-4b6991462c5c

[2] https://qz.com/571007/the-magic-that-makes-spotifys-discover-weekly-playlists-so-damn-
good/

[3] https://www.spotifyforbrands.com/tr/insights/inside-spotifys-data-mission/

[4] Jacob, S. How Pandora Uses Data to Improve Its Service and Music Stations. Retrieved
from https://neilpatel.com/blog/how-pandora-uses-data/

[5] Bal, Y. Bal, M. Demirhan, Y. (2011, January) Creating Competitive Advantage by Using Data
Mining Technique as an Innovative Method for Decision Making Process in Business. Retrieved
from https://www.researchgate.net/publication/220449370_Creating_Competitive_Advantage
_by_Using_Data_Mining_Technique_as_an_Innovative_Method_for_Decision_Making_Proc
ess_in_Business

[6] https://www.sas.com/en_us/customers/migros-ch.html

[7] https://partners.twitter.com/en/success-stories/sprout-social-helps-casio-respond-to-
customers-faster

[8] https://www.commsights.com/sentiment-analysis-in-social-media/

[9] https://en.wikipedia.org/wiki/Web_mining#Web_usage_mining

[10] https://www.tableau.com/solutions/customer/ASWatson-builds-competitive-edge-with-
data-in-fast-moving-retail-industry

Association Analysis – Retail Case Study Example (Part 4)


· 
Edward Scissorhands – by Roopam

This is a continuation of the case study example of marketing analytics we have been discussing for the
last few articles. You can find the previous parts at the following links ( Part 1, Part 2, and Part 3).  In the
last part, we discussed exploratory data analysis (EDA: Part 3). In this article we will talk about
association analysis, a helpful technique to mine interesting patterns in customers’ transaction data.
Association analysis can be used as a handy tool for extended exploratory data analysis. By the way,
association analysis is also the core of market basket analysis or sequence analysis. Later in the article,
we will use association analysis in our case study example to design effective offer catalogs for
campaigns and also online store design (website).

Scissorhands
I must have been 9 or 10 years old when in our school we had our first craft lecture. Craft lectures are
called SUPW in India, it’s an abbreviation for ‘Socially Useful Productive Work’. As a part of the first
lecture, each student was provided with an A4 sized color paper and a pair of scissors. In the first lecture
excited kids with no direction discovered that they could cut a sheet in a virtually infinite number of ways.
It was neither socially useful nor productive work and created a lot of wasted paper. A more apt long
form of SUPW in this case is ‘Some Useful Paper Wasted’. Later with a more directed effort we
discovered that there are so many cool shapes hidden in a piece of paper as long as scissors are used
wisely.

This is precisely the kind of experience many analysts have when they come across customers’
transaction data in companies. There is wealth of information about customer behavior hidden in this
data but it is hard to figure out where to start. Transaction data can be sliced, diced and grouped in
infinitely many ways similar to a piece of paper dissected with scissors. The key in both these above
cases is direction.

Hollywood Image of Data Analysis


Let me describe a typical Hollywood visual for data analysis, a man standing in front of a giant screen
with data (sequence of numbers) floating all over the screen. This man will detect patterns in this data on
the fly. This is a powerful image but completely untrue. The above technique of stare at data and hope to
find patterns is guaranteed to generate all noise and very little signal. Even the great code breakers like
John Nash and Alan Turing will fail if they try to find patterns in data using this Hollywood technique.

The point I am trying to drive at here is that data analysis is a highly planned activity.  As an analyst
never touch your data before you have a proper plan of action (hypotheses etc.) in place. Having said
this there are always going to be times as an analyst when you have to enter uncharted territories of
data to find patterns. In these cases, I will recommend you rely on machine learning algorithms or create
your own modified algorithms specific to your requirements. In my opinion, machines are any day better
than us humans at this task. Association analysis powered by the Apriori algorithm is one
such technique to mine transaction data. Let’s explore association analysis in the next part.

Association Analysis
Association analysis, as you will discover soon, is primarily frequency analysis performed on a large
dataset. Since datasets for most practical problems are large you need clever algorithms like  Apriori to
manage association analysis. Let’s consider a much smaller transaction dataset to learn about
association analysis. Here, each row or transaction number represents market baskets of customers.
For the subsequent products columns, 1 represents ‘bought the product in that transaction’, whereas 0
stands for ‘did not buy’.

Transaction # Shirts Trousers Ties

001 1 1 1

002 0 1 0

003 1 0 1

004 1 0 1

005 1 1 0

There are a few association analysis metrics (i.e. support, confidence, and lift) that are really helpful in
deciphering information hidden in this kind of dataset. Let us explore these metrics and understand their
usage. Support for purchase of shirts and ties together in association analysis is defined as:
For our data there are 3 transactions with both shirts and ties (shirts∩ties) out of total 5 transactions.

60% is a fairly high value for support and you will rarely find such high values for support in real world
examples. For real world problems with several product groups, support of 1% or at times even lower
depending upon the nature of your problem is also useful.

Confidence for association is calculated using the following formula:

In our dataset, there are 3 transaction for both shirts and ties together out of 4 transactions for shirts.
The calculation for confidence for our dataset is:

Again you will rarely find such high value of confidence for most real world problems unless there are
appealing combo offers on two products. A good value of confidence is again problem specific.

A third useful metric for association analysis is lift; it is defined as:

Expected confidence in the above formula is presence of ties in the overall dataset i.e. there are 4
instances of ties purchase out of 5.

The value for lift, 125%, shows that purchases of the ties improve when the customers buy shirts. The
question you are asking here is that if the customer buys a shirt, does his chance of buying ties go up
i.e. value of lift above 100%. Let us use our knowledge about association analysis for the case study
example we have been working on.

Retail Case Study Example – Association Analysis

DresSMart Inc., where you are the Chief Analytics Officer & Business Strategy Head, is an online retail
store for clothes and apparel. They showcase different products, brands, and styles. You know
association analysis works best when performed separately on different customer segments (read
about customer segmentation). However, you have decided to do a quick association analysis on the
data available in your company.

With your data for formal shirts and ties we explored in the above example, you got support of 0.2% with
confidence of 12% and lift of 509%. This implies that though there are fewer percentage records of
transactions with both ties and shirts, once the customers buys formal shirts his chances of buying a tie
goes up five fold.

DresSMart provides the option to it’s customers to return the undamaged product back within 30 days
with full refund. You did a further investigation of customers who are buying ties along with shirts and
found that product return rates of the ties for these transactions are also 3 times more than the other
return rates. This is an indicator that customers are struggling to choose matching ties while placing the
orders online along with shirts. There is a need to improve this process on the company’s website. The
idea is to reduce product return rate while exploiting the full opportunity for cross selling ties with shirts.

You have found some good clues to improve the profitability of your company through exploratory data
analysis tools. Now you want to prepare and address the original objectives (Part 2) to improve
profitability for campaign efforts. You will delve into serious modeling for this task next time around.
Sign-off Note

Hope you enjoy being Edward Scissorhands with your data! See you soon with the next part of this case
study example where we will explore more about decision tree algorithms.

You might also like