Professional Documents
Culture Documents
– Regression [Predictive]
– Deviation Detection [Predictive]
Classification: Application
• Customer Attrition/Churn:
– Goal: To predict whether a customer is likely to be lost to a
competitor.
– Approach:
• Use detailed record of transactions with each of the past
and present customers, to find attributes.
– How often the customer calls, where he calls, what
time-of-the day he calls most, his financial status,
marital status, etc.
• Label the customers as loyal or disloyal.
Clustering Definition
• Market Segmentation:
– Goal: subdivide a market into distinct subsets of
customers where any subset may conceivably be selected
as a market target to be reached with a distinct marketing
mix.
– Approach:
• Collect different attributes of customers based on their
geographical and lifestyle related information.
• Find clusters of similar customers.
• Measure the clustering quality by observing buying
patterns of customers in same cluster vs. those from
different clusters.
Association Rule Discovery: Definition
TID Items
1 Bread, Coke, Milk
Rules
RulesDiscovered:
Discovered:
2 Beer, Bread
{Milk}
{Milk}-->
-->{Coke}
{Coke}
3 Beer, Coke, Diaper, Milk {Diaper,
{Diaper,Milk}
Milk}-->
-->{Beer}
{Beer}
4 Beer, Bread, Diaper, Milk
5 Coke, Diaper, Milk
Association Rule Discovery
• Agriculture:
– Data mining is emerging technology in agriculture field for crop
yield analysis with respect to four parameters namely year, rainfall,
production and area of sowing. Yield prediction is a very important
agricultural problem that remains to be solved based on the
available data.
Data Mining Process
1. Business Understanding
2. Data Understanding
3. Data Preparation
4. Model Building
5. Testing and Evaluation
6. Deployment
1. Business Understanding
• The key element of any data mining study is to know what the
study is for
• Specific goals such as “What are the common characteristics of the
customers we have lost to our competitors recently?” or
• “What are typical profiles of our customers, and how much value
does each of them provide to us?” are needed
• First and foremost, the analyst should be clear and concise about the
description of the data mining task so that the most relevant data can be
identified
• The developed models are assessed and evaluated for their accuracy
and generality. This step assesses the degree to which the selected
model (or models) meets the business objectives and, if so, to
what extent (i.e., do more models need to be developed and
assessed). Another option is to test the developed model(s) in a real-
world scenario if time and budget constraints permit.
• Even though the outcome of the developed models is expected to
relate to the original business objectives, other findings that are
not necessarily related to the original business objectives but
that might also unveil additional information or hints for future
directions often are discovered.
Step 6: Deployment