DATA MINING FOR BUSINESS INTELLIGENCE (1 and 2

)

Data

Mining: A process for extracting

information from large data sets to solve business problems. 

Data Warehouse: A large database created specifically for decision support throughout the enterprise. It usually consists of data extracted from other company databases. This data has been cleaned and organized for easy access. Often includes a metadata store as well.

. data mining is the search for the relationships and global patterns that exist in large databases but are hidden among vast amount of data. In other words. This relationship represents valuable knowledge about the database.Data Mining: Data mining is defined as the process of extracting significant and potentially useful patterns in large volume of data. if the database is a faithful mirror of the real world registered by database.

effective promotion schemes) schemes) (ii) Customer Retention: -Identifying patterns that leads to defection of customers and suggesting preventive measures for the current customers (iii) Risk Assessment and Fraud Detection: -a mail order retailer can identify payment patterns from different customers at the same address.Some application areas: areas: (i) Sales and Customer Service: Market Basket Analysis (Analysis of transactional databases to find sets of items that appear frequently together in a single purchase) have already shown phenomenal gains in cross-selling. identifying potentially fraudulent practices by an individual using different names .An insurance company can identify client who may have different kinds of policies totaling more than an acceptable level . crossbetter layout of catalog and web pages.A bank can identify companies that may be in financial jeopardy before extending a loan to them (iv) Customer Segmentation (v) Product Grouping . floor and shelf layout.

TYPES OF KNOWLEDGE EXTRACTED USING DATA MINING - Association Rule Classification Clustering Feature Selection Factor Analysis Sequence Mining Regression .

It employs association or linkage analysis. searching transactions from operational systems for interesting patterns with a high probability of repetition.ASSOCIATION RULE Association rule is a type of data mining that correlates one set of items or events with another set of items or events. Algorithms :  A priori  Partitioning  Dynamic Itemset Counting  Frequent Pattern Tree Algorithm .

they also buy Lays chips 70% of the time. they also buy bread 70% of the time (Association) (Association) (ii) When people buy Pepsi.Examples of Association Rules When people buy butter. a type of Temporal Association) (iv) When people buy coke. on Sunday evenings (Temporal Association) Association) (iii) 70% of the readers who buy a DBMS book also buy a Data Mining book after a semester (Sequence Rule. they do not buy coffee 95% of the time (Negative Association) Association) (i) .

µconfidence¶ and µlift¶ µSupport¶ of an itemset in a transaction Support¶ database is defined as the percetntage of occurrence of the itemset. out of all the transactions.Strength of an association rule defined under the framework of µsupport¶. X=>Y holds with support s% . . if s% of the transactions in the database contain µX¶ and µY¶ both.

if µX¶ and µY¶ represent two items/itemsets such that X Y= .e. we say µX¶ associates µY¶ and represented as X=>Y .For a given transaction database. there is no common item in them.. i.

X=>Y holds with confidence c%. if c% of the transaction in the database that supportsµX¶ also supportsµY¶ . out of all the transactions containing X.µConfidence¶ of an association rule X=>Y is defined as the percentage of transactions containing X and Y both.

the greater is the strength of the association. Lift = Confidence/Percentage Support of Y .0 suggests that there is some usefulness to the rule. A lift ratio greater than 1. The larger the lift ratio.µLift¶ of an association rule X=>Y  The lift ratio is the confidence of the rule divided by the confidence assuming independence of consequent from antecedent.

Various types of association rules:    Ordinary Association Rule: -Boolean Type .Quantitative Type .Categorical Type Temporal Association Rule: -Boolean Type -Quantitative Type -Categorical Type Spatial Association Rule: -Boolean Type -Quantitative Type -Categorical Type .

Transaction of items: TID 1 2 3 4 5 6 7 8 9 10 I1 1 1 0 1 1 1 0 0 1 1 I2 0 1 0 0 1 0 0 0 0 0 I3 1 0 0 0 1 0 0 1 1 1 .

n (I1. n(I2 and I3) = 1. n(I3) = 5 n(I1and I2) = 2. n(I2) = 2. of transactions = 10 n(I1) = 7.Total no. . n(I1 and I3) = 4. I2 and I3) = 1 Find all the association rules for min_support = 30% and min_confidence = 60% and min_lift =1.

I2 I1.I3 I1.I3)/n(I 1) =4/7 =57.1% n(I1.I3)/n(I3) =4/5 =80% Lift= 80/70 =1.I3 I2.I3 2 4 1 1 20% 40% (I1. of Trans.14 I3=>I1 I1.I2.Itemset No.I3) 10% 10% - . Support Frequent Itemset Confidence & Lift Association Rules I1 I2 I3 7 2 5 70% 20% 50% I1 I3 Conf=n(I1.

7474 5 . Association Rules in Retail Sale Transaction Data Rule # Conf.5688 8 3 87.41 74.7647 2 68.63 73. Noodles-maggiNoodles-maggi200gm=> 201 149 138 2.1949 1 1.100 gm=> MDH-MasalaMDH-Masala-chicken masala=> Tomato sauce (Maggi)(Maggi)200gm=> Coffee (50gm)_Nescafe=> 204 158 201 204 209 158 233 204 201 230 138 146 150 150 140 2.66 MDH-MasalaMDH-Masala-chicken masala=> MDH-MasalaMDH-Masala-chicken masala. NoodlesNoodlesmaggimaggi-200gm Basmati Rice (1 Kg) Tomato sauce (Maggi)(Maggi)-200gm MDH-MasalaMDH-Masala-chicken masala BournvitaBournvita-Cadbury 500gm 149 201 138 2. % Antecedent (a) Noodles-maggiNoodles-maggi-200gm.34 158 204 138 4 5 6 7 8 67.53 66.1949 1 2.62 MDH-MasalaMDH-Masala-chicken masala Noodles-maggiNoodles-maggi200gm. Tomato sauce (Maggi)(Maggi)200gm=> Consequent (c) Support(a ) Support( c) Support(a U c) Lift Ratio 1 92.7647 2..5688 8 2.65 92. Tomato sauce (Maggi)(Maggi)-200gm Tomato sauce (Maggi)(Maggi)-200gm MDH-MasalaMDH-Masala-chicken masala.99 Tomato sauce (Maggi)(Maggi)200gm=> Kismis .3795 3 2.

.

.

May Solutions. by Dr. Inductis Retail Rewards Solutions. Matt Hasan.CaseCase-I Maximize return on investment in retail industry Customer rewards programs (Publications:Articles & Whitepapers. by 2007) ³Optimally Manage Churn by Leveraging Each Shopper¶s Inherent Loyalty Intensity.´ .

are widespread in the retail industry Programs have very high adoption rates due to no joining fee and low barriers to entry. .Recent research of retail customer rewards programs industry sources reveal the following interesting facts:         Reward programs. some use transaction data to match coupons and discounts to customer buying patterns. especially card based ones. 80% of the customers have a loyalty card 54% of shoppers have multiple loyalty cards from competing retailers 80%80%-90% of grocery purchases are made with a loyalty card Less than 30% of shoppers say reward programs have a major impact on their shopping decisions 40% of long term reward program members never redeem their rewards Shoppers with emotional ties to a store (or chain) tend to shop there even if they have to go further or pay more Majority of retail rewards programs use just cumulative dollar value of purchases to allocate rewards.

retain. without significantly lessimpacting the top line.´ Strategic Customer Worth Management (SCWM): SCWM is a proven enterprise solution based on unique ³true customer worth´ evaluating framework that determines the optimal treatment and rewards to offer to acquire. Success Story of a Leading Retail Chain in USA Background: Store sales were declining with   The average shelf life of store merchandise was higher than industry norms Rewards/Loyalty program. or expand business with each customer. Using the SCWM (Strategic Customer Worth Management Developed by Inductis) ³They were able to reverse a decline in sales while decreasing expenditures on rewards program. resulting in a gross margin improvement of 17%. including discounts and coupons were attracting less-profitable customers. .

Current Rewards/Loyalty Programs  Point of Sale Data  Customer Database  Product/ Discount Database .Provides broad demographic and transaction data .Rewards frequent and high spending shoppers .

Enhancements/Additions Based on SCWM: SCWM: .Personalized Customer Interaction .Predictive Modeling  Matches incentives to customer profile  Scores customers for true economic worth  Provides details on inherent loyalty (emotional) profile of customers in addition to demographic and transaction data .

merchandizing. purchasing. They used in-store survey indata as well as geodemographic data from consumer research services  Aligned the loyalty programs. as well as preferences for merchandise and communications channel.Approach  They applied unique SCWM framework to perform inindepth customer segmentation of intrinsic loyalty. and communications channels of each store type according to the profile of the profitable customers of that store type  Implemented a targeted direct mail program designed to resonate with shoppers matching the profitable customer profile of each store type  Designed and executed a rewards program based on true customer worth and inherent loyalty coefficient of each customer segment  Instituted an ongoing survey and tracking system to collect data on customer demographics and browsing. to create a virtuous circle of understanding customers and creating programs that retain their loyalty . and loyalty patterns.

and communications channels of each store type according to the profile of the profitable customers of that store type  Implemented a targeted direct mail program designed to resonate with shoppers matching the profitable customer profile of each store type  Designed and executed a rewards program based on true customer worth and inherent loyalty coefficient of each customer segment  Instituted an ongoing survey and tracking system to collect data on customer demographics and browsing. as targeted rewards programs replaced some sales-coupons. merchandizing.Results  Average revenue per store increased by 5%  Among those stores which were previously the worst performers. with some stores seeing an increase of as much as 20%  Expenditures on rewards programs decreased by 8%. average revenue per customer increased by 14%  Traffic increased by an average of 11%. as well as preferences for merchandise and communications channel. to create a virtuous circle of understanding customers and creating programs that retain their loyalty . and discounts sales Gross margin increased by 17%  They applied unique SCWM framework to perform in-depth customer insegmentation of intrinsic loyalty. purchasing. They used in-store survey data as well as ingeodemographic data from consumer research services  Aligned the loyalty programs. and loyalty patterns.

.

.«.