Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more
Download
Standard view
Full view
of .
Save to My Library
Look up keyword
Like this
2Activity
0 of .
Results for:
No results containing your search query
P. 1
Enchancing and Deriving Actionable Knowledge from Decision Trees

Enchancing and Deriving Actionable Knowledge from Decision Trees

Ratings: (0)|Views: 105 |Likes:
Published by ijcsis
Data mining algorithms are used to discover customer models for distribution information, Using customer profiles in customer relationship management (CRM), it has been used in pointing out the customers who are loyal and who are attritors but they require human experts for discovering knowledge manually. Many post processing technique have been introduced that do not suggest action to increase the objective function such as profit. In this paper, a novel algorithm is proposed that suggest actions to change the customer from the undesired status to the desired one. These algorithms can discover cost effective actions to transform customer from undesirable classes to desirable ones. Many tests have been conducted and experimental results have been analyzed in this paper.
Data mining algorithms are used to discover customer models for distribution information, Using customer profiles in customer relationship management (CRM), it has been used in pointing out the customers who are loyal and who are attritors but they require human experts for discovering knowledge manually. Many post processing technique have been introduced that do not suggest action to increase the objective function such as profit. In this paper, a novel algorithm is proposed that suggest actions to change the customer from the undesired status to the desired one. These algorithms can discover cost effective actions to transform customer from undesirable classes to desirable ones. Many tests have been conducted and experimental results have been analyzed in this paper.

More info:

Published by: ijcsis on Jan 20, 2011
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less

02/03/2011

pdf

text

original

 
ENCHANCING AND DERIVING ACTIONABLE KNOWLEDGE FROMDECISION TREES
1. P.Senthil Vadivu 2.Dr.(Mrs) Vasantha Kalyani David.HEAD, Department Of Computer Applications, Associate Professor,Hindusthan College Of Arts and Science, Department of Computer Science,Coimbatore-641028. Tamil Nadu, India. Avinashilingam Deemed UniversityEmail: sowju_sashi@rediffmail.com Coimbatore, Tamil Nadu, IndiaEmail: vasanthadavid@yahoo.com
Abstract
 Data mining algorithms are used to discover customer models for distribution information, Using customer profiles in customer relationship management (CRM), it has been used in pointing out the customers who are loyal  and who are
attritors b
ut they require human experts for discovering knowledge manually. Many post processing technique have been introduced  that do not suggest action to increase the objective function such as profit. In this paper, a novel algorithm is proposed  that suggest actions to change the customer from theundesired status to the desired one. These algorithms can discover cost effective actions to transform customer fromundesirable classes to desirable ones. Many tests have been conducted and experimental results have been analyzed in this paper Key words : CRM,BSP,ACO,
decision trees, attrition
 
1. Introduction
Researchers are done in data mining. Variousmodels likes Bayesian models, decision trees, supportvector machines and association rules have beenapplied to various industrial applications such ascustomer relationship management,(CRM)[1][2] whichmaximizes the profit and reduces the costs, relying on post processing techniques such as visualization andinterestingness ranking.Because of massive industry deregulationacross the world each customer is facing an ever growing number of choices in telecommunicationindustry and financial services [3] [10] .The result isthat an increasing number of customers are switchingfrom one service provider to another. ThisPhenomenon is called customer “churning “or “attrition”.A main approach in the data mining area is torank the customers according to the estimatedlikelihood and they way they respond to directmarketing actions and compare the rankings using alift chart or the area under curve measure from theROC curve. Ensemble based methods are examinedunder the cost sensitive learning frameworks. For Example, integrated boosting algorithms with costconsiderations.A class of reinforcement learning problems andassociated techniques are used to learn about how tomake sequential decisions based on delayedreinforcement so as to maximize cumulative rewards.A common problem in current application of datamining in intelligent CRM is that people tend to focuson, and be satisfied with building up the models andinterpreting them, but not to use them to get profitexplicitly. More specifically, most data miningalgorithms only aim at constructing customer profiles; predict the characteristics of customers of certainclasses. Example of this class is: what kind of customers are likely attritors and kind are loyalcustomers?This can be done in the telecommunicationsindustry. For example, by reducing the monthly ratesor increasing the service level for valuable customers.Unlike distributional knowledge, to consider actionable knowledge one must take into accountresource constraint such as direct mailing and sales promotion [14]. To make a decision one must take intoaccount the cost as well as the benefit of actions to theenterprise.This paper is presented with many algorithms for the Creation of decision tree, BSP (BoundedSegmentation Problem), Greedy-BSP and ACO (AntColony Optimization) which helps us to obtain actionsfor maximizing the profit and finding out the number of customer who are likely to be loyal.
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 8, No. 9, December 2010230http://sites.google.com/site/ijcsis/ISSN 1947-5500
 
2. Extracting Actions in Decision Tree
For CRM applications, using a set of examples adecision tree can be built, which is described by set of attributes such as name, sex, birthday, etc., andfinancial information such as yearly income and familyinformation such as lifestyles and number of children.Decision tree is used in vast area of data mining because one can easily convert methods into rules andalso obtain characteristics of customers those who belong to a certain class. The algorithm that is used inthis paper do not only relay on prediction but also itcan classify the customers who are loyal and such rulescan be easily derived from decision trees.The first step is to extract rules when there is norestrictions in the number of rules that can be produced. This is called as unlimited resource case [3].The overall process of the algorithm is described asfollows:
Algorithm 1:
Step 1: Import customer data with data collection,data cleaning, data preprocessing and so on.Step 2: A Decision tree can be build usingdecision tree learning algorithm[11] to predict, if acustomer is in desire status or not. One improvementfor the decision building is to use the area under thecurve of the ROC curve [7].Step 3: Search for optimal actions for eachcustomer using the key component proactive solution[3].Step 4: Produce reports, for domain experts toreview the actions that deploy the actions.
2.1 A search for a leaf node in the unlimitedresources
This algorithm search for optimal actions andtransforms each leaf node to another node in the moredesirable fashion. Once the customer profile is built,the customers who are there in the training examplesfalls into a particular leaf node in a more desirablestatus thus the probability gain can then be convertedinto expected gross profit.When a customer is moved from one leaf toanother node there are some attribute values of thecustomer that must be changed. When an attributevalue is transformed from V1 to V2, it corresponds toan action that incurs cost which is defined in a costmatrix.The leaf node search algorithm searches allleafs in the tree so that for every leaf node ,a bestdestination leaf node is found to move the customer tothe collection of moves are required to maximize thenet profit.The domain specific cost matrix for the net profit of an action can be defined as follows:P
 N
et=P
E
*Pgain-
i COSTij (1)
Where P
 N
et denotes the net profit, P
E
denotes the total profit the customer in the desired status, Pgain denotesthe probability gain, and COST
ij
denotes the cost of each action involved.The leaf node search algorithm for searchingthe best actions can be described as follows:
Algorithm: leaf-node search
1. For each customer x, do2. Let S be the source leaf node in which x falls into;3. Let D be a destination leaf node for x the maximumnet profit P
 Net
.4. Output (S, D, Pnet);
An example of customer profile:
-Low Med HIGHL H0.1F M0.9 0.2 0.8 0.5Consider the above decision tree, the tree has fivenodes. A, B, C, D, E each with the probability of customers being a loyal. The probability of attritorssimply “1” minus this probability.Consider a customer Alexander who’s record statesthat the service=Low (service level is low), sex=M(male), and Rate =L (mortgage rate is low). Thecustomer is classified by the decision tree. It can beseen that Alexander falls into the leaf node B, which predicts that Alexander will have only a 20 percentchance of being loyal. The algorithm will now searchthrough all other leafs (A, C, D & E) in the decisiontree to see if Alexander can be “replaced” into a bestleaf with the highest net profit.1. Consider the leaf node A. which do not have a high probability of being loyal(90%), because the cost of action would be very high if Alexander should beC RATESEX
 
A BDServiceE
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 8, No. 9, December 2010231http://sites.google.com/site/ijcsis/ISSN 1947-5500
 
changed to female). So the net profit is a negativeinfinity.2. Consider leaf node C, it has a lower probability of  being loyal, so we can easily skip it.3. Consider leaf node D the probability gain is 60 percent (80 percent - 20 percent) if Alexander falls intoD, the action needed is to change service from L (low)to H (high).4. Consider leaf E, the probability gain is 30 percent(50 percent – 20 percent), which transfers to $300 of the expected gross profit. Assume that the cost of theactions (change service from L to H and change ratefrom L to H). Is $250, then the net profit of the jack from B to E is $ 50 (300-250).Clearly, the node with the maximum net profit for Alexander is D, that suggest actions of changing theservice from L to H.
3. COST MATRIX:-
Each attribute value changes incur cost and the costfor each attribute is determined by the domain experts.The values of many attributes, such as sex, address,number of children cannot be changed with anyreasonable amount of money. These attributes arecalled “hard attributes”. The users must assign largenumber to every entry in the cost matrix.Some values can be easily changed withreasonable costs, these attributes such as the servicelevel, interest rate and promotion packages are called“soft attributes”.The hard attributes should be included the tree building process in the first place to prevent customersfrom being moved to other leafs is because that manyhard attributes are important accurate probabilityestimation of the leaves.For continuous value attributes, such as interest ratewhich is varied within a certain range. the numericalranges be discretized first for feature transformation.
4. THE LIMITED RESOURCE CASE:POSTPROCESSING DECISION TREES:-4.1 BSP
 
(Bounded Segmentation Problem)
 In the previous example that is considered aboveeach leaf node of the decision tree is a separatecustomer group. For each customer group we have todesign actions to increase the net profit. But in practicethe company may be limited in its resources. But whensuch limitations occur it is difficult to merge all thenodes into K segments. So to each segment aresponsible manager can apply several actions toincrease the overall profit.Step 1: Here, a decision tree is build with collectionS(m) source leaf nodes and collection D(m) destinationleaf nodes.Step 2: Consider a constant, k.(K<m) ,where m is totalnumber of source leaf nodes.Step 3: Build a cost matrix with attributes U and V.Step 4: Build a unit benefit vector, when a customer  belongs to positive classStep 5: Build a set of test cases.The goal is to is to find a solution with maximum net profit. by transforming customers that belongs to asource node S to the destination node D via, a number of attribute value changing actions.
GOALS:
The goal is to transform a set of leaf node S to adestination leaf node D, S->D.ACTIONS:In order to change one has to apply one attributevalue changing action. This is denoted by{Attr, u->v}.Thus the BSP problem is to find the best K groupsof source leaf nodes {Group i=1, 2…, k} and their corresponding goals and associated action sets tomaximize the total net profit for a given data set C
test
.Low HIGHA B C D0.9 0.2 0.8 0.5Example: To illustrate the limited resources problem,consider again our decision tree in above figure.Suppose that we wish to find a single customer segment {k=1}. A candidate group is {L2, L4}, with aselected action set {service <-H, Rate <--C} which cantransform the group to node L3. assume that group toleaf node L3,L2 changes the service level only andthus, has a profit gain of (0.8-0.2)*1-0.1=0.5 and L4has a profit gain of (0.8-0.5)*1-0.1=0.2.Thus, the net benefit for this group is 0.2+0.5=0.7.As an example of the profit matrix computation, a part of the profit matrix corresponding to the sourceleaf node. L2 is as shown in table, whereAset1={status=A}. Aset2={service=H, Rate =C } andAset3={service=H , Rate=D}. here, for conveniencewe ignore the source value of the attributes which isdependent on the actual test cases.RATESTATUSL1 L2 L3Service
 
L4
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 8, No. 9, December 2010232http://sites.google.com/site/ijcsis/ISSN 1947-5500

You're Reading a Free Preview

Download
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->