This action might not be possible to undo. Are you sure you want to continue?

Welcome to Scribd! Start your free trial and access books, documents and more.Find out more

9, December 2010

**ENCHANCING AND DERIVING ACTIONABLE KNOWLEDGE FROM DECISION TREES
**

1. P.Senthil Vadivu HEAD, Department Of Computer Applications, Hindusthan College Of Arts and Science, Coimbatore-641028. Tamil Nadu, India. Email: sowju_sashi@rediffmail.com 2.Dr.(Mrs) Vasantha Kalyani David. Associate Professor, Department of Computer Science, Avinashilingam Deemed University Coimbatore, Tamil Nadu, India Email: vasanthadavid@yahoo.com

Abstract

Data mining algorithms are used to discover customer models for distribution information, Using customer profiles in customer relationship management (CRM), it has been used in pointing out the customers who are loyal and who are attritors but they require human experts for discovering knowledge manually. Many post processing technique have been introduced that do not suggest action to increase the objective function such as profit. In this paper, a novel algorithm is proposed that suggest actions to change the customer from the undesired status to the desired one. These algorithms can discover cost effective actions to transform customer from undesirable classes to desirable ones. Many tests have been conducted and experimental results have been analyzed in this paper Key words : CRM,BSP,ACO, decision trees, attrition

1. Introduction

Researchers are done in data mining. Various models likes Bayesian models, decision trees, support vector machines and association rules have been applied to various industrial applications such as customer relationship management,(CRM)[1][2] which maximizes the profit and reduces the costs, relying on post processing techniques such as visualization and interestingness ranking. Because of massive industry deregulation across the world each customer is facing an ever growing number of choices in telecommunication industry and financial services [3] [10] .The result is that an increasing number of customers are switching from one service provider to another. This Phenomenon is called customer “churning “or “attrition”. A main approach in the data mining area is to rank the customers according to the estimated likelihood and they way they respond to direct

marketing actions and compare the rankings using a lift chart or the area under curve measure from the ROC curve. Ensemble based methods are examined under the cost sensitive learning frameworks. For Example, integrated boosting algorithms with cost considerations. A class of reinforcement learning problems and associated techniques are used to learn about how to make sequential decisions based on delayed reinforcement so as to maximize cumulative rewards. A common problem in current application of data mining in intelligent CRM is that people tend to focus on, and be satisfied with building up the models and interpreting them, but not to use them to get profit explicitly. More specifically, most data mining algorithms only aim at constructing customer profiles; predict the characteristics of customers of certain classes. Example of this class is: what kind of customers are likely attritors and kind are loyal customers? This can be done in the telecommunications industry. For example, by reducing the monthly rates or increasing the service level for valuable customers. Unlike distributional knowledge, to consider actionable knowledge one must take into account resource constraint such as direct mailing and sales promotion [14]. To make a decision one must take into account the cost as well as the benefit of actions to the enterprise. This paper is presented with many algorithms for the Creation of decision tree, BSP (Bounded Segmentation Problem), Greedy-BSP and ACO (Ant Colony Optimization) which helps us to obtain actions for maximizing the profit and finding out the number of customer who are likely to be loyal.

230

http://sites.google.com/site/ijcsis/ ISSN 1947-5500

(IJCSIS) International Journal of Computer Science and Information Security, Vol. 8, No. 9, December 2010

**2. Extracting Actions in Decision Tree
**

For CRM applications, using a set of examples a decision tree can be built, which is described by set of attributes such as name, sex, birthday, etc., and financial information such as yearly income and family information such as lifestyles and number of children. Decision tree is used in vast area of data mining because one can easily convert methods into rules and also obtain characteristics of customers those who belong to a certain class. The algorithm that is used in this paper do not only relay on prediction but also it can classify the customers who are loyal and such rules can be easily derived from decision trees. The first step is to extract rules when there is no restrictions in the number of rules that can be produced. This is called as unlimited resource case [3]. The overall process of the algorithm is described as follows: Algorithm 1: Step 1: Import customer data with data collection, data cleaning, data preprocessing and so on. Step 2: A Decision tree can be build using decision tree learning algorithm[11] to predict, if a customer is in desire status or not. One improvement for the decision building is to use the area under the curve of the ROC curve [7]. Step 3: Search for optimal actions for each customer using the key component proactive solution [3]. Step 4: Produce reports, for domain experts to review the actions that deploy the actions.

The domain specific cost matrix for the net profit of an action can be defined as follows: (1) PNet=PE*Pgain-∑i COSTij Where PNet denotes the net profit, PE denotes the total profit the customer in the desired status, Pgain denotes the probability gain, and COSTij denotes the cost of each action involved. The leaf node search algorithm for searching the best actions can be described as follows:

**Algorithm: leaf-node search
**

1. For each customer x, do 2. Let S be the source leaf node in which x falls into; 3. Let D be a destination leaf node for x the maximum net profit PNet. 4. Output (S, D, Pnet);

An example of customer profile:Service Low SEX Med C L 0.1 F A 0.9 M B 0.2 0.8 D E 0.5 HIGH RATE H

**2.1 A search for a leaf node in the unlimited resources
**

This algorithm search for optimal actions and transforms each leaf node to another node in the more desirable fashion. Once the customer profile is built, the customers who are there in the training examples falls into a particular leaf node in a more desirable status thus the probability gain can then be converted into expected gross profit. When a customer is moved from one leaf to another node there are some attribute values of the customer that must be changed. When an attribute value is transformed from V1 to V2, it corresponds to an action that incurs cost which is defined in a cost matrix. The leaf node search algorithm searches all leafs in the tree so that for every leaf node ,a best destination leaf node is found to move the customer to the collection of moves are required to maximize the net profit.

Consider the above decision tree, the tree has five nodes. A, B, C, D, E each with the probability of customers being a loyal. The probability of attritors simply “1” minus this probability. Consider a customer Alexander who’s record states that the service=Low (service level is low), sex=M (male), and Rate =L (mortgage rate is low). The customer is classified by the decision tree. It can be seen that Alexander falls into the leaf node B, which predicts that Alexander will have only a 20 percent chance of being loyal. The algorithm will now search through all other leafs (A, C, D & E) in the decision tree to see if Alexander can be “replaced” into a best leaf with the highest net profit. 1. Consider the leaf node A. which do not have a high probability of being loyal(90%), because the cost of action would be very high if Alexander should be

231

http://sites.google.com/site/ijcsis/ ISSN 1947-5500

(IJCSIS) International Journal of Computer Science and Information Security, Vol. 8, No. 9, December 2010

changed to female). So the net profit is a negative infinity. 2. Consider leaf node C, it has a lower probability of being loyal, so we can easily skip it. 3. Consider leaf node D the probability gain is 60 percent (80 percent - 20 percent) if Alexander falls into D, the action needed is to change service from L (low) to H (high). 4. Consider leaf E, the probability gain is 30 percent (50 percent – 20 percent), which transfers to $300 of the expected gross profit. Assume that the cost of the actions (change service from L to H and change rate from L to H). Is $250, then the net profit of the jack from B to E is $ 50 (300-250). Clearly, the node with the maximum net profit for Alexander is D, that suggest actions of changing the service from L to H. 3. COST MATRIX:Each attribute value changes incur cost and the cost for each attribute is determined by the domain experts. The values of many attributes, such as sex, address, number of children cannot be changed with any reasonable amount of money. These attributes are called “hard attributes”. The users must assign large number to every entry in the cost matrix. Some values can be easily changed with reasonable costs, these attributes such as the service level, interest rate and promotion packages are called “soft attributes”. The hard attributes should be included the tree building process in the first place to prevent customers from being moved to other leafs is because that many hard attributes are important accurate probability estimation of the leaves. For continuous value attributes, such as interest rate which is varied within a certain range. the numerical ranges be discretized first for feature transformation. 4. THE LIMITED RESOURCE CASE: POSTPROCESSING DECISION TREES:4.1 BSP (Bounded Segmentation Problem) In the previous example that is considered above each leaf node of the decision tree is a separate customer group. For each customer group we have to design actions to increase the net profit. But in practice the company may be limited in its resources. But when such limitations occur it is difficult to merge all the nodes into K segments. So to each segment a responsible manager can apply several actions to increase the overall profit. Step 1: Here, a decision tree is build with collection S(m) source leaf nodes and collection D(m) destination leaf nodes.

Step 2: Consider a constant, k.(K<m) ,where m is total number of source leaf nodes. Step 3: Build a cost matrix with attributes U and V. Step 4: Build a unit benefit vector, when a customer belongs to positive class Step 5: Build a set of test cases. The goal is to is to find a solution with maximum net profit. by transforming customers that belongs to a source node S to the destination node D via, a number of attribute value changing actions. GOALS: The goal is to transform a set of leaf node S to a destination leaf node D, S->D. ACTIONS: In order to change one has to apply one attribute value changing action. This is denoted by {Attr, u->v}. Thus the BSP problem is to find the best K groups of source leaf nodes {Group i=1, 2…, k} and their corresponding goals and associated action sets to maximize the total net profit for a given data set Ctest. Service Low STATUS HIGH

RATE C D

A

B

L1

L2

L3

L4

0.9 0.2 0.8 0.5 Example: To illustrate the limited resources problem, consider again our decision tree in above figure. Suppose that we wish to find a single customer segment {k=1}. A candidate group is {L2, L4}, with a selected action set {service <-H, Rate <--C} which can transform the group to node L3. assume that group to leaf node L3,L2 changes the service level only and thus, has a profit gain of (0.8-0.2)*1-0.1=0.5 and L4 has a profit gain of (0.8-0.5)*1-0.1=0.2.Thus, the net benefit for this group is 0.2+0.5=0.7. As an example of the profit matrix computation, a part of the profit matrix corresponding to the source leaf node. L2 is as shown in table, where Aset1={status=A}. Aset2={service=H, Rate =C } and Aset3={service=H , Rate=D}. here, for convenience we ignore the source value of the attributes which is dependent on the actual test cases.

232

http://sites.google.com/site/ijcsis/ ISSN 1947-5500

(IJCSIS) International Journal of Computer Science and Information Security, Vol. 8, No. 9, December 2010

TABLE1 An computation Aset1(L2)(Goal = ->L1) 0.6 …

example

of

the

profit

matrix

Aset2(L2)(Goal = ->L3) 0.4 …

Aset3(L2)(Goal= ->L4) 0.1 …

input parameters. Greedy BSP algorithm processes this matrix in a sequenced manner for k iterations. In each iteration, it considered adding one additional column of the M matrix, until it considered all k columns. Greedy BSP algorithm considers how to expand the customer group by one. To do this, it considers which addition column will increase the total net profit to a highest value we can include one more column. 5. IMPROVING THE ROBUSTNESS USING MULTIPLE TREES: The advantage of the Greedy BSP algorithm is that it can significantly reduce the computational cost while guaranteeing the high quality of the solution at the same time. In Greedy BSP algorithm, built decision tree always choose the most informative attribute as the root node. Therefore, we have also proposed an algorithm referred to as Greedy BSP multiple which is based on integrating an ensemble of decision trees in this paper [16],[5],and [15]. The basic idea is to construct multiple decision trees using different top ranked attributes as their root nodes. For each set of test cases, the ensemble decision trees return the median net profit and the corresponding leaf nodes and action sets as the final solution. Thus, we expect that when the training data are unstable, the ensemble based decision tree methods can perform much more stable as compared to results from the single decision tree. Algorithm Greedy BSP Multiple: Step 1: Given a training data set described by P attributes 1.1 Calculate gain ratios to rank all the attributes in a descending order. 1.2 For i=1 to p Use the ith attribute as the root node to construct the ith decision tree End for Step 2: take a set of testing examples as input 2.1 For i= 1 to p Use the ith decision tree to calculate the net profit by calling algorithms Greedy BSP End for 2.2 return k actions sets corresponding to the median net profit. Since Greedy BSP multiple relies on building, multiple decision trees to calculate the median net profit different sampling can only affect the construction of a small portion of decision trees. Therefore. Greedy BSP Multiple can produce net profit less variance.

TABLE 2 : lustrating the Greedy-BSP algorithm Source nodes S1 S2 S3 S4 Column sum Selected actions Aset1 (goal= ->D1) 2 0 0 0 2 Aset2 (goal= ->D2) 0 1 1 1 3 X Aset3 (goal= ->D3) 1 0 0 0 1 Aset4 (goal= >D4) 1 0 0 0 1

Then the BSP problem becomes one of picking the best k columns of matrix M such that the sum of the maximum net profit value for each source leaf node among the K columns is maximized. When all Pij elements are of unit cost, this is essentially a maximum coverage problem, which aims at finding K sets such that the total weight of elements covered is maximized, where the weight of element is the same for all the sets. A special case of the BSP problem is equivalent to the maximum coverage problem with unit costs. Our aim will then to find approximation solutions to the BSP problem. Algorithm for BSP: Step 1: Choose any combination of k action sets Step 2: Group the leaf nodes into k groups Step 3: Evaluate the net benefit of the action sets on the group Step 4: Return the k action set with associated leaf node Since the BSP needs to examine every combination of k action sets, the computation complexity is more. To avoid this we have develop the Greedy algorithm which can reduce the computational cost and guarantee the quality of solution. We consider the intuition of the Greedy BSP algorithm using an example profit matrix M as shown in table. Where we assume a k=2 limit. In this table each number a profit Pij value computed from the

233

http://sites.google.com/site/ijcsis/ ISSN 1947-5500

6. AACO (ADAPTIVE OPTIMISATION):

ANT

COLONY

The searching process of ACO is based on the positive feedback reinforcement [4][12]. Thus, the escape from the local optima is more difficult than the other Meta heuristics, therefore, the recognition of searching status and the escape technique from the local optima are important to improve the search performance of ACO. Regarding the recognition of searching status, the proposed algorithm utilizes a transition of the distance of the best tour. A period of which each ant builds a tool represents one generation. Regarding the recognition of searching status, the proposed utilizes a transition of the distance of the best two (shortest 2). A period of which each ant builds a tool represents one generation. There is a cranky ant which selects a path let us not being selected and which is shortest. ADAPTIVE ANT COLONY ALGORITHM 1. Initialized the parameters ∞, t max, t, sx, sy 2. for each agent do 3. Place agent at randomly selected site an grid 4. End for 5. While (not termination) // such at t≤tmax 6. for each agent do 7. Compute agent’s Fitness t(agent) And activate probability Pa (agent) According to (4) and (7) 8. r<-random([0,1]} 9. If r≤ Pa then 10. Activate agent and move to random Selected neighbor’s site not Occupied by other agents 11. Else 12 .Stay at current site and sleep 13 .end if 14 .end for 15 .adapting update parameters ∞, t<-t+1 16. End while 17.Output location of agents 7. DIFFERENCE FROM A PREVIOUS WORK Machine learning and data mining research has been distributed to business practices by addressing some issues in marketing. One issue is that the difficult market is cost sensitive. So to solve this problem an associative rule based approach is used to differentiate between the positive class and negative class members and to use this rules for segmentation.

Collecting customer data [9] and using the data for direct marketing operations have increasingly become possible. One approach is known as database marketing which is creating a bank of operation about individual customers from their orders, queries and other activities using it to analyze customer behavior and develop intelligent strategies [10],[13],[6]. Another important computational aspect is to segment a customer group into sub-groups. AACO is a new technique that is used to find the accuracy in this paper. All the above research works have aimed at finding a segmentation of the customer’s database taking a predefined action for every customer based on that customer’s current status. None of them have addressed about discovering actions that might be taken from a customer database, in this paper we have addressed about how to extract actions and find out the best accuracy for the customer to be loyal. EXPERIMENTAL EVALUATION

This experimental evaluation shows the entropy value that has been calculated for each parameter.

234

http://sites.google.com/site/ijcsis/ ISSN 1947-5500

The best tree is selected from the various experimental result analyzed.

In our example each action set contains a number of actions and each action set contains four different action set with attribute changes. The experiment with Greedy BSP found action sets with maximum net profit and is more efficient than optimal BSP. We also conducted this experiment with AACO and found that AACO is more accurate than the Greedy BSP algorithms REFERENCES [1] R.Agarwal and R.Srikanth “Fast algorithms for large data bases (VLDB’94), pp.487-499, sept.1994.

235

http://sites.google.com/site/ijcsis/ ISSN 1947-5500

[2]Bank Marketing Association, Building a Financial Service Plan: Working plans for product and segment marketing, Financial Source Book, 1989. [3]A.Berson, K.Therling, and S.J.Smith, Building Data Mining Applications for CRM.McGraw-Hill, 1999. [4] Bezdek,J.C., (1981) Pattern Recognition with fuzzy Objective Function Algorithm” [5] M.S.Chen, J.Han and P.S.Yu,”Data Mining: An Overview from a Database Perspective,”IEEE Trans on Knowledge and Data Engineering”, 1996. [6]R.G.Drozdenko and P.D.Drake.”Optimal Database Marketing “2002. [7] J.Hung and C.X.Ling,”Using Auc and Accuracy in Evaluvating Learning Algorithms”, IEEE Trans Knowledge and Data Engineering. Pp 299310, 2005. [8] Lyche “A Guide to Customer Relationship Management” 2001. [9]H.MannilaH.Toivonen,and .I.Verkamo,”Efficient Algorithms for Discovering Association Rules”Proc workshop Knowledge discovery in databases. [10]E.L.Nash,”Database Marketing” McGraw Hill, 1993. [11]J.R.Quinlan, C4.5 Programs for Machine Learning”, 1993 [12] S.Rajasekaran, Vasantha Kalyani David , Pattern Recognition using Neural And Functional networks by 2008. [13]Rashes attrition and M.Stone DataBase

[15] X.Zang and C.E. Brodley, “Boosting Lazy Decision Trees”, Proc Conf .Machine Learning (ICM), pp178-185, 2003. [16] Z.H.Zhou, J.Wu and W.Tang “Ensembling Neural Network IEEE Conf. Data Mining, pp 585-588, 2003. P.Senthil Vadivu received MSc (Computer Science) from Avinashilingam Deemed University in 1999, completed M.phil from Bharathiar university in 2006.Currently working as the Head, Department of Computer Applications, Hindusthan College of Arts and Science, Coimbatore-28. Her Research area is decision trees using neural networks.

Marketing, Joattritionhn Wiley, 1998. [14] Q.Yang, J.Yin,C.X Ling, and T.Chen, “Post processing Decision Trees to Extract Actionable Knowledge” IEEE conf.Data Mining ,pp 685-688, 2003.

236

http://sites.google.com/site/ijcsis/ ISSN 1947-5500

- Journal of Computer Science IJCSIS March 2016 Part II
- Journal of Computer Science IJCSIS March 2016 Part I
- Journal of Computer Science IJCSIS April 2016 Part II
- Journal of Computer Science IJCSIS April 2016 Part I
- Journal of Computer Science IJCSIS February 2016
- Journal of Computer Science IJCSIS Special Issue February 2016
- Journal of Computer Science IJCSIS January 2016
- Journal of Computer Science IJCSIS December 2015
- Journal of Computer Science IJCSIS November 2015
- Journal of Computer Science IJCSIS October 2015
- Journal of Computer Science IJCSIS June 2015
- Journal of Computer Science IJCSIS July 2015
- International Journal of Computer Science IJCSIS September 2015
- Journal of Computer Science IJCSIS August 2015
- Journal of Computer Science IJCSIS April 2015
- Journal of Computer Science IJCSIS March 2015
- Fraudulent Electronic Transaction Detection Using Dynamic KDA Model
- Embedded Mobile Agent (EMA) for Distributed Information Retrieval
- A Survey
- Security Architecture with NAC using Crescent University as Case study
- An Analysis of Various Algorithms For Text Spam Classification and Clustering Using RapidMiner and Weka
- Unweighted Class Specific Soft Voting based ensemble of Extreme Learning Machine and its variant
- An Efficient Model to Automatically Find Index in Databases
- Base Station Radiation’s Optimization using Two Phase Shifting Dipoles
- Low Footprint Hybrid Finite Field Multiplier for Embedded Cryptography

Data mining algorithms are used to discover customer models for distribution information, Using customer profiles in customer relationship management (CRM), it has been used in pointing out the cus...

Data mining algorithms are used to discover customer models for distribution information, Using customer profiles in customer relationship management (CRM), it has been used in pointing out the customers who are loyal and who are attritors but they require human experts for discovering knowledge manually. Many post processing technique have been introduced that do not suggest action to increase the objective function such as profit. In this paper, a novel algorithm is proposed that suggest actions to change the customer from the undesired status to the desired one. These algorithms can discover cost effective actions to transform customer from undesirable classes to desirable ones. Many tests have been conducted and experimental results have been analyzed in this paper.

- Social Networks
- Optimal Marketing Strategies over Social Networks
- MMM1
- Or Chap1 2 Eng Feb21
- 1_introduction to Statistics_Jan-2, 2012 [Compatibility Mode]
- Feedback Approaches to Modeling Structural Shifts in Market Response
- TN03 - Choice Modeling Technical Note
- Quantative Modeling in Marketing Research
- Efficient Experimental Design With Marketing Research Applications
- (PLS) model
- Stata Class Notes
- MIT15_053S13_iprefguide
- lecture09-knapsack
- Reliance Fresh Cluster Analysis.pdf
- ANOVA+Matlab+Instructions
- TI-82_ Lists and Statistics
- Topic 4
- Ho Logistic
- Paper for Little's Algorithm
- PredInt.xls
- Useful Stata Commands
- Useful Stata Commands
- SPSStips.doc
- Sports Drink Ans EGE120032 1
- Quantitative Techniques
- INF206D - Additional Info
- Statistical Analysis in Excel
- 00000chen- Linear Regression Analysis3

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

We've moved you to where you read on your other device.

Get the full title to continue

Get the full title to continue listening from where you left off, or restart the preview.

scribd