You are on page 1of 9

Expert Systems with Applications 40 (2013) 65616569

Contents lists available at SciVerse ScienceDirect

Expert Systems with Applications


journal homepage: www.elsevier.com/locate/eswa

A combined mining-based framework for predicting


telecommunications customer payment behaviors
Chun-Hao Chen, Rui-Dong Chiang , Terng-Fang Wu, Huan-Chen Chu
Department of Computer Science and Information Engineering, Tamkang University, Taipei 251, Taiwan, ROC

a r t i c l e

i n f o

Keywords:
Late payment prediction system
Association rules
Clustering
Decision trees
Domain-driven data mining

a b s t r a c t
Most existing data mining algorithms apply data-driven data mining technologies. The major disadvantage of this method is that expert analysis is required before the derived information can be used. In this
paper, we thus adopt a domain-driven data mining strategy and utilize association rules, clustering, and
decision trees to analyze the data from xed-line users for establishing a late payment prediction system,
namely the Combined Mining-based Customer Payment Behavior Predication System (CM-CoP). The CMCoP could indicate potential users who may not pay the fee on time. In the implementation of the proposed system, rst association rules were used to analyze customer payment behavior and the results of
analysis were used to generate derivative attributes. Next, the clustering algorithm was used for customer segmentation. The cluster of customers who paid their bills was found and was then deleted to
reduce data imbalances. Finally, a decision tree was utilized to predict and analyze the rest of the data
using the derivative attributes and the attributes provided by the telecom providers. In the evaluation
results, the average accuracy of the CM-CoP model was 78.53% under an average recall of 88.13% and
an average gain of 11.2% after a six-month validation. Since the prediction accuracy of the existing
method used by telecom providers was 65.60%, the prediction accuracy of the proposed model was
13% greater. In other words, the results indicate that the CM-CoP model is effective, and is better than
that of the existing approach used in the telecom providers.
2013 Elsevier Ltd. All rights reserved.

1. Introduction
The telecom market has developed rapidly and telecom providers have spared no effort to increase their revenue by winning
more customers and improving performance. However, they still
have to deal with late payments from customers. Most customers
will pay their bill on time, but some also not pay their bill, either
intentionally or because they forget to make the payment. These
two behaviors are collectively called late payments.
There are many types of fraud, and telephone fraud is a common one (Taniguchi, Haft, Hollmen, & Tresp, 1998). The literature
indicates that the telephone fraud causes losses of two to three billion US dollars each year, and losses from telephone fraud comprises 1.5% to 5% of the total turnover. The traditional monitoring
methods used by xed-line providers identify abnormal situations
after the expiration of a payment period, but by then the loss has
already occurred. Obviously, traditional monitoring methods do
not satisfy the telecom providers needs for risk control. A number
Corresponding author.
E-mail addresses: chchen@mail.tku.edu.tw (C.-H. Chen), 081863@mail.tku.
edu.tw, chiang@cs.tku.edu.tw (R.-D. Chiang), tfwu945@gmail.com (T.-F. Wu),
HuanChen.Chu@gmail.com (H.-C. Chu).
0957-4174/$ - see front matter 2013 Elsevier Ltd. All rights reserved.
http://dx.doi.org/10.1016/j.eswa.2013.06.001

of scholars have used data mining technologies (without the use of


additional equipment or extra loads) to analyze communications
and liaison records, and to conduct user prole analysis in order
to discover fraudulent behavior (Cahill, Lambert, Pinheiro, &
Sun, 2002; Fawcett & Provost, 1997; Hung, Yen, & Wang, 2006;
Schommer, 2010; Wheeler & Aitken, 2000; Yan, Wolniewicz, &
Dodier, 2004).
Late payments may be caused by fraud, habitual delays, or other
special reasons (Hung et al., 2006; Schommer, 2010). Fraudulent
behavior can cause telecom providers to suffer heavy short-term
losses. Although late payments caused by non-fraudulent behavior
may not immediately cause signicant losses, it may cause other
losses such as cash ow reductions, increased labor costs for debt
collection (xed-line providers indicate that some users paid all
their fees within two years) and customer churn. Recent literature
has discussed fraudulent behavior (Hung et al., 2006; Schommer,
2010), but seldom paid attention to late payments caused by
non-fraudulent behavior. Thus, this study aims to discuss late payments caused by specic reasons or irregular habits.
The above-mentioned studies employed data-driven data mining technologies. The disadvantage of that method is that the
mined information requires expert analysis before use (Hung
et al., 2006; Schommer, 2010). Data-driven data mining technology

6562

C.-H. Chen et al. / Expert Systems with Applications 40 (2013) 65616569

is useful for information technology personnel but is usually not


meaningful for business personnel. Take association rule mining
as an example, some of the derived rules only expose commonsense knowledge and may not be interesting in business point of
view. For example, if a rule If milk is bought, Then bread is
bought, is derived with high support and condence. This rule is
a reliable rule according to the well-known Apriori algorithm. This
rule is then a technique interestingness information. But, it may
not be valuable for business since the derived rule is commonsense knowledge, and it may also mislead decision makers because
the rule If milk is bought, Then bread is not bought may also exist
if we take not bought concept into consideration, simultaneously. And, if the proposed approach only considers technique
interestingness, common-sense information may be derived, such
as which customers often delay payment but eventually pay, or
which customers always pay on time. Recently, Longbing Cao suggested the domain-driven data mining concept (D3M) Cao, 2010;
Cao et al., 2010, and combined it with industry knowledge to mine
useful real information. They emphasize the issues surrounding
real-world data mining, and propose the trends from data-centered
hidden pattern discovery to domain driven actionable knowledge
discovery (AKD). And, the actionable means that the derived
knowledge patterns can not only provide important grounds to
business decision makers for making appropriate actions, but also
deliver expected outcomes to business. Based on that innovation,
several more relevant studies were performed (Du & Ling, 2010;
Jin et al., 2010; Mansingh, Osei-Bryson, & Reichgelt, 2011; Marinica
& Guillet, 2010; Xu, Lin, & Xu, 2009).
As mentioned in Cao (2010) and Cao et al. (2010), For many
complex enterprise applications, one-scan mining seems unworkable for many reasons. To this end, we propose the Combined Mining based AKD (CM-AKD) framework to progressively extract
actionable knowledge. And, one-scan mining seems unworkable
means that the problem that we attempt to solve may need to analyze the data many times (not just once) by various techniques
such that the problem could be solve effectively. We also believe
that the prediction of telecommunications customer payment
behavior is also highly domain-driven task. In this paper, we thus
adopt the CM-AKD framework to analyze data on xed-line users
for establishing a late payment prediction system, namely the
Combined Mining-based Customer Payment Behavior Predication System (CM-CoP). The CM-CoP could indicate which users are more
likely to not pay their bill on time. In the implementation of the
late payment prediction system, this study used association rules
to analyze customer payment behavior, but did not produce any
prediction rules, which makes it unique among the related studies
(Fawcett & Provost, 1997; Hung et al., 2006; Schommer, 2010).
Thus, customer payment behavior is a part of data processing,
and the payment behavior rules are used to produce derivative
attributes. Customer payment behavior can then be directly stored
in the data. Next, a clustering algorithm is used for customer segmentation, and the cluster of customers who paid their fees punctually is eliminated to reduce data imbalances. Finally, a decision
tree is utilized to construct a prediction model from the rest of
the data by using the derivative attributes from the association
rules and the attributes provided by the telecom providers. Thus,
based on the real dataset given by the telecom providers and the
proposed framework, service personnel can remind customers
who may make delayed payments to pay the fee.
Modeling requires an efciency evaluation to maintain its predictive power. Generally, when the time frame is too long or policies change, user behavior may change, and the accuracy, recall,
and predictive power of the system will be reduced. To avoid this,
this system automatically veries and compares the accuracy rate
of each rule when the system analyzes monthly user payment

records. If the accuracy rate of a rule is lower than the set threshold
value, the system highlights it for review and the providers decide
whether to delete it. Next, providers may use new data and the
constructed model to produce a new rule. If the new rule is veried, the system will add the rule to the database thus maintaining
the systems predictive power. After the system is built, the provider spends six months conducting verications. A comparison
of the data from providers indicated that, even though the testing
environment was different from the conditions at the providers,
the efcacy rules produced by CM-CoP is greater than that of existing rules used by telecom providers. Thus, the two main contributions of this work are described as follows:
1. Firstly, we adopt a domain-driven data mining strategy and utilize data mining techniques to analyze the data from xed-line
users for establishing a late payment prediction system, namely
the Combined Mining-based Customer Payment Behavior Predication System (CM-CoP).
2. Secondly, the average accuracy of the CM-CoP model was
78.53% under an average recall of 88.13% and an average gain
of 11.2% after a six-month validation.
Since this paper focuses on real-world application in the telecom market, background concepts related to association rules,
clustering, and the decision tree method are introduced briey.
This paper is organized as follows: Section 2 introduces D3M and
related work; the framework of the proposed CM-CoP is described
in Section 3, which includes system ow, data preprocessing, and
data mining ow; Section 4 presents cases and verication results;
and conclusions are offered in Section 5.
2. Related work
This section introduces related data mining methods and
domain-driven data mining concepts and structures. Data mining
approaches include association rule, clustering, and decision tree
are describe in Section 2.1. The concepts of domain-driven data
mining and relevant existing mining structures are introduced in
Section 2.2.
2.1. Related data-mining approaches
Data mining aims to extract useful knowledge and patterns
from existing data to solve a specic issue. To date, it has been used
in many different elds, such as shopping cart analysis (Agrawal,
Imielinksi, & Swami, 1993), network intrusions (Tajbakhsh,
Rahmati, & Mirzaei, 2009), and stock market analysis (Au & Chan,
2003; Hadavandi, Shavandi, & Ghanbari, 2010). One common use
is the mining of association rules from transaction data, i.e. an
analysis of correlations between products purchased by customers.
The association rule is represented as A ? B, where A and B are
common products, and the rule states that if product A is purchased, product B will be purchased together with it. Two gauges
are used to measure the validity of association rules, support and
condence. The earliest association rule mining was suggested by
Agrawal et al. (1993), and the three main steps include: (1) produce candidate itemsets; (2) produce frequent itemsets based on
minimum support; and (3) produce frequent itemsets based on
minimum condence.
Clustering data based on data similarity is known as the clustering method, and it is an unsupervised mining technology. The kmeans method has been widely used (McQueen, 1967). The input
items for k-means include the given itemset and the designated
cluster number k, where k is an integer greater than 1. In k-means,

C.-H. Chen et al. / Expert Systems with Applications 40 (2013) 65616569

the centroid of each cluster is used as a representative of the


cluster center. The procedure has four steps: (1) the k of the items
from the data set is selected as the representative centroid; (2) all
of the items are assigned to the centroid cluster; (3) the centroid is
calculated based on the new cluster; (4) if all the items still belong
to the original cluster, it ends, otherwise step (2) is repeated.
Classication aims to nd rules for the classication of new
data. The mining approach is a supervised method, and some useful information needs to be extracted from the data clusters of a given type as the basis for the classication of new data. Most
classication mining approaches nd the rules and sort them into
an easy to operate structure called the classier. The decision tree
(Quinlan, 2003), proposed by Quinlan in 1992, is often used. The
decision tree uses the distribution differences of different types
of data as the classication criterion, and thus, the classication
rules extracted from the decision tree can use these characteristics
to classify the new data. In addition, the decision tree algorithm
can nd classication rules and convert them into a tree structure.
With the tree structure, given rules can be used to classify new
data more rapidly because the decision tree uses the classier produced from the classication mining method.
2.2. Domain-driven data mining concepts
Domain-driven data mining (D3M) (Cao, 2010; Cao et al., 2010)
was proposed by Longbing Cao at the University of Technology in
Sydney, Australia. The D3M is dened by the following ve items.
(1) Based on meta knowledge, a heuristic method is used for
continuous testing to solve problems.
(2) The main objective is to mine actionable knowledge discovery (AKD).
(3) Technology and business purposes are the interested
conditions.
(4) Actual complex enterprise application is a prerequisite.
(5) Actual business application is a post-condition.
Based on the above denition, Longbing Cao further described
four frameworks for the logical concept of D3M: (1) post-analysis-based AKD (PA-AKD) (Du & Ling, 2010; Marinica & Guillet,
2010; Mansingh et al., 2011; Xu et al., 2009); (2) unied interestingness-based AKD (UI-AKD) (Jin et al., 2010; Sim, Indrawan,
Zutshi, & Srinivasan, 2010); (3) combined mining-based AKD
(CM-AKD) (He, Zhang, Shi, & Huang, 2010; Pradeep, Krishna, Illapu,
Kumar, & Koyi, 2010); (4) and multisource combined-miningbased AKD (MSCM-AKD) (Xiang, Cao, Hu, & Yang, 2010). The
common central concept of the four frameworks is actionable
knowledge discovery (AKD).
In the PA-AKD framework, there are two steps. First, general
patterns are found in the data, and then meta knowledge and domain-specic business interestingness are used to extract maneuverable business rules. In the UI-AKD framework, the main concept
uses meta knowledge and domain knowledge to develop mining
technology for deriving actionable knowledge patterns. This algorithm can directly mine actionable knowledge discovery (AKD).
The CM-AKD framework focuses on the use of different mining approaches, nding different patterns, and using these patterns for
feature construction or comparison. Finally, the pattern merger
method is combined with domain knowledge to aggregate nal
mining results. The MSCM-AKD framework is the most complex
among the four structures, and its main purpose is to extend the
CM-AKD framework and increase the number of different data
sources. For this reason, there is not much literature on MSCMAKD. In this paper, the CM-AKD framework is utilized for establishing the prediction of the customer payment behavior system,
which can be formalized as Eq. (1):

6563

where ti,j and bi,j are technical and business interestingness of model
mj, and [ii,j()] indicates the alternative checking of unied interestingness, [JPj is the merger function, Xm is the meta-knowledge consisting of meta-data about patterns, features and their relationships.
In other words, the CM-AKD consists of multi-steps for pattern
extraction and renement on the whole dataset. It rst split into J
steps of mining based on business understanding, data understanding, exploratory analysis and goal denition. Then, each step j is
used for extracting a pattern sub-set Pj based on technical signicance (ti()). The pattern sub-set Pj is then fed into step j + 1 for guiding corresponding feature construction and pattern set Pj+1. The
derived pattern sub-sets are then merged into a nal pattern set
(P) based on the environment (e), domain knowledge (Xm) and
business expectations (bi). Finally, the merged pattern P is then converted into business rules as nal deliverables that reect business
preferences and needs. Based on the CM-AKD framework, in the
next section, the details of the proposed predicting customer payment behavior system are described. Note that the CM-AKD framework is one of the domain-driven data mining framework proposed
by Cao et al. for mining actionable knowledge rules (patterns).
Based on CM-AKD framework, we propose the CM-CoP framework
for predicting telecommunications customer payment behavior.
And, the domain-driven data mining strategy focus on how to take
the objective and subjective interestingness in terms of technique
and business goals into consideration for driving actionable rules
(patterns). The descriptions about the proposed domain-driven data
mining strategy in terms of the ve items are stated as follows.
Firstly, based on the CM-AKD framework, we propose the system,
namely the Combined Mining-based Customer payment behavior
Predication framework (CM-CoP), which uses the heuristic methods
(Step-1 to Step-3 mining, see Fig. 1) is used for continuous testing to
solve problems. Secondly, the main objective of the CM-CoP is using
customer payment and communication behaviors to predict which
users might not pay their bills. Thirdly, in the proposed approach,
we take the attributes provided by the telecom providers (as business interestingness) and the rules provided by the results of the
association rules (as technique interestingness) into consideration
for achieving more accuracy results. For last two items, telecommunications customer payment behavior prediction is the complex
enterprise application as mentioned in previous section. After verication by the provider, the accuracy of the proposed model was
higher than that of the existing model.
3. The proposed predicting customer payment behavior system
This section introduces the methods structure. First, we will describe the proposed CM-CoP system framework according to the
CM-AKD framework. Next, the proposed predicting customer payment behavior algorithm is described, including using association
rules to produce payment behavior pattern for analysis and derived attributes, utilizing the clustering technique to reduce imbalance, and combing the derived attributes and the attributes
provided by the telecom providers to construct the decision trees
for predict user payment behavior patterns.
3.1. CM-CoP system framework
Based on the CM-AKD framework, we propose system, namely
the Combined Mining-based Customer payment behavior Predication
framework (CM-CoP), combines data mining techniques including
association rules, clustering, and decision trees, as well as industry

6564

C.-H. Chen et al. / Expert Systems with Applications 40 (2013) 65616569

AA
Actionable
cctiioonnaabbllee
cti
ct
Actionable
Rule
R
uulleeSet
SSet
eet(Model)
(Mo
(M
Rule
(Model)
R
Set
(Mooddeell))

Domain-driven mining phase


Pattern Merger
High Precision Rule Set
Decision
D
eecciissiion
oonT
Tree
Tr
rreee1
ree
re
Decision
D
T
Tree
1

Decision
D
eecciissiion
oonT
Tree
Tr
rreee2
ree
re
Decision
D
T
Tree
2

Decision
D
eecciissiion
oonT
Tree
rreeen
ree
re
Decision
D
T
Tree
n

St
S
Step-3
tteepp--3
Mi
M
Step-3
S
-33Mining
Mining
Miin
inniin
inngg
Decision
D
eecciissiion
oon
Trees
rreeeess
ree
re
Decision
Trees
D
o T
T

Meta
Knowledge

Domain
Knowledge
Extracted
DB

ti,1()

Association
A
sssocciia
ssoc
sso
ttion
tio
ti
o
on
Association
A
iaati
Patterns
Pattern
P
aatt
ttter
tte
eer
ern
rrnnss
Patterns
P

D
Desired
eessiire
rredd
red
Desired
D
Clustering
C
lluusstter
eeriinngg
eri
Clustering
C

St
S
Step-1
tteepp--1
Mi
M
Step-1
Mining
S
-11Mining
Miin
inniin
inngg
Association
AA
sssoocciiaation
sso
ss
ttionMining
M
Association
Mining
Miinniinngg

St
S
Step-2
Mi
M
iin
Step-2
Mining
Stteepp--2
-22Mining
M
inniin
inngg
Clustering
C
eeriinnggAnalysis
eri
AAnalysis
Clustering
Clluusstter
Annaallyyssiiss

Payment Records

CDR/BASE
Database

ti,2()
bi,2()

ETL Transformation

Model Tuning phase


New
Payment Records

Telecom Provider

LL
Late
aatteeP
Pa
Payment
aayym
ment
Late
P
Payment
meenntt
Customers
Predicting
Cu
C
uussto
mers
rrssPr
Predi
Pre
eeddi
icctin
tti
tin
iinngg
Customers
Predicting
C
sttoom
meers
Pr
cti

Validated Rules

Experts Validation

R1: X1 Y1
R2: X2 Y2

Late Payment
Customers List

AA
Actionable
cctiioonnaabbllee
cti
ct
Actionable
Rule
R
uulleeSet
SSet
eet(Model)
(Mod
(Mo
R
Set
(Moddeell))
Rule
(Model)
updating
d

Rn: Xn Yn

Fig. 1. Combined Mining-based Customer Payment Behavior Prediction Framework (CM-CoP).

knowledge provided by xed-line providers for analyzing xedline user data. The purpose of the CM-CoP is using customer payment and communication behaviors to predict which users might
not pay their bills, and the proposed CM-CoP framework is shown
in Fig. 1.
The overall system execution ow is shown in Fig. 1, and it includes two parts: (1) the domain-driven mining phase and (2) the
model tuning phase. In the rst part, ETL (extraction transformation loading) is utilized to derive CDR historical data. Next, this
study uses association rules to analyze data about user bills based
on telecom providers practices to create a behavioral model of potential late-paying users. According to the derived rules, in combination with the professional knowledge of the providers, the
derived attributes from payment behavior is established. Meanwhile, the clustering technique is then used to derive the desired
groups with business interestingness. Finally, decision tree algorithms are utilized to analyze the data by using various attributes,
and the derived rules are stored in a database for validation.
In the second part, after the model is constructed, its efciency
has to be evaluated to maintain its predictive power. Generally,
when the time frame is too long or policies change, user behavior
may change, and the accuracy and recall of the systems predictive
power will decrease. The systems design must take this into
account. The system automatically veries and compares the accuracy rate of each rule when the system retrieves the monthly user
payment records. If the accuracy rate of the rule is lower than the
set threshold value, the system will highlight it for the providers to

review the rule. Apart from this, the providers can use new data to
create rules from the constructed model, and if a new rule is veried, the system will add the rule to the database, thus maintaining
predictive power.
3.2. The proposed domain-driven mining approach
In this subsection, based on the proposed CM-CoP framework,
the CM-CoP algorithm is proposed for predicting customer payment behavior algorithm in this paper. The details of the proposed
CM-CoP algorithm are stated in Table 1.
From Table 1, the proposed CM-CoP algorithm can be divided
into four parts, including association pattern mining (lines 23),
clustering analysis (lines 45), mining decision tree (lines 611)
and rule evaluation (line 1216). In rst part, the preprocessed
payment records are rst used for deriving association pattern to
be as the customer behavior (see Section 3.3 for more details).
Then, since only a few customers will late to pay their bills, the
proportion between customers who late payment and pay their
bills on time is imbalance. Thus, the data imbalance should be taken into consideration before building the model. Here, after consulting the experts of the telecom providers for deriving
appropriate attributes, the clustering technique is then used to divide customers into groups. Those groups that can be identied as
contain customers who pay their bills on time will then be removed. The remaining groups are the customers that we need to
focus on. However, it is not an easy task to nd general actionable

C.-H. Chen et al. / Expert Systems with Applications 40 (2013) 65616569


Table 1
The proposed CM-CoP algorithm.
Algorithm: CM-CoP algorithm
Input: A set of CDR/BASE dataset CDR, a set of payment records PR, business
problem w, minimum support a, minimum condence k, mata knowledge
Xm, domain knowledge Xd.
Output: The operable business rule set R0 .
Procedure CM-CoP(){
(1) AKD is split into 3 steps of mining;
(2) PR0
dataPrecessing(PR, Xm);
(3) associationPattern
Step-I AssociationMining(PR0 , a, k);
(4) CDR0
dataPrecessing(CDR, Xm);
(5) desiredCluster
Step-II ClusteringAnalysis(CDR);
(6) attributeSet
featureSelection(CDR);
(7) For each subset subAttributes of attributeSet
(8)
decisionTree
Step-III MiningDecisionTree
(9)
(association Pattern, desiredCluster, subAttributes);
(10)
decisionTreeSet
decisionTreeSet [ decisionTree;
(11) End For
(12) For each rule Ri in decisionTreeSet
(13)
If evaluationFunction(Ri, Xm, Xd) == true
(14)
R0
R0 [ Ri;
(15)
End If
(16) End For
(17) Output R
}

Table 2
The operable business rule turning procedure.
Procedure: The operable business rule turning procedure
Input: The operable business rule set R0 , a set of new coming payment records
newPR, a set of CDR/BASE dataset CDR, mata knowledge Xm, domain
knowledge Xd, an accuracy threshold k.
Output: The operable business rule set R.
Procedure RuleTurning (){
R0 R0 [ CM-CoP(CDR, newPR, Xm, Xd);
(1) For each rule Rj in R0
(2) If expertValidation(Ri, k, Xm, Xd) == false
(3)
Remove Ri from R0 ;
(4)
End If
(5) End For
(6) Output the tuned operable business rule set R0 ;
}

rules for predicting the customers behavior, and directly use the
rules provided by the results of the association rules could not
solve the problem efciently. In order to conquer this issue, we select different set of attributes from the derived association patterns
(as technique interestingness) and those consulted attributes (as
business interestingness) for constructing decision trees. At last,
the each rule in the decision trees is then evaluated by the telecom
providers for enhancing its predicting ability. Finally, those veried
rules are then collected as the operable business rule set (see
Section 3.4 for more details).
Furthermore, modeling requires an efciency evaluation to
maintain its predictive power. Generally, when the time frame is
too long or policies change, user behavior may change, the predictive power of the system will be reduced. To avoid this, this system
automatically veries and compares the accuracy rate of each rule
when the system analyzes monthly user payment records by using
the operable business rule turning procedure shows in Table 2:
As shown in Table 2, the new payment records will rst use to
generate new operable business rule (line 1). Then, for each rule in
the operable business rule set R0 , experts verify it predicting power.
If its predicting power is lower than a threshold, then the rule will
be removed from the operable business rule set (line 26). Since
we focus on analyzing the real data for predicting the customers
who may make delayed payments, the goal of this paper is
attempted to design the CM-CoP framework (Fig. 1) and its

6565

algorithms (Tables 1 and 2). So, the detail approaches of the related
data mining algorithms are using the existing tools in IBM Intelligent Miner.
3.3. Payment behavior pattern analysis and derived attributes
The user communication characteristics consist of CDR information. Users telephone usage habits are expressed by the start time,
the number of users to make the call, the sum, duration, call type,
call variation information, and other statistical data. As specic
fraudulent behavior may have a xed behavior pattern, most
fraudulent behavior can be found by examining the CDR data. In
combination with the professional knowledge of the providers
and user data, fraudulent behavior can then be determined. Generally, normal users sometimes delay payment because of special
reasons or habitual delays. Although these delays are different
from fraudulent behavior, they also have xed behavior patterns.
To identify late paying users, this behavior must be compared with
customer payment records. With the derived attributes (X ? no
format) of user payment and call behavior patterns obtained from
CDR by discussing with telecom providers, the proposed approach
can be utilized to predict whether a user will default on a debt. As
shown in Fig. 1, the system requires data on CDR, the customer
base, and customer payment status. The attributes of the important data for user payment status are shown in Table 3.
As shown in Table 3, when a users Payment Status is 00, it
means the user paid the bill on time; when the Payment Status
is 01, it means the user failed to pay the bill on time, and when
the Payment Status is 02, it means the user never paid the bill.
Meanwhile, there is a delay of 35 days from the time spent the
telephone is used and the time needed to predict whether a user
may default on their debt for more than ten days. Data acquisition
and prediction can be completed in 20 days. Service personnel can
remind customers who may make delayed payments to pay the
fee. Based on the denition used by telecom providers, there are
six billing cycles. The data provided by telecom providers was taken from customer data in one area and payment cycle. One billing
cycle is listed in Table 4. Differences in regions and billing cycle
time points are not considered.
Assuming that it is early in the seventh month, we need to predict the correlation between payment behavior patterns in the
sixth month and payment habits in the rst to fth months. Since
the customer payment status for the fth month is still unknown
early in the seventh month, the payment records for the fth
month are not used in the analysis of the customers payment pattern. Thus, only the relation between the user payment behavior
from the rst to fourth month and late payment behavior in the
sixth month can be described.
Next, in an analysis of user payment behavior patterns, the payment records for the rst to sixth month are summarized in the
payment status table, as shown in Table 4, and the data for all of
the months is aggregated to the original payment status. Also,
attention is paid to whether payment defaults exceed ten days,
but attention is not paid to whether the users paid their bill. When
user payment status in Table 4 is 00, user payment status is yes
in the sixth month, otherwise it is no. Yes indicates a timely
payment, and no indicates delinquency. For example, in Table
4, the user payment status for the sixth month can be changed
from 02 to the new payment status no. The user payment status then changes from 01 to the new payment status 01C,
where C represents the payment status of the fourth month.
The limitation of repeating occurrences of the same new payment
status item in the analysis can be overcome using the association
rules.
During the analysis using association rules, the payment status
of each customer during the period from the rst to fourth month

6566

C.-H. Chen et al. / Expert Systems with Applications 40 (2013) 65616569

Table 3
Data format of original CDR payment situations.
Attribute name

Description

Amount

Total amount of the bill, which includes communication


fees for seven days, excluding international calls

Installed date

Installation date

Billing cycle

Billing cycle for each month

Payment status

Cycle number

Billing period

Payment deadline

From the 1st day of the previous month


to the end of the previous month

The 25th of the current month

Payment status of the current month


00: paid punctually, 01: paid, 10 days overdue, 02: not paid

Table 4
New payment status.

Start

Customer ID

Bill month

Original payment status

001
001
001
001
001

06
04
03
02
01

02
No
01
01C
02
02D
00
00E
02
02F
C: the 4th month; D: the 3rd month; E: the 2nd
month; F: the 1st month

New payment status

is regarded as one transaction, and the association rule is used to


nd the rule X ? no format since the goal of the proposed approach
is attempted to predict late payment users, in which the condence is greater than 50%. X refers to the subset of the payment
status for the past four months, and is also the user payment model
to be found. In the subsequent data preprocessing, all occurrences
of X are regarded as the derived attributes related to payment
behavior. This study uses ETL to set the derived attributes. Finally,
a decision tree is utilized to analyze the training data. To avoid
irrelevant attributes in the nal result and to increase rule comprehensibility, payment behavior-related derived attributes are used
as substitutes for the original payment status from the rst to
fourth month.
From the payment behavior pattern analysis, a total of 19
meaningful rules are produced, and 19 derived attributes are set.
An example using one of the 19 rules is shown as follows:

Select data depends


on billing cycle

CDR/BASE

Clustering
Remove customers
paid punctually

DB

Decision tree 1

Decision tree n

Select high
accuracy rules

01D; 01C ! no support 0:39%; confidence 60:18%; lift 7:19


It can be seen that the probability of late payment is 60.18% in
the sixth month when the user made a late payment in the third
month and fourth month. If the user has made a late payment in
the past two or three months, the probability of a late payment
in the analyzed month is 60.18%. When preprocessing data, if the
user has such a situation, the relative attribute of the rule is set
to yes, otherwise it is set to no. For long-term users, the advantage of this approach is that it can combine payment behavior patterns from the past two to ve months with call behavior to
increase prediction accuracy. This method is also applied to customers who have insufcient historical payment records. As long
as the payment records of new customers satisfy any rule, the system can utilize the payment behavior patterns of new customers to
increase the accuracy rate.
3.4. Data mining implementation process
Late-paying customers comprise 7% of all customers. If a decision tree is used directly for analysis, prediction analysis will be
difcult, due to the small percentage of late-paying customers.
And, in fact, if a decision tree is used directly, the result has one
single node, the root node, and the prediction accuracy for normal

Rules in
rule base
End
Fig. 2. Data mining implementation ow.

customers is 93%. This accuracy is high, but it fails to meet the


model prediction requirements since the model prediction goal is
to nd customers who may make delayed payments.. As shown
in Fig. 2, clustering and decision trees are used to perform data
mining in the second phase to analyze user behavior and nd target customers.
In the rst phase, clustering is used to group customers and to
nd the cluster of normal customers. Then, the decision tree is utilized to create rules from those clusters. The behavior of normal
customers does not change with time. After the behavior rules
for normal customers are found, they are then used on all the window data arrays. In addition, some fraudulent behavior or default
behavior can be found by using the clustering algorithm in this
phase for things such as in analysis of pay-per-call usage cluster,

6567

C.-H. Chen et al. / Expert Systems with Applications 40 (2013) 65616569

where it is found that some users may indulge in numerous payper-calls over a short time. Since this type of customers often cant
pay their bills, if these customers had similar behavior in the past,
their payment behavior will be used to identify whether the users
usually paid the fee normally. If customers dont have similar
behavior, they will be included in the forecast list.
In the second phase, the behavior rules for normal customers in
the rst phase are used to eliminate the customers who satised
the rules, and the percentage of default customers in the data
therefore increases. In this phase, a decision tree is formed for
drawing predicts from and analyzing the rest of the data. However,
there are many different kinds of default behavior. The dates
signed by customers are different, and some new customers have
no historical data. In order to nd the customers who met the
objective, different attributes are selected for analysis to produce
different decision trees and rules. After verication, the rules that
are more than 80% accurate are selected and stored in an SQL database in order to nd the target customers.
4. Experimental results
This study used data from xed-line providers in Taiwan for
case discussions. The xed-line providers provided customers
CDR and payment data of customers from one base station, area,
and payment cycle for a period of twelve months. The data was
then used for model construction. According to statistics, 7% of
the users still hadnt paid their bills more than ten days after they
were overdue. Due to the condentiality of the service agreements,
this study only discusses the data mining process in the second
phase as well as some verication results.
4.1. Training data and testing data

diluted by other behavior, the training data from each month was
subdivided into different datasets for one week, two weeks, three
weeks and one month. Ten datasets in total were then generated,
including four, three, two, and one datasets, which were generated
for one week, two weeks, three weeks, and one month, respectively. Thus, a stable model that is easily converted with time
was established, and the effectiveness of the data mining effect
was increased.
4.2. Case discussion
In the proposed system, this study used the function elds and
supplemental eld shown in Table 5 to cluster and describe user
behavior in the rst phase, in order to nd the cluster for normal
customers. Next, a decision tree was used to analyze the derived
clusters.
After clustering, about thirty clusters were derived. However,
most of them contain a small parts of users (less than 2% of all
users). The two representative clusters, namely cluster [6]6 and
cluster [4]3, that account for 55.92% and 31.49% of all users,
respectively, were used for further analysis. The clustering results
are shown in Fig. 3.
From Fig. 3, this study checked the supplemental elds of the
two clusters, i.e. the PAYSTATUS distribution situation. The
cluster [4]3 has 5644 instance. According to the PAYTATUS, its

Table 5
Important related elds.
Field name

Denition

Function

MAXAMOUNT

Maximum amount for making one call


in the current week
Total amount for making calls in the
current week
Standard deviation of the call amount
for the current week
Average amount of each call in the
current week
The number used to make calls in the
current week
Payment type
Pay-per-call in last month
Number of different calls for the
current week
Total call time for the current week
Pay-per-call in the rst usage
Timely payment

Function eld

TOTALAMOUNT

Since late payment behavior may change over time, the predictive power of the model may decrease. Thus, in order to overcome
this problem and prevent the model from depending on historical
data, a future system operation is conducted to enable the datasets
of the model to cover different time intervals. The time window
concept is used to set the datasets for the model construction
(including the training data and testing data). The xed-line provider offered twelve months of CDR data and payment data for
model construction. The data was divided into six sections according to the sequence months, and the model was in turn constructed
to determine the useful rules, so that different datasets for the
model would be able to cover different time intervals. In other
words, the sliding window size was set at six, and since the accuracy of the last month could not be evaluated, six datasets were
then generated for model construction. During each time interval,
the customer data in the sixth month was used as the training data,
and the data from the rst month to the fth month were used as
historical data to construct the prediction model. The data from the
following months were used as testing data to verify the derived
rules. Since the derived rules would become ineffective over time,
less accurate rules are deleted and new rules are generated during
each system test in order to prevent the model from depending on
historical data.
There are many reasons that customers will pay their bills late.
Some fraudulent behaviors may cause heavy short-term losses to
providers. We thus expect that the system can analyze and predict
user behavior through a small amount of CDR. Based on the studied
experiences of the past three months and users who made habitual
late payments, we also found that fraudulent behavior could be
identied based on the behavioral difference between the current
week and past weeks, and between the current week and the past
several months. To prevent special fraudulent behavior from being

STDAMOUNT
AVGAMOUNT
NUMBERCOUNT
PAYTYPE
PAYLASTMONTH
NUMBERCLRS3
TOTALDURITIONS
FIRSTCALL
PAYSTATUS

Function eld
Function eld
Function eld
Function eld
Function eld
Function eld
Function eld
Function eld
Function eld
Supplemental
eld

Fig. 3. The clustering results of cluster [6]6 and cluster [4]3.

6568

C.-H. Chen et al. / Expert Systems with Applications 40 (2013) 65616569

percentage of late payment users in cluster [4]3 (approximately


22.57% (=1274/5644)) was higher than the percentage of total late
payment users, which fails to meet the goal of this study. The number of late payment users in cluster set [6]6 was 82, comprising
0.818% of the users in the cluster, which was lower than the percentage of total late-paying users.
With reference to the distribution of the three function elds for
user call behavior in the representative cluster, TOTALAMOUNT,
MAXAMOUNT and AVGAMOUNT, it was found that users whose
telecom fee was lower than NT$200 each month account for 97%
of total users, and the users whose maximum amount and average
amount for each call was lower than NT$10 comprises 80%, after
comparing the call amount with call amount of all users using
two function elds in cluster [6]6. This percentage was higher
than that for all users. When the call amount was higher, the percentage of users in the cluster was lower than the percentage of the
total users. Thus, it can be deduced that the users in cluster [6]6
were normal users, and this cluster can therefore be excluded from
late payment behavior. After customers with the characteristics of
cluster [6]6 are eliminated, the percentage of late-paying customers increases. In fact, it then comprised 15% of all users.
For users in cluster [4]3, the percentage of users whose bill
was lower than NT$200 each month, and the percentage of users
whose maximum and average amounts per call were lower than
NT$10, were both lower than the percentage for all users. When
the telephone fee was high, the percentage of users in the cluster
was higher than the percentage for all users. The users in cluster
[4]3 displayed behavior opposite to that of users in cluster
[6]6, so the users in cluster [4]3 were retained for further
analysis.
In the second phase, a decision tree was directly used to predict
and analyze the rest of the data. In this study, ten useful rules were
derived. Two rules are described as follows:

1 IF PAY PATTERN 04 Y and


TOTALAMOUNT > 300 THEN PAY N
In rule (1), PAY_PATTERN_04 = Y indicates that the user did not
pay their bill in the fourth month. Since the threshold value of the
rule is set to 80%, we can say that, according to rule (1), if the user
has not paid their bill for the fourth month and the total call
amount of the current week exceeds NT$ 300, the probability of
late payment is greater than 80%.

2 IF USEDMONTH < 6:5 and


MAXAMOUNT > 697:5 and
TOTALAMOUNT > 5397:5 THEN PAY N
According to rule (2), if a new customer has used the service for
less than 6.5 months and the price per call for the current week is
greater than or equal to NT$657.5, and the total call amount for the
week is greater than NT$5,397.5, the probability of default for the
new customer exceeds 80%.
4.3. Verication results
The last item in this study is a verication conducted by the
telecom provider for six months. The provider investigated users
whose monthly telephone fees exceed NT$2,000 and compared
their data with the prediction results of the system. The statistical
results are shown in Table 6. During the comparison, the subjects
investigated by the provider were considered to be the population;
therefore, the recall of the providers was not calculated. The method used to calculate the evaluation criteria in Table 6 is described
as following Eqs. (2)(4):

Predictionaccuracy
the number of correctly predicted late payment user total
predicted number of late payment users A=B

Recall the number of correctly predicted late payment users=


number of the actual late payment users A=C

Accuracy of provider the number of the actual late payment users=


total actual number of users C=D
4
According to the evaluation criteria, the results are shown in Table 6.
The results are shown in Table 6, the average recall and the
average accuracy of the system were 88.13% and 78.53%, respectively, according to the proposed CM-CoP model. The accuracy
was 13% greater than that of the predictions resulting from the
existing method used by the telecom provider, which was
65.60%. Meanwhile, in this situation, the average gain of the system was thus 11.2 (=78.53%/7%). Thus, the proposed CM-CoP model is efcient for predicting late-paying customers.

Table 6
Verication of statistical results .
Month

Prediction results

Subjects investigated

Late payment
Normal payment
Total number of users

1576 (A)
436
2012 (B)

1837 (C)
1011
2848 (D)

Predictive accuracy
Recall
Accuracy of provider

78.33%
85.79%
64.50%

Late payment
Normal payment
Total number of users

1592
455
2047

1778
987
2765

Predictive accuracy
Recall
Accuracy of provider

77.77%
89.54%
64.30%

Late payment
Normal payment
Total number of users

1588
403
1991

1798
979
2777

Predictive accuracy
Recall
Accuracy of provider

79.76%
88.32%
64.75%

Late payment
Normal payment
Total number of users

1602
424
2026

1795
855
2650

Predictive accuracy
Recall
Accuracy of provider

79.07%
89.25%
67.74%

Late payment
Normal payment
Total number of users

1588
435
2023

1788
964
2752

Predictive accuracy
Recall
Accuracy of provider

78.50%
88.81%
64.97%

Late payment
Normal payment
Total number of users

1605
460
2065

1843
894
2737

Predictive accuracy
Recall
Accuracy of provider

77.72%
87.09%
67.34%

C.-H. Chen et al. / Expert Systems with Applications 40 (2013) 65616569

5. Conclusions and future work


In this paper, we rst propose a late payment prediction framework, namely the Combined Mining-based Customer Payment
Behavior Predication (CM-CoP) framework, which incorporates
data mining technology with the domain-driven data mining strategy to predict which customers are most likely to not pay their bill
more than ten days after payment is due. Then, the CM-CoP algorithm is proposed for achieving this goal. After verication by the
provider, the accuracy of the proposed model, which was 78.53%,
was higher than that of the existing model, which was 65.60%. During the implementation of the plan, the user payment records for
the previous month (the fth month of each time interval) were
added to the user payment behavior model, and the accuracy
and recall of the existing model improved.
In addition, there are many reasons for late payment, and some
behavior can cause providers to suffer heavy short-term losses.
Since we hope that the system can use a small amount of recent
CDR to predict and analyze users behavior, each month-based
window was subdivided into several week-based windows. The
week-based windows increased the real-time capabilities of the
system. Because of this providers do not have to wait until the middle ten days of the next month for CDR prediction analysis, but can
perform an analysis after the system collects the CDR for one week.
In fact, if the providers are willing to use a week-based operation
model to develop a real-time function, the system can make further analysis for different late payment behaviors in a short time.
In the future, the authors will communicate with telecom providers regarding this issue and will conduct further analysis of each
type of behavior in the hope that they will offer user payment records for the previous month on time. Further, the existing rules
used by providers can be combined with the model. Different models can be provided for users with different call amounts in order to
further improve the models efciency.
References
Agrawal, R., Imielinksi, T., & Swami, A. (1993). Mining association rules between
sets of items in large database. In The 1993 ACM SIGMOD conference, Washington
DC, USA.
Au, W. H., & Chan, K. C. C. (2003). Mining fuzzy association rules in a bank-account
database. IEEE Transactions on Fuzzy Systems, 11(2), 238248.
Cahill, M. H., Lambert, D., Pinheiro, J. C., & Sun, D. X. (2002). Detecting fraud in the
realworld. Handbook of massive data sets (massive computing 4). Kluwer
Acadamic Publishers (pp. 911929). Kluwer Acadamic Publishers.

6569

Cao, L. (2010). Domain-driven data mining: challenges and prospects. IEEE


Transactions on Knowledge and Data Engineering, 22(6), 755769.
Cao, L., Zhao, Y., Zhang, H., Luo, D., Zhang, C., & Park, E. K. (2010). Flexible
frameworks for actionable knowledge discovery. IEEE Transactions on Knowledge
and Data Engineering, 22(9), 12991312.
Du, J., & Ling, C. X. (2010). Asking generalized queries to domain experts to improve
learning. IEEE Transactions on Knowledge and Data Engineering, 22(6), 812825.
Fawcett, T., & Provost, F. (1997). Adaptive fraud detection. Data Mining and
Knowledge Discovery, 1(3).
Hadavandi, E., Shavandi, H., & Ghanbari, A. (2010). Integration of genetic fuzzy
systems and articial neural networks for stock price forecasting. KnowledgeBased System, 23(8), 800808.
He, J., Zhang, Y., Shi, Y., & Huang, G. (2010). Domain-driven classication based on
multiple criteria and multiple constraint-level programming for intelligent
credit scoring. IEEE Transactions on Knowledge and Data Engineering, 22(6),
826838.
Hung, S. Y., Yen, D. C., & Wang, H. Y. (2006). Applying data mining to telecom churn
management. Expert Systems with Applications, 31, 515524.
Jin, H., Chen, J., He, H., Kelman, C., McAullay, D., & OKeefe, C. M. (2010). Signaling
potential adverse drug reactions from administrative health databases. IEEE
Transactions on Knowledge and Data Engineering, 22(6), 839853.
Mansingh, G., Osei-Bryson, K., & Reichgelt, H. (2011). Using ontologies to facilitate
post-processing of association rules by domain experts. Information Sciences,
181(3), 419434.
Marinica, C., & Guillet, F. (2010). Knowledge-based interactive postmining of
association rules using ontologies. IEEE Transactions on Knowledge and Data
Engineering, 22(6), 784797.
McQueen, J. B. (1967). Some methods of classication and analysis of mutivariate
observations. In The symposium on mathematical satistics and probability (pp.
281297).
Pradeep, I. K., Krishna, S. M., Illapu, S. S. R., Kumar, A., & Koyi, L. P. (2010). CRM
system using CM-AKD approach of D3M. International Journal of Engineering
Science and Technology, 2(3), 237242.
Quinlan, J. R. (2003). Induction of decision trees. Machine Learning, 1(1), 81106.
Schommer, C. (2010). Discovering fraud behaviour in call detailed records. Grande
region security and reliability day.
Sim, A. T. H., Indrawan, M., Zutshi, S., & Srinivasan, B. (2010). Logic-based pattern
discovery. IEEE Transactions on Knowledge and Data Engineering, 22(6), 798811.
Tajbakhsh, A., Rahmati, M., & Mirzaei, A. (2009). Intrusion detection using fuzzy
association rules. Applied Soft Computing, 9(2), 462469.
Taniguchi, M., Haft, M., Hollmen, J., & Tresp, V. (1998). Fraud detection in
communication networks using neural and probilistic methods. IEEE
International Conference on Acoustics, Speech and Signal Processing, 2, 1215.
Wheeler, R., & Aitken, S. (2000). Multiple algorithms for fraud detection. KnowledgeBased System, 13, 9399.
Xiang, E. W., Cao, B., Hu, D. H., & Yang, Q. (2010). Bridging domains using world
wide knowledge for transfer learning. IEEE Transactions on Knowledge and Data
Engineering, 22(6), 70783.
Xu, X., Lin, J., & Xu, D. (2009). Mining pattern of supplier with the methodology of
domain-driven data mining. IEEE International Conference on Fuzzy Systems,
19251930.
Yan, L., Wolniewicz, R. H., & Dodier, R. (2004). Predicting customer behavior in
telecommunications. IEEE Intelligent Systems, 5058.

You might also like