You are on page 1of 8

A Robust Rule-Based Event Management

Architecture for Call-Data Records

C. W. Ong and J. C.Tay

Center for Computational Intelligence,


Nanyang Technological University
asjctay@ntu.edu.sg

Abstract. Rules provide a flexible method of recognizing events and event


patterns through the matching of CDR data fields. The first step in automatic
CDR filtering is to identify the data fields that comprise the CDR format. In the
particular case of the Nortel Meridian One PABX, five different call data types
can be identified that are critical for call reporting. The architecture we have
proposed will allow for line activity analysis while continuously publishing
action choices in real-time. For performance evaluation of serial-line CDR data
communications, an approximation to the CDR record loss rate at different
simulated call traffic intensities was calculated. Here, the arrival process
represents the arrival of newly generated CDR to the output buffer and the
service process represents the process of transmitting the CDR over the serial
connection. We calculate the CDR loss rate at different arrival intensities and
observed that the CDR loss rate is negligible when the CDR arrival rate is less
than 4 CDR per second.

Keywords. Rule-based system, CDR loss rate, event filtering and correlation.

1 Introduction and Motivation


Telecommunications monitoring is usually conducted by maintaining raw data logs
produced by the Private Automatic Branch Exchange (or PABX). There is usually
very little meaningful correlation to the user directory to allow the management to
view user call details before monthly bills are invoiced from the service provider.
Call data records (or CDRs) are typically produced during or after calls are made
and need to be filtered and identified before they can be used meaningfully. In this
manner, we can classify CDRs as events that can be processed. One common method
of call data filtering is essentially a set of ‘if-then’ control structures used to
systematically decipher each CDR according to predefined vendor-specific CDR
formatting. Such an implementation implies little room for variation in call data
formatting, and imposes brittleness on the design of the filtration and correlation
function. Another factor to motivate automatic filtration of CDR information is the
incidence of telecommunications fraud, typified by overseas calls and prolonged
phone usage during work hours. This is a costly issue which can be prevented
through the proactive use of rules to ensure suspicious activity is recognized and that
management is alerted.
Rules provide a flexible method of recognizing events and event patterns through
the matching of CDR data fields. A basic architecture must provide a sufficient

M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3215, pp. 16–23, 2004.
© Springer-Verlag Berlin Heidelberg 2004
A Robust Rule-Based Event Management Architecture for Call-Data Records 17

subset of rules that can match most usage patterns and data formats while allowing
unknown patterns to be learnt via user-intervention. Such a system would save the
tedious and impossible task of categorizing each and every vendor-specific CDR
format and having to hard-code for all present and future usage patterns. Although
there is commercially available software on the market like Hansen Software’s CASH
Call Accounting Software or TelSoft Solutions [1], these are closed-source and are
not ideal platforms for customizing to the user’s needs.
In this paper, we propose an effective architecture for filtering and learning CDRs
from correct, partially-correct, and unknown PABX data formats through the use of
an embedded forward-chaining rule-based engine. In addition, the proposed
architecture also provides web-based customizable reports for trend and historical
analysis of phone-line usage. This research is based on our experience in
implementing similar systems for Banks, Credit Collection Agencies and
Manufacturing research centers [2].

2 Overview of CDR Filtering and Correlation


The CDR transaction model we assume in our study is one in which the CDR is
produced only after a call has been placed. On one hand, this model makes the job of
collecting call data for completed phone calls simpler but on the other, this implies that
a complete picture of the state of current phone-line usage is unknown. However, it
remains realistic as most phone systems issue CDRs only during or after a call is made.
The first step in automatic CDR filtering is to identify the data fields that comprise
the CDR format and which can be used to identify the type of CDR being produced.
In the particular case of the Nortel Meridian One PABX [3], five different call data
types [4] can be identified that are critical for call reporting. They are; Normal Calls,
Start/End Calls, ACD Call Connections, Authorization and Internal.

Fig. 2.1. CDR Generation for Normal Calls shows the generation of a CDR record for the entire
event comprising call-commencement, call-duration and call-termination of a Normal Call

A more complex scenario occurs when calls are transferred and CDRs have to be
correlated together to form a single record in the system. This is shown in Fig 2.2.
After the variables are identified, they can be classified and used within rules for
filtering CDR and identification.

3 Simple Fraud Alert Rules Development

In traditional fraud monitoring practices, the administrator would only be aware of the
case after it has occurred and by then it would be too late to take measures against the
18 C.W. Ong and J.C. Tay

Fig. 2.2. CDR Generation for Start/End Calls

perpetrator. Usually the reports will have to be viewed by the administrator and
flagged for suspicious activity, all of which is time-consuming and prone to errors.
The model presented here is intended as a first step towards improving fraud
detection efficiency and effectiveness. The expert system model once developed and
introduced will allow for line activity analysis while continuously publishing action
choices in real-time. Specifically, after every interval of CDR arrival, it can be
checked against fraud rules and reports can immediately be sent to the administrator.
The administrator can then take further action by flagging specific lines for further
analysis and monitoring. The action will usually be recommended (based on similar
past actions) by the system to minimize the administrative load.

3.1 Rule-Based Fraud Modeling

Ideally the process of modeling fraud activity involves collecting historical line
activity data, and then applying statistical techniques (such as discriminant analysis)
on the dataset to obtain a predictive model that is used to distinguish between fraud
and non-fraud activity [5][6]. In our case however, it would be more difficult to
distinguish fraud and non-fraud activity as CDRs are only issued after a call is made.
Instead, a rule-based approach is used to correlate variables which form conjunctive
patterns in fraudulent phone-line usage. A fraud variable datum (or FVD) is a data
field in a CDR which is used as part of a rule condition to detect fraudulent
phone-line usage.
Some examples of FVDs (in order of significance) are; Duration of Call,
Frequency of Call, Time of Call, Day of Call and Destination of Call. The FVDs
represent the signs which a human operator should take note of when detecting fraud.
Duration of Call is often the prime suspect since an international call (or IDD) that is
placed for more than a certain duration incurs great cost and is unlikely to be used for
meetings, is cause for alarm. Frequency of Call can also indicate fraud activity since a
high frequency detected within a certain time period could be indicative of redial
activity in an attempt to gain unauthorized access to systems. By monitoring Time and
Day of Call, unusual activities can also be caught during periods when calls are not
expected to be placed for most extensions. The last category, Destination of Call,
could be used to monitor lines in which IDD calls are not allowed. Some examples of
Fraud Detection Rules are:
A Robust Rule-Based Event Management Architecture for Call-Data Records 19

• If DurationOfCall > 3600 then email Administrator;


• If FrequencyOfCall > 10 AND TimeOfCall = 5 then email Administrator;
• If TimeOfCall > OFF_PEAK then monitor line;
Actions or Jobs can be specified to be executed when rules are triggered to
facilitate reporting and alert creation. Actions include using HTML alert, E-Mail and
short message service (or SMS).

4 An Event Management Architecture

By classifying incoming CDR as events to be managed, we design a system that is


able to process incoming events using JESS rules. In this case, events are generated
from the Nortel PABX. The CDRTool system consists of modules to intercept events,
process them with the help of the Rule Framework and store processed events in a
database. This database can then be processed offline by a Report Framework for
statistical analysis and report generation.
The CDRTool System consists of various submodules to intercept events, process
and store them into a database. The architecture shown in Fig 4.1 allows for
component failure and recovery with little intervention from the administrator. A
Supervisor Module ensures each component is running by using a Publish/Subscribe
message service. Clients address messages to a topic. Topics retain messages only as
long as it takes to distribute them to current subscribers. Publishers and subscribers
have a timing dependency. A client that subscribes to a topic can consume only
messages published after the client has created a subscription, and the subscriber must
continue to be active in order for it to consume messages.
Each component that makes up the workflow from getting the raw CDR to a
processed CDR for rule matching communicates using a point-to-point messaging
system also via JMS (Java Messaging System). A point-to-point (or PTP) product or
application is built around the concept of message queues, senders, and receivers.
Each message is addressed to a specific queue, and receiving clients extract messages
from the queue(s) established to hold their messages. Queues retain all messages sent
to them until the messages are consumed or until the messages expire.
The application business model allows a component to send information to another
and to continue to operate without receiving an immediate response. Using messaging
for these tasks allow the various components to interact with one another efficiently,
without tying up network or other resources. The PABX Client module is a Web-based
communications program in charge of reading raw CDR from the PABX.
The CDR Processor Module will retrieve from the Attributes and Schema database,
those attributes that make up a CDR and attempt to create a well-formed CDR from the
raw record. If it is able to do so, it will then proceed to queue it for processing by the
CDR Manager. If not, errors are reported and tagged for batch processing to identify
those CDR sets which will be used by the administrator to take corrective action (for
example adding it to the database so that future CDR can be identified).
The CDR Manager uses JESS [7] to identify the CDRs that are possibly alert cases
based on rules set down by the administrator and also logs them into a central
database for report generation. CDRs that arrive at the CDR Manager are assumed to
20 C.W. Ong and J.C. Tay

be well-formed (as verified by CDR Processor); however, the rule engine still
performs a verification check against the schema database for any data range changes
if they have been specified by the administrator. This two-check process ensures that
the CDRs that enter into the database are valid. Fraud rules are also applied at this
step and any fraud triggers are then queued for job execution. CDRs which trigger the
fraud rules are tagged upon insertion into the CDR database and the appropriate alert
action (using E-Mail or SMS) is taken.

Fig. 4.1. The CDRTool System Architecture

The Attribute Manager is a web-based application that allows user specification of


CDR schema for the specific PABX that is in use. The various attributes and variables
specified here will be used by the system to correlate CDRs and also identify unknown
CDRs for system learning. The schema is stored in XML form for portability.
The rule builder is where users can enter new rules. The rule builder retrieves
conditions from the Attributes and Schema database and performs data range
verification. The information is grouped and displayed so that rules can be added
quickly and all the system rules viewed at a glance. Job types can be specified when
rules are matched (Job options are Email, SMS or HTML reports). The results of rule
executions are sent via the Report Framework.
The rules which are entered via the Rule Builder Form are stored into a JESS
rule-base. The rule engine is used primarily by the CDR Manager and also for
dispatching of alerts to the Report Framework when fraud rules are triggered. Rules
are created by the Rule Builder and stored in a SQL database called the Rule Database
which is used by the rule engine. A wizard-guiding system for adding rules was
developed to ease the administrative burden of manually going through error logs to
identify new rules.
A Robust Rule-Based Event Management Architecture for Call-Data Records 21

A reporting framework provides the system with a means to generate different user
specified reports in HTML, E-Mail or Excel spreadsheet formats. The system also
allows report templates to be customized using the web-based GUI. The major modules
in the Report Framework are the Template Builder, the User Query Interface and the
Report Presenter. Each module represents the work required from creating reports to
gathering user queries and finally presenting the results in HTML or Excel. These
results can then be viewed either in a browser or sent as E-mail as well as SMS alerts.
The Template Builder gathers attributes from the Attributes and Schema database
and provides a web-based interface for building report templates. This allows
customisation to suit departmental needs and data analysis requirements. Each
user-query is then manipulated for each report. Each query is built as a SQL statement
whose results can be in graphical format or raw data format. A test SQL function is
provided to ensure the query executes correctly against the database. The User Query
Interface obtains Report Templates from the Report Template Database and builds the
user interface for presentation to the user (using HTML for display on a browser).
Finally, from the raw data received from the results of User Query, Report Presenter
will then format reports to suit user needs. Drill-down reporting allows more detailed
view of data.

5 Performance Testing

From Fig 5.1, there were a total of 103574 records for the month of January 2003.
The implementation of hard coded IF-Else statements shown in Fig 1.1 produced
5867 error records which meant there was a 5.36% error rate. The rule-based
approach through the use of wizards to modify rules produces 1352 error records even
after rule adjustment due to inability to filter the CDR. This translates to a 1.28%
error rate. This shows a slight improvement over the old system. The disadvantages of
naive approach are that the hard coded rules are difficult to change and usually can
only be modified by shutting down the server and examining the error logs. A
rule-based system does not require a shutdown of the system since the rules can be
compiled by the CDR Processor immediately when new rules are added. CDRs with
recurring errors are also accumulated and presented to the user with the option to add
in new CDR filter rule based on closest rule match.
In this section an approximation to the CDR record loss rate at different simulated
call traffic intensities will be calculated. This approximation is made to investigate the
limitations of using the serial interface for output of CDR data. The approximation is
based on a simple model of a queuing system; a M/M/1*B system. This system
assumes exponentially distributed interarrival times and exponentially distributed
service times, using only one server and having a limited buffer space of B buffers.
Here, the arrival process represents the arrival of newly generated CDR to the
output buffer and the service process represents the process of transmitting the CDR
over the serial connection. Exponentially distributed interarrival times are often used
in telephony based queuing systems to model the arrival process and has shown to
often be a very good approximation (see [8]). The time to transmit the data over the
serial line is intuitively constant in this case, since the size of each CDR and the
22 C.W. Ong and J.C. Tay

transmitting rate are constant. However, as also mentioned in [8], systems with
general and deterministic service times can often be very closely approximated using
exponentially distributed service times in the model. By using exponential
distributions rather than general, the calculations can be simplified but still be
accurate enough to be used for approximating the limitations of the system.

Type of Call Number of Calls


Outgoing Call 53310
Incoming Call 39800
Initial Connection 4106
End 1789
Internal Call 1726
Start 1438
Outgoing T2T 945
Incoming T2T 460
Total 103574

Fig. 5.1. Typical Records from PABX for one month

The CDR loss rate was calculated for different arrival intensities and plotted in a
graph (see Fig 5.2). From the graph it can be determined that the CDR loss rate may
be neglected when the CDR arrival rate is less than close to 4 CDR per second. When
the arrival rate reaches 4 CDR per second, the output buffer starts to fill up and CDRs
are lost. At stress testing, call traffic generates a maximum arrival intensity of
approximately 1 CDR record per second, which is far lower than the critical arrival
intensity when call information records begin to get lost. Even if the traffic load
increases to three times the traffic load of today, there’s no immediate risk of losing
CDRs due to saturated output buffers.
From Fig 5.3 we can see that this arrival rate; at the point when the output buffer
starts to fill up, corresponds to a traffic intensity of about 80%.

6 Summary and Conclusion

The CDR transaction model we have assumed in our study is one in which the CDR is
produced only after a call has been placed. The first step in automatic CDR filtering is to
identify the data fields that comprise the CDR format and which can be used to identify
the type of CDR being produced. In the particular case of the Nortel Meridian One PABX
[5], five different call data were identified that are critical for call reporting. The
architecture that is proposed will allow for line activity analysis while continuously
publishing action availabilities in real-time. For performance evaluation, an approximation
to the CDR record loss rate at different simulated call traffic intensities was calculated.
From the results, we observe that the CDR loss rate is negligible when the CDR arrival
rate is less than 4 CDR per second. At stress testing, call traffic generates a maximum
arrival intensity of approximately only 1 CDR record per second, which is far lower than
the critical arrival intensity when call information records begin to get lost.
A Robust Rule-Based Event Management Architecture for Call-Data Records 23

CDR record loss rate

loss rate [CDR /


2.5
2
second] 1.5
1
0.5
0
0 2 4 6 8
arrival intensity [CDR / second]

Fig. 5.2. CDR Record Rate Loss

Traffic intensity, ρ (λ)

1,6
1,4
1,2
1
ρ(λ)

0,8
0,6
0,4
0,2
0
0,5 1 1,5 2 2,5 3 3,5 4 4,5 5 5,5 6 6,5 7 7,5 8
λ

Fig. 5.3. Traffic Intensity

References
[1] TelSoft Solutions for Call Accounting, http://telsoftsolutions.com/callaccount.html
(verified on 15 Jan 2004).
[2] Nguyen A. T., J. B. Zhang, J. C. Tay, “Intelligent management of manufacturing event &
alarms”, technical report, School of Computer Engineering, Nanyang Technological
University, Jan 2004.
[3] Reference for Nortel Meridian 1 PBX/Meridian Link Services, http://
callpath.genesyslab.com/docs63/html/nortsl1/brsl1m02.htm#ToC (verified on 15 Jan 2004).
[4] Nortel Networks. (2002) CDR Description and Formats. Document Number:
553-2631-100 Document Release: Standard 9.00.
[5] Nikbakht, E. and Tafti, M.H.A, Application of Expert Systems in evaluation of credit card
borrowers. Managerial Finance 15/5, 19-27, 1989.
[6] Peter B., John S., Yves M., Bart P., Christof S., Chris C., Fraud Detection and Management in
Mobile Telecommunications Networks, Proceedings of the European Conference on Security
and Detection ECOS 97, pp. 91-96, London, April 28-30, 1997. ESAT-SISTA TR97-41.
[7] Java Expert System Shell or JESS, website at http://herzberg.ca.sandia.gov/jess/
[8] Jain, Raj, The Art of Computer Systems Performance Analysis. ISBN 0-471-50336-3, USA:
John Wiley & Sons, Inc., 1991.

You might also like