This action might not be possible to undo. Are you sure you want to continue?
M S Ramaiah Institute of Technology (Autonomous Institute Affiliated to VTU) Bangalore – 560 054
0 2.4 2.2 2.0 4.5 3.2 5.0 2.1 2.0 6.3 2.TABLE OF CONTENTS 1.2 3.0 3.0 Abstract Introduction to Data Mining Data Mining Techniques Telecommunication Marketing Information about WEKA Telecommunication Fraud Detection Network Fraud Isolation Telecommunication Fraud Subscription fraud Bad Debt Call Detail Record Problem Definition Algorithms used Snapshots Conclusion References .3 4.1 3.1 4.
Thus. Although the intentions of the mobile phone users cannot be observed. we are making no prior assumptions about the data indicative of fraudulent call patterns. While call data are recorded for subscribers for billing purposes. This research investigates the unsupervised learning potentials of two neural networks for the profiling of calls made by users over a period of time in a mobile telecommunication network. One of the strategies for fraud detection checks for signs of questionable changes in user behavior. The ordered features can later be interpreted and labeled according to specific requirements of the mobile service provider. i. .1. the calls made for billing purpose are unlabeled. An unsupervised learning algorithm can analyze and cluster call patterns for each subscriber in order to facilitate the fraud detection process. the LSTM recurrent neural network algorithm providing a better discrimination than the SOM algorithm in terms of long time series modeling. Our investigation shows the learning ability of both techniques to discriminate user call patterns. Further analysis is thus. Over a period of time. their intentions are reflected in the call data which define usage patterns. suspicious call behaviors are isolated within the mobile telecommunication network and can be used to identify fraudulent call. required to be able to isolate fraudulent usage. marketing and fraud detection. an individual phone generates a large pattern of use. Abstract Huge amounts of data are being collected as a result of the increased use of mobile telecommunications.e. Our study provides a comparative analysis and application of SelfOrganizing Maps (SOM) and Long Short-Term Memory (LSTM) recurrent neural networks algorithms to user call data records in order to conduct a descriptive data mining on users call patterns. Insight into information and knowledge derived from these databases can give operators a competitive edge in terms of customer care and retention. LSTM discriminates different types of temporal sequences and groups them according to a variety of features.
When implemented on high performance client/server or parallel processing computers. finding predictive information that experts may miss because it lies outside their expectations. and can be integrated with new products and systems as they are brought on-line. Data mining techniques can be implemented rapidly on existing software and hardware platforms to enhance the value of existing information resources. Most companies already collect and refine massive quantities of data. Examples of profitable applications illustrate its relevance to today’s business environment as well as a basic description of how data warehouse architectures can evolve to deliver the value of data mining to end users. the extraction of hidden predictive information from large databases.2. Data mining tools can answer business questions that traditionally were too time consuming to resolve. is a powerful new technology with great potential to help companies focus on the most important information in their data warehouses. data mining tools can analyze massive databases to deliver answers to questions such as. knowledge-driven decisions. Data mining tools predict future trends and behaviors. and why?" This white paper provides an introduction to the basic technologies of data mining. . prospective analyses offered by data mining move beyond the analyses of past events provided by retrospective tools typical of decision support systems. Introduction Data mining. The automated. They scour databases for hidden patterns. "Which clients are most likely to respond to my next promotional mailing. allowing businesses to make proactive.
In association. 1. we make the software that can learn how to classify the data items into groups. Basically classification is used to classify each item in a set of data into one of predefined set of classes or groups. clustering. For example. For example.” In this case. we can apply classification in application that “given all past records of employees who left the company. classification. predict which current employees are probably to leave in the future. prediction and sequential patterns. Classification : Classification is a classic data mining technique based on machine learning. the association technique is used in market basket analysis to identify what products that customers frequently purchase together. In classification.1 Datamining Techniques There are several major data mining techniques have been developed and used in data mining projects recently including association. 2. And then we can ask our data mining software to classify the employees into each group.2. Classification method makes use of mathematical techniques such as decision trees. we divide the employee’s records into two groups that are “leave” and “stay”. a pattern is discovered based on a relationship of a particular item on other items in the same transaction. Association : Association is one of the best known data mining technique. . neural network and statistics. We will briefly examine those data mining techniques with example to have a good overview of them. linear programming. Based on this data businesses can have corresponding marketing campaign to sell more products to make more profit.
If readers want to grab books in a topic. profit could be a dependent variable. books have a wide range of topics available. The challenge is how to keep those books in a way that readers can take several books in a specific topic without hassle.3. clustering technique also defines the classes and put objects in them. Sequential Patterns: Sequential patterns analysis in one of data mining technique that seeks to discover similar patterns in data transaction over a business period. Then based on the historical sale and profit data. we can keep books that have some kind of similarities in one cluster or one shelf and label it with a meaningful name. The uncover patterns are used for further business analysis to recognize relationships among data. we can draw a fitted regression curve that is used for profit prediction. By using clustering technique. he or she would only go to that shelf instead of looking the whole in the whole library. we can take library as an example. while in classification objects are assigned into predefined classes. Different from classification. prediction analysis technique can be used in sale to predict profit for the future if we consider sale is an independent variable. To make the concept clearer. 5. Clustering : Clustering is a data mining technique that makes meaningful or useful cluster of objects that have similar characteristic using automatic technique. . For instance. In a library. Prediction: The prediction as it name implied is one of a data mining techniques that discovers relationship between independent variables and relationship between dependent and independent variables. 4.
which describes the telecommunication customers. The fourth and final data mining issue concerns real-time performance: many data mining applications. . These automated systems performed important functions such as identifying fraudulent phone calls and identifying network faults. the experts do not have the requisite knowledge. require that any learned model/rules be applied in real-time. which describes the state of the hardware and software components in the network. which describes the calls that traverse the telecommunication networks. if not impossible. For example.2 Telecommunication Marketing The telecommunications industry generates and stores a tremendous amount of data. network data. such as the failure of a network element or an instance of telephone fraud. rarity is another issue that must be dealt with. and customer data. The problem with this approach is that it is time consuming to obtain the knowledge from human experts (the “knowledge acquisition bottleneck”) and.2. The first concerns scale. The need to handle such large volumes of data led to the development of knowledge-based expert systems. Telecommunication data pose several interesting issues for data mining. A second issue is that the raw data is often not suitable for data mining. The amount of data is so great that manual analysis of the data is difficult. since telecommunication databases may contain billions of records and are amongst the largest in the world. both call detail and network data are time-series data that represent individual events. in many cases. These data include call detail data. useful “summary” features must be identified and then the data must be summarized using these features. Before this data can be effectively mined. Because many data mining applications in the telecommunications industry involve predicting very rare events. The advent of data mining technology promised solutions to these problems and for this reason the telecommunications industry was an early adopter of data mining technology. such as fraud detection.
Weka provides access to SQL databases using Java Database Connectivity and can process the result returned by a database query.2. regression. numeric or nominal attributes. and feature selection.3 Information about WEKA The Weka work bench contains a collection of visualization tools and algorithms for data analysis and predictive modeling. but there is separate software for converting a collection of linked database tables into a single table that is suitable for processing using Weka. clustering. in particular for educational purposes and research. together with graphical user interfaces for easy access to this functionality. classification. data preprocessing. The original non-Java version of Weka was aTCL/TK front-end to (mostly third-party) modeling algorithms implemented in other programming languages. plus data preprocessing utilities in C. visualization. but the more recent fully Java-based version (Weka 3). where each data point is described by a fixed number of attributes (normally. more specifically. is now used in many different application areas. Another important area that is currently . and a Make file-based system for running machine learning experiments. All of Weka's techniques are predicated on the assumption that the data is available as a single flat file or relation. since it is fully implemented in the Java programming language and thus runs on almost any modern computing platform A comprehensive collection of data preprocessing and modeling techniques Ease of use due to its graphical user interfaces. Advantages of Weka include: Free availability under the GNU General Public License Portability. This original version was primarily designed as a tool for analyzing data from agricultural domains. but some other attribute types are also supported). It is not capable of multirelational data mining. for which development started in 1997. Weka supports several standard data mining tasks.
where individual scatter plots can be selected and enlarged.. etc. The Associate panel provides access to association rule learners that attempt to identify all important interrelationships between attributes in the data. to estimate the accuracy of the resulting predictive model. or the model itself (if the model is amenable to visualization like. and for preprocessing this data using a so-called filtering algorithm. The Classify panel enables the user to apply classification and regression algorithms (indiscriminately called classifiers in Weka) to the resulting dataset.g. a decision tree).. and analyzed further using various selection operators. turning numeric attributes into discrete ones) and make it possible to delete instances and attributes according to specific criteria.not covered by the algorithms included in the Weka distribution is sequence modeling. The Select attributes panel provides algorithms for identifying the most predictive attributes in a dataset. ROC curves.g. The Cluster panel gives access to the clustering techniques in Weka. and to visualize erroneous predictions.. a CSV file. e. There is also an implementation of the expectation maximization algorithmfor learning a mixture of normal distributions. etc. The Explorer interface features several panels providing access to the main components of the workbench: The Preprocess panel has facilities for importing data from a database.. The Visualize panel shows a scatter plot matrix. e.g. .. These filters can be used to transform the data (e. the simple k-means algorithm.
4 Telecommunication Fraud Detection Fraud. fraud could go unnoticed fairly easily. . In the early days of the telecommunications business. fraud matured in the area of transactional businesses. AT&T started automating direct-dial long distance calling. as fraud may be perpetrated for political causes (e.. including forged artwork.. perjury). Usually the fraud is for monetary gain. is certainly as old as civilization itself. because it was such a small proportion of the overall business.g. the act of deceiving others for personal gain. or selfpreservation (e. Due to the sheer volume of transactions in these businesses.g.2. but not always. Although these forms of fraud are very different in nature. plagiarism). The word comes from the Latin fraudem. and over the years has come to represent a wide array of injustices. conﬁdence schemes. personal prestige (e. most notably in the telecommunications and credit card industries. “social engineering” was used to convince telephone operators to give Access to lines or complete calls that were not authorized. academic plagiarism. they all have in common a dishonest attempt to convince an innocent party that a legitimate transaction is occurring when in fact it is not. and email advance-fee frauds (such as the well-known Nigerian-email scam). In the twentieth century. Because fraud could now be perpetrated without speaking to a human. it could be automated. electoral fraud). exposing themselves to the ﬁrst generation of hackers. meaning deceit or injury. In the 1950s.g..
Also known as "fault diagnosis. In addition. For example. . Fault isolation may be part of hardware design at the circuit level all the way up to the complete system. After fault isolation is accomplished. Although the terms "fault isolation" and "fault detection" are sometimes used synonymously." the term may refer to hardware or software.2. generating intermediate output that can be examined as well as recording operational steps in a log are ways to assist the trouble shooter to manually determine which routine caused the application to stop working or stop working properly. parts can be replaced manually or automatically. but always deals with methods that can isolate the component. It is accomplished by building in test circuits and/or by dividing operations into multiple regions or components that can be monitored separately. Software can also be created and run with fault isolation in mind. fault detection means determining that a problem has occurred. Many techniques can be used. whereas fault isolation pinpoints the exact cause and location.5 Network Fault Isolation Determining the cause of a problem. program modules can be run in different address spaces to achieve separation. device or software module causing the error.
If someone does not pay their bill. subscription using false identity. duration of the call. These classes describe the mode in which the operator was defrauded.2 Bad Debt : Bad Debt occurs when payment is not received for good/services rendered. for example. where the callers or customers appear to have originally intended to honour their bills but have since lost the ability or desire to pay. 3.Telecommunication Fraud Telecommunication fraud can be defined as the theft of services or deliberate abuse of voice or data networks. 3. Call start and end time.3 Call Detail Record(CDR): Call Detail Record is descriptive information about the call placed on a telecommunication network. then the telecom company has to establish if the person was fraudulent or was merely unable to pay. .1 Subscription Fraud : Subscription fraud is the obtaining of a telecommunication account on postpaid through normal procedure with no intention of paying for the bill either using false or stolen identity. 3. Each mode can be used to defraud the network for revenue based purposes or nonrevenue based purpose. etc. DPC. Telecommunication fraud can be broken down into several generic classes. This is. in a telecommunication company. for example.This is usually done by giving a wrong address such that the subscriber remains untraceable.3. The typical pattern of such fraudsters is to run a high bill in a short time. It includes sufficient information to describe the important characteristics of each call such OPC. Most of these frauds are perpetrated either by the fraudster impersonating someone else or technically deceiving the network systems.
While call data are recorded for subscribers for billing purposes. a descriptive analysis of the call profiling for each subscriber can be used for knowledge extraction. This type is relatively easy to detect. it is virtually impossible to analyse without sophisticated techniques and tools. Hence. Anomalous use can be identified as belonging to one of two types : 1. In order to detect fraud of the second type. The pattern is anomalous only relative to the historical pattern established for that phone. 2. an individual handset’s Subscriber Identity Module (SIM) card generates a large pattern of use. it will almost never occur in normal use. The pattern of use may include international calls and time-varying call patterns among others. it is necessary to have knowledge of the history of SIM usage. Interpretation by way of clustering or grouping of similar patterns can help in isolating suspicious call behaviour within the mobile telecommunication network. it is interesting to know that no prior assumptions are made about the data indicative of fraudulent call patterns. . This can also help fraud analysts in their further investigation and call pattern analysis of subscribers. In other words. Further analysis is thus required to be able to identify possible fraudulent usage. the calls made for billing purposes are unlabeled.4. Because of the huge call volumes. The pattern is intrinsically fraudulent.Problem Defnition Over a period of time. Anomalous use can be detected within the overall pattern such as subscribers abuse of free call services such as emergency services.
because the model encodes dependencies among all variables. Three. . Four. hierarchical Bayes(ian) model or directed acyclic graphical model is aprobabilistic graphical model (a type of statistical model) that represents a set of random variables and their conditional dependencies via adirected acyclic graph (DAG). Bayes network. a Bayesian network can be used to learn causal relationships. a Bayesian network could represent the probabilistic relationships between diseases and symptoms. For example. Bayesian statistical methods in conjunction with Bayesian networks offer an efficient and principled approach for avoiding the over fitting of data. we relate Bayesian-network methods for learning to techniques for supervised and unsupervised learning. the graphical model has several advantages for data analysis. and hence can be used to gain understanding about a problem domain and to predict the consequences of intervention. One. including techniques for learning with incomplete data. the network can be used to compute the probabilities of the presence of various diseases. Given symptoms. we describe methods for learning both the parameters and structure of a Bayesian network. When used in conjunction with statistical techniques.1 Algorithms Used Bayesian Networks : A Bayesian network. A Bayesian network is a graphical model that encodes probabilistic relationships among variables of interest. In addition. it readily handles situations where some data entries are missing. Two. belief network.4. it is an ideal representation for combining prior knowledge (which often comes in causal form) and data. With regard to the latter task. we discuss methods for constructing Bayesian networks from prior knowledge and summarize Bayesian statistical methods for using data to improve these models. because the model has both a causal and probabilistic semantics. In this paper.
and it processes information using a connectionist approach to computation. An example system has three layers. The synapses store parameters called "weights" that manipulate the data in the calculations. and then via more synapses to the third layer of output neurons. More complex systems will have more layers of neurons with some having increased layers of input neurons and output neurons. An ANN is typically defined by three types of parameters: The interconnection pattern between different layers of neurons The learning process for updating the weights of the interconnections The activation function that converts a neuron's weighted input to its output activation. Modern neural networks are non-linear statistical data modeling tools. A neural network consists of an interconnected group of artificial neurons.Neural Network Method An artificial neural network (ANN). The word network in the term 'artificial neural network' refers to the inter– connections between the neurons in the different layers of each system. usually called neural network (NN). . The first layer has input neurons. In most cases an ANN is an adaptive system that changes its structure based on external or internal information that flows through the network during the learning phase. They are usually used to model complex relationships between inputs and outputs or to find patterns in data. is a mathematical model or computational model that is inspired by the structure and/or functional aspects of biological neural networks. which send data via synapses to the second layer of neurons.
Rule Based Method Rule-based methods. providing comprehensible description instead of only black-box prediction. showing possible inconsistencies and avoiding unpredictable conclusions that black box predictors may generate in untypical situations. similarity or prototype-based rules (P-rules). Rules are used to support decision making in classification (Classification. fuzzy logic (F-rules). Computational Intelligence and Artificial Intelligence fields. Machine Learning). . Algorithms for extraction of rules from data have been advanced in Statistics. Machine Learning. Various forms of rules that allow expression of different types of knowledge are used: classical prepositional logic (C-rules). rule discovery or rule extraction from data. Statistics) and association tasks. providing logical justification for drawing conclusions. regression (Regression. association rules (Arules). and have sufficiently high accuracy. are data mining techniques aimed at understanding data structures. Sets of rules are useful if rules are not too numerous. comprehensible. Rule based systems should expose in a comprehensible way knowledge hidden in data. M-of-N or threshold rules (T-rules).
New York. 1(3):291-316. Activity monitoring: Noticing interesting changes in behavior. 1995 August 20-21. CA.. F. Conclusion In this project report. Fawcett. D. Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining. F. Signature-based methods for data streams. Fawcett. detection of subscription fraud and bad debts in telecommunication using BayesNet and JRip algorithm pattern learning have been mentioned. T. 6.. AAAI Press: Menlo Park. NY: AAAI Press. Pregibon. Proceedings of the First International Conference on Knowledge Discovery and Data Mining. 1998 August 27-31. T. C. 5(3):167-182. Cortes. C.5. Ezawa. S. Adaptive fraud detection. Data Mining and Knowledge Discovery 1997. Norton. References Cortes. 1998. K. . Provost. Montreal Canada. Knowledge discovery in telecommunication services data using Bayesian network models. Data Mining and Knowledge Discovery 2001. 174-178. 1995. D... Giga-mining. Provost. Pregibon. Theoritical and experimental results have been demonstrated which showed that pattern learning technique can be useful in detecting subscription fraud and bad debts in telecommunication.
This action might not be possible to undo. Are you sure you want to continue?
We've moved you to where you read on your other device.
Get the full title to continue listening from where you left off, or restart the preview.