Comparative Study of Credit Card Fraud Detection Using HMM and Neural Network

CHAPTER
47
Comparative Study of Credit Card Fraud
Detection Using HMM and Neural Network
Pukhraj P. Shrishrimal, Sunil A. Khillare, Vishal B. Waghmare, C. Namrata Mahender and R.R. Deshmukh
Department of Computer Science & Information Technology, Dr. Babasaheb Ambedkar Marathwada University Aurangabad
E-mail: pukhraj.shrishrimal@gmail.com, sunil1.khillare@gmail.com, vishal.pri12@gmail.com, nam.mah@gmail.com,
ratnadeep_deshmukh@yahoo.co.in}
ABSTRACT
As the new era is of Electronic Commerce (e-commerce) the whole world is moving towards the more usage of Credit card. The
popularity of credit card has increased dramatically as it can be used for both online as well as for regular shopping, so the number
of fraud cases taking place is also more. This paper compares the two techniques used for detecting Credit card Frauds HMM
(Hidden Markov Model) and NN (Neural Network). It is observed that the accuracy of detecting fraud using Neural Network is
more than the Hidden Markov Model.
Keywords: Internet, Credit cards, E-commerce Security, Fraud Detection, Hidden Markov Model, Neural Network.
1. INTRODUCTION
The Craze of online shopping is rapidly growing day by day. As the economy is growing the expenditure of the people is
increasing. The credit cards are not only used for online shopping but also for the regular shopping. One can imagine the
popularity of credit card usage by the number of transactions reported by Barclays by the end of the last century which
tells that 350 million card transactions were recorded per year. It is nothing but the worlds biggest retail chain Wal-Mart
records more than 7 Billion transaction per year [1]. The transactions happening online as well as regular transaction are
increasing rapidly. As the number of credit card users is increasing the chance for the attacker to get the card details and
do fraudulent transactions is also increasing.
The card transactions can be classified in two different categories: (i) Virtual or Online and (ii) Physical. In the First
type the attacker get the necessary card details (such as Credit Card Number, Date of Expiry, CVV2 Security number, and
in few cases the online account details of the card holder). Mostly this type of the Card fraud is done over Internet and
Phone. In the Second category to perform a card transaction or purchase the cardholder needs to present the card for
shopping to retailer. For such type of transaction one need to steal the credit card. If the card holder does not come to
know that the card is lost than the Credit card Company has to suffer a substantial financial loss. In the first type the card
holder comes to know that his or her card is being used by other person when the Credit card statement is received.
308 Knowledge Engineering
The credit card fraud is becomes the nightmare for the all parties involved. Also, the credit card fraud losses are
increasing daily [2]. Experts from Visa International predict annual growth of some fraud types up to 65% [4]. For the
credit card industry, fraud accounts for over $850 million dollars in losses each year in the United States and $10 billion
worldwide [5].
The occurrence of fraudulent transactions can be discovered by inconsistent spending patterns. The detection of the
fraud gets more difficult if the frauds occurred is having similar spending pattern as that of the card holder. To find the
inconsistent spending patterns there are many different techniques which can be applied. The most popular techniques
used are Neural network and HMM (Hidden Markov Model).
This paper is organized as follows: Section 2 helps us to know what is meant by Credit card Fraud Detection and the
most common Problems Faced. Section 3 displays the HMM (Hidden Markov Model) model. Section 4 displays the
Neural Network Techniques in credit card fraud detection. Section 5 includes the comparison between the HMM and
Neural Data Mining techniques. Finally, Section 6 contains the conclusion of the paper.
2 CREDIT CARD FRAUD DETECTION

In this section of paper deals with the study of the Credit Card Fraud and the problems that are been faced in the Fraud
Detection System (FDS). Here we are clearing the idea about Credit card only but it also applies for the Debit Cards
which is similar to that of Credit Card Fraud.
2.1 Meaning of Credit Card Fraud

Whenever the Credit card holder does not perform a transaction using his or her Credit card is known as a fraudulent
transaction. The more use of Credit card in the daily life has provided many luxuries to the human being as they are having
credit card with them they can purchase when they are short of cash and can pay later to the Credit card company within
the given time without incurring any interest. The Credit card has given the bad people or fraudsters a good opportunity
to get handsome reward without paying a single penny from their pocket.
The Credit card fraud is attracting more attackers as no one can find them easily as the fraud detected after 2 to 3
weeks of the Fraud. Generally, the online Card fraud is increasing as no one can see who as actually done the purchase
of good or who have done registration. Fraud is almost always directed toward cash or items that can be converted into
cash and compressed in time [5]. The most common Fraud techniques are
(a) Getting the copy of the card and get hold PIN number of the card [3]
(b) To get hold of card details (like Card number, Expiry Date and CVV2 Security Code).
(c) Vendors charging more money than agreed to the customer, without notice to the customer.
The Credit card fraud leads to the financial loss of the financial organization and the customers. The customer or
Cardholder face a partial (possibly complete) pay for the loss by paying the higher interest rates, reduced benefits and
high membership fees [3]. To reduce the Card fraud is in the interest of Bank or Financial organization and card holder
which resulted in the Fraud Detection System.
2.2 Problems in Credit card Fraud detection

The biggest problem of the Fraud detection system is lack of the real world data for academic researcher for performing
the experiment. It is because fraud detection is associated with the sensitive financial data of the customers of the
financial organization which is confidential as per the policy of the customer privacy [3].
The another problem is to find the inconsistent pattern of spending in which sometimes the transaction of the
cardholder which can be genuine can also get declined. Another problem in Credit card Fraud detection is what if the
fraudulent amount is similar to that of the spending pattern of the cardholder then it can bypass the Fraud Detection
System.
Pukhraj P. Shrishrimal et al. 309
The qualities that should be possessed by the Fraud detection system are
(a) It should be able to identify the inconsistent pattern of spending as the number of actual fraudulent transaction is
few percent of total transactions.
(b) The system should be able to adapt to new kind of frauds [3].
(c) It should be able to separate the overlapping data i.e. those transactions which are genuine but similar to the
fraudulent one and fraudulent one similar to the genuine transaction [3].
3. HIDDEN MARKOV MODEL

A Hidden Markov Model is a double embedded stochastic process with two hierarchy levels. It can be used to model
much more complicated stochastic processes as compared to a traditional Markov model. HMM has a finite set of states
governed by a set of transition probabilities. In a particular state, an outcome or observation can be generated according
to an associated probability distribution. It is only the outcome and not the state which is visible to an external observer
[6]. Hidden Markov model based applications are common in various areas.
An HMM can be characterized by the following [6]
(i) N, the number of states in the model. We denote the set of states S = {S1, S2 SN}, where Si, i =1, 2 N is an
individual state. The state at time instant t is denoted by qt.
(ii) M, the number of distinct observation symbols per state. The observation symbols correspond to the physical
output of the system being modeled. We denote the set of symbols V = {V1, V2 .VM}, where Vi, i = 1, 2... M
is an individual symbol.
(iii) The state transition probability matrix A = [aij], where
aij = P (qt + 1 = Sj | qt = Si), 1 £ i £ N, 1 £ j £ N; t = 1, 2 (1)
For the general case where any state j can be reached from any other state i in a single step, we have aij > 0 for
all i, j. Also,
N
å aij = 1, 1 £ i £ N
J =1
(iv) The observation symbol probability matrix B = [bj (k)], where

M
bj (k) = P (Vk | Sj), 1 £ j £ N, 1 £ k £ M and å bj (k) = 1, 1 £ j £ N (2)
k =1
(v) The initial state probability vector p = [pi] where

N
pi = P (q1 = Si), 1 £ 1 £ N, such that å PE = 1 (3)
i =1
(vi) The observation sequence O = O1, O2, O3, ...OR, where each observation Ot is one of the symbols from V and
R is the number of observations in the sequence.
It is evident that a complete specification of an HMM requires the estimation of two model parameters, N and M, and
three probability distributions A, B and p. We use the notation = (A, B, p) to indicate the complete set of parameters of
the model, where A, B implicitly include N and M.
An observation sequence O as mentioned above, can be generated by many possible state sequences. Consider one
such particular sequence.
Q = q1, q2 qR (4)
where q1 is the initial state.
310 Knowledge Engineering
The probability that O is generated from this state sequence is given by:
4
P (O | Q, l) = Õ
J =1
P (Ot | qt, ?) (5)
where statistical independence of observations is assumed. Eq. 5 can be expanded as.

P (O |Q, l) = bq1 (O1). b q2 (O2) b qR (OR) (6)
The probability of the state sequence Q is given as.
P (Q | l) = p q1 a q1 q2 a q2 q3 a qR-1 qR (7)
Thus, the probability of generation of the observation sequence O by the Hidden Markov Model specified by ? can be
written as follows.
P (O | l) = å P (Q, l) P (Q | l) (8)
=ll3
Deriving the value of P (O | ?) using the direct definition of Eq. 8 is computationally intensive. Hence, a procedure
named as Forward-Backward procedure [6] is used to compute P (O | ?).
4. NEURAL NETWORK
Neural networks are an extension of risk scoring techniques. They are based on the statistical knowledge contained in
extensive databases of historical transactions, and fraudulent ones in particular.
These neural network models are basically trained by using examples of both legitimate and fraudulent transactions
and are able to correlate and weigh various fraud indicators (e.g., unusual transaction amount, card history, etc) to the
occurrence of fraud.
A neural network is a computerized system that sorts data logically by performing the following tasks.
(a) Identifies cardholders buying and fraudulent activity patterns.
(b) Processes data by trial and elimination (excluding data that is not relevant to the pattern).
(c) Finds relationships in the patterns and current transaction data.
The principles of neural networking are motivated by the functions of the brain especially pattern recognition and
associative memory. The neural network recognizes similar patterns, predicting future values or events based upon the
associative memory of the patterns it has learned.
The advantages neural networks offer over other techniques are that these models are able to learn from the past and
thus, improve results as time passes. They can also extract rules and predict future activity based on the current situation.
By employing neural networks effectively, banks can detect fraudulent use of a card, faster and more efficiently [7].
5. COMPARISON OF HIDDEN MARKOV MODEL AND NEURAL NETWORK FOR

CREDIT CARD FRAUD DETECTION
While doing the study of credit card fraud detection using Hidden Markov Model it was been observed that an accuracy
of approximately 80% was achieved using Hidden Markov Model [8]. The accuracy achieved by the Neural Network
was 92.50% for the training database used for the study [9].
The Learning time for the Artificial Neural Network can be more. While learning time for Hidden Markov Model
depends upon the type of clustering algorithm used.
Pukhraj P. Shrishrimal et al. 311
6. CONCLUSION
From the above comparison came to know that the Credit card Fraud Detection using Neural Network is having more
accuracy than Credit card Fraud Detection Hidden Markov Model. The Research carried out on Credit card Fraud
detection is carried on imaginary dataset not on the real data so the chance to test the result on real data is difficult due
unavailability of Data.
References
1. D.J. Hand, G. Blunt, M.G. Kelly, and N.M. Adams, Data Mining for Fun and Profit, Statistical Science, vol. 15, no. 2, pp.
111131, 2000.
2. Dr. Saleh Al-Furiah Lamia AL-Braheem , Comprehensive study on methods of fraud prevention in credit card e-payment
system
3. Sam Maes, Karl Tuyls, Bram Vanschoenwinkel, Bernard Manderick, Credit card Fraud detection using Bayesian and
Neural Networks.
4. Vladimir Zaslavsky and Anna Strizhak ,Credit card Fraud Detection using self organizing maps.
5. Robert J. Richardson, Monitoring Sale Transactions for Illegal Activity.
6. L.R. Rabiner, A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition,
7. Tej paul Bhatla, Vikram Prabhu, Amit Dua, Understanding Credit Card Frauds
8. Abhinav Srivastava, Amlan Kundu, Shamik Sural, Arun K. Majumdar, Credit card Fraud Detection using Hidden Markov
Model
9. S. Gosh, D.L. Reilly, Credit card Fraud Detection with a Neural Network

Comparative Study of Credit Card Fraud Detection Using HMM and Neural Network

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Comparative Study of Credit Card Fraud Detection Using HMM and Neural Network

Uploaded by

Copyright:

Available Formats

CHAPTER

2 CREDIT CARD FRAUD DETECTION

2.1 Meaning of Credit Card Fraud

2.2 Problems in Credit card Fraud detection

3. HIDDEN MARKOV MODEL

(iv) The observation symbol probability matrix B = [bj (k)], where

(v) The initial state probability vector p = [pi] where

where statistical independence of observations is assumed. Eq. 5 can be expanded as.

5. COMPARISON OF HIDDEN MARKOV MODEL AND NEURAL NETWORK FOR

You might also like