Professional Documents
Culture Documents
Team Members
Supervisor
Deepika Bansal
Abstract:
Electronic payment systems continue to seamlessly aid business transactions
across the world, and credit cards have emerged as a means of making
payments in E-payment systems. Fraud due to credit card usage has, however,
remained a major global threat to financial institutions with several reports
and statistics laying bare the extent of this challenge. In this paper, we
implemented a champion-challenger framework using three different hybrid
ensemble models and compared the best model among all. Challenger 1 which
is the ensemble of Random Forest, Adaboost, and LSTM became champion
with Accuracy, Precision, Recall, and F1-Score as 99.86, 99.73, 99.99, 99.86.
Introduction:
Fraud can be defined as an unjust or criminal deception intended to end in
personal gain [1–2]. A credit card is a physical medium for selling goods or
services without having cash in hand. Credit Card Fraud Detection (CCFD) is
the procedure to identify whether a transaction is normal or abnormal. Credit
Card Crime (CCC) [1,2] refers to the way of stealing the identity of someone
(victim) and accomplishing fraudulent transactions in its name. Consequently,
the victim can be prone to unpredicted consequences [3–4].
Motivation:
Credit cards are becoming more and more popular in financial transactions,
and at the same time, frauds are also increasing. Conventional methods use
rule-based expert systems to detect fraud behaviors. These methods, however,
are not robust as they neglect diverse situations and extreme imbalance of
positive and negative samples. This motivated us to work on a hybrid ensemble
champion challenger approach that can capture the intricate patterns of fraud
behaviors learned from labeled data.
Dataset:
The dataset contains transactions made by credit cards in September 2013 by
European cardholders. This dataset presents transactions that occurred in two
days, where we have 492 frauds out of 284,807 transactions. The dataset is
highly unbalanced, the positive class (frauds) account for 0.172% of all
transactions.
Visualization of a few features from the dataset which are obtained by using
PCA on the original data to hide private information as well as for security of
the users whose data is being used. These features are labeled from V1 to V28
for the same reason.
4
Literature Review
Saad M. Darwish[7] offers an improved two-level credit card fraud tracking
model from imbalanced datasets based on the semantic fusion of k-means and
the artificial bee colony (ABC) algorithm to improve identification precision
and accelerate the convergence of detection. Experimental findings show that
the suggested model can improve the precision of ranking against the danger
of suspect operations and provide higher accuracy relative to traditional
techniques.
Ruttala Sailusha [12] the author has aimed to focus mainly on machine
learning algorithms. The algorithms used were the random forest algorithm
and the Adaboost algorithm. The Random Forest and the Adaboost algorithms
were then compared and the algorithm that had the greatest accuracy,
precision, recall, and F1-score was considered as the best algorithm that was
used to detect the fraud.
Kuldeep Randhawa [13] the author has discussed the study on credit card fraud
detection using machine learning algorithms.. A publicly available credit card
data set was used for evaluation using individual (standard) models and hybrid
models using AdaBoost and majority voting combination methods. The MCC
metric has been adopted as a performance measure. A perfect MCC score of 1
has been achieved using AdaBoost and majority voting methods. To further
evaluate the hybrid models, noise from 10% to 30% has been added into the
data samples. The majority voting method has yielded the best MCC score of
0.942 for 30% noise added to the data set. This shows that the majority voting
method is stable in performance in the presence of noise. All accuracy rates
were above 99%, with the exception of SVM at 95.5%. The non-fraud detection
rates of NB, DT, and LIR are at 100%, while the rest are close to perfect, with
the exception of SVM.
Methodology:
For our implementation of the champion challenger framework , we used three
hybrid ensemble models which also involve a deep learning aspect as
challengers, from which we wish to find the champion. Since for credit card
fraud detection , the top performing models are hybrid ensemble models
according to the authors , and the deep learning model is able to outperform
this in their testing , we make use of both of these by incorporating a deep
learning unit in our hybrid ensemble classifiers.
Algorithm
Random forest : The first part of this model is a random forest which is a
combination of tree predictors such that each tree depends on the values of a
random vector sampled independently and with the same distribution for all
8
This model uses a random forest with 75 decision trees and a winner takes all
type approach.The decision trees operate on decision tree classifiers .
LSTM : Long Short Term Recurrent Neural Network (LSTM RNN) sub-module,
called LSTM-FCN, or a LSTM RNN with attention, called ALSTM-FCN. Long
short-term memory recurrent neural networks are an improvement over the
general recurrent neural networks, which possess a vanishing gradient
problem. LSTM RNNs address the vanishing gradient problem commonly
found in ordinary recurrent neural networks by incorporating gating functions
into their state dynamics.[16]
For our hybrid ensemble model we use a LSTM which has tanh as activation
function ,with a dropout value of 0.2, This produces a 256 dimensional vector
which is then passed onto a fully connected layer which translates this vector
into another one of the same size for further processing.The model uses
downsampling via smaller output dense layers and has a dropout and batch
normalization layer to help prevent overfitting and reduce training time. The
architecture of this unit is given as follows
percentage=0.5
Values for hyperparameter dropout for the LSTM module tested were 0.2 and
0.5 , for momentum for the batch normalization layer , 0.98 and 0.95 were
tested . ‘relu’ activation was chosen for the fully connected layers so that false
or weak outputs were nullified instantly. Changing the value of epsilon in batch
normalization did not yield different results .
The adaboost classifier in our model uses the Stagewise Additive Modeling
using a Multi-class Exponential loss function SAMME algorithm , with 100
estimators and a learning rate of 0.75 . A random seed value of 0 is used for
generating random states so that the output is reproducible.
utilize the system resources for training and testing thus reducing the time
taken .
Convolutional Neural Networks find their applications in many fields and can
be applied for credit card fraud detection as well. CNNs in combination with
sampling techniques like smote achieves satisfying results. one-dimensional
CNN that resembles natural language processing can be used to extract the
datasets' important features before sending them for classification in a fully
connected dense layer. [18]
CNN used in our model uses 3 convolutional layers and 2 max pooling layers
with below mentioned hyperparameters output of which is fed to a fully
connected dense layer. Dropout and batch normalization layers are used to
help prevent overfitting and reduce training time after which the output from
the batch normalization layer is flattened and fed to the final fully connected
dense layer. Model architecture is given as follows:
Dropout Dropout (, 3, 5) 0
percentage=0.3
BatchNormalization None (, 3, 5) 20
The core issue in SVM research is the choice of the kernel function. However
there is no more feasible and efficient method to construct a suitable kernel
function in allusion to specific problems. In practice, the more commonly used
kernel functions are Linearity kernel function, polynomial kernel function,
RBF (Radial Basis Function) kernel function and Sigmoid kernel function. [19]
The SVM classifier in our model uses the Radial basis function kernel and other
parameters set to default value. Polynomial kernel function with degree as
3,5,7,9 was experimented but SVM with RBF kernel function gave better
results.
GBM: Gradient boosting algorithm can be used for predicting not only
continuous target variables (as a Regressor) but also categorical target
12
Decision Trees learn the splits and try to classify instances. There are many
splits where misclassification can occur. Decision trees may be weak at some
splits and better at another split while classification. Gradient Boosting is an
ensemble of many decision trees. Many weak learners are combined to make
the whole model much better at classification. The tree ensemble model is a set
of classification and regression trees (CART). The prediction scores at each
level are summed up to get the final predicted score. [20]
GBM in our model was experimented with learning_rate as 0.05, 0.01, 0.1, 0.5
and n_estimators as 100, 150, 200 and max_depth as 3, 5, 10, 15. Performance
of the model changed drastically with learning rate but change in
n_estimators and max_depth did not cause a noticeable change in the
performance. Optimal values for the model are learning_rate as 0.5,
n_estimators as 100 and max_depth as 3.
Logistic Regression : In statistics, the logistic model (or logit model) is used to
model the probability of a certain class or event existing such as pass/fail,
win/lose, alive/dead or healthy/sick. This can be extended to model several
classes of events such as determining whether an image contains a cat, dog,
lion, etc. Each object being detected in the image would be assigned a
probability between 0 and 1, with a sum of one.
Logistic regression is a statistical model that in its basic form uses a logistic
function to model a binary dependent variable, although many more complex
extensions exist. In regression analysis, logistic regression[1] (or logit
regression) is estimating the parameters of a logistic model (a form of binary
regression). Mathematically, a binary logistic model has a dependent variable
with two possible values, such as pass/fail which is represented by an indicator
variable, where the two values are labeled "0" and "1". In the logistic model,
the log-odds (the logarithm of the odds) for the value labeled "1" is a linear
combination of one or more independent variables ("predictors"); the
independent variables can each be a binary variable (two classes, coded by an
indicator variable) or a continuous variable (any real value).
● Evaluation Metrics
The following are the evaluation metrics that we use to gauge the
performance of our system.
Graphs
CNN - Training Loss and Accuracy vs CNN - Validation Loss and Accuracy
epoch vs epoch
16
ANN - Training Loss and Accuracy vs ANN - Validation Loss and Accuracy
epoch vs epoch
Comparison Table
Challenger 1 modules
Challenger 2 modules
Challenger 3 modules
Champion-challenger comparison
Conclusion
Credit card fraud is without a doubt an act of criminal dishonesty. This report
has listed out the most common methods of fraud along with their detection
methods and reviewed recent findings in this field. We have tackled the
problem of class imbalance which is present in the dataset . This paper has also
explained in detail how machine learning can be applied to get better results in
fraud detection along with the algorithm, pseudocode, explanation of its
implementation and experimentation results. The algorithm provided uses
three different hybrid ensemble models and then uses champion challenger
framework to compare them. The three challenger models were fed into the
framework and challenger 1 became the champion with Accuracy, Precision,
Recall and F1-Score as 99.86, 99.73, 99.99, 99.86.
References:
Base Paper:
Eunji Kim, Jehyuk Lee, Hunsik Shin, Hoseong Yang, Sungzoon Cho, Seung-kwan Nam,
Youngmi Song, Jeong-a Yoon, Jong-il Kim, Champion-challenger analysis for credit card fraud
detection: Hybrid ensemble and deep learning,Expert Systems with Applications, Volume
128,2019,Pages 214-224,ISSN 0957-4174
https://doi.org/10.1016/j.eswa.2019.03.042.
Dataset: https://www.kaggle.com/mlg-ulb/creditcardfraud
[1] Xuan S, Liu G, Li Z, Zheng L, Wang S, Jiang C. Random forest for credit card fraud
detection. ICNSC 2018 - 15th IEEE Int. Conf. Networking, Sens. Control 2018: 1–6.
[5] Johnson JM, Khoshgoftaar TM. Survey on deep learning with class imbalance. J. Big Data
2019. https://doi.org/10.1186/s40537-019-0192-5.
[6] Walke A. Comparison of Supervised and Unsupervised Fraud Detection, no. September
2013. Springer International Publishing; 2019.
[7] Darwish, S.M. A bio-inspired credit card fraud detection model based on user behavior
analysis suitable for business management in electronic banking. J Ambient Intell Human
Comput 11, 4873–4887 (2020)
[8] J. Forough and S. Momtazi, Ensemble of deep sequential models for credit card fraud
detection, Applied Soft Computing Journal (2020),
doi:https://doi.org/10.1016/j.asoc.2020.106883.
[9] Toluwase Ayobami Olowookere, Olumide Sunday Adewale, A framework for detecting credit
card fraud with cost-sensitive meta-learning ensemble approach, Scientific African, Volume 8,
2020, E00464, ISSN 2468-2276, https://doi.org/10.1016/j.sciaf.2020.e00464.
[10] Naoufal Rtayli, Nourddine Enneya, Enhanced credit card fraud detection based on
SVM-recursive feature elimination and hyper-parameters optimization, Journal of Information
Security and Applications, Volume 55, 2020, 102596, ISSN 2214-2126.
[11]Sanaz Nami, Mehdi Shajari, Cost-sensitive payment card fraud detection based on dynamic
random forest and k-nearest neighbors, Expert Systems with Applications, Volume
110,2018,Pages 381-392,ISSN 0957-4174,
https://doi.org/10.1016/j.eswa.2018.06.011.
[12] R. Sailusha, V. Gnaneswar, R. Ramesh and G. R. Rao, "Credit Card Fraud Detection Using
Machine Learning," 2020 4th International Conference on Intelligent Computing and Control
Systems (ICICCS), 2020, pp. 1264-1270, doi: 10.1109/ICICCS48265.2020.9121114.
[13] Kuldeep Randhawa; Chu Kiong Loo; Manjeevan Seera; Chee Peng Lim; Asoke K. Nandi,
“Credit Card Fraud Detection Using AdaBoost and Majority Voting”
[14] X. Kewei, B. Peng, Y. Jiang and T. Lu, "A Hybrid Deep Learning Model For Online Fraud
Detection," 2021 IEEE International Conference on Consumer Electronics and Computer
Engineering (ICCECE), 2021, pp. 431-434, doi: 10.1109/ICCECE51280.2021.9342110.
[16] F. Karim, S. Majumdar, H. Darabi and S. Chen, "LSTM Fully Convolutional Networks for
Time Series Classification," in IEEE Access, vol. 6, pp. 1662-1669, 2018, doi:
10.1109/ACCESS.2017.2779939.
20
[17] K. Randhawa, C. K. Loo, M. Seera, C. P. Lim and A. K. Nandi, "Credit Card Fraud Detection
Using AdaBoost and Majority Voting," in IEEE Access, vol. 6, pp. 14277-14284, 2018, doi:
10.1109/ACCESS.2018.2806420.
[18] Z. Zhang and S. Huang, "Credit Card Fraud Detection via Deep Learning Method Using
Data Balance Tools," 2020 International Conference on Computer Science and Management
Technology (ICCSMT), 2020, pp. 133-137, doi: 10.1109/ICCSMT51754.2020.00033.
[19] W. Xu and Y. Liu, "An Optimized SVM Model for Detection of Fraudulent Online Credit Card
Transactions," 2012 International Conference on Management of e-Commerce and
e-Government, 2012, pp. 14-17, doi: 10.1109/ICMeCG.2012.39.
[20] A. Mishra and C. Ghorpade, "Credit Card Fraud Detection on the Skewed Data Using
Various Classification and Ensemble Techniques," 2018 IEEE International Students'
Conference on Electrical, Electronics and Computer Science (SCEECS), 2018, pp. 1-5, doi:
10.1109/SCEECS.2018.8546939.
[21] Chen, T., & Guestrin, C. (2016). XGBoost. Proceedings of the 22nd ACM SIGKDD
International Conference on Knowledge Discovery and Data Mining.
[22] R. Nayak, L.C. Jain, B.K.H. Ting, Artificial Neural Networks in Biomedical Engineering: A
Review, Editor(s): S. Valliappan, N. Khalili, Computational Mechanics–New Frontiers for the
New Millennium, Elsevier, 2001, Pages 887-892, ISBN 9780080439815,
https://doi.org/10.1016/B978-0-08-043981-5.50132-2.
[23] Asha RB, Suresh Kumar KR, Credit card fraud detection using artificial neural network,
Global Transitions Proceedings, Volume 2, Issue 1, 2021, Pages 35-41, ISSN 2666-285X,
https://doi.org/10.1016/j.gltp.2021.01.006.