Big Data Analytics Project

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
BIG DATA ANALYTICS (CO- 414)

FINAL REPORT
Topic: “Champion-challenger analysis for credit card fraud detection:

Hybrid ensemble and deep learning”
Team Members
Gaurav Khatri Gaurav Kumar Dhama Harshit Muhal

2K18/CO/129 2K18/CO/130 2K18/CO/145
Supervisor
Deepika Bansal
COE Department, DTU
DELHI TECHNOLOGICAL UNIVERSITY

1
Abstract:
Electronic payment systems continue to seamlessly aid business transactions
across the world, and credit cards have emerged as a means of making
payments in E-payment systems. Fraud due to credit card usage has, however,
remained a major global threat to financial institutions with several reports
and statistics laying bare the extent of this challenge. In this paper, we
implemented a champion-challenger framework using three different hybrid
ensemble models and compared the best model among all. Challenger 1 which
is the ensemble of Random Forest, Adaboost, and LSTM became champion
with Accuracy, Precision, Recall, and F1-Score as 99.86, 99.73, 99.99, 99.86.
Introduction:
Fraud can be defined as an unjust or criminal deception intended to end in
personal gain [1–2]. A credit card is a physical medium for selling goods or
services without having cash in hand. Credit Card Fraud Detection (CCFD) is
the procedure to identify whether a transaction is normal or abnormal. Credit
Card Crime (CCC) [1,2] refers to the way of stealing the identity of someone
(victim) and accomplishing fraudulent transactions in its name. Consequently,
the victim can be prone to unpredicted consequences [3–4].
The classification of fraudulent transactions with highly imbalanced classes of

data becomes very challenging. Indeed, the number of normal transactions
belonging to the normal class is, comparatively, much larger than the number
of fraudulent transactions class which contains abnormal cases [9,10]. Fig. 2
shows an example of imbalanced data where the orange dots represent the
fraud cases with much lower frequency compared to the legitimate cases
represented by the blue dots.
In the literature, we find two main approaches to identifying and detecting

fraud transactions: the supervised approach and the unsupervised approach.
The supervised approach is performed to classify new transactions (fraudulent
and non-fraudulent) based on a transactional data record. The well-known
algorithms, used for supervised identification and detection of CCF, are as
follows: Bayesian Belief Networks, Random Forests, and K-Nearest Neighbors.
Support Vector Machines, Artificial Neural Networks, and Hidden Markov
2
Models. As examples of the used unsupervised methods, we found SOMs

(Self-Organized Maps) method and the K-Means method for problems
associated with clustering.
Our goal is to train a binary-classier F that is capable of classifying any given

transaction instance as fraud or non-fraud through a champion challenger
framework of three different hybrid-ensemble models. So, the objectives for
this paper can be summarised as follows:
● To implement a champion-challenger framework using three different
hybrid ensemble models and compare the best model between all.
● To perform data preprocessing and solve data imbalance problems.
● To compare the performance of these models and decide the champion.
Motivation:
Credit cards are becoming more and more popular in financial transactions,
and at the same time, frauds are also increasing. Conventional methods use
rule-based expert systems to detect fraud behaviors. These methods, however,
are not robust as they neglect diverse situations and extreme imbalance of
positive and negative samples. This motivated us to work on a hybrid ensemble
champion challenger approach that can capture the intricate patterns of fraud
behaviors learned from labeled data.
Dataset:
The dataset contains transactions made by credit cards in September 2013 by
European cardholders. This dataset presents transactions that occurred in two
days, where we have 492 frauds out of 284,807 transactions. The dataset is
highly unbalanced, the positive class (frauds) account for 0.172% of all
transactions.
Class imbalance is prominent in this dataset as only 492 fraud transactions

occur while 284,807 non-fraudulent transactions are happening, we have used
oversampling to remove this class imbalance.
3
Time series graph visualisation for the data

This graph shows the amount of time passed
between the first transaction and the corresponding
transaction, This has been normalized to reduce
computation time
Amount value visualisation for the data

This data shows the visualization of the amount value
of the transactions . It has also been normalized for
increasing computational efficiency
Visualization of a few features from the dataset which are obtained by using
PCA on the original data to hide private information as well as for security of
the users whose data is being used. These features are labeled from V1 to V28
for the same reason.
4
Literature Review
Saad M. Darwish[7] offers an improved two-level credit card fraud tracking
model from imbalanced datasets based on the semantic fusion of k-means and
the artificial bee colony (ABC) algorithm to improve identification precision
and accelerate the convergence of detection. Experimental findings show that
the suggested model can improve the precision of ranking against the danger
of suspect operations and provide higher accuracy relative to traditional
techniques.
J. Forough [8] proposed in this paper, an ensemble model based on sequential

modeling of data using deep recurrent neural networks and a novel voting
mechanism based on artificial neural networks to detect fraudulent actions.
The author’s experimental results on two real world datasets demonstrate that
the proposed model outperforms the state-of-the-art models in all evaluation
criteria.
Toluwase Ayobami Olowookere, Olumide Sunday Adewale[9] proposed a

framework that combines the potentials of meta-learning ensemble
techniques and cost-sensitive learning paradigm for fraud detection. Results
obtained from classifying unseen data show that the cost-sensitive ensemble
classiﬁer maintains an excellent AUC value indicating consistent performance
across different fraud rates in the dataset.
Naoufal Rtayli , Nourddine Enneya [10] developed a new model based on a
hybrid approach for Credit Card Fraud Detection (CCFD). The proposed model,
compared to previous studies, shows its strong ability to identify fraud
transactions. The experimentation of their model, on many real-world data
sets, gives the best results in terms of efficiency and effectiveness.
SanazNami [11] developed a method that involves two stages of detecting

fraudulent payment card transactions. In this method, additional transaction
features are derived from primary transactional data. Accordingly, a new
similarity measure is established on the basis of transaction time in this stage.
This measure assigns greater weight to recent transactions. In the second
stage, the dynamic random forest algorithm is employed for the first time in
initial detection, and the minimum risk model is applied in cost-sensitive
detection.
5
Ruttala Sailusha [12] the author has aimed to focus mainly on machine
learning algorithms. The algorithms used were the random forest algorithm
and the Adaboost algorithm. The Random Forest and the Adaboost algorithms
were then compared and the algorithm that had the greatest accuracy,
precision, recall, and F1-score was considered as the best algorithm that was
used to detect the fraud.
Kuldeep Randhawa [13] the author has discussed the study on credit card fraud
detection using machine learning algorithms.. A publicly available credit card
data set was used for evaluation using individual (standard) models and hybrid
models using AdaBoost and majority voting combination methods. The MCC
metric has been adopted as a performance measure. A perfect MCC score of 1
has been achieved using AdaBoost and majority voting methods. To further
evaluate the hybrid models, noise from 10% to 30% has been added into the
data samples. The majority voting method has yielded the best MCC score of
0.942 for 30% noise added to the data set. This shows that the majority voting
method is stable in performance in the presence of noise. All accuracy rates
were above 99%, with the exception of SVM at 95.5%. The non-fraud detection
rates of NB, DT, and LIR are at 100%, while the rest are close to perfect, with
the exception of SVM.
X. Kewei, B. Peng, Y. Jiang and T. Lu [14] employed multiple techniques,

including feature engineering, memory compression, mixed precision, and
ensem ble loss to boost the performance of their model. Experiments show
that our model outperforms traditional machine-learning-based methods like
Bayes and SVM.
Methodology:
For our implementation of the champion challenger framework , we used three
hybrid ensemble models which also involve a deep learning aspect as
challengers, from which we wish to find the champion. Since for credit card
fraud detection , the top performing models are hybrid ensemble models
according to the authors , and the deep learning model is able to outperform
this in their testing , we make use of both of these by incorporating a deep
learning unit in our hybrid ensemble classifiers.
Phase 1: Load the credit card fraud dataset.

6
Phase 2: Applying SMOTE technique on imbalanced credit card data and

performing checks for null and missing values and normalising dataset. The
Synthetic Minority Oversampling Technique (SMOTE) is a well-known
preprocessing technique for dealing with imbalanced datasets, in which the
minority class is oversampled by creating synthetic examples in feature vector
rather than data space.
Phase 3: Divide dataset into training and testing data for model learning.
Phase 4: Train three hybrid ensemble challengers on the training dataset.
Phase 5: Detect the transaction as fraudulent or normal.
Phase 6: Performance evaluation of proposed model using different metrics.
Phase 7 : Compare the performance of the challengers and determine the
champion.
Architecture for challenger 1
This challenger is a bagging type hybrid ensemble

that uses random forest[15], LSTM[16], and
Adaboost[17] algorithms for classification and a
majority vote type bagging system for final
prediction. The task for this model is classification as
the data has a class attribute with a value of 0 for
non-fraudulent transactions and 1 for fraudulent
transactions.
This challenger is a bagging type hybrid ensemble

that uses Convolutional Neural Network [18],
Support Vector Machines[19], and Gradient
Boosting machine learning algorithm [20] for
classification and an Averaging type bagging
system for final prediction. The task for this model
is classification as the data has a class attribute
with values 0 for non-fraudulent transactions and 1
for fraudulent transactions.
7
The champion is a bagging type hybrid ensemble

model which uses Artificial Neural Network
[22][23], Logistic Regression, and XGBoost [21]
for classification and an Averaging type bagging
system for final prediction. The task for this
model is classification as the data has a class
attribute with value of 0 for non-fraudulent
transactions and 1 for fraudulent transactions.
Algorithm
For the champion challenger framework, the following steps are

used :
● Train the models on the training data.

● Load the trained models in the champion-challenger framework.
● Make predictions with the models on the testing data and measure
performance using metrics and calculate each predictions as
○ Y1, Y2, Y3
● Compare the performance metric values for the models and display best
performers in each class.
● Decide the champion by comparing using weighted average as -
○ (W1*Accuracy + W2*Precision+ W3* Recall + W4* F1-score)/(∑W)
Experimentation/Description for challenger 1
Random forest : The first part of this model is a random forest which is a
combination of tree predictors such that each tree depends on the values of a
random vector sampled independently and with the same distribution for all
8
trees in the forest. [15]. The algorithm is unique in that it is robust to

overfitting, even in extreme cases e.g. when there are more features than
training examples.
This model uses a random forest with 75 decision trees and a winner takes all
type approach.The decision trees operate on decision tree classifiers .
The hyperparameter n_estimators which determine the number of trees were

tested for values 200, 100 and 75 of which 75 gave the best results for the
accuracy on the testing dataset. The max depth and no of leaves were left
unrestricted for all trees so that the optimal structure was formed , min split
was for 2 data points.
LSTM : Long Short Term Recurrent Neural Network (LSTM RNN) sub-module,
called LSTM-FCN, or a LSTM RNN with attention, called ALSTM-FCN. Long
short-term memory recurrent neural networks are an improvement over the
general recurrent neural networks, which possess a vanishing gradient
problem. LSTM RNNs address the vanishing gradient problem commonly
found in ordinary recurrent neural networks by incorporating gating functions
into their state dynamics.[16]
For our hybrid ensemble model we use a LSTM which has tanh as activation
function ,with a dropout value of 0.2, This produces a 256 dimensional vector
which is then passed onto a fully connected layer which translates this vector
into another one of the same size for further processing.The model uses
downsampling via smaller output dense layers and has a dropout and batch
normalization layer to help prevent overfitting and reduce training time. The
architecture of this unit is given as follows
Layer Hyperparameters Output Shape Total parameters
LSTM activation=’tanh’ (, 256) 264192

dropout=0.2
Fully Connected 1 activation=’relu’ (, 256) 65792
Dropout Dropout (, 256) 0

9
percentage=0.5
Batch momentum=0.95 (, 64) 256

Normalization epsilon=0.001
Values for hyperparameter dropout for the LSTM module tested were 0.2 and
0.5 , for momentum for the batch normalization layer , 0.98 and 0.95 were
tested . ‘relu’ activation was chosen for the fully connected layers so that false
or weak outputs were nullified instantly. Changing the value of epsilon in batch
normalization did not yield different results .
Adaboost : Adaptive Boosting or AdaBoost classifier is used to fit a weak

classifier on a dataset and then uses copies of the same classifier to adjust the
weights of the incorrectly classified data points thus boosting the performance
of the classifier. “The outputs are combined by using a weighted sum, which
represents the combined output of the boosted classifier. AdaBoost tweaks
weak learners in favor of misclassified data samples.As long as the classifier
performance is not random, AdaBoost is able to improve the individual results
from different algorithms”[17]
The adaboost classifier in our model uses the Stagewise Additive Modeling
using a Multi-class Exponential loss function SAMME algorithm , with 100
estimators and a learning rate of 0.75 . A random seed value of 0 is used for
generating random states so that the output is reproducible.
Increasing the number of estimators from 75 to 200 vastly increases the

accuracy , precision and recall for the model , this was opposite of what was
observed in the random forests unit .Using n_jobs as -1 , we were able to fully
10
utilize the system resources for training and testing thus reducing the time
taken .
Convolutional Neural Network: A Convolutional Neural Network

(ConvNet/CNN) is a Deep Learning algorithm which can take in an input
image, assign importance (learnable weights and biases) to various
aspects/objects in the image and be able to differentiate one from the other.
While in primitive methods filters are hand-engineered, with enough training,
ConvNets have the ability to learn these filters/characteristics.
Convolutional Neural Networks find their applications in many fields and can
be applied for credit card fraud detection as well. CNNs in combination with
sampling techniques like smote achieves satisfying results. one-dimensional
CNN that resembles natural language processing can be used to extract the
datasets' important features before sending them for classification in a fully
connected dense layer. [18]
CNN used in our model uses 3 convolutional layers and 2 max pooling layers
with below mentioned hyperparameters output of which is fed to a fully
connected dense layer. Dropout and batch normalization layers are used to
help prevent overfitting and reduce training time after which the output from
the batch normalization layer is flattened and fed to the final fully connected
dense layer. Model architecture is given as follows:
Layer Hyperparameters Output Shape Total

parameters
Convolutional Layer 1 Kernel size = 2, (, 29, 30) 90

activation= ‘relu’
Max Pooling 1 Pool size = 2, (, 14, 30) 0

strides = 2
Convolutional Layer Kernel size = 2, (, 13, 15) 915

2 activation= ‘relu’
11
Max Pooling 2 Pool size = 2, (, 4, 15) 0

strides = 3
Convolutional Layer Kernel size = 2, (, 3, 10) 310

3 activation= ‘relu’
Fully connected activation= ‘relu’ (, 3, 5) 55

Dense Layer 1
Dropout Dropout (, 3, 5) 0
percentage=0.3
BatchNormalization None (, 3, 5) 20
Flatten Layer None (, 15) 0
Fully connected activation= (, 2) 32

Dense Layer 2 ‘softmax’
Support Vector Machine: Support Vector Machine (SVM) is a popular machine

learning method for classification, regression, and other learning tasks. In the
SVM algorithm, we plot each data item as a point in n-dimensional space
(where n is a number of features you have) with the value of each feature being
the value of a particular coordinate. Then, we perform classification by finding
the hyper-plane that differentiates the two classes very well. [19]
The core issue in SVM research is the choice of the kernel function. However
there is no more feasible and efficient method to construct a suitable kernel
function in allusion to specific problems. In practice, the more commonly used
kernel functions are Linearity kernel function, polynomial kernel function,
RBF (Radial Basis Function) kernel function and Sigmoid kernel function. [19]
The SVM classifier in our model uses the Radial basis function kernel and other
parameters set to default value. Polynomial kernel function with degree as
3,5,7,9 was experimented but SVM with RBF kernel function gave better
results.
GBM: Gradient boosting algorithm can be used for predicting not only
continuous target variables (as a Regressor) but also categorical target
12
variables (as a Classifier). When it is used as a regressor, the cost function is

Mean Square Error (MSE) and when it is used as a classifier then the cost
function is Log loss.
Decision Trees learn the splits and try to classify instances. There are many
splits where misclassification can occur. Decision trees may be weak at some
splits and better at another split while classification. Gradient Boosting is an
ensemble of many decision trees. Many weak learners are combined to make
the whole model much better at classification. The tree ensemble model is a set
of classification and regression trees (CART). The prediction scores at each
level are summed up to get the final predicted score. [20]
GBM in our model was experimented with learning_rate as 0.05, 0.01, 0.1, 0.5
and n_estimators as 100, 150, 200 and max_depth as 3, 5, 10, 15. Performance
of the model changed drastically with learning rate but change in
n_estimators and max_depth did not cause a noticeable change in the
performance. Optimal values for the model are learning_rate as 0.5,
n_estimators as 100 and max_depth as 3.
Artificial Neural Network : Artificial neural networks (ANNs), the branch of

artificial intelligence, date back to the 1940s, when McCulloch and Pitts
developed the first neural model. Since then the wide interest in artificial
neural networks, both among researchers and in areas of various applications,
has resulted in more-powerful networks, better training algorithms and
improved hardware. The basic problem solved by ANNs is the inductive
acquisition of concepts from examples. The ability to learn and generalize from
data, that is to mimic the human capability to learn from experience, makes
ANNs useful in automating the process of learning rules from various
applications.[22][23]
Layer Hyperparameters Output Shape Total parameters

13
Dropout 1 Dropout (, 32) 0

percentage=0.3

percentage = 0.3

percentage = 0.3
Fully Connected 4 activation= (, 1) 33

’sigmoid’
Logistic Regression : In statistics, the logistic model (or logit model) is used to
model the probability of a certain class or event existing such as pass/fail,
win/lose, alive/dead or healthy/sick. This can be extended to model several
classes of events such as determining whether an image contains a cat, dog,
lion, etc. Each object being detected in the image would be assigned a
probability between 0 and 1, with a sum of one.
Logistic regression is a statistical model that in its basic form uses a logistic
function to model a binary dependent variable, although many more complex
extensions exist. In regression analysis, logistic regression[1] (or logit
regression) is estimating the parameters of a logistic model (a form of binary
regression). Mathematically, a binary logistic model has a dependent variable
with two possible values, such as pass/fail which is represented by an indicator
variable, where the two values are labeled "0" and "1". In the logistic model,
the log-odds (the logarithm of the odds) for the value labeled "1" is a linear
combination of one or more independent variables ("predictors"); the
independent variables can each be a binary variable (two classes, coded by an
indicator variable) or a continuous variable (any real value).
XGBoost : XGBoost is a decision-tree-based ensemble Machine Learning

algorithm that uses a gradient boosting framework. In prediction problems
involving unstructured data (images, text, etc.) artificial neural networks tend
to outperform all other algorithms or frameworks. However, when it comes to
14
small-to-medium structured/tabular data, decision tree based algorithms are

considered best-in-class right now.[21]
It is an optimized distributed gradient boosting library designed to be highly

efficient, flexible and portable. It implements machine learning algorithms
under the Gradient Boosting framework. XGBoost provides a parallel tree
boosting (also known as GBDT, GBM) that solves many data science problems
in a fast and accurate way. The same code runs on major distributed
environments (Kubernetes, Hadoop, SGE, MPI, Dask) and can solve problems
beyond billions of examples.
XGBoost is experimented with max_depth from 3-10, min_child_weight from

1-5; optimal values are 5 for max depth, 1 from min_child_weight, and 0 for
gamma.
15
Result and Analysis

All the challenger models were trained and evaluated on the bases of below
mentioned evaluation metrics. We were able to identify the champion from the
3 challengers and challenger model 1 turned out to be the champion with
Accuracy, Precision, Recall and F1-Score as 99.86, 99.73, 99.99, 99.86.
● Evaluation Metrics
The following are the evaluation metrics that we use to gauge the
performance of our system.
Graphs
CNN - Training Loss and Accuracy vs CNN - Validation Loss and Accuracy
epoch vs epoch
16
LSTM - Training Loss and Accuracy LSTM - Validation Loss and

vs epoch Accuracy vs epoch
ANN - Training Loss and Accuracy vs ANN - Validation Loss and Accuracy
epoch vs epoch
Comparison Table
Challenger 1 modules
COMPONENT Accuracy Precision Recall F1 score
Random Forest 99.97 99.94 99.99 99.96
LSTM 99.39 98.80 99.98 99.39
Adaboost 98.00 98.73 97.24 97.98

17
CNN 97.49 99.44 95.53 97.45
SVM 96.68 98.19 95.12 96.63
GBM 99.70 99.62 99.79 99.70
ANN 99.79 99.96 99.61 99.78
Logistic 94.64 92.25 97.47 94.79

Regression
XGBoost 97.64 96.44 98.93 97.67
Champion-challenger comparison
CHALLENGER Accuracy Precision Recall F1 score
1 99.86 99.73 99.99 99.86
2 98.51 99.52 97.49 98.50
3 99.36 99.96 99.61 97.97

18
Conclusion
Credit card fraud is without a doubt an act of criminal dishonesty. This report
has listed out the most common methods of fraud along with their detection
methods and reviewed recent findings in this field. We have tackled the
problem of class imbalance which is present in the dataset . This paper has also
explained in detail how machine learning can be applied to get better results in
fraud detection along with the algorithm, pseudocode, explanation of its
implementation and experimentation results. The algorithm provided uses
three different hybrid ensemble models and then uses champion challenger
framework to compare them. The three challenger models were fed into the
framework and challenger 1 became the champion with Accuracy, Precision,
Recall and F1-Score as 99.86, 99.73, 99.99, 99.86.
References:
Base Paper:
Eunji Kim, Jehyuk Lee, Hunsik Shin, Hoseong Yang, Sungzoon Cho, Seung-kwan Nam,
Youngmi Song, Jeong-a Yoon, Jong-il Kim, Champion-challenger analysis for credit card fraud
detection: Hybrid ensemble and deep learning,Expert Systems with Applications, Volume
128,2019,Pages 214-224,ISSN 0957-4174
https://doi.org/10.1016/j.eswa.2019.03.042.
Dataset: https://www.kaggle.com/mlg-ulb/creditcardfraud
[1] Xuan S, Liu G, Li Z, Zheng L, Wang S, Jiang C. Random forest for credit card fraud
detection. ICNSC 2018 - 15th IEEE Int. Conf. Networking, Sens. Control 2018: 1–6.
[2] Makki S, Assaghir Z, Taher Y, Haque R, Hacid MS, Zeineddine H. An

Experimental Study With Imbalanced Classification Approaches for Credit Card Fraud
Detection. IEEE Access 2019;7:93010–22.
[3] West J, Bhattacharya M. Intelligent financial fraud detection: A comprehensive review.

Comput. Secur. 2016;57:47–66.
[4] Baader G, Krcmar H. International Journal of Accounting Information Systems Reducing

false positives in fraud detection : Combining the red flag approach with process mining. Int. J.
Account. Inf. Syst. 2018;31(June):1–16.
19
[5] Johnson JM, Khoshgoftaar TM. Survey on deep learning with class imbalance. J. Big Data
2019. https://doi.org/10.1186/s40537-019-0192-5.
[6] Walke A. Comparison of Supervised and Unsupervised Fraud Detection, no. September
2013. Springer International Publishing; 2019.
[7] Darwish, S.M. A bio-inspired credit card fraud detection model based on user behavior
analysis suitable for business management in electronic banking. J Ambient Intell Human
Comput 11, 4873–4887 (2020)
[8] J. Forough and S. Momtazi, Ensemble of deep sequential models for credit card fraud
detection, Applied Soft Computing Journal (2020),
doi:https://doi.org/10.1016/j.asoc.2020.106883.
[9] Toluwase Ayobami Olowookere, Olumide Sunday Adewale, A framework for detecting credit
card fraud with cost-sensitive meta-learning ensemble approach, Scientific African, Volume 8,
2020, E00464, ISSN 2468-2276, https://doi.org/10.1016/j.sciaf.2020.e00464.
[10] Naoufal Rtayli, Nourddine Enneya, Enhanced credit card fraud detection based on
SVM-recursive feature elimination and hyper-parameters optimization, Journal of Information
Security and Applications, Volume 55, 2020, 102596, ISSN 2214-2126.
[11]Sanaz Nami, Mehdi Shajari, Cost-sensitive payment card fraud detection based on dynamic
random forest and k-nearest neighbors, Expert Systems with Applications, Volume
110,2018,Pages 381-392,ISSN 0957-4174,
https://doi.org/10.1016/j.eswa.2018.06.011.
[12] R. Sailusha, V. Gnaneswar, R. Ramesh and G. R. Rao, "Credit Card Fraud Detection Using
Machine Learning," 2020 4th International Conference on Intelligent Computing and Control
Systems (ICICCS), 2020, pp. 1264-1270, doi: 10.1109/ICICCS48265.2020.9121114.
[13] Kuldeep Randhawa; Chu Kiong Loo; Manjeevan Seera; Chee Peng Lim; Asoke K. Nandi,
“Credit Card Fraud Detection Using AdaBoost and Majority Voting”
[14] X. Kewei, B. Peng, Y. Jiang and T. Lu, "A Hybrid Deep Learning Model For Online Fraud
Detection," 2021 IEEE International Conference on Consumer Electronics and Computer
Engineering (ICCECE), 2021, pp. 431-434, doi: 10.1109/ICCECE51280.2021.9342110.
[15] Breiman, L. Random Forests. Machine Learning 45, 5–32 (2001).

https://doi.org/10.1023/A:1010933404324
[16] F. Karim, S. Majumdar, H. Darabi and S. Chen, "LSTM Fully Convolutional Networks for
Time Series Classification," in IEEE Access, vol. 6, pp. 1662-1669, 2018, doi:
10.1109/ACCESS.2017.2779939.
20
[17] K. Randhawa, C. K. Loo, M. Seera, C. P. Lim and A. K. Nandi, "Credit Card Fraud Detection
Using AdaBoost and Majority Voting," in IEEE Access, vol. 6, pp. 14277-14284, 2018, doi:
10.1109/ACCESS.2018.2806420.
[18] Z. Zhang and S. Huang, "Credit Card Fraud Detection via Deep Learning Method Using
Data Balance Tools," 2020 International Conference on Computer Science and Management
Technology (ICCSMT), 2020, pp. 133-137, doi: 10.1109/ICCSMT51754.2020.00033.
[19] W. Xu and Y. Liu, "An Optimized SVM Model for Detection of Fraudulent Online Credit Card
Transactions," 2012 International Conference on Management of e-Commerce and
e-Government, 2012, pp. 14-17, doi: 10.1109/ICMeCG.2012.39.
[20] A. Mishra and C. Ghorpade, "Credit Card Fraud Detection on the Skewed Data Using
Various Classification and Ensemble Techniques," 2018 IEEE International Students'
Conference on Electrical, Electronics and Computer Science (SCEECS), 2018, pp. 1-5, doi:
10.1109/SCEECS.2018.8546939.
[21] Chen, T., & Guestrin, C. (2016). XGBoost. Proceedings of the 22nd ACM SIGKDD
International Conference on Knowledge Discovery and Data Mining.
[22] R. Nayak, L.C. Jain, B.K.H. Ting, Artificial Neural Networks in Biomedical Engineering: A
Review, Editor(s): S. Valliappan, N. Khalili, Computational Mechanics–New Frontiers for the
New Millennium, Elsevier, 2001, Pages 887-892, ISBN 9780080439815,
https://doi.org/10.1016/B978-0-08-043981-5.50132-2.
[23] Asha RB, Suresh Kumar KR, Credit card fraud detection using artificial neural network,
Global Transitions Proceedings, Volume 2, Issue 1, 2021, Pages 35-41, ISSN 2666-285X,
https://doi.org/10.1016/j.gltp.2021.01.006.

Big Data Analytics Project

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Big Data Analytics Project

Uploaded by

Copyright:

Available Formats

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

BIG DATA ANALYTICS (CO- 414)

Topic: “Champion-challenger analysis for credit card fraud detection:

Gaurav Khatri Gaurav Kumar Dhama Harshit Muhal

COE Department, DTU

DELHI TECHNOLOGICAL UNIVERSITY

The classification of fraudulent transactions with highly imbalanced classes of

In the literature, we find two main approaches to identifying and detecting

Models. As examples of the used unsupervised methods, we found SOMs

Our goal is to train a binary-classier F that is capable of classifying any given

Class imbalance is prominent in this dataset as only 492 fraud transactions

Time series graph visualisation for the data

Amount value visualisation for the data

J. Forough [8] proposed in this paper, an ensemble model based on sequential

Toluwase Ayobami Olowookere, Olumide Sunday Adewale[9] proposed a

SanazNami [11] developed a method that involves two stages of detecting

X. Kewei, B. Peng, Y. Jiang and T. Lu [14] employed multiple techniques,

Phase 1: Load the credit card fraud dataset.

Phase 2: Applying SMOTE technique on imbalanced credit card data and

Architecture for challenger 1

This challenger is a bagging type hybrid ensemble

Architecture for challenger 2

This challenger is a bagging type hybrid ensemble

Architecture for challenger 3

The champion is a bagging type hybrid ensemble

For the champion challenger framework, the following steps are

● Train the models on the training data.

Experimentation/Description for challenger 1

trees in the forest. [15]. The algorithm is unique in that it is robust to

The hyperparameter n_estimators which determine the number of trees were

Layer Hyperparameters Output Shape Total parameters

LSTM activation=’tanh’ (, 256) 264192

Fully Connected 1 activation=’relu’ (, 256) 65792

Dropout Dropout (, 256) 0

Fully Connected 2 activation=’relu’ (, 64) 16448

Batch momentum=0.95 (, 64) 256

Fully Connected 3 activation=’relu’ (, 32) 2080

Fully Connected 4 activation=’relu’ (, 8) 264

Fully Connected 5 activation=’relu’ (, 1) 9

Adaboost : Adaptive Boosting or AdaBoost classifier is used to fit a weak

Increasing the number of estimators from 75 to 200 vastly increases the

Experimentation/Description for challenger 2

Convolutional Neural Network: A Convolutional Neural Network

Layer Hyperparameters Output Shape Total

Convolutional Layer 1 Kernel size = 2, (, 29, 30) 90

Max Pooling 1 Pool size = 2, (, 14, 30) 0

Convolutional Layer Kernel size = 2, (, 13, 15) 915

Max Pooling 2 Pool size = 2, (, 4, 15) 0

Convolutional Layer Kernel size = 2, (, 3, 10) 310

Fully connected activation= ‘relu’ (, 3, 5) 55

Flatten Layer None (, 15) 0

Fully connected activation= (, 2) 32

Support Vector Machine: Support Vector Machine (SVM) is a popular machine

variables (as a Classifier). When it is used as a regressor, the cost function is

Experimentation/Description for challenger 3

Artificial Neural Network : Artificial neural networks (ANNs), the branch of

Layer Hyperparameters Output Shape Total parameters

Fully Connected 1 activation=’relu’ (, 32) 992

Dropout 1 Dropout (, 32) 0

Fully Connected 2 activation=’relu’ (, 32) 1056

Dropout 2 Dropout (, 32) 0

Fully Connected 3 activation=’relu’ (, 32) 1056

Dropout 3 Dropout (, 32) 0

Fully Connected 4 activation= (, 1) 33