You are on page 1of 7

Paper Title* (use style: paper title)

line 1: 1st Given Name Surname line 1: 2nd Given Name Surname line 1: 3rd Given Name Surname line 1: 4th Given Name Surname
line 2: dept. name of line 2: dept. name of line 2: dept. name of line 2: dept. name of
organization (of Affiliation) organization (of Affiliation) organization (of Affiliation) organization (of Affiliation)
line 3: name of organization (of line 3: name of organization (of line 3: name of organization (of line 3: name of organization (of
Affiliation) Affiliation) Affiliation) Affiliation)
line 4: City, Country line 4: City, Country line 4: City, Country line 4: City, Country
line 5: email address or ORCID line 5: email address or ORCID line 5: email address or ORCID line 5: email address or ORCI

Abstract—The threat posed by financial transaction fraud to systems look for suspected fraudulent actions using
organizations and individuals has prompted the development of predetermined rules and patterns. But these systems struggle
cutting-edge methods for detection and prevention. The use of to adjust to new and developing fraud strategies, which results
real-time monitoring systems and machine learning algorithms to
in many false negatives and potential financial losses (kumar
improve fraud detection and prevention in financial transactions
is explored in this research study. The paper addresses the
et al., 2020). The use of machine learning algorithms has
drawbacks of conventional rule-based systems, explains why real- drawn a lot of interest as a solution to these restrictions.
time monitoring and machine learning should be used, and Large volumes of transactional data can be automatically
describes the goals of the research. To comprehend the current mined for patterns and abnormalities using machine learning
methodologies and pinpoint research gaps, a thorough literature algorithms, leading to more precise and adaptable fraud
study is done. The suggested approach includes dimensionality detection. Financial institutions can examine past
reduction, feature engineering, data preparation, and the transactional data to find trends linked to fraudulent actions
application of machine learning models built into a real-time by utilizing machine learning techniques like supervised
monitoring system. Results are assessed using performance
learning, unsupervised learning, and deep learning (dal
measures and contrasted with the performance of current
systems. Adaptive thresholds and dynamic risk scoring are two pozzolo et al., 2015). Additionally, by continuously monitoring
proactive fraud prevention strategies that being investigated. transactions in real-time and sending out notifications for
Considerations for scalability and deployment, including data suspected fraud, the integration of real-time monitoring
security and legal compliance, are also covered. The study systems improves fraud detection (bolton et al., 2011). With
suggests areas for additional research in this field and helps to timely action made possible by this proactive strategy,
design reliable fraud detection systems. potential losses and damages are reduced. The necessity for a
more effective and efficient strategy to counteract changing
Keywords—component, formatting, style, styling, insert (key
fraud strategies is what motivates the use of machine learning
words)
algorithms and real-time monitoring systems. Financial fraud
is dynamic, necessitating the use of adaptable systems that
can recognize emerging trends and abnormalities. Detecting
I. INTRODUCTION (HEADING 1) complex and changing fraud patterns is made possible by
For organizations, financial institutions, and people machine learning algorithms, allowing for early identification
everywhere, detecting and preventing fraud in financial and prevention (phua et al., 2010). In addition to machine
transactions is a top priority. The need to investigate more learning, real-time monitoring systems offer fast response
sophisticated techniques has arisen as sophisticated fraud has capabilities, enabling prompt intervention to stop fraudulent
made clear the limitations of conventional rule-based systems. transactions (kou et al., 2020).
This study explores how real-time monitoring systems and 1.1 Research Objectives
machine learning algorithms can be used to improve financial 1. Investigate the use of machine learning algorithms for fraud
transaction fraud detection and prevention capabilities. In the detection in financial transactions.
literature, the importance of fraud prevention and detection in 2. Design and develop a real-time monitoring system for
financial transactions has been extensively discussed. In continuous fraud detection and prevention.
addition to causing significant financial losses, financial fraud 3. Assessing the performance of the suggested approach in
also erodes public faith in the financial system (association of comparison to conventional rule-based systems.
certified fraud examiners, 2020). Traditional rule-based

XXX-X-XXXX-XXXX-X/XX/$XX.00 ©20XX IEEE


4. Exploring proactive measures for fraud prevention, such as transactions—presents another difficulty. The model may be
dynamic risk scoring and adaptive thresholds. biased toward the majority class (legal transactions) because of
5. Analyse scalability and deployment considerations for unbalanced datasets, which will lead to decreased performance
implementing the proposed system in real – world financial in identifying the minority class (fraudulent transactions)
institutions. Techniques such using the Synthetic Minority Over-
1.2 Research Questions sampling Technique (SMOTE), which oversamples the
1. How can machine learning algorithms be used in financial minority class, or under-sampling the majority class have been
transactions to spot and stop fraud? suggested as solutions to the problem of unbalanced data.
2. What effect do real-time monitoring systems have on the These methods seek to improve the identification of fraudulent
capacity for fraud detection and prevention? transactions while balancing the distribution of classes.
3. How effective and accurate at detecting fraud is the
2.2 Unsupervised Learning Approaches
suggested method compared to conventional rule - based
systems? For spotting fraud in numerous domains, unsupervised
4. What preventative measures can be built into the system to learning techniques like clustering and anomaly detection have
stop fraud before it happens? been investigated. The goal of these strategies, which do not
5. What factors need to be considered while deploying the require labelled data, is to find patterns and anomalies in the
suggested system in actual financial institutions? data that may point to fraudulent activity. Clustering algorithms
were used in a study by Ranshous et al. (2015) to identify
II. LITERATURE REVIEW fraud. To find clusters of connected fraudulent transactions, the
In recent years, there has been a lot of study on applying authors used clustering techniques, which made it possible to
machine learning algorithms to detect fraud in financial spot trends and similarities in fraudulent behaviour. This
transactions. Various strategies and algorithms have been method is especially beneficial for identifying innovative or
examined in several research to increase the precision and previously unidentified fraud patterns that may not be picked
effectiveness of fraud detection systems. This section reviews up by predetermined rules or labelled data. Unsupervised
earlier studies and research articles in the field, addressing the learning techniques have the advantage of being able to adapt
benefits and drawbacks of various strategies while identifying to new fraud methods without relying on labels that have been
the gaps in the body of knowledge that the current study seeks predetermined. They can find irregularities and patterns in the
to fill. data that may be signs of fraud. Unsupervised learning
techniques face considerable difficulties due to their increased
2.1 Supervised Learning Approaches false positive rate when compared to supervised methods.
A fraud detection system based on logistic regression was Unsupervised models have a high rate of false positives
proposed by Buczak & Guven 2016. The study proved that because they can classify genuine transactions as anomalies or
logistic regression is useful for spotting fraudulent transactions. find clusters that include both valid and fraudulent transactions.
A popular classification approach called logistic regression Another drawback is the challenge of identifying specific fraud
predicts the association between input features and the incidents. While unsupervised learning techniques offer a more
likelihood that a transaction is fraudulent. It is a desirable comprehensive perspective of fraud tendencies, they could fall
option for fraud detection systems because of its readability short in terms of the level of detail needed to pinpoint
and simplicity. Another well-liked supervised learning strategy fraudulent transactions or the participants. To recognize and
for fraud detection is decision trees. To categorize occurrences authenticate specific fraud cases, more research and analysis
as fraudulent or authentic, decision tree algorithms, such the are frequently required. Hybrid methods that blend supervised
C4.5 algorithm, build a tree-like model that divides the dataset and unsupervised techniques have been developed to solve the
depending on feature values. Because they can manage non- issues of false positives and the difficulty in identifying
linear correlations between features and the target variable, specific fraud instances.
decision trees have the advantage of being ideal for identifying 2.3 Hybrid Approaches
intricate fraud patterns. The ability of Support Vector Machines
(SVMs) to handle high-dimensional data and nonlinear In fraud detection research, hybrid systems that blend
relationships has led to their use in fraud detection as well. supervised and unsupervised techniques have gained
SVMs look for an ideal hyperplane that can distinguish popularity. These solutions try to take use of the advantages of
between fraudulent and legal transactions with the greatest both tactics while addressing the weaknesses of each, such as
margin. at dealing with unbalanced datasets, SVMs have high false positive rates or the inability to manage intricate
shown to perform well at classifying fraudulent transactions. fraud patterns.
Although these supervised learning algorithms are easy to use A hybrid fraud detection system with integrated clustering
and interpret, they could have trouble spotting fraud. The and classification algorithms was proposed by Bhattacharyya et
complexity of fraud patterns is one of the biggest problems. al. (2018). The classification technique was used to separate
The techniques used by fraudsters are constantly changing, between fraudulent and valid transactions inside each cluster
creating complex and dynamic fraud patterns that these once the clustering algorithm had identified groups of similar
algorithms would find challenging to successfully detect. The transactions. When compared to employing either strategy
unbalanced character of fraud datasets—where the proportion alone, our hybrid model showed enhanced fraud detection
of legal transactions is noticeably higher than that of fraudulent performance. The benefit of hybrid techniques is their capacity
for both supervised learning to capture well-known fraud temporal dependencies in financial transactions are vital for
patterns and unsupervised learning to detect new fraud patterns. spotting fraud.
Hybrid models seek to increase fraud detection accuracy while
3. Lack of consideration for interpretability and
lowering false positives by incorporating the best features of
explainability: To win the trust of stakeholders and meet
both approaches. However, using hybrid models in practical
regulatory obligations, it is crucial to offer explanations and
settings is not without its difficulties. When compared to
interpretability as machine learning models get increasingly
individual approaches, these models are typically more intricate
complicated.
and computationally intensive. Large-scale implementation
may be more difficult because to the need for additional 4. inadequate analysis of unbalanced datasets: In fraud
resources and knowledge for the integration and coordination detection, where there are far fewer cases of fraud than there
of multiple algorithms. are of valid transactions, unbalanced datasets are typical.
Further research is required to determine how well current
2.4 Deep Learning Approaches
approaches perform on data that is unbalanced.
Due to their effectiveness in extracting complicated patterns
2.4 Feature extraction
from vast amounts of data, deep learning models, particularly
neural networks, have drawn a lot of interest in the field of The process of building new features out of already existing
fraud detection. In a thorough review of data mining-based ones to collect more data. The following are some methods
fraud detection research, Phua et al. (2010) emphasized the frequently employed for feature extraction in financial
efficiency of neural networks in identifying credit card fraud. transaction data:
Deep learning methods neural networks have demonstrated
exceptional performance in detecting credit card fraud. Even • Aggregation: The summarization of transaction data over
complex fraud patterns that are difficult for people or predetermined time periods (e.g., daily, weekly) in order to
conventional machine learning algorithms to recognize can be extract characteristics like the total number of transactions, the
detected by these models, which can automatically learn key average frequency of transactions, or the maximum amount of
attributes and capture them. Deep neural networks may transactions. 9
successfully extract high-level representations of the input data • Time-Based Features: Extraction of temporal data, such as
by using numerous layers of interconnected nodes (neurons), the day of the week, the hour of the day, or the amount of time
enabling precise fraud detection. However, there are a few since the last transaction, using transaction timestamps.
things to consider when using deep learning models for fraud
detection. First off, for deep learning models to operate at their • Statistical Features: Calculating statistical measures of
best, a lot of labelled training data is frequently necessary. In transaction amounts or other pertinent variables, such as mean,
the area of fraud detection, gathering an extensive and precisely standard deviation, and skewness.
annotated dataset might be difficult because fraudulent • Text mining: The process of extracting terms or patterns
instances are frequently more rare than valid ones. To lessen from text-based fields, such as transaction descriptions, that
the problem of imbalanced datasets, sophisticated sampling may be indicators of fraud.
techniques and data augmentation approaches might be used. 8
Second, training and optimizing deep learning models can be 2.5 Dimensionality Reduction
computationally taxing and may call for a lot of processing Methods for reducing the number of characteristics in a
power. Large datasets and complex neural architectures may dataset while keeping the most crucial data are known as
require the utilization of specialized hardware or distributed dimensionality reduction techniques. This aids in combating
computing resources in order to train models effectively. computational complexity and the "curse of dimensionality."
Despite these difficulties, convolutional neural networks and Techniques for dimensionality reduction that are frequently
recurrent neural networks are examples of deep learning employed include:
approaches that have advanced and continue to help fraud
detection systems become more effective. The goal of ongoing • Using principal component analysis (PCA), the original
research is to improve the effectiveness of deep learning characteristics are converted into a fresh collection of
models for fraud detection. uncorrelated variables (principal components), which account
for most of the variance in the data.
This includes developing lightweight architectures, model
compression methods, transfer learning, and transfer learning • The supervised dimensionality reduction technique linear
methods. The current study tries to fill various gaps in the discriminant analysis (LDA) maximizes the separation between
literature despite the advancements made in machine learning- several classes while minimizing within-class variation.
based fraud detection. These gaps include the following: • t-Distributed Stochastic Neighbour Embedding, or t-SNE
1. Limited attention paid to real-time fraud detection: While a non-linear technique, frequently used for visualization, that
real-time fraud detection calls for prompt identification and maintains the data's local structure while lowering its
prevention during live transactions, many existing research dimensionality.
concentrate on offline analysis of past data.
2. Insufficient attention to temporal aspects: Although they
frequently go unnoticed, time-dependent characteristics and
III. METHODOLOGY The specific properties of the financial transaction data and
3.1 Dataset Description: the goals of fraud detection should be aligned with the chosen
feature engineering approaches and dimensionality reduction
The dataset used for the research is a synthetic dataset techniques.
generated for the purpose of this study, appendix 1. It contains
information about financial transactions, including transaction The following methods were adopted:
IDs, customer IDs, transaction amounts, transaction • Feature Selection: By focusing on the most crucial
timestamps, regions, states, customer categories, and account elements that helped with fraud detection, we scanned through
balances. The dataset consists of 10000 records and includes the data to identify noise. This lessened the possibility of
characteristics such as geographical information, customer overfitting while also enhancing the model's accuracy and
profiles, and transaction details. interpretability.
3.2 Preprocessing Steps: • Feature Extraction: Transaction data frequently contains
Before applying machine learning algorithms for fraud important information that may not be readily captured by the
detection, several preprocessing steps were employed to clean raw features. This is known as feature extraction. Meaningful
and transform the data. representations and identify significant fraud-related patterns or
trends were created.
These steps are as follow:
• Dimensionality reduction: Datasets related to financial
• Handling missing values: Identify and handle any missing transactions may be highly dimensional, which increases
values in the dataset, either by imputing them or removing the computing complexity and raises the possibility of overfitting.
corresponding records. Methods for dimensionality reduction reduced the number of
features while retaining the most important data, which helped
• Data normalization: Scale numerical features such as
to solve these problems.
transaction amounts and account balances to a common range
to ensure they have a similar impact during model training. The trade-off between model performance and
interpretability were considered while choosing certain
• Encoding categorical variables: Convert categorical
strategies. Higher predicted accuracy may be obtained using
variables like regions, states, and customer categories into
more sophisticated approaches like deep learning or ensemble
numerical representations using techniques like onehot
methods, but they may also be more difficult to comprehend.
encoding or label encoding.
To balance model complexity, interpretability, and computing
• Feature selection: Identify and select the most relevant efficiency, one must consider both the resources at hand as well
features that contribute significantly to fraud detection, as the needs of the fraud detection system.
considering their impact and reducing computational
3.5 Machine Learning Algorithms:
complexity.
The selection and implementation of machine learning
3.3 Exploratory Data Analysis:
algorithms for fraud detection depend on the specific
Data visualization can be a valuable step to gain insights requirements of the problem and the characteristics of the
into the dataset and understand its characteristics. Visualization dataset. In this research, the following algorithms were applied:
techniques applied were:
• Logistic Regression: This algorithm is suitable for binary
• Histograms: Plotting histograms can provide an overview classification tasks and can provide interpretable results.
of the distribution of numerical features such as transaction
• Decision Trees: Decision trees can capture non-linear
amounts and account balances.
relationships and are effective in handling categorical features.
• Bar plots: Visualizing categorical variables like regions,
• Random Forest: This ensemble method combines multiple
states, and customer categories using bar plots can help
decision trees to improve accuracy and handle complex fraud
understand their frequency distribution.
patterns.
• Scatter plots: Plotting transaction amounts against account
• Support Vector Machines (SVM): SVMs can handle high-
balances can reveal potential patterns or outliers.
dimensional data and are effective in separating classes with a
• Heatmaps: Using a heatmap, correlations between clear margin.
different features can be explored, which can help identify
The four algorithms were used to be able to establish the
relationships and potential predictors of fraud.
best possible result, and the associated algorithm as well as the
By visualizing the data, it becomes easier to identify any applicable hyperparameters.
anomalies, outliers, or patterns that may require further
3.6 Solution Deployment:
investigation or preprocessing before training the machine
learning models. Deploying the machine learning models for fraud detection
in a production setting comes next after they have been trained
3.4 Feature Engineering and Dimensionality Reduction:
and assessed. The following are the main factors for algorithm
deployment were applied:
• Model serialization A format was created to that makes it A/B testing were performed to compare the performance of
simple to load and use the trained machine learning models the deployed models against a baseline or alternative
during deployment by serializing them . Pickle files, joblib approaches. Continuous evaluation of the effectiveness of the
files, or serialized representations particular to the machine deployed models using relevant metrics including precision,
learning framework of choice are examples of common recall, and F1-score.
formats.
Continuous Improvement:
The final machine learning model were deployed to a local
Feedback loops were incorporated to collect labelled data
device on which simulates the on-premise scenario
on detected fraud cases and use it to improve the models. This
3.7 Model Deployment: iterative process helps enhanced the accuracy and effectiveness
of the fraud detection system over time
Options Machine learning models can be deployed in a
variety of ways, depending on the infrastructure and needs: Keep your text and graphic files separate until after the text
has been formatted and styled. Do not use hard tabs, and limit
• On-Premises Deployment: Setting up the models on the
use of hard returns to only one return at the end of a paragraph.
organization's own local servers or infrastructure.
Do not add any kind of pagination anywhere in the paper. Do
• Cloud Deployment: Hosting the models on cloud not number text heads-the template will do that for you.
infrastructure like AWS, Azure, or Google Cloud.
IV. RESULTS AND FINDINGS
• Containerization: Packing the models into containers for
The bar plot reveals the distribution of customer categories
scalability and simple deployment (like Docker).
in the dataset. The x-axis represents the different customer
• Serverless Deployment: This method involves deploying categories, and the y-axis represents the count of customers in
the models as functions using serverless platforms (such as each category. The following observations can be made from
AWS Lambda and Google Cloud Functions). the plot:
API Development: Low-Profile: This category has the highest count,
indicating that a significant portion of the customers falls into
To expose the deployed models, a microservice or an API this category.
endpoint was created. This made it possible for other programs
or systems to communicate with fraud detection models and Medium-Profile: The count of customers in this category
make predictions. Transaction data are accepted as input by the is moderately high, suggesting a considerable presence.
API, which should then output estimated fraud probability or
High-Profile: This category has a relatively low count
binary labels.
compared to the others, indicating a smaller proportion of
Scalability and effectiveness: customers.
The solution was developed to allow increasing transaction Implications: The distribution of customer categories
volumes in real-time. To increase performance and scalability, provides valuable insights into the customer base. The
strategies like load balancing, caching, and parallel processing dominance of the Low-Profile category suggests that most
are suggested. customers in the dataset have low transaction activity or
account balances. On the other hand, the presence of Medium-
Monitoring and logging systems: Profile and High-Profile categories indicates the existence of
Implementing monitoring and logging systems to keep tabs customers with relatively higher transaction activity or account
on the operation and behaviour of the deployed models. This balances.
entailed logging all input information, forecasts, and runtime Understanding the distribution of customer categories can
faults or exceptions. Continuous improvement is made possible be useful for various purposes, such as targeted marketing
via monitoring, which helps find any drift in model campaigns, customer segmentation, and fraud detection.
performance over time. Further analysis can be performed to explore the relationships
Security Consideration: between customer categories and other variables in the dataset.
It is important to note that this analysis is based on the given
Applying the proper security precautions to safeguard the dataset and may not represent the entire population accurately.
deployed models and the data they analyse. Access controls,
encryption of sensitive data, and frequent security audits may Additional data and more comprehensive analysis can
all be necessary for this. provide deeper insights into customer categories and their
significance in the context of the domain. In conclusion, the
Versioning and Updates: categorical analysis of the 'customer_category' variable
Versioning mechanism for the deployed models was provides a high level understanding of the distribution of
created to keep track of changes and simplify future updates. customer categories within the dataset. The bar plot visually
To adapt to changing fraud tendencies, automated pipelines are represents the counts of each category, highlighting the
suggested for model updates and retraining. dominance of the Low Profile category and the presence of
Medium-Profile and High-Profile categories.
A/B Testing and Evaluation:
ACKNOWLEDGMENT a) Despite the advancements gained in this research,
I would like to express my sincere gratitude to the there are still a number of opportunities for system
following individuals and institutions for their invaluable improvements and exploration in the future.
contributions to the completion of my project, IntelliAI. b) Examine the usage of ensemble models, like
Random Forest or Gradient Boosting Machines, to combine
Gurpreet Singh Panesar, Asst. Professor, AIT-CSE Your the advantages of many methods and raise the accuracy of
unwavering guidance, expert feedback, and unwavering fraud detection.
encouragement were instrumental in shaping the direction and c) Focus on creating more explainable AI models to
quality of our project. Your willingness to share your
offer insights into how fraud detection judgments are made,
knowledge and insights was truly inspiring.
improving system transparency and trust.
Committee Members: d) Investigate the use of online learning strategies to
Afridi Haque, UID - 20BCS3879, AIT-CSE (20BDA2-A) modify the fraud detection system in real-time as new data
Amninder Sran, UID - 20BCS3959, AIT-CSE (20BDA2-B) becomes available, enhancing its response to changing fraud
Laksh Walia, UID - 20BCS3889, AIT-CSE (20BDA2-A) patterns.
Akshat Shrivasta, UID - 20BCS3888, AIT-CSE (20BDA2-A) e) Investigate how deep reinforcement learning can
be used to detect fraud. Through interactions with its
I am deeply appreciative of your insightful comments, environment, the system can learn the best practices for
constructive suggestions, and thorough review of my project. preventing fraud.
Your expertise and valuable perspectives helped to refine my f) Enhanced Data Preprocessing: Improve the training
work and enhance its overall impact. I am fortunate to have dataset's quality by further refining data preprocessing
collaborated with these talented individuals. Their expertise,
procedures to manage missing or noisy data.
dedication, and collaborative spirit were essential to the
g) Integration with External Data Sources: To
successful completion of this project. I would like to extend my
heartfelt gratitude to all the individuals who participated in my improve the fraud detection process, think about integrating
research study. Their willingness to share their time and external data sources, such as social media data or transaction
experiences was invaluable to my research. history from partner institutions.
h) Develop a thorough system for continual
CONCLUSION monitoring, evaluation, and modifications to accommodate
This study examined numerous methods to deal with this new fraud schemes and guarantee the system's continued
pressing issue as it pertained to financial transaction fraud applicability
detection and prevention. In order to identify fraudulent REFERENCES
activity, the study looked at the usage of supervised learning
algorithms, unsupervised learning algorithms, and hybrid The template will number citations consecutively within
approaches. In addition, the capacity to recognize intricate brackets [1]. The sentence punctuation follows the bracket [2].
Refer simply to the reference number, as in [3]—do not use
fraud patterns was tested for deep learning models, notably
“Ref. [3]” or “reference [3]” except at the beginning of a
neural networks. The study also stressed the significance of
sentence: “Reference [3] was the first ...”
incorporating machine learning models into real-time
monitoring to create a reliable fraud detection system. For papers published in translation journals, please give the
English citation first, followed by the original foreign-language
Research Contributions and Findings: citation [6].
The research's conclusions showed that each strategy
had advantages and disadvantages. While demonstrating [1] Buczak, A. L., & Guven, E. (2016). A Survey of Data Mining and
interpretability and ease of use, supervised learning methods Machine Learning Methods for Cyber Security Intrusion Detection.
such as logistic regression and decision trees struggled with IEEE Communications Surveys & Tutorials, 18(2), 1153-1176. DOI:
10.1109/COMST.2015.2494502.
complicated fraud patterns and unbalanced datasets.
[2] Ranshous, S., Bay, C., Cramer, N., Henricksen, M., & Hannigan, B.
Clustering and anomaly detection are two unsupervised (2015). Combining Clustering and Classification for Anomalous
learning approaches that excel at spotting novel or Activity Detection in Cybersecurity. In Proceedings of the 2015
undiscovered fraud trends but have a high rate of false Workshop on Artificial Intelligence and Security (pp. 49-58).
positives and are unable to identify specific fraud instances. [3] Bhattacharyya, D., Kalaimannan, E., & Verma, A. (2018). Anomalous
Pattern Detection in Enterprise Data Using Hybrid Classification and
Although hybrid approaches sought to integrate the best Clustering Techniques. Procedia Computer Science, 132, 1066-1075.
features of both supervised and unsupervised techniques, their DOI: 10.1016/j.procs.2018.05.110.
complexity and processing requirements made large-scale [4] Phua, C., Lee, V., Smith, K., & Gayler, R. (2010). A Comprehensive
deployment difficult. By extracting complex patterns from Survey of Data Miningbased Fraud Detection Research. Artificial
enormous volumes of data, deep learning models, in particular Intelligence Review, 33(4), 229-246. DOI: 10.1007/s10462-009-9128-7.
neural networks, showed promise in the detection of fraud. For [5] Friedman, J., Hastie, T., & Tibshirani, R. (2001). The Elements of
Statistical Learning: Data Mining, Inference, and Prediction. New York,
efficient training, they needed a lot of labeled data and NY: Springer-Verlag.
processing power. [6] Brownlee, J. (2020). Master Machine Learning Algorithms. Machine
Future Study and Developments Learning Mastery.
[7] Chollet, F. (2018). Deep Learning with Python. Manning Publications. [11] Kotsiantis, S. B. (2013). Decision Trees: A Recent Overview. Artificial
[8] Varshney, A., Mishra, S., & Jha, R. P. (2019). A Review on Machine Intelligence Review, 39(4), 261-283. DOI: 10.1007/s10462-011-9272-4.
Learning Algorithms for Fraud Detection. Procedia Computer Science, [12] Kingma, D. P., & Ba, J. (2015). Adam: A Method for Stochastic
132, 1575-1584. DOI: 10.1016/j.procs.2019.04.169. Optimization. International Conference on Learning Representations
[9] Cawley, G. C., & Talbot, N. L. (2010). On Over-fitting in Model (ICLR).
Selection and Subsequent Selection Bias in Performance Evaluation. IEEE conference templates contain guidance text for
Journal of Machine Learning Research, 11, 2079- 2107. composing and formatting conference papers. Please
[10] Kuhn, M., & Johnson, K. (2013). Applied Predictive Modeling. New ensure that all template text is removed from your
York, NY: Springer-Verlag. conference paper prior to submission to the conference.
Failure to remove template text from your paper may
result in your paper not being published.

You might also like