You are on page 1of 23

UPI FRAUD DETECTION USING MACHINE LEARNING

ALGORITHM
ABSTRACT

 With the increasing digitization of financial transactions, ensuring the security and integrity of online payment systems has
become a critical concern. This research explores the application of machine learning techniques to detect and prevent
fraudulent activities in the realm of financial transactions

 The study involves the collection and analysis of a diverse dataset containing both legitimate and potentially fraudulent
transactions. Features such as transaction amounts, user behaviour patterns, and temporal information are extracted to
develop a robust machine learning model.

 XG boost based machine learning model is implemented for detect and recognize the fraud and non fraud users
PROBLEM STATEMENT

• As digital payment systems, particularly those utilizing the Unified Payments Interface (UPI),
continue to gain widespread adoption, the vulnerability to fraudulent activities within these
transactions becomes a critical concern. Instances of unauthorized access, identity theft, and
financial losses pose substantial threats to users and financial institutions relying on UPI. The
existing security measures, though robust, are susceptible to evolving techniques employed by
malicious actors.
SCOPE

• The main scope of the project is aims to significantly enhance the security and reliability of UPI
transactions, fostering trust among users and stakeholders in the digital payment ecosystem

• The scope of the UPI fraud detection system encompasses various dimensions, including
technical, operational, and regulatory aspects. Here is an outline of the key elements within the
scope
INTRODUCTION

 In recent years, the proliferation of digital payment systems has transformed the landscape
of financial transactions, offering convenience and efficiency to users worldwide

 One such innovation, the Unified Payments Interface (UPI), has played a pivotal role in
revolutionizing how individuals and businesses conduct electronic transactions in India.

 As UPI gains prominence, ensuring the integrity and security of these transactions becomes
paramount, particularly in light of the escalating threat posed by fraudulent activities
PROBLEM FORMULATION

• Develop an anti-fraud model for internet loans to detect fraudulent applications, minimize false
positives and optimize customer experience.

• Internet loan fraud poses a risk to lenders, as fraudulent borrowers can take out loans with the
intention of never paying them back.
LITERATURE SURVEY
Title Author, year, Method or algorithm Advantage disadvantage
Journal

Association rules applied to D. Sánchez, M. XGBoost (XGB) AUC = 0.86 Need to increase the
credit card fraud detection,’’ A. Vila, L. accuracy.
Cerda, and J.
M. Serrano
IEEE

design of FFML: A rule- M. E. Edge and tri-relationship Better overall Need to increase the
based policy modelling P. R. F. embedding framework performance accuracy of the
language for proactive fraud Sampaio, has been used model.
management in financial
data streams

A cost-sensitive decision Federico Monti Decision tree 92.7% ROC AUC More system
tree approach for fraud et al, 2013, complexity
detection IEEE
LITERATURE SURVEY

Title Author, year, Method or algorithm Advantage disadvantage


Journal
Anti-fraud measures in Kai Shu, 2018, Trust based approaches Better performance Not suitable for
Southern Africa, IEEE used different type of data
set
‘Anti-fraud technologies: Yaqing Wang, Event Adversarial Better accuracy. Need to decrease the
A business essential in the 2019, KDD Neural Networks complexity of the
card industry model.

Learning imbalanced Shuo Yang, 2019 Unsupervised Automated process Need to increase the
datasets based on SMOTE Framework accuracy of the model.
and Gaussian distribution,

Hybrid dual Kalman Xinyi Zhou, Reza Kalman filter Kalman filter based -tracking efficiency
filtering model for short- Zafarani, 2019 low
term traffic flow
forecasting
EXISTING SYSTEM

• In existing system a histogram-based outlier score (HBOS) algorithm.

• Machine learning models has been implemented.

• SVM (support vector machine) scheme has been implemented.

• Random forest scheme has been implemented.

• Naive Bayesian scheme has been implemented in existing system.


DISADVANTAGE
• Less efficiency
• Data training loss
• Attacks possible
PROPOSED SYSTEM
• The main proposed system for UPI fraud detection Implementing XGBoost (Extreme Gradient
Boosting) for UPI fraud loan prediction involves leveraging this powerful machine learning
algorithm to analyze features and predict whether a loan application is likely to be fraudulent.
Below is a general outline of the steps involved in using XGBoost for UPI fraud loan prediction

• Implementing XGBoost for UPI fraud loan prediction requires a multidisciplinary approach,
combining expertise in machine learning, finance, and regulatory compliance. Regular updates and
collaboration with industry stakeholders are essential to address emerging challenges in fraud
detection.
ADVANTAGE

• High efficiency
• Support for large datasets
• High forecasting
SYSTEM ARCHITECTURE

Admin

Preprocessing
Dataset
Stemming Feature
Login collectio
and extraction
n
tokenization

Loan Model Train the


prediction Result deployment model

Test user login feature


MODULE LIST

• Dataset acquisition

• Preprocessing

• Feature extraction

• Training of features

• Testing and performance evaluation


DATASET ACQUISITION

• The data set is collected from the Kaggle website, Data set divided into three category a training
set, a validation set, testing set.

• This will split our dataset into training, validation, and testing sets in the ratio mentioned above-
80% for training (of that, 10% for validation) and 20% for testing. The original dataset consisted
of 162 slide images scanned at 40x. An imbalance in the class data with over 2x the number of
negative data points than positive data points.
PREPROCESSING

• Preprocessing is the process of image reduce the dimension of image

• We specify the input image volume shape to our network where depth is the number of color
channels each image contains.

• The image resize according the deep learning layer size of rows and column of image.
FEATURE EXTRACTION

• Extract relevant features that capture user behavior, transaction patterns, and other indicators of
potential fraudulent activity. Consider factors such as transaction frequency, average transaction
amount, credit score, and user profile information
TRAINING

• Divide the dataset into training and testing sets to evaluate the model's performance on unseen
data. This ensures an unbiased assessment of the model's generalization capabilities

• Implement the XGBoost algorithm for binary classification. Fine-tune hyperparameters using
techniques like grid search or random search to optimize model performance. Train the model on
the labeled training dataset.
TESTING PROCESS
• The testing process is implemented this function we can split the model with a test set of 30% of the
original data set.

• The input jusy specify the size of the input and is called D (see the code above X_ train shape).

• The dense layer is instead where the real work happens: it takes the input and does a linear transformation
to get an output of size 1. The linear transformation we want to apply is the sigmoid activation function
so that in output we are in a range of 0 and 1.

• Loss per iteration, training loss, validating loss is implemented in module.

• Accuracy and sensitivity of the analyzed.


SYSTEM REQUIREMENT
H/W SYSTEM CONFIGURATION:-
• processor - Pentium – IV
• RAM - 4 GB (min)
• Hard Disk - 20 GB
S/W SYSTEM CONFIGURATION:-
• Operating System : Windows 7 or 8
• Software : python Idle
• Front end : HTML ,CSS
• Back end: My Sql
BENEFITS
• Enhanced Security:
• Fraud Prevention
• Cost Reduction:
• Efficiency in Loan Processing:
CONCLUSION
Furthermore, the proposed system emphasizes transparency, interpretability, and adherence to
regulatory compliance standards. Feature importance analyses and explainability techniques
contribute to understanding the decision-making process, instilling confidence in users and

regulatory bodies regarding the fairness and reliability of the UPI fraud loan prediction system .
REFERENCE

1) Y. Peng, G. Wang, G. Kou, and Y. Shi, ‘‘An empirical study of classification algorithm evaluation for financial risk
prediction,’’ Appl. Soft Comput., vol. 11, no. 2, pp. 2906–2915, Mar. 2011.

2) J. Z. Lei and A. A. Ghorbani, ‘‘Improved competitive learning neural networks for network intrusion and fraud
detection,’’ Neurocomputing, vol. 75, no. 1, pp. 135–145, Jan. 2012.

3) Y. Sahin, S. Bulkan, and E. Duman, ‘‘A cost-sensitive decision tree approach for fraud detection,’’ Expert Syst.
Appl., vol. 40, no. 15, pp. 5916–5923, Nov. 2013.

4) N. Soltani Halvaiee and M. K. Akbari, ‘‘A novel model for credit card fraud detection using artificial immune
systems,’’ Appl. Soft Comput., vol. 24, pp. 40–49, Nov. 2014.

5) M. E. Edge and P. R. F. Sampaio, ‘‘The design of FFML: A rule-based policy modelling language for proactive
fraud management in financial data streams,’’ Expert Syst. Appl., vol. 39, no. 11, pp. 9966–9985, Sep. 2012.

You might also like