You are on page 1of 26

Sanjivani Rural Education Society’s

Sanjivani College of Engineering, Kopargaon-423 603


(An Autonomous Institute, Affiliated to Savitribai Phule Pune University, Pune)
NACC ‘A’ Grade Accredited, ISO 9001:2015 Certified

Credit Card Fraud


Detection Using Machine
Learning.

1
• CONTENT
Contents OF THE PRESENTATION
• Introduction
• Scope
• Objectives
• Literature Review
• System Architecture
• Algorithm
• System Requirement
• Experimental Analysis
• Snapshots
• Conclusion

DEPARTMENT OF COMPUTER ENGINEERING, Sanjivani COE, Kopargaon 2


• Introduction
Introduction
 Problem Definition:
 What is Fraud?
’Fraud’ in credit card transactions is unauthorized and unwanted usage of an account by someone other
than the owner of that account.
 How does fraud takes place?
a) Lost or stolen card
b) Identity theft
c) Skimming (or cloning)
d) Counterfeit card
e) Mail intercept fraud
 What is aim of the proposed system?
The aim of the proposed system is to build a classifier that can detect credit card fraudulent transactions.
 Which algorithms are used?
Quadratic Discriminant Analysis (QDA), Logistic Regression (LR) and Support Vector Regression (SVR)

DEPARTMENT OF COMPUTER ENGINEERING, Sanjivani COE, Kopargaon 3


• Scope
a. System focuses on recognizing whether a new transaction is fraudulent or
not.
b. Detecting fraud automatically.
c. Less time needed for verification method.

DEPARTMENT OF COMPUTER ENGINEERING, Sanjivani COE, Kopargaon 4


• OBJECTIVES
a. The key objective of the proposed system is to identify suspicious events
and report them to an analyst while letting normal transactions be
automatically processed.
b. To provides more results with more accuracy.
c. To reduce losses due to payment fraud for both merchants and issuing
banks and increase revenue opportunities for merchants.
5
DEPARTMENT OF COMPUTER ENGINEERING, Sanjivani COE, Kopargaon 5
• Literature review
NO. AUTHOR TITLE SOURCE FINDINGS
(YEAR)
Relative Analysis of ML Algorithm The proposed module applies to the big
P. Naveen and B. Diwan (2020) IEEE Conference
1. QDA, LR and SVR for Credit Card Paper number of a dataset and provides more
Fraud Detection Dataset . results with more accuracy.
Skewness in the datasets can be handled
“Sampling Approaches for Imbalanced either by oversampling minority class
Shivani Tyagi, Sangeeta Mittal (2019) examples or by undersampling majority
2. Data Classification Problem in ICRIC class. Adaptive synthetic oversampling
Machine Learning” approach can best improve the imbalance
ratio as well as classification results.
International
Conference on The system defines the professional
A Comprehensive Survey of Data fraudster, formalizes the main8types and
1 March,
Clifton P, Vincent L, Kate S & Ross G, Development of ITS on Intelligent
3. 4 2020 Mining-based
based Fraud Detection AV Computation subtypes of known fraud, and presents the
(2010) nature of data evidence collected within
Research
Technology and affected industries.
Automation.

DEPARTMENT OF COMPUTER ENGINEERING, Sanjivani COE, Kopargaon 6


• Literature review
AUTHOR
NO. (YEAR) TITLE SOURCE FINDINGS

This system explains various


International Journal of techniques available for a fraud detection system such as
Yashvi J, Namrata T, “A Comparative Analysis of Various Support Vector Machine (SVM), Artificial Neural
4.
Credit Card Fraud Detection Techniques” Recent Technology and Networks (ANN), Bayesian Network, K- Nearest
Shripriya D ,Sarika J (2019)
Engineering (IJRTE) Neighbour (KNN), Hidden Markov Model, Fuzzy Logic
Based System and Decision Trees.

Seong H, Jeong, H Kim, “A Survey of Fraud Detection IEEE


5. Youngsang S,Taejin L, Huy Research based on Transaction The system provides an overview of the research
Conference Paper on data mining based fraud detection. .
K K,(2015) Analysis and Data Mining Technique”

DEPARTMENT OF COMPUTER ENGINEERING, Sanjivani COE, Kopargaon 7


• Requirements analysis
 Requirements Analysis is the process of defining the expectations of
the users for an application that is to be built or modified.

• Requirement Analysis includes:


1. Requirement Specification
2. Validation of Requirement
3. System Requirements

DEPARTMENT OF COMPUTER ENGINEERING, Sanjivani COE, Kopargaon 8


• Requirements analysis
1. Requirement Specification
• Captures complete description about how the system is expected to perform.
• SRS minimizes the time and effort.
• Minimizes the development cost.

 NR: Normal Requirements :


• These are the requirement which are clearly state by the customer
• NR1:- System should detect fraudulent transaction and non fraudulent
transaction from previous history of transactions.
• NR2:-System should abort the transaction if it finds fraudulent.
• NR3:-System should detect fraud automatically.
• NR4:-System should provide more results with more accuracy.

.
DEPARTMENT OF COMPUTER ENGINEERING, Sanjivani COE, Kopargaon 9
• Requirements analysis
 ER: Expected Requirements:
These are the basic requirement that not be clearly told by the customer, but also the customer expect that
requirement.
• ER1:- The prediction should be done by the algorithm which will give highest accuracy.
• ER2:-The system should handle a large amount of data.
• ER3:-The system should be reliable and give response in a short time.
• ER4:-The system should represent results in the form of graphs.

 XR: Excited Requirements:


• These requirements are neither stated by the customer nor expected.
• XR1:-The system should minimize false alerts.

2. Validation of Requirement:
• To check all the issues related to requirements.
• The basic objective is to ensure that the SRS reflects the actual requirements accurately and clearly.

DEPARTMENT OF COMPUTER ENGINEERING, Sanjivani COE, Kopargaon 10


• Requirements analysis
 Validation of Normal Requirements
• VNR1:- The system detect fraudulent and non-fraudulent transaction.
• VNR2:- The System aborts the transaction if it finds the transaction as fraudulent.

 Validation of Expected Requirements


• VER1:- By using machine learning algorithms we got the best module, this model can handle big dataset and provide results
with more accuracy.
• VER2:- The system represent results in the form of graphs.

 Validation of Excited Requirements


• VXR1:- This requirement gets satisfied if the user is able to get accurate result.
• VXR2:- This requirement can not be validated as our system only focuses on detecting fraudulent and non-fraudulent
transactions.

DEPARTMENT OF COMPUTER ENGINEERING, Sanjivani COE, Kopargaon 11


• System Architecture

DEPARTMENT OF COMPUTER ENGINEERING, Sanjivani COE, Kopargaon 12


• Algorithms
1. Algorithms
• Logistic Regression
• Support Vector Machine
• Decision Tree

DEPARTMENT OF COMPUTER ENGINEERING, Sanjivani COE, Kopargaon 13


• Algorithms
Logistic Regression
Logistic regression is a classification algorithm used to assign observations to a discrete set of classes. Some of the
examples of classification problems are Email spam or not spam, Online transactions Fraud or not Fraud, Tumor Malignant or
Benign. Logistic regression transforms its output using the logistic sigmoid function to return a probability value. Logistic
Regression is a Machine Learning algorithm which is used for the classification problems, it is a predictive analysis algorithm and
based on the concept of probability.

Steps:
1. Data Pre-processing step.
2. Fitting Logistic Regression to the Training set.
3. Predicting the test result.
4. Test accuracy of the result(Creation of Confusion matrix)
5.Visualizing the test set result.

DEPARTMENT OF COMPUTER ENGINEERING, Sanjivani COE, Kopargaon 14


• Algorithms
Support Vector Machine
• Supervised Learning algorithms, which is used for Classification as well as Regression problems.
• The goal of the SVM algorithm is to create the best line or decision boundary that can segregate n-dimensional space into
classes so that we can easily put the new data point in the correct category in the future. This best decision boundary is called
a hyperplane
• SVM chooses the extreme points/vectors that help in creating the hyperplane. These extreme cases are called as support
vectors, and hence algorithm is called as Support Vector Machine.
• Step 1: Importing the libraries
NumPy, matplotlib, Pandas
Step 2: Importing the dataset
Step 3: Feature Scaling
Step 4: Training the Support Vector Machine model on the Training set
Step 5: Predicting the Test set Results
Step 7: Comparing the Test Set with Predicted Values
Step 8: Visualizing the SVM results

DEPARTMENT OF COMPUTER ENGINEERING, Sanjivani COE, Kopargaon 15


• Algorithms
Decision Tree:

• It is a Supervised learning technique that can be used for both classification and Regression problems.
• In a Decision tree, there are two nodes, which are the Decision Node and Leaf Node. Decision nodes are used to make any
decision and have multiple branches, whereas Leaf nodes are the output of those decisions and do not contain any further
branches.
• Steps :
1. Data Pre-processing step
2. Fitting a Decision-Tree algorithm to the Training set
3. Predicting the test result
4. Test accuracy of the result(Creation of Confusion matrix)
5. Visualizing the test set result.

DEPARTMENT OF COMPUTER ENGINEERING, Sanjivani COE, Kopargaon 16


• System MODULES
1. Breakdown Structure

DEPARTMENT OF COMPUTER ENGINEERING, Sanjivani COE, Kopargaon 17


• System Modules
Module Details:
1. Preprocessing-(Transform raw data into a useful &efficient format)
a) Scaling the Feature Amount-(Normalize the range of independent variables or features )
b) Subset Selection.(Identify and remove the irrelevant & redundant information)
c) Handle Class Imbalance. (SMOTE-Synthetic Minority Oversampling Technique)
2. Comparison
a) Model Training.
b) Model Testing.
3. Prediction

DEPARTMENT OF COMPUTER ENGINEERING, Sanjivani COE, Kopargaon 18


• System Requirements
 Software Requirements
• 1. Operating system Windows 7 or above
• 2. IDE: Anaconda Navigator 2.0.4, Jupyter Notebook 6.0.3
• 4. Dataset: creditcard.csv

 Hardware Requirements
• 1. Processor i3 or above
• 2. RAM 3GB or above
• 3. Hard Disk

DEPARTMENT OF COMPUTER ENGINEERING, Sanjivani COE, Kopargaon 19


• Snapshots
Home Page Dashboard

DEPARTMENT OF COMPUTER ENGINEERING, Sanjivani COE, Kopargaon 20


• Snapshots

About page Administrative page

DEPARTMENT OF COMPUTER ENGINEERING, Sanjivani COE, Kopargaon 21


• Snapshots
Upload File/Dataset File Reports

DEPARTMENT OF COMPUTER ENGINEERING, Sanjivani COE, Kopargaon 22


• Snapshots
Single CSV Record

DEPARTMENT OF COMPUTER ENGINEERING, Sanjivani COE, Kopargaon 23


• Snapshots
Multiple records result/file uploaded Result:

DEPARTMENT OF COMPUTER ENGINEERING, Sanjivani COE, Kopargaon 24


• CONCLUSION
The result of an accurate value has been acquired for the analysis of credit card fraud. As
compared to various functionality, this particular module applies to the big number of a dataset
and provides more results with more accuracy. The Usage of more pre-processing techniques
would also assist.
The motivation of this model to learn and developed Quadratic Discriminant Analysis,
Logistic Regression and Support Vector Regression. It is obvious that the exertion set forth
when chipping away at the credit card fraud detection. The categorical values of parameters
like data series and quantity will be embedded in the future.
DEPARTMENT OF COMPUTER ENGINEERING, Sanjivani COE, Kopargaon 25
Thank You

26
DEPARTMENT OF COMPUTER ENGINEERING, Sanjivani COE, Kopargaon 26

You might also like