Bank Data Science Project Predicts Genuine Notes

Uploaded by

Alejo Guerra Fernandez

0% found this document useful (0 votes)

23 views3 pages

Original Title

Final Project

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

23 views3 pages

Bank Data Science Project Predicts Genuine Notes

Uploaded by

Alejo Guerra Fernandez

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 3

Search inside document

FINAL DATA SCIENCE PROJECT-MODELING FOR A BANK

The propose of this data science project is to show if the methodology that we apply to identify
which banknote are genuine and which ones were forged works well or not, this is going to be
analyzed by the data that was taken from the bank.

Firstly, we just wanted to know how the feature of this data is (see Fig [1]) to make sure what we
should do knowing this feature.

Fig [2]. Variance vs skewness

After that, knowing how the feature were, we decided to standardize the data to make easier
understand the data and at the same time looking for outliers. We tried to find the outliers using an
ellipse knowing that outliers can be found two standard deviation from the mean as is shown in Fig
[2].

Fig [2]. Variance vs Skewness-Standardized

Unfortunately We realized that it is not an important result because we have too many features
making different trends into the same graph, that’s why I think that applying K-means clustering,
we can take advantage from all that data having some groups into the same data because we have
more than one feature here, and if we have more than one feature, so this method becomes more
trustworthy, and we can study the trend of each group easier.
The red dot shows us the mean, it let us know that we have too many features into the same graph,
and we can not say too much just with that, that’s one more reason to use a K-mean clustering, and
may be after that we can get more clear where the outliers are, this is possible having the respective
centroids depending on how many clusters we want. I decided to get two clusters into this data
because it shows two different features, after that I will be able to check if I need more data for the
method that we are using.

Using K-means clustering We wanted to find the outlier taking into the account that now We have
two centroids more, We try to find the outliers know the location of two centroids and the mean
and from that we can have an idea how many outliers are there, see Fig [3].

Fig [3]. Variance vs skewness Standardized Using K-means clustering

In the Fig 3, We can see 3 ellipses, the oranges ones were made with two clusters and the red one
with the mean, this because We wanted to cover a larger area within the data to find the outliers,
this graph let us think that may be after We finish the code We still having a considerable error, but
it is going to be less if the bank provides more data making stronger at the time to use K-means
clustering.

Knowing that there is going to be some errors in the results, We keep working with the standardized
data reducing the margin of error since there will be a smaller distance between the data, and with
this We make a prediction trying to find the genuine and the forged using K-means clustering and
dividing the groups by the following colors, see Fig [4]:

Fig [4]. Variance vs Skewness Standardized Using K-Means Clustering

After this we classified the genuine with 0 and the forged with 1 and then making a comparation
between the real data that was given by the bank, and analyzing the results we got 12,8% of error
that says that our prediction is 88,2% correct.

Number of success 1197 of 1372

Reliability 88,2%

In conclusion We do not get enough reliability from the prediction it was just 88,2% , this since We
need more data from the bank as more data is given to study, the margin of error will be much
smaller and the prediction will be more accurate, so my advice is to not use this prediction since a
lot of money will be lost due to errors in this, but could be better if the bank still given more
information about the variance and the skewness as others features to make more trustworthy the
prediction.

House Prices Prediction in King County
Document10 pages
House Prices Prediction in King County
Jun Zhang
No ratings yet
Data Mining Graded Assignment Analysis
Document39 pages
Data Mining Graded Assignment Analysis
rakesh sandhyapogu
100% (3)
CLUSTERING ANALYSIS FOR CUSTOMER SEGMENTATION
Document16 pages
CLUSTERING ANALYSIS FOR CUSTOMER SEGMENTATION
rakesh sandhyapogu
No ratings yet
Coincent - Data Science With Python Assignment
Document23 pages
Coincent - Data Science With Python Assignment
Sai Nikhil Nellore
100% (2)
Operation & Service Manual for Hydraulic Service Unit
Document8 pages
Operation & Service Manual for Hydraulic Service Unit
Alejo Guerra Fernandez
No ratings yet
Operation & Service Manual for Hydraulic Service Unit
Document8 pages
Operation & Service Manual for Hydraulic Service Unit
Alejo Guerra Fernandez
No ratings yet
Operation & Service Manual for Hydraulic Service Unit
Document8 pages
Operation & Service Manual for Hydraulic Service Unit
Alejo Guerra Fernandez
No ratings yet
Customer Segmentation with R K-Means Clustering
Document5 pages
Customer Segmentation with R K-Means Clustering
raja S
No ratings yet
Differential Privacy in The Wild: A Tutorial On Current Practices and Open Challenges
Document292 pages
Differential Privacy in The Wild: A Tutorial On Current Practices and Open Challenges
dustuumistuu
No ratings yet
Answer Report (Preditive Modelling)
Document29 pages
Answer Report (Preditive Modelling)
Shweta Lakhera
100% (1)
Introduction To Data Science Project Report: 1 Preprocessing
Document10 pages
Introduction To Data Science Project Report: 1 Preprocessing
Bangtan gurl
No ratings yet
Data Mining Project
Document22 pages
Data Mining Project
Ranadip Guha
No ratings yet
Data Preparation (Fujita Sho, Lim Chun Sin)
Document2 pages
Data Preparation (Fujita Sho, Lim Chun Sin)
Sho Fujita
No ratings yet
Chapter2 FML
Document33 pages
Chapter2 FML
Saquibh Shaikh
No ratings yet
Understanding the Bias-Variance Tradeoff in Machine Learning Models
Document11 pages
Understanding the Bias-Variance Tradeoff in Machine Learning Models
Balaganesh Nallathambi
No ratings yet
Decision Tree Random Forest X20147503 Antanas Capskis
Document5 pages
Decision Tree Random Forest X20147503 Antanas Capskis
Antanas Capskis
No ratings yet
Does Data Splitting Improve Prediction?: Julian J. Faraway
Document12 pages
Does Data Splitting Improve Prediction?: Julian J. Faraway
MarcAparici
No ratings yet
Sunira Data Mining
Document53 pages
Sunira Data Mining
Deepanshu Parashar
No ratings yet
Sunira Data Mining
Document53 pages
Sunira Data Mining
Deepanshu Parashar
No ratings yet
Predictive Modelling
Document58 pages
Predictive Modelling
Pranav Viswanathan
No ratings yet
Machine Learning - Project Group 3
Document17 pages
Machine Learning - Project Group 3
Tangirala Ashwini
No ratings yet
Data Science
Document17 pages
Data Science
Nabajit
No ratings yet
Standard Deviation
Document8 pages
Standard Deviation
Amine Bajji
No ratings yet
Reading 11 - Programming End-to-End Solution
Document13 pages
Reading 11 - Programming End-to-End Solution
lussy
No ratings yet
Capstone Project - Credit Card Fraud Prediction - Alexandre Daltro
Document15 pages
Capstone Project - Credit Card Fraud Prediction - Alexandre Daltro
DKN
No ratings yet
House Price Prediction
Document11 pages
House Price Prediction
Simran
No ratings yet
Metrices of The Model
Document9 pages
Metrices of The Model
Narmatha
No ratings yet
Project Report - ML
Document17 pages
Project Report - ML
Sankeeta Jha
100% (1)
Clustering Analysis: Reading The Data
Document15 pages
Clustering Analysis: Reading The Data
KATHIRVEL S
100% (1)
Data Mining Project DSBA PCA Report Final
Document21 pages
Data Mining Project DSBA PCA Report Final
indraneel120
No ratings yet
Answer Report: Data Mining
Document32 pages
Answer Report: Data Mining
Chetan Sharma
No ratings yet
Restaurant Recommendation1
Document5 pages
Restaurant Recommendation1
César Leonardo Orosco Barboza
No ratings yet
Accuracy Assessment
Document8 pages
Accuracy Assessment
Nasirajk
No ratings yet
Procedural Annotation of Uncertain Information
Document8 pages
Procedural Annotation of Uncertain Information
rntrd
No ratings yet
ErrorAnalysis 3
Document4 pages
ErrorAnalysis 3
losblancos1
No ratings yet
Machine Learning Interview Questions on Dimensionality Reduction and Model Performance
Document21 pages
Machine Learning Interview Questions on Dimensionality Reduction and Model Performance
Kaleab Tekle
100% (1)
Data Mining Assignment: Sudhanva Saralaya
Document16 pages
Data Mining Assignment: Sudhanva Saralaya
Sudhanva S
100% (1)
Data Mining
Document37 pages
Data Mining
Sai Saamrudh Saiganesh
No ratings yet
Evaluating A Machine Learning Model
Document14 pages
Evaluating A Machine Learning Model
Jean
No ratings yet
Data Mining
Document34 pages
Data Mining
ShreyaPrakash
No ratings yet
BIA 660: Glassdoor Sentimental Analysis and Salary Prediction
Document15 pages
BIA 660: Glassdoor Sentimental Analysis and Salary Prediction
aditghanekar10
No ratings yet
Polynomial Regression and Clustering
Document6 pages
Polynomial Regression and Clustering
Dimple Rohera
No ratings yet
Need of PCA
Document6 pages
Need of PCA
Simi Jain
100% (1)
Basic Interview Q's On ML PDF
Document243 pages
Basic Interview Q's On ML PDF
sourajit roy chowdhury
100% (2)
Business Report: Advanced Statistics Module Project - II
Document9 pages
Business Report: Advanced Statistics Module Project - II
Prasad Mohan
No ratings yet
What Makes A Good Visualization
Document26 pages
What Makes A Good Visualization
Yasser Isteitieh
No ratings yet
MAY 2021 DATA MINING BUSINESS REPORT CLUSTER ANALYSIS
Document38 pages
MAY 2021 DATA MINING BUSINESS REPORT CLUSTER ANALYSIS
Thaku Singh
No ratings yet
Pca Tutorial
Document27 pages
Pca Tutorial
Gregory A Perdomo P
No ratings yet
Churn Analysis of Bank Customers
Document12 pages
Churn Analysis of Bank Customers
Vaibhav Agarwal
100% (1)
Report
Document17 pages
Report
Kanishk Kapoor
No ratings yet
Making Inference from a Single Sample
Document4 pages
Making Inference from a Single Sample
nou
No ratings yet
Big Data Analytics Lab File
Document61 pages
Big Data Analytics Lab File
Akshay Jain
No ratings yet
Machine Learning Models: by Mayuri Bhandari
Document48 pages
Machine Learning Models: by Mayuri Bhandari
mayuri
No ratings yet
Term Project: Part 2 - Individual
Document13 pages
Term Project: Part 2 - Individual
api-289345421
No ratings yet
Project Submission Clustering
Document20 pages
Project Submission Clustering
ankitbhagat
No ratings yet
Bank Data Analysis Report: Compare Model Performance
Document14 pages
Bank Data Analysis Report: Compare Model Performance
Chandra Prakash S
No ratings yet
Assignment
Document24 pages
Assignment
Santhi Palanisamy
No ratings yet
Assignment 1 SOLUTION
Document11 pages
Assignment 1 SOLUTION
Subash Adhikari
No ratings yet
Qualireg
Document9 pages
Qualireg
Hira Mehmood
No ratings yet
Exact Answers To The Wrong Questions: Why Statisticians Still Do Not Understand Shewhart
Document7 pages
Exact Answers To The Wrong Questions: Why Statisticians Still Do Not Understand Shewhart
tehky63
No ratings yet
Basic Data Science Interview Questions Explained
Document38 pages
Basic Data Science Interview Questions Explained
sahil kumar
No ratings yet
Statistics: Practical Concept of Statistics for Data Scientists
From Everand
Statistics: Practical Concept of Statistics for Data Scientists
John Slavio
No ratings yet
Predicting the Unpredictable: Pragmatic Approaches to Estimating Project Schedule or Cost
From Everand
Predicting the Unpredictable: Pragmatic Approaches to Estimating Project Schedule or Cost
Johanna Rothman
Rating: 4.5 out of 5 stars
4.5/5 (4)
Start Predicting In A World Of Data Science And Predictive Analysis
From Everand
Start Predicting In A World Of Data Science And Predictive Analysis
Matthew Abbitt
No ratings yet
Errors of Regression Models: Bite-Size Machine Learning, #1
From Everand
Errors of Regression Models: Bite-Size Machine Learning, #1
Lee Baker
No ratings yet
Operation & Service Manual: Model: 02-1040-0111 10 Ton Single Stage Jack
Document17 pages
Operation & Service Manual: Model: 02-1040-0111 10 Ton Single Stage Jack
Alejo Guerra Fernandez
No ratings yet
Operation & Service Manual: Model: 02-1040-0111 10 Ton Single Stage Jack
Document17 pages
Operation & Service Manual: Model: 02-1040-0111 10 Ton Single Stage Jack
Alejo Guerra Fernandez
No ratings yet
Bank Data Science Project Predicts Genuine Notes
Document3 pages
Bank Data Science Project Predicts Genuine Notes
Alejo Guerra Fernandez
No ratings yet
Operation & Service Manual
Document52 pages
Operation & Service Manual
Alejo Guerra Fernandez
No ratings yet
Operation & Service Manual
Document52 pages
Operation & Service Manual
Alejo Guerra Fernandez
No ratings yet
Operation & Service Manual
Document52 pages
Operation & Service Manual
Alejo Guerra Fernandez
No ratings yet
Operation & Service Manual
Document52 pages
Operation & Service Manual
Alejo Guerra Fernandez
No ratings yet
Operation & Service Manual
Document52 pages
Operation & Service Manual
Alejo Guerra Fernandez
No ratings yet
Operation & Service Manual: Model: 02-1040-0111 10 Ton Single Stage Jack
Document17 pages
Operation & Service Manual: Model: 02-1040-0111 10 Ton Single Stage Jack
Alejo Guerra Fernandez
No ratings yet
Distance-based Models Explained
Document58 pages
Distance-based Models Explained
Shivam Chadha
No ratings yet
Advanced Analytics With R and Tableau - Jen Stirrup, Ruben Oliva Ramos
Document277 pages
Advanced Analytics With R and Tableau - Jen Stirrup, Ruben Oliva Ramos
Edwin Alex Palomino Morales
100% (4)
Image Segmentation by Fuzzy C-Means Clustering Algorithm With A Novel Penalty Term Yong Yang
Document15 pages
Image Segmentation by Fuzzy C-Means Clustering Algorithm With A Novel Penalty Term Yong Yang
dragon_287
No ratings yet
K Means Questions
Document2 pages
K Means Questions
Shakeeb Parwez
No ratings yet
East West Airlines Output K Means
Document29 pages
East West Airlines Output K Means
SantoshIshkhan
No ratings yet
INF6028 Week 2 21-22 (Unsupervised Learning)
Document75 pages
INF6028 Week 2 21-22 (Unsupervised Learning)
Aryanda Prasetya
No ratings yet
SADM10-Comparativo Descriptivo y Predictivo
Document15 pages
SADM10-Comparativo Descriptivo y Predictivo
efra_12
No ratings yet
Machine Learning in Traffic Classification of SDN - Final Project Report
Document11 pages
Machine Learning in Traffic Classification of SDN - Final Project Report
faizan jutt
No ratings yet
Lab Workbook
Document160 pages
Lab Workbook
Oye Aj
No ratings yet
Assignment-7: Opening Iris - Arff and Removing Class Attribute
Document17 pages
Assignment-7: Opening Iris - Arff and Removing Class Attribute
ammi890
No ratings yet
Iterative Data Mining Approach for Overlapping Gene Co-Expression Patterns
Document22 pages
Iterative Data Mining Approach for Overlapping Gene Co-Expression Patterns
Global info - tech Kumbakonam
No ratings yet
Applying Design Knowledge and Machine Learning To Scada Data For Classification of Wind Turbine Operating Regimes
Document8 pages
Applying Design Knowledge and Machine Learning To Scada Data For Classification of Wind Turbine Operating Regimes
Vinicius dos Santos Miranda
No ratings yet
Cluster Training PDF (Compatibility Mode)
Document21 pages
Cluster Training PDF (Compatibility Mode)
Sarbani Dasgupts
No ratings yet
Clustering in Machine Learning - Javatpoint
Document10 pages
Clustering in Machine Learning - Javatpoint
mangotwin22
No ratings yet
Plant Disease Recognition A Large-Scale Benchmark Dataset and A Visual Region and Loss Reweighting Approach
Document13 pages
Plant Disease Recognition A Large-Scale Benchmark Dataset and A Visual Region and Loss Reweighting Approach
a
No ratings yet
SSRN Id3693395
Document62 pages
SSRN Id3693395
tema.kouznetsov
No ratings yet
e1071 Package Reference
Document62 pages
e1071 Package Reference
Gayathri Prasanna Shrinivas
No ratings yet
IEEE RAICS 2011 Table of Contents
Document12 pages
IEEE RAICS 2011 Table of Contents
hub23
No ratings yet
Kmeans++ Exercise
Document6 pages
Kmeans++ Exercise
Mister Sinister
No ratings yet
Pizza Market Segmentation
Document12 pages
Pizza Market Segmentation
srinaths92
No ratings yet
INT213
Document30 pages
INT213
Dr-Irfan Ahmad Pindoo
0% (1)
Chapter 7
Document49 pages
Chapter 7
Ontime Bestwriters
No ratings yet
Accident Data Analysis Using Machine Learning
Document6 pages
Accident Data Analysis Using Machine Learning
IJRASETPublications
No ratings yet
Teacher-Child Relationship Quality in Early Childhood Education-Libre
Document19 pages
Teacher-Child Relationship Quality in Early Childhood Education-Libre
Klara Mihalache
No ratings yet
1234
Document10 pages
1234
Rafiullah Rustam
No ratings yet
ML IoT (General) PDF
Document33 pages
ML IoT (General) PDF
Chinh Trịnh
No ratings yet
Comparative Study of K-Means and Hierarchical Clustering Techniques
Document7 pages
Comparative Study of K-Means and Hierarchical Clustering Techniques
Magdalena Palomares
No ratings yet
AI Brochure
Document3 pages
AI Brochure
jbsimha3629
No ratings yet