You are on page 1of 56

CREDIT CARD FRAUD DETECTION IN DARKWEB

Abstract

Due to a rapid advancement in the electronic commerce technology, the use of


credit cards has dramatically increased. As credit card becomes the most popular
mode of payment for both online as well as regular purchase, cases of fraud
associated with it are also rising. Financial fraud is an ever growing menace with
far consequences in the financial industry. Data mining had played an imperative
role in the detection of credit card fraud in dark web. Credit card fraud detection,
which is a data mining problem, becomes challenging due to two major reasons –
first, the profiles of normal and fraudulent behaviors change constantly and
secondly, credit card fraud data sets are highly skewed. The performance of fraud
detection in credit card transactions in dark web is greatly affected by the sampling
approach on dataset, selection of variables and detection technique(s) used. In
credit card fraud detection in dark web project, we will fit our first model. We will
begin with logistic regression. A logistic regression is used for modelling the
outcome probability of a class such as pass/fail, positive/negative and in our case –
fraud/not fraud. The goal of the task is to automatically identify fraudulent credit
card transactions using Machine Learning. Dataset of credit card transactions is
sourced from cardholders containing 3076 transactions. A hybrid technique of
under-sampling and oversampling is carried out on the skewed data. The three
techniques are applied on the raw and pre processed data. The work is
implemented in Python. The performance of the techniques is evaluated based on
accuracy, sensitivity, specificity, precision, correlation coefficient and balanced
classification rate. The results shows of optimal accuracy for naïve bayes, k-nearest
neighbour and logistic regression classifiers are more than 90%. The comparative
results show that k-nearest neighbour performs better than naïve bayes and logistic
regression techniques
BACKGROUND STUDY

STUDIES ON THE EXISTING SYSTEM

The system study phase studies the requirements of the existing system has more
precious and rare things, which are collected from various sources that will be sold
on auction such that they can see more profit over the product rather than ordinary
sales. Firstly and most ostensibly when your card details are overseen by some
other person. When your card is lost or stolen and the person possessing it knows
how to get things done. Fake phone call convincing you to share the details. And
lastly and most improbably, a high-level hacking of the bank account details.

Though profitable there is more risk associated. In existing system auction is


conducted manually. The customer who need to purchase product need to go to the
shop and he have to refer the types of products available in the store, for that he
need to refer the index or to contact the employee of the store. After referring the
products he need to select the products for purchase. The customer needs to stand
in some queue for the billing

LIMITATION OF EXISTING SYSTEM

 It needs manpower to record all the details of all customers


 Each and every time the customer details have to be register manually whenever
the customer purchased the products.
 On administrator side it’s difficult to manage more no of customer if visited.
 There should be some panel of large members to enable to handle all type of
customers.
 Credit card validation would not be done.
 Takes high cost
 Maintenance costs high
 Administrator has to keep track of all processes
 As files are maintained manually, chances for errors are large
 Needs more manpower
 Publicity cannot be assured
 Enormous Data is processed every day and the model build must be fast enough
to respond to the scam in time.
 Imbalanced Data i.e. most of the transactions(99.8%) are not fraudulent which
makes it really hard for detecting the fraudulent ones
 Data availability as the data is mostly private.
 Misclassified Data can be another major issue, as not every fraudulent
transaction is caught and reported.
 And last but not the least, Adaptive techniques used against the model by the
scammers.
STUDIES ON THE PROPOSED SYSTEM
The Proposed system should overcome the entire disadvantage faced by the
existing system. Enormous Data is processed every day and the model build must
be fast enough to respond to the scam in time. Data availability as the data is
mostly private. Misclassified Data can be another major issue, as not every
fraudulent transaction is caught and reported. And last but not the least, Adaptive
techniques used against the model by the scammers. We solve the problem of
detecting credit card fraud transactions using machine learning libraries like
numpy, scikit learn, and few other python libraries.

DEFINING THE PROBLEM


The main objectives of the proposed system are to reduce the human work, reduce
the maintenance time, and to make the system more user friendly, efficient, more
accurate and fast processing. We used the web service concept to develop this
application.

Advantage of proposed system

 Comparing the existing system it reduces the maintenance cost and investment
 Prevents errors due to systematic process
 Gaining more customers becomes quite easier
Modules

DATA COLLECTION

The dataset consists of 12 parameters like Merchant_id, Transaction date, Average


Amount/transaction/day, Transaction_amount, Is declined, Total Number of
declines/day, isForeignTransaction, isHighRiskCountry,
Daily_chargeback_avg_amt, 6_month_avg_chbk_amt, 6-month_chbk_freq,
isFradulent. So, the label distribution suffers from imbalance issues. The data we
are going to use is the Kaggle Credit Card Fraud Detection dataset. We are going
to neglect the time feature which is of no use to build the models. The remaining
features are the ‘Amount’ feature that contains the total amount of money being
transacted and the ‘Class’ feature that contains whether the transaction is a fraud
case or not.

DATA ACQUISITION

There are a total of 3076 transactions. Let’s import the necessary modules, load our
dataset, and perform EDA on our dataset

DATA PRE PROCESSING

Now, check for null values in the credit card dataset. Luckily, there aren’t any null
or NaN values in our dataset. The feature we are most interested in is the
“Amount”. Now, let’s check the number of occurrences of each class label and plot
the information using matplotlib. We can observe that the genuine transactions are
over 99%! This is not good. Let’s apply scaling techniques on the “Amount”
feature to transform the range of values. We drop the original “Amount” column
and add a new column with the scaled values. We also drop the “Time” column as
it is irrelevant.

TRAIN AND TEST SPLIT

Now, it’s time to split credit card data with a split of 70-30 using train_test_split().
Let’s train different models on our dataset and observe which algorithm works
better for our problem. This is actually a binary classification problem as we have
to predict only 1 of the 2 class labels. We can apply a variety of algorithms for this
problem like Random Forest, Decision Tree, Support Vector Machine algorithms,
etc. In this machine learning project, we build Random Forest and Decision Tree
classifiers and see which one works best. We address the “class imbalance”
problem by picking the best-performed model. But before we go into the code,
let’s understand what random forests and decision trees are.

EVALUATION MODEL

Model Evaluation is an essential part of the model development process. It helps to


find the best model that represents our data and how well the selected model will
work in the future. Evaluating model performance with the data used for training is
not acceptable in data science because it can effortlessly generate
overoptimistically and over fitted models. To avoid over fitting, evaluation
methods such as hold out and cross-validations are used to test to evaluate model
performance. The result will be in the visualized form. Representation of classified
data in the form of graphs. Accuracy is well-defined as the proportion of precise
predictions for the test data. It can be calculated easily by mathematical calculation
i.e. dividing the number of correct predictions by the number of total predictions.

EVALUATE DATASET MODEL

Evaluate the newly created models on the dataset and pick the best one. Train the
decision tree and random forest models on the dataset using the fit() function.
Record the predictions made by the models using the predict() function and
evaluate. Let’s visualize the scores of each of our credit card fraud classifiers
SYSTEM SPECIFICATION

HARDWARE SPECIFICATION:

PROCESSOR : Intel i3 processors

HARD DISK CAPACITY : 400 GB

MONITOR : 14 “SAMTRON MONITOR

INTERNAL MEMORY CA : 2 GB

KEYBOARD : LOGITECH OF 104 KEYS

CPU CLOCK : 1.08 GHz

MOUSE : LOGITECH MOUSE

SOFTWARE SPECIFICATION:

OPERATING SYSTEM : WINDOWS XP/7 ULTIMATE

FRONT END : PYTHON2

TOOLS USED : SPYDER IDE


PYTHON
Python is an easy to learn, powerful programming language. It has efficient high-
level data structures and a simple but effective approach to object-oriented
programming. Python’s elegant syntax and dynamic typing, together with its
interpreted nature, make it an ideal language for scripting and rapid application
development in many areas on most platforms.
The Python interpreter and the extensive standard library are freely available in
source or binary form for all major platforms. Python is a high-level, interpreted,
interactive and object-oriented scripting language. Python is designed to be highly
readable. It uses English keywords frequently where as other languages use
punctuation, and it has fewer syntactical constructions than other languages.
Python can be used on a server to create web applications. Python works on
different platforms (Windows, Mac, Linux, Raspberry Pi, etc). Python can connect
to database systems. It can also read and modify files.
Python is Interpreted − Python is processed at runtime by the interpreter. You do
not need to compile your program before executing it. This is similar to PERL and
PHP.
Python is Interactive − You can actually sit at a Python prompt and interact with
the interpreter directly to write your programs.
Python is Object-Oriented − Python supports Object-Oriented style or technique
of programming that encapsulates code within objects.
Python is a Beginner's Language − Python is a great language for the beginner-
level programmers and supports the development of a wide range of applications
from simple text processing to WWW browsers to games.
Python is derived from many other languages, including ABC, Modula-3, C, C++,
Algol-68, SmallTalk, and Unix shell and other scripting languages.
Often, programmers fall in love with Python because of the increased
productivity it provides. Since there is no compilation step, the edit-test-debug
cycle is incredibly fast. Debugging Python programs is easy: a bug or bad input
will never cause a segmentation fault. Instead, when the interpreter discovers an
error, it raises an exception. When the program doesn't catch the exception, the
interpreter prints a stack trace. A source level debugger allows inspection of local
and global variables, evaluation of arbitrary expressions, setting breakpoints,
stepping through the code a line at a time, and so on. The debugger is written in
Python itself, testifying to Python's introspective power. On the other hand, often
the quickest way to debug a program is to add a few print statements to the source:
the fast edit-test-debug cycle makes this simple approach very effective.
Python was conceived in the late 1980s by Guido van Rossum at Centrum
Wiskunde & Informatica (CWI) in the Netherlands as a successor to the ABC
language (itself inspired by SETL), capable of exception handling and interfacing
with the Amoeba operating system. Its implementation began in December 1989.
Van Rossum shouldered sole responsibility for the project, as the lead developer,
until 12 July 2018, when he announced his "permanent vacation" from his
responsibilities as Python's Benevolent Dictator For Life, a title the Python
community bestowed upon him to reflect his long-term commitment as the
project's chief decision-maker.[37] He now shares his leadership as a member of a
five-person steering council. In January 2019, active Python core developers
elected Brett Cannon, Nick Coghlan, Barry Warsaw, Carol Willing and Van
Rossum to a five-member "Steering Council" to lead the project.
Python 2.0 was released on 16 October 2000 with many major new features,
including a cycle-detecting garbage collector and support for Unicode.
Python 3.0 was released on 3 December 2008. It was a major revision of the
language that is not completely backward-compatible. Many of its major features
were backported to Python 2.6.x and 2.7.x version series. Releases of Python 3
include the 2to3 utility, which automates (at least partially) the translation of
Python 2 code to Python 3.
Python 2.7's end-of-life date was initially set at 2015 then postponed to 2020 out of
concern that a large body of existing code could not easily be forward-ported to
Python 3.
Python is a multi-paradigm programming language. Object-oriented programming
and structured programming are fully supported, and many of its features support
functional programming and aspect-oriented programming (including by
metaprogramming and metaobjects (magic methods)). Many other paradigms are
supported via extensions, including design by contract and logic programming.
Python uses dynamic typing and a combination of reference counting and a cycle-
detecting garbage collector for memory management. It also features dynamic
name resolution (late binding), which binds method and variable names during
program execution.
Python's design offers some support for functional programming in the Lisp
tradition. It has filter, map, and reduce functions; list comprehensions, dictionaries,
sets, and generator expressions. The standard library has two modules (itertools
and functools) that implement functional tools borrowed from Haskell and
Standard ML
Rather than having all of its functionality built into its core, Python was designed
to be highly extensible. This compact modularity has made it particularly popular
as a means of adding programmable interfaces to existing applications. Van
Rossum's vision of a small core language with a large standard library and easily
extensible interpreter stemmed from his frustrations with ABC, which espoused the
opposite approach
MYSQL
MySQL Server is a powerful database management system and the user can create
application that requires little or no programming. It supports GUI features and an
entire programming language, Phpmyadmin which can be used to develop richer
and more developed application. There are quite a few reasons, the first being that
MySQL is a feature rich program that can handle any database related task you
have. You can create places to store your data build tools that make it easy to read
and modify your database contents, and ask questions of your data. MySQL is a
relational database, a database that stores information about related objects. In
MySQL that database means a collection of tables that hold data. It collectively
stores all the other related objects such as queries, forms and reports that are used to
implement function effectively.

The MySQL database can act as a back end database for PHP as a front end,
MySQL supports the user with its powerful database management functions. A
beginner can create his/her own database very simply by some mouse clicks.
Another good reason to use MySQL as backend tool is that it is a component of the
overwhelmingly popular Open source software.

MySQL is a freely available open source Relational Database Management System


(RDBMS) that uses Structured Query Language (SQL). SQL is the most popular
language for adding, accessing and managing content in a database. It is most
noted for its quick processing, proven reliability, ease and flexibility of use.
MySQL is an essential part of almost every open source PHP application. One of
the most important things about using MySQL is to have a MySQL specialized
host. Its name is a combination of "My", the name of co-founder Michael
Widenius's daughter and "SQL", the abbreviation for Structured Query Language.
MySQL is free and open-source software under the terms of the GNU General
Public License, and is also available under a variety of proprietary licenses.
MySQL was owned and sponsored by the Swedish company MySQL AB, which
was bought by Sun Microsystems (now Oracle Corporation). In 2010, when Oracle
acquired Sun, Widenius forked the open-source MySQL project to create
MariaDB.

MySQL is written in C and C++. Its SQL parser is written in yacc, but it uses a
home-brewed lexical analyzer.[15] MySQL works on many system platforms,
including AIX, BSDi, FreeBSD, HP-UX, eComStation, i5/OS, IRIX, Linux,
macOS, Microsoft Windows, NetBSD, Novell NetWare, OpenBSD, OpenSolaris,
OS/2 Warp, QNX, Oracle Solaris, Symbian, SunOS, SCO OpenServer, SCO
UnixWare, Sanos and Tru64. A port of MySQL to OpenVMS also exists.

MySQL was created by a Swedish company, MySQL AB, founded by David


Axmark, Allan Larsson and Michael "Monty" Widenius. Original development of
MySQL by Widenius and Axmark began in 1994. The first version of MySQL
appeared on 23 May 1995. It was initially created for personal usage from mSQL
based on the low-level language ISAM, which the creators considered too slow and
inflexible. They created a new SQL interface, while keeping the same API as
mSQL. By keeping the API consistent with the mSQL system, many developers
were able to use MySQL instead of the (proprietarily licensed) mSQL antecedent.

MySQL is based on a client-server model. The core of MySQL is MySQL server,


which handles all of the database instructions (or commands). MySQL server is
available as a separate program for use in a client-server networked environment
and as a library that can be embedded (or linked) into seperate applications.
MySQL operates along with several utility programs which support the
administration of MySQL databases. Commands are sent to MySQLServer via the
MySQL client, which is installed on a computer. MySQL was originally developed
to handle large databases quickly. Although MySQL is typically installed on only
one machine, it is able to send the database to multiple locations, as users are able
to access it via different MySQL client interfaces. These interfaces send SQL
statements to the server and then display the results.

MySQL enables data to be stored and accessed across multiple storage engines,
including InnoDB, CSV, and NDB. MySQL is also capable of replicating data and
partitioning tables for better performance and durability. MySQL users aren't
required to learn new commands; they can access their data using standard SQL
commands.

The RDBMS supports large databases with millions records and supports many
data types including signed or unsigned integers 1, 2, 3, 4, and 8 bytes long;
FLOAT; DOUBLE; CHAR; VARCHAR; BINARY; VARBINARY; TEXT;
BLOB; DATE; TIME; DATETIME; TIMESTAMP; YEAR; SET; ENUM; and
OpenGIS spatial types. Fixed- and variable-length string types are also supported
HYPER TEXT MARKUP LANGUAGE (HTML)

HTML is an application of the Standard Generalized Markup Language (SGML),


which was approved as an international standard in the year 1986. SGML provides
a way to encode hyper documents so they can be interchanged.

SGML is also a Meta language for formally describing document markup system.
Infact HTML uses SGML to define a language that describes a WWW hyper
document’s structure and inter connectivity.

Following the rigors of SGML, TBL bore HTML to the world in 1990. Since then,
many of us have it to be easy to use but sometimes quite limiting. These limiting
factors are being addressed but the World Wide Web Consortium (aka W3c) at
MIT. But HTML had to start somewhere, and its success argues that it didn’t start
out too badly.

Hypertext Markup Language (HTML) is the standard markup language for


documents designed to be displayed in a web browser. It can be assisted by
technologies such as Cascading Style Sheets (CSS) and scripting languages such as
JavaScript. HTML is a computer language devised to allow website creation.
These websites can then be viewed by anyone else connected to the Internet. It is
relatively easy to learn, with the basics being accessible to most people in one
sitting; and quite powerful in what it allows you to create. It is constantly
undergoing revision and evolution to meet the demands and requirements of the
growing Internet audience under the direction of the W3C, the organisation
charged with designing and maintaining the language.

HyperText is the method by which you move around on the web — by clicking on
special text called hyperlinks which bring you to the next page. The fact that it is
hyper just means it is not linear — i.e. you can go to any place on the Internet
whenever you want by clicking on links — there is no set order to do things in.
Markup is what HTML tags do to the text inside them. They mark it as a certain
type of text (italicised text, for example). HTML is a Language, as it has code-
words and syntax like any other language.

HTML consists of a series of short codes typed into a text-file by the site author —
these are the tags. The text is then saved as a html file, and viewed through a
browser, like Internet Explorer or Netscape Navigator. This browser reads the file
and translates the text into a visible form, hopefully rendering the page as the
author had intended. Writing your own HTML entails using tags correctly to create
your vision. You can use anything from a rudimentary text-editor to a powerful
graphical editor to create HTML pages.

The tags are what separate normal text from HTML code. You might know them
as the words between the <angle-brackets>. They allow all the cool stuff like
images and tables and stuff, just by telling your browser what to render on the
page. Different tags will perform different functions. The tags themselves don’t
appear when you view your page through a browser, but their effects do. The
simplest tags do nothing more than apply formatting to some text

Web browsers receive HTML documents from a web server or from local storage
and render the documents into multimedia web pages. HTML describes the
structure of a web page semantically and originally included cues for the
appearance of the document.

HTML elements are the building blocks of HTML pages. With HTML constructs,
images and other objects such as interactive forms may be embedded into the
rendered page. HTML provides a means to create structured documents by
denoting structural semantics for text such as headings, paragraphs, lists, links,
quotes and other items. HTML elements are delineated by tags, written using angle
brackets. Tags such as <img /> and <input /> directly introduce content into the
page. Other tags such as <p> surround and provide information about document
text and may include other tags as sub-elements. Browsers do not display the
HTML tags, but use them to interpret the content of the page.

HTML can embed programs written in a scripting language such as JavaScript,


which affects the behavior and content of web pages. Inclusion of CSS defines the
look and layout of content. The World Wide Web Consortium (W3C), former
maintainer of the HTML and current maintainer of the CSS standards, has
encouraged the use of CSS over explicit presentational HTML since 1997.

The first publicly available description of HTML was a document called "HTML
Tags", first mentioned on the Internet by Tim Berners-Lee in late 1991. It
describes 18 elements comprising the initial, relatively simple design of HTML.
Except for the hyperlink tag, these were strongly influenced by SGMLguid, an in-
house Standard Generalized Markup Language (SGML)-based documentation
format at CERN. Eleven of these elements still exist in HTML 4.

After the HTML and HTML+ drafts expired in early 1994, the IETF created an
HTML Working Group, which in 1995 completed "HTML 2.0", the first HTML
specification intended to be treated as a standard against which future
implementations should be based.

Of course, but since making websites became more popular and needs increased
many other supporting languages have been created to allow new stuff to happen,
plus HTML is modified every few years to make way for improvements.
Cascading Stylesheets are used to control how your pages are presented, and make
pages more accessible. Basic special effects and interaction is provided by
JavaScript, which adds a lot of power to basic HTML. Most of this advanced stuff
is for later down the road, but when using all of these technologies together, you
have a lot of power at your disposal.
CSS

Cascading Style Sheets (CSS) is a style sheet language used for describing the
presentation of a document written in a markup language like HTML. CSS is a
cornerstone technology of the World Wide Web, alongside HTML and JavaScript.
CSS is designed to enable the separation of presentation and content, including
layout, colors, and fonts. This separation can improve content accessibility,
provide more flexibility and control in the specification of presentation
characteristics, enable multiple web pages to share formatting by specifying the
relevant CSS in a separate .css file, and reduce complexity and repetition in the
structural content.

Separation of formatting and content also makes it feasible to present the same
markup page in different styles for different rendering methods, such as on-screen,
in print, by voice (via speech-based browser or screen reader), and on Braille-
based tactile devices. CSS also has rules for alternate formatting if the content is
accessed on a mobile device. The name cascading comes from the specified
priority scheme to determine which style rule applies if more than one rule matches
a particular element. This cascading priority scheme is predictable.

The CSS specifications are maintained by the World Wide Web Consortium
(W3C). Internet media type (MIME type) text/css is registered for use with CSS by
RFC 2318 (March 1998). The W3C operates a free CSS validation service for CSS
documents. In addition to HTML, other markup languages support the use of CSS
including XHTML, plain XML, SVG, and XUL.

CSS has a simple syntax and uses a number of English keywords to specify the
names of various style properties. A style sheet consists of a list of rules. Each rule
or rule-set consists of one or more selectors, and a declaration block.
Before CSS, nearly all presentational attributes of HTML documents were
contained within the HTML markup. All font colors, background styles, element
alignments, borders and sizes had to be explicitly described, often repeatedly,
within the HTML. CSS lets authors move much of that information to another file,
the style sheet, resulting in considerably simpler HTML.

Stands for "Cascading Style Sheet." Cascading style sheets are used to format the
layout of Web pages. They can be used to define text styles, table sizes, and other
aspects of Web pages that previously could only be defined in a page's HTML.

CSS helps Web developers create a uniform look across several pages of a Web
site. Instead of defining the style of each table and each block of text within a
page's HTML, commonly used styles need to be defined only once in a CSS
document. Once the style is defined in cascading style sheet, it can be used by any
page that references the CSS file. Plus, CSS makes it easy to change styles across
several pages at once. For example, a Web developer may want to increase the
default text size from 10pt to 12pt for fifty pages of a Web site. If the pages all
reference the same style sheet, the text size only needs to be changed on the style
sheet and all the pages will show the larger text.

While CSS is great for creating text styles, it is helpful for formatting other aspects
of Web page layout as well. For example, CSS can be used to define the cell
padding of table cells, the style, thickness, and color of a table's border, and the
padding around images or other objects. CSS gives Web developers more exact
control over how Web pages will look than HTML does. This is why most Web
pages today incorporate cascading style sheets.

CSS is created and maintained through a group of people within the W3C called
the CSS Working Group. The CSS Working Group creates documents called
specifications. When a specification has been discussed and officially ratified by
the W3C members, it becomes a recommendation. These ratified specifications are
called recommendations because the W3C has no control over the actual
implementation of the language. Independent companies and organizations create
that software.

JAVASCRIPT

JavaScript is a dynamic computer programming language. It is lightweight and


most commonly used as a part of web pages, whose implementations allow client-
side script to interact with the user and make dynamic pages. It is an interpreted
programming language with object-oriented capabilities. JavaScript was first
known as LiveScript, but Netscape changed its name to JavaScript, possibly
because of the excitement being generated by Java. JavaScript made its first
appearance in Netscape 2.0 in 1995 with the name LiveScript. The general-purpose
core of the language has been embedded in Netscape, Internet Explorer, and other
web browsers

Client-side JavaScript is the most common form of the language. The script should
be included in or referenced by an HTML document for the code to be interpreted
by the browser. It means that a web page need not be a static HTML, but can
include programs that interact with the user, control the browser, and dynamically
create HTML content. The JavaScript client-side mechanism provides many
advantages over traditional CGI server-side scripts. For example, you might use
JavaScript to check if the user has entered a valid e-mail address in a form field.
The JavaScript code is executed when the user submits the form, and only if all the
entries are valid, they would be submitted to the Web Server. JavaScript can be
used to trap user-initiated events such as button clicks, link navigation, and other
actions that the user initiates explicitly or implicitly.

JavaScript can be implemented using JavaScript statements that are placed within
the <script>... </script> HTML tags in a web page.

You can place the <script> tags, containing your JavaScript, anywhere within your
web page, but it is normally recommended that you should keep it within the
<head> tags.

The <script> tag alerts the browser program to start interpreting all the text
between these tags as a script.

JavaScript ignores spaces, tabs, and newlines that appear in JavaScript programs.
You can use spaces, tabs, and newlines freely in your program and you are free to
format and indent your programs in a neat and consistent way that makes the code
easy to read and understand. Simple statements in JavaScript are generally
followed by a semicolon character, just as they are in C, C++, and Java. JavaScript,
however, allows you to omit this semicolon if each of your statements are placed
on a separate line. For example, the following code could be written without
semicolons.

JavaScript is a case-sensitive language. This means that the language keywords,


variables, function names, and any other identifiers must always be typed with a
consistent capitalization of letters. So the identifiers Time and TIME will convey
different meanings in JavaScript.

All the modern browsers come with built-in support for JavaScript. Frequently,
you may need to enable or disable this support manually. This chapter explains the
procedure of enabling and disabling JavaScript support in your browsers: Internet
Explorer, Firefox, chrome, and Opera.

JavaScript often abbreviated as JS, is an interpreted programming language that


conforms to the ECMAScript specification. JavaScript is high-level, often just-in-
time compiled, and multi-paradigm. It has curly-bracket syntax, dynamic typing,
prototype-based object-orientation, and first-class functions. Alongside HTML and
CSS, JavaScript is one of the core technologies of the World Wide Web.
JavaScript enables interactive web pages and is an essential part of web
applications. The vast majority of websites use it for client-side page behavior, and
all major web browsers have a dedicated JavaScript engine to execute it. As a
multi-paradigm language, JavaScript supports event-driven, functional, and
imperative programming styles. It has application programming interfaces (APIs)
for working with text, dates, regular expressions, standard data structures, and the
Document Object Model (DOM). However, the language itself does not include
any input/output (I/O), such as networking, storage, or graphics facilities, as the
host environment (usually a web browser) provides those APIs. Originally used
only in web browsers, JavaScript engines are also now embedded in server-side
website deployments and non-browser applications. Although there are similarities
between JavaScript and Java, including language name, syntax, and respective
standard libraries, the two languages are distinct and differ greatly in design.
Flask

Flask is a web application framework written in Python. Armin Ronacher, who


leads an international group of Python enthusiasts named Pocco, develops it. Flask
is based on Werkzeug WSGI toolkit and Jinja2 template engine. Both are Pocco
projects. Flask is a micro web framework written in Python. It is classified as a
microframework because it does not require particular tools or libraries. It has no
database abstraction layer, form validation, or any other components where pre-
existing third-party libraries provide common functions. However, Flask supports
extensions that can add application features as if they were implemented in Flask
itself. Extensions exist for object-relational mappers, form validation, upload
handling, various open authentication technologies and several common
framework related tools. Extensions are updated far more frequently than the core
Flask program.

Web Application Framework or simply Web Framework represents a collection of


libraries and modules that enables a web application developer to write
applications without having to bother about low-level details such as protocols,
thread management etc. Flask is a web application framework written in Python. It
is developed by Armin Ronacher, who leads an international group of Python
enthusiasts named Pocco. Flask is based on the Werkzeug WSGI toolkit and Jinja2
template engine. Both are Pocco projects. Web Server Gateway Interface (WSGI)
has been adopted as a standard for Python web application development. WSGI is
a specification for a universal interface between the web server and the web
applications.

Python 2.6 or higher is usually required for installation of Flask. Although Flask
and its dependencies work well with Python 3 (Python 3.3 onwards), many Flask
extensions do not support it properly. Hence, it is recommended that Flask should
be installed on Python 2.7. virtualenv is a virtual Python environment builder. It
helps a user to create multiple Python environments side-by-side. Thereby, it can
avoid compatibility issues between the different versions of the libraries. This
command needs administrator privileges. Add sudo before pip on Linux/Mac OS.
If you are on Windows, log in as Administrator. On Ubuntu virtualenv may be
installed using its package manager. The route() function of the Flask class is a
decorator, which tells the application which URL should call the associated
function. Importing flask module in the project is mandatory. An object of Flask
class is our WSGI application. Flask constructor takes the name of current module
(__name__) as argument. The rule parameter represents URL binding with the
function. The options is a list of parameters to be forwarded to the underlying Rule
object. Finally the run() method of Flask class runs the application on the local
development server.

A Flask application is started by calling the run() method. However, while the
application is under development, it should be restarted manually for each change
in the code. To avoid this inconvenience, enable debug support. The server will
then reload itself if the code changes. It will also provide a useful debugger to track
the errors if any, in the application. The Debug mode is enabled by setting the
debug property of the application object to True before running or passing the
debug parameter to the run() method.

Modern web frameworks use the routing technique to help a user remember
application URLs. It is useful to access the desired page directly without having to
navigate from the home page. The route() decorator in Flask is used to bind URL
to a function. As a result, if a user visits http://localhost:5000/hello URL, the
output of the hello_world() function will be rendered in the browser. The
add_url_rule() function of an application object is also available to bind a URL
with a function as in the above example, route() is used. It is possible to build a
URL dynamically, by adding variable parts to the rule parameter. This variable
part is marked as <variable-name>. It is passed as a keyword argument to the
function with which the rule is associated. In the following example, the rule
parameter of route() decorator contains <name> variable part attached to URL
‘/hello’. Hence, if the http://localhost:5000/hello/TutorialsPoint is entered as a
URL in the browser, ‘TutorialPoint’ will be supplied to hello() function as
argument.

An advantage of using Flask might be the fact that this framework is light, and the
risk for encountering Flask security bugs is minimal. At the same time, a drawback
might be the fact that it requires quite some effort from the part of the programmer
in order to boost the list of dependencies via plugins. A great thing about Flask is
the template engine available. The purpose of such templates is to allow basic
layout configuration for web pages with the purpose of mentioning which element
is susceptible to change. As such, you will be able to define your template once
and keep it the same all over the pages of a website. With the aid of a template
engine, you will be able to save a lot of time when setting up your application, and
even when it comes to updates or maintenance issues. Overall, Flask is easy to
learn and manage as a scalable tool. It allows any type of approach or
programming technique, as there are no restrictions included on the app
architecture or data abstraction layers. You can even run it on embedded systems
like a Raspberry Pi. Your web app can be loaded on any device, including mobile
phone, desktop pc or even a tv. Besides, it benefits from a community that offers
support and solutions suggestions to a multitude of problems that programmers
might face when using Flask in Python. The core benefit of Flask is that the
programmer controls everything, while he or she will get a deeper understanding of
how internal mechanics of frameworks function.

Werkzeug

Werkzeug is a utility library for the Python programming language, in other words
a toolkit for Web Server Gateway Interface (WSGI) applications, and is licensed
under a BSD License. Werkzeug can realize software objects for request, response,
and utility functions. It can be used to build a custom software framework on top
of it and supports Python 2.6, 2.7 and 3.3

Jinja

Jinja (template engine)

Jinja, also by Ronacher, is a template engine for the Python programming language
and is licensed under a BSD License. Similar to the Django web framework, it
provides that templates are evaluated in a sandbox.

A framework "is a code library that makes a developer's life easier when building
reliable, scalable, and maintainable web applications" by providing reusable code
or extensions for common operations. There are a number of frameworks for
Python, including Flask, Tornado, Pyramid, and Django. Flask is an API of Python
that allows to build up web-applications. It was developed by Armin Ronacher.
Flask’s framework is more explicit than Django’s framework and is also easier to
learn because it have less base code to implement a simple web-Application. A
Web-Application Framework or Web Framework is the collection of modules and
libraries that helps the developer to write applications without writing the low-level
codes such as protocols, thread management, etc. Flask is based on WSGI(Web
Server Gateway Interface) toolkit and Jinja2 template engine
Why Flask?

 easy to use.
 built in development server and debugger
 integrated unit testing support
 RESTful request dispatching
 uses Jinja2 templating

DESIGN AND DEVELOPMENT PROCESS


FUNDAMENTAL DESIGN CONCEPTS

System design is a “how to” approach to creation of a new system. System design
goes through 2 phases. They are

- Logical design

- Physical design

Logical design reviews the present physical system, prepares input and output
specifications, makes edit security and control specifications

Physical design maps out the details of the physical system, plans, system
implementation, device a test and implementation plan.

DESIGN PROCESS

INPUT DESIGN
Input design is the process of converting the user-oriented. Input to a computer
based format. The goal of the input design is to make the data entry easier , logical
and free error. Errors in the input data are controlled by the input design. The
quality of the input determines the quality of the system output.

All the data entry screen are interactive in nature, so that the user can directly enter
into data according to the prompted messages. The user are also can directly enter
into data according to the prompted messages. The users are also provided with
option of selecting an appropriate input from a list of values. This will reduce the
number of error, which are otherwise likely to arise if they were to be entered by
the user itself.

Input design is one of the most important phase of the system design. Input design
is the process where the input received in the system are planned and designed, so
as to get necessary information from the user, eliminating the information that is
not required. The aim of the input design is to ensure the maximum possible levels
of accuracy and also ensures that the input is accessible that understood by the
user. The input design is the part of overall system design, which requires very
careful attention. If the data going into the system is incorrect then the processing
and output will magnify the errors.

The objectives considered during input design are:


 Nature of input processing.
 Flexibility and thoroughness of validation rules.
 Handling of properties within the input documents.
 Screen design to ensure accuracy and efficiency of the input relationship
with files.
 Careful design of the input also involves attention to error handling,
controls, batching and validation procedures.
Input design features can ensure the reliability of the system and produce result
from accurate data or they can result in the production of erroneous information.

Data Flow Diagram (DFD)

The first step is to draw a data flow diagram (DFD). The DFD was first developed
by Larry Constantine as a way of expressing system requirements in graphical
form.

A DFD also known as a “bubble chart” has the purpose of clarifying system
requirements and identifying major transformations that will become programs in
system design. So, it is the starting point of the design phase that functionally
decomposes the requirements specifications down to the lowest level of detail. A
DFD consists of series of bubbles join by the data flows in the system.

The purpose of data flow diagrams is to provide a semantic bridge between users
and systems developers. The diagrams are:

• Graphical, eliminating thousands of words;

• Logical representations, modeling WHAT a system does, rather than


physical models showing HOW it does it;

• Hierarchical, showing systems at any level of detail; and

• jargon less, allowing user understanding and reviewing.


The goal of data flow diagramming is to have a commonly understood model of
a system. The diagrams are the basis of structured systems analysis. Data flow
diagrams are supported by other techniques of structured systems analysis such as
data structure diagrams, data dictionaries, and procedure-representing techniques
such as decision tables, decision trees, and structured English.
External Entity

An external entity is a source or destination of a data flow, which is outside the


area of study. Only those entities, which originate or receive data, are represented
on a business process diagram. The symbol used is an oval containing a
meaningful and unique identifier.

Process

A process shows a transformation or manipulation of data flows within the system.


The symbol used is a rectangular box, which contains 3 descriptive elements:
Firstly an identification number appears in the upper left hand corner. This is
allocated arbitrarily at the top level and serves as a unique reference. Secondly, a
location appears to the right of the identifier and describes where in the system the
process takes place.

Data Flow

A data flow shows the flow of information from its source to its destination. A data
flow is represented by a line, with arrowheads showing the direction of flow.
Information always flows to or from a process and may be written, verbal or
electronic. Each data flow may be referenced by the processes or data stores at its
head and tail, or by a description of its contents.

Data Store

A data store is a holding place for information within the system: It is represented
by an open ended narrow rectangle. Data stores may be long-term files such as
sales ledgers, or may be short-term accumulations: for example batches of
documents that are waiting to be processed. Each data store should be given a
reference followed by an arbitrary number.
Resource Flow

A resource flow shows the flow of any physical material from its source to its
destination. For this reason they are sometimes referred to as physical flows. The
physical material in question should be given a meaningful name. Resource flows
are usually restricted to early, high-level diagrams and are used when a description
of the physical flow of materials is considered to be important to help the analysis.

OUTPUT DESIGN

The output form of the system is either by screen or by hard copies. Output design
aims at communicating the results of the processing of the users. The reports are
generated to suit the needs of the users .The reports have to be generated with
appropriate levels. In our project outputs are generated by asp as html pages. As its
web application output is designed in a very user-friendly this will be through
screen most of the time.

CODE DESIGN

The main purpose of code design is to simplify the coding and to achieve better
performance and quality with free of errors. The coding is prepared in such a way
that the internal procedures are more meaningful validation manager is displayed
for each column. The coding of the variables is done in such a way that one other
than person who developed the packages can understand its purpose.

To reduce the server load, the project is designed in a way that most of the
Validation of fields is done as client side validation, which will be more effective.

DATABASE DESIGN
The database design involves creation of tables that are represented in physical
database as stored files. They have their own existence. Each table constitute of
rows and columns where each row can be viewed as record that consists of related
information and column can be viewed as field of data of same type. The table is
also designed with some position can have a null value.

The database design of project is designed in such a way values are kept without
redundancy and with normalized format.

DEVELOPMENT APPROACH

TOP DOWN APPROACH

The importance of new system is that it is user friendly and a better interface with
user’s working on it. It can overcome the problems of manual system and the
security problem.

Top down approach of software development is the incremental approach to the


construction of program structure. Modules are integrated by moving through the
control hierarchy, beginning with the main control module. Module subordinate to
the main control modules is incorporate into the structure in either a depth first or
breadth first manner.

The top down approach is performed in a serious of five steps

1. The main module that is overall software is divided into five modules that
are under the control of the main control module.
2. Depending on the top down approach selected subordinate stubs is replaced
one at a time with actual components.
3. Tests are conducted as each component is integrated
4. On completion of each test another stub is replaced with real time
component.
5. Regression testing may be conducted to ensure the new errors have not been
introduced.

TESTING AND IMPLEMENTATION


SYSTEM TESTING

It is the process of exercising software with the intent of finding and ultimately
correcting errors. This fundamental philosophy does not change for web
applications, because web based system and applications reside on network and
inter-operate with many different operating systems, browsers, hardware platforms
and communication protocols. Thus searching for errors is significant challenge for
web applications.

Testing issues:

1. Client GUI should be considered.


2. Target environment and platform considerations
3. Distributed database considerations
4. Distributed processing consideration

TESTING AND METHODLOGIES

System testing is the state of implementation, which is aimed at ensuring that the
system works accurately and efficiently as expect before live operation,
commences. It certifies that the whole set of programs hang together System
testing requires a test plan that consists of several key activities and steps for run
program, string, system and user acceptance testing. The implementation of newly
design package is important in adopting a successful new system

Testing is important stage in software development. System test is implementation


should be a confirmation that all is correct and an opportunity to show the users
that the system works as they expected It accounts the largest percentage of
technical effort in software development process.

Testing phase is the development phase that validates the code against the
functional specifications. Testing is a vital to the achievement of the system goals.
The objective of testing is to discover errors. To fulfill this objective a series of test
step such as the unit test, integration test, validation and system test where planned
and executed.

Unit testing

Here each program is tested individually so any error apply unit is debugged. The
sample data are given for the unit testing. The unit test results are recorded for
further references. During unit testing the functions of the program unit validation
and the limitations are tested.

Unit testing is testing changes made in a existing or new program this test is
carried out during the programming and each module is found to be working
satisfactorily. For example in the registration form after entering all the fields we
click the submit button. When submit button is clicked, all the data in form are
validated. Only after validation entries will be added to the database.

Unit testing comprises the set of tests performed by an individual prior to


integration of the unit into large system. The situation is illustrated in as follows

Coding-> Debugging ->Unit testing -> Integration testing


The four categories of test that a programmer will typically perform on a program
unit

1.Functional test
2.Performance test
3.Stress Test
4.Structure test
Functional test involve exercising the code with nominal input values for
which the expected results are known as well as boundary values and special
values.

Performance testing determines the amount of execution time spent in various


parts of unit program through put and response time and device utilization by the
program.

A variation of stress testing called sensitivity testing in same situations a very


small range of data contained in a bound of valid data may cause extreme and even
erroneous processing or profound performance degradation.

Structured testing is concerned with a exercising the internal logic of a


program and traversing paths. Functional testing, stress testing performance testing
are referred as “black box” testing and structure testing is referred as “white box”
testing

VALIDATION TESTING

Software validation is achieved through a serious of testes that demonstrate


conformity with requirements. Thus the proposed system under consideration has
been tested by validation & found to be working satisfactory.
OUTPUT TESTING

Asking the user about the format required by them tests the output generated by the
system under consideration .It can be done in two ways, One on screen and other
on printer format. The output format on the screen is found to be correct as the
format designed n system test.

SYSTEM TESTING
In the system testing the whole system is tested for interface between
each module and program units are tested and recorded. This testing
is done with sample data. The securities, communication between
interfaces are tested

System testing is actually a series of different tests whose primary


purpose is to fully exercise the computer based system although each
test has a different purpose all work to verify that all system
elements properly integrated and perform allocate function.

It involves two kinds of activities namely

1. Integrated testing

2. Acceptance testing

Integrated testing

Integrated testing is a systematic technique for constructing tests to uncover errors


associated with interface.

Objective is to take unit tested modules and build a program structure that has been
dictated by design
Acceptance testing

Acceptance testing involves planning an execution of a functional test,


performance test and stress test to verify that the implemented system satisfies the
requirement.

The acceptance testing is the final stage of the user the various possibilities of the
data are entered and the results are tested.

Validation testing

Software validation is achieved through a series of test that demonstrates the


conformity and requirements. Thus the proposed system under consideration has to
be tested by validation and found to be working satisfactorily. For example in
customer enters phone number field should contain number otherwise it produces
an error message similarly in all the forms the fields are validated

Testing results

All the tests should be traceable to customer requirements the focus of testing will
shift progressively from programs Exhaustive testing is not possible To be more
effective testing should be which has probability of finding errors

The following are the attributes of good test

1. A good test has a probability of finding a errors

2. A good test should be “best of breeds”

3. A good test to neither simple nor too complex


SYSTEM IMPLEMENTATION

Implementation is the stage in the project where the theoretical design is turned
into a working system. The most crucial stage is achieving a successful new
system and giving a user confidence in that the new system will work efficiently
and effectively in the implementation stage. The stage consist of

 Testing a developed program with sample data


 Detection and correction of error
 Creating whether the system meets a user requirement.
 Making necessary changes as desired by users.
 Training user personal

IMPLEMENTATION PROCEDURES

The implementation phase is less creative than system design. A system design
may be dropped at any time prior to implementation, although it becomes more
difficult when it goes to the design phase. The final report of the implementation
phase includes procedural flowcharts, record layouts, and a workable plan for
implementing the candidate system design into a operational design.

USER TRAINING

It is designed to prepare the users for testing & converting the system. There is
several ways to trail the users they are:

1) User manual

2) Help screens

3) Training demonstrations.

1) User manual:
The summary of important functions about the system & software can be provided
as a document to the user. User training is designed to prepare the user for testing
and convening a system

The summary of important functions about the system and the software can be
provided as a document to the user

1. Open http page


2. Type the file name with URL index .php in the address bar
3. Index. php is opened existing user the type the username and
password
4. Click the submit button
2) Help screens:
This features now available in every software package, especially
when it is used with a menu. The user selects the “Help” option from the menu.
The System success the necessary description or information for user reference.
3) Training demonstration:
Another user training element is a training demonstration. Live
demonstration with personal contact is extremely effective for training users.

OPERATIONAL DOCUMENTATION
Documentation means of communication; it establishes the design and
performance criteria of the project. Documentation is descriptive information that
portrays the use and /or operation of the system. The user will have to enter the
user name and password if it is valid he participate in auction. Otherwise if it is
new user he needs to register

Documentation means of communication; it establishes design & performance


criteria for phases of the project. Documentation is descriptive information that
portrays he use &/or operation of the system.

1) Documentation tools:

Document production & desktop publishing tool support nearly ever aspect of
software developers. Most software development organizations spend a substantial
amount of time developing documents, and in many cases the documentation
process itself is quite inefficient. It is not use unusual for a software development
effort on documentation. For this reason, Documentation tools provide an
important opportunity to improve productivity.

2) Document restructuring:
Creating document is far too timed consuming. If the system work’s, we’ll live
with what we have. In some cases, this is the correct approach. It is not possible to
recreate document for hundreds of computer programs.

Documentation must be updated, but we have limited resources. It may not be


necessary to fully redocument an application. Rather, those portions of the system
that are currently undergoing change are fully documented.

The system is business critical and must be fully redocumented. Even in this case,
an intelligent approach is to pare documentation to an essential minimum.

SYSTEM MAINTENANCE

Maintenance is actually implementation of the review plan as important as it


is programmers and analyst is to perform or identify with him or herself with the
maintenance. There are psychologically personality, and professional reasons for
this. Analyst and programmers spend fair more time maintaining programmer then
they do writing them Maintenances account for 50-80% of total system
development. Maintenance is expensive .One way to reduce the maintenance costs
are through maintenance mgt and software modification audits Types of
maintenance are

1. Perfective maintenance
2. Preventive maintenance
Perfective maintenance:

Changes made to the system to add features or to improve the performance.

Preventive maintenance:

Changes made to the system to avoid future problems. Any changes can be
made in the future and our project can adopt the changes.
Conclusion

The project investigates the comparative performance of machine learning models


in binary classification of imbalanced credit card fraud data. The rationale for
investigating these three techniques is due to less comparison they have attracted in
past literature. However, a subsequent study to compare other single and ensemble
techniques using our approach is underway. The contribution of the paper is
summarized in the following:

1. Three classifiers based on different machine learning techniques are trained on


real life of credit card transactions data and their performances on credit card fraud
detection evaluated and compared based on several relevant metrics.

2. The highly imbalanced dataset is sampled in a hybrid approach where the


positive class is oversampled and the negative class under-sampled, achieving two
sets of data distributions.

3. The performances of the three classifiers are examined on the two sets of data
distributions using accuracy, sensitivity, specificity, precision, balanced
classification rate and Matthews Correlation coefficient metrics.

Performance of classifiers varies across different evaluation metrics. Results from


the experiment shows that the machine learning algorithm shows significant
performance for all metrics evaluated except for accuracy in the 10:90 data
distribution. This study shows the effect of hybrid sampling on the performance of
binary classification of imbalanced data. Expected future areas of research could
be in examining meta-classifiers and meta learning approaches in handling highly
imbalanced credit card fraud data. Also effects of other sampling approaches can
be investigated.
FUTURE WORK

The drawbacks of the project are we cannot determine the names of fraud and non
fraud transactions for the given dataset using machine learning. For the further
development of the project can work to solve this problem by using various
methods. When we cannot reach out objective for 100% accuracy in scam
identification, we wound up making a framework which could, along sufficient
opportunity with data, get near this objective. Similarly like along any venture,
here is such opportunity to get better here. The very nature of this undertaking
takes into account various techniques for to be coordinated jointly as blocks with
its outcomes could be joined for building them accuracy for end product. That
work can additionally be enhanced along with the option of many techniques in
them. Be that as it may, the yield of these algorithms should be in a similar
configuration as another. When this situation is fulfilled, this module is anything
but difficult to plus like crisp in the code. That gives an incredible level of
particularity and flexibility to the project. More opportunity to get better can be
found in the dataset. As shown previously, the precision of the algorithms
increments.
Book Reference

1) Head-First Python, 2nd edition Paul Barry (O’Reilly, 2016)

2) Think Python: How to Think Like a Computer Scientist, 2nd edition Allen B.
Downey (O’Reilly, 2015)

3) Learn Python 3 the Hard Way Zed A. Shaw (Addison-Wesley, 2016)

4) Real Python Course, Part 1 Real Python Team (Real Python, 2017)

5) Learning Python: Learn to code like a professional with Python by Fabrizio


Romano (Recommended and reviewed by Jay LaCroix)

WEBSITES

1) https://www.w3schools.com/python/python_reference.asp

2) http://webreference.com/programming/python/index-2.html

3) https://www.python.org/doc/

4) https://www.pythonforbeginners.com/basics/python-websites-tutorials

5) https://realpython.com/tutorials/web-dev/

6) https://www.tutorialspoint.com/python_web_development_libraries/
index.htm

7) https://pythonprogramming.net/web-development-tutorials/

8) https://www.fullstackpython.com/web-development.html

9) https://pythonspot.com/web-dev/
SCREENSHOT

DATASET

Home page
Input form
Output form
SAMPLE CODE

Index.py

import scipy.stats as stats

import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

import seaborn as sns

plt.style.use('ggplot')

import pickle

df = pd.read_csv('creditcard.csv')

#shape

print('This data frame has {} rows and {} columns.'.format(df.shape[0],


df.shape[1]))

df.sample(5)

df.info()

df.drop(['Merchant_id','Transaction date'], axis='columns', inplace=True)

class_names = {0:'N', 1:'Y'}

print(df.isFradulent.value_counts().rename(index = class_names))

df['isFradulent'] = df['isFradulent'].map({'N': 0 ,'Y': 1})

df['Is declined'] = df['Is declined'].map({'N': 0 ,'Y': 1})

df['isForeignTransaction'] = df['isForeignTransaction'].map({'N': 0 ,'Y': 1})

df['isHighRiskCountry'] = df['isHighRiskCountry'].map({'N': 0 ,'Y': 1})


df['AverageAmount/transaction/day']=df['Average
Amount/transaction/day'].astype(np.int64)

df['Transaction_amount']=df['Transaction_amount'].astype(np.int64)

df['6_month_avg_chbk_amt']=df['6_month_avg_chbk_amt'].astype(np.int64)

df.isna().sum()

df.info()

feature_names = df.iloc[:, 0:9].columns

target = df.iloc[:1, 9: ].columns

print(feature_names)

print(target)

data_features = df[feature_names]

data_target = df[target]

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(data_features, data_target,


train_size=0.70, test_size=0.30, random_state=1)

print("Length of X_train is: {X_train}".format(X_train = len(X_train)))

print("Length of X_test is: {X_test}".format(X_test = len(X_test)))

print("Length of y_train is: {y_train}".format(y_train = len(y_train)))

print("Length of y_test is: {y_test}".format(y_test = len(y_test)))

from sklearn.linear_model import LogisticRegression

from sklearn.metrics import confusion_matrix

model = LogisticRegression()

model.fit(X_train, y_train.values.ravel())
pred = model.predict(X_test)

LRacc = model.score(X_train, y_train.values.ravel())

print(LRacc)

from sklearn.svm import SVR

SVM = SVR()

SVM.fit(X_train, y_train.values.ravel())

SVMacc = SVM.score(X_train, y_train.values.ravel())

print(SVMacc)

from sklearn.ensemble import RandomForestClassifier

from sklearn.neural_network import MLPClassifier

RF = RandomForestClassifier(n_estimators=100, max_depth=2, random_state=0)

RF.fit(X_train, y_train.values.ravel())

y_pred = RF.predict([[702.0144268,6318.129841,0,0,0,0,0,0,0]])

RFacc = RF.score(X_train, y_train.values.ravel())

print(RFacc)

NN = MLPClassifier(solver='lbfgs', alpha=1e-5, hidden_layer_sizes=(5, 2),


random_state=1)

NN.fit(X_train, y_train.values.ravel())

NNacc = NN.score(X_train, y_train.values.ravel())

print(NNacc)

file=open('my_model.pkl','wb')

pickle.dump(NN,file,protocol=3)
class_names = ['N', 'Y']

matrix = confusion_matrix(y_test, pred)

# Create pandas dataframe

dataframe = pd.DataFrame(matrix, index=class_names, columns=class_names)

# Create heatmap

sns.heatmap(dataframe, annot=True, cbar=None, cmap="Blues", fmt = 'g')

plt.title("Confusion Matrix"), plt.tight_layout()

plt.ylabel("True Class"), plt.xlabel("Predicted Class")

plt.show()

#heatmap

corr = df.corr()

plt.figure(figsize=(12,10))

heat = sns.heatmap(data=corr)

plt.title('Heatmap of Correlation')
App.py

import numpy as np

from flask import Flask,render_template,request

import pickle

# from keras import models

file=open('my_model.pkl','rb')

model=pickle.load(file)

#file.close()

app=Flask(__name__)

@app.route('/',methods=['GET','POST'])

def hello_world():

if request.method == 'POST':

mydict=request.form

v1=int(mydict['v1'])

v2=int(mydict['v2'])

v3=int(mydict['v3'])

v4=int(mydict['v4'])

v5=int(mydict['v5'])

v6=int(mydict['v6'])

v7=int(mydict['v7'])

v8=int(mydict['v8'])

v9=int(mydict['v9'])
input_feature=[v1,v2,v3,v4,v5,v6,v7,v8,v9]

input_feature = np.array(input_feature)

input_feature = input_feature.astype(np.float).reshape(1,-1)

predict1 = model.predict(input_feature)

print(predict1)

#predict = model.predict_proba([input_feature])[0][1]

#infprob = infprob*100

return render_template('result.html',inf=predict1)

return render_template('index.html')

if __name__ == '__main__' :

app.run(debug=False)

You might also like