9-Apr ML Fridays - Technical Edition - Fraud Detection Using Deep Graph Networks

ML Fridays
Fraud Detection using Deep Graph Networks
Arun Kumar Lokanatha – AI/ML solution architect

Date – 09/04/2021
© 2020, Amazon Web Services, Inc. or its Affiliates.

Table of contents
• Fraud Types and Market Drivers

• Rule based Systems
• Fraud Detection using Machine Learning
• Supervised
• Unsupervised
• Fraud Detection using Deep Learning
• Autoencoders
• Deep Graph Networks
• Code Examples , Demo and Getting Started guides.

Fraud comes in all shapes and forms
Payment fraud
• Compromised payment instruments (e.g., stolen cards)
• Intentional nonpayment (e.g., prepaid cards)

Payment fraud
Account takeover or compromise

• User name and password
• API key

Payment fraud
Account takeover or compromise

• User name and password
• API key
Abuse
• Free tier misuse
• Premium phone number

Market Drivers

Fraud is big business
120% $5.1T 113%

Account takeover losses Global Cost of Fraud (2019) The increase in application
- Crowe.com
reached $5.1 billion in fraud 2016
2017, down from 280% - Forester
growth in 2015.
- Javelin Research
53% $130B 28%

Increase in Imposter Expected loss by retailers Global New Account Fraud
Scams. to card-not-present in the Increased 28% in 2019
- FTC next 5 years - Jumio
- Juniper Research

Payment Fraud Trends
2016 2018 2021 2023
$22.8b $27.9b $32.9b $35.7b
Data fromThe Nilson Report, November 2019, Issue 1164 (https://nilsonreport.com/upload/content_promo/The_Nilson_Report_Issue_1164.pdf)

Fraud Prevention strategies

Fraud prevention strategy
Prevention Detection Containment Remediation

The Fraud Detection filter
You lose customers & money
• increased churn
• negative reviews
• lost revenue
Incorrectly
DECLINED
Correctly APPROVED (TN)
(FP)
All incoming requests
Correctly Incorrectly APPROVED (FN)

DECLINED
(TP) You lose money
(chargebacks & fees)
• A trade-off between False

Positives vs False Negatives
Fraud
Detection
Filter
Trade-off efficiency
False
Negatives
Increase
• revenue loss
• negative reviews
• customer churn
Decrease
fraud losses
False
Positives

Types of Fraudulent Behaviour
Layer 1
Endpoint authentication –e.g. stolen card or machine
Layer 2
AnomalyFraudulent transactions
within a session are
–e.g. transfer before anomalies
balance
Layer 3
Anomaly within an account–e.g. Unusual spikes in transfer
Fraud detection in its core is an anomaly detection problem
Layer 4
Anomaly within multiple channels of the same account–-e.g. spikes
Layer 5
Anomaly within multiple challens and multiple accounts–e.g. Irregular transfer

Outliers
Kaggle visa dataset

Fraudsters continually attack using different methods

There is no silver bullet algorithm or solution to Prevent
Fraud

Rule Based Fraud Detection

Rule-based Fraud Detection
if IP_ADDRESS_LOCATION is ’Japan’ and

CUST_ADDRESS_COUNTRY is ‘Japan’ and
CUSTOMER_PHONE_LOC is ‘Spain’
then
Investigate
Rules look for specific conditions or behaviors
Pros
• Straight forward to implement
• Easy to explain
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Difficulties with rule-based Fraud Detection
$$$ billions lost to Lower FP vs FN trade- Difficult to adapt to new

fraud each year off efficiency fraud patterns
Rules = more human Dependent on experts to

reviews update detection logic
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Machine Learning for Fraud
Detection

Using Machine Learning Algorithms for Fraud Detection
• Problem: What to do when we don’t have annotated data, but want to

identify potentially fraudulent transactions?
• Two “flavors” of machine learning:

• Supervised: Access to labeled data
• Unsupervised: Access to features alone

Using Machine Learning for Supervised Learning
Feed Discover
relationships Apply solution Make
labeled to unseen data predictions
data to between input
algorithm and output

Using Machine Learning for Unsupervised Learning
Feed raw Uncover hidden Automatically Investigate

data to patterns flag anomalies potential fraud
algorithm

Solution Live Data
e.g. incoming, real-time transactions
Labeled Data Predictions

e.g. Anomalous
Train
Deploy transactions, fraud
XGBoost
XGBoost
Model using
Model
SageMaker*
Unlabeled Data
Train Deploy
Random Cut
Random
Forest Model
Cut Forest
using
SageMaker* Model
* SageMaker built-in algorithm

Solution Architecture
Anomaly Detection
Amazon SageMaker
(Random Cut Forest)
Fraud Detection
Amazon API Gateway AWS Lambda Amazon S3 bucket
(Model and Data)
Transactions Amazon SageMaker

(XGBoost)
Optional
Amazon Kinesis Amazon S3 bucket Amazon QuickSight

Data Firehose (Results)

The challenge
• Coming out with features is difficult

• Time consuming
• Requires domain knowledge
• Time for tuning the features
• Labeled dataset is not available specifically for fraud detection scenarios
• Imbalanced datasets often need to add techniques to mitigate them.

Deep Learning for Fraud
Detection

Autoencoders

Learning features
• Anomaly != (Normal)
• Problem to solve “Learn Normal”
• Learning normal is easier problem to solve since data for
(Normal) is exhaustively available
Limitations and Considerations
1. A drawback of the Autoencoder is that it does not

distinguish fraudulent and normal transactions with similar
reconstruction errors.
2. Typical way to mitigate is to build an Additional Model (MLP)
which can further classify the abnormal detections from
Autoencoders
3. This reduces the number of labelled samples needed to build
the Fraud Model.
4. All the above discussed models work on linear features and lot
of times the real word data is connected.

Graph Neural Networks

Overview of graphs and graph
neural networks

What are graphs
Abstract representation of relationships between entities.
Nodes
Edges
Homogeneous Heterogeneous
Graph learning Tasks
Node Classification
Fraud detection
target right customers
Link Prediction
recommendations
missing relations in a knowledge graph
Graph Classification
predict property of a chemical compound

Graph learning and Node Embeddings
Transform nodes to a numerical representation
• Embed nodes to a low-dimension space
• Embeddings capture the essential task-specific information
• For example, node similarities in the embedding space approximates the similarities in the original
graph.
Original Graph: Zachary’s Karate Club Embeddings: Representation in 2D

Traditional Graph learning techniques
Generate embeddings by manual feature engineering
• Requires domain expertise, involves considerable manual fine-tuning,
time consuming, does not scale, …
Automatically generate embeddings using unsupervised dimensionality
reduction approaches
• Singular value decomposition, tensor decomposition, co-factorization,
deep walks, etc.
• Cannot effectively combine rich attributes with network structure.
• Employ mostly (multi-)linear models.
• Do not allow for end-to-end learning.

Graph Neural Networks
A family of (deep) neural networks that learn node, edge, and graph
embeddings

How Graph Neural Networks work
Graph Neural Networks are based on Message Passing
v5 h5
h2
V2
AGGREGATE
And
v1 h1
m1
COMBINE
v3 v4
h4
h3
Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O., & Dahl, G. E. (2017, August). Neural message passing for quantum chemistry.
Xu, K., Hu, W., Leskovec, J., & Jegelka, S. (2018). How powerful are graph neural networks?
Graph Neural Network models
GCN GAT R-GCN

𝑙−1 (𝑙) 1 (𝑙) 𝑙−1
(𝑙) ℎ𝑤 (𝑙) 𝑙−1 𝑀𝑣𝑤 = 𝑊𝑟 ℎ𝑤
𝑀𝑣𝑤 = 𝑀𝑣𝑤 = 𝛼𝑣𝑤 ℎ𝑤 𝑐𝑣,𝑟
𝑑𝑣 +1
(𝑙) 𝑙
AGGREGATE 𝑚𝑣 = ෍ 𝑀𝑣𝑤
𝑤∈𝑁 𝑣 ∪{𝑣}
(𝑙) 𝑙
COMBINE ℎ𝑣 = 𝜙(𝑚𝑣 𝑊 (𝑙) )
Kipf, T. N., & Welling, M. (2016). Semi-supervised classification with graph convolutional networks.
Veličković, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., & Bengio, Y. (2017). Graph attention networks.
Schlichtkrull, M., Kipf, T. N., Bloem, P., Van Den Berg, R., Titov, I., & Welling, M. (2018, June). Modeling relational data with graph convolutional networks.
Deep Graph Library (DGL) for Deep Learning on Graphs
1. DGL is a toolkit for Deep Learning

on graphs.
2. Supports all popular frameworks.
3. Comes with pre defined Graph
Networks for GCN, R-GCN , GAT etc
4. Makes it easier to build your own
custom networks by providing
common functions.
5. Benefits of using DGL on Amazon
SageMaker:
- Efficiently train models for
graphs with up to millions of
nodes and billions of edges.
https://docs.aws.amazon.com/sagemaker/latest/dg/deep-graph-library.html
Fraud detection with graph
neural networks

Fraud Detection with Graphs
• Common Issue: Fraudsters can evolve to fool rules based

methods or simple feature based methods
• Observation: Fraudsters cannot mask their behavior with respect

to the full interaction graph
• Often connected objects -> guilt-by-association
• Combine weak signals from individual nodes to derive stronger ones
• Node Aggregation : fraudulent or malicious users tend to connect
with many other users or entities
• Activity Aggregation : fraudulent or malicious users linked to
accounts that act in a coordinated fashion in short bursts of time
Fraud Detection - Formulation
User DeviceID IP Address MAC Address Label
00049 990000862471 216.3. 128.12 00:0a:95:9d:68:16. 0

854
⋮ ⋮ ⋮ ⋮ ⋮
06302 351756051523 66.249.64.163 00:0a:95:9d:68:16. 1

999
Context
User signs-up and once some usage data
is collected, predict if user is fraud or not.

Fraud Detection – User Features
User Time stamp Activity Success Trans_amt Bal Amt
00049 09/10/2007 @ ChangedPassword 0 N/A N/A
12:45am
⋮ ⋮ ⋮ ⋮ ⋮ ⋮
06302 09/11/2007 @ BalanceTransfer 1 3419 0

11:45pm
Feature Transformation
user day 1 Activity_x success Trans_amt

⋯ day 30
hour 0 hour 23 hour 0 hour 23 ≤ 103 > 106

⋯ ⋯ ⋯ ⋯
0004 18 ⋯ 18 ⋯ 0 ⋯ 0 3 -15 8 ⋯ 0
9
Using Graph Neural Networks to Detect Fraud
Graph
R-GNN R-GNN User

Layer … Layer embeddings Classifier
𝑥11 ⋯ 𝑥1𝑚
⋮ ⋱ ⋮ User features
𝑥𝑛1 ⋯ 𝑥𝑛𝑚

Fraud Detection – GNN Results

Code Walkthrough
https://github.com/awslabs/
sagemaker-graph-fraud-
detection

Architecture using Amazon SageMaker

Resources

Resources
• Random Cut Forest Documentation: AWS Docs

• XGBoost Documentation: AWS Docs
• Imbalanced-learn (SMOTE): Library Documentation
• SMOTE original publication: On ArXiv
• Learning from imbalanced data review article: DOI Link
• Fraud Detection using Auto encoders - Github Link

Getting Started
All code can be found on

GitHub:
https://github.com/awsla
bs/sagemaker-graph-
fraud-detection

Q&A
Arun Kumar Lokanatha
AI/ML Solution Architect

Thank You
Arun Kumar L

9-Apr ML Fridays - Technical Edition - Fraud Detection Using Deep Graph Networks

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

9-Apr ML Fridays - Technical Edition - Fraud Detection Using Deep Graph Networks

Uploaded by

Copyright:

Available Formats

ML Fridays

Fraud Detection using Deep Graph Networks

Arun Kumar Lokanatha – AI/ML solution architect

© 2020, Amazon Web Services, Inc. or its Affiliates.

• Fraud Types and Market Drivers

© 2020, Amazon Web Services, Inc. or its Affiliates.

© 2020, Amazon Web Services, Inc. or its Affiliates.

Account takeover or compromise

© 2020, Amazon Web Services, Inc. or its Affiliates.

Account takeover or compromise

© 2020, Amazon Web Services, Inc. or its Affiliates.

© 2020, Amazon Web Services, Inc. or its Affiliates.

120% $5.1T 113%

53% $130B 28%

© 2020, Amazon Web Services, Inc. or its Affiliates.

2016 2018 2021 2023

$22.8b $27.9b $32.9b $35.7b

Data fromThe Nilson Report, November 2019, Issue 1164 (https://nilsonreport.com/upload/content_promo/The_Nilson_Report_Issue_1164.pdf)

© 2020, Amazon Web Services, Inc. or its Affiliates.

© 2020, Amazon Web Services, Inc. or its Affiliates.

Prevention Detection Containment Remediation

© 2020, Amazon Web Services, Inc. or its Affiliates.

All incoming requests

Correctly Incorrectly APPROVED (FN)

• A trade-off between False

© 2020, Amazon Web Services, Inc. or its Affiliates.

© 2020, Amazon Web Services, Inc. or its Affiliates.

Kaggle visa dataset

© 2020, Amazon Web Services, Inc. or its Affiliates.

© 2020, Amazon Web Services, Inc. or its Affiliates.

© 2020, Amazon Web Services, Inc. or its Affiliates.

© 2020, Amazon Web Services, Inc. or its Affiliates.

if IP_ADDRESS_LOCATION is ’Japan’ and

Rules look for specific conditions or behaviors

$$$ billions lost to Lower FP vs FN trade- Difficult to adapt to new

Rules = more human Dependent on experts to

© 2020, Amazon Web Services, Inc. or its Affiliates.

• Problem: What to do when we don’t have annotated data, but want to

• Two “flavors” of machine learning:

© 2020, Amazon Web Services, Inc. or its Affiliates.

© 2020, Amazon Web Services, Inc. or its Affiliates.

Feed raw Uncover hidden Automatically Investigate

© 2020, Amazon Web Services, Inc. or its Affiliates.

Labeled Data Predictions

* SageMaker built-in algorithm

Transactions Amazon SageMaker

Amazon Kinesis Amazon S3 bucket Amazon QuickSight

© 2020, Amazon Web Services, Inc. or its Affiliates.

• Coming out with features is difficult

© 2020, Amazon Web Services, Inc. or its Affiliates.

© 2020, Amazon Web Services, Inc. or its Affiliates.

© 2020, Amazon Web Services, Inc. or its Affiliates.

1. A drawback of the Autoencoder is that it does not

© 2020, Amazon Web Services, Inc. or its Affiliates.

© 2020, Amazon Web Services, Inc. or its Affiliates.

© 2020, Amazon Web Services, Inc. or its Affiliates.

© 2020, Amazon Web Services, Inc. or its Affiliates.

Original Graph: Zachary’s Karate Club Embeddings: Representation in 2D

© 2020, Amazon Web Services, Inc. or its Affiliates.

© 2020, Amazon Web Services, Inc. or its Affiliates.

Graph Neural Networks are based on Message Passing

GCN GAT R-GCN