You are on page 1of 32

Practicing Data Science

A Collection of Case Studies

Ivan Pazin
Kathrin Melcher

kathrin.melcher@knime.com

© 2019 KNIME AG. All Right Reserved.


Agenda

• CRISP DM
• Customer Intelligence
– Churn Prediction
– Customer Segmentation
• Anomaly Detection - Fraud Detection
• Text Classification - Sentiment Analysis

© 2019 KNIME AG. All Rights Reserved. 2


CRISP DM

https://en.wikipedia.org/wiki/Cross_Industry_Standard_Process_for_Data_Mining

© 2019 KNIME AG. All Rights Reserved. 3


A Classic Data Science Project

Data Model Model Model


Deployment
Preparation Training Optimization Testing

It always starts
with some data

Data Manipulation Model Training Parameter Tuning Performance Measures Files & DBs
Data Blending Bag of Models Parameter Optimization Accuracy Dashboards
Missing Values Handling Model Selection Regularization ROC Curve REST API
Feature Generation Ensemble Models Model Size Cross-Validation SQL Code Export
Dimensionality Reduction Own Ensemble Model No. Iterations … Reporting
Feature Selection External Models … …
Outlier Removal Import Existing Models
Normalization Model Factory
Partitioning …

© 2019 KNIME AG. All Rights Reserved. 4


Customer Intelligence

© 2019 KNIME AG. All Rights Reserved. 5


Churn Prediction: The Problem

CRM System • Churn Prediction


Data about your customer • Upselling Likelihood
• Demographics • Product Propensity /NBO
• Behavior • Campaign Management
• Customer Segmentation
• Revenues
• …

Model

© 2019 KNIME AG. All Rights Reserved. 6


Churn Prediction: The Training Workflow

© 2019 KNIME AG. All Rights Reserved. 7


A Second Workflow for Deployment

© 2019 KNIME AG. All Rights Reserved. 8


Churn Prediction: Deployment on the WebPortal

© 2019 KNIME AG. All Rights Reserved. 9


Customer Segmentation: The Problem

CRM System
Data about your customer
• Demographics
• Churn Prediction
• Behavior • Upselling Likelihood
• Revenues • Product Propensity /NBO
• Customer Segmentation
• Campaign Management
• …
Model

© 2019 KNIME AG. All Rights Reserved. 10


Customer Segmentation: The Clustering Workflow

© 2019 KNIME AG. All Rights Reserved. 11


Fraud & Anomaly Detection

© 2019 KNIME AG. All Rights Reserved. 12


Anomaly Detection: Use Cases

Assembling Details
System Health Monitoring

Intrusion
Weather Information

Medicine Predictive Maintenance

IoT
Fault Detection Heart Beat

Transactions Fraud Detection

Networks

Finance
Sensor Data

© 2019 KNIME AG. All Rights Reserved. 13


Fraud & Anomaly Detection
What have all those use cases in common?
• Discover rare events that shouldn’t happen => often no labeled data
• Find a problem before other people see it => anomaly is unknown

How can we train a model to detect fraudulent transaction without a labeled


dataset?

Prediction: Fraud
Transaction Data
or Non-Fraud?

Model
© 2019 KNIME AG. All Rights Reserved. 14
Fraud Detection using Autoencoder

Input Layer Hidden Layers Output Layer

• Trained with Back-


Propagation on just
“normal” transactions
Input 𝒙 Output 𝒙‘

• If distance > threshold


=> possible fraud

Execution of the Network:


𝒙𝒏𝒆𝒘 − 𝒙'𝒏𝒆𝒘 ( > 𝛿 ⇒ anomaly

© 2019 KNIME AG. All Rights Reserved. 15


Fraud Detection: without Fraud Examples

© 2019 KNIME AG. All Rights Reserved.


Deployment via REST on KNIME Server
Workflow deployed as (REST) web service on KNIME Server

Workflow calling another workflow on KNIME Server

© 2019 KNIME AG. All Rights Reserved. 18


Text Classification - Sentiment Analysis

© 2019 KNIME AG. All Rights Reserved. 19


Sentiment Analysis – An Example

© 2019 KNIME AG. All Rights Reserved.


Sentiment Analysis
Task: Determine the expressed opinion in a document/text, e.g. positive,
negative

Sentiment Analysis = Opinion Mining = Emotion AI

Lexicon Based Machine Learning Deep Learning

© 2019 KNIME AG. All Rights Reserved.


Sentiment Analysis

© 2019 KNIME AG. All Rights Reserved. 22


Recommendation Engine

© 2019 KNIME AG. All Rights Reserved. 23


Recommendation Engines or Market Basket Analysis
Recommendation
Model

IF =>

© 2019 KNIME AG. All Rights Reserved. 24


Market Basket Analysis: with Association Rules

© 2019 KNIME AG. All Rights Reserved.


Recommendation Engine/MBA: Deployment

© 2019 KNIME AG. All Rights Reserved.


Creative AI

© 2019 KNIME AG. All Rights Reserved. 28


Creative AI: The Problems

• Free Text Generation


– Simulating a writing style
– Writing in different languages
– Providing an answer in a specific style
– Generating candidates for product names

• Image Neuro-Styling
– Picasso
– Botero
– Matisse
– Manet
– …

© 2019 KNIME AG. All Rights Reserved. 29


Creative AI: Deployment and Results

Yo!
This post is about generating free text
Caro amico ti scrivo così mi distraggo un po'E
with a deep learning network siccome sei molto lontano più forte ti scriverò.
particularly it is about Brick X6, This License refers to version of the GNU General Public Da quella prima folla strana, che aveva preso il suo
Phey, cabe, License. Copyright also means copyright-bick, nome, e di correre alla casa di don Abbondio, con un
make you feel soom the way (I smoke good!) Remade me any thing to his sword viso bene di non poterci andar la casa del padre
I probably make (What?) To his salt and most hidden loose to be so for sings, but not in Cristoforo, e gli disse che s'avvicinava all'uscio, e si
More money in six months, a libutt of his matter than that shall be sure as will be soldye
As master compary, do not live in traitor. mise a sparse di corsa, e di stare a sé, verso la strada
Than what's in your papa's safe (I'm serious)
Look like I robbed a bank (Okay Okay) Bless thy five wits! di servizio, chiesto le parole che gli andavan
I set it off like Queen Latifah -Kent dall'altra stanza, e con la sua condizione de'
'Cause I'm living single I'm feeling cautious O pity! cappuccini, e di consigli ricerche di confidenza delle
I ain't scream when they served a subpoena Sir, where is the patience now, gride, nel suo passaggio, se non pensava con una
(Can't go back to jail) That this is so far from the sea and some bidings to dismantle
So many folds of save and honest. certa ripugnanza a casa sua, che andavano a
I heard that he a leader scomparire in un campo di buone ragioni che
(Who pood, what to be f*****' up -Brabantio
I must not think the Turk of Cassio in the strange metting the avevan potuto raccogliere i suoi pensieri, e di sopra
The baugerout Black alro Black X6,
cribles of a charmer be the reviling of libe to say non senza interrogare, che la sua avventura aveva
Phantom White X6 looks like a panda That I can deceive him to the best advantage,
Goin' out like I'm Montana fatto predicare, e con la forza d'un fatto come
In her prophetic fairs of a little to presently at your powers;
Hundred killers, hundred hammers Black X6, whereof I thank you, sir. fuggitive che aveva preso il suo nome, e di correre
Phantom White X6, panda
-Albany alla casa di don Abbondio, con un cappuccino di
Pockets swole, Danny Gloucester, I will prove upancy of his sport and first accuriors quella sorte, con un certo sospiro, alzando le sue
Sellin' bar, candy and guard and talking on the white. finestre, e le diede un'occhiata in carrozza. Si
Man I'm the macho like Randy vendano a metter nelle mani di chi era stato a
-King Lear
The choppa go Oscar for Grammy Where are the thief?
B**** n**** pull up ya panty
sedere sur una strada così fatta con le braccia in
Thou shalt never have the captains at the letter
Hope you killas understand me To the Moor and thing we have not the better shall be sure as
Hey Panda, Panda Panda, worth if he be anger—
Panda, Panda, Panda, Panda -Regan
I got broads in Atlanta I pray you, have a countend more than think to do a proclaim’d
Twistin' dope, lean, and the Fanta there of my heart, Hot
Credit cards and the scammers The words save, honest, thief, master, traitor, and deceive
Hittin' off licks in the bando seem to fit the context. Notice also that the dialogue sprouting
from the start text of the license agreement interestingly
involves mainly minor, less tragic characters from the plays.

© 2019 KNIME AG. All Rights Reserved. 33


Practicing Data Science: 22 Case Studies
Disease Tagging
Customer Segmentation Market Basket Analysis
ClickStream Analysis
Churn Prediction Recommendation Engine

Influencers Product Naming


Translation
Topic Detection

Taxi Demand Prediction

E-mail Classification

Sentiment Analysis
Sentiment & Influencers
Credit Risk Assessment
Anomaly Detection Fraud Detection

© 2019 KNIME AG. All Rights Reserved.


Free Book as a Thank You
Free Copy of e-book:
“Practicing Data Science.
A Collection of Case Studies”
from KNIME Press

https://www.knime.com/knimepress

with this code: ZAGREB-1019

© 2019 KNIME AG. All Rights Reserved. 37


Stay connected with KNIME

Blog: knime.com/blog
Follow us on social
media:
Forum: forum.knime.com

KNIME Hub:
hub.knime.com

KNIME E-Learning Course:


www.knime.com/e-learning-course

© 2019 KNIME AG. All Rights Reserved. 38


The KNIME® trademark and logo and OPEN FOR INNOVATION® trademark are used by KNIME AG under license from KNIME GmbH,
and are registered in the United States. KNIME® is also registered in Germany.

© 2019 KNIME AG. All Rights Reserved. 39

You might also like