Professional Documents
Culture Documents
®
Copyright © 2020 KNIME AG 12
What is KNIME Analytics Platform?
®
Copyright © 2020 KNIME AG 23
Visual KNIME Workflows
®
Copyright © 2020 KNIME AG 34
Data Access
• Databases
– MySQL, PostgreSQL
– any JDBC (Oracle, DB2, MS SQL
Server)
• Files
– CSV, txt
– Excel, Word, PDF
– SAS, SPSS
– XML
– PMML
– Images, texts, networks, chem
• Web, Cloud
– REST, Web services
– Twitter, Google
®
Copyright © 2020 KNIME AG 45
Big Data
®
Copyright © 2020 KNIME AG 56
Transformation
• Preprocessing
– Row, column, matrix based
• Data blending
– Join, concatenate, append
• Aggregation
– Grouping, pivoting, binning
• Feature Creation and Selection
®
Copyright © 2020 KNIME AG 67
Analysis & Data Mining
• Regression
– Linear, logistic
• Classification
– Decision tree, ensembles, SVM,
MLP, Naïve Bayes
• Clustering
– k-means, DBSCAN, hierarchical
• Validation
– Cross-validation, scoring, ROC
• Deep Learning
– Keras, DL4J
• External
– R, Python, Weka, H2O, Keras
®
Copyright © 2020 KNIME AG 78
Visualization
• Interactive Visualizations
• JavaScript-based nodes
– Scatter Plot, Box Plot, Line Plot
– Networks, ROC Curve, Decision
Tree
– Plotly Integration
– Adding more with each release!
• Misc
– Tag cloud, open street map,
molecules
• Script-based visualizations
– R, Python
®
Copyright © 2020 KNIME AG 89
Deployment
• Database
• Files
– Excel, CSV, txt
– XML
– PMML
– to: local, KNIME Server, SSH-,
FTP-Server
• BIRT Reporting
®
Copyright © 2020 KNIME AG 9
10
Over 2000 Native and Embedded Nodes Included:
®
Copyright © 2020 KNIME AG 10
11
Overview
®
Copyright © 2020 KNIME AG 11
12
Install KNIME Analytics Platform
®
Copyright © 2020 KNIME AG 12
13
Start KNIME Analytics Platform
®
Copyright © 2020 KNIME AG 13
14
The KNIME Workspace
®
Copyright © 2020 KNIME AG 14
15
The KNIME Workbench
®
Copyright © 2020 KNIME AG 15
16
KNIME Explorer
®
Copyright © 2020 KNIME AG 16
17
Creating New Workflows, Importing and Exporting
®
Copyright © 2020 KNIME AG 17
18
Node Repository
®
Copyright © 2020 KNIME AG 18
19
Console and Other Views
®
Copyright © 2020 KNIME AG 19
20
Description
®
Copyright © 2020 KNIME AG 20
21
Workflow Description
®
Copyright © 2020 KNIME AG 21
22
Workflow Coach
®
Copyright © 2020 KNIME AG 22
23
Tool Bar
The buttons in the toolbar can be used for the active workflow. The most
important buttons:
– Execute selected and executable nodes (F7)
– Execute all executable nodes
– Execute selected nodes and open first view
– Cancel all selected, running nodes (F9)
– Cancel all running nodes
®
Copyright © 2020 KNIME AG 23
24
KNIME File Extensions
®
Copyright © 2020 KNIME AG 24
25
More on Nodes…
Not Configured:
The node is waiting for configuration or incoming data.
Configured:
The node has been configured correctly, and can be executed.
Executed:
The node has been successfully executed. Results may be
viewed and used in downstream nodes.
®
Copyright © 2020 KNIME AG 25
26
Inserting and Connecting Nodes
Image
DB Connection DB Data
Data
®
Copyright © 2020 KNIME AG 26
27
Node Configuration
®
Copyright © 2020 KNIME AG 27
28
Node Execution
• Right-click node
• Select Execute in the context menu
• If execution is successful, status
shows green light
• If execution encounters errors, status
shows red light
®
Copyright © 2020 KNIME AG 28
29
Node Views
• Right-click node
• Select Views in context menu
• Select output port to inspect execution results
®
Copyright © 2020 KNIME AG 29
30
Curved Connections!
®
Copyright © 2020 KNIME AG 30
31
Getting Started: KNIME Example Server
®
Copyright © 2020 KNIME AG 31
32
Sharing Workflows
How to use the KNIME Hub
®
Copyright © 2020 KNIME AG 32
33
KNIME Hub
®
Copyright © 2020 KNIME AG 34
35
Searching Nodes and Workflows
®
Copyright © 2020 KNIME AG 35
36
Opening a Workflow from the Hub
®
Copyright © 2020 KNIME AG 36
37
Open Workflow in KNIME Analytics Platform
®
Copyright © 2020 KNIME AG 37
38
Saving the Workflow
®
Copyright © 2020 KNIME AG 38
39
Edit the Workflow
®
Copyright © 2020 KNIME AG 39
40
Sharing the Workflow
1. Save your Edits
®
Copyright © 2020 KNIME AG 40
41
Log in the Hub
KNIME Forum
Account Credentials
®
Copyright © 2020 KNIME AG 41
42
Publish your Workflow
1. Edit Metadata
®
Copyright © 2020 KNIME AG 42
43
Open your Workflow in the Hub
®
Copyright © 2020 KNIME AG 43
44
Open your Workflow in the Hub
®
Copyright © 2020 KNIME AG 44
45
Hot Keys (for Future Reference)
Task Hot key Description
Node Configuration F6 opens the configuration window of the selected node
F7 executes selected configured nodes
Shift + F7 executes all configured nodes
Node Execution Shift + F10 executes all configured nodes and opens all views
F9 cancels selected running nodes
Shift + F9 cancels all running nodes
Node Connections Ctrl + L connects selected nodes
Ctrl + Shift + L disconnects selected nodes
Ctrl + Shift + Arrow moves the selected node in the arrow direction
Move Nodes and Ctrl + Shift + PgUp/PgDown moves the selected annotation in the front or in the back of all
Annotations overlapping annotations
F8 resets selected nodes
Ctrl + S saves the workflow
Workflow Operations
Ctrl + Shift + S saves all open workflows
Ctrl + Shift + W closes all open workflows
Metanode Shift + F12 opens metanode wizard
®
Copyright © 2020 KNIME AG 45
46
Stay connected with KNIME
Blog: knime.com/blog
Follow us on social
media:
Forum: forum.knime.com
KNIME Hub:
hub.knime.com
®
Copyright © 2020 KNIME AG 46
47
Today’s Example: Churn Prediction
®
Copyright © 2020 KNIME AG 1
48
Today’s Example: Churn Prediction
It always starts
with some data
…
Data Manipulation Model Training Parameter Tuning Performance Measures Files & DBs
Data Blending Bag of Models Parameter Optimization Accuracy Dashboards
Missing Values Handling Model Selection Regularization ROC Curve REST API
Feature Generation Ensemble Models Model Size Cross-Validation SQL Code Export
Dimensionality Reduction Own Ensemble Model No. Iterations … Reporting
Feature Selection External Models … …
Outlier Removal Import Existing Models
Normalization Model Factory
Partitioning …
…
®
Copyright © 2020 KNIME AG 49
The Data
• The data files used in the exercises are available in the “data”
folder: data files in different file formats, web-based data, data
on a database, etc.
®
Copyright © 2020 KNIME AG 3
50
Today’s Example: Churn Prediction
®
Copyright © 2020 KNIME AG 1
52
Data Source Nodes
Output port
Status
Node label
®
Copyright © 2020 KNIME AG 2
53
New Node: File Reader
®
Copyright © 2020 KNIME AG 3
54
File Reader Configuration
File path
Basic
Settings
Advanced
Settings
Preview
Help Button
®
Copyright © 2020 KNIME AG 4
55
Alternative Faster Way …
®
Copyright © 2020 KNIME AG 5
56
Filenames and the knime:// Protocol
Absolute URL
Mountpoint-relative URL
Local path
®
Copyright © 2020 KNIME AG 6
57
Workflow-Relative File Paths
®
Copyright © 2020 KNIME AG 7
58
New Node: Excel Reader (XLS)
®
Copyright © 2020 KNIME AG 8
59
Excel Reader Configuration
File path
File system
Sheet
specific
settings
Preview
®
Copyright © 2020 KNIME AG 9
60
Filenames and the knime:// Protocol
Absolute URL
Local Path
Custom URL
®
Copyright © 2020 KNIME AG 10
61
New Node: Table Reader
File path
®
Copyright © 2020 KNIME AG 12
63
New Nodes: Database Connectors
®
Copyright © 2020 KNIME AG 13
64
Other Useful Data Sources
®
Copyright © 2020 KNIME AG 14
65
Importing Data Exercise
JSON Response:
XML Response:
®
Copyright © 2020 KNIME AG 16
67
RESTful Web Services
Provide authentication
Enter URL, or use if necessary
from column
https://www.knime.com/blog/a-restful-way-to-find-and-retrieve-data
https://www.knime.com/blog/OSM-meets-CSV-file-and-Google-API
®
Copyright © 2020 KNIME AG 17
68
JSON Reader and JSON Path nodes
• Use the JSON Reader (or GET Request) node to get a JSON
cell
• Use the JSON Path node to query the JSON file and extract
parameters
• Editor window simplifies
construction of JSON queries
by auto-generating them
(click on properties)
®
Copyright © 2020 KNIME AG 18
69
Google Sheets
®
Copyright © 2020 KNIME AG 20
71
Today’s Example
®
Copyright © 2020 KNIME AG 21
72
Data Manipulation
Clean, Join, Aggregate
®
Copyright © 2020 KNIME AG 1
73
Data Manipulation Nodes
®
Copyright © 2020 KNIME AG 2
74
New Node: Concatenate
®
Copyright © 2020 KNIME AG 3
75
Dynamic Ports
Add and remove node ports based on your needs, e.g. in order
to concatenate three or more tables
Click on the three dots
in the bottom left
corner
®
Copyright © 2020 KNIME AG 4
76
New Node: Cell Replacer
®
Copyright © 2020 KNIME AG 5
77
New Node: String Manipulation
®
Copyright © 2020 KNIME AG 6
78
Data Manipulation Exercise, Activity I
®
Copyright © 2020 KNIME AG 7
79
Joining Columns of Data
Left Table Right Table
CustomerKey OrderDate OrderID CustomerKey DoB City Gender
Join by CustomerKey
22 2019-09-23 #23444 17 1974-02-23 Berlin F
10 1993-01-13 Berlin M
CustomerKey OrderDate OrderID DoB City Gender CustomerKey OrderDate OrderID DoB City Gender
®
Copyright © 2020 KNIME AG 8
80
Joining Columns of Data
Left Table Right Table
CustomerKey OrderDate OrderID CustomerKey DoB City Gender
Join by CustomerKey
22 2019-09-23 #23444 17 1974-02-23 Berlin F
10 1993-01-13 Berlin M
65 ? ? 2001-05-25 Stuttgart F
35 ? ? 1988-08-05 Cologne M
24 2019-09-30 #23457 ? ? ?
®
Copyright © 2020 KNIME AG 9
81
New Node: Joiner
®
Copyright © 2020 KNIME AG 10
82
Joiner Configuration – Linking Rows
Joiner mode
®
Copyright © 2020 KNIME AG 11
83
Joiner Configuration – Column Selection
®
Copyright © 2020 KNIME AG 12
84
Data Aggregation
®
Copyright © 2020 KNIME AG 13
85
New Node: GroupBy
Aggregation methods
Select criteria to
keep row
®
Copyright © 2020 KNIME AG 15
87
New Node: Column Expression
®
Copyright © 2020 KNIME AG 16
88
Workflow Organization and
Documentation
®
Copyright © 2020 KNIME AG 17
89
Comments & Annotations
Double-click to
Double-click to write
write
Use the
Use thepanel
paneltoto
change properties
change properties
®
Copyright © 2020 KNIME AG 18
90
Workflow Organisation – Good Practices
• Workflow annotations
• Node labels
• Metanodes
− Right click -> Create
Metanode...
− Organize workflow by
task
− Hide complexity &
improve readability
®
Copyright © 2020 KNIME AG 19
91
Workflow Organisation – Components
• Component encapsulates
a reusable functionality as
a KNIME workflow
• Components can be
configured as any KNIME
nodes Drag and drop from the
KNIME Hub to your workflow
• Access and share
components on the KNIME
Hub
®
Copyright © 2020 KNIME AG 20
92
KNIME WorkflowDiff
®
Copyright © 2020 KNIME AG 21
93
Data Manipulation Exercise, Activity II
®
Copyright © 2020 KNIME AG 22
94
Today’s Example
®
Copyright © 2020 KNIME AG 23
95
Data Visualization
Charts and Tables
®
Copyright © 2020 KNIME AG 1
96
Data Visualization
®
Copyright © 2020 KNIME AG 2
97
Visualizations using 1 Column
®
Copyright © 2020 KNIME AG 98
Visualizations using 2 Columns
1 Column
®
Copyright © 2020 KNIME AG 99
Visualizations using 3 Columns
®
Copyright © 2020 KNIME AG 100
New Node: Scatter Plot
Image
outport
• Plots different columns on X and Y
• Displays data including color
information Interactivity
options
• Produces an interactive view and
an image
• Select data points and publish
selection to other views
®
Copyright © 2020 KNIME AG 6
101
New Node: Scatter Plot
®
Copyright © 2020 KNIME AG 7
102
New Node: Color Manager
Color range
Discrete
for numerical
colors for
values
nominal
values
®
Copyright © 2020 KNIME AG 8
103
New Node: Bar Chart
®
Copyright © 2020 KNIME AG 9
104
New Node: Line Plot
®
Copyright © 2020 KNIME AG 10
105
New Node: Stacked Area Chart
®
Copyright © 2020 KNIME AG 11
106
Selection & Filtering in JavaScript Views
Apply selection
®
Copyright © 2020 KNIME AG 12
107
Components – Combined Views
Table View
®
Copyright © 2020 KNIME AG 13
108
Interactivity across Charts: Selection and Filter Events
®
Copyright © 2020 KNIME AG 109
Interactivity across Charts: Selection and Filter Events
®
Copyright © 2020 KNIME AG 110
Interactivity across Charts: Selection and Filter Events
• Click layout button when inside • Add views and rows via drag&drop
Component to assign views to rows • Add columns using + buttons
and columns
®
Copyright © 2020 KNIME AG 18
113
Data Aggregation
Aggregation: Count
Category Online Onsite
Product Store Category # Ordered
ID Items Clothing 2 1
P1 Online Clothing 2 Home 0 1
P2 Onsite Home 3 Electronics 2 0
P3 Onsite Clothing 1
P4 Online Clothing 5
P5 Online Electronics 7 Aggregation: Sum (# Ordered Items)
P6 Online Electronics 5 Category Online Onsite
Clothing 7 1
Home 0 3
Electronics 12 0
Solution: Pivoting Node
®
Copyright © 2020 KNIME AG 19 114
Data Aggregation
®
Copyright © 2020 KNIME AG 20 115
New Node: Pivoting
®
Copyright © 2020 KNIME AG 21
116
New Node: Pivoting
Groups ~ Rows
Pivots ~ Columns
Aggregation
®
Copyright © 2020 KNIME AG 22
117
Script-based View Nodes
®
Copyright © 2020 KNIME AG 23
118
Visualization Exercise
Start with exercise: Visualization
• Read sales.csv data
• Assign a different color to each product
• Plot BasketValue against BasketSize using the Scatter Plot node
• Show the total BasketValue by time and product in a Line Plot and a Stacked Area Chart
(Use the Pivoting node to get the sum of sales by Quarter and Product!)
®
Copyright © 2020 KNIME AG 25
119
Today’s Example
®
Copyright © 2020 KNIME AG 26
120
Data Mining
Partition, Learn, Predict, Score
®
Copyright © 2020 KNIME AG 1
121
Data Mining Strategies
Example Applications:
• Anomaly Detection (fraud, predictive maintenance)
• Association Rule Learning (market basket analysis)
• Clustering (market segmentation)
• Classification (next best offer, churn preventions)
• Regression (trend estimation)
®
Copyright © 2020 KNIME AG 2
122
Data Mining: Process Overview
Train
Training
Model
Set
Apply Score
Model Model
Original
Data Set
Test
Set
®
Copyright © 2020 KNIME AG 3
123
Data Mining in KNIME
®
Copyright © 2020 KNIME AG 4
124
New Node: Partitioning
®
Copyright © 2020 KNIME AG 5
125
Learner-Predictor Motif
New Data!
®
Copyright © 2020 KNIME AG 6
126
Classification
• Methods
- Decision Trees
- Neural Networks
- Naïve Bayes
- Logistic Regression
®
Copyright © 2020 KNIME AG 7
127
Target Column
®
Copyright © 2020 KNIME AG 8
128
KNIME’s Decision Tree
• C4.5 builds a tree from a set of training data using the concept of
information entropy.
• At each node of the tree, the attribute of the data with the highest
normalized information gain (difference in entropy) is chosen to split the
data.
• The C4.5 algorithm then recurses on the smaller sub lists.
®
Copyright © 2020 KNIME AG 9
129
New Node: Decision Tree Learner
®
Copyright © 2020 KNIME AG 10
130
Decision Tree View
®
Copyright © 2020 KNIME AG 11
131
New Node: Decision Tree Predictor
®
Copyright © 2020 KNIME AG 12
132
New Node: Scorer
®
Copyright © 2020 KNIME AG 13
133
New Node: Scorer
Confusion matrix shows the distribution of
model errors
®
Copyright © 2020 KNIME AG 14
134
Confusion Matrix
®
Copyright © 2020 KNIME AG 15
135
Receiver Operating Characteristics
®
Copyright © 2020 KNIME AG 16
136
New Node: ROC Curve
®
Copyright © 2020 KNIME AG 17
137
Data Mining Exercise, Activity I
Start with exercise: Data Mining, Activity I:
• Partition the fully joined data into a training and test set (50%, Stratified
Sampling on Target)
• Train a decision tree on the training set to predict Target
• Use the trained model to predict Target in the test set
• What is the overall accuracy of your model?
• Optional: Evaluate the accuracy and robustness of the model with the
ROC Curve node
®
Copyright © 2020 KNIME AG 18
138
Regression
• Methods
- Linear
- Polynomial
- Regression Trees
- Partial Least Squares
®
Copyright © 2020 KNIME AG 19
139
New Nodes: Linear Regression Learner & Regression Predictor
®
Copyright © 2020 KNIME AG 20
140
New Node: Numeric Scorer
®
Copyright © 2020 KNIME AG 21
141
Data Mining Exercise, Activity II
®
Copyright © 2020 KNIME AG 22
142
Today’s Example
®
Copyright © 2020 KNIME AG 23
143
Clustering
• Applications
- Market Segmentation
- Diversity picking
• Methods
- K-means/medoids
- Hierarchical
- DBScan
- OPTICS
- Neighbourgrams
®
Copyright © 2020 KNIME AG 24
144
New Nodes: k-Means Clustering
®
Copyright © 2020 KNIME AG 25
145
Data Mining Exercise, Activity III
®
Copyright © 2020 KNIME AG 26
146
Scripting Integrations: R and Python
®
Copyright © 2020 KNIME AG 27
147
Java Snippet
®
Copyright © 2020 KNIME AG 28
148
Exporting Data & Deployment
®
Copyright © 2020 KNIME AG 1
149
Exporting Data
®
Copyright © 2020 KNIME AG 2
150
Input/Output in Deployment
Input Output
• File (CSV, Table, XLS, …) • Report (BIRT, Tableau,
• Database Spotfire, PowerBI)
• JSON for REST API • Email
• File (CSV, Table, XLS, …)
• WebPortal
®
Copyright © 2020 KNIME AG 3
151
To Report / Email
To BIRT Report
Also available:
Nodes for Tableau
and Spotfire
®
Copyright © 2020 KNIME AG 4
152
To File / Database
®
Copyright © 2020 KNIME AG 5
153
REST API (Available on KNIME Server)
®
Copyright © 2020 KNIME AG 6
154
To Dashboard on WebPortal
Step 3
Step 1 Step 2 Step 4 Step 5
Customize
Upload File Select Columns Interactive View Download Image
Column Domains
®
Copyright © 2020 KNIME AG 7
155
Workflow on KNIME WebPortal
WebPortal Page
(Step 1)
Upload File
®
Copyright © 2020 KNIME AG 8
156
Components to Produce Dashboard on Web Page
File
Selection Column
Selection
Stacked
Area Chart
Filter by
Row Filter
Range
®
Copyright © 2020 KNIME AG 9
157
Data Export Nodes
®
Copyright © 2020 KNIME AG 10
158
New Node: Table Writer
®
Copyright © 2020 KNIME AG 11
159
New Node: XLS Writer
®
Copyright © 2020 KNIME AG 12
160
New Node: Database Writer
®
Copyright © 2020 KNIME AG 13
161
Automation: Call Local Workflow
®
Copyright © 2020 KNIME AG 14
162
Automation: Call Local Workflow
Provide email
credentials, host, etc.
Convert binary
column to file and
save to temp dir
®
Copyright © 2020 KNIME AG 16
164
KNIME Server as a REST Resource
https://www.knime.org/blog/giving-the-knime-server-a-rest
®
Copyright © 2020 KNIME AG 17
165
KNIME Server as a REST resource
®
Copyright © 2020 KNIME AG 18
166
Remote File Handling – Cloud Storage
®
Copyright © 2020 KNIME AG 19
167
Remote File Handling – Cloud Storage
Enter
credentials
Upload!
Create
Create new
directory in
bucket
bucket
Create URIs
of local file
paths
®
Copyright © 2020 KNIME AG 20
168
Reporting in KNIME
®
Copyright © 2020 KNIME AG 21
169
Reporting in KNIME
®
Copyright © 2020 KNIME AG 22
170
Installation
®
Copyright © 2020 KNIME AG 23
171
New Node: Data to Report
®
Copyright © 2020 KNIME AG 24
172
New Node: Image to Report
®
Copyright © 2020 KNIME AG 25
173
Edit the Report
Open the workflow and click the Report Editor button in the tool bar
®
Copyright © 2020 KNIME AG 26
174
Reporting Perspective
Click button to
create report
®
Copyright © 2020 KNIME AG 27
175
Charting in BIRT
®
Copyright © 2020 KNIME AG 28
176
Tips & Tricks
®
Copyright © 2020 KNIME AG 29
177
Exporting Data Exercise
®
Copyright © 2020 KNIME AG 30
178
Today’s Example
®
Copyright © 2020 KNIME AG 31
179
Thank You!
education@knime.com
The KNIME® trademark and logo and OPEN FOR INNOVATION® trademark are used by
Copyright © 2020 KNIME AG KNIME AG under license from KNIME GmbH, and are registered in the United States.
KNIME® is also registered in Germany.