Lecture 3 Supervised

AI & Maintenance
Lecture 3: Supervised Machine Learning

Recap: Last lecture
• CRISP-DM / CRISP-ML
• Visualisation
• Preprocessing
Machine Learning
Machine Learning
• Learn patterns from data
• Use patterns to predict
• Classical problems
• Predict category  Classification
• Predict number  Regression

Sub-fields of AI
Supervised Learning
Machine Learning Unsupervised Learning
Planning Reinforcement Learning
Expert Systems Sentiment Analysis

Natural Language Processing Topic Modeling
Artificial Intelligence
Chatbots
Knowledge Representation
Computer Vision
Robotics
Speech
Sub-fields of AI
Supervised Learning
Machine Learning Unsupervised Learning
Planning Reinforcement Learning
Expert Systems Sentiment Analysis

Natural Language Processing Topic Modeling
Artificial Intelligence
Chatbots
Knowledge Representation
Computer Vision
Robotics
Speech
Machine Learning
Classification
Supervised
Learning
Regression
Clustering
Unsupervised
Machine Learning Association
Learning
Dimensionality reduction
Game AI
Reinforcement
Learning
Robot navigation
Machine Learning
Classification
Supervised
Learning
Regression
Clustering
Unsupervised
Machine Learning Association Lecture 5
Learning
Game AI
Reinforcement
Learning
Robot navigation
Machine Learning
Classification
Supervised
Learning
Regression
Clustering
Unsupervised
Machine Learning Association
Learning
Game AI
Reinforcement
Learning
Robot navigation
Supervised Machine Learning
Object recognition
Classification Customer retention

Credit scoring
Supervised Learning
Forecasting
Regression Predicting
Remaining Useful Life

Preprocessed Labeled
Model training Output
Data Data
Annotator Data Scientist

Annotation
• Label already in data
• Human annotator
• Intensive
• Historic annotations (due to business process)

Classification
Classification
Data Data

Toy dataset
5
• Binary target
4
• Two variables 3
y
2
0
0.5 1 1.5 2 2.5 3 3.5 4 4.5
x
Decision rules
5
• Often manually constructed

4
• IF _ THEN _ (ELSE _) 3
y
2
• IF y > 2.5 THEN ● ELSE ● 1
0
0.5 1 1.5 2 2.5 3 3.5 4 4.5
x
Decision Tree
5
y
2
0
0.5 1 1.5 2 2.5 3 3.5 4 4.5
x
Decision Tree
5
x<3
●
3
y
2
0
0.5 1 1.5 2 2.5 3 3.5 4 4.5
x
Decision Tree
5
x<3
y > 3.5 ●
3
y
2
●
1
0
0.5 1 1.5 2 2.5 3 3.5 4 4.5
x
Decision Tree
5
x<3
y > 3.5 ●
3
y
2
● x < 1.5
1
● 0
0.5 1 1.5 2 2.5 3 3.5 4 4.5
x
Decision Tree
5
x<3
y > 3.5 ●
3
y
2
● x < 1.5
1
● y < 2.5 0
0.5 1 1.5 2 2.5 3 3.5 4 4.5
x
● ●
Decision Tree
5
Hyperparameters
4
• Split method
3
• Max Tree depth y

2
• Max splits 1
• Min samples per leaf 0

0.5 1 1.5 2 2.5 3 3.5 4 4.5
x
•…
Decision Tree
Pros Cons
• Easy to explain • Limited prediction power
• Little preprocessing needed • Prone to overfitting
• Non-parametric • Sensitive to imbalanced data

Random Forest
• Train multiple decision trees

• On different samples of the data
• Aggregate to get result

• Majority vote
• Average
Random Forest (Majority vote)
y y y y
y
x x x x x
x
Random Forest (Averaged)
y y y y
y
x x x x x
x
Random Forests
Hyperparameters
• All decision tree parameters
• Number of trees
• Sampling strategy
• Combination strategy
Random Forests
Pros Cons
• More predictive power • Less explainable
• Non-parametric • Biased to high cardinality
• Parameter complexity
Neural Networks
Binary classification
Input Hidden Output

layer layer layer
Neural Networks
5
y
2
0
0.5 1 1.5 2 2.5 3 3.5 4 4.5
x
Input Hidden Output

layer layer layer
Neural Networks
5
2
½ 4
-1
½ 3
1
y
1 2
-2
1 1
-¼
0
0.5 1 1.5 2 2.5 3 3.5 4 4.5
x
Input Hidden Output

layer layer layer
Neural Networks
5
2
½ 4
-1
2 2
½ 3
1
y
1 2
4 4 -2
1 1
-¼
0
0.5 1 1.5 2 2.5 3 3.5 4 4.5
x
Input Hidden Output

layer layers layer
Neural Networks
(2×2) + (4×-1)
5
0
2
½ 4
-1
2 2
½ 3
1
y
1 2
4 4 -2
1 1
-¼
0
0.5 1 1.5 2 2.5 3 3.5 4 4.5
x
Input Hidden Output

layer layer layer
Neural Networks
5
0
2
½ 4
-1
2 2
½ 3
1
5 y
1 2
4 4 -2
1 1
-¼
0
0.5 1 1.5 2 2.5 3 3.5 4 4.5
x
Input Hidden Output

layer layer layer
Neural Networks
5
0
2
½ 4
-1
2 2
½ 3
1
5 y
1 2
4 4 -2
1 1
-¼
-5
0
0.5 1 1.5 2 2.5 3 3.5 4 4.5
x
Input Hidden Output

layer layer layer
Neural Networks
5
0
2
½ 4
-1
2 2
½ 3
1
5 0 y
1 2
4 4 -2
1 1
-¼
-5
0
0.5 1 1.5 2 2.5 3 3.5 4 4.5
x
Input Hidden Output

layer layer layer
Neural Networks
5
0
2
½ 4
-1
2 2
½ 3
1
5 0 ≠ 1 y
1 2
4 4 -2
1 1
-¼
-5
0
0.5 1 1.5 2 2.5 3 3.5 4 4.5
x
Input Hidden Output

layer layer layer
Neural Networks
5
0
2
½ 4
-1
2 2
½ 3
1.2
5 0 ≠ 1 y
1 2
4 4 -2
0.8 1
-¼
-5
0
0.5 1 1.5 2 2.5 3 3.5 4 4.5
x
Input Hidden Output

layer layer layer
Neural Networks
Binary classification
Input Hidden Output

layer layers layer
Neural Networks
Multiclass classification
Input Hidden Output

layer layers layer
Neural Networks
Hyperparameters
• Full architecture
• Hidden layers, neurons per layer
• Activation functions
• Learning rate
Neural Networks
Pros Cons
• Very powerful • Explainability
• Transfer learning • Training

Regression
Regression
Data Data

Linear Regression
• Straight line
• Finding trends
Linear Regression
actual predicted
18
16
14
12
10
0
0 1 2 3 4 5
Linear Regression
Pros Cons
• Very simple • Often too simple

ARIMA
• AutoRegressive Integrated Moving Average
• AutoRegressive:
• Moving Average:
• Integrated  differencing of terms
• ARIMA(p, d, q)
ARIMA
ARIMA
Pros Cons
• Only need target • Does not include features
• Explainability
Adapting classification methods
• Regression trees
• Neural nets for regression

Regression trees
x<3 x<3
y > 3.5 2 y > 3.5 ●
1.5 x < 1.5 ● x < 1.5
0.75 0.5 ● ●
Regression trees
x<3
y > 3.5 2 predicted

value
1.5 x < 1.5
0.75 0.5
x
Neural Networks
Summation last hidden layer
Input Hidden Output

layer layers layer
Recap
• Supervised Machine Learning
• Classification / Regression
• Random Forest
• Neural Networks
Next week
Next week
• Evaluation of supervised models
• Overfitting / Underfitting

Lecture 3 Supervised

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 3 Supervised

Uploaded by

Copyright:

Available Formats

AI & Maintenance

Lecture 3: Supervised Machine Learning

• Learn patterns from data

• Use patterns to predict

• Predict number  Regression

Planning Reinforcement Learning

Expert Systems Sentiment Analysis

Planning Reinforcement Learning

Expert Systems Sentiment Analysis

Classification Customer retention

Remaining Useful Life

Annotator Data Scientist

• Label already in data

• Historic annotations (due to business process)

Annotator Data Scientist

• Often manually constructed

• IF y > 2.5 THEN ● ELSE ● 1

• Max Tree depth y

• Min samples per leaf 0

• Easy to explain • Limited prediction power

• Little preprocessing needed • Prone to overfitting

• Non-parametric • Sensitive to imbalanced data

• Train multiple decision trees

• Aggregate to get result

• All decision tree parameters

• More predictive power • Less explainable

• Non-parametric • Biased to high cardinality

Input Hidden Output

Input Hidden Output

Input Hidden Output

Input Hidden Output

Input Hidden Output

Input Hidden Output

Input Hidden Output

Input Hidden Output

Input Hidden Output

Input Hidden Output

Input Hidden Output

Input Hidden Output

• Very powerful • Explainability

• Transfer learning • Training

Annotator Data Scientist

• Very simple • Often too simple

• AutoRegressive Integrated Moving Average

• Integrated  differencing of terms

• Only need target • Does not include features

• Neural nets for regression

y > 3.5 2 y > 3.5 ●

1.5 x < 1.5 ● x < 1.5

y > 3.5 2 predicted

1.5 x < 1.5

Summation last hidden layer

Input Hidden Output

• Supervised Machine Learning

• Evaluation of supervised models

You might also like