You are on page 1of 58

AI & Maintenance

Lecture 3: Supervised Machine Learning


Recap: Last lecture

• CRISP-DM / CRISP-ML

• Visualisation

• Preprocessing
Machine Learning
Machine Learning

• Learn patterns from data

• Use patterns to predict

• Classical problems
• Predict category  Classification

• Predict number  Regression


Sub-fields of AI
Supervised Learning
Machine Learning Unsupervised Learning

Planning Reinforcement Learning

Expert Systems Sentiment Analysis


Natural Language Processing Topic Modeling
Artificial Intelligence
Chatbots
Knowledge Representation

Computer Vision

Robotics

Speech
Sub-fields of AI
Supervised Learning
Machine Learning Unsupervised Learning

Planning Reinforcement Learning

Expert Systems Sentiment Analysis


Natural Language Processing Topic Modeling
Artificial Intelligence
Chatbots
Knowledge Representation

Computer Vision

Robotics

Speech
Machine Learning
Classification
Supervised
Learning
Regression

Clustering
Unsupervised
Machine Learning Association
Learning
Dimensionality reduction

Game AI
Reinforcement
Learning
Robot navigation
Machine Learning
Classification
Supervised
Learning
Regression

Clustering
Unsupervised
Machine Learning Association Lecture 5
Learning
Dimensionality reduction

Game AI
Reinforcement
Learning
Robot navigation
Machine Learning
Classification
Supervised
Learning
Regression

Clustering
Unsupervised
Machine Learning Association
Learning
Dimensionality reduction

Game AI
Reinforcement
Learning
Robot navigation
Supervised Machine Learning
Supervised Machine Learning

Object recognition

Classification Customer retention


Credit scoring

Supervised Learning

Forecasting

Regression Predicting

Remaining Useful Life


Supervised Machine Learning

Preprocessed Labeled
Model training Output
Data Data

Annotator Data Scientist


Annotation

• Label already in data

• Human annotator
• Intensive

• Historic annotations (due to business process)


Classification
Classification

Preprocessed Labeled
Model training Output
Data Data

Annotator Data Scientist


Toy dataset
5

• Binary target
4

• Two variables 3

y
2

0
0.5 1 1.5 2 2.5 3 3.5 4 4.5
x
Decision rules
5

• Often manually constructed


4

• IF _ THEN _ (ELSE _) 3

y
2

• IF y > 2.5 THEN ● ELSE ● 1

0
0.5 1 1.5 2 2.5 3 3.5 4 4.5
x
Decision Tree
5

y
2

0
0.5 1 1.5 2 2.5 3 3.5 4 4.5
x
Decision Tree
5
x<3


3

y
2

0
0.5 1 1.5 2 2.5 3 3.5 4 4.5
x
Decision Tree
5
x<3

y > 3.5 ●
3

y
2

1

0
0.5 1 1.5 2 2.5 3 3.5 4 4.5
x
Decision Tree
5
x<3

y > 3.5 ●
3

y
2
● x < 1.5
1

● 0
0.5 1 1.5 2 2.5 3 3.5 4 4.5
x
Decision Tree
5
x<3

y > 3.5 ●
3

y
2
● x < 1.5
1

● y < 2.5 0
0.5 1 1.5 2 2.5 3 3.5 4 4.5
x

● ●
Decision Tree
5

Hyperparameters
4

• Split method
3

• Max Tree depth y


2

• Max splits 1

• Min samples per leaf 0


0.5 1 1.5 2 2.5 3 3.5 4 4.5
x

•…
Decision Tree

Pros Cons

• Easy to explain • Limited prediction power

• Little preprocessing needed • Prone to overfitting

• Non-parametric • Sensitive to imbalanced data


Random Forest

• Train multiple decision trees


• On different samples of the data

• Aggregate to get result


• Majority vote

• Average
Random Forest (Majority vote)

y y y y
y

x x x x x

x
Random Forest (Averaged)

y y y y
y

x x x x x

x
Random Forests

Hyperparameters

• All decision tree parameters

• Number of trees

• Sampling strategy

• Combination strategy
Random Forests

Pros Cons

• More predictive power • Less explainable

• Non-parametric • Biased to high cardinality

• Parameter complexity
Neural Networks

Binary classification

Input Hidden Output


layer layer layer
Neural Networks
5

y
2

0
0.5 1 1.5 2 2.5 3 3.5 4 4.5
x

Input Hidden Output


layer layer layer
Neural Networks
5

2
½ 4
-1

½ 3
1
y

1 2

-2
1 1


0
0.5 1 1.5 2 2.5 3 3.5 4 4.5
x

Input Hidden Output


layer layer layer
Neural Networks
5

2
½ 4
-1
2 2
½ 3
1
y

1 2

4 4 -2
1 1


0
0.5 1 1.5 2 2.5 3 3.5 4 4.5
x

Input Hidden Output


layer layers layer
Neural Networks
(2×2) + (4×-1)
5

0
2
½ 4
-1
2 2
½ 3
1
y

1 2

4 4 -2
1 1


0
0.5 1 1.5 2 2.5 3 3.5 4 4.5
x

Input Hidden Output


layer layer layer
Neural Networks
5

0
2
½ 4
-1
2 2
½ 3
1
5 y

1 2

4 4 -2
1 1


0
0.5 1 1.5 2 2.5 3 3.5 4 4.5
x

Input Hidden Output


layer layer layer
Neural Networks
5

0
2
½ 4
-1
2 2
½ 3
1
5 y

1 2

4 4 -2
1 1


-5
0
0.5 1 1.5 2 2.5 3 3.5 4 4.5
x

Input Hidden Output


layer layer layer
Neural Networks
5

0
2
½ 4
-1
2 2
½ 3
1
5 0 y

1 2

4 4 -2
1 1


-5
0
0.5 1 1.5 2 2.5 3 3.5 4 4.5
x

Input Hidden Output


layer layer layer
Neural Networks
5

0
2
½ 4
-1
2 2
½ 3
1
5 0 ≠ 1 y

1 2

4 4 -2
1 1


-5
0
0.5 1 1.5 2 2.5 3 3.5 4 4.5
x

Input Hidden Output


layer layer layer
Neural Networks
5

0
2
½ 4
-1
2 2
½ 3
1.2
5 0 ≠ 1 y

1 2

4 4 -2
0.8 1


-5
0
0.5 1 1.5 2 2.5 3 3.5 4 4.5
x

Input Hidden Output


layer layer layer
Neural Networks

Binary classification

Input Hidden Output


layer layers layer
Neural Networks

Multiclass classification

Input Hidden Output


layer layers layer
Neural Networks

Hyperparameters

• Full architecture
• Hidden layers, neurons per layer

• Activation functions

• Learning rate
Neural Networks

Pros Cons

• Very powerful • Explainability

• Transfer learning • Training


Regression
Regression

Preprocessed Labeled
Model training Output
Data Data

Annotator Data Scientist


Linear Regression

• Straight line

• Finding trends
Linear Regression
actual predicted
18

16

14

12

10

0
0 1 2 3 4 5
Linear Regression

Pros Cons

• Very simple • Often too simple


ARIMA

• AutoRegressive Integrated Moving Average

• AutoRegressive:

• Moving Average:

• Integrated  differencing of terms

• ARIMA(p, d, q)
ARIMA
ARIMA

Pros Cons

• Only need target • Does not include features

• Explainability
Adapting classification methods

• Regression trees

• Neural nets for regression


Regression trees
x<3 x<3

y > 3.5 2 y > 3.5 ●

1.5 x < 1.5 ● x < 1.5

0.75 0.5 ● ●
Regression trees
x<3

y > 3.5 2 predicted


value

1.5 x < 1.5

0.75 0.5
x
Neural Networks

Summation last hidden layer

Input Hidden Output


layer layers layer
Recap

• Supervised Machine Learning

• Classification / Regression

• Random Forest

• Neural Networks
Next week
Next week

• Evaluation of supervised models

• Overfitting / Underfitting

You might also like