You are on page 1of 85

Supervised learning

M A C H I N E L E A R N I N G F O R E V E R YO N E

Hadrien Lacroix
Content Developer at DataCamp
Modeling

MACHINE LEARNING FOR EVERYONE


Types

MACHINE LEARNING FOR EVERYONE


What is supervised learning?

MACHINE LEARNING FOR EVERYONE


Classi cation and regression

MACHINE LEARNING FOR EVERYONE


Classi cation

MACHINE LEARNING FOR EVERYONE


Classi cation
Classi cation = assigning a category
Will this customer stop its subscription?
Yes, No

Is this mole cancerous?


Yes, No

What kind of wine is that?


Red, White, Rosé

What ower is that?


Rose, Tulip, Carnation, Lily

MACHINE LEARNING FOR EVERYONE


Observations

MACHINE LEARNING FOR EVERYONE


Features

MACHINE LEARNING FOR EVERYONE


Target

MACHINE LEARNING FOR EVERYONE


Graphing our data

MACHINE LEARNING FOR EVERYONE


Splitting data

MACHINE LEARNING FOR EVERYONE


Manual classi er

MACHINE LEARNING FOR EVERYONE


Support vector machine - linear classi er

MACHINE LEARNING FOR EVERYONE


Support vector machine - polynomial classi er

MACHINE LEARNING FOR EVERYONE


Regression

MACHINE LEARNING FOR EVERYONE


Regression
Regression = assigning a continuous variable
How much will this stock be worth?

What is this exoplanet's mass?

How tall will this child be as an adult?

MACHINE LEARNING FOR EVERYONE


Predicting temperature

MACHINE LEARNING FOR EVERYONE


Training data

MACHINE LEARNING FOR EVERYONE


Linear regression

MACHINE LEARNING FOR EVERYONE


Model

MACHINE LEARNING FOR EVERYONE


Given humidity...

MACHINE LEARNING FOR EVERYONE


... nd temperature

MACHINE LEARNING FOR EVERYONE


Testing data

MACHINE LEARNING FOR EVERYONE


Classi cation vs regression
Regression = continuous
Any value within a nite (height) or in nite (time) interval
20°F, 20.1°F, 20.01°F...

Classi cation = category


One of few speci c values
Cold, Mild, Hot

MACHINE LEARNING FOR EVERYONE


Let's practice!
M A C H I N E L E A R N I N G F O R E V E R YO N E
Unsupervised
learning
M A C H I N E L E A R N I N G F O R E V E R YO N E

Hadrien Lacroix
Content Developer at DataCamp
Supervised learning

MACHINE LEARNING FOR EVERYONE


Unsupervised learning
Unsupervised learning = no target column
No guidance

Looks at the whole dataset

Tries to detect patterns

MACHINE LEARNING FOR EVERYONE


Applications

MACHINE LEARNING FOR EVERYONE


Clustering

MACHINE LEARNING FOR EVERYONE


Clustering example

MACHINE LEARNING FOR EVERYONE


Species cluster

MACHINE LEARNING FOR EVERYONE


Color cluster

MACHINE LEARNING FOR EVERYONE


Origin cluster

MACHINE LEARNING FOR EVERYONE


Clustering models
K Means:
Specify the number of clusters

DBSCAN (density-based spatial clustering of applications with noise):


Specify what constitutes a cluster

MACHINE LEARNING FOR EVERYONE


Iris table

MACHINE LEARNING FOR EVERYONE


KNN with 4 clusters

MACHINE LEARNING FOR EVERYONE


KNN with 3 clusters

MACHINE LEARNING FOR EVERYONE


Ground truth

MACHINE LEARNING FOR EVERYONE


Anomaly detection

MACHINE LEARNING FOR EVERYONE


Detecting outliers
Anomaly detection = detecting outliers

Outliers = observations that differ from the rest

MACHINE LEARNING FOR EVERYONE


Outliers

MACHINE LEARNING FOR EVERYONE


Removing outliers

MACHINE LEARNING FOR EVERYONE


Some anomaly detection use cases
Discover devices that fail faster or last longer

Discover fraudsters that manage trick the system

Discover patients that resist a fatal disease

...

MACHINE LEARNING FOR EVERYONE


Association

MACHINE LEARNING FOR EVERYONE


Association

MACHINE LEARNING FOR EVERYONE


Let's practice!
M A C H I N E L E A R N I N G F O R E V E R YO N E
Evaluating
performance
M A C H I N E L E A R N I N G F O R E V E R YO N E

Hadrien Lacroix
Content Developer at DataCamp
Evaluate step

MACHINE LEARNING FOR EVERYONE


Over tting
Performs great on training data

Performs poorly on testing data

Model memorized training data and can't generalize learnings to new data

Use testing set to check model performance

MACHINE LEARNING FOR EVERYONE


Illustrating over tting

MACHINE LEARNING FOR EVERYONE


Accuracy
Accuracy = correctly classi ed observations / all observations

48 / 50 = 96%

MACHINE LEARNING FOR EVERYONE


Limits of accuracy: fraud example

Accuracy of this model:

28 correctly classified
= 93.33%
30 total points
Misses majority of fraudulent transactions

Need a better metric

MACHINE LEARNING FOR EVERYONE


Confusion matrix

MACHINE LEARNING FOR EVERYONE


True positives

MACHINE LEARNING FOR EVERYONE


True positives

MACHINE LEARNING FOR EVERYONE


False negatives

MACHINE LEARNING FOR EVERYONE


False negatives

MACHINE LEARNING FOR EVERYONE


Remembering False Negatives

MACHINE LEARNING FOR EVERYONE


Fill out the rest...

MACHINE LEARNING FOR EVERYONE


False positives, true negatives

MACHINE LEARNING FOR EVERYONE


Remembering False Positives

1 https://www. ickr.com/photos/59632563@N04/6104068209

MACHINE LEARNING FOR EVERYONE


Sensitivity

How many fraudulent transactions did we classify correctly?

true positives
Sensitivity = = 1/3 = 33.33%
true positives + false negatives
Rather mark legitimate transactions as suspicious than authorize fraudulent transactions

MACHINE LEARNING FOR EVERYONE


Speci city
true negatives
Specificity =
true negatives + false positives
Spam lter:

Rather send spam to inbox than send real emails to the spam folder

MACHINE LEARNING FOR EVERYONE


Evaluating regression

MACHINE LEARNING FOR EVERYONE


Evaluating regression

Error = distance between point (actual value) and line (predicted value)

Many ways calculate this. e.g, root mean square error

MACHINE LEARNING FOR EVERYONE


Unsupervised learning

1 https://www. ickr.com/photos/micahdowty/8540188997

MACHINE LEARNING FOR EVERYONE


Let's practice!
M A C H I N E L E A R N I N G F O R E V E R YO N E
Improving
performance
M A C H I N E L E A R N I N G F O R E V E R YO N E

Hadrien Lacroix
Content Developer at DataCamp
Machine learning work ow

MACHINE LEARNING FOR EVERYONE


Several options
Dimensionality reduction

Hyperparameter tuning

Ensemble methods

MACHINE LEARNING FOR EVERYONE


Dimensionality reduction
Reducing the number of features

MACHINE LEARNING FOR EVERYONE


Dimensionality reduction: example
Irrelevance: some features don't carry useful information

MACHINE LEARNING FOR EVERYONE


Dimensionality reduction: example
Correlation: some features carry similar information

Keep only one feature


e.g. height and shoe size --> height

Collapse multiple features into one underlying feature


e.g. height and weight --> Body Mass Index

MACHINE LEARNING FOR EVERYONE


Hyperparameter tuning

MACHINE LEARNING FOR EVERYONE


Hyperparameter tuning

MACHINE LEARNING FOR EVERYONE


Hyperparameter tuning

MACHINE LEARNING FOR EVERYONE


Hyperparameter tuning

MACHINE LEARNING FOR EVERYONE


Hyperparameter tuning

MACHINE LEARNING FOR EVERYONE


Hyperparameter tuning: example
SVM algorithm hyperparameters:

kernel : "linear" --> "poly"

degree

gamma

shrinking

coef0

tol

...

MACHINE LEARNING FOR EVERYONE


Ensemble methods

MACHINE LEARNING FOR EVERYONE


Ensemble methods: classi cation

MACHINE LEARNING FOR EVERYONE


Ensemble methods: regression

MACHINE LEARNING FOR EVERYONE


Let's practice!
M A C H I N E L E A R N I N G F O R E V E R YO N E

You might also like