0% found this document useful (0 votes)

99 views39 pages

COMPX310-19A Machine Learning Chapter 3: Classification

Uploaded by

Natch Sadindum

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

99 views39 pages

COMPX310-19A Machine Learning Chapter 3: Classification

Uploaded by

Natch Sadindum

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

COMPX310-19A

Machine Learning
Chapter 3: Classification
An introduction using Python, Scikit-Learn, Keras, and Tensorflow

Unless otherwise indicated, all images are from Hands-on Machine Learning with
Scikit-Learn, Keras, and TensorFlow by Aurélien Géron, Copyright © 2019 O’Reilly Media
House keeping
 Outline


03/08/2021 COMPX310 2
MNIST: the “hello world”of ML
 Scikit-learn provides some benchmark datasets,

 In this case: [Link]

03/08/2021 COMPX310 3
Handwritten digits

03/08/2021 COMPX310 4
Preparing Y

03/08/2021 COMPX310 5
Train/test, binary class

03/08/2021 COMPX310 6
Yet another learner: SGD

03/08/2021 COMPX310 7
Cross-validation

03/08/2021 COMPX310 8
Cross-validation
 Cross-validation is an alternative to Train+Validation
 Train is split up into k equal-sized folds (default: 10 folds)
 Use k-1 folds together as the new train, validate on the
remaining fold
 Repeat this k times, always choosing another fold => k results
 Compute mean + standard deviation
 [can also repeat this multiple times with different random seeds
to reduce the variance of the result]

03/08/2021 COMPX310 9
Cross-validation
 Workhorse in ML, therefore direct support in scikit_learn:

03/08/2021 COMPX310 10
Are we really that good?

03/08/2021 COMPX310 11
Getting predictions from CV

03/08/2021 COMPX310 12
Compare to perfection

03/08/2021 COMPX310 13
Precision and Recall
Precision: how many of the predicted 5s are really 5s

Recall: how many of the real 5s do we actually find

03/08/2021 COMPX310 14
TN, TP, FN, FP and the confusion matrix
[[5, 1], TN=5, FP=1, FN=2, TP=3
[2, 3]]. Rows: row0 info about class0, …
Columns: col0 info about predictedAs0, …

03/08/2021 COMPX310 15
F1: harmonic mean of recall & precision
[[5, 1], TN=5, FP=1, FN=2, TP=3
[2, 3]]. Rows: row0 info about class0, …
Columns: col0 info about predictedAs0, …

03/08/2021 COMPX310 16
Some results

03/08/2021 COMPX310 17
Thresholds: precision/recall trade-off

03/08/2021 COMPX310 18
Classifiers return numeric scores

03/08/2021 COMPX310 19
Precision recall curves

03/08/2021 COMPX310 20
Precision recall curves

03/08/2021 COMPX310 21
Recall @ precision == 0.9

03/08/2021 COMPX310 22
Precision-recall curve

03/08/2021 COMPX310 23
Alternative: ROC curve

03/08/2021 COMPX310 24
Alternative: ROC curve
Plot true positive rate (TPR)
over false positive rate (FPR)
for all possible thresholds.

Best @ (0,1).
Diagonal is a random classifier.

Area under the curve (AUC) is

1.0 for best possible, and
0.5 for random classifier.

AUC is very popular,

does not need a threshold,
works well for imbalanced data.

03/08/2021 COMPX310 25
Compare to Random Forest

03/08/2021 COMPX310 26
Compare to Random Forest

03/08/2021 COMPX310 27
Compare to Random Forest

03/08/2021 COMPX310 28
Multi-class classification

03/08/2021 COMPX310 29
Multi-class classification

03/08/2021 COMPX310 30
One-vs-One for Multiclass

03/08/2021 COMPX310 31
Random forest for multi-class

03/08/2021 COMPX310 32
Error analysis: confusion matrix from CV

03/08/2021 COMPX310 33
Error analysis: confusion matrix from CV

03/08/2021 COMPX310 34
Error analysis: confusion matrix from CV

03/08/2021 COMPX310 35
Multilabel: more than one binary target

03/08/2021 COMPX310 36
Multilabel: cross-validation

“Macro”: compute F1 for each label separately, then

average over all labels

“Micro”: compute F1 for the labels per example, then

average over all examples

03/08/2021 COMPX310 37
MultiOutput: multiple multiclass target
 E.g.: reconstruct image from a corrupted version

X y

03/08/2021 COMPX310 38
Adding noise, train & predict

03/08/2021 COMPX310 39

COMPX310-19A Machine Learning: An Introduction Using Python, Scikit-Learn, Keras, and Tensorflow
No ratings yet
COMPX310-19A Machine Learning: An Introduction Using Python, Scikit-Learn, Keras, and Tensorflow
44 pages
COMPX310-19A Machine Learning Chapter 7: Ensembles, Random Forest
No ratings yet
COMPX310-19A Machine Learning Chapter 7: Ensembles, Random Forest
41 pages
Pprint ML
No ratings yet
Pprint ML
22 pages
Lec5 Classification
No ratings yet
Lec5 Classification
27 pages
Lecture03. Classification (Chapter 3)
No ratings yet
Lecture03. Classification (Chapter 3)
46 pages
Classification Algorithms Course Overview
No ratings yet
Classification Algorithms Course Overview
4 pages
(REPORT) LAB - 2 - Decision - Tree
No ratings yet
(REPORT) LAB - 2 - Decision - Tree
17 pages
Hands On Machine Learning 3 Edition
No ratings yet
Hands On Machine Learning 3 Edition
31 pages
Class 2a-Decision Trees
No ratings yet
Class 2a-Decision Trees
28 pages
Smai Lecture 04 Perf Measures Classification
No ratings yet
Smai Lecture 04 Perf Measures Classification
42 pages
ML - Mod2 Classification
No ratings yet
ML - Mod2 Classification
74 pages
Machine Learning Lab Viva Questions
100% (1)
Machine Learning Lab Viva Questions
4 pages
Multi-Class Classifiers for Activity Monitoring
No ratings yet
Multi-Class Classifiers for Activity Monitoring
19 pages
Heart Disease Classification with ML
No ratings yet
Heart Disease Classification with ML
10 pages
Cardiovascular Disease Detection Project
No ratings yet
Cardiovascular Disease Detection Project
35 pages
Anomaly Detection with PCA and Random Forest
No ratings yet
Anomaly Detection with PCA and Random Forest
5 pages
CH-5 ML
No ratings yet
CH-5 ML
36 pages
Data Science in FInancial Services - 3
No ratings yet
Data Science in FInancial Services - 3
76 pages
Data Preprocessing
No ratings yet
Data Preprocessing
65 pages
Unit6 - 7 Issues
No ratings yet
Unit6 - 7 Issues
53 pages
Data Mining Decision Tree Analysis
No ratings yet
Data Mining Decision Tree Analysis
7 pages
Slides On DataI
No ratings yet
Slides On DataI
33 pages
Fisher Iris Classification Analysis
No ratings yet
Fisher Iris Classification Analysis
22 pages
Jalali@mshdiua - Ac.ir Jalali - Mshdiau.ac - Ir: Data Mining
No ratings yet
Jalali@mshdiua - Ac.ir Jalali - Mshdiau.ac - Ir: Data Mining
50 pages
Final ML
No ratings yet
Final ML
2 pages
Supervised Learning Classifier Guide
No ratings yet
Supervised Learning Classifier Guide
10 pages
CIVI6731 Lecture (Week9)
No ratings yet
CIVI6731 Lecture (Week9)
18 pages
19-Introduction Classification Algorithm-18-09-2024
No ratings yet
19-Introduction Classification Algorithm-18-09-2024
102 pages
Final Research Paper
No ratings yet
Final Research Paper
3 pages
COMPX310-19A Machine Learning Chapter 5: Support Vector Machines
No ratings yet
COMPX310-19A Machine Learning Chapter 5: Support Vector Machines
29 pages
CH 6
No ratings yet
CH 6
24 pages
Machine Learning Course Syllabus Overview
No ratings yet
Machine Learning Course Syllabus Overview
118 pages
Classification Techniques in Python
No ratings yet
Classification Techniques in Python
30 pages
Machine Learning Classification Techniques
No ratings yet
Machine Learning Classification Techniques
13 pages
Binary Classifier Training and Evaluation
No ratings yet
Binary Classifier Training and Evaluation
151 pages
Machine Learning Data Preprocessing
No ratings yet
Machine Learning Data Preprocessing
51 pages
Machine Learning for Image Classification
No ratings yet
Machine Learning for Image Classification
14 pages
006 Practical List of ML
No ratings yet
006 Practical List of ML
3 pages
ML101 Graded Assignment 2.ipynb - Colab
No ratings yet
ML101 Graded Assignment 2.ipynb - Colab
6 pages
Machine Learning Final Report
No ratings yet
Machine Learning Final Report
8 pages
Supervised vs. Unsupervised Learning
No ratings yet
Supervised vs. Unsupervised Learning
87 pages
Classification and Decision Trees Overview
No ratings yet
Classification and Decision Trees Overview
50 pages
Complete Data Science Questions
No ratings yet
Complete Data Science Questions
5 pages
Machine Learning II
No ratings yet
Machine Learning II
61 pages
04 - Model Selection
No ratings yet
04 - Model Selection
62 pages
Classification Algorithms and Evaluation Techniques
No ratings yet
Classification Algorithms and Evaluation Techniques
24 pages
M.L L-9 Machine Learning Model Evaluation
No ratings yet
M.L L-9 Machine Learning Model Evaluation
20 pages
COMPX310-19A Machine Learning Chapter 10: Neural Networks
No ratings yet
COMPX310-19A Machine Learning Chapter 10: Neural Networks
35 pages
Cardiovascular Disease Prediction Model
No ratings yet
Cardiovascular Disease Prediction Model
16 pages
07 Classification
No ratings yet
07 Classification
52 pages
Python ML Methods Cheatsheet
No ratings yet
Python ML Methods Cheatsheet
6 pages
Untitled 10
No ratings yet
Untitled 10
12 pages
Machine Learning for Disease Diagnosis
No ratings yet
Machine Learning for Disease Diagnosis
3 pages
l09 Machine Learning
No ratings yet
l09 Machine Learning
39 pages
Data Science with Max: SVM & PCA Guide
No ratings yet
Data Science with Max: SVM & PCA Guide
7 pages
Aiml Nts
No ratings yet
Aiml Nts
33 pages
Machine Learning in Disease Prediction
No ratings yet
Machine Learning in Disease Prediction
21 pages
Remote Sensing Data Classification Techniques
No ratings yet
Remote Sensing Data Classification Techniques
43 pages
Supervised Machine Learning Guide
No ratings yet
Supervised Machine Learning Guide
30 pages
WRAMP Instruction Set Guide
No ratings yet
WRAMP Instruction Set Guide
16 pages
COMPX310-19A Machine Learning Chapter 11: Training Deep Neural Networks
No ratings yet
COMPX310-19A Machine Learning Chapter 11: Training Deep Neural Networks
21 pages
COMPX310-19A Machine Learning Chapter 4: Training Models
No ratings yet
COMPX310-19A Machine Learning Chapter 4: Training Models
48 pages
COMPX310-19A Machine Learning Chapter 8: Dimensionality Reduction
No ratings yet
COMPX310-19A Machine Learning Chapter 8: Dimensionality Reduction
35 pages
Hand Work (Skills) Is Better Than Education
No ratings yet
Hand Work (Skills) Is Better Than Education
3 pages
Consumer Behavior
No ratings yet
Consumer Behavior
5 pages
Teaching Prefix and Suffix
No ratings yet
Teaching Prefix and Suffix
2 pages
Lesson Plan Unit 10
No ratings yet
Lesson Plan Unit 10
6 pages
Beginner's Guide to User Research Techniques
No ratings yet
Beginner's Guide to User Research Techniques
24 pages
Workplace Engagement and Generational Factors
No ratings yet
Workplace Engagement and Generational Factors
6 pages
Evidence - CSTP 6
No ratings yet
Evidence - CSTP 6
4 pages
Sayta Kumar Barman's Resume
No ratings yet
Sayta Kumar Barman's Resume
2 pages
Teacher Job Performance Survey Questionnaire
No ratings yet
Teacher Job Performance Survey Questionnaire
5 pages
Grade 2 Notation and Numeration Curriculum
No ratings yet
Grade 2 Notation and Numeration Curriculum
11 pages
Analytic Autoethnography in Arts
100% (2)
Analytic Autoethnography in Arts
15 pages
Gallup Employee Engagement Questionnaire
89% (18)
Gallup Employee Engagement Questionnaire
2 pages
Table of Specification
No ratings yet
Table of Specification
14 pages
Frontal Lobe Syndromes
No ratings yet
Frontal Lobe Syndromes
7 pages
Understanding Manuscript Speech Delivery
No ratings yet
Understanding Manuscript Speech Delivery
16 pages
CREATIVE ARTS PROVINCIAL Dance Gr9 T4 LESSON PLAN 2020 Autosaved
No ratings yet
CREATIVE ARTS PROVINCIAL Dance Gr9 T4 LESSON PLAN 2020 Autosaved
3 pages
Fotip Ilp 5
No ratings yet
Fotip Ilp 5
7 pages
Basic Aspects of Technical Writing
No ratings yet
Basic Aspects of Technical Writing
22 pages
A Stylistic Analysis of President Chakwera
No ratings yet
A Stylistic Analysis of President Chakwera
5 pages
Edl Getting Ready To Teach Grammar
0% (1)
Edl Getting Ready To Teach Grammar
22 pages
Ai 1
No ratings yet
Ai 1
34 pages
Sedna Aspects To Post
No ratings yet
Sedna Aspects To Post
2 pages
Ten Key Factors Influencing Successful Multilingualism - Tokuhama-Espinosa
No ratings yet
Ten Key Factors Influencing Successful Multilingualism - Tokuhama-Espinosa
2 pages
Factors Influencing Business Simulation Use
No ratings yet
Factors Influencing Business Simulation Use
11 pages
Mastering Multiple Choice Strategies
100% (1)
Mastering Multiple Choice Strategies
22 pages
Writing with Coherence & Cohesion
No ratings yet
Writing with Coherence & Cohesion
2 pages
Psychology Concepts for Nursing Care
No ratings yet
Psychology Concepts for Nursing Care
260 pages
SHS - Instructional Supervisory Plan
No ratings yet
SHS - Instructional Supervisory Plan
4 pages
A Recipe For Training Neural Networks
No ratings yet
A Recipe For Training Neural Networks
18 pages
Computer Assignment
No ratings yet
Computer Assignment
4 pages

COMPX310-19A Machine Learning Chapter 3: Classification

Uploaded by

COMPX310-19A Machine Learning Chapter 3: Classification

Uploaded by

COMPX310-19A

 In this case: [Link]

Recall: how many of the real 5s do we actually find

Area under the curve (AUC) is

AUC is very popular,

“Macro”: compute F1 for each label separately, then

“Micro”: compute F1 for the labels per example, then

You might also like