You are on page 1of 39

LINFO2262: Machine Learning

Classification and Evaluation

Pierre Dupont

ICTEAM Institute
Université catholique de Louvain – Belgium

P. Dupont (UCL Machine Learning Group) LINFO2262 1.


What is Machine Learning?

Outline

1 What is Machine Learning?


Introduction
Examples
What is a well-defined learning problem?
A mathematical viewpoint
Many disciplines, many more methods

2 Some Machine Learning Challenges

3 Course organization

P. Dupont (UCL Machine Learning Group) LINFO2262 2.


What is Machine Learning? Introduction

What is Machine Learning?

Machine Learning
The science of getting computers to act without being explicitly
programmed
Construction of computer programs that automatically improve
with experience: spam filtering software, autonomous driving,
speech recognition, chess programs, etc
Induction of a general theory (i.e. a model) from observed
examples (training set) in order to apply the model to previously
unseen examples (test set)

Note: Unsupervised learning and, more generally, data mining aim at


extracting knowledge from observed data

P. Dupont (UCL Machine Learning Group) LINFO2262 3.


What is Machine Learning? Examples

Autonomous Driving

DARPA Grand challenge 2005


build a robot capable of navigating 175 miles through desert terrain in
less than 10 hours, with no human intervention

The actual wining time of Stanley [Thrun et al. 05] was 6 hours 54
minutes.

P. Dupont (UCL Machine Learning Group) LINFO2262 4.


What is Machine Learning? Examples

Credit Risk Analysis


Data:
Customer103: (time=t0) Customer103: (time=t1) ... Customer103: (time=tn)
Years of credit: 9 Years of credit: 9 Years of credit: 9
Loan balance: $2,400 Loan balance: $3,250 Loan balance: $4,500
Income: $52k Income: ? Income: ?
Own House: Yes Own House: Yes Own House: Yes
Other delinquent accts: 2 Other delinquent accts: 2 Other delinquent accts: 3
Max billing cycles late: 3 Max billing cycles late: 4 Max billing cycles late: 6
Profitable customer?: ? Profitable customer?: ? Profitable customer?: No
... ... ...
Logical rules learned from training data:
If Other-Delinquent-Accounts > 2, and
Number-Delinquent-Billing-Cycles > 1
Then Profitable-Customer? = No
[Deny Credit Card application]
If Other-Delinquent-Accounts = 0, and
(Income > $30k) OR (Years-of-Credit > 3)
Then Profitable-Customer? = Yes
[Accept Credit Card application]

P. Dupont (UCL Machine Learning Group) LINFO2262 5.


What is Machine Learning? Examples

Medical prognosis

Data:
Patient103 time=1 Patient103 time=2 ... Patient103 time=n

Age: 23 Age: 23 Age: 23


FirstPregnancy: no FirstPregnancy: no FirstPregnancy: no
Anemia: no Anemia: no Anemia: no
Diabetes: no Diabetes: YES Diabetes: no
PreviousPrematureBirth: no PreviousPrematureBirth: no PreviousPrematureBirth: no
Ultrasound: ? Ultrasound: abnormal Ultrasound: ?
Elective C−Section: ? Elective C−Section: no Elective C−Section: no
Emergency C−Section: ? Emergency C−Section: ? Emergency C−Section: Yes
... ... ...

One of 18 learned logical rules:

If No previous normal delivery, and


Abnormal 2nd Trimester Ultrasound, and
Malpresentation at admission
Then Probability of Emergency C-Section is 0.6

P. Dupont (UCL Machine Learning Group) LINFO2262 6.


What is Machine Learning? Examples

Software that Customizes to User

P. Dupont (UCL Machine Learning Group) LINFO2262 7.


What is Machine Learning? Examples

PageRank
Likely, the most frequently used algorithm in the world

The PageRank algorithm learns the ranking of web pages from the
(changing) hyperlink structure of the Internet

P. Dupont (UCL Machine Learning Group) LINFO2262 8.


What is Machine Learning? Examples

Many many more applications. . .

P. Dupont (UCL Machine Learning Group) LINFO2262 9.


What is Machine Learning? What is a well-defined learning problem?

What is a Learning Problem?

Learning = Improving with experience at some task


Improve over task T ,

with respect to performance measure Perf ,

based on experience E.

P. Dupont (UCL Machine Learning Group) LINFO2262 10.


What is Machine Learning? What is a well-defined learning problem?

Example: Gender Recognition

T : Determine the gender of people


E: some features assumed to be representative of gender in a set
of labeled images (= training data)
Perf : % of people correctly recognized in new and unlabeled
images
Note: one needs to collect, pre-process and label the training data!

P. Dupont (UCL Machine Learning Group) LINFO2262 11.


What is Machine Learning? What is a well-defined learning problem?

Choose the Target Function and Features

Target function
A direct mapping
f : Image → Gender
An integer coding for the person’s gender
f : Image → N
A probability estimate of the person’s gender
f : Image, Gender → [0, 1]

Features to be extracted or computed from an Image


Pixel intensities (black or white, grey intensity or RGB coding)
Resolution?
Shade information?
Shape information?

P. Dupont (UCL Machine Learning Group) LINFO2262 12.


What is Machine Learning? What is a well-defined learning problem?

Choose Representation for Function to be Learned

Target function representation


Collection of logical rules encoded in a decision tree?

Linear or Polynomial function of numerical features from the


image?

Probability distribution?

A weighted sum of new implicit features computed from the input


features?

...

P. Dupont (UCL Machine Learning Group) LINFO2262 13.


What is Machine Learning? A mathematical viewpoint

Learning is the estimation of a mathematical function

Mathematical viewpoint
Supervised learning is the estimation of a function f : X → Y mapping
the space X of input data to some output Y . The estimation is based
on a finite training set of input data for which the mapping is known:
{(x 1 , y1 ), (x 2 , y2 ), . . . , (x n , yn )}

Two standard cases:


Supervised classification: Y is discrete or even binary{−1, 1}
Regression: Y is continuous
In both cases, the input space X may be made of continuous or
discrete features, a mix of both, or even more complex object features

P. Dupont (UCL Machine Learning Group) LINFO2262 14.


What is Machine Learning? A mathematical viewpoint

Supervised classification example


Gender recognition task

A vectorial input space X


I a grey intensity [0, 255] for each pixel
 
...
I each image can be represented by a vector 104
 
 ... 
x of pixel intensities  
I 1024x768 = 786432 dimensions  75 
...
A binary output space Y = {−1, 1} coding
for Male versus Female

P. Dupont (UCL Machine Learning Group) LINFO2262 15.


What is Machine Learning? A mathematical viewpoint

A Representation for the Function to be Learned

X
g(x) = w j · xj
j

xj ’s are the image features: grayscale intensity for each pixel


x is a vector of pixel intensities
wj is the weight of the j-th feature to be estimated from learning
examples

Simple decision function


f (x) = sign(g(x))
If f (x) == +1 then Female else Male

P. Dupont (UCL Machine Learning Group) LINFO2262 16.


What is Machine Learning? A mathematical viewpoint

Design Choices
Type of training experience

DNA Picture FingerPrint

Target Function

Picture −> Gender Picture −> scalar Picture,Gender −> [0,1]

Feature Extraction

RGB Coding Compressed Format


GrayScale Intensities

Function Representation

Logical rules Linear Function

Polynomial Function
Learning Algorithm

Convex Optimization
Perceptron Gradient Descent

P. Dupont (UCL Machine Learning Group) LINFO2262 17.


What is Machine Learning? A mathematical viewpoint

Regression
Time series prediction

One-dimensional input space X : time indexes x1 , x2 , . . .


One-dimensional real output value Y : e.g. a traffic load, local
temperature, currency exchange rate, etc

Note: general regression problems may use multi-dimensional inputs


and/or outputs
P. Dupont (UCL Machine Learning Group) LINFO2262 18.
What is Machine Learning? Many disciplines, many more methods

Some Relevant Disciplines


Probability and statistics
I real data include noise and hidden variables
I some models are essentially estimates of probability distributions
I performance assessment requires refined statistical analysis
Mathematical optimization
I fitting a model to some data is often formulated as a constrained
optimization problem
I links exist with operational research
Artificial intelligence
Computational complexity theory
Information theory
Control Theory
Signal Processing
...
P. Dupont (UCL Machine Learning Group) LINFO2262 19.
What is Machine Learning? Many disciplines, many more methods

Many model classes and learning algorithms


A non-exhaustive list of learning methods and model classes:

Tree-based models : Decision Trees, Random Forests, . . .


Linear discriminants : Perceptron, Fisher Linear Discriminants, . . .
Kernel methods : Support Vector Machines, Gaussian Processes
Deep learning : Deep Multi-layer Perceptrons, Convolutional
Neural Nets, Generative Adversarial Networks, . . .
Probabilistic models : Naive Bayes Classifier, Gaussian Classifier,
Logistic Regression, . . .
Instance based methods : k-Nearest Neighbor, LVQ, . . .
Graphical models : Bayesian Networks, Markov Logic Networks, . . .
Sequential models : Markov Chains, Kalman Filters, Probabilistic
Automata, Hidden Markov Models, Conditional Random
Fields, . . .
P. Dupont (UCL Machine Learning Group) LINFO2262 20.
Some Machine Learning Challenges

Outline

1 What is Machine Learning?

2 Some Machine Learning Challenges


Fundamental questions
Good and bad news

3 Course organization

P. Dupont (UCL Machine Learning Group) LINFO2262 21.


Some Machine Learning Challenges Fundamental questions

Some Recurrent Questions in Machine Learning

What algorithms can approximate functions well (and when)?


How does the number of training examples influence accuracy?
How does the complexity of the target function representation
impact it?
How does noisy data influence accuracy?
Which are the theoretical limits of learnability?
How to estimate the performance of a predictive model in a
statistically sound way?
How can prior knowledge of learner help?
Can we understand what the computer system has learned?
How to optimally combine several models?

P. Dupont (UCL Machine Learning Group) LINFO2262 22.


Some Machine Learning Challenges Good and bad news

Good news
Bayes classifier is optimal

A probabilistic view of the classification problem


x: input random variables = feature values computed from the
input data
y : output random variable = class label 1, . . . , C

MAP decision rule


Choose class
y ∗ = argmaxy∈{1,...,C} P(y|x) = argmaxy∈{1,...,C} P(x|y)P(y)

Good news from Bayes Decision Theory: MAP decision rule is


optimal as it minimizes the probability of classification error

P. Dupont (UCL Machine Learning Group) LINFO2262 23.


Some Machine Learning Challenges Good and bad news

Bad news
Bayes classifier is often impossible to estimate reliably
P(y) and P(x|y ) need to be reliably estimated from the training
data and/or prior knowledge
I P̂(y ) easy
I P̂(x|y ) often very hard
Example: Gender classification from grey-coded pictures
I P̂(y = Female) = .52, P̂(y = Male) = .48
I P̂(x|y ) = ?
1024 ∗ 768 = 786432 pixels ⇒ x1 , x2 , . . . , x786432
[0, 255] = 256 intensity values
For each of the 2 classes:
6
256786432 ≈ 1010 parameters to estimate!!

Inductive bias
Need to restrict the set of possible distributions (e.g. Gaussian)
The relevant inductive bias is hard to choose
P. Dupont (UCL Machine Learning Group) LINFO2262 24.
Some Machine Learning Challenges Good and bad news

Good news
There is a wide range of mathematical functions to build a model
Generalization as an (often implicit) search through a space of possible
models. Example: regression with M-degree polynomials

training data
target function
predictive model

Illustrations from Pattern Recognition and Machine Learning, C. Bishop, Springer, 2006

P. Dupont (UCL Machine Learning Group) LINFO2262 25.


Some Machine Learning Challenges Good and bad news

Bad news
Learning is impossible without inductive bias

The need for an inductive bias


If the function class is rich enough, you will always find at least 2
functions perfectly fitting the training data but predicting exactly
the opposite on new data
There is no way to tell which function (= predictive model) to
choose
Restriction of the function class is a necessary inductive bias

P. Dupont (UCL Machine Learning Group) LINFO2262 26.


Some Machine Learning Challenges Good and bad news

The need for capacity control and regularization


degree 1 degree 9 degree 9, smoother

Recall that the target model is unknown!

Capacity control: restrict the function class to be rich enough but


not too rich (inductive bias is required)
Regularization: favor smooth models in the chosen function class
to avoid overfitting
Illustrations from Pattern Recognition and Machine Learning, C. Bishop, Springer, 2006

P. Dupont (UCL Machine Learning Group) LINFO2262 27.


Some Machine Learning Challenges Good and bad news

From William of Ockham to Vladimir Vapnik

Ockham’s Razor: Principle of Simplicity (14th century)


Entia non sunt multiplicanda praeter necessitatem
Among all theories explaining the world equally well,
the simplest is the best

Vapnik’s Statistical Learning Theory (20th century)


s    
1 2n 4
R[f ] ≤ Remp [f ] + h ln + 1 + ln
|{z} | {z } n h δ
True Error Training Error | {z }
Capacity term

P. Dupont (UCL Machine Learning Group) LINFO2262 28.


Some Machine Learning Challenges Good and bad news

A few words of caution


Machine learning is not magic: good generalizations are possible
from examples but the examples matter as well as the inductive
bias and the learning algorithm
Look at your data, Look at your data, LOOK
at your data
Check your model, CHECK your model:
I try to understand the acquired knowledge in your model
I confront its predictions with your expectations
Define the learning task properly
I Be prepared to do things (semi-)manually first, before automating
them
I Pre-process your learning data (data cleansing, feature extraction,
noise reduction, . . . )
I Do not confuse learning a predictive model and using a predictive
model
I Useless to learn existing knowledge (ex. chess rules)

P. Dupont (UCL Machine Learning Group) LINFO2262 29.


Course organization

Outline

1 What is Machine Learning?

2 Some Machine Learning Challenges

3 Course organization

P. Dupont (UCL Machine Learning Group) LINFO2262 30.


Course organization

Course objectives

A student completing successfully this course will be able to


understand and apply standard techniques to build computer
programs that automatically improve with experience
assess the quality of a learned model for a given task
assess the relative performance of several learning algorithms
justify the use of a particular learning algorithm given the nature of
the data, the learning problem and a relevant performance
measure
use, adapt and extend learning software (in Python)

P. Dupont (UCL Machine Learning Group) LINFO2262 31.


Course organization

Some References

Pattern Recognition and Machine Learning, C. Bishop, Springer, 2006.


An Introduction to Statistical Learning, G. James, D. Witten, T. Hastie, R.
Tibshirani, Springer, 2013.
The Elements of Statistical Learning, T. Hastie, R. Tibshirani, N. Friedman, 2nd
edition, Springer, 2009.
Deep Learning, I. Goodfellow, Y. Bengio, A. Courville, MIT Press, 2016.

P. Dupont (UCL Machine Learning Group) LINFO2262 32.


Course organization

Instructors

Prof. Pierre Dupont


Office: Réaumur A.142
Teaching Assistants: Alexander Gerniers, Victor Hamer
Office: Réaumur A.337.10

P. Dupont (UCL Machine Learning Group) LINFO2262 33.


Course organization

Course organization

Lectures (usually on Friday at 10:45)

Review slides, look at references

Theoretical questions and practical projects on


inginious.info.ucl.ac.be/course/LINFO2262

P. Dupont (UCL Machine Learning Group) LINFO2262 34.


Course organization

P. Dupont (UCL Machine Learning Group) LINFO2262 35.


Course organization

Evaluation

First session (June): 100% of the global grade is based on the


Inginious assignments (no (re-)submission after the deadline)

Relative weights between assignments


I A1: 10% Decision Trees, Ensemble of Trees
I A2: 15% Linear Models, Support Vector Machines
I A3: 10% Evaluation Protocols - Performance Assessment
I A4: 15% Deep Learning
I A5: 50% ML Competition

Second session (August)


I A1 ⇒ A4: 50% of the global grade (projects are not re-evaluated)
I A5 replaced by a written exam (closed book): 50 %

P. Dupont (UCL Machine Learning Group) LINFO2262 36.


Course organization

Course website
https://moodle.uclouvain.be/course/view.php?id=1836

Assignments
Submit your results and/or Python code through
inginious.info.ucl.ac.be/course/LINFO2262
in due time ⇒ check inginious deadlines, e.g. at 23:00 the day
before the next lecture
Online feedback
I Theory: feedback after submission is closed
I Practical Problems: warm-up questions: real time feedback (no
impact on the grade)
I Practical Problems: test questions: feedback after submission is
closed
Do not expect the inginious server to play the role of a Python
debugger!

P. Dupont (UCL Machine Learning Group) LINFO2262 37.


Course organization

UCLouvain anti-plagiarism policy


Submitting your answers and Python code on inginious implies that
you agree with the following:
I hereby certify that the results and code that I will submit for this project is coming
from my own work. The submitted works will not be (even partial) copy/paste from the
work of other students. Re-use of publicly available code fragments (e.g. from Stack
Overflow) is allowed, provided any such use is explictly quoted as a comment in your
submitted code.
I also certify that I will not distribute any answer or code related to these projects,
in person or on any repository (github, bitbucket, Facebook groups, etc.) accessible to
anybody, even after the deadlines.
Any violation of the above statements will be considered as cheating or plagiarism and
will be reported as such to the President of the Jury.

Automated similarity checks between various student codes +


answers will be performed
You are more than welcome to exchange ideas/questions openly
on the Moodle student forum.
P. Dupont (UCL Machine Learning Group) LINFO2262 38.
Conclusion

Take Home Message

Machine Learning
ML is about making computers learn from experience

Birds and planes do not fly the same way


I Computers do not tend to learn as humans or animals even though
biological learning systems inspire some ML research
I Machine learning 6= cognitive science

ML is fun and novel applications arise every day

P. Dupont (UCL Machine Learning Group) LINFO2262 39.

You might also like