Professional Documents
Culture Documents
Slides Lect 02
Slides Lect 02
Lecture 2
The Perceptron
The Learning Setup
A Simple Learning Algorithm: PLA
Other Views of Learning
Is Learning Feasible: A Puzzle
M. Magdon-Ismail
CSCI 4100/6100
recap: The Plan
1. What is Learning?
2. Can We do it?
3. How to do it?
concepts
4. How to do it well? theory
practice
5. General principles?
6. Advanced techniques.
c AM
L Creator: Malik Magdon-Ismail The Perceptron: 2 /25 Recap: key players −→
recap: The Key Players
c AM
L Creator: Malik Magdon-Ismail The Perceptron: 3 /25 Recap: learning setup −→
recap: Summary of the Learning Setup
TRAINING EXAMPLES
(x1 , y1 ), (x2 , y2 ), . . . , (xN , yN )
LEARNING FINAL
ALGORITHM HYPOTHESIS
A g≈f
(learned credit approval formula)
HYPOTHESIS SET
H
c AM
L Creator: Malik Magdon-Ismail The Perceptron: 4 /25 Simple learning model −→
A Simple Learning Model
• Give importance weights to the different inputs and compute a “Credit Score”
d
X
“Credit Score” = w i xi .
i=1
c AM
L Creator: Malik Magdon-Ismail The Perceptron: 5 /25 Rewriting the model −→
A Simple Learning Model
d
X
Approve credit if wixi > threshold,
i=1
Xd
Deny credit if wixi < threshold.
i=1
d
! !
X
h(x) = sign w i xi + w0
i=1
c AM
L Creator: Malik Magdon-Ismail The Perceptron: 6 /25 Perceptron −→
The Perceptron Hypothesis Set
w0 1
w 1 x1
w= . ∈ Rd+1, x= . ∈ {1} × Rd.
. .
wd xd
c AM
L Creator: Malik Magdon-Ismail The Perceptron: 7 /25 Geometry of perceptron −→
Geometry of The Perceptron
Income
Income
Age Age
c AM
L Creator: Malik Magdon-Ismail The Perceptron: 8 /25 Use the data −→
Use the Data to Pick a Line
Income
Income
Age Age
A perceptron fits the data by using a line to separate the +1 from −1 data.
Fitting the data: How to find a hyperplane that separates the data?
(“It’s obvious - just look at the data and draw the line,” is not a valid solution.)
c AM
L Creator: Malik Magdon-Ismail The Perceptron: 9 /25 How to learn g −→
How to Learn a Final Hypothesis g from H
Idea! Start with some weight vector and try to improve it.
Income
Age
c AM
L Creator: Malik Magdon-Ismail The Perceptron: 10 /25 PLA −→
The Perceptron Learning Algorithm (PLA)
2: for iteration t = 1, 2, 3, . . . y∗ x∗
w(t + 1)
3: the weight vector is w(t). w(t)
4: From (x1, y1 ), . . . , (xN , yN ) pick any misclassified example.
x∗
5: Call the misclassified example (x∗, y∗ ),
sign (w(t) • x∗) 6= y∗ .
y∗ = −1
6: Update the weight:
w(t + 1) = w(t) + y∗x∗ . y∗ x∗
w(t)
w(t + 1)
7: t←t+1
x∗
PLA implements our idea: start at some weights and try to improve.
c AM
L Creator: Malik Magdon-Ismail The Perceptron: 11 /25 PLA convergence −→
Does PLA Work?
Theorem. If the data can be fit by a linear separator, then after some finite number
of steps, PLA will find one.
Income
What if the data cannot be fit by a perceptron?
Age
iteration 1
c AM
L Creator: Malik Magdon-Ismail The Perceptron: 12 /25 Start −→
Does PLA Work?
Theorem. If the data can be fit by a linear separator, then after some finite number
of steps, PLA will find one.
Income
What if the data cannot be fit by a perceptron?
Age
iteration 1
c AM
L Creator: Malik Magdon-Ismail The Perceptron: 13 /25 Iteration 1 −→
Does PLA Work?
Theorem. If the data can be fit by a linear separator, then after some finite number
of steps, PLA will find one.
Income
What if the data cannot be fit by a perceptron?
Age
iteration 1
c AM
L Creator: Malik Magdon-Ismail The Perceptron: 14 /25 Iteratrion 2 −→
Does PLA Work?
Theorem. If the data can be fit by a linear separator, then after some finite number
of steps, PLA will find one.
Income
What if the data cannot be fit by a perceptron?
Age
iteration 2
c AM
L Creator: Malik Magdon-Ismail The Perceptron: 15 /25 Iteration 3 −→
Does PLA Work?
Theorem. If the data can be fit by a linear separator, then after some finite number
of steps, PLA will find one.
Income
What if the data cannot be fit by a perceptron?
Age
iteration 3
c AM
L Creator: Malik Magdon-Ismail The Perceptron: 16 /25 Iteration 4 −→
Does PLA Work?
Theorem. If the data can be fit by a linear separator, then after some finite number
of steps, PLA will find one.
Income
What if the data cannot be fit by a perceptron?
Age
iteration 4
c AM
L Creator: Malik Magdon-Ismail The Perceptron: 17 /25 Iteration 5 −→
Does PLA Work?
Theorem. If the data can be fit by a linear separator, then after some finite number
of steps, PLA will find one.
Income
What if the data cannot be fit by a perceptron?
Age
iteration 5
c AM
L Creator: Malik Magdon-Ismail The Perceptron: 18 /25 Iteration 6 −→
Does PLA Work?
Theorem. If the data can be fit by a linear separator, then after some finite number
of steps, PLA will find one.
Income
What if the data cannot be fit by a perceptron?
Age
iteration 6
c AM
L Creator: Malik Magdon-Ismail The Perceptron: 19 /25 Non-separable data? −→
Does PLA Work?
Theorem. If the data can be fit by a linear separator, then after some finite number
of steps, PLA will find one.
Income
What if the data cannot be fit by a perceptron?
Age
iteration 1
c AM
L Creator: Malik Magdon-Ismail The Perceptron: 20 /25 We can fit! −→
We can Fit the Data
• We can find an h that works from infinitely many (for the perceptron).
(So computationally, things seem good.)
c AM
L Creator: Malik Magdon-Ismail The Perceptron: 21 /25 Other views of learning −→
Other Views of Learning
c AM
L Creator: Malik Magdon-Ismail The Perceptron: 22 /25 Coins – supervised −→
Supervised Learning - Classifying Coins
25
25
5
Mass
Mass
1 5
1
10 10
Size Size
c AM
L Creator: Malik Magdon-Ismail The Perceptron: 23 /25 Coins – unsupervised −→
Unsupervised Learning - Categorizing Coins
type 4
Mass
Mass
type 3
type 2
type 1
Size Size
c AM
L Creator: Malik Magdon-Ismail The Perceptron: 24 /25 Puzzle: outside the data −→
Outside the Data Set - A Puzzle
Dogs (f = −1)
Trees (f = +1)
Tree or Dog? (f = ?)
c AM
L Creator: Malik Magdon-Ismail The Perceptron: 25 /25