Professional Documents
Culture Documents
05 1-LearningObserv
05 1-LearningObserv
• Four Components
1. Performance Element: collection of knowledge
and procedures to decide on the next action.
E.g. walking, turning, drawing, etc.
2. Learning Element: takes in feedback from the
critic and modifies the performance element
accordingly.
Learning Agent (con’t)
• Key idea:
– To use specific examples to reach general
conclusions
• Given a set of examples, the system tries to
approximate the evaluation function.
• Also called Pure Inductive Inference
Recognizing Handwritten Digits
Learning Agent
Training Examples
Recognizing Handwritten Digits
Is this a 7 or a 1?
Some may be more biased
toward 7 and others more
biased toward 1.
Formal Definitions
• Example: a pair (x, f(x)), where
– x is the input,
– f(x) is the output of the function
applied to x.
• hypothesis: a function h that approximates f, given a
set of examples.
Task of Induction
Goal Predicate:
Patrons?
none full
Will wait for a table?
some
No Yes WaitEst?
>60 0-10
30-60 10-30
No Alternate? Hungry? Yes
no yes no yes
no yes no yes
No Yes No Yes
http://www.cs.washington.edu/education/courses/473/99wi/
Logical Representation of a Path
Patrons?
none full
some
WaitEst?
>60 0-10
30-60 10-30
Hungry?
no yes
Yes
Patrons?
none full
some
+ +X1, X3, X6, X8 +X4, X12
- X7, X11 - - X2, X5, X9, X10
Splitting Examples by Testing on
Attributes (con’t)
+ X1, X3, X4, X6, X8, X12 (Positive examples)
- X2, X5, X7, X9, X10, X11 (Negative examples)
Patrons?
none full
some
+ +X1, X3, X6, X8 +X4, X12
- X7, X11 - - X2, X5, X9, X10
No Yes
Splitting Examples by Testing on Attributes
(con’t)
+ X1, X3, X4, X6, X8, X12 (Positive examples)
- X2, X5, X7, X9, X10, X11 (Negative examples)
Patrons?
none full
some
+ +X1, X3, X6, X8 +X4, X12
- X7, X11 - - X2, X5, X9, X10
No Yes Hungry?
no yes
+ X4, X12 +
- X2, X10 - X5, X9
What Makes a Good Attribute?
Better
Patrons?
none full Attribute
some
+ +X1, X3, X6, X8 +X4, X12
- X7, X11 - - X2, X5, X9, X10
http://www.cs.washington.edu/education/courses/473/99wi/
Final Decision Tree
Patrons?
No Yes Hungry?
No Yes
Type? No
Goal Predicate:
Patrons?
none full
Will wait for a table?
some
No Yes WaitEst?
>60 0-10
30-60 10-30
No Alternate? Hungry? Yes
no yes no yes
no yes no yes
No Yes No Yes
http://www.cs.washington.edu/education/courses/473/99wi/
Outline
• Learning agents
• Inductive learning
• Learning decision trees
– Example of a decision tree
– Decision-tree-learning algorithm
– Accessing the performance
• Learning general logical descriptions
– Current-best hypothesis search algorithm
– Version space learning algorithm
• Computational learning theory
• Summary
Assessing the Performance of the
Learning Algorithm
• A learning algorithm is good if it produces
hypotheses that do a good job of predicating the
classifications of unseen examples
• Test the algorithm’s prediction performance on a set
of new examples, called a test set.
Methodology in Accessing
Performance
1. Collect a large set of examples.
2. Divide it into 2 disjoint set: the training set and the test set. It
is very important that these 2 sets are separate so that the
algorithm doesn’t cheat. Usually this division of examples is
done randomly.
3. Use the learning algorithm with the training set as examples to
generate a hypothesis H.
Methodology (con’t)
0.9
0.8
% correct
on test set 0.7
Happy Graph
0.6
0.5
0.4
0 20 40 60 80 100
Training set size
“Artificial Intelligence A Modern Approach”, Stuart Russel Peter Norwig
Overfitting
• Overfitting is what happens when a learning algorithm finds
meaningless “regularity” in the data.
• Caused by irrelevant attributes.
• Solution: decision tree pruning.
– Resulting decision tree is.
• Smaller.
• More tolerant to noise.
• More accurate in its predictions.
Practical Uses of Decision Tree Learning
• Designing oil platform equipment.
• Learning to fly a plane.
• Diagnosing heart attacks.
Outline
• Learning agents
• Inductive learning
• Learning decision trees
– Example of a decision tree
– Decision-tree-learning algorithm
– Accessing the performance
• Learning general logical descriptions
– Current-best hypothesis search algorithm
– Version space learning algorithm
• Computational learning theory
• Summary
Learning General Logical Description
• Key idea:
– Look at inductive learning generally
– Find a logical description that is equivalent
to the (unknown) evaluation function
• Make our hypothesis more or less specific to match
the evaluation function.
Outline
• Learning agents
• Inductive learning
• Learning decision trees
– Example of a decision tree
– Decision-tree-learning algorithm
– Accessing the performance
• Learning general logical descriptions
– Current-best hypothesis search algorithm
– Version space learning algorithm
• Computational learning theory
• Summary
Current-best-hypothesis Search
• Key idea:
– Maintain a single hypothesis throughout.
– Update the hypothesis to maintain consistency as
a new example comes in.
Definitions
http://www.pitt.edu/~suthers/infsci1054/8.html
How to Specialize
http://www.pitt.edu/~suthers/infsci1054/8.html
What do all these mean?
• Least-Commitment Search
• No backtracking
• Key idea:
– Maintain the most general and specific hypotheses
at any point in learning. Update them as new
examples come in.
Definitions