Chapter 11 - Machine Learning

Machine Learning
Chapter 11
Machine Learning
What is learning?
Machine Learning
What is learning?
That is what learning is. You suddenly understand
something you've understood all your life, but in a
new way.
(Doris Lessing 2007 Nobel Prize in Literature)
Machine Learning
How to construct programs that automatically
improve with experience.
Machine Learning
How to construct programs that automatically
improve with experience.
Learning problem:
Task T
Performance measure P
Training experience E
Machine Learning
Chess game:
Task T: playing chess games

Performance measure P: percent of games won against
opponents
Training experience E: playing practice games againts itself
Machine Learning
Handwriting recognition:
Task T: recognizing and classifying handwritten words

Performance measure P: percent of words correctly
classified
Training experience E: handwritten words with given

classifications
Designing a Learning System

Choosing the training experience:
Direct or indirect feedback

Degree of learner's control
Representative distribution of examples

Choosing the target function:
Type of knowledge to be learned

Function approximation

Choosing a representation for the target function:
Expressive representation for a close function approximation

Simple representation for simple training data and learning
algorithms
10

Choosing a function approximation algorithm
(learning algorithm)
11

Chess game:
Task T: playing chess games

Performance measure P: percent of games won against
opponents
Training experience E: playing practice games againts itself

Target function: V: Board R
12

Chess game:
Target function representation:
V^(b) = w0 + w1x1 + w2x2 + w3x3 + w4x4 + w5x5 + w6x6

x1: the number of black pieces on the board
x2: the number of red pieces on the board
x3: the number of black kings on the board

x4: the number of red kings on the board
x5: the number of black pieces threatened by red
x6: the number of red pieces threatened by black
13

Chess game:
Function approximation algorithm:
(<x1 = 3, x2 = 0, x3 = 1, x4 = 0, x5 = 0, x6 = 0>, 100)
x1: the number of black pieces on the board

x2: the number of red pieces on the board
x3: the number of black kings on the board

x4: the number of red kings on the board
x5: the number of black pieces threatened by red
x6: the number of red pieces threatened by black
14

What is learning?
15

Learning is an (endless) generalization or induction
process.
16

New problem
(initial board)
Experiment
Generator
Performance
System
Solution trace
(game history)
Hypothesis
(V^)
Generalizer
Critic
Training examples
{(b1, V1), (b2, V2), ...}
17
Issues in Machine Learning

What learning algorithms to be used?
How much training data is sufficient?
When and how prior knowledge can guide the learning process?
What is the best strategy for choosing a next training experience?
What is the best way to reduce the learning task to one or more
function approximation problems?
How can the learner automatically alter its representation to
improve its learning ability?
18
Example
Experience
Example
Sky
AirTemp
Sunny
Warm
Normal
Strong
Warm
Same
Yes
Sunny
Warm
High
Strong
Warm
Same
Yes
Rainy
Cold
High
Strong
Warm
Change
No
Sunny
Warm
High
Strong
Cool
Change
Yes
Low
Weak
Prediction
Humidity
Wind
Water
Forecast
EnjoySport
Rainy
Cold
High
Strong
Warm
Change
Sunny
Warm
Low
Strong
Cool
Same
Sunny
Warm
Normal
Strong
Warm
Same
19
Example
Learning problem:
Task T: classifying days on which my friend enjoys water sport
Performance measure P: percent of days correctly classified

Training experience E: days with given attributes and classifications
20
Concept Learning
Inferring a boolean-valued function from training
examples of its input (instances) and output
(classifications).
21
Concept Learning
Learning problem:
Target concept: a subset of the set of instances X

c: X {0, 1}
Target function:
Sky AirTemp Humidity Wind Water Forecast {Yes, No}
Hypothesis:
Characteristics of all instances of the concept to be learned
Constraints on instance attributes
h: X {0, 1}
22
Concept Learning
Satisfaction:
h(x) = 1 iff x satisfies all the constraints of h

h(x) = 0 otherwsie
Consistency:
h(x) = c(x) for every instance x of the training examples
Correctness:
h(x) = c(x) for every instance x of X

23
Concept Learning
How to represent a hypothesis function?
24
Concept Learning
Hypothesis representation (constraints on instance attributes):
<Sky, AirTemp, Humidity, Wind, Water, Forecast>
?: any value is acceptable

single required value
: no value is acceptable
25
Concept Learning
General-to-specific ordering of hypotheses:
hj g hk iff x
X: hk(x) = 1 hj(x) = 1
Specific
Lattice
(Partial order)
General
h1 = <Sunny, ?, ?, Strong, ? , ?>

h1
h3
h2 = <Sunny, ?, ?,
h3 = <Sunny, ?, ?,
?
?
, ? , ?>
, Cool, ?>
h2
H
26
FIND-S
Example
Sky
AirTemp
Humidity
Wind
Water
Forecast
EnjoySport
Sunny
Warm
Normal
Strong
Warm
Same
Yes
Sunny
Warm
High
Strong
Warm
Same
Yes
Rainy
Cold
High
Strong
Warm
Change
No
Sunny
Warm
High
Strong
Cool
Change
Yes
h=< ,
, ,
>
h = <Sunny, Warm, Normal, Strong, Warm, Same>

h = <Sunny, Warm,
h = <Sunny, Warm,
?
?
, Strong, Warm, Same>

, Strong,
? ,
? >
27
FIND-S
Initialize h to the most specific hypothesis in H:
For each positive training instance x:
For each attribute constraint ai in h:
If the constraint is not satisfied by x
Then replace ai by the next more general

constraint satisfied by x
Output hypothesis h
28
AirTemp
FIND-S
Example
Sky
Sunny
Warm
Normal
Strong
Warm
Same
Yes
Sunny
Warm
High
Strong
Warm
Same
Yes
Rainy
Cold
High
Strong
Warm
Change
No
Sunny
Warm
High
Strong
Cool
Change
Yes
h = <Sunny, Warm,
Prediction
Humidity
Wind
, Strong,
Water
? ,
Forecast
? >
Rainy
Cold
High
Strong
Warm
Change
Sunny
Warm
Low
Strong
Cool
Same
Sunny
Warm
Normal
Strong
Warm
EnjoySport
Same
No
Yes
Yes
29
FIND-S
The output hypothesis is the most specific one that
satisfies all positive training examples.
30
FIND-S
The result is consistent with the positive training examples.
31
FIND-S
Is the result is consistent with the negative training
examples?
32
FIND-S
Example
Sky
AirTemp
Sunny
Warm
Normal
Strong
Warm
Same
Yes
Sunny
Warm
High
Strong
Warm
Same
Yes
Rainy
Cold
High
Strong
Warm
Change
No
Sunny
Warm
High
Strong
Cool
Change
Yes
Sunny
Warm
Normal
Strong
Cool
Change
No
h = <Sunny, Warm,
Humidity
Wind
, Strong,
Water
? ,
Forecast
EnjoySport
? >
33
FIND-S
The result is consistent with the negative training examples
if the target concept is contained in H (and the training
examples are correct).
34
FIND-S
The result is consistent with the negative training examples
if the target concept is contained in H (and the training
examples are correct).
Sizes of the space:
Size of the instance space: |X| = 3.2.2.2.2.2 = 96

Size of the concept space C = 2|X| = 296
Size of the hypothesis space H = (4.3.3.3.3.3) + 1 = 973 << 296

The target concept (in C) may not be contained in H.
35
FIND-S
Questions:
Has the learner converged to the target concept, as there can

be several consistent hypotheses (with both positive and
negative training examples)?
Why the most specific hypothesis is preferred?
What if there are several maximally specific consistent
hypotheses?
What if the training examples are not correct?
36
List-then-Eliminate Algorithm
Version space: a set of all hypotheses that are
consistent with the training examples.
Algorithm:
Initial version space = set containing every hypothesis in H
For each training example <x, c(x)>, remove from the version
space any hypothesis h for which h(x) c(x)
Output the hypotheses in the version space
37
List-then-Eliminate Algorithm
Requires an exhaustive enumeration of all hypotheses
in H
38
Compact Representation of
Version Space
G (the generic boundary): set of the most generic
hypotheses of H consistent with the training data D:
G = {g
H | consistent(g, D) g
H: g >g g consistent(g, D)}
S (the specific boundary): set of the most specific

hypotheses of H consistent with the training data D:
S = {sH | consistent(s, D) s
H: s >g s consistent(s, D)}
39
Compact Representation of
Version Space
Version space = <G, S> = {h
H | g
G s
S: g g h g s}
S
G
40
Candidate-Elimination Algorithm
Example
Sky
AirTemp
Humidity
Wind
Water
Forecast
Sunny
Warm
Normal
Strong
Warm
Same
Rainy
Cold
High
Strong
Warm
Change
2
4
Sunny
Sunny
Warm
Warm
S0 = {<, , , , , >}
G0 = {<?, ?, ?, ?, ?, ?>}
High
High
Strong
Strong
Warm
Cool
S1 = {<Sunny, Warm, Normal, Strong, Warm, Same>}

G1 = {<?, ?, ?, ?, ?, ?>}
S2 = {<Sunny, Warm, ?, Strong, Warm, Same>}
G2 = {<?, ?, ?, ?, ?, ?>}
EnjoySport
Yes
Same
Yes
No
Change
Yes
S3 = {<Sunny, Warm, ?, Strong, Warm, Same>}

G3 = {<Sunny, ?, ?, ?, ?, ?>, <?, Warm, ?, ?, ?, ?>, <?, ?, ?, ?, ?, Same>}
S4 = {<Sunny, Warm, ?, Strong, ?, ?>}
G4 = {<Sunny, ?, ?, ?, ?, ?>, <?, Warm, ?, ?, ?, ?>}
41
<Sunny, ?, ?, Strong, ?, ?>
<Sunny, Warm, ?, ?, ?, ?>
<?, Warm, ?, Strong, ?, ?>
G4 = {<Sunny, ?, ?, ?, ?, ?>, <?, Warm, ?, ?, ?, ?>}
42
Initialize G to the set of maximally general hypotheses
in H
Initialize S to the set of maximally specific hypotheses
in H
43
For each positive example d:
Remove from G any hypothesis inconsistent with d

For each s in S that is inconsistent with d:
Remove s from S
Add to S all least generalizations h of s, such that h is consistent with

d and some hypothesis in G is more general than h
Remove from S any hypothesis that is more general than another
hypothesis in S
44
For each negative example d:
Remove from S any hypothesis inconsistent with d

For each g in G that is inconsistent with d:
Remove g from G
Add to G all least specializations h of g, such that h is consistent with

d and some hypothesis in S is more specific than h
Remove from G any hypothesis that is more specific than another
hypothesis in G
45
The version space will converge toward the correct
target concepts if:
H contains the correct target concept
There are no errors in the training examples
A training instance to be requested next should

discriminate among the alternative hypotheses in the
current version space:
46
Partially learned concept can be used to classify new
instances using the majority rule.
<Sunny, ?, ?, Strong, ?, ?>
<Sunny, Warm, ?, ?, ?, ?>
<?, Warm, ?, Strong, ?, ?>
G4 = {<Sunny, ?, ?, ?, ?, ?>, <?, Warm, ?, ?, ?, ?>}

5
Rainy
Warm
High
Strong
Cool
Same
?
47
Inductive Bias
Number of possible concepts = 2|X| = 296
Size of H = (4.3.3.3.3.3) + 1 = 973 << 296
48
Inductive Bias
Number of possible concepts = 2|X| = 296
Size of H = (4.3.3.3.3.3) + 1 = 973 << 296
a biased hypothesis space
49
Inductive Bias
An unbiased hypothesis space H that can represent
every subset of the instance space X: Propositional
logic sentences
Positive examples: x1, x2, x3
Negative examples: x4, x5
h(x) (x = x1) (x = x2) (x = x3) x1 x2 x3
h(x) (x x4) (x x5) x4 x5
50
Inductive Bias
x1 x2 x3
x1 x2 x3 x6
x4 x5
Any new instance x is classified positive by half of the version space,
and negative by the other half
not classifiable
51
Example
Inductive Bias
Day
Actor
Mon
Famous
Sun
Infamous
2
4
5
6
7
8
9
10
Sat
Wed
Sun
Thu
Tue
Sat
Wed
Sat
Price
EasyTicket
Expensive
Yes
Cheap
No
Famous
Moderate
Infamous
Moderate
Yes
Cheap
Yes
Cheap
No
Famous
Infamous
Famous
Famous
Famous
Infamous
Expensive
Expensive
Cheap
Expensive
No
No
Yes
?
52
Inductive Bias
Example
Quality
Price
Buy
Good
Low
Yes
Bad
High
No
Good
High
Bad
Low
53
Inductive Bias
A learner that makes no prior assumptions regarding
the identity of the target concept cannot classify any
unseen instances.
54
Homework
Exercises
2-1 2.5 (Chapter 2, ML textbook)
55

Chapter 11 - Machine Learning

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter 11 - Machine Learning

Uploaded by

Copyright:

Available Formats

Machine Learning

Task T: playing chess games

Training experience E: playing practice games againts itself

Task T: recognizing and classifying handwritten words

Training experience E: handwritten words with given

Designing a Learning System

Direct or indirect feedback

Designing a Learning System

Type of knowledge to be learned

Designing a Learning System

Expressive representation for a close function approximation

Designing a Learning System

Designing a Learning System

Task T: playing chess games

Training experience E: playing practice games againts itself

Designing a Learning System

Target function representation:

V^(b) = w0 + w1x1 + w2x2 + w3x3 + w4x4 + w5x5 + w6x6

x3: the number of black kings on the board

x5: the number of black pieces threatened by red

x6: the number of red pieces threatened by black

Designing a Learning System

Function approximation algorithm:

(<x1 = 3, x2 = 0, x3 = 1, x4 = 0, x5 = 0, x6 = 0>, 100)

x1: the number of black pieces on the board

x3: the number of black kings on the board

x5: the number of black pieces threatened by red

x6: the number of red pieces threatened by black

Designing a Learning System

Designing a Learning System

Designing a Learning System

Issues in Machine Learning

Task T: classifying days on which my friend enjoys water sport

Performance measure P: percent of days correctly classified

Target concept: a subset of the set of instances X

Sky AirTemp Humidity Wind Water Forecast {Yes, No}

Characteristics of all instances of the concept to be learned

Constraints on instance attributes

h(x) = 1 iff x satisfies all the constraints of h

h(x) = c(x) for every instance x of the training examples

h(x) = c(x) for every instance x of X

?: any value is acceptable

h1 = <Sunny, ?, ?, Strong, ? , ?>

h = <Sunny, Warm, Normal, Strong, Warm, Same>

, Strong, Warm, Same>

If the constraint is not satisfied by x

Then replace ai by the next more general

Size of the instance space: |X| = 3.2.2.2.2.2 = 96

Size of the hypothesis space H = (4.3.3.3.3.3) + 1 = 973 << 296

Has the learner converged to the target concept, as there can

Initial version space = set containing every hypothesis in H

S (the specific boundary): set of the most specific

S1 = {<Sunny, Warm, Normal, Strong, Warm, Same>}

S3 = {<Sunny, Warm, ?, Strong, Warm, Same>}

<Sunny, ?, ?, Strong, ?, ?>

<Sunny, Warm, ?, ?, ?, ?>

<?, Warm, ?, Strong, ?, ?>

G4 = {<Sunny, ?, ?, ?, ?, ?>, <?, Warm, ?, ?, ?, ?>}

Remove from G any hypothesis inconsistent with d

Add to S all least generalizations h of s, such that h is consistent with

Remove from S any hypothesis inconsistent with d

Add to G all least specializations h of g, such that h is consistent with

A training instance to be requested next should

<Sunny, ?, ?, Strong, ?, ?>