ML Lec. 02

Machine Learning
Lecture No. 2
VERSION SPACE
Instructor:
Prof. Dr. M Sultan Zia
VERSION SPACE
Concept Learning by Induction
• Learning has been classified into several types: deductive, inductive, analytical, etc.
• Much of human learning involves acquiring general concepts from specific training
examples (this is called inductive learning)
• Example: Concept of ball

* red, round, small
* green, round, large
* red, round, medium
• Complicated concepts: “situations in which I should study more to pass

the exam” 2
VERSION SPACE
Concept Learning by Induction
• Each concept can be thought of as a Boolean-valued function whose value is true for
some inputs and false for all the rest
(e.g. a function defined over all the animals, whose value is true for birds and
false for all the other animals)
• This lecture is about the problem of automatically inferring the general definition of
some concept, given examples labeled as members or nonmembers of the concept.
This task is called concept learning, or approximating (inferring) a Boolean valued
function from examples
3
• Task:
– Learn (to imitate) a function f: X  Y (i.e. given x, predict y)
• Experience:
– Learning algorithm is given the correct value of the function for particular inputs 
training examples (see table above)
– An example is a pair (x, y), where x is the input and y=f(x) is the output of the function
applied to x.
• Performance Measure:
– Find a function h: X  Y predicts the same y as f: X  Y as often as possible.
Instance Space X: Set of all possible objects described by attributes.
Target Function 𝒇 (hidden): Maps each instance 𝑥 ∈ 𝑋 to target label 𝑦 ∈ 𝑌.
Hypothesis 𝒉: Function that approximates 𝑓.
Hypothesis Space 𝑯: Set of functions we consider for approximating 𝑓.
Training Data 𝑺: Sample of instances/examples labeled with target function 𝑓.

VERSION SPACE
Concept Learning by Induction: Hypothesis Representation
• The possible concepts are called Hypotheses and we need an appropriate
representation for the hypotheses
• Let the hypothesis be a conjunction of constraints on the attribute-values

• For each attribute, the hypothesis will have either
? Any value is acceptable
Value Any single value is acceptable
 No value is acceptable
6
VERSION SPACE
• If
sky = sunny  temp = warm  humidity = ?  wind = strong  water =
?  forecast = same
then
Enjoy Sport = Yes
else
Enjoy sport = No
Alternatively, this can be

written as:
{sunny,
warm, ?, strong, ?, same} 7
VERSION SPACE
By selecting a hypothesis representation, the space of all hypotheses (that the
program can ever represent and therefore can ever learn) is implicitly defined
In our example, the instance space X can contain 3.2.2.2.2.2 = 96 distinct instances
There are 5.4.4.4.4.4 = 5120 syntactically distinct hypotheses. Since every

hypothesis containing even one  classifies every instance as negative, hence
semantically distinct hypotheses are: 4.3.3.3.3.3 + 1 = 973
Most practical learning tasks involve much larger, sometimes infinite,

hypothesis spaces
8
VERSION SPACE
Concept Learning by Induction: Basic Assumption
Once a hypothesis that best fits the training examples is found, we can use it to predict
the class label of new examples
The basic assumption while using this hypothesis is:
Any hypothesis found to approximate the target function well over a sufficiently
large set of training examples will also approximate the target function well over
other unobserved examples (The inductive learning hypothesis)
9
VERSION SPACE
Concept Learning by Induction: General to Specific Ordering
Consider
the three
hypotheses
h1, h2 and
h3
10
Find-S Algorithm
How to find a hypothesis consistent with the

observed training examples?
- A hypothesis is consistent with the
training examples if it correctly classifies
these examples
One way is to begin with the most specific

possible hypothesis, then generalize it
each time it fails to cover a positive
training example (i.e. classifies it as
negative)
The algorithm based on this method is called

Find-S
11
12
VERSION SPACE
Find-S Algorithm
The nodes shown in the diagram are the possible hypotheses allowed by our
hypothesis representation scheme
Note that our search is guided by the positive examples and we consider only those
hypotheses which are consistent with the positive training examples
The search moves from hypothesis to hypothesis, searching from the most specific to
progressively more general hypotheses
At each step, the hypothesis is generalized only as far as necessary to cover the new
positive example
Therefore, at each stage the hypothesis is the most specific hypothesis consistent with
13
the training examples observed up to this point
VERSION SPACE
Find-S Algorithm
Note that the algorithm simply ignores every negative example
However, since at each step our current hypothesis is maximally specific it will never
cover (falsely classify) any negative example. In other words, it will be always
consistent with each negative training example
However the data must be noise free and our hypothesis representation should be such
that the true concept can be described by it
14
VERSION SPACE
Find-S Algorithm
Problems with Find-S:
1. Has the learner converged to the true target concept?
2. Why prefer the most specific hypothesis?
3. Are the training examples consistent with each other?
4. What if there are several maximally specific consistent

hypotheses?
15
VERSION SPACE
Definition: Version Space
Version Space is the set of hypotheses consistent with the training

examples of a problem
Find-S algorithm finds one hypothesis present in the Version Space,

however there may be others
The next 2 algorithms will compute the Version Space
16
VERSION SPACE
List-then-Eliminate Algorithm
This algorithm first initializes the version space to contain all hypotheses
possible, then eliminate any hypothesis found inconsistent with any training
example
The version space of candidate hypotheses thus shrinks as more examples are
observed, until ideally just one hypothesis remains that is consistent with all
the observed examples
For the Enjoy Sport data we can list 973 possible hypotheses
Then we can test each hypothesis to see whether it confirms with our training
data set or not
17
VERSION SPACE
For this data we will be left with the following hypotheses
h1 = {Sunny, Warm, ?, Strong, ?, ?}

h2 = {Sunny, ?, ?, Strong, ?, ?}
h3 = {Sunny, Warm, ?, ?, ?, ?}
h4 = {?, Warm, ?, Strong, ?, ?}
h5 = {Sunny, ?, ?, ?, ?, ?}
h6 = {?, Warm, ?, ?, ?, ?}
Note that the Find-S algorithm is able to find only h1
18
VERSION SPACE
If insufficient data is available to narrow the version space to a single hypothesis,
then the algorithm can output the entire set of hypotheses consistent with the
observed data
It has the advantage that it guarantees to output all the hypotheses consistent with
the training data
Unfortunately, it requires exhaustive listing of all hypotheses – an unrealistic

requirement for practical problems
19
VERSION SPACE
Candidate Elimination Algorithm
The Candidate Elimination algorithm instead of listing all the possible members of
the version space, employs a much more compact representation
The version space is represented by its most general (maximally general) and most
specific (maximally specific) members
These members form the general and specific boundary sets that delimit the
version space. Every other member of the version space lies between these
boundaries
20
VERSION SPACE
21
22
If Restaurant is Sam’s and Cost is cheap then reaction

Candidate
Elimination
Algorithm
Candidate
Elimination
Algorithm
S0 = {, , , , , }
G0 = {?, ?, ?, ?, ?, ?}
VERSION SPACE
31
??? ((( )))
32

ML Lec. 02

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ML Lec. 02

Uploaded by

Copyright:

Available Formats

Machine Learning

• Example: Concept of ball

• Complicated concepts: “situations in which I should study more to pass

Target Function 𝒇 (hidden): Maps each instance 𝑥 ∈ 𝑋 to target label 𝑦 ∈ 𝑌.

Hypothesis 𝒉: Function that approximates 𝑓.

Hypothesis Space 𝑯: Set of functions we consider for approximating 𝑓.

Training Data 𝑺: Sample of instances/examples labeled with target function 𝑓.

Concept Learning by Induction: Hypothesis Representation

Alternatively, this can be

There are 5.4.4.4.4.4 = 5120 syntactically distinct hypotheses. Since every

Most practical learning tasks involve much larger, sometimes infinite,

The basic assumption while using this hypothesis is:

How to find a hypothesis consistent with the

One way is to begin with the most specific

The algorithm based on this method is called

Problems with Find-S:

1. Has the learner converged to the true target concept?

2. Why prefer the most specific hypothesis?

3. Are the training examples consistent with each other?

4. What if there are several maximally specific consistent

Version Space is the set of hypotheses consistent with the training

Find-S algorithm finds one hypothesis present in the Version Space,

The next 2 algorithms will compute the Version Space

For this data we will be left with the following hypotheses

h1 = {Sunny, Warm, ?, Strong, ?, ?}

Note that the Find-S algorithm is able to find only h1

Unfortunately, it requires exhaustive listing of all hypotheses – an unrealistic

If Restaurant is Sam’s and Cost is cheap then reaction

You might also like