Machine Learning: Bilal Khan

Machine Learning
Bilal Khan
SEARCH IN CONCEPT SPACE
2
Concept Learning by Induction

• Learning has been classified into several types: deductive,
inductive, etc.
• Much of human learning involves acquiring general

concepts from specific training examples (this is called
inductive learning)
3

• Example: Concept of ball
* red, round, small
* green, round, small
* red, round, medium
4

• Each concept can be thought of as a Boolean-valued function whose
value is true for some inputs and false for all the rest
(e.g. a function defined over all the animals, whose value is true for
birds and false for all the other animals)
• This lecture is about the problem of automatically inferring the general
definition of some concept, given examples labeled as members or
nonmembers of the concept. This task is called concept learning, or
approximating (inferring) a Boolean valued function from examples
5
• Target Concept to be learnt: “Days on which Ahmed

enjoys his favorite water sport”
• Training Examples present are:
6

• The training examples are described by the values of
seven “Attributes”
• The task is to learn to predict the value of the attribute

EnjoySport for an arbitrary day, based on the values of its
other attributes
7
Concept (Hypothesis) Representation
8
Concept Learning by Induction: Hypothesis Representation
• The possible concepts are called Hypotheses and we need

an appropriate representation for the hypotheses
• Let the hypothesis be a conjunction of constraints on the
attribute-values
9
• If
(sky = sunny)  (temp = warm)  (humidity = ?) 
(wind = strong)  (water = ?)  (forecast = same)
then
Enjoy Sport = Yes
else
Enjoy sport = No
• Alternatively, this can be written as:

{sunny, warm, ?, strong, ?, same}
10

• For each attribute, the hypothesis will have either
? Any value is acceptable
Value Any single value is acceptable
 No value is acceptable
11
• If some instance (example/observation) satisfies all the

constraints of a hypothesis, then it is classified as positive
(belonging to the concept)
• The most general hypothesis is {?, ?, ?, ?, ?, ?}

It would classify every example as a positive example
• The most specific hypothesis is {, , , , , }

It would classify every example as negative
12
By selecting a hypothesis representation, the space of all hypotheses

(that the program can ever represent and therefore can ever learn)
is implicitly defined
In our example, the instance space X can contain 3.2.2.2.2.2 = 96

distinct instances
13
If we use the following hypothesis representation
There are 5.4.4.4.4.4 = 5120 syntactically distinct hypotheses.

Since every hypothesis containing even one  classifies every
instance as negative, hence semantically distinct hypotheses are:
4.3.3.3.3.3 + 1 = 973
14
Most practical learning tasks involve much larger, sometimes

infinite, hypothesis spaces
15
Search in Concept (Hypothesis) Space
16
Concept Learning by Induction: Search in Hypotheses Space
Concept learning can be viewed as the task of searching through

a large space of hypotheses implicitly defined by the
hypothesis representation
The goal of this search is to find the hypothesis that best fits the
training examples
17
Concept Learning by Induction: Basic Assumption
Once a hypothesis that best fits the training examples is found,

we can use it to predict the class label of new examples
The basic assumption while using this hypothesis is:
Any hypothesis found to approximate the target function well

over a sufficiently large set of training examples will also
approximate the target function well over other unobserved
examples
18
Concept Learning by Induction: General to Specific Ordering
If we view learning as a search problem, then it is natural

that our study of learning algorithms will examine
different strategies for searching the hypothesis space
Many algorithms for concept learning organize the search

through the hypothesis space by relying on a general to
specific ordering of hypotheses
19
Example:
Consider h1 = {sunny, ?, ?, strong, ?, ?}
h2 = {sunny, ?, ?, ?, ?, ?}
any instance classified positive by h1 will also be
classified positive by h2 (because it imposes fewer
constraints on the instance)
Hence h2 is more general than h1 and h1 is more specific
than h2
20
Consider the three hypotheses h1, h2 and h3
21

• Neither h1 nor h3 is more general than the other
• h2 is more general than both h1 and h3
• Note that the “more-general-than” relationship is

independent of the target concept. It depends only on which
instances satisfy the two hypotheses and not on the
classification of those instances according to the target
concept
22
List – then – Eliminate Algorithm
23
List-then-Eliminate Algorithm
This algorithm first makes a set of all possible hypotheses

possible, then eliminates any hypothesis found
inconsistent with any training example
The set of candidate hypotheses thus shrinks as more

examples are observed, until only those one hypotheses
remain that are consistent with all the observed examples
24
25
For the Enjoy Sport data we can list 973 possible hypotheses
Then we can test each hypothesis to see whether it confirms

with our training data set or not
26
{?, ?, ?, ?, ?, ?}
27
For this data we will be left with the following hypotheses
h1 = {Sunny, Warm, ?, Strong, ?, ?}

h2 = {Sunny, ?, ?, Strong, ?, ?}
h3 = {Sunny, Warm, ?, ?, ?, ?}
h4 = {?, Warm, ?, Strong, ?, ?}
h5 = {Sunny, ?, ?, ?, ?, ?}
h6 = {?, Warm, ?, ?, ?, ?}
28
If insufficient data is available to narrow the set of

hypotheses to the consistent hypothesis, then the
algorithm can output the entire set of hypotheses
consistent with the observed data
It has the advantage that it guarantees to output all the

hypotheses consistent with the training data
Unfortunately, it requires exhaustive listing of all hypotheses

– an unrealistic requirement for practical problems
29
Find-S Algorithm
30
Find-S Algorithm
How to find a hypothesis consistent with the observed training

examples?
- A hypothesis is consistent with the training examples if it
correctly classifies these examples
A positive training example is an example of the concept to be

learnt.
Similarly a negative training example is not an example of the

concept
31
Find-S Algorithm
We say that a hypothesis covers a positive training example if it

correctly classifies the example as positive
One way is to begin with the most specific possible hypothesis,

then generalize it each time it fails to cover a positive
training example (i.e. classifies it as negative)
The algorithm based on this method is called Find-S

It finds a maximally specific hypothesis
32
Find-S Algorithm
33
Find-S Algorithm
34
Find-S Algorithm
The nodes shown in the diagram are the possible hypotheses

allowed by our hypothesis representation scheme
Note that our search is guided by the positive examples and we

consider only those hypotheses which are consistent with the
positive training examples
The search moves from hypothesis to hypothesis, searching

from the most specific to progressively more general
hypotheses
35
Find-S Algorithm
At each step, the hypothesis is generalized only as far as

necessary to cover the new positive example
Therefore, at each stage the hypothesis is the most specific

hypothesis consistent with the training examples observed up
to this point
Hence, it is called Find-S
36
Find-S Algorithm
Note that the algorithm does not get any help from the negative
examples
However, since at each step our current hypothesis is maximally

specific it will never cover (falsely classify) any negative
example. In other words, it will be always consistent with
each negative training example
However the data must be noise free and our hypothesis

representation should be such that the true concept can be
described by it
37
Find-S Algorithm
Problems with Find-S:
1. No way of knowing if there are other possible target

concepts
2. Why prefer the most specific hypothesis?
3. If the training examples are not consistent, the algorithm
may fail
4. What if there are several maximally specific consistent
hypotheses?
38
Definition: Version Space
Version Space is the set of hypotheses consistent with the

training examples of a problem
Find-S algorithm finds one hypothesis present in the Version

Space, however there may be others consistent
hypotheses
39
References
Sections 2.1 – 2.3, 2.5.2 of T. Mitchell
40

Machine Learning: Bilal Khan

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Machine Learning: Bilal Khan

Uploaded by

Copyright:

Available Formats

Machine Learning

Concept Learning by Induction

• Much of human learning involves acquiring general

Concept Learning by Induction

Concept Learning by Induction

Concept Learning by Induction

• Target Concept to be learnt: “Days on which Ahmed

Concept Learning by Induction

• The task is to learn to predict the value of the attribute

Concept (Hypothesis) Representation

Concept Learning by Induction: Hypothesis Representation

• The possible concepts are called Hypotheses and we need

Concept Learning by Induction: Hypothesis Representation

• Alternatively, this can be written as:

Concept Learning by Induction: Hypothesis Representation

Concept Learning by Induction: Hypothesis Representation

• If some instance (example/observation) satisfies all the

• The most general hypothesis is {?, ?, ?, ?, ?, ?}

• The most specific hypothesis is {, , , , , }

Concept Learning by Induction: Hypothesis Representation

By selecting a hypothesis representation, the space of all hypotheses

In our example, the instance space X can contain 3.2.2.2.2.2 = 96

Concept Learning by Induction: Hypothesis Representation

If we use the following hypothesis representation

There are 5.4.4.4.4.4 = 5120 syntactically distinct hypotheses.

Concept Learning by Induction: Hypothesis Representation

Most practical learning tasks involve much larger, sometimes

Search in Concept (Hypothesis) Space

Concept Learning by Induction: Search in Hypotheses Space

Concept learning can be viewed as the task of searching through

Concept Learning by Induction: Basic Assumption

Once a hypothesis that best fits the training examples is found,

The basic assumption while using this hypothesis is:

Any hypothesis found to approximate the target function well

Concept Learning by Induction: General to Specific Ordering

If we view learning as a search problem, then it is natural

Many algorithms for concept learning organize the search

Concept Learning by Induction: General to Specific Ordering

Concept Learning by Induction: General to Specific Ordering

Consider the three hypotheses h1, h2 and h3

Concept Learning by Induction: General to Specific Ordering

• h2 is more general than both h1 and h3

• Note that the “more-general-than” relationship is

List – then – Eliminate Algorithm

This algorithm first makes a set of all possible hypotheses

The set of candidate hypotheses thus shrinks as more

Then we can test each hypothesis to see whether it confirms

For this data we will be left with the following hypotheses

h1 = {Sunny, Warm, ?, Strong, ?, ?}

If insufficient data is available to narrow the set of

It has the advantage that it guarantees to output all the

Unfortunately, it requires exhaustive listing of all hypotheses

How to find a hypothesis consistent with the observed training

A positive training example is an example of the concept to be

Similarly a negative training example is not an example of the

We say that a hypothesis covers a positive training example if it

One way is to begin with the most specific possible hypothesis,

The algorithm based on this method is called Find-S

The nodes shown in the diagram are the possible hypotheses

Note that our search is guided by the positive examples and we

The search moves from hypothesis to hypothesis, searching

At each step, the hypothesis is generalized only as far as