You are on page 1of 6

MACHINE LEARNING

CLASS 7
Hypothesis
In a binary classification problem, a hypothesis is a statement or a proposition purporting to
explain a given set of facts or observations.

Hypothesis space
The hypothesis space for a binary classification problem is a set of hypotheses for the problem
that might possibly be returned by it.

Consistency and satisfying


Let x be an example in a binary classification problem
c(x) denote the class label or Target Function assigned to x (c(x) is 1 or 0).
D be a set of training examples for the problem.
h be a hypothesis for the problem
h(x) be the class label assigned to x by the hypothesis h.

Therefore………
(a) We say that the hypothesis h is consistent with the set of training examples D if h(x) = c(x) for all x ∈ D.
(b) We say that an example x satisfies the hypothesis h if h(x) = 1.
Consider the set of observations of a variable x with the associated class labels given in Table below:

X 27 15 23 20 25 17 12 30 6 10
Class 1 0 1 1 1 0 0 1 0 0

Figure: Sample data to illustrate the concept of hypotheses

Hollow dots representing negative examples

10 15 20 25 30

6 12 17 23 27

Solid dots representing positive examples


It appears that the class labeling has been done based on the following rule.
h’′ : IF x ≥ 20 THEN “1” ELSE “0”

h′ is consistent with the training examples

h’ (27) = 1; c(27) = 1; h’(27) = c(27) for x = 5 and x = 28 (not in training data),


h′(5) = 0; h′(28) = 1
h’(15) = 0; c(15) = 0; h’(15) = c(15)

The following proposition also explains the data:


h : IF x ≥ 19 THEN “0” ELSE “1”

It is not enough that the hypothesis explains the given data; it must also predict correctly the
class label of future observations. So we consider a set of such hypotheses and choose the
“best” one. The set of hypotheses can be defined using a parameter, say m, as given below:
hm : IF x ≥ m THEN “1” ELSE ”0”.
axis-aligned rectangle

Consider the problem of assigning the label “family car” or “not family car” to cars. For
convenience, we shall replace the label “family car” by “1” and “not family car” by “0”.
Suppose we choose the features “price (’000 $)” and “power (hp)” as the input
representation for the problem. Further, suppose that there is some reason to believe that
for a car to be a family car, its price and power should be in certain ranges. This
supposition can be formulated in the form of the following proposition:
IF (p1 < price < p2) AND (e1 < power < e2) THEN “1” ELSE ”0”

You might also like