Professional Documents
Culture Documents
For a given set of training data examples stored in a .CSV file, implement
and demonstrate the Candidate-Elimination algorithm to output a description of the set of all
hypotheses consistent with the training examples.
The key idea in the Candidate-Elimination algorithm is to output a description of the set of
all hypotheses consistent with the training examples(version space).
The algorithm represents the set of all hypotheses consistent with the observed training
examples. This subset of all hypotheses is called the version space with respect to the
hypothesis space H and the training examples D, because it contains all plausible versions of
the target concept.
A version space can be represented with its general and specific boundary sets.
The Candidate-Elimination algorithm represents the version space by storing only its most
general members G and its most specific members S.
Given only these two sets S and G, it is possible to enumerate all members of a version
space by generating hypotheses that lie between these two sets in general-to-specific partial
ordering over hypotheses. Every member of the version space lies between these boundaries
An Illustrative Example
Example Sky AirTemp Humidity Wind Water Forecast EnjoySport
1 Sunny Warm Normal Strong Warm Same Yes
2 Sunny Warm High Strong Warm Same Yes
3 Rainy Cold High Strong Warm Change No
4 Sunny Warm High Strong Cool Change Yes
Initializing the S boundary set to contain the most specific (least general)
hypothesis
S0 , , , , ,
When the second training example is observed, it has a similar effect of
generalizing S further to S2, leaving G again unchanged i.e., G2 = G1 = G0
Consider the third training example. This negative example reveals that the G
boundary of the version space is overly general, that is, the hypothesis in G
incorrectly predicts that this new example is a positive example.
The hypothesis in the G boundary must therefore be specialized until it correctly
classifies this new negative example
Consider the fourth training example.
This positive example further generalizes the S boundary of the version space. It
also results in removing one member of the G boundary, because this member
fails to cover the new positive example
After processing these four examples, the boundary sets S4 and G4 delimit the version
space of all hypotheses consistent with the set of incrementally observed training
examples.
• If d is a negative example
• Remove from S any hypothesis inconsistent with d
• For each hypothesis g in G that is not consistent with d
• Remove g from G
• Add to G all minimal specializations h of g such that
• h is consistent with d, and some member of S is more specific than h
• Remove from G any hypothesis that is less general than another hypothesis in G
------------------------------------------------------------------------------------------------------------------------
enumerate(x): The enumerate() function takes a collection (e.g. a tuple) and returns it as an
enumerate object. Syntax: enumerate(iterable, start )
import numpy as np
import pandas as pd
data=pd.DataFrame(data=pd.read_csv("enjoysport2.csv"))
concepts = np.array(data.iloc[:,0:-1])
print(concepts)
target = np.array(data.iloc[:,-1])
print(target)
def learn(concepts, target):
specific_h = concepts[0].copy()
print("initialization of specific_h and general_h")
print(specific_h)
general_h = [["?" for i in range(len(specific_h))] for i in range(len(specific_h))]
print(general_h)
for i, h in enumerate(concepts):
print("For Loop Starts")
if target[i] == "yes":
print("If instance is Positive ")
for x in range(len(specific_h)):
if h[x]!= specific_h[x]:
specific_h[x] ='?'
general_h[x][x] ='?'
if target[i] == "no":
print("If instance is Negative ")
for x in range(len(specific_h)):
if h[x]!= specific_h[x]:
general_h[x][x] = specific_h[x]
else:
general_h[x][x] = '?'
indices = [i for i, val in enumerate(general_h) if val == ['?', '?', '?', '?', '?', '?']]
for i in indices:
general_h.remove(['?', '?', '?', '?', '?', '?'])
return specific_h, general_h
Output:
Final Specific_h:
['sunny' 'warm' '?' 'strong' '?' '?']
Final General_h:
[['sunny', '?', '?', '?', '?', '?'], ['?', 'warm', '?', '?', '?', '?']]