You are on page 1of 4

RULE BASED CLASSIFICATION

Rule-Based Classification
Model Rules
Set of IF-THEN rules
IF age = youth AND student = yes THEN
buys_computer = yes
Rule antecedent/precondition vs. rule consequent
Assessment of a rule: coverage and accuracy
ncovers = # of tuples covered by R
ncorrect = # of tuples correctly classified by R
coverage(R) = ncovers /|D| /* D: training data set
*/
accuracy(R) = ncorrect / ncovers
Rule Accuracy and Coverage
If-Then Rules
Rule Triggering
Input X satisfies a rule
Several rules are triggered Conflict Resolution
Size Ordering

Highest priority to toughest (rule antecedent


size) rule
Rule Ordering

Rules are prioritized before-hand

Class based ordering


Rules for most prevalent class comes first
or based on mis-classification cost / class

Rule-based ordering
Rule Quality based measures
Ordered list Decision list Must be
processed strictly in order
No rule is triggered Default rule

Rule Extraction from a Decision Tree

Example: Rule extraction from the buys_computer


decision-tree
IF age = young AND student = no THEN
buys_computer = no
IF age = young AND student = yes THEN
buys_computer = yes
IF age = mid-age THEN buys_computer = yes
IF age = old AND credit_rating = excellent THEN
buys_computer = yes
IF age = young AND credit_rating = fair THEN
buys_computer = no
Set of extracted rules very high
Pruning may be required
Rule Generalization For a given rule
antecedent any condition that does not improve
the estimated accuracy can be dropped
Side-effects of pruning

Mutually Exclusive? / Exhaustive?


C4.5 Class Ordering for Conflict resolution

All rules for a single class are grouped


together

Class rule sets are ranked to minimize falsepositive errors

Default class one that contains most training


tuples not covered by any rule
Rule Extraction from the Training Data

Sequential covering algorithm: Extracts rules


directly from training data
Associative Classification Algorithms may also

be used

Typical sequential covering algorithms: FOIL (First

Order Inductive Learner), AQ, CN2, RIPPER


Rules are learned sequentially, each rule for a given
class Ci will cover many tuples of Ci but none (or
few) of the tuples of other classes
Steps:

Rules are learned one at a time

Each time a rule is learned, the tuples covered by


the rules are removed

The process repeats on the remaining tuples


unless termination condition, e.g., when no more
training examples or when the quality of a rule
returned is below a user-specified threshold
Algorithm: Sequential Covering
Input: D, Att_vals
Output: If-Then rules
Method:
Rule_set = {}
For each class c do
Repeat
Rule = Learn_One_Rule(D, Att_vals, c) //
Finds best rule for given class
Remove tuples covered by Rule from D
Until terminating condition
Rule_set = Rule_set + Rule
End for
Return Rule_Set

Start with the most general rule possible: condition =


empty
Adding new attributes by adopting a greedy depthfirst strategy
Picks the one that most improves the rule quality

Example:
Start with IF _ THEN loan_decision = accept
Consider IF loan_term=short THEN.. / IF
loan_term=long THEN.. / IF income = high
THEN.. / IF income = medium THEN.. /
If best one is IF income = high THEN
loan_decision = accept expand it further

Rule Quality measures


Coverage or Accuracy independently will not be
sufficient
Rule-Quality measures: consider both coverage and
accuracy
Foil-gain (in FOIL & RIPPER): assesses
info_gain by extending condition

It favors rules that have high accuracy and cover


many positive tuples
R Existing rule; R Extended rule
Likelihood Ratio Statistic
m
Likelihood_Ratio = 2 i=1
fi log(fi/ei)
Greater this value higher the significance
Rule pruning based on an independent set of test
tuples
Pos/neg are # of positive/negative tuples covered
by R.
If FOIL_Prune is higher for the pruned version of
R, prune R