Professional Documents
Culture Documents
Lecture Decision Trees
Lecture Decision Trees
Introduction
Introduction
3
DECISION TREES
4
DECISION TREES
Example
5
DECISION TREES
Example
Outlook
Sunny Rain
Overcast
Humidity Wind
High Normal Strong Weak
7
DECISION TREES
8
DECISION TREES
9
DECISION TREES
11
DECISION TREES
12
DECISION TREES
13
DECISION TREES
14
DECISION TREES
For example, the first node might ask whether or not the animal
likes to play fetch. If the animal does, we will follow the edge to
the left child node; if not, we will follow the edge to the right
child node.
Example
Outlook
Sunny Rain
Overcast
Humidity Wind
High Normal Strong Weak
16
DECISION TREES
17
DECISION TREES
Entropy Example:
A single toss of a fair coin has only two outcomes: heads and
tails. The probability that the coin will land on heads is 0.5,
and the probability that it will land on tails is 0.5.
That is, only one bit is required to represent the two equally
probable outcomes, heads and tails.
18
DECISION TREES
Entropy
If the coin has the same face on both sides, the variable
representing its outcome has 0 bits of entropy; that is, we are
always certain of the outcome and the variable will never
represent new information.
19
DECISION TREES
20
DECISION TREES
Tree Construction
Now let's find the explanatory variable that will be most helpful
in classifying the animal; that is, let's find the explanatory
variable that reduces the entropy the most. We can test the
plays fetch explanatory variable and divide the training instances
into animals that play fetch and animals that don't. This produces
the two following subsets:
21
DECISION TREES
The left child node contains a subset of the training data with
seven cats and two dogs that do not like to play fetch. The
entropy at this node is given by the following:
The right child contains a subset with one cat and four dogs that
do like to play fetch. The entropy at this node is given by the
following:
22
DECISION TREES
23
DECISION TREES
We could also divide the instances into animals that prefer cat
food and animals that don't, to produce the following tree:
24
DECISION TREES
25
DECISION TREES
26
DECISION TREES
27
DECISION TREES
Information Gain
28
DECISION TREES
Information Gain
The following table contains the information gains for all of the
tests. In this case, the cat food test is still the best, as it increases
the information gain the most.
29
DECISION TREES
It is to be noted that
One of the child node
Has got entropy value 0.
Next Node
It must be remembered that this node has 8 samples. It
contains two cats and six dogs whose favorite food is not cat
food (may be dog food or bacon)
31
DECISION TREES
play fetch?
From the data set, find out the samples, i.e
From all animals who don’t like cat food, we shall make two
subsets. First for those which don’t like play fetch and
second, those who like to play fetch.
Is grumpy?
From the data set, find out the samples, i.e
From all animals who don’t like cat food, we shall make two
subsets. First for those which don’t like play fetch and
second, those who like to play fetch.
34
DECISION TREES
35
DECISION TREES
36
DECISION TREES
Both of the tests will produce the same subsets and create a leaf
node containing one dog and a leaf node containing two cats.
We will arbitrarily choose to test for animals that like dog food.
38
DECISION TREES
39
DECISION TREES
C4.5 also can prune trees. Pruning reduces the size of a tree
by replacing branches that classify few instances with leaf
nodes.
41
Performance Metrics
42
Performance Metrics
43
Performance Metrics
1. Overfitting
.
46