You are on page 1of 34

Decision Tree Learning

4 y < 1,5
Tamás Horváth
yes no
3
University of Bonn and Fraunhofer IAIS
2 x < 3,5

1 yes no

x < 1,5 Slides: partly taken from Stefan Wrobel


1 2 3 4
yes no

y < 3,5

yes no
Decision Tree Learning

Predictive Learning: Function Approximation (recap)

© T.Horváth – ILAS Machine Learning, WS16/17 2


Decision Tree Learning

Error Measures

© T.Horváth – ILAS Machine Learning, WS16/17 3


Decision Tree Learning

Special Case: Concept Learning

© T.Horváth – ILAS Machine Learning, WS16/17 4


Decision Tree Learning

Top-Down Decision Tree Induction (TDIDT)


 CLS [Hunt et.al. 66] - the ancestor

 ID3 [Quinlan 79], C4.5, C5.0 [Quinlan 93,99], J48 [Witten et.al. 00]
 ID3 - no numeric attributes
 C4.5 fully developed, most popular and widely used
 successor C5.0, ensembles
 CART [Breiman et.al. 84]: regression trees

 OC1 [Murthy, Kasif, Salzberg/94]


 linear decision functions in nodes

 SLIQ, SPRINT [Agrawal, Mehta, Risannen, Shafer 96a,96b]


 scalable and parallel variants

 VFDT [Domingos, Hulten 00]: sampling

© T.Horváth – ILAS Machine Learning, WS16/17 5


Decision Tree Learning

Hypothesis Language: Decision Trees

© T.Horváth – ILAS Machine Learning, WS16/17 6


Decision Tree Learning

Example: PlayTennis (Discretized)

© T.Horváth – ILAS Machine Learning, WS16/17 7


Decision Tree Learning

Decision Tree for PlayTennis

Outlook

Sunny Overcast Rain

Humidity Yes Wind

High Normal Strong Weak

No Yes No Yes
© T.Horváth – ILAS Machine Learning, WS16/17 8
Decision Tree Learning

Example (Kirsten/Wrobel/Dahmen/98 ):
Predicting Suitability of Locations for a Certain Plant Species
Location: Anderer Berg, Nauheim
Coordinates: L4, Blatt 98
Humidity of Soil: hoch
Acidity: neutral
Average Temperature: 9,2° C
. .
. . several hundreds of
. . thousands of such
recordings have been made
Species Found: Rotbuche,
– ca. 14.000 were used in
Gewöhnliche Esche,
the study
Waldgeißblatt
...

e.g. predict at which locations “Rotbuche” (copper beech) can grow.


© T.Horváth – ILAS Machine Learning, WS16/17 9
Decision Tree Learning

A Decision Tree

Humidity
= trocken = feucht

Acidity Temp
= low = high
9 >9
= neutral

Temp Temp G N G
 3,5 > 3,5
 7,5 > 7,5

N G G N

© T.Horváth – ILAS Machine Learning, WS16/17 10


Decision Tree Learning

Outline
 TDIDT algorithm: example (Quinlan, 1986)

 information gain

 expressivity of decision trees

 complexity of constructing optimal decision trees

 overfitting

 summary

© T.Horváth – ILAS Machine Learning, WS16/17 11


Decision Tree Learning

Top-Down Induction of Decision Trees

© T.Horváth – ILAS Machine Learning, WS16/17 12


Decision Tree Learning

Remark

Outlook

Sunny Overcast Rain

Yes

© T.Horváth – ILAS Machine Learning, WS16/17 13


Decision Tree Learning

Which Attribute is Best?

[29+, 35- ] A1 = ? [29+, 35-] A2 = ?

T F T F

[21+, 5-] [8+, 30-] [18+, 33-] [11+, 2-]

© T.Horváth – ILAS Machine Learning, WS16/17 14


Decision Tree Learning

Which Attribute is Best?

© T.Horváth – ILAS Machine Learning, WS16/17 15


Decision Tree Learning

Which Attribute is Best?

© T.Horváth – ILAS Machine Learning, WS16/17 16


Decision Tree Learning

Algorithm: Example
(9+,5-)
Outlook

Sunny Overcast Rain

(2+,3-) (4+,0-) (3+,2-)

© T.Horváth – ILAS Machine Learning, WS16/17 17


Decision Tree Learning

Algorithm: Example
(9+,5-)
Temperature

Hot Mild Cool

(2+,2-) (4+,2-) (3+,1-)

© T.Horváth – ILAS Machine Learning, WS16/17 18


Decision Tree Learning

Algorithm: Example
(9+,5-)
Humidity

High Normal

(3+,4-) (6+,1-)

© T.Horváth – ILAS Machine Learning, WS16/17 19


Decision Tree Learning

Algorithm: Example
(9+,5-)
Wind

Weak Strong

(6+,2-) (3+,3-)

© T.Horváth – ILAS Machine Learning, WS16/17 20


Decision Tree Learning

Algorithm: Example

Outlook

Sunny Overcast Rain

(2+,3-) (4+,0-) (3+,2-)

© T.Horváth – ILAS Machine Learning, WS16/17 21


Decision Tree Learning

Algorithm: Example

Outlook

Sunny Overcast Rain

(2+,3-) Yes (3+,2-)


Temperature

Hot Cool Mild

(0+,2-) (1+,0-) (1+,1-)

© T.Horváth – ILAS Machine Learning, WS16/17 22


Decision Tree Learning

Algorithm: Example

Outlook

Sunny Overcast Rain

(2+,3-) Yes (3+,2-)


Humidity

High Normal

(0+,3-) (2+,0-)

© T.Horváth – ILAS Machine Learning, WS16/17 23


Decision Tree Learning

Algorithm: Example

Outlook

Sunny Overcast Rain

(2+,3-) Yes (3+,2-)


Humidity Wind

High Normal Weak Strong

No Yes Yes No

© T.Horváth – ILAS Machine Learning, WS16/17 24


Decision Tree Learning

Outline
 TDIDT algorithm: example (Quinlan, 1986)
 information gain
 expressivity of decision trees
 complexity of constructing optimal decision trees
 overfitting
 summary

© T.Horváth – ILAS Machine Learning, WS16/17 25


Decision Tree Learning

Information Gain

© T.Horváth – ILAS Machine Learning, WS16/17 26


Decision Tree Learning

Shannon Information Content

© T.Horváth – ILAS Machine Learning, WS16/17 27


Decision Tree Learning

Shannon Information Content

© T.Horváth – ILAS Machine Learning, WS16/17 28


Decision Tree Learning

Shannon Information Content

© T.Horváth – ILAS Machine Learning, WS16/17 29


Decision Tree Learning

Shannon Information Content

© T.Horváth – ILAS Machine Learning, WS16/17 30


Decision Tree Learning

A Quote from Alfréd Rényi

Alfréd Rényi
1921 – 1970

© T.Horváth – ILAS Machine Learning, WS16/17 31


Decision Tree Learning

Shannon Information Content: The Entropy

Claude E. Shannon
1916 – 2001

© T.Horváth – ILAS Machine Learning, WS16/17 32


Decision Tree Learning

Entropy: Some Remarks

© T.Horváth – ILAS Machine Learning, WS16/17 33


Decision Tree Learning

Entropy: Example

© T.Horváth – ILAS Machine Learning, WS16/17 34

You might also like