2) TCE MOOC-jDecision Tree

ID3 Decision Tree
APPLIED DATA SCIENCE WITH PYTHON
TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 1

Decision Tree Algorithm
❖ It is one of the most widely used and practical methods for supervised learning.
❖Decision Trees are a non-parametric supervised learning method used for both
❖ classification and regression tasks.
❖ The goal is to create a model that predicts the value of a target variable by
❖ learning simple decision rules inferred from the data features.
❖The decision rules are generally in form of if-then-else statements.

ID3 Decision Tree Algorithm
❖ID3 is one of the most common decision tree algorithm. Firstly, It was introduced in 1986
and it is acronym of Iterative Dichotomiser.
❖The ID3 algorithm begins with the original set as the root node.
❖On each iteration of the algorithm, it iterates through every unused attribute of the set
and calculates the entropy or the information gain of that attribute.
❖It then selects the attribute which has the largest information gain value.
❖The set is then split or partitioned by the selected attribute to produce subsets of the
data.
❖Information gain is a statistical property that measures how well a given attribute
separates the training examples according to their target classification.

ID3 ALGORITHM

Last 14 days report from Meteorological Dept.

Description of Each Attributes
TCE Online Course - APPLIED DATA SCIENCE WITH PYTHONDecision tree for the concept Play Cricket 7
Entropy of Class Label
Since, the basic version of the ID3 algorithm deal with the case where classification are either positive or negative,
we can define entropy as :
H(S)=− σ𝑖=1 𝑃 𝑥𝑖 log 2 𝑃 𝑥𝑖
suppose S is a Class Label Play containing 14 boolean examples,

with 9 positive and 5 negative examples.
Then, the entropy of S relative to this boolean classification is:
H(S) = - p(yes) * log2(p(yes)) - p(no) * log2(p(no)) = - (9/14) * log2(9/14) - (5/14) * log2(5/14)

= - (-0.41) - (-0.53) = 0.94

1 st Attribute :(Outlook)
Categorical values - sunny, overcast and rain
H(Outlook=sunny) = -(2/5)*log2(2/5)-(3/5)*log2(3/5)
=0.971
H(Outlook=rain) = -(3/5)*log2(3/5)-(2/5)*log2 (2/5) =0.971
H(Outlook=overcast) = -(4/4)*log2(4/4)-0 = 0
Average Entropy Information for Outlook - I(Outlook)

= p(sunny) * H(Outlook=sunny) + p(rain) * H(Outlook=rain) + p(overcast) * H(Outlook=overcast)
= (5/14)*0.971 + (5/14)*0.971 + (4/14)*0= 0.693
Information Gain = H(S) - I(Outlook) = 0.94 - 0.693 = 0.247

Second Attribute : (Temperature)
Categorical values - hot, mild, cool
H(Temperature=hot) = -(2/4)*log2 (2/4)-(2/4)*log2 (2/4) = 1

H(Temperature=cool) = -(3/4)*log2 (3/4)-(1/4)*log2 (1/4) = 0.811
H(Temperature=mild) = -(4/6)*log2 (4/6)-(2/6)*log2 (2/6) = 0.9179
Average Entropy Information for Temperature - I(Temperature)
= p(hot)*H(Temperature=hot) + p(mild)*H(Temperature=mild) + p(cool)*H(Temperature=cool)

= (4/14)*1 + (6/14)*0.9179 + (4/14)*0.811= 0.9108
Information Gain = H(S) - I(Temperature) = 0.94 - 0.9108 = 0.0292

3rd Attribute (Humidity)
Categorical values - high, normal

H(Humidity=high) = -(3/7)*log2 (3/7)-(4/7)*log2 (4/7)
= 0.983
H(Humidity=normal) = -(6/7)*log2 (6/7)-(1/7)*log2 (1/7)
= 0.591
Average Entropy Information for Humidity - I(Humidity)
= p(high)*H(Humidity=high) + p(normal)*H(Humidity=normal)
= (7/14)*0.983 + (7/14)*0.591 = 0.787
Information Gain = H(S) - I(Humidity) =0.94 - 0.787
= 0.153

Fourth Attribute : (Windy)
Categorical values - weak, strong

H(Windy=weak) = -(6/8)*log2 (6/8)-(2/8)*log2 (2/8) = 0.811
H(Windy=strong) = -(3/6)*log2 (3/6)-(3/6)*log2 (3/6) = 1
Average Entropy Information for Windy :–
I(Windy) = p(weak)*H(Windy=weak) + p(strong)*H(Windy=strong)

= (8/14)*0.811 + (6/14)*1 = 0.892
Information Gain = H(S) - I(Windy) = 0.94 - 0.892 = 0.048

Here, the attribute with maximum information gain is
Outlook.
So, the decision tree built so far –
Here, when Outlook == overcast, it is of pure class(Yes).

Now, we have to repeat same procedure for the data with rows
consist of Outlook value as Sunny and then for Outlook value as Rain.

TCE Online Course -APPLIED DATA SCIENCE WITH PYTHON 17
Repeat the tree construction with Root node and Remaining Attributes

Play ( 2 Yes, 3 No)
Complete entropy of Sunny is :H(S)
= - p(yes) * log2 (p(yes)) - p(no) * log2 (p(no))
= - (2/5) * log2 (2/5) - (3/5) * log2 (3/5)

= 0.971

First Attribute – (Sunny,Temperature)
Categorical values : Outlook(Sunny), Temperature(hot, mild, cool)
H(Sunny, Temperature=hot)
= -0-(2/2)*log2(2/2) = 0
H(Sunny, Temperature=cool)
= -(1)*log2 (1)- 0 = 0
H(Sunny, Temperature=mild)
= -(1/2)*log2 (1/2)-(1/2)*log(1/2)
=1
Average Entropy Information for Temperature - I(Sunny, Temperature)
= p(Sunny, hot)*H(Sunny, Temperature=hot) + p(Sunny, mild)*H(Sunny, Temperature
=mild) + p(Sunny, cool)*H(Sunny, Temperature=cool)
= (2/5)*0 + (1/5)*0 + (2/5)*1= 0.4
Information Gain = H(Sunny) - I(Sunny, Temperature)
= 0.971 - 0.4 = 0.571
Second Attribute – (Sunny,Humidity)
Categorical values : Outlook(Sunny ),Humidity(high, normal)

H(Sunny, Humidity=high)
= - 0 - (3/3)* log2(3/3) = 0
H(Sunny, Humidity=normal)
= -(2/2)* log2(2/2)-0 = 0
Average Entropy Information for Humidity - I(Sunny, Humidity)
= p(Sunny, high)*H(Sunny, Humidity=high) + p(Sunny, normal)*H(Sunny, Humidity=normal)
= (3/5)*0 + (2/5)*0 = 0
Information Gain = H(Sunny) - I(Sunny, Humidity)
= 0.971 - 0 = 0.971

Third Attribute – (Sunny ,Windy)
Categorical values : Outlook(Sunny),Windy( weak, strong)

H(Sunny, Wind=weak) = -(1/3)* log2(1/3)-(2/3)*log(2/3) = 0.918
H(Sunny, Windy=strong) = -(1/2)* log2(1/2)-(1/2)*log(1/2) = 1
Average Entropy Information for Windy - I(Sunny, Wind)
= p(Sunny, weak)*H(Sunny, Windy=weak) + p(Sunny, strong)*H(Sunny, Windy=strong)
= (3/5)*0.918 + (2/5)*1
= 0.9508
Information Gain = H(Sunny) - I(Sunny, Windy) = 0.971 - 0.95 = 0.0202

Attribute Gain
Sunny, Temperature 0.571
Sunny,Humidity 0.971
Sunny ,Windy 0.02
Here, the attribute with maximum information gain is Humidity.

, So the decision tree built so far
Here, when Outlook = Sunny and Humidity = High, it is a pure class of
category "no".
And When Outlook = Sunny and Humidity = Normal, it is again a pure
class of category "yes".
Therefore, we don't need to do further calculations.

Continue with outlook =Rain

Continue with Outlook: Rainy Category

Continue with Outlook:Rainy
Play ( 3 Yes, 2 No)

Complete entropy of Rainy is -H(S) = - p(yes) * log2 (p(yes)) - p(no) * log2 (p(no))
= - (3/5) * log2 (3/5) - (2/5) * log2 (2/5) = 0.971

Categorical values – (Outlook-Rain,) Temperature(mild, cool)
H(Rainy, Temperature=cool) = -(1/2)*log2 (1/2)- (1/2)*log2 (1/2) = 1
H(Rainy, Temperature=mild)
= -(2/3)*log2 (2/3)-(1/3)*log2 (1/3) = 0.918
Average Entropy Information for Temperature - I(Rainy, Temperature)
= p(Rainy, mild)*H(Rainy, Temperature=mild) + p(Rainy, cool)*H(Rainy, Temperature=cool)
= (2/5)*1 + (3/5)*0.918 = 0.9508
Information Gain = H(Rainy) - I(Rainy, Temperature) = 0.971 - 0.9508 = 0.0202

Categorical values :(Outlook-Rainy, Windy-weak, strong)
H(Windy=weak) = -(3/3)*log2 (3/3)-0 = 0
H(Wind=strong) = 0-(2/2)*log2 (2/2) = 0

Average Entropy Information for Windy - I(Windy)
= p(Rainy, weak)*H(Rainy, Windy=weak) + p(Rainy, strong)*H(Rainy, Windy=strong)

= (3/5)*0 + (2/5)*0 = 0
Information Gain = H(Rainy) - I(Rainy, Windy) = 0.971 – 0 = 0.971

Attribute Gain
Rainy, Temperature 0.02
Rainy,Windy 0.971

Final ID3 Decision Tree
5 Rules are Generated

1.If Outlook=Sunny and Humidity=High then Don’t Play
2. If Outlook=Sunny and Humidity=Normal then Play
3.If Outlook=Overcast then Play
4.If Outlook= Rainy and Windy= Strong then Don’t Play
5. If Outlook= Rainy and Windy= Weak then Don’t Play
Thank You!
C.DEISY August 2021

2) TCE MOOC-jDecision Tree

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

2) TCE MOOC-jDecision Tree

Uploaded by

Copyright:

Available Formats

ID3 Decision Tree

APPLIED DATA SCIENCE WITH PYTHON

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 1

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 2

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 3

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 5

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 6

H(S)=− σ𝑖=1 𝑃 𝑥𝑖 log 2 𝑃 𝑥𝑖

suppose S is a Class Label Play containing 14 boolean examples,

Then, the entropy of S relative to this boolean classification is:

H(S) = - p(yes) * log2(p(yes)) - p(no) * log2(p(no)) = - (9/14) * log2(9/14) - (5/14) * log2(5/14)

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 10

Average Entropy Information for Outlook - I(Outlook)

Information Gain = H(S) - I(Outlook) = 0.94 - 0.693 = 0.247

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 11

Categorical values - hot, mild, cool

H(Temperature=hot) = -(2/4)*log2 (2/4)-(2/4)*log2 (2/4) = 1

Average Entropy Information for Temperature - I(Temperature)

= p(hot)*H(Temperature=hot) + p(mild)*H(Temperature=mild) + p(cool)*H(Temperature=cool)

Information Gain = H(S) - I(Temperature) = 0.94 - 0.9108 = 0.0292

Categorical values - high, normal

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 13

Categorical values - weak, strong

I(Windy) = p(weak)*H(Windy=weak) + p(strong)*H(Windy=strong)

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 14

Here, when Outlook == overcast, it is of pure class(Yes).

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 16

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 18

= - (2/5) * log2 (2/5) - (3/5) * log2 (3/5)

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 19

Categorical values : Outlook(Sunny ),Humidity(high, normal)

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 21

Categorical values : Outlook(Sunny),Windy( weak, strong)

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 22

Here, the attribute with maximum information gain is Humidity.

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 23

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 24

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 25

Play ( 3 Yes, 2 No)

= - (3/5) * log2 (3/5) - (2/5) * log2 (2/5) = 0.971

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 26

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 27

H(Wind=strong) = 0-(2/2)*log2 (2/2) = 0

= p(Rainy, weak)*H(Rainy, Windy=weak) + p(Rainy, strong)*H(Rainy, Windy=strong)

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 28

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 29

5 Rules are Generated

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 32

You might also like

H(Temperature=hot) = -(2/4)log2 (2/4)-(2/4)log2 (2/4) = 1

= p(hot)H(Temperature=hot) + p(mild)H(Temperature=mild) + p(cool)*H(Temperature=cool)

I(Windy) = p(weak)H(Windy=weak) + p(strong)H(Windy=strong)

= p(Rainy, weak)H(Rainy, Windy=weak) + p(Rainy, strong)H(Rainy, Windy=strong)