You are on page 1of 32

ID3 Decision Tree

APPLIED DATA SCIENCE WITH PYTHON

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 1


Decision Tree Algorithm

❖ It is one of the most widely used and practical methods for supervised learning.
❖Decision Trees are a non-parametric supervised learning method used for both
❖ classification and regression tasks.
❖ The goal is to create a model that predicts the value of a target variable by
❖ learning simple decision rules inferred from the data features.
❖The decision rules are generally in form of if-then-else statements.

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 2


ID3 Decision Tree Algorithm

❖ID3 is one of the most common decision tree algorithm. Firstly, It was introduced in 1986
and it is acronym of Iterative Dichotomiser.
❖The ID3 algorithm begins with the original set as the root node.
❖On each iteration of the algorithm, it iterates through every unused attribute of the set
and calculates the entropy or the information gain of that attribute.
❖It then selects the attribute which has the largest information gain value.
❖The set is then split or partitioned by the selected attribute to produce subsets of the
data.
❖Information gain is a statistical property that measures how well a given attribute
separates the training examples according to their target classification.

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 3


TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 4
ID3 ALGORITHM

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 5


Last 14 days report from Meteorological Dept.

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 6


Description of Each Attributes

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHONDecision tree for the concept Play Cricket 7
TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 8
TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 9
Entropy of Class Label
Since, the basic version of the ID3 algorithm deal with the case where classification are either positive or negative,
we can define entropy as :

H(S)=− σ𝑖=1 𝑃 𝑥𝑖 log 2 𝑃 𝑥𝑖

suppose S is a Class Label Play containing 14 boolean examples,


with 9 positive and 5 negative examples.

Then, the entropy of S relative to this boolean classification is:

H(S) = - p(yes) * log2(p(yes)) - p(no) * log2(p(no)) = - (9/14) * log2(9/14) - (5/14) * log2(5/14)


= - (-0.41) - (-0.53) = 0.94

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 10


1 st Attribute :(Outlook)
Categorical values - sunny, overcast and rain
H(Outlook=sunny) = -(2/5)*log2(2/5)-(3/5)*log2(3/5)
=0.971
H(Outlook=rain) = -(3/5)*log2(3/5)-(2/5)*log2 (2/5) =0.971

H(Outlook=overcast) = -(4/4)*log2(4/4)-0 = 0

Average Entropy Information for Outlook - I(Outlook)


= p(sunny) * H(Outlook=sunny) + p(rain) * H(Outlook=rain) + p(overcast) * H(Outlook=overcast)
= (5/14)*0.971 + (5/14)*0.971 + (4/14)*0= 0.693

Information Gain = H(S) - I(Outlook) = 0.94 - 0.693 = 0.247

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 11


Second Attribute : (Temperature)

Categorical values - hot, mild, cool

H(Temperature=hot) = -(2/4)*log2 (2/4)-(2/4)*log2 (2/4) = 1


H(Temperature=cool) = -(3/4)*log2 (3/4)-(1/4)*log2 (1/4) = 0.811
H(Temperature=mild) = -(4/6)*log2 (4/6)-(2/6)*log2 (2/6) = 0.9179

Average Entropy Information for Temperature - I(Temperature)

= p(hot)*H(Temperature=hot) + p(mild)*H(Temperature=mild) + p(cool)*H(Temperature=cool)


= (4/14)*1 + (6/14)*0.9179 + (4/14)*0.811= 0.9108

Information Gain = H(S) - I(Temperature) = 0.94 - 0.9108 = 0.0292


TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 12
3rd Attribute (Humidity)

Categorical values - high, normal


H(Humidity=high) = -(3/7)*log2 (3/7)-(4/7)*log2 (4/7)
= 0.983
H(Humidity=normal) = -(6/7)*log2 (6/7)-(1/7)*log2 (1/7)
= 0.591
Average Entropy Information for Humidity - I(Humidity)
= p(high)*H(Humidity=high) + p(normal)*H(Humidity=normal)
= (7/14)*0.983 + (7/14)*0.591 = 0.787
Information Gain = H(S) - I(Humidity) =0.94 - 0.787
= 0.153

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 13


Fourth Attribute : (Windy)

Categorical values - weak, strong


H(Windy=weak) = -(6/8)*log2 (6/8)-(2/8)*log2 (2/8) = 0.811
H(Windy=strong) = -(3/6)*log2 (3/6)-(3/6)*log2 (3/6) = 1
Average Entropy Information for Windy :–

I(Windy) = p(weak)*H(Windy=weak) + p(strong)*H(Windy=strong)


= (8/14)*0.811 + (6/14)*1 = 0.892
Information Gain = H(S) - I(Windy) = 0.94 - 0.892 = 0.048

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 14


TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 15
Here, the attribute with maximum information gain is
Outlook.
So, the decision tree built so far –

Here, when Outlook == overcast, it is of pure class(Yes).


Now, we have to repeat same procedure for the data with rows
consist of Outlook value as Sunny and then for Outlook value as Rain.

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 16


TCE Online Course -APPLIED DATA SCIENCE WITH PYTHON 17
Repeat the tree construction with Root node and Remaining Attributes

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 18


Play ( 2 Yes, 3 No)
Complete entropy of Sunny is :H(S)
= - p(yes) * log2 (p(yes)) - p(no) * log2 (p(no))

= - (2/5) * log2 (2/5) - (3/5) * log2 (3/5)


= 0.971

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 19


First Attribute – (Sunny,Temperature)
Categorical values : Outlook(Sunny), Temperature(hot, mild, cool)
H(Sunny, Temperature=hot)
= -0-(2/2)*log2(2/2) = 0
H(Sunny, Temperature=cool)
= -(1)*log2 (1)- 0 = 0
H(Sunny, Temperature=mild)
= -(1/2)*log2 (1/2)-(1/2)*log(1/2)
=1
Average Entropy Information for Temperature - I(Sunny, Temperature)
= p(Sunny, hot)*H(Sunny, Temperature=hot) + p(Sunny, mild)*H(Sunny, Temperature
=mild) + p(Sunny, cool)*H(Sunny, Temperature=cool)
= (2/5)*0 + (1/5)*0 + (2/5)*1= 0.4
Information Gain = H(Sunny) - I(Sunny, Temperature)
= 0.971 - 0.4 = 0.571
TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 20
Second Attribute – (Sunny,Humidity)

Categorical values : Outlook(Sunny ),Humidity(high, normal)


H(Sunny, Humidity=high)
= - 0 - (3/3)* log2(3/3) = 0
H(Sunny, Humidity=normal)
= -(2/2)* log2(2/2)-0 = 0
Average Entropy Information for Humidity - I(Sunny, Humidity)
= p(Sunny, high)*H(Sunny, Humidity=high) + p(Sunny, normal)*H(Sunny, Humidity=normal)
= (3/5)*0 + (2/5)*0 = 0
Information Gain = H(Sunny) - I(Sunny, Humidity)
= 0.971 - 0 = 0.971

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 21


Third Attribute – (Sunny ,Windy)

Categorical values : Outlook(Sunny),Windy( weak, strong)


H(Sunny, Wind=weak) = -(1/3)* log2(1/3)-(2/3)*log(2/3) = 0.918
H(Sunny, Windy=strong) = -(1/2)* log2(1/2)-(1/2)*log(1/2) = 1
Average Entropy Information for Windy - I(Sunny, Wind)
= p(Sunny, weak)*H(Sunny, Windy=weak) + p(Sunny, strong)*H(Sunny, Windy=strong)
= (3/5)*0.918 + (2/5)*1
= 0.9508
Information Gain = H(Sunny) - I(Sunny, Windy) = 0.971 - 0.95 = 0.0202

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 22


Attribute Gain
Sunny, Temperature 0.571
Sunny,Humidity 0.971
Sunny ,Windy 0.02

Here, the attribute with maximum information gain is Humidity.


, So the decision tree built so far
Here, when Outlook = Sunny and Humidity = High, it is a pure class of
category "no".
And When Outlook = Sunny and Humidity = Normal, it is again a pure
class of category "yes".
Therefore, we don't need to do further calculations.

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 23


Continue with outlook =Rain

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 24


Continue with Outlook: Rainy Category

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 25


Continue with Outlook:Rainy

Play ( 3 Yes, 2 No)


Complete entropy of Rainy is -H(S) = - p(yes) * log2 (p(yes)) - p(no) * log2 (p(no))

= - (3/5) * log2 (3/5) - (2/5) * log2 (2/5) = 0.971

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 26


Categorical values – (Outlook-Rain,) Temperature(mild, cool)
H(Rainy, Temperature=cool) = -(1/2)*log2 (1/2)- (1/2)*log2 (1/2) = 1
H(Rainy, Temperature=mild)
= -(2/3)*log2 (2/3)-(1/3)*log2 (1/3) = 0.918
Average Entropy Information for Temperature - I(Rainy, Temperature)
= p(Rainy, mild)*H(Rainy, Temperature=mild) + p(Rainy, cool)*H(Rainy, Temperature=cool)
= (2/5)*1 + (3/5)*0.918 = 0.9508
Information Gain = H(Rainy) - I(Rainy, Temperature) = 0.971 - 0.9508 = 0.0202

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 27


Categorical values :(Outlook-Rainy, Windy-weak, strong)
H(Windy=weak) = -(3/3)*log2 (3/3)-0 = 0

H(Wind=strong) = 0-(2/2)*log2 (2/2) = 0


Average Entropy Information for Windy - I(Windy)

= p(Rainy, weak)*H(Rainy, Windy=weak) + p(Rainy, strong)*H(Rainy, Windy=strong)


= (3/5)*0 + (2/5)*0 = 0
Information Gain = H(Rainy) - I(Rainy, Windy) = 0.971 – 0 = 0.971

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 28


Attribute Gain
Rainy, Temperature 0.02
Rainy,Windy 0.971

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 29


Final ID3 Decision Tree

5 Rules are Generated


1.If Outlook=Sunny and Humidity=High then Don’t Play
2. If Outlook=Sunny and Humidity=Normal then Play
3.If Outlook=Overcast then Play
4.If Outlook= Rainy and Windy= Strong then Don’t Play
5. If Outlook= Rainy and Windy= Weak then Don’t Play
TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 30
TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 31
Thank You!
C.DEISY August 2021

TCE Online Course - APPLIED DATA SCIENCE WITH PYTHON 32

You might also like