You are on page 1of 6

Task 1.

Apply Decision tree Algorithm on the following table and construct decision tree accordingly.

Solution:

Entropy(Buy_Computer) = Entropy(9 /14, 5/14)


Buy Computer
= Entropy(0.64, 0.36) Yes No
= -(0.64 log2 0.64)-(0.36 log20.36) 9 5

= 0.94

Buy_Computer
Yes No
Young 2 3 5
Middle 4 0 4
Age
Senior 2 3 5
14

Entropy(Young) = Entropy(3/5, 2/5)


= -(0.6 log2 0.6)-(0.4 log2 0.4)

= 0.971
Entropy(Middle) = Entropy(4/4, 0/4)
= -(1 log2 1)-(0 log2 0)

=0
Entropy(Senior) = Entropy(2/5, 3/5)
= -(0.4 log2 0.4)-(0.6 log2 0.6)

= 0.971
Entropy(Buy_Computer, Age) = P(Young)*E(Young) + P(Middle)*E(Middle) + P(Senior)*E(Senior)
= (5/14)*0.971 + (4/14)*0.0 + (5/14)*0.971
= 0.693
Gain(T,X) = Entropy(T) – Entropy(T,X)
Gain(Buy_Computer, Age) = E(Buy_Computer) – E(Buy_Computer, Age)
= 0.94 – 0.693
= 0.247
Buy_Computer

Yes No

High 2 2 4

Income Medium 4 2 6

Low 3 1 4

14

Entropy(High) = Entropy(2/4, 2/4)


= -(0.5 log2 0.5)-(0.5 log2 0.5)

=1

Entropy(Medium) = Entropy(4/6, 2/6)


= -(0.66 log2 0.66)-(0.33 log2 0.33)

= 0.9234

Entropy(Low) = Entropy(3/4, 1/4)


= -(0.75 log2 0.75)-(0.25 log2 0.25)

= 0.8112

Entropy(Buy_Computer, Income) = P(H)*E(H) + P(M)*E(M) + P(L)*E(L)


= (4/14)*1 + (6/14)*0.9234 + (4/14)*0.8112
= 0.9131.

Gain(T,X) = Entropy(T) – Entropy(T,X)


Gain(Buy_Computer, Income) = E(Buy_Computer) – E(Buy_Computer, Income)
= 0.94 – 0.9131
= 0.029
Buy_Computer

Yes No

Yes 6 1 7
Student
No 3 4 7

14

Entropy(Yes) = Entropy(6/7, 1/7)


= -(0.8571 log2 0.8571)-(0.1428 log2 0.1428)

= 0.4961

Entropy(No) = Entropy(3/7, 4/7)


= -(0.4285 log2 0.4285)-(0.5714 log2 0.5714)

= 0.9851

Entropy(Buy_Computer, Student) = P(Y)*E(Y) + P(N)*E(N)


= (7/14)*0.4961 + (7/14)*0.9851
= 0.788.

Gain(T,X) = Entropy(T) – Entropy(T,X)


Gain(Buy_Computer, Student) = E(Buy_Computer) – E(Buy_Computer, Student)
= 0.94 – 0.788
= 0.152
Buy_Computer

Yes No

Fair 6 2 8
Credit_Rating
Excellent 3 3 6

14

Entropy(Yes) = Entropy(6/8, 2/8)


= -(0.75 log2 0.75) - (0.25 log2 0.25)

= 0.8112

Entropy(No) = Entropy(3/6, 3/6)


= -(0.5 log2 0.5)-(0.5 log2 0.5)

= 0.25

Entropy(Buy_Computer, Credit_Rating) = P(Y)*E(Y) + P(N)*E(N)


= (8/14)*0.8112 + (6/14)*0.25
= 0.892

Gain(T,X) = Entropy(T) – Entropy(T,X)


Gain(Buy_Computer, Age) = E(Buy_Computer) – E(Buy_Computer, Credit_Rating)
= 0.94 – 0.892
= 0.048

Choosing the attribute with the largest information gain as the decision node, divide the dataset
by its branches and repeat the same process on every branch. So the largest information gain is of
Gain(Buy_Computer, Age), so the decision node will start from Age.
Age Income Student Credit_Rating Buy_Computer
Middle High No Fair Yes
Middle Low Yes Excellent Yes
Middle Medium No Excellent Yes
Middle High Yes Fair Yes

As the middle age people always buy the computer so,

Age Income Student Credit_Rating Buy_Computer


Young High No Fair No
Young High No Excellent No
Young Medium No Fair No
Young Low Yes Fair Yes
Young Medium Yes Excellent Yes

The people who are young and are not student, they do not buy the computer and the people who are young
and are student always buy the computer. So the tree will be as;
Age Income Student Credit_Rating Buy_Computer
Senior Medium No Fair Yes
Senior Low Yes fair Yes
Senior Low Yes Excellent No
Senior Medium Yes Fair Yes
Senior Medium No Excellent No

The people who are senior and has excellent credit_rating they do not buy the computer and the people
who are senior and has fair credit_rating always buy the computer. So the tree will be as;

You might also like