Decision Tree Problem 1

Example 6.
3:
Example 6.3: Assess a student's performance during his course of study and predict whether
a student will get a job offer or not in his final year of the course. The training dataset T consists
of10 data instances with attributes such as 'CGPA, Interactiveness', Practical Knowledge and
Communication Skills' as shown in Table 6.3. The target class attribute is the Job Offer.
Table 6.3: Training Dataset T
S.No. CGPA Interactiveness Practical
Knowledge Communication Skills Job Offer
1 29 Yes Very good Good Yes
28 No Good Moderate Yes
3 29 No Average Poor No
<8 No Average Good No
5 28 Yes Good Moderate Yes
6. 29 Yes Good Moderate Yes
7. <8 Yes Good
Poor No
29 No Very good Good Yes
28 Yes Good Good Yes
10. 28 Good
Yes Average Yes
Solution:
Step 1:
Calculate the Entropy for the target class Job Offer
Entropy_Info(Target Attribute= Job Offer)= Entropy_Info(7, 3) =
Iteration 1:
10O8 10 1lo8, 10---0.3599+-0.5208)-0.8807
Step 2
Calculate the Entropy_Info and Gain(Information_Gain) for each of the attribute in the training
dataset.
Table 6.4 shows the number of data instances classified with Job Offer as Yes or No for the attribute
CGPA
Table 6.4: Entropy Information for CGPA
CGPA Job Offer = Yes Job Offer = No Total Entropy

29 1
28 0
8 0 2
Entropy_Info(T, CGPA)
0.311 +0.4997) +0+0

10
= 0.3243
Gain (CGPA) =0.8807-0.3243

0.5564
Table 6.5 shows the number of data instances classified with Job Offer as Yes or No for the
attribute Interactiveness.
Table 6.5: Entropy Information for Interactiveness
Interactiveness Job Offer = Yes Job Offer = No Total Entropy
YES 5
NO 2
Entropy_Jnfo(T, Interactiveness)= o8
(0.2191+0.4306)+ 10(0.4997+0.4997)
=0.3898 +0.3998 = 0.7896
Gain(Interactiveness) =0.8807-0.7896
= 0.0911
r the
Table 6.6 shows the number of data instances classified with Job Offer as Yes or N
attribute Practical Knowledge.
Table 6.6: Entropy Information
for Practical Knowledge
Practical Knowledge Job Offer = Yes Job Offer =No Total
Very Good Entropy
Entropy
0 0
Average
Good
Entropy_ Info(T, Practical Knowledge)
0)+0.5280+0.3897)+0.2574 +0.4641)
=0+0.2753+0.3608
= 0.6361
Gain (Practical Knowledge) = 0.8807-0.6361

0.2446
Table 6.7 shows the number of data instances classified with Job Offer as Yes or No for the
attribute Communication Skills.
Table 6.7: Entropy lInformation for Communication Skills
Communication Skills Job Offer = Yes Job Offer = No Total

Good
Moderate 3 0
Poor 0 2
Entropy_Info(T, Communication Skills)
(0.5280
10
+ 0.3897) + 00)+10
10
0.3609
0.8813-0.36096
Gain(Communication Skills)
=
= 0.5203
6.8:
The Gain calculated for all the attributes is shown in Table
Table 6.8: Gain
Attributes Gain
0.5564
CGPA
Interactiveness 0.0911
Practical Knowledge 0.2246
Communication Skills 0.5203

attribute for which entropy 18 minimum and ther
Step 3: From Table 6.8, choose
maximunm as the best
the
split attribute.
therefore the gain
The best split attribute is CGPA since it has the
maximum gain. So, we chooee Ce
root node. There are three distinct values for CGPA with outcomes 29, 28 and <8. The
A as the
value is 0 for 28 and <8 with all instances classified as Job Offer- Yes for 28 and lob Ontro
<8. Hence, both 28 and <8 end up in a leaf node. The tree grows with the subset =Nofox
ofinstances
subset of
with instances
CGPA 29 as shown in Figure 6.3.
29 <8
CGPA
28
Pratical Communication Job

Interactiveness Skills Offer
Knowledge Job offer= Ye Job offer= No
Yes Very good Good Yes
No Average Poor No
Yes Good Moderate Yes
No Very good Good Yes
Figure 6.3: Decision Tree After Iteration 1

branched with CGPA 29.
Now, continue the same process for the subset of data instances
Iteration 2:
In this iteration, the same process of computing the Entropy_Info and Gain are repeated with t
subset of training set. The subset consists of 4 data instances as shown in the above Figure b.3
Entropy_Info(T) Entropy_Info(3, 1)
= =
,og,
=--0.3111+-0.4997)
= 0.8108
Entropy_Info(T, Interactiveness)=
= 0+ 0.4997
Gain(Interactiveness)= 0.8108-0.4997
0.3111
Entropy_Info(T, Practical Knowledge)
=0
Decision Tree Lea ning
Gain Practical Knowledge)= 0.8108 167
EntropyInfo(T, Communication Skills))

- o,2 os,,
=0
Gain(Communication Skills) =0.8108

The gain calculated for all the
attributes is shown in Table 6.9.
Table 6.9: Total Gain
Attributes Gain
Interactiveness 0.3111 o o
Practical Knowledge 0.8108
Communication Skills 0.8108 t ge
Here, both the attributes Practical Knowledge and Communication Skills' have the same
Gain. So, we can either construct the decision tree using 'Practical Knowledge' or 'Communication
Skills'. The final decision tree is shown in Figure 6.4.
28 8
CGPA
29
Job offer No
Job offer =Yes n
PK
Good Very
good
Average
Job offer = No ts Job offer =Yes

Job offer =Yes
Decision Tree
Figure 6.4: Final

Decision Tree Problem 1

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Decision Tree Problem 1

Uploaded by

Copyright:

Available Formats

Example 6.

CGPA Job Offer = Yes Job Offer = No Total Entropy

0.311 +0.4997) +0+0

Gain (CGPA) =0.8807-0.3243

Entropy_ Info(T, Practical Knowledge)

Gain (Practical Knowledge) = 0.8807-0.6361

Communication Skills Job Offer = Yes Job Offer = No Total

Entropy_Info(T, Communication Skills)

Practical Knowledge 0.2246

Communication Skills 0.5203

Pratical Communication Job

Yes Good Moderate Yes

No Very good Good Yes

Figure 6.3: Decision Tree After Iteration 1

EntropyInfo(T, Communication Skills))

Gain(Communication Skills) =0.8108

Job offer = No ts Job offer =Yes

You might also like