You are on page 1of 5

Example 6.

3:
Example 6.3: Assess a student's performance during his course of study and predict whether
a student will get a job offer or not in his final year of the course. The training dataset T consists
of10 data instances with attributes such as 'CGPA, Interactiveness', Practical Knowledge and
Communication Skills' as shown in Table 6.3. The target class attribute is the Job Offer.
Table 6.3: Training Dataset T
S.No. CGPA Interactiveness Practical
Knowledge Communication Skills Job Offer
1 29 Yes Very good Good Yes
28 No Good Moderate Yes
3 29 No Average Poor No
<8 No Average Good No
5 28 Yes Good Moderate Yes
6. 29 Yes Good Moderate Yes
7. <8 Yes Good
Poor No
29 No Very good Good Yes
28 Yes Good Good Yes
10. 28 Good
Yes Average Yes
Solution:
Step 1:
Calculate the Entropy for the target class Job Offer
Entropy_Info(Target Attribute= Job Offer)= Entropy_Info(7, 3) =

Iteration 1:
10O8 10 1lo8, 10---0.3599+-0.5208)-0.8807
Step 2
Calculate the Entropy_Info and Gain(Information_Gain) for each of the attribute in the training
dataset.
Table 6.4 shows the number of data instances classified with Job Offer as Yes or No for the attribute
CGPA
Table 6.4: Entropy Information for CGPA

CGPA Job Offer = Yes Job Offer = No Total Entropy


29 1
28 0
8 0 2

Entropy_Info(T, CGPA)

0.311 +0.4997) +0+0


10
= 0.3243

Gain (CGPA) =0.8807-0.3243


0.5564
Table 6.5 shows the number of data instances classified with Job Offer as Yes or No for the
attribute Interactiveness.
Table 6.5: Entropy Information for Interactiveness
Interactiveness Job Offer = Yes Job Offer = No Total Entropy
YES 5
NO 2

Entropy_Jnfo(T, Interactiveness)= o8
(0.2191+0.4306)+ 10(0.4997+0.4997)
=0.3898 +0.3998 = 0.7896
Gain(Interactiveness) =0.8807-0.7896
= 0.0911
r the
Table 6.6 shows the number of data instances classified with Job Offer as Yes or N
attribute Practical Knowledge.
Table 6.6: Entropy Information
for Practical Knowledge
Practical Knowledge Job Offer = Yes Job Offer =No Total
Very Good Entropy
Entropy
0 0
Average
Good

Entropy_ Info(T, Practical Knowledge)

0)+0.5280+0.3897)+0.2574 +0.4641)
=0+0.2753+0.3608
= 0.6361

Gain (Practical Knowledge) = 0.8807-0.6361


0.2446
Table 6.7 shows the number of data instances classified with Job Offer as Yes or No for the
attribute Communication Skills.
Table 6.7: Entropy lInformation for Communication Skills

Communication Skills Job Offer = Yes Job Offer = No Total


Good
Moderate 3 0

Poor 0 2

Entropy_Info(T, Communication Skills)

(0.5280
10
+ 0.3897) + 00)+10
10
0.3609
0.8813-0.36096
Gain(Communication Skills)
=

= 0.5203
6.8:
The Gain calculated for all the attributes is shown in Table
Table 6.8: Gain

Attributes Gain
0.5564
CGPA
Interactiveness 0.0911

Practical Knowledge 0.2246

Communication Skills 0.5203


attribute for which entropy 18 minimum and ther
Step 3: From Table 6.8, choose
maximunm as the best
the
split attribute.
therefore the gain
The best split attribute is CGPA since it has the
maximum gain. So, we chooee Ce
root node. There are three distinct values for CGPA with outcomes 29, 28 and <8. The
A as the

value is 0 for 28 and <8 with all instances classified as Job Offer- Yes for 28 and lob Ontro
<8. Hence, both 28 and <8 end up in a leaf node. The tree grows with the subset =Nofox
ofinstances
subset of
with instances
CGPA 29 as shown in Figure 6.3.

29 <8
CGPA

28

Pratical Communication Job


Interactiveness Skills Offer
Knowledge Job offer= Ye Job offer= No
Yes Very good Good Yes

No Average Poor No

Yes Good Moderate Yes

No Very good Good Yes

Figure 6.3: Decision Tree After Iteration 1


branched with CGPA 29.
Now, continue the same process for the subset of data instances
Iteration 2:
In this iteration, the same process of computing the Entropy_Info and Gain are repeated with t

subset of training set. The subset consists of 4 data instances as shown in the above Figure b.3

Entropy_Info(T) Entropy_Info(3, 1)
= =

,og,
=--0.3111+-0.4997)
= 0.8108

Entropy_Info(T, Interactiveness)=
= 0+ 0.4997

Gain(Interactiveness)= 0.8108-0.4997
0.3111
Entropy_Info(T, Practical Knowledge)

=0
Decision Tree Lea ning
Gain Practical Knowledge)= 0.8108 167

EntropyInfo(T, Communication Skills))


- o,2 os,,
=0

Gain(Communication Skills) =0.8108


The gain calculated for all the
attributes is shown in Table 6.9.
Table 6.9: Total Gain
Attributes Gain
Interactiveness 0.3111 o o
Practical Knowledge 0.8108
Communication Skills 0.8108 t ge
Here, both the attributes Practical Knowledge and Communication Skills' have the same
Gain. So, we can either construct the decision tree using 'Practical Knowledge' or 'Communication
Skills'. The final decision tree is shown in Figure 6.4.

28 8
CGPA

29
Job offer No
Job offer =Yes n

PK
Good Very
good

Average

Job offer = No ts Job offer =Yes


Job offer =Yes

Decision Tree
Figure 6.4: Final

You might also like