You are on page 1of 23

COS3751/203/2/2018

Tutorial Letter 203/2/2018

Techniques of Artificial Intelligence


COS3751

Semester 2

School of Computing

IMPORTANT INFORMATION

This tutorial letter contains model solutions for the self-assessment assignment.

BAR CODE

university
Define Tomorrow. of south africa
Solution
SELF ASSESSMENT ASSIGNMENT 01

Study material: Chapters 9, and 18. You may skip sections 9.3, and 9.4. You only need to
study 18.1, 18.2, and 18.3.

Question 1

Convert the following First-Order Logic (FOL) sentence to clause form:

∀y (∀x P(x) ⇒ ∃x (∀z Q(x, z) ∨ ∀z R(x, y, z)))

• Eliminate implication:
∀y (¬∀x P(x) ∨ ∃x (∀z Q(x, z) ∨ ∀z R(x, y, z)))

• Move ¬ inwards
∀y (∃x ¬P(x) ∨ ∃x (∀z Q(x, z) ∨ ∀z R(x, y, z)))

• Standardize variables: The variable name x is used in the scope of two different quantifiers,
so we change the name of one of them to avoid confusion when we drop the quantifiers. The
same applies for the variable name z.
∀y (∃x ¬P(x) ∨ ∃u (∀z Q(u, z) ∨ ∀v R(u, y, v ))) Note that the scope of (∀z) is the predicate
Q(u, z), and the scope of (∀v ) is the predicate R(u, y, v ). On the other hand, the scope of
(∀y) is the whole formula.

• Skolemize
∀y (¬P(f (y)) ∨ (∀z Q(g(y), z) ∨ ∀v R(g(y), y, v ))) Here f and g are Skolem functions, and
their arguments are all the universally quantified variables in whose scope the existential
quantifier appears. In this case, ∃u appears only in the scope of ∀y. Note that the same
variable u appears twice in the sentence, so we replace each occurrence of u with g(y ).

• Drop universal quantifiers: At this point, all remaining variables are universally quantified.
Moreover, the sentence is equivalent to one in which all the universal quantifiers have been
moved to the left: ∀y, z, v (¬P(f (y)) ∨ (Q(g(y), z) ∨ R(g(y), y, v )))
We can therefore drop the universal quantifiers: ¬P(f (y)) ∨ Q(g(y), z) ∨ R(g(y), y, v )

The sentence is now in CNF and it has only one conjunct (one clause) consisting of a disjunction
of three literals.

Question 2

Consider the following English statements:

a. Anyone who passes their history exam and who wins the lottery is happy.

b. Anyone who studies or is lucky can pass their exams.

2
COS3751/203/2/2018

c. John did not study.

d. John is lucky.

e. Anyone who is lucky wins the lottery.

(2.1) Provide a vocabulary for the statements.


Pass(p, s): Predicate. Person p passes subject s.
Win(p, g): Predicate. Person p wins game g.
Happy(p): Predicate. Person p is happy.
Study(p): Predicate. Person p studies.
Lucky(p): Predicate. Person p is lucky.
History : Constant denoting a subject.
Lottery : Constant denoting a game.
John: Constant denoting a person.

(2.2) Translate the above English sentences to FOL statements using the vocabulary you
defined above.

a. ∀p ((Pass(p, History) ∧ Win(p, Lottery)) ⇒ Happy(p))

b. ∀q, r ((Study(q) ∨ Lucky(q)) ⇒ Pass(q, r ))

c. ¬Study (John)

d. Lucky(John)

e. ∀s (Lucky (s) ⇒ Win(s, Lottery))

It is important to standardize the variables (to use different variable names) to avoid
confusion when dropping the universal quantifiers.

(2.3) Convert the FOL statements obtained in 2.2 into clause form.

a. ∀p ((Pass(p, History) ∧ Win(p, Lottery)) ⇒ Happy(p))


≡ ∀p (¬(Pass(p, History) ∧ Win(p, Lottery)) ∨ Happy(p))
≡ ∀p (¬Pass(p, History ) ∨ ¬Win(p, Lottery) ∨ Happy(p))

b. ∀q, r ((Study(q) ∨ Lucky(q)) ⇒ Pass(q, r ))


≡ ∀q, r (¬(Study (q) ∨ Lucky(q)) ∨ Pass(q, r ))
≡ ∀q, r ((¬Study (q) ∧ ¬Lucky(q)) ∨ Pass(q, r ))
≡ (¬Study (q) ∧ ¬Lucky (q)) ∨ Pass(q, r )
But we have to use the distributivity of disjunction, otherwise it is not in clause
form!
≡ (¬Study (q) ∨ Pass(q, r )) ∧ (¬Lucky(q) ∨ Pass(q, r ))
So this sentence is represented by two clauses.

c. No conversion required.

d. No conversion required.

3
e. ∀s (Lucky (s) ⇒ Win(s, Lottery))
≡ ¬Lucky (s) ∨ Win(s, Lottery)

We thus end up with the following clauses:

1. ¬Pass(p, History ) ∨ ¬Win(p, Lottery) ∨ Happy(p)

2. ¬Study (q) ∨ Pass(q, r )

3. ¬Lucky(q) ∨ Pass(q, r )

4. ¬Study (John)

5. Lucky(John)

6. ¬Lucky(s) ∨ Win(s, Lottery)

Note that universal quantifiers have been dropped because all variables were uni-
versally quantified. Skolem functions are only introduced only to remove existential
quantifiers. (See section 9.5 of R&N.)

(2.4) Use resolution refutation to prove that John is happy.


In order to use resolution refutation, we negate the goal, convert the negated goal
to clause form (if necessary), and add the resulting clause(s) to the set of premises.
Make sure you understand why this approach works by reviewing the ground resolution
theorom. Also review section 7.5: remember that α |= β iff (α ∧ ¬β) is unsatisfiable.
In this case, the goal is Happy(John). So the negation of the goal is ¬Happy(John).
We can now resolve the premises (clauses 1 to 6 in Question 2.3 above) together with
the negated goal until the empty clause (Nil) is generated.

1. ¬Pass(p, History ) ∨ ¬Win(p, Lottery) ∨ Happy (p) assumption

2. ¬Study (q) ∨ Pass(q, r ) assumption

3. ¬Lucky(q) ∨ Pass(q, r ) assumption

4. ¬Study (John) assumption

5. Lucky(John) assumption

6. ¬Lucky(s) ∨ Win(s, Lottery) assumption

7. ¬Happy(John) negation of goal

8. ¬Win(p, Lottery) ∨ Happy(p) ∨ ¬Lucky(p) 1&3, {p/q}, {History/r}

9. ¬Win(John, Lottery) ∨ Happy(John) 5&8, {John/p}

10. ¬Win(John, Lottery) 9&7

11. Win(John, Lottery) 5&6, {John/w}

12. ∅ 10&11

4
COS3751/203/2/2018

We have shown that the negation of the goal together with the premises (all in clause
form) produce a contradiction (empty clause). Therefore the goal Happy(John), which
translates to ‘John is happy’, is a logical consequence of the other sentences.
It is important to show which clauses form part of the resolution to produce the resol-
vent. You must also always show the substitutions.
Remember: Given a fact with a constant, one cannot simply replace the constant
with a variable to derive a new (general) statement. For example, suppose we add
Carol to the list of persons in the vocabulary for the above problem. We cannot derive
Lucky(x) {x/John}, and then Lucky(Carol) {Carol/x}

Question 3

(3.1) Convert the Boolean function in Table 1 into a decision tree:

x1 x2 x3 f(x1 , x2 , x3 )
0 0 0 1
0 0 1 0
0 1 0 1
0 1 1 1
1 0 0 1
1 0 1 0
1 1 0 0
1 1 1 1

Table 1: Boolean function table

No information gain values were given, so it becomes a matter of picking the sequence
in which the variables are to be evaluated. The next step would be to simplify the
decision tree by consolidating equivalent leaf nodes.
Variable order x1, x2, x3:

x1

0 1

x2 x2

0 1 0 1

x3 x3 x3 x3

0 1 0 1 0 1 0 1

1 0 1 1 1 0 0 1

This can be simplified to:

5
x1

0 1

x2 x2

0 1 0 1

x3 1 x3 x3

0 1 0 1 0 1

1 0 1 0 0 1

Variable order x1, x3, x2:

x1

0 1

x3 x3

0 1 0 1

1 x2 x2 x2

0 1 0 1 0 1

0 1 1 0 0 1

Variable order x2, x1, x3:

x2

0 1

x3 x3

0 1 0 1

x2 x2 1 x2

0 1 0 1 0 1

1 0 1 0 0 1

Variable order x2, x3, x1:

6
COS3751/203/2/2018

x2

0 1

x3 x3

0 1 0 1

1 0 x1 1

0 1

1 0

Variable order x3, x2, x1:

x3

0 1

x2 x2

0 1 0 1

1 x1 0 1

0 1

1 0

Variable order x3, x1, x2:

x3

0 1

x1 x1

0 1 0 1

1 x2 x2 x2

0 1 0 1 0 1

1 0 0 1 0 1

7
(3.2) When we construct a decision tree without the benefit of gain values, the order in
which we evaluate the variables is important. Why?
It may be possible to consolidate leaf nodes with similar values to produce a smaller,
more compact tree.

Question 4

The National Credit Act introduced in South Africa in 2007 places more responsibility on a bank to
determine whether the loan applicant will be able to afford it. Blue Bank has a table of information
on 14 loan applications they have received in the past.

Use the information in this table to construct a decision tree that will assist the bank in determining
the risk associated with a new loan application.

No. Credit history Debt Collateral Income Risk


1 Bad High No < R15k High
2 unknown High No R15k - R35k High
3 unknown Low No R15k - R35k Medium
4 unknown Low No < R15k High
5 unknown Low No > R35k Low
6 unknown Low Yes > R35k Low
7 Bad Low No < R15k High
8 Bad Low Yes > R35k Medium
9 Good Low No > R35k Low
10 Good High Yes > R35k Low
11 Good High No < R15k High
12 Good High No R15k - R35k Medium
13 Good High No > R35k Low
14 Bad High No R15k - R35k High

Table 2: Risk information table

In order to determine the information gain of any different attributes of a certain collection of data,
the calculation of entropy is important the notion of information gain is defined in terms of entropy.
Entropy can be described as a measure of impurity of an arbitrary collection of examples. Russell
& Norvig describe entropy as a measure of the uncertainty of a random variable, and mention that
the acquisition of information corresponds to a reduction in entropy. Namely a random variable
with only one value has no uncertainty and thus its entropy is defined as zero; thus we gain no
information by observing its value.
For any attribute A, if no information can be gained regarding the decision from attribute A, then the
entropy of attribute A is equal to 0. If all the members of a class are split equally by any attribute
A, then the entropy of A is 1. For any collection S the maximum entropy value is 1.
Formally, the entropy of a random variable V with value vk , each with probability P(vk ) is defined
as:
n n
X 1 X 1
H(V ) = −P(vk )log2 =− P(vk )log2 (1)
P(vk ) P(vk )
k=1 k=1

8
COS3751/203/2/2018

Maybe a more comprehensible way of defining the entropy of any collection S is as follows:
c
X
Entropy(S) = − pi log2 (pi ) (2)
i=1

This corresponds to what Russell & Norvig refer to as information content. If the possible answers
vi have probabilities P(vi ) then the information content I of the actual answer is given by:
n
X
I(P(v1 ), ... , P(vn )) = −P(vi )log2 P(vi ) (3)
i=1

You may see the negative as a way of making the values of pi log2 (pi ) positive because all pi < 1,
i.e. each represents a probability of the sample set.
In what follows we will use the ID3 algorithm to develop our decision tree.
Our collection of 14 examples S, has four meaningful attributes (Credit History, Debt, Collateral,
Income), and a 3-wise classification (Low, Medium, High). Since there are no repeat values in the
No. attribute it plays no role in the classification and can be ignored (although we will show the No.
attribute in further tables so that we know which examples we are working with).
For purposes of clarity in the formulae, we will shorten the attribute labels and values as follows:

• Credit History = CH

• Debt = D

• Collateral = C

• Income = I

• Bad = B

• Unknown = U

• Good = G

• High = H

• Low = L

• <R15K = 15K

• R15K-R35K = 15K-35K

• >R35K = 35K

• Medium = M

We will also not be repeating calculations at every step: we show the full set of calculations for the
first level of the tree, after that we will only show the final result of the calculation.
Summarise the example set, S = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14}, as: S = [5L , 3M , 6H ] We
start by calculating the Entropy for our set of examples:

9
3
X
Entropy(S) = −pi log2 pi
i=1
5 5 3 3 6 6
= −( )log2 ( ) − ( )log2 ( ) − ( )log2 ( )
14 14 14 14 14 14
5 3 6
= −( )(log2 5 − log2 14) − ( )(log2 3 − log2 14) − ( )(log2 6 − log2 14)
14   14   14  
5 log5 log14 3 log3 log14 6 log6 log14
= −( ) − −( ) − −( ) −
14 log2 log2 14 log2 log2 14 log2 log2
= 0.531 + 0.476 + 0.524
= 1.531(max = 1.585)

We have to determine the information gain (IG) of the different attributes in order to select the best
choice for the root node. The information gain measures the expected reduction of entropy – the
higher the IG, the higher the expectation of reduction of entropy.
In what follows we calculate the information gain for each of the four attributes.
Credit History:
Values (CH ) = B, G, U
S = [5L , 3M , 6H ]
SB ← [0L , 1M , 3H ]
SG ← [3L , 1M , 1H ]
SU ← [2L , 1M , 2H ]

0 0 1 1 3 3
Entropy(SB ) = −( )log2 ( ) − ( )log2 ( ) − ( )log2 ( )
4 4 4 4 4 4
= 0.811

3 3 1 1 1 1
Entropy(SG ) = −( )log2 ( ) − ( )log2 ( ) − ( )log2 ( )
5 5 5 5 5 5
= 1.371

2 2 1 1 2 2
Entropy(SU ) = −( )log2 ( ) − ( )log2 ( ) − ( )log2 ( )
5 5 5 5 5 5
= 1.522

X |Sv |
Gain(S, CH) = Entropy(S) − Entropy(Sv )
|S|
v ∈Values(A)
4 5 5
= Entropy(S) − ( )Entropy(SB ) − ( )Entropy(SG ) − ( )Entropy(SU )
14 14 14
4 5 5
= 1.531 − ( )0.811 − ( )1.371 − ( )1.522
14 14 14
= 0.266

10
COS3751/203/2/2018

Debt:
Values(D) = L, H

S = [5L , 3M , 6H ]
SL ← [3L , 2M , 2H ]
SH ← [2L , 1M , 4H ]

3 3 2 2 2 2
Entropy(SL ) = −( )log2 ( ) − ( )log2 ( ) − ( )log2 ( )
7 7 7 7 7 7
= 1.557

2 2 1 1 4 4
Entropy(SH ) = −( )log2 ( ) − ( )log2 ( ) − ( )log2 ( )
7 7 7 7 7 7
= 1.379

7 7
Gain(S, D) = Entropy(S) − ( )Entropy(SL ) − ( )Entropy(SH )
14 14
7 7
= 1.531 − ( )1.557 − ( )1.379
14 14
= 0.063

Collateral:
Values (C) = Y,N
S = [5L , 3M , 6H ]
SY ← [2L , 1M , 0H ]
SN ← [3L , 2M , 6H ]

2 2 1 1 0 0
Entropy(SY ) = −( )log2 ( ) − ( )log2 ( ) − ( )log2 ( )
3 3 3 3 3 3
= 0.918

3 3 2 2 6 6
Entropy(SN ) = −( )log2 ( ) − ( )log2 ( ) − ( )log2 ( )
11 11 11 11 11 11
= 1.435

3 11
Gain(S, C) = Entropy(S) − ( )Entropy(SY ) − ( )Entropy(SN )
14 14
3 11
= 1.531 − ( )0.918 − ( )1.435
14 14
= 0.207

Income:
Values (I ) = 15 , 15-35, 35
S = [5L , 3M , 6H ]

11
S15 ← [0L , 0M , 4H ]
S15−35 ← [0L , 2M , 2H ]
S35 ← [5L , 1M , 0H ]

0 0 0 0 4 4
Entropy(S15 ) = −( )log2 ( ) − ( )log2 ( ) − ( )log2 ( )
4 4 4 4 4 4
= 0

0 0 2 2 2 2
Entropy(S15−35 ) = −( )log2 ( ) − ( )log2 ( ) − ( )log2 ( )
4 4 4 4 4 4
= 1

5 5 1 1 0 0
Entropy(S35 ) = ( )log2 ( ) − ( )log2 ( ) − ( )log2 ( )
6 6 6 6 6 6
= 0.650

4 4 6
Gain(S, I) = Entropy(S) − ( )Entropy(S15 ) − ( )Entropy(S15−35 ) − ( )Entropy(S35 )
14 14 14
4 4 6
= 1.531 − ( )0 − ( )1 − ( )0.650
14 14 14
= 0.967

Attribute Income provides the highest information gain (0.967), i.e. the best prediction for our target
attribute, Risk. Income becomes the root node of our decision tree.

Income

< 15K > 35K


15K − 35K

The ID3 algorithm now performs a recursion with the three subsets of our examples, based on the
three possible values of the Income attribute (<R15K, R15K-R35K, >R35K).
The first subset, corresponding to Income = {< R15K }, is:

No. Credit History Debt Collateral Income Risk


1 Bad High No < R15k High
4 Unknown Low No < R15k High
7 Bad Low No < R15k High
11 Good High No < R15k High

We notice that the target attribute is the same for all four of our examples, i.e. all four examples
produce the same target value, High. We have our first leaf node, labelled High.

12
COS3751/203/2/2018

Income

< 15K > 35K


15K − 35K

High

The second subset, corresponding to Income = {R15K − R35K }, is:

No. Credit History Debt Collateral Income Risk


2 Unknown High No R15K-R35K High
3 Unknown Low No R15K-R35K Medium
12 Good High No R15K-R35K Medium
14 Bad High No R15K-R35K High

Our subset collection of 4 examples S, has three meaningful attributes (Credit History, Debt, Col-
lateral), and a binary classification (Medium, High): S15K −35K = {2, 3, 12, 14} = [2M , 2H ].
We start by calculating the Entropy for this subset:
2
X
Entropy(S15K −35K ) = −pi log2 pi
i=1
2 2 2 2
= −( )log2 ( ) − ( )log2 ( )
4 4 4 4
= 0.5 + 0.5
= 1

Calculate the information gain for the attributes of the subset (entropy for each attribute is calculated
in the same fashion as above).
Credit History:

1 1 2
Gain(S15K −35K , CH) = Entropy(S15K −35K ) − ( )Entropy(SB ) − ( )Entropy(SG ) − ( )Entropy(SU )
4 4 4
1 1 2
= 1 − ( )0 − ( )0 − ( )1
4 4 4
= 0.5

Debt:

1 3
Gain(S15K −35K , D) = Entropy(S15K −35K ) − ( )Entropy(SL ) − ( )Entropy(SH )
4 4
1 3
= 1 − ( )0 − ( )0.918
4 4
= 0.312

13
Collateral:

0 4
Gain(S15K −35K , C) = Entropy(S15K −35K ) − ( )Entropy(SY ) − ( )Entropy(SN )
4 4
0 4
= 1 − ( )0 − ( )1
4 4
= 0
Attribute Credit History provides the highest Information Gain in this subset (0.5). Thus it becomes
our next decision node.
Income

< 15K > 35K


15K − 35K

High Credit History

Bad Unknown
Good

The third subset, corresponding to Income = {> R35k}, is then used.

No. Credit History Debt Collateral Income Risk


5 Unknown Low No >R35K Low
6 Unknown Low Yes >R35K Low
8 Bad Low Yes >R35K Medium
9 Good Low No >R35K Low
10 Good High Yes >R35K Low
13 Good High No >R35K Low

Our subset collection of 6 examples S, has three meaningful attributes (Credit History, Debt, Col-
lateral), and a binary classification (Low, Medium):

S35K = {5, 6, 8, 9, 10, 13} = [5L , 1M ]


Start by calculating the Entropy for this subset:

2
X
Entropy(S35K ) = −pi log2 pi
i=1
5 5 1 1
= −( ) log2 ( ) − ( ) log2 ( )
6 6 6 6
= 0.65

14
COS3751/203/2/2018

We now proceed to calculate the information gain for each of the three attributes in the subset:
Credit History:
Values(CH) = B, G, U

S35K = [5L , 1M ]
SB ← [0L , 1M ]
SG ← [3L , 0M ]
SU ← [2L , 0M ]

0 0 1 1
Entropy(SB ) = −( )log2 ( ) − ( )log2 ( )
1 1 1 1
= 0
3 3 0 0
Entropy(SG ) = −( )log2 ( ) − ( )log2 ( )
3 3 3 3
= 0
2 2 0 0
Entropy(SU ) = −( )log2 ( ) − ( )log2 ( )
2 2 2 2
= =0
1 3 2
Gain(S35K , CH) = Entropy(S35K ) − ( )Entropy(SB ) − ( )Entropy(SG ) − ( )Entropy(SH )
6 6 6
1 3 2
= 0.65 − ( )0 − ( )0 − ( )0
6 6 6
= 0.65

Debt:
Values(D) = L, H

S35K = [5L , 1M ]
SL ← [5L , 1M ]
SH ← [2L , 0M ]

15
3 3 1 1
Entropy(SL ) = −( )log2 ( ) − ( )log2 ( )
4 4 4 4
= 0.811
2 2 0 0
Entropy(SG ) = −( )log2 ( ) − ( )log2 ( )
2 2 2 2
= 0
4 2
Gain(S35K , D) = Entropy(S35K ) − ( )Entropy(SL ) − ( )Entropy(SH )
6 6
4 2
= 0.65 − ( )0.811 − ( )0
6 6
= 0.109

Collateral:
Values(D) = Y , N

S35K = [5L , 1M ]
SY ← [2L , 1M ]
SN ← [3L , 0M ]

2 2 1 1
Entropy(SL ) = −( )log2 ( ) − ( )log2 ( )
3 3 3 3
= 0.918
3 3 0 0
Entropy(SG ) = −( )log2 ( ) − ( )log2 ( )
3 3 3 3
= 0
3 3
Gain(S35K , D) = Entropy(S35K ) − ( )Entropy(SY ) − ( )Entropy(SN )
6 6
3 3
= 0.65 − ( )0.918 − ( )0
6 6
= 0.191

In summary:

Gain(S35K , CH) = 0.65


Gain(S35K , D) = 0.109
Gain(S35K , Collateral) = 0.191

Attribute Credit History provides the highest Information Gain in this subset and becomes the
decision node on this branch of the decision tree.

16
COS3751/203/2/2018

Income

< 15K > 35K


15K − 35K

High Credit History Credit History

Bad Unknown Bad Unknown


Good Good

The algorithm continues with recursion to the next level. The subset of Credit History that corre-
sponds to the Income = R15K-R35K and Credit History = Bad provides a single example (14).

No. Credit History Debt Collateral Income Risk


14 Bad High No R15K-R35K High

We have a single example, hence another leaf node: High.

Income

< 15K > 35K


15K − 35K

High Credit History Credit History

Bad Unknown Bad Unknown


Good Good

High

The subset of Credit History that corresponds to the Income = R15K-R35K and Credit History =
Good is also a single example (12).

No. Credit History Debt Collateral Income Risk


14 Bad High No R15K-R35K High

We have a single example, hence another leaf node: Medium.

17
Income

< 15K > 35K


15K − 35K

High Credit History Credit History

Bad Good Unknown Bad Unknown


Good

High Medium

The subset of Credit History that corresponds to the Income = R15K-R35K and Credit History =
Unknown has two examples (2,3).

No. Credit History Debt Collateral Income Risk


2 Unknown High No R15K-R35K High
3 Unknown Low No R15K-R35K Medium

Our subset collection of 2 examples S, has two meaningful attributes (Debt, Collateral), and a bi-
nary classification (Medium, High shortened for clarity to M,H):

SUnknown = {2, 3} = [1M , 1H ].

We again calculate the entropy for the subset:

1 1 1 1
Entropy(S) = −( )log2 ( ) − ( )log2 ( )
2 2 2 2
= 1

Debt
Values(D) = L, H

SUnknown = [1L , 1M ]
SL ← [2L , 1M ]
SH ← [3L , 0M ]

18
COS3751/203/2/2018

1 1 0 0
Entropy(SL ) = −( )log2 ( ) − ( )log2 ( )
1 1 1 1
= 0
0 0 1 1
Entropy(SH ) = −( )log2 ( ) − ( )log2 ( )
1 1 1 1
= 0
1 1
Gain(SUnknown , D) = Entropy(SUnknown ) − ( )Entropy(SL ) − ( )Entropy(SH )
2 2
1 1
= 1 − ( )0.918 − ( )0
2 2
= 1

Collateral
Values(D) = N

SUnknown = [1M , 1H ]
SN ← [1M , 1H ]

1 1 1 1
Entropy(SN ) = −( )log2 ( ) − ( )log2 ( )
2 2 2 2
= 1
2
Gain(SUnknown , C) = Entropy(SUnknown ) − ( )Entropy(SN )
2
2
= 1 − ( )1
2
= 0

In summary:

Gain(SUnknown , D) = 1
Gain(SUnknown , Collateral) = 0

Debt gives us perfect information gain, and thus becomes the next decision node.

19
Income

< 15K > 35K


15K − 35K

High Credit History Credit History

Bad Good Unknown Bad Unknown


Good

High Medium Debt

High Low

We return to credit history (we first finish all the nodes on the same level). The subset of Credit
History that corresponds to the Income = ¿R35K and Credit History = Bad contains only 1 example
(8).

No. Credit History Debt Collateral Income Risk


8 Bad Low Yes >R35K Medium

We thus have another leaf node: Medium.

Income

< 15K > 35K


15K − 35K

High Credit History Credit History

Bad Good Unknown Bad


Unknown
Good

High Medium Debt Medium

High Low

20
COS3751/203/2/2018

The subset of Credit History that corresponds to Income = >R35K and Credit History = Good
provides three examples (9,10,13):

No. Credit History Debt Collateral Income Risk


9 Good Low No >R35K Low
10 Good High Yes >R35K Low
13 Good High No >R35K Low

Our subset collection of 3 examples S, has two meaningful attributes (Debt, Collateral), and a sin-
gle classification (Low). Hence we again have a leaf node Low.

Income

< 15K > 35K


15K − 35K

High Credit History Credit History

Bad Good Unknown Bad


Good Unknown

High Medium Debt Medium Low

High Low

The subset of Credit History that corresponds to the Income = >R35K and Credit History = Un-
known is provides 2 examples (5,6).

No. Credit History Debt Collateral Income Risk


5 Unknown Low No >R35K Low
6 Unknown Low Yes >R35K Low

Our subset collection of 2 examples S, has two meaningful attributes (Debt, Collateral), and a sin-
gle classification (Low). Hence we again have a leaf node Low.

21
Income

< 15K > 35K


15K − 35K

High Credit History Credit History

Bad Good Unknown Bad Unknown


Good

High Medium Debt Medium Low Low

High Low

We can now do the last set of calculations. With Debt = High we have only one example left (2),
which makes this a leaf node.

No. Credit History Debt Collateral Income Risk


2 Unknown High No R15K-R35K High

Similarly, when Debt = Low we have one example left (3). This creates the final leaf node.

No. Credit History Debt Collateral Income Risk


3 Unknown Low No R15K-R35K Medium

These last two steps complete our decision tree (note that Collateral plays no role in the decision
based on this decision tree).

22
COS3751/203/2/2018

Income

< 15K > 35K


15K − 35K

High Credit History Credit History

Bad Good Unknown Bad Unknown


Good

High Medium Debt Medium Low Low

High Low

High Medium

Copyright UNISA
c 2018 (v2018.2.1)

23

You might also like