You are on page 1of 25

1

Artificial Intelligence with


Machine Learning in Java
4-2
Information Entropy

Copyright © 2020, Oracle and/or its affiliates. All rights reserved.

2
Objectives
• This lesson covers the following objectives:
−Define information entropy
−Understand variance
−Calculate information entropy
−Understand information entropy

AiML 4-2
Information Entropy Copyright © 2020, Oracle and/or its affiliates. All rights reserved. 3

3
Information Entropy
• Information entropy is a concept defined by
mathematicians in 1949
• The idea originated from the concept of
entropy(disorder) in statistical thermodynamics, and
refers to uncertainty in data
• Data with a high level of uncertainty (randomness) will
contain more information that can be used

AiML 4-2
Information Entropy Copyright © 2020, Oracle and/or its affiliates. All rights reserved. 4

4
Information Entropy
• Example:
−If given new data about a topic, then there is new
information
−This means that something is learned, and the information
would have high entropy
−If given known data, then no learning takes place
−This information would have low (or zero) entropy

AiML 4-2
Information Entropy Copyright © 2020, Oracle and/or its affiliates. All rights reserved. 5

5
Information Entropy
• Information entropy allows quantification of making
the best split in our decision tree by looking at the
variance in the data
• The formulas used in the following examples seem
complex, but do not require knowledge of high level
mathematics

AiML 4-2
Information Entropy Copyright © 2020, Oracle and/or its affiliates. All rights reserved. 6

6
Variance
• Variance measures how far a data set is spread out
• The technical definition is
−“The average of the squared differences from the mean”
• This will not be calculated directly, but a few examples
will be used to look at data sets
• There is also an online calculator available:
• http://www.alcula.com/calculators/statistics/variance/

AiML 4-2
Information Entropy Copyright © 2020, Oracle and/or its affiliates. All rights reserved. 7

7
Variance
• Examples
−5,5,5,5,5 has a variance of 0 which means all the numbers are
the same. They do not vary
−5,5,5,5,6 has a variance of 0.16 means that the numbers are
very close. They vary little
−5,5,5,5,2000 has a variance of 636804, which means there is
a larger change in the numbers
• In variance, if there is a high value, then we do not
know much about the data, but if there is a narrow
variance we can be more confident in the data values

AiML 4-2
Information Entropy Copyright © 2020, Oracle and/or its affiliates. All rights reserved. 8

8
Information Entropy
• Information entropy is measured as:
−𝐸𝑛𝑡𝑟𝑜𝑝𝑦(𝑆) = − σ𝑛𝑥=1 𝑝 𝑥 𝑙𝑜𝑔2 𝑝(𝑥)
• This reads as:
−Entropy equals negative sum over x, p of x multiplied by log p
of x
• This entropy equation will return how much
information to expect from some action
• The higher the number, the more information we
obtain

AiML 4-2
Information Entropy Copyright © 2020, Oracle and/or its affiliates. All rights reserved. 9

9
Entropy Example
• Consider this example:
−There is a bag of 10 green marbles
−Calculate the entropy of selecting a green marble from the
bag
−P(x) is the probability of picking a green marble from the total
number of marbles
−In this example there are only green marbles, so the
probability will be 10/10 = 1
• Use p(x) = 1 in the entropy equation

AiML 4-2
Information Entropy Copyright © 2020, Oracle and/or its affiliates. All rights reserved. 10

10
Entropy Example
• 𝐸𝑛𝑡𝑟𝑜𝑝𝑦(𝑆) = − σ𝑛𝑥=1 𝑝 𝑥 𝑙𝑜𝑔2 𝑝(𝑥)
• In this example n = 1 as there is one color of marble
−𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 = −𝑝 𝑥 𝑙𝑜𝑔2 𝑝 𝑥
−𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 = −1𝑙𝑜𝑔2 1
−𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 = −1 ∗ 0
−𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 = 0
• p(x) in this example = 1

AiML 4-2
Information Entropy Copyright © 2020, Oracle and/or its affiliates. All rights reserved. 11

11
Entropy Example

• 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 = 0
• We obtain an entropy of 0
• This gives us very low entropy so we gain little
information which would mean this is not a good
training set

AiML 4-2
Information Entropy Copyright © 2020, Oracle and/or its affiliates. All rights reserved. 12

12
Entropy Example 2
• There is a bag of 5 green marbles and 5 red marbles
• 𝐸𝑛𝑡𝑟𝑜𝑝𝑦(𝑆) = − σ𝑛𝑥=1 𝑝 𝑥 𝑙𝑜𝑔2 𝑝(𝑥)
• In this example, given two types of marbles, n = 2
• It is necessary to use the equation above two times
• Simplify the above to look at any one value as being:

−𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 = −𝑝(+)𝑙𝑜𝑔2 𝑝 + − 𝑝(−)𝑙𝑜𝑔2 (𝑝−)

AiML 4-2
Information Entropy Copyright © 2020, Oracle and/or its affiliates. All rights reserved. 13

13
Entropy Example 2
• 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 = −𝑝(+)𝑙𝑜𝑔2 𝑝 + − 𝑝(−)𝑙𝑜𝑔2 (𝑝−)

• For a green marble, this would be the positive value


and for the red, the negative value
• p(x) will be the probability of that ball type
• We have 5 green out of a total of 10 so the probability
is 5/10

AiML 4-2
Information Entropy Copyright © 2020, Oracle and/or its affiliates. All rights reserved. 14

14
Entropy Example 2
• 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 = −𝑝 𝑔𝑟𝑒𝑒𝑛 𝑙𝑜𝑔2 𝑝(𝑔𝑟𝑒𝑒𝑛) −
𝑝 𝑟𝑒𝑑 𝑙𝑜𝑔2 𝑝(𝑟𝑒𝑑)

5 5 5 5
• 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 = − 𝑙𝑜𝑔2 − 𝑙𝑜𝑔2
10 10 10 10

• 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 = − 0.5 ∗ −1 − 0.5 ∗ (−1)


• 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 = 0.5 + 0.5
• 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 = 1

• This is a higher entropy value, and would be a better


training set for learning

AiML 4-2
Information Entropy Copyright © 2020, Oracle and/or its affiliates. All rights reserved. 15

15
Calculating Logarithms
• A calculator may be able to do the following
calculation:
5
−𝑙𝑜𝑔2 10
• Another option is:
5
5 𝑙𝑜𝑔
10
−𝑙𝑜𝑔2 =
10 log 2
• The result is -0.5

AiML 4-2
Information Entropy Copyright © 2020, Oracle and/or its affiliates. All rights reserved. 16

16
Entropy Example 3
• Flipping a coin gives 2 outcomes
• Call these heads or tails
• This is referred to as binary output, because it has 2
states
• The probability of a head or a tail is 0.5
• The result is entropy = 1

1 1 1 1
• 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 = − 𝑙𝑜𝑔2 − 𝑙𝑜𝑔2 =1
2 2 2 2

AiML 4-2
Information Entropy Copyright © 2020, Oracle and/or its affiliates. All rights reserved. 17

17
Entropy Example 3
• What if there is a coin with 2 heads?
• The probability of heads is 1, and entropy = 0
• If the coin has 2 tails, the probability of heads = 0, and
entropy = 0
• Consider coins that are weighted to give a 10% chance
of heads, then a 20% chance, etc.
• The following graph shows entropy values of these
probabilities

AiML 4-2
Information Entropy Copyright © 2020, Oracle and/or its affiliates. All rights reserved. 18

18
Entropy of Coin Toss with Weighted Coins
Entropy On Coin Toss
Entropy

Probability

AiML 4-2
Information Entropy Copyright © 2020, Oracle and/or its affiliates. All rights reserved. 19

19
Entropy of Coin Toss
• So what can be deduced from the coin toss?
• If there is a 2-sided coin with the same side on both,
then the entropy = 0 because the result is already
known
• There is no information to be gained from this
• The highest entropy is 1, when there is a fair, 2-sided
coin
• This means that in half the cases we will gain new
information

AiML 4-2
Information Entropy Copyright © 2020, Oracle and/or its affiliates. All rights reserved. 20

20
How Much Information
• In the coin toss, there is a maximum of 1 bit of
information
• In other examples this can be greater than 1
• In the coin toss, there are 2 options:

−𝐴𝑚𝑜𝑢𝑛𝑡𝑦 𝑜𝑓 𝐼𝑛𝑓𝑜𝑟𝑚𝑎𝑡𝑖𝑜𝑛 = 𝑙𝑜𝑔2 2 = 1bit

AiML 4-2
Information Entropy Copyright © 2020, Oracle and/or its affiliates. All rights reserved. 21

21
Bits of Information
• With a six-sided dice, there are 6 possible results:

−𝐴𝑚𝑜𝑢𝑛𝑡𝑦 𝑜𝑓 𝐼𝑛𝑓𝑜𝑟𝑚𝑎𝑡𝑖𝑜𝑛 = 𝑙𝑜𝑔2 6 = 2.58𝑏𝑖𝑡

• Rolling a six-sided dice returns a maximum of 2.58 bits


of information
• This is the goal of entropy
• To choose the attribute that will give the most
information

AiML 4-2
Information Entropy Copyright © 2020, Oracle and/or its affiliates. All rights reserved. 22

22
Next
• In the next lesson we work through a full example of
the ID3 algorithm
• Make yourself comfortable!

AiML 4-2
Information Entropy Copyright © 2020, Oracle and/or its affiliates. All rights reserved. 23

23
Summary
• In this lesson, you should have learned how to:
−Define information entropy
−Understand variance
−Calculate information entropy
−Understand information entropy

AiML 4-2
Information Entropy Copyright © 2020, Oracle and/or its affiliates. All rights reserved. 24

24
25

You might also like