You are on page 1of 5

IS328 Data Mining

Naïve Bayes Classification


PART II

Tutorial 9 Exercises

Q2 Prediction with all nominal data


The Tennis Game Data
ID Outloo Temperature Humidity Windy Class
k
1 sunny hot high false No
2 sunny hot high True No
3 overcast hot high false Yes
4 rainy mild high false Yes
5 rainy cool normal false Yes
6 rainy cool normal True No
7 overcast cool normal True Yes
8 sunny mild high false No
9 sunny cool normal false Yes
10 rainy mild normal false Yes
11 sunny mild normal true Yes
12 overcast mild high true Yes
13 overcast hot normal false Yes
14 rainy mild high true No

Predict the class for the following day:


Outlook = Sunny, Temprature = cool, Humidity = high, Windy = False
Answers:

Outlook Temperature Humidity Windy

Yes No Yes No Yes No Ye

Sunny 2 3 Hot 2 2 High 3 4 False 6

Overcast 4 0 Mild 4 2 Normal 6 1 True 3

Rainy 3 2 Cool 3 1

Sunny 2/9 3/5 Hot 2/9 2/5 High 3/9 4/5 False

Overcast 4/9 0/5 Mild 4/9 2/5 Normal 6/9 1/5 True

Rainy 3/9 2/5 Cool 3/9 1/5

Likelihood of Yes = 9/14 x 2/9 x 3/9 x 3.9 x 3/9


= 0.0053

Likelihood of No = 5/9 x 3/5 x 1/5 x 4/5 x 3/5


= 0.0206
Therefore, the prediction is NO.

Q2 Prediction with some numerical data

The Naive Bayes Classifier for Data Sets with Numerical Attribute Values
• One common practice to handle numerical attribute values is to assume normal distributions
for numerical attributes

The Tennis Game Data

ID Outloo Temperature Humidity Windy Class


k
1 sunny 85 85 False No
2 sunny 80 90 True No
3 overcast 83 86 False Yes
4 rainy 70 96 false Yes
5 rainy 68 80 false Yes
6 rainy 65 70 true No
7 overcast 64 65 true Yes
8 sunny 72 95 false No
9 sunny 69 70 false Yes
10 rainy 75 80 false Yes
11 sunny 75 70 true Yes
12 overcast 72 90 true Yes
13 overcast 81 75 false Yes
14 rainy 71 91 true No

Predict the class for the following day:


Outlook = Sunny, Temprature = 66, Humidity = 86, Windy = False
( a   ij ) 2

1 2 ij2
P ( Ai  a | c j )  e
2 2
ij

P{la y
Yes No
9 5

9/14 5/14
Outlook Temperature Humidity
Windy
Yes No Yes No Yes No
Yes No
Sunny 2 3 83 85 86 85
false 6 2
Overcast 4 0 70 80 86 90 true
3 3
Rainy 3 2 68 65 80 70
64 72 65 95
69 71 70 91
75 80
75 70
72 90
81 75S

Mean 73 74.6 79.9 86.2


Std Dev 6.2 7.9 18.2 9.7

( a   ij ) 2

1 2 ij2
P ( Ai  a | c j )  e
2 ij2

Use the formula to calculate the following:

TY = P(temperature = 66 ! Yes) = 0.0340


TN = P(Tempearture = 66 | No)
HY = P(Humidity = 86 } Yes)
HN = P(Humidity = 86 | No)

LY = Likelihood of yes = 9/14 x 2/9 x TY x HY x 3/9


LN = = Likelihood of No = 5/14 x 3/5 x TN x HN x 3/5

IF LY > LN then the prediction is Yes


Otherwise the prediction is No.

You might also like