You are on page 1of 11

N aïve

tio n to
t r odu c r y ’s
th in th e o
d r ew
in -dep nd the see An ers.
e a se ta Min
mor sifiers
Naïve Bayes
a p le a a
For s Clas them, y for D
e t
Bay unding obabili
o r
surr re on P

Classifiers
u
lect

Note to other teachers and users of these


slides. Andrew would be delighted if you
Andrew W. Moore
Professor
found this source material useful in
giving your own lectures. Feel free to use
these slides verbatim, or to modify them
to fit your own needs. PowerPoint
originals are available. If you make use School of Computer Science
of a significant portion of these slides in
your own lecture, please include this
message, or the following link to the Carnegie Mellon University
source repository of Andrew’s tutorials:
http://www.cs.cmu.edu/~awm/tutorials . www.cs.cmu.edu/~awm
Comments and corrections gratefully
received. awm@cs.cmu.edu
412-268-7599
h av e a lready
a s s u m e you
otes
These n an Networks
e si
Copyright © 2004, Andrew W. Moore met Bay
A simple Bayes Net
J

C Z R

J Person is a Junior
C Brought Coat to Classroom
Z Live in zipcode 15213
R Saw “Return of the King” more than once
Copyright © 2004, Andrew W. Moore Slide 2
A simple Bayes Net
J What parameters are
stored in the CPTs of
this Bayes Net?

C Z R

J Person is a Junior
C Brought Coat to Classroom
Z Live in zipcode 15213
R Saw “Return of the King” more than once
Copyright © 2004, Andrew W. Moore Slide 3
A simple Bayes Net
P(J) =
J J Person is a Junior
C Brought Coat to Classroom
Z Live in zipcode 15213
R Saw “Return of the King” more
than once

C Z R
P(Z|J) = P(R|J) =
P(C|J) =
P(Z|~J)= P(R|~J)=
P(C|~J)=
2 0 p eople
a b a s e from us e
a d a t l d w e
w e have H o w cou
Suppo s e
a le c t ur e .
in t h i s CP T ?
o a tt e nded t h e v a l u es
wh s t im a te
o e
t ha t t
Copyright © 2004, Andrew W. Moore Slide 4
A simple Bayes Net
P(J) =
J J Person is a Junior
C Brought Coat to Classroom
# people who walked to school
Z Live in zipcode 15213
R# people in of
Saw “Return database
the King” more
than once

# people who walked to school and brought a coat


C F R
# people who walked to school
P(Z|J) = P(R|J) =
P(C|J) =
P(Z|~J)= P(R|~J)=
P(C|~J)=
2 0 p eople
a b a s e from us e
a d a t l d w e
# coat - bringers who
w e ve t walkHoto
hadidn' cou
wschool
Suppo s e
a le c t ur e .
in t h i s CP T ?
o a tt e nded e v a l u es
# people
wh who didn' s t im tawalk
te t h to school
o e
t ha t t
Copyright © 2004, Andrew W. Moore Slide 5
A Naïve Bayes Classifier
P(J) =
J Output Attribute
J Walked to School
C Brought Coat to Classroom
Z Live in zipcode 15213
R Saw “Return of the King” more
than once

C Z R
P(Z|J) = P(R|J) =
P(C|J) =
P(C|~J)=
P(Z|~J)= Input Attributes
P(R|~J)=

A new person shows up at class wearing an “I live right above the


Manor Theater where I saw all the Lord of The Rings Movies every
night” overcoat.
What is the probability that they are a Junior?
Copyright © 2004, Andrew W. Moore Slide 6
Naïve Bayes Classifier Inference
J
P ( J | C ^ Z ^ R ) 
P ( J ^ C ^ Z ^ R )

C Z R
P (C ^ Z ^ R )
P ( J ^ C ^ Z ^ R )

P ( J ^ C ^ Z ^ R )  P ( J ^ C ^ Z ^ R )

P(C | J ) P(Z | J ) P( R | J ) P( J )

 P(C | J ) P(Z | J ) P( R | J ) P( J ) 
 
  
 P (C | J ) P(Z | J ) P( R | J ) P(J ) 
 
Copyright © 2004, Andrew W. Moore Slide 7
The General Case
Y

X1 X2 ... Xm

1. Estimate P(Y=v) as fraction of records with Y=v


2. Estimate P(Xi=u | Y=v) as fraction of “Y=v” records that also
have X=u.
3. To predict the Y value given observations of all the Xi values,
compute
Y predict  argmax P(Y  v | X 1  u1  X m  um )
v
Copyright © 2004, Andrew W. Moore Slide 8
Naïve Bayes Classifier
Y predict  argmax P(Y  v | X 1  u1  X m  um )
v
P(Y  v ^ X 1  u1  X m  um )
Y predict
 argmax
v P( X 1  u1  X m  um )
P( X 1  u1  X m  um | Y  v) P(Y  v)
Y predict
 argmax
v P( X 1  u1  X m  um )

Y predict  argmax P( X 1  u1  X m  um | Y  v) P(Y  v)


v
Because of the structure of
nY the Bayes Net
Y predict  argmax P (Y  v) P ( X j  u j | Y  v)
v j 1

Copyright © 2004, Andrew W. Moore Slide 9


More Facts About Naïve Bayes
Classifiers
• Naïve Bayes Classifiers can be built with real-valued
inputs*
• Rather Technical Complaint: Bayes Classifiers don’t try to
be maximally discriminative---they merely try to honestly
model what’s going on*
• Zero probabilities are painful for Joint and Naïve. A hack
(justifiable with the magic words “Dirichlet Prior”) can
help*.
• Naïve Bayes is wonderfully cheap. And survives 10,000
attributes cheerfully!
*See future Andrew Lectures

Copyright © 2004, Andrew W. Moore Slide 10


What you should know
• How to build a Bayes Classifier
• How to predict with a BC

Copyright © 2004, Andrew W. Moore Slide 11

You might also like