Naive

N aïve
tio n to
t r odu c r y ’s
th in th e o
d r ew
in -dep nd the see An ers.
e a se ta Min
mor sifiers
Naïve Bayes
a p le a a
For s Clas them, y for D
e t
Bay unding obabili
o r
surr re on P
Classifiers
u
lect
Note to other teachers and users of these

slides. Andrew would be delighted if you
Andrew W. Moore
Professor
found this source material useful in
giving your own lectures. Feel free to use
these slides verbatim, or to modify them
to fit your own needs. PowerPoint
originals are available. If you make use School of Computer Science
of a significant portion of these slides in
your own lecture, please include this
message, or the following link to the Carnegie Mellon University
source repository of Andrew’s tutorials:
http://www.cs.cmu.edu/~awm/tutorials . www.cs.cmu.edu/~awm
Comments and corrections gratefully
received. awm@cs.cmu.edu
412-268-7599
h av e a lready
a s s u m e you
otes
These n an Networks
e si
Copyright © 2004, Andrew W. Moore met Bay
A simple Bayes Net
J
C Z R
J Person is a Junior
C Brought Coat to Classroom
Z Live in zipcode 15213
R Saw “Return of the King” more than once
Copyright © 2004, Andrew W. Moore Slide 2
A simple Bayes Net
J What parameters are
stored in the CPTs of
this Bayes Net?
C Z R
J Person is a Junior
R Saw “Return of the King” more than once
A simple Bayes Net
P(J) =
J J Person is a Junior
R Saw “Return of the King” more
than once
C Z R
P(Z|J) = P(R|J) =
P(C|J) =
P(Z|~J)= P(R|~J)=
P(C|~J)=
2 0 p eople
a b a s e from us e
a d a t l d w e
w e have H o w cou
Suppo s e
a le c t ur e .
in t h i s CP T ?
o a tt e nded t h e v a l u es
wh s t im a te
o e
t ha t t
A simple Bayes Net
P(J) =
J J Person is a Junior
# people who walked to school
R# people in of
Saw “Return database
the King” more
than once
# people who walked to school and brought a coat

C F R
# people who walked to school
P(Z|J) = P(R|J) =
P(C|J) =
P(Z|~J)= P(R|~J)=
P(C|~J)=
2 0 p eople
a b a s e from us e
a d a t l d w e
# coat - bringers who
w e ve t walkHoto
hadidn' cou
wschool
Suppo s e
a le c t ur e .
in t h i s CP T ?
o a tt e nded e v a l u es
# people
wh who didn' s t im tawalk
te t h to school
o e
t ha t t
A Naïve Bayes Classifier
P(J) =
J Output Attribute
J Walked to School
R Saw “Return of the King” more
than once
C Z R
P(Z|J) = P(R|J) =
P(C|J) =
P(C|~J)=
P(Z|~J)= Input Attributes
P(R|~J)=
A new person shows up at class wearing an “I live right above the

Manor Theater where I saw all the Lord of The Rings Movies every
night” overcoat.
What is the probability that they are a Junior?
Naïve Bayes Classifier Inference
J
P ( J | C ^ Z ^ R ) 
P ( J ^ C ^ Z ^ R )

C Z R
P (C ^ Z ^ R )
P ( J ^ C ^ Z ^ R )

P ( J ^ C ^ Z ^ R )  P ( J ^ C ^ Z ^ R )
P(C | J ) P(Z | J ) P( R | J ) P( J )

 P(C | J ) P(Z | J ) P( R | J ) P( J ) 
 
  
 P (C | J ) P(Z | J ) P( R | J ) P(J ) 
 
The General Case
Y
X1 X2 ... Xm
1. Estimate P(Y=v) as fraction of records with Y=v

2. Estimate P(Xi=u | Y=v) as fraction of “Y=v” records that also
have X=u.
3. To predict the Y value given observations of all the Xi values,
compute
Y predict  argmax P(Y  v | X 1  u1  X m  um )
v
Naïve Bayes Classifier
Y predict  argmax P(Y  v | X 1  u1  X m  um )
v
P(Y  v ^ X 1  u1  X m  um )
Y predict
 argmax
v P( X 1  u1  X m  um )
P( X 1  u1  X m  um | Y  v) P(Y  v)
Y predict
 argmax
v P( X 1  u1  X m  um )
Y predict  argmax P( X 1  u1  X m  um | Y  v) P(Y  v)

v
Because of the structure of
nY the Bayes Net
Y predict  argmax P (Y  v) P ( X j  u j | Y  v)
v j 1

More Facts About Naïve Bayes
Classifiers
• Naïve Bayes Classifiers can be built with real-valued
inputs*
• Rather Technical Complaint: Bayes Classifiers don’t try to
be maximally discriminative---they merely try to honestly
model what’s going on*
• Zero probabilities are painful for Joint and Naïve. A hack
(justifiable with the magic words “Dirichlet Prior”) can
help*.
• Naïve Bayes is wonderfully cheap. And survives 10,000
attributes cheerfully!
*See future Andrew Lectures

What you should know
• How to build a Bayes Classifier
• How to predict with a BC

Naive

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Naive

Uploaded by

Copyright:

Available Formats

N aïve

Note to other teachers and users of these

# people who walked to school and brought a coat

A new person shows up at class wearing an “I live right above the

1. Estimate P(Y=v) as fraction of records with Y=v

Y predict  argmax P( X 1  u1  X m  um | Y  v) P(Y  v)

Copyright © 2004, Andrew W. Moore Slide 9

Copyright © 2004, Andrew W. Moore Slide 10

Copyright © 2004, Andrew W. Moore Slide 11

You might also like