Chap2 Part1 PDF

Bayes Decision Theory
Sargur Srihari
CSE 555
Introduction to Pattern Recognition
Reverend Thomas Bayes
1702-1761
Bayes set out his theory of probability in Essay towards solving a problem in the doctrine of
chances published in the Philosophical Transactions of the Royal Society of London in 1764.
The paper was sent to the Royal Society by Richard Price, a friend of Bayes', who wrote:-
I now send you an essay which I have found among the papers of our deceased friend Mr Bayes,
and which, in my opinion, has great merit... In an introduction which he has writ to this Essay,
he says, that his design at first in thinking on the subject of it was, to find out a method by which
we might judge concerning the probability that an event has to happen, in given circumstances,
upon supposition that we know nothing concerning it but that, under the same circumstances,
it has happened a certain number of times, and failed a certain other number of times.
CSE 555: Srihari 1
Bayes Rule
Two Classes(A, ~A) , Single Binary-Valued Feature (X,~X)
Known
Data
By Conditional Probability Rule, P ( X / A) P( A)

By Bayes Rule, P ( A / X ) =
p ( X / A) =
p ( X & A) P( X )
p ( A) P ( X / A) P( A)
.248 =
= = 0.7515 P ( X & A) + P ( X & ~ A)
.330
P ( X / A) P ( A)
p ( X / ~ A) =
p( X & ~ A) =
p(~ A) P ( X / A) P ( A) + P ( X / ~ A) P (~ A)
=
.168
= 0.2507 0.75 × 0.33
=
.670 0.75 × 0.33 + 0.25 × 0.67
.2475 .2475
CSE 555: Srihari = = = 0.596 2
.2475 + .1675 .415
Bayes Decision Theory
• Fundamental statistical approach to statistical
pattern classification
• Quantifies trade-offs between classification using
probabilities and costs of decisions
• Assumes all relevant probabilities are known
CSE 555: Srihari 3

Prior Probabilities
State of nature, prior
• State of nature is a random variable

P(ω1) + P( ω2) = 1 (exclusivity and exhaustivity)
• Decision rule with only the prior information

Decide ω1 if P(ω1) > P(ω2) otherwise decide ω2
CSE 555: Srihari 4

Class-conditional Probabilities
p(x | ω1) and p(x | ω2)
p(x | ωj) Pdfs show the probability of measuring a

particular feature value given category ωj.
If x is a feature value, the two curves

describe the difference in populations of two
types of classes.
Density functions are normalized-- thus area

under each curve is 1.0
Feature x
CSE 555: Srihari 5

Bayes formula to combine prior and
class-conditional probabilities
p ( x / ω j ) P(ω j )
P (ω j | x) =
p( x)
• In the case of two categories
j =2
p ( x) = ∑ P( x | ω j ) P(ω j )
j =1
• Informally, Bayes rule says:

posterior = likelihood x prior / evidence
CSE 555: Srihari 6
Posterior probabilities
Posterior probabilities for the
priors P(ω1) = 2/3, P(ω2)=1/3
Class-conditional For x = 14,
p.d.f.s P(ω1/x) = 0.08, P(ω2/x) = 0.92
p(x | ωj) P(ωj | x)
Feature x Feature x
CSE 555: Srihari 7
Bayes Decision Rule
x is an observation for which:
if P(ω1 | x) > P(ω2 | x) True state of nature = ω1

if P(ω1 | x) < P(ω2 | x) True state of nature = ω2
Therefore:
whenever we observe a particular x, the probability of
error is :
P(error | x) = P(ω1 | x) if we decide ω2
P(error | x) = P(ω2 | x) if we decide ω1
CSE 555: Srihari 8

Bayes Decision Rule minimizes
probability of error
Decide ω1 if P(ω1 | x) > P(ω2 | x);

otherwise decide ω2
Therefore:
P(error | x) = min [ P(ω1 | x), P(ω2 | x) ]
(Bayes decision)
CSE 555: Srihari 9

Bayes Decision Theory – Continuous
Features
• Generalization of the preceding ideas

• Use of more than one feature
• Use more than two states of nature
• Allowing actions and not only decide on the state of
nature
• Introduce a loss of function which is more general than
the probability of error
CSE 555: Srihari 10

Loss Function
• Allowing actions other than classification primarily

allows the possibility of rejection
• Refusing to make a decision in close or bad cases!

• The loss function states how costly each action
taken is
CSE 555: Srihari 11

Loss Function Definition
Let {ω1, ω2,…, ωc} be the set of c states of nature

(or “categories”)
Let {α1, α2,…, αa} be the set of possible actions
Let λ(αi | ωj) be the loss incurred for taking
action αi when the state of nature is ωj
CSE 555: Srihari 12

Overall Risk
R = Sum of all R(αi | x) for i = 1,…,a
Conditional risk
Minimizing R Minimizing R(αi | x) for i = 1,…, a
j =c
Expected Loss with action i
R( α i | x ) = ∑ λ ( α i | ω j ) P ( ω j | x )
j =1
Select the action αi for which R(αi | x) is minimum
R is minimum and R in this case is called the Risk
Bayes risk = best performance that can be achieved
CSE 555: Srihari 13

Two-category classification
α1 : deciding ω1
α2 : deciding ω2
λij = λ(αi | ωj)
loss incurred for deciding ωi when the true state of nature is ωj
Conditional risk:
R(α1 | x) = λ11P(ω1 | x) + λ12P(ω2 | x)

R(α2 | x) = λ21P(ω1 | x) + λ22P(ω2 | x)
CSE 555: Srihari 14
Minimum Risk Decision Rule
Our rule is the following:
if R(α1 | x) < R(α2 | x)
action α1: “decide ω1” is taken
This results in the equivalent rule :

decide ω1 if:
(λ21- λ11) P(x | ω1) P(ω1) >

(λ12- λ22) P(x | ω2) P(ω2)
and decide ω2 otherwise
CSE 555: Srihari 15

Likelihood ratio Decision Rule
The preceding rule is equivalent to the following rule:
P ( x | ω 1 ) λ12 − λ 22 P ( ω 2 )
if > .
P ( x | ω 2 ) λ 21 − λ11 P ( ω 1 )
Then take action α1 (decide ω1)

Otherwise take action α2 (decide ω2)
CSE 555: Srihari 16
Exercise
Select the optimal decision where:
Ω= {ω1, ω2}
P(x | ω1) N(2, 0.5) (Normal distribution)
P(x | ω2) N(1.5, 0.2)
P(ω1) = 2/3
P(ω2) = 1/3
⎡1 2⎤
λ=⎢ ⎥
⎣3 4 ⎦
CSE 555: Srihari 17

Chap2 Part1 PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chap2 Part1 PDF

Uploaded by

Copyright:

Available Formats

Bayes Decision Theory

By Conditional Probability Rule, P ( X / A) P( A)

CSE 555: Srihari 3

• State of nature is a random variable

• Decision rule with only the prior information

CSE 555: Srihari 4

p(x | ωj) Pdfs show the probability of measuring a

If x is a feature value, the two curves

Density functions are normalized-- thus area

CSE 555: Srihari 5

• Informally, Bayes rule says:

x is an observation for which:

if P(ω1 | x) > P(ω2 | x) True state of nature = ω1

CSE 555: Srihari 8

Decide ω1 if P(ω1 | x) > P(ω2 | x);

CSE 555: Srihari 9

• Generalization of the preceding ideas

CSE 555: Srihari 10

• Allowing actions other than classification primarily

• Refusing to make a decision in close or bad cases!

CSE 555: Srihari 11

Let {ω1, ω2,…, ωc} be the set of c states of nature

Let {α1, α2,…, αa} be the set of possible actions

Let λ(αi | ωj) be the loss incurred for taking

action αi when the state of nature is ωj

CSE 555: Srihari 12

Select the action αi for which R(αi | x) is minimum

R is minimum and R in this case is called the Risk

Bayes risk = best performance that can be achieved

CSE 555: Srihari 13

R(α1 | x) = λ11P(ω1 | x) + λ12P(ω2 | x)

This results in the equivalent rule :

(λ21- λ11) P(x | ω1) P(ω1) >

and decide ω2 otherwise

CSE 555: Srihari 15

The preceding rule is equivalent to the following rule:

Then take action α1 (decide ω1)

You might also like