Pattern Classification: All Materials in These Slides Were Taken From

Pattern
Classification
All materials in these slides were taken from

Pattern Classification (2nd ed) by R. O.
Duda, P. E. Hart and D. G. Stork, John Wiley
& Sons, 2000
with the permission of the authors and the
publisher
Chapter 2 (Part 1):
Bayesian Decision Theory
(Sections 2.1-2.2)
• Introduction
• Bayesian Decision Theory–Continuous Features
3
Introduction
• The sea bass/salmon example
• State of nature, prior
• State of nature is a random variable
• The catch of salmon and sea bass is equiprobable
• P(1) = P(2) (uniform priors)
• P(1) + P( 2) = 1 (exclusivity and exhaustivity)
Pattern Classification, Chapter 2 (Part 1)

4
• Decision rule with only the prior information

• Decide 1 if P(1) > P(2) otherwise decide 2
• Use of the class –conditional information

• P(x |  ) and P(x |  ) describe the difference in
1 2
lightness between populations of sea and salmon

5

6
• Posterior, likelihood, evidence
• P( | x) = P(x |  ) . P ( ) / P(x)

j j j
• Where in case of two categories

j2
P ( x )   P ( x |  j )P (  j )
j 1
• Posterior = (Likelihood. Prior) / Evidence

7

8
• Decision given the posterior probabilities
X is an observation for which:
if P(1 | x) > P(2 | x) True state of nature = 1

if P(1 | x) < P(2 | x) True state of nature = 2
Therefore:
whenever we observe a particular x, the probability of
error is :
P(error | x) = P(1 | x) if we decide 2
P(error | x) = P(2 | x) if we decide 1
9
• Minimizing the probability of error
• Decide  if P(1 | x) > P(2 | x);

1
otherwise decide 2
Therefore:
P(error | x) = min [P(1 | x), P(2 | x)]
(Bayes decision)

10
Bayesian Decision Theory –
Continuous Features
• Generalization of the preceding ideas

• Use of more than one feature
• Use more than two states of nature
• Allowing actions and not only decide on the state of
nature
• Introduce a loss of function which is more general than
the probability of error

11
• Allowing actions other than classification primarily

allows the possibility of rejection
• Refusing to make a decision in close or bad cases!
• The loss function states how costly each action

taken is

12
Let {1, 2,…, c} be the set of c states of nature

(or “categories”)
Let {1, 2,…, a} be the set of possible actions
Let (i | j) be the loss incurred for taking
action i when the state of nature is j

13
Overall risk
R = Sum of all R(i | x) for i = 1,…,a
Conditional risk
Minimizing R Minimizing R(i | x) for i = 1,…, a
j c
R(  i | x )    (  i |  j ) P (  j | x )
j 1
for i = 1,…,a
14
Select the action i for which R(i | x) is minimum
R is minimum and R in this case is called the

Bayes risk = best performance that can be achieved!

15
• Two-category classification
1 : deciding 1
2 : deciding 2
ij = (i | j)
loss incurred for deciding i when the true state of nature is j
Conditional risk:
R(1 | x) = 11P(1 | x) + 12P(2 | x)

R(2 | x) = 21P(1 | x) + 22P(2 | x)

16
Our rule is the following:

if R(1 | x) < R(2 | x)
action 1: “decide 1” is taken
This results in the equivalent rule :

decide 1 if:
(21- 11) P(x | 1) P(1) >

(12- 22) P(x | 2) P(2)
and decide 2 otherwise

17
Likelihood ratio:
The preceding rule is equivalent to the following rule:
P ( x |  1 ) 12   22 P (  2 )
if  .
P ( x |  2 )  21  11 P (  1 )
Then take action 1 (decide 1)

Otherwise take action 2 (decide 2)

18
Optimal decision property
“If the likelihood ratio exceeds a threshold value

independent of the input pattern x, we can take
optimal actions”

19
Exercise
Select the optimal decision where:

 = {1, 2}
P(x | 1) N(2, 0.5) (Normal distribution)

P(x | 2) N(1.5, 0.2)
P(1) = 2/3
P(2) = 1/3
1 2
 
3 4 

Pattern Classification: All Materials in These Slides Were Taken From

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Pattern Classification: All Materials in These Slides Were Taken From

Uploaded by

Copyright:

Available Formats

Pattern

All materials in these slides were taken from

• P(1) = P(2) (uniform priors)

• P(1) + P( 2) = 1 (exclusivity and exhaustivity)

Pattern Classification, Chapter 2 (Part 1)

• Decision rule with only the prior information

• Use of the class –conditional information

Pattern Classification, Chapter 2 (Part 1)

Pattern Classification, Chapter 2 (Part 1)

• Posterior, likelihood, evidence

• P( | x) = P(x |  ) . P ( ) / P(x)

• Where in case of two categories

• Posterior = (Likelihood. Prior) / Evidence

Pattern Classification, Chapter 2 (Part 1)

• Decision given the posterior probabilities

X is an observation for which:

if P(1 | x) > P(2 | x) True state of nature = 1

• Minimizing the probability of error

• Decide  if P(1 | x) > P(2 | x);

Pattern Classification, Chapter 2 (Part 1)

• Generalization of the preceding ideas

Pattern Classification, Chapter 2 (Part 1)

• Allowing actions other than classification primarily

• Refusing to make a decision in close or bad cases!

• The loss function states how costly each action

Pattern Classification, Chapter 2 (Part 1)

Let {1, 2,…, c} be the set of c states of nature

Let {1, 2,…, a} be the set of possible actions

Let (i | j) be the loss incurred for taking

action i when the state of nature is j

Pattern Classification, Chapter 2 (Part 1)

Minimizing R Minimizing R(i | x) for i = 1,…, a

Select the action i for which R(i | x) is minimum

R is minimum and R in this case is called the

Pattern Classification, Chapter 2 (Part 1)

loss incurred for deciding i when the true state of nature is j

R(1 | x) = 11P(1 | x) + 12P(2 | x)

Pattern Classification, Chapter 2 (Part 1)

Our rule is the following:

This results in the equivalent rule :

(21- 11) P(x | 1) P(1) >

and decide 2 otherwise

The preceding rule is equivalent to the following rule:

Then take action 1 (decide 1)

Pattern Classification, Chapter 2 (Part 1)

Optimal decision property

“If the likelihood ratio exceeds a threshold value

Pattern Classification, Chapter 2 (Part 1)

Select the optimal decision where:

P(x | 1) N(2, 0.5) (Normal distribution)

You might also like