Professional Documents
Culture Documents
• Idea:
– find bits, then say object is present if bits are ok
• Advantage:
– objects with complex configuration spaces don’t make good
templates
• internal degrees of freedom
• aspect changes
• (possibly) shading
• variations in texture
• etc.
• Write
• Assume
• Notice:
– different patterns may yield different templates with different
probabilities
– different templates may be found in noise with different
probabilities
Figure from “Local grayvalue invariants for image retrieval,” by C. Schmid and R. Mohr,
IEEE Trans. Pattern Analysis and Machine Intelligence, 1997 copyright 1997, IEEE
Computer Vision - A Modern Approach
Set: Recognition by relations
Slides by D.A. Forsyth
Figure from “Local grayvalue invariants for image retrieval,” by C. Schmid and R. Mohr,
IEEE Trans. Pattern Analysis and Machine Intelligence, 1997 copyright 1997, IEEE
• Strategy:
– Face is eyes, nose, mouth, etc. with appropriate relations between
them
– build a specialised detector for each of these (template matching)
and look for groups with the right internal structure
– Once we’ve found enough of a face, there is little uncertainty
about where the other bits could be
• Strategy: compare
• We observe these
measurements
• I have:
– sequence of measurements
– collection of states
– topology
• I want
– state transition probabilities
– measurement emission probabilities
• Straightforward application of EM
– discrete vars give state for each measurement
– M step is just averaging, etc.
User gesturing
Figure from “Real time American sign language recognition using desk and wearable computer
based video,” T. Starner, et al. Proc. Int. Symp. on Computer Vision, 1995, copyright 1995, IEEE
Computer Vision - A Modern Approach
Set: Recognition by relations
Slides by D.A. Forsyth
HMM’s can be spatial rather than
temporal; for example, we have a
simple model where the position of
the arm depends on the position of
the torso, and the position of the
leg depends on the position of the
torso. We can build a trellis, where
each node represents correspondence
between an image token and a body
part, and do DP on this trellis.