Professional Documents
Culture Documents
1. Introduction
2. Human Hand Modeling
3. Feature Selection and Extraction
4. Model-Based Hand Posture Recognition
5. Hand Motion Tracking
6. Conclusion
Refs.
2
1. Introduction
Hand gestures:
Purpose of human gestures: conversational,
controlling, manipulative, and communicative.
3
1. Introduction
4
1. Introduction
5
2. Human Hand Modeling
rotation
MCP joint has two degrees of freedom
7
2. Human Hand Modeling
8
2. Human Hand Modeling
9
2. Human Hand Modeling
10
2. Human Hand Modeling
11
3. Feature Selection and Extraction
12
3. Feature Selection and Extraction
High-level features
Fingertips, fingers, joint locations, etc.
Intuitive representation, efficient processing.
hard to extract
Low-level features
Colors, contours, edges, silhouette, etc.
Skin color segmentation
Distance metric: Chamfer matching
Easier to obtain; sensitive to finger/palm angles
13
3. Feature Selection and Extraction
14
3. Feature Selection and Extraction
15
3. Feature Selection and Extraction
3D features:
Stereo cameras obtain 3D images
Depth info helping for cluttered backgrounds
Acquired surface is matched to the model surface
16
4. Model-Based Hand Posture
Recognition
A hand appears very different at different
orientation or viewpoint
Database approach: Efficient searching and
accurate indexing of image database
Template matching: Chamfer distance
17
4. Model-Based Hand Posture
Recognition
Distance-transform (DT)
Approximation of Euclidean distance in 2-D/3-D
Distance mask (x3): b a b
// int a = 3; a 0 a
b a b
// int b = 4;
DT generates a new image, in which pixel value
gives the distance to the nearest edge.
Efficient algorithms to compute. Calculated only
once for each frame.
18
4. Model-Based Hand Posture
Recognition
Edge model of the target
image is superimposed
onto the distance image.
Avg/Max of distance
values that edge model
hits gives Chamfer
Distance.
19
4. Model-Based Hand Posture
Recognition
An example of DT image (for the V pose)
20
4. Model-Based Hand Posture
Recognition
Single frame pose estimation:
The estimation from one image or multiple images
of different views.
Hand orientation determined first.
Search over all possible configurations, given the
hand orientation and motion constraints.
21
4. Model-Based Hand Posture
Recognition
Hand Pose Classification:
The classifier is trained by a large number of
labeled poses, which can be generated by
artificial 3D hand models.
22
5. Hand Motion Tracking
23
5. Hand Motion Tracking
Frame 0 Pose Estimation
Model-based tracking
24
5. Hand Motion Tracking
Bayesian tracking
Multi-resolution partitioning of the state space.
Particle filtering
Approximate arbitrary distributions with a set of
random samples.
Deal with clutter and ambiguous situations more
effectively, by multiple hypotheses.
Tree-based filtering and searching
Cluster prototype: a group of similar shape
templates.
25
5. Hand Motion Tracking
26
5. Hand Motion Tracking
Hierarchical partitioning of the state space
27
5. Hand Motion Tracking
Challenges:
28
6. Conclusion
30
Paper Survey:
33
Block Diagram of the Prototype
34
The Proposed Approach
Three phases:
1. Graphical simulation of the hand tracking
problem
2. Tracking with a real video camera and validating
the accuracy of the tracking system using the
CyberGlove as a reference
3. Extend to multi-cameras
35
Phase 1: Simulation
Study the feasibility single camera vision-
based hand tracking
26-DOF 3-D hand model
CyberGlove
Square marker: palm position and orientation
(global configuration)
Fingertips: finger posture and joint angles
(local configurations)
36
Phase 1: Simulation (Cont.)
2-D projections are used to estimate the 3-D
hand posture.
Based on geometric computations and inverse
kinematics
3-D/2-D Feature-to-Posture Transformation
How 3-D model data are projected onto the image
plane.
Forward kinematics: 4X4 matrix transformation
37
Phase 1: Simulation (Cont.)
2-D/3-D Feature-to-Posture Transformation
2-D marker features => hand posture hypothesis
Pinhole camera model utilized
Perspective geometry and its relevant constraints
Finger posture: use detected finger markers to
determine a reachable range by the finger along the
camera view direction
The reachable linear segment is then sampled at
constant lengths to calculate a finger posture
hypothesis by IK.
38
Phase 1: Simulation (Cont.)
39
Phase 1: Simulation (Cont.)
Thumb: binary search of a lookup table of all
feasible end-effector positions
Other fingers: solved by error model analysis
technique
40
Prototype Phase 2 – Facing the
Reality
Many practical parameters that are different
from the simulation
Detection of 2-D features from acquired video
frames, by utilizing segmented color and
silhouette.
Palm: Two colored markers (each on front and
back)
Fingertips: Five colored ring markers (one for
each finger)
41
Prototype Phase 3 – Multiple cameras
Camera sensor fusion
Type 1:
posture hypothesis is generated separately, and
then validated using the observation models
Useful when cameras are mobile
Type 2
Geometrical transformation between camera
coordinate frames is used
Best orientation is used by both models
42
Conclusions
Framework presented including two steps:
Posture hypothesis and validation
The framework provides reasonable results,
comparing to the CyberGlove
Multiple cameras help cover more area and
improve tracking accuracy
Handles intermittent occlusion for a short time
Future work: 3-D marker-less hand tracking
43
Comments:
A prototype of 3-D model-based hand tracking
in a general environment with unconstrained
background.
Recognize dynamic gestures in real-time.
Dataglove is used to validate the proposed
framework.
Colored markers are used to assist palm and
finguretip recognition.
44
Comments (Cont.):
Lack of palm identification of bare hands
Hand selhouette and skin color for hand
orientation estimation
Marker-less edge/contour detection for
fingertips
Elbow, arm and shoulder info may be used to
reduce the dimension of matching of 3-D hand
model
45
3D Model-Based Hand Gesture
Recognition and Tracking
Questions......
Comments......
Suggestions......
46