Some Methods for Augmented Reality Applications

Vincent Lepetit, Mustafa Ozuysal, Julien Pilet, Pascal Lagger, Pascal Fua

Vision-Based 3D Tracking

QuickTimeª and a decompressor are needed to see this picture.

QuickTimeª and a decompressor are needed to see this picture.

QuickTimeª and a decompressor are needed to see this picture.

2

Recursive Tracking
QuickTimeª and a decompressor are needed to see this picture.
QuickTimeª and a decompressor are needed to see this picture.

t=0

t=1

t=2

...

3

Real-Time 3D Object Detection

QuickTimeª and a decompressor are needed to see this picture.

Runs at 15 Hz

Keypoint Recognition
The general approach [Lowe, Schmid, Mikolajczyk, Matas] is a particular case of classification:
Search in the Database Pre-processing Make the actual classification easier Nearest neighbor classification

One class per keypoint: the set of the keypoint’s possible appearances under various perspective, lighting, noise...

5

Training phase

Classifier
Used at run-time to recognize the keypoints

6

A New Classifier: Ferns

7

Presentation on an Example

Ferns: Training
The tests compare the intensities of two pixels around the keypoint:

Invariant to light change by any raising function. Posterior probabilities:

Ferns: Training
0 1 1 1 0 0

1

0

1

1

5 6

Ferns: Training

Ferns: Training Results

Ferns: Recognition

Justification
We are looking for proportional to but complete representation of the joint distribution infeasible. Naive Bayesian ignores the correlation:

Compromise:

ie probabilities stored by the leaves.
14

It Really Works

QuickTimeª and a decompressor are needed to see this picture.

QuickTimeª and a decompressor are needed to see this picture.

16

Ferns Implementation
1: for(int i = 0; i < H; i++) P[i ] = 0.; 2: for(int k = 0; k < M; k++) { 3: int index = 0, * d = D + k * 2 * S; 4: for(int j = 0; j < S; j++) { 5: index <<= 1; 6: if (*(K + d[0]) < *(K + d[1])) 7: index++; 8: d += 2; } 9: p = PF + k * shift2 + index * shift1; 10: for(int i = 0; i < H; i++) P[i] += p[i]; }

Very simple to implement; No need for orientation nor perspective correction; No parameters to tune; Very fast.
17 17

Fast, easy to implement; No parameters to tune. Takes a lot of memory:
floating point values to store.

18 18

Feature Harvesting: Estimating the Posterior Probabilities from Video Sequences

19

Feature Harvesting
Estimate the posterior probabilities from a training video sequence:

QuickTimeª and a YUV420 codec decompressor are needed to see this picture.

Feature Harvesting
With the ferns, we can easily: - add a class; - remove a class; - add samples of a class to refine the classifier.  Incremental learning

Detect Object in Current Frame Matches Update Classifier

Training examples

 No need to store image patches;  We can select the keypoints the classifier can recognize.

Feature Harvesting Steps

Feature Harvesting

Test Sequence

QuickTimeª and a decompressor are needed to see this picture.

Application to Deformable Objects

Detecting a Deformable Object

QuickTimeª and a decompressor are needed to see this picture.

15 frames/sec.

26

Energy Minimization
Wide Baseline Matching Regularization Term

ε ( X , Y ) = ε C ( X , Y ) + λDε D ( X , Y )

Input Image

Model Image

27

Realistic Rendering

Realistic Rendering

QuickTimeª and a decompressor are needed to see this picture.

QuickTimeª and a decompressor are needed to see this picture.

31

Application to Geometric and Photometric Calibration

Application to Geometric and Photometric Calibration

QuickTimeª and a decompressor are needed to see this picture.

Calibration

• Internal parameters;

Calibration

• Internal parameters; • Relative motions; • Chain the relative motions to get the absolute poses; • Bundle adjustment.

Photometric Calibration
Make use of the same calibration object. Two methods :
• Dynamic, with Lambertian rendering only; • Static, allowing shadows and specular reflections.

Using the Geometric Calibration Target to Measure Light
albedo(m) m nt I(camera, t, m)

Solve for gain, bias of each camera, and irradiance for observed normals.

Radiance Map Computation

INTERPOLATION

Lambertian material rendering; Updated in real-time.

Using and Updating the Radiance Map

QuickTimeª and a decompressor are needed to see this picture.

40

Point Light Sources for AR

DECONVOLUTION

Retrieve point light sources for:
– Cast shadows; – Specular reflections.

Not real-time anymore.

QuickTimeª and a MS-MPEG4 v2 decompressor are needed to see this picture.

You can download the source code (under GPL) for • Keypoint recognition, • Planar object detection, and • Multi-camera geometric and photometric calibration, at http://cvlab.epfl.ch/software/bazar Installation on Windows, Linux, and Mac OSX should be easy.

Thank you
43 43

44

References
V. Lepetit and P. Fua. Keypoint recognition using randomized trees. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(9):1465-1479, September 2006. M.  Ozuysal,  V.  Lepetit,  F.  Fleuret  and  P.  Fua.  Feature  Harvesting  for  Tracking­by­ Detection. In European Conference on Computer Vision, Vol. 3, pp. 592­­605, 2006. J.  Pilet,  V.  Lepetit  and  P.  Fua.  Fast  Non­Rigid  Surface  Detection,  Registration,  and  Realistic Augmentation. Accepted to International Journal of Computer Vision, 2006. M.  Salzmann,  S.  Ilic  and  P.  Fua.  Physically  Valid  Shape  Parameterization  for  Monocular  3­D  Deformable  Surface  Tracking.  British  Machine  Vision  Conference,  2005.

45

Justification
We are looking for proportional to but complete representation of the joint distribution infeasible. Naive Bayesian ignores the correlation:

Compromise:

46

Estimating

from Samples

the empirical probability

constant value

It is easy to prove that this expression for

can be estimated as

if we modelize

as

with

the number of samples that verify

and u a positive constant (in practice u = 1).
47

Optimized Locations versus Random Locations: We Can Use Random Tests
Comparison of the recognition rates for 200 keypoints:
Recognition rate
Information gain optimization Randomness

Number of trees

We can use random tests
• For a small number of classes
• we can try several tests, and • retain the best one according to some criterion.

49 49

We can use random tests
• For a small number of classes
• we can try several tests, and • retain the best one according to some criterion.

• When the number of classes is large
• any test does a decent job:

50 50

Why it is interesting
• Building the ferns takes no time (except for the posterior probabilities estimation); • Allows incremental learning; • Simplifies the classifier structure.

51 51

Comparing Trees and Ferns

52

Comparing Trees and Ferns
• Trees are organized hierarchically; • Combine posterior distributions additively.

• Ferns are flat; • Combine posterior distributions multiplicatively; • Use a finer model when estimate the posterior distributions from examples.

53

The Arborescence is not Needed when the Features are Taken at Random
f1

f2 f3

Ferns outperform Trees
• 500 classes. • No orientation or perspective correction.
Recognition rate FERNS TREES

Number of structures
55

Ferns outperform Trees
Recognition rate

FERNS

TREES

Number of classes
56

Linking the Two Approaches
The Ferns consider

with

while the Trees consider

57

Linking the Two Approaches
It can be proved the two methods are equivalent when the are small:

58

Comparison with SIFT
Recognition rate
Number of Inliers FERNS SIFT

Frame Index

59

Comparison with SIFT
Computation time
• SIFT: 1 ms to compute the descriptor of a keypoint
(without including convolution); classes.

• FERNS: 13.5 10-3 ms to classify one keypoint into 200

60

Naive Feature Tracking
Track Keypoints by Frame-to-Frame Matching
Training Examples

Build Classifier Does not work very well: Prone to drift & tracking failures; How to select “good” keypoints?  Feature harvesting.

Handling Light Changes

QuickTimeª and a decompressor are needed to see this picture.

3D Shape Recovery from Monocular Video

M. Salzmann, S. Ilic and P. Fua. Physically Valid Shape Parameterization for Monocular 3-D Deformable Surface Tracking. BMVC05.
63

Database of Feasible Shapes PCA

Bending modes

Extension modes

64

Vision for the America's Cup

CTI Project 65

Stretchable Material

66

Robust Bundle Adjustment
Error evolution

• One pixel reprojection error
67

Same Camera, 2 Sets of Images

68

Optimization
•Random initialization; •Successive minimizations with lower and lower values for r: r0 = 1000  r = 1 ou 2 pixels

QuickTimeª and a decompressor are needed to see this picture.

Internal Parameters Calibration
0

Homography

0

External Parameter Calibration
World Coordinate System

A1

P1

0

P3 P2 Relative 3D Pose Absolute 3D Pose A2
0

P4 A3

Bundle Adjustment on: P1, P2, P3, P4, A1, A2 and A3

Using the Geometric Calibration Target to Measure Light
albedo(m) m nt I(camera, t, m)

Solve for gain, bias of each camera, and irradiance for observed normals.