Professional Documents
Culture Documents
Ci =f =liv Pi
1
p; p 2 g
Si
1. Introduction The visual hull E defined by the silhouette set Si (see Fig.
1) can be written as:
As computer graphics and robot vision become more and
=
\
more performing, attention is being focussed on complex E Ci
high quality 3D models and the way they can be acquired i=1;:::;n
Proceedings of the First International Symposium on 3D Data Processing Visualization and Transmission (3DPVT’02)
0-7695-1521-5/02 $17.00 © 2002 IEEE
Authorized licensed use limited to: Stanford University. Downloaded on December 13,2021 at 07:23:23 UTC from IEEE Xplore. Restrictions apply.
visual hull by marching cube triangulation. We can see in ~ i and m
vectors m ~ j is just defined as the scalar product of
Fig. 2 the visual hull obtained for a statuette with an octree the normalized vectors ~ni ,~nj :
=
depth of 8. The resulting 3D mesh contains 97223 vertices
Cij ~
ni ~
nj :
and 194442 triangles.
2
If n > , a possible measure of likeness is the mean cross-
3. Multi-stereo carving correlation between all the possible vector pairs:
P 6
n n
will deform the initial model in order to adjust it to the true i;j =1;i=j
n
surface. To do so, we will use information contained in the
object texture. Evidently, if the object has no texture or if = i;j =1
n n (
Cij
1)
n
:
~
ni ;
The multi-stereo carving procedure can be decomposed n
i=1
into 4 different steps:
and ~si is the partial mean sum vector,
carving candidate detection,
= 11
X = n~
s n
~i
Cij =
Xn
~
ni ~
nj
X X
i;j =1 i;j =1
We describe these 4 different steps in the following sub-
n n
sections.
= n
~i n
~j
i=1 j =1
3.1. Carving candidate detection
= n~
s n~
s = jj jj n
2
~
s
2
;
To detect candidates for carving among the vertices of and C1 can be written as a function of ~s :
the visual hull mesh, we have to evaluate the depth quality
C = jj jj 1 1
2
n ~
s
of the surface at each one of its vertices. This quality mea- :
1
sure is based on colour coherence of the projection of a ver- n
tex into the different images. To measure image coherence, The major problem with this criterion is that it does not al-
we need to find a way to extract vectors of information from low eliminating vectors which are very different from oth-
original data (i.e. the images). This is discussed in section ers. This can happen for example when there is a highlight
3.1.2. We need also to define a multi-vector likeness crite- in one of the correlation windows. A possible solution can
rion as follows. be found if we write C1 as the mean correlation between
each vector n~i and its partial mean sum vector ~si :
3.1.1 Multi-vector cross-correlation criterion
C =1
X n
1
Let ~vi ; i 2 f ; : : : ; ng be n different vectors. We would 1 ~
si ~
ni :
=2
n
i=1
like to measure their likeness. If n , a very well known
criterion is the normalized cross-correlation. We can define We can compute the likeness between every vector and the
the normalized vector ~ni as partial mean sum vector:
= jj = 8
jj 2 f1 g
~
vi m
~ i pi s
~i ~
ni ; i;
~
ni ;i ;:::;n ;
~
vi m
~ i
compute the maximum of pi ,
with m
~ i being the vector whose components are the mean
Proceedings of the First International Symposium on 3D Data Processing Visualization and Transmission (3DPVT’02)
0-7695-1521-5/02 $17.00 © 2002 IEEE
Authorized licensed use limited to: Stanford University. Downloaded on December 13,2021 at 07:23:23 UTC from IEEE Xplore. Restrictions apply.
Figure 2. Visual hull of a statuette. Some samples of the 72 silhouettes are shown at the bottom.
and only use those vectors ~ni which verify flat. This improves speed as the mapping induced by a plane
between two cameras is a homography.
pi > T pmax ; T 2 [0 1]
; :
We show in Fig. 3 the result of using the C2 criterion in
#( )
If K is the number of vectors that verify the preceding
a plane-based correlation. We can see the relation between
equation, we can write a second criterion C2 :
the regions with a weak correlation (black and blue colour)
X
and the concave regions, in particular in the area close to
C = #(1 )
the ears and the nose.
2 s
~i ~
ni :
K
i max
i;p >T p
3.2. Carving direction selection
3.1.2 Evaluation methods
For every carving candidate we need to define the di-
Having chosen a multi-vector criterion, we need to define rection in which to carve. This direction is chosen as the
the way information is extracted from images. Different optical ray passing through the optical center of one of the
authors have used various approaches but they always use cameras and the 3D point. Ideally, the best camera would
the same basic idea: if a 3D point belongs to the object have its image plane as perpendicular to the object normal
surface, its projection into the different cameras which re- as possible, but unfortunately, at this stage we know neither
ally see it (i.e. there is no occlusion) will be closely cor- the right shape of the object nor its normal. As a first try,
related. A first method consists of projecting the 3D point we could use the visual hull model as an object estimation
into the image of each camera, sampling the image color and consider the visual hull normal as the object normal.
at each point of projection and measuring the consistency If the visual hull is close to the true shape, estimation will
of all color samples[3]. But this measure is very sensitive be good enough. Otherwise, the visual hull normal may be
to noise, since we only have one sample per image. To in- completely wrong (Fig. 4). A more efficient solution con-
crease robustness, it is preferable to take several samples sists of taking the median camera among the set of cameras
per image. A way of doing this is to fit a quadric surface to which actually see the candidate. This method does not de-
the model [5], and to use the surface-induced mapping be- pend on the normal estimation and thus is more stable. In
tween cameras to compute coherence. This procedure has some cases, such as in Fig. 4, the method may not work as
been successfully tested. But thanks to the high resolution the above set of cameras can be void at the bottom of some
of our 3D model, we can assume the surface to be locally folds. This is an intrinsic problem of the visual hull.
Proceedings of the First International Symposium on 3D Data Processing Visualization and Transmission (3DPVT’02)
0-7695-1521-5/02 $17.00 © 2002 IEEE
Authorized licensed use limited to: Stanford University. Downloaded on December 13,2021 at 07:23:23 UTC from IEEE Xplore. Restrictions apply.
Figure 3. Plane-based correlation results
( )= A d2D
d3D d2D
B C d2D
;
normal variations. Right: The corresponding Using these two equations, we can deduce the relationship
high resolution mesh. between two different 2D deviations di2D and dj2D :
( )=
j Aj d
j
2D
d3D d ;
2D j
Bj Cj d
3.3. Carving depth estimation 2D
and
For every point, we have to estimate the depth along ( )= Bi d3D
+
i
d2D d3D ;
the carving direction to attain the object surface. Un- Ai Ci d3D
like the carving candidate detection method, which was an we can substitute
j
( ( )) = i j ji j2Dj 2D i
evaluation method, the depth estimation cannot benefit of j B A d
j dj2D
i
d2D d3D d
a mapping-induced surface. For the correlation method 2D A (B C d )+C A
j
we have chosen the easiest solution: a rectangular win-
dow based correlation, also called front parallel correlation.
= ij ij ij2Dj2DF
E
+G
d
d
;
algorithm used for evaluation of a depth along the carving This relation is exact whereas in [13] the approximation
direction is similar to that of [13]. For every 3D depth, the
i
d2D =
Kd
j
2D is used. With these equations we can re-
algorithm computes the projected 2D point for every image, late in a single coordinate system all the correlation curves
extracts the centered window by resampling the image, and 1 In a pinhole camera model context.
Proceedings of the First International Symposium on 3D Data Processing Visualization and Transmission (3DPVT’02)
0-7695-1521-5/02 $17.00 © 2002 IEEE
Authorized licensed use limited to: Stanford University. Downloaded on December 13,2021 at 07:23:23 UTC from IEEE Xplore. Restrictions apply.
Initial search Initial search
0.8
1
0.8
need to calculate the depth steps which correspond to a lo- 0.6 0.6
cal image variation of one pixel. This means that for a given 0.4 0.4
correlation
correlation
0.2 0.2
0 0
compute a low number of correlations and cameras far from −0.2 −0.2
the median camera need a higher number. Logically, the −0.4 −0.4
bigger the parallax is, the higher the depth resolution. This −0.6
−0.8
−0.6
−0.8
0 20 40 60 80 100 120 140 160 180 200 0 20 40 60 80 100 120 140 160 180 200
property has led us to split the carving procedure into two pixels pixels
a first stage with a reduced number of cameras near Figure 5. Initial search with the 4 closest cam-
the median camera for a quick rough initial search in
eras to the main camera. a) Valid case. b)
all the depth interval, Non-valid case.
a second stage with the full set of cameras within a
smaller interval for a high-accuracy search. 1
Accurate estimation
1
Accurate estimation
0.9
0.8
0.8
0.4
0.7
0.6
Mean correlation
For the initial search, we use only the 4 nearest cameras to
correlation
0.2
0.5
0.3
given by the visual hull because we know that points must −0.4
0.2
stay inside the visual hull volume. Using cameras with a −0.6
−0.8
0.1
0
55 60 65 70 75 80 55 60 65 70 75 80
small parallax allows us to scan a big depth range with a pixels pixels
Proceedings of the First International Symposium on 3D Data Processing Visualization and Transmission (3DPVT’02)
0-7695-1521-5/02 $17.00 © 2002 IEEE
Authorized licensed use limited to: Stanford University. Downloaded on December 13,2021 at 07:23:23 UTC from IEEE Xplore. Restrictions apply.
a)
b)
c)
Figure 7. (a) Carved model. (b) Multi-correlation results. (c) Carving depths.
Proceedings of the First International Symposium on 3D Data Processing Visualization and Transmission (3DPVT’02)
0-7695-1521-5/02 $17.00 © 2002 IEEE
Authorized licensed use limited to: Stanford University. Downloaded on December 13,2021 at 07:23:23 UTC from IEEE Xplore. Restrictions apply.
Figure 8. Carved model with filtering.
5. Acknowledgements
References
Figure 9. Depth regularisation. [1] B. G. Baumgart. Geometric Modelling for Computer Vision.
PhD thesis, Standford University, 1974.
[2] E. Boyer. Object models from contour sequences. In Proc.
ECCV, pages 109–118, 1996.
for the full 3D model reconstruction (visual hull construc- [3] H. B. C.-E. Liedtke and R. Koch. Shape adaptation for mod-
tion+carving+texture mapping) is around 10 minutes in a elling of 3d objects in natural scenes. In IEEE Proc. Com-
PC with a Pentium 4 processor at 1,4GHz. puter Vision and Pattern Recognition, 1991. Hawaii.
[4] A. Colosimo, A. Sarti, and S. Tubaro. Image-based object
modeling: a multiresolution level-set approach. In IEEE
International Conference on Image Processing, volume 2,
4 Conclusions pages 181–184, 2001.
[5] G. Cross and A. Zisserman. Surface reconstruction from
multiple views using apparent contours and surface texture.
In A. Leonardis, F. Solina, and R. Bajcsy, editors, NATO
We propose a 3D reconstruction method of real objects Advanced Research Workshop on Confluence of Computer
from a sequence of images. After the construction of the vi- Vision and Computer Graphics, Ljubljana, Slovenia, pages
sual hull defined by the object silhouettes, we carve it with a 25–47, 2000.
multi-correlation technique. This method works well if the [6] O. Faugeras and R. Keriven. Variational principles, sur-
shape of the visual hull is not very different from the object, face evolution, pdes, level set methods, and the stereo prob-
even if the carving depth is big (ex. the face of the statue). lem. IEEE Transactions on Image Processing, 7(3):336–
If the mesh deformation is very important (ex. the back 344, march 1998.
[7] A. Laurentini. The visual hull concept for silhouette based
appendix in the model) the results are noisy because there
image understanding. IEEE Trans. on PAMI, 16(2), 1994.
are many points that cannot be carved. We are consider- [8] J. M. Lavest, M. Viala, and M. Dhome. Do we really need
ing a more global method as the level set approach [6, 4, 9] an accurate calibration pattern to achieve a reliable camera
which we would like to combine with the present technique calibration? In Proc. of the 5th European Conf. on Computer
in order to control its computational cost. Vision, volume 1, pages 158–174, 1998. Germany.
Proceedings of the First International Symposium on 3D Data Processing Visualization and Transmission (3DPVT’02)
0-7695-1521-5/02 $17.00 © 2002 IEEE
Authorized licensed use limited to: Stanford University. Downloaded on December 13,2021 at 07:23:23 UTC from IEEE Xplore. Restrictions apply.
a) b)
Figure 10. a) Original image. Below: Detail of the neck in a posterior view. b) 3D model with texture
mapping in the same point of view.
[9] R. Malladi, J.A.Sethian, and B. Vemuri. Shape modelling national Journal of Computer Vision, 16:35–56, September
with front propagation: A level set approach. IEEE Tr. on 1995.
PAMI, 17(2):158–175, February 1995. [15] F. Schmitt and Y. Yemez. 3d color object reconstruction
[10] Y. Matsumoto, K. Fujimura, and T. Kitamura. Shape-from- from 2d image sequences. In IEEE Interational Conference
silhouette/stereo and its application to 3-d digitizer. j-LECT- on Image Processing, 1999. Kobe.
NOTES-COMP-SCI, 1568:177–188, 1999.
[11] Y. Matsumoto, H. Terasaki, K. Sugimoto, and T. Arakawa.
A portable three-dimensional digitizer. In Int. Conf. on Re-
cent Advances in 3D Imaging and Modeling, pages 197–
205, 1997. Ottowa.
[12] W. Niem and J. Wingbermuhle. Automatic reconstruction of
3d objects using a mobile monoscopic camera. In Int. Conf.
on Recent Advances in 3D Imaging and Modeling, pages
173–181, 1997. Ottowa.
[13] M. Okutomi and T. Kanade. A multiple-baseline stereo. In
IEEE Proc. Computer Vision and Pattern Recognition, pages
63 –69, 1991.
[14] P.Fua and Y. Leclerc. Object-centered surface reconstruc-
tion: Combining multi-image stereo and shading. Inter-
Proceedings of the First International Symposium on 3D Data Processing Visualization and Transmission (3DPVT’02)
0-7695-1521-5/02 $17.00 © 2002 IEEE
Authorized licensed use limited to: Stanford University. Downloaded on December 13,2021 at 07:23:23 UTC from IEEE Xplore. Restrictions apply.