You are on page 1of 3

An Introduction to Stereo Vision

and Disparity Computation


Edwin Olson, eolson@mit.edu
Melissa Hao, mhao@mit.edu
Almost all high-end robotics systems are now equipped with pairs of cameras arranged to
proide depth perception. !hile humans tend to ta"e depth perception for granted,
#udging depth is difficult for computers, and remains a sub#ect of ongoing research.

$he fundamental process inoles finding corresponding points in two different iews of
the same scene. %imple triangulation can then be used to determine the distance.
Figure 1. Geometry of stereo vision for one camera configuration.
&igure ' depicts a typical camera configuration, with the cameras pointing somewhat
inwards. %uppose the coordinates of two corresponding points are ()
left
,y
left
* and
()
right
,y
right
*. &or cameras that are properly aligned, y
left
+y
right
. $he disparity is defined to be
)
right
-)
left
. $his alue can be positie or negatie, depending on the angle of the cameras as
well as the distance to the ob#ect.
%earching for corresponding points is a recurring problem in machine ision as well as in
image and ideo compression. ,omputing stereo disparity is ery similar to finding
optimal motion ectors, and approaches for both problems are similar.
$he most common approach in both stereo disparity calculations and motion
compensation is to slide a bloc" ta"en from one image oer a second image. $his
approach is "nown as the Block Matching Algorithm. At each possible offset, a square-
sense error is computed. &inding the position where the sub-images are most similar (and
the minimum error occurs* is equialent to computing the disparity.
left view right view
focal point
imaging surface (gets reersed
image*
distant ob#ect
nearby ob#ect
p
o
s
i
t
i

e
n
e
g
a
t
i

e
-ero disparity
.isparities typically hae a small dynamic range (often / '0 pi)els* compared to the
actual distances to ob#ects. $herefore, measuring disparities to integral pi)el alues
results in ery low depth resolution. $he solution is to measure disparities to subpi)el
resolution, with half-pel accuracy being common and quarter-pel used in some systems.
&inding corresponding points for eery pi)el in an image is an e)tremely computationally
e)pensie tas". ,onsider a straightforward implementation1 for eery pi)el in the left
image, a surrounding bloc" of pi)els (often '2)'2 or 34)34* is slid across a row from the
right image (which is the same height as the bloc" from the left, but the width of the
whole image.* At each position, the square-sense error (or other error metric* is
computed, inoling a large number of additions and multiplications.
5arious optimi-ations intended to reduce the amount of computation hae been proposed.
6ather than searching an entire row, a subset of it is usually selected based on an estimate
of the ma)imum disparity li"ely to be seen in the data. $he search range can also be
dynamically ad#usted by e)ploiting the fact that nearby points are li"ely to hae similar
disparities.
Another class of optimi-ations relies on the obseration that the error function as a
function of hori-ontal offset (from which disparity is determined*, is typically quite
smooth, with a single and dramatic minimum.
Figure . !ypical error versus hori"ontal offset curve.
$he smoothness of the error cure often ma"es it possible to find the minimum without
an e)haustie search. &or e)ample, one can sample the error cure at a relatiely small
number of points, and select the best point(s* for further refinement. 7ogarithmic
searches, common in motion compensation applications, employ this e)act strategy.
8sing Mat7ab, we hae implemented a stereo ision algorithm using a straight-forward
bloc" matching algorithm. !e implemented half-pel accuracy using a ninth order &96
filter, rather than the lower-quality blinear filter often used. :erformance was not a
concern in our e)periments. !e typically used a bloc" si-e of '2)'2.
Areas in the test images with sharp edges and distinct features produced e)tremely sharp
error cures, ery similar to &igure 4, yielding e)cellent disparity estimation accuracy
hori-ontal offset
error
Hori-ontal offset
corresponding to
disparity estimate
and consistency. Howeer, areas of relatie uniformity produced relatiely flat error
cures, resulting in highly erratic disparity estimates.
$he sharpness of the error cure can be used to produce a confidence estimate. Error
cures with a sharp and distinct minimum are typically ery accurate whereas flat error
cures are less reliable. !e used an error ariance metric to estimate confidence;flat
error cures hae a lower ariance and thus a lower confidence leel. $he confidence
map can be used to mas" out error-prone regions of the depth map. <9M:6O5E
ME$HO. O& ,OM:8$9=> ,O=&9.E=,E E%$9MA$E?.
=oise or under-sampled detail in the source images can lead to poorly correlated left and
right images. :refiltering can combat this by remoing information not present in both
images. <9=%E6$ !HA$ @O8 .9. HE6E?
Figure #. $alf%pel depth map on &enault Automo'ile (art using 1)*1) 'loc+ matching.
Mat7ab source code and raw images can be obtained from1
http1AAwww.raenousbirds.comAeolsonA<BBB?
50 100 150 200 250
50
100
150
200
250
50 100 150 200 250
50
100
150
200
250
50 100 150 200 250
50
100
150
200
250
,eft Image Depth -ap Confidence ./0.11

You might also like