You are on page 1of 5

in Proc. Intl Conf. Pattern Recognition (ICPR), Vol. I, pages 407411, Aug.

1996, Vienna

On the Epipolar Geometry Between Two Images With Lens Distortion


Zhengyou Zhang (Zhengyou.Zhang@sophia.inria.fr) INRIA, 2004 Route des Lucioles, BP 93, F-06902 Sophia-Antipolis Cedex, France
system (X; Y; Z ):

Abstract
In order to achieve a 3D, either Euclidean or projective, reconstruction with high precision, one has to consider lens distortion. In almost all work on multiple-views problems in computer vision, a camera is modeled as a pinhole. Lens distortion has usually been corrected off-line. This paper intends to consider lens distortion as an integral part of a camera. We rst describe the epipolar geometry between two images with lens distortion. For a point in one image, its corresponding point in the other image should lie on a so-called epipolar curve. We then investigate the possibility of estimating the distortion parameters and the fundamental matrix based on the generalized epipolar constraint. Experimental results with computer simulation show that the distortion parameters can be estimated correctly if the noise in image points is low and the lens distortion is severe. Otherwise, it is better to treat the cameras as being distortion-free. Keywords: Camera Calibration, Lens Distortion, Epipolar Geometry, Fundamental Matrix, Epipolar Curve

X Y Z] = R X
T

Z ] +t:
w T

Step 2: Perspective projection from 3D camera coordinates (X; Y; Z ) to ideal image coordinates (x; y) under pinhole camera model:

x=f

X ; Z

y=f

Y ; Z

where f is the effective focal length. Step 3: Lens distortion:

x=x ^+

y=y ^+

where (^ x; y ^) are the distorted or true image coordinates on the image plane, and ( x ; y ) are the distortion corrections to (x; y ). We will return back to this point later. Step 4: Afne transformation from real image coordinates (^ x; y ^) to frame buffer (pixel) image coordinates (u; v):

1 Introduction
Up to now, and in almost all work on multiple-views problems in computer vision (with an exception of [10]), a camera is always modeled as being linearly projective, i.e. the homogeneous coordinates of a 3D point and those of an image point are related by a 3 4 matrix. This statement does not imply, though, that lens distortion has never been taken into account in such work. Indeed, distortion has usually been corrected off-line using classical methods by observing for example straight lines [9, 10], if it is not weak enough to be neglected [5]. Our treatment in this article intends to consider camera distortion as an integral part of a camera.

u = d?1 x ^ + u0 ;
x

v = d?1 y ^ + v0 ;
y

where (u0 ; v0 ) are the coordinates of the image center (the principal point) in the frame buffer; dx and dy are the distances between adjacent pixels in the horizontal and vertical directions of the image plane, respectively. Now let us examine how to model camera distortion. There are mainly two kinds of distortion: radial and decentering [8, 1]. The distortion corrections are expressed in the usual representation as power series in radial distance r:
x

2 Camera Model with Distortion


Following [1, 2, 9], we can model the transformation from 3D world coordinates to camera pixel coordinates as a process of four steps: Step 1: Rigid transformation from the object world coordinate system (Xw ; Yw ; Zw ) to the camera 3D coordinate

=x ^(k1 r2 + k2 r4 + k3 r6 + ) + p1 (r2 + 2^ x2 ) + 2p2x ^y ^](1 + p3 r2 + =y ^(k1 r2 + k2 r4 + k3 r6 + ) + 2p1x ^y ^ + p2 (r2 + 2^ y2)](1 + p3 r2 +


p

); );

(1) (2)

where r = x ^2 + y ^2 , k1 , k2 and k3 are coefcients of radial distortion, and p1 , p2 and p3 are coefcients of decentering distortion. Under radial (symmetric) distortion, which is caused by imperfect lens shape, the ideal image points are distorted along radial directions from the distortion cen-

ter (here the same as the principal point). The radial distortion is symmetric (see Fig. 1). Under decentering distor-

putation. We now introduce another notation (~ u; v ~) for ideal pixel image coordinates:

u ~ = x=d ;
x

and

v ~ = y=d :
y

(5)

It is easy to see that

u ~ = (u ? u0 ) 1 + (u ? u0 )2 k1 d2 + (v ? v0 )2 k1 d2 ] v ~ = (v ? v0 ) 1 + (u ? u0 )2 k1 d2 + (v ? v0 )2 k1 d2 ] :
x y x y

(6) (7)

~=v ~ = (u ? Divide the above two equations, and we have u u0 )=(v ? v0 ) : Substituting (u ? u0 ) into (6) yields
(a) (b) Figure 1. Image under radial distortion; image resolution 512 512; dashed curves are the = ?5e ? 07 and k1 d2 = ideal points. (a) k1 d2 x y 2 ?7e ? 07; (b) k1 dx = 5e ? 07 and k1 d2 = 7 e ? 07 y tion which is usually caused by improper lens assembly, the ideal image points are distorted in both radial and tangential directions. Depending on the lenses used, one needs to choose an appropriate distortion model. Based on the reports in the literature [1, 9], unless one is specically concerned with the reduction of distortion to very low levels, it is likely that the distortion function is totally dominated by the radial components, and especially dominated by the k1 term. Tsai [9], who uses the rst two radial terms, ever claimed that any more elaborate modeling not only would not help (negligible when compared with sensor quantization), but also would cause numerical instability. This has also been conrmed by Wei and Ma [10, section 3.4]. We thus consider only the rst radial term in the sequel, although mathematically there is no reason to do this. Combining Step 3 and Step 4 yields:

~2 + (v ? v )2 k d2 ] ; v ~ = (v ? v0 ) 1 + (v ? v0 )2 k1 d2 u 0 1 v ~2 i.e. : (v ? v0 )3 + p(v ? v0 ) + q = 0 ;
x y

where

p=
If k1

v ~2 2 2 k1 (dx u ~ + d2 v ~2 ) ; y
q

q=?
p

v ~3 2 2 k1 (dx u ~ + d2 v ~2 ) y

= ?v ~p :

general the middle one is what we need. Once v is solved, u coordinate is given by u0 v0 )=v ~.

= 0 (no distortion), there is only one solution: v = ? ? 2 v ~ + v0 . Let = 2 + 3 2 . If > 0, then there is = 0, then v = v0 , which occurs only one solution; if when v ~ = 0; if < 0, then there are three solutions, and in +u ~(v ?

3 Epipolar Constraint between Two Images with Distortion


Given two points (u; v ) and (u0 ; v 0 ) in correspondence, there is a constraint on them even when the two images exhibit lens distortion. If we consider the ideal pixel coordinates (~ u; v ~) and (~ u0 ; v ~0 ) based on (6) and (7), then the relation between space coordinates and image coordinates is linear-projective, and there exists a 3 3 so-called fundamental matrix which relates points between two images such that:

x = d (u ? u0 ) 1 + (u ? u0 )2 k1 d2 + (v ? v0 )2 k1 d2 ] ;
x x y

y = dy (v ? v0 ) 1 + (u ? u0 )2 k1 d2 + (v ? v0 )2 k1 d2 ]: x y

(3) (4)

Examining the perspective model in Step 2, it is clear that we can not determine simultaneously f , dx and dy from visual information. We must specify either f , or one of dx and dy , or the ratio dy =dx using, for example, the information provided by camera manufacturers. For the problem at hand, we assume f = 1, and (x; y ) are exactly the so-called normalized image coordinates.

u ~0 ~0 5 = 0 ; u ~; v ~; 1]F 4v

2 3

where

2.1 Computation of Distorted Coordinates from Ideal Ones


In the above, we provided an expression from true pixel image coordinates (u; v ) to (ideal) normalized image coordinates: (3) and (4). Sometimes, we need the inverse com-

u ~ = (u ? u0 ) 1 + (u ? u0 )2 k1 d2 + (v ? v0 )2 k1 d2 ] ; v ~ = (v ? v0 ) 1 + (u ? u0 )2 k1 d2 + (v ? v0 )2 k1 d2 ] ; 0 d02 + (v0 ? v0 )2 k0 d02 ] ; u ~0 = (u0 ? u00 ) 1 + (u0 ? u00 )2 k1 0 1 0 0 0 0 0 2 0 0 2 0 0 0 d02 ] : v ~ = (v ? v0 ) 1 + (u ? u0 ) k1 d + (v ? v0 )2 k1


x y x y x y x y

The reader is referred to [4] for more details on the fundamental matrix. If we go back to use the true pixel coordinates, we then have a constraint:

= h2 =h1 = (dy =dx)2 which is related to the aspect of an image pixel. We distinguish different congurations in estimating the distortion parameters: = h1 ]. We assume that the two imConguration 1: ages have the same distortion parameters (e.g. they are taken by the same camera at two different time instants) and that f ; u0 ; v0 g are known, e.g. the distortion center is at the center of the image (u0 = v0 = 255) and the ratio is given by the camera manufacturers. There is only one parameter to estimate.
T Conguration 2: = h1 ; h0 1 ] . The two images have independent distortion, but all parameters are known except h1 and h0 1.

of distortion-free images, its corresponding point in the rst image does not lie on a straight line anymore. As a matter of fact, equation (8) describes a cubic curve in (u; v ) on which the corresponding point must lie. Wed like to call the curve g(u; v; u0; v0 ) the epipolar curve of point (u0 ; v0 ). Symmetrically, for a point (u; v ) in the rst image, g (u; v ; u0 ; v 0 ) describes the epipolar curve in (u0 ; v 0 ) in the second image on which the corresponding point must lie.

g(u; v; u0; v0 ) u ~; v ~; 1]F u ~0 ; v ~0 ; 1] = 0 : (8) For a point (u0 ; v 0 ) in the second image, unlike in the case
T

d d

4 Solving the Distortion Parameters From Points Matches


In this section, we describe how to estimate the distortion parameters based on the epipolar constraint (8). We only consider radial distortion. Let be the vector containing the distortion parameters (see below for different cases studied). Let be the vector containing the parameters of the fundamental matrix 1 . Given n point matches, f( i ; 0 )g (i = 1; : : : ; n), bei tween two images, we want to know whether we can determine and . The problem can be naturally formulated as a minimization problem:

Conguration 3: = h1 ; h2 ]T . The two images are assumed to have the same distortion parameters. All other parameters are given. There are other congurations, but in this paper we study only the above three congurations.

5 Computer Simulation Results


In this section, we will show some experimental results obtained by computer simulation. In the simulations, we use an object containing two orthogonal planes with checkerpattern on them. Realistic camera parameters are used to generate images with resolution 512 512. The two cameras differ from each other mainly by a lateral motion. The two distortion-free images are shown in Fig. 2, where the points are indicated by crosses.

m m

min g2 (u ; v ; u0 ; v0 ) ; df
; i i i i

(9)

where g (u; v ; u0 ; v 0 ) is given by (8). In our particular implementation, instead of solving (9), we have made two modications: 1. To reduce the dimension of the parameter space, we separate the estimation of from . The estimation of is conducted in each optimization step for . The estimation of is done using Levenberg-Marquardt technique by minimizing the distance between points and epipolar lines, as described in [3]. The estimation of is done using a downhill simplex method [6].

2. Instead
i

g2(u ; v ; u0 ; v0 ),
i i i

minimizing the algebraic distance which is not physically meaningful, we replace it by the Euclidean distance in the image plane between point (ui ; vi ) to the corresponding epipolar curve. Symmetry for the two images is also assured in our implementation.

of

Figure 2. A pair of distortion-free images used for computer simulations In the sequel, we use the notations as explained in Table 1 to measure the quality of the experimental results. The values of h1 , h2 , j h1 j, and j h2 j shown in the tables of this section have been multiplied by 107 . The unit of F is percent (%) because the values have been multiplied by 100. For each noise level, Ntrials = 30 trials have been conducted, and the results shown are the average. The rst series of experiments were carried out in order to investigate the noise sensitivity of the distortion estimation

h1

Regarding the parameters of lens distortion, we denote k1 d2 , and h2 k 1 d2 . Furthermore, we denote x y

1 Recall that a fundamental matrix F has only 7 degrees of freedom. The parameterization described in [3] is used in our experimentation.

Table 1. Notation used in presenting the experimental results


dis avg cor avg
h1

Table 4. Errors versus different levels of image noise: Conguration 3


dis dis cor cor j h1 j j h2 j F noise avg h2 max avg max h1 0.0 4.47 21.23 0 0 0 0 0 0 0 0.1 4.43 21.28 0.61 2.68 ?0.16 1.15 0.78 1.51 0.37 0.2 4.44 21.32 1.40 6.31 ?0.53 3.97 1.43 4.13 1.03 0.3 4.47 21.28 2.77 12.80 ?0.41 8.19 2.35 8.19 2.83 0.4 4.49 21.19 3.35 15.20 ?2.09 11.24 3.20 11.24 4.64 0.5 4.53 21.42 4.56 21.50 ?1.13 14.62 2.78 14.62 7.38 0.6 4.57 21.23 4.87 23.05 0.10 14.01 2.79 14.07 5.95 0.7 4.62 21.63 6.41 30.42 3.50 14.98 3.77 14.98 4.73 three congurations produce similar results (recall that the lens distortion is the same for the two images). When the noise is small, the rst conguration gives the best result; but it gives the worse result when the noise level is high.

errors in distortion correction (pixels)

trials ^ 2 )). Similarly for h01 ( h02 ). ( N1 (h2 ? h i=1 trials j h1 j (j h2 j): average absolute difference between ^ 1 (h ^ 2 ) values, i.e. h1 (h2 ) and estimated h real P PNtrials Ntrials 1 1 ^ ^ 2 j). jh1 ? h1 j ( Ntrials i=1 jh2 ? h i=1 Ntrials Similarly for j h0 j (j h02 j). 1 b F : (k ? k)=k k 100%, where k k denotes the Frobenius norm of a matrix, and ^ are the true and estimated fundamental matrices.

dis & max : average and maximum distance between distorted (i.e. real observed) and ideal points cor & max : average and maximum distance between distortion-corrected and ideal points ( h2 ): average difference between real h1 (h2 ) and trials ^ 2) values, i.e. N1 PiN ^ h1 ( h estimated ^ =1 (h1 ? h1 ) trials

PN

F F

30 27 24 21 18 15 12 9 6 3 0 0:0 0:1

cor avg cor cor avg cor cor avg cor

cong 1 cong 2

for different congurations. The distortion parameters are: h1 = 5 10?7, h2 = 7 10?7 (i.e. = 1:4), u0 = 255, v0 = 255. The two images are assumed to exhibit the same lens distortion. The results are shown in Tables 2 to 4. Without exception, the distortion estimation is very sensitive to noise. When image points are very noisy ( > 0:5), the results with distortion correction are worse than those without distortion correction. Table 2. Errors versus different levels of image noise: Conguration 1 noise 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
dis avg

max cong 1 max cong 2


cong 3 max cong 3

4.47 4.43 4.44 4.47 4.50 4.53 4.58 4.63

21.23 21.28 21.25 21.37 21.33 21.42 21.35 21.43

max

dis

cor avg

0 0.39 1.23 2.00 3.51 5.13 11.16 19.00

0 1.67 5.66 9.23 16.42 24.16 53.23 91.01

max

cor

h1

j 1j
h

F
0 0.14 0.24 0.49 0.84 1.19 2.55 4.47

0 0.098 0.93 2.03 3.86 2.41 ?4.81 ?19.6

0 0.38 1.32 2.16 3.86 5.66 12.5 21.4

0:2

0:3 0:4 0:5 0:6 noise level (pixels)

0:7

Figure 3. Comparison of different congurations of distortion estimation Figure 4 shows the result of one trial of Conguration 3 with noise level = 0:3. We can observe a signicant reduction of distortion effect with our technique. However, there is an over-estimation of lens distortion, and it is even stronger with the increase of the noise in image points. The second series of experiments were carried out in order to investigate the precision of the distortion estimation versus different degrees of lens distortion with xed noise level in image points. The results for conguration 3 with noise level = 0:3 is given in Table 5. The distortion center is xed at the image center (i.e. u0 = v0 = 255). The distortion coefcients are set as: h1 = ?d 10?6 and h2 = ?1:4d 10?6, where d varies from 0.0 to 1.0. The two images are assumed to exhibit the same lens distortion. As it is clear, the error in the estimation of distortion (see e.g. cor cor avg and max ) decreases with the increase of the lens distortion. When there is only a weak distortion, the result is very poor because of noise in the image points.

Table 3. Errors versus different levels of image noise: Conguration 2 noise 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
dis avg

4.47 4.43 4.45 4.46 4.50 4.53 4.57 4.60

21.23 21.32 21.30 21.31 21.33 21.29 21.34 21.33

max

dis

cor avg

0 0.42 1.32 1.87 3.54 4.25 5.51 6.44

0 1.83 6.08 8.58 16.68 19.99 26.02 30.45

max

cor

h1

0 0.27 1.27 1.99 3.93 4.64 4.57 5.44

0 0.26 1.27 2.01 3.90 4.70 4.35 5.31

h0 1

j 1j j 1j
h h0

F
0 0.17 0.37 0.73 1.08 1.06 1.41 1.69

0 0.42 1.43 1.99 3.93 4.64 6.11 7.04

0 0.41 1.42 2.01 3.90 4.70 6.07 7.18

Figure 3 provides a comparison of the experimental results with different congurations. It is observed that the

(a)

(b)

Figure 4. Result of one trial of Conguration 3 with noise level = 0:3. (a) Original noisy points of the rst image (indicated by ) superimposed with ideal points (indicated by ); (b) Distortion-corrected points of the rst image (indicated by ) superimposed with ideal points (indicated by ) Table 5. Errors versus different degrees of lens distortion d (see text): Conguration 3

sical camera calibration [7], but our situation may be even worse because no information about 3D positions is available. Since lens distortion exhibits stronger effect near image border than in the image center, we would expect to have a more reliable estimate of the distortion parameters if more points near image border were used. However, this is unfortunately not common in the situation we studied here, because we need to establish point matches between two images. A point near border in one image is likely not visible in the other image, or is not located near image border anymore. The image pair shown in Sect. 5 is almost the best we can achieve. In summary, our tentative conclusions are the following: In applications where distortion correction can be done off-line, do it. If image points can be extracted with high precision (< 0.3 pixels) and lens distortion is relatively large, the technique presented in this paper can be applied. Otherwise, consider the cameras as a pinhole, i.e. neglect lens distortion.

d
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

dis avg

0.38 0.94 1.74 2.61 3.51 4.47 5.47 6.53 7.66 8.86 10.16

1.00 3.75 7.57 11.80 16.31 21.23 26.87 33.10 40.24 48.61 58.51

max

dis

cor avg

5.93 4.99 4.53 3.82 3.14 2.70 2.35 1.71 1.59 1.26 1.06

26.80 23.11 21.09 17.79 14.74 12.94 10.86 7.42 6.90 5.41 4.59

max

cor

h1

h2

j 1j j 2j
h h

F
7.46 5.71 4.53 3.43 3.10 2.49 2.79 2.00 1.82 1.40 1.01

References
[1] D. C. Brown. Close-range camera calibration. Photogrammetric Engineering, 37(8):855866, 1971. [2] W. Faig. Calibration of Close-Range Photogrammetry Systems: Mathematical Formulation. Photogrammetric Engineering and remote Sensing, 41(12):14791486, 1975. [3] Q.-T. Luong. Matrice Fondamentale et Calibration Visuelle sur lEnvironnement-Vers une plus grande autonomie des syste ` mes robotiques. PhD thesis, Universite de Paris-Sud, Centre dOrsay, Dec. 1992. [4] Q.-T. Luong and O. Faugeras. An optimization framework for efcient self-calibration and motion determination. In Proceedings of the International Conference on Pattern Recognition, volume I, pages 248252, Jerusalem, Israel, Oct. 1994. Computer Society Press. [5] R. Mohr, B. Boufama, and P. Brand. Accurate projective reconstruction. In J. Mundy and A. Zisserman, editors, Applications of Invariance in Computer Vision, volume 825 of Lecture Notes in Computer Science, pages 257276, Berlin, 1993. Springer-Verlag. [6] J. Nelder and R. Mead. A simplex method for function minimization. Computer Journal, (7):308313, 1965. [7] S.-W. Shih, Y.-P. Hung, and W.-S. Lin. When should we consider lens distortion in camera calibration. Pattern Recognition, 28(3):447461, 1995. [8] C. C. Slama, editor. Manual of Photogrammetry. American Society of Photogrammetry, fourth edition, 1980. [9] R. Y. Tsai. A versatile camera calibration technique for highaccuracy 3D machine vision metrology using off-the-shelf tv cameras and lenses. IEEE Journal of Robotics and Automation, 3(4):323344, Aug. 1987. [10] G. Wei and S. Ma. Implicit and explicit camera calibration: Theory and experiments. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(5):469480, 1994.

1.46 0.48 1.15 0.92 ?0.37 0.01 ?1.31 ?1.90 ?1.34 ?1.08 ?0.79

18.80 16.45 13.50 10.84 9.97 7.65 7.55 5.56 4.39 3.33 2.41

2.06 1.41 2.02 2.35 1.92 1.63 2.08 2.16 2.27 1.59 1.33

18.80 16.45 13.67 11.14 9.97 7.82 7.55 5.62 4.79 3.66 2.69

6 Conclusion
In this paper, we have described the epipolar geometry between two images with lens distortion. For a point in one image, its corresponding point in the other image should lie on a curve, instead of a straight line in the case with distortion-free cameras. We then presented our preliminary results on the investigation of the possibility of estimating the distortion parameters and the fundamental matrix based on the generalized epipolar constraint. Although more extensive experiments need to be carried out, it seems that it is possible to estimate lens distortion from two images without using any knowledge of 3D scenes, on the condition that the noise in the extracted image points is small enough and that the lens distortion is relatively large. Otherwise, the estimation of the fundamental matrix based on a distortion model can be worse than that based on a distortion-free model. This is also in accordance with the results of the clas-

You might also like