You are on page 1of 10

The geometry of two views

Laboratory 3
Aitor Sanchez​1​Alejandro Zarate​2​Isaac Perez​3​Daniel Yuste​4
Universitat Pompeu Fabra, Barcelona,Spain

aitor.sancheza@​e-campus.uab.cat​1​, alejandro.zarate@e-campus.uab.cat​2​, ipesanz@gmail.com​3 ​,


danielyustegalvez@gmail.com​4

Abstract— This document contains the work performed during the lab3 of the M4 Module. This deliverable is
focused on computing the fundamental matrix that relates two images by using the Normalized 8-point algorithm
(algebraic method) and the Robust normalized 8-point algorithm with application in Photo-sequencing.

INTRODUCTION

The fundamental work includes the calculation of the fundamental matrix between two images,
the identification of epipolar points in order to be applied in photo sequencing analysis. The work
conducted for the first part of this laboratory can be summarized in the following four high level
steps:

It is important to note that keypoint detection is performed using ORB, we used a brute force
matcher (BFMatcher) object with distance measurement cv.NORM_HAMMING with crossCheck
for better results, and inlier identification is performed using RANSAC.
In computer vision, the fundamental matrix F is a 3×3 matrix which relates corresponding points
in stereo images. In epipolar geometry, with homogeneous image coordinates, x and x′, of
corresponding points in a stereo image pair, Fx describes a line (an epipolar line) on which the
corresponding point x′ on the other image must lie. That means, for all pairs of corresponding
points holds:
x ′ ⊤​
​F x = 0
Being of rank two and determined only up to scale, the fundamental matrix can be estimated
given at least seven point correspondences. Its seven parameters represent the only geometric
information about cameras that can be obtained through point correspondences alone. [1]
1. Fundamental matrix F with the normalized 8-point
algorithm

1.1 Direct Linear Transformation (DLT) Algorithm.

In the first task of the Laboratory, we created the function that estimates the fundamental matrix
given a set of point correspondences between a pair of images.

The Direct Linear Transformation (DLT) algorithm can lead us to a matrix H with a given set of four 2D to
2D points. In order to use it, it’s enough with four point correspondences within three points no
​ ill be undetermined.
collinears. If the previous conditions are not asserted,​H w

With the four matches, we are able to guess two homographies. One for each image and with those, we
can treat the DLT as a least-square problem easy to solve by using Singular Value Decomposition (SVD)​
.

In order to check that the completed function works properly we have used the toy example given
where we know the ground truth image.

1.1.1 Results and conclusions.

● We have been able to apply and understand the theory applicable to this DLT Algorithm
● Create the ​fundamental_matrix() method, normalization() and put into practice the assert
conditions to know if the fundamental matrix is properly calculated
● Normalization of the data points is mandatory
● It’s mandatory to convert the fundamental matrix to rank two.

In real life:

● We expect to have a matrix A of rank 8 at least in order to determine H up to scale.


The implicit compatibility relationship between interframe homographies and the
fundamental matrix can be directly used for computing the fundamental matrix. The
compatibility equation gives six constraints [2] (for which only 5 are linearly independent).
Therefore, at least 2 homographies are needed for computing the fundamental matrix.
● The question can be translated to a least squares problem by DLT and can be easily solved by
SVD decomposition. However, this straightforward method is unstable for inaccurate
homographies, sometimes leading to completely meaningless results.
1.2 Robust estimation of the fundamental matrix.

In this section we estimate the fundamental matrix in a situation where the image
correspondences contain outliers.

The fundamental matrix is a relationship between any two images of the same scene that constrains
where the projection of points from the scene can occur in both images. Given the projection of a scene
point into one of the images the corresponding point in the other image is constrained to a line, helping
the search, and allowing for the detection of wrong correspondences. The relation between
corresponding image points, which the fundamental matrix represents, is referred to as ​epipolar
constraint,​​matching constraint​, d
​iscrete matching constraint,​or ​incidence relation.​[1]

The first step required to compute and visualize the image matches using ORB.

Compute and match pairs of interest points.

To reduce key points to inliers, we proceed to write the function for the fundamental matrix
calculation embedding a RANSAC procedure in the previous DLT algorithm.

Compute inliers.
As we have seen before, we require two homographies in order to calculate F by the approach of a
least-squares problem.

In order to minimize the error, and avoid divergences when performing the least-squares calculation
with (maybe) an incorrect chosen point, in a robust estimation of the fundamental matrix we will
recalculate our first iteration and provide a rectified final F.

To do so, it’s recommended to introduce normalization in order to stabilize the result and deal with the
translation and scaling at the same time.

By doing this, we obtain a skew-symmetric matrix of the form:

Apart from using normalization, we can use a RANSAC approach in order to be confident that our
solution is good enough, considering to do N iterations before giving the final result.

After inliers have been identified for the fundamental matrix between the two images is obtained
as follows:

1. Normalize correspondences (to improve mathematical stability.)


2. Create matrix W from pi and p’i correspondences (find at least 8 correspondences.)
3. Compute the SVD of matrix W=UDV​T​.
4. Create vector f from last column of V.
5. Compose fundamental matrix F.
6. Compute the SVD of fundamental matrix F=UDV​T​.
7. Remove last singular value of D
8. Recompute matrix F
9. Un-normalize

As recommended in the literature, the error is calculated following the Sampson formula [4],
which is expected to provide better results than the symmetric epipolar error. However we did
not test for other error metrics in this lab.

1.2.1 Results and conclusions.

● The robust approach helps us to minimize the effect of outliers (Error) in the fundamental
matrix. This is due to RANSAC. Results are clear once the inliers are plotted.
● By introducing RANSAC we introduce new hyperparameters that need to be tuned case by
case and no longer have a unique close solution.
1.3 Epipolar lines

Now the fundamental matrix has been estimated we are going to display some points and their
corresponding epipolar lines.

Given a pair of images, to each point x in one image, there exists a corresponding epipolar line l′ in the
other image. Any point x′ in the second image matching the point x must lie on the epipolar line l . The
epipolar line is the projection in the second image of the ray from the point x through the camera centre
C of the first camera :​​x → l′
Thus, there is a map ​from a point in one image to its corresponding epipolar line in the other image. It is
the nature of this map that will now be explored. It will turn out that this mapping is a (singular)
correlation, that is a projective mapping from points to lines, which is represented by a matrix F, the
fundamental matrix. [3]

For this laboratory, once we have calculated the fundamental matrix for the pair of images we
proceed to display some points and their corresponding epipolar lines. The figure below shows
the corresponding epipolar points for the two images analyzed in the Laboratory.

Points and their corresponding epipolar lines.

1.3.1 Results and conclusions.

● We can demonstrate that the fundamental Matrix is a good approximation since the matches
can be found on the corresponding epipolar lines.
● On the other hand, at least for this example image, the method is quite sensitive. If we run a
few iterations of the same code, due to the probabilistic approach of the algorithm, different
solutions are found.
● That can be easily appreciated if we take a look at the epipole. The epipole is the point of
intersection of the line joining the camera centres (the baseline) with the image plane.

Therefore, in order for a method to be robust, the epipole should be located in very similar
positions for different iterations of the algorithm.
Even reducing the threshold for RANSAC we get a variation. Probably, the solution to reduce that
variability would be to obtain better correspondences.

1.3.2 Extra: OpenCV implementation with SIFT and cv.findFundamentalMat()

In order to contrast results obtained through our function development, we implemented


Epipolar Line identification using the cv.findFundamentalMat() with RANSAC included as part of
OpenCV. The RANSAC algorithm in this implementation needs at least 15 points and the 7-point
algorithm is used. Other parameters included:

● Identification of keypoints and descriptors with SIFT.


● [param1 = 0.1] – Maximum distance from a point to an epipolar line in pixels, beyond which
the point is considered an outlier and is not used for computing the final fundamental matrix.
● [param2= 0. 99] – Specifies a desirable level of confidence (probability) that the estimated
matrix is correct.

Compute and match pairs of interest points.

The following images present the resulting epipolar lines, as can be appreciated a significant
number of points was identified and it appears that most of the matches seem to be accurate and
correspond to the geometry of the scene pretty well.
Epipolar line identification with cv.findFundamentalMat()
param1 = 0.1 , param2 = 0.99

In the following images the [param1] is increased to 1 and 3 pixels. At 1 pixel it appears that most
of the matches seem to be accurate and correspond to the geometry of the scene pretty well.
However, at 3 pixels it appears that some of the points do not match.

Epipolar line identification with cv.findFundamentalMat()


param1 = 1 , param2 = 0.99

Epipolar line identification with cv.findFundamentalMat()


param1 = 3 , param2 = 0.99
2. Application: Photo Sequencing
In this part we will compute a simplified version of the algorithm explained in the
Photo-sequencing paper [5]. Since we do not have two images taken from roughly the same
viewpoint at two different time instants we will manually pick a dynamic point corresponding to a
point in a van (identified by index 'idx1') and the projection of its 3D trajectory in the reference
image. Then we will compute the projection (to the reference image) of two points on this 3D
trajectory at two different time instants (corresponding to the time when the two other provided
images were taken). In our case, and because we’ve used an ORB point detector, the idx1 is:
1672.
In the first pair of images it is possible to see the matches between images and in the second pair
of images there are only the inliers.

Compute and match pairs of interest points.

Compute inliers.
After inliers have been identified for the fundamental matrix applying RANSAC, the goal is to find
the points of the van in image two and three and represent them in the first image following the
next steps:

● Find the correspondence of the keypoint in the van using the matches in the first step, not the
inliers because there is not any inlier in the van because it is moving and the surround is
different.
● Find the trajectory of the van using the given point and computing the cross product.
● Compute the epipolar lines of the correspondences in image two and three.
● Project these epipolar lines in image one and find the intersection with the van trajectory to
get the future position of the van.

Points and their corresponding epipolar lines.

2.1 Results and conclusions.

We have used the epipolar lines definition, to identify a linear straight forward direction of a
moving object. Previously considering that the movement is straight and linear, and the camera is
in the same position.
3. CONCLUSIONS
After this lab session, we are more aware of the tools we have on hand when dealing with a given
dataset of images (images from non-calibrated cameras, how to calculate homographies, use of
epipolar line for pose estimation)

Lab three has been a profitable and hard piece of work:

● Although we have demonstrated the feasibility of using DLT algorithm for pose estimation,
other alternative like EPnP shown to be more efficient and accurate according to the literature
[​
Davide Scaramuzza lectures​]
● Unfortunately, EPnP, which is up to 10x points accurate & efficient than DLT algorithm, can
only be used in Calibrated Cameras. For uncalibrated areas, DLT is the only available option.
● We learnt that normalization when using the 8-point algorithm is required
● To apply a most robust fundamental matrix it is a good solution if you introduce RANSAC to
remove the outliers (noise) to get better results.
● The use of the epipolar lines can help if you want to know the trajectory of an object but with
restrictions. This movement must be linear and in the same direction.

References

[1] ​
https://en.wikipedia.org/wiki/Fundamental_matrix_(computer_vision)
[2] QT Luong and O Faugeras. Determining the fundamental matrix with planes. In IEEE Conference on Computer
Vision and Pattern Recognition, pages 489–494, 1993.
[3] ​
https://www.robots.ox.ac.uk/~vgg/hzbook/hzbook1/HZepipolar.pdf
[4] T. Basha, Y. Moses, and S. Avidan. Photo Sequencing, International Journal of Computer Vision, 110(3), 2014.
[4] Fundamental Matrix Estimation: A Study of Error Criteria. ​Link to DOI

You might also like