You are on page 1of 4

VISUAL ODOMETRY RESEARCH BASED ON EFFICIENT

FEATURE TRACKING
DING Lianghong, WANG Runxiao, LI Tao
School of Mechatronics, Northwestern Polytechnical University, Xi’an 710072, China

Keywords: visual odometry, feature tracking, detection and 4) The occlusion causes the missing of matched point.
removal, local positioning The outliers caused by the above reasons will pass all the
stages and waste computing resources, interfere the normal
Abstract processing of the inliers. At the same time, in iteration the
outliers affect the convergence speed and reduce the
Visual odometry can use feature tracking to estimate the measurement accuracy. In the common idea, the outliers
motion of robot in sequence stereo frames. High quality should be deleted by RANSAC algorithm before iteration.
feature point set can be obtained by all kinds of detection and But if the proportion of outliers is too large, RANSAC
removal methods through the each process of visual odometry. algorithm must work many times to obtain the inliers set and
Several methods have been summarized in the paper and there is a greater chance.
applied in our experiments. The result shows that these This paper will discuss how to get a good point set before
methods are better than single RANSAC algorithm. motion estimation. All kinds of detection and removal
methods may be applied in every matching process. The
1 Introduction experiments show that the most of outliers have been detected
and deleted by these methods.
Visual odometry is a local position method developed based
on stereo vision and the key technology for robot’s 2 Basic framework
autonomous navigation in unstructured environments, being
applied in Mars rover [1-3], lunar rover and quadruped robot
[4-5]. Visual odometry can determine the posture and position 2.1 Algorithm framework
of robot by analyzing images in sequence stereo frames. It
mainly includes four steps: feature detection, feature There are one detection and three matching process in a cycle
matching, feature tracking and motion estimation. Rotation of visual odometry. Because the outliers are mainly from
and translation matrices between current frame and previous mismatch, detection and removal will be done after every
frame can be obtained. They express the variation of robot’s match process in a cycle. The outliers firstly will be detected
six degrees of freedom in 3D space. The continuous measure- by constraint detection methods and then deleted according to
ment can outline the trajectory of robot precisely in 2D or 3D the rules. The algorithm flowchart of visual odometry in the
map. That is very important for robot navigation whether it is paper is below in figure 1.
fully autonomous or semi-autonomous. At the same time, +DUULVGHWHFWLRQLQ
visual odometry can combine with other visual navigation, SUHYLRXVULJKWLPDJH 1&&PDWFKLQJLQ
such as identification of obstacles, terrain reconstruction. FXUUHQWOHIWLPDJH
Visual odometry uses stereo matching for two 3D coordinate 1&&PDWFKLQJLQSUHYLRXVOHIWLPDJH
sets of the same feature point set. Motion estimation will find detection and removal
the optimal R and t matrices that reflect coordinate detection and removal
transformation between two 3D coordinate sets. Stereo 5$16$&FRDUVHPHDVXUHPHQW
matching will be done three times in one visual odometry 1&&PDWFKLQJRIIHDWXUHWUDFNLQJ
LQFXUUHQWULJKWLPDJH
cycle, so it is the most important part of visual odometry and /0LWHUDWLRQHVWLPDWLRQ
in fact it is always the most difficult process in stereo vision.
detection and removal HVWLPDWLRQUHVXOWV5DQGW
Tracking feature point is the key factor in visual odometry.
Ideally, once feature points are detected, they can be survival
to estimation stage through all the intermediate process.
However, due to various adverse reasons, a considerable part Figure 1: Flow diagram of visual odometry
of feature points are outliers and they should not survive to
the iterative processing. There are several main causes of the 2.2 Basic algorithms
outliers. Harris corner detection algorithm extracts the higher interest
1) Lens distortion and calibration error cause the mismatch. values pixel points as the feature point in each active area of
2) The error of matching algorithm itself causes the mismatch. scene [6]. Visual odometry can follow the “update often,
3) The search area of tracking match is too large so that the search less” principle. There are about 60 points in the initial
cross correlation value of possible outlier is larger than the extraction. Harris algorithm is as follows:
accurate point.

2264
Authorized licensed use limited to: Ming Chi University of Technology. Downloaded on January 16,2021 at 07:42:07 UTC from IEEE Xplore. Restrictions apply.
ª I x2 x, y I x x, y I y x, y º (1) field compared with the other points. This is clearly

u x, y G s … «
I x , y I x , y I y2 x, y
» contrary to the distribution regulation of points.
¬« y x ¼»
3) It is the most difficult to match in feature tracking, so
NCC matching algorithm selects 7 u 7 pixel size image as a
there are the most outliers. The rigidity test and the slope
matching template and it can take into account the accuracy test are effective methods in this stage. The relative
and timeliness. In search area, the template has to compute position between the points in the image is changeless
cross correlation value one by one pixel. So NCC algorithm when the landmarks are all stationary. The rigidity test
spends more time and it is not conducive to the improvement can delete outliers against the rule. The slope of the line
of real-time. But NCC algorithm has a good accuracy and between the points in current frame and the projection of
robustness. The expression is as follows: corresponding points from previous frame will show a
¦T uT l r clear distribution regulation. The point which the
difference of slope is too large can be deleted directly.
NCC x, y, u , v m,n
(2)
4) The cross correlation value of NCC algorithm can be as a
¦ Tl 2 u ¦ Tl 2
m,n m,n
method to delete outliers. In particular, the occlusion
problem often occurs if there are a variety of obstacles in
RANSAC algorithm firstly selects an optimal point set and a scene. If the same point is in one image and not in the
unit quaternion combined with singular value decomposition other, the cross correlation value is relatively low usually.
roughly calculates R and t matrices. At the same time, the But the threshold must be selected cautiously to avoid
distance between the tracking points and the projected by R accidentally deleting inliers.
and t matrices from the point set of previous frame can be 5) The points’ depth of field can detect the outliers in liner
computed. The feature points which have more than two pixel motion. Such as the forward movement, the same point’s
unit distance will be deleted as the outliers. This measure can depth of field in previous frame is larger than the one in
ensure that the point set will be easy to iterative fast current frame. Because the points move back relative to
convergence in the next step. the camera, they are gradually close to the camera.
The ultimate goal of motion estimation is to get the optimal R 6) The movement range of feature points also can detect the
and t matrices to make the expression ei minimum. L-M mismatch points. In a cycle, all the points have the same
iteration algorithm can calculate the square sum of all the distance almost. The point which has too large or too
surplus points between the tracking points and the small distance can be excluded as the outliers. But when
transformed points by R and t [7]. the camera is turning sharply or the roll of the camera is
too large, this method should be used with caution.
ei Pcj  R ˜ Ppj  t (3) All of the above methods are not necessarily used in visual
n 2 odometry together. Although the strict detection and removal
F x ¦ Pcj  R x Ppj  t (4) can delete almost all the outliers before the iteration, it can
j 1 cause some error that a portion of the inliers maybe be deleted
incorrectly as the outliers. In particular, the same outlier must
2.3 Detection and removal be synchronously deleting in all the point sets which has its
records. The points in the set are the ordered arrays and they
Many measures have to be increased into visual odometry to must be kept accordance with correspondence in all the sets.
reduce the adverse effects from outliers. The accuracy of These methods for deleting outliers can get rid of the
visual odometry depends on the quality of the feature point disturbing by the mismatch and ensure a good set of the
set, not point quantity. The experiments show that the results points in every stage of visual odometry.
of iteration will be good when the scale of inliers is over 90%.
If the scale is less than 70%, iteration convergence results will 3 Experiments and results
be a large uncertainty. That is not conducive to robot precise
localization. There are several methods to detect and delete The 16th and 17th frames in the experiments are selected to
the outliers in time throughout the whole process of visual describe the process of visual odometry as an example. The
odometry. right image is the reference image and the left image is the
1) The epipolar line constraint sets the search range in the target image.
same frame. The feature points should be distributed
strictly along the epipolar line in normal state. But 3.1 First frame
because of the system error of the camera, the search area
will extend a few pixels of offset buffer above and below The feature points are detected by Harris algorithm in the 16r
the epipolar line. That can eliminate the possible mis- (figure 2). By setting the parameters, the number of feature
match if the feature points don't lie in the epipolar line. points can be controlled to meet the expectation. NCC
2) The points in the set will be arrayed by pixel line. The matching will determine the best points in the 16l
disparity of consecutive feature points is no more than a corresponding to the points in the 16r (figure 3). The epipolar
certain pixel value, such as 10. If the disparity is too line constraint can increase the accuracy of stereo matching
larger or too smaller than the others, that means the and reduce computation time. If the cross correlation value of
corresponding point has a significantly different depth of NCC matching is too low, that means there may not be right

2265
Authorized licensed use limited to: Ming Chi University of Technology. Downloaded on January 16,2021 at 07:42:07 UTC from IEEE Xplore. Restrictions apply.
matched point because of the occlusion problem. After tracking match, the feature points must still be matched
Triangulation can calculate the 3D coordinates of the feature in the current frame. This process is the same to the first
points in the 16th frame. After getting the point set, the matching. Triangulation can calculate the new 3D coordinates
disparity of consecutive feature points is often closer because of the feature points in the current frame. The points’ depth of
these points array from far to near. If some point has the field will be different before and after moving and method 5)
significant different disparity compared with the others, it is can detect the more outliers. The movement range of the
considered to be the outlier and deleted. feature points can be computed in this cycle and the points
that are too larger or too small can be deleted as the outliers.
Finally the good point set will be obtained by all kinds of the
methods.

Figure 2: Feature detection in the 16r

Figure 5: Initial feature trajectory of two frames

Figure 3: First matching in the 16l

3.2 Second frame


After the camera moving a short distance, the feature points Figure 6: Feature trajectory after removal
in the 16r will be tracked in the 17r. This stage is the most
important image processing in visual odometry and the most 3.3 Motion estimation
of outliers will be produced because of many reasons. The
search area bounds of NCC matching must be large enough to
contain all the potential matching range. So it cost much time
to compute matching and increase the probability of
mismatch. The result of the initial tracking match is not good
as figure 4. The lines between the points in current frame and
the projection points from the previous frame can clearly
show there are many outliers in this process (figure 5). The
rigidity test and the slope test can detect and delete the
outliers (figure 6).

Figure 7: Final feature trajectory


Motion estimation also can be divided into two steps. 1)
RANSAC algorithm selects the optimal point set and roughly
calculates R and t matrices (figure 7). 2) L-M iteration
computes accurate results. The results of L-M iteration can
show clearly the error between the points from tracking match
and the points from the R and t transformation (figure 8). The
dots are the tracking match points and the crosses are the
result from transformation. Ideally, the dots and the crosses
Figure 4: Feature tracking in the 17r

2266
Authorized licensed use limited to: Ming Chi University of Technology. Downloaded on January 16,2021 at 07:42:07 UTC from IEEE Xplore. Restrictions apply.
should completely overlap. However, due to the presence of various detection and removal methods are too strict and
errors, R and t matrices only are the approximation. many inliers are accidentally deleted, but the final result has
no major impact.

4 Conclusion
For a good feature tracking in visual odometry, the paper
focused on the detection and removal methods. Several
methods can in time delete the outliers in each stage during
visual odometry. Although these methods are very effective,
it does not mean all the measure will be used in a cycle
together. In order to prevent accidentally deleting inliers, the
conditions of use must be set properly according to the
Figure 8: Comparison of iteration and tracking motion state. Visual odometry is a very effective local
positioning way for robot in unstructured environments.
3.4 Results Currently it would be best to work with other positioning
sensors, such as IMU, encoder. Specially, IMU can remedy
the angle measurement error during long distances.

Acknowledgements
This paper is supported by the graduate innovation fund of
Northwestern Polytechnical University.

References
[1] Maimone M, Yang Cheng, Matthies L. “Two Years of
Visual odometry on the Mars Exploration Rovers”,
Figure 9: Trajectory of camera in 3D space Journal of Field Robotics, special issue on Space
Robotics, 24(3): 169-186, (2007).
[2] Johnson A, Goldberg S, Yang Cheng, et al. “Robust and
efficient stereo feature tracking for visual odometry”.
IEEE International Conference on Robotics and
Automation, Pasadena, CA, USA, 39-46, (2008).
[3] Olson C, Matthies L, Schoppers M, et al. “Robust stereo
ego-motion for long distance navigation”, Proceedings
of the IEEE Conference on Computer Vision and Pattern
Recognition, Hilton Head Island, SC, USA, 2: 453-458,
(2000).
[4] Howard A. “Real-Time Stereo Visual Odometry for
Figure 10: Trajectory of camera in 2D grid map Autonomous Ground Vehicles”, International Confer-
Visual odometry can estimate robot’s six degrees of freedom ence on Robots and Systems, Acropolis Convention
in 3D space. The trajectory of the robot’s center gravity can Center, Nice, France, 3946-3952, (2008).
be obtained by connecting all the origin of coordinates from [5] Buehler M, Playter R, Raibert M. “Robots step outside”,
the continuous motion estimation. There are two issues worth International Symposium of Adaptive Motion of Animal
noting. Firstly, R and t matrices are the transformation of the and Machines, Ilmenau Germany, 1-4, (2005).
point set, not robot. So R must be transformed into the inverse [6] Harris C, Stephens M. “A combined corner and edge
matrix and t must be added negative sign. Secondly, the each detector”, Proceedings of The Fourth Alvey Vision
estimation is based on the previous frame coordinate system, Conference, Manchester UK, 15: 147-151, (1988).
so visual odometry is the relative motion estimation. Figure 9 [7] Johnson A, Matthies L. “Precise Image-Based Motion
shows the trajectory of camera in the experiment in 3D space. Estimation for Autonomous Small Body Exploration”,
Figure 10 clearly shows the location information of camera in 5th International Symposium on Artificial Intelligence,
2D grid map. Robotics and Automation in Space, 627-634, (1999).
There are 75 motion estimation cycles in the experiments and [8] Konolige K, Agrawal M, Blas M, et al. “Mapping,
the camera has moved about 7 meters. The final positioning navigation, and learning for off-road traversal”, Journal
error is less 4% and it can meet the requirements of robot of Field Robotics, 26(1): 88-113, (2009).
local position in unstructured environments. To test the [9] Nistlr D, Naroditsky O, Bergen J. “Visual Odometry for
robustness of the algorithm, there are many slightly blurred Ground Vehicle Applications”, Journal of Field
images to be added into the experiments. In some cycles, Robotics, 23(1): 3-20, (2006).

2267
Authorized licensed use limited to: Ming Chi University of Technology. Downloaded on January 16,2021 at 07:42:07 UTC from IEEE Xplore. Restrictions apply.

You might also like