Professional Documents
Culture Documents
tation that exploits the implicit surfaces. Whereas conventional SLAM systems [3], [15], [16] have demonstrated accurate
GPIS representations [11], [12] can only infer the Euclidean pose estimation, global consistency for long term trajectory,
distance field (EDF) accurately close to the measurements and real-time performance. Due to the difficulties of extracting
(noisy points on the surface), Log-GPIS proposed in our continuous surfaces from a sparse set of points, most of
previous work [13] faithfully models the EDF and gradients by them are not suitable for applications other than localisation.
enforcing Eikonal equation [14] through a simple log trans- In addition, dedicated challenging algorithms are required
formation on the standard GPIS. This representation allows to recover a dense and accurate reconstruction from sparse
us to use Log-GPIS as a single unified representation for points [17].
localisation, mapping and planning multiple robotics tasks, As an alternative representation to achieve sensor tracking
which is the focus of this paper. The smooth and probabilistic and mapping, the seminal work of KinectFusion [4] demon-
nature of Log-GPIS provides an accurate dense surface for strates the performance of camera motion estimation and
mapping given noisy measurements. The extrapolative power dense reconstruction in real-time. To represent the geome-
of Log-GPIS provides information not only near but also far try, KinectFusion uses the truncated signed distance function
away from the measurements, and hence enables estimation (TSDF) [18] and a coarse-to-fine ICP algorithm [19] to
of the full EDF with gradients. This allows motion estimation perform alignment. TSDF, originally proposed in computer
via direct optimisation of the alignment between the global graphics literature, has become a widely used representation
map representation and a new set of measurements. Further- to exploit in odometry and mapping approaches. However,
more, the EDF and its gradients are suitable for trajectory KinectFusion has some drawbacks. First, the algorithm im-
optimisation-based planners. plementation relies on GPU. Second, the distance of the
The contributions of this paper are as follow: TSDF is truncated and generally overestimates the distance
• The Log-GPIS-MOP framework, which achieves incre- within limited viewpoints. Third, the representation can not
mental localisation, mapping and planning based on a be directly used for planning applications.
single continuous and probabilistic representation through Another relevant representation for mapping and odometry
accurate prediction of the EDF and the implicit surface tracking is from RGB-D SLAM [20]. Such as the framework
with gradients. recently proposed in [21], the ElasticFusion represents the map
• A sound formulation for 2D/3D point cloud alignment as a collection of surfels, which are small units with normals
for odometry estimation, where the Log-GPIS is used to and colour. Given a surfel-based representation, ElasticFusion
minimise the point cloud distance to the implicit surface incrementally estimates the sensor pose through the dense
from a set of newly arrived observations following the frame-to-map tracking. No pose-graph optimisation or post-
gradients. processing is required. Such representation generally assists
• A generic mapping approach for organised and unorgan- to compute accurate pose estimation, along with high-quality
ised point clouds that consists in probabilistically fusing reconstruction and a rich understanding of the occupied space.
new measurements into a global Log-GPIS representation However, the representation does not differentiate free space
and is ready for a modified version of marching cubes to from unknown space explicitly, which is of limited applicabil-
recover dense surface reconstruction. ity for autonomous navigation tasks.
• A path planning approach that uses Log-GPIS to avoid In contrast, occupancy-based representations [22] efficiently
collisions with the mapped surfaces following the EDF compute occupancy information, such as free, occupied, and
and gradient information. unknown. It is commonly used for mapping and path planning
Moreover, we present a thorough evaluation of the Log- applications. Many options for occupancy maps have been
GPIS-MOP incremental tasks and overall framework on public presented in the literature varying from discrete [22] [23] [24]
benchmarks, and qualitative and quantitative benchmarking to probabilistic and continuous [25], [26]. For the discrete
with the state-of-the-art. Our work enables sequential and occupancy map, the constructed maps contain a set of grid
concurrent localisation, mapping and planning, focusing in cells with occupancy information of the environment. The
accurate performance as opposed to real-time operations. planning approaches take the occupancy grids as input [27].
The paper structure is organised as follows. The related As for the probabilistic method in [28], the authors propose
work is presented in Section II. Section III explains the the continuous occupancy mapping through viewing Gaussian
background knowledge. Our Log-GPIS representation is dis- processes as implicit functions of occupancy. By doing that, it
cussed in Section IV. The methodology of Log-GPIS-MOP is capable of building the representation with both occupancy
framework is presented in Section V , with odometry in VI, information and implicit surfaces. If naively extended from 2D
mapping VII, and planning VIII. In Section IX, we present a occupancy grids to 3D, however, that requires a huge memory
thorough evaluation of the framework on public benchmarks, consumption. It is difficult to perform fast ray-casting and
and compare the performance to existing solutions. Finally, look-up tabling for any space larger than a room. In particular,
conclusion and future works is given the Section X. the online solution commonly used in 3D is Octomap [23].
This approach uses an octree-based structure to represent
II. R ELATED W ORK occupancy probabilities of cells. The octree structure allows
For visual SLAM, several successful frameworks process the map to deal with large space and immensely decreases the
the input data into a set of sparse features as the repre- amount of memory to represent unknown or free space.
sentation for odometry and mapping. Recent feature-based However, the representation with occupancy information
3
only is not enough for some planning approaches. For instance, discontinuity. In contrast, our paper uses a single and unified
trajectory optimisation-based planners including CHOMP [5] representation generated from depth data only for odometry
and TrajOpt [29], require the robot to build an efficient estimation, continuous dense mapping and planning with no
representation with distance to obstacles and direction infor- aid from any other positioning system.
mation. In addition, having a distance map speeds up collision Gaussian Process implicit surfaces (GPIS) is a continuous
checking of complex circumstances. For example, robot arms and probabilistic representation given noisy sensor obser-
are commonly represented as a set of overlapping spheres. It vations initially proposed to reconstruct surfaces [8]. Later
is convenient to check the distance field in the center of each in [11], the authors used the GPIS as an approximate signed
sphere [5]. distance function (GPIS-SDF) that infers the distance to the
These planners require a distance field, which is not trun- nearest surfaces. However, it only keeps tracking the distance
cated further from surface. ESDF containing distance values information close to the surface as there is a lack of observa-
over the entire exploring space is also necessary. The ESDF tions further away. Based on the Varadhan’s approach [34], our
information can be naturally passed to the collision cost func- previous work in [13] applies the logarithmic transformation
tion to produce collision-free trajectories [5] [29]. Checking to a GPIS with a specific covariance function to recover
the distance values of the neighbors gives the gradient at a the distance field in the space. This method provides the
given point. This allows CHOMP and other planners to follow extra potential of GPIS to accomplish accurate distance field
the gradient direction of the distance to push the robot on the mapping even further away from the surface. However, our
trajectory without collision [30]. previous work does not consider to solve the localisation
Following above requirements, Oleynikova et al [31] pro- problem as assumes the sensor poses are known and do not
posed the Voxblox framework to incrementally generate a actually propose a planning approach that leverages such a
representation for mapping and optimised planning. It adopts representation.
the advantages of TSDF to generate implicit surface as the Catering for the various requirements to have a represen-
dense reconstruction ideal for human visualisation. Based on tation suitable for odometry, mapping and navigation is still
the TSDF, the distance within truncated threshold is relatively an open research problem. Thus, in the paper we present the
accurate. Then it uses the ESDF representation that is nu- unified environment representation that can be used for dense
merically approximated from TSDF to the space within the reconstruction, collision avoidance, while providing additional
observations field of view. The basic unit for ESDF is voxel information needed for sensor’s odometry estimation.
grid, which contains the distance to the nearest obstacles
III. BACKGROUND
and gradient. Given the exploring ESDF, it has the ability
to work with trajectory optimisation methods, such as the In this section we introduce the Eikonal equation, the
trajectory optimization-based local planner [32] proposed by derivation of the distance approximation used in this work and
the same authors to perform online navigation in unknown a brief description of the Gaussian Process Implicit Surfaces.
environments.
Compared to KinectFusion, Voxblox does not use the dis- A. Euclidean Distance Field
tance information of the TSDF or ESDF to perform motion Assume a bounded domain denoted as S ⊂ RD with suit-
estimation. The system still requires ground truth poses as ably smooth boundary ∂S. The ∂S has a constant orientation,
input from a motion capturing system or estimated poses which is given by the the normal n. The nearest distance from
from a visual-inertial odometry framework [33]. To make all x ∈ RD to w ∈ ∂S is expressed as EDF d(x):
the distance play a role in the trajectory estimation, same d(x) = min |x − w| . (1)
authors extended their work to tackle the long term consistent w∈∂S
mapping problem in Voxgraph [6]. It introduces submap-based Suppose the boundary travels continuously along its normal
ESDF and TSDF representations for mapping and navigation. direction to reach at x. Let f (x) be the speed of travel and
Voxgraph adopts the distance information to generate corre- b(x) the arrival time. The minimal amount of time required is
spondence free constraints between submaps. Then through given by the solution of
optimising a pose graph, the relative pose given submaps 1
in a closed loop is computed to assist with the long term |∇b(x)| = (2)
f (x)
drifts. However, Voxgraph system needs an external visual
odometry to convert incoming sensor data into submaps for where | · | represents the Euclidean norm. In the special case
further optimisation and localisation. that travel speed equals to 1, the arrival time b(x) is exactly
The authors proposed a landmark-based representation the distance from x to ∂S, equivalent to d(x). This basic
in [2] for sparse map construction, self localisation and path characteristic of distance function satisfies the solution of the
planning. The representation is a set of image landmarks well-known Eikonal equation [35]:
selecting from a sequence of images. The localisation is |∇d(x)| = 1 , (3)
implemented by computing a flow between the landmarks,
where the Euclidean distance between two points is given by
while at the same time the shortest path is computed along
a straight line. ∇ represents the gradient. Note that (3) is
the selected landmarks. However, sparse landmarks represent
constrained with boundary conditions,
geometric structure properly, but they are not suitable for dense
reconstruction or path planning due to the sparseness and d(x) = 0 and ∂d(x)/∂n = 1 on x ∈∂S. (4)
4
B. Varadhan’s Formula Thus the expression of v(x) in terms of u(x) from (6) is
given by, √
In the robotics literature, recently proposed methods such
v(x) = exp[−u(x)/ t]. (8)
as [31] [36] adopted the concept of wave front propagation
to estimate the distance field efficiently. This approach incre- Substituting v(x) from (8) in (7) leads to
mentally propagates the distance field to the neighbors directly h √ i
from TSDF or from the occupancy map through a voxel grid. v − t∆v = v 1 − |∇u|2 + t∆u = 0, (9)
In this way, compromise has to be made to not obtain the
and this results in the regularised Eikonal equation given by,
EDF directly as further post processing algorithm is required.
√
To recover the distance field exactly, one option is to solve 1 − |∇u|2 + t∆u = 0. (10)
the Eikonal equation. Various methods have been introduced
in the past few decades. Some of these methods take advantage Essentially, (10) demonstrates that the regularised form of the
of the relationships provided by physical constraints, generally Eikonal equation can be effectively solved in a linear way.
discretising the domain through a grid, thereby computing a
solution for each discrete point. However, the main restriction C. Gaussian Process Implicit Surfaces
in exploiting the Eikonal equation (3) is that the equation is Gaussian Process (GP) [38] is a non-parametric stochastic
hyperbolic and non-linear, which makes it difficult to directly method ideal for solving non-linear regression problems. GPIS
solve and recover the accurate distance values. techniques [8], [9], [10] use GP approach to estimate a prob-
Inspired by the idea proposed in the computer graphics abilistic and continuous representation of the implicit surface
literature [14], we adopt a smooth alternative based on Varad- given noisy measurements. Recently as proposed in [11], [12],
han’s distance formula [34] [37]. The intuitive explanation GPIS representation can be used for distance field estimation,
of Varadhan formula in physical perspective is as follows. which provides approximate distance values only near the
Assume a hot needle touching a surface, so that the heat surface.
radiates on the manifold. The heat spreading is treated as Following the formulation in [11], for an unknown function
taking a random walk from the boundary. Assume the time of d that represents the distance to a surface, the implicit surface
random walks to be minimum, the nearest possible trajectory is modelled with zero mean GP and gradient ∇d:
will be taken. This relationship between heat and distance on
d
the manifold S is captured by the Varadhan formula through ∼ GP(0, k̃(x, x0 )). (11)
the heat kernel. The heat kernel denoted as v(x) represents ∇d
the conduction of heat from a source to a target over time. Then, the combined covariance matrix k̃(x, x0 ) with partial
According to the celebrated result of Theorem 2.3 of derivatives can be written as,
Varadhan in [34], the heat kernel v(x) is closely related to
k (x, x0 ) k (x, x0 ) ∇>
the EDF d(x) as follows: 0 x0
k̃ (x, x ) = , (12)
√ ∇x k (x, x0 ) ∇x k (x, x0 ) ∇>x0
d(x) = lim {− t ln[v(x)]} , (5) where k(x, x0 ) is the covariance function, which characterises
t→0
the properties of function d through multiple hyperparameters.
where t is the time and the condition for the equation to hold ∇x and ∇x0 are the partial derivatives of k(x, x0 ) at x and
is that t approaches zero. Consequently, given a relative small x0 [10].
value of t, it is possible to define the approximation of d(x) Consider a set of points corrupted by noise. The distance
as u(x): J
measurements y = {yj = v(xj ) + j }j=1 ⊂ R and the corre-
√
u(x) = − t ln v(x). (6) sponding gradients ∇y ⊂ R are taken at locations xj ∈ RD
and affected by additive Gaussian noise j ∼ N (0, σv2 ). The
In this case, t controls the smoothing property of u(x).
posterior distribution of d at an arbitrary testing point x∗ is
Instead of solving the non-linear solution of the Eikonal
given by d(x∗ ) ∼ N (d¯∗ , V [d∗ ]) with the predictive mean and
equation in terms of d(x), we can solve the homogeneous
variance with derivatives are expressed as follows
screened Poisson partial differential equation (PDE) [37] in
d¯∗
terms of v(x). Screened Poisson exhibits similar properties to > 2 −1 y
= k̃∗ (K̃ + σd I) (13)
the conventional heat equation: ∇d¯∗ ∇y
(1 − t∆)v = (1/t − ∆)v = 0 in S, V [d∗ ]
(7) = k̃(x∗ , x∗ ) − k̃> 2 −1
∗ (K̃ + σd I) k̃∗ . (14)
V [∇d∗ ]
v = 1 on ∂S,
P ∂2 Note that I represents the identity matrix of size J(D + 1) ×
v is to be determined. ∆ = i ∂x 2 is the Laplace operator J(D + 1). k̃∗ represents the covariance vector between the
i
and 1/t is a positive parameter that controls the screening. training points and the testing point x∗ . K̃ is the covariance
Unlike the non-linear Eikonal equation (3), it is important to matrix of the training points J with gradients. Furthermore,
highlight that the screened Poisson equation (7) is linear and k̃(x∗ , x∗ ) is the covariance function between the testing point.
easy to solve. That is to say, the proposed method solves the Most GPIS techniques ensure the accuracy of the recon-
heat equation in terms of v(x) instead of Eikonal equation structed surfaces, but only few of them directly model the
related to d(x). distance field [11], [12]. However, one of the disadvantages of
5
standard GPIS applying inference of the distance field is that We also note that the log-transformation affects the gradient.
only accurately infer the distance near the surface. Given the Taking the partial derivative of (16) respect to the distance
limited training data, it is hard to keep the high precision and leads to: √
continuity of the distance field further from the measurements. − t
∇d¯∗ = ∇v̄∗ , (17)
Our representation, Log-GPIS method [13], aims to overcome v̄∗
this shortcoming. It uses the log transformation of the well- where we can find that the main difference between
√ ∇d¯∗ and
known GPIS model after applying the Varadhan’s equation to ∇v̄∗ is the direction and the scaling factor t/v̄∗ . Typically,
predict the exact Euclidean distance field given minimal and for a standard GPIS, the magnitude of the gradients at the
sparse observations close to surface. With accurate distance surface does not have to be fixed. However, the Eikonal
predicted, we unlock the potential of Log-GPIS to accomplish equation (3) requires that the magnitude of the gradient is
robotic tasks other than mapping, such as Log-GPIS based normalised to 1. Since the gradient of d¯∗ is in the opposite
odometry discussed in section VI and path planning in sec- direction as the gradient of v̄∗ , we simply normalise the
tion VIII. gradient of v̄∗ , and invert the sign to have the d¯∗ .
Based on (17), the predictive variance V [d∗ ] can be directly
IV. L OG -G AUSSIAN P ROCESS I MPLICIT S URFACES recovered in terms of V [v∗ ] using a first-order approximation,
A. Derivation √ √ >
− t − t
As the name suggests, our Log-GPIS [13] representation is V [d∗ ] = V [v∗ ] . (18)
v̄∗ v̄∗
based on applying the logarithmic transformation of Varad-
han’s formula on the GPIS. As discussed above, the main So far the Varadhan’s equation only applies to x ∈ S as
difference between the Eikonal equation with solution d(x) shown in (7). In [13], we showed that this domain can be
and the equation with the heat kernel v(x) as solution, is that extended to x ∈ RD . For completeness we detail here.
heat equation (also known as screened Poisson PDE) is linear Let us consider the complement of the EDF as S c , and
to solve. It is advantageous to model the heat kernel v(x) (i.e. the corresponding heat equation in S c . Given ∂S c = ∂S, the
the exponential of the d(x)) as a GP, instead of the d(x), boundary condition remains the same for both heat equations
in S and S c . Combining two heat equations for x ∈ S and
v ∈ S c , we arrive at:
∼ GP(0, k̃(x, x0 )) . (15)
∇v
v − t∆v = 0 in RD ,
We then use (6) to recover the distance and thus its (19)
v = 1 on ∂S.
name. Essentially, Log-GPIS representation simply takes the
logarithm of a standard GPIS to model v(x). Thus, the linear Note that, this domain change has essentially ‘stitched’ the
heat equation or the screened Poisson PDE constraints can EDFs of S and S c . Thus the information on whether x is inside
be naturally incorporated into GP regression by ensuring that or outside S is lost (i.e. the sign is lost). In our perspective,
the covariance function follows the linear PDE [39], which the advantages of using (19) to approximate the distance field
we will describe how to choose the covariance function for everywhere on RD , losing the sign is a small price to pay for
Log-GPIS shortly in section IV-B. the convenience of the simplicity and accuracy of the distance
Moreover, assuming that the choice of covariance function estimation.
respects (19) already, we estimate v̄(x) and ∇v̄ and its
variance by simply using the GPIS regression equation (13) B. Choice of Covariance Function
and (14) with target measurements set to yi = 1 at locations xi In our previous work [13], we derived a covariance function
on the surface (instead of d¯ and ∇d).
¯ Intuitively, this implies
that exactly satisfies the screened Poisson PDE (19). Inspired
the target values of measurements in terms of the EDF is by Sarkka’s work [39], we noted that the so-called Whittle
log(1) = 0 on the surface boundary. Similarly, this enforces kernel [40], solves exactly the screened Poisson equation (19),
the boundary condition in (19).
|x − x0 |
Subsequently, after the GP inference, we need to use Varad- k (x, x0 ) = Kν (λ |x − x0 |) , (20)
han’s formula (6) to recover the EDF from the predictive mean 2λ
√
and covariance function: where λ = 1/ t and Kν is the modified Bessel function
√ of the second kind of order ν = 1. λ is the characteristic
d¯∗ = − t ln v̄∗ . (16)
length scale hyperparameter. Note that in this case, (5) shows
Remarkably, (16) shows that we can solve the screened that when t gets closer zero it will produce a better distance
Poisson equation by a careful choice of covariance and a approximation, so that relative larger λ will produce a more
simple log-transformation, which justifies the name of Log- accurate EDF. In this paper, hyperparameter optimisation is not
GPIS. Despite its simplicity, the log-transformation has a non- considered. Nevertheless, λ is exactly the hyperparameter of
trivial benefit as the predicted distance will approach infinity the Whittle covariance, and (5) provides the necessary intuition
when querying points further away from the surface unlike to choose a good λ value for the EDF inference. Empirically
the standard GPIS, which will predict distance value of zero we found that λ can be chosen from 5 up to 40. Too large
(falling back to the mean far away from the measurements). In λ will cause numerical issues when querying distance away
other words, Log-GPIS solves the issue of undesirable artifacts from input observations and too small λ value will introduce
in the predicted surface geometry. inaccuracy to distance field around the surface.
6
Furthermore, as the Whittle covariance is |ν − 1| = 0 times that probabilistically represents the EDF and its gradient.
differentiable, it cannot be used for GPIS gradient inference. Finally, a path planning approach uses the Log-GPIS to avoid
Given that the standard Whittle covariance is a special case of the surfaces of the mapped environment using the EDF and
the Matérn family of covariance functions, and Matérn ν = gradient information. Note that Log-GPIS-MOP is based on
3/2 is once differentiable, we proposed in [13] to use this the assumption that the environment has sufficient curvature
latter instead. The Matérn family of covariance functions are to unequivocally recover the pose of the sensor at any given
given by: sensor frame.
1−ν
√ !ν √ !
0 22 2ν 0 2ν 0 VI. O DOMETRY
k(x, x ) = σ |x − x | Kν |x − x |
Γ(ν) l l
Let us consider a depth sensor moving in a static envi-
(21) ronment. The sensor captures a sequence of organised or
where l is the characteristic length scale and σ 2 (the signal unorganised point clouds as measurements of the environment.
variance) are the hyperparameters controlling the generalisa- Let us assume that the measurement noise is independent and
tion information of GP models. Sample functions from Matérn Gaussian distributed. The sensor frame at time ti is denoted as
family are |ν − 1| times differentiable. Thus, ν manages the FCi , where i = (0, ..., L). The world reference FW is given
degree of smoothness. by the sensor frame FC0 at initial the timestamp. The pose
We empirically observed
√ in [13] that a modified version of of FCi in world reference FW is given by the rotation matrix
Matern 3/2 with l = 2νλ is a good approximation of Whittle W
RCi ∈ SO(3) and the translation vector W tCi , combined
covariance with the advantage that it is once differentiable. into the 4 × 4 homogeneous transformation matrix ∈ SE(3)
Given that the modified Matérn 3/2 allows normal prediction as,
and Whittle does not, and the approximation is acceptable,
W
RCi W pCi
W
we propose to use this Matérn version as a trade-off for extra TCi = . (22)
0> 1
information.
Exploiting the accurate EDF with gradients in the space, not Denote the point cloud measurement of the sensor frame
only around the boundary but also further away, establishes the FCi at time ti as Ci Pi and Ci pi,j ∈ R3 the j-th point of Ci Pi
mathematical foundation to solve the pose estimation problem. with j = 0, ..., Ji . This 3D point cloud Ci Pi can be projected
Furthermore, faithfully modelling the EDF with gradients is from FCi to FW using W TCi ,
W C
sufficient for a trajectory optimisation-based planner, which Pi W
i
Pi
requires the distance and gradients to the nearest obstacles = TCi . (23)
1 1
to explore in complex surroundings. Moreover, the GPIS
provides continuous and probabilistic representation for map- Also, let us express W TCi through the concatenation of
W
ping and surface reconstruction. Our Log-GPIS unlocks the TCi−1 and Ci−1 TCi as
potential to provide a unified solution for sensor odometry, W
TCi = W TCi−1 Ci−1 TCi , (24)
surface reconstruction, and safe navigation.
where W TCi−1 is given by the odometry computation at time
ti−1 . We aim to find the W TCi such that the newly arrived
V. L OG -GPIS-MOP OVERVIEW Ci
Pi can be projected to the global world frame.
We present first the incremental formulation that leverages
directly the Log-GPIS. This incremental odometry formulation
is the one used by the incremental mapping and planning
approaches all tightly integrated through the Log-GPIS. Fur-
thermore, we also present the batch optimisation odometry
formulation that can be used as post-processing step to refine
the incremental odometry. Note that the batch formulation uses
Fig. 2. Diagram of Log-GPIS-MOP framework. the sequential poses only and not directly the Log-GPIS.
Let us consider the Log-GPIS as a unified representation for
simultaneous mapping, odometry and planning (MOP) with A. Incremental Formulation
depth sensors. Given the input data from any type of depth In this work, we propose an iterative approach to estimate
sensor, e.g. LiDAR and depth cameras, Log-GPIS-MOP aims the odometry based on the Log-GPIS representation. Our
at incrementally estimate the EDF, while at the same time odometry estimation approach follows the iterative and frame
localise the sensor, and with this information plan its path to transformation estimation aspects behind Iterative Closest
avoid colliding into the surfaces implicitly described by the Point (ICP)[19] but without the CP approach. ICP algorithm
EDF. Fig. 2 shows an overview of the full approach. aligns two point clouds using for instance, point to point or
The pose estimation problem for odometry is formulated as point to plane distance metrics. For each point in the source
iterative alignment optimisation where the Log-GPIS is used point cloud, ICP searches for the closest point in the target
to minimise the distance to the surface from a newly arrived point cloud, which are tentative correspondences. Then, the
point cloud following the gradients. The mapping consists in transformation than aligns both point clouds is computed by
fusing this current point cloud into the existing Log-GPIS iteratively minimising the total distance error of all points
7
correspondences. Searching for the closest points is the most Given the local Log-GPIS for an arbitrary testing point
Ci−1
expensive part in a standard ICP [4], [41]. pi,j , the predictive mean v̄i with derivatives are obtained
Our work proposes instead, a direct query of the distance from (13). Also, as per (16), d¯i can be computed by applying
to the surface based on Log-GPIS to find the alignment the logarithm transformation to v̄i ,
efficiently and accurately without the need of the closest
v̄i
yi−1
points search. Given a fused global representation with the = k̃>
i−1 (K̃ i−1 + σ 2 −1
I) (27)
∇v̄i ∇yi−1
information of all previous frames, the minimum distance to
the surface can be obtained directly from the Log-GPIS, which √
d¯i = − t ln v̄i . (28)
saves us from doing the correspondences search. With the
assumption that the difference between two consecutive frames K̃i−1 is the covariance matrix of the input points and k̃i−1
is relatively small, the transformation is computed iteratively represents the covariance vector between the input points of
by minimising in each step the sum of the Log-GPIS inference frame FCi−1 and the testing point. yi−1 are the measurements
itself (which is equivalent to the distance to the surface in a at ti−1 , i.e. Ci−1 Pi−1 .
point to plane way as the distance gradients are utilised as
well). After the transformation of current frame is applied,
the current point cloud is fused to the global representation ...
using a fusion update method, which will be discussed in
section VII-A. X0 X1 X2 Xi−1 Xi
More formally, let us consider a local Log-GPIS represen-
tation d¯i obtained using the previous frame only and a global Fig. 4. Factor graph representation of the frame to map odometry. Low opacity
Log-GPIS representation d¯W i , which is the result of fusing all nodes is used for those nodes that are not used for current alignment. Black
the previous frames into the global frame. Let us assume the factors are the global Log-GPIS distance constrains.
initial W TC0 is identity and define Xi = {W TC1 , ..., W TCi }
As in any frame to frame odometry, it can easily drift. The
with i = (0, ..., L) as the state to be estimated. The odometry
other downside of this simple alignment is that the local log-
is then formulated incrementally to estimate W TCi with ti
GPIS has to be created at every step and discarded afterwards.
as the current time stamp through a frame to map iterative
This motivates the use of a less subject to drift frame to map
alignment Fig. 3, with the possibility to compute the frame to
alignment, which uses the fused global Log-GPIS.
frame alignment as shown in Fig. 4.
b) Frame-to-Map: Let us consider a sequentially com-
The iterative alignment in both cases, frame to map and
puted global Log-GPIS d¯Wi , which is used frame to map align-
frame to frame, is performed by minimising the total distance
ment to compute the poses as the sensor moves through the
to the surface to find the optimal current pose in an non-linear
environment. Particularising eidis for frame to map odometry
least squares optimisation manner,
(Fig. 4), we have
2 Ji
X̌i = arg min
eidis
,
(25) X
eidis W
d¯W W
TCi Ci pi,j ,
Xi TCi = i (29)
j=0
where eidiscorresponds to the total distance from current point
Ci
cloud Pi to the surface and X̌i is the sequential estimation where the global Log-GPIS d¯W
i is computed from,
of the current pose. More importantly, with this formulation
v̄iW
W
Log-GPIS is directly used to compute this metric from all the W > W 2 −1 yi−1
points in point cloud Ci Pi by querying the distance values. = (k̃i−1 ) (K̃i−1 + σ I) (30)
∇v̄iW W
∇yi−1
a) Frame-to-frame: This alignment is computed between √
the current frame and the previous-only representation d¯i ; the d¯W W
i = − t ln v̄i , (31)
local Log-GPIS. W
where K̃i−1
is the covariance matrix of all the fused point
clouds from 0 to i − 1 in global reference frame W P{0,...,i−1}
... and k̃W
X0 X1 X2 Xi−1 Xi i−1 represents the covariance vector between all merged
W W
points in world frame and the testing point. yi−1 and ∇yi−1
are the measurements and gradients of W P{0,...,i−1} .
Fig. 3. Factor graph representation of the frame to frame odometry. Low Further, to solve the optimisation of the alignment problem,
opacity nodes is used for those nodes that are not used for current alignment.
Black factors are the local Log-GPIS distance constrains. the derivative of the distance with respect of W TCi is desired.
Based on the chain rule, the Jacobian function is as follows,
Particularising (25) for this case (Fig. 3), let us write eidis ∂ d¯W ∂ d¯W
W
i TCi Ci pi,j i
W
TCi Ci pi,j ∂ W TCi Ci pi,j
as: =
∂ (W TCi ) ∂ (W TCi Ci pi,j ) ∂ (W TCi )
Ji
X (32)
eidis Ci−1
TCi = W TCi−1 d¯i Ci−1
TCi Ci pi,j ,
(26) The first term is one of the valuable byproducts of the global
j=0 Log-GPIS, which is the gradient of query point W TCi Ci pi,j :
where the W TCi−1 is given by accumulating the relative poses ∂ d¯W W
i TCi Ci pi,j
from 1 to i − 1. = ∇d¯W W Ci
i ( TCi pi,j ). (33)
∂ (W TCi Ci pi,j )
8
into one point cloud in global reference frame, we aim to build FV move towards the implicit surface and iteratively query
a globally consistent continuous and probabilistic reconstruc- the Log-GPIS until the iso-values are smaller than a threshold
tion. However, a well-known disadvantage of general GPs lies (0.0001m in our experiments).
on the memory allocation and the computational complexity to
invert the covariance matrix of the size of the training points. VIII. P LANNING
Moreover, as pointed out above, we model the joint GPIS
with gradients, which improves the accuracy of the predicted The differentiable nature of Log-GPIS permits straight-
surfaces. Despite this being advantageous as there is no need to forward integration with existing optimisation-based planners.
generate artificial points to direct inside and outside surfaces, In this section, we present how Log-GPIS can be used together
the computational complexity gets worse when considering the with the covariant Hamiltonian optimisation-based motion
gradient inferences. planner (CHOMP) [5] for obstacle avoidance planning.
To reduce the general computational complexity, and Let xr : t 7→ xr (t) be the robot’s trajectory. For simplicity,
thereby improving its scalability, in our previous work [9], we consider the robot’s position in the workspace over time,
we adopted the idea of the scaling GP algorithm to achieve excluding orientation, so that xr (t) ∈ RD . However, the
fast kernel learning [42], and exploited the Structured Kernel framework can be easily extended to cases with orientation
Interpolation framework with Derivatives (D-SKI) [43] to or with multiple joints.
consider the gradients. The D-SKI is an extension development CHOMP minimises an objective functional defined over a
of the Structured Kernel Interpolation method (SKI) [42] and space of trajectories. We consider the conventional objective
the Matrix-Vector Multiplications method (MVMs) [44]. The functional as introduced in [5] for smooth obstacle avoidance:
main idea is to use an approximate kernel function with in- Z T
1 r
ducing points, enabling fast processing through interpolation. C[xr ] ≡ ||ẋ (t)||2 + λc(xr (t))||ẋr (t)||dt. (44)
The input dataset is reprojected onto a grid generated by 0 2
inducing points, which significantly reduces the requirement Here, the first term encourages smoothness by regularising
to compute GPIS. The interpolated kernel function of k (x, x0 ) the velocity. The second term is responsible for obstacle
is formulated as, avoidance, which is enforced by the collision penalty term
c(xr (t)) set as:
K ≈ V K(U, U )V T , (41)
r 1
where U is a uniform grid of inducing points m, and V −d(x (t)) + 2 ,
if d(xr (t)) < 0
r 1
is an N × m sparse matrix of interpolation weights. When c(x (t)) = 2 (d(xr (t)) − )2 , if 0 < d(xr (t)) ≤ (45)
considering gradients and preserving the positive definiteness 0, otherwise.
of the approximate kernel, D-SKI differentiates combined
covariance matrix. The target scaling covariance matrix of CHOMP incrementally minimises the objective functional
k̃(x, x0 ) can be written: C[xr ] using its functional gradient2 . With the objective set
as (44), the functional gradient is given by:
T
V V ∇C[xr ] = −ẍr + ||ẋr || I − ẋr ẋrT ∇c(xr ) − c(xr )κ .
K̃ ≈ K(U, U ) =
∇V ∇V (46)
V K(U, U )V T V K(U, U )(∇V )T
Here, κ = ||ẋr ||−2 I − ẋr ẋrT ẍr is the curvature vector.
, (42)
(∇V )K(U, U )V T (∇V )K(U, U )(∇V )T The gradient of the collision penalty term c(xr ) is given by
where V and ∂V are sparse matrices with assumption of
r
if d(xr ) < 0
quintic interpolation [45]. −∇d(x ),
r
∇c(x ) = (d(xr ) − )∇d(xr ) if 0 < d(xr ) ≤ (47)
0, otherwise.
C. Marching Cubes for Log-GPIS
Given a set of query points for Log-GPIS to obtain the The merit of our framework is that we can use the Log-GPIS
implicit surface values, let us use a modified version of gradient inference equation (17) to efficiently and accurately
marching cubes algorithm [46] to reconstruct a dense map. compute the gradient ∇d(xr ), and, in turn, (46), (47). This is
The marching cube computes and extracts the surface by because our formulation produces the gradients analytically,
connecting points of a constant value (iso-value) within a whereas conventional approaches rely on discrete grid-based
volume of space: approximation [5], [32].
FV = fiso (M, viso ), (43)
IX. E XPERIMENTAL R ESULTS
where M specifies a set of points for querying. viso is the
iso-value at which to compute the surface, specified as a In this section, we illustrate the performance of the proposed
scalar. FV contains the computed faces and vertices of the framework both qualitatively and quantitatively. The frame-
surface. Note that since the lost of the sign, instead of going work is implemented in C++ and Matlab. All experiments were
through exactly zero crossing, our marching cubes sets the run on an 8-core i7 CPU at 2.5GHz.
iso-value as a constant value (0.1m for 2D cases and 0.001m 2 We focus only on gradient computation and omit the exact update process.
for 3D in our experiments). Then, the computed vertices in Interested readers are referred to [5].
10
Fig. 11. A histogram of the angle distribution of the normals in degrees. Top
figure is Log-GPIS and bottom figures is Voxblox.
(a) Raw image of the scene (b) Log-GPIS mesh (c) Voxblox mesh
Fig. 12. 3D odometry and mapping evaluation on the cow and lady dataset. a) shows the raw image of the scene. b) and c) show the reconstructed meshes
of our framework and Voxblox respectively. Note that Voxblox fuses the colour for the final reconstruction. In contrast, we directly take the colour from the
global merged point cloud.
(a) Ground truth distance (b) Log-GPIS distance (c) Voxblox distance (d)
Fig. 13. A horizontal slice (5m × 3m) shows the distance accuracy given Log-GPIS odometry and mapping on the cow and lady dataset. The slice is 0.8m
above the ground. All figures show from the top view.
of-the-art frameworks. Our experiments showed that Log- real-time 3d reconstruction and interaction using a moving depth cam-
GPIS-MOP exhibits comparable results to the state-of-the- era,” in Proc. of the ACM symposium, 2011, pp. 559–568.
[5] M. Zucker, N. Ratliff, A. D. Dragan, M. Pivtoraiko, M. Klingensmith,
art frameworks for localisation, surface mapping and obstacle C. M. Dellin, J. A. Bagnell, and S. S. Srinivasa, “Chomp: Covari-
avoidance, even without using IMU data or loop closure ant hamiltonian optimization for motion planning,” Int. J. of Rob.
detection. Future work lies in loop closure detection to develop Res.(IJRR), pp. 1164–1193, 2013.
a full SLAM solution which will improve long-term robustness [6] V. Reijgwart, A. Millane, H. Oleynikova, R. Siegwart, C. Cadena, and
J. Nieto, “Voxgraph: Globally consistent, volumetric mapping using
in large-scale environments. Further, we would like to recover signed distance function submaps,” IEEE Robotics and Automation
the sign of EDF and develop an efficient implementation that Letters, 2020.
will allow online operation. [7] D. Haehnel, Cyrill Stachniss—Robotics Datasets, Intel Lab, 2019.
[Online]. Available: http://www2.informatik.uni-freiburg.de/∼stachnis/
datasets/
[8] O. Williams and A. Fitzgibbon, “Gaussian process implicit surfaces,”
R EFERENCES 2007.
[1] C. Cadena, L. Carlone, H. Carrillo, Y. Latif, D. Scaramuzza, J. Neira, [9] L. Wu, R. Falque, V. Perez-Puchalt, L. Liu, N. Pietroni, and T. Vidal-
I. Reid, and J. Leonard, “Past, present, and future of simultaneous Calleja, “Skeleton-based conditionally independent gaussian process
localization and mapping: Towards the robust-perception age,” Trans. implicit surfaces for fusion in sparse to dense 3d reconstruction,” IEEE
on Rob., p. 1309–1332, 2016. Robotics and Automation Letters, no. 2, pp. 1532–1539, 2020.
[2] J. Thoma, D. P. Paudel, A. Chhatkuli, T. Probst, and L. V. Gool, [10] W. Martens, Y. Poffet, P. R. Soria, R. Fitch, and S. Sukkarieh, “Geo-
“Mapping, localization and path planning for image-based navigation metric priors for gaussian process implicit surfaces,” IEEE Robotics and
using visual features and map,” in Proc. of CVPR, 2019, pp. 7383–7391. Automation Letters (RA-L), pp. 373–380, 2017.
[3] R. Mur-Artal and J. D. Tardós, “ORB-SLAM2: an open-source SLAM [11] B. Lee, C. Zhang, Z. Huang, and D. D. Lee, “Online continuous
system for monocular, stereo, and RGB-D cameras,” IEEE Transactions mapping using gaussian process implicit surfaces,” in 2019 International
on Robotics (T-RO), vol. 33, no. 5, pp. 1255–1262, 2017. Conference on Robotics and Automation (ICRA), 2019.
[4] S. Izadi, D. Kim, O. Hilliges, D. Molyneaux, R. Newcombe, P. Kohli, [12] J. A. Stork and T. Stoyanov, “Ensemble of sparse gaussian process
J. Shotton, S. Hodges, D. Freeman, A. Davison et al., “Kinectfusion: experts for implicit surface mapping with streaming data,” 2020 IEEE
15