You are on page 1of 15

ISPRS Journal of Photogrammetry and Remote Sensing 196 (2023) 58–72

Contents lists available at ScienceDirect

ISPRS Journal of Photogrammetry and Remote Sensing


journal homepage: www.elsevier.com/locate/isprsjprs

A review of non-rigid transformations and learning-based 3D point cloud


registration methods
Sara Monji-Azad a ,∗, Jürgen Hesser a,b,c,d , Nikolas Löw a
a Mannheim Institute for Intelligent Systems in Medicine (MIISM), Medical Faculty Mannheim, Heidelberg University, Mannheim 68167, Germany
b
Interdisciplinary Center for Scientific Computing (IWR), Heidelberg University, Heidelberg, Germany
c
Central Institute for Computer Engineering (ZITI), Heidelberg University, Heidelberg, Germany
d
CZS Heidelberg Center for Model-Based AI, Heidelberg University, Mannheim, Germany

ARTICLE INFO ABSTRACT

Keywords: Point cloud registration is a research field where the spatial relationship between two or more sets of points in
Point cloud registration space is determined. Point clouds are found in multiple applications, such as laser scanning, 3D reconstruction,
Non-rigid transformation and time-of-flight imaging, to mention a few. This paper provides a thorough overview of recent advances in
Quantitative assessments metrics
learning-based 3D point cloud registration methods with an emphasis on non-rigid transformations. In this
Robustness
respect, the available studies should take various challenges like noise, outliers, different deformation levels,
Registration datasets
and data incompleteness into account. Therefore, a comparison study on the quantitative assessment metrics
and robustness of different approaches is discussed. Furthermore, a comparative study on available datasets is
reviewed. This information will help to understand the new range of possibilities and to inspire future research
directions.

1. Introduction any two points is not changed by transformation. In contrast, to model a


non-rigid transformation, some typical traditional transformation mod-
Point cloud registration is one of the most fundamental and chal- els are considered, such as affine transformation, projection models,
lenging areas in computer vision. It has a wide range of applications, and thin-plate spline (TPS) (Chui and Rangarajan, 2003), to mention
such as 3D localization (Elbaz et al., 2017), 3D object recognition (Al- a few (Bellekens et al., 2015).
hamzi et al., 2015), 3D reconstruction (Takimoto et al., 2016), aug- Still, in comparison with rigid approaches, the transformation func-
mented/virtual reality (Mahmood and Han, 2019), generating free-
tion for non-rigid registration is more challenging. Finding the stable
viewpoint videos (Zhang et al., 2021b), tracking (Wang et al., 2019e),
corresponding points and using a proper deformation model can be
and medical imaging (Petricek and Svoboda, 2017) etc. The goal of
named as two main challenges of deformable registration (Castellani
point cloud registration is to find a transformation between two or
more corresponding point sets, which minimizes the distance between and Bartoli, 2020).
the transformed source point cloud and the target one (Feng et al., Rigid and non-rigid point cloud registration, regardless of the trans-
2019), (Ge, 2016). Moreover, the transformation can be defined as a formation function, can be further divided into non-learning as well as
specific transformation model (Besl and McKay, 1992) or as a model- learning-based registration methods. Non-learning registration methods
free solution with a displacement vector field (DVF) (Myronenko et al., are mostly based on developing an iterative optimization algorithm to
2006). Finding the model’s optimal parameters is the main task of estimate the rigid or non-rigid geometric transformation (Horn et al.,
model-based methods while estimating the displacement field that 2020). Nevertheless, learning-based methods learn the transformation
transfers the source point set to the target one is the main task of model- based on the geometric features of the point clouds.
free methods (Wang et al., 2019c). A sample of point cloud registration Having said that, point cloud registration includes a wide range of
both for 2D and 3D data is demonstrated in Fig. 1. topics, which can be divided into different categories. However, this
Traditionally, the registration approaches can be categorized into
area is suffering from different challenges such as different deformation
rigid and non-rigid transformations. Rigid registration usually estimates
levels, different types of noise, different levels of the outlier, and
rotation and translation parameters, which means the distance between

∗ Corresponding author.
E-mail addresses: sara.monjiazad@medma.uni-heidelberg.de (S. Monji-Azad), juergen.hesser@medma.uni-heidelberg.de (J. Hesser),
nikolas.loew@medma.uni-heidelberg.de (N. Löw).

https://doi.org/10.1016/j.isprsjprs.2022.12.023
Received 28 June 2022; Received in revised form 13 December 2022; Accepted 22 December 2022
Available online 3 January 2023
0924-2716/© 2023 International Society for Photogrammetry and Remote Sensing, Inc. (ISPRS). Published by Elsevier B.V. All rights reserved.
S. Monji-Azad et al. ISPRS Journal of Photogrammetry and Remote Sensing 196 (2023) 58–72

is how the point cloud registration methods can be evaluated and how
they can be robust to certain challenges. Additionally, reviewing the
available datasets and their characteristics is another significant matter
of the point cloud registration problem that can be addressed in more
detail.
Due to all mentioned points, this article focuses on learning ap-
proaches as well as non-rigid transformations. This review provides a
comprehensive look at the point cloud registration topic from various
points of view, which can be summarized below:

• Studying learning-based methods: a review of the most well-


known and qualified point cloud registration learning-based
methods.
• Non-rigid transformation: a study on the existing point cloud
registration methods based on non-rigid transformation.
• Review on datasets: a comparative study on available 2D and
3D datasets and their characteristics of point cloud registration.
• Evaluation performance and metrics: a study on evaluation
metrics and performance indicators for point cloud registration.
• Challenges and robustness: a study on the challenges of the
point cloud registration problem, how methods tackle the chal-
lenges, and how robust to challenges they are.

Although tremendous efforts have been made to present a clear view


Fig. 1. The output of a point cloud registration method for 2D and 3D point sets. The of the point cloud registration problem, there are some aspects that
inputs and outputs of a registration algorithm are shown in the first and second rows, will not be considered in this review. Multi-modal registration methods,
respectively. partial registration techniques, and multi-view registration approaches
are some of them. Nevertheless, the methods that will be discussed in
this review can be generalized to the mentioned points.
data incompleteness (Wang et al., 2020c). Point cloud partial overlap- The rest of this paper is organized as follows: the importance of
ping (Zhao et al., 2021b) and cluttered scenes registration (Tazir et al., the study about point cloud registration is discussed as a systematic
2018) are other challenges that should be handled through registration. review in Section 2. Section 3 provides the problem definition of non-
There are some existing survey articles on point cloud registration. rigid point cloud registration. Section 4 outlines a general overview of
The available approaches have been studied from some specific points registration approaches, and Section 5 provides an overview of existing
of view in each of them. A review of point set registration based on methods. The experimental datasets for point cloud registration are
pairwise and groupwise registrations is provided in Zhu et al. (2019). described in Section 6. The evaluation performance and metrics are
Pairwise registration methods use two point sets, while groupwise presented in Section 7. A discussion on point cloud registration and
registration methods take more than two point sets into account. Re- its future outlook is provided in Section 8. Finally, Section 9 concludes
view papers (Zhang et al., 2020; Tang et al., 2021) have provided an the paper.
overview of the learning-based point cloud registration methods. Some
non-learning methods are studied in Maiseli et al. (2017). In Huang 2. Systematic review
et al. (2021b), both learning and non-learning methods have been
studied for non-rigid point clouds from same-source and cross-source Point cloud registration includes a wide range of research areas and
domains. Same-source and cross-source refer to the point cloud acqui- can be used in various applications. As it is shown in Fig. 2, the number
sition approach. In fact, when the point sets are captured using similar of publications in this research field has increased in recent years.
sensors from different points of view or at various times, it is called Before discussing the related articles and their approaches, a systematic
the same-source approach, while the cross-source methods mention the review will be presented in the following. There are different search
ones which generate the point sets from different types of sensors. engines of research repositories which help to provide the statistics
In Saiti and Theoharis (2020), a review of 3D multi-modal registration of articles based on their topics and the years of publications. To
is presented. Multi-modal registration is an area of registration to find elaborate, Dimensions (Dimentions, 2022) is one of the popular sources
correspondences between some different data, which are captured from which collects the results from various datasets, e.g., Google Scholar,
the same object in different modalities. A study on available learning- IEEE Xplore digital library, ScienceDirect, arXiv, and so on. As a matter
based registration methods, which have used different inputs such of fact, these statistics have been collected using as many articles as
as 2D/3D mesh, images, voxel grid, and point clouds is presented possible to show this research area’s importance. Based on a large
in Villena-Martinez et al. (2020). Furthermore, a review of point cloud amount of data, missing some studies is inevitable.
registration methods on mobile robotics is presented in Pomerleau et al. A number of collected articles shown in Fig. 2 are based on both
(2015) and a comprehensive study on laser scanner point clouds is learning and non-learning publications. Different keywords including
presented in Dong et al. (2020), Cheng et al. (2018). ‘‘point cloud registration’’ as well as ‘‘point cloud alignment’’, ‘‘point
As discussed, although there are some existing reviews about reg- cloud matching’’, and so on are used. In this investigation, the pub-
istration, each of them presents some aspects of this problem. Some lication type is limited to ‘‘Articles’’ and ‘‘Proceeding’’. Additionally,
reviews are suffering from not considering the recent advances in deep the field of research is limited to ‘‘Information and computing science’’,
learning. In another aspect, the reviews which cover novel learning- ‘‘Artificial intelligence and image processing’’, and ‘‘Engineering’’.
based methods did not consider significant points, e.g., non-rigid trans- In recent years, using learning approaches to solve computer vision
formations. Focusing on some specific applications, such as medical problems has rapidly become more popular. Particularly, using deep
imaging or laser scanning to mention a few, is another motivation to learning methods to solve point cloud registration has been used more
present a novel review of the point cloud registration topic. Another compared to the years before 2017. The number of publications be-
point that has been studied but can be discussed more comprehensively tween 2017 and 2021 on learning methods based on deep architectures

59
S. Monji-Azad et al. ISPRS Journal of Photogrammetry and Remote Sensing 196 (2023) 58–72

As the second categorization, feature-based registration methods


can be presented. The feature-based methods are mainly used for coarse
registration that provides prior estimates for fine methods like ICP.
Feature-based approaches can be considered in different phases which
are shown in Fig. 4. These methods extract the key points from the
point sets. After extracting feature descriptors, their dissimilarity is
calculated. By finding the corresponding points, the equation system
can be solved and transformation parameters can be calculated. There
are two additional steps, noise removal and outlier removal, which
can be optional in this model. Some approaches apply noise removal
methods as a pre-processing step to be robust to noise. However, to
be robust to the outlier, some methods use an outlier-robust approach
for finding corresponding points such as RANSAC (Fischler and Bolles,
1981) and its variants (Fotouhi et al., 2019; Li et al., 2020a, 2021a;
Fig. 2. The number of publications per year for both learning and non-learning
Quan and Yang, 2020).
approaches is demonstrated, and it is shown how the interest in point cloud registration In addition, it is worth mentioning that the schema which is shown
is increasing over the years. in Fig. 4 can be changed due to the fact that some approaches can solve
the registration problem without the feature extraction step. Beside
mentioned outlooks, point cloud registration methods can be discussed
is shown in Fig. 3. The statistics are collected from Dimensions by using from different points of view as well:
different keywords such as: ‘‘point cloud registration’’, ‘‘point cloud
• Some studies show how robust the methods can be to the different
alignment’’ and ‘‘point cloud matching’’ as well as ‘‘learning’’, ‘‘machine
challenges in order to improve the results.
learning’’, ‘‘deep learning’’. Finally, a total of 124 articles were studied,
• Another approach is based on the registration area and the search
which are closely related to learning-based point cloud registration,
space. Global and local point cloud registration can be discussed
and then are classified according to their network architecture. As it is
in this approach (Zhang et al., 2021a).
shown in Fig. 3, the number of learning-based methods has increased
in recent years. Among the available architectures, the convolutional • There are several articles that focus on the loss function. Proba-
neural network (CNN) and the Graph Neural Network (GNN) are the bilistic methods such as Expectation–Maximization (EM) (Chang
most frequently used approaches. and Zwicker, 2009) and the approaches that utilize different
Among all available articles, which are discussed in the systematic dis-similarity or distance metrics are common instances of this
review, the methods based on non-rigid transformation are selected to group.
be discussed more in detail. The selected methods can be considered • Another group of studies concentrates on optimization issues.
as baseline approaches which have more citations in comparison with Most feature-based approaches can be considered in this group.
the rest of the studies as well as being published in journals based on The approach of these articles can be divided into two categories:
having a good reputation. – Some studies use the pre-alignment techniques or an initial
guess to prevent the error function from getting stuck in
3. Problem definition local minima. This can also speed up the convergence rate
of optimization, which leads to decreasing the running time.
Given a source point set 𝑋 = {𝑥𝑖 |𝑥𝑖 ∈ R3 , 𝑖 = 1, 2, … , 𝑁} and a target – Another category consists of optimization techniques in gen-
point set 𝑌 = {𝑦𝑖 |𝑦𝑖 ∈ R3 , 𝑖 = 1, 2, … , 𝑀}, where 𝑁, 𝑀 are the number eral. Levenberg–Marquardt (Fitzgibbon, 2003), or Graph-
of points, the goal of non-rigid point cloud registration is to compute cuts (Chang and Zwicker, 2008), to mention a few, are some
the non-rigid transformation function 𝑓 ∶ R𝑁×3 → R𝑀×3 , such that the of the approaches, which focus on the optimization problem
deformed point set 𝑋̂ = 𝑓 (𝑋) is as close as possible to the target point under different constraints.
set Y, |𝑌 − 𝑋|
̂ → 0.
Recently, there are various studies on learning-based non-rigid point Nevertheless, it is worth mentioning that the studied methods,
cloud registration. Commonly, these methods train a model by using which will be discussed in this review, can take place in more than
training data to predict the non-rigid transformation 𝑓 with the source one of the mentioned categorizations. Due to this fact, the learning-
and target point clouds 𝑋 and 𝑌 as inputs. Furthermore, there are based methods will be categorized based on network architectures. The
some other studies that train a network to learn the displacement field. summarization of studied approaches is presented in Table 1.
Moreover, there is the third studies group which relies on learning
correspondences for unseen data. Therefore, training the model can 5. An overview of learning-based methods
be done by searching for model parameters to minimize the loss func-
tion or learning the best displacement field by minimizing a distance There are some learning architectures that are mostly used in
function. learning-based registration methods, namely CNN, RNN, GNN, and
MLP. In the following, an overview of non-rigid registration methods
4. General overview of the existing methods based on learning approaches will be discussed. Through this article, it
is tried to consider the most significant approaches which have more
Considering available articles, most point cloud registration meth- citations in comparison with the rest of the studies as well as being
ods can be addressed in different points of view. published in journals based on having a good reputation.
In one categorization, point cloud registration methods can be
classified into coarse and fine approaches. To this end, the coarse- 5.1. Convolutional neural network (CNN)
based registration methods compute an initial geometric alignment
while fine-based registration methods estimate a transformation to CNN is a class of neural networks that consists of one or more
solve registration problems as precisely as possible. Iterative Closest convolutional layers. In other words, CNN is the regularized version
Point (ICP) (Besl and McKay, 1992) and the CPD method (Myronenko of a multi-layer perceptron. Various approaches have been proposed
and Song, 2010) are two common fine approaches. to apply CNN for point cloud registration. However, using CNN for

60
S. Monji-Azad et al. ISPRS Journal of Photogrammetry and Remote Sensing 196 (2023) 58–72

Fig. 3. An overview of the number of publications for learning-based point cloud registration approaches. The presented statistics includes both rigid and non-rigid transformations.

Fig. 4. An overview of feature-based approaches to solve point cloud registration problem.

point cloud registration has some challenges. To be more precise, unlike to solves the registration problem without a requirement for a prior
images that are made of a regular grid, point clouds are a set of 3D alignment.
points without specific orders. Therefore, there are different approaches In another non-rigid registration study (Wang et al., 2019a), the
to overcome these challenges such as converting the point cloud to a authors propose PR-Net for non-rigid registration based on the CNN and
2D image or 3D volumetric grids (Wu et al., 2019). learning the formulation of the correlation between source and target
ProRegNet is a registration framework based on CNN which is point clouds. PR-Net learns the global and local features based on the
proposed in Fu et al. (2021). It applies the biomechanical constraints voxelized point cloud which is called the shape descriptor tensor in their
to improve registration accuracy. ProRegNet has three main steps: paper. In this paper, it is discussed that the defined shape descriptor
volumetric point cloud generation from input images, preparation of tensor is efficient and effective to extract geometric features of shapes.
training dataset based on the finite element (FE), and registration learn- In Wei et al. (2016), a non-rigid surface registration is proposed
which finds dense correspondences in the form of two depth maps
ing phase. In the first step, a CNN network is used for MR and TRUS
of surfaces. To achieve this purpose, the surface depth map and 204
prostate segmentation based on Lei et al. (2019), Wang et al. (2019b).
points of extracted curves are extracted. They prove that using feature
Then, the segmented ROIs are meshed into tetrahedron elements to
descriptors on depth map pixels can lead to more accurate results for
extract the volumetric prostate point cloud. For the second step, which
registration.
is the training dataset generation, the finite element model is defined.
DispVoxNets (Shimada et al., 2019) is a non-rigid registration ap-
To this end, a surface point correspondence is established. Then, surface
proach that converts point clouds to voxel grid representations. The
point registration is done by predicting point cloud deformation based
approach has two main steps refinement and displacement estimation.
on finite element modeling and biomechanical constraints. The method implements a U-Net-style architecture of CNN. In the first
A volume-to-surface registration network (V2S-Net) using a biome- step, the large global deformation is predicated. In the second step,
chanical model is proposed in Pfeiffer et al. (2020). In this way, CNN refinement of the displacement vector for resolving small displacement
has performed the registration step without searching for corresponding is done.
points. To achieve this purpose, V2S-Net generates random meshes of PFCNN (Yang et al., 2020) is a customized convolutional network
different organs synthetically to increase the network generalization. that is proposed for different applications such as segmentation, classi-
To elaborate, V2S-Net registers the preoperative liver volume mesh on fication as well as registration. PFCNN is a mesh grid-based convolution
the intraoperative point cloud of a partial liver surface. The meshes are framework. In this article, a new translation structure is proposed
used as network input and are presented in the form of distance fields which is similar to surface convolutions to learn features for mentioned
on a regular grid. Learning surface representation makes it possible applications. For non-rigid registration, PFCNN classifies each input

61
S. Monji-Azad et al. ISPRS Journal of Photogrammetry and Remote Sensing 196 (2023) 58–72

Table 1
Overview of some learning-based registration methods.
Approach name (Ref) Year Network architecture Code availability Experimental data
ProRegNet (Fu et al., 2021 CNN Not yet Medical dataset
2021)
CPD-Net (Wang et al., 2019 PointNet - MLP https://github.com/nyummvc/CPD-Net Chui-Rangarajan (Chui and
2019c) Rangarajan, 2000), 3D face, 3D cat,
ShapeNet (Chair) (Chang et al.,
2015)
V2S-Net (Pfeiffer et al., 2020 CNN https://gitlab.com/ncttsopublic/Volume2SurfaceCNN Medical dataset
2020)
RMA-Net (Feng et al., 2021 RNN https://github.com/WanquanF/RMA-Net Human motion (Vlasic et al., 2008),
2021) TOSCA (Bronstein et al., 2008),
FaceWareHouse (Cao et al., 2013),
ModelNet (Wu et al., 2015),
SURREAL (Varol et al., 2017),
PR-Net (Wang et al., 2019 CNN https://github.com/Lingjing324/PR-Net Chui-Rangarajan (Chui and
2019a) Rangarajan, 2000), 3D face, 3D cat
(Hansen and Heinrich, 2021 DGCNN https://github.com/multimodallearning/deep-geo-reg Medical dataset
2021)
(Wei et al., 2016) 2016 CNN https://github.com/halimacc/DenseHumanBodyCorrespondences SCAPE (Anguelov et al., 2005),
Yobi3D (model search engine, 2022),
MIT (Tevs et al., 2012)
DispVoxNets (Shimada 2019 CNN https://vcai.mpi-inf.mpg.de/projects/DispVoxNets/ Thin plate (Golyanik et al., 2018),
et al., 2019) FLAME (Li et al., 2017), Dynamic
FAUST (DFAUST) (Bogo et al.,
2017), cloth (Bednarik et al., 2018).
GP-Aligner (Wang 2020 MLP Not yet ShapeNet (Chang et al., 2015)
et al., 2020b)
(Hansen et al., 2019) 2019 DGCNN Not yet Medical dataset
FPT (Baum et al., 2022 PointNet, MLP Not yet ModelNet (Wu et al., 2015)
2022)
SyNoRiM (Huang et al., 2022 CNN https://github.com/huangjh-pub/synorim Ma et al. (2020), Clothcap (Pons-Moll
2022a) et al., 2017), 4dcomplete (Li et al.,
2021c), Deepdeform (Bozic et al.,
2020), SAPIEN (Xiang et al., 2020)
(Trappolini et al., 2021 Transformer https://github.com/GiovanniTRA/transmatching SURREAL (Varol et al., 2017),
2021) FAUST (Bogo et al., 2014),
SHREC’19 (Melzi et al., 2019)
NDP (Li and Harada, 2022 MLP https://github.com/rabbityl/DeformationPyramid 4DMatch/4DLoMatch (Li and Harada,
2022b) 2022a)
NrtNet (Hu et al., 2022 DGCNN, Transformer Not yet SURREAL (Varol et al., 2017),
2022) SHREC’19 (Melzi et al., 2019), MIT
(Grosse et al., 2009)
PFCNN (Yang et al., 2020 CNN https://github.com/msraig/pfcnn FAUST (Bogo et al., 2014)
2020)
(Trimech et al., 2020) 2020 CNN Not yet ModelNet (Wu et al., 2015)
ResNet-LDDMM (Amor 2021 Residual networks Not yet SHREC’19 (Melzi et al., 2019)
et al., 2021)
(Netto and Oliveira, 2022 DGCNN Not yet ModelNet (Wu et al., 2015), TOSCA
2022) (Bronstein et al., 2008), Human
motion (Vlasic et al., 2008)
PR-GAN (Tang and 2021 GAN Not yet Chui-Rangarajan (Chui and
Zhao, 2021) Rangarajan, 2000)
(Shi et al., 2021) 2021 PointNet, MLP Not yet Medical dataset
FlowNet3D (Liu et al., 2019 PointNet++ https://github.com/xingyul/flownet3d FlyingThings3D (Mayer et al., 2016),
2019) KITTI (Geiger et al., 2012)
FlowNet3D++ (Wang 2020 PointNet++ Not yet FlyingThings3D (Mayer et al., 2016),
et al., 2020d) KITTI (Geiger et al., 2012),
KillingFusion (Slavcheva et al., 2017)
Lepard (Li and Harada, 2022 Transformer https://github.com/rabbityl/lepard 3DMatch (Zeng et al., 2017),
2022a) 3DLoMatch (Huang et al., 2021a)

mesh vertex into its corresponding vertex on the template mesh. This A 3D facial expression recognition based on non-rigid point cloud
approach is similar to the methods like (Fey et al., 2018; Boscaini recognition is proposed in Trimech et al. (2020). The approach includes
et al., 2016). In the second step, a pipeline is presented including a pre-processing step. To this end, 70 landmarks are extracted as well
a sequence of convolutions for PFCNN as well as a two-level net- as 204 points of extracted curves. Furthermore, they use SIFT for 3D
work for MDGCNN (Poulenard and Ovsjanikov, 2018). MDGCNN is the keypoint extraction. All mentioned features are used as the input of a
patch-based multi-directional geodesic. non-rigid CPD registration inspired by Trimech et al. (2016). Finally,

62
S. Monji-Azad et al. ISPRS Journal of Photogrammetry and Remote Sensing 196 (2023) 58–72

Fig. 5. A taxonomy of graph convolutional networks from Zhang et al. (2019). This schema provides two points of view, GCNN in the application areas and GCNN based on
convolutional types.

the accuracy of the proposed method is reported for face expressions 5.3. MLP and PointNet-based
recognition (FER) while there is no report for the accuracy of the
registration step. Multilayer perceptron (MLP) is one of the network architectures
which is widely used in studies and is common for learning point cloud
5.2. Graph convolutional neural network (GCNN) features. However, the problem of processing unordered point sets is
a fundamental matter in this area. Hence, any network architecture
As discussed before, in the basic CNN, the main elements are
should take this problem into account. To this purpose, the PointNet
convolutional and pooling layers. These components in CNN can be
algorithm (Qi et al., 2017a) is a deep learning architecture based on
operated on structured data like images. In the literature, two impor-
MLP that is able to directly consume the 3D points cloud. PointNet al-
tant characteristics are considered for structural data namely, Euclidean
and grid-like structures (Bronstein et al., 2017). Considering the non- gorithm can be used for classification and semantic segmentation tasks
Euclidean characteristic of graphs, the convolutions and filtering on by handling the unordered structure of its inputs. Considering the fact
graphs are not as simple as on images (Zhang et al., 2019). In Shuman that PointNet cannot present high accuracy to pinpoint local changes,
et al. (2013), an overview of signal processing on graphs is presented in- PointNet++ (Qi et al., 2017b) is presented to learn local features with
cluding different approaches for graph analysis. To present an overview increasing contextual scales. PointNet (Qi et al., 2017a) and Point-
of graph convolutional neural networks, a taxonomy of convolution Net++ (Qi et al., 2017b) are used in many learning-based registration
types and application areas is shown in Fig. 5. This taxonomy is taken methods as the feature descriptor for point clouds. These approaches
from Zhang et al. (2019). Regarding the mentioned taxonomy, several are pioneering methods for point cloud processing. Their results can be
aspects can be considered for GCNN. In this respect, the spectral-based used as the input of the point cloud registration methods. PointNet is
graph convolutions are defined based on graph Fourier transforma- based on the multilayer perceptron (MLP). Then, in comparison with
tion. Spatial-based graph convolutions focus on the nodes and their CNN, the input is not challenging. After a while, another version of
neighborhood representations. GCNNs are a model of neural network PointNet called PointNet++ was proposed. Using hierarchical feature
architectures which take the graph structure and neighboring nodes in a aggregation for points was provided. Furthermore, PointNetLK (Aoki
convolutional manner into account. Recently, there are several studies et al., 2019) is another well-known method based on shape feature
on using GCNNs for point cloud applications. Then, the main idea is learning and PointNet. PointNetLK is a rigid-based registration ap-
to learn geometric features which are extracted from point clouds with
proach to learning the global feature representation of point sets. The
neighborhood relations. This information is commonly defined on KNN
approach applies the Lucas–Kanade algorithm (LK) (Lucas et al., 1981)
graphs, semantic labeling, and so on.
for 3D point cloud registration in an iterative process with PointNet.
A key point-based method is presented in Hansen and Heinrich
(2021) for medical purposes. Point clouds are extracted from the input The authors improved the proposed method by using an analytical
scan pair. In the following, edge convolutions as a geometric feature Jacobian matrix and decomposing it to feature and warp components
are combined with differentiable loopy belief propagation (LBP) (Ihler in Li et al. (2021b). They use PointNet and the LK algorithm in one
et al., 2005). LBP is applied for the regularization of displacements recurrent neural network.
on a KNN graph. Therefore, they could solve 3D lung registration as GP-Aligner (Wang et al., 2020b) is an approach to solve non-rigid
a geometric alignment of two point sets. The used GCNN structure groupwise point cloud registration. More specifically, the first step of
in Hansen and Heinrich (2021) is inspired by EdgeConv network ar- GP-Aligner is an optimized Group Latent Descriptor (GLD) to charac-
chitecture (Wang et al., 2019). EdgeConv is one of the most applicable terize the groupwise relation between a group of point sets. Therefore,
methods which captures the graph’s local geometric structure. Edge- GLD is defined as a Gaussian distribution and in the following is con-
Conv generates edge features in the manner of representing the relation catenated with the coordinates of the points. Finally, after optimizing
between the points and their neighbors. This method is robust to the the GLD results, an MLP network is used to learn coherent drifts as the
ordering of neighbors and permutation invariant. desired transformation.
In another study (Hansen et al., 2019) from the same authors Neural deformation pyramid (NDP) (Li and Harada, 2022b) is an-
of Hansen and Heinrich (2021), another dynamic GCNN is proposed. In other approach based on MLP architecture. In NDP, a pyramid archi-
the first step, the approach is using EdgeConv and similar architecture tecture is proposed which each pyramid level contains a multilayer
to learn robust correspondences between the source and target point
perceptron. Each level of this hierarchical approach takes an input
sets. The second step is followed via probabilistic refinement. There-
of a sinusoidally encoded 3D point and generates the output of the
fore, the learned features are applied to improve prior probabilities to
motion increments from the previous level. In this paper, it is proven
be used in the CPD algorithm.
that the frequency of the sinusoidal function can represent non-rigidity.
SyNoRiM (Huang et al., 2022a) is a registration approach that uses
a point cloud graph as an input of a CNN network. In their problem Based on their definition, low frequencies represent rigid motion while
formulation, a fully connected graph is assumed. Then, the graph high frequencies produce more fluctuations. Then, it is suitable for
vertices are input point clouds and the edges represent the graph con- representing highly non-rigid motion.
nectivity. SyNoRiM computes the pairwise correspondences which are In another study, an unsupervised learning method of the geometric
parameterized using functional maps. Although the SyNoRiM approach non-rigid transformation based on coherent point morph drift (CPD-
could handle the occlusion problem by learning non-orthogonal basis Net) is proposed in Wang et al. (2019c). CPD-Net has three main steps.
functions to regularize deformations, the approach still suffers from Learning the global shape descriptor is the first step based on the
scaling to large scenes (Huang et al., 2022b). PointNet network. For the second step, the method concatenates the

63
S. Monji-Azad et al. ISPRS Journal of Photogrammetry and Remote Sensing 196 (2023) 58–72

achieved descriptors to the source point cloud. Finally, by using another surface attention is modified to make it possible to be aware of the un-
MLP architecture, the drifts of each source point are learned. derlying density of the geometry. Considering the refinement procedure
Another approach based on MLP and PointNet is proposed in Shi in the transformer network architecture, this approach is suffering from
et al. (2021). This method estimates points drift by learning the dis- long training and post-processing time as well.
placement field function and omitting an additional iterative optimiza- NrtNet is another non-rigid registration method based on trans-
tion process. Furthermore, FPT (Baum et al., 2022) is a multi-modal formers proposed in Hu et al. (2022). NrtNet uses a dynamic GCNN
biomedical method based on two main steps. The first step is a global architecture as a feature extractor and then learns correct correspon-
feature extractor based on PointNet architecture and the second step is dences using a transformer-based framework. They claim that NrtNet
a point transformer network based on Baum et al. (2021). This network is able to learn correspondences between two dense point clouds. Then,
is to learn a unique displacement vector without any smoothness or the last module calculates the relative drift of the point pairs to solve
coherence constraints. the registration problem.
There is another registration method is called FlowNet3D (Liu et al., Lepard (Li and Harada, 2022a) is another registration method based
2019) to learn scene flow in 3D point clouds. In the FlowNet3D on transformers. Lepard is known as a position-aware feature-matching
article, two applications for scene flow output are presented, scan method. The proposed method is built of a fully convolutional feature
registration and motion segmentation. In the first frame to indicate extractor network, called KPFCN (Thomas et al., 2019). Furthermore,
the motion between two frames, FlowNet3D estimates a transnational for the matching step, a transformer architecture with self and cross-
flow vector for every point, individually. To this aim, they propose attention is used, which is inspired by Vaswani et al. (2017). Finally,
two new learning layers. One layer is called the flow embedding layer the differentiable matching step has been done to solve the registration
which learns the correlation of two point sets. Another layer is called problem using (Sarlin et al., 2020; Sun et al., 2021).
the upconv layer which learns to propagate features from one point
set to the other one. However, in Guo et al. (2020) two problems 5.5. Other network architectures
with the FlowNet3D approach are mentioned. One is that FlowNet3D
cannot predict all motion vectors correctly. Therefore, some motion As discussed, although there are several deep-learning architectures
vectors have significant differences in comparison with ground truth in that are most popular in recent studies, the existing approaches are
their direction. The second problem is that FlowNet3D cannot handle not limited to the mentioned ones. In the following, a few more
non-static scenes like deformable objects. Hence, FlowNet3D++ (Wang non-rigid registration methods based on different architectures are
et al., 2020d) is proposed to solve the mentioned problems by minimiz- discussed, namely, generative adversarial network (GAN), recurrent
ing the angle between the predicted motion vector and the ground truth neural network (RNN), and deep residual networks (ResNet).
ones. For solving the second mentioned problem, FlowNet3D++ uses GAN is presented for the first time in VGoodfellow et al. (2014).
The GAN architecture is formed of two neural networks which contest
a point-to-plane distance loss to achieve better accuracy for dynamic
with each other, namely a generator and a discriminator. This contest
scenes.
is in the form of a zero-sum game. To this end, one agent’s gain is
another agent’s loss. The generator produces fake samples, while the
5.4. Transformers
discriminator determines the generated faked samples from the real
data. This process will be continued as long as the discriminator cannot
A transformer is a deep learning model which is introduced to distinguish the fake data. Another GAN version, called conditional
process sequential input data by a pioneering work (Vaswani et al., GAN, is proposed in Mirza and Osindero (2014). This architecture
2017). The transformers mechanism is quite similar to the recurrent allows adding some constraints to generated data (e.g., class labels),
neural network (RNN), however, in contrast to RNN, the whole input regarding the model’s expected outputs.
is processed with transformers once. As an example, BERT (Devlin In Tang and Zhao (2021), a conditional GAN for non-rigid point
et al., 2018) is one of the most important approaches which shows how set registration is proposed which is called PR-GAN. In the first step,
transformers can be used in the field of natural language processing. considering the irregular format of point clouds, the proposed method
For computer vision applications, Dosovitskiy et al. (2020) is a method uses an autoencoder to overcome this challenge. The output of the
that shows transformers’ capability in comparison with CNN. The pio- autoencoder is used for PR-GAN architecture. The generator in PR-GAN
neering transformer approaches, which have used point clouds as input, aims to generate the parameters of the geometric transformation while
are presented in Zhao et al. (2021a), Guo et al. (2021), to mention the discriminator applies to force the generated parameters to register
a few. Although there are different studies on using transformers for two-point clouds accurately.
classification or segmentation, the number of studies on point cloud RNN is a type of neural network which is designed to interpret
registration based on transformers is limited. One of the first methods sequential information. There are different variants of RNN, e.g., long
using transformers for rigid point cloud registration is deep closet point short-term memory (LSTM) (Hochreiter and Schmidhuber, 1997) and
(DCP) (Wang and Solomon, 2019a). DCP contains three steps: firstly, gated recurrent unit (GRU) (Cho et al., 2014). They are suitable to
a point cloud embedding network for feature extraction is suggested, process time series data. However, using RNN as an architecture for
which is based on dynamic GCNN. In the second step, the method has point cloud processing carries some challenges. Firstly, RNN learns a
an attention-based module to show the relationship between source one-dimensional vector that is not suitable to represent the entire point
and target point clouds. As the last step, to calculate rigid transforma- cloud. One solution is to make a flat representation of the point cloud.
tion, which includes rotation and translation, a differentiable singular However, this can change the data structure and cause the omission of
value decomposition is used. Furthermore, for computing the feature the local structure. Extending the one-vector representation to a two-
representation, DCP applies a transformer network (Vaswani et al., dimensional representation is another solution. Secondly, with regard
2017) to provide global and local information. Due to the fact that the to the unordered structure of the point cloud, concatenation cannot be
focus of this article is on non-rigid registration, in the following, some applied. Then, one solution is to concatenate the point feature, the state
registration approaches which are based on transformer architecture of the neighbor, and the displacement from the neighbor to the point.
will be discussed. This can be done for each neighbor separately (Fan and Yang, 2019).
A non-rigid registration approach is proposed in Trappolini et al. Estimating a non-rigid transformation by a combination of 𝑁 rigid
(2021) which is based on an auto-encoder architecture and a trans- transformation based on the recurrent network is proposed in Feng
former network. The transformer is applied as a geometric translator et al. (2021) (RMA-Net). RMA-Net defines a skinning weight for each
between two point sets. Furthermore, the attention mechanism named point to show the influence of the rigid transformation. Therefore, the

64
S. Monji-Azad et al. ISPRS Journal of Photogrammetry and Remote Sensing 196 (2023) 58–72

Table 2
Some available benchmark for the point cloud registration.
Dataset name (Ref) Year Synthetic/Real Included data URL
3DLoMatch (Huang 2021 Real A collection of 62 scenes, with overlap https://overlappredator.github.io/
et al., 2021a) ratios between 10 percent to 30 percent
Faust (Bogo et al., 2014 Real 300 triangulated meshes of 10 various http://faust.is.tue.mpg.de/overview
2014) subjects, five male and five female. Each
scan is in 30 different poses
SCAPE (Anguelov et al., 2005 Synthetic 71 registered meshes of a particular person http://ai.stanford.edu/~drago/Projects/scape/scape.html
2005) in different poses
4DMatch/4DLoMatch 2022 Synthetic Randomly selected 1761 sequences from https://github.com/rabbityl/lepard
(Li and Harada, 2022a) DeformingThings4D, 4DLoMatch has low
overlap in comparison with 4D match

skinning weight and a rigid transformation are estimated by RNN at 6.1. Real-world datasets
each iteration. In this paper, it is proven that the proposed method is
appropriate for both rigid and non-rigid transformation. The datasets that represent some objects in the real scene and
The residual network (ResNet) (He et al., 2016) is an architecture provide some knowledge about them accordingly are called real-world
based on a new technique called skip connection. The skip connection datasets. Usually, they are captured by using navigating robots
is a name for an operation of skipping some layers to connect a equipped with cameras, LiDAR cameras, RGB-D cameras, or some other
layer to the further one, which makes a residual block. This process depth cameras (Li et al., 2019). A real-world dataset can include large
can solve two problems, namely vanishing gradients and degradation or small-scale areas or indoor/outdoor environments (Handa et al.,
(accuracy saturation) problems. For non-rigid point cloud registra- 2016). In the following, some real-world datasets, which are common
tion, an approach called ResNet-LDDMM is proposed in Amor et al. for point cloud registration in the studied articles, will be discussed.
(2021). ResNet-LDDMM is a ResNet-based diffeomorphic registration Cloth (Bednarik et al., 2018), is one of the used datasets which con-
approach to register 3D shapes and 3D point clouds. This approach tains different states of deformable objects including paper, sweaters,
solves the non-stationary ordinary differential equation (ODE)(flow hoodies, cloth, and T-shirts. It presents non-linear deformations be-
equation) based on Euler’s discretization scheme. Considering the fact tween the source and target surfaces. Meshes, normal vectors, and
that ResNet speeds the learning process by reducing the impact of depth maps are available for the Cloth dataset. Deepdeform (Bozic
vanishing gradients, the ResNet-LDDMM method takes the advantage et al., 2020) is another dataset that includes partial views captured with
of this fact to reduce the computation cost. the real RGB-D camera. KITTI (Geiger et al., 2012) is an outdoor real-
world dataset that is captured by using a Velodyne HDL64 3D LiDAR
6. Datasets
scanner in Karlsruhe, Germany. There are 11 sequences which include
the grayscale dataset, color dataset, Velodyne laser data, calibration
To show the accuracy of point cloud registration methods and to
files, and ground truth poses. The available information makes this
study how robust they are to the existing challenges, several datasets
dataset usable for 3D object detection and 3D tracking, too. Further-
with different characteristics have been published. These characteristics
more, KillingFusion (Slavcheva et al., 2017) is another dataset that is
can provide some information, such as color and depth of point clouds,
used for registration, but this dataset is a benchmark for non-rigid 3D
or some 3D geometric information such as pose, alignments, parts,
reconstruction without correspondences.
and key points, or symmetry of point clouds. Some datasets provide
SHREC is the name of two different datasets, namely SHREC’19
language-related or semantic annotations of point sets and some others
(Melzi et al., 2019) and SHREC’20 (Dyke et al., 2020), which are widely
provide segmentation data. It is of great importance to mention that
used both for matching and registration problems. SHREC datasets
this information is not available for all datasets at the same time. De-
are generated for shape correspondences purpose of isometric and
pending on which challenges are most important for a method to solve,
non-isometric deformations. SHREC’19 (Melzi et al., 2019) includes
the dataset can be selected. In addition, the specific characteristics of
the datasets have led them to be able to be used for other applications humans’ real scans under various deformations, namely articulating,
as well as point cloud registration such as reconstruction, segmentation, bending, stretching, and topologically changing as well as the ground
and recognition. truth of correspondences. In comparison with SHREC’19 (Melzi et al.,
However, despite all these useful details, the lack of some informa- 2019), SHREC’20 (Dyke et al., 2020) includes a set of synthetic models
tion, e.g., the models’ complexities or deformation level makes a fair and real-world scans, e.g., four-legged animals.
comparison between the state-of-the-art methods difficult. In this way, In the following, some datasets which are known as the benchmark
the articles have customized the available benchmarks. Depending on for registration are presented.
which challenges are going to be solved, a new customized dataset will 3DMatch (Zeng et al., 2017) is a dataset of a real-world scene.
be generated. For instance, for reporting how robust to noise or outlier It is provided in two categories: keypoint matching benchmark and
an algorithm is, some noise or outlier will be added to the dataset. geometric registration benchmark. The keypoint matching benchmark
As another example, the studied articles generate their own dataset in contains both 2D and 3D RGB-D patches of some base-lined correspon-
different deformation levels. dences. Depth and color information is available for this dataset which
As it is shown in Table 1, the studied approaches are using different includes 62 RGB-D scenes. The geometric registration benchmark also
datasets. It is worth mentioning that only a few benchmark datasets includes real-world RGB-D data scanned by Microsoft Kinect and Intel
are available for point cloud registration. However, there are multiple RealSense. It includes eight sets of scene fragments. Each fragment is
datasets that are generated for other purposes like reconstruction or a 3D point cloud of a surface. It is made of 50-depth frames. Beside
matching, which can be used for registration purposes as well. registration, this dataset can be used for reconstruction, model align-
The most frequently used datasets of the studied methods, in Sec- ment, and surface correspondence. 3DLoMatch (Huang et al., 2021a) is
tion 5, will be discussed in the following. The datasets can be catego- another real-world dataset for the registration of 3D point clouds with
rized into two main groups: real-world datasets and synthetic ones. The overlap ratios between 10 percent to 30 percent.
characteristics of the most used datasets as well as the benchmark ones Faust (Bogo et al., 2014) is another real-world scene dataset. It
are provided in Table 2. contains 300 triangulated meshes of 10 various subjects, five males and

65
S. Monji-Azad et al. ISPRS Journal of Photogrammetry and Remote Sensing 196 (2023) 58–72

five females. Each scan is in 30 different poses. This dataset represents ground truth are available, which can be used for learning pixel-wise
the real human meshes with ground-truth correspondences. A full-body classification as well as matching and non-rigid registration problems
3D stereo capture system composed of 22 modular scanning units is like (Trappolini et al., 2021; Hu et al., 2022). Human part segmentation
used to produce the Faust dataset. The number of vertices is about and depth estimation are other applications for this dataset.
6890. Dynamic FAUST (DFAUST) (Bogo et al., 2017) is a benchmark for SCAPE (Anguelov et al., 2005) is a benchmark for the matching
registering human bodies in motion. DFAUST includes scanned meshes problem and non-rigid registration by providing the full objects’ cor-
of humans in different positions. respondences. 71 registered meshes of a particular person in different
poses are available in this dataset.
6.2. Synthetic datasets 4DMatch/4DLoMatch (Li and Harada, 2022a) are two benchmarks
for registration and matching problems both for being used in rigid and
The synthetic dataset is the information artificially generated using deformable scenes. 4DMatch is a partial point cloud benchmark while
simulations or computer algorithms. Synthetic datasets can be gen- its low overlap version is called 4DLoMatch. 4Dmatch is captured by
erated as ground truth. Regarding this, they can be ideal for those using the sequence from Li et al. (2021c) dataset. The ground truth
applications that their real-world datasets can be difficult to gen- dense correspondence is available as well. The availability of time-
erate. For instance, in surgical applications, generating a real-world varying geometry in both datasets makes it possible to show more
dataset can be difficult. Then, synthetic datasets can be an appropriate challenges in matching and registration applications
replacement approach.
Considering the fact that there are some challenges in real-world 7. Evaluation performance
datasets such as noise, camera distortion, etc., synthetic datasets are
generated to provide a benchmark for point cloud registration methods. As discussed before, finding a transformation between two or more
To this aim, there are several datasets generated for different 3D topics, point clouds and considering that how the challenges are overcome
like reconstruction, segmentation, or matching purposes. is the main goal of the point cloud registration methods. Some of
One of the oldest popular datasets for deformable object registration these challenges have been mentioned in the previous sections. In this
is Chui–Rangarajan (Chui and Rangarajan, 2000). Chui–Rangarajan is section, the challenges will be discussed in detail and it will show how
a synthetic dataset that includes two different point clouds: Chinese methods can evaluate their robustness to the challenges. Performance
characters with 105 points and fish shapes with 98 points. evaluation of point cloud registration methods will be reported through
TOSCA (Bronstein et al., 2008) is another synthetic dataset that different criteria. Quantitative assessment metrics and robustness are
collects 3D non-rigid shapes in different poses. It contains 80 objects the most common criteria. These will be explained in detail in the
and the number of vertexes for each object is about 50000. The dataset following. Moreover, the summarization of the performance evaluation
includes some shapes of different animals like cats, dogs, wolves, of studied papers is presented in Table 3.
horses, centaurs, gorillas, and some shapes of the human figure like
female and two different male figures with different poses. 7.1. Quantitative assessment metrics
MIT (Tevs et al., 2012) is a synthetic dataset that contains three var-
ious characters from different animation sequences. The ground truth of The quantitative analysis of the algorithm and the assessment of
correspondence points is available, which can be used in the matching its performance are the most significant measurements to evaluate a
problem. Furthermore, Thin plate (Golyanik et al., 2018) is one dataset method. As it is discussed in the problem definition in Section 3, the
that contains various synthetic isometric surfaces. Also, there is a free loss function and distance metric should be minimized. The available
3D model search engine called Yobi3D (model search engine, 2022) point cloud registration methods use different quantitative assessment
which is used in Wei et al. (2016) paper. FlyingThings3D (Mayer et al., metrics to show the performance of their methods. Some of them are
2016) is a large dataset for disparity, optical flow, and scene flow based on distance metrics meanwhile others are based on transforma-
estimation. This dataset is generated based on learning scene flow in tion errors. Beside reporting distance and loss function, some articles
3D point cloud, which is used in some methods like (Liu et al., 2019) report matching errors to show how robust the methods are to the
and (Wang et al., 2020d). Although the main goal of these methods is outlier and noise. These types of error metrics will be studied in the
learning scene flow estimation, non-rigid registration is the main step robustness section.
in these algorithms. The quantitative assessment metrics as well as robustness to the
One of the most popular topics for registration is about the human common challenges for learning-based and non-rigid 3D point cloud
body, human dress, or human facial expressions. Considering the fact registration are shown in Fig. 6.
that the mentioned areas need to take deformation difficulties into Quantitative assessment metrics based on distance: some meth-
account, the related datasets are appropriate to show the accuracy ods calculate the distance between the transformed source point clouds
of the non-rigid registration methods. According to the mentioned and target ones and then report it as a quantitative evaluation (Peterlík
points, Human motion (Vlasic et al., 2008) is the name of a dataset et al., 2018). Chamfer distance (CD) is one of the metrics that is widely
that provides articulated mesh animation from multi-view silhouettes. used in papers. The mean neighbor distance, the Hausdorff distance
Considering that captured meshes are in full correspondence, make it (HD), the Euclidean distance, the chordal distance, the Earth-Mover dis-
usable for matching problems as well as matching-based registration tance (EMD), and the Mahalanobis distance are some common distance
methods. Some information like texture and deformation transforma- measurements in the articles. In Fu et al. (2021), the Hausdorff distance
tion are also available. Furthermore, FaceWareHouse (Cao et al., 2013) (HD) is used to calculate the distance between two input surfaces. They
is a dataset that includes 150 different face shapes with 47 various also use mean surface distance (MSD) to calculate the distance between
expressions. Also, FLAME (Li et al., 2017) is another one that consists of two input surfaces. In Liang et al. (2018), a distance metric based on
different captured human facial expressions, including 10k face meshes. a combination of the distance term and the energy term is used. The
Clothcap (Pons-Moll et al., 2017) is a synthetic dataset that consists registration loss is calculated based on misaligned points between the
of clothed humans scanned by a 3dMD scanner. 4dcomplete (Li et al., source and the transformed point clouds in Pais et al. (2020). In Liu
2021c) is a dataset for non-rigid motion estimation which is also used et al. (2019) and Wang et al. (2020d), the L2 distance between the
for registration purposes as well. estimated flow vector to the ground truth flow vector is considered
SURREAL (Varol et al., 2017) is one of the available benchmarks for to report the accuracy. Furthermore, according to the similarity of the
matching problems with proving the corresponding ground truth. To source and target point clouds after deformation, the distance loss is
this end, the photo-realistic synthetic images and their corresponding defined as 𝐿𝑑𝑖𝑠 in Hu et al. (2022). There is another distance metric

66
S. Monji-Azad et al. ISPRS Journal of Photogrammetry and Remote Sensing 196 (2023) 58–72

Table 3
Evaluation performance and metrics - some learning approaches.
Approach name (Ref) Year Evaluation metrics Robustness
ProRegNet (Fu et al., 2021 Hausdorff distance, MSD, Mean, STD Outliers, Noise
2021)
CPD-Net (Wang et al., 2019 Chamfer distance Deformation levels, Noise, Outliers, Incompleteness
2019c)
V2S-Net (Pfeiffer et al., 2020 MDE, MTD Noise
2020)
RMA-Net (Feng et al., 2021 MSE, RMSE, MAE, Chamfer Distance, Deformation level, Outliers, Incompleteness
2021) EMD
(Wei et al., 2016) 2016 Mean Partial overlapping, Outlier, Data density variation,
Deformation level
DispVoxNets (Shimada 2019 RMSE, STD Noise, Deformation level, Outlier, Incompleteness
et al., 2019)
GP-Aligner (Wang 2020 Chamfer distance Noise, outlier, Incompleteness, Deformation level
et al., 2020b)
(Hansen et al., 2019) 2019 TRE Noise, Outlier
FPT (Baum et al., 2022 RMSE Partial overlapping
2022)
SyNoRiM (Huang et al., 2022 L2 distance, MAE, EPE, Acc Noise, Incompleteness, Outlier, Data density
2022a) variation
(Trappolini et al., 2021 Mean, MGO, Chamfer distance, Max Noise, Data density variation
2021) EU, Mean EU
NDP (Li and Harada, 2022 EPE, Acc, Outlier, Partial overlapping, Data Density variation
2022b)
NrtNet (Hu et al., 2022 Percentage of correct correspondence, Deformation levels
2022) Chamfer distance, 𝐿𝑑𝑖𝑠 , 𝐿𝑚𝑎𝑡
PFCNN (Yang et al., 2020 Geodesic error, mIoU, mA, oA Data density variation
2020)
ResNet-LDDMM (Amor 2021 Geodesic error Noise, Incompleteness, Partial overlapping,
et al., 2021) Articulated deformations, Near-isometric
deformations, Non-isometric deformations,
Topological and geometric changes
(Netto and Oliveira, 2022 Precision, recall, EPE, MAE, MIE Noise, Outlier, Incompleteness, Partial overlapping,
2022) Data density variation, Deformation levels
PR-GAN (Tang and 2021 MSE, Chamfer distance Noise, Outlier, Deformation levels
Zhao, 2021)
(Shi et al., 2021) 2021 Chamfer distance Deformation levels
FlowNet3D (Liu et al., 2019 EPE, Acc, L2 distance Partial overlapping, Noise, Outlier, Data density
2019) variation, Different number of sampled points
FlowNet3D++ (Wang 2020 EPE, Acc, L2 distance, ADE, Mean Noise, Outlier, Data density variation
et al., 2020d)
Lepard (Li and Harada, 2022 IR, FMR, Recall, RRE, RTE Outlier, Data density variation, Partial overlapping
2022a)

which is called geodesic distance. A geodesic is a curve representing the As discussed before, some articles find correspondences before the
shortest path between two points on a surface. Some articles like (Yang registration step. Then, usually they report the corresponding error as
et al., 2020; Amor et al., 2021) represent the geodesic distance to report well as transformation and distance errors. Some registration methods
their methods’ accuracies. like Le et al. (2019), Chaudhury (2020), and Ma et al. (2017) consider
The root-mean-squared error (RMSE) is another common approach the number of matches to report the accuracy of their methods. This
to report the error of the registration method. The mean squared error measurement is called inlier ratio (IR) in Li and Harada (2022a). In
(MSE) and the mean absolute error (MAE) are two other metrics which SyNoRiM (Huang et al., 2022a) and NDP (Li and Harada, 2022b), two
can be used to estimate the algorithm’s error. The root-mean-squared accuracy measurements are defined, namely Accuracy Strict (AccS) and
distance (RMSD) is a distance metric proposed in Hirose (2020) for Accuracy Relaxed (AccR). The Accuracy Strict (AccS) is the percentage
evaluating the method. Mean displacement error (MDE) and mean of points whose relative error is less than 5% or greater than 2 cm.
target displacement (MTD) are two other measurements that are in- The Accuracy Relaxed (AccR) is the percentage of points whose relative
troduced in Pfeiffer et al. (2020). Likewise, in Hansen et al. (2019) error is less than 10% or greater than 5 cm. In He et al. (2017), the
target registration error (TRE) is proposed to evaluate registration matching error is calculated based on the corresponding point and the
accuracy. This error is called End-point-error (EPE) in some articles number of iterations. The mean and standard deviation (SD) are two
like FlowNet (Liu et al., 2019) and FlowNet++(Wang et al., 2020d). other factors to show the accuracy of a registration algorithm.
ResNet (Amor et al., 2021) reports the mIoU which is the class mean Another way to compute the accuracy of a point cloud registration
intersection over the union as well as the mA which is the class method is using precision and recall measurements. In fact, by consid-
mean accuracy. Furthermore, ResNet presents the oA to report the ering different distance threshold values, the correct match for each
overall accuracy. The mean isotropic error (MIE) is another error mea- point correspondence can be defined. Then, by using this information,
surement that is reported in Netto and Oliveira (2022). Furthermore, precision and recall will be calculated (Chaudhury, 2020). In Li and
to present the average of angle deviation error, ADE is reported in Harada (2022a), the same concept is called feature matching recall
FlowNet++(Wang et al., 2020d). (FMR).

67
S. Monji-Azad et al. ISPRS Journal of Photogrammetry and Remote Sensing 196 (2023) 58–72

Fig. 6. (a) Quantitative assessment metrics (b) Robustness to the common challenges.

Quantitative assessment metrics based on transformation er- noise to show how robust they are to noise. Another important factor is
ror: as discussed before, to solve the point cloud registration problem, how robust the method is to outliers. In some articles like Shimada et al.
some methods calculate the transformation error between source and (2019), Wang et al. (2020a) methods’ robustness to outliers is studied.
target point sets. For instance, some articles report rotation and trans- Robustness to occlusion is shown in Li et al. (2020b) meanwhile
lation errors, like Li and Harada (2022a). This paper defines relative robustness to incompleteness is evaluated in Bernreiter et al. (2021).
rotation error (RRE) and relative translation error (RTE). Some others, For another example, in some applications, the occlusion between the
like Eckart et al. (2018), report average Euler angular deviation from objects in the scene can be a challenge. Partial registration is another
the ground truth. The convergence rate is calculated as the algorithm difficulty, which happens when from one point of view just some parts
accuracy in Ahmed et al. (2021). The converge rate definition in this of the object appear. Regarding these aspects, some measurements are
article is based on translation and rotation error. To be more precise, used to show how robust a registration method is to various challenges.
if the mentioned error is less than 5 degrees for rotation and 5 cm for Some approaches such as Yang et al. (2016), Ge (2016), and Billings
translation, then the convergence rate can be defined as 100 percent. et al. (2015) report the failure or success rate of the correspondences.
The isotropic error for both rotation and translation (ISO) (Bauer et al., This can show how robust to outlier and noise the method is. Although
2021), geometric loss (Huang et al., 2020), the absolute trajectory error partial point cloud registration methods are not considered as the main
(ATE) (Ding and Feng, 2019), and area under the curve (AUC) (Sarode topic in this survey, some articles, e.g., (Chen et al., 2020; Huang
et al., 2019) are some other error evaluation metrics that are used in et al., 2020; Min et al., 2021; Wu et al., 2021), show how robust their
the mentioned articles. Furthermore, in Hu et al. (2022), the matrix methods can be to partial registration as well. Some methods like Li
optimization loss formula is used which is called 𝐿𝑚𝑎𝑡 . et al. (2020b) and Yang et al. (2018) increase the level of deformation
As the research has demonstrated, the quantitative assessment met- to show that their method can overcome large deformation as well.
ric based on distance is the most important and common measurement Robustness to data density variation or the size of data is studied
to evaluate the point cloud registration problem. To this end, the in Ahmed et al. (2021), Min et al. (2021), Ao et al. (2021).
distance between the transformed source point cloud and the target one
is calculated. Chamfer distance (CD) and Hausdorff distance (HD) are 8. Discussion
two widely used distance functions. In other studies, the registration
methods are based on finding the corresponding points. Therefore, Based on the studies shown in this article, some points can be
calculating the percentage of correct corresponding points beside the concluded. In the following, first, the existing challenges are discussed.
registration error, e.g. distance metric, is a significant measurement to It is shown how different approaches can be robust to them. Then, an
report. Furthermore, reporting the percentage of correct corresponding overview of datasets challenges is provided. Finally, different learning
points can be helpful to study how robust the methods are to noise architectures of point cloud registration are going to be discussed.
and outliers. Despite using the two mentioned measurements, when As discussed before, data incompleteness, partial overlapping, noise,
the approaches calculate the transformation error, the geometric loss and outlier are discussed more in the articles. To this end, finding the
between the predicted transformation matrix and ground truth one can transformation, when some correspondences are missed is challenging.
be used as a measurement metric. To overcome this challenge, the available approaches are categorized
into correspondence-free or correspondence-based methods. However,
7.2. Robustness both mentioned approaches suffer from some issues. The differences
between global features of the two point sets can be the main challenge
In the second criterium of evaluating the available approaches, the of correspondence-free methods while dealing with the lack of some
robustness to the challenges will be studied. Different challenges that correspondences is the primary problem of correspondence-based meth-
various methods should be robust to them are shown in Fig. 6, (b). ods. Detecting keypoint correspondences can be a solution to overcome
For instance, the data captured using LiDAR, scanners, or real-world this challenge (Wang and Solomon, 2019b). Additionally, using the
cameras suffer from various types of noise. Therefore, some papers deep neural network to find the alignment of incomplete shapes can
provide their achievements when the dataset has a different level of be another approach (Hanocka et al., 2018). Although there are some

68
S. Monji-Azad et al. ISPRS Journal of Photogrammetry and Remote Sensing 196 (2023) 58–72

suggestions to overcome these challenges, data incompleteness, and 9. Conclusion


partial overlapping are still open issues for the point cloud registration
problem. Robustness to noise and outliers, as two other challenges, Point cloud registration is one of the fundamental topics that can be
are still studied to evaluate different registration methods. The first used in many applications, such as 3D reconstruction, pose estimation,
approach is to overcome these challenges by improving the feature augmented/virtual reality, etc. Traditionally, finding the spatial trans-
extraction phase or using different methods to find the corresponding formation between two corresponding point sets is the goal of point
points. For instance, some methods use the feature extractor, which is cloud registration methods. The spatial transformation was defined into
robust to noise such as RANSAC variants. Therefore, they can improve rigid and non-rigid transformation categories. In this review paper,
the registration method’s accuracy by finding the correct corresponding non-rigid transformations and learning-based 3D point cloud registra-
points. Nevertheless, the second group of studies such as ICP and its tion methods were studied. Statistics of the recent developments and a
variants can solve the registration problem without feature extraction comprehensive review of the most important approaches were provided
and different descriptors. Both mentioned aspects have their own ad- in this article. Some of the existing challenges and their robustness
vantages and disadvantages. Therefore, selecting one of the mentioned were reviewed. Additionally, the most used datasets and a review of
approaches depends on the circumstances of the problem, which must existing benchmarks which are common in the articles were studied
be solved. and compared. By the end of the paper, a discussion on challenges
Despite the fact that there are different datasets used for evaluating and some of their existing solutions, datasets, and different learning
the point cloud registration methods, a number of challenges exist, such architectures was presented.
as deformation level, noise level, etc., which are not presented in the
available datasets. Therefore, the lack of a benchmark that includes Declaration of competing interest
all mentioned challenges can be discussed. In this fashion, the articles
generate their customized datasets with different challenges to evalu- The authors declare that they have no known competing finan-
ate proposed registration methods. For example, different deformation cial interests or personal relationships that could have appeared to
levels, different noise levels, or different outlier levels are generated influence the work reported in this paper.
in each study based on the authors’ opinions. Consequently, without
considering how the dataset is generated, it can be impractical to Acknowledgments
compare the results of the articles just by using the reported numbers.
As another challenge, it can be said that the most available datasets do The authors of this paper would like to thank Dr. Mehran Fotouhi
not provide the transformation ground truth between source and target for his valuable comments and suggestions to improve the work.
point clouds. In the upcoming work, generating a dataset that presents The authors gratefully acknowledge the data storage service
SDS@hd supported by the Ministry of Science, Research and the Arts
all challenges, the train, as well as the test data categorizations can be
Baden-Württemberg (MWK) and the German Research Foundation
considered as future work.
(DFG) through grant INST 35/1314-1 FUGG and INST 35/1503-1
Regarding various studies solving point cloud registration based
FUGG.
on the learning approach, a discussion between different architectures
This work was partially funded by Zentrales Innovationsprogramm
can be presented. As it is discussed, CNN could achieve some success
Mittelstand (ZIM), Germany under grant KK5044704CS0.
to improve the generalization in image applications by applying the
convolutional filters on the whole image and reducing the number of
References
parameters. However, using CNN for unordered 3D data, such as point
clouds, has some challenges that can be solved if it converts to a 2D Ahmed, M.T., Ziauddin, S., Marshall, J.A., Greenspan, M., 2021. Point cloud registration
image or 3D volumetric grids. Another group of studies is based on using virtual interest points from Macaulay’s resultant of quadric surfaces. J. Math.
PointNet. Then, in comparison with CNN, the input is not challenging. Imaging Vision 63 (4), 457–471. http://dx.doi.org/10.1007/s10851-020-01013-z.
However, it suffers from handling the local registration. As a solution, Alhamzi, K., Elmogy, M., Barakat, S., 2015. 3D object recognition based on local and
global features using point cloud library. Int. J. Adv. Comput. Technol. 7 (3), 43.
there are some methods that segment the point cloud into different Amor, B.B., Arguillère, S., Shao, L., 2021. ResNet-LDDMM: Advancing the LDDMM
regions and estimate the transformation for each region separately (Mei framework using deep residual networks. arXiv preprint arXiv:2102.07951.
et al., 2022; Truong et al., 2019). Although this approach can handle Anguelov, D., Srinivasan, P., Koller, D., Thrun, S., Rodgers, J., Davis, J., 2005. SCAPE:
local registration, it is suffering from segmentation challenges as well as Shape completion and animation of people. ACM Trans. Graph. (ToG) (ISSN:
0730-0301) 24 (3), 408–416. http://dx.doi.org/10.1145/1073204.1073207.
finding the appropriate transformation model for each region. Having
Ao, S., Hu, Q., Yang, B., Markham, A., Guo, Y., 2021. Spinnet: Learning a general
said that, graph neural network (GNN) has attracted considerable surface descriptor for 3D point cloud registration. In: Proceedings of the IEEE/CVF
research interest in recent years. GNN could achieve good accuracy Conference on Computer Vision and Pattern Recognition. pp. 11753–11762.
by learning the neighborhood information of points and presenting Aoki, Y., Goforth, H., Srivatsan, R.A., Lucey, S., 2019. Pointnetlk: Robust & efficient
point cloud registration using pointnet. In: Proceedings of the IEEE/CVF Conference
the local features. For the future of deep learning methods, regarding
on Computer Vision and Pattern Recognition. pp. 7163–7172. http://dx.doi.org/10.
all mentioned advantages of GNN, it is not unexpected to see more 1109/cvpr.2019.00733.
novel graph-based approaches. Using transformer-based architectures is Bauer, D., Patten, T., Vincze, M., 2021. Reagent: Point cloud registration using
another recent trend in learning-based registration methods. However, imitation and reinforcement learning. In: Proceedings of the IEEE/CVF Conference
on Computer Vision and Pattern Recognition. pp. 14586–14594.
they still suffer from the generalization problem. In this way, there are
Baum, Z.M., Hu, Y., Barratt, D.C., 2021. Real-time multimodal image registration with
some approaches like Trappolini et al. (2021), which improve the gen- partial intraoperative point-set data. Med. Image Anal. 74, 102231.
eralization of the network but still suffer from being time-consuming. Baum, Z., Ungi, T., Schlenger, C., Hu, Y., Barratt, D.C., 2022. Learning generalized
Furthermore, in some recent studies, it is shown that using different non-rigid multimodal biomedical image registration from generic point set data. In:
representations of the point cloud can be an efficient solution for some International Workshop on Advances in Simplifying Medical Ultrasound. Springer,
pp. 141–151.
mentioned challenges of the registration problem. For instance, Peng Bednarik, J., Fua, P., Salzmann, M., 2018. Learning to reconstruct texture-less de-
et al. (2021) as a learning-based method considers a grid representation formable surfaces from a single view. In: 2018 International Conference on 3D
like mesh to tackle the problem of finding correspondences between Vision. 3dV, IEEE, pp. 606–615.
two surfaces. In addition, in some datasets, additional features like Bellekens, B., Spruyt, V., Berkvens, R., Penne, R., Weyn, M., 2015. A benchmark survey
of rigid 3D point cloud registration algorithms. Int. J. Adv. Intell. Syst. 8, 118–127.
color for each point are available. Then, using texture and color infor-
Bernreiter, L., Ott, L., Nieto, J., Siegwart, R., Cadena, C., 2021. PHASER: A robust and
mation as well as point set coordinates can be other upcoming solutions correspondence-free global pointcloud registration. IEEE Robot. Autom. Lett. 6 (2),
for the learning-based registration methods. 855–862. http://dx.doi.org/10.1109/LRA.2021.3052418.

69
S. Monji-Azad et al. ISPRS Journal of Photogrammetry and Remote Sensing 196 (2023) 58–72

Besl, P.J., McKay, N.D., 1992. Method for registration of 3-D shapes. In: Sensor Fusion Feng, W., Zhang, J., Cai, H., Xu, H., Hou, J., Bao, H., 2021. Recurrent multi-
IV: Control Paradigms and Data Structures, Vol. 1611. Spie, pp. 586–606. view alignment network for unsupervised surface registration. In: Proceedings
Billings, S.D., Boctor, E.M., Taylor, R.H., 2015. Iterative most-likely point registration of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp.
(IMLP): A robust algorithm for computing optimal shape alignment. PLoS One 10 10297–10307.
(3), e0117688. http://dx.doi.org/10.1371/journal.pone.0117688. Fey, M., Lenssen, J.E., Weichert, F., Müller, H., 2018. Splinecnn: Fast geometric deep
Bogo, F., Romero, J., Loper, M., Black, M.J., 2014. FAUST: Dataset and evaluation for learning with continuous b-spline kernels. In: Proceedings of the IEEE Conference
3D mesh registration. In: Proceedings of the IEEE Conference on Computer Vision on Computer Vision and Pattern Recognition. pp. 869–877.
and Pattern Recognition. pp. 3794–3801. Fischler, M.A., Bolles, R.C., 1981. Random sample consensus: A paradigm for model
Bogo, F., Romero, J., Pons-Moll, G., Black, M.J., 2017. Dynamic FAUST: Registering fitting with applications to image analysis and automated cartography. Commun.
human bodies in motion. In: Proceedings of the IEEE Conference on Computer ACM 24 (6), 381–395.
Vision and Pattern Recognition. pp. 6233–6242. Fitzgibbon, A.W., 2003. Robust registration of 2D and 3D point sets. Image Vis. Comput.
Boscaini, D., Masci, J., Rodolà, E., Bronstein, M., 2016. Learning shape correspondence 21 (13–14), 1145–1153.
with anisotropic convolutional neural networks. Adv. Neural Inf. Process. Syst. 29. Fotouhi, M., Hekmatian, H., Kashani-Nezhad, M.A., Kasaei, S., 2019. SC-RANSAC:
Bozic, A., Zollhofer, M., Theobalt, C., Nießner, M., 2020. Deepdeform: Learning Spatial consistency on RANSAC. Multimedia Tools Appl. 78 (7), 9429–9461.
non-rigid rgb-d reconstruction with semi-supervised data. In: Proceedings of the Fu, Y., Lei, Y., Wang, T., Patel, P., Jani, A.B., Mao, H., Curran, W.J., Liu, T., Yang, X.,
IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 7002–7012. 2021. Biomechanically constrained non-rigid MR-TRUS prostate registration using
Bronstein, A.M., Bronstein, M.M., Kimmel, R., 2008. Numerical Geometry of Non-Rigid deep learning based 3D point cloud matching. Med. Image Anal. 67, 101845.
Shapes. Springer Science & Business Media. http://dx.doi.org/10.1016/j.media.2020.101845.
Bronstein, M.M., Bruna, J., LeCun, Y., Szlam, A., Vandergheynst, P., 2017. Geometric Ge, X., 2016. Non-rigid registration of 3D point clouds under isometric deformation.
deep learning: Going beyond euclidean data. IEEE Signal Process. Mag. 34 (4), ISPRS J. Photogramm. Remote Sens. (ISSN: 0924-2716) 121, 192–202. http://dx.
18–42. doi.org/10.1016/j.isprsjprs.2016.09.009.
Cao, C., Weng, Y., Zhou, S., Tong, Y., Zhou, K., 2013. Facewarehouse: A 3D facial Geiger, A., Lenz, P., Urtasun, R., 2012. Are we ready for autonomous driving? the kitti
expression database for visual computing. IEEE Trans. Vis. Comput. Graphics 20 vision benchmark suite. In: 2012 IEEE Conference on Computer Vision and Pattern
(3), 413–425. Recognition. IEEE, pp. 3354–3361.
Castellani, U., Bartoli, A., 2020. 3D shape registration. In: 3D Imaging, Analysis and Golyanik, V., Shimada, S., Varanasi, K., Stricker, D., 2018. Hdm-net: Monocular
Applications. Springer, pp. 353–411. non-rigid 3D reconstruction with learned deformation model. In: International
Chang, A.X., Funkhouser, T., Guibas, L., Hanrahan, P., Huang, Q., Li, Z., Savarese, S., Conference on Virtual Reality and Augmented Reality. Springer, pp. 51–72.
Savva, M., Song, S., Su, H., et al., 2015. Shapenet: An information-rich 3D model Grosse, R., Johnson, M.K., Adelson, E.H., Freeman, W.T., 2009. Ground truth dataset
repository. arXiv preprint arXiv:1512.03012. and baseline evaluations for intrinsic image algorithms. In: 2009 IEEE 12th
Chang, W., Zwicker, M., 2008. Automatic registration for articulated shapes. Comput. International Conference on Computer Vision. IEEE, pp. 2335–2342.
Graph. Forum 27 (5), 1459–1468.
Guo, M.-H., Cai, J.-X., Liu, Z.-N., Mu, T.-J., Martin, R.R., Hu, S.-M., 2021. Pct: Point
Chang, W., Zwicker, M., 2009. Range scan registration using reduced deformable
cloud transformer. Comput. Vis. Media 7 (2), 187–199.
models. Comput. Graph. Forum 28 (2), 447–456.
Guo, Y., Wang, H., Hu, Q., Liu, H., Liu, L., Bennamoun, M., 2020. Deep learning for 3D
Chaudhury, A., 2020. Multilevel optimization for registration of deformable point
point clouds: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 43 (12), 4338–4364.
clouds. IEEE Trans. Image Proc. Publ. IEEE Signal Proc. Soc. PP, http://dx.doi.
Handa, A., Patraucean, V., Badrinarayanan, V., Stent, S., Cipolla, R., 2016. Under-
org/10.1109/TIP.2020.3019649.
standing real world indoor scenes with synthetic data. In: Proceedings of the IEEE
Chen, S., Nan, L., Xia, R., Zhao, J., Wonka, P., 2020. PLADE: A plane-based descriptor
Conference on Computer Vision and Pattern Recognition. pp. 4077–4085.
for point cloud registration with small overlap. IEEE Trans. Geosci. Remote
Hanocka, R., Fish, N., Wang, Z., Giryes, R., Fleishman, S., Cohen-Or, D., 2018. Alignet:
Sens. (ISSN: 0196-2892) 58 (4), 2530–2540. http://dx.doi.org/10.1109/TGRS.2019.
Partial-shape agnostic alignment via unsupervised learning. ACM Trans. Graph. 38
2952086.
(1), 1–14.
Cheng, L., Chen, S., Liu, X., Xu, H., Wu, Y., Li, M., Chen, Y., 2018. Registration of
Hansen, L., Dittmer, D., Heinrich, M.P., 2019. Learning deformable point set registration
laser scanning point clouds: A review. Sensors 18 (5), 1641.
with regularized dynamic graph cnns for large lung motion in copd patients.
Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H.,
In: International Workshop on Graph Learning in Medical Imaging. Springer, pp.
Bengio, Y., 2014. Learning phrase representations using RNN encoder-decoder for
53–61.
statistical machine translation. arXiv preprint arXiv:1406.1078.
Hansen, L., Heinrich, M.P., 2021. Deep learning based geometric registration for
Chui, H., Rangarajan, A., 2000. A new algorithm for non-rigid point matching. In:
medical images: How accurate can we get without visual features? In: International
Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR
Conference on Information Processing in Medical Imaging. Springer, pp. 18–30.
2000 (Cat. No. PR00662), Vol. 2. IEEE, pp. 44–51.
Chui, H., Rangarajan, A., 2003. A new point matching algorithm for non-rigid He, Y., Liang, B., Yang, J., Li, S., He, J., 2017. An iterative closest points algorithm
registration. Comput. Vis. Image Underst. 89 (2–3), 114–141. for registration of 3D laser scanner point clouds with geometric features. Sensors
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K., 2018. Bert: Pre-training of deep 17 (8), 1862.
bidirectional transformers for language understanding. arXiv preprint arXiv:1810. He, K., Zhang, X., Ren, S., Sun, J., 2016. Deep residual learning for image recog-
04805. nition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern
Dimentions, 2022. https://app.dimensions.ai/discover/publication (Last Access: 12 May Recognition. pp. 770–778.
2022). Hirose, O., 2020. Acceleration of non-rigid point set registration with downsampling
Ding, L., Feng, C., 2019. DeepMapping: Unsupervised map estimation from multiple and Gaussian process regression. IEEE Trans. Pattern Anal. Mach. Intell. 43 (8),
point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and 2858–2865.
Pattern Recognition. pp. 8650–8659. Hochreiter, S., Schmidhuber, J., 1997. Long short-term memory. Neural Comput. 9 (8),
Dong, Z., Liang, F., Yang, B., Xu, Y., Zang, Y., Li, J., Wang, Y., Dai, W., Fan, H., 1735–1780.
Hyyppä, J., et al., 2020. Registration of large-scale terrestrial laser scanner point Horn, M., Engel, N., Belagiannis, V., Buchholz, M., Dietmayer, K., 2020. DeepCLR:
clouds: A review and benchmark. ISPRS J. Photogramm. Remote Sens. 163, Correspondence-less architecture for deep end-to-end point cloud registration.
327–342. In: 2020 IEEE 23rd International Conference on Intelligent Transportation Sys-
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., tems. ITSC, IEEE, ISBN: 978-1-7281-4149-7, pp. 1–7. http://dx.doi.org/10.1109/
Dehghani, M., Minderer, M., Heigold, G., Gelly, S., et al., 2020. An image is ITSC45102.2020.9294279.
worth 16x16 words: Transformers for image recognition at scale. arXiv preprint Hu, X., Zhang, D., Chen, J., Wu, Y., Chen, Y., 2022. NrtNet: An unsupervised method
arXiv:2010.11929. for 3D non-rigid point cloud registration based on transformer. Sensors 22 (14),
Dyke, R.M., Lai, Y.-K., Rosin, P.L., Zappalà, S., Dykes, S., Guo, D., Li, K., Marin, R., 5128.
Melzi, S., Yang, J., 2020. Shrec’20: Shape correspondence with non-isometric Huang, J., Birdal, T., Gojcic, Z., Guibas, L.J., Hu, S.-M., 2022a. Multiway non-rigid
deformations. Comput. Graph. 92, 28–43. point cloud registration via learned functional map synchronization. IEEE Trans.
Eckart, B., Kim, K., Kautz, J., 2018. Fast and accurate point cloud registration using Pattern Anal. Mach. Intell..
trees of gaussian mixtures. arXiv preprint arXiv:1807.02587. Huang, S., Gojcic, Z., Huang, J., Wieser, A., Schindler, K., 2022b. Dynamic 3D scene
Elbaz, G., Avraham, T., Fischer, A., 2017. 3D point cloud registration for localization analysis by point cloud accumulation. arXiv preprint arXiv:2207.12394.
using a deep neural network auto-encoder. In: Proceedings of the IEEE Conference Huang, S., Gojcic, Z., Usvyatsov, M., Wieser, A., Schindler, K., 2021a. Predator:
on Computer Vision and Pattern Recognition. pp. 4631–4640. Registration of 3D point clouds with low overlap. In: Proceedings of the IEEE/CVF
Fan, H., Yang, Y., 2019. PointRNN: Point recurrent neural network for moving point Conference on Computer Vision and Pattern Recognition. pp. 4267–4276.
cloud processing. arXiv preprint arXiv:1910.08287. Huang, X., Mei, G., Zhang, J., 2020. Feature-metric registration: A fast semi-
Feng, M., Hu, S., Ang, M.H., Lee, G.H., 2019. 2D3D-matchnet: Learning to match supervised approach for robust point cloud registration without correspondences.
keypoints across 2D image and 3D point cloud. In: 2019 International Conference In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
on Robotics and Automation. ICRA, IEEE, http://dx.doi.org/10.1109/icra.2019. CVPR, IEEE, ISBN: 978-1-7281-7168-5, pp. 11363–11371. http://dx.doi.org/10.
8794415. 1109/CVPR42600.2020.01138.

70
S. Monji-Azad et al. ISPRS Journal of Photogrammetry and Remote Sensing 196 (2023) 58–72

Huang, X., Mei, G., Zhang, J., Abbas, R., 2021b. A comprehensive survey on point Pais, G.D., Ramalingam, S., Govindu, V.M., Nascimento, J.C., Chellappa, R., Miraldo, P.,
cloud registration. arXiv preprint arXiv:2103.02690. 2020. 3DRegNet: A deep neural network for 3D point registration. In: 2020
Ihler, A.T., Fisher III, J.W., Willsky, A.S., Chickering, D.M., 2005. Loopy belief IEEE/CVF Conference on Computer Vision and Pattern Recognition. CVPR, IEEE,
propagation: Convergence and effects of message errors. J. Mach. Learn. Res. 6 ISBN: 978-1-7281-7168-5, pp. 7191–7201. http://dx.doi.org/10.1109/CVPR42600.
(5). 2020.00722.
Le, H.M., Do, T.-T., Hoang, T., Cheung, N.-M., 2019. SDRSAC: Semidefinite-based Peng, S., Jiang, C., Liao, Y., Niemeyer, M., Pollefeys, M., Geiger, A., 2021. Shape as
randomized approach for robust point cloud registration without correspondences. points: A differentiable poisson solver. Adv. Neural Inf. Process. Syst. 34.
In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. CVPR, Peterlík, I., Courtecuisse, H., Rohling, R., Abolmaesumi, P., Nguan, C., Cotin, S., Sal-
IEEE, ISBN: 978-1-7281-3293-8, pp. 124–133. http://dx.doi.org/10.1109/CVPR. cudean, S., 2018. Fast elastic registration of soft tissues under large deformations.
2019.00021. Med. Image Anal. 45, 24–40. http://dx.doi.org/10.1016/j.media.2017.12.006.
Lei, Y., Tian, S., He, X., Wang, T., Wang, B., Patel, P., Jani, A.B., Mao, H., Curran, W.J., Petricek, T., Svoboda, T., 2017. Point cloud registration from local feature
Liu, T., et al., 2019. Ultrasound prostate segmentation based on multidirectional correspondences-evaluation on challenging datasets. PLoS One 12 (11), e0187943.
deeply supervised V-Net. Med. Phys. 46 (7), 3194–3206. http://dx.doi.org/10.1371/journal.pone.0187943.
Li, T., Bolkart, T., Black, M.J., Li, H., Romero, J., 2017. Learning a model of facial Pfeiffer, M., Riediger, C., Leger, S., Kühn, J.-P., Seppelt, D., Hoffmann, R.-T., Weitz, J.,
shape and expression from 4D scans. ACM Trans. Graph. 36 (6), 194–1. Speidel, S., 2020. Non-rigid volume to surface registration using a data-driven
Li, Y., Harada, T., 2022a. Lepard: Learning partial point cloud matching in rigid and biomechanical model. In: International Conference on Medical Image Computing
deformable scenes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Computer-Assisted Intervention. Springer, pp. 724–734.
and Pattern Recognition. pp. 5554–5564. Pomerleau, F., Colas, F., Siegwart, R., et al., 2015. A review of point cloud registration
Li, Y., Harada, T., 2022b. Non-rigid point cloud registration with neural deformation algorithms for mobile robotics. Found. Trends® Robot. 4 (1), 1–104.
Pyramid. arXiv preprint arXiv:2205.12796. Pons-Moll, G., Pujades, S., Hu, S., Black, M.J., 2017. ClothCap: Seamless 4D clothing
Li, J., Hu, Q., Ai, M., 2020a. GESAC: Robust graph enhanced sample consensus for capture and retargeting. ACM Trans. Graph. (ToG) 36 (4), 1–15.
point cloud registration. ISPRS J. Photogramm. Remote Sens. 167, 363–374. Poulenard, A., Ovsjanikov, M., 2018. Multi-directional geodesic neural networks via
Li, J., Hu, Q., Ai, M., 2021a. Point cloud registration based on one-point Ransac and equivariant convolution. ACM Trans. Graph. 37 (6), 1–14.
scale-annealing biweight estimation. IEEE Trans. Geosci. Remote Sens. 59 (11), Qi, C.R., Su, H., Mo, K., Guibas, L.J., 2017a. Pointnet: Deep learning on point sets
9716–9729. for 3D classification and segmentation. In: Proceedings of the IEEE Conference on
Li, X., Pontes, J.K., Lucey, S., 2021b. Pointnetlk revisited. In: Proceedings of Computer Vision and Pattern Recognition. pp. 652–660.
the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. Qi, C.R., Yi, L., Su, H., Guibas, L.J., 2017b. Pointnet++: Deep hierarchical feature
12763–12772. learning on point sets in a metric space. Adv. Neural Inf. Process. Syst. 30.
Li, Y., Takehara, H., Taketomi, T., Zheng, B., Nießner, M., 2021c. 4Dcomplete: Non-
Quan, S., Yang, J., 2020. Compatibility-guided sampling consensus for 3-d point cloud
rigid motion estimation beyond the observable surface. In: Proceedings of the
registration. IEEE Trans. Geosci. Remote Sens. 58 (10), 7380–7392.
IEEE/CVF International Conference on Computer Vision. pp. 12706–12716.
Saiti, E., Theoharis, T., 2020. An application independent review of multimodal 3D
Li, M., Xu, R.Y., Xin, J., Zhang, K., Jing, J., 2020b. Fast non-rigid points registration
registration methods. Comput. Graph. 91, 153–178.
with cluster correspondences projection. Signal Process. (ISSN: 01651684) 170,
Sarlin, P.-E., DeTone, D., Malisiewicz, T., Rabinovich, A., 2020. Superglue: Learning
107425. http://dx.doi.org/10.1016/j.sigpro.2019.107425.
feature matching with graph neural networks. In: Proceedings of the IEEE/CVF
Li, J., Yang, B., Chen, C., Habib, A., 2019. NRLI-UAV: Non-rigid registration of
Conference on Computer Vision and Pattern Recognition. pp. 4938–4947.
sequential raw laser scans and images for low-cost UAV LiDAR point cloud
Sarode, V., Li, X., Goforth, H., Aoki, Y., Srivatsan, R.A., Lucey, S., Choset, H., 2019.
quality improvement. ISPRS J. Photogramm. Remote Sens. (ISSN: 0924-2716) 158,
Pcrnet: Point cloud registration network using pointnet encoding. arXiv preprint
123–145. http://dx.doi.org/10.1016/j.isprsjprs.2019.10.009.
arXiv:1908.07906.
Liang, L., Wei, M., Szymczak, A., Petrella, A., Xie, H., Qin, J., Wang, J., Wang, F.L.,
Shi, J., Wan, P., Chen, F., 2021. An unsupervised non-rigid registration network for
2018. Nonrigid iterative closest points for registration of 3D biomedical surfaces.
fast medical shape alignment. In: 2021 43rd Annual International Conference of
Opt. Lasers Eng. (ISSN: 01438166) 100, 141–154. http://dx.doi.org/10.1016/j.
the IEEE Engineering in Medicine & Biology Society. EMBC, IEEE, pp. 1887–1890.
optlaseng.2017.08.005.
Shimada, S., Golyanik, V., Tretschk, E., Stricker, D., Theobalt, C., 2019. Dis-
Liu, X., Qi, C.R., Guibas, L.J., 2019. Flownet3D: Learning scene flow in 3D point clouds.
pvoxnets: Non-rigid point set alignment with supervised learning proxies. In: 2019
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern
International Conference on 3D Vision. 3DV, IEEE, pp. 27–36.
Recognition. pp. 529–537.
Lucas, B.D., Kanade, T., et al., 1981. An iterative image registration technique with an Shuman, D.I., Narang, S.K., Frossard, P., Ortega, A., Vandergheynst, P., 2013. The
application to stereo vision. IJCAI. emerging field of signal processing on graphs: Extending high-dimensional data
analysis to networks and other irregular domains. IEEE Signal Process. Mag. 30
Ma, Q., Yang, J., Ranjan, A., Pujades, S., Pons-Moll, G., Tang, S., Black, M.J., 2020.
(3), 83–98.
Learning to dress 3D people in generative clothing. In: Proceedings of the IEEE/CVF
Conference on Computer Vision and Pattern Recognition. pp. 6469–6478. Slavcheva, M., Baust, M., Cremers, D., Ilic, S., 2017. Killingfusion: Non-rigid 3D
Ma, J., Zhao, J., Jiang, J., Zhou, H., 2017. Non-rigid point set registration with reconstruction without correspondences. In: Proceedings of the IEEE Conference
robust transformation estimation under manifold regularization. In: Thirty-First on Computer Vision and Pattern Recognition. pp. 1386–1395.
AAAI Conference on Artificial Intelligence. Sun, J., Shen, Z., Wang, Y., Bao, H., Zhou, X., 2021. LoFTR: Detector-free local
Mahmood, B., Han, S., 2019. 3D registration of indoor point clouds for augmented real- feature matching with transformers. In: Proceedings of the IEEE/CVF Conference
ity. In: Computing in Civil Engineering 2019: Visualization, Information Modeling, on Computer Vision and Pattern Recognition. pp. 8922–8931.
and Simulation. American Society of Civil Engineers Reston, VA, pp. 1–8. Takimoto, R.Y., Tsuzuki, M.d.S.G., Vogelaar, R., de Castro Martins, T., Sato, A.K.,
Maiseli, B., Gu, Y., Gao, H., 2017. Recent developments and trends in point set Iwao, Y., Gotoh, T., Kagei, S., 2016. 3D reconstruction and multiple point cloud
registration methods. J. Vis. Commun. Image Represent. 46, 95–106. registration using a low precision RGB-D sensor. Mechatronics 35, 11–22.
Mayer, N., Ilg, E., Hausser, P., Fischer, P., Cremers, D., Dosovitskiy, A., Brox, T., 2016. Tang, H., Zhao, Y., 2021. A conditional generative adversarial network for non-rigid
A large dataset to train convolutional networks for disparity, optical flow, and point set registration. In: 2021 IEEE Asia-Pacific Conference on Computer Science
scene flow estimation. In: Proceedings of the IEEE Conference on Computer Vision and Data Engineering. CSDE, IEEE, pp. 1–6.
and Pattern Recognition. pp. 4040–4048. Tang, W., Zou, D., Li, P., 2021. Learning-based point cloud registration: A short review
Mei, G., Huang, X., Zhang, J., Wu, Q., 2022. Partial point cloud registration via soft and evaluation. In: 2021 2nd International Conference on Artificial Intelligence
segmentation. In: 2022 IEEE International Conference on Image Processing. ICIP, in Electronics Engineering. ACM, New York, NY, USA, http://dx.doi.org/10.1145/
IEEE, pp. 681–685. 3460268.3460273.
Melzi, S., Marin, R., Rodolà, E., Castellani, U., Ren, J., Poulenard, A., Wonka, P., Tazir, M.L., Gokhool, T., Checchin, P., Malaterre, L., Trassoudaine, L., 2018. CICP:
Ovsjanikov, M., 2019. Shrec 2019: Matching humans with different connectivity. Cluster iterative closest point for sparse–dense point cloud registration. Robot.
In: Eurographics Workshop on 3D Object Retrieval, Vol. 7. p. 3. Auton. Syst. 108, 66–86.
Min, T., Kim, E., Shim, I., 2021. Geometry guided network for point cloud registration. Tevs, A., Berner, A., Wand, M., Ihrke, I., Bokeloh, M., Kerber, J., Seidel, H.-P., 2012.
IEEE Robot. Autom. Lett. 6 (4), 7270–7277. http://dx.doi.org/10.1109/LRA.2021. Animation cartography—intrinsic reconstruction of shape and motion. ACM Trans.
3097268. Graph. 31 (2), 1–15.
Mirza, M., Osindero, S., 2014. Conditional generative adversarial nets. arXiv preprint Thomas, H., Qi, C.R., Deschaud, J.-E., Marcotegui, B., Goulette, F., Guibas, L.J., 2019.
arXiv:1411.1784. Kpconv: Flexible and deformable convolution for point clouds. In: Proceedings of
model search engine, F.D., 2022. https://youbi3d.com (Last Access: 29 Oct 2022). the IEEE/CVF International Conference on Computer Vision. pp. 6411–6420.
Myronenko, A., Song, X., 2010. Point set registration: Coherent point drift. IEEE Trans. Trappolini, G., Cosmo, L., Moschella, L., Marin, R., Melzi, S., Rodolà, E., 2021. Shape
Pattern Anal. Mach. Intell. 32 (12), 2262–2275. registration in the time of transformers. Adv. Neural Inf. Process. Syst. 34.
Myronenko, A., Song, X., Carreira-Perpinan, M., 2006. Non-rigid point set registration: Trimech, I.H., Maalej, A., Amara, N.E.B., 2016. 3D facial expression recognition
Coherent point drift. Adv. Neural Inf. Process. Syst. 19. using nonrigid CPD registration method. In: 2016 7th International Conference
Netto, G.M., Oliveira, M.M., 2022. Robust point-cloud registration based on dense point on Sciences of Electronics, Technologies of Information and Telecommunications.
matching and probabilistic modeling. Vis. Comput. 38 (9), 3217–3230. SETIT, IEEE, pp. 478–481.

71
S. Monji-Azad et al. ISPRS Journal of Photogrammetry and Remote Sensing 196 (2023) 58–72

Trimech, I.H., Maalej, A., Amara, N.E.B., 2020. Point-based deep neural network for Wei, L., Huang, Q., Ceylan, D., Vouga, E., Li, H., 2016. Dense human body correspon-
3D facial expression recognition. In: 2020 International Conference on Cyberworlds. dences using convolutional networks. In: Proceedings of the IEEE Conference on
CW, IEEE, pp. 164–171. Computer Vision and Pattern Recognition. pp. 1544–1553.
Truong, G., Gilani, S.Z., Islam, S.M.S., Suter, D., 2019. Fast point cloud registration Wu, B., Ma, J., Chen, G., An, P., 2021. Feature interactive representation for point
using semantic segmentation. In: 2019 Digital Image Computing: Techniques and cloud registration. In: Proceedings of the IEEE/CVF International Conference on
Applications. DICTA, IEEE, pp. 1–8. Computer Vision. pp. 5530–5539.
Varol, G., Romero, J., Martin, X., Mahmood, N., Black, M.J., Laptev, I., Schmid, C., Wu, W., Qi, Z., Fuxin, L., 2019. Pointconv: Deep convolutional networks on 3D point
2017. Learning from synthetic humans. In: Proceedings of the IEEE Conference on clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and
Computer Vision and Pattern Recognition. pp. 109–117. Pattern Recognition. pp. 9621–9630.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., Xiao, J., 2015. 3D shapenets: A
Polosukhin, I., 2017. Attention is all you need. Adv. Neural Inf. Process. Syst. 30. deep representation for volumetric shapes. In: Proceedings of the IEEE Conference
VGoodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., on Computer Vision and Pattern Recognition. pp. 1912–1920.
Courville, A., Bengio, Y., 2014. Generative adversarial nets. In: Proceedings of the Xiang, F., Qin, Y., Mo, K., Xia, Y., Zhu, H., Liu, F., Liu, M., Jiang, H., Yuan, Y.,
International Conference on Neural Information Processing Systems. NIPS. Wang, H., et al., 2020. Sapien: A simulated part-based interactive environment.
Villena-Martinez, V., Oprea, S., Saval-Calvo, M., Azorin-Lopez, J., Fuster-Guillo, A., In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern
Fisher, R.B., 2020. When deep learning meets data alignment: A review on deep Recognition. pp. 11097–11107.
registration networks (DRNs). Appl. Sci. 10 (21), 7524. http://dx.doi.org/10.3390/ Yang, J., Cao, Z., Zhang, Q., 2016. A fast and robust local descriptor for 3D point cloud
app10217524. registration. Inform. Sci. (ISSN: 00200255) 346–347, 163–179. http://dx.doi.org/
Vlasic, D., Baran, I., Matusik, W., Popović, J., 2008. Articulated mesh animation from 10.1016/j.ins.2016.01.095.
multi-view silhouettes. ACM SIGGRAPH 2008 Papers. ACM, pp. 1–9. Yang, C., Liu, Y., Jiang, X., Zhang, Z., Wei, L., Lai, T., Chen, R., 2018. Non-rigid
Wang, L., Chen, J., Li, X., Fang, Y., 2019a. Non-rigid point set registration networks. point set registration via adaptive weighted objective function. IEEE Access 6,
arXiv preprint arXiv:1904.01428. 75947–75960. http://dx.doi.org/10.1109/ACCESS.2018.2883689.
Wang, B., Lei, Y., Tian, S., Wang, T., Liu, Y., Patel, P., Jani, A.B., Mao, H., Curran, W.J., Yang, Y., Liu, S., Pan, H., Liu, Y., Tong, X., 2020. PFCNN: Convolutional neural
Liu, T., et al., 2019b. Deeply supervised 3D fully convolutional networks with group networks on 3D surfaces using parallel frames. In: Proceedings of the IEEE/CVF
dilated convolution for automatic MRI prostate segmentation. Med. Phys. 46 (4), Conference on Computer Vision and Pattern Recognition. pp. 13578–13587.
1707–1718. Zeng, A., Song, S., Nießner, M., Fisher, M., Xiao, J., Funkhouser, T., 2017. 3Dmatch:
Wang, L., Li, X., Chen, J., Fang, Y., 2019c. Coherent point drift networks: Unsupervised Learning local geometric descriptors from rgb-d reconstructions. In: Proceedings of
learning of non-rigid point set registration. arXiv preprint arXiv:1906.03039. the IEEE Conference on Computer Vision and Pattern Recognition. pp. 1802–1811.
Wang, L., Li, X., Fang, Y., 2020a. GP-Aligner: Unsupervised non-rigid groupwise point Zhang, Z., Chen, G., Wang, X., Shu, M., 2021a. DDRNet: Fast point cloud registra-
set registration based on optimized group latent descriptor. arXiv preprint arXiv: tion network for large-scale scenes. ISPRS J. Photogramm. Remote Sens. (ISSN:
2007.12979. 0924-2716) 175, 184–198. http://dx.doi.org/10.1016/j.isprsjprs.2021.03.003.
Wang, L., Li, X., Fang, Y., 2020b. GP-Aligner: Unsupervised non-rigid groupwise Zhang, Z., Dai, Y., Sun, J., 2020. Deep learning based point cloud registration: An
point set registration based on optimized group latent descriptor. arXiv preprint overview. Virtual Real. Intell. Hardw. (ISSN: 2096-5796) 2 (3), 222–246. http:
arXiv:2007.12979. //dx.doi.org/10.1016/j.vrih.2020.05.002.
Wang, L., Li, X., Fang, Y., 2020c. Unsupervised learning of 3D point set registration. Zhang, Z., Sun, J., Dai, Y., Zhou, D., Song, X., He, M., 2021b. A representation separa-
arXiv preprint arXiv:2006.06200. tion perspective to correspondences-free unsupervised 3D point cloud registration.
Wang, Z., Li, S., Howard-Jenkins, H., Prisacariu, V., Chen, M., 2020d. Flownet3D++: IEEE Geosci. Remote Sens. Lett..
Geometric losses for deep scene flow estimation. In: Proceedings of the IEEE/CVF Zhang, S., Tong, H., Xu, J., Maciejewski, R., 2019. Graph convolutional networks: A
Winter Conference on Applications of Computer Vision. pp. 91–98. comprehensive review. Comput. Soc. Netw. 6 (1), 1–23.
Wang, Y., Solomon, J., 2019a. Deep closest point: Learning representations for point Zhao, H., Jiang, L., Jia, J., Torr, P.H., Koltun, V., 2021a. Point transformer. In:
cloud registration. In: 2019 IEEE/CVF International Conference on Computer Proceedings of the IEEE/CVF International Conference on Computer Vision. pp.
Vision. ICCV, IEEE, ISBN: 978-1-7281-4803-8, pp. 3522–3531. http://dx.doi.org/ 16259–16268.
10.1109/ICCV.2019.00362. Zhao, H., Liang, Z., Wang, C., Yang, M., 2021b. CentroidReg: A global-to-local
Wang, Y., Solomon, J.M., 2019b. Prnet: Self-supervised learning for partial-to-partial framework for partial point cloud registration. IEEE Robot. Autom. Lett. 6 (2),
registration. Adv. Neural Inf. Process. Syst. 32. 2533–2540. http://dx.doi.org/10.1109/lra.2021.3061369.
Wang, Y., Sun, Y., Liu, Z., Sarma, S.E., Bronstein, M.M., Solomon, J.M., 2019. Dynamic Zhu, H., Guo, B., Zou, K., Li, Y., Yuen, K.-V., Mihaylova, L., Leung, H., 2019. A review of
graph cnn for learning on point clouds. ACM Trans. Graph. (ToG) 38 (5), 1–12. point set registration: From pairwise registration to groupwise registration. Sensors
Wang, Y., Zhang, S., Bai, X., 2019e. A 3D tracking and registration method based on (Basel, Switzerland) 19 (5), http://dx.doi.org/10.3390/s19051191.
point cloud and visual features for augmented reality aided assembly system. Xibei
Gongye Daxue Xuebao/J. Northwestern Polytech. Univ. 37 (1), 143–151.

72

You might also like