Professional Documents
Culture Documents
a r t i c l e i n f o a b s t r a c t
Article history: In the last decade, Light Detection and Ranging (LIDAR) became a leading technology of detailed and
Available online 3 August 2021 reliable 3D environment perception. This paper gives an overview of the wide applicability of LIDAR
sensors from the perspective of signal processing for autonomous driving, including dynamic and static
Keywords:
scene analysis, mapping, situation awareness which functions significantly point beyond the role of a
Lidar
Object detection
safe obstacle detector, which was the sole typical function for LIDARs in the pioneer years of driver-less
SLAM vehicles. The paper focuses on a wide range of LIDAR data analysis applications of the last decade, and
Change detection in addition to the presentation of a state-of-the-art survey, the article also summarizes some issues and
Navigation expected directions of the development in this field, and the future perspectives of LIDAR systems and
intelligent LIDAR based information processing.
© 2021 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY-NC-ND
license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
https://doi.org/10.1016/j.dsp.2021.103193
1051-2004/© 2021 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY-NC-ND license
(http://creativecommons.org/licenses/by-nc-nd/4.0/).
C. Benedek, A. Majdik, B. Nagy, Z. Rozsa and T. Sziranyi Digital Signal Processing 119 (2021) 103193
Fig. 1. Data sample of a Velodyne HDL-64 RMB Lidar. Fig. 2. Data of a Riegl VMX-450 Mobile Laser Scanning (MLS) system.
space verification. However, as we will demonstrate in this study, High Definition (HD) maps of the cities, which can be utilized by
the potential of this technology goes far beyond simple obstacle self-driving vehicles for navigation.
detection, since the development of LIDAR technology in terms of Another recently emerging new technology is the Doppler-
temporal and spatial resolution and noise elimination led us to LIDAR (First mention is [2] for wind measurements): e.g., Black-
more sophisticated 3D measurements for various real-time percep- more1 has just introduced a LIDAR for autonomous driving with
tion, navigation and mapping problems. velocity or rotation speed data output. Very recent models, such
Next, we introduce the reader to the latest exciting results and as the Livox sensors, use advanced non-repetitive scanning pat-
their background in real-life LIDAR applications. terns to deliver highly-accurate details. These scanning patterns
even provide relatively high point density in a short period of time,
1.2. Outline of the paper which can build up a higher density as the duration increases. The
actually available models can achieve the same or greater point
First, we show the diversity of laser scanner devices to get a density as conventional 32-line RMB LIDAR sensors.
point cloud of the 3D environment. Next, we present an overview
of a wide range of LIDAR-based application modules built on each 2.2. LIDAR resources
other, which implement various functions, including object percep-
tion, classification, mapping and localization. We also discuss the Numerous autonomous driving datasets have been released in
opportunities in challenging situations such as extreme weather the recent years with LIDAR data. The most important ones are
conditions or the availability of low-range one-beam (plane) sen- listed in Table 1 with their purpose. As we can observe, for typical
sors only. Finally, we show the on-the-fly calibration of the LIDAR benchmark problems, such as object detection, tracking or Simulta-
and camera system. neous Localization and Mapping (SLAM), one can choose between
various public resources, corresponding to different sensor charac-
2. LIDAR sensors and resources teristic and scenario types. A main challenge in the future, how-
ever, will be the timely completion of the available benchmark
2.1. LIDAR sensor types datasets with reliable measurement and ground truth information,
following the appearance of newer and newer LIDAR sensors tech-
LIDAR equipments give us a versatile application and op- nologies.
erational richness: static/mobile, 360◦ /wide angle/narrow scan,
equidistant scanning resolution/special beam-patterns, single echo/ 3. LIDAR based object perception
multiple echos. We will see that LIDARS can be used in any area
of imaging the world. Object perception and recognition is a central objective in LI-
Rotating Multi-beam (RMB) Lidar systems provide a 360◦ field DAR based 3D point cloud processing. Though several 3D object
of view of the scene, with a vertical resolution equal to the num- detection and classification approaches can be found in the litera-
ber of the sensors, while the horizontal angle resolution depends ture, due to the large differences in data characteristics obtained
on the speed of rotation. Although RMB Lidars can produce high by different LIDAR sensors, object perception methods are still
frame-rate point cloud videos enabling dynamic event analysis in strongly sensor dependent making very challenging to adopt them
the 3D space, the measurements have a low spatial density, which between different types of LIDAR data.
quickly decreases as a function of the distance from the sensor, and Since LIDAR sensors provide very accurate 3D geometric infor-
the point clouds may exhibit particular patterns typical to sensor mation, the localization and shape recognition of the objects can
characteristic (see Fig. 1). In special cases, only one or a few beams be more intuitively compared to 2D image processing. However,
are available. beyond the different sensor data characteristics, several challenges
Mobile laser scanning (MLS) platforms equipped with time syn- occur in automatic LIDAR-based object detection and classifica-
chronized Lidar sensors and navigation units can rapidly provide tion, such as the sparsity of the data, variable point density, non-
very dense and accurate static point clouds from large environ- uniform sampling and in addition, in cluttered scenes objects often
ments, where the 3D spatial measurements are accurately regis-
tered to a geo-referenced global coordinate system (Fig. 2). These
1
point clouds may act as a basis for detailed and up-to-date 3D https://blackmoreinc.com.
2
C. Benedek, A. Majdik, B. Nagy, Z. Rozsa and T. Sziranyi Digital Signal Processing 119 (2021) 103193
Table 1
LIDAR datasets with different purposes.
Name of the dataset Lane detection Object detection/tracking Segmentation Localization and mapping
Standford Track [3] X
Ford [4] X
KITTI [5] X X X X
Málaga urban [6] X
Oxford RobotCar [7] X
Apolloscape [8] X X X X
KAIST Urban dataset [9] X
KAIST Multispectral [10] X
Multivehicle Stereo Event [11] X
UTBM RoboCar [12] X
Unsupervised Llamas (Bosch) [13] X
PandaSet* X X
BLVD [14] X
H3D (Honda) [15] X
Lyft level 5 [16] X
NuScenes [17] X X
Waymo Open [18] X
Argoverse [19] X X
SZTAKI-Velo64Road [20] X
SZTAKI CityMLS [21] X X
*
https://scale.com/open-datasets/pandaset
4. The limits of usage: low-resolution LIDAR perception and 4.1. Vision with LIDARs of extremely low resolutions
extreme circumstances
2D range scanners and their applications have a relatively long
The previous section (Sect. 3) shows that Lidar pattern evalua- history in robotics [30]. Automated Guided Vehicles (AGVs) have
tion can result in semantic interpretation; now we see that a very been using these sensors for decades for safety and navigation
limited (diluted) information of LIDAR scans can also be used for purposes. Today, there are products available in the market with
accurate perception. In this section, limitations of LIDAR sensors, extremely high horizontal angular resolution, high scanning fre-
resulted challenges, and current solutions are discussed. Besides quency, and safety guarantee of the manufacturer. Also, fully devel-
3
C. Benedek, A. Majdik, B. Nagy, Z. Rozsa and T. Sziranyi Digital Signal Processing 119 (2021) 103193
Fig. 4. Examples of planar LIDAR sensor and point cloud acquired by a planar LIDAR.
4
C. Benedek, A. Majdik, B. Nagy, Z. Rozsa and T. Sziranyi Digital Signal Processing 119 (2021) 103193
5
C. Benedek, A. Majdik, B. Nagy, Z. Rozsa and T. Sziranyi Digital Signal Processing 119 (2021) 103193
6
C. Benedek, A. Majdik, B. Nagy, Z. Rozsa and T. Sziranyi Digital Signal Processing 119 (2021) 103193
Fig. 8. Labeling result of the proposed 3D CNN based scene segmentation approach (test data provided by Budapest Közút Zrt.)
features. Results are demonstrated on synthetic and indoor data 7. Change detection using onboard Lidar and MLS maps
samples, with dense and accurate spatial data and RGB color infor-
mation. For self-driving car navigation and environment perception,
The Similarity Group Proposal Network (SGPN) [69] uses Point- change detection between the instantly sensed RMB Lidar measure-
Net++ as a backbone feature extractor, and presents performance ments and the MLS based reference environment model appears as
improvement by adding several extra layers to the top of the net- a crucial task, which indicates a number of key challenges. Particu-
work structure. However, as noted by the authors, SGPN cannot larly, there is a significant difference in the quality and the density
process large scenes on the order 105 or more points [69], due characteristics of the i3D and MLS point clouds, due to a trade-off
between temporal and spatial resolution of the available 3D sen-
to using a similarity matrix whose size scales quadratically as the
sors.
number of points increases. This property is disadvantageous for
In recent years various techniques have been published for
MLS data processing, where a typical scene may contain over 107
change detection in point clouds, however, the majority of the
points.
approaches rely on dense terrestrial laser scanning (TLS) data
The Sparse Lattice Network (SPLATNet3D ) [70] is a recent tech-
recorded from static tripod platforms [71,72]. As explained in [71],
nique which able to deal with large point cloud scenes efficiently
classification based on calculation of point-to-point distances may
by using a Bilateral Convolution Layer (BCL). SPLATNet3D [70] be useful for homogeneous TLS and MLS data, where changes can
projects the extracted features to a lattice structure, and it applies be detected directly in 3D. However, the point-to-point distance is
sparse convolution operations. Similarly to voxel based approaches, very sensitive to varying point density, causing degradation in our
the lattice structure implements a discrete scene representation, addressed i3D/MLS cross-platform scenario. Instead, [71] follows
where one should address under- and oversegmentation problems a ray tracing and occupancy map based approach with estimated
depending on the lattice scales. normals for efficient occlusion detection, and point-to-triangle dis-
The C2 CNN technique introduced in [21] is based on two- tances for more robust calculation of the changes. Here the De-
channel 3D convolutional neural network (CNN), and is specifically launay triangulation step may mean a critical point, especially in
improved to segment MLS point clouds into nine different seman- noisy and cluttered segments of the MLS point cloud, which are
tic classes, which can be used for high definition city map gener- unavoidably present in a city-scale project. [72] uses a nearest
ation. The main purpose of semantic point labeling is to provide neighbor search across segments of scans: for every point of a seg-
a detailed and reliable background map for self-driving vehicles ment they perform a fixed radius search of 15 cm in the reference
(SDV), which indicates the roads and various landmark objects for cloud. If for a certain percentage of segment points no neighboring
navigation and decision support of SDVs. This approach consid- points could be found for at least one segment-to-cloud compar-
ers several practical aspects of raw MLS sensor data processing, ison, the object is labeled there as moving entity. A method for
including the presence of diverse urban objects, varying point den- change detection between MLS point clouds and 2D terrestrial im-
sity, and strong measurement noise of phantom effects caused by ages is discussed in [73]. An approach dealing with purely RMB
objects moving concurrently with the scanning platform. We eval- Lidar measurements is presented in [74], which uses a ray tracing
uate the proposed approach on a manually annotated new MLS approach with nearest neighbor search. A voxel based occupancy
technique is applied in [75], where the authors focus on detect-
benchmark set, and compare our solution to three general refer-
ing changes in point clouds captured with different MLS systems.
ence techniques proposed for semantic point cloud segmentation.
However, the differences in data quality of the inputs are less sig-
A numerical comparison between many of the above mentioned
nificant than in our discussed case.
methods is shown in Table 2, using the SZTAKI CityMLS Bench-
In [76] authors proved that change detection can be accel-
mark Set [21].3
erated if they compare only keyframes to the map or previous
frames. Here, keyframes are the ones that contain changes with
high probability. [76] proposed a solution to find these keyframes
3
url: http://mplab.sztaki.hu/geocomp/SZTAKI-CityMLS-DB.html. by exploiting mapping residuals. The authors demonstrated the
7
C. Benedek, A. Majdik, B. Nagy, Z. Rozsa and T. Sziranyi Digital Signal Processing 119 (2021) 103193
Table 2
Quantitative comparison of various point cloud segmentation techniques [63], [67], [68], [70] and [21] on the SZTAKI CityMLS bench-
mark set.
xyz
Class OG-CNN [63] Multi-view [67] PointNet++ [68] SPLATNetrgb [70] C2 CNN [21]
Pr Rc F-r Pr Rc F-r Pr Rc F-r Pr Rc F-r Pr Rc F-r
Phantom 85.3 34.7 49.3 76.5 45.3 56.9 82.3 76.5 79.3 83.4 78.2 80.7 84.3 85.9 85.1
Pedestrian 61.2 82.4 70.2 57.2 66.8 61.6 86.1 81.2 83.6 80.4 78.6 79.5 85.2 85.3 85.2
Car 56.4 89.5 69.2 60.2 73.3 66.1 80.6 92.7 86.2 81.1 89.4 85.0 86.4 88.7 87.5
Vegetation 72.4 83.4 77.5 71.7 78.4 74.9 91.4 89.7 90.5 86.4 87.3 86.8 98.2 95.5 96.8
Column 88.6 74.3 80.8 83.4 76.8 80.0 83.4 93.6 88.2 84.1 89.2 86.6 86.5 89.2 87.8
Tram/Bus 91.4 81.6 86.2 85.7 83.2 84.4 83.1 89.7 86.3 79.3 82.1 80.7 89.5 96.9 93.0
Furniture 72.1 82.4 76.9 57.2 89.3 69.7 84.8 82.9 83.8 82.6 81.3 81.9 88.8 78.8 83.5
Overall 76.9 74.2 75.5 72.5 73.4 72.9 85.6 87.5 86.5 82.5 83.7 83.0 90.4 90.2 90.3
Note: Voxel level Precision (Pr), Recall (Rc) and F-rates (F-r) are given in percent (overall values weighted with class significance).
8. Camera-Lidar calibration
8
C. Benedek, A. Majdik, B. Nagy, Z. Rozsa and T. Sziranyi Digital Signal Processing 119 (2021) 103193
tions, they still rely on the presence of calibration targets, which tial transformation estimation between the different sensors mea-
often should be arranged in complex setups (i.e., [81] uses 12 surement [90], and mutual information based matching is sensi-
checkerboards). Furthermore, during the calibration both the plat- tive to inhomogeneous point cloud inputs and illumination arti-
form and the targets must be motionless. facts, which are frequently occurring problems when using RMB
On the contrary, target-less approaches rely on features ex- Lidars [78].
tracted from the observed scene without using any calibration In [95], authors proposed a new fully automatic and target-less
objects. Some of these methods use motion-based [86–88] infor- extrinsic calibration approach between a camera and a rotating
mation to calibrate the Lidar and camera, while alternative tech- multi-beam (RMB) Lidar mounted on a moving car. This technique
niques [78,89] attempt to minimize the calibration errors using adopts a structure from motion (SfM) method to generate 3D point
only static features. clouds from the camera data, which can be matched to the Lidar
Among motion-based approaches, Huang and Stachniss [87] im- point clouds; thus, the extrinsic calibration problem is addressed
prove the accuracy of extrinsic calibration by the estimation of the as a registration task in the 3D domain (see Fig. 10). The method
motion errors, Shiu and Ahmad [86] approximate the relative mo- consists of two main steps: an object level matching algorithm per-
tion parameters between the consecutive frames, and Shi et al. [90] forming a coarse alignment of the camera and Lidar data, and a
calculate sensor motion by jointly minimizing the projection error fine alignment step that implements a control point based point
between the Lidar and the camera residuals. These methods first level registration refinement. The superiority of the method is that
estimate the trajectories of the camera and Lidar sensors either by it relies on only the raw camera and Lidar sensor streams with-
visual odometry and scan matching techniques, or by exploiting out using any external Global Navigation Satellite System (GNSS)
IMU and GNSS measurements. Thereafter they match the recorded or Inertial Measurement Unit (IMU) sensors. Moreover, it is able
camera and Lidar measurement sequences assuming that the sen- to automatically calculate the extrinsic calibration parameters be-
sors are rigidly mounted to the platform. However, the accuracy of tween the Lidar and camera sensors on-the-fly which means we
these techniques strongly depends on the performance of trajec- only have to mount the sensors on the top of the vehicle and start
tory estimation, which may suffer from visual featureless regions, driving in a typical urban environment.
low resolution scans [91], lack of hardware trigger based synchro- Note that there exist a few end-to-end deep learning based
nization between the camera and the Lidar [90], or urban scenes camera and Lidar calibration methods [80,96] in the literature,
without sufficient GPS coverage. which can automatically estimate the calibration parameters within
We continue the discussion with single frame target-less and a bounded parameter range based on a sufficiently large training
feature-based methods. Moghadam et al. [89] attempt to detect dataset. However, the trained models cannot be applied for arbi-
correspondences by extracting lines both from the 3D Lidar point trary configurations, and re-training is often more resource inten-
cloud and the 2D image data. While this method proved to be sive than applying a conventional calibration approach. In addition,
efficient in indoor environments, it requires a large number of the failure case analysis and analytical estimation of the limits of
line correspondences, a condition that cannot often be satisfied in operations are highly challenging for black box deep learning ap-
outdoor scenes. A mutual information based approach has been in- proaches.
troduced in [92] to calibrate different range sensors with cameras.
Pandey et al. [78] attempt to maximize the mutual information 9. Conclusion and future directions
using the camera’s grayscale pixel intensities and the Lidar re-
flectivity values. Based on Lidar reflectivity values and grayscale When LIDARs started in autonomous driving systems, it was
images Napier et al. [93] minimize the correlation error between mainly a part of the supporting development toolkit. For a long
the Lidar and the camera frames. Scaramuzza et al. [94] introduce time up to now, it was seriously considered that LIDAR was not
a new data representation called the Bearing Angle image (BA) needed as a traffic controller sensor, since (i) it cannot see through
which is generated from the Lidar’s range measurements. Using bad weather (ii) vulnerable opto-mechanics and (iii) its relative
conventional image processing operations, the method searches for high price. However, today we can get high quality but much
correspondences between the BA and the camera image. As a lim- cheaper LIDAR sensors, with more robust opto-mechanical solu-
itation, target-less feature-based methods require a reasonable ini- tions; bad weather problems can be partly eliminated and so on.
9
C. Benedek, A. Majdik, B. Nagy, Z. Rozsa and T. Sziranyi Digital Signal Processing 119 (2021) 103193
This technology cannot be stopped, the intense research carried [8] Y. Ma, X. Zhu, S. Zhang, R. Yang, W. Wang, D. Manocha, Trafficpredict: trajec-
out in the past decade has shown significant progress. Today LI- tory prediction for heterogeneous traffic-agents, in: Proceedings of the AAAI
Conference on Artificial Intelligence, vol. 33, 2019, pp. 6120–6127.
DARs are indispensable sensors in many autonomous robotic appli-
[9] J. Jeong, Y. Cho, Y.-S. Shin, H. Roh, A. Kim, Complex urban dataset with multi-
cations, and they will have a definite place in autonomous driving. level sensors from highly diverse urban environments, Int. J. Robot. Res. (2019)
Nevertheless, there are still many opportunities to further enhance 0278364919843996.
their capabilities in the future by exploiting, e.g., multiple reflec- [10] Y. Choi, N. Kim, S. Hwang, K. Park, J.S. Yoon, K. An, I.S. Kweon, Kaist multi-
spectral day/night dataset for autonomous and assisted driving.
tion patterns, the Doppler effect, reflection intensity. Its main ad-
[11] A.Z. Zhu, D. Thakur, T. Özaslan, B. Pfrommer, V. Kumar, K. Daniilidis, The mul-
vantage that LiDAR is the only sensor that gives high resolution at tivehicle stereo event camera dataset: an event camera dataset for 3D percep-
range: the power to recognize objects very fine and very accurately tion, IEEE Robot. Autom. Lett. 3 (3) (2018) 2032–2039, https://doi.org/10.1109/
in space, even from afar. LRA.2018.2800793.
In the methodology of the above tasks, based on huge testing [12] Z. Yan, L. Sun, T. Krajnik, Y. Ruichek, EU long-term dataset with multiple sen-
sors for autonomous driving, in: Proceedings of the 2020 IEEE/RSJ International
databases, Deep Learning methods are widely used for detection Conference on Intelligent Robots and Systems (IROS), 2020.
and classification. In 3D calibration and positioning tasks, the clas- [13] K. Behrendt, R. Soussan, Unsupervised labeled lane marker dataset generation
sical 3D geometry based mathematical framework, e.g., [97], is still using maps, in: Proceedings of the IEEE International Conference on Computer
used. However, the results of direct mathematical methods can Vision, 2019.
[14] J. Xue, J. Fang, T. Li, B. Zhang, P. Zhang, Z. Ye, J. Dou, Blvd: building a large-
also be adopted to the data feeding step of Deep Learning train-
scale 5D semantics benchmark for autonomous driving, in: 2019 International
ing methods, especially in 3D geometry tasks, to obtain a more Conference on Robotics and Automation (ICRA), 2019, pp. 6685–6691.
complete set of input variations. [15] A. Patil, S. Malla, H. Gang, Y.-T. Chen, The H3D dataset for full-surround 3D
As the industrial tendency of LIDAR sensor development offers multi-object detection and tracking in crowded urban scenes, in: International
Conference on Robotics and Automation, 2019.
in parallel quick increment of spatial and temporal resolution of
[16] J. Houston, G. Zuidhof, L. Bergamini, Y. Ye, A. Jain, S. Omari, V. Iglovikov, P.
the 3D measurement sequences and reduced measurement noise, Ondruska, One thousand and one hours: self-driving motion prediction dataset,
LIDAR sensors in the near future may act as general tools to reli- https://level5.lyft.com/dataset/, 2020.
ably capture dense and accurate 3D environmental information for [17] H. Caesar, V. Bankiti, A.H. Lang, S. Vora, V.E. Liong, Q. Xu, A. Krishnan, Y. Pan,
G. Baldan, O. Beijbom, Nuscenes: a multimodal dataset for autonomous driving,
various real-time perception, navigation and mapping problems.
arXiv preprint, arXiv:1903.11027.
Over the past decade, another defining trend is noticeable to- [18] P. Sun, H. Kretzschmar, X. Dotiwalla, A. Chouard, V. Patnaik, P. Tsui, J. Guo,
wards producing smaller and more compact LIDAR devices. These Y. Zhou, Y. Chai, B. Caine, V. Vasudevan, W. Han, J. Ngiam, H. Zhao, A. Tim-
tendencies facilitate the use of LIDAR sensors on small Unmanned ofeev, S. Ettinger, M. Krivokon, A. Gao, A. Joshi, Y. Zhang, J. Shlens, Z. Chen,
Aerial Vehicles (UAVs) [98], enabling further directions of research D. Anguelov, Scalability in perception for autonomous driving: Waymo open
dataset, in: Proceedings of the IEEE/CVF Conference on Computer Vision and
and many novel applications [99].
Pattern Recognition (CVPR), 2020.
[19] M. Chang, J. Lambert, P. Sangkloy, J. Singh, S. Bak, A. Hartnett, D. Wang, P. Carr,
CRediT authorship contribution statement S. Lucey, D. Ramanan, J. Hays, Argoverse: 3D tracking and forecasting with rich
maps, CoRR, arXiv:1911.02620.
[20] A. Börcs, B. Nagy, C. Benedek, Instant object detection in Lidar point clouds, in:
All authors contributed equally to this article. IEEE Geosci. Remote Sens. Lett, 2017, pp. 992–996.
[21] B. Nagy, C. Benedek, 3D CNN-based semantic labeling approach for mobile laser
Declaration of competing interest scanning data, IEEE Sens. J. 19 (21) (2019) 10034–10045, https://doi.org/10.
1109/JSEN.2019.2927269.
[22] C. Benedek, D. Molnár, T. Szirányi, A dynamic MRF model for foreground detec-
The authors declare that they have no known competing finan- tion on range data sequences of rotating multi-beam Lidar, in: X. Jiang, O.R.P.
cial interests or personal relationships that could have appeared to Bellon, D. Goldgof, T. Oishi (Eds.), Advances in Depth Image Analysis and Ap-
influence the work reported in this paper. plications, Springer Berlin Heidelberg, Berlin, Heidelberg, 2013, pp. 87–96.
[23] R.B. Rusu, S. Cousins, 3D is here: point cloud library (PCL), in: 2011 IEEE Inter-
national Conference on Robotics and Automation, 2011, pp. 1–4.
Acknowledgment [24] J.-F. Lalonde, N. Vandapel, M. Hebert, Data structures for efficient dynamic pro-
cessing in 3-d, Int. J. Robot. Res. 26 (8) (2007) 777–796, https://doi.org/10.
The research was supported by the Ministry of Innovation and 1177/0278364907079265.
[25] M. Himmelsbach, A. Müller, T. Luettel, H.-J. Wuensche, Lidar-based 3D object
Technology NRDI Office within the framework of the Autonomous
perception, in: Proceedings of 1st International Workshop on Cognition for
Systems National Laboratory Program and by the Hungarian Na- Technical Systems, 2008.
tional Science Foudation (NKFIH OTKA) No. K 139485. [26] A. Azim, O. Aycard, Detection, classification and tracking of moving objects
in a 3D environment, in: 2012 IEEE Intelligent Vehicles Symposium, 2012,
pp. 802–807.
References
[27] A. Lang, S. Vora, H. Caesar, L. Zhou, J. Yang, O. Beijbom, Pointpillars: fast en-
coders for object detection from point clouds, in: Conference on Computer
[1] J.C.F. Diaz, W.E. Carter, R.L. Shrestha, C.L. Glennie, LiDAR Remote Sensing, Vision and Pattern Recognition, 2019, pp. 12689–12697.
Springer International Publishing, Cham, 2017, pp. 929–980. [28] Y. Zhou, O. Tuzel, Voxelnet: end-to-end learning for point cloud based 3D ob-
[2] M.L. Chanin, A. Garnier, A. Hauchecorne, J. Porteneuve, A Doppler Lidar for ject detection, in: 2018 IEEE/CVF Conference on Computer Vision and Pattern
measuring winds in the middle atmosphere, Geophys. Res. Lett. 16 (11) (1989) Recognition, 2018, pp. 4490–4499.
1273–1276, https://doi.org/10.1029/GL016i011p01273. [29] T.-H. Kim, T.-H. Park, Placement optimization of multiple Lidar sensors for au-
[3] A. Teichman, J. Levinson, S. Thrun, Towards 3D object recognition via classi- tonomous vehicles, IEEE Trans. Intell. Transp. Syst. 21 (5) (2020) 2139–2145,
fication of arbitrary object tracks, in: 2011 IEEE International Conference on https://doi.org/10.1109/TITS.2019.2915087.
Robotics and Automation, 2011, pp. 4034–4041. [30] K.O. Arras, O.M. Mozos, W. Burgard, Using boosted features for the detection of
[4] G. Pandey, J.R. McBride, R.M. Eustice, Ford campus vision and Lidar data people in 2D range data, in: IEEE ICRA, 2007.
set, Int. J. Robot. Res. 30 (13) (2011) 1543–1552, https://doi.org/10.1177/ [31] W. Hess, D. Kohler, H. Rapp, D. Andor, Real-time loop closure in 2D LIDAR
0278364911400640. SLAM, in: IEEE ICRA, 2016.
[5] A. Geiger, P. Lenz, R. Urtasun, Are we ready for autonomous driving? The Kitti [32] L. Kurnianggoro, K.H. Jo, Object classification for LIDAR data using encoded fea-
vision benchmark suite, in: IEEE CVPR, 2012. tures, in: 2017 10th International Conference on Human System Interactions
[6] J.-L. Blanco, F.-A. Moreno, J. Gonzalez-Jimenez, The Málaga urban dataset: high- (HSI), 2017, pp. 49–53.
rate stereo and Lidars in a realistic urban scenario, Int. J. Robot. Res. 33 (2) [33] C. Weinrich, T. Wengefeld, M. Volkhardt, A. Scheidig, H.-M. Gross, Generic
(2014) 207–214, https://doi.org/10.1177/0278364913507326, http://www.mrpt. Distance-Invariant Features for Detecting People with Walking Aid in 2D Laser
org/MalagaUrbanDataset. Range Data, Springer International Publishing, Cham, 2016, pp. 735–747.
[7] W. Maddern, G. Pascoe, C. Linegar, P. Newman, 1 year, 1000 km: the Oxford [34] F. Galip, M.H. Sharif, M. Caputcu, S. Uyaver, Recognition of objects from laser
robotcar dataset, Int. J. Robot. Res. 36 (1) (2017) 3–15, https://doi.org/10.1177/ scanned data points using SVM, in: 2016 First International Conference on Mul-
0278364916679498. timedia and Image Processing (ICMIP), 2016, pp. 28–35.
10
C. Benedek, A. Majdik, B. Nagy, Z. Rozsa and T. Sziranyi Digital Signal Processing 119 (2021) 103193
[35] L. Beyer, A. Hermans, B. Leibe Drow, Real-time deep learning-based wheelchair [60] T. Kanzok, F. Süß, L. Linsen, R. Rosenthal, Efficient removal of inconsistencies
detection in 2-D range data, IEEE Robot. Autom. Lett. 2 (2) (2017) 585–592, in large multi-scan point clouds, in: Int’l Conf. in Central Europe on Computer
https://doi.org/10.1109/LRA.2016.2645131. Graphics, Visualization and Computer Vision, Pilsen, Czech Rep., 2013.
[36] L. Spinello, K.O. Arras, R. Triebel, R. Siegwart, A layered approach to people [61] J. Gehrung, M. Hebel, M. Arens, U. Stilla, An approach to extract moving objects
detection in 3D range data, in: Proceedings of the Twenty-Fourth AAAI Confer- from MLS data using a volumetric background representation, in: ISPRS Ann.
ence on Artificial Intelligence, AAAI’10, AAAI Press, 2010, pp. 1625–1630. Photogramm. Remote Sens. and Spatial Inf. Sci., vol. IV-1, 2017.
[37] V. Alvarez-Santos, A. Canedo-Rodriguez, R. Iglesias, X. Pardo, C. Regueiro, M. [62] M. Engelcke, D. Rao, D.Z. Wang, C.H. Tong, I. Posner, Vote3Deep: fast object
Fernandez-Delgado, Route learning and reproduction in a tour-guide robot, detection in 3D point clouds using efficient convolutional neural networks, in:
Robot. Auton. Syst. 63 (Part 2) (2015) 206–213, https://doi.org/10.1016/j.robot. IEEE International Conference on Robotics and Automation (ICRA), Singapore,
2014.07.013. 2017, pp. 1355–1361.
[38] Z. Rozsa, T. Sziranyi, Obstacle prediction for automated guided vehicles based [63] J. Huang, S. You, Point cloud labeling using 3D convolutional neural network,
on point clouds measured by a tilted Lidar sensor, IEEE Trans. Intell. Transp. in: International Conference on Pattern Recognition (ICPR), Cancun, Mexico,
Syst. 19 (8) (2018) 2708–2720, https://doi.org/10.1109/TITS.2018.2790264. 2016, pp. 2670–2675.
[39] I. Sipiran, B. Bustos, Harris 3D: a robust extension of the Harris operator for [64] H.S. Koppula, A. Anand, T. Joachims, A. Saxena, Semantic labeling of 3D point
interest point detection on 3D meshes, Vis. Comput. 27 (11) (2011) 963–976, clouds for indoor scenes, in: Int’l Conf. Neural Inf. Processing Systems (NIPS),
https://doi.org/10.1007/s00371-011-0610-y. Granada, Spain, 2011, pp. 244–252.
[65] T. Hackel, J.D. Wegner, K. Schindler, Fast semantic segmentation of 3D point
[40] Z. Rozsa, T. Sziranyi, Object detection from a few Lidar scanning planes,
clouds with strongly varying density, ISPRS Ann. Photogramm. Remote Sens.
IEEE Trans. Intell. Veh. 4 (4) (2019) 548–560, https://doi.org/10.1109/TIV.2019.
and Spatial Inf. Sci. III-3.
2938109.
[66] G. Riegler, A.O. Ulusoy, A. Geiger, OctNet: learning deep 3D representations at
[41] M. Kutila, P. Pyykönen, W. Ritter, O. Sawade, B. Schäufele, Automotive LIDAR
high resolutions, in: IEEE Conference on Computer Vision and Pattern Recogni-
sensor development scenarios for harsh weather conditions, in: IEEE ITSC,
tion (CVPR), 2017, pp. 6620–6629.
2016.
[67] G. Pang, U. Neumann, 3D point cloud object detection with multi-view con-
[42] N. Charron, S. Phillips, S. Waslander, De-noising of Lidar point clouds corrupted
volutional neural network, in: International Conference on Pattern Recognition
by snowfall, https://doi.org/10.1109/CRV.2018.00043, 2018, pp. 254–261.
(ICPR), Cancun, Mexico, 2016, pp. 585–590.
[43] R. Heinzler, F. Piewak, P. Schindler, W. Stork, CNN-based Lidar point cloud de- [68] C. Qi, L. Yi, H. Su, L. Guibas, PointNet++: deep hierarchical feature learning on
noising in adverse weather, IEEE Robot. Autom. Lett. 5 (2) (2020) 2514–2521, point sets in a metric space, in: Conf. NIPS, 2017.
https://doi.org/10.1109/LRA.2020.2972865. [69] W. Wang, R. Yu, Q. Huang, U. Neumann, SGPN: similarity group proposal net-
[44] R. Heinzler, P. Schindler, J. Seekircher, W. Ritter, W. Stork, Weather influence work for 3D point cloud instance segmentation, in: IEEE Conf. on Computer
and classification with automotive Lidar sensors, in: 2019 IEEE Intelligent Ve- Vision and Pattern Recognition (CVPR), Salt Lake City, UT, 2018, pp. 2569–2578.
hicles Symposium (IV), 2019, pp. 1527–1534. [70] H. Su, V. Jampani, D. Sun, S. Maji, V. Kalogerakis, M.-H. Yang, J. Kautz, SPLAT-
[45] M. Kutila, P. Pyykönen, H. Holzhüter, M. Colomb, P. Duthon, Automotive Lidar Net: Sparse lattice networks for point cloud processing, 2018, pp. 2530–2539.
performance verification in fog and rain, in: 2018 IEEE Intelligent Transporta- [71] W. Xiao, B. Vallet, M. Brédif, N. Paparoditis, Street environment change detec-
tion Systems Conference, ITSC 2018, IEEE Institute of Electrical and Electronic tion from mobile laser scanning point clouds, ISPRS J. Photogramm. Remote
Engineers, United States, 2018, pp. 1695–1701. Sens. 107 (2015) 38–49.
[46] M. Bijelic, T. Gruber, F. Mannan, F. Kraus, W. Ritter, K. Dietmayer, F. Heide, See- [72] A. Schlichting, C. Brenner, Vehicle localization by Lidar point correlation im-
ing through fog without seeing fog: deep multimodal sensor fusion in unseen proved by change detection, in: ISPRS International Archives of the Photogram-
adverse weather, in: The IEEE/CVF Conference on Computer Vision and Pattern metry, in: Remote Sensing and Spatial Information Sciences, vol. XLI-B1, 2016,
Recognition (CVPR), 2020. pp. 703–710.
[47] J. Zhang, S. Singh, Loam: Lidar odometry and mapping in real-time, in: [73] R. Qin, A. Gruen, 3D change detection at street level using mobile laser scan-
Robotics: Science and Systems Conference, 2014. ning point clouds and terrestrial images, ISPRS J. Photogramm. Remote Sens.
[48] R. Dube, A. Cramariuc, D. Dugas, J. Nieto, R. Siegwart, C. Cadena, SegMap: 3D 90 (2014) 23–35.
segment mapping using data-driven descriptors, in: Robotics: Science and Sys- [74] J.P. Underwood, D. Gillsjö, T. Bailey, V. Vlaskine, Explicit 3D change detection
tems Conference, 2018. using ray-tracing in spherical coordinates, in: IEEE International Conference on
[49] T. Shan, B. Englot, D. Meyers, W. Wang, C. Ratti, R. Daniela, LIO-SAM: tightly- Robotics and Automation, Karlsruhe, Germany, 2013, pp. 4735–4741.
coupled Lidar inertial odometry via smoothing and mapping, in: IEEE/RSJ In- [75] K. Liu, J. Boehm, C. Alis, Change detection of mobile LIDAR data using cloud
ternational Conference on Intelligent Robots and Systems (IROS), IEEE, 2020, computing, in: ISPRS International Archives of the Photogrammetry, in: Remote
pp. 5135–5142. Sensing and Spatial Information Sciences, vol. XLI-B3, 2016, pp. 309–313.
[50] T. Shan, B. Englot, C. Ratti, R. Daniela, LVI-SAM: tightly-coupled Lidar-visual- [76] Z. Rozsa, M. Golarits, T. Sziranyi, Localization of map changes by exploiting
inertial odometry via smoothing and mapping, in: IEEE International Confer- slam residuals, in: J. Blanc-Talon, P. Delmas, W. Philips, D. Popescu, P. Scheun-
ence on Robotics and Automation (ICRA), IEEE, 2021, arXiv preprint, arXiv: ders (Eds.), Advanced Concepts for Intelligent Vision Systems, Springer Interna-
2104.10831. tional Publishing, Cham, 2020, pp. 312–324.
[51] D. Rozenberszki, A.L. Majdik, LOL: Lidar-only odometry and localization in 3D [77] B. Gálai, C. Benedek, Change detection in urban streets by a real time Lidar
point cloud maps*, in: 2020 IEEE International Conference on Robotics and Au- scanner and MLS reference data, in: International Conference on Image Analy-
tomation (ICRA), 2020, pp. 4379–4385. sis and Recognition (ICIAR), in: Lecture Notes in Computer Science, vol. 10317,
Springer, Montreal, Canada, 2017, pp. 210–220.
[52] B. Nagy, C. Benedek, Real-time point cloud alignment for vehicle localization in
[78] G. Pandey, J.R. McBride, S. Savarese, R.M. Eustice, Automatic extrinsic calibra-
a high resolution 3D map, in: ECCV Workshops, 2018.
tion of vision and Lidar by maximizing mutual information, J. Field Robot. 32
[53] B. Nagy, C. Benedek, 3D CNN based phantom object removing from mobile
(2015) 696–722.
laser scanning data, in: Int’l Joint Conference on Neural Networks (IJCNN), An-
[79] Z. Pusztai, I. Eichhardt, L. Hajder, Accurate calibration of multi-Lidar-multi-
chorage, Alaska, USA, 2017, pp. 4429–4435.
camera systems, in: Sensors, vol. 18, 2018, pp. 119–152.
[54] Y. Yu, J. Li, H. Guan, C. Wang, Automated detection of three-dimensional cars
[80] G. Iyer, R.K. Ram, J.K. Murthy, K.M. Krishna, Calibnet: Geometrically supervised
in mobile laser scanning point clouds using DBM-Hough-Forests, IEEE Trans.
extrinsic calibration using 3D spatial transformer networks, in: 2018 IEEE/RSJ
Geosci. Remote Sens. 54 (7) (2016) 4130–4142, https://doi.org/10.1109/TGRS.
International Conference on Intelligent Robots and Systems (IROS).
2016.2537830.
[81] A. Geiger, F. Moosmann, O. Car, B. Schuster, Automatic camera and range sensor
[55] H. Zheng, R. Wang, S. Xu, Recognizing street lighting poles from mobile LiDAR calibration using a single shot, in: IEEE International Conference on Robotics
data, IEEE Trans. Geosci. Remote Sens. 55 (1) (2017) 407–420, https://doi.org/ and Automation, 2012, pp. 3936–3943.
10.1109/TGRS.2016.2607521. [82] H. Alismail, L. Baker, B. Browning, Automatic calibration of a range sensor and
[56] B. Wu, B. Yu, W. Yue, S. Shu, W. Tan, C. Hu, Y. Huang, J. Wu, H. Liu, A voxel- camera system, in: 2012 Second International Conference on 3D Imaging, Mod-
based method for automated identification and morphological parameters esti- eling, Processing, Visualization and Transmission, 2012, pp. 286–292.
mation of individual street trees from mobile laser scanning data, Remote Sens. [83] Y. Park, S.M. Yun, C.S. Won, K. Cho, K. Um, S. Sim, Calibration between color
5 (2) (2013) 584. camera and 3D LIDAR instruments with a polygonal planar board, in: Sensors,
[57] S. Papadimitriou, H. Kitagawa, P.B. Gibbons, C. Faloutsos, LOCI: fast outlier de- 2014.
tection using the local correlation integral, in: Int’l Conf. on Data Engineering, [84] M. Velas, M. Spanel, Z. Materna, A. Herout, Calibration of RGB camera with
Los Alamitos, CA, USA, 2003, pp. 315–326. velodyne LiDAR, 2014.
[58] S. Sotoodeh, Outlier detection in laser scanner point clouds, in: ISPRS [85] S. Rodriguez-Florez, V. F., P. Bonnifait, Extrinsic calibration between a multi-
Arch. Photogramm. Remote Sens. and Spatial Inf. Sci., vol. XXXVI-5, 2006, layer Lidar and a camera, in: IEEE International Conference on Multisensor
pp. 297–302. Fusion and Integration for Intelligent Systems, 2008, pp. 214–219.
[59] J. Köhler, T. Nöll, G. Reis, D. Stricker, Robust outlier removal from point clouds [86] Y.C. Shiu, S. Ahmad, Calibration of wrist-mounted robotic sensors by solving
acquired with structured light, in: Eurographics (Short Papers), Cagliari, Italy, homogeneous transform equations of the form AX=XB, IEEE Trans. Robot. Au-
2012, pp. 21–24. tom. 5 (1) (1989) 16–29.
11
C. Benedek, A. Majdik, B. Nagy, Z. Rozsa and T. Sziranyi Digital Signal Processing 119 (2021) 103193
[87] K. Huang, C. Stachniss, Extrinsic multi-sensor calibration for mobile robots tribution to the simultaneous localization and mapping (SLAM) of mobile
using the Gauss Helmert model, in: IEEE/RSJ International Conference on In- robots. While working on his Ph.D. he was a visiting scholar at Robotics,
telligent Robots and Systems (IROS), 2017, pp. 1490–1496. Perception and Real Time Group, University of Zaragoza, Spain and 3D
[88] K.H. Strobl, G. Hirzinger, Optimal hand-eye calibration, in: IEEE/RSJ In- Vision and Mobile Robotics Research Group, Budapest University of Tech-
ternational Conference on Intelligent Robots and Systems (IROS), 2006,
nology and Economics, Hungary. Next, he was a postdoc at Robotics and
pp. 4647–4653.
[89] P. Moghadam, M. Bosse, R. Zlot, Line-based extrinsic calibration of range and
Perception Group, University of Zurich, Switzerland. Currently, he is a se-
image sensors, IEEE Int. Conf. Robot. Autom. (2013) 3685–3691. nior research fellow with the Machine Perception Research Laboratory,
[90] C. Shi, K. Huang, Q. Yu, J. Xiao, H. Lu, C. Xie, Extrinsic calibration and odometry Institute for Computer Science and Control (SZTAKI). His research inter-
for camera-Lidar systems, IEEE Access 7 (2019) 120106–120116. ests include vision and LIDAR-based localization and mapping of ground
[91] O. Józsa, A. Börcs, C. Benedek, Towards 4D virtual city reconstruction from and aerial vehicles, the spatial artificial intelligence of autonomous sys-
Lidar point cloud sequences, in: ISPRS Workshop on 3D Virtual City Model- tems.
ing, Regina, Canada, in: ISPRS Annals Photogram. Rem. Sens. and Spat. Inf. Sci.,
vol. II-3/W1, 2013, pp. 15–20.
[92] R. Wang, F.P. Ferrie, J. Macfarlane, Automatic registration of mobile Lidar and Balázs Nagy received his Ph.D. degree in computer science from the
spherical panoramas, in: IEEE Computer Society Conference on Computer Vi- Péter Pázmány Catholic University in 2021. He is a research fellow at the
sion and Pattern Recognition Workshops, 2012, pp. 33–40. Machine Perception Research Laboratory of the Institute for Computer Sci-
[93] A. Napier, P. Corke, P. Newman, Cross-calibration of push-broom 2D LIDARs and ence and Control (SZTAKI). His research interests include sensor fusion,
cameras in natural scenes, in: IEEE International Conference on Robotics and computer vision, deep learning, robotics and autonomous vehicles.
Automation (ICRA), 2013, pp. 3679–3684.
[94] D. Scaramuzza, A. Harati, R. Siegwart, Extrinsic self calibration of a camera and
a 3D laser range finder from natural scenes, in: IEEE/RSJ International Confer- Zoltan Rozsa received his Ph.D. degree in vehicle and transportation
ence on Intelligent Robots and Systems, 2007, pp. 4164–4169. engineering from Budapest University of Technology and Economics in
[95] B. Nagy, C. Benedek, On-the-fly camera and Lidar calibration, Remote Sens. 2020. He is currently with the Faculty of Transportation Engineering and
12 (7) (2020), https://doi.org/10.3390/rs12071137, https://www.mdpi.com/ Vehicle Engineering of Budapest University of Technology and Economics
2072-4292/12/7/1137. and a research fellow at the Machine Perception Research Laboratory of
[96] N. Schneider, F. Piewak, C. Stiller, U. Franke, Regnet: multimodal sensor registra- the Institute for Computer Science and Control (SZTAKI). His research in-
tion using deep neural networks, in: 2017 IEEE Intelligent Vehicles Symposium terests include automated guided vehicles, machine vision, 3D recognition,
(IV), 2017, pp. 1803–1810.
and reconstruction.
[97] D. Barath, J. Matas, Graph-cut ransac: local optimization on spatially coherent
structures, IEEE Trans. Pattern Anal. Mach. Intell. (2021) 1, https://doi.org/10.
1109/TPAMI.2021.3071812. Tamas Sziranyi received his Ph.D. and D.Sci. degrees in 1991 and 2001,
[98] X. Li, C. Liu, Z. Wang, X. Xie, D. Li, L. Xu, Airborne LiDAR: state-of-the-art of by the Hungarian Academy of Sciences, Budapest. He was appointed to a
system design, technology and application, Meas. Sci. Technol. 32 (3) (2020) Full Professor position in 2001 at Pannon University, Veszprem, Hungary,
032002, https://doi.org/10.1088/1361-6501/abc867. and, in 2004, at the Peter Pazmany Catholic University, Budapest. Presently
[99] D.V. Nam, K. Gon-Woo, Solid-state LiDAR based-slam: a concise review and
he is a full professor at the Budapest University of Technology and Eco-
application, in: 2021 IEEE International Conference on Big Data and Smart
nomics. He has been a research scientist at the Institute for Computer
Computing (BigComp), 2021, pp. 302–305.
Science and Control (SZTAKI), since 1992, where he leads the Machine
Perception Research Laboratory since 2006. His research activities include
Csaba Benedek received the M.Sc. degree in computer sciences from machine perception, pattern recognition, texture and motion segmenta-
the Budapest University of Technology and Economics (BME), in 2004, and tion, Markov Random Fields and stochastic optimization, remote sensing,
the Ph.D. degree in image processing from Péter Pázmány Catholic Uni- surveillance, intelligent networked sensor systems, graph-based clustering,
versity, Budapest, in 2008. From 2008 to 2009, he was a Post-Doctoral digital film restoration. He has participated in several prestigious interna-
Researcher with INRIA, Sophia-Antipolis, France. He received D.Sci. degree tional (ESA, EDA, FP6, FP7, OTKA) projects with his research laboratory.
in 2021, by the Hungarian Academy of Sciences, Budapest. He has been Dr. Sziranyi was the founder and past president (1997 to 2002) of the
the manager of various national and international research projects in re- Hungarian Image Processing and Pattern Recognition Society. He was an
cent years. He is currently a Research Advisor with the Machine Perception Associate Editor of IEEE T. Image Processing (2003-2009), and he has been
Research Laboratory, Institute for Computer Science and Control (SZTAKI), an AE of Digital Signal Processing (Elsevier) since 2012. He was honored
and a Habilitated Associate Professor with Péter Pázmány Catholic Uni- by the Master Professor award in 2001, the Szechenyi professorship, and
versity. His research interests include Bayesian image and point cloud the ProScientia (Veszprem) award in 2011. He is a Fellow both of the Int.
segmentation, object extraction, change detection, and GIS data analysis. Assoc. Pattern Recognition (IAPR) and the Hungarian Academy of Engi-
Now he is the president (2019-2023) of the Hungarian Image Processing neering from 2008. He has more than 300 publications, including 50 in
and Pattern Recognition Society (KEPAF). major scientific journals and several international patents.
12