Professional Documents
Culture Documents
PROCESSING WITH
HIERARCHICAL GAUSSIAN MIXTURES
Ben Eckart, NVIDIA Research, Learning and Perception Group, 3/20/2019
2
2
POINT CLOUD PROCESSING CHALLENGES
Points are non-differentiable,
non-probabilistic
Large amounts of often noisy
data
Often spatially redundant,
wide ranging density variance
3
PREVIOUS APPROACHES
What have people done before?
Discrete Approaches
Voxel Grids/Lists, Octrees, TSDFs
Though efficient, they inherit the same non-
differentiable, non-probabilistic problems as point
clouds
OctoMap
4
PREVIOUS APPROACHES
What have people done before?
Continuous Approaches
Gaussian Mixture Models, Gaussian Processes
Though theoretically attractive, in practice tend to be too
slow for many applications
model
J=8 J=8 J=8 J=8 J=8 J=8 J=8 J=8
Talk Overview
• Background
– Theory of generative modeling for point clouds
• Single-Layer Model (GMMs)
– GPU-Accelerated Construction Algorithm
– Benefits: Compact and Data-Parallel
– Limitations: Scaling with model size, lack of memory
coherence
• Hierarchical Models (HGMMs)
– GPU-Accelerated Construction Algorithm
– Benefits: Fast and Parallelizable on GPU
– Application: Registration
STATISTICAL / GENERATIVE MODELS
Interpret point cloud data (PCD) as an iid sampling of some
unknown latent spatial probabilistic function
Generative property: Full joint probability space is
represented
Model
8
Modeling as an MLE Optimization
Data Model
Parametric Model as a Modified GMM
Interpret point cloud data as an iid sampling from a small number (J << N)
of Gaussian and Uniform Distributions:
GMM for Point Clouds: Intuition
Point samples
representing pieces of the
same local geometry could
be aggregated into
clusters with the local
geometry encoded inside
the covariance of that
cluster.
SOLVING FOR THE MLE GMM PARAMETERS
Typically done via the Expectation Maximization (EM) Algorithm
𝚯𝒊𝒏𝒊𝒕 𝚯𝒇𝒊𝒏𝒂𝒍
E Step M𝚯 Step
Update 𝚯
Point Cloud
EM Algorithm
12
E Step: A Single Point
Z
zi
𝑂(𝑁)
𝑂(𝐽)
“Probabilistic generalization
of K-Means Clustering”
15
GPU Data Parallelism
GMM Model Limitations
18
19
19
20
20
PARALLEL PARTITIONING USING CUDA
Given each point's max expectation and associated cluster index, we can
"invert" this index using parallel scans to group together point ID's having
same partition #:
Cluster 1 Cluster 2 Cluster 3
[0 0 1 0 1 1 1 2 0 2 2 2] ➔ [[0 1 3 8] [2 4 5 6] [7 9 10 11]]
Now we can run a 2D cuda kernel where
Dimension 1: index into original point cloud
Dimension 2: cluster of the parent
e.g. 3 clusters, 12 points, 2 threads/threadblock ➔ grid size of (2, 3)
21
HGMM COMPLEXITY
Even though we now have 64 clusters, we only need to query 8
clusters for each point (avoiding the computation of all NxJ (sparse)
expectations)
Due to the 2D cuda grid and indexing structure, this segmentation of
the points into 64 clusters is the exact same complexity/speed as the
original "simple" J=8 GMM.
Thus, we can keep increasing the complexity of the model eightfold
while incurring only a linear time penalty
22
HGMM ALGORITHM
Small EM algorithms (8 clusters at a time) are
recursively performed on increasingly smaller
partitions of the point cloud data
E Step: Associate points to clusters
M Step: Update mixture means, covariances,
and weights
Partition Step: Before each recursion step,
new point partitions are determined by
maximum likelihood point-cluster
associations from last E Step
23
HGMM DATA STRUCTURE
GMM
J=8 “Level 2” GMM
Efficiency
benefits of
GMM
J=8
GMM
J=8
hierarchical
structures
“Level
3”
like Octree
GMM GMM GMM GMM
J=8 J=8 J=8 J=8 Theoretical
benefits of a
“Level
4”
probabilistic
GMM
J=8
GMM
J=8
GMM
J=8
GMM
J=8
GMM
J=8
GMM
J=8
GMM
J=8
GMM
J=8 generative
model 24
E Step Performance
25
Compactness vs Fidelity
COMPACTNESS VS FIDELITY
Reconstruction Error (PSNR) vs Model Size (kB)
20
kB
27
MODELING LARGE POINT CLOUDS
Endeavor Snapshots: ~80 GB of Point Cloud Data each
HGMM Level 6:
<12 MB
Volume created
from
stochastically
sampled
Marching Cubes
Visualization is
real-time: ~20
fps on Titan X
28
ENDEAVOR DATA: BILLIONS OF POINTS
29
APPLICATION: RIGID REGISTRATION
30
Registration as EM with HGMM
MLE over Space of Rotations, Translations
Goal: Maximize data
likelihood over T
given some
probability model θ
31
Outdoor Urban Velodyne Data
• Velodyne VLP-16
– ~15k pts/frame
– ~10 frames/sec
• Frame-to-Frame
model-building and
registration with
overlap estimation
HGMM-Based Registration
• Average Frame-
to-Frame Error:
0.0960
Robust Point-to-Plane ICP
• Average Frame-
to-Frame Error:
0.1519
• best result on
libpointmatcher
Speed vs Accuracy Trade-Off
Test: Random
transformations of point
cloud pairs while
varying the subsampling
rate.
36
DRIVEWORKS (Future Release)
With Velodyne
HDL-64E:
~300 FPS on
Titan Xp
~30 FPS on
Xavier 37
DNN-BASED STEREO DEPTH MAPS
38
FINAL REMARKS
42
Stanford Lounge Dataset (Kinect)
ICP-Based
Proposed
43
44
Noise Handling
• Test: Random (uniform)
noise injected at
increasing amounts
• Result: Mixture
component “stick” to
areas of geometrically
coherent, dense areas,
disregarding areas of
noise
SAMPLING FOR PROBABILISTIC OCCUPANCY
𝑝Ƹ = 𝐿Σ𝑝 + 𝜇
∀ 𝜇, Σ ∈ Θ
46
MESHING UNDER NOISE
47
ADAPTIVE
MULTI-
SCALE
48
MULTI-SCALE MODELING
Multilevel cross-sections can be adaptively chosen for robustness
49
E Step: Parallelized Tree Search
Adaptive Thresholding Finds the Most Appropriate Scale to Associate Point Data to the Point Cloud Model
Point-model associations
are found through
parallelized adaptive
tree search in CUDA.
Complexity() is defined
to be , but other
suitable heuristics are
possible. 50
M-Step: Mahalanobis Estimation
We seek the transformation that maximizes the expected
joint log-likelihood of our data and latent associations wrt the
posterior over our current association estimates.
The resulting form (1) is a
weighted sum-of-squared
Mahalanobis distances,
further reduced to (2) by (1)
writing in terms of
{0,1}
sufficient statistics M𝑗 . (2)