Openvdb: Ken Museth - Nvidia

OPENVDB
KEN MUSETH | NVIDIA
THE PREMIER CONFERENCE & EXHIBITION IN COMPUTER

© 2021 SIGGRAPH. ALL RIGHTS RESERVED. GRAPHICS & INTERACTIVE TECHNIQUES
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on
the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.
Copyright is held by the owner/author(s).
SIGGRAPH '21 Courses, August 09-13, 2021, Virtual Event, USA
ACM 978-1-4503-8361-5/21/08.
10.1145/3450508.3464577
Schedule OpenVDB
• Intro to OpenVDB and NanoVDB

- Ken Museth (NVIDIA, TSC Chair)
• Adoption of NanoVDB in Houdini
- Jeff Lait (SideFX, TSC)
• OpenVDB in Production
- Jeff Budsberg (DWA)
• OpenVDB AX
- Richard Jones (DNeg)
• Multithreading in OpenVDB
- Dan Bailey (ILM, TSC)
Properties of VDB OpenVDB
7897 x 1504 x 5774
• Unbounded • Fast access

• Sparse/adaptive • Compact
VDB Highlights OpenVDB
• Unbounded: Virtually infinite index domain

• Compact: Small memory and disk footprints
• Fast random access: O(1) lookup, insertion and deletion
• Fast sequential access: O(1) voxel and node iterators
• Fast I/O: Custom codecs and formats
• Built-in BVH: Native hierarchical acceleration structure
• General: arbitrary value type and topology
Terminology OpenVDB
• Voxel [ = Volume + Pixel ]

- Smallest addressable unit of index space
- Resides at the leaf node level
• Tile
- Larger constant domain of index space
- Resides at the upper (non-leaf) tree levels
• Active state
- All values (voxels and tiles) have a binary state
- Interpretation of state is application-defined
Motivation OpenVDB
VDB
Compact Fast
2D y
outside
inside
Tall
1D tree
Dense representation
Adaptive x
Shallow Tree O(1) Access
VDB topology
Root node
Active bit-mask
Child bit-mask
Internal nodes
Internal nodes Block
Leaf nodes
Sparse representation
Encode
Decode
Values
VDB Tree OpenVDB
Root node Tile values with

(unbounded) active/inactive states
Active Mask
Child Mask
Internal Node 1
Tile values / Child pointers
Active Mask
Child Mask
Internal Node 2
Tile values / Child pointers
Active Mask
Leaf Node Voxels
[K. Museth, SIGGRAPH / ACM TOG, 2013]

Random Access: Top-Down OpenVDB
tree.getValue(x,y,z)=?
Random Access: Bottom-Up OpenVDB
accessor .getValue(x,y,z)=?
{x & ~7, y & ~7, z & ~7}

Masks out 3 LSB!
ValueAccessor OpenVDB
Always use it but …
• Tiny perfect hash table • Optimal for spatially coherent access

- Small footprint - Avoid truly random access
- Fast - Re-use as much as possible
• Not thread-safe due to caching - Never use it for one access only
- Mutable, even for read access • Exceptions
- Construct one per thread - Spatial coherence or reuse is impossible
- Reading from a grid while deleting its nodes
OpenVDB Tools OpenVDB
More than 100 tools and counting...
• Conversion • Geometric Transformation

• Filtering and Morphology • Level Set Processing
• Mathematical Transformation • Bitwise Boolean Operations
• Combination • Bitwise Morphology Operations
• Segmentation • Sparsity Management and Compression
• Finite Differencing • Ray-marching and Rendering
• Poisson Solver • Point advections and NN-search
• Interpolators • Statistics and diagnostic tools
• Multi-grid tools • Potential Flow
• Scattering and gathering • Python and JIT combination (AX)
Conversion From Polygons OpenVDB
Polygonal Model Level Set Volume
Resolution: 1051 x 208 x 863

Memory: 172 MB
Multi-threaded and fast ( <100 ms )
Internal self-intersections
Large Toolbox OpenVDB
Convolution &
Morphology Operations Interface Tracking
Smoothing
Constructive Solid Geometry Surface Properties Tools for Fluid Simulation

Level Set Fracture OpenVDB
Ray-tracing by hierarchical DDAs OpenVDB
Direct ray-tracing of level Volume Rendering of

set in ~300 milliseconds density in ~1 second
Pixel resolution: 1920x1080 

Voxel resolution: 2619x511x2149
Active voxel count ~200 million
CPU w. 8 cores
Support of Point Data OpenVDB
• PointPartitioner
• PointIndexGrid
• PointDataGrid
Multi-Resolution Grids OpenVDB
Conversion To Polygons OpenVDB
• Adaptive, using local curvature Adaptive Mesh

• Supports region masking and adaptivity field (1.7M points + 1.5M quads + 335K triangles)
46 MB
Level Set
1051 x 208 x 863 resolution
172 MB
Uniform Mesh
(4.8M points + 4.8M quads)
129 MB
Houdini SOP Nodes OpenVDB
Advect Density, Advect Level Set, Advect Points,

Analysis, Clip, Combine, Convert, Create, Diagnostics,
Fill, Filter, Filter Level Set, Fracture, From Particles, From
Polygons, Metadata, Morph Level Set, Noise, Occlusion
Mask, Platonic, Prune, Rasterize Points, Ray, Read,
Rebuild Level Set, Resample, Sample Points, Scatter,
Sort Points, To Polygons, To Spheres, Transform, Vector
Merge, Vector Split, Visualize, Write
arnold
STOKE MX
3delight
VFX Reference Platform

Semantic Versioning OpenVDB
major.minor.patch
• Patch:
- No change to API, file format or ABI of Grid or its member classes
• Minor:
- Change to API but not Grid ABI; backward-compatible file format
• Major:
- Change to ABI of Grid or non-backward-compatible file format
• Guaranteed Grid and file compatibility with fixed major version
• One major release per year
• Support last three versions on VFX Platform
A Few Highlights in v8 OpenVDB
• Improvements to cmake build system

• Reduce dependencies on boost and OpenEXR
• Replacing CppUnit with gtest
• FastSweeping tools and Houdini nodes
• Improved dilation
• Filterers handle active tiles
• Sharpening tools
• Dynamic node manager
• AX JIT scripting
• NanoVDB
OpenVDB
NanoVDB: Data Structure NanoVDB
Linear pointer-less VDB tree

OpenVDB Fast copy between host and device
NanoVDB
O(1) random and sequential access
Extra meta data (bbox & min/max)
Multiple voxel types: float, Vec3, int, bit
Points with arbitrary attributes
Run-time compression (NEW)
Allocation on insertion Allocation on construction

NanoVDB: Dependencies NanoVDB
OpenVDB NanoVDB
Implemented in both C++11 and C99

Self-contained (single header)
Optional dependency on OpenVDB
Explicit memory alignment (32byte)
CUDA, DX12, OptiX, OpenGL, OpenCL, Vulkan, GLSL
CPU fallback
NanoVDB: Features NanoVDB
Separate file format

Converters to/from OpenVDB
OpenVDB NanoVDB
Standalone GridBuilder
Checksum and validation tools
0,1,2,3 order interpolation
1st-5th order gradients
Surface curvatures
Interactive ray-tracer
Ray-marching (HDDA)
Ray-marching benchmark NanoVDB
CPU: 2 x Xeon E5-2696 (22 cores) Pixels: 2023 x 911

GPU: 1 x GeForce RTX 2080 Ti Rays: 1,842,953
Volume: 2023 x 911 x 893 CPU (1) - HDDA: 19,833.9 ms

CPU (44) - HDDA: 559.6 ms = 35 x CPU (1)
OpenVDB: 266 MB
GPU - HDDA: 16.8 ms = 33 x CPU (44)
NanoVDB: 259 MB
CPU (44) + HDDA: 15.5 ms = 36 x CPU (44) - HDDA
OpenVDB -> NanoVDB: 27 ms GPU + HDDA: 1.5 ms = 11 x CPU (44) + HDDA
NanoVDB: Vanilla C++ 11 1.2 Grays per second
Ray-tracer: CUDA 10.2
Production benchmarks NanoVDB
Arnold v 6.2.1 | Autodesk

Run-time compression NanoVDB
16 Bit floats
32 bit float: 266 MB, 442 FPS

RTX 8000, CUDA 10.2

8 Bit floats

RTX 8000, CUDA 10.2
4 Bit floats

RTX 8000, CUDA 10.2 4 bit float: 54 MB, 525 FPS
adaptive bit rate
32 bit float: 266 MB, 440 FPS 2 bits: 13.0%

4 bit float: 54 MB, 525 FPS 5.3 bits per value, 64 MB, 440 FPS
RTX 8000, CUDA 10.2
4 X memory reduction
NanoVDB NanoVDB
RAY-TRACING LEVEL SET WITH 2.2 BILLION VOXELS VOLUME RENDERING DENSITY WITH 1.5 BILLION VOXELS
3662 x 3697 x 3684 5142 x 1351 x 2449

NanoVDB Viewer NanoVDB
Wil Braithwaite
NanoVDB in FLOW NanoVDB
1276 x 1519 x 1160 Andrew Reidmeyer

Blender NanoVDB
Houdini 18.5: Vellum NanoVDB
25 FPS (sim + collision)

SideFX
Cloth: 83K triangles
Houdini 18.5: Pyro GPU solver NanoVDB
Sadjad Rabiee
git clone https://github.com/AcademySoftwareFoundation/openvdb.git

git checkout feature/nanovdb
Online Material OpenVDB
• http://www.openvdb.org
- Course slides from 2013 - 2021
- Technical papers
- Coding cookbook and doxygen
- vdb assets (points and voxels)
- Houdini hip file with examples
- Build instructions, license, CLA, TSC minutes
• https://github.com/AcademySoftwareFoundation/openvdb
NANOVDB
CASE STUDY IN HOUDINI
THE PREMIER CONFERENCE & EXHIBITION IN

© 2021 SIGGRAPH. ALL RIGHTS RESERVED. COMPUTER GRAPHICS & INTERACTIVE TECHNIQUES
2
NANOVDB OVERVIEW
• What is NanoVDB
• Why NanoVDB
• Case Study: Blind-Data OpenCL Integration
• Example: Particle collisions
• Example: Smoke Sourcing
• Example: Cloth Collisions
• .nvdb files
• Security Concerns
• Digression on Dithering
• Summary

3
WHAT IS NANOVDB?
• OpenVDB compatible
• A Reference Data Layout
• Read Only (-ish)
• Pre-Marshalled (Zero-copy serialization/deserialization)
• Header Only

4
APPLICATION INTEROP WITH OPENVDB
.vdb files
Simulation Rendering System
System

5
LIBRARY INTEROP WITH OPENVDB
void *
void *

6
WHY NANOVDB?
?
GPU System
Simulation
System

7
WHY NANOVDB?
NanoVDB GPU System

Simulation
System

8
CASE STUDY: OPENCL BLIND DATA
#include <nanovdb/util/OpenToNanoVDB.h>
auto gridhandle = nanovdb::openToNanoVDB(grid,

nanovdb::StatsMode::Default,
nanovdb::ChecksumMode::Default,
/*verbose*/0);

9
CASE STUDY: OPENCL BLIND DATA
#include "nanovdb/util/OpenToNanoVDB.h"
template <typename GridType>

void
CEvdbToBuffer(const GridType &grid, cl::Buffer *buffer)
{
try
{
auto gridhandle = nanovdb::openToNanoVDB(grid, nanovdb::StatsMode::Default, nanovdb::ChecksumMode::Default, /*verbose*/0);
exint gridsize = gridhandle.size();
CE_Context *context = CE_Context::getContext();
*buffer = context->allocBuffer(gridsize);
context->writeBuffer(*buffer, gridsize, gridhandle.data());
}
catch (openvdb::Exception &err)
{ throw CE_Error(-1, err.what()); }
}
void
CE_VDBGrid::initFromVDB(const openvdb::GridBase &grid)
{
UTvdbCallAllType(UTvdbGetGridType(grid), CEvdbToBuffer, grid, &myBuffer);
UTvdbCallPointType(UTvdbGetGridType(grid), CEvdbToBuffer, grid, &myBuffer);
}

10
PARTICLE COLLISIONS

11
EXAMPLE: SMOKE SOURCING

12
EXAMPLE: CLOTH COLLISION

13
.NVDB FILES?
Do Not Use
14
SECURITY CONCERNS
• Accidental Misuse
bool isValid() const { return DataType::mMagic == NANOVDB_MAGIC_NUMBER; }
uint64_t checksum() const { return DataType::mChecksum; }
• Intentional Misuse
template <typename ValueT>
bool isValid(const NanoGrid<ValueT> &grid, bool detailed, bool verbose)

15
DIGRESSION ON DITHERING
Maximum
Value
Minimum
Value

16
ORIGINAL

17
POSTERIZED

18
ORDERED DITHERING
0 8 2 10
12 4 14 6
3 11 1 9
15 7 13 5

19
ORDERED DITHERING
0 8 2 10
12 4 14 6
3 11 1 9
15 7 13 5

20
ORDERED DITHERING
0 8 2 10
12 4 14 6
3 11 1 9
15 7 13 5

21
RANDOMIZED DITHER

22
BAYER DITHER

23
MAGIC CUBE

24
RANDOMIZED BAYER DITHER

25
DITHERING: FLOATING POINT ROUNDING
auto *dst = reinterpret_cast<uint16_t*>(data+1);

const double encode = 65535.0/(max - min);// note that double is required!
for (int i=0; ... )
*dst++ = uint16_t(encode * (*src++ - min) + lut(offset++));

26

for (int i=0; ... )
s e e e e e e e e m m m m m m m m m m m m m m m m m m m m m m m
2 2 2 1 1 1 1 1 1 1 1 1 1 9 8 7 6 5 4 3 2 1 0
2 1 0 9 8 7 6 5 4 3 2 1 0

27

for (int i=0; ... )
65535 512/513
s e e e e e e e e m m m m m m m m m m m m m m m m m m m m m m m
2 2 2 1 1 1 1 1 1 1 1 1 1 9 8 7 6 5 4 3 2 1 0
2 1 0 9 8 7 6 5 4 3 2 1 0

28
NANOVDB CONCLUSION
• NanoVDB provides a way to move sparse volumes in non-shared memory architectures efficiently
• NanoVDB is a very fast way to gain read-only VDB support
− No dependencies
− C++, C, OpenCL, GLSL, and more platforms provided
− Command line tools to convert .vdb to .nvdb
− Ideally provide direct support for .vdb when possible to simplify people’s pipelines!
• For algorithms manipulating sparse volumes, use OpenVDB
• For storage and compatibility, use .vdb files

29
OPENVDB
IN PRODUCTION

COMPUTER GRAPHICS & INTERACTIVE TECHNIQUES 1
TALK OVERVIEW
•  Introduction to toolset via Applications

−  Clouds + Atmosphere
−  Liquids
−  Filtering + Morphological Ops
−  Grid analysis
−  Managing complexity
−  Advection + Fluids
−  Retiming + Deformation
−  Extrapolation
−  Visualization + rendering

TOOLS & WORKFLOW EXAMPLES

VOLUME CREATION
mesh density grid topology

VOLUME CREATION
density grid topology
format disk
32 bit 1.9 MB
16 bit 1.3 MB
16 bit blosc 970 KB
ARBITRARY GRIDS
density level set

grid metadata:
mesh •  background value
•  voxel size
•  index to world xform
•  class
•  bbox
•  name
•  value/vector type
...
color velocity

COMPLEXITY
8.9 M active voxels

1051 x 208 x 862
171 MB
2.7 sec

VOLUME MANIPULATION
Φ += noise( CPT ) Φ = noise( P )

COMBINING GRIDS

CLOUD MODELING
Miller, B., Museth, K., Penney, D. and Bin Zafar, N. Cloud modeling and rendering. Siggraph Talk, 2012 THE PREMIER CONFERENCE & EXHIBITION IN
10
COMPUTER GRAPHICS & INTERACTIVE TECHNIQUES
CLOUD MODELING
Lee, D. How to Train for Cloud. Siggraph Asia Course, 2020 THE PREMIER CONFERENCE & EXHIBITION IN
CLOUDS
Lee, D. How to Train for Cloud. Siggraph Asia Course, 2020 THE PREMIER CONFERENCE & EXHIBITION IN
12
Miller, B., Museth, K., Penney, D. and Bin Zafar, N. Cloud modeling and rendering. Siggraph Talk, 2012
RASTER PRIMITIVES

RASTER PRIMITIVES

COMPLEXITY
frustum buffer
LOD pyramid

COMPLEXITY
grid type comparison (same voxel count)
frustum ortho
•  resolution near camera:

0.28 (frustum) : 0.45 (ortho)
•  17% ortho voxels not visible

COMPLEXITY
LOD Voxels Disk Render Render Utilization

0 138 M 464 MB 2.5 MB 5.5
1 30 M 102 MB 1.0 MB 3.2
2 4M 13 MB 0.8 MB 2.3
3 30 K 197 KB 0.6 KB 1.3
Lee, D. How to Train for Cloud. Siggraph Asia Course, 2020 COMPUTER GRAPHICS & INTERACTIVE TECHNIQUES 17
ATMOSPHERE TOOLS

LIQUIDS
Budsberg, J., Losure, M., Museth, K., Baer, M. Liquids in The Croods. DigiPro, 2013
Losure, M. Surreal Night Swimming in Home. Siggraph Dailies, 2015 THE PREMIER CONFERENCE & EXHIBITION IN
Van Opstal, B., Janin, L., Museth, K. Large Scale Simulation of Water and Ice in Dragon 2, Siggraph Talk, 2014
LEVEL SET FILTERING & MORPHOLOGICAL OPS

Budsberg, J., Losure, M., Museth, K., Baer, M. Liquids in The Croods. DigiPro, 2013 COMPUTER GRAPHICS & INTERACTIVE TECHNIQUES 20
mask

segmentation

smooth blur dilate
sharpen
LEVEL SET SURFACING
~1.1M polys ~177K polys
mask
level set mesh

GRID ANALYSIS
gradient curvature
velocity vorticity

SECONDARY ELEMENTS
Losure, M. Surreal Night Swimming in Home. Siggraph Dailies, 2015 THE PREMIER CONFERENCE & EXHIBITION IN
SIMPLIFICATION MASKS
Van Opstal, B., Janin, L., Museth, K. Large Scale Simulation and Surfacing of Water THE PREMIER CONFERENCE & EXHIBITION IN
27
and Ice in How to Train Your Dragon 2, Siggraph Talk, 2014
OPEN MESH COLLISIONS
open mesh
SDF
UDF
SDF

COMPLEXITY

Klar, G., Budsberg, J. et al. Production ready MPM simulations. ACM Siggraph Talks, 2017 COMPUTER GRAPHICS & INTERACTIVE TECHNIQUES 29
ISLAND DETECTION

Klar, G., Budsberg, J. et al. Production ready MPM simulations. ACM Siggraph Talks, 2017 COMPUTER GRAPHICS & INTERACTIVE TECHNIQUES 30
MATERIAL POINT METHOD
Stomakhin, A. et al. A material point method for snow simulation. Siggraph, 2013 THE PREMIER CONFERENCE & EXHIBITION IN
31
Klar, G., Budsberg, J. et al. Production ready MPM simulations. ACM Siggraph Talks, 2017
VDB POINTS
example: 1 M points
format Disk notes

bgeo.sc 42 MB
bgeo.sc 23 MB P, v, N (16 bit)
Cd (8 bit)
abc 45 MB
usdc 46 MB
vdb 25 MB
vdb 16 MB P (8 bit fix)
v (16 bit trunc)
Cd (8 bit unit)
N (unit)

VDB POINTS
VDB from Particles Topology to SDF (6x speed!) VDB Points topology

RASTER PRIMITIVES (VECTOR FIELDS)

CONSTRAINED ADVECTION
velocity field constrain SDF THE PREMIER CONFERENCE & EXHIBITION IN

POTENTIAL FLOW
existing simulation
modified simulation

potential flow
PROJECT NON-DIVERGENT

AIR FIELD
rasterize
isolate low
density
density points rasterize velocity project non-divergent update velocity
points

Van Opstal, B. et al. Instafalls: How to train your waterfalls. ACM Siggraph Talks, 2019 COMPUTER GRAPHICS & INTERACTIVE TECHNIQUES 38
AIR FIELD

Van Opstal, B. et al. Instafalls: How to train your waterfalls. ACM Siggraph Talks, 2019 COMPUTER GRAPHICS & INTERACTIVE TECHNIQUES 39
DISTRIBUTED DATA PROCESSING
RAM >250 Gb
4% CPU usage
•  limited resource!
Billions of simulated points Post-processing:
Level sets
Filtering
Morphological ops
Less RAM/CPU
Data Analysis 100% CPU usage
•  fits on average
farm blade
•  lots of resources
•  faster turnaround
Hundreds of waterfalls

DISTRIBUTED DATA PROCESSING
points
partitioned points padded read
open mesh
uniform partition
UDF level set + filtering VDB clip seamless distributed level set
iterative
binary partition
SDF

MORPHING
open mesh
UDF
•  5 minute-long river simulation

SDF •  art-direct sections independently
•  multiple artists iterate simultaneously

VOLUME STAMPING

VOLUME RETIMING

MASKED EXTRAPOLATION
SDF
extrapolated SDF grids
masks
Museth, K. Novel Algorithm for Sparse and Parallel Fast Sweeping: Efficient THE PREMIER CONFERENCE & EXHIBITION IN
45
Computation of Sparse Signed Distance Fields. Siggraph 2017
MASKED EXTRAPOLATION
source grid
mask
topology to masked sweep extrapolate

level set
Museth, K. Novel Algorithm for Sparse and Parallel Fast Sweeping: Efficient THE PREMIER CONFERENCE & EXHIBITION IN
46
Computation of Sparse Signed Distance Fields. Siggraph 2017
FILTER EXTRAPOLATION
Extrap examples

VOLUME DEFORMATION

VOLUME FRACTURE
Alden, M., Melich, G. and Museth, K. Efficient and seamless volumetric fracture. THE PREMIER CONFERENCE & EXHIBITION IN
49
Siggraph Talk, 2012
PROXY GENERATION
Budsberg, J., Bin Zafar, N., Alden, M. Elastic and Plastic Deformations with Rigid THE PREMIER CONFERENCE & EXHIBITION IN
50
Body Dynamics. Siggraph Talk, 2014
ELASTIC DEFORMATION
Budsberg, J., Bin Zafar, N., Alden, M. Elastic and Plastic Deformations with Rigid THE PREMIER CONFERENCE & EXHIBITION IN
51
Body Dynamics. Siggraph Talk, 2014
VISUALIZATION

Matthews, M. Amorphous: An OpenGL Sparse Volume Renderer. ACM Siggraph Talks, 2012 COMPUTER GRAPHICS & INTERACTIVE TECHNIQUES 52
RENDERING
3D viewer parity across renderers

openvdb_print fire.vdb -m
density float (8,6,8)->(183,142,153)
…
temperature float (8,6,8)->(183,142,153)
background: 0
voxel size: 0.5
index to world:
[0.5, 0, 0, 0]
[0, 0.5, 0, 0]
[0, 0, 0.5, 0]
[-47.5, -4, -40, 1]
amorphous: {"_format":100, "default": {"enabled":fa
{"enabled":true, "field":"temperature”, "gain":1.0, "
[0.0,0.0,0.0,1.0],0.50,0], [0.200, [0.039,0.002,0.0,
class: fog volume
file_bbox_max: [183, 142, 153]
file_bbox_min: [8, 6, 8]
…
…

Matthews, M. Amorphous: An OpenGL Sparse Volume Renderer. ACM Siggraph Talks, 2012 COMPUTER GRAPHICS & INTERACTIVE TECHNIQUES 53
RENDERING
scattering soft shadowing
smooth normals

SPATIAL MAP
sample 1 VDB per frame > lots of animated UDIMs

EMISSIVE LIGHTING

THANKS!
•  Lawrence Lee
•  Baptiste Van Opstal
•  Michael Losure
•  Domin Lee
•  Alex Timchenko

QUESTIONS?
•  Forget to ask something?

•  www.openvdb.org/forum

OPENVDB
IN PRODUCTION

RICHARD JONES
FX R&D @ DNEG
MOTIVATING EXAMPLE
How quickly can we subtract 1 from all voxels in a VDB?
C++ API
Multiple ways e.g. tools::foreach, Leaf/NodeManager and others
1 openvdb::tools::foreach(floatgrid->beginValueOn(),
2 [&] (const openvdb::FloatGrid::ValueOnIter& iter) {
3 iter.modifyValue([&](float& surface) { surface -= 1.f; });
4 });
or*
1 openvdb::tree::LeafManager<openvdb::FloatTree> manager(floatgrid->tree());
2 manager.foreach([&](auto& leaf, size_t) {
3 for (auto iter = leaf.beginValueOn(); iter; ++iter) {
4 iter.modifyValue([&](float& surface) { surface -= 1.f; });
5 }
6 });
*for simplicity only running on voxels

Alternatively, as of OpenVDB 8.x, we could use AX
1 float@surface -= 1.f;
via
1 openvdb::ax::run("float@surface -= 1.f;", floatgrid);
Ok, nothing too impressive so far. What about performance?

VDB with 96,956,688 voxels
Code Time (ms)

tools::foreach 24512
LeafManager::foreach 2359
AX* 140
AX is 16x-175x faster
*includes JIT compilation!
OUTLINE
What is AX?
AX Code
What does it look like? Is it fast?
How AX works
Points and volumes
What to use it for

Getting started
Building and using the library
Closer look:
How AX optimises for sparsity
WHAT IS AX?
AN EXPRESSION LANGUAGE FOR VDB POINTS AND VOLUMES
Flexible
Extend functionality whilst remaining customisable
Performance-driven
JIT compiled to target compiled C++ performance...or better
Simple to write
"Domain specific" with tailored execution pattern
INCLUDED IN OPENVDB 8.X
AX library
Core functionality for language and execution over points and volumes.
Tools
vdb_ax command line tool for quick access to AX
OpenVDB AX Houdini SOP
WHAT DOES AX CODE LOOK LIKE?
C-style language with simple attribute (@) syntax for manipulating VDB data
1 float amp = 0.1, pers = 1, lacun = 1;

2 vec3f freq = {0.4,0.2,0.5},
{0.4f,0.2f,0.5f},
offset
offset
= {1,0,0},
= {1.0f,0.0f,0.0f},
noise = 0.0f;
noise = 0.0f;
3
4 // position based curl simplex noise
5 for (int octave = 0; octave < 3; ++octave) {
6 vec3f noisePos = getvoxelpws() * freq + offset;
7 noise += curlsimplexnoise(noisePos) * amp;
8
9 amp *= pers;
10 freq *= lacun;
11 }
12
13 // add noise to VDB velocity value
14 vec3f@v += noise;
BUT IS IT FAST?
HAVEN'T FORGOTTEN ABOUT THAT 16X-175X SPEEDUP ALREADY HAVE YOU?
AX allows ALL users expert user performance

JIT gives both fast iteration time AND performance of C++
JIT allows host targeted optimisations e.g. SIMD with AVX
AX execution pattern and implementation designed for performance
Now let's see how it works!
HOW IT WORKS
Points
Per point kernel.

@ refers to point attribute.
e.g.
1 if (vec3f@P[0] > 0) {
2 vec3f@Cd = {0,0,1};
3 }
4 else {
5 vec3f@Cd = {1,0,0};
6 }
HOW IT WORKS
Volumes
1 float@a += max(1, float@c);
2 float@b = float@a;
@-refers to value of a volume at a point in space.

Each volume getting written-to gets its own kernel.
Each kernel only runs over topology of its volume.
E.g. for float@a this becomes*
1 float@a += max(1, point_sample("c", pos));
*Note: After last write to float@a, everything else will be optimised out, removing line 2.
and for float@b this becomes**

1 float a = point_sample("a_cache", pos) + max(1, point_sample("c", pos));
2 float@b = a;
**Note: AX will cache grids that are both written-to and read from to avoid execution ordering issues, meaning float@b reads from a cached
version of volume 'a'.
But what can we use it for?

WHAT TO USE AX FOR
A NON-EXHAUSTIVE LIST
SIMULATION
v@v += v@force * timestep;
Update attributes - forces etc. v@P += v@v * timestep;
Move points
POINT MANIPULATION
if (vec3f@Cd == {0,0,1}) addtogroup(“blue”);
Add to groups if (ingroup(“blue”)) deletepoint();
Delete points
VOLUME MODELLING
f@surface = min(f@surface, f@other);
CSG operations f@noise = simplexnoise(getvoxelpws());
Noise generation
RENDERING
i@geoId = rand(i@id) * 10;
Change instanced geometry on points f@density *= 0.25;
Tweak volumes and points
And many more!
GETTING STARTED
Building AX
Same as core OpenVDB + new LLVM dependency

apt-get, brew etc. and CMake
Incorporating AX
1 #include <openvdb/openvdb.h>
2 #include <openvdb/io/File.h>
3
4 int main()
5 {
6 openvdb::initialize();
7
8 openvdb::io::File in("sphere.vdb");
9 in.open();
10 auto grids = in.getGrids();
11
12 // your code goes here
13
14 openvdb::io::File out("sphere_mod.vdb");
15 out.write(*grids);
16
17 openvdb::uninitialize();
18 return 0;
19 }
Incorporating AX in 4 lines
1 #include <openvdb/openvdb.h>
2 #include <openvdb/io/File.h> <- Include openvdb_ax/ax.h
3
4 #include <openvdb_ax/ax.h>
5
6 int main()
7 {
8 openvdb::initialize();
9
10 openvdb::io::File in("sphere.vdb");
11 in.open();
12 auto grids = in.getGrids();
13 <- Initialize
14
15
openvdb::ax::initialize();
<- Run!
16 openvdb::ax::run("@surface += 0.1;", *grids); <- Uninitialize
17
18 openvdb::ax::uninitialize();
19
20 openvdb::io::File out("sphere_mod.vdb");
21 out.write(*grids); Ta da!
22 Now you are up and running with AX
23 openvdb::uninitialize();
24 return 0;
25 }
CLOSER LOOK
HOW AX OPTIMISES FOR SPARSITY
1 if (getcoordx() == 511) float@density = 1.0f;

THE CHALLENGE
ALLOW AX TO WORK WITH SPARSE INPUTS, SEAMLESSLY
Problem:
Want to maintain value access paradigm.
Do not want user have to specify mode for tiles etc.
Avoid densify and prune.
Solution: Active tile-streaming.
Let's take a look.
WHAT WILL CAUSE TILE STREAMING?
Using position (or coordinates)
Reading other volumes
Functions that may vary across voxels i.e. rand() (w/o seed)
Only volumes that depend on the above, will be tile streamed, even in the same
snippet. e.g.
1 float@other = 10;
2 if (getcoordx() == 511) float@density = 1.0f;
Here, 'density' is streamed but not 'other'.

Similarly, for AX code with none of the above e.g.
1 float@density -= 1.0f;
No tile streaming, AX runs over the existing topology.

HOW TILE-STREAMING WORKS
For a sparse grid, with tiles, that is being streamed:
Temporary node hierarchy generated on-the-fly, used in AX kernel, and then
checked for variation/change.
If:
variation detected -> add to tree
value change (coarser nodes same) -> update value
no change -> no update
Result: No difference for user between dense and sparse, without resorting to
densify and prune!
OPENVDB AX SUMMARY
New expression language toolkit included in OpenVDB 8.x
Manipulates point attributes and volume values
Extends the utility of VDB and makes it easy to write very fast custom operators
Can be integrated in 4 lines of C++
THANKS FOR LISTENING!
CONTACT
rhj@dneg.com
VDB Mailing-List: openvdb-dev@lists.aswf.io
DOCUMENTATION
https://academysoftwarefoundation.github.io/openvdb/openvdbax.html
Thanks to Nick Avramoussis, the OpenVDB TSC & DNEG

QUESTIONS?
Don't forget to post them and come along to the Q&A.
Multithreading in OpenVDB
Dan Bailey
R&D FX Engineer
Industrial Light and Magic

1
Talk Overview
Sequential Access Parallel Constructs

● Tree Hierarchy ● Foreach Tool (Value)
● Tree Iterators ● Foreach Tool (Leaf)
● Iterator Ranges ● LeafManagers
● Depth-First vs Breadth-First ● NodeManagers
● Tree Visitor Methods ● DynamicNodeManagers
● Node Visitor Tool ● Foreach Benchmarks
Random Access Case Studies

● Direct Access ● ActiveVoxelCount
● Root Node Query ● Deactivate
● ValueAccessors ● Merge
● GetValue Benchmarks

2
Tree Hierarchy
Tree Root Node Internal Node Leaf Node
std::map [array] [array]

#children = 1
(i, j, k) => Tile / Node* Node*

Root Node
Tile Value
#children = ~infinite
densify /
Internal Node 1 voxelize
prune
#children = 32768
Node Tile
Internal Node 2
5.0 5.0 5.0 5.0 5.0
#children = 4096 5.0 5.0 5.0 5.0
5.0 5.0 5.0 5.0

Leaf Node (Tile == Value)
5.0 5.0 5.0 5.0
#children = 512

3
Tree Hierarchy

#children = 1
[value] [value]
(i, j, k) => Tile / Node* [child]
Root Node Bitmask
#children = ~infinite
Value Child
Internal Node 1 BitMask BitMask
#children = 32768 0 0 Inactive Tile
Internal Node 2 1 0 Active Tile
#children = 4096
0 1 Child Node
Leaf Node
1 1 Invalid
#children = 512

4
Tree Iterators

[value] [value]
● beginRootChildren() ● beginChildOn() ● beginChildOn() ● beginChildOn()

● beginRootTiles() ● beginChildOff() ● beginChildOff() ● beginChildOff()
● beginRootDense() ● beginChildAll() ● beginChildAll() ● beginChildAll()
● beginValueOn() ● beginValueOn() ● beginValueOn()
● beginValueOff() ● beginValueOff() ● beginValueOff()
● beginValueAll() ● beginValueAll() ● beginValueAll()
All Inactive Tiles
All Elements + Children
All Elements
All Tiles
All Tiles

5
Tree Iterators

[value] [value]
● beginRootChildren() ● beginChildOn() ● beginChildOn() ● beginChildOn()

● beginRootTiles() ● beginChildOff() ● beginChildOff() ● beginChildOff()
● beginRootDense() ● beginChildAll() ● beginChildAll() ● beginChildAll()
● beginValueOn() ● beginValueOn() ● beginValueOn()
● beginNode()
● beginValueOff() ● beginValueOff() ● beginValueOff()
● beginLeaf()
Hierarchical ● beginValueAll() ● beginValueAll() ● beginValueAll()
● beginValueOn()
Iterators
● beginValueOff() All Inactive Tiles
● beginValueAll() All Elements + Children
All Elements
All Tiles
All Tiles

6
Tree Iterators
Option 1: Manual Iteration

for (auto iter1 = tree.cbeginRootChildren(); iter1; ++iter1) {
for (auto iter2 = iter1->cbeginChildOn(); iter2; ++iter2) {
for (auto iter3 = iter2->cbeginChildOn(); iter3; ++iter3) {
for (auto iter4 = iter3->cbeginValueOn(); iter4; ++iter4) {
sum += iter4.getValue();
} } } }
Option 2: Leaf Iteration
for (auto leaf = tree.cbeginLeaf(); leaf; ++leaf) {

for (auto iter = leaf->cbeginValueOn(); iter; ++iter) { Disney Cloud - 1.5 billion voxels *
sum += iter.getValue();
} }
Option 1 6.1s
Option 3: Value Iteration Option 2 6.4s
for (auto iter = tree.cbeginValueOn(); iter; ++iter) {

Option 3 15.0s
sum += iter.getValue();
}
* https://www.disneyanimation.com/data-sets

7
Iterator Ranges
Value Iterator Range

IteratorRange<FloatTree::ValueOnCIter> iterRange(tree.cbeginValueOn()); 11.7s
Leaf Iterator Range

IteratorRange<FloatTree::LeafCIter> iterRange(tree.cbeginLeaf()); 24ms
Node Iterator Range

IteratorRange<FloatTree::LeafCIter> iterRange(tree.cbeginNode()); 23ms
template<typename IterT>
struct IteratorRange
{
IteratorRange(const IterT& iter, size_t grainSize = 8)
{
mSize = 0;
for (IterT it(iter); it.test(); ++mSize, ++it) {} Evaluate the iterator to find out the size
}
...
};

8
Depth First vs Breadth First
Depth First Breadth First
Root Node Root Node

1 1
Internal Node Internal Node Internal Node Internal Node

2 5 2 3
Leaf Node Leaf Node Leaf Node Leaf Node Leaf Node Leaf Node Leaf Node Leaf Node
3 4 6 7 4 5 6 7

9
Tree Visitor Methods
● visitActiveBBox(...) ● visitActiveBBox(...) ● visitActiveBBox(...) ● visitActiveBBox(...)

● visit(...) ● visit(...) ● visit(...) ● visit(...)
● visit2(...) ● visit2Node(...) ● visit2Node(...) ● visit2Node(...)
● combine(...) ● visit2(...) ● visit2(...) ● visit2(...)
● combineExtended(...) ● combine(...) ● combine(...) ● combine(...)
● combine2(...) ● combine2(...) ● combine2(...) ● combine2(...)
● combine2Extended(...)
All Tree visitor methods are being deprecated, instead:
● Use tools::visitNodesDepthFirst() [single-threaded]

● Use tree::DynamicNodeManager [multi-threaded]

10
Tree Visit Method
template<typename VisitorOp>
void Tree::visit(VisitorOp& op);
Visitor
functor

11
Tree Visit Method
template<typename VisitorOp>
void Tree::visit(VisitorOp& op);
Generic iterator
struct VisitorOp
{
template <typename IterT>
bool operator()(IterT &iter)
{
typename IterT::NonConstValueType value;
Test what kind of typename IterT::ChildNodeType *child = iter.probeChild(value);
iterator?
if (child)
{
bool isLeaf = child->getLevel() == 0;
Return true ...
}
to continue
return true;
}
bool operator()(FloatTree::LeafNodeType::ChildAllIter &) { return true; }

Overloads for
bool operator()(FloatTree::LeafNodeType::ChildAllCIter &) { return true; } specific iterators
};

12
Node Visitor Tool
Optional
starting index
template <typename TreeT, typename OpT>

size_t tools::visitNodesDepthFirst(TreeT& tree, OpT& op, size_t idx = 0);
Total nodes Visitor

(+ starting index) functor

13
Node Visitor Tool
template <typename TreeT, typename OpT> Called for every node in the tree
size_t tools::visitNodesDepthFirst(TreeT& tree, OpT& op, size_t idx = 0); in depth-first traversal order
struct OffsetOp
{
const float offset;
explicit OffsetOp(float _offset): offset(_offset) { } Node index
(based on
template<typename NodeT> traversal order)
void operator()(NodeT& node, size_t) const
{
for (auto iter = node.beginValueOn(); iter; ++iter) {
iter.setValue(iter.getValue() + mOffset);
}
}
};
Use tree::DynamicNodeManager for a multi-threaded equivalent...

14
Random Access
Raytracing Sampling

15
Direct Access
Tree (i, j, k)
Coord ijk(0, 0, 0);
tree.getValue(ijk);
Root Node
Root Node query is expensive
Internal Node 1
Internal Node 2
Leaf Node

16
Root Node Query
100 million coords
Coalesced Test Case
Test different access patterns with a root 1 Tile (Direct) 3.68s

node containing 1, 8 or 64 tiles
8 Tiles (Direct) 7.10s

std::vector<Coord> ijks;
// populate ijks
for (const Coord& ijk : ijks) {
sum += root.getValueDepth(ijk); Interleaved Test Case
}
1 Tile (Direct) 3.68s

0 if child exists
-1 otherwise 64 Tiles (Direct) 9.02s

17
ValueAccessors
100 million coords
std::vector<Coord> ijks;
Coalesced Test Case
// populate ijks
tree::ValueAccessor<FloatTree> valueAccessor(tree);
for (const auto& ijk : ijks) {
1 Tile (Direct) 3.68s
sum += valueAccessor.getValueDepth(ijk); 1 Tile (Accessor) 2.04s
}
8 Tiles (Accessor) 1.31s
Tree (i, j, k) 64 Tiles (Direct) 9.83s
Root Node
cache Interleaved Test Case
Internal Node 1
ValueAccessor 1 Tile (Direct) 3.68s
1 Tile (Accessor) 2.17s
Internal Node 2 Root Node
Internal Node 1 8 Tiles (Direct) 7.00s
Internal Node 2 8 Tiles (Accessor) 9.53s
Leaf Node Leaf Node

18
GetValue Benchmarks
Manual Iteration 6.1s
Leaf Iteration 6.4s
Value Iteration 15.0s
Sequential Direct 15.3s
Sequential ValueAccessor 5.9s
Interleaved Direct 15.6s
Interleaved ValueAccessor 21.1s

19
Random Access
Raytracing Sampling

20
Foreach Tool
template<typename IterT, typename OpT>

void tools::foreach(const IterT& iter, OpT& op, bool threaded = true, bool shareOp = true);
Generic Foreach Threading

iterator functor enabled by
default

21
Foreach Tool (Value)

Unthreaded 34.8s
1 Thread 96s
auto op = [&](const auto& iter) { 2 Threads 59.7s

iter.setValue(iter.getValue() * 2);
};
4 Threads 40.3s
tools::foreach(tree.beginValueOn(), op);
8 Threads 31.2s
Tree value 16 Threads 27.2s

iterator
32 Threads 27.0s

22
Foreach Tool (Leaf)

Unthreaded 7.51s
1 Thread 7.51s
auto op = [&](const auto& leafIter) { 2 Threads 4.26s

for (auto iter = leafIter->beginValueOn(); iter; ++iter) {
} 4 Threads 2.19s
};
8 Threads 1.14s
tools::foreach(tree.beginLeaf(), op);
16 Threads 0.64s
Tree leaf 32 Threads 0.51s

iterator

23
LeafManager
template<typename LeafOp> ● foreach(...)

void LeafManager::foreach(const LeafOp& op, bool threaded = true, size_t grainSize=1); ● reduce(...)
Root Node
Float leaf only Leaf index
struct DoubleOp
{
using LeafT = FloatTree::LeafNodeType; Internal Node Internal Node
void operator()(LeafT& leaf, size_t idx) const {
for (auto iter = leaf.beginValueOn(); iter; ++iter) {
}
}
};
Leaf Node Leaf Node Leaf Node Leaf Node
tree::LeafManager<FloatTree> leafManager(tree); 1 1 1 1
DoubleOp op;
leafManager.foreach(op);

24
LeafManager
template<typename LeafOp>
void LeafManager::foreach(const LeafOp& op, bool threaded = true, size_t grainSize=1);
Unthreaded 8.63s
1 Thread 8.13s
struct DoubleOp 2 Threads 4.57s

{
using LeafT = FloatTree::LeafNodeType;
void operator()(LeafT& leaf, size_t idx) const { 4 Threads 2.27s
} 8 Threads 1.16s
}
}; 16 Threads 0.58s
tree::LeafManager<FloatTree> leafManager(tree);
32 Threads 0.48s
DoubleOp op;
leafManager.foreach(op);

25
NodeManager
template<typename NodeOp> ● foreachTopDown(...)

void NodeManager::foreachTopDown(const NodeOp& op, bool threaded = true, size_t grainSize=1); ● foreachBottomUp(...)
● reduceTopDown(...)
● reduceBottomUp(...)
Node
functor
Bottom Up
Root Node
1
Internal Node Internal Node

2 2
Top Down
3 3 3 3
Breadth First

26
NodeManager
template<typename NodeOp>
void NodeManager::foreachTopDown(const NodeOp& op, bool threaded = true, size_t grainSize=1);
Do nothing for
Bottom Up
RootNode or
InternalNode No node index Root Node
1
struct DoubleOp
{
template <typename OtherNodeType>
void operator()(OtherNodeType&) const { }
void operator()(FloatTree::LeafNodeType& leaf) const { 2 2
Top Down
}
}
}; Leaf Node Leaf Node Leaf Node Leaf Node
tree::NodeManager<FloatTree> nodeManager(tree);
3 3 3 3
DoubleOp op;
nodeManager.foreachTopDown(op); Breadth First

27
NodeManager
void NodeManager::foreachTopDown(const NodeOp& op, bool threaded = true, size_t grainSize=1);
Unthreaded 9.06s
1 Thread 9.15s
struct DoubleOp
{
2 Threads 5.45s
void operator()(OtherNodeType&) const { }
4 Threads 2.55s
void operator()(FloatTree::LeafNodeType& leaf) const {
iter.setValue(iter.getValue() * 2); 8 Threads 1.22s
}
}
}; 16 Threads 0.59s
tree::NodeManager<FloatTree> nodeManager(tree); 32 Threads 0.45s

DoubleOp op;
nodeManager.foreachTopDown(op);

28
NodeManager
One method for all three node types One method for each node type
Tree
struct Op template <typename TreeT>
{ struct Op
Root Node // RootNode, InternalNode, LeafNode {
template <typename NodeT> using RootNodeT = typename TreeT::RootNodeType;
void operator()(NodeT& node) const; using LeafNodeT = typename TreeT::LeafNodeType;
Internal Node 1 };
// RootNode
void operator()(RootNodeT& root) const;
Internal Node 2
// InternalNode
template <typename InternalNodeT>
Leaf Node void operator()(InternalNodeT& internal) const;
// LeafNode
void operator()(LeafNodeT& leaf) const;
};

29
DynamicNodeManager

void DynamicNodeManager::foreachTopDown(const NodeOp& op, bool threaded = true, size_t grainSize=1); ● reduceTopDown(...)
Node
functor
Root Node
1
● Constructs node arrays lazily instead of up-front

● Primarily designed for topology-changing operations Internal Node Internal Node
● Can filter out sub-trees 2 2
Top Down
3 3 3 3
Breadth First

30
DynamicNodeManager

void DynamicNodeManager::foreachTopDown(const NodeOp& op, bool threaded = true, size_t grainSize=1); ● reduceTopDown(...)
Return false to
stop processing Node index
sub-nodes (per type)
Root Node
1
struct DoubleOp
{
bool operator()(OtherNodeType&, size_t) const { return true; }
bool operator()(FloatTree::LeafNodeType& leaf, size_t idx) const {
2 2
Top Down
}
return true;
} Leaf Node Leaf Node Leaf Node Leaf Node
};
3 3 3 3
tree::DynamicNodeManager<FloatTree> nodeManager(tree);
DoubleOp op;
nodeManager.foreachTopDown(op); Breadth First

31
DynamicNodeManager
void DynamicNodeManager::foreachTopDown(const NodeOp& op, bool threaded = true, size_t grainSize=1);
Unthreaded 8.51s
1 Thread 8.69s
struct DoubleOp
{
template <typename OtherNodeType> 2 Threads 4.72s
bool operator()(OtherNodeType&, size_t) const { return true; }
bool operator()(FloatTree::LeafNodeType& leaf, size_t idx) const { 4 Threads 2.18s

iter.setValue(iter.getValue() * 2); 8 Threads 1.08s
}
return true;
} 16 Threads 0.53s
};
32 Threads 0.41s
DoubleOp op;

32
Foreach Benchmarks
Foreach Value Unthreaded 34.8s
Foreach Value 32 Threads 27.0s
Foreach Leaf Unthreaded 7.51s
Foreach Leaf 32 Threads 0.51s
LeafManager Unthreaded 8.63s
LeafManager 32 Threads 0.48s
NodeManager Unthreaded 9.06s
NodeManager 32 Threads 0.45s
DynamicNodeManager Unthreaded 8.51s
DynamicNodeManager 32 Threads 0.41s

33
Case Study 1: ActiveVoxelCount
Count all active voxels in the tree

template <typename TreeT>
Index64 Tree::activeVoxelCount() const;
Index64 tools::countActiveVoxels(const TreeT& tree, bool threaded = true);
Unthreaded
depth-first
summation
Tree tree::DynamicNodeManager<FloatTree> nodeManager(tree);

ActiveVoxelCountOp op;
nodeManager.reduceTopDown(op);
Root Node
Internal Node 1
Old Method 203ms

Internal Node 2
New Method 26ms
Leaf Node

34
Case Study 2: Deactivate
Mark as inactive any active tiles or voxels in the given tree whose values are equal to value (to within tolerance)

void tools::deactivate(TreeT& tree, const ValueType& value, const ValueType& tolerance = 0, bool threaded = true);
Threaded over leaf nodes
tools::foreach(tree.beginLeaf(), op);
Old Method (Leaf) 1.49s

+
Unthreaded over other nodes (tiles) New Method (Leaf) 1.21s
auto it = tree.beginValueOff();
it.setMaxDepth(tree.treeDepth() - 2); Old Method (Tile) 0.30s
tools::foreach(it, op, /*threaded=*/false);
New Method (Tile) 0.02s

35
Case Study 3: Merge
Given a and b trees, compute a + b per voxel and store result in a

void tools::compSum(TreeT& a, TreeT& b);
std::vector<FloatTree*> trees{&a, &b};

tools::SumMergeOp<FloatTree> op(trees, Steal());
Unthreaded tree::DynamicNodeManager<FloatTree> nodeManager(tree);
visitor nodeManager.foreachTopDown(op);
method
template <typename CombineOpT> Old Method 9.73s

void Tree::combineExtended(Tree& other, CombineOpT& op);
New Method 0.77s

36
Case Study 3: Merge
100 VDBs (3.12MV)
VDB Combine SOP 3m 3s
1 VDB (317MV) VDB Merge SOP 17.1s

37
Conclusion
Access Pattern?
Predictable Random
What is most important?
Simplicity Performance
Evaluate Over?
Leaf Nodes Tiles or Nodes
Evaluate Over? Does Tree Topology Change?
Leaf Nodes Tiles Nodes Yes No
LeafIterator ValueIterator (depth-1) DepthFirstNodeVisitor LeafManager DynamicNodeManager NodeManager ValueAccessor

38
Conclusion
Benchmarks: git clone git@github.com:danrbailey/siggraph2021_openvdb.git

Access Pattern?
Predictable Random
What is most important?
Simplicity Performance
Evaluate Over?
Leaf Nodes Tiles or Nodes
Evaluate Over? Does Tree Topology Change?
Leaf Nodes Tiles Nodes Yes No
LeafIterator ValueIterator (depth-1) DepthFirstNodeVisitor LeafManager DynamicNodeManager NodeManager ValueAccessor

39

Openvdb: Ken Museth - Nvidia

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Openvdb: Ken Museth - Nvidia

Uploaded by

Copyright:

Available Formats

OPENVDB

KEN MUSETH | NVIDIA

THE PREMIER CONFERENCE & EXHIBITION IN COMPUTER

• Intro to OpenVDB and NanoVDB

7897 x 1504 x 5774

• Unbounded • Fast access

• Unbounded: Virtually infinite index domain

• Voxel [ = Volume + Pixel ]

Internal nodes Block

Root node Tile values with

[K. Museth, SIGGRAPH / ACM TOG, 2013]

{x & ~7, y & ~7, z & ~7}

Always use it but …

• Tiny perfect hash table • Optimal for spatially coherent access

More than 100 tools and counting...

• Conversion • Geometric Transformation

Polygonal Model Level Set Volume

Resolution: 1051 x 208 x 863

Constructive Solid Geometry Surface Properties Tools for Fluid Simulation

Direct ray-tracing of level Volume Rendering of

Pixel resolution: 1920x1080

Active voxel count ~200 million

• Adaptive, using local curvature Adaptive Mesh

Advect Density, Advect Level Set, Advect Points,

VFX Reference Platform

• Improvements to cmake build system

Linear pointer-less VDB tree

Allocation on insertion Allocation on construction

Implemented in both C++11 and C99

Separate file format

CPU: 2 x Xeon E5-2696 (22 cores) Pixels: 2023 x 911

Volume: 2023 x 911 x 893 CPU (1) - HDDA: 19,833.9 ms

Arnold v 6.2.1 | Autodesk

32 bit float: 266 MB, 442 FPS

RTX 8000, CUDA 10.2

32 bit float: 266 MB, 442 FPS

32 bit float: 266 MB, 442 FPS

adaptive bit rate

32 bit float: 266 MB, 440 FPS 2 bits: 13.0%

3662 x 3697 x 3684 5142 x 1351 x 2449

1276 x 1519 x 1160 Andrew Reidmeyer

25 FPS (sim + collision)

git clone https://github.com/AcademySoftwareFoundation/openvdb.git

THE PREMIER CONFERENCE & EXHIBITION IN

THE PREMIER CONFERENCE & EXHIBITION IN

THE PREMIER CONFERENCE & EXHIBITION IN

THE PREMIER CONFERENCE & EXHIBITION IN

THE PREMIER CONFERENCE & EXHIBITION IN

THE PREMIER CONFERENCE & EXHIBITION IN

NanoVDB GPU System

THE PREMIER CONFERENCE & EXHIBITION IN

auto gridhandle = nanovdb::openToNanoVDB(grid,

THE PREMIER CONFERENCE & EXHIBITION IN

template <typename GridType>

THE PREMIER CONFERENCE & EXHIBITION IN

THE PREMIER CONFERENCE & EXHIBITION IN

THE PREMIER CONFERENCE & EXHIBITION IN

THE PREMIER CONFERENCE & EXHIBITION IN

THE PREMIER CONFERENCE & EXHIBITION IN

THE PREMIER CONFERENCE & EXHIBITION IN

THE PREMIER CONFERENCE & EXHIBITION IN

THE PREMIER CONFERENCE & EXHIBITION IN

THE PREMIER CONFERENCE & EXHIBITION IN

Pixel resolution: 1920x1080 

auto dst = reinterpret_cast<uint16_t>(data+1);

auto dst = reinterpret_cast<uint16_t>(data+1);

auto dst = reinterpret_cast<uint16_t>(data+1);

•  Introduction to toolset via Applications

Φ += noise( CPT ) Φ = noise( P )

•  resolution near camera:

•  17% ortho voxels not visible

•  5 minute-long river simulation