Professional Documents
Culture Documents
Unit III - Chapter 12
Unit III - Chapter 12
Chapter 12
3D Reconstruction
1
Ch12. 3D reconstruction
Shape from X:
12.1 Shape from X Refers to methods of reconstructing the three-dimensional shape of objects
from various sources of information, such as images (photographs), range data
(from sensors like LiDAR), or other cues like shading and texture.
Active Rangefinding:
12.2 Active rangefinding Techniques that actively emit energy (such as light or sound) onto a scene and
measure the time it takes for the energy to return, allowing for the
determination of distances and thus the reconstruction of the scene's geometry.
Surface Representations:
12.3 Surface representations Methods for representing the surfaces of objects in three-dimensional space,
often using mathematical representations such as meshes, point clouds, or
parametric surfaces like splines or NURBS.
Representations of objects as collections of individual
12.4 Point-based representations points in 3D space, often obtained from sensors or
scanning devices. These points can be processed and
analyzed for various applications like object recognition,
reconstruction, or visualization.
4
12.1.1 Shape from shading and
photometric stereo photometric stereo offers a powerful and non-contact
method for reconstructing the three-dimensional shape of
objects from multiple images captured under controlled
lighting conditions.
5
12.1.1 Shape from shading and
photometric stereo
The surface normal at a point on a surface is a vector that is
perpendicular to the surface at that point.In other words, it points
• How is this possible? directly away from the surface, indicating the direction that the surface
is facing locally.
6
7
12.1.1 Shape from shading and
photometric stereo
8
12.1.1 Shape from shading and
photometric stereo
Assumption:
Uniform albedo refers to a property of a surface where the
reflectivity or brightness of the surface remains constant or
consistent across its entire area
9
12.1.1 Shape from shading and
photometric stereo
• For example, a diffuse (Lambertian) surface has a reflectance map that
is the (non-negative) dot product between the surface normal:
reflectance map:
10
12.1.1 Shape from shading and
photometric stereo
• Lambertian Surface
Light falling on it is scattered such that the apparent
brightness of the surface to an observer is the same
regardless of the observer's angle of view.
11
12
12.1.1 Shape from shading and
photometric stereo
• smoothness constraint:
The smoothness constraint encourages smoothness in the reconstructed surface by penalizing abrupt changes or discontinuities in
surface orientation.
It assumes that neighboring points on the surface should have similar surface orientations to avoid surface irregularities or artifacts.
• integrability constraint
The integrability constraint ensures that the reconstructed surface is integrable, meaning that its surface normals are consistent with
the surface's depth or height field.
It imposes a relationship between surface gradients and surface curvature to ensure that the reconstructed surface is physically
plausible.
13
Smoothness Constraint
14
12.1.1 Shape from shading and
photometric stereo
• Photometric stereo
Another way to make shape from shading more
reliable is to use multiple light sources that can be
selectively turned on and off.
15
16
12.1.2 Shape from texture
•Shape from texture
The foreshortening of regular patterns as the
surface slants or bends away from the camera.
17
12.1.2 Shape from texture
18
19
12.1.3 Shape from focus
• Shape from focus
A strong cue for object depth is the amount of blur,
which increases as the object’s surface moves away
from the camera’s focusing distance.
• The amount of blur increases in both directions as
you move away from the focus plane. Therefore, it
is necessary to
1. use two or more images captured with different focus
distance settings
2. or to translate the object in depth and look for the
point of maximum sharpness.
20
21
12.1.3 Shape from focus
• The magnification of the object can vary as the
focus distance is changed or the object is moved.
This can be modeled either explicitly (making
correspondence more difficult) or using telecentric
optics, which approximate an orthographic camera
and require an aperture in front of the lens.
22
23
12.1.3 Shape from focus
• The amount of defocus must be reliably estimated.
A simple approach is to average the squared
gradient in a region but this suffers from several
problems, including the image magnification
problem mentioned above. A better solution is to
use carefully designed rational filters.
24
http://www.cs.columbia.edu/CAVE/projects/depth_defocus/ 25
12.2 Active Range finding
26
12.2 Active Range finding
• Kinect
• Azure Kinect DK
27
12.2 Active Range finding
• Tango
• ARCore
28
12.2 Active Range finding
29
SPIE: Society of Photo-Optical Instrumentation Engineers
12.2 Active Range finding
30
12.2 Active Range finding
• NBA 2K15
• Bad face scan
• NBA 2K19 face scan
31
12.2.1 Range Data Merging
• Range data merging
The registration (alignment) of partial 3D surface
models and their integration into coherent 3D
surfaces. https://www.youtube.com/watch?v=ii2vHBwlmo8
32
12.2.1 Range Data Merging
33
12.2.1 Range Data Merging
• Sparse Iterative Closest Point(video)
• Sparse Iterative Closest Point (introduction)
34
12.2.1 Range Data Merging
35
Stereolithography (SLA)
Technology
• Stereolithography
36
12.2.2 Application: Digital Heritage
• Active range finding technologies, combined with
surface modeling and appearance modeling
techniques (Section 12.7), are widely used in the
fields of archeological and historical preservation,
which often also goes under the name digital
heritage (MacDonald 2006).
• In such applications, detailed 3D models of cultural
objects are acquired and later used for applications
such as analysis, preservation, restoration, and the
production of duplicate artwork
37
12.2.2 Application: Digital Heritage
• The Digital Michelangelo project
laser scan of the David
38
12.2.2 Application: Digital Heritage
39
40
12.3 Surface Representations
• Surface representations
- Triangle Meshes
- Splines (Farin 1992, 1996)
- Subdivision Surfaces (Stollnitz, DeRose, and
Salesin 1996; Zorin, Schr¨oder, and Sweldens 1996;
Warren and Weimer 2001; Peters and Reif 2008)
41
Triangle Meshes
42
12.3 Surface Representations
• Introduction to subdivision surfaces
43
12.3 Surface Representations
• It enables not only the creation of highly detailed
models but also processing operations, such as
interpolation (Section 12.3.1), fairing or smoothing,
and decimation and simplification (Section 12.3.2).
44
12.3.1 Surface Representations
• One of the most common operations on surfaces is
their reconstruction from a set of sparse data
constraints, i.e. scattered data interpolation
45
12.3.1 Surface Representations
46
Radial Basis Functions (RBFs) are a popular method for interpolating data in multiple dimensions. They are particularly useful for
approximating a function based on scattered data points. The basic idea is to represent the unknown function f(x) as a sum of radially
symmetric basis functions, each centered at different data points.
48
12.3.2 Surface Simplification
• To approximate a given mesh with one that has
subdivision connectivity, over which a set of
triangular wavelet coefficients can then be
computed. (Eck, DeRose, Duchamp et al. 1995)
• To use sequential edge collapse operations to go
from the original fine-resolution mesh to a coarse
base-level mesh (Hoppe 1996)
49
12.3.2 Surface Simplification
50
12.3.3 Geometry Images
• To make meshes more easy to compress and store
in a cache-efficient manner.
• To create geometry images by cutting surface
meshes along well chosen lines and “flattening” the
resulting representation into a square.(Gu, Gortler,
and Hoppe (2002))
51
12.3.3 Geometry Images
52
The concept of point-based representations in computer graphics provides an alternative to traditional polygonal mesh structures,
particularly triangle meshes. This approach relies on using points or particles as the primary means of representing surfaces. Each point in
the representation carries additional attributes to effectively render the surface without needing to connect the points into triangles.
54
12.4 Point-based Representations
55
12.5 Volumetric Representations
• A third alternative for modeling 3D surfaces is to
construct 3D volumetric inside–outside functions
• Implicit (inside–outside) functions to represent 3D
shape
56
12.5.1 Implicit Surfaces and Level
Sets
• implicit surfaces
which use an indicator function (characteristic
function) F(x; y; z) to indicate which 3D points are
inside F(x; y; z) < 0 or outside F(x; y; z) > 0 the
object
The surface itself is represented by the set of points where F(x,y,z)=0, known as the zero set or level set.
57
12.5.1 Implicit Surfaces and Level
Sets
• Implicit (inside-outside) functions of superquadrics:
58
Conics (2D Curve)
59
Quadric
60
Superquadrics
61
12.5.1 Implicit Surfaces and Level
Sets
• A different kind of implicit shape model can be
constructed by defining a signed distance function
over a regular three-dimensional grid, optionally
using an octree spline to represent this function
more coarsely away from its surface (zero-set)
(Lavall´ee and Szeliski 1995 Szeliski and Lavall´ee
1996; Frisken, Perry, Rockwood et al. 2000; Ohtake,
Belyaev, Alexaet al. 2003)
A signed distance function (SDF) provides a particularly useful form of implicit function:
At any point in space, the value of the SDF indicates the shortest distance to the surface.
The sign of the value indicates whether the point is inside (<0) or outside (>0) the object.
On the surface itself, the SDF equals zero.
These functions can be stored directly on a regular grid or more efficiently using hierarchical structures like octrees. Octrees allow for
more coarse representations far from the surface and more detailed representations near the surface, improving both memory efficiency
and computational speed
62
12.5.1 Implicit Surfaces and Level
Sets
• Examples of signed distance functions being used
- distance transforms (Section 3.3.3)
- level sets for 2D contour fitting and tracking
(Section 5.1.4)
- volumetric stereo (Section 11.6.1)
- range data merging (Section 12.2.1)
- point-based modeling (Section 12.4)
Distance Transforms: Compute the minimum distance from each point in a space (e.g., pixels in an image) to a set of features
(e.g., edges).
Level Sets for 2D Contour Fitting and Tracking: Level set methods dynamically evolve contours in images, useful for object
tracking and shape recovery.
Volumetric Stereo: Use in 3D reconstruction from multiple stereo images, leveraging the consistency of distance measures across
views.
Range Data Merging: Combine multiple 3D scans into a single model, smoothing out noise and filling gaps in data.
Point-Based Modeling: Convert point clouds to more smooth and continuous surface models using implicit surface techniques.
63
12.6 Model-based Reconstruction
• 12.6.1 Architecture
• 12.6.2 Heads and faces
• 12.6.3 Application: Facial animation
• 12.6.4 Whole body modeling and tracking
Model-based reconstruction is a sophisticated approach in computer graphics and computer vision where models of objects (often human
bodies or faces) are used to reconstruct shapes and movements from visual data. This approach is particularly prevalent in applications
such as facial animation and whole-body tracking, due to its efficiency and the high quality of results it can achieve.
64
12.6.1 Architecture
In the context of model-based reconstruction, "architecture" typically refers to the computational and algorithmic frameworks used to support
the reconstruction process. This includes the data structures, algorithms, and interfaces designed to handle specific types of models and the
data acquired from sensors or images.
65
12.6.1 Architecture
• Architectural interior modeling
• Lumion 9
66
12.6.1 Architecture
• Automated line-based reconstruction
67
12.6.2 Heads and Faces
Modeling heads and faces involves creating detailed 3D models that can accurately represent the unique features and
expressions of individual faces.
68
12.6.2 Heads and Faces
69
12.6.2 Model-based Reconstruction
3D head model applications
• head tracking
• face transfer, i.e., replacing one person’s face with
another in a video
• face beautification by warping face images toward
a more attractive “standard”
• face de-identification for privacy protection
• face swapping
70
3D Head Model Applications
• face swap from Microsoft
• deepfake damage
• Mrdeepfakes
• Fake videos of real people -- and how to spot them
71
12.6.3 Application: Facial Animation
73
12.6.3 Application: Facial Animation
75
12.6.3 Application: Facial Animation
• Photoreal Digital Actor
• https://www.youtube.com/watch?v=piJ4Zke7EUw
• Tomb Raider
76
12.6.4 Whole Body Modeling and
Tracking
• Background subtraction
• Initialization and detection
• Tracking with flow
• 3D kinematic models
• Probabilistic models
• Adaptive shape modeling
• Activity recognition
77
12.6.4 Whole Body Modeling and
Tracking
• Background Subtraction
- One of the first steps in many (but certainly not
all) human tracking systems is to model the
background in order to extract the moving
foreground objects (silhouettes) corresponding to
people.
- Once silhouettes have been extracted from one or
more cameras, they can then be modeled using
deformable templates or other contour models
78
12.6.4 Whole Body Modeling and
Tracking
• Initialization and Detection
- In order to track people in a fully automated
manner, it is necessary to first detect (or re-acquire)
their presence in individual video frames.
- This topic is closely related to pedestrian
detection, which is often considered as a kind of
object recognition (Mori, Ren, Efros et al. 2004;
Felzenszwalb and Huttenlocher 2005; Felzenszwalb,
McAllester, and Ramanan 2008
• pedestrain detection
79
12.6.4 Whole Body Modeling and
Tracking
• 3D Kinematic Models
- which specifies the length of each limb in a
skeleton as well as the 2D or 3D rotation angles
between the limbs or segments
80
12.6.4 Whole Body Modeling and
Tracking
82
12.6.4 Whole Body Modeling and
Tracking
• Adaptive Shape Modeling
- Another essential component of whole body
modeling and tracking is the fitting of
parameterized shape models to visual data
83
12.6.4 Whole Body Modeling and
Tracking
84
12.6.4 Whole Body Modeling and
Tracking
• Estimating Human Shape and Pose From a Single
Image
• Original
• 3D Human Pose and Shape from a Single Image
85
12.6.4 Whole Body Modeling and
Tracking
• Activity Recognition
- The final widely studied topic in human modeling
is motion, activity, and action recognition (Bobick
1997; Hu, Tan, Wang et al. 2004; Hilton, Fua, and
Ronfard 2006)
- Examples of actions that are commonly
recognized include walking and running, jumping,
dancing, picking up objects, sitting down and
standing up, and waving
86
12.7 Recovering Texture Maps
and Albedos Recovering texture maps and albedos is a crucial step in the process of rendering
realistic 3D models in computer graphics. This process allows the model to not only
have an accurate shape but also display realistic surface details like color, patterns,
and other characteristics that are vital for photorealistic rendering.
87
12.7 Recovering Texture Maps
and Albedos Texture Mapping
Texture mapping is a method for adding detail, surface texture, or color to a 3D
model. A texture map is an image applied (mapped) to the surface of a shape or
polygon. This technique is used to add details without having to increase the
polygon count of the model.
The albedo of an object refers to the diffuse reflectance property of the material, independent of lighting. Albedo maps are a type of texture
map that solely represents the color information of the material's surface.
88
12.7 Recovering Texture Maps
and Albedos
89
12.7.1 Estimating BRDFs
90