Professional Documents
Culture Documents
Coding
Real Sens
ing
Representation Rendering Display
del i ng
Mo
Synthetic
Rendering
• In a nutshell: what color of pixels should be
shown to the user
• Process (how) depends on
– Input representation
– Desired quality
• Output (what) of process depends on
– Content
– Viewpoint
16.11.2021 3
Rendering and stereo 3D
• Stereo 3D displays require two or more views
from different viewpoints (more on lecture 5)
• Rendering gives you a single view from a given
viewpoint (our own definition for this course)
– For this lecture, we can consider stereo 3D to be
achieved by “rendering twice or more“
– In some cases, the display method can have an effect
on the rendering process (e.g. in light fields,
holography)
16.11.2021 4
Rendering
• Image based rendering (IBR)
– Real world scenes
– Captured video
– View+depth, multiview, light field
• Computer graphics rendering (CGR)
– Synthetic scenes
– Content via modeling
– Mesh, procedural
16.11.2021 5
Rendering
• Image based rendering (IBR)
16.11.2021 6
Before going to Image-based Rendering: yet another data
representation:
LIGHT FIELD
16.11.2021 7
Plenoptic function (PF)
• Introduced by Adelson and Bergen (1991)
– Plenus (complete) + Optic = Plenoptic
• Continuous function that describes the amount of light
through every point in space in every direction
– Describes the ’light field’
z
• 7-D function P(q,j,l,t,Vx,Vy,Vz) y
j
– (Vx,Vy,Vz) – location in 3D space
q
– (q,j) – angles determining
the direction
(Vx,Vy,Vz)
– l – wavelength
– t – time
x
8
3D&VR course
Plenoptic function
• Adding simplifications & assumptions
– Light wavelength (l) can be replaced by RGB
– Time (t) can be dropped
z
y
j
q
(Vx,Vy,Vz)
x
9
3D&VR course
Two-plane parameterization
• A 4-D approximation of PF, parameterized through two
parallel planes L(u,v,s,t) (also called light slabs)
– Levoy and Hanrah (1996) – light field
– Gortler et al. (1996) – Lumigraph
t
Ds
s
u
Du
u
10
3D&VR course
Two-plane parameterization
• The two planes are naturally interpreted as
camera plane (uv plane) and image plane (st
plane)
• Example 17 x 17 images
11
3D&VR course
EPI: Transforming depths to lines
t2
t1
v1 v2 v
• Note that in this figure ’t’ is the camera axis and ‘v’ is
12 the
image axis
3D&VR course
EPI: Transforming depths to lines
s u
13
• Note that in this figure ’t’ is the camera axis and ‘v’ is the
image axis
3D&VR course
Epipolar image (EPI)
• Consider a horizontal camera array forming
N images of resolution Rx by Ry
1 2 3 4 5 6 N=7
• Stack images to get a volume – a slice of that volume
represents an epipolar image (EPI)
1
1
7
7
14
3D&VR course
Epipolar image
• 600 instead of 7 cameras
600 rows
↑100
7 rows
15
3D&VR course
Summary of Light Field
• 4D approximation of the 7D plenoptic function
• Most often represented by two-plane
parameterization: one plane for cameras and
one plane for images
– Can be simply explained as a stack of 2-D images,
taken by cameras along horizontal and vertical
direction
– If cameras are moving horizontally only: 3D stack
• Slices of the 4D (or 3D) stacks give rise to EPIs
– Slopes of lines implicitely represent the depth
11/16/21 16
IMAGE BASED RENDERING
16.11.2021 17
Image Based Rendering
• The basic image-based formats are in most
cases trivially converted for display
– It would be exaggeration to call displaying raster
images rendering
– The “vanilla” formats: stereo and multiview
• It is also possible to synthesize new views from
multiple images
11/16/21 18
Image Based Rendering
11/16/21 19
Depth Image based rendering
(DIBR)
• Rendering process for the view+depth
representation
– Also applies to multiview+depth
• Given an image and its corresponding depth
map, what does the scene look like from another
viewpoint
– The view and the depth map should be from the same viewpoint
– (if not, they can be made to be)
11/16/21 20
Depth Image based
rendering
• Inputs:
– 2D view of the scene
– Depth map of the scene
• Output:
– 2D view of the scene
from a ”virtual” viewpoint
16.11.2021 21
Depth Image based
rendering (3D Warping)
Key idea:
– Assign a color for each depth pixel
– Project them to the world coordinates
– Project them back to the virtual camera
This procedure is called ‘3D warping’
11/16/21
Depth Image based rendering
(3D Warping)
3D space
● 𝑀 𝑥, 𝑦, 𝑧
3D
-3D
-to
-to
-2D
2D
𝑚 𝑢, 𝑣
● 𝑚′ 𝑢′, 𝑣′
●
refe
renc iew
e vi et v
ew targ
11/16/21
Depth Image based rendering
(3D Warping)
• Let 𝑀 = 𝑥, 𝑦, 𝑧, 1 𝑇 be a world point and 𝑚 = 𝑢, 𝑣, 1 𝑇 be its
projection in the reference camera view.
• For a pinhole camera model: 𝑧 ∗ 𝑚 = 𝑃 ∗ 𝑀
where z is the depth of M and 𝑃!×# is the camera projection
matrix: 𝑃 = 𝐾 𝑅!×! 𝑇!×$ .
• The 3D point M can be reconstructed from the image point m
using inverse projection matrix P -1 and its depth value z.
• Then, the reconstructed 3D point can be projected to the
virtual image plane using projection matrix of the virtual
camera P’.
11/16/21
Depth Image based rendering
11/16/21
Use case: multiview display
• View + depth
– Encode view+depth (RGBD * 2D data)
– Receive & decode
– Apply DIBR with a different translation for each view
required by the display
– Guess some content to the disoccluded areas
• Other possibility: encode & transmit all views
(e.g. 28 * RGB * 2D data)
16.11.2021 26
Depth Image based rendering
16.11.2021 27
Depth Image based rendering
16.11.2021 29
Disparity from depth
𝑀(𝑥, 𝑧)
z
f
b
16.11.2021 30
Disparity from depth
𝑀(𝑥, 𝑧)
𝑢! 𝑢"
z
f
b
Disparity from depth
𝑀(𝑥, 𝑧)
𝑑 = 𝑢! − 𝑢"
𝑑
z
f
16.11.2021 32
Disparity from depth
𝑏
𝑑 = 𝑢! − 𝑢"
𝑑
𝑑 𝑓
z =
f 𝑏 𝑧
𝑏𝑓
𝑑=
𝑧
b
16.11.2021 33
Artifacts
• Artifacts may appear in virtual views
• Mainly caused by depth discontinuities
• The amount of artifacts increases with the
distance of a virtual view from the original view
• There are three major types of artifacts:
– Occlusions
– Cracks
– Ghost Contours
16.11.2021 34
Occlusions
• A single input view can’t have information on the
whole scene (in a general case)
– Areas with no direct line of sight from the camera
won’t be captured
• Changing viewpoint may reveal areas that were
blocked by other objects (occluded)
– Blank areas in the virtual view
16.11.2021 35
Occlusions
11/16/21 36
Occlusions
• Blanks should be filled with something
– Interpolation from surrounding pixels
– Interpolation of structure
– Similar image patches etc.
• Representations containing layers or multiple
views and depth maps can give correct
information on the occluded areas
– The rendering process gets more complicated when
more inputs are introduced
16.11.2021 37
Occlusions
11/16/21 38
Y. Mao, G. Cheung, and Y. Ji, “On
Cracks
constructing z-dimensional DIBR-
synthesized images,” IEEE Trans.
Multimedia, vol. 18, no. 8, pp. 1453–
1468, Aug. 2016
16.11.2021 40
Mulptiple cameras
• Consider a horizontal camera array forming
N images of resolution Rx by Ry
1 2 3 4 5 6 N=7
• Stack images to get a volume – a slice of that volume
represents an epipolar image (EPI)
1
1
7
7
41
3D&VR course
Epipolar image
• 600 instead of 7 cameras
600 rows
↑100
7 rows
42
3D&VR course
EPI: Transforming depths to lines
s u
43
• Note that in this figure ’t’ is the camera axis and ‘v’ is the
image axis
3D&VR course
Ligth Field reconstruction
• Generating full (densely sampled) light field that was captured by sparsely
located cameras
• Intermediate view generation without explicit depth information
u
t
1 2 3 4 N=5
t1 t2 t'1 t'2
𝑡$& = 𝑡
t 𝑡%& − 𝑡$& = 3 𝑡% − 𝑡$
Dt
𝐿# = 𝑓 𝐿$$ , 𝐿$% , 𝐿%$ , 𝐿%%
𝐿′# = 𝑓 𝐿′$$ , 𝐿′$% , 𝐿′%$ , 𝐿′%%
Reconstruction by processing EPIs
≤1px disp.
Coarsely Densely
Set of captured views sampled sampled
t t
v v
16.11.2021 49
3D Graphics vs. Photography
●●● ●●●
● ● ● ● ● Vertex ● ● ● ● ● Primitive ●●● Geometry ●●●
Shader gl_Position Assembly Primitives Shader Primitives
Vertices
Varying
variables
per vertex Uniform Variables
●●● ●●●
●●● Clipping ●●● Rasteri- ◘◘◘◘◘ Fragment ◘◘◘◘◘
Primitives & Culling Primitives zation Varying Shader Shaded
variables colors
per fragment per fragment
material texture
Rendering Pipeline:
Uniform Variables
• Besides the attribute data, shader can also access
uniform variables
• Uniform variables can only be set in between OpenGL
draw calls and not per vertex
• Used to describe things that do not change from vertex to
vertex
• Can describe e.g.
– the virtual camera parameters that maps abstract 3D
coordinates to the actual 2D screen
– position of some light sources in the scene
Rendering Pipeline:
Vertex Shader
• Vertex data is transferred from CPU to GPU and stored in
a vertex buffer
• Each vertex gets processed independently by the vertex
shader
• Vertex shader can perform arbitrary operations on vertices
• The most typical use of the vertex shader is to determine
the final position of the vertices on the screen
• In order to output the transformed vertex position, the
shader must write to the predefined variable gl_Position
Rendering Pipeline:
Varying Variables
• The vertex shader can also output other variables,
called varying variables
• Varying variables are used by the fragment shader to
determine the final color of each fragment covered by
the triangle
• The transformed vertices and their varying variables
are collected by the triangle assembler and grouped
together in triplets
Rendering Pipeline
Uniform Variables
●●●●●●●●● ●●●
●●●●●●●●● ●●●
● ● ● ● ● ● ● ●
●●●●●●●●● Vertex Shader ●●●
●●●●●●●●● Attributes gl_Position ●●●
Varying
Vertex Buffer variables Assembler
Rendering Pipeline:
Vertex Shader Example
#version 330
in vec3 aVertex;
in vec3 aColor;
void main()
{
gl_Position = uProjectionMatrix * uModelViewMatrix * vec4(aVertex, 1.0);
vColor = aColor;
}
Rendering Pipeline:
Geometry Shader
• A geometry shader is an optional stage of the pipeline
• If present, it takes a whole triangle as an input and
has access to all vertices that make up the triangle
• Vertex adjacency information can be also provided
• The output of a geometry shader can be zero or more
triangles: some triangles can be filtered out or new
triangles can be generated
Rendering Pipeline:
Clipping & Culling
• To reduce workload for the following stages, clipping and
culling tests are performed in order to discard triangles
that are not seen in the current frame
clipped
The clipping stage
detects triangles that lie Screen
outside the view volume. space
● ● front face
● ●
Rendering Pipeline:
Rasterization
• Rasterization maps triangles to pixels on the screen,
i.e. defines a set of pixel-size fragments that are part
of a triangle
• For each fragment, the rasterizer computes an
interpolated value of position in screen-space and other
vertex attributes (e.g. color, normal, etc.)
• The value for each varying variable is set by blending
the three values associated with the triangle’s vertices
Rendering Pipeline
● ● ● ● ● ● ● ● ● ●
● ● ●
●
● ● ● ● ● ● ●
●●●
●●● ● ● ●● ● ● ● ● ● ● ●
● ● ● ◘ ◘ ◘ ◘ ◘
●●● ● ● ● ● ● ● ● ● ● ●
●●● gl_Position Varying
● ● ● ● ● ● ● ● ● ●
Varying variables
variables ● ● ● ● ● ● ● ● ● ●
Assembler
● ● ● ● ● ● ● ● ●● ●
Rasterizer
Rendering Pipeline:
Fragment Shader
• Fragment shader computes the final color value of each
fragment based on the information passed through
varying and uniform variables
• Takes a single fragment as an input and produce a single
fragment as an output
• By changing fragment shader, we can simulate different
types of light sources and object materials as well as give
high visual complexity to simple geometric objects
Rendering Pipeline
Uniform Variables
● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ●
◘ ◘ ◘ ◘ ◘ Fragment ◘ ◘ ◘ ◘ ◘
● ● ● ● ● ● ● ● ● ●
Shader Screen
Varying ● ● ● ● ● ● ● ● ● ●
variables color
● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ●
Frame Buffer
Rendering Pipeline:
Fragment Shader Example
#version 330
in vec3 vColor;
out vec4 fragColor;
void main(void)
{
fragColor = vec4(vColor.x, vColor.y, vColor.z, 1.0);
}
Rendering Pipeline:
Raster Operations
• Raster operations determine the final color of pixels in
the framebuffer based on the fragments produced by
the fragment shader
• First, visibility depth test is performed to determine if the
fragment is visible and needs to be added to the
framebuffer
• Then, blending is performed to blend the color of the
fragment with the color of the already rendered pixel
Rendering Pipeline:
Z-buffer
• Z-buffer holds distance of object
from the viewport plane
• Determines which fragments are
visible in the given 2D viewport
• Depth of a new fragment is
compared with the value stored in
the Z-buffer
• The color and depth value of the
pixel is overwritten only if the
current fragment is closer
• Note the similarity with view+depth
Rendering Pipeline:
Alpha Blending
• Blending is performed to blend the color of the fragment
with the color of the already rendered pixel
• Additional alpha-channel (RGBA) is used to describe
opacity/transparency of each pixel
• 1 - fully opaque, 0 - fully transparent, fractional - partially
transparent pixel
3D CG Rendering Algorithms
• Trying to model optical properties of a material:
– Reflected light
– Refracted light
void main() {
vec3 toLight = normalize(uLight – vPosition);
vec3 normal = normalize(vNormal);
float diffuse = max(0.0, dot(normal, toLight));
vec3 intensity = vColor * diffuse;
fragColor = vec4(intensity.x, intensity.y, intensity.z, 1.0);
}
CG Shading: Phong
• Reflected light = Diffuse
component + Specular component
– Diffuse according to the Lambertian
model
– Specular component – creates the
shiny patches
• The specular component
– Depends on position of viewer
– Depends on position light source
CG Shading: Law of Reflection
The angle of incidence and the angle of reflection
are equal 𝛉𝑖 = 𝛉𝑟
normal
to
L ted
ig h c
t f le
𝛉𝑖 𝛉𝑟 re
e
φ toEy
CG Shading: Law of Reflection
𝐼 =𝐴+𝐵 𝑅 =𝐴−𝐵 𝑁
𝐴 𝐴
𝐵 = cos 𝛉 ∗ 𝑁
𝛉 𝐵 𝛉
𝐼 = 𝐴 + cos 𝛉 ∗ 𝑁 𝐼 𝑅
𝑅 = 𝐴 − cos 𝛉 ∗ 𝑁
𝐴 = 𝐼 − cos 𝛉 ∗ 𝑁
𝑅 = 𝐼 − 2 ∗ cos 𝛉 ∗ 𝑁
𝑅 = 𝐼 − 2 ∗ 𝑑𝑜𝑡(𝑁, 𝐼) ∗ 𝑁
void main() {
vec3 toLight = normalize(uLight – vPosition);
vec3 toEye = normalize(-vPosition);
vec3 normal = normalize(vNormal);
vec3 reflected = normalize(reflect(-toLight, normal));
11/16/21 89
Texture
• Typically: representative
samples of desired surfaces
• Often no 1:1 mapping with
geometry like in view+depth
• Photographed, scanned or
drawn
• For certain types, might
also be procedurally
generated, such as marble
11/16/21 90
1996 vs 2013
11/16/21 91
Texture mapping
• Could be planar,
spherical, box…
Texture mapping
• Each vertex is associated with texture coordinate
(u,v)
• Varying variables are used to store texture
coordinates
• Texture coordinates are interpolated over each
triangle, giving us coordinates per fragment
• A uniform variable is used to point to the texture
• Fragment shader computes pixel color by
fetching data from the texture
Texture mapping: Example
Vertex shader: Fragment shader:
#version 330 #version 330
void main()
{
vec3 toLight = normalize(uLight – vPosition);
vec3 bump = normalize( texture2D(uNormalMap, texCoord).xyz * 2.0 - 1.0);
float diffuse = max(0.0, dot(bump, toLight));
vec3 intensity = vColor * diffuse;
fragColor = vec4(intensity.x, intensity.y, intensity.z, 1.0);
}
Environment Mapping
• Rendering approximate
reflections on shiny surfaces
• Render the environment once
– Store it as an environment map
– Use it as a texture of object
• Fast to compute, good for real time
– Plausible visual effect
– Assumes a static environment, otherwise the map
should be rerendered for each frame
Environment Mapping:
Cube Maps
• Six square textures
represent the faces
of a large cube
surrounding the
scene (cube map)
• Each texture pixel
represents the color
as seen along one
direction in the
environment
Environment Mapping:
Cube Map
Fragment shader:
#version 330
void main()
{
vec3 normal = normalize(vNormal);
vec3 reflected = reflect(normalize(-vPosition), normal);
vec4 texColor = textureCube(uTexUnit0, reflected);
fragColor = texColor;
}
Image Based Lighting
• Environment map serves
as a light source
• Light Rays projected
from map onto the object
• Color of pixels on the map
determines the color and
brightness of regions on object
• Very realistic results
• Suitable for real time applications
Real time 3D graphics
• In practice, games
– Unreal Engine:
https://www.youtube.com/watch?v=Vh9msqaoJZw
– Unreal Engine vs Unity:
https://www.youtube.com/watch?v=hKYU6Q0KdqM
– Rockstar Advanced Game Engine (RAGE):
https://www.youtube.com/watch?v=yBpBi5ivOoQ
16.11.2021 105
Ray Tracing
• “Traditional” rendering techniques explicitly
mimic effects light causes in the scene
• Ray tracing tries to simulate the behavior of
individual light rays
– Shadows, reflections are produced implicitly
• Different from standard OpenGL pipeline
• In real life
– Ray of light starts from a light source
– Gets reflected and refracted many times on objects
– Arrives at the eye of the observer
Ray Tracing
• Many rays get scattered and absorbed, never
reaching an observer
• To exploit this, ray tracing does the process in
reverse
– Rays which wouldn’t reach the observer won’t be
dealt with at any point
• Rays are shot from the observer to determine if
they directly or indirectly reach light sources
Ray Tracing
• A bunch of light rays is shot from virtual camera
through every pixel in the viewport (image plane)
• Each time a ray hits a surface three new rays
are generated:
– Shadow
– Reflection
– Refraction
Ray Tracing
• Shadow ray
– from a point where incoming ray hits the object
directly to light source
– If there is another object in the path of shadow ray,
object is in the shadow
• Reflection ray
– simulates reflected part of light of the original ray
• Refraction ray
– simulates refracted part of light
– used only if object is semi-transparent
Ray Tracing
• The main computation in ray tracing is the
intersection of a ray with objects in the scene
• Instead of testing every scene object for
intersection with the ray, auxiliary data structures
(Bounding Volume Hierarchy) can be used to
quickly determine if a set of objects is entirely
missed by the ray
• Simple bounding shapes are used (sphere, box)
• BVH forms a tree structure on a set of geometric
objects
Ray Tracing
• Computationally heavy
– Number of rays grows exponentially
– Needs to be done for every frame
– Most problems are solved and features added by
shooting more and more rays
• To reduce computational complexity, number
of ray bounces is limited
Ray Tracing
• The rays are bouncing recursively until the
contribution of the ray to the source pixel
becomes too weak
• Result depends on:
– Camera
– Light sources
– Objects
• Every time one of them changes, everything has
to be recomputed
11/16/21 113
Gilvan Isbiro
Real Time Ray Tracing
• With the advances in computing power,
ray tracing has become (kinda) feasible at
interactive rates
• Exhaustive raytrace still not an option
– A bit of ray tracing and a bunch of dirty tricks
to make it work
11/16/21 114
Real Time Ray Tracing
• An insufficient amount of rays leads to
noise in the images
11/16/21 115
Real Time Ray Tracing
11/16/21 116
Ray Tracing
• Alex Roman
– Above Everything Else, 2010:
http://vimeo.com/15630517
– The Third & The Seventh, 2010:
http://vimeo.com/7809605
• Very realistic
• Very slow
Ray Tracing
General principles of 3D
Graphics
• The idea is always to balance between accurate
modeling and the available resources
• Photorealism is (almost) achievable, but with a
huge resource cost
• The trick is to do it just well enough that the
viewer doesn’t realize he is being tricked
– Martin Mittring: “How to scale down and not get caught – The
unreal Engine 4 ‘Rivalry’ Demo”, youtu.be/HY62PAsM7eg
11/16/21 119
11/16/21 120
General principles of 3D
Graphics
• Most algorithms rely on assumptions to function
efficiently enough, such as
– Lighting is static
– Geometry is static
– Camera position is static or changes slowly
– Viewer is too far to see a difference
• It is up to the programmer to pick the correct
algorithm for the application
11/16/21 121
Virtual Reality Rendering
• In principle the same as rendering
stereoscopic content
• Additional step is to match camera pose
with the pose of the headset
– Render one view per eye
– Keep the sync with head pose!
• Practical point: has to be much faster
– For smooth viewing, 90 frames per second,
both eyes
11/16/21 122
VR: Foveated Rendering
• The human eye sees a very limited area in
“full resolution”
– Render that in high quality
• Everything else is could be rendered at a
lower quality, increasing frame rate
• Requires that the renderer knows where
the viewer is looking
– Requires extra hardware for tracking
11/16/21 123
VR: Time Warp
• What if the viewer moves their head after
rendering the frame is done?
– At 60 Hz, 16.7ms between tracking and
showing
• Head motion can be compensated to a
degree by warping the image
– Depth is easily extracted from rendered
– Texture is the rendered image
->Sounds like depth image based rendering
11/16/21 124
GRAPHICS ARCHITECTURE
16.11.2021 125
Graphics System Architecture
Graphics Card
http://michaelgalloy.com/2013/06/11/cpu-vs-gpu-performance.html
Graphics Memory
11/16/21 136
Nvidia RTX Platform
• You can only cast a limited number of rays from each
pixel in the virtual camera
• So, unless you leave your ray tracer running long
enough to fill in the scene, you have a lot of unpleasant-
looking “bald spots” or “noise”
• If that noise can be reduced separately, you can produce
quality output much faster compared to sending out that
many more rays
• Nvidia uses this technique to create frames more quickly
• Performing it in real time relies heavily on the Tensor “AI”
Cores in its Turing GPUs
Nvidia RTX Platform
• Nvidia Deep Learning Super-Sampling (DLSS) is a
method that uses Nvidia’s supercomputers and a game-
scanning neural network to work out the most efficient
way to perform AI-powered antialiasing
• The supercomputers will work this out using early access
copies of the game, with these instructions then used by
Nvidia’s GPUs
• RT Cores (there isn’t a lot of information out about them),
but in particular, they allow for much faster calculation of
the intersection of a ray with objects stored in a
Bounding Volume Hierarchy (BVH)
MISC STUFF
11/16/21 139
Viewport