Professional Documents
Culture Documents
Mälardalens University
Michael Carlie
2016-06-15
Bézier surfaces and NURBS surfaces along with basic algebraic implicit surfaces are
implemented in order to test the performance relative to polygonal meshes
approximating the same objects. The parametric surfaces are implemented using an
iterative Newtonian method that converges on the point of intersection using a
bounding volume hierarchy that stores the initial guesses. Research into intersecting
rays with parametric surfaces is surveyed in order to find additional methods that speed
up the computation. The implicit surfaces are implemented using common direct
algebraic methods. All of the intersection tests are implemented using the Embree ray
tracing API as well as a SIMD library in order to achieve interactive framerates on a CPU.
The results show that both Bézier surfaces and NURBS surfaces can achieve interactive
framerates on a CPU using SIMD computation, with Bézier surfaces coming close to the
performance of polygonal counterparts. The implicit surfaces implemented outperform
even the simplest polygonal approximations.
2
Table of Contents
Abstract ............................................................................................................................................................... 2
1. Introduction .................................................................................................................................................. 5
1.1 Problem Definition....................................................................................................................... 5
2. Background ................................................................................................................................................... 5
2.1 Rendering Algorithms ....................................................................................................................... 5
2.2 Ray Tracing ............................................................................................................................................ 6
2.2.1 Illumination ................................................................................................................................... 7
2.3 Bounding Volume Hierarchy ........................................................................................................... 8
2.4 Embree .................................................................................................................................................... 8
2.5 Non-Polygonal Ray Tracing .......................................................................................................... 11
2.5.1 Ray Tracing Implicit Surfaces.............................................................................................. 11
2.5.2 Ray Tracing Parametric Surfaces ....................................................................................... 12
3. Method ......................................................................................................................................................... 13
4. Theory .......................................................................................................................................................... 13
4.1 Implicit Surfaces ............................................................................................................................... 13
4.2 Parametric Surfaces......................................................................................................................... 14
4.3 Continuity ............................................................................................................................................ 15
4.4 Bézier Curves and Surfaces .......................................................................................................... 15
4.4.1 Bézier Curves ............................................................................................................................. 15
4.4.2 Continuity between Bézier Curves .................................................................................... 16
4.4.3 De Casteljau’s Algorithm ....................................................................................................... 17
4.4.4 Bézier Surfaces .......................................................................................................................... 17
4.4.5 Partial Derivatives ................................................................................................................... 18
4.4.6 Matrix Representation ........................................................................................................... 18
4.5 Non-Uniform Rational B-Spline (NURBS) Curves and Surfaces ..................................... 19
4.5.1 NURBS Curve ............................................................................................................................. 19
4.5.2 Basis Functions ......................................................................................................................... 20
4.5.3 Knot Vector ................................................................................................................................. 21
4.5.4 Local Support ............................................................................................................................. 22
4.5.5 Knot Insertion ........................................................................................................................... 22
4.5.6 NURBS Surface .......................................................................................................................... 24
4.5.7 Partial Derivatives ................................................................................................................... 25
5 Ray Tracing Implicit Surfaces .............................................................................................................. 26
5.1 Ray-Sphere Intersection ................................................................................................................ 26
5.2 Ray-Cylinder Intersection ............................................................................................................. 27
6. Ray Tracing Parametric Surfaces....................................................................................................... 28
6.1 Newton Rhapson .............................................................................................................................. 28
6.2 Ray Representation ......................................................................................................................... 29
6.3 Newton’s Method for Ray-Surface Intersection ................................................................... 29
6.4 Bézier Surface Evaluation ............................................................................................................. 31
6.5 Bézier Surface Subdivision ........................................................................................................... 31
6.5.1 Bézier Flatness Test ................................................................................................................ 32
6.6 NURBS Surface Evaluation ............................................................................................................ 33
6.6.1 Cox De-Boor Items ................................................................................................................... 34
6.6.2 Basis-Function Cache.............................................................................................................. 35
6.7 NURBS Subdivision .......................................................................................................................... 36
3
6.7.1 NURBS Flatness Test ............................................................................................................... 36
7. Implementation ........................................................................................................................................ 37
7.1 Implicit Surfaces ............................................................................................................................... 37
7.2 Parametric Surfaces......................................................................................................................... 37
7.2.1 Bezier Surface Preprocessing .............................................................................................. 37
7.2.2 NURBS Surface Preprocessing ............................................................................................ 38
7.2.3 Bounding volume hierarchy ................................................................................................ 40
7.2.4 Intersection Testing ................................................................................................................ 40
7.2.5 Bézier Surface Evaluation ..................................................................................................... 41
7.2.6 NURBS Surface Evaluation ................................................................................................... 41
8. Results .......................................................................................................................................................... 41
8.1 Test System ......................................................................................................................................... 41
8.2 Test Setup ............................................................................................................................................ 41
8.3 Polygonal Meshes ............................................................................................................................. 42
8.3.1 Image Quality ............................................................................................................................. 42
8.3.3 Memory Consumption............................................................................................................ 43
8.4 Implicit Surfaces ............................................................................................................................... 43
8.4.1 Image Quality ............................................................................................................................. 43
8.4.2 Performance ............................................................................................................................... 44
8.4.3 Memory Consumption............................................................................................................ 46
8.5 Bézier Surfaces .................................................................................................................................. 46
8.5.1 Image Quality ............................................................................................................................. 46
8.5.2 Performance ............................................................................................................................... 47
8.5.3 Memory Consumption............................................................................................................ 49
8.6 NURBS ................................................................................................................................................... 49
8.6.1 Image Quality ............................................................................................................................. 49
8.6.2 Performance ............................................................................................................................... 50
8.6.3 Memory Consumption............................................................................................................ 52
9. Discussion ................................................................................................................................................... 52
9.1 Related Research .............................................................................................................................. 53
10. Conclusions .............................................................................................................................................. 54
10.1 Future work ..................................................................................................................................... 54
Bibliography ................................................................................................................................................... 55
4
1. Introduction
Ray Tracing is a rendering technique that produces images by generating rays that are
projected from the camera. Intersection tests are conducted between the rays and the
geometry representations that constitute the scene in order to determine visibility and
the amount of light that is reflected back along the rays towards the camera. Ray tracing
is well suited for rendering photo-realistic images with reflections, refractions and other
lighting techniques since the generated rays simulate the behavior of light. Geometry
representations in rendering applications are typically represented using a set of
adjacent polygonal faces that together represent the geometry. With ray tracing such
polygonal meshes are rendered by conducting intersection tests with each of
the polygonal faces in the mesh, however ray tracing can also be used to render non-
polygonal mathematically defined geometry representations. Many real-world objects
can be represented using such mathematically defined geometries. Examples include
implicit surfaces such as cones, cylinders ellipsoids and spheres, as well as Bézier
surfaces and NURBS. Traditionally these geometries are approximated and represented
using polygonal meshes in rendering applications through a process known as
tessellation, however such methods have problems with unintended artefacts and
limited scalability.
2. Background
2.1 Rendering Algorithms
The fundamental algorithms used to represent images in computer graphics can be
generally classified as either projective or image-space algorithms. Projective algorithms
typically use a process known as rasterization to map the three-dimensional geometric
representations to pixels in a two-dimensional space in a per object fashion. The shading
of the individual objects is performed on each of the objects locally in order to give the
appearance of depth and determine how they interact with light. Image-space
algorithms on the other hand determine the visibility and shading of an object by finding
5
geometries along a line of sight and computing the amount of light that is reflected back
towards the viewer for each pixel. A common type of image-space rendering algorithm is
ray tracing [1].
Computer graphics implementations can be further categorized into offline and real-
time implementations. The purpose of real-time rendering is to create the illusion of
continuity in image rendering and providing interactivity by rapidly rendering each
image based on the user’s input. This cycle of user input followed by rendering should
happen at a fast enough rate to where the viewer is unaware of the fact that each image
is rendered in a discrete fashion. The rate at which images are rendered and displayed is
measured in frames per second (fps) or Hertz (Hz). Offline rendering implementations
on the other hand are not necessarily concerned with the time it takes to render an
image [2].
The rays initially generated from the camera are called the primary rays. Secondary rays
can be generated recursively from the points of intersection with objects in the scene of
the primary rays in order to produce additional effects such as shadows, reflection and
refraction. Rays can be successively generated from points of intersection of prior rays
in order to increase realism or produce additional effects such as reflectivity. The
recursive generation can be halted based on some specified depth limit or when the rays
stop producing significant information. The ray tracing algorithm is suitable for
rendering photo-realistic images since the rays simulate the behavior of light.
6
Figure 1 A demonstration of the ray tracing algorithm with primary rays, shadow rays and a sphere object.
Areas and objects under shadows are determined by generating a ray directed towards a
light source from an initial point of intersection and testing for objects that are between
the initial point of intersection and the light source If an object is detected by the
shadow ray the initial point of intersection is colored in such a way that it appears to be
shaded. Reflection is determined by generating a reflected ray from a previous point of
intersection on a reflective object and determining if there is another object in the path
of the new reflection ray. If another object is detected the color of the point of
intersection of the original reflective object has the color of the object detected by the
reflection ray added to it. [1]
2.2.1 Illumination
Once the visibility of an object in the scene has been determined a method is needed to
determine what color the corresponding pixel should have. The simplest way to color an
object is to simply assign some desired color to that pixel once a successful intersection
test has been conducted. In order to give the appearance of being three-dimensional on
a two dimensional screen an illumination model, simulating the way light interacts with
physical objects, should be implemented. A common illumination model that simulates
light was first described by Phong [4]. For each material in the scene the specular,
diffuse, ambient and shininess constants, denoted 𝐾𝑠 , 𝐾𝑑 , 𝐾𝑎 , 𝛼 respectively, are defined.
Additionally, we have L which is the direction from the point of intersection to the light
point in the scene, the normal N from the surface at the point of intersection, R which is
a vector representing the direction that a reflected ray originating from the light source
would take if it hit the object at the point of intersection and V which is a vector pointing
towards the camera from the point of intersection. The final color I then becomes
7
𝐼 = 𝐾𝑎 𝐼𝑎 + 𝐾𝑑 𝐼𝑑 max(𝑁 ∙ 𝐿, 0) + 𝐾𝑠 𝐼𝑠 max((𝑉 ∙ 𝑅)𝛼 , 0) (4.51)
Other lighting models for interesting effects and a more physically accurate modification
of the Phong model exist [5], however since the focus of this paper is on the intersection
aspect of ray tracing, this basic illumination model is presented for a cursory
background on the subject of illumination.
Various kind of bounding volumes can be used to construct such hierarchies. Spheres,
bounding boxes and axis-aligned bounding boxes being the most common. The type of
volume most suitable for the hierarchy usually depends on the types of objects
contained within the leaf nodes [6].
Other types of accelerated spacial structures exist such as uniform grids, bounding
interval hierarchies, octrees and kd-trees. In [7] a comparative study of all of these types
was conducted. The authors concluded that kd-trees are generally faster than bounding
volume hierarchies but that bounding volume hierarchies have the advantage of lower
memory consumption and a better availability of robust and fast building algorithms.
Both kd-trees and bounding volumes hierarchies were faster than the other alternatives.
2.4 Embree
Embree is a CPU-based ray tracing API consisting of low-level kernels intended to
maximize utilization of modern x86 CPU architectures using Single instruction, multiple
data (SIMD) computation for ray traced rendering implementations. Embree allows for
runtime code selection that optimizes the accelerated data structures, all of which are
variant of bounding volume hierarchies, and build algorithms used in a way that best
suits the hardware capabilities and architecture that the ray tracing implementation is
running on. Both packet and single ray tracing intersection computations are supported
with the ability to dynamically switch between them. Embree has built-in intersection
methods for various primitives but also allows for user-defined intersection testing with
arbitrary geometries [8].
8
Embree supports ray packets of sizes 4, 8 or 16 for both determining intersection and
occlusion. Embree also uses SIMD math library containing type definition for floating-
point variables, double-precision variables, integers and booleans among others that can
contain between 4 and 16 separate values in a SIMD variable which allows for
computing up to 16 simultaneous computations on each of the values in the variables.
Besides these basic types the library also includes second, third and fourth dimensional
vectors that can contain the respective variables in each of the elements allowing for
easily implemented SIMD vector algebra computation. The SIMD library can be used for
computing intersections with user defined geometries by including the library from the
Embree source code, available at [9]. The following table described the types used in this
thesis.
Table 1 The types from the Embree SIMD library used in this thesis
Type Description
Basic types that contain eight floats,
vfloat8
integers and booleans respectively. The
Embree source includes overloads and
vint8
functions for basic arithmetic operations
as well as fused multiply add (FMA)
vbool8 operations, all using Advanced Vector
Extensions (AVX) instrinsics
Vector types that can contain any of the
Vec2
basic types as elements. Includes
overloads for arithmetic operations as
Vec3
well as vector specific operations such as
dot and cross product calculation using
Vec4
AVX instrinsics.
Embree uses a device concept, where an RTCDevice object is instantiated and used for
each ray tracing system implemented in the code. These devises don’t interfere with
each other so that several devices can be used for separate purposes. Embree also
provides an RTCScene object which is a container for a list of geometries of potentially
various types. Scenes can be created with different options depending on whether ray
packets are supported or if the scene is to be dynamic or static.
For a dynamic scene, geometries that are included are enabled by default and can be
disabled, re-enabled or removed. Once a geometry has been included in the scene and
enabled a call to the API to commit the scene will cause a bounding volume hierarchy to
be created for the objects in that particular scene. Objects can later be modified after
which a subsequent call to commit the scene will cause the bounding volume hierarchy
to be created again. On the other hand, a static scene cannot be modified after the initial
call to commit the scene.
Each object that is placed in a scene is given a unique identification number called
geomID by Embree. This number can be used to retrieve, modify or delete the geometry
once it’s in the scene.
9
The method with which Embree creates the bounding volume hierarchy for polygonal
meshes can be altered by specifying one of the following flags when creating a scene.
Table 2 Flags used to set the conditions for bounding volume hierarchy generation in Embree. The flags are
considered suggestions by Embree and may be ignored
Embree also provides its own ray structures for various packet sizes that can be passed
to a scene in order to check for intersection. The structure for a single ray is given in
table 3.
Table 3 the ray object provided by Embree with a description of the associated types.
RTCRay
org Ray origin
dir Ray direction
tnear Start of ray
tfar End of ray
time Used for motion blur
mask Used to mask out geometries
Ng Object normal
u, v Barycentric coordinates of hit
geomID The ID of the hit geometry
primID The ID of the primitive hit
instID Instance ID of hit
For ray packets the data elements of the ray are eight length arrays of their single ray
counterparts. The org, dir and Ng variables are split into three separated eight length
arrays for each dimension for the ray packet structure.
The rays are passed to the scene by calling the function rtcIntersect for a single ray
or rtcIntersect8 for a packet of eight rays along with the scene that the rays are
going to be tested for intersection in. When calling an rtcIntersect function for a ray
packet a valid mask also needs to be created and passed to the function. The valid mask
is an integer array with each of the values initially set to negative one. When one of the
rays in the ray packet intersects an object, Embree sets the corresponding element in the
10
valid mask to zero. When the packet is later tested against another object, any ray which
has its valid mask set to zero is ignored. This also needs to be accounted for manually
when implementing user geometries.
A list of geometries supported by Embree can be found at [10]. Embree uses the Möller-
Trumbore method [11] for finding intersections between triangles and rays.
User geometries are created in Embree by providing a user created intersection and
occlusion function. The functions are passed to Embree using the
rtcNewUserGeometry function which also accepts a pointer to the user data that is
needed for performing the intersection tests. The user function receives the ray or ray
packet and valid mask along with the user geometry and uses this information to fill the
ray with data. The data that is passed to the ray is the geomID of the object that has
been intersected, setting the tfar member of the ray to the intersection point and
setting the normal values of the ray.
Algebraic surfaces such as ellipsoids and cylinders can be formed into a polynomial
function when combined with the equation of a ray. Surfaces such as spheres which
form quadratic equations when represented as polynomial functions can be solved using
the quadratic equation. Symbolic methods can also be used for finding roots of third and
fourth degree polynomials however above that numerical methods need to be applied.
An early attempt, mentioned by Hart, using Descartes rule of signs was implemented in
[15] Interval analysis which is a numerical method for finding roots within error bounds
was used by Mitchell [16]. LG-surfaces proposed by Karla and Barr [17] uses the
Lipschitz constants of the implicit function that guarantees finding the roots of the
polynomial function where L gives the means to find the regions where a function f is
guaranteed to not intersect the surface and G gives the directional derivative along the
ray. Sphere tracing described by Hart in [18] marches along the ray towards the implicit
surface in steps small enough that they are guaranteed not to miss the surface.
More recently real-time results were achieved using a marching algorithm by Singh and
Narayanan. [19]. The method used can render algebraic implicit surface up to an order
of 50 as well as non-algebraic surfaces at real-time rates on a GPU. Knoll et al. [20]
11
presented an algorithm based on interval arithmetic and affine transformation that
reached interactive frame rates on both a CPU and GPU implementations using SIMD
computation.
Initial methods for determining ray intersections with parametric surfaces were first
presented by Kajiya [21] and Toth [22] Both employed different techniques based on
iterative Newtonian methods in order to find the roots of the polynomial equations, with
the former solving the problem of finding the initial values using direct algebraic
methods and the latter using interval analysis. In [23] Sweeny and Bartels presented a
method for b-spline surfaces that preprocessed the surface into a structure of nested
bounding boxes in order to find initial estimates. Martin et al. [24] presented a similar,
faster iterative method, also using bounding boxes for initial estimates that focused on
simplifying previous methods and presenting the method in a way more suitable for
implementation. A further optimization is presented in [25] where only the derivatives
of the rational function are calculated instead of both the non-rational and rational
derivatives as in Martin’s method.
As for subdivision methods for ray tracing parametric surfaces; a method for
intersections with rational Bézier surfaces based on a technique called Bézier clipping
was first introduces by Nishita, Sederbergt and Kakimoto in [26]. The technique takes
advantage of the convex hull properties of Bézier and NURBS surfaces in order to
iteratively remove sections that have not been intersected by the ray. Campagna,
Slusallek and Seidel in [27] presented an improved method based on Nishita’s algorithm.
The method however requires NURBS surfaces to first be converted into simpler Bézier
surfaces in order to work. In [28] the authors detail a problem with the original Bézier
clipping method where multiple equivalent intersections can occur causing redundant
computation and propose a method for reducing the probability of the phenomenon
occurring.
In [30] Abert, Geimer and Mulle take the recursive basis function for NURBS surfaces
and converts it into a non-recursive form optimized for SIMD computation and uses the
Newtonian method to find the roots. Several invariants in the surface evaluation are also
pre-calculated and stored in a cache. The method was able to achieve greater than one
frames per-second results on commodity hardware from 2006. Abert and Geimer also
developed a SIMD based implementation for Bézier surfaces in 2005 [31] which
achieved between 3 – 5 frames per-second for various models. In a master’s thesis from
2010 [32] another implementation based on Newtonian methods with bounding-
volumes for initial value finding running on CUDA compatible GPUs achieved between 3-
5 fps for various models, setups and techniques.
12
In [33] a third alternative for ray tracing parametric surfaces is presented. The method
uses the second order derivative of the surface in order to find the roots of the surface
equation with the ray. The method is able to better handle special cases than Newton’s
method, such as the case where the ray intersects the surface at two points, however
according the authors it takes longer to converge than Newton’s method.
3. Method
Initial research in this paper focuses on the various existing methods for representing
objects and geometries without the usage of polygonal meshes in computer graphics.
Specifically, the non-polygonal representations researched and implemented are ones
for which there exist methods to determine ray-object intersections but are traditionally
tessellated before rendering with rasterization. The geometry representations chosen
for implementation are also constrained by feasibility of completion within the given
time-frame of the thesis and the ability to implement with the tools used. The current or
future practical usage in real-world implementations is also a discriminating factor.
Methods for measuring common performance related quantities such as frames per
second and relative scalability are developed in tandem with the intersection
implementations so that performance can be measured and compared. Literature on the
subject is also evaluated in order to find other types of measurements that are
considered valuable given the context.
Embree is used for developing the respective implementations. The Embree API is used
in order to remove the need for implementing custom accelerated structures and
hardware parallelism. Embree has built in methods for ray tracing various geometry
representations including polygonal meshes but also allows for user defined geometry
representations that can use the accelerated structures and SIMD capabilities making it
ideal for this work. Using an open source API rather than creating custom
implementations also provides an openly available and known point of comparison.
4. Theory
4.1 Implicit Surfaces
An implicit surface is a mathematically defined surface construct defined in three
dimensions by an equation 𝑓(𝑥, 𝑦, 𝑧) = 0. In other words, an implicit surface is defined
by the set of points (𝑥, 𝑦, 𝑧) that satisfy the condition 𝑓(𝑥, 𝑦, 𝑧) = 0.
Implicit surfaces can be used to represent a wide variety of surfaces. For example a
plane as an implicit surface can be defined as 𝑥 + 𝑦 + 𝑧 = 0 and a sphere with radius
r can be defined as 𝑥 2 + 𝑦 2 + 𝑧 2 − 𝑟 2 = 0. Other examples include ellipsoids, cones, tori
and hyperboloids.
13
Figure 2 Implicit surfaces plotted with MATLAB’s isosurface function.
Implicit surface geometries can be bounded, i.e. occupying a finite amount of space such
as in the case of a sphere, or unbounded such as the case of the plane where the surface
extends into infinite space. The degree if an implicit surface is determined by the highest
order variable in the defining polynomial, e.g. a sphere is a second degree or quadratic
implicit surface since the highest degree for a single variable is two, and a torus is a
fourth degree or quartic implicit surface since the highest degree for a single variable is
four. Generally it is only possible to analytically solve implicit surfaces up to the fourth
degree, beyond that alternative methods need to be applied [2].
𝑥(𝑡) = cos(𝑡)
𝑆(𝑡) = { (4.1)
𝑦(𝑡) = sin(𝑡)
Inserting a value for 𝑡 within the defined parametric interval 0 ≤ 𝑡 ≤ 1 will produce a
point (𝑥, 𝑦) on the circle.
𝑥2 + 𝑦2 + 𝑧2 = 𝑟 (4.2)
is defined parametrically as
A set of two parameter values (𝑢, 𝑣) would then produce a point (𝑥, 𝑦, 𝑧) on the sphere.
In short, parametric surfaces are defined by taking the function of the surface and
deriving a form of the function where each of the variables is defined by a couple of
parameter variables (𝑢, 𝑣) which for any value within the defined interval produce a
point on the surface.
(4.4)
14
𝑥(𝑢, 𝑣)
𝑆(𝑢, 𝑣) = {𝑦(𝑢, 𝑣) 𝑢 ∈ [𝑢𝑚𝑖𝑛 , 𝑢𝑚𝑎𝑥 ], 𝑣 ∈ [𝑣𝑚𝑖𝑛 , 𝑣𝑚𝑎𝑥 ]
𝑧(𝑢, 𝑣)
4.3 Continuity
Continuity describes the differentiability or “smoothness” of a set of curves or surfaces.
Continuity is written as 𝐶 𝑛 where 𝑛 is the amount of times that a set of curves are
differentiable at a point. Some examples of levels of continuity including their geometric
interpretation are.
𝐶 −1 : The curves are discontinuous, i.e. the curves don’t connect to each other.
𝐶 0 : The curves connect but the tangents differ at the points where they connect.
𝐶 1 : There is a difference in acceleration at the points where the curves connect.
𝐶 2 : The point at which the curves connect have the same acceleration.
0≤𝑡≤1
Where 𝑛 is the degree of the curve, 𝑃𝑖 are a set of two or three-dimensional control
points, where the amount of control points is one more than the degree of the curve that
they shape. 𝐵𝑖,𝑛 are the blending functions of the curve defined as
15
𝑛!
𝐵(𝑡)𝑖,𝑛 = 𝑡 𝑖 (1 − 𝑡)𝑛−𝑖 𝑖 = 0, … , 𝑛 (4.6)
𝑛! (𝑛 − 𝑖)!
As an example, given two control points, 𝑃0 and 𝑃1 , defining a linear or first degree
curve, equation 4.5 expands to
If the control points are 𝑃0 , 𝑃1 , 𝑃2 , 𝑃3 then the curve is cubic and is given by
As can be seen a change in any of the control points results in a change of the entire
curve; single Bézier curves are thus limited in the shapes that they can represent which
means that a larger set of connected curves is required to represent closed shapes such
as ellipsoids. However, limits also exist in the extent to which Bézier curves can be
connected.
Equation 4.9 entails that in order to achieve 𝐶 1 continuity between two connected
Bézier curves the ratio of the distance between the last two control points of the first
curve and the distance between the two control points of the second curve must be 𝑛/𝑚.
Where 𝑚 is the degree of the first curve and 𝑛 is the degree of the second. Achieving 𝐶 2
continuity or above is not possible for Bézier curves, although control points can always
be positioned such that they at least appear to be continuous to a larger extent.
16
4.4.3 De Casteljau’s Algorithm
Finding a point on a Bézier curve given a parameter value 𝑡 can be achieved by simply
inserting 𝑡 and doing the calculations as per the definition of Bézier curves. The problem
with this method in terms of computation is that it is not very numerically stable. A
more stable approach is given by De Casteljau’s algorithm [34].
The algorithm recursively linearly interpolates a point between each set of two adjacent
control points until the desired value is found. Linearly interpolating a point 𝑡 ∈ [0,1]
between two points 𝑃0 and 𝑃1 is done by
𝑃(𝑡) = 𝑃0 (1 − 𝑡) + 𝑃1 𝑡 (4.10)
The point 𝑃(𝑡) is then the point at parameter value 𝑡 along the line 𝑃0 to 𝑃1 . The way that
De Casteljau’s Algorithm works is that given a set of points 𝑃0 , 𝑃1 … 𝑃𝑛 . Linearly
interpolate a new point at 𝑡 along each successive pair of points 𝑃0 to 𝑃1 ... 𝑃𝑛−1 to 𝑃𝑛 .
This gives a new set of n points. Continue the algorithm with the new set of points which
gives another set of 𝑛 – 1 points. If done 𝑛 times, where 𝑛 is the degree of the curve, the
result is a single point. This point is the point 𝑡 along the original curve.
𝑖 = 0, 1, … , 𝑛
𝑃𝑖,𝑗 = (1 − 𝑡)𝑃𝑖−1,𝑗 + 𝑢𝑃𝑖−1,𝑗+1 { (4.11)
𝑗 = 0,1, … , 𝑛 − 𝑖
Besides finding a point 𝑃(𝑡) on the curve, the algorithm can also be used to divide the
original curve at a point at 𝑡. The algorithm is simply applied at the split point 𝑡 until two
new sets of control points can be retrieved which respectively describe both halves of
the original curve.
An important note to make is that even though the convex hull property of the curve is
always maintained when subdividing; the new control points will become closer and
closer to the curve as higher levels of subdivision are applied, allowing for the creation
of a successively tighter convex hull or control net.
Or more compactly as
𝑛 𝑚
17
Like Bézier curves, Bézier surfaces are shaped by manipulating the associated control
points and don’t generally intersect with the individual control points except for those
situated on the corners of the surface. The surfaces are also contained within the convex
hull formed by the control points. The same rules regarding continuity also apply. De
Casteljau’s algorithm can be applied to a Bézier surface by applying it to each set of
control points in both u and v direction.
𝑛 𝑚
𝑛 𝑚
The normal at the point (𝑢, 𝑣) can be found by normalizing the cross product of the two
partial derivatives.
18
[𝑈] = [𝑢𝑛 𝑢𝑛−1 … 1]
(4.17)
[𝑉] = [𝑣 𝑛 𝑣 𝑛−1 … 1]
−1 3 −3 1
[𝑁] = [ 3 −6 3 0] (4.18)
−3 3 0 0
1 0 0 0
The matrix [𝐶𝑃] contains the respective control points in both directions, making [𝐶𝑃] a
(𝑛 + 1) by (𝑛 + 1) matrix of three-dimensional vectors.
The partial derivatives in both u and v direction can easily be represented in matrix
form as
where
19
Besides the shape being determined by the set of control points and the degree, b-
splines have an associated knot vector which defines the parameter intervals in which
the control points provide local support and at which points the splines of a curve
connect. Mathematically a b-spline curve is defined similarly to a Bezier curve by the
polynomial
Where 𝑃𝑖 are the control points, 𝑛 is the number of control points and 𝐵𝑖 (𝑡) is the basis-
function with respect to the parameter 𝑡.
Each of the individual control points can also have an associated weight 𝑤𝑖 . The weights
can be increase in order for the associated control point to have an increased effect on
the curve or decreased in order to reduce the effect on the curve. A b-spline with
weights is defined by the polynomial
∑𝑛𝑖=0 𝑤𝑖 𝑃𝑖 𝐵𝑖 (𝑡)
𝑓(𝑡) = (4.23)
∑𝑛𝑖=0 𝑤𝑖 𝐵𝑖 (𝑡)
A b-spline curve defined in such a way is called a rational curve given the rational form
of the defining polynomial. In the case where all weights are equal to one, equation 4.23
reduces to equation 4.22, making the non-rational case a special case of the rational one.
The weights of a b-spline curve must never be negative since that could break the
convex hull property of the curve. By multiplying the control points by the weights and
appending the weights to the ends of each control point a more compact form of
equation 4.23 is
Where 𝑃𝑖ℎ denotes that the control points have had their weights appended and
multiplied.
1 𝑖𝑓 𝑢𝑖 ≤ 𝑢 ≤ 𝑢𝑖+1
𝐵𝑖,0 (𝑢) = { (4.25)
0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
20
𝑢 − 𝑢𝑖 𝑢𝑖+𝑝+1 − 𝑢
𝐵𝑖,𝑝 (𝑢) = 𝐵𝑖,𝑝−1 + 𝐵 (𝑢) (4.26)
𝑢𝑖+𝑝 − 𝑢𝑖 𝑢𝑖+𝑝+1 − 𝑢𝑖+1 𝑖+1,𝑝−1
Here 𝑝 is the degree of the curve and 𝑖 is given by equation 4.23. As can be seen the
basis-functions are evaluated recursively starting from degree 𝑝 down to the 0th degree,
meaning that for an increased degree the depth of the recursive formula increases and
for an increase in control points the amount of basis-functions evaluated increases. The
0
convention 0 = 0 is used. 𝑢𝑖 is the 𝑖 𝑡ℎ element of the curve’s knot vector.
Each of the elements in the knot vector determine the points in the parameter domain at
which the splines of the b-spline curve connect to each other or ‘tie-together’. The
minimum and maximum elements in a knot vector can be any number, including
negative ones, i.e. the magnitude of the vector does not matter. However, it is common to
normalize the elements such that they are between zero and one. The elements in the
knot vector must be set in a non-decreasing order, that is, 𝑢𝑖 ≤ 𝑢𝑖+1 for all elements in
the knot vector meaning that [1 2 3 4 5 6] is a valid knot vector whereas [6 5 4 3 2 1] is
not.
Each successive pair of elements in the knot vector 𝑢𝑖 … 𝑢𝑖+1 are referred to as a knot
span and represent a particular parameter interval. The only valid knot spans are within
the parameter domain 𝑢𝑚𝑖𝑛 ≤ 𝑢 < 𝑢𝑚𝑎𝑥 with 𝑢𝑚𝑖𝑛 = 𝑢𝑝 where 𝑝 is the degree of the
curve and 𝑢𝑚𝑎𝑥 = 𝑢𝑛 where 𝑛 is the number of control points. Any parameter value
outside of this span is not defined for the curve.
A non-empty knot span is any span where 𝑢𝑖 ≠ 𝑢𝑖+1. Such spans define a single
polynomial within the spline and therefore have 𝐶 𝑝 continuity within that span, where 𝑝
is the degree of the curve. Increasing the number of equal knots in a knot span to some
number produces an empty knot span and increases that span’s multiplicity. Increasing
multiplicity decreases the differentiability or continuity of the basis-functions defined by
the span to 𝐶 𝑝−𝑘 where 𝑝 is the degree and 𝑘 is the multiplicity. As an example, take a
cubic curve with four control points and multiplicity k = 4
𝑘=4 𝑘=4
𝑢 ⏞0 0 0 ⏞
⃗ = [0 1 1 1 1] (4.28)
The valid knot span of this curve is 𝑢3 ≤ 𝑢 < 𝑢4 . So 𝑢 is defined between 0 and 1 with a
single polynomial curve of 𝐶 𝑝 continuity. In fact, if setting all of the weights to one for
this particular curve, it is geometrically equivalent to that of a Bézier curve.
21
If we instead take a cubic curve with five control points and the knot vector
Then we have a spline defined by two polynomial curves the first of which is defined
within the parameter 𝑢1 ∈ [0, 0.5] and the second 𝑢2 ∈ [0.5, 1]. The two polynomial
curves meet at the point 𝑢 = 0.5 and have 𝐶 𝑝−𝑘 = 𝐶 2 continuity at that point.
Increasing the multiplicity of the middle knot element would decrease the continuity at
that point but would also necessitate increasing either the degree (which would
consequently make the continuity unaltered) or increasing the number of control points.
Increasing the degree may change the shape of the curve so it is necessary to increase
the amount of control points if maintaining the original form is required. The process of
inserting new knots into a knot vector while maintaining the same original shape is
called knot insertion, described in more detail in section 4.4.5
Knot vectors can be classified into three different types: uniform, open uniform and non-
uniform.
Uniform knot vectors are knot vectors for which the spacing between each of the
elements is equal. An example would be [0 1 2 3 4 5 6] or [0 0.1 0.2 0.3 0.4 0.5 0.6].
Uniform knot vectors can be used to describe closed shapes by making the endpoints
coincide.
An open uniform knot vector has equidistant knots except for the beginning and ending
knots which have multiplicity 𝑝 + 1 where 𝑝 is the degree of the curve. An example of
an open uniform curve would be [0 0 0 0 0.25 0.5 0.75 1 1 1 1] for a cubic spline with
seven control points.
Lastly, non-uniform knot vectors are the most general form of knot vectors where the
only constraint is that the elements of the knot vector are in ascending order. Both
uniform and open uniform knot vectors are special cases of non-uniform knot vectors.
22
𝑛
𝑢
⃗ = [𝑢0 𝑢1 … 𝑢𝑙 𝑢𝑙+1 … ] (4.31)
Inserting a new knot between 𝑢𝑙 and 𝑢𝑙+1 gives the new curve
𝑛+1
𝑄𝑖 = (1 − 𝛼𝑖 )𝑃𝑖−1 + 𝛼𝑖 𝑃𝑖 (4.33)
1 𝑖 ≤𝑙−𝑘+1
0 𝑖 ≥ 𝑙+1
𝛼𝑖 = { 𝑢̅ − 𝑢𝑖 (4.34)
𝑙−𝑘+2≤𝑖 ≤𝑙
𝑢𝑙+𝑘−1 − 𝑢𝑖
where 𝑢̅ is the new knot and 𝑘 is the order of the curve, i.e. the degree of the curve plus
one. Although the new curve has a different knot vector and different set of control
points, it is geometrically equivalent to its previous shape. The algorithm for inserting
one knot at a time is known as Boehm’s algorithm [36]. A more general form that
handles the insertion of multiple knots simultaneously is known as the Oslo algorithm
[37], [38] which is briefly detailed here.
𝑢
⃗ = [𝑢0 𝑢1 … 𝑢𝑛+𝑝+1 ] (4.35)
Inserting an arbitrary amount of new knots will give the new knot vector
Where 𝑚 > 𝑛 and corresponds to the number of inserted knots. The new curve is then
defined by
23
𝑚
Since the new curve is geometrically equivalent to the previous one, the new control
points can be computed based on the identity 𝑓(𝑡) = 𝑔(𝑡) where 𝑓(𝑡) is the old curve
defined by the knot vector 𝑢
⃗ . With this in mind; the new control points are derived using
the equations.
𝑚
𝑝
𝑄𝑗ℎ = ∑ 𝑃𝑗ℎ 𝛼𝑖,𝑗 (4.38)
𝑖=0
0 1 𝑢𝑖 ≤ 𝑣𝑗 ≤ 𝑢𝑖+1
𝛼𝑖,𝑗 ={ (4.39)
0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
∑𝑛𝑖=0 ∑𝑚
𝑗=0 𝐵𝑖,𝑝 (𝑢)𝐵𝑗,𝑞 (𝑣)𝑃𝑖,𝑗 𝑤𝑖,𝑗
𝑆(𝑢, 𝑣) = (4.41)
∑𝑛𝑖=0 ∑𝑚
𝑗=0 𝐵𝑖,𝑝 (𝑢)𝐵𝑗,𝑞 (𝑣)𝑤𝑖,𝑗
Where p, and q are the respective degrees of the curves. Each curve in the construction
has a separate knot vector and can therefore each have a separate number of control
points. A more succinct form of the equation where the weights have been appended to
and multiplied with the control points is
𝑛 𝑚
ℎ
𝑆(𝑢, 𝑣) = ∑ ∑ 𝐵𝑖,𝑝 (𝑢)𝐵𝑗,𝑞 (𝑣)𝑃𝑖,𝑗 (4.42)
𝑖=0 𝑗=0
All of the properties associated with NURBS curves, such as the convex hull property and
local support, are carried over to the surface form.
24
Figure 5 A cubic NURBS surface with 20 control points, 5 in
v’s direction and 4 in u’s direction.
∑𝑛𝑖=0 ∑𝑚
𝑗=0 𝐵𝑖,𝑝 (𝑢)𝐵𝑗,𝑞 (𝑣)𝑃𝑖,𝑗 𝑤𝑖,𝑗 𝑁(𝑢, 𝑣)
𝑆(𝑢, 𝑣) = 𝑛 𝑚 = (4.43)
∑𝑖=0 ∑𝑗=0 𝐵𝑖,𝑝 (𝑢)𝐵𝑗,𝑞 (𝑣)𝑤𝑖,𝑗 𝐷(𝑢, 𝑣)
Which are the partial derivatives of a NURBS surface in both u and v direction where
𝑛 𝑚
𝑛 𝑚
25
𝑛 𝑚
𝑛 𝑚
𝑝 𝑝
𝐵´𝑖,𝑝 (𝑡) = 𝐵𝑖,𝑝−1 (𝑡) − 𝐵 (𝑡)
𝑡𝑖+𝑝 − 𝑡𝑖 𝑡𝑖+𝑝+1 − 𝑡𝑖+1 𝑖+1,𝑝−1 (4.50)
Note that 𝐵𝑖,𝑝−1 (𝑡) and 𝐵𝑖+1,𝑝−1 (𝑡) in equation 4.50 refer to the original Cox De-Boor
recurrence and not the derivative, i.e. the recursion continues down the original basis
function where only the first layer is changed to the equation 4.50.
(𝑝 − 𝑐)2 − 𝑟 2 = 0 (5.1)
To determine a point of intersection the point 𝑝 is substituted with the ray equation
(𝑜 + 𝑡𝑑 − 𝑐)2 − 𝑟 2 = 0 (5.2)
𝑎𝑡 2 + 𝑏𝑡 + 𝑐 = 0
𝑎 = 𝑑2 (5.4)
𝑏 = 2𝑑(𝑜 − 𝑐)
𝑐 = (𝑜 − 𝑐)2 − 𝑟 2
26
the solution for the equation is
−𝑏 ± √𝑏 2 − 4𝑎𝑐 (5.5)
𝑡 =
2𝑎
Quadratic equations have zero, one or two roots which can be determined by the
discriminant
𝑏 2 − 4𝑎𝑐 (5.6)
If the value of the discriminant is less than zero then the ray does not intersect the
sphere, if the discriminant is equal to zero then the ray grazes the sphere at a single
point and if the discriminant is greater than zero then the ray goes through the sphere.
Two points of intersection can then be determined from the equation; one where the ray
enters the sphere and one where it exits. The point nearest the camera becomes the
point used for determining the color of the corresponding pixel on the screen.
where 𝑞 = (𝑥, 𝑦, 𝑧) is a point on the curve. Substituting the point with the ray equation
𝑞 = 𝑜 + 𝑡𝑑, the cylinder equation can be transformed into the quadric form 𝑎𝑡 2 + 𝑏𝑡 +
𝑐 = 0 where
𝑎 = (𝑑 − (𝑑 ∙ 𝑑𝑐 ) ∙ 𝑑𝑐 )2
−𝑏 ± √𝑏 2 − 4𝑎𝑐
𝑡 = (5.9)
2𝑎
𝑏 2 − 4𝑎𝑐 ≥ 0 (5.10)
entails that the ray intersects the cylinder. In order to have a finite cylinder the ray is
tested to see if it between the two planes determined by the line around which the
27
cylinder revolves by
𝑑𝑐 ∙ (𝑞 − 𝑜𝑐 ) > 0
(5.11)
𝑑𝑐 ∙ (𝑞 − 𝑐) < 0
Where 𝑞 is the ray equation with 𝑡 set to the point of intersection and 𝑐 is the line that
the cylinder revolves around with 𝑡𝑐 set to the length of the cylinder.
Rendering a closed cylinder requires testing for intersection with the planes that bound
the cylinder. The equations for planes contained within the cylinder’s radius 𝑟 are given
by
𝑑𝑐 ∙ (𝑞 − 𝑜𝑐 ) = 0 𝑎𝑛𝑑 (𝑞 − 𝑜𝑐 )2 < 𝑟 2
(5.12)
2 2
𝑑𝑐 ∙ (𝑞 − 𝑐) = 0 𝑎𝑛𝑑 (𝑞 − 𝑐) < 𝑟
Finding the final point of intersection is done by retrieving the non-negative results from
equation 5.9 and the non-negative results from equations 5.12 The smallest of these
values then determines the point of intersection [40].
𝑓(𝑥0 )
𝑥1 = 𝑥0 − (6.1)
𝑓´(𝑥0 )
Where 𝑥0 is the initial guess and 𝑥1 is a new value for 𝑥 closer to the root given that the
iteration converged towards the root.
28
𝑓(𝑥𝑖 )
𝑥𝑖+1 = 𝑥𝑖 − (6.2)
𝑓´(𝑥𝑖 )
With a good initial guess the rate of convergence is quadratic. However, a bad guess or
local extrema may cause the iteration to diverge from the root.
𝑟 = 𝑜 + 𝑡𝑑 (6.3)
Using a technique developed by Kajiya [21] for the Newtonian method, the ray is instead
represented as the intersection of two orthogonal planes each represented as 𝑝1 =
(𝑛1 , 𝑑1 ) and 𝑝2 = (𝑛2 , 𝑑2 ) where 𝑛1 and 𝑛2 are two unit length vectors that are
orthogonal to each other and 𝑑1 , 𝑑2 represent the distance of each plane to the origin.
The original ray representation can be transformed into the two planes by applying the
following
Applying the above rule causes 𝑛1 to always be perpendicular to the ray direction 𝑑. 𝑛2
can then simply be retrieved by
𝑛2 = 𝑛1 × 𝑑 (6.5)
𝑑1 = −𝑛1 ∙ 𝑜
(6.6)
𝑑2 = −𝑛2 ∙ 𝑜
𝑛 ∙ 𝑆(𝑢, 𝑣) + 𝑑1
𝑅(𝑢, 𝑣) = { 1 (6.7)
𝑛2 ∙ 𝑆(𝑢, 𝑣) + 𝑑2
Where 𝑆(𝑢, 𝑣) is the point given by evaluating either the NURBS surface or the Bézier
surface at a point in terms of 𝑢 and 𝑣. In geometric terms |𝑅(𝑢, 𝑣)| can be interpreted as
the distance between the ray and the evaluated surface point 𝑆(𝑢, 𝑣). Since the
29
Newtonian method iteratively approximates the roots, an intersection is reported once
the distance comes under a certain user defined threshold
The iteration is stopped whenever any new u, v values given from the iteration results in
a point further away from the ray
Finding the new 𝑢, 𝑣 values given the old ones is done by applying
𝑢𝑖+1 𝑢𝑖
( ) = ( ) − 𝐽−1 𝑅(𝑢, 𝑣) (6.10)
𝑣𝑖+1 𝑣𝑖
𝑛 ∙ 𝑆 (𝑢, 𝑣) 𝑛1 ∙ 𝑆𝑣 (𝑢, 𝑣)
𝐽=( 1 𝑢 ) (6.11)
𝑛2 ∙ 𝑆𝑢 (𝑢, 𝑣) 𝑛2 ∙ 𝑆𝑣 (𝑢, 𝑣)
and
1 𝑛 ∙ 𝑆 (𝑢, 𝑣) −𝑛1 ∙ 𝑆𝑣 (𝑢, 𝑣)
𝐽−1 = ( 2 𝑣 ) (6.12)
det(𝐽) −𝑛2 ∙ 𝑆𝑢 (𝑢, 𝑣) 𝑛1 ∙ 𝑆𝑢 (𝑢, 𝑣)
det(𝐽) = (𝑛1 ∙ 𝑆𝑢 (𝑢, 𝑣)) ∗ (𝑛2 ∙ 𝑆𝑣 (𝑢, 𝑣)) − (𝑛2 ∙ 𝑆𝑢 (𝑢, 𝑣)) ∗ (𝑛1 ∙ 𝑆𝑣 (𝑢, 𝑣)) (6.13)
Equation 6.10 is as can been seen analogous to the Newton Rhapson formula (equation
6.2)
A problem, noted in [24], may arise if the determinant det(𝐽) is singular, i.e. det(J) ≈ 0.
Such a determinant may make it difficult to determine the inverse Jacobian. The
problem can be circumvented by performing a small random perturbation of the 𝑢, 𝑣
values after checking if the Jacobian is near zero and initiating the next iteration if it is.
30
Where 𝑢0 and 𝑣0 are the initial guesses and 𝑟𝑎𝑛𝑑(0,1) produces a random value
between 0 and 1.
Since the final point retrieved is an approximation and most likely does not lie on the
ray’s path a projection of the point onto the ray is performed using
𝑡 = (𝑆(𝑢, 𝑣) − 𝑜) ∙ 𝑑 (6.15)
As previously mentioned, [𝑈] and [𝑉] are [𝑢𝑛 𝑢𝑛−1 … 1] and [𝑣 𝑛 𝑣 𝑛−1 … 1] respectively,
[𝑁] contains the polynomial coefficients and [𝐶𝑃] is a 4x4 matrix of three-dimensional
vectors.
Since the product [𝑁][𝐶𝑃][𝑁]𝑇 only ever changes when the control points change, it can
be computed during the preprocessing step and stored in a separate matrix. It’s also
worth noting that as long as [𝑈] and [𝑉] are represented in the order described where
the elements start from 𝑢𝑛 to 1 then [𝑁] is equal to its transpose [𝑁]𝑇 and therefore the
transpose need not be calculated.
The surface evaluation begins with the calculation of [𝑈], [𝑉], [𝑈´], and [𝑉´]. After which
the products [𝑈][𝑁][𝐶𝑃][𝑁]𝑇 and [𝑈´][𝑁][𝐶𝑃][𝑁]𝑇 are determined. The calculation then
continues for [𝑈][𝑁][𝐶𝑃][𝑁]𝑇 [𝑉], [𝑈´][𝑁][𝐶𝑃][𝑁]𝑇 [𝑉] and [𝑈][𝑁][𝐶𝑃][𝑁]𝑇 [𝑉´]
Doing the calculation of both the point on the surface as well as the partial derivatives in
the same scope allows for the product [𝑈][𝑁][𝐶𝑃][𝑁]𝑇 to be pre-calculated and used for
the calculation of [𝑈][𝑁][𝐶𝑃][𝑁]𝑇 [𝑉] and [𝑈][𝑁][𝐶𝑃][𝑁]𝑇 [𝑉´] considering the products
that they have in common which decreases the amount of calculations needed compared
with doing the surface evaluation for the point and the derivatives separately.
The results of these calculations are three three-dimensional vectors representing the
point of the surface corresponding to the given 𝑢, 𝑣 values and the partial derivatives in
𝑢 and 𝑣 direction respectively.
31
𝑃(𝑡) = 𝑃0 (1 − 𝑡) + 𝑃1 𝑡 (6.17)
Given the original four control points 𝑃0 , 𝑃1 , 𝑃2 , 𝑃3 and parameter t = 0.5. A new set of
linearly interpolated points 𝑄0 , 𝑄1 , 𝑄2 is given by
𝑄0 = (𝑃0 + 𝑃1 ) ∗ 0.5
𝑄1 = (𝑃1 + 𝑃2 ) ∗ 0.5 (6.19)
𝑄2 = (𝑃2 + 𝑃3 ) ∗ 0.5
Finally, these points along with the original points are used to construct the final set of
2n + 1 points 𝑅0 , 𝑅1 , 𝑅2 , 𝑅3 , 𝑅4 , 𝑅5 , 𝑅6 by
𝑅0 = 𝑃0
𝑅1 = 𝑄0
𝑅2 = (𝑄0 + 𝑄1 ) ∗ 0.5
𝑅4 = (𝑄1 + 𝑄2 ) ∗ 0.5 (6.20)
𝑅3 = (𝑅2 + 𝑅4 ) ∗ 0.5
𝑅5 = 𝑄2
𝑅6 = 𝑃3
The method basically measures the distance from the curve at parameter value 𝑡, to
where a point on the curve at 𝑡 would be if the curve were a straight line. Given the
control points 𝑃0 , 𝑃3 , a straight lined cubic curve is given by
2 1 1 2
𝑆(𝑡) = (1 − 𝑡)3 𝑃0 + 3(1 − 𝑡)2 𝑡 ( 𝑃0 + 𝑃3 ) + 3(1 + 𝑡)𝑡 2 ( 𝑃0 + 𝑃3 ) + 𝑡 3 𝑃3 (6.21)
3 3 3 3
32
2 1 1 2
which is simply a cubic curve with control points 𝑃0 , 3 𝑃0 + 3 𝑃3 , 3 𝑃0 + 3 𝑃3 , 𝑃3 that
describe a straight line between 𝑃0 , 𝑃3 as a function of 𝑡. Given the cubic curve 𝐵(𝑡), the
flatness of which is to be evaluated, the distance between the two curves at 𝑡 is given by
𝑎 = 3𝑃1 − 2𝑃0 − 𝑃3
(6.24)
𝑏 = 3𝑃2 − 𝑃0 − 2𝑃3
The maximum value of (1 − 𝑡)2 𝑡 2 can be shown to be 1/16 within the interval 0 ≤ 𝑡 ≤ 1,
and if 𝑎 = (𝑎𝑥 , 𝑎𝑦 , 𝑎𝑧 ), 𝑏 = (𝑏𝑥 , 𝑏𝑦 , 𝑏𝑧 ), then
2 2 2
‖(1 − 𝑡)𝑎 + 𝑡𝑏‖2 = ((1 − 𝑡)𝑎𝑥 + 𝑡𝑏𝑥 ) + ((1 − 𝑡)𝑎𝑦 + 𝑡𝑏𝑦 ) + ((1 − 𝑡)𝑎𝑧 + 𝑡𝑏𝑧 ) (6.25)
The maximum of each term is given by max(𝑎𝑥2 , 𝑏𝑥2 ), max(𝑎𝑦2 , 𝑏𝑦2 ) and max(𝑎𝑧2 , 𝑏𝑧2 ), which
finally gives an expression for the maximum distance of the distance function 𝑑 2 (𝑡)
within the parameter bounds 0 ≤ 𝑡 ≤ 1 as
1
(max(𝑎𝑥2 , 𝑏𝑥2 ) + max(𝑎𝑦2 , 𝑏𝑦2 ) + max(𝑎𝑧2 , 𝑏𝑧2 )) (6.26)
16
The expression can be used to stop subdividing a Bézier curve once the value of it is
below some user defined threshold.
When evaluating only the non-zero basis functions, equation 4.24 is rewritten as
33
for the valid knot span [𝑡𝑎 , 𝑡𝑎+1 ]. Or for a surface
𝑎 𝑏
ℎ
𝑆(𝑢, 𝑣) = ∑ ∑ 𝑃𝑖,𝑗 𝐵𝑖,𝑝 (𝑢)𝐵𝑗,𝑞 (𝑣) (6.28)
𝑖=𝑎−𝑝 𝑖=𝑏−𝑝
given the valid knot spans [𝑢𝑎 , 𝑢𝑎+1 ] and [𝑣𝑏 , 𝑣𝑏+1 ].
This scheme consequently requires that the valid span index 𝑎 is found, which can easily
be done given a parameter value 𝑡 by iterating through the knot vector until arriving at
the index 𝑎 for which 𝑡𝑎 ≤ 𝑡 < 𝑡𝑎+1 . If no such value exists, then 𝑡 lies outside of the
defined parameter interval.
The first step for precomputing the CDBItems is to rewrite the Cox De-Boor formula into
a form that allows for precomputation of invariant terms. The formula is rewritten as
Where
1
𝑐1 = (6.30)
𝑡𝑖+𝑝 − 𝑡𝑖
1
𝑐2 = − (6.31)
𝑡𝑖+𝑝−1 − 𝑡𝑖+1
These four values 𝑐𝑖 , 𝑖 ∈ [1,4] are the CDBItems computed during a preprocessing step.
The values are precomputed for each of the basis-functions 𝐵𝑖,𝑝 (𝑡) in both u and v
direction except for 𝐵𝑖,0 (𝑡) which is known to be equal to one for both u and v
considering the local support property of the surface.
34
At runtime a basis-function 𝐵𝑖,𝑝 (𝑡) can be calculated as long as the parameter 𝑡 as well as
the two lower degree basis functions 𝐵𝑖+1,𝑝−1 (𝑡) and 𝐵𝑖,𝑝−1 (𝑡) are known using the
CDBItems.
Since every basis-function depends on two lower non-zero basis functions, the lower
order basis-functions are calculated first starting from the 0th degree basis function
which is known to be equal to one. As an example, given a cubic NURBS surface with 16
control points and the knot vectors [0 0 0 0 1 1 1 1] in both directions, the basis-
functions in a single direction that need to be evaluated are 𝐵0,3 (𝑡), 𝐵1,3 (𝑡), 𝐵2,3 (𝑡) and
𝐵3,3 (𝑡). The non-zero dependencies of these basis functions can be represented with the
following triangular scheme.
Since 𝑡 is within the valid parameter interval, 𝐵3,0 (𝑡) is known to be equal to one. The
rest of the basis-functions are calculated using equation 6.32 with the CDBItems for each
basis function in the order
9 8 7 6
5 4 3
2 1
0
The lowest degree basis function, which is known to be equal to one is calculated first.
Second, the two higher degree basis functions that depend on the lowest basis function
are calculated from right to left, then the three higher degree basis functions are
calculated bases on the two lower ones, and so on, until the 𝑝 + 1 basis functions needed
for the surface evaluation are calculated. The same method works for the derivative
calculation as well.
35
6.7 NURBS Subdivision
The Oslo algorithm outlined in section 4.5.5 can be used to recursively subdivide a
NURBS curve or surface as per a method described by Peterson in [42] originally used
for tessellating NURBS.
Splitting a NURBS curve using the Oslo algorithm is done by adding new knots of the
same value in the knot vector, where the number of new knots inserted increases the
multiplicity of that value to the degree of the curve plus one. Then, using the Oslo
algorithm (equation 4.38), a new set of control points describing the exact same curve
can be determined.
To test the straightness of a NURBS curve a line is created between the first control
point and another control point that is significantly far from the first.
Once such a line is found it is stored in a separate vector 𝑇 = 𝑃𝑖 − 𝑃0 . Then the curves
are iterated over again where the flatness of each successive line is determined in
relation to 𝑇 by
‖(𝑃𝑖 − 𝑃0 ) × 𝑇‖ (6.34)
As long as the value is smaller than some user defined threshold for each successive pair
of control points the curve is considered to be flat enough.
Once the flatness of each curve in both directions of the surface has been determined a
final check is made to make sure that the control points are coplanar in order to assure
that the surface is not twisted.
36
A plane is created using the cross product of two adjacent edges of the surface
where 𝐶𝑖,𝑗 is a control point that is one of the corners on the surface.
The distance from the plane P to the fourth corner 𝐶1,1 is then given by
‖𝐶1,1 − 𝑃‖ (6.36)
If the distance is smaller than a user defined value 𝜀 then the surface is not significantly
twisted and subdivision can be halted if the curves are also reasonably flat.
7. Implementation
7.1 Implicit Surfaces
An implicit sphere and implicit cylinder are implemented according to the methods
described in sections 5.1 and 5.2. Both are created with the SIMD library in order to
increase performance.
The control points for the surface can either be defined manually by an array of sixteen
three-dimensional vectors or by loading a model containing the control points. Two
types of files are supported for loading the Bézier models. One is a custom file type
consisting of a list of vertices corresponding to each of the control points along with a
list of indices, with sixteen indices per line, where each line represents a patch in the
model and each index in the line corresponds to one of the vertices. The model is loaded
by first loading all of the vertices and then for each of the patches a new Bézier patch
model is instantiated along with a call for the surface to be preprocessed. Another type
of file is also supported, which is the Wavefront OBJ file type. The OBJ file type supports
the representation of rational and non-rational b-spline surfaces. The structure of a b-
spline OBJ file is similar to that of the custom file in that there is a list of vertices
corresponding to the control points along with lines of sixteen indices per line
corresponding to the patches. The file also contains the knot vectors and degrees of each
of the surfaces. These are ignored in the case of loading Bézier patches since all of the
knot vectors are pre-set to eight length vectors with a multiplicity of four for the first
and last knot vector and all of the degrees are set to three, which as was mentioned in
section 4.5.3, describes a cubic Bézier curve.
The bounds can easily be determined for each patch by iterating thought all of the
control points, finding the minimum and maximum x, y and z values and storing them in
37
Embree’s bounds_o type, which is used by Embree to determine the bound of the
bounding box.
𝑤𝑚 𝑤𝑚 𝑤𝑚 𝑤𝑚
𝑠𝑝𝑙𝑖𝑡(𝑢𝑚 , 𝑣𝑚 , 𝑤𝑚 , ℎ𝑚 ) = ((𝑢𝑚 − , 𝑣𝑚 , , ℎ𝑚 ) , (𝑢𝑚 + , 𝑣𝑚 , , ℎ ))
4 2 4 2 𝑚 (7.1)
= ((𝑢𝑙 , 𝑣𝑙 , 𝑤𝑙 , ℎ𝑙 ), (𝑢𝑟 , 𝑣𝑟 , 𝑤𝑟 , ℎ𝑟 ))
ℎ𝑚 ℎ𝑚 ℎ𝑚 ℎ𝑚
𝑠𝑝𝑙𝑖𝑡(𝑢𝑚 , 𝑣𝑚 , 𝑤𝑚 , ℎ𝑚 ) = ((𝑢𝑚 , 𝑣𝑚 − , 𝑤𝑚 , ) , (𝑢𝑚 , 𝑣𝑚 + , 𝑤𝑚 , ))
4 2 4 2 (7.2)
= ((𝑢𝑡 , 𝑣𝑡 , 𝑤𝑡 , ℎ𝑡 ), (𝑢𝑏 , 𝑣𝑏 , 𝑤𝑏 , ℎ𝑏 ))
float Bezier::flatness (vec3 P0, vec3 P1, vec3 P2, vec3 P3) {
return ux + uy + uz;
Code example 1 Calculating the flatness of a Bézier curve given four control points.
38
Load the control points, knot vectors and degrees either from user defined values
or from files containing several surfaces.
Instantiate one surface cache structure per thread used by the system for both 𝑢
and 𝑣 direction.
Calculate the Cox De-Boor items based on the knot vector.
Subdivide the surface until the maximum subdivision depth is reached or the new
surfaces are flat enough based on a user defined tolerance.
Generate the bounding volume hierarchy from the subdivided surface
The control points, knot vector and degrees can be manually chosen and placed in the
NURBS object structure for any degree and any number of control points in any
direction. It is also possible to load the Bézier patches from the custom file type
described in section 7.2.1 by setting the degrees to three and the knot vectors to
[0 0 0 0 1 1 1 1]. Alternatively, NURBS surfaces can be read from OBJ files.
Since the size of the elements of the surface cache for both the surface and the
derivatives is equal to the degree of the surface plus one they need to be dynamically
instantiated on the heap depending on the degree of the surface. Allocating an object on
the heap for each surface evaluation slows down performance so the surface caches are
allocated during the preprocessing step and stored in the NURBS object since the degree
is known beforehand. However, since the system uses multiple threads to perform the
intersection a surface cache needs to be created for each thread so that there are no
access conflicts and so as to avoid using critical sections which can also slow down
performance. Each thread has an identity number which is used as the index for
retrieving and storing values in the corresponding surface cache.
After allocating the surface caches, the Cox De-Boor items or CDBItems are determined
from the knot vector using the algorithm in code example 2.
The subdivision scheme used for NURBS surfaces is based on a method used for
tessellating NURBS surfaces found in [42] which can also be used for general
subdivision. Basically the method uses the Oslo Algorithm (outlined in section 4.5.5) in
order to split the surface along either direction. The algorithm starts by inserting
multiple knots at a parameter value in the middle of the knot vector so that it has
multiplicity 𝑘 = 𝑑𝑒𝑔𝑟𝑒𝑒 + 1. The new control points that described the exact same
39
surface are then determined using the Oslo algorithm. Once the new control points are
determined, the surface is simply split by creating two new sets of control points ending
and beginning at the control point determined by the knot span where the multiplicity
was set. The flatness is then determined by the method described in section 6.7.1.
The bounding volume is generated in the exact same way as for the Bézier surface. The
only difference is the generation of the initial guesses for the leaf nodes. The initial
guesses for the NURBS leaf nodes is done by dividing the sum of the first and last
elements of the valid knot span by two.
𝑢𝑝 + 𝑢𝑛
𝑢= (7.3)
2
where 𝑝 is the degree and 𝑛 is the number of control points in 𝑢’𝑠 direction. The value of
𝑣 remains the same as the divided surface when dividing in 𝑢’𝑠 direction and vice versa.
Since the computational expense of the parametric surfaces is high relative to traversing
the bounding volumes, a more novel traversal scheme is used. Instead of testing for
intersection for each leaf node that the ray intersects, all of the leaf nodes that the ray
intersects are stored in a cache. The cache is then sorted based on the distance from the
leaf node to the ray’s origin and the intersection tests are conducted starting with the
leaf node closest to the ray’s origin. If no intersection is found, or not all of the rays in
the packet have found an intersection, the intersection tests are continued down the leaf
node cache until either an intersection is found or there are no more leaf nodes.
Figure 6 A Bézier surface subdivided to three different levels using Embree's bounding volume generation.
The boxes are colored after the initial guesses contained in the leaf nodes, with red being in v's direction and
green in u's direction.
40
bounding volume hierarchy in order to generate the initial guesses needed for the
method to converge fast and reliably. The intersection testing uses the SIMD library
from Embree’s source code. This increases the performance of the intersection testing
since eight rays can be tested simultaneously. However the entire ray packet needs to be
taken through all of the iterations until the testing is done for each of the rays in the
packet.
For NURBS surfaces the parameter values are checked to see if they are within the valid
knot span and passed to the surface evaluator if they are. For packets of rays the initial
guesses in a single packet may lie in different knot spans, in those cases it is not possible
to perform the surface evaluation using SIMD. The values are checked to see if the values
are all within the same knot span. If they are then the packet is flagged as coherent and
is passed to the SIMD surface evaluator, if not then the packet is split and each element
in the packet is calculated one by one.
8. Results
8.1 Test System
All performance tests were performed with an Intel Core i7-5960X with 8 physical and 16
logical cores. For the performance evaluation, the system was compiled with Visual
Studio. For memory consumption a build using Xcode was used.
Table 4 Some of the measurements taken by the benchmarking system along with a description.
Name Description
Avg fps The average framerate over the specified amount of frames
Avg ms The average time in milliseconds to render a single frame
The number of rays with a positive intersection test calculated
Screen fill
as a percentage of the amount of primary rays.
41
Besides the values in table 4, the framerate is measured for each second that passes and
stored in the benchmarking tool.
All of the performance tests are measured at a resolution of 1920x1080 where the
camera revolves around the models for 60 second in order to gather accurate average
values and present the performance difference when the screen fill is larger or smaller.
Figure 7 The Teapot model tessellated with different levels of tolerance. From left to right the triangle count
is 9 793, 34 514, 299 035 and 2 839 344 triangles.
Figure 8 The Killaroo model tessellated with different tolerances from the original NURBS representation.
From left to right the triangle count is 91 902, 205 131, 529 127 and 4 611 833 triangles.
For the sphere and cylinder, the meshes are instead subdivided from an original mesh
representation and are shown in figures 9 and 10.
Figure 9 A mesh representation of the sphere. The model is subdivided from an originally course
representation in order to give models with different amounts of triangles. From left to right the triangle
count is 512, 1 984, 7 936 and 31 744 triangles.
42
Figure 10 A mesh representation of the cylinder with different amount of triangles. From left to right the
triangle count is 124, 744, 2 976 and 11 904 triangles.
Table 5 Memory consumption for meshes with a varying amount of triangles. The memory consumed by
polygonal models depends on the amount of triangles that constitute the model.
43
Figure 11 The implicit version of the sphere model. Figure 12 The implicit version of the cylinder
For Implicit surfaces there are no variables that model. As with the sphere there are no variables
alter the quality of the models, rather the model that need to be tuned in order to increase the
remains smooth and of good quality regardless of quality.
any variables.
8.4.2 Performance
Considering the symmetrical shape of the implicit surfaces evaluated, instead of
comparing single implicit surfaces with polygonal counterparts the models are arranged
in a 40 x 40 grid with a total of 1600 objects in each grid that the camera revolves
around while performance measurements are collected. Figures 13, 14 and 15 give the
results for spheres, cylinders and a mixture of both respectively.
35
30
25
Implicit Spheres
20
512 Triangles
15
10
5
0
0 10 20 30 40 50 60
Figure 13 A performance comparison of implicit spheres arranged in a 40 x 40 grid and polygonal meshes of
the same size also placed in a 40 x 40 grid. The polygonal meshes each consist of 512 triangles. The camera is
spun around the grid and the frames per-second are measured at set intervals.
44
Table 6 Additional information for the sphere grid test. The data presented in this table shows average values
accumulated throughout the duration of the performance test.
30
25
20 Implicit Cylinders
15 744 Triangles
10
5
0
0 10 20 30 40 50 60
Figure 14 The grid test performed on the implicit cylinders as well as polygonal approximations consisting of
744 triangles. The camera revolves around the grid while performance measurements are stored.
Table 7 Additional details of the grid test for cylinder models. The values are averages of the values computed
throughout the tests.
45
Mixed grid test
40
35
Frames per-second
30
25 Mix of implicit spheres and
20 cylinders
15 Mix of meshes with 512 and
744 triangles respectively
10
5
0
0 10 20 30 40 50 60
Figure 15 The grid test performed with a mix of implicit cylinders and spheres compared with a mix of
polygonal meshes of spheres and cylinders. The objects are placed one after another in a 40 x 40 grid. The
polygonal sphere consists of 512 triangles while the polygonal cylinder consists of 744 triangles.
Table 8 Additional details for the grid test conducted on a mix of the objects. The values are averages
determined from the values calculated throughout the test in Figure 15.
As can be seen, implicit surfaces generally outperform even the simplest polygonal
approximations of the same objects. The performance becomes relatively close when the
types are mixed as can be seen in table 8, however the meshes involved in each test
aren’t close to the same quality of the implicit surfaces.
46
Figure 16 The Bézier version of the teapot with Figure 17 The Bézier version of the Killaroo model
14 376 Leaf nodes. The subdivision tolerance and with 115 782 Leaf node. The subdivision depth and
maximum subdivision level is set to a level that tolerance is set to a level that produces acceptable
produces acceptable image quality while images quality while maintaining a low amount of
maintaining a low amount of bounding volumes. bounding volumes. .
The image quality and reduction of most artefacts heavily depends on the subdivision
depth and tolerance. Figure 18 shows the increase in image quality when increasing the
subdivision depth for the teapot.
Figure 18 The relative image quality of the Bézier version of the Teapot model with different levels of
subdivision. From left to right the amount of leaf nodes is 64, 256 and 4096
8.5.2 Performance
Figures 19 and 20 show the Bézier versions of the Teapot and Killaroo models compared
with tessellated versions of the same models. As can be seen the Bézier versions
generally perform slower than the polygonal meshes, even when a very large amount of
triangles are involved.
47
Bézier Teapot compared with triangles meshes
35
30
Frames per-second
25
34 514 Triangles
20
299 035 Triangles
15
2 839 344 Triangles
10 Bézier 14 376 Leaf nodes
5
0
0 10 20 30 40 50 60
Figure 19 Performance evaluation of the Bézier Teapot compared with polygonal meshes of the same model
tessellated with different tolerances in order to produce models with different amount of triangles.
Table 9 Additional information for the test represented in Figure 19. The values are averages calculated from
the values in Figure 19.
30
25 205 131 Triangles
48
Table 10 Additional information for the test presented in Figure 20. The values are averages calculated from
the values collected during the test.
Table 11 The memory consumption for the Bézier models with different levels of subdivision. For the teapot
the memory consumption is very low and does not scale greatly with an increase in subdivision depth. On the
other hand the more organic looking and curved Killaroo model produces many more leaf nodes given the
large number of patches in the model
8.6 NURBS
8.6.1 Image Quality
As with the Bézier surface versions, the quality of the NURBS versions of the models
generally depends on the subdivision depth and tolerance. Figures 21 and 22 show the
quality of the models at the subdivision and tolerance levels chosen for the performance
tests.
49
Figure 21 The NURBS version of the Teapot model Figure 22 The NURBS version of the Killaroo model.
with 16 384 bounding volumes. The subdivision The subdivision depth and tolerance is set to a level
depth and tolerance is set to a level that produces that produces acceptable image quality while
acceptable image quality. maintaining a low amount of bounding volumes.
The number of leaf nodes is 180 980
Figure 23 shows the increase in image quality for an increase in subdivision depth.
Figure 23 The image quality of the NURBS teapot relative to the amount of bounding boxes in the model. From
left to right the amount of leaf nodes is 64, 256 and 4096
8.6.2 Performance
Figures 24 and 25 show comparisons between NURBS models of the Teapot and Killaroo
along with tessellated versions of the same models. As can be seen there’s a larger
performance disparity between the NURBS and tessellated Killaroo models than for the
teapot model. The Killaroo model is very detailed with lots of curves and therefore
requires a much larger amount of bounding volumes. Each of the patches that make up
the model are also very large and complex, further decreasing the performance.
50
NURBS Teapot compared with triangles meshes
35
30
25
34 514 Triangles
20
299 035 Triangles
15
2 839 344 Triangles
10 NURBS 16 384 leaf nodes
5
0
0 10 20 30 40 50 60
Figure 24 A performance comparison of the NURBS Teapot compared with tessellated versions of the same
model. The camera rotates around each of the models during 60 seconds while the framerate is recorded
during set intervals.
Table 12 Additional information for the NURBS Teapot performance test in Figure 24. The values are averages
of the data calculated throughout the test.
Figure 25 The NURBS version of the Killaroo model compared with tessellated counterparts. The camera here
is also revolved around the models while framerate is computed at set intervals.
51
Table 13 Complementary information for the NURBS Killaroo performance test in Figure 25. The values are
averages computed from the values in Figure 13 and others.
Table 14 Memory consumption of the NURBS models at different levels of subdivision. As can be seen the
memory consumption for the NURBS Killaroo model remains fairly small even with large levels of subdivision
since only 89 patches are needed to represent the model
9. Discussion
For implicit surfaces, frame-rates are generally higher, even when compared with fairly
crude polygonal approximations. Implicit surfaces also maintain the same image quality
regardless of how close the camera is to the surface, while polygonal meshes do not.
Memory usage for implicit surfaces is also lower even when compared to fairly low
polygonal approximations. The image quality of implicit surfaces is generally very
robust and there are no variables that need to be fine-tuned for the image quality to be
good, at least for the types evaluated in this thesis. Clearly the frame-rates for the
implicit surfaces evaluated are within interactive frame-rates.
52
(figures 19 and 20) for both of the models evaluated, while NURBS are both much
slower and have a larger performance disparity for an increase in the complexity of the
model as can be seen in figures 24 and 25. The reason for the increased performance
disparity for NURBS surfaces is likely due to the fact that ray packets need to be split and
evaluated one by one when the rays are in different knot spans which is a likely
occurrence for large complex patches.
A major advantage that parametric surfaces have over their polygonal counterparts is
lower memory usage. The memory usage is not only generally lower for a number of leaf
nodes that produce acceptable quality, the memory usage also does not need to increase
for better image quality when the camera is close to the surface. Bézier and NURBS
surfaces maintain the same level of smoothness regardless of how close the camera is to
the surface, while polygonal meshes would need to increase the number of triangles
relative to how close the camera is to the model in order to have the same property.
Increasing the number of triangles would however also increase the amount of
information in the model which would increase memory usage.
Good image quality is achievable for Bézier surfaces even with a relatively low amount
of leaf nodes. Generally the teapot model can be represented without noticeable artifacts
for both NURBS and Bézier surfaces even with a fairly low amount of bounding volumes.
For the Killaroo model, good image quality without many artifacts can be achieved for
Bézier surfaces while the NURBS version has some artifacts that are not necessarily
removed when increasing the amount of leaf nodes. This is also likely due to the large
complex patches that constitute the model.
Of course one of the major advantages of ray tracing is the ability to simulate light and
add additional effects such as shadows, reflection and refraction. For NURBS surfaces it
is doubtful that interactive frame-rates can be maintained for complex objects with
additional effects, at least at the resolution evaluated.
As for Bézier surfaces, in a Master’s thesis from 2006 [44], a GPU implementation on a
GeForce 6800 achieved a maximum of two frames per second. In [31] a CPU
implementation was able to achieve around 6 frames per second for the teapot model.
As for other direct comparisons between parametric surfaces and polygonal meshes. In
a 2004 study [45] a comparative study between Bézier surface, Loop subdivision
surfaces and triangle meshes was performed with the Bézier surface having about 50%
of the speed of triangle meshes. No NURBS were tested in the study unfortunately, and
other studies directly comparing parametric surfaces with triangle meshes have not
been found.
53
10. Conclusions
A ray tracing system able to handle Bézier and NURBS surfaces as well as implicit
surfaces was created using Embree in order to determine the performance of each in
comparison with polygonal models representing the same objects. Performance close to
polygonal counterparts for Bézier surfaces was achieved, with NURBS being slower than
Bézier surfaces but still within interactive frame-rates.
The image quality for parametric surfaces can vary greatly depending on the subdivision
implementation. The parameters that determine the level of subdivision are dependent
on the model and need to be fine-tuned in order to optimize results. Having a level of
subdivision that is very high results in high image quality but lowers performance. It is
therefore crucial to have a good adaptive subdivision scheme for both surface types.
The ray tracing system was written with a SIMD library in order to allow for
computation of ray packets with eight rays per packet. The SIMD aspect of the
implementation gave a large increase in performance and allowed for the
representations implemented to reach interactive frame-rates, even at a high resolution
making them suitable for interactive ray tracing, at least without many additional
lighting effects.
The use of ray streams, where the size of the ray packet is decreased as rays in the packet
are determined to not intersect the surface would be of interest since ray packets need to
be taken through all of the newton iterations for parametric surfaces if even one is slow
to converge. Ray streams could also be used to fix the problem with incoherent ray
packets needing to be split for NURBS surface evaluation.
A study that implements additional lighting and other visual effects would also be of great
interest.
As for implicit surfaces, a more general algorithm such as the Sphere-tracing method [18]
or the method based on interval arithmetic by Knoll et.al. [46] would be of great interest.
Although the algebraic surfaces implemented in this paper were considerably faster than
identical triangle meshes, a more general algorithm would likely be slower whilst also
being more useful for modeling general implicit surfaces.
54
Bibliography
[1] K. Suffern, Ray Tracing from the Ground Up, A K Peters ltd, 2007.
[2] T. Akine-Möller, E. Haines and N. Hoffman, Real-Time Rendering, 3rd Edition ed., A
K Peters Ltd, 2008.
[3] T. Whitted, "An improved illumination model for shaded display," ACM SIGGRAPH
Computer Graphics, vol. 13, no. 2, 1979.
[4] B. T. Phong, "Illumination for computer generated images," Communications of the
ACM 18.6, 1975.
[5] J. F. Blinn, "Models of light reflection for computer synthesized pictures.," ACM
SIGGRAPH Computer Graphics., vol. 11, no. 2, 1977.
[6] A. Nguyen, "Diss. Implicit bounding volumes and bounding volume hierarchies,"
2006.
[7] A. L. dos Santos, T. Veronica and J. Lindoso, "Review and Comparative Study of Ray
Traversal Algorithms on a Modern GPU Architecture," in WSCG 2014: Full Papers
Proceedings:, 2014.
[8] I. Wald, et al., "Embree: A kernel framework for efficient cpu ray tracing.," ACM
Transactions on Graphics (TOG) 33.4, p. 143, 2014.
[9] I. Corporation, "Embree," 2016. [Online]. Available:
https://github.com/embree/embree. [Accessed 24 5 2016].
[10] Intel Corporation, "Embree API," 2016. [Online]. Available:
https://embree.github.io/api.html. [Accessed 24 5 2016].
[11] T. Möller and B. Trumbore, "Fast, minimum storage ray/triangle intersection.","
ACM SIGGRAPH 2005 Courses, 2005.
[12] Á. Balázs, M. Guthe and R. Klein, "Efficient trimmed NURBS tessellation," Journal of
WSCG , vol. 12, no. 1, pp. 27-33, 2004.
[13] M. Hall and J. Warren, "Adaptive polygonalization of implicitly defined surfaces,"
Computer Graphics and Applications, IEEE, vol. 10, no. 6, pp. 33-42, 1990.
[14] J. C. Hart, "Ray tracing implicit surfaces," Siggraph 93 Course Notes: Design,
Visualization and Animation of Implicit Surfaces, pp. 1-16, 1993.
[15] P. Hanrahan, "Ray Tracing Algebraic Sur- faces," SIGGRAPH 83, pp. 83-89, 1983.
[16] D. P. Mitchell, "Robust ray intersection with interval arithmetic.," Proceedings of
Graphics Interface, vol. 90, pp. 68-74, 1990.
[17] D. Kalra and A. H. Barr, "Guaranteed ray intersections with implicit surfaces," ACM
SIGGRAPH Computer Graphics, vol. 23, no. 3, 1989.
[18] J. C. Hart, "Sphere tracing: A geometric method for the antialiased ray tracing of
implicit surfaces," The Visual Computer, vol. 12, no. 10, pp. 527-545, 1996.
[19] J. M. Singh and P. J. Narayanan, "Real-time ray tracing of implicit surfaces on the
GPU," IEEE Transactions on Visualization and Computer Graphics, vol. 16, no. 2, pp.
261-272, 2010.
55
[20] Knoll, Aaron, et al., "Fast ray tracing of arbitrary implicit surfaces with interval and
affine arithmetic," Computer Graphics Forum, vol. 28, no. 1, pp. 26-40, 2009.
[21] J. T. Kajiya, "Ray tracing parametric patches," ACM, vol. 16, no. 3, pp. 245-256,
1982.
[22] D. L. Toth, "On ray tracing parametric surfaces," ACM SIGGRAPH Computer
Graphics, vol. 19, no. 3, pp. 171-179, 1985.
[23] M. A. Sweeney and R. H. Bartels, "Ray tracing free-form B-spline surfaces,"
Computer Graphics and Applications, IEEE, vol. 6, no. 2, pp. 41-49, 1986.
[24] Martin, William, et al., "Practical ray tracing of trimmed NURBS surfaces," Journal
of Graphics Tools, vol. 5, no. 1, pp. 27-52, 2000.
[25] A. Glazs and A. Sisojevs, "An Efficient Approach to Direct NURBS Surface Rendering
for Ray Tracing," The 19th International Conference on Computer Graphics, vol. 19,
no. Poster Proceedings, pp. 9-12, 2011.
[26] T. Nishita, W. T. Sederbergt and M. Kakimoto, "Ray Tracing Trimmed Rational
Surface Patches," ACM SIGGRAPH Computer Graphics , vol. 24, no. 4, pp. 337-345,
1990.
[27] S. Campagna, P. Slusallek and H.-P. Seidel, "Ray tracing of spline surfaces: Bézier
clipping, Chebyshev boxing, and bounding volume hierarchy–a critical comparison
with new results," The Visual Computer, vol. 13, no. 6, pp. 265-282, 1997.
[28] A. Efremov, H. Vlastimil and H.-P. Seidel, "Robust and numerically stable Bézier
clipping method for ray tracing NURBS surfaces," in Proceedings of the 21st spring
conference on Computer graphics., 2005.
[29] S.-W. Wang, S. Zen-Chung and C. Ruei-Chuan, "An efficient and stable ray tracing
algorithm for parametric surfaces," Journal of Information Science and Engineering,
vol. 18, no. 4, pp. 541-561, 2002.
[30] O. Abert, M. Geimer and S. Mulle, "Direct and fast ray tracing of NURBS surfaces,"
EEE Symposium on Interactive Ray Tracing 2006, pp. 161-168, 2006.
[31] M. Geimer and A. Oliver, "Interactive ray tracing of trimmed bicubic bézier
surfaces without triangulation," Institute for Computational Visualistics, 2005.
[32] Valkering, E. E. J, "Ray Tracing NURBS Surfaces using CUDA. Diss.," Delft University
of Technology, 2010.
[33] T. Park, J. Ji and K. H. Ko, "A second order geometric method for ray/parametric
surface intersection," Computer Aided Geometric Design, vol. 30, no. 8, pp. 795-804,
2013.
[34] W. Boehm and A. Müller, "On de Casteljau's algorithm.," Computer Aided Geometric
Design, vol. 16, no. 7, pp. 587-605, 1999.
[35] K. I. Joy, "A Matrix Representation of a Cubic Bézier Patch," 2000. [Online].
Available: http://www.idav.ucdavis.edu/education/CAGDNotes/Matrix-Cubic-
Bezier-Patch.pdf. [Accessed 24 5 2016].
[36] W. Boehm, "Inserting new knots into B-spline curves," Computer-Aided Design, vol.
12, no. 4, pp. 199-201, 1980.
[37] E. Cohen, T. Lyche and R. Riesenfeld, "Discrete B-splines and subdivision
techniques in computer-aided geometric design and computer graphics.,"
Computer graphics and image processing, vol. 14, no. 2, pp. 87-111, 1980.
56
[38] O. P. Abert, "Interactive ray tracing of NURBS surfaces by using SIMD instructions
and the GPU in parallel.," Diss. Nanyang Technological University, 2005.
[39] P. Shirley and S. Marschner, Fundementals of Computer Graphics, 3rd Edition ed.,
A K Peters Ltd, 2009.
[40] D. Zorin, "Advanced Rendering, Spring 2005, Lecture 2," 1999. [Online]. Available:
http://mrl.nyu.edu/~dzorin/rend05/lecture2.pdf. [Accessed 25 5 2016].
[41] J. Armstrong, "Recursive Subdivision," Technote TN-06-005, 2006.
[42] J. W. Peterson, "Tessellation of NURBS surfaces," in Graphics Gems IV, Boston,
Academic Press, 1994, p. 286–320.
[43] O. Abert, M. Bröcker and R. Spring, "Accelerating Rendering of NURBS Surfaces by
Using Hybrid Ray Tracing," 2008.
[44] J. Löw, "Ray Tracing Bézier Surfaces on GPU," 2006.
[45] C. Benthin, I. Wald and P. Slusallek, "Interactive ray tracing of free-form surfaces,"
in Proceedings of the 3rd international conference on Computer graphics, virtual
reality, visualisation and interaction in Africa, 2004.
[46] A. Knoll, et al., "Fast ray tracing of arbitrary implicit surfaces with interval and
affine arithmetic," Computer Graphics Forum, vol. 28, no. 1, 2009.
[47] S. G. Parker, et. al., "Optix: A general purpose ray tracing engine," ACM Transactions
on Graphics (TOG) 29.4, p. 66, 2010.
[48] D. Kozlov, "FireRays SDK - High Efficiency, High Performance," [Online]. Available:
http://developer.amd.com/tools-and-sdks/graphics-development/firepro-
sdk/firerays-sdk/. [Accessed 6 April 2016].
[49] M. Nunes, "Interactive Ray-Tracing-Coherent Rays.," University of Minho, Braga,
Protugal.
[50] S. Boulos, "Notes on efficient ray tracing.," ACM SIGGRAPH 2005 Courses, 2005.
[51] J. Amanatides and K. Choi, "Ray tracing triangular meshes.," Proceedings of the
Eighth Western Computer Graphics Symposium, pp. 43-52, 1997.
[52] M. Khan, Cubic Bezier Curve and Surface, 2012.
[53] T. DeRose, M. Kass and T. Truong, "Subdivision surfaces in character animation," in
Proceedings of the 25th annual conference on Computer graphics and interactive
techniques, 1998.
57