DH2323 Individual Project Advanced Ray Tracing

Didrik Nordstr¨ om didrikn@kth.se August 13, 2013

Abstract A basic light simulator was implemented for rendering of 3D models using ray tracing with the goal of producing realistic images with clearly visible optical phenomenas like indirect illumination, lens aberration and depth of field. An optical system consisting of a thin lens and a light sensor was modeled and tuned to act as a camera in the scene. Support for materials of lambertian and specular types was implemented, as well as an area light source. For user convenience, progressive and multi threaded rendering was incorporated in the implementation. The resulting images were noisy but had more realistic lighting than images produced with simpler rendering engines. It was concluded that light simulation using ray tracing is computationally expensive and relies on fast approximative algorithms to be usable in production environments.


1 2 3 4 Introduction . . . . . . . . . . . . Purpose . . . . . . . . . . . . . . Prerequisites . . . . . . . . . . . Method . . . . . . . . . . . . . . 4.1 Optical system . . . . . . 4.2 Materials . . . . . . . . . 4.3 The direction of light . . 4.4 Area light source . . . . . 4.5 Indirect illumination . . . 4.6 Tuning the optical system 4.7 Rendering algorithm . . . Results . . . . . . . . . . . . . . . 5.1 Progressive rendering . . 5.2 Image resolution . . . . . 5.3 Depth of field . . . . . . . 5.4 Indirect illumination . . . Conclusions






The most common application for computer graphics is real-time rendering of complex 3D scenes; the advanced and expensive graphics cards of today are a strong indication of that. We want to produce the best visual results, taking whatever shortcuts, cheats and tricks available, but we must always maintain high frame rates. Most of the time, that is what we use computer graphics for, and it is almost always in the context of some interactive media like computer games. Most of the time we need fast rendering, but sometimes we want the best possible image, and we accept that it might take a while, or even very long, for the computer to produce a single image. Architects, product designers, animated movie producers and others sometimes need to create photo realistic images in order to achieve their goals. Even though it is impossible to avoid simplifications completely, this project is about simulating the basic physics of light. If done properly, it can yield photo realistic images, much more accurate than what can be achieved with real-time graphics.



The purposes of this project is to implement formulas from physics that produce visible optical phenomena like depth of field, aberration and global illumination. Approximations and simplifications are allowed, both to simplify algorithms that can be difficult to implement and to improve performance. The purpose is not to implement existing rendering algorithms. It is rather about discovering relations between physical formulas and computation, and also about understanding concepts about optics by using the renderer.




The project is implemented in C++11 utilizing source code from the other labs in the computer graphics course. The external code base consists of SDL (Simple DirectMedia Layer) for screen and pixel operations, and GLM (OpenGL Mathematics) as a linear algebra library. The main source file consists of a main function, a rendering loop and a function for finding the closest intersection between a ray and the triangles in the scene (with custom lazy evaluation of solutions to the equation system, improving intersection detection speed with up to 200%). Also provided is an instance of the Cornell box which is used as the rendering subject throughout the lab.

(a) Realtime rasterized render

(b) Raytraced render with shadows

Figure 1: The Cornell box scene used in previous labs



Optical system

The first step in coming closer to a physical model of light is to exchange the pinhole camera, used in the previous labs, for a slightly more advanced optical system. One of the simplest optical systems that can produce an image consists of a rectangular light sensor and a thin lens, which is what we will use. The benefit of the thin lens is that it can be approximated to a flat surface, which enables fast and easy intersection detections. This means that the rays enter and exit the lens at the same point, which is never true in reality, but is a common approximation. The fact that we actually use a lens to focus the light beams will automatically enable for depth of field, which is an optical phenomena that the simple ray tracer and rasterizer did not have. Tuning of the optical system will be explained later.

Figure 2: Setup of scene and camera, objects not according to scale. The red box is the Cornell box (without objects), the blue disc is the thin lens, the yellow plane is the light sensor, the black line is the optical axis and the green dot is the subject.



S1 : Distance between the lens and subject S2 : Distance between the lens and the light sensor R: Lens radius (not related to curvature) LW , LH : Dimensions of the light sensor 5


Thin lens arithmetic

To simulate a lens properly, one should calculate the refraction twice, once when the ray enters the lens and once when it exits. This also requires calculation of the normals. Fortunately, for a thin lens, it is much simpler, both for the programmer and for the computer. Assume the thin lens have focal length f . Let n ¯ 1 be the normalized incident ray direction, let n ¯ 2 be the outgoing refracted ray direction and let r ¯ be the intersection point on the lens, relative to the lens center. Note also that n ¯1, n ¯ 2 and r ¯ needs to be translated according to the optical axis. Then n ¯ 2 can be computed easily: n ¯2 = n ¯1 − r ¯ f (1)



In computer graphics, a BRDF (Bidirectional Reflectance Distribution Function) describes, among other things, how light gets reflected depending on the angle of the incoming light. The easiest BRDF to implement is Lambertian reflectance which is a perfectly diffuse surface that reflects light evenly in all directions, regardless of the angle of the incident ray. This is solved by randomizing a vector within a box of side 2 and checking if the vector is within the unit sphere and the dot-product of the vector and the surface normal is positive; if it fails it tries again. As an experiment, another material was created which is attached to the left wall in the scene. This material first reduces the incoming light strength by half. It incorporates specular reflection, but with small randomized distortion in order to create the effect of a rippled glass mirror.


The direction of light

For the ray tracing, it was discovered that the best direction of shooting photons is from the camera and towards the scene. Some experiments were conducted with shooting rays from the light source and hoping they would bounce on the geometry and eventually enter the camera but they yielded weak results, mainly due to that the lens is so small that it only catches a small fraction of the rays. This reversed model should however not be a problem, since the optical models used are symmetrical.



Area light source

With the decision to reverse the direction of light, an immediate problem was raised; how would randomized rays hit a point light source? The answer is that they would not. Instead, the point light source from previous labs needed to be exchanged for an area light source, which was made quite large and was placed just in front of the cube on the left side. Another benefit of the area light source is that it is more physically correct and thus yields more realistic results; the most significant distinction is probably soft shadows. The light source itself is not rendered due to that its size and intensity would dominate in the final image.


Indirect illumination

Indirect illumination was an important part of the project. The visual goal is to lighten up areas that are not directly illuminated by the light source. These areas are instead lit up by surrounding bright areas which in some scenes makes a big difference. This is often referred to as ambient lighting. In the implementation, the rays may bounce a constant number of times, B . If they have not reached the light source by then, they are canceled. The only exception is the mirror surface on the left; if a ray hits it, that reflection will not contribute to the count. 4.5.1 Color mixing

Another problem that was not required to solve in the previous labs was how to compute the color of a ray that is bouncing on a colored surface. In reality, white light is a mixture of light of many visible wavelengths (colors); and a colored diffuse material reflects many of the wavelengths, some stronger than others. In the computer’s color model, colors are simply a mixture of red, green and blue. It is safe to say that if the color of a surface is red, and it receives a ray of white light, the reflected ray should now be red. But what if the incident ray is red and the surface color is orange, what color should the reflected ray get? Since the underlying model isn’t physically correct, it is hard to correct that mistake in this phase. In this implementation the color mixing is performed by taking the element-wise product of each of the color components. This can be seen in equation 2, where I is the color of the incident ray, S is the color of the surface and R is the color of the reflected ray. R = (IR SR , IG SG , IB SB ) 7 (2)


Tuning the optical system

In optical systems like cameras, there are countless of configurations possible to photograph an object. To narrow the problem down, some parameters will be assigned based on reasonable guesses, as it will make it possible to compute the other parameters.

Figure 3: Determining angle of view First of all, the distance between the lens and the focused object is kept the same as in the previous labs, S 1 = 2.9. It is now possible to determine the angle of view, α, because the entire box should be visible in the image. If we look at figure 3 we can calculate the angle using trigonometric functions: 1 ≈ 55◦ 1.9 And given the formula for angle of view, α = 2 tan−1 α = 2 tan−1 and a fixed sensor size, e.g. LW focal length f : (3)

L (4) 2f = LH = L = 0.1, we can compute the

1 L = 2f 1.9 which gives us f = 0.095. Using the formula for thin lenses,


1 1 1 + = (6) S1 S2 f we can subsequently determine S2 ≈ 0.0982. These parameters can now remain fixed. 8


Depth of field

There is only one parameter left in the optical system, the aperture (or lens radius) R. In reality, this parameter will alter the brightness in the final image, but since all rays are sent through the lens no matter how small it is, the quantity of light passing through does not change. A common way to measure aperture diameter is with the f-number value N , which describes the aperture diameter in relation to the focal length; typical values are 2 ≤ N ≤ 16 for cameras. A higher number means smaller aperture. f (7) N The aperture will affect the depth of field, i.e. the distance between the nearest and the furthest object which is acceptably sharp. The smaller the aperture, the larger is the depth of field. We first need to determine what is acceptably sharp. The circle of confusion, c, is a measurement of how large an optical spot is allowed to be in order to be perceived as sharp when looking at an image. For the purpose of keeping the project simple we will use the commonly used approximation √ c = d/1500, where the d is the diagonal of the sensor, in this case d = 2L. In the following formulas, H is the hyperfocal distance (which for the purpose of the project is only needed for computing depth of field), DN is the near limit and DF is the far limit for the depth of field. Thus, the depth of field is DF − DN . 2R = H= DN = DF = f2 +f Nc (8) (9) (10)

S1 H H + S1 − f S1 H H − S1 − f

The f-number’s impact on the depth of field can be inspected in table 1.


N 2 4 8 16 32

H 47.957 24.026 12.061 6.078 3.086

DN 2.740 2.597 2.353 1.984 1.519

DF 3.093 3.313 3.858 5.717 97.934

DF − DN 0.353 0.716 1.505 3.733 96.415

Table 1: Depth of field, DF − DN , varying as a function of aperture size which is described by the f-number N . Subject distance S1 = 2.9.


Rendering algorithm

The idea of the rendering algorithm is to shoot a ray from a random location on the light sensor, through a random location on the lens and then compute where it ends up using the material and light models explained previously. When a ray hits the light source, the corresponding cell on the light sensor will increase in intensity. A noteworthy observation is that the lens will flip the image both horizontally and vertically which need to be accounted for during drawing. 4.7.1 Light sensor

The light sensor consists of p number of cells, one per pixel, usually in the interval 1002 ≤ p ≤ 8002 , with one float value per color component per cell. These values are incremented with the computed light intensity of the ray which hit the light source. Note that these float values are not limited to be less than one, but can be of arbitrary size. 4.7.2 Progressive rendering

To fully understand the renderer, one needs to be familiar with the distinction between ray tracing and drawing. Ray tracing is the intense computational work of tracing the light rays in the scene, in this respect it also involves the alteration of the light sensor’s color values. Drawing is simply the process of traversing the cells in the light sensor and drawing the colors to the screen. Ray tracing and drawing is performed asynchronously, or more precisely, ray tracing is performed in parallel and drawing is done in the main thread at a fixed interval. This allows for the render to be pro10

gressive and the user may stop the rendering when satisfied with the result. Longer render duration, t, yields more rays traced, r, which should improve the result. 4.7.3 Multi threaded ray tracing

The only variable the ray tracing function updates is the sensor cells. This is very convenient, because it gets easy to delegate the work to multiple CPU cores. This is done using simple C++ threads. Since the threads may update the same cell at the same time, a mutual exclusion lock is protecting the sensor from concurrent alterations. 4.7.4 Exposure

In order to control the exposure of the image, the total amount of intensity for all rays that have been traced are stored in a global variable and is used for computing the average intensity per cell before each frame is rendered. When a cell is drawn, the color value in the cell is then divided by the average per cell intensity, which results in that the image will never be overor underexposed. The exposure can also be fine-tuned manually using the arrow keys during rendering.




All images are rendered with a 2.4 GHz Intel Core i7 CPU with four physical cores, and one ray tracing thread per each of the eight virtual cores. In figure 4, a typical rendering can be seen. The image is sharp but surfaces are noisy due to that the rays are randomized. The difference between the reflective material on the left wall and the diffuse surfaces are clearly visible, just like the soft shadows from the red cube, as a result of the area light source. The rendering time of 71.6 minutes is quite long with respect to the simplicity of the scene, and compared to commercial photo realistic rendering engines. Output shows that around 9.5% of the rays hit the light source; meaning that more than 90% of the computations were not contributing to the image at all.

Figure 4: p = 8002 , B = 2, N = 16, r = 12 × 107 , t = 71.6 min



Progressive rendering

Just as expected, the duration of the rendering has a significant impact on the final result. In figure 5, the gradual reduction of noise over time can be inspected. Furthermore, the total amount of light in figure 5d is about 15 times higher than in figure 5c, but the images are both at acceptable brightness levels, which confirms that the automatic exposure adjustment is working.

(a) t = 31 s

(b) t = 78 s

(c) t = 4.8 min

(d) t = 71.6 min

Figure 5: The duration of the render has significant impact on the result; longer time leads to less noise.


Image resolution

The greater the resolution, the more cells in the sensor and subsequently it is sparser between the pixels in the image that are lit up. Theoretically, the render time for an image of an acceptable noise level should correlate linearly with the resolution, p. In figure 6, visual inspection confirms such a relation.

(a) p = 1002

(b) p = 2002

(c) p = 4002

(d) p = 8002

Figure 6: The relationship between rendering resolution p and the noise level. All images have been cropped to 100 × 100 pixels. t = 140 s.



Depth of field

As shown previously in table 1, the depth of field is reduced when the aperture gets larger (smaller f-number). In figure 7, this effect can be inspected. The images confirms that the focus gets shallower. But if we have a closer look at the largest aperture, in figure 7d, there is only focus in the centre of the image even though many objects, like the edges of the box, remain within the depth of field. The further away from the image center one looks, the stronger the blurriness seems. This is probably a result of optical aberration, either because a thin lens is not of ideal shape for focusing light, or that the implementation of the thin lens is a completely flat surface, which becomes a problem when the lens size gets larger.

(a) N = 16, DoF = 3.73

(b) N = 8, DoF = 1.50

(c) N = 4, DoF = 0.72

(d) N = 2, DoF = 0.35

Figure 7: The relationship between the f-number, N , and the depth of field (DoF). A larger aperture (lower f-number) results in a shallower focus. 14


Indirect illumination

With more bounces of light rays allowed, the rendering time per ray tracing operation increases. Since the light rays loses intensity for every bounce, it makes sense to keep the number of bounces to a minimum. In figure 8, the effect of the indirect illumination can be inspected. Note that B = 1 implies that there is only direct illumination, which can be seen in figure 8a where the area to the right of the red cube is pitch black. In the other two images, that area has been slightly illuminated. Also, note how the ray hit rate, r ˆ (portion of rays that were successful in hitting the light source), increases with B .

(a) B = 1, t = 12.7 min, r ˆ = 6.1%

(b) B = 2, t = 18.3 min, (c) B = 3, t = 21.7 min, r ˆ = 9.3% r ˆ = 12%

Figure 8: Indirect illumination. The number of bounces B , and its effect on the final image. All images have r = 3.0 × 107 rays traced.




Overall, the results are visually satisfying. The images are more realistic than what was produced in the previous labs. Needless to say, CPU powered ray tracing is very time consuming. The sample scene used has little geometry; with a more complex scene the rendering time will increase significantly. For production purposes the speed is unacceptably low. There should however be room for substantial performance improvements. One could for example utilize the GPU for matrix computations or incorporate existing algorithms like radiosity and photon mapping. Parallelism is, as shown in the report, highly achievable which makes it possible to distribute rendering over many CPU cores, and even multiple machines. When creating a photo realistic renderer one constantly needs to make choices between physically correct light simulations and clever approximations. Thus, the true challenge is, just like with real-time graphics, to create simplified algorithms that are both fast and accurate. For improved light simulation, there are many optical phenomenas that would be interesting to implement, e.g. refractive materials, more physically correct BRDFs, more advanced optical systems for simulation of cameras, customizable aperture shape and diffraction (for star-burst effects) and subsurface scattering.


Master your semester with Scribd & The New York Times

Special offer for students: Only $4.99/month.

Master your semester with Scribd & The New York Times

Cancel anytime.