You are on page 1of 8

362 10.

Mathematical Topics from 3D Graphics

have a much more difficult time understanding even the global illumination
techniques of today, much less those of tomorrow.

10.2 Viewing in 3D
Before we render a scene, we must pick a camera and a window. That is,
we must decide where to render it from (the view position, orientation, and
zoom) and where to render it to (the rectangle on the screen). The output
window is the simpler of the two, and so we will discuss it first.
Section 10.2.1 describes how to specify the output window. Section 10.2.2
discusses the pixel aspect ratio. Section 10.2.3 introduces the view frustum.
Section 10.2.4 describes field of view angles and zoom.

10.2.1 Specifying the Output Window


We don’t have to render our image to the entire screen. For example, in
split-screen multiplayer games, each player is given a portion of the screen.
The output window refers to the portion of the output device where our
image will be rendered. This is shown in Figure 10.2.

Figure 10.2. Specifying the output window


10.2. Viewing in 3D 363

The position of the window is specified by the coordinates of the upper


left-hand pixel (winPosx , winPosy ). The integers winResx and winResy are
the dimensions of the window in pixels. Defining it this way, using the size
of the window rather than the coordinates of the lower right-hand corner,
avoids some sticky issues caused by integer pixel coordinates. We are also
careful to distinguish between the size of the window in pixels, and the
physical size of the window. This distinction will become important in
Section 10.2.2.
With that said, it is important to realize that we do not necessarily have
to be rendering to the screen at all. We could be rendering into a buffer to
be saved into a .TGA file or as a frame in an .AVI, or we may be rendering
into a texture as a subprocess of the “main” render, to produce a shadow
map, or a reflection, or the image on a monitor in the virtual world. For
these reasons, the term render target is often used to refer to the current
destination of rendering output.

10.2.2 Pixel Aspect Ratio


Regardless of whether we are rendering to the screen or an off-screen buffer,
we must know the aspect ratio of the pixels, which is the ratio of a pixel’s
height to its width. This ratio is often 1:1—that is, we have “square”
pixels—but this is not always the case! We give some examples below, but
it is common for this assumption to go unquestioned and become the source
of complicated kludges applied in the wrong place, to fix up stretched or
squashed images.
The formula for computing the aspect ratio is
Computing the pixel
pixPhysx devPhysx devResy aspect ratio
= · . (10.2)
pixPhysy devPhysy devResx
The notation pixPhys refers to the physical size of a pixel, and devPhys is
the physical height and width of the device on which the image is displayed.
For both quantities, the individual measurements may be unknown, but
that’s OK because the ratio is all we need, and this usually is known. For
example, standard desktop monitors come in all different sizes, but the
viewable area on many older monitors has a ratio of 4:3, meaning it is 33%
wider than it is tall. Another common ratio is 16:9 or wider7 on high-
definition televisions. The integers devResx and devResy are the number
7 Monitor manufacturers must have been overjoyed to find that people perceived a

premium quality to these “widescreen” monitors. Monitor sizes are typically measured
by the diagonal, but costs are more directly tied to number of pixels, which is proportional
to area, not diagonal length. Thus, a 16:9 monitor with the same number of pixels as a
4:3 will have a longer diagonal measurement, which is perceived as a “bigger” monitor.
We’re not sure if the proliferation of monitors with even wider aspect ratios is fueled
more by market forces or marketing forces.
364 10. Mathematical Topics from 3D Graphics

of pixels in the x and y dimensions. For example, a resolution of 1280 × 720


means that devResx = 1280 and devResy = 720.
But, as mentioned already, we often deal with square pixels with an
aspect ratio of 1:1. For example, on a desktop monitor with a physical
width:height ratio of 4:3, some common resolutions resulting in square pixel
ratios are 640 × 480, 800 × 600, 1024 × 768, and 1600 × 1200. On 16:9
monitors, common resolutions are 1280 × 720, 1600 × 900, 1920 × 1080. The
aspect ratio 8:5 (more commonly known as 16:10) is also very common, for
desktop monitor sizes and televisions. Some common display resolutions
that are 16:10 are 1153 × 720, 1280 × 800, 1440 × 900, 1680 × 1050, and
1920×1200. In fact, on the PC, it’s common to just assume a 1:1 pixel ratio,
since obtaining the dimensions of the display device might be impossible.
Console games have it easier in this respect.
Notice that nowhere in these calculations is the size or location of the
window used; the location and size of the rendering window has no bearing
on the physical proportions of a pixel. However, the size of the window
will become important when we discuss field of view in Section 10.2.4, and
the position is important when we map from camera space to screen space
Section 10.3.5.
At this point, some readers may be wondering how this discussion makes
sense in the context of rendering to a bitmap, where the word “physical”
implied by the variable names pixPhys and devPhys doesn’t apply. In most
of these situations, it’s appropriate simply to act as if the pixel aspect ratio
is 1:1. In some special circumstances, however, you may wish to render
anamorphically, producing a squashed image in the bitmap that will later
be stretched out when the bitmap is used.

10.2.3 The View Frustum


The view frustum is the volume of space that is potentially visible to the
camera. It is shaped like a pyramid with the tip snipped off. An example
of a view frustum is shown in Figure 10.3.
The view frustum is bounded by six planes, known as the clip planes.
The first four of the planes form the sides of the pyramid and are called
the top, left, bottom, and right planes, for obvious reasons. They corre-
spond to the sides of the output window. The near and far clip planes,
which correspond to certain camera-space values of z, require a bit more
explanation.
The reason for the far clip plane is perhaps easier to understand. It
prevents rendering of objects beyond a certain distance. There are two
practical reasons why a far clip plane is needed. The first is relatively easy
to understand: a far clip plane can limit the number of objects that need
to be rendered in an outdoor environment. The second reason is slightly
10.2. Viewing in 3D 365

Figure 10.3
The 3D view frustum

more complicated, but essentially it has to do with how the depth buffer
values are assigned. As an example, if the depth buffer entries are 16-
bit fixed point, then the largest depth value that can be stored is 65,535.
The far clip establishes what (floating point) z value in camera space will
correspond to the maximum value that can be stored in the depth buffer.
The motivation for the near clip plane will have to wait until we discuss
clip space in Section 10.3.2.
Notice that each of the clipping planes are planes, with emphasis on the
fact that they extend infinitely. The view volume is the intersection of the
six half-spaces defined by the clip planes.

10.2.4 Field of View and Zoom


A camera has position and orientation, just like any other object in the
world. However, it also has an additional property known as field of view.
Another term you probably know is zoom. Intuitively, you already know
what it means to “zoom in” and “zoom out.” When you zoom in, the
object you are looking at appears bigger on screen, and when you zoom
out, the apparent size of the object is smaller. Let’s see if we can develop
this intuition into a more precise definition.
The field of view (FOV) is the angle that is intercepted by the view
frustum. We actually need two angles: a horizontal field of view, and a
vertical field of view. Let’s drop back to 2D briefly and consider just one of
these angles. Figure 10.4 shows the view frustum from above, illustrating
precisely the angle that the horizontal field of view measures. The labeling
of the axes is illustrative of camera space, which is discussed in Section 10.3.
Zoom measures the ratio of the apparent size of the object relative to a
90o field of view. For example, a zoom of 2.0 means that object will appear
366 10. Mathematical Topics from 3D Graphics

Figure 10.4
Horizontal field of view

twice as big on screen as it would if we were using a 90o field of view. So


larger zoom values cause the image on screen to become larger (“zoom in”),
and smaller values for zoom cause the images on screen to become smaller
(“zoom out”).
Zoom can be interpreted geometrically as shown in Figure 10.5. Using
some basic trig, we can derive the conversion between zoom and field of

Figure 10.5
Geometric interpretation of zoom
10.2. Viewing in 3D 367

view:
1 Converting between
zoom = , fov = 2 arctan (1/zoom) . (10.3)
tan (fov/2) zoom and field of view

Notice the inverse relationship between zoom and field of view. As


zoom gets larger, the field of view gets smaller, causing the view frustum
to narrow. It might not seem intuitive at first, but when the view frustum
gets more narrow, the perceived size of visible objects increases.
Field of view is a convenient measurement for humans to use, but as we
discover in Section 10.3.4, zoom is the measurement that we need to feed
into the graphics pipeline.
We need two different field of view angles (or zoom values), one hori-
zontal and one vertical. We are certainly free to choose any two arbitrary
values we fancy, but if we do not maintain a proper relationship between
these values, then the rendered image will appear stretched. If you’ve ever
watched a movie intended for the wide screen that was simply squashed
anamorphically to fit on a regular TV, or watched content with a 4:3 as-
pect on a 16:9 TV in “full”8 mode, then you have seen this effect.
In order to maintain proper proportions, the zoom values must be in-
versely proportional to the physical dimensions of the output window:
zoomy winPhysx The usual relationship
= = window aspect ratio. (10.4) between vertical and
zoomx winPhysy horizontal zoom
The variable winPhys refers to the physical size of the output window. As
indicated in Equation (10.4), even though we don’t usually know the actual
size of the render window, we can determine its aspect ratio. But how do
we do this? Usually, all we know is the resolution (number of pixels) of
the output window. Here’s where the pixel aspect ratio calculations from
Section 10.2.2 come in:
zoomy winPhysx winResx pixPhysx
= = ·
zoomx winPhysy winResy pixPhysy
(10.5)
winResx devPhysx devResy
= · · .
winResy devPhysy devResx
In this formula,
• zoom refers to the camera’s zoom values,
• winPhys refers to the physical window size,

8 While it causes videophiles extreme stress to see an image manhandled this way,

apparently some TV owners prefer a stretched image to the black bars, which give them
the feeling that they are not getting all their money’s worth out of their expensive
new TV.
368 10. Mathematical Topics from 3D Graphics

• winRes refers to the resolution of the window, in pixels,


• pixPhys refers to the physical dimensions of a pixel,
• devPhys refers to the physical dimensions of the output device. Re-
member that we usually don’t know the individual sizes, but we do
know the ratio,
• devRes refers to the resolution of the output device.
Many rendering packages allow you to specify only one field of view
angle (or zoom value). When you do this, they automatically compute the
other value for you, assuming you want uniform display proportions. For
example, you may specify the horizontal field of view, and they compute
the vertical field of view for you.
Now that we know how to describe zoom in a manner suitable for con-
sumption by a computer, what do we do with these zoom values? They go
into the clip matrix, which is described in Section 10.3.4.

10.2.5 Orthographic Projection


The discussion so far has centered on perspective projection, which is the
most commonly used type of projection, since that’s how our eyes perceive
the world. However, in many situations orthographic projection is also use-
ful. We introduced orthographic projection in Section 5.3; to briefly review,
in orthographic projection, the lines of projection (the lines that connect
all the points in space that project onto the same screen coordinates) are
parallel, rather than intersecting at a single point. There is no perspective
foreshortening in orthographic projection; an object will appear the same
size on the screen no matter how far away it is, and moving the camera
forward or backward along the viewing direction has no apparent effect so
long as the objects remain in front of the near clip plane.
Figure 10.6 shows a scene rendered from the same position and orien-
tation, comparing perspective and orthographic projection. On the left,
notice that with perspective projection, parallel lines do not remain par-
allel, and the closer grid squares are larger than the ones in the distance.
Under orthographic projection, the grid squares are all the same size and
the grid lines remain parallel.
Orthographic views are very useful for “schematic” views and other
situations where distances and angles need to be measured precisely. Every
modeling tool will support such a view. In a video game, you might use an
orthographic view to render a map or some other HUD element.
For an orthographic projection, it makes no sense to speak of the “field
of view” as an angle, since the view frustum is shaped like a box, not a
pyramid. Rather than defining the x and y dimensions of the view frustum
10.3. Coordinate Spaces 369

Perspective projection Orthographic projection

Figure 10.6
Perspective versus orthographic projection

in terms of two angles, we give two sizes: the physical width and height of
the box.
The zoom value has a different meaning in orthographic projection com-
pared to perspective. It is related to the physical size of the frustum box:
Converting between
zoom and frustum size in
zoom = 2/size, size = 2/zoom. orthographic projection
As with perspective projections, there are two different zoom values, one
for x and one for y, and their ratio must be coordinated with the aspect ratio
of the rendering window in order to avoid producing a “squashed” image.
We developed Equation (10.5) with perspective projection in mind, but this
formula also governs the proper relationship for orthographic projection.

10.3 Coordinate Spaces


This section reviews several important coordinate spaces related to 3D view-
ing. Unfortunately, terminology is not consistent in the literature on the
subject, even though the concepts are. Here, we discuss the coordinate
spaces in the order they are encountered as geometry flows through the
graphics pipeline.

10.3.1 Model, World, and Camera Space


The geometry of an object is initially described in object space, which is
a coordinate space local to the object being described (see Section 3.2.2).

You might also like