Professional Documents
Culture Documents
Hardware Accelerated High Quality Recons
Hardware Accelerated High Quality Recons
Hardware Accelerated High Quality Recons
on PC Graphics Hardware
Markus Hadwiger Thomas Theußl y Helwig Hauser Eduard Gröller y
X
sum:
accurate. Since numerical considerations alone may not always be
appropriate, they also show how to use their framework to design bx + m
accurate and smooth reconstruction filters [14]. Turkowsky [22] g(x) = f [x℄ h(x) = f [i℄h(x i) (1)
used windowed ideal reconstruction filters for image resampling i =bx m +1
tasks. Theußl et al. [21] used the framework developed by Möller et
al. [12] to assess the quality of windowed reconstruction filters and This equation describes a convolution of the discrete input samples
to derive optimal values for the parameters of Kaiser and Gaussian f [x℄ with a continuous reconstruction filter h(x). In the case of
windows. reconstruction, this is essentially a sliding average of the samples
and the reconstruction filter. If the function to be reconstructed was
bandlimited and sampled properly, using the sinc function as recon-
In the field of hardware-accelerated visualization, Meißner et struction filter would result in perfect reconstruction. However, the
al. [10] have done a thorough comparison of the four prevalent ap- sinc function is impracticable because of its infinite extent. There-
proaches to volume rendering, one of them being the use of tex- fore, in practice finite approximations to the sinc filter are used,
ture mapping hardware. Hardware-accelerated texture mapping has which yield–depending on the properties of these approximations–
been used to accelerate volume rendering for quite some time. The results of varying quality. In equation 1, the (finite) half-width of
original SGI RealityEngine [1] introduced 3D textures. They can be the filter kernel is denoted by m.
used effectively for rendering volumes at interactive frame rates, as In order to be able to exploit standard graphics hardware for per-
shown by Cabral et al. [3], and Cullip and Neumann [4], for exam- forming this computation, we do not use the evaluation order usu-
ple. These early approaches all stored a preshaded volume in a 3D ally employed, i.e., in software-based filtering. The convolution
texture, which had to be recomputed whenever the viewing parame- sum is commonly evaluated in its entirety for a single output sam-
ters, lighting conditions, or transfer function were changed. Wester- ple at a time. That is, all the contributions of neighboring input
mann and Ertl [23] introduced several approaches how to accelerate samples (their values multiplied by the corresponding filter values)
volume rendering with graphics hardware that is capable of three- are gathered and added up in order to calculate the final value of a
dimensional texture mapping and supports a color matrix. They certain output sample. This “gathering” of contributions is shown
were able to render monochrome images of shaded isosurfaces at in figure 1(a). This figure uses a simple tent filter as an example. It
interactive rates–without explicitly extracting any geometry–and shows how a single output sample is calculated by adding up two
performed shading on-the-fly, without requiring the volume tex- contributions. The first contribution is gathered from the neigh-
ture to be updated. Meißner et al. [9] have extended this approach boring input sample on the left-hand side, and the second one is
for semi-transparent volume rendering, and are able to apply both gathered from the input sample on the right-hand side. In the case
classification and colored shading at run-time without changing the of this example, the convolution results in linear interpolation, due
volume texture itself. Their approach additionally requires the SGI to the tent filter employed. For generating the desired output data
pixel textures extension of OpenGL [20]. Dachille et al. [5] used in its entirety, this is done for all corresponding resampling points
a mixed-mode approach between applying a transfer function and (output sample locations).
shading in software, and rendering the resulting texture volume in Our method uses a different evaluation order. Instead of focusing
hardware. Many of the approaches presented earlier have been on a single output sample at any one time, we calculate the contri-
adapted to standard PC graphics hardware later on, e.g., by Rezk- bution of a single input sample to all corresponding output sample
Salama et al. [19]. They are only using two-dimensional textures in- locations (resampling points) first. That is, we distribute the con-
stead of three-dimensional ones, exploiting the multi-texturing and tribution of an input sample to its neighboring output samples, in-
multi-stage rasteriziation capabilites of NVIDIA GeForce graphics stead of the other way around. This “distribution” of contributions
cards. The high-quality filtering approach we are proposing can be is shown in figure 1(b). In this case, the final value of a single out-
used in conjunction with many of the methods previously described. put sample is only available when all corresponding contributions
of input samples have been distributed to it.
The OpenGL [17] imaging subset (introduced in OpenGL 1.2) We evaluate the convolution sum in the order outlined above,
offers convolution capabilities, but these are primarily intended for since the distribution of the contributions of a single relative input
image processing and unfortunately cannot be used for the kind of sample can be done in hardware for all output samples (pixels) si-
reconstruction we desire. Furthermore, these features are usually multaneously. The final result is gradually built up over multiple
not supported natively by the graphics hardware, especially not by rendering passes. The term relative input sample location denotes
consumer PC graphics accelerators. Hopf and Ertl [6] have shown a relative offset of an input sample with respect to a set of output
how to achieve 3D convolution by building upon two-dimensional samples. In the example of a one-dimensional tent filter (like in fig-
convolution in OpenGL. ure 1(a)), there are two relative input sample locations. One could
output sample input samples
input samples
filter kernel
+
+
x
(a) (b)
Figure 1: Gathering vs. distribution of input sample contributions (tent filter): (a) Gathering all contributions to a single output sample (b)
Distributing a single input sample’s contribution (tent filter not shown).
be called the “left-hand neighbor,” the other the “right-hand neigh- In the case of a tent filter (which has width two) this can be seen in
bor.” In the first pass, the contribution of all respective left-hand figure 1(a), by imagining the filter kernel sliding from the second
neighbors is calculated. The second pass then adds the contribution input sample shown to the third input sample.
of all right-hand neighbors. Note that the number of passes depends From now on we will be calling a segment of width one from one
on the filter kernel used, see below. integer location in a filter kernel to the next an integer filter kernel
Thus, the same part of the filter convolution sum is added to the tile, or simply filter tile. Consequently, a one-dimensional tent filter
previous result for each pixel at the same time, yielding the final has two filter tiles, corresponding to the fact that it has width two.
result after all parts have been added up. From this point of view, Looking again at figure 1(a), we can now say that each filter tile
the graph in figure 1(b) depicts both rendering passes that are nec- covers exactly one input sample, regardless of where the kernel is
essary for reconstruction with a one-dimensional tent filter, but only actually positioned.
with respect to the contribution of a single input sample. The con-
tributions distributed simultaneously in a single pass are depicted input samples
in figures 2 and 3, respectively. In the first pass, shown in figure 2,
the contributions of all relative left-hand neighbors are distributed.
Consequently, the second pass, shown in figure 3, distributes the
contributions of all relative right-hand neighbors. Adding up the
distributed contributions of these two passes yields the final result
for all resampling points (i.e., linearly interpolated output values in
this example).
approach ultimately yields the exact same result as multi-texturing 5 NV texture shader, NV texture shader2
1.2 1.2
Cubic B-spline Blackman windowed sinc
Catmull-Rom spline Blackman window (width = 4)
1 1
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0 0
-0.2 -0.2
-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3
(a) (b)
Figure 6: Filter kernels we have used: (a) Cubic B-spline and Catmull-Rom spline; (b) Blackman windowed sinc depicting also the window
itself.
Figure 7: Slice of MR head data set, reconstructed using: (a) Bilinear interpolation; (b) Blackman windowed sinc; (c) Bicubic Catmull-Rom
spline; (d) Bicubic B-spline.
Table 1: Frame rates for different scenarios. Note that the tricubic case uses the fall-back algorithm.
5 Results screws. The slice is not aligned with the slice stack, but instead
has an arbitrary location and orientation with respect to the texture
In our work, we are focusing on widely available low-cost PC domain. Since the Radeon does not support multi-texturing with
graphics hardware. Currently, the two premier graphics accelera- two 3D textures, we have used the fall-back approach for this case
tors in this field are the NVIDIA GeForce 2 and the ATI Radeon. outlined in section 3.4.
We have implemented and tested our method on a GeForce 2 with Table 1 shows some timing results of our test implementation.
32MB RAM, and a Radeon with 64MB RAM. The graphics API We have used a Pentium III, 733 MHz, 512MB of RAM, with the
we are using is OpenGL. Although the GeForce 3 is already on the two graphics cards described above (GeForce 2 and Radeon). Note
horizon, it is not widely available yet. that the timings do not depend on the shape of the filter kernel per
The GeForce 2 supports multi-texturing with two 2D textures at se, only on its width.
the same time and offers the capability to subtract from the contents
of the frame buffer. It does not support 3D textures, however. The
Radeon supports up to three simultaneous 2D textures, as well as 6 Conclusions and future work
one 3D texture and one 2D texture at the same time. Unfortunately,
it does not allow to subtract from the frame buffer. These properties We have presented a general approach for high-quality filtering that
of the graphics hardware we have used lead to a couple of restric- is able to exploit hardware acceleration for reconstruction with ar-
tions with respect to what parts of our approach can be implemented bitrary filter kernels. Conceptually, the method is not constrained to
on what platform. a certain dimensionality of the data, or the shape of the filter kernel.
As filter kernels we have used a cubic B-spline of width four and In practice, limiting factors are the number of rendering passes and
a cubic Catmull-Rom spline, also of width four. In addition to these the precision of the frame buffer.
we have also tested a windowed sinc filter kernel. Our method is quite general and can be used for a lot of appli-
We have reconstructed object-aligned planar slices on both the cations. With regard to volume visualization, the reconstruction of
GeForce 2, as well as the Radeon. Figure 7(a) shows a part from object-aligned, as well as oblique slices through volumetric data is
a 256x256 slice from a human head MR data set. The image was especially interesting. Reconstruction of slices can also be used for
rendered in a resolution of 600x600. In this figure, reconstruction direct volume rendering.
was done with the hardware-native bilinear interpolation. Thus, We are exploiting commodity graphics hardware, multi-
interpolation artifacts are clearly visible when viewed on screen. texturing, and multiple rendering passes. The number of passes
The same slice with the same conditions but different reconstruc- is a major factor determining the resulting performance. Therefore,
tion kernels is shown in figures 7(b) (Blackman windowed sinc), future hardware that supports many textures at the same time–not
7(c) (cubic Catmull-Rom), and 7(d) (cubic B-spline). The Black- only 2D textures, but also 3D textures–will make the application of
man windowed sinc and Catmull-Rom kernels can only be used on our method more feasible for real-time use.
the GeForce, due to the fact that the Radeon does not support frame Since hardware that is able to combine multi-texturing with 3D
buffer subtraction. Figure 9 (colorplate) shows slices reconstructed textures is not yet available to us, we are still using a “simula-
with different filters together with magnified regions to highlight tion” of our algorithm for reconstructing oblique slices that is much
the differences. slower. Thus, we are really looking forward to the immediate fu-
Since the GeForce 2 does not support 3D textures, we have tested ture where such hardware will be available. Still, we would also
our approach for reconstructing arbitrarily oriented slices only on like to adapt the approach outlined by Rezk-Salama et al. [19] in or-
the Radeon. Figure 8 shows a slice from a vertebrae data set with der to considerably speed up the intermediate solution. We would
(a) (b) (c) (d)
Figure 8: Oblique slice through vertebrae data set with screws: (a) Trilinear filter; (b) Part of (a) magnified; (c) Tricubic B-spline filter; (d)
Part of (c) magnified.
also like to experiment with additional filter kernels. Please see [10] M. Meißner, J. Huang, D. Bartz, K. Müller, and R. Crawfis.
http://www.VRVis.at/vis/research/hq-hw-reco/ for high-resolution A practical evaluation of four popular volume rendering al-
images and the most up-to-date information on this ongoing work. gorithms. In Procceedings of IEEE Symposium on Volume
Visualization, pages 81–90, 2000.
[9] M. Meißner, U. Hoffmann, and W. Straßer. Enabling clas- [23] R. Westermann and T. Ertl. Efficiently using graphics hard-
sification and shading for 3D texture mapping based volume ware in volume rendering applications. In Proceedings of
rendering. In Proceedings of IEEE Visualization ’99, pages SIGGRAPH ’98, pages 169–178, 1998.
207–214, 1999.
(a)
(b)
(c)
(d)
Figure 9: Slice from MR data set, reconstructed using different filters (right images show shaded portion of left image enlarged): (a) Bilinear
filter; (b) Blackman windowed sinc; (c) Bicubic Catmull-Rom spline; (d) Bicubic B-spline.