Video Processing, Architecture and Standard Design, Philips Consumer Electronics 620A Lorong 1, Toa Payoh, Singapore 319762 emails: Charayaphan.Charoensak@Philips, ecchara@ntu.edu.sg
ABSTRACT This paper presents efficient FPGA hardware architecture for the implementation of a digital video processing algorithm for improving picture quality when displayed on devices such as LCD/PDP panels. The algorithm performs dynamic range compression on the photographic quality input video and produces the output suitable for displaying on a display panel. The algorithm is based on using Bilateral filter. Bilateral filter is a type of non-iterative smoothing filter that preserves edge information. The proposed architecture demonstrates a good compromise between filtering performance and FPGA resource requirements. The architecture was prototyped in hardware using FPGA. The design and simulation was carried out using FPGA system-level design approach.

Farook Sattar School of Electrical and Electronic Engineering Nanyang Technological University Nanyang Avenue, Singapore 639798 email: efsattar@ntu.edu.sg
Bilateral filter discussed in this paper offers hardware simplification and is suitable for FPGA implementation. The paper first introduces technique for dynamic range compression followed by introduction of Bilateral filter. Then, MATLAB simulations showing the results of dynamic range compression is given. After that, the final verification of the FPGA design was carried out and the FPGA synthesis results are reported.





There is an increasing demand for more research on high-definition (HD) picture quality processing for applications in consumer-based television. Such areas include full-HD picture quality enhancement and LCD/PDP panel picture quality improvement. The consumer expects photorealistic quality from the expensive HD television set they purchased. Today, the digital signal processors used inside a television set often support up to 14 bits processing data path allowing a very wide signal dynamic range. However, the display panels often limit the signal data width to within 8 to 10 bits. In addition to that, the optical property of the display panel further limits the effective dynamic range. Thus, effective techniques for compressing the dynamic range of the input image for displaying on a typical display panel are desirable. This paper presents our work on hardware architecture of FPGA-based circuit for the implementation of dynamic range compression suitable for video applications. The algorithm is based on using Bilateral filter (BF) for the implementation of edge-preserving smoothing filter. The

One technique that takes a high-dynamic-range image input and compresses its contrast while preserving the details was introduced in [5][6]. Such algorithm is based on two-scale decomposition of the image into a base layer (large-scale features) and a detail layer. Only the base layer has its contrast reduced while the detail is preserved, thus preserving the visible details. The algorithm requires implementation of edge-preserving filter – a filter that blurs the small variations of a signal (noise or texture detail) but preserves the large discontinuities (edges). Bilateral filter is commonly used to implement the edgepreserving filter and is discussed in more detail in next section.



Bilateral filter was first introduced by Smith and Brady under the name “SUSAN” [3] and was later referred to as “Bilateral filter” [1][4]. The filter replaces each pixel by a weighted average of its neighbors. The weight assigned to each neighbor decreases with both the distance in the image plane and the distance on the intensity axis. Thus, it is a form of moving average adaptive filter weighted:
xl =
k ∈η l

∑ y w(
k 1 k ∈η l

yl − yk ) w2 ( l − k ) yl − yk ) w2 ( l − k )



Here, yl and x l are the filter input and output values respectively, l and k are 2D-coordinates of the image pixel locations,ηl is a neighborhood around l, . denotes the Euclidean distance, and w1(.) and w2(.) are weight functions. w1(.) is a function of absolute difference of brightness value and w2(.) is a function of Euclidean distance. The weight functions are usually chosen as Gaussians for w2(.) and exponential for w1(.). The estimation x of the original signal is computed from a distorted signal y = x + n, where n is uncorrelated noise. The least mean-square (LMS) estimate is obtained by the conditional expectation.

w2 ( l − k )

P (ζ | k )


are the correspondence between the image estimator and the Bilateral filter. Thus, Bilateral filter may be used as an efficient image estimator. In most implementations of Bilateral filter, Gaussian blur function is used. The special characteristic is that it prevents blurring across edges by decreasing the weight of pixels when the intensity difference is too large. Bilateral filter is also attractive for our application of dynamic range compression because it is non-iterative, robust, and relatively simple for hardware realization.

x = E{x | y}


and the linear solution of this problem is Wiener filter. Similarly, a locally adaptive Weiner filter is expressed as:
− x l = rxl y Ryy1 y


Here, the pixels in y located around position l are denoted ζ . The correlation ρ xy is defined; where its high value indicates that the observation belongs to the same structure, and low value for pixels that do not. This 2 correlation within structure is expressed as ρ xy = (σ / σ x ) 2 .

σ denotes the noise variance, and σ x the signal variance.
Typically, ρ xy is closed to 1 and we may assume:

rxl y = ρ xy if yk ∈ ζ else 0.


Since the observations are corrupted by noise, we may present the probability P ( ζ | y , k ) that an observed value y at location k belongs to ζ . If we assume constant variance and uncorrelated observations, a formulation similar to the Bilateral filter can be derived. Ryy as well as its inverse are diagonal with constant entries. The Weiner filter may then be implemented by the conditional average:
xl =
1 K
k ∈η l

∑ y P( ζ | y, k )


Fig. 1 Bilateral filter. (a) Input image showing small amplitude details (or noise) and high amplitude edge, (b) Gaussian low-pass filter function – w2, (c) intensity function – w1, (d) combined filter function derived from w1 and w2, (e) the final filter output showing the much reduced small amplitude details and the preserved large amplitude edge information.

where K =

k ∈η


yk P ( ζ | y, k )

is the normalization factor and

ρ xy is constant. Equation (5) may be expressed in the form
of Bilateral filter:
xl =
1 K

k∈η l


p( y | ζ , k ) P ( ζ | y , k ), p( y | k )

(6) (7)


w1 (| yl − yk |)

p( y | ζ , k ) p( y | k )

It was described in last section on how Bilateral filter can be used for edge-preserving low-pass filter. In this section, more detail discussion on the implementation of dynamic range compression is given.

Our implementation of dynamic range compression is based on a multi-scale decomposition of image [5][6]. Typically, a two-scale decomposition, where the base image (less details) is computed using Bilateral filtering, and the detail image is the division of the input intensity by the base layer (please refer to Fig. 1). The base image has its dynamic range (contrast) reduced, while the magnitude of the detail image is unchanged. The final output is the summation of the two decompositions and thus has overall dynamic range reduced while the detail is preserved. For more details on the algorithm, please refer to provided references.


(b) Fig. 3 Dynamic range compression result reported in [5]. (a) Input image showing high resolution image but contains portion with bright zone (near the lamp) and vary dark zone (corners), (b) Output image showing much improved lighting distribution throughout the image with details preserved.


In this section, we propose architecture for efficient implementation of dynamic range compression algorithm described earlier. Our goal is to maintain low hardware complexity while preserving the algorithm performance specifically:
Fig. 2 Two-scale decomposition. (a) Input image, (b) based image computed by performing Bilateral filtering on input image. Notice here that the small details is smoothed out while the high contrast edge information is preserved, (c) the detail image



Fig.3 shows the simulation result from dynamic range compression. Note that the compression was carried out only on luminance channel. Thus, the color image was separated into two channels, luminance and color. After the luminance channel is processed the luminance and color channels were inverted back to colored image.


It should effectively reduce the dynamic range of image, and with minimal reduction in picture sharpness (details). It should be simple to be realized in hardware. It should not require frame buffer, i.e. base on instream processing. Thus, the hardware will be practical for real-time video post-processing using low-cost components. It should be able to process at real-time video data rate and thus usable for implementation in consumer products.


This section describes the FPGA design of the prototype circuit that implements the dynamic range compression algorithm described earlier. To accelerate the FPGA design and verification, integrated system-level environment called System Generator from Xilinx [2] is used. Using System Generator, the FPGA design and simulation is carried out using combination of Simulink and Xilinx blocks. The FPGA functional simulation was carried out under MATLAB Simulink environment. After the successful simulation, the synthesizable VHDL code is automatically generated from the models for FPGA synthesis and implementation. The Bilateral filter described in previous section was implemented. For spatial weight w2(.), a Gaussian function of variance σ 2 = 6, and the 15x15 convolution kernel were used. Fig. 3 shows a small portion of the FPGA design, i.e. the portion of Bilateral filter. Notice the sub-systems “sumval”, “sumweight”, and “15-tap BF filter”. The subsystems “sumval”, “sumweigh” perform the summations:

The result of the FPGA simulation on a gray scale image, 256x256 pixels, is shown in Fig. 4. Fig. 4a is the input image and Fig. 4b is the output. It can be seen that the output image shows improved details in the dark areas while the bright areas do not show clipping. Note that some dark details may appear noisy and thus the amount of compression may have to be adjusted based on some statistics measured on the individual image.



Fig. 4 FPGA simulation of dynamic range compression algorithm. (a) input image, (b) output image




∑ y w( η
k 1

yl − yk ) w2 ( l − k )

(9) (10)


k∈η l


yl − yk ) w2 ( l − k )

in equation (1) respectively. The sub-system “15-tap BF filter” is a MAC-based implementation of the 15x15 Bilateral filter.

After the successful simulation, the VHDL codes were automatically generated from the System Generator block sets. The VHDL codes were then synthesized using Xilinx ISE 8.1i and targeted for Xilinx Spartan3 family. The optimization setting is for maximum clock speed. Table 1 details the resource requirements of the design. Note that in practice, additional circuit is needed for input/output interface, and synchronization. Also, for color image, circuitry for color space conversion is needed. Table 2 shows the reported maximum path delay and the highest FPGA clock frequency. Because 15-tap MACbased is used for the filter, the actual maximum pixel rate achievable is only 72.2/15=4.8 Million pixels/second. This is much lower than the typical frequency of 13.5 MHz needed for un-scaled standard definition (SD) television (and higher for HD image). More work is needed on hardware optimization to improve data throughput to realize real-time application. Multiple video paths and poly-phase filter are considered for future work.
Table 1. Resource utilization of the FPGA design for dynamic range compression based on modified Bilateral filter

Fig. 3 More detailed circuit for the modified 7x7 BF and for the weight summation for scaling

During FPGA simulation, the gray scale image data is read sequentially from MATLAB workspace, and the final result also written back into the workspace. After the simulation is completed, a MATLAB program plots the input image and filtered output image for comparison.

Number of Slice for Logic Number of Slice for Flip Flops Number of 4-inputs LUTs

2,150 823 7,022

Table 2. Maximum combinational path delay and operating frequency of the FPGA design

Maximum path delay from/to any node Maximum operating frequency

13.8 nSec 72.2 MHz



In this paper, we present an FPGA design of a dynamic range compression algorithm suitable for the application of real-time video picture quality enhancement. The algorithm is based on the concept of separating the image into two components, base and detail. The contrast of the base component is reduced while the detail component remains. The recombined image thus exhibits improved details when displayed on limited dynamic range panels. Bilateral filter is used for the implementation of edge-preserving smoothing filter. The filter is suitable because it requires no iteration, is stable, and relatively simple. The hardware implementation of the algorithm was realized using FPGA. The FPGA functional simulations were carried out to verify performance of the proposed architecture. After the successful simulation, the VHDL code for the design was generated and synthesized. The estimated FPGA resource requirement is reported. Additional work on testing with more images, design optimization, and real-time demonstration of the system is to be carried out in future work. The FPGA design offers potential for real-time video applications. In real applications of color television, the

algorithm may be realized by first performing color space transformation, from RGB to YUV or HSV for example, then performing the dynamic range compression on the luminance information, and then inverse the transformation to generate the RGB output.



Elad, M., “On the origin of the bilateral filter and ways to improve it,” IEEE Transactions on Image Processing, pp. 1141-1151, 2002. Xilinx Inc., System Generator v8.1 for the MathWorks Simulink: Quick Start Guide, 2006. Smith, S.M., Brady, J.M., ”SUSAN – a new approach to low level image processing,” International Journal of Computer Vision 23, pp. 45–78, 1997. Tomasi, C., Manduchi, R., “Bilateral filtering for gray and color images,” in IEEE Proc. Int. Conf. Computer Vision, pp. 839–846, 1998. Kimmel, R., Elad, M., Shaked, D., Keshet, R., and Sobel. I., “A variational framework for retinex”, International Journal of Computer Vision, no. 52, pp. 7–23, 2003. Faugeras, O.D., “Digital image color processing within the framework of a human visual system”, IEEE Trans.on ASSP, vol. 27, pp. 380–393, 1979.

[2] [3]




Master your semester with Scribd & The New York Times

Special offer for students: Only $4.99/month.

Master your semester with Scribd & The New York Times

Cancel anytime.