You are on page 1of 13

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/221458236

Pre-processing of compressed digital video

Conference Paper  in  Proceedings of SPIE - The International Society for Optical Engineering · January 2001
DOI: 10.1117/12.411794 · Source: DBLP

CITATIONS READS
17 1,151

3 authors, including:

C. Andrew Segall Aggelos Katsaggelos


Sharp Laboratories of America Northwestern University
90 PUBLICATIONS   1,677 CITATIONS    1,076 PUBLICATIONS   25,748 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Quantitative Surface Shape Measurements of Art View project

Computational Photography View project

All content following this page was uploaded by C. Andrew Segall on 28 January 2014.

The user has requested enhancement of the downloaded file.


Pre-processing of compressed digital video
C. Andrew Segall*, Passant Karunaratne and Aggelos K. Katsaggelos
Image and Video Processing Laboratory, Department of Electrical and Computer Engineering,
Northwestern University, Evanston, IL 60208

ABSTRACT
Pre-processing algorithms improve on the performance of a video compression system by removing spurious noise and
insignificant features from the original images. This increases compression efficiency and attenuates coding artifacts.
Unfortunately, determining the appropriate amount of pre-filtering is a difficult problem, as it depends on both the content of
an image as well as the target bit-rate of compression algorithm. In this paper, we explore a pre-processing technique that is
loosely coupled to the quantization decisions of a rate control mechanism. This technique results in a pre-processing system
that operates directly on the Displaced Frame Difference (DFD) and is applicable to any standard-compatible compression
system. Results explore the effect of several standard filters on the DFD. An adaptive technique is then considered.

Keywords: pre-processing, image compression, adaptive filtering, displace frame difference, DFD

1. INTRODUCTION
Digital video compression algorithms reduce the bit-rate requirements for transmitting an image sequence. This minimizes
transmission costs and facilitates a variety of everyday applications. For example, video telephony utilizes the Public
Switched Telephone Network or Internet to assist users in collaborating on projects. Similarly, digital camcorders, DVD
players, digital VCRs and time-shifting devices rely on a combination of hard drives, tapes and optical media to entertain
consumers and provide novel methods for accessing media content. Finally, wireless infrastructure provides the potential for
un-tethered video communication between mobile users.

Compression standards form the core of most compression algorithms. These standards include ITU's H.261, H.263 and
H.263+ as well as MPEG's MPEG-1, MPEG-2, MPEG-4 and MPEG-4v2 [3-8]. While each standard addresses a specific
viewing environment and transmission system, incorporating these standards into an application provides specific benefits to
service providers, manufactures and consumers. For example, development costs are often reduced. Also, standards-based
compression algorithms allow for inter-operability between competing products. Nevertheless, it is imperative to understand
that video compression standards are not bit exact, in that only the structure of the decoder is specified. How the encoder
generates the bit-stream as well as what the decoder does with the decompressed data is completely under the control of the
system designer. These important tasks are encapsulated within pre-processing, post-processing and rate control algorithms.

Design of the auxiliary components of a compression system greatly influences the quality of decoded images, and many
variants of the rate control and post-processing algorithms have been explored in the literature [11-17, 19]. However, it is
surprising that pre-processing algorithms are rarely investigated within the specific context of digital video. Instead,
traditional noise filtering approaches are often employed [2]. In this paper, we consider the field of pre-processing for video
compression. The goal of the pre-processing algorithm is to modify the original image sequence so that image quality is
maximized for a given bit-rate. This is realized by removing noise and other insignificant image content before compression,
a task that is greatly complicated by the predictive nature of video compression. Here, we consider a novel realization for the
pre-filtering algorithm. This method operates directly on the error residuals of the predictive encoder, which delays the
smoothing decision until any motion compensation is complete.

The rest of the paper is organizes as follows. In section 2, a standard-compliant video compression algorithm is introduced.
Previous approaches to the pre-processing problem are then discussed. In section 3, a general approach to the pre-processing
task is outlined. This approach operates on the predictive error residual of the encoder and requires an adaptive structure to
handle changes in coding modes. In section 4, simulations of the proposed algorithm are presented. Experiments incorporate
a variety of filtering techniques into the proposed method. Visual quality metrics and peak signal-to-noise ratios are then

*
Correspondence: Email: asegall@northwestern.edu; WWW: http://www.ece.nwu.edu/~asegall/
calculated. These measurements suggest that the pre-processing method can improve the visual quality of the underlying
compression system while also increasing its coding efficiency. This is supported by visual results.

2. BACKGROUND
Video compression standards utilize two different methods for encoding images. In the first method, an image is said to be
intra-coded. This process starts with the current image and divides it into equally sized blocks. The Discrete Cosine
Transform (DCT) of each block is then calculated, and the transform coefficients are quantized. Quantization discards
information about the image by mapping the DCT coefficients onto a set of equally spaced values. For example, the
coefficients may be restricted to only integers that are divisible by eight. After quantization, the coefficients are transmitted
to the decoder with a sequence of variable length codes. The decoder then reconstructs each block by calculating the inverse-
DCT of the quantized coefficients, and all of the blocks are reassembled to form the decoded image.

In the second method of video compression, an image is inter-coded. This technique exploits the temporal correlation
between images and increases the amount of compression. Like intra-coding, the process begins by dividing each image into
equally sized blocks. However, an estimate for each block is then found within the previously decoded images. This
estimate is simply an area within a previous image frame that contains pixel intensities similar to the current block. After
identifying the location of the best estimate, the spatial offset between the current block and the estimate is computed. This
information is called a motion vector, and it is transmitted to the decoder. The next step in an inter-coding technique is to
calculate the difference between the estimate and the current block. The error residual, or Displaced Frame Difference
(DFD), is then transformed with the DCT and quantized. At the decoder, the inverse-DCT of the quantized coefficients is
calculated and added to the pixels referenced by the motion vector.

The general structures of the intra- and inter-coding are quite similar, and the only difference between the two techniques
appears with the motion-compensated estimate. Removing the estimate transforms the inter-frame method into the intra-
frame algorithm. In fact, most video compression systems routinely combine intra- and inter-mode encoding within a single
frame. This is accomplished by coding an image with the inter-mode algorithm but not providing estimates for all of the
blocks. For blocks without an estimate, the error residual is defined as the intensity values of the block. This error residual is
then processed with the DCT and quantized, which intra-codes some of the blocks even though the frame is compressed with
the inter-mode algorithm.

No matter the method of encoding, quantization is the fundamental step in a video compression algorithm. Increasing the
amount of quantization reduces the bit-rate and results in higher compression. Unfortunately, significant quantization also
discards a large amount of image information. This leads to visible errors in the decoded image sequence, which can be
divided into three major types of artifacts – blocking, ringing and temporal flicker. As the goal of any pre-processing
algorithm is to mitigate these visible errors, it is important to understand the origin of each type of artifact.

Independent block-by-block processing results in the first type of coding artifact. These blocking artifacts are most evident
at lower bit-rates, where the encoder removes all high-frequency information from the compressed image blocks. The effect
is a series of smooth image blocks, which makes the block boundaries quite pronounced. For example, at an extremely low
bit-rate, an encoder may transmit only the average (or DC) value of each block. The decoded image becomes piece-wise
constant, with each block taking on its average value.

A second type of artifact appears as the bit-rate increases. At these rates, significant high-frequency components are
transmitted to the decoder, while insignificant information is still removed from the compressed image. The definition of
significance that is employed by the standards leads to ringing artifacts. These artifacts appear since the encoder defines
lower-frequency information as more significant. Thus, high-frequency coefficients undergo coarser quantization. Problems
arise with this approach when there are sharp transitions in the original image. Under this condition, a wide range of
frequency information is present. However, only the lower frequency components are transmitted to the decoder. This leads
to a perceivable oscillation, or ringing, in the vicinity of the original transition. This artifact is complicated by edges that
appear in multiple frames of a sequence, where it is sometimes referred to as a mosquito or corona artifact.

The final type of artifact appears when the temporal processing of the image sequence leads to visible errors. This temporal
flicker is often attributed to the rate-control mechanism and is largely independent of the available bit-rate. In most
applications, the goal of the rate-controller is to guide the compression algorithm so that the visual quality of the compressed
image sequence is constant. This is achieved by distributing the available bits across the images in an intelligent fashion.
Unfortunately, real-time processing, buffering and transmission constraints often preclude a rate-controller from realizing this
goal. Instead, the resulting image sequence exhibits a time-varying image quality. As this quality changes along the
temporal axis, the structure of the compression algorithm may become evident. For example, the block-based structure of the
encoder may become quite visible as the quality of each block changes from frame to frame.

Reducing the severity of coding artifacts is the primary goal of the pre-processing algorithm. These techniques modify the
original image sequence before quantization and usually attempt to remove spurious noise and other insignificant features
from the original data. Removing these signal components improves the quality of decoded images, as small changes in the
original image can result in large changes in the quantized representation. Moreover, the faithful representation of any
insignificant data requires bits that could be reallocated to more important parts of the sequence.

While a pre-processing algorithm should operate on the image data before quantization, these techniques can be applied in a
variety of locations within the compression system. Perhaps the most straightforward approach is to position the pre-
processing algorithm entirely before the encoding procedure. In this location, pre-processor operates directly on the intensity
data of the original image sequence. For example, applying a low-pass filter to the original imagery is a common method of
pre-processing, as it is well suited for removing additive noise. Of coarse, determining the amount of smoothing is a critical
design parameter, and heuristic methods are often employed. However, a more sound technique is to pose the pre-processing
algorithm as an operational rate-distortion problem [9]. Design of the filter then attempts to maximize the resulting image
quality, given a pre-defined bit-rate.

An alternative location for the pre-processor is inside of the decoder. This may seem counter intuitive, as the decoder does
not have knowledge of the original image sequence. However, inter-mode compression requires that the encoder incorporate
a replica of the decoder. This facilitates predictive coding, as all error residuals are calculated relative to the decoded results.
Within this framework, pre-processing methods can modify the predicted information within the encoder. Of coarse, this
requires the decoder to apply the same operation; otherwise the error residual becomes meaningless. For standard-compliant
encoding, only two compression formats support this type of pre-filtering. In H.261, each predicted block is processed with a
low-pass filter before calculating the error residual [7]. This operation is commonly called "loop filtering". In H.263+, the
decoder filters the block boundaries of the intensity data [8], with the amount of smoothing varying relative to the
quantization parameter.

Pre-processing techniques can also incorporate knowledge of the variable length codes into the filtering decisions. In an
example of this approach, the DCT coefficients of the original image are modified [18]. The goal of the operation is still to
remove insignificant features from the compressed bit-stream. However, the definition for insignificance exploits the
structure of the variable length codes that are utilized by the encoder. Small changes in the DCT coefficient may introduce
significant savings in the bits required for the representation. If the visual impact of these changes is negligible, then the pre-
processing operation should modify the DCT coefficient.

3. APPROACH
While the goal of pre-filtering is to attenuate compression artifacts, it is plausible to argue that pre-filtering algorithms
actually replace the coding errors with a different form of distortion. As an example, many applications utilize a low-pass
filter as the pre-filtering algorithm. This filter is applied to the original images before compression, and the low-pass
operation reduces the entropy of the images and attenuates any noise. This usually results in a reduction of compression
artifacts, but it also produces encoded images that are smoother. Thus, the low-pass filter provides a trade-off between a
smooth image and one dominated by blocking, ringing and temporal flicker.

Introducing a distortion with the pre-filter is not inherently bad, since the filter also removes significant compression
artifacts. However, it is important to realize that pre-filtering is effectively a trade-off between two types of errors. When
excessive filtering is present, coding artifacts are reduced at the expense of significant pre-filtering distortions. Conversely,
when the pre-filtering is minimized, the encoded result is prone to visible compression artifacts but devoid of any errors
introduced by the pre-filter. The trade-off between these two distortions is a critical design criteria, which is controlled by
the amount of filtering applied to the image sequence.

Selecting the appropriate amount of filtering is complicated by the target bit-rate. High bit-rate applications utilize fine
quantizers, which readily reveal excessive smoothing. On the other hand, low bit-rate applications rely on coarse quantizer
that are less susceptible to filtering errors. Thus, selection of the filter parameters becomes dependent on the rate control
mechanism, as the rate control algorithm varies the bit-rate on a block by block basis. This requires an adaptive pre-filter that
changes across the image frame.

Finding the ideal parameters for a pre-filter quickly becomes an iterative procedure. Within this approach, the rate control
algorithm prescribes a target bit-rate, which then motivates an appropriate amount of pre-filtering. Unfortunately, once the
pre-filter is applied, decisions made by the rate control mechanism should be modified. This may require additional
adjustments to the pre-filtering algorithm, which leads to an iterative optimization of both the pre-filter and rate control
mechanisms.

Utilizing an iterative optimization is a realistic technique when the pre-filter and rate control methods are simple. However,
current video compression standards utilize a myriad of inter-frame coding methods, which greatly complicates the iterative
selection of the pre-filtering operator. For example, when inter-coding an image frame, any change to the pre-filtering
operator effects the selection of the motion vectors. This is a very demanding task, as it necessitates finding new estimates
for every block in the current frame. The rate control algorithm then determines the amount of quantization for each block
based on the error residual. Of coarse, the decision of the rate controller may motivate additional changes in the pre-filter,
which requires the determination of new motion vectors.

To facilitate an iterative approach to selecting the pre-processing parameters, we advocate a pre-filter that is located directly
before the quantization operator of the encoder. This relaxes the computational requirements for selecting parameters, as the
pre-filter no longer affects the calculation of any motion vectors. Instead, the filter operates on inter-coded blocks by
modifying the residual between the block and its estimate. This should not be confused with traditional "loop filtering"
methods, which filter the estimate before calculate the residual. The proposed method can be incorporated into a standard-
compliant video encoder. Within this framework, the pre-processing algorithm processes intra-coded and inter-coded blocks
differently. For intra-coded information, the intensity values of the block are directly modified. For inter-coded data, the
error residual is filtered.

The DFD is an intriguing signal to filter, as it depends on the content of the original signal as well as the estimated values.
Effectively, this allows the filter to attenuate two types of noise with a single operation. As a first type of noise, spurious
features and additive sensor noise are introduced into the DFD by the current image. The structure of this type of noise is
markedly different from the blocking, ringing and temporal flicker artifacts that also appear in the DFD. This second type of
noise is derived from the estimated values. Filtering the DFD leads to a very efficient method for improving the video
sequence, but it is further complicated by the structure of the signal. For example, filtering across the boundaries of the error
residual is poorly motivated, as the estimates for each block may be from different image locations. Furthermore, inter-mode
compression traditionally combines both intra-coded and inter-coded blocks. In this case, smoothing between different types
of blocks is inappropriate, as the error residuals contain very different types of information.

Irrespective of any complications, filtering the DFD provides the possibility for directly coupling the pre-filter to the rate
control mechanism as well as addressing a variety of image artifacts. With these desirable traits, it is highly appropriate to
consider the effect of filtering the DFD on visual quality and compression efficiency. In the next section, we explore DFD
pre-processing with several standard approaches to image filtering. These methods include linear averaging operations, a
variety of order statistic filters and the hybrid alpha-trim operation. Results from these experiments provide insight into the
proposed pre-filtering technique. Additionally, they justify an initial coupling between the rate control algorithm and the pre-
filter. This technique is well suited for most time-critical coding applications and provides a significant motivation for future
work.

4. SIMULATIONS
In this section, we utilize a sequence of 150 CCIR601 image frames to explore the effect of filtering the DFD. Each image in
the sequence is interlaced and contains two fields with a spatial resolution of 720x240 pixels per field. The temporal
resolution of the sequence is 60 frames per second. For a video compression system, we rely on the MPEG-2 standard
operating at 6.0Mbps. The encoder utilizes the TM5 rate control algorithm and is modified to allow direct access to the DFD.
This allowed several filtering approaches to be applied to the signal. Results are then visually inspected, and the effect of the
filter described. Additionally, objective measures are utilized to complement the subjective results. These measurements
include the standard peak signal-to-noise ratio (PSNR). A visual quality metric provides more reliable results.
The purpose of a visual quality metric is to incorporate properties of the human visual system into a weighted error
calculation. These weights are dependent on the psycho-visual characteristics of human perception. For example, a viewer is
relatively insensitive to the absolute value of intensity magnitudes. Instead, the differences between luminance values are the
significant observation. This is referred to as Weber’s law, and if an object is barely visible, then the ratio of the difference
between the object and background intensities and the intensity of the background is approximately constant. As a second
example, objects become more difficult to ascertain as the background becomes cluttered. This property is often referred to
as spatial masking. In the context of video compression, this property suggests that ringing artifacts are more visible in
relatively constant image regions.

In this section, we utilize the metric proposed in [20]. In this method, an 8x8 block based DCT is first applied to the original
and test images. Visibility thresholds are then determined for each DCT coefficient by the mapping presented in [1], which
varies relative to the frequency indices of the DCT, the mean luminance of the display, and the number of pixels per degree
of visual angle. These thresholds determine the smallest perceivable DCT error between the compressed and enhanced
imagery and must be adjusted for the average block intensity, as defined in [20]. Perceptual errors are then calculated as the
difference between the original and test images at each DCT frequency divided by the visibility threshold. Values less than
one suggest that errors are not visible, while larger results express more significant artifacts.

Our first set of experiments considers an image sequence corrupted by additive Gaussian noise. The noise process is white
and produces and signal-to-noise ratio of 10dB. Several basic filter types are integrated into the pre-processing algorithm and
include a 3x3 averaging operation, a 3x3 median operations and a 3x3 morphological open-close. In these experiments, only
the DFD of the luminance channel is processed. Additionally, the rate control mechanism is modified so that each frame is
encoded with the same number of bits. A visual example of the three procedures appears in Figure 1. The result without any
pre-filtering is shown in (a), while the effect of the average, median and open-close operators appear in (b), (c) and (d),
respectively. As can be seen from the figure, all of the operators reduce some of the coding artifacts introduced by the noise.
Specifically, the pre-filter removes many of the ringing artifacts present on the monument.

Objective measures support the visual observations. Plots of the visual quality for each experiment is shown in Figure 2.
The visual quality of the mean, median and open-close operators appear in (a), (b) and (c), respectively. In these plots, higher
values for the metric represent increasingly visible artifacts. It is important to note that the that the mean and median
operators produce very similar results. This agrees with our visual observations. Furthermore, the morphological filter
provides the best performance of the three operators. This can be argued from the visual examples. However, it is in stark
contrast to the PSNR measurements reported in Figure 3. In this figure, the PSNR for the mean, median and open-close
operators appear in (a), (b) and (c), respectively. In each figure, the compressed result without processing is included for
comparison. These measurements do not reflect the improvement in visual quality, which is to be expected. However, notice
the significant different between the results for the open-close operator and the other filtering methods. The morphological
filter provides the largest decrease in PSNR, which does not correlate with our visual results.

In a second set of experiments, we considered the effect of over-smoothing the DFD. For this application, we increased the
scale of each filter to a 5x5 window. At a target bit-rate of 6Mbps, this provides an overly smooth result. Visual examples
for the experiment appear in Figure 4. The compressed result without any pre-filtering is shown in (a), while the mean,
median and open-close operators appear in (b), (c) and (d), respectively. Unlike the 3x3 case, the open-close operator
produces a very unappealing image, as it is severely plagued by blocking artifacts. This is attributed to the small size of each
image block. Comparing the median and mean operators, we see that the median filter is more successful in preserving
image content. Conversely, the mean filter produces a much smoother image. For example, the averaging operator severely
degrades the lower portion of the monument and bus.

PSNR values suggest a similar conclusion. Figure 5 shows the PSNR for the mean, median and open-close operators appear
in (a), (b) and (c), respectively. In all of the plots, the result of compressing the sequence without pre-processing is shown for
comparison. Once again, we see that all operators result in a decrease in PSNR. More importantly, the open-close filter
results in the most significant decrease. Unlike the visual quality results, the median and mean filters provide similar PSNR
metrics. This provides an example of how PSNR does not always predict quality of a video sequence. Visual quality
measurements appear in Figure 6. The visual quality of the mean, median and open-close operators appear in (a), (b) and (c),
respectively. From the figure, we see that the median operator provides the best performance in terms of visual quality. The
mean filter is not quite as impressive, while the morphological operator introduces significant visible errors. This conclusion
completely agrees our visual inspection.
As a final set of experiments with the noisy data, we explored a lose coupling between the rate control algorithm and the pre-
processing method. This technique was first proposed in [10], where it was developed for low bit-rate applications. The
motivation for utilizing the technique is that most rate control algorithms adjust the quantization parameter by calculating a
local estimate for the variance. Adapting the pre-filter with respect to this same information provides an initial approach for
coupling the pre-processing and rate control algorithms. The method is straightforward and consists of an adaptive Gaussian
kernel that varies relative to the variance within a block. The pre-filter adapts on a block by block basis and utilizes a filter
with the form

 
 
H(i, j ) = exp
1 (
 − i2 + j2 ) 
,
 σN σ2
2
Z 
 σ2 Z 
 B 

where H(i,j) is the pre-filter, Z is a normalizing constant, óB is the standard deviation of the current block, óN is an estimate of
the standard deviation of the noise and óZ is a parameter that defines the maximum amount of smoothing.

Several parameters must be selected for the implementation of the adaptive procedure. In our simulations, we estimate the
variance of the noise by first calculating the variance for every block. These values are then sorted and the smallest 25% of
the values are averaged to provide the noise estimate. This is done for every image frame. The second parameter that must
be chosen limits the maximum amount of smoothing, which is expressed with the parameter óZ. Proper selection of this
parameter is complicated by the different coding methods employed in the MPEG-2 encoder. Thus, we allow the parameter
to vary relative to the coding modes. In this configuration, intra-coded blocks may utilize one value for óZ, while inter-coded
blocks could rely on a different definition for the parameter.

In our experiments, twenty different combinations for óZ are considered. The most pleasing result corresponds to a value of
óZ =10 for both the inter- and intra-mode processing. One image from the sequence is shown in Figure 7. Comparing the
adaptive approach to the previously discussed techniques, we see that the adaptive mechanism provides a viable pre-
processing algorithm. This is apparent around the sharp edges of the bus and monument, where the adaptive filter preserves
the image content. Additionally, the adaptive technique removes the spurious noise within the frame. This results from the
coupling between the rate control and pre-processing algorithms, as the pre-processing technique reduces smoothing when
the quantization step-size is decreased.

Objective measurements also show the improvement. The improvement in visual quality for the entire sequence is shown in
Figure 8. As can be seen from the figure, the adaptive technique provides an improvement in visual quality that is
comparable to the other techniques. The PSNR values for this pre-filter are also encouraging. As illustrated in Figure 9, the
adaptive method actually improves the PSNR of the decoded sequence. This is the only filter in the experiments that
introduces any PSNR improvements.

As a final experiment, we processed the original image sequence without adding noise. This removes the additive noise
components from the DFD, and tests the ability of the pre-filter to remove coding artifacts from the estimated images. The
visual results are presented in Figure 10. In the figure, the compressed image without any pre-filtering appears in (a). The
result of the 3x3 mean and median operators appear in (b) and (c), respectively, and the adaptive technique is presented in
(d). All filter parameters are unchanged from the previous experiments. From the figure, we see that the mean operator
provide a relatively smooth image. The median operator and adaptive technique addresses this problem and provides a very
pleasing result. Specifically, notice that both methods adaptive pre-filter remove some of the blocking artifacts on the
monument and several of the severe distortions around the statue's face. More importantly, this filtering does not come at the
expense of smooth images. For example, the sharp transition along the bus remains intact, especially with the adaptive result.

Visual quality metrics also shows the advantage of a DFD pre-filter, and they appear in Figure 11. The visual quality of the
mean, median and adaptive methods are shown in (a), (b) and (c), respectively. Like our previous assessment, the visual
quality metrics illustrate the success of the DFD pre-filter in removing compression artifacts. While the techniques produce
relatively similar results, the visual quality of the median operator is deemed the best. The mean operator introduces a slight
blurring in the image sequence, which translates into a larger visible error. This suggests that the pre-filtering algorithm is
suitable for removing both additive noise processes as well as coding artifacts within the predicted estimate.
As a final result, PSNR values for the experiments are provided in Figure 12. The result of processing the sequence with the
mean, median and adaptive techniques appear in (a), (b) and (c), respectively. In all of the plots, the PSNR of the compressed
sequence without pre-processing is included for comparison. As in the previous experiments, the mean and median operators
decrease the PSNR metric. This may reflect an excessive amount of smoothing. More importantly though, the adaptive
technique is able to increase the PSNR of the decoded image sequence. This is quite impressive, as coding artifacts from
previous frames are the only noise components within the DFD. Handling this type of noise provides further motivation for
exploiting a DFD based pre-processing algorithm, and suggests the future research of this type of filter.

5. REFERENCES
1. A.J. Ahumada and H.A. Peterson, “Luminance-Model-Based DCT Quantization for Color Image Compression,” Human
Vision, Visual Processing and Digital Display, Proceedings SPIE, vol.1913, pp.191-201, 1993.
2. J.C. Brailean, R.P. Kleihorst, S.N. Efstratiadis, A.K. Katsaggelos, R.L. Lagendijk, "Noise Reduction Filters for Dynamic
Image Sequences: A Review,'' IEEE Proceedings, vol. 83, no. 9, pp. 1272-1292, Sept. 1995.
3. ISO/IEC JTC1/SC29 International Standard 11172-2, Information technology – Coding of moving pictures and
associated audio for digital storage media at up to about 1,5 Mbit/s—Part 2: Video, 1993.
4. ISO/IEC JTC1/SC29 International Standard 13818-2, Information technology – Generic coding of moving pictures and
associated audio information: Video, 1995.
5. ISO/IEC JTC1/SC29 International Standard 14496-2, Information technology – Generic coding of audio-visual objects:
Visual, 1999.
6. ISO/IEC JTC1/SC29 International Standard 14496-2 AM1, Information technology – Generic coding of audio-visual
objects: Visual, 2000.
7. ITU-T Recommendation H.261, Video Codec for Audio Visual Services at px64 kbits/s, March 1993.
8. ITU-T Recommendation H.263, Video Coding for Low Bitrate Communications, February 1998.
9. L.-J. Lin and A. Ortega, “Perceptually Based Video Rate Control Using Pre-filtering and Predicted Rate-Distortion
Characteristics”, Proceedings of the IEEE International Conference on Image Processing, pp.57-60, Santa Barbara, CA,
Oct. 26-29, 1997.
10. T. Ozcelik, J.C. Brailean, A.K. Katsaggelos, O. Erdogan and C. Auyeund, “Method and Apparatus for Spatially
Adaptive Filtering for Video Encoding,” U.S. Patent 5,764,307, June 9, 1998.
11. B. Ramamurthi and A. Gersho, “Nonlinear space-variant postprocessing of block coded image,” IEEE Transactions on
Acoustics, Speech and Signal Processing, vol.34, no.5, pp.1258-1267, Oct. 1986.
12. H.C. Reeves and J.S. Lim, “Reduction of blocking effects in image coding,” Optical Engineering, vol.23, pp.34-37, Jan.
1984.
13. R. Rosenholtz and A. Zakhor, “Iterative Procedures for Reduction of Blocking Effects in Transform Image Coding,”
IEEE Transactions on Circuits and Systems for Video Technology, vol.2, no.1, pp.91-94, Mar. 1992.
14. K. Sauer, “Enhancement of low bit-rate coded images using edge detection and estimation,” Computer Vision Graphics
and Image Processing: Graphical Models and Image Processing, vol.53, no.1, pp.52-62, Jan. 1991.
15. C.A. Segall and A.K. Katsaggelos, “Enhancement of Compressed Video using Visual Quality Metrics,” Proceedings of
the IEEE International Conference on Image Processing, Vancouver, BC, Canada, Sept. 10-13, 2000.
16. Signal Recovery Techniques for Image and Video Compression and Transmission, A. K. Katsaggelos and N. P.
Galatsanos, editors, Kluwer Academic Publishers, 1998.
17. C.-J. Tsai, P. Karunaratne, N.P. Galatsanos and A.K. Katsaggelos, "A Compressed Video Enhancement Algorithm,"
Proceedings of the International Conf. on Image Processing, Kobe, Japan, Oct. 25-28, 1999.
18. K. Ramchandran and M. Vetterli, “Rate-Distortion Optimal Fast Thresholding with Complete JPEG/MPEG Decoder
Compatibility,” IEEE Transactions on Image Processing, vol.3, no.5, pp.700-704, Sept. 1994.
19. Y. Yang, N.P. Galatsanos and A.K. Katsaggelos, “Regularized Reconstruction to Reduce Blocking Artifacts of Block
Discrete Cosine Transform Compressed Images,” IEEE Transactions on Circuits and Systems for Video Technology,
vol.3, no.6, pp.421-432, Dec. 1993.
20. A.B.Watson, “Image Data Compression having Minimum Perceptual Error,” U.S. Patent 5,426,512, 1995.
(a) (b)

(c) (d)
Figure 1 Visual example of pre-processing the DFD: (a) compressed result without pre-processing; (b) 3x3 averaging operation; (c) 3x3
median operation, and (d) 3x3 open-close operation. The DFD pre-filter reduce some of the ringing artifacts that are introduced by the
noisy sequence.

(a)

(b) (c)
Figure 2 Magnitude of visual errors appearing in the first experiment: (a) errors remaining
after the 3x3 averaging operation; (b) errors remaining after the 3x3 median operation, and
(c) errors remaining after the 3x3 open-close operation. Larger values for the visual quality
metric correspond to more objectionable artifacts.
(a) (b) (c)
Figure 3 PSNR results for the first experiment: (a) 3x3 averaging operation; (b) 3x3 median operation, and (c) 3x3 open-close operation.
The PSNR values for the compressed sequence without pre-processing are provided for comparison and denoted with the circles.

(a) (b)

(c) (d)
Figure 4 Visual example of pre-processing the DFD: (a) compressed result without pre-processing; (b) 5x5 averaging operation; (c) 5x5
median operation, and (d) 5x5 open-close operation. Excessive smoothing becomes evident as the size of the pre-filter is increased.

(a) (b) (c)


Figure 5 PSNR results for the second experiment: (a) 5x5 averaging operation; (b) 5x5 median operation, and (c) 5x5 open-close
operation. The PSNR values for the compressed sequence without pre-processing are provided for comparison and denoted with the
circles.
(a)

(b) (c)
Figure 6 Magnitude of visual errors appearing in the second experiment: (a) errors
remaining after the 5x5 averaging operation; (b) errors remaining after the 5x5 median
operation, and (c) errors remaining after the 5x5 open-close operation. Larger values for the
visual quality metric correspond to more objectionable artifacts.

(a) (b)
Figure 7 Visual example of pre-processing the DFD: (a) compressed result without pre-processing, and (b) the adaptive processing
technique. The adaptive method produces the most visually pleasing image sequence. Blocking and ringing artifacts are reduced, while
significant edges are also preserved.
Figure 8 Magnitude of visual errors appearing Figure 9 PSNR result for the adaptive
after adaptive processing. technique. Values for the compressed sequence
are denoted with the circles.

(a) (b)

(c) (d)
Figure 10 Visual example of pre-processing the DFD when noise is absent: (a) compressed result without pre-processing; (b) 3x3
averaging operation; (c) 3x3 median operation, and (d) adaptive technique. Even when noise is not added to the images, the adaptive
technique reduces blocking and ringing artifacts.
(a)

(c) (d)
Figure 11 Magnitude of visual errors appearing in the noise free experiment: (a) errors
remaining after the 3x3 averaging operation; (b) errors remaining after the 3x3 median
operation, and (c) errors remaining after the adaptive technique. Large values for the visual
quality metric correspond to more objectionable artifacts.

(a) (b) (c)


Figure 12 PSNR results for the noise free experiment: (a) 3x3 averaging operation; (b) 3x3 median operation, and (c) the adaptive
technique. The PSNR values for the compressed sequence without pre-processing are provided for comparison and denoted with the
circles.

View publication stats

You might also like