You are on page 1of 12

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 16, NO.

12, DECEMBER 2007

2953

A Fast Image Super-Resolution Algorithm
Using an Adaptive Wiener Filter
Russell Hardie, Senior Member, IEEE

Abstract—A computationally simple super-resolution algorithm
using a type of adaptive Wiener filter is proposed. The algorithm
produces an improved resolution image from a sequence of low-resolution (LR) video frames with overlapping field of view. The algorithm uses subpixel registration to position each LR pixel value
on a common spatial grid that is referenced to the average position of the input frames. The positions of the LR pixels are not
quantized to a finite grid as with some previous techniques. The
output high-resolution (HR) pixels are obtained using a weighted
sum of LR pixels in a local moving window. Using a statistical
model, the weights for each HR pixel are designed to minimize the
mean squared error and they depend on the relative positions of
the surrounding LR pixels. Thus, these weights adapt spatially and
temporally to changing distributions of LR pixels due to varying
motion. Both a global and spatially varying statistical model are
considered here. Since the weights adapt with distribution of LR
pixels, it is quite robust and will not become unstable when an unfavorable distribution of LR pixels is observed. For translational
motion, the algorithm has a low computational complexity and
may be readily suitable for real-time and/or near real-time processing applications. With other motion models, the computational
complexity goes up significantly. However, regardless of the motion model, the algorithm lends itself to parallel implementation.
The efficacy of the proposed algorithm is demonstrated here in a
number of experimental results using simulated and real video sequences. A computational analysis is also presented.
Index Terms—Aliasing, image restoration, super resolution, undersampling, video processing.

I. INTRODUCTION

D

URING acquisition, digital images are invariably degraded by a number of phenomena that limit image
resolution and utility. Factors that can limit the effective resolution of an imaging system may include aliasing due to
undersampling, blur, and noise. In many imaging applications,
there is a desire to capture a wide field-of-view image with high
spatial resolution. This generally leads to the selection of small
-number optics. For diffraction limited optics, the spatial
cutoff frequency of the image in the focal plane is inversely
proportional to the -number [1]. Thus, the detector pitch of
the focal plane array (FPA), that samples the image formed by
the optics, must be properly matched with the -number of
the optics [1]. In particular, small -number optics requires an

Manuscript received May 17, 2007; revised September 9, 2007. The associate
editor coordinating the review of this manuscript and approving it for publication was Dr. Peyman Milanfar.
The author is with the Department of Electrical and Computer Engineering,
University of Dayton, Dayton, OH 45469-0232 USA (e-mail: rhardie@udayton.
edu).
Digital Object Identifier 10.1109/TIP.2007.909416

FPA with small detector pitch. However, small detector pitch
implies small detectors and these may be starved for photons
due to the limited active surface area. Furthermore, sufficiently
small detectors may be costly or impractical to produce. For
these reasons, many imaging systems are designed to allow
some aliasing in an engineering tradeoff.
When aliasing is a significant factor, multiframe super-resolution (SR) enhancement has proven to be an effective postprocessing method to help overcome resolution limitations stemming from nonideal sampling from the FPA [2]. Multiframe
SR algorithms use a set of undersampled low-resolution (LR)
frames from a video sequence to produce a high-resolution (HR)
image. If some type of subpixel motion exists between frames,
a unique set of samples is provided by the individual frames.
This allows postprocessing techniques to exploit the diversity
of scene samples in multiple frames to effectively increase the
sampling rate of a given FPA. The SR algorithms also typically
seek to reduce blur from spatial detector integration, diffraction
from the optics, and other sources.
A variety of SR algorithms have been proposed and these
include frequency domain approaches, iterative HR image
reconstruction techniques, learning-based methods and interpolation-restoration approaches [2]. The simplest of these
SR methods, both computationally and intuitively, are the
interpolation-restoration approaches [3]–[17]. In interpolation-restoration methods, registration is used to position the LR
pixels from multiple frames onto a common HR grid. Since
these LR pixels will generally lie nonuniformly on the common
grid, a nonuniform interpolation scheme is typically employed
to create uniformly spaced pixel values. This creates an image
with an effectively reduced detector pitch, but the spatial integration from the original detector size remains along with blur
from other factors such as diffraction from the optics. For these
reasons, the nonuniform interpolation is typically followed by
a restoration method to treat the effects of the system point
spread function (PSF). Most of these interpolation-restoration
SR algorithms completely decouple the interpolation and
restoration steps by treating them independently. However,
when the motion for a given set of frames is poorly distributed,
any nonuniform interpolation result will undoubtedly suffer
due to aliasing. An independent deconvolution process can exaggerate these artifacts. Therefore, integrating the nonuniform
interpolation and deconvolution steps can be advantageous.
The partition weighted sum (PWS) SR algorithm in [17] is
one interpolation-restoration approach that does combine the interpolation and restoration steps. However, in the PWS SR algorithm, the LR pixels are placed into a discrete HR grid by
rounding off the coordinates of the LR pixels and placing them

1057-7149/$25.00 © 2007 IEEE

16. the proposed system can also be applied to other motion models. In this paper. Commonly employed motion models in SR algorithms are translation.2954 Fig. Some previous work has considered nonglobal motion [23] and general optical flow. However. 12. it does not readily extend (in an optimum way) to rotation or other motion models. the computational complexity goes up significantly. 1 as continuous spatial coordinates. Another interpolation-restoration SR approach that seeks to combine the interpolation and restoration steps is the small-kernel Wiener filter (SKWF) in [16]. II. as will be described in the following. with or without the spatially varying statistical model. This block diagram illustrates how the LR frames are formed from the desired 2-D continuous scene and how they relate to the corresponding ideal HR image. An analysis of computational complexity is provided in Section IV and experimental results are provided in Section V. we use a parametric statistical model for the correlations that ultimately define the filter weights. Unlike that in [17]. Note that. This scene first goes through a geometric transformation to account for any scene motion exhibited during the acquisition of frames. the proposed method does not require quantization of the LR pixel coordinates to populate a discrete HR grid. This observation model relates the ideal continuous scene to the observed LR pixels. Finally. Thus. The AWF SR algorithm then produces the HR pixel estimates using weighted sums of neighboring LR pixels. However. VOL. Observation model relating a desired 2-D continuous scene. where and are the sired scene is shown in Fig. it does effectively introduce a quantization error in the registration that can affect performance. Here we focus primarily on translational motion. The proposed AWF SR method combines nonuniform interpolation and restoration into a single weighted sum operation. The de. y ). Section II describes the observation model used as the basis for the proposed algorithm development. The remainder of the paper is organized as follows. OBSERVATION MODEL The proposed SR algorithm is based on the observation model shown in Fig. It also gives the system a small number of tuning parameters to control performance and eliminates the need for empirical training images and the computational burden of estimating statistics from those images. conclusions are presented in Section VI. The weights for each HR pixel are designed to minimize mean squared error based on the relative spatial locations of the LR pixels. 1. it is quite IEEE TRANSACTIONS ON IMAGE PROCESSING. here. we make use of the exact LR pixel locations provided by the registration algorithm. The proposed spatially-adaptive algorithm is simpler computationally for both training and filtering than the vector quantization approach of the PWS SR method [17]. the algorithm lends itself to a parallel implementation. We show that. regardless of the motion model. This method applies a space-invariant linear filter to the nonuniform samples on the HR grid. the filter weights adapt spatially and temporally to changing spatial distributions of LR pixels on the HR grid. the algorithm has a low computational complexity and may be readily suitable for real-time and/or near real-time processing applications. With other motion models. It may also yield useful results for other types of motion. The proposed AWF SR algorithm is presented in Section III. An offline training procedure is used to obtain the filter weights for the PWS SR filter using training images. where are the motion parameters associfor ated with frame . Another novel aspect of this work is that we apply a spatially varying statistical model in the spirit of [21] and [22] to control the SR filter weights. Thus. DECEMBER 2007 robust and will not become unstable when an unfavorable distribution of LR pixels is observed. it may be possible to implement the nontranslational version in real-time or near real-time using a massively parallel hardware architecture. with a set of corresponding LR frames. the needed subpixel accuracy is difficult to achieve when a large number of motion parameters are being estimated with a limited number of pixels. This is because the spatial location of the LR pixel around each HR pixel is considered individually. The filter impulse response is computed using a frequency domain analysis relying on the shift property of the Fourier transform. yielding (1) . As such. we propose a new computationally simple SR algorithm using a type of adaptive Wiener filter (AWF). This gives the proposed system more degrees of freedom (in the form of distinct filter weights) than the spatially invariant linear filter employed in [16]. and to the local intensity statistics of those LR pixels. 1. NO. into the nearest HR grid position. The method in [17] has the option to employ a bank of filters [18]–[20] tuned to different spatial structures in the image based on vector quantization. It is this parametric model that allows the algorithm to handle nonquantized motion parameters. but the proposed algorithm is suitable for rotational motion when the PSF is approximately circularly symmetric. and affine transformations. In addition to avoiding the quantization of the LR pixel locations. Although we focus on global translational motion. While this is necessary for the PWS SR filter because of how the partition statistics are estimated. Here. d(x. the AWF SR method populates a common HR grid using the LR frames and the registration information. Since the weights adapt with distribution of LR pixels. another major difference between the proposed method and that in [17] is that. In particular. a potentially distinct and optimum set of weights is computed for estimating each HR pixel on the HR grid. allowing the filter weights to optimized for each HR pixel independently. rotation. . for the global translational motion.

Let this ideal HR image rate. While the model shown in Fig. Once the PSF and motion models are swapped. and possibly other factors such as motion during integration and atmospheric turbulence. with no aliasing. The HR grid is aligned with the average frame position. with a standard deviation of 1. which provides motion stabilization for video processing with no added computational . an essential point to make here is that the system PSF bandlimits the image in the focal plane. spatial detector integration.. where is the number of HR pixels. the detector pitch in the FPA. we employ a Gaussian prefilter. PROPOSED AWF SR ALGORITHM The goal of the proposed AWF SR algorithm is to use . denoted LR pixels values can be combined into a single vector yielding . or noise. The latter two are generally time varying and are not considered here. 2. As discussed in Section I. System Overview An overview of the AWF SR algorithm is provided by the block diagram in Fig. making it possible to sample at the Nyquist rate with finite detector spacing. sample locaare obtained based on the registrations from the image tion parameters.e. Here. if the system is highly undersampled. Note that all of the pixel values for frame . Of course.5 LR pixel spacings. a more convenient model for the following algorithm development is shown in Fig. with be represented in lexicographical notation with the vector as shown in Fig. To minimize the impact of noise and aliasing on the registration process. This gives rise to a set of sampled scene values for each frame. 2 to develop the proposed AWF SR algorithm. the ideal at the Nyquist HR image would be obtained by sampling first bandlimited to . In lexicographical form. the system is likely the vector to be undersampled with respect to the Nyquist criterion. This yields the set of measured . However. we use an additive noise model to account for electronic and photometric noise sources. A. It can be shown that the models are also equivalent for rotational motion. we register each frame to the previous frame and then accumulate these. This ideal HR image enjoys the full resolution afforded by the optics. This is because. Modeling the system PSF where for undersampled systems is addressed in [1] and we refer the reader to that paper for more details. It is this image we wish to estimate from the observation . Since this method works best for small shifts and rotations. Note that. This allows for a relatively simple PSF model determined by the shape of the active area of a detector in the FPA [1]. 1 and 2 are equivalent for translational motion. Note that the models in Figs. This is done such that we end up with same set of samples in as those obtained by combining the outputs in Fig. In the following section. The convolution operation is denoted as 2955 Fig. It is also equivalent for rotational motion if the PSF is circularly symmetric. Ideally. it can also be used to generate a single HR still image from an input sequence. the horizontal direction and and would be chosen so that the new SR sampling meets or exceeds the Nyquist criterion assuming a cutoff frequency given in (3). III.. The effective sampling rate for the estimated image is to be greater than that in the observation model in Fig. the image is convolved with the system PSF. blur. The first step is to register the frames to a common grid. would need to meet the requirement (4) to prevent aliasing.e. The next step in the observation model is to spatially sample the image based on the detector layout and pitch in the FPA. 1. if the PSF is circularly symmetric. shifting then blurring is equivalent to blurring then shifting). 1 for translational motion. the contribution of the detector spatial integration is generally far more dominant than the diffraction component. (2) is the system PSF. 2. Finally. Thus. Note that the radial cutoff frequency associated with PSF from diffraction-limited optics with a circular exit pupil is [24] (3) where is the -number of the optics and is the wavelength of light considered. The PSF can be designed to include diffraction from the optics. Thus. we use the observation model in Fig. 1 can be combined into a single nonuniform sampling operation as shown in Fig. we denote the samples for frame with . for translational motion. the motion and uniform sampling operations in Fig. . Here we employ the gradient-based registration technique described in [1]. This follows because the continuous operations of rotation and convolution with a circularly symmetric PSF commute as well (i. This is equivalent to the observation model in Fig. rotating then blurring with a circularly symmetric PSF is equivalent to blurring then rotating). the geometric transformation of the motion model is removed and replaced with a nonuniform sampling process guided by the motion parameters. Alternative observation model replacing the motion model and combining of LR frames by a single nonuniform sampling process. 2. and [26]. to the input images.HARDIE: FAST IMAGE SUPER-RESOLUTION ALGORITHM USING AN ADAPTIVE WIENER FILTER After the geometric transformation. 1 by in in the vertical direction. On the other hand. to form an estimate of the HR image . [25]. 1 is a good representation of the physical image acquisition process. the PSF blurring and the geometric transformation commute [13] (i. 3. for denoted . 1. Note that the proposed system can be used of to process video sequences by using a sliding window of input frames and outputting one HR image for each group of input frames.

we have vector associated with the samples within observation window . Fig. we employ a moving observation window on the HR HR pixel spacgrid as shown in Fig. 4 is of size . 4. B. The observation and in window moves across the HR grid in steps of the horizontal and vertical directions. In this case. The resulting motion parameters are stored in . where is the random noise other words. 3. If to be . first let To begin the development of modeling be the noise-free version of the th observation vector . a similar weight matrix was determined using empirical estimates from training images where the LR pixels are forced to lie exactly on HR pixels spacings on the HR grid. and this estimate is denoted as shown in Fig. 2 2 The weights in (6) that minimize the mean squared error are found in a fashion similar to that for an FIR Wiener filter. we define the th column of Within this framework. This gives the current algorithm an advantage. Here. we use a parametric continuous model for the desired image 2-D autocorrelation function and cross-correlation function. We then sample these continuous functions at coordinates determined by the locations of the LR pixels on the HR and . where is an observation vector a positional index for the window on the HR grid and is the number of LR pixels within the th observation window. This allows the filter to adapt to any changing intensity statistics and varying spatial distributions of the LR pixels on the HR grid. where and . we form estimates for the HR pixels within a subwindow of the main observation window. it is possible to establish the coordinates of each LR pixel in on a common grid. then . Next. Note that the estimation subwindow depicted in Fig. there are computational savings associated with estimating more than one HR pixel per observation window ). All of the LR pixels on the HR grid that lie within the span of this observation window are placed into . However. As will be shown in Section IV.2956 IEEE TRANSACTIONS ON IMAGE PROCESSING. VOL. The large box represents the span of an observation window (here shown to include 5 5 LR pixel spacings). Overview of the proposed SR algorithm. respectively. [18] that the weights that minimize the mean squared error are given by (7) where is the autocorrelation matrix for the observation vector and is the cross correlation and between the desired vector the observation vector. In . respectively. 4. The HR pixel estimates within the estimation subwindow are obtained using a weighted sum of the LR pixels in the observation window. 4. As mentioned in Section I. The small box represents the estimation window containing a set of 6 6 HR pixels to estimate based on the LR pixels in the observation window. This window spans ings in the horizontal direction and HR pixel spacings in the vertical direction. the integer multiples of number of LR pixels contained in each observation vector is constant and given by (5) For each observation window. All of these outputs are combined (i. NO. 3. Global Statistical Model and . and squares). Let the size of this subwindow be denoted . before applying the weights in (7). These tuning parameters provide a level of control over the characteristics of the HR image estimate. HR grid showing the LR pixel locations for three frames (circles. a potentially unique set of weights can be used to form each HR pixel estimate. With these parameters. In [17]. Each column of contains the weights used for a particular HR pixel within the estimation window. assume that the noise is a zero-mean uncorrelated noise vector with independent and identically distributed elements with variance . it is straightforward to . This is expressed as (6) and is a where matrix of weights. Furthermore. these weights vary based on the spatial distribution of LR pixels within the observation window. Note that is the vector of desired HR samples corresponding to the HR grid positions in the estimation window. 12. In this case. triangles. here we do not quantize the shifts and force the LR pixels into a discrete HR grid as done in [17]. as shown in Fig. unlike a standard Wiener filter. It can be shown [14]. This eliminates the necessity for training grid to fill images and gives the system a relatively small number of convenient tuning parameters (parameters that define the correlation functions). each is normalized so that it sums to 1 to ensure concolumn in tinuity across observation windows and to avoid potential grid pattern artifacts. Note that. and are Consider the case of translational motion when and . complexity. letting to form an estimate of the ideally sampled image .e. for . DECEMBER 2007 Fig. especially for small upsampling factors where such quantization is significant. 16. [17]..

This process can be repeated for each position of the observation window on the HR grid. The resulting autocorrelation model can be expressed as (13) where is the variance of the local region of the desired image that is responsible for producing . in addition to adapting to the local spatial distribution of the LR pixels in each observation window. so do the autocorrelation and cross-correlation matrices. Evaluating (11) using all . and . and . we fix in (12) and allow the local desired image variance parameter to vary with the position of the observation window. and then in turn can be these displacements yields found from (8). 2. empirically from statistically representative training images or defined using a parametric model. Thus. This can be obtained sired image autocorrelations. The number of unique patterns ranges from 1 to and . as the relative positions of the LR pixels in the observation window change across an HR grid or between HR frames. Autocorrelation function model r (x. and the variance of the correthe desired image variance. Here. tocorrelation function. The proposed framework allows for the selection of the de. Thus. C. can be expressed in terms of (10) The autocorrelation of Fig. consider a wide sense stationary au. y ) for  = 1 and  = 0:75. if we can estimate . as The cross-correlation function between [27] as shown in Fig. a repeating pattern of spatial locations for the LR pixels occurs. . for translational motion. the horizontal and vertical distances between all the samples can easily be computed. A plot of this autocorrelation function is shown in Fig. is given by (14) where (15) and . Since many images do not obey this assumption. we employ a circularly symmetric parametric autocorrelation model [28] of the form (12) where is the variance of the desired image and is a tuning parameter that controls the decay of the autocorrelation with distance. it may be advantageous in some cases to employ a desired autocorrelation model that is a function of the observation window position. we can then use (14) to form a corresponding estimate of . a weight matrix need only be obtained once for each unique spatial pattern (and not necessarily once per observation vector). we estimate as . the weights would also adapt to the local intensity statistics of those samples. it can be shown that the relationship between . we can use (12) to get (10) and (11). as can be seen in . Finally. 4. For the experimental results presented in Section V.HARDIE: FAST IMAGE SUPER-RESOLUTION ALGORITHM USING AN ADAPTIVE WIENER FILTER 2957 show that the autocorrelation matrix for the observation vector is given by (8) and the cross-correlation matrix is given by (9) Now the problem reduces to modeling the elements of and . 5. Using (11). Using these matrices in (7) window gives rise to generates the weights for a given observation vector. To do so. With knowledge of the horizontal and vertical distances between the LR pixels in the observation window and the HR pixels in the estimation window. However. it is based on the assumption that the random processes involved are wide sense stationary. Sampling these according the spatial distribution of LR pixels in any given observation and . Spatially Varying Statistical Model While we believe the method described above is very useful. for the desired image . Fig. Since the noise is assumed to be independent. This leads to weights tuned to the specific spatial pattern of LR pixels. We have found that this model is representative of empirically estimated image autocorrelation functions for a wide range of natural images (including the images used here). 5 for and . as will be addressed in depending on the choice of Section IV. sponding PSF degraded image. given the model parameters. Thus. one can evaluate (10) at the appropriate coordinates to fill as defined in (9). Thus. the HR pixel estimates are formed using (6). To make use of this spatially varying model. Note that. is given by (11) With knowledge of the positions of the LR pixels that comprise (and ). This greatly reduces the computational complexity of the algorithm as described in Section IV. we must estimate the model parameter for each observation vector.

Fig.y=3). For these data . and where . one can conveniently obtain and (17) and (18) respectively. (a) NSR = 1:0. For these data. y=4). To help illustrate the impact of the NSR on the filter weights. In these plots. we wish to exploit the repeating spatial pattern of LR pixels and not compute a different set of weights to for each observation window. COMPUTATIONAL COMPLEXITY In this section. W = W = 12. One option is to quantize discrete levels. the weights are more that for very noisy data (e. 6 is the linear mapping defined by (14) for and a cubic polynomial curve fit of the data. Also shown in Fig. 8 bit aerial image in Fig. to estimate When a different set of weights is already being computed for each observation vector. Thus. 9(a). DECEMBER 2007 Fig. However. letting vary between observation vectors allows the filter weights to adapt to a varying NSR across the HR grid. . NO. Fig. 7. we believe that the model with a fixed is sufficiently accurate to yield useful results. 6. Attempting to estimate a spatially varying and one area of ongoing investigation. where x and y are measured in HR pixel spacings. the weights for all the LR pixels change in an effort to make the best use of the available samples in an MSE sense. y ) = (1=9)rect(x=3. Note that. W = W = 9. The filter weights are given by (16) is the sample variance estimate from the elements of where the observation vector . and . including the spatially varying statistical model adds little computational complexity. one is computing the appropriate filter weights and the other is applying those weights to form the output. y) = (1=16)rect(x=4. Common symbols correspond to LR pixels from a common frame and the “x” symbol at (0. Fig. IV. producing a smoothing effect. There are two main components of the system. This based polynomial mapping rather than (14) to estimate suggests that the model in (13) may not be able to fully account for the nonstationary characteristics of this image using a conis stant . evenly distributed. VOL. especially when an polynomial mapping function is used from (16). where x and y are measured in HR pixel spacings. it may be advantageous to use such an empirically . respectively. but the cubic fit is better. where . and h(x.0) marks the location HR pixel being estimated. However.. and h(x. if statistically representative training data are available. 7(a) and (b) shows the and . (19) The coefficient of the identity matrix is a noise-to-signal (NSR) ratio. This would require that only weight matrices be computed for each unique spatial pattern. Scatter plot showing the empirical relationship between  and  based on the 8 bit aerial image in Fig. Thus. we perform a computational complexity analysis of AWF SR algorithm (excluding the registration step which is common to most SR methods). . 16. L = L = 4. if model in (12) with for any as and are obtained using the . L = L = 3. 12. for translational motion. Weights for the case where P = 3. a stem is located at the position of each LR pixel on the HR grid and the height of each stem is the corresponding weight.g. plot shows much more weight given to the LR the samples in close proximity to the HR pixel being estimated. and are measured in HR pixel spacings. 9(b). Note that the linear model from (14) is fairly accurate when applied to these data. these weights are not simply samples of a fixed underlying interpolation kernel. (b) NSR = 0:1. In contrast. and A plot showing the empirical relationship between is provided in Fig. It is also interesting to note that if the spatial location of any of the LR pixels were to change. 6.2958 IEEE TRANSACTIONS ON IMAGE PROCESSING. This provides a practical way to employ the spatially varying statistical model without the need to compute distinct weights for each observation window. where and are measured in HR pixel spacings. Thus. These sample standard deviations are obtained from local windows of images derived from the . The analysis here is based on the number floating . Note weights for ). 7 shows a stem plot of the weights for the case .

However. The main complexity controlling parameters are . where . When the motion between the LR frames is other than global translational motion. the total number of flops per output pixel required is approximately given by (23) (21) In comparison. there is little one can do to control the computational complexity. This means that only a limited number of distinct weight matrices need to be computed. respectively. where is the number of neighbors used. When the estimation window size is not a multiple of the upsampling and . the weighted nearest neighbors (WNN) alflops for the nonuniform intergorithm [3]. this algorithm may be suitable for real-time or near real-time implementation using a massively parallel hardware architecture. Consider the AW and . It then requires a 2-D FFT. the number of flops is now dominated by the Cholesky factorization and is significantly higher than that for translational motion. = = = = 4 and = 1. The amount of training data required grows with the number of partitions (since data for each partition are required). the number of flops associated with applying the weights to get an output HR frame is . The vector quantization processing used in [17] also increases the complexity of filter training by requiring the design of a vector quantization codebook. = 4. 4. However. are a multiple of observation window dimensions. are posfactor (i. and . unique LR pixel distributions itive integers). Note that increasing farable complexity for vors the AWF SR algorithm in this case. In this case. as shown in Fig. the HR grid exhibits a repetitive structure in the region where the LR frames overlap. and an inverse 2-D FFT. respectively. When and . . Thus. For the WNN. 8. there exist for the observation windows moving across the HR grid. Computing the weights for each unique LR pixel distriand and solving bution requires filling the matrices .HARDIE: FAST IMAGE SUPER-RESOLUTION ALGORITHM USING AN ADAPTIVE WIENER FILTER point multiply and add operations (flops) required to generate an HR output frame. the total flop count for the WNN algorithm is given by (22) Fig. Using as the approximate flop count for the 2-D FFT. and the window dimensions. Clearly. there are a number of parameters that can be adjusted to change the computational complexity.. the LU factorization of the autocorrelation matrix is performed for every estimation window in the HR grid and the forward and backward substitution is carried out for every HR pixel. Since the complexity of the WNN algorithm is dominated by the FFT. each observation window can be processed independently in parallel. In particular. Let us first consider the case of translational motion where the and . These operations are neglected in the following computational complexity analysis. where is the average number of LR pixels within an observation window. the regularity in the HR grid pattern may not be preserved and the number of LR pixels in the observation window may vary. When and are integer multiples of . Thus. Flop counts as a function of the HR image dimension for both the proposed AWF SR and the WNN algorithm. However. considering only translational motion the total flop count is given by (20) and are not integer multiples of and when spectively. and . With only translation motion between the LR frames.e. [4] requires polation. In this case. The weights can be computed using Cholesky factorization flops to perform LU decomposition for which requires the autocorrelation matrix and flops to solve for one set of weights using forward and backward substitution. For the AWF SR curves. SR algorithm with parameters In this case the AWF SR and WNN algorihms have a compa. there is only one spatial distribution of LR pixels encountered as the observation horizontally window moves across the HR grid by steps of and vertically. the flop count is given by . 8 shows the flop counts as a function of the HR image dimension for both the proposed AWF SR and the WNN alneighbors. where is given by (5) and is the total number of HR pixels to be estimated. reand 2959 pN Fig. computational complexity as the AWF with when the PWS employs multiple partitions. Filling and can be done the equations using a look-up-table with interpolation. The autocorrelation and cross-correlation statistics must also be estimated empirically from training images prior to filtering using the method in [17]. L D D Q l L as can be seen with the AWF SR method. that method has the added burden of performing vector quantization on each of the observation vectors. . For all of the AWF SR curves gorithm using and . multiplies. It should be noted that computational complexity of the PWS is essentially the same filter implementation in [17] with . We must also consider the computation of the filter weights. In the worst case scenario. a weight matrix must be computed for every estimation window in the HR grid.

EXPERIMENTAL RESULTS In this section. the noise related parameters must account for all noise sources. where detectors are much larger than the diffraction-limited spot size from the optics [1]. 9(c) for . For these methods. can be increased as desired without the need for training data or fear of poorly conditioned autocorrelation matrices. 9(a) is translationally shifted by subpixel spacings using bicubic interpolation. The moving average filter models the detector integration for square detectors (and neglects diffraction). blurred with a moving average filter of size . A simulated LR frame with no noise is shown in Fig. 11 and 12 show that all the methods improve with an increased number of input frames and the errors are much lower than for single frame interpolation. respectively. we use the 8-bit image shown in Fig. Images used for error analysis with simulated video sequences. lead to the lowest errors. (c) simulated LR frame with L = L = 4 and no noise. Singleframe bicubic interpolation is used to provide a performance baseline for evaluating all of the multiframe SR methods. The AWF . The simulated sequences are derived from a still frame aerial image and allow for quantitative error analysis. 1. 10. the SKWF method [16]. 9(b). [4]. 10. 11 and 12. the WNN method [3]. VOL. 9. The benchmark methods consist of the sample-and-add method with Wiener deconvolution (SA+Wiener) [12]. Translational shifts for 16 simulated LR frames. A. The MSE and MAE as a function of are shown in Figs. and the SKWF implementation is the author’s interpretation of one method described in [16]. The same frame with additive noise of variance is shown in Fig. The FLIR sequence allows us to illustrate the performance of the proposed SR algorithm in a real application. In particular. 12. and the PWS filter approach [17]. (a) Test image used to create an image sequence with translational motion. To help put the results in context. The results in Figs. DECEMBER 2007 In contrast. and then down-subsampled by and in the horizontal and vertical directions. the HR image in Fig. These methods were selected because they have a relatively low computational complexity. approximately on par with that of the proposed method. are estimated using the independent training image shown in Fig. The SA+Wiener is implemented using the software in [29]. Other statistical parameters. such as the polynomial fitfor the AWF and the empirting parameters for estimating ical autocorrelation and cross correlation for the PWS. The . Fig. This serves as a representative of the more computationally demanding iterative SR methods. Also. respectively. For each of the noise variance for these results is methods. 9(a) and follow the steps in the observation model shown in Fig. we have attempted to optimize these parameters to get the best performance from each algorithm (in an MSE sense) for each . including that effectively produced by registration errors. there are one or more tuning parameters related to NSR that affect system performance. We also include some results with the regularized least squares (RLS) image reconstruction algorithm [1].2960 IEEE TRANSACTIONS ON IMAGE PROCESSING. V. Thus. To provide as fair a comparison as possible. the proposed AWF SR method requires only an estimate of the local variance for each observation vector in order to obtain spatially varying statistics and requires only and to be ready to process a set of frames. we compare the performance of the AWF SR algorithm to several previously published techniques. We employ the same size observation of size window for the PWS and SKWF methods and the PWS uses an 8 8 partitioning window. (b) training image used to estimate statistical parameters. we demonstrate the efficacy of the proposed AWF SR algorithm using both simulated image sequences and a real video sequence obtained with a forward looking infrared (FLIR) imager. for the AWF method and we employ Note that we use and an estimation window a window size of . Simulated Data To generate the simulated video sequences. NO. and (d) simulated LR frame with L = L = 4 and noise with variance  = 100. This is a fairly good approximation for heavily undersampled systems. using the additive Gaussian noise variance (if known) may not Fig. 9(d). A set of 16 frames are obtained in this manner using the random translational shifts shown in Fig. The first experiment compares the various SR methods using different numbers of input frames. 16.

Finally. The parameters used for the results in Fig. such as on the roof tops. it is also doing extra sharpening in 33. there may be little quantization effects if the shifts happen to be near integer multiples of HR pixel spacings. The PWS with and modeled noise variance of 97.14 is shown in Fig.084 2961 Fig. the vector quantization used by the PWS does not perform as well with fewer frames or when the HR grid is sparsely populated. The WNN output with an NSR of 0. It is interesting to note that the image in Fig.HARDIE: FAST IMAGE SUPER-RESOLUTION ALGORITHM USING AN ADAPTIVE WIENER FILTER Fig.37 is shown in Fig. 13(f). We also include results for the RLS algorithm for reference using ten iterations with [1]. 13(a). In order to evaluate how sensitive the SR methods are to LR frame shift pattern variations. Weights computed for the AWF with lower NSRs will tend to do more sharpening and less smoothing. The spatial pattern of the LR frames varies randomly between input sets. 13(f) appears smoother than the others in the flat regions. Thus. but also the lowest error standard produces lower errors than deviations. 13(b) for an NSR of 0. Each input set consists of 16 frames obtained using the same observation model as that for the results above. produces errors in this The PWS with ten partitions study that are higher than those for the new AWF method. Table I shows that the AWF with not only produces the lowest MSE and MAE. The PWS with . In other cases the quantization effect can be more pronounced. 13(c). Here we include the reusing a modeled noise variance of sults for the PWS with (global statistics) with a mod152. For the WNN. point in The output images corresponding to the Figs. The higher error and error variance for the PWS compared with the AWF may be partly attributable to the quantization of motion parameters used by the PWS. Also. Mean absolute error as a function of the number of LR frames used for simulated image sequence with additive noise of variance  = 100. This effect tends to increase the standard deviation of the error metrics for this approach. The results in Table I may also suggest that the vector quantization . Thus.6% of the local windows. When applying the AWF with to this same dataset.8 times that of the AWF for . 11 and 12 are shown in Fig. the interpolation is based only on the distance of nearest LR pixels using a simple inverse distance weighting.17. 12. We believe this can be partly attributed to the quantization of the LR frame positions to a discrete HR grid used by the PWS. 13. This positional quantization is essentially introducing another noise source that hurts performance. and the field in the lower left. This is important for video applications where different groups of frames will have different motion parameters and we are interested in how robust the SR methods are to the varying shift pattern.26. 13 are applied to all 100 input sets. is shown in Fig. parking lots. The SKWF output is shown in Fig.6% of the local windows for the use weights computed with an NSR in (19) lower than that for the global AWF (some dramatically so). 13(d) using [as defined in [16]] and noise power spectral density value of 0. the optimum global NSR is 0.63 and the AWF with eled noise variance of 105. the adaptive AWF is doing more than just extra smoothing. The output obtained with single frame bicubic interpolation is shown in Fig. 13(e). the AWF SR output and modeled noise variance of 73. at least when a moderate to high level of noise is present. . The increased error for the the WNN and SA+Wiener methods may be partly attributable to how these methods perform the nonuniform interpolation and restoration in two independent steps. with consistently provides the lowest MAE and MSE. we have done a quantitative analysis using 100 independent sets of LR input frames. The SKWF uses a spatially invariant filter on the HR grid and does not independently optimize the weights for each HR pixel. Mean squared error as a function of the number of LR frames used for simulated image sequence with additive noise of variance  = 100. The sample mean and standard deviation of the error metrics are shown in Table I for the SR methods.048. the use of spatially varying statistics does appear to impact the subjective quality of the images. The output of the SA+Wiener method is shown in Fig. but note that the error standard deviation the PWS with goes up and is 2. Note also compared with an average NSR of 0. This is due to the fact that these areas have lower does more smoothing there variance and the AWF with and less smoothing in the dynamic areas with higher variance.68 [as defined with in (19)]. 11. For some input patterns.079 for adaptive AWF that 33.

Wright Patterson Air Force Base. (e) PWS with = 10.043.m wavelength. The NSRs for the SA+Wiener and WNN are 0. (d) SKWF. DECEMBER 2007 Fig. The SKWF output is using [16]) and noise power spectral density value of 0. The AWF. VOL. NO. the optical cutoff frequency is 83. 15. (as defined in respectively. yielding a sampling frequency of 20 cycles/mm. 15. The PWS and a modeled noise variance of 16.050 mm (with 80% fill factor). PWS. 14 shows one LR frame from the sequence. and SKWF HR image with all employ a 15 15 observation window. The outputs lated imagery with a noise variance of for the AWF and the benchmark methods are shown in Fig. 16(a). The estimated translational shift parameters are shown in Fig. 16. The Amber FPA has square detectors with a pitch of 0. For a 4. we apply the various SR algorithms to video frames from a FLIR imager. 13. The system PSF used is based on a diffraction-limited model for the optics and the shape of the active area of the detectors. The AWF and PWS use a 5 5 estimation window. We process the 16 frames to produce an . In the single frame interpolation shown in Fig. 14. 16. frames is used and each LR frame has A group of been cropped to a size of 64 64.32 and result is for and modeled noise variance of the AWF results is for 10. 12. (a) Single frame bicubic interpolation. Output images for the simulated image sequence with = 16. M Q l TABLE I MSE AND MAE STATISTICS FOR THE VARIOUS SR ALGORITHMS OVER 100 SHIFT PATTERN REALIZATIONS based partitioning is more sensitive to the shift pattern than the local NSR approach of the AWF.3 cycles/mm. as described in [1]. and (f) AWF with = 20. Single 64 2 64 frame in the FLIR image sequence. These data have been provided courtesy of the Sensors Directorate at the Air Force Research Laboratories. P Fig. note . These frames have translational motion introduced by panning across a scene.2962 IEEE TRANSACTIONS ON IMAGE PROCESSING. (b) SA + Wiener. Fig. The parameters for all the SR methods are based on optimization with the simu. and the PWS uses a 7 7 partitioning window. FLIR Video Data Here.080 and 0. (c) WNN with = 4. Fig. Estimated translational shifts (in HR pixel spacings) for the FLIR image sequence.058. The FLIR imager is equipped with a 128 128 Amber AE-4128 infrared FPA with Indium-Antimonide (InSb) detectors and optics. B.0.

For each HR pixel estimate. the AWF lends itself to parallel implementation as each observation window could potentially be processed independently in parallel. This not only lowers the computational complexity for training and filtering. the complexity is among the lowest of the published SR algorithms [4]. [12]. 16(f) appears to have fewer artifacts than the other images. (e) PWS with M = 10. In particular. (c) WNN with l = 4. 1998. 3. ) will tend the AWF with spatially varying statistics (i. no vector quantization is needed.. but also provides notably better performance in the experiments conducted here. Outputs for the FLIR image sequence using P = 16 frames and L = L = 5. 247–260. pp. . and more sharpening in the dynamic portion of the image. In fact. VI. 2963 the proposed approach not only adapts to the local spatial pattern of LR pixels. For nontranslational motion. R. the framework proposed here can be applied to other motion. 37. 1. the spatially varying model reduces to the global model. for providing the FLIR video sequence used here. REFERENCES [1] R. Also. Park. the PWS SR method. E. In the quantitative performance comparison conducted here with moderate levels of noise. Hardie. Finally. If no noise is present. such as the RLS [1]. C. [2] S. ACKNOWLEDGMENT The author would like to thank the Sensors Directorate at the Air Force Research Labs. Bognar.e. We believe that the images produced by the AWF also tend to have a subjectively pleasing appearance. K.” Opt. the computational complexity of the proposed method is quite low in comparison to iterative SR algorithms.. the complexity goes up dramatically. no. is that it does not quantize the motion parameters and place the LR pixels on a discrete HR grid. G. vol. Eng. but also the lowest error standard deviation over shift pattern realizations. When considering translational motion. Armstrong. Watson. This robustness to shift pattern makes the AWF attractive for video applications. local NSRs are estimated for each observation window. J. Mag. M. and M. Barnard. Kang. as can be seen in (19). While we have focused on translational motion. Note that the adaptive statistical model is most beneficial in cases where moderate to high noise levels exist. so long as the the PSF and motion models commute [13]. Like the methods in [16] and [17]. [17] and can be readily adjusted based on several filter parameters. C.. K. Instead. no.HARDIE: FAST IMAGE SUPER-RESOLUTION ALGORITHM USING AN ADAPTIVE WIENER FILTER Fig. The AWF image in Fig. “High-resolution image reconstruction from a sequence of rotated and translated frames and its application to an infrared imaging system. E. May 2003. the AWF using the spatially varying statistical model produces not only the lowest errors (MSE and MAE). but the option of a parallel implementation remains. These artifacts are virtually eliminated with all of the multiframe SR methods. A. where a variety of interframe motion patterns may be encountered over time.” IEEE Signal Process. the severe “checkerboard” aliasing artifacts on the roof of the main building. but also to the local NSRs. The estimated HR pixels are formed as a weighted sum of neighboring LR pixels in an observation window than scans across the HR grid. the weights are optimized for the spatial pattern of LR pixels in each observation window relative to the HR pixel. and (f) AWF with Q = 20. and E. Park. (d) SKWF. Wright Patterson Air Force Base. the filter weights for the AWF SR method are model-based and do not require empirical training images. [16]. 21–36. (b) SA+Wiener. When using a spatially varying statistical model. G. 16. as well as the anonymous reviewers for providing feedback to help strengthen this paper. for translational motion. (a) Single frame bicubic interpolation. Jan. He would also like to thank Dr. allowing the AWF SR method to process the LR input frames in a spatially varying manner. CONCLUSION We have proposed a novel SR algorithm based on a type of adaptive Wiener filter. Ordonez for helpful conversations regarding statistical models. Here we consider both a global statistical model and a spatially varying model. One of the main advantages of the AWF SR algorithm over its predecessor. pp. to do more smoothing in the low variance image regions that are dominated by noise. J. “Super-resolution image reconstruction: A technical overview. Multiple frames are acquired and registered to a common continuous HR grid (without quantizing the motion parameters).

vol.” J. San Fransisco. pattern recognition. “Data-driven multi-channel superresolution with application to video sequences. Jain. Sezan. D.edu/~milanfar/SR-Software. in 1988. he received the Rudolf Kingslake Medal and Prize from SPIE in 1998 for work on multiframe image-resolution enhancement algorithms. no. Sauer and J. [4] M. M. 49. and medical image processing. 11. Dayton. VOL. Tekalp. 1995. 17. “High resolution image formation from low resolution frames using delaunay triangulation. [16] J. Lucas and T. [28] A. Lin. Englewood Cliffs. S. vol..” Opt. 5. [29] S. Models Image Process. Opt. and K. 3. Signal Process. pp. Sezan. Goodman.” Appl. Amer. C. Joint Conf. R. 3130–3137. “A computationally efficient super-resolution algorithm for video processing using partition filters. [25] M. Dec. 12. “Infrared image registration using multiple translationally shifted aliased video frames. 13. pp. 10. Englewood Cliffs. G. in 1990 and 1992. Saito. vol. 740–745. 5. Hardie has served on the organization committee for the IEEE—EURASIP Workshop on Nonlinear Signal and Image Processing. “Robust fusion of irregularly sampled data using adaptive normalized convolution. 44. Milanfar. T. [22] J. Elad.D. BC. Aizawa. “Small-kernel superresolution methods for microscanning imaging systems. Eng. [21] J. 1987. Syst. 2001. University of Dayton. Patti.. Stadtmiller. Eng. [7] K. 12. C. Soc.” IEEE Trans. 6. no.. vol. Introduction to Fourier Optics. vol.. E. C. 16. and J. MD. Peleg. Eng. no. [27] C. 1900–1915. Reichenbach.” Circuits. Shao and K.” in Proc. Lim. [13] S.” EURASIP J. Fundamentals of Digital Image Processing.. Aug. Ozkan. E. A. Allebach. He has served as a Topical Editor for the Dekker Encyclopedia of Electro-Optics and a Guest Editor for the EURASIP Journal on Advances in Signal Processing. NO. vol. Barner. and Ph. Nov. Sarhan. Milanfar. E.S. 2006. 1990. Barner. Image Process. Bose. S.. May 1991. Digital Image Enhancement and Noise Filtering by Use of Local Statistics vol. and A. “Improving resolution by image registration. 1989. 16. Lertrattanapanich and N. [11] N. [26] B.S. no. vol. [15] T. Two-Dimensional Signal and Image Processing. pp. Aug. 107003-1–107003-14. Milanfar. pp. pp. D. “Iterative reconstruction of band-limited images from non-uniformly spaced samples. “Subspace partition weighted sum filters for image restoration. Video Technol. 2. E.ucsc. Image Process. and the M. no. Hardie. no. 2006. His research work has focused on image enhancement and restoration.. Hel-Or. and P. Opt. pp. NJ: Prentice-Hall. pp. DECEMBER 2007 [20] Y. M. 10. 13. Hardie. Q. 1990. Irani and S. “Partition-based interpolation for image demosaicking and super-resolution reconstruction. Available: www.. SPIE Visual Communication and Image Processing. [9] A. vol. 2005. OH. 1327–1344. May 2007.” IEEE Signal Process. R. no. “An iterative image registration technique with an application to stereo vision. no. 1999. van Vliet. New York: McGrawHill. 34. Appl. E. Englewood Cliffs. pp. Oct. Barner. Mar. Hardie. pp. and M. Hardie. 613–616.2964 [3] J. no. He is currently a Full Professor in the Department of Electrical and Computer Engineering. Robinson. 2004. E. Nov. Lett. vol. Schutte. 11. Image Process. 12. C. 4. 34 . Resolution Enhancement Software [Online]. vol. no. 1064–1076. 1187–1193. no. 2000. C. 2000. Image Process. 481–492. 45. Oct. pp. Bognar. Howe. Feb. “Partition-based weighted sum filters for image restoration. pp.” IEEE Trans. and K. Sep. Along with several collaborators. L. Therrian. Artificial Intelligence. Yasuda. and R. M. pp. 2002. 1992. Mar.” in Proc. 38. Lee. 8. Aug.. 2004. [24] J. no. Kanade. Circuits Syst. A. J. “A fast super-resolution reconstruction algorithm for pure translational motion and common space invariant blur. . 2005. Hardie. “High resolution image reconstruction from digital video by exploitation on non-global motion. C. [17] B. Boston. [5] S. [10] H. Tuinstra and R. 915–923. D..” IEEE Trans.. 1968. Hardie.” IEEE Trans. Shao. Mag. 9. MA. 1. in 1993. NJ: Prentice-Hall. Article ID 83268. 1991. and M.htm 2004 Russell C. no. pp. pp. vol.soe. 318–328. [6] K. Canada. R. C.” IEEE Trans. Barner. NJ: Prentice-Hall. Vancouver. and B. K. 3.” Opt. Teklap. vol. [14] M. Mar. vol. vol. K. 19. K. vol. ICASSP. Hardie (S’92–M’92–SM’99) received the B. 169–172.-S.” presented at the Int. 1–12. “High resolution image reconstruction from lower-resolution image sequences and space-varying image restoration. [23] T.” Opt. J. 621–634. 8. C. pp. no. [18] K.. vol. no. pp.” IEEE Trans. T. respectively. 1427–1441. and R.” IEEE Trans. Baltimore.. 5. degree (magna cum laude) in engineering science from Loyola College. 6. pp. 53. Narayanan. “Acquisition of very high resolution images using stereo camera.” IEEE Trans. 1997. M. Shi. Meas. R.. Farsiu. Oct. Signal Process. no. 8. K. May 1999. he received the School of Engineering Award of Excellence in Teaching at the University of Dayton and was the recipient of the first Professor of the Year Award in 2002 from the student chapter of the IEEE at the University of Dayton. 321–338. Hardie. degrees in electrical engineering from the University of Delaware. “Superresolution video reconstruction with arbitrary sampling lattices and nonzero aperture time. [19] M. 2006. Komatsu. His research interests include a wide variety of topics in the area of digital signal and image processing. vol. Image Process. May 1999. 3. “Reduction of aliasing in staring infrared imagers utilizing subpixel techniques. [12] M. Pham. “A wavelet-based interpolation restoration method for superresolution. 1203–1214. Robinson. J. Chellappa. Newark. Image Process. vol. 12.” IEEE Instrum. Barner. 231–239. Aug. [8] A. 1497–1505. Alam. where he also holds a joint appointment with the Electro-Optics Program. Gillette. Circuits Syst. W. pp. CA.. Oct. Discrete Random Signals and Statistical Signal Processing. Dec. He was a Senior Scientist at Earth Satellite Corporation prior to his appointment at the University of Dayton. In 1999. vol.. “Fast and robust multi-frame super-resolution. 10. S. 1981. 5. 3. IEEE TRANSACTIONS ON IMAGE PROCESSING.. no. Dr. Shao. the 2006 Alumni Award in Teaching. no.” CHIP: Graph. M. “Optimization of partition-based weighted sum filters and their application to image denoising. and R. He recently received the University of Dayton’s top university-wide teaching award. 1992.. and T. Farsiu. he also served as Co-Chair of a special session for that workshop in 2001. D. Nguyen and P. pp. vol. Shekarforoush and R. Elad and Y. and P.