38
Super Resolution Image Using Dictionary
Technique
Ghada M. Shaker
1
, Alaa A. Hefnawy
1
, Fathi E. Abd ElSamie
2
and Moawed I.Dessouky
2
Abstract This paper addresses the problem of generating a super resolution (SR) image from a degraded input image; using a
hybrid algorithm to form dictionary from low resolution (LR) image to produce high resolution (HR) one. A patchbased, locally
adaptive denoising method based on clustering the given degraded image into regions of a like geometric structure has been
proposed. We will show the effectiveness of sparsity as a prior for regularizing where we utilize as features the local weight
functions derived from steering kernel regression, to effectively perform such clustering. Next, we model each region (or cluster)
which may not be spatially contiguous these by learning a best basis describing the patches within that cluster. This learned basis
(or dictionary) is then employed to optimally estimate the underlying pixel values using a kernel regression framework. We
illustrate the overall algorithms capabilities with several examples. Sparse KSVD algorithm is applied for optimization to speed up
sparse coding. Comparison with sparse coding method shows sparse dictionary is more compact and effective.
Keywords : SuperResolution (SR), Sparse Coding, Sparse Representation, Learningbased, Sparse Dictionary.
1. INTRODUCTION
Highresolution images are important in
many fields, like in medical image diagnosis,
remote sensing, highresolution video and other
fields; one of the main methods of obtaining HR
images is SR method. SR image reconstruction is
currently a very active area of research, as it offers
the promise of overcoming some of the inherent
resolution limitations of lowcost imaging sensors
(cell phone or surveillance cameras) allowing
better utilization of the emergent capability of
highresolution displays (e.g., highdefinition
LCDs) [1].
The SR task is direct as the inverse
problem of recovering the original HR image by
fusing multiple LR images of the same scene [2],
based on assumptions or prior knowledge about
the generation model from the HR image to the LR
images However, SR image reconstruction is
commonly a severely illposed problem because of
the insufficient number of LR images, ill
conditioned registration and unknown blurring
operators and the solution from the reconstruction
limitation is not unique. Another method assumes
the LR images are wrapped, blurred and down
sampled version of the corresponding high
resolution image, and through modeling this
process, HR image can be inversed from the
sequence of LR mages. But some of the parameters
of the model are hard to define. Another category
of SR methods that can overcome this difficulty is
learning based approaches, which use a learned
contributed prior to predict the correspondence
between LR and HR image patches. Nevertheless,
the above methods typically need enormous
databases of millions of HR and LR patch pairs to
make the databases expressive enough.
Learningbased superresolution method
builds association between both highand low
resolution image patches (or patch features), and
defines the association as prior for SR
reconstruction. There are many methods of
learningbased SR method; Freeman et al. [3]
assumes that an image consists of three
components: high, middle and low (h, m, l)
frequency component. Also, the method is based
on a hypothesis that the LR image is the result
which is discarded HF component from the
corresponding HR one. The aim of learningbased
method is to restore the HF component from low
and middlefrequency component: maximizing the
probability ( / , ) p h m l . A method that picks the
first order and second order gradients of Laplacian
and Gaussian pyramid of the image as its feature
space, this method may be called second kind
learningbased super resolution method.
Hertzmann et al. [4], Efros and Freeman [5]
presented a local feature transform method, the
idea of image analogies could also been used for
image superresolution, which may be called the
third kind learningbased superresolution

1
Electronics Research Institute

2
Faculty of Eng., Menofia Univ.
JOURNAL OF COMPUTER SCIENCE AND ENGINEERING, VOLUME 17, ISSUE 2, FEBRUARY 2013
39
algorithm. In order to break through the limits,
Yang et al [6], presented a learningbased
algorithm based on sparse coding, which
effectively builds sparse association between high
and lowresolution image patches and gets
excellent results.
Most learningbased SR methods have two
steps [7]: matching K similar feature examples
from training set; optimization estimation based on
the K examples. This method is restricted by the
quality of the K candidates, which limited the
freedom of the model and wasted prior
information of other patches in the training set.
During sparse dictionary coding, the training
examples consist of feature patches rather than
original image patches. We select firstorder and
secondorder gradients of the LR patches as the
feature, which is same with the methods of Chang
et al. [8] and Yang et al. [9]. Rubinstein et al [10],
presented a parameter model called sparse
dictionary (SD) to balance efficiency and
adaptively, which decomposes each atom in the
dictionary by a basis dictionary. This model has a
simple and high adaptive structure. Follows works
in the papers [4, 1014]. Learningbased super
resolution algorithms have been intensively
studies. In this paper we suggest a framework for
denoising by learning an appropriate basis
function to describe image patches after applying
wavelet transform on noisy image. Using of such
basis functions to describe geometric structure has
been earlier explored.
2. SUPERRESOLUTIONS FROM
SPARSITY
In recent years, the beginning of such high
resolution imaging devices, the signal sensors are
becoming increasingly dense in terms of the
number of pixels per unit area of the sensor. This
means that relatively speaking; the overall sensor
is increasingly more prone to the effects of noise.
Hence, denoising remains an important research
problem in image processing. Before we deal with
the image denoising problem, we first define our
observation model as [14]
i i y = z i +
(1)
where i z is the original pixel intensity of theith
pixel observed as i y after being corrupted by zero
mean independent identically d di is st tr ri ib bu ut te ed d a ad dd di it ti iv ve e
n no oi is se e i q . Many recently introduced denoising
methods are patch based in nature [15][16]. Hence,
it is useful to formulate the observation model in
terms of image patches as well. Decomposing the
image into overlapping patches, we can also write
the data model as [14]
i i i y=z+ (2)
where i z is the original image patch with the
ith pixel at its center written in a vectorized format
and i y is the observed vectorized patch corrupted
by a noise vector i . Denoising the image is thus
solving the inverse problem to estimate the pixel
intensities i z . Many linear and nonlinear methods
have been proposed to solve this problem. Very
effective method that achieves excellent denoising
results is the KSVD algorithm proposed by
Aharon [17][18]. In this method, an optimal over
complete dictionary of image patches adapted for
the observed noisy data is first determined.
Assuming that each image patch is sparse
representable, denoising is carried out by coding
each patch as a linear combination of only a few
patches in the dictionary.
Denoising the image is thus solving the
inverse problem to estimate the pixel intensities i z .
Proposed a simple patchbased algorithm that
takes advantage of the presence of repeating
structures in a given image and performs a
weighted averaging of pixels with similar
neighborhoods to suppress the noise we note that
denoising is a special case of the regression
problem where samples at all desired pixel
locations are given, but these samples are
degraded, and need to be restored [13]. While
kernel regression (KR) is a well studied method in
statistics and signal processing, KR is identified as
a nonparametric approach that requires minimal
assumptions, and hence the framework is one of
the appropriate approaches to the regression
problem. The steering kernel regression (SKR)
method is distinguished by the way it forms the
local regression weights which weights can be
considered to be a measure of similarity of a group
of pixels compared to a certain pixel or
neighborhood of pixels under consideration. The
SK in this particular case can be expressed as [13]:
JOURNAL OF COMPUTER SCIENCE AND ENGINEERING, VOLUME 17, ISSUE 2, FEBRUARY 2013
40
j
2 2
det( ) ( ) ( )
exp{ }
2 2
i j j i j
ij
T
x x x x
h h t
=
C C
w (3)
where ij w describes the similarity of the
jth pixel with respect to the ith pixel,
2
, j i x x eR
denote the location of the ith and the jth pixels
respectively and h is a global smoothing
parameter which controls the support of the
steering kernel. The matrix j C denotes the
symmetric gradient covariance matrix formed from
the estimated vertical and horizontal gradients of
the jth pixel .It can be expressed as a matrix that
allows the Gaussian to align with the underlying
image structure by a combination of elongation
and rotation operators. Mathematically, it can be
expressed in the form [13]
j= j j j
T
j
C U U (4)
where j U represents the rotation
operator that aligns the Gaussian to the
direction j u of the underlying edge, j denotes
the elongation operator, and j acts as a scaling
parameter as shown in Fig.1. It can be clearly seen
how the kernels are representative of the
underlying image structure.
Additionally, it can be seen that different
locations in the image having different intensities
but similar underlying structure still result in
similarly shaped kernels. For the SKR method the
data is modeled to be locally polynomial where the
image is assumed to be sufficiently smooth
(locally) to allow fitting of a polynomial of some
low degree (usually 0, 1 or 2). We can then rewrite
the data model of equation (2) [14]:
i i i y=+ (5)
where the dictionary is a matrix
whose columns are formed from polynomial basis
vectors and i is the vector of coefficients. This
paper presents a novel super resolution algorithm
based on sparse dictionary, which builds sparse
association between image feature patches to direct
superresolution reconstruction.
3. PROPOSED METHOD
The weights go behind edge directions
and dictate the contribution of various pixels in a
local neighborhood of the pixel to be denoised.
However, this regression framework has two
inherent restrictions: the basis function remains the
same (polynomial) over the entire image, and the
order of regression is constant for the entire image.
These drawbacks force both smooth and textured
regions of any image to be reconstructed using the
same basis vectors (and, hence, the same order of
regression). Our proposed method here aims to
improve these problems by the use of regression
where both the type and number of basis vectors
are dictated by the given image data. They (basis
vectors) show that the secondorder regression
generally leads to an improved denoising
performance, as compared to lower orders.
More specifically, once the
parameters
are estimated, one can proceed to reconstruct a
vectorized version of each patch in the image as
=
(k)T (k) (k)T (k)
i i y% (8)
This solution is then plugged into (7) and
we reformulate the problem as that of
minimization with respect to
( ) k
alone. To
simplify the problem further, we enforce the
dictionary to be orthonormal, transforming the
optimization problem of (7) into [14]:
( )
2
argmin
k
k i
u
eO
=
(k) (k) (k)
i i y  %
=
( )
2
argmin )
k
k i
u
eO
i parameters are
estimated, we can reconstruct the target patch as:
( ) ( ) ( ) ( )
(1) (1) (2) (2) ( ) ( )
.......
k k k k
i i mk mk i
+ = + + + i z y
( ) ( )
k k
i
+ = y (10)
i K eO
The patches thus estimated are
overlapping, so we should perfectly optimally
combine the overlapping regions in some way to
form the final image. However, since the
parameters are estimated taking into account the
local weights, the pixels in each of the estimated
patches having a high confidence in regions where
the local weights are high. As a result, the patch
reconstruction form of (10) is more accurate along
the edge directions and towards the center of the
patch under consideration (that is, wherever the
local weights are strong).
4. Simulation Results
To validate the proposed method, we
performed various experiments. We trained some
images that collected from [13], down sized them
to half of the original size then we artificially
added zero mean white Gaussian noise of different
standard deviations (variance) to produce noisy
images. The parameters that can be tuned for our
method are the number of clusters (K) for the
clustering stage and the smoothing parameter (H).
For the parrot and man image shown in Table 1,
JOURNAL OF COMPUTER SCIENCE AND ENGINEERING, VOLUME 17, ISSUE 2, FEBRUARY 2013
43
the method was found to give the best K for each
standard deviation (from 525) when the image
was divided into different number of clusters
(from 113) according to each case. For each case
we calculate the MSE and PSNR value of image in
case our technique.
Figure 3 illustrates how the MSE varies
with K for additive white Gaussian noise of
standard deviation varies from (525) according to
table 1. Where we vary K from 1 to 13 and for each
case we calculate the MSE value of image in case
our technique. We iterate this process with
different values of standard deviation () from 5 to
25 with step equal 5.which the top curve indicate
to parrot image case where K opt. (that occurs least
MSE ) are (3,7,8,13,2) for (5,10,15,20,25)
respectively and the down curve indicate to case of
man image where K opt. (that occurs least MSE )
are (1,1,2,2,2) for (5,10,15,20,25) respectively. We
put circle around optimal K on the figure 3.
Tables 2, shows the optimized values of H
(smoothing parameter) for parrot and man image
with different levels of noise (only) on image, and
compare the MSE values for noisy image with
denoised image resulted from proposed method
for different cases of K and standard deviation.
For the parrot image shown in Fig. 4, the
method was found to give the best results when
the image (with =25) was divided into 2 clusters
with H=3.2 and show the different level of quality.
For the man image shown in Fig. 5, the method
was found to give the best results when the image
(with =25) was divided into 2 clusters with H=2.8
and show the different level of quality.
Fig.3. Illustration of how the MSE varies with k for additive white
Gaussian noise of standard deviation varies from (525).
(c) (a) (b)
Fig.5 Comparison of denoising results on noisy parrot image
corrupted by additive white Gaussian noise of standard deviation 25.
(a) Original image, (b) noisy image (MSE=567.98) (c) Proposed
method (MSE =160.62).
(c)
(b)
(a)
Fig.4 Comparison of denoising results on noisy parrot image corrupted by
additive white Gaussian noise of standard deviation 25. (a) Original image,
(b) noisy image (MSE=571.29) (c) Proposed method (MSE =109.97).
JOURNAL OF COMPUTER SCIENCE AND ENGINEERING, VOLUME 17, ISSUE 2, FEBRUARY 2013
44
Table 1: The MSE and PSNR values of different sigma and number of cluster for number of image.
image k 1 2 3 4 5 6 7 8 9 10 11 12 13
P
A
R
R
O
T
MSE 15.24 13.69 13.48 13.56 13.53 14.81 14.88 14.02 14.05 13.65 13.51 13.52 14.39
5
PSNR 36.30 34.16 36.84 36.81 36.82 36.43 36.41 36.66 36.66 36.78 36.83 36.82 36.55
MSE 38.89 37.27 37.02 37.43 35.59 35.36 34.73 35.02 35.05 35.28 35.08 34.76 34.90
10
PSNR 32.23 32.41 32.44 32.39 32.62 32.65 32.72 32.69 32.69 32.66 32.68 32.72 32.70
MSE 61.32 62.41 62.57 61.36 61.01 60.59 59.96 59.25 60.28 59.59 59.05 60.36 59.86
15
PSNR 30.25 30.18 30.17 30.26 30.28 30.31 30.35 30.40 30.33 30.38 30.42 30.32 30.36
MSE 88.35 84.72 86.02 86.05 85.63 84.71 85.64 86.37 84.19 85.38 84.92 85.52 83.47
20
PSNR 28.67 28.85 28.78 28.78 28.80 28.85 28.80 28.767 28.878 28.82 28.84 28.81 28.92
MSE 112.3 109.84 113.21 112.95 113.34 114.73 112.20 112.46 111.72 111.49 111.2 111.17 112.16
25
PSNR 27.64 27.72 27.59 27.60 27.59 27.53 27.63 27.62 27.65 27.66 27.67 27.67 27.632
M
A
N
MSE 18.37 18.76 18.92 18.60 18.42 20.87 21.57 21.64 20.85 20.69 22.00 20.18 19.98
5
PSNR 35.49 35.39 35.36 35.44 35.48 34.94 34.79 34.78 34.94 34.97 34.71 35.08 35.13
MSE 49.73 50.23 51.07 54.05 53.68 53.93 52.70 52.46 52.29 52.04 52.51 51.92 51.27
10
PSNR 31.16 31.12 31.05 30.80 30.83 30.81 30.91 30.93 30.95 30.97 30.93 30.98 31.03
MSE 95.71 86.32 92.91 87.09 87.04 87.44 87.35 88.81 88.11 87.45 88.78 88.60 88.50
15
PSNR 28.32 28.77 28.45 28.73 28.73 28.71 28.72 28.65 28.68 28.71 28.65 28.66 28.66
MSE 145.8 138.04 142.40 142.41 141.06 140.61 140.91 140.48 140.10 139.97 139.02 138.50 138.73
20
PSNR 26.49 26.73 26.59 26.59 26.64 26.65 26.64 26.65 26.67 26.67 26.7 26.72 26.71
MSE 145.8 138.04 142.29 142.41 141.16 140.63 140.93 141.42 140.11 138.98 138.5 138.94 138.52
25
PSNR 26.49 26.73 26.59 26.59 26.64 26.65 26.64 26.63 26.667 26.70 26.70 26.70 26.72
Tables 3, we compare the MSE values for
noisy image with denoised image resulted from
wavelet denoised proposed method and proposed
method for different cases of K and standard
deviation. We note that wavelet technique can
perform well for high noise levels at ( =25, 20, 15),
while it fail for the low one at (=10, 5), where the
MSE increases. In contrary the proposed technique
reduce amount of noise in all cases. In addition, we
observe that the number of clusters differ for both
images.
For the parrot image shown in Fig. 6, the
method was found to give the best results when
the image (with =25) was divided into 2 clusters
with H=3.2 and show the different level of quality
between wavelet denoised method and the
proposed method. For the man image shown in
Fig. 7, the method was found to give the best
results when the image (with =25) was divided
into 2 clusters with H=2.8 and show the different
level of quality between wavelet denoised method
and the proposed method.
However, for illustrative purposes, in this
paper we show results using a value of that
allowed us to achieve the least MSE. Moreover, to
eliminate dependence on the random initialization
of cluster centers for the KMeans algorithm, we
perform clustering using KMeans multiple times
and use the best clustering that produces the least
cost. A part from this, the bandwidth or the
smoothing parameter for the steering kernel also
needed to be tuned for optimality.
Further, we note that the patchbased
method of KSVD performs better denoising
compared wavelet denoise where wavelet denoise
JOURNAL OF COMPUTER SCIENCE AND ENGINEERING, VOLUME 17, ISSUE 2, FEBRUARY 2013
45
failed to decrease the noise in case of low noise
image and the proposed method appears to
perform better in the same cases .
Table 2: The Hopt and MSE values of different sigma and
number of cluster for number of image.
Image H
opt
K
opt
MSE. Using
the noisy
image
MSE. Using
the proposed
method
parrot
5
10
15
20
25
3
3
2.8
2.8
3.2
3
7
8
10
2
24.98
97.82
214.98
373.65
571.29
13.97
34.85
59.66
85.34
109.97
man
5
10
15
20
25
3
3
2.8
3
2.8
1
1
11
4
2
24.21
94.79
210.11
368.49
567.98
18.26
49.77
88.49
123.71
160.62
Table 3: The MSE values of different sigma and number of
cluster for number of image for wavelet denoised method and
proposed method
Image H
opt
K
opt
MSE.
For
noisy
image
MSE.
Using the
Wavelet
denoise
MSE.
Using the
proposed
method
parrot
5
10
15
20
25
3
3
2.8
2.8
3.2
3
7
8
10
2
24.98
97.82
214.98
373.65
571.29
150.66
162.54
181.69
207.94
240.99
13.97
34.85
59.66
85.34
109.97
man
5
10
15
20
25
3
3
2.8
3
2.8
1
1
11
4
10
24.21
94.79
210.11
368.49
567.98
117.58
129.72
149.18
175.79
209.44
18.26
49.77
88.49
123.71
160.62
5. Conclusions
This paper presents a SR method based on
sparse dictionary, which builds sparse association
between image feature patches, and at the same
time carry on matching and optimization.
Compared with other sparse coding method,
sparse dictionary is more compressible and
efficient, and needs fewer examples for the same
quality. Comparison with other learningbased
superresolution method shows that our method
superior in quality and computation. However,
there are some possibilities for future
improvement. Through the proposed method used
wavelet domain for clustering the image using an
important effect features that are able to capture
the underlying geometry in the presence of noise.
A dictionary is learned for each of the clusters and
a generalized kernel regression is performed to
produce a denoised estimate for each pixel, then
we expand our proposed method on image with
blurring and image degraded with both noise and
blurring, as a result the proposed method shows a
satisfied result. It presented the implementation of
the proposed method that consists of four stages,
namely wavelet transform, clustering, dictionary
learning, and coefficient calculation. However,
each of these blocks can be replaced by alternate
approaches that satisfy similar objectives. Our
outline is evaluated experimentally and compared
to some of the state of the art methods for learning
SR method. It can be seen that the performance of
the proposed method is viable, qualitatively as
well as quantitatively. For optimal performance, it
is necessary to tune a few parameters of our
framework like number of clusters and smoothing
parameter. Although our method may be useful to
use variants of KMeans that converge to the
optimal number of clusters automatically.
Fig.6 Comparison of denoising results on noisy parrot image
corrupted by additive white Gaussian noise of standard deviation 25.
(a) Noisy image (MSE=571.29), (b) wavelet denoised method
(MSE=240.99) and (c) Proposed method (MSE =109.97).
(c) (a) (b)
JOURNAL OF COMPUTER SCIENCE AND ENGINEERING, VOLUME 17, ISSUE 2, FEBRUARY 2013
46
References:
[1] Dr.Vivek Bannore. IterativeInterpolation Super
Resolution Image Reconstruction, January 2009.
[2] Subhasis Chaudhuri, Manjunath V. Joshi. Motin
FreesuperResolution, Springer Science Business
Media, Inc, 2005.
[3] W. T. Freeman, E. C, Pasztor, and O. T. Carmichael.
"Learning Low Level Vision," International Journal of
Computer Vision, Vol. 40, 2000, pp. 2547.
[4] A Hertzmann, C. E. Jacobs, N. Oliver, B. Curless, and
D. H. Salesin, "Image Analogies," Computer Graphics
Proceedings, Annual Conference Series. ACM
SIGGRAPH. Los Angeles, California, 2001, pp. 327
339.
[5] A. A Efros and W. T. Freeman, "Image Quilting for
Texture Synthesis and Transfer," Proceedings of the
28th Annual Conference on Computer graphics and
interactive techniques (SIGGRAPH '01), 2001, New
York.
[6] H. Chang, D.Y. Yeung, and Y. Xiong, Super
resolution Through Neighbor Embedding, CVPR,
2004.
[7] Li Min, Li ShiHua, Wang Fu and Le Xiang , Super
Resolution based on Improved Sparse Coding, 2010
[8] S. Rajaram, M. S. Gupta, N. Petrovic, and T. S. Huang,
"Learning Based Nonparametric Image Super
Resolution," EURASIP J. Appl.Signal Process, Vol.
2006,2006, pp. III.
[9] L. C. Yang, L. Wright, T. S. Huang, and Y. Ma, "Image
superresolution as sparse representation of raw image
patches," Computer Vision and Pattern Recognition
(CVPR 2008). IEEE Conference on. 2008.
[10] R. Rubinstein, M. Zibulevsky, and M. Elad, "Double
Sparsity: Learning Sparse Dictionaries for Sparse
Signal Approximation," IEEE Transactions on Signal
Processing, Vol. 58, Feb. 2009, pp. 1553 1564.
[11] H. Takeda, S. Farsiu, and P. Milanfar, Kernel
Regression for Image Processing and Reconstruction,
IEEE Trans. Image Process., Vol. 16, no. 2, pp. 349366,
Feb. 2007.
[12] J. Mairal, G. Sapiro, and M. Elad, Learning Multiscale
Sparse Representations for Image and Video
Restoration, SIAM Multiscale Model. Simul., Vol. 7,
no. 1, pp. 214241, Apr. 2008.
[13] H. Takeda, S. Farsiu, and P. Milanfar, Kernel
Regression for Image Processing and Reconstruction,
IEEE Trans. Image Process., Vol. 16, no. 2, pp. 349366,
Feb. 2007.
[14] P. Chatterjee and P. Milanfar, ClusteringBased
Denoising With Locally Learned Dictionaries, IEEE
Transactions on Image Processing, Vol. 18, NO. 7, July
2009.
[15] A. Buades, B. Coll, and J.M. Morel, A nonlocal
Algorithm for Image Denoising, in Proc. IEEE Conf.
Computer Vision and Pattern Recognition,
Washington, DC, Oct. 2005, vol. 2, pp. 6065.
[16] K. Dabov, A. Foi, V. Katkovnik, and K. O. Egiazarian,
Image denoising by sparse 3D transformdomain
collaborative filtering, IEEE Trans. Image Process.,
vol. 16, no. 8, pp. 20802095, Aug. 2007.
[17] M. Aharon, M. Elad, and A. Bruckstein, KSVD: An
algorithm for designing over complete dictionaries for
sparse representation, IEEE Trans. Signal Process.,
vol. 54, no. 11, pp. 43114322, Nov. 2006.
[18] M. Elad and M. Aharon, Image Denoising via Sparse
and Redundant Representations over Learned
Dictionaries, IEEE Trans. Image Process., Vol. 15, no.
12, pp. 37363745, Dec. 2006.
[19] J. F. Murray and K. KreutzDelgado, Learning Sparse
over Complete Codes for Images, J. VLSI Signal
Process., Vol. 45, no. 1, pp. 97110, Nov. 2006.
Fig.7 Comparison of denoising results on noisy parrot image corrupted
by additive white Gaussian noise of standard deviation 25. (a) ) noisy
image (MSE=567.98), (b) wavelet denoised method (MSE=209.44) (c)
Proposed method (MSE =160.62).
(c)
(a) (b)