Neurocomputing 216 (2016) 1–17

Contents lists available at ScienceDirect

Neurocomputing
journal homepage: www.elsevier.com/locate/neucom

Rotation expanded dictionary-based single image super-resolution
Tao Li, Xiaohai He n, Qizhi Teng, Xiaoqiang Wu
College of Electronics and Information Engineering, Sichuan University, Chengdu, China

art ic l e i nf o a b s t r a c t

Article history: In this report, issues that affect the performance of the neighbor embedding (NE)-based Super-Resolu-
Received 29 July 2015 tion (SR) method are analyzed. Effective enrichment of the dictionary is a critical factor for the NE-based
Received in revised form SR method. To efficiently enhance the dictionary's expressive capability, a rotation expanded dictionary
12 June 2016
(RED) incorporating the Radon transform (RT) technique is proposed. By representing patch rotations
Accepted 20 June 2016
with a compact scheme, both the search for neighbors and the estimation of rotation angles in the SR
Communicated by: Gaofeng MENG
Available online 20 July 2016 process are significantly simplified. To refine the patch matching accuracy when using the expanded
dictionary, a new level of imaging, known as the middle-resolution (MR) image, is proposed to replace
Keywords: the original low-resolution (LR) image in patch matching. Because MR patches bear more distinguishable
Super-resolution
features, this modification is able to identify neighbors more accurately for the input patches. Lastly, the
Rotation expanded dictionary
effects of a single image SR method based on the MR matching and the rotation expansion are examined
Radon transform
Variance curve in simulations. A comprehensive comparison with several state-of-the-art SR methods demonstrates the
superior performance of the proposed method.
& 2016 Elsevier B.V. All rights reserved.

1. Introduction resolution scale-up factor increases. In the learning-based SR
methods (e.g., [13–24]), a dictionary that stores the relationship
The super-resolution (SR) method is an effective signal pro- between the LR and HR image patches is constructed from a
cessing technique used to overcome resolution limitations in training dataset in advance, which is then used to produce HR
imaging systems. Due to its capability of generating a high-re- images from the input LR images. Additionally, dictionaries with
solution (HR) version of a low-resolution (LR) image or video, SR large scale-up factors can be suitably constructed from the corre-
has found various applications in video surveillance, medical sponding training sets; hence, SR using large scale-up factors is
imaging, entertainment, etc. more feasible in this method.
The classical SR methods [1,2] use multiple LR images with Recently, there has been significant progress in learning-based
sub-pixel misalignment to construct an HR image. Furthermore, SR methods. In [14], the relationship between the LR and HR image
this basic idea is applicable to video SR approaches [3–5] because patches is built in a Markov random field. However, this method is
consecutive video frames can be used as multiple LR images. Un- sensitive to the training datasets. Chang et al. [15] proposed a
like classical methods, the single image SR method is used to neighborhood embedding (NE) algorithm. Based on the assump-
generate an HR image from only one LR image. Thus, it is more tion that the local geometry is invariant at different resolution
flexible and adaptive but ill posed for signal processing. levels of an image, the HR output is generated from a linear
Single image SR techniques can be classified into three primary combination of the HR counterparts of the input patch's neighbors.
categories: interpolation-based SR, reconstruction-based SR and By assuming that the sparse representation coefficients are iden-
learning-based SR. The interpolation-based SR methods (e.g., tical for the LR patches and their HR counterparts, Yang et al. [16]
[6–8]) are fast but suffer from severe blurring and jagged artifacts. proposed a joint dictionary training method to simultaneously
The reconstruction-based SR methods (e.g., [9–12]) rely on the learn the LR and the HR dictionaries. Then, the overall procedure
degradation model and smoothness priors to synthesize an HR in [16] can be further simplified using the series of optimizations
image that, after being degraded by the model, is consistent with proposed in [17]. However, the assumption of the coefficient in-
the original LR image. Because the information from the smooth- variance is often defective in dictionary training when the spatial-
ness priors is limited, the performance deteriorates rapidly as the variant relation and other complex relations are involved. To ad-
dress this problem, several approaches [18–21] have attempted to
n
Corresponding author. release the invariance constraint. Wang et al. [18] proposed a
E-mail address: hxh@scu.edu.cn (X. He). semi-coupled training model where the sparse representation

http://dx.doi.org/10.1016/j.neucom.2016.06.066
0925-2312/& 2016 Elsevier B.V. All rights reserved.

matching is directly conducted between the input LR patches and tween the LR's and HR's nonzero representation coefficients. Peleg et al. thus. The SR re- by incorporating an extra set of training samples and training the construction is used to restore the HR image yh from the observed regression functions from the large sample set instead of the small LR image xl . enhance the expressive capability of the dictionary. Liang et al. Zhang et al. gradual magnification as an initial estimation and combined xl∈Nl and yh ∈Nh . In Section 4. For each . When the in. patch on the LR and HR supports while allowing a linear relation be. Yang et al. an iterative optimization procedure constrained by presented in Section 6. The most straight. Given that the resolution scale-up multiple image priors of different aspects. He et al. Various techniques have been used to mance of example learning-based SR methods. In another two attempts. presentations of the LR and HR patches. gradient histogram and non-local sparsity. a so. they made further improvements [30] notes the additive noise during image acquisition. The are more competent because they incorporate the sparse re. the data fidelity of the SR model is re. the transformation from the HR from pairs of compact LR and HR sub-dictionaries to reconstruct image to the LR image in Eq. [32] proposed a scheme that used deformed patches to problem. sented in Section 7. pressive capability. the NE the use of the optical flow algorithm and assuming a slow variation method [15] introduces the idea of a locally linear embedding field. learning process in [20]. n de- the dictionary atoms. In forward approach to enrich the abundance is to increase the dic. (1) is non-injective. In Section 3. Conventionally. methods that mulated as follows: attempt to reduce the computational complexity at no or limited xl =QHyh +n. [20] both suggested retaining the invariance constraint only factor affecting the SR performance. is maintained for lution based on a rotation expanded dictionary (RED) is proposed in data with different resolutions. they further proposed a non-locally centralized sparse representation (NCSR) model using the nonlocal-sparsity To facilitate the representation in matrix format. images [22]. data similarity. (1) cost of the reconstruction quality are also studied [28–31]. and decimation tion-efficient method using multiple linear mappings obtained in the down-sampling operation. severely ill-posed problem. [29] constructed a pair of LR and HR Q ∈ RNl ×Nh is a decimation matrix representing the down-sampling dictionaries and trained the regression functions that anchor to operator that reduces the image resolution. thus making the SR reconstruction a Dictionary abundance is a critical factor that affects the perfor. the flexibility and adaptation of dance. learning-based SR methods. blurring. deep learning model is formation types are analyzed. By splitting the feature space into numerous subspaces and training a where H ∈ RNh×Nh is the matrix format of a low pass filter that set of simple regression functions. The inverse HR images. However. Lastly. Yang et al. methods such as those in [25–27] the improvement in patch matching accuracy is discussed. [26] proposed an adaptive sparse do- main selection strategy and incorporated two adaptive regular- izations on the local structures and non-local self-similarity of the 2. a specific relationship between the sparse re. the most algorithm that directly used the regression functions to generate common bi-cubic low-pass filter is used as the blurring operator. when the The remainder of this report is organized in the following blurring effect in the degradation model is aggravated. [23] utilized the image priors such as We propose a rotation deformation model to represent our image gradients to constrain the deep learning process. the RED is uous. However. [31] proposed a computa. To reduce possible ambiguity within the difference between them is that the joint dictionary training is LR patches. respectively. [24] applied Cur. Later. Timofte et al. ments is introduced in Section 5. [28] proposed a rapid accounts for the blurring of the HR image. detailed RED-based SR method incorporating all of the improve- presentation of the HR patches directly into the data fidelity. a dictionary that embeds the tionary size. This algorithm justifies the feasi- this report to significantly enhance the dictionary's expressive bility of the NE-based SR method for determining the HR neigh- capabilities at a lower computational cost. To im. The system model of SR problems and the NE-based SR spondence between the LR and the HR patches is more ambig.. Later. The the LR dictionary atoms. Furthermore. HR images. Using the idea of patch deformations. rotation expanded dictionary (RED). The experimental results are Furthermore. the corre. due to An important type of learning-based SR method. i. dictionary. such as the structural factor is an integer q > 1. manner. The system model of the SR problem can be then for- In addition to achieving a better SR performance. constructions. / Neurocomputing 216 (2016) 1–17 coefficients of the LR and the HR patches were not identical but When searching for the nearest neighbors from dictionaries for linearly correlated. follows: pendency between the LR and HR coefficients. image relationships across low and high resolutions is imported to which inevitably adds more computational loads in patch matching. this method has a high complexity and a limited effect on (LLE) [33] algorithm to the SR field. the learning-based SR methods. bors by identifying the similarity between the LR patches. impact on the SR reconstruction. this requires increasing the training dataset. However. Recently. Moreover. process is underdetermined. address this problem. To suppress the blurring cient matching scheme in the RED is also introduced. Dong et al. proposed to make the dictionary more expressive. Nh=q2Nl . which demonstrates that the dictionary enrichment. and the rotation deformation is introduced to exploit the mapping relation between LR and HR determined to be the most effective one. effectively enhancing its ex- prove the reliability of dictionaries. the observed prior [26]. for example. A highly effi- presentations will introduce more errors. We use patch deformations to expand the dictionary abun- variance constraint is relaxed. method are briefly reviewed in Section 2. the report's conclusions are pre- both the data fidelity and multiple priors is used in the SR re. Li et al. example. [19] and Jia the input LR patches. However. [27] proposed a novel SR model that used LR image and its HR source are denoted as two column vectors. In this report. We adopt a new image quality level in patch matching to mi- In these methods. also known as the neighborhood.e. the patch matching accuracy is another key et al. [21] proposed a The primary objectives of this report can be summarized as nonlinear parametric model to formulate the statistical de. Zhang et al. Furthermore. System model images. for presented by a specific relationship between the sparse re. the sizes of the two vectors satisfy self-similarity. provide the necessary constraints for solving the ill-posed SR Zhu et al. velet Support Value Filters to extract reliable structural features.2 T. resulting in different types of SR methods. tigate the feature ambiguity and refine the matching accuracy. due to the unknown noise. Several de- these methods largely improve. learning-based methods. a novel patch matching maneuver is also proposed in achieved by a beta process in [19] but by a coupled dictionary this report to increase the patch matching accuracy.

which will produce a 2D matrix. pose using a Radon transform (RT) to support the simple patch ward but inevitably increases the costs of dictionary construction matching solution for the rotation deformation model. because a rotation causes no loss in quality when modifying the corresponding HR atoms. As mentioned previously. 9. Although zooming in on an atom requires no new information to create the deformed atom. Rotation expanded dictionary and storage of the deformed atoms but also enables analytical 3. the parameter-controlled model not only facilitates the generation 3. flow-based method is used to determine the optimal matched construct the HR image. the directly sampled from the training images. .1. the entire scene in the RT ( r . if the zooming ratio is large. priate HR features that correspond to the K nearest neighbors. 1. deformations that are generated from parameterized models are preferred. Ap. This solution was attempted by Zhu et al. When addressing rectangular patches in the SR problem and parently. deformations can be concisely represented by the deformation δ (∙) is the Dirac delta function. and the mance is highly dependent on the dictionary abundance. thus saving a con. zooming and domains of r and θ . the corresponding HR atom in the dictionary should also be zoomed in by the same ratio to produce the HR output. Alternatively. r is the perpendicular distance parameters instead of the deformation pixels. Radon transform of function f (x. ensuring the RT is isotropic for rotated patches. of the original patch p with an offset of –ϕ in a variable θ such that tremely limited. Accordingly. the atom can be rotated by an arbitrary angle without deteriorating the performance of the SR reconstruction. For the translation deformation. computation complexity involved is still high. which will consequently deteriorate the SR performance. Furthermore. we can pro- Enlarging the dictionary by adding new atoms is straightfor. Then.e. } represents applying RT on the operand.{. the only ambi- guity that appears during rotation occurs at the four corners. Hence. the rotated atoms are expressed by a parameterized model controlled by the rotation angle. between the line L and the coordinate origin. these types example is illustrated in Fig. Moreover. the RT of considerable loss of information. { f (x. summation of the pixels' intensities along the corresponding di- The most basic deformations of an image patch can be cate. similarly.. translation. it requires an increase in the training The RT is a mathematical integral transform consisting of the samples. when an LR atom in the dictionary is zoomed in to match the input image patch. their arbitrarily rotated deformations. 1. { f ( x. y) δ ( r−x cos θ −y sin θ ) original atom is shifted in a certain direction to generate a de- x y (3) formed atom. Here. y ) . which can be atom could also be an economical but effective approach in ex. T. acceptable deformation using their method is limited. the effects of the zooming deformation on expanding the dictionary are still extremely limited. θ ). a rotation ex- from the dictionary and computes the reconstruction weights of panded dictionary is proposed here. expressed as follows: panding the dictionary because no extra training samples are needed. which can be excluded by using a disk mask without losing im- portant atom features. a deformation patch rotated by an angle ϕ will be equal to the RT panding the dictionary using the translation deformation are ex. However. The discrete form can be given as follows: rotation. Therefore. and θ is the angle siderable effort in deformation representation. In computer terms. creating deformed versions of an existing integral of a function f ( x. RT ( r. the patches are compared with the dictionary's LR atoms as well as HR patch is obtained from a weighted combination of the appro. Dictionary expansion using deformations solutions for simple patch matching of the deformed atoms. Radon transform constraints on the local compatibility and smoothness between the adjacent HR patches. In this case. an image w. i. it will be hard to con. Thus. the effects of ex. the quality of the zoomed HR atoms will decrease. the SR perfor. in which the deforma- Unlike the SR methods that use over-complete dictionaries. In the RED. The rotation deformation encounters no significant informa- r tion-loss problems when creating deformed atoms. y) δ ( r−x cos θ−y sin θ ) dxdy (2) tionary expansion more efficient. An optical NE-based SR method uses an example-based dictionary to re. / Neurocomputing 216 (2016) 1–17 3 patch in the input LR image. An formations are determined by the model's parameters. NE [15] finds K nearest neighbors dictionary's expressive capability. y) }= ∫ ∫ f ( x. Moreover. the target HR image is recovered from the HR patches with 3. a disk mask is used struct the deformed atoms online by extrapolation due to the to exclude the four patch corners from the integral. the tions were represented by optical flow parameters.2.t. Thus. y) }= ∑ ∑ f ( x. x thus. [32]. to make the dic. a new scene that is off the scope of the original atom needs to be presented in the deformed atom. in which the atoms are all small patches patches with the optimal deformation parameters. Because the LR atoms are typically rectangular and small in size. the same problem will L y occur for the translation deformation. Therefore. a set of given angles is equivalent to computing the proaches in patch matching. formed by the normal vector of L and the x coordinate axis. rections. a new scene will appear around the original one. where the input LR image the neighbors that minimize the reconstruction error. If we can deform an atom by zooming out.r. Lastly. y ) over straight lines. these where the operator 9. and utilization. applying RT on of deformations also facilitate the use of parameterized ap. Li et al. In contrast to irregular deformations. if the translation range is large. Because the de. over the two gorized into three primary types. the rotation de- formation is the most effective deformation for improving the Fig. θ )=9.

the tions can be implemented by first determining the angle ψ that correlation maximization in Eq. the RT matrix matching has a few advantages.2π ⎤⎦.. RT9 ϕ{p}( r . Before calculating the correlation of the two old to determine the similarity. θ)} ) } r 2 (6) containing the matched patches. The correlation between two RT matrices Clearly. the RT has been widely used. in this report. / Neurocomputing 216 (2016) 1–17 RTp ( r . We can use the idea of dimension reduction for correlation calculation. where dictionary atoms are multiplied by the rotation deformations. the variable θ . matching the deformations of an image patch.3. sions of the other matrix. for the rotated the shifting offset of the second matrix. v (θ ) provides a highly abstract characterization of the is calculated to quantify the similarity of their corresponding image patch over all directions. main. This deduction may not hold vice versa in most cases. Example of a masked patch (a) and its RT variance curves (b). two maneuvers are proposed to address the drawbacks. π ψ θ= 0 (7) arg max ∑ ∑ RT1( r . θ+ψ ). one of the RT matrix. that direction. (7). i. (3) whereas the RTs for all of the the complexity in determining the relative rotational difference deformed patches can be directly deduced from the original patch between the two patches is still similar in both domains because a using a corresponding circular-shift. θ ) RT2 ( r . however. When the rotation deformation is considered. θ )=9. Now. θ−ϕ)=RT9 ϕ{p}( r .e. First. their RT matrices will definitely be similar. i. Furthermore. although gen- Therefore. Similar to the RT similar contents. the complexity by avoiding the generation and storage of every To improve the efficiency of patch matching in the RT domain. 2(a). representing the rotation deformed patches in the erating the rotated versions of an RT matrix by shifting is much RT domain can be extremely simple. which significantly reduces greedy search is used to find the optimal angle offset in both cases. the relative ro. the the offset of –ϕ is a circular-shift of the original matrix RT ( r. Specifically. θ ). θ ) and create a the similarities of two patches.r. π lows: arg max ∑ v1 ( θ ) v2 ( θ +ψ ). similar to that in the pixel domain. then the two matched patches. In the applications of texture analysis [34. search in Eq. a simple scheme can be used to avoid the greedy Then. we can calculate the var- ches in the RT domain provides an alternative method to compare iance over dimension r of an RT matrix RT (r. sufficient condition for matching the two image patches. The representation of the image pat. due to the periodicity of the RT w. Fig. its RT variance rotated versions of one of the patches should be enumerated to curve is plotted in Fig. in the RT domain.4 T. Because the RT of a patch over curves. where the operator Er {∙} represents the mathematical expectation We can assume the similarity in the RT domain is similar to of the operand over r . matching the RT variance curves is a necessary but not matrices should be compared with all the circular-shifting ver. (4) describe the RT matrix of a patch. Additionally. . θ) − E { RT ( r. 3. thus transforming the matrix correlation into a curve Except for the generation and storage complexities. When two patches are similar in curve v (θ ) with the variances as follows: the pixel domain.π ⎤⎦ is symmetric to that over θ∈⎡⎣ π . Because the problem has now been transformed to a curve ψ θ= 0 r (5) correlation. the RT variance curves can still be two patches in the RT domain considering the rotation deforma. The peak aligned curve (c) is also shown. we can first calibrate each curve by aligning their peaks θ∈⎡⎣ 0. patch is calculated using Eq. all of the masked by a disk template is illustrated in Fig. Li et al. the similarity of two patches can be de- patches can be considered similar. θ∈⎡⎣ 0. Only the RT of the original simpler than generating the rotated patches in the pixel domain. when the correlation. Furthermore. θ ) correlation calculation in the RT domain is still on a 2 dimensional along the dimension of θ . the variance of an RT integral curve at a efficient patch matching among these multifold atoms is a critical certain angle is used to characterize the image's texture trait in issue when using the RED. If there is at least one shifting version of it can effectively generate a limited candidate group containing the the second RT matrix that matches with the first one. Therefore. Hence. obtained from the original one by circular shifting. termined by calculating the correlation between their RT variance tational difference between the two patches can be inferred from curves instead of their RT matrices. that in the pixel domain. nevertheless. matching the RT matrices produces a limited candidate group v ( θ )=E r { ( RT ( r. grid. { 9 ϕ{p}} andRT ( r .π ⎤⎦ is sufficient to with a common reference.35]. Patch matching in the RED a scalar metric for each row or column of the RT matrix can be extracted. 2. rotated patch in the pixel domain. (5) can be replaced as follows: maximizes the correlation between the two RT matrices as fol. Accordingly. 2(b). an image patch patches.e. the RT variance curve compare with the other patch to determine whether they contain can be used as a reference for patch matching. Therefore. p However. the origin point of the coordinate. { p} . For example. it can be observed from the illustrated process above where the operator 9 ϕ{∙} represents rotating the patch by an angle that compared to the traditional patch matching in the pixel do- ϕ .t.. the maximum correlation is compared against a thresh. Second. θ )=9.

the angular offsets to align the variance strength of the optimization in Eq. ⋯ . images to LR images. The RT matrix of a left-side. Before we a ones refined using the back-projection algorithm [36] for both the can construct the VCsD Vm from the patches in the MRsD using the method elaborated in Section 3. 2(a) is Dm is the MR sub-dictionary (MRsD). the aligned variance curve of the image patch in Fig. which consists of the angular offsets used to align the patch is generated by flipping it upside down or left side right. The extracting operator Pi : Nh→n is .2. However. Pms = (pdis . Therefore. respectively. Additionally. an MR image ymi is constructed from each Based on the assumption of similar local geometry across different yli using the back-projection algorithm. MRsD. i = 1. For the patches that are de- termined to be similar. thus gen- right mirror. the curves in the VCsD. and ym can then be obtained by minimizing the norm then adjusted.. Then. we can define the obtained image ym as a mid-resolution (MR) image because it provides a new level of re. such as sharp edges and contrasting textures. Rotation expanded dictionaries-based SR arrange them in the most correlated status that they can bear among all their rotation deformations. suppose M. Li et al. This process can eliminate the brightness 2 arg min QHym − xl differences and equalize the feature contrast among all of the MR ym 2 (8) patches. π −θ ). The last two are designed specifically for bustness to noise. In this case. N}. image ydi is created by subtracting the upsampled LR image from Similarly. i. The flowchart of the RED-based SR method is depicted in Fig. scaled up by a bi-cubic interpolation to the size of yhi . a patch matching with the deformations of the MR atoms. is blurred by a bi-cubic low-pass filter and formation. the RED is constructed off-line in advance and produces relative rotation angle ψ between the two similar patches. no similar contents in any case. by a factor s using bi-cubic interpolation. Let the selected patch pairs be denoted by neighbor atoms from the dictionary of input image patches. (8) is insufficient to restrain the curves in the VCsD are recorded to compose the CAsD Θm . (1). M . and the residual pixel powers are then normalized. Thus. ⋯. double the abundance of the RED. the RT variance curve of a mirror.e. Thus. in both dimensions. Dictionary Construction peaks are shifted to the origin are recorded. In this case. θ ). Super-resolution reconstruction remain unknown. The feature ambiguity in the LR do- a fixed number. In this process. tion for patch matching. The difference be- tween the two angular offsets serves as an easy estimation of the First. is simply the the HR one. whereas incorporating the entire HR image will also 4. stores the high-frequency parts of the non-deformed HR patches. i. denoted by used as a criterion to sort the similarities between their HR fea- tures. The back-projection algorithm is then applied on yl to guishable in the MR domain. frequency components in the HR image because we are only concerned about reproducing the missing high-frequency com- ponents from the dictionary.1. The mirror of an image (CAsD). Based on the degradation model in Eq. if the two patches do contain similar contents. formed MR patches. xli is inferred without intensive calculation. ymi ) | i = 1. θ ). 3. which contains the HR differential images and the corresponding MR images. in which the atoms are the the curve correlations are calculated. vmir (θ ) = v (π −θ ). First. which will interfere with the original low-frequency components in the input LR The accuracy of patch matching is another critical factor that testing image. Im ) are then divided into small large. then the two patches contain both the RED and the MR patch matching scheme. Let the refined image be denoted masked by a disk template to exclude the four corners' pixels and by ym . the corresponding HR features are more distin. RTmir ( r. is obtained. ym only partially recovers the high-frequency information lost in the LR image. ample. certain differences in the HR features are hard to distinguish patch pairs. which contains the non-de- illustrated in Fig. con. respectively. For ex. each HR image. the alignment of their principle textures will erase their rotation mismatch and 5. Im )={ (ydi . corresponding curves in the VCsD. four sub-dictionaries. RT ( r. aligned RT variance curves generated from the MR atoms in the As a further extension. The sub-dictionaries are generated from a set of HR training Thus. which provides a new degree-of-freedom in addition to rotation. Similar to the rotation de. Furthermore. vmir (θ ). ⋯ . a mirror deformation in the RT domain can also be downsampled to produce a corresponding LR image xli . the MR image is divided into matching accuracy by determining the patch similarities in the MR patches. given by pmi =Pi ym . After processing all of resolutions. Θm is the calibration angle sub-dictionary cilitates the mirror expansion in the RED. the masked patches are first sub- of the estimation error as follows: tracted by their mean pixel values. ture directions have been aligned with the same coordinate axis.e. / Neurocomputing 216 (2016) 1–17 5 This is equivalent to rotating each patch until their principle tex. RTmir ( r . Furthermore. ( Pds. if a variance curve has several similar peaks. the angular offsets by which the curve 5. is symmetric to the one of the original erating an upsampled LR image yli . pmi ) { s } ) | i = 1. ydi =yhi −yli . The recovered information in In the pre-processing stage. we can increase the patch generate an MR image ym . we propose substituting the LR patches in patch matching with the simply constructed by Dh=Pds and Dm=Pms . an HR differential patch. to enhance the features' representa- directly affects the performance of the NE-based SR methods. shifted to the origin before variance curve sub-dictionary (VCsD).. i. Dh is the HR sub-dictionary (HRsD). thus guaranteeing a fair matching comparison for the Because no prior is imported in the reconstruction process. due to the non-injective transformation from HR ( Id. if the correla. the use of RT curve matching also fa. the MR patches from Dm are first input image and the dictionary. of pairs are randomly selected from main will affect the accuracy in finding and sorting the HR the residuals. artifacts. Lastly. 2(c). the extent of matching features in the LR patches is the HR images in Ih . Next. the mirror deformation can further images Ih in the following process. yhi ∈Ih. This process is used to remove the low- reverse of the original one v (θ ). (7). N . T. A new SR method is proposed in this section that incorporates tion of the two RT curves is still low. After deleting the patches with over-smooth contents. thus. the original LR image is xl . for asymmetric atoms. which produces an LR terpart. relative to their LR features. Vm is the then these peaks are all. a set of image pairs. θ )=RT ( − r . It should be noted that to increase the ro. domain instead of in the LR domain. image yl . The HRsD and the MRsD are To mitigate the ambiguity in representing the HR features. the input LR image xl is first scaled up an MR image contains more detailed features than its LR coun.. solution between the LR and the HR. which siderably reducing the searching complexity in Eq.e. Improving matching accuracy import heavy low-frequency components. and the complete HR features 5. especially when the resolution difference is All of the image pairs in ( Id.

By denoting the location set by with θc∈Θm and c∈ΩCi . of patch p will not be between the input patch and the candidates are calculated from θi adjusted to the same level of patch q.6 T. Ui 0 =K . In the first step. their corresponding HR fea- In the proposed SR method. Flowchart of the proposed rotation expanded dictionary-based single image super-resolution. are determined as the neighbors of pmi . i.. By considering matches. Thus.. *^ { p}. M} where pressed as {p^ }. the set containing the indices patches. Li et al. Given the i-th input patch pmi . During the patch matching stage. the input patch pmi to support a fair as follows: comparison. ⋯ . we can obtain pmi ∈n with i∈L m . pmj ∈Dm (9) s wise patch matching is then conducted on pmi and p^mc . / Neurocomputing 216 (2016) 1–17 Fig. the similarities are evaluated by the correlations be- s tween the curve via and all of the curves from Vm a .r. Because the identified MR neighbors have been properly belongs to the K nearest neighbors of pmi . mk i i i uij ∈{ 0. an aligned RT variance curve RT variance curve matching may also contain non-neighbor pat- via is produced from each MR patch pmi after it is adjusted in the ches. s pmi 9 ∆ θ ik pdk (11) VCsD are selected as the matching candidate set for the input It should be noted that the adjustment operator of the HR patch. intensity-based patch matching. the brightness and the contrast of the rotated MR the rotation deformation. i. the candidate group obtained by the constructing the VCsD and the CAsD. The candidate MR patches from the MRsD. the K most similar MR patches. and the jth MR patch pmj s in the MRsD. s sponding angular offset for the alignment is recorded as θi . c∈ΩCi . The corre. in the second step. L m . t. Denoting the presents adjusting the brightness and the contrast in patch p to neighbor set of the indices by Ω iK . This process can be described as follows: M s { s arg min ∑ uij pmi − *pmi 9 ∆ θ ij pmj { }} 2 . several similar MR neighbors are first rotated by ∆θic to compensate for the rotational mis- generated from the MRsD are found for each pmi . accurate neighbors are determined using pixel-wise comparisons. Eq. 3. Then. the brightness. The pixel- s. k ∈ Ω K and Ω K ⊆ ΩC . the MR neighbors can be ex- s the same level of that in patch q.Ui 2 j=1 s s where p^mc denotes the adjusted candidate MR patch. the pixel domain and transformed into the RT domain. pmc ∈Dm with c∈ΩCi . ⋯ . RT variance curve follows: matching. is slightly different from the previous one * { p} q q of the candidates is denoted by ΩCi .e. Similar to the process in As illustrated in Section 3. where K ≤ C . where ∆θij is the relative rotation offset between pmi tween their pixel intensities. the matching problem can be expressed patches are adjusted w. The similarities ^ are sorted. a weighted superposition of the s HR features of the K nearest neighbors is used to synthesize the HR determines whether the deformed version of the jth MR patch pmj patch. and Ui={ uij | j = 1. and the indices of the first C most similar curves in the p^dk =* { { } }. M} represents a differential angle similarity being measured by the mean squared error (MSE) be- vector for pmi . which can be given as two-step matching scheme. in which uij In the SR reconstruction stage. { p^mc =*pmi 9 ∆ θ ic pmc { s } }.1} denotes a binary selection vector for pmi .e. This is because the atoms in . an n × Nh selection matrix extracting a n × n patch centered at and the CAsD Θm . pdks ∈Dh . Lastly. adjusted during the patch matching. with the where ∆Θi={ ∆θij | j = 1. the operator *q { p} re. thus producing differential angles as ∆θic =θi−θc the i-th location within the image. Then. (9) is solved efficiently using a tures should also be adjusted accordingly. the relative rotations such that the mean value. k∈ΩiK . (10) ∆Θ i.t.

the maximally overlapping technique is used.395 0. Select C candidate and pmi .8668 32. which can be given as follows: neighbors. Lastly.9319 29. (14). suppress the block effects.415 0. which are used to reconstruct 5. 4. scale-up factor s. the synthesized [19].9318 29. end for k ∈ΩiK (13) 12. Back-project on yl to obtain an MR image ym . T. (3). several improvements Input: LR image xl . Conduct curve matching for via in Vm a . 6. Experimental results tween adjacent patches covers an area of n ×( n −1) pixels. 9. The weight wi. for i∈L m do are used as the HR dictionary atoms instead of the direct HR image 4. the overlapped region be. Second. 1 wi.8662 0 5 10 15 20 25 0 5 10 15 20 25 Nearest Neighbor Number K Nearest Neighbor Number K (a) (b) Fig. The peak differential image is combined with the up-sampled LR image yl to signal-to-noise ratio (PSNR) and structural similarity index mea- generate the final HR image yh . 11. k p^dk .8672 PSNR PSNR SSIM SSIM 32.445 0.8666 32. Then. HRsD Dh . reconstructed HR image. k is the calibration angle θi using Eq. overlapped patch location set L m provide a better reference for the proposed method: the training Output: HR image yh dataset is enlarged to the same extent of the proposed method. whereas adjusting the mean RT using Eq. including NE [15]. the HR differential image patches 3.9315 29. this technique is able to achieve an optimized reconstruction quality in patch-based image processing. Obtain the adjusted HR patches p^dk using Eq. and apply the only the high-frequency components. similarity.e.94 0. and restore the HR To avoid the block effect within the HR image caused by the image yh using Eq. MRsD Dm . Compute s the weight is inversely proportional to the MSE between the the weights for p^mk using Eq. patches. rors. Rotation Expanded Dictionary-based SR. 8. the maximally overlapping technique is used to further 2. / Neurocomputing 216 (2016) 1–17 7 32.9316 29. Adjust the brightness and the contrast of pmi . forming an index set Ω iK . (12).. i. VCsD a are further incorporated into the original NE [15] algorithm to Vm .96 0. A þ [30] and DPSR [32]. Here.425 0.8664 32. As proven in [19. . The codes of the baseline methods are down- loaded from their authors’ websites to avoid implementation er- Algorithm 1.95 0. BPJDL averaging the pixels in the overlapped regions. Bi-cubic interpolate xl by a factor of s to generate an image yl . (12) 7.932 29. (b) when the scale-up factor is 3. (13) are superposed based on their individual locations. (11). Li et al. ScSR [16]. s determined by the similarity between the k-th MR neighbor p^ mk 6. An increase in h will diminish the effects of the simi. Synthesize the HR differential patch pdi by a weighted su- pmi can be obtained as follows: perposition using Eq. After single image super-resolution. (10) to generate the adjusted candidate patch p^mc . (6) and circular-shifting. normalizes the weights for pmi . Neighbors that are more similar to s the input should contribute more to the SR reconstruction.435 0. Recover the HR differential image from pdi with i∈L m by averaging over the overlapped regions.∀k∈ΩiK . s pdi = ∑ wi. forming an index set ΩCi . Extract patch pmi centered at locations i from the MR image patches. ⎛ ⎞−1⎛ ⎞ where a higher PSNR indicates smaller pixel errors between the yh =yl +⎜⎜ ∑ PiT Pi ⎟⎟ ⎜⎜ ∑ PiT pdi ⎟⎟. Zeyde [17]. Dh are all HR differential patches. Peleg [21]. hence. Obtain the aligned RT variance curve via and value will add an unexpected DC component. Conduct pixel-wise patch matching to find the K nearest larity difference on the calculated weights.405 0.9317 29.92 0. the HR differential patch pdi for the current patch 10. k= Zi ( exp − pmi − s p^mk 2 ) /h . non-uniform SR performance among the different patches. which maintain better low frequency components in the ym .91 0. 1. CAsD Θm . Considered an important benchmark.37]. which can be expressed as follows: sure (SSIM) [38] are used to evaluate these methods' performance. reconstructed image and the original one whereas a higher SSIM ⎝ i ∈ Lm ⎠ ⎝ i ∈ Lm ⎠ (14) indicates smaller textural and structural distortions of the re- The RED-based SR method is summarized in Algorithm 1. all of the HR differential patches obtained from SR.867 Average PSNR (dB) Average PSNR (dB) Average SSIM Average SSIM 32. the adjacent patch is only one column or row away from the To demonstrate the effectiveness of the proposed RED-based current one. The effects of K on SR performance. Rotate and adjust the MR neighbors pmc s s where h is a tuning scalar controlling the significance of the using Eq. Compute the rotational differential angle ∆θic for the C can- didate neighbors. (13).93 0. (a) when the scale-up factor is 2. Lastly. constructed one. it is compared with several state-of-the-art techniques on Eq. Zi is a factor that neighbors.

6578 27.1736 27.1528 38.0510 32.9547 0. the over-smooth patches. which mulations are obtained by blurring and downsampling the original incorporate various steep edges and several natural textures.0006 29.2769 32.4729 34.9601 0.8524 0.9318 6.7208 38.5148 33.8985 0.9367 0.9158 0.3549 27.8882 0.9371 0.0648 32.6756 35.5897 32. In the SR process.2215 29.9389 0.9763 34.9542 0.2787 34. the selection of K is known as one of the crucial issues for VCsD and the CAsD.9634 0.6194 35.9461 0.9116 35.9293 0.9362 0.8893 0.8683 32.9349 0.7701 32.9679 0.8551 Peppers PSNR(dB) 27. the size of the candidate set is C ¼ 200.8735 0.5852 32.9557 0.9221 0. During the RT variance curve bear distinguishable high-frequency features.6061 31.9593 Average PSNR(dB) 30.5786 SSIM 0.3751 29.9548 Child PSNR(dB) 32.1977 SSIM 0. are further removed. respectively. However.000 MR and HR patch pairs are scale-up factors of 2 and 3. These original HR In the RED-based method.6377 SSIM 0.8878 SSIM 0.1171 SSIM 0.1416 SSIM 0.7728 32.7660 32.4738 38.5950 27.8554 0.4902 38.7267 33. this is performed by trimming the patch pairs in casually selected as K ¼ 10.6076 33.7947 34.8049 0. In the im.9433 32.9309 0.8965 0.9310 SSIM 0.9268 0.9592 0.9146 31.9423 0.9097 0.9229 0. benchmarks.5762 35.3471 29.4499 34.7466 35.7623 30.4838 33.1742 28.9072 0.2342 31.4125 35.6186 32.3919 SSIM 0.5152 32.9303 0.0368 35.5402 33.9147 0.9323 0. Li et al.7465 32.9351 0.9482 0.9382 0.4426 30. When K is too small.0524 34.9320 0.8362 0.3851 31.9325 0. respectively.8869 35.8889 0.2961 SSIM 0.8876 0.9278 0.8488 32.9374 0.8969 0. the randomly selected as the dictionary atoms.9594 0. thus leaving a smaller but more efficient dictionary.4375 34. 50.9584 0.0626 SSIM 0.9723 0.9469 0.9619 Car PSNR(dB) 30.1.7920 31.9580 0.2520 34.3037 28.3665 28. furthermore.9541 0.3856 32.7811 33.4500 29.8521 0.3868 SSIM 0.9343 0.6719 35.8882 0.5852 35.8 T.9738 Lena PSNR(dB) 32.3099 29. when the pre-processed images are down-sampling rates of 1/2 and 1/3 are used in the evaluations for divided into small patches.7194 38.0697 33.5383 31. the performance will of 0° to 180°.9341 0.9384 0.9579 32.9611 0.4899 SSIM 0.9838 32. Experimental setting In the SR performance evaluation.9398 0.8555 0.0911 28.9702 0.9586 0.5918 32.9476 0.4188 34.8469 0.9636 0. matching.4591 34.9365 0.6470 27.8876 31.7658 27.6863 34.9635 0.6043 31.9368 0.4568 30.8820 32.3191 34.9227 0.8886 Hat PSNR(dB) 31.6774 33.6516 33.0191 35.9453 SSIM 0.9235 0.0301 31. become unstable because too few neighbors are vulnerable to . The synthesized LR images in the following SR si- taset contains 17 HR logo images and 34 HR natural images.9494 0.9343 Foreman PSNR(dB) 35.0518 34.9381 0.6670 35.9320 0.9425 0.8644 0. When creating the method.5237 32.4105 37.9244 0.9268 0.8576 0.9177 0. during the intensity-based matching.6513 32.9553 0.9408 0.4706 32.0250 34.6177 27.9396 Wheel PSNR(dB) 29.9641 0.8598 Leaves PSNR(dB) 27.9154 0.9438 Zebra PSNR(dB) 27.9742 0.4372 28. the same dataset found in [32] is images are considered as the actual HR images and used as used as the training resource for dictionary construction.9590 0.9373 0.9401 Parthenon PSNR(dB) 28.8548 0.4078 31.7869 27.9298 0.8541 0.9516 0.3504 32. / Neurocomputing 216 (2016) 1–17 Table 1 Comparison of super-resolution results (scale-up factor ¼ 2).9290 0. A key parameter for the NE-based SR which the MR patch's variance is less than 20.2712 38. Image Measures Bicubic NE [13] ScSR [14] BPJDL [17] Peleg [19] Zeyde [15] A þ [25] DPSR [27] Proposed Butterfly PSNR(dB) 27.9367 0.9296 0.9485 0. the size of the neighbors is plementations.6068 SSIM 0.1121 29.9312 0.9584 0.9383 0.5574 32.9319 House PSNR(dB) 26.3111 28.9353 0.4438 30. images that are different from those in the training dataset are tested.9647 Girl PSNR(dB) 34.9635 0.8539 0.9481 0.9423 31. the angular resolution in RT is 1° over a range the SR performance.3851 34.1715 32.8534 0.9697 0.9370 Flowers PSNR(dB) 30. The da.8089 0.0517 31.8518 0.9156 0.4795 32.5580 34.8560 0.8836 0.3622 37.4938 29.9421 0.3477 32.9612 0.9509 0.9108 0.9366 0.000.9298 0.2625 28.9638 0.8101 31.9669 0.3528 28. The HR images.2907 35.9337 32.1465 31. Thus.5796 32. which may not sizes of 5 × 5 and 7 × 7. The bi-cubic filter is used to blur the images whereas dictionary size is 50.5085 SSIM 0.9379 0.9375 0.9787 32. to guarantee resolution scale-up factors of 2 and 3 are simulated using patch a high dictionary quality.4993 32.4185 32.8359 0.9297 0.9290 0.9376 0.

7123 29.4172 24.8880 0.8742 0.7954 25.8480 0.8430 0.9288 0.7347 29.8773 0.9345 Girl PSNR(dB) 32.4234 28.8966 0.8869 0.7393 0.8507 25. which in. Therefore.7416 0.2577 SSIM 0.0657 26.8076 0.4256 SSIM 0.3077 34.9274 0.8983 28.4523 25.4817 34.7396 0.8752 0.6276 31.8633 0.8179 31.8466 0.7437 Leaves PSNR(dB) 23.8450 0.8309 0.8459 0.8597 26.9054 Lena PSNR(dB) 30.8720 0.6197 33.9261 0.2968 29.8553 0.3479 30.0550 28.0077 27.6214 27.8134 0. which demonstrates an extremely wide feasible zone for the selection of a proper K.0288 26.1091 33.9062 0.7238 29.7608 28.8454 0.9270 Average PSNR(dB) 27.2606 28.7830 28.7386 0.9133 30.8691 0.8282 0.8777 0.8636 0. The effect of K in the proposed component of the color images whereas the color components are method is evaluated over all of the tested images.5756 SSIM 0.4559 28.8003 28.6979 0.9318 0. when K is over-large.8899 0.8825 0.8666 erroneous matches.7095 30.9032 0. This result occurs 6.4939 31.0719 26.5803 SSIM 0.7557 31.7374 0.1774 26.8243 0.8168 30.8267 0.1861 25.8563 0.1968 30.8229 0.0841 SSIM 0.8664 0.4807 33.8255 0.9159 0.8611 0.8809 0.8658 0.8260 0.8785 0.1537 26. Image Measures Bicubic NE [13] ScSR [14] BPJDL [17] Peleg [19] Zeyde [15] A þ [25] DPSR [27] Proposed Butterfly PSNR(dB) 24.8697 31.9331 0.8814 0. the learning-based SR methods gen- expansion in the dictionary provides more highly resembling erally outperform the interpolation-based method because extra atoms to reduce the reconstruction errors under high K values.7863 28.8240 0.6996 33. The average directly interpolated.8608 26.0986 31.1392 25. factors of 2 and 3 are listed in Table 1 and Table 2.9950 28.8276 Hat PSNR(dB) 29.0134 25.0524 25.7373 0.7368 0.3755 35.8809 0.3655 28. As bustness to erroneous matches under low K values whereas the demonstrated in the tables.6411 26. Because human eyes are more sensitive to luminance changes.9256 0.9254 28.5865 SSIM 0.7362 0.8412 28.8304 25.9069 0.7998 25.6777 29.0088 30.4831 32.8524 Foreman PSNR(dB) 32.9296 27. Evaluations on performance primarily because the proposed RED-based SR method enriches the abundance of the dictionary and improves the matching ac.8831 0.8863 Zebra PSNR(dB) 23.9084 Car PSNR(dB) 27.1490 29.0921 SSIM 0.5612 SSIM 0.0527 26.0399 26.8782 25.5283 33.8480 0.7285 0.8783 27.9505 25.6362 28.8601 0.8477 0.7806 28.8794 0.8412 0.8737 0.8282 0.8537 0.8637 0.9901 34.5360 29.5991 30.0002 25.8913 29.9142 0. the results present a good performance of the proposed dictionaries.8738 0. information sourcing from the training images is imported in the Thus. T.8634 0.0816 31.9068 0.7362 0.0372 27.8425 26.6944 26.7993 0.8593 0.7354 0.1574 29. 4.6222 30.8252 31.5211 31.9323 0.8734 0.8845 0.8632 0.2172 SSIM 0. more atoms 0.4126 SSIM 0.8666 Flowers PSNR(dB) 27.8850 0.8456 0.8653 0.6975 28.8077 28.1347 28.8620 27.0919 30.6116 30.9588 SSIM 0.8389 29. respectively.4908 25. with low similarities will be imported as neighbors. / Neurocomputing 216 (2016) 1–17 9 Table 2 Comparison of super-resolution results (scale-up factor ¼ 3).8796 0.2.8452 26.5573 31.0394 28.4314 27.0254 SSIM 0.8281 0.8780 0.5.8027 0.8570 0. The tuning scalar h in the weight calculation is though several baseline improvements have been implemented in . Al- in the simulations.8449 30.8652 0.3590 27.0617 SSIM 0.8546 0.8748 0.9431 25.0382 25.8755 0.7217 0.8616 0. the PSNR and SSIM performance for both scale-up factors of 2 and PSNR and SSIM are only measured in the luminance component.3205 28.9097 0. 3 are depicted in Fig.4290 34.7632 25.0353 25.9073 0. Considered one of the pioneering methods of its kind.8937 0.5924 35. method in selecting a proper K and justify our casual selection of K the NE [15] incorporates the least optimizing modifications.8927 0.5307 28.8810 Parthenon PSNR(dB) 26.8834 0.5372 28. all of the SR methods are only implemented in the luminance creases the reconstruction errors.8929 0.4738 28.6944 0.9109 0.5216 26.8845 0.8198 0. in the objective evaluations.8705 0.0314 26.8026 0.8906 Child PSNR(dB) 29.4873 33.9113 0.2085 35.3400 34.7460 Peppers PSNR(dB) 25.7351 0.5056 26.8798 0. The PSNR and SSIM of the different methods with scale-up curacy.8456 0.4201 33. Li et al.8098 26.8882 0.4077 SSIM 0.8657 0.8802 0.5219 33.9459 31. The improvement in matching accuracy increases the ro.7449 0.8955 0.7963 26.8846 0.8934 Wheel PSNR(dB) 26.1269 29.8909 0.8640 0. however.1804 26.1377 SSIM 0.8929 0.8701 House PSNR(dB) 24.9940 27.8642 0.9273 30.

a similar utilization of the flexibility is ob. (c) NE [13]. its performance is still worse than those of the Thus. (i) DPSR [27]. Although certain modifications have been made to simplify the problem for learning-based SR methods. 5. Compared to the coupled dic. Moreover. the error propagation among methods when the scale-up factor increases. worse. it can be determined that simulations. / Neurocomputing 216 (2016) 1–17 Fig. (h) A þ [25]. respectively. Super-resolution results of image “House” with a scale-up factor of 2. (a) Ground-truth HR. the fea- it a stronger robustness to the increase in the scale-up factor. 6. (e) BPJDL [17]. However. 5 and Fig. (j) The proposed. (d) ScSR [14]. the pro- method. respectively. The results of “Butterfly”. due to the computational complexity of ScSR [16]. which also explains the better performance of the latter less erroneous matches for the SR reconstruction. the virtually used in BPJDL [19] is able to release more power from the dic. (b) Bicubic interpolation. DPSR [32] also achieves better results in the mances under different scale-up factors. Because deformation patches are used to proposed method can be less affected. enriched dictionary is able to provide more similar neighbors with tionary. when the scale-up factor increases. the deformation benefits are often impaired in DPSR methods are illustrated in Fig. these parameters due to the inaccurate estimation is unavoidable. are shown in Figs. it becomes comparatively better when the factor is identifying the best HR atoms will become a more prominent 3. which gives methods. it produces the second best result among the SR methods. the similar performance. because each deformed patch involves the performance degradation is smaller than that of other SR several optical flow parameters. Using the RED to enhance the dictionary's expressive capability tionary learning method in ScSR [16]. By comparing the perfor- expand the dictionary. other SR methods in most cases. In this case. (g) Zeyde [15]. Li et al. . A þ [30] is a rapid SR method that relies on a large amount of scale-up factor is 2. “Zebra” and “Leaves” training samples to learn a set of accurate regression functions. where the [32]. The visual results of “House” and “Wheel” using different Therefore. this simulation. posed method presents a superior performance among all of the served in the statistical prediction model of Peleg [21]. 7–9. ture ambiguity across the low and high resolutions becomes although the performance of Peleg [21] is normal when the scale. Zeyde [17] still presents a refined matching accuracy from the MR patch matching. Furthermore. the deficiency in dictionary matching for up factor is 2. Nevertheless.10 T. the flexible coefficient model and the MR level to refine the matching accuracy. thus. where the scale-up factor is 3. Thus. (f) Peleg [19].

T. (g) Zeyde [15]. (d) ScSR [14]. 6. (c) NE [13]. (d) ScSR [14]. Li et al. . (i) DPSR [27]. (c) NE [13]. (i) DPSR [27]. (e) BPJDL [17]. (b) Bicubic interpolation. (b) Bicubic interpolation. / Neurocomputing 216 (2016) 1–17 11 Fig. (j) The proposed. (a) Ground-truth HR. (h) A þ [25]. (h) A þ [25]. (a) Ground-truth HR. Fig. (f) Peleg [19]. (f) Peleg [19]. Super-resolution results of image “Butterfly” with a scale-up factor of 3. (g) Zeyde [15]. (e) BPJDL [17]. (j) The proposed. Super-resolution results of image “Wheel” with a scale-up factor of 2. 7.

001 in the SSIM are achieved for all these three BPJDL [19]. DPSR [32] and our proposed method all contain an extra differential feature zones.12 T. age is significantly closer to the ground-truth HR image. Benefitting from the expanded these methods in later evaluations. it still generally outperforms all of the other ticeable artifacts around the principal edges while the latter baseline methods. the good visual effects of the proposed constructed HR image. truth counterpart is also shown. (f) Peleg [19]. the PSNR and 0. Zeyde [17] and Peleg [21] obtain more natural edges in methods. the two improvements in the RED and MR patch good HR images with considerably weaker artifacts. DPSR [32] produces a generally clear visual effect but the error propagation in the estimated optical flow fields also in. Except for the overall HR images. original source code. For each re. (c) NE [13]. a feature zone is enlarged and shown to method are more prominent in the differential images. dictionary. we have modified the source codes by removing the interpolation method. average improvements of approximately 0. a normalized differential light differential intensities reflect that our reconstructed HR im- image of the feature zone obtained by subtracting the ground. (d) ScSR [14]. the proposed method. as a compatible and designated part slightly over-smoothens the plain areas and produces prominent of the original framework. Influences of two improvements and dictionary size duces severe artifacts in certain edge areas. Meanwhile. A þ [30] generates In this section. post-process that uses the non-local self-similarities (NLSS) [39] to sistent with the objective results. Therefore. where the further compare the details. 8. to make the comparisons generally have sharper reconstructions than those of the bi-cubic rigidly fair. / Neurocomputing 216 (2016) 1–17 Fig. (j) The proposed. (e) BPJDL [17]. Please refer to the images in actual sizes in the supplemental the overall clearness and sharper edges presented in the re- materials for a better visualization of the results. using the NLSS. which can be used as the baseline performance for its type. After we add basic improvements to the NLSS in these methods. (a) Ground-truth HR. even if the NLSS process is removed from their reconstructed images. The objective results are listed in Table 3. the former three create no. constructed HR images. Super-resolution results of image “Zebra” with a scale-up factor of 3. However. The visual results are essentially con. the NLSS process is still retained in false edges along the real ones. . (h) A þ [25]. In addition to matching of the proposed method are analyzed in detail. (b) Bicubic interpolation. the BPJDL the performance differences can be easier to discern from the [19]. 6. Li et al. It should be noted that in their original source codes.3. The learning-based SR methods further refine the SR quality.1 dB in By improving the dictionaries in the learning process. (g) Zeyde [15]. However. Due to the large effort in constructing a set of regression functions. ScSR [16]. Apparently. the visual effects of NE [15] are generally The simulation over all of the testing images indicates that by good. (i) DPSR [27].

“Flowers”. By compar- images as in Section 6. (g) Zeyde [15]. we can in the following simulations refers to the concrete dictionary size observe average enhancements of 0. Furthermore. In the analysis. (b) Bicubic interpolation. used to train the regression functions. However. / Neurocomputing 216 (2016) 1–17 13 Fig. Moreover. we can observe that the MR patch matching can further improve The curves of the PSNR versus the dictionary size are plotted in the average PSNR by 0. the benchmark method of our type. we can remove the improvement in the MR patch "RED þMR" and the "RED þ LR" are relatively stable over a large matching. thus creating an inferior method denoted by "RED þLR". By tionary size" here refers specifically to the size of the sample set comparing the performance gap between the two "RED" methods. Li et al. speculation is proven by the later evaluations of complexities. dictionary to a much higher level than the one with a concrete sion.52 dB and 0.000 under 20.12 dB by increasing the matching accuracy. both the "RED þMR" and the "RED þ LR" regression functions. Additionally. (h) A þ [25]. as the dictionary size becomes generally achieve better performances compared to the other two small. respectively. the performance deteriorates dramatically.000 at an interval of 2. 10.000. 9. and region starting from a small size. It is clear that the "RED þMR" achieves the achieved using a dictionary size of 5. The evaluated dictionary size ranges the A þ [30] method. which explains methods with a bigger dictionary size of 100. NE [15]. the performances of the Additionally. It should be ing the performance gap between the "RED þLR" and the NE [15] mentioned that for the two "RED" methods.55 dB in the that specifies the size of HRsD whereas for A þ [30]. the "dictionary size" and the gap between the "RED þ LR" and the A þ [30]. the dictionary size is enlarging the concrete dictionary size.000. It should be noted that for for all of the 14 tested images.000. . which primarily result from using the RED. are also merits of reducing the dictionary size can be easily acquired included.000 and 5. due to the use of RT to describe the rotation atoms in an clearly demonstrates the superiority of the rotation expansion on extremely efficient way. Moreover.000. Furthermore. (f) Peleg [19]. even with a fairly small training samples will cause an over-fitting of the learnt neighbor dictionary size of 2.000. T. For comparison. (c) NE [13]. which demonstrates that the the second best method in previous simulations.2 with a scale-up factor of 3. PSNRs. which enhances the expressive capability of the and the MRsD of the proposed RED do not increase after expan. the "dic. therefore. (a) Ground-truth HR. The first three sub-figures are for “Butterfly”. and These comparisons definitely validate the effectiveness of the two “Parthenon”. the lack of best performance in all cases. (e) BPJDL [17].000 above 20. Thus. The experiments are conducted on the same testing without a significant impact on the SR performances. the costs are much lower compared to factor for the learning-based SR methods. this cost also analyzed in this section. Fig. (i) DPSR [27]. the concrete dictionary size of the HRsD the dictionary. A þ [30]. the last one is an average performance improvements proposed in this report. the performance in the previous section is from 2. Considered a critical panding the dictionary. the proposed method is denoted by "RED þMR". This result its poor performance in Fig. (j) The proposed. Super-resolution results of image “Leaves” with a scale-up factor of 3. 10. (d) ScSR [14]. Although it unavoidably requires extra effort in virtually ex- of a concrete expansion of the dictionary. the RED can be observed as a virtual expansion instead size. as indicated in the figures.000 to 100.

0158 31. Li et al.1948 26.0024 29.3293 29.7371 25.0657 26.8585 0.8645 0.8550 0.7351 0. The running times of the dictionary construc.7396 0.7557 31.7123 29.4.7345 0. / Neurocomputing 216 (2016) 1–17 Table 3 Super-resolution results with/without the NLSS process (scale-up factor ¼ 3).9307 0.8867 0.8781 0.5865 SSIM 0. SCDL [18] and Zeyde time ( × 102) for generating HR images. the methods that train over-complete the other methods. require the same order of dictionaries.5056 27.8785 0.0682 29.8937 0. Additionally.5216 26.0 GHz and 16 GB memory) to consumes a relatively long time.3726 33.1924 25.8636 0.8796 0.7357 0.7630 28.6772 31. whereas the PSNRs versus the two types of running (OMP).1490 29.2583 26.8456 0.7379 0. ponent Analysis (PCA) and efficient orthogonal matching pursuit respectively.4201 33. Zeyde [17] ranks as the second fastest in DPSR [32] method.8624 0.9708 26. As a methods are analyzed based on the same configuration in Section special case. DPSR [32] and the proposed method.8304 25.8863 Zebra PSNR(dB) 27.8844 0. except for the DPSR [32].7529 30. which also same computer platform (Intel CPU 4.8642 0.8701 House PSNR(dB) 25.8882 0.4406 29. it is the fastest method when considering times for different methods are plotted in Fig.8906 Child PSNR(dB) 30.8763 0.8503 0.8687 0. Image Measures BPJDL [17] DPSR [27] Proposed w/o wi w/o wi w/o wi Butterfly PSNR(dB) 26.4617 33.7387 0.7095 30.0254 SSIM 0.4568 31.3400 35.8650 0.9588 SSIM 0.0349 29.9067 0.8077 29.8226 0.8772 0.9099 0.4709 27. both the dictionary construction and the SR reconstruction.1269 29.0841 SSIM 0.7155 28. the A þ [30] uses two types of dictionaries and trains 6.8608 26.8934 Wheel PSNR(dB) 28.9221 26.9256 0.5756 SSIM 0.7429 0.8880 0.8460 0.7806 28.6590 30. were performed on the the regression functions in dictionary construction.4450 26. the computational complexities of all of the patches.5211 31.8983 29.8666 Flowers PSNR(dB) 28.8624 0.8810 Parthenon PSNR(dB) 26.9333 0.0479 35.8252 0.8794 0.7437 Leaves PSNR(dB) 25. A þ [30] achieves the shortest processing time better platform with an extra 16 GB of memory is used to run the in SR reconstruction.5095 33.9255 0.0208 29.9459 30.8657 0.7386 0.8252 31.7427 28.7998 25.8783 27.8588 30.8851 0. are typically more time-consuming in dictionary construction compared to those that build dictionaries directly from sampled In this section.8720 30.8890 0.9345 Girl PSNR(dB) 33.5640 27. All of As depicted in Table 4.14 T.9133 30.0524 25.4913 28. BPJDL [19].7638 25.8653 0.9113 0.8666 6.8755 0.0617 SSIM 0.9159 0.8641 0.8524 Foreman PSNR(dB) 34. Table 5 by performing dimension reduction using Principle Com- tions and the SR reconstructions are recorded in Tables 4 and 5.2085 35.8802 0.5612 SSIM 0.1774 28.0921 SSIM 0.0215 25. Computational complexity [17].8258 0.9218 31.8477 0.7460 Peppers PSNR(dB) 25. such as ScSR [16].4077 SSIM 0.8913 0.8439 0.9084 Car PSNR(dB) 28.4126 SSIM 0.5803 SSIM 0.8362 28.4431 35.8863 0.9270 Average PSNR(dB) 28.8777 0.2577 SSIM 0. However.8740 0.8936 0.9243 0.0339 25.8717 0.8916 26.6589 28.8003 28. due .7452 0.9318 0.8869 0.5307 28.2287 34. such as NE [15]. All methods.9054 Lena PSNR(dB) 31.3784 27. A trained functions.5219 33. with the help of well- process the same testing images with a scale-up factor of 3.8734 0.8634 0.8352 26.9940 27.8907 0.4256 SSIM 0.8752 0.8240 0.8601 0.9025 0.2172 SSIM 0.8814 0.8517 26. except DPSR [32].2.7362 0.8267 0.8389 29.8828 31.0288 26.0318 27.9140 0.0353 25. 11.8955 0. The DPSR [32] method.1377 SSIM 0.8276 Hat PSNR(dB) 30.8742 0.8563 0.

6 × 102 1.2 1.3 × 104 6.5 × 10 7. the benchmark method.2 × 102 Parthenon (291 × 459) 7.0 × 102 3.5 3.7 0. (a) “Butterfly”.4 × 102 0.8 × 102 2.6 × 102 0. the dictionary construction complexity. Table 4 Comparison of the running time (s) of dictionary construction process. even though it is processed on a better learning regression functions.6 0.4 3.7 × 102 7.4 × 102 to its enormous calculation requirements in measuring the de.3 × 102 0.3 × 104 6.9 × 104 8.3 × 102 1. the proposed method requires A more comprehensive comparison is illustrated in Fig.3 0.0 × 102 4. provements. The super-resolution performance under different dictionary size. (d) Average over all the testing images.4 × 102 3.9 × 102 Zebra (174 × 150) 1. Li et al.7 × 102 6.000 consumes an average time of .4 × 102 2.3 × 103 Wheel (156 × 204) 1. In this case.6 4.4 × 102 3. 11.. (c) “Parthenon”.1 14.1 0.9 0.6 × 102 6.8 × 102 0.3 × 102 0.6 × 103 2. its PSNR performance is re.0 × 102 1.7 × 103 Girl (258 × 255) 3.2 1.3 0.e. / Neurocomputing 216 (2016) 1–17 15 Fig. 10.9 × 102 9. Although the Zeyde [17] method provements.5 Table 5 Comparison of the running time (s) of SR reconstruction process.9 × 102 1.6 × 102 4. performance of Aþ [30] is supported by an extremely large dic- formation similarity and determining the deformation field.0 × 102 1. the NE [15] method in SR reconstruction.6 × 103 2. Image (Size) NE [13] ScSR [14] BPJDL [17] SCDL [16] Zeyde [15] A þ [25] DPSR [27] Proposed Car (234 × 360) 4. is the tionary size.9 × 102 2. Aþ [30] is considerably faster produce similar PSNR performances.3 × 102 0.3 1.1 5. Methods NE [13] ScSR [14] BPJDL [17] SCDL [16] Zeyde [15] A þ [25] DPSR [27] Proposed 4 5 4 3 Time 6.7 × 102 1.4 × 10 163.8 × 10 2.6 1.6 0.2 × 102 2. Compared to platform. To further evaluate the time efficiency of these im- is the fastest for the entire procedure.2 × 104 1.1 4. where more time for both dictionary construction and SR reconstruction the fastest method is depicted in magenta while the method with due to the extra complexity introduced by the incorporated im- the best quality is depicted in red. i. (b) “Flowers”.2 × 102 Flowers (360 × 498) 9.0 × 10 4.9 × 102 4. T. As a method that produces PSNR results that are and the dictionary size of the proposed method to make them comparable to the proposed method. is unavoidably increased. However.5 × 102 1. because the achieved PSNR with a dictionary size of 100. especially for slowest method in Table 5.1 × 102 Child (255 × 255) 3. we can adjust the dictionary size of the NE [15] method latively low. NE [15].5 × 104 1.3 × 102 0.

(CVPR). 2004. Ben-Ezra. Image Process. Yan. Conclusions [1] S. Supplementary material with a dictionary size of 2.06. 27 (6) (2005) SR methods. Q. the search for neighbors based on MR means and steering kernel regression. S. Caspi. patches greatly enhances the matching accuracy. Z. On single image scale-up using sparse-re- presentations. the tedious and soft-decision estimation. In this [4] T. [6] K. Image super-resolution via sparse re- duced by directly increasing the dictionary size. Song. Image Process. W. Wu. Image Process IEEE Trans. We thank the anonymous reviewers for helpful comments and [19] L. Jones. Xiong. Y. C. Wu. IEEE Trans. are two critical aspects that influence the SR performance. 30 (2015) 147–165. pp. T. Springer. Based on this finding. i. Wang. 2012. Peleg. Mach. Yang. Zeng. H.5  102 s in generating HR images whereas the proposed method Appendix A. image Process. The results in. Li et al. IEEE Trans.. which involves rotations. Lu. Ma. Example-based super-resolution via social images. 18 (5) (2009) 969–981. Nayar. where both the objective measurements and the visual prior model. Nguyen. in: Pro- Acknowledgment ceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).org/10. For such [3] M. . Appl.neucom. I-I. Q. Improving resolution by image registration.T. in: Proceedings of Computer Vision and Pattern Recognition. in: Curves and Surfaces. Xu. Gao. presentation. Image Process. Li. pp. Beta process joint dictionary learning for coupled feature spaces with application to single image super-resolution. Fast and robust multiframe super Considered an important type of learning-based SR method. To fully evaluate the performance of the proposed [14] W. E. 2013.16 T. Li. the MR image. 27 (4) (2005) 531–545. (a) Running time consumed in dictionary construction. Orchard. Wang. [8] X. By introducing the RT to efficiently represent rotation deforma. image pairs. X. Com- method.000 consumes 1. 2012. Yang. M. Z. a new quality level of image is obtained using a back. Huang. Farsiu.S. Protter. X. Thus. M. Y. T. [12] W. Zaretzki. Gradient Profile Sharpness. Liang. W. complexity are also analyzed through simulations. H. This study is supported by the National Natural Sci. the nearest neighbors to construct the desired HR image. Katsaggelos. MR features are more distinguishable than LR ones. Intell. Qi.K. Example-based super-resolution. D. Milanfar. the effects of the dictionary size and the computational put. Tang. Pattern Anal. Fei. Neurocomputing 172 (2016) 38–47. is considerably simpli.e. IEEE Trans. the online version at http://dx. Ren. [10] K. Single Image Super-Resolution Based on performance.K. 11.-Y. Y. M. projection algorithm. pp. 21 (11) (2012) 4544–4556. CVPR dicate that the proposed improvements can reduce the SR reliance 2004. Hung. 61471248).: Image Commun. in: Pro- suggestions. Y. Pasztor. Zhang. Image Process. 19 (11) (2010) 2861–2873. A comprehensive comparison is conducted in si.1016/j. [7] X. Siu. Han. Irani. method. Space-time super-resolution. Wang. Y. To address the feature ambiguities between the LR and HR Trans. X. Liu. group cuts prior. Image Process. solution. Shechtman. [16] J. IEEE 22 (2) (2002) 56–65. Chen. the dictionary abundance and the matching accuracy 977–987. Image Processing. X. [2] M. M. Gong. New edge-directed interpolation. 53 (3) (1991) 231–239. effects demonstrate the superior HR image quality of the proposed [13] Y. [9] S. Video super-resolution using controlled subpixel detector shifts. most effective one. Zhang.-W. Space–time super-resolution with patch report. M. With an improved re. IEEE Trans. M. 887–896. Pan. resolution. Zhang. CVGIP: Graph. A. Super-resolution through neighbor embed- ding. Yeung. 10 (10) (2001) 1521–1527. Dai. a novel RED is proposed.1  102 s. Apparently. 17 (6) (2008) patch matching.066. the proposed improvements can achieve better time efficiency than Supplementary data associated with this article can be found in simply increasing the dictionary size. IEEE Trans. 99 (2015). A. Zomet. Irani. [15] H.C. Tao.-C. Intell. Softcuts: a soft edge smoothness prior for color image super-resolution. expanding the dictionary by deformations is first analyzed. in: Proceedings of the 2004 IEEE Computer Society Conference on. Fast image interpolation using the bilateral filter. [17] R. References 7. IEEE fied. Neurocomputing 162 (2015) 218–233. X. Chang. Elad. Wright. Xu. P. [5] E. Semi-coupled dictionary learning with applications to image super-resolution and photo-sketch synthesis. He.R. ceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition ence Foundation of China (Grant no. Elad. thus avoiding the increase in complexity in. IET 6 (7) (2012) 877–890. 2216-2223. Image interpolation by adaptive 2-D autoregressive modeling tions and extracting a variance curve for each patch. (b) Running time consumed in SR reconstruction. improving the [11] Q. Robinson. Single image super-resolution with non-local in the proposed method. S. IEEE Trans. Signal Process. IEEE Trans. Freeman. D. D. Image Process. PSNR versus running time for different SR methods. Q. Graph.T. X. 9. Y. Zeyde. and the study demonstrates that the rotation deformation is the Mach. J. Y.doi. 2004. Teng. Li. 711–730. B. on dictionary size. Image super-resolution employing a spatial adaptive mulations. 345-352. T.D. M. R. pp. He. Pattern Anal. / Neurocomputing 216 (2016) 1–17 Fig. 13 (10) (2004) 1327–1344. [18] S. the NE-based method searches in an example-based dictionary for Model. S.2016.

G. and M. IEEE senior member of the Chinese Institute of Electronics. Image Process. funding of Education Department of China. Tao. H. Intell. Sheikh. Rotation invariant analysis and orientation research interests include image processing. L. S. Ph. Zhang. and software engineering. is a professor at the College of Electronics and In- [21] T. J. and signal processing. degree in the College of Electronics and Information Engineering. Yang. Neurocomputing (2016). Image Process. X. E.C. Zhang. Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Mosaicing and Super-Resolution. 22 (4) (2013) 1620–1630. Buades. Yang. 111–126. Image Represent. L. in: Proceedings of 2013 20th IEEE International Conference on Image Processing (ICIP). for image restoration. 2005. Jiao. Technology Progress Award of Sichuan province. and pattern recognition. Zhou. Timofte.R. China. 24 (3) (2015) 846–861. 27 (6) (2005) 1004–1008. China. Commun.S. W. 2004. Yuille.-Y. sparse re- presentation. Image deblurring and super-resolution by adaptive sparse domain selection and adaptive regularization. 2005. D. Wang. Romano. Jafari-Khouzani. D. Z. 60-65. L. in: Computer Vision–ECCV 2014. pattern estimation method for texture classification based on Radon transform and recognition. Yang. China. The core image scanning and comprehensive (CISS) tion for rotation invariant texture analysis. CVPR 2005. Xiao. [25] W. She is currently pursuing her Ph. Simoncelli. Loy. IEEE Trans. Pattern Anal. Jia. J. Patches. degree in applied [32] Y. Image transformation based on learning dictionaries Xiaohai He received his Ph. De Smet. in 2005 and 2008. Proceedings of 2013 IEEE International Conference on Computer Vision (ICCV). pp. 435-439. Saul. H.-f Ma. IEEE Trans. C. A non-local algorithm for image denoising. 2013. Single Image Super-resolution using Deformable physics from Science and Technology University. Learning a deep convolutional network for image Electronic Engineering and the Journal of Data Acqui- super-resolution. volving the development of image scanning systems. M. Shi. His [35] X. V. image Process. degree in communication and information systems. Her research interests include digital Conference on Computer Vision (ICCV). Peleg. in: Proceedings of 2013 IEEE International terprises. J. X.: Publ. Incorporating image priors with age processing. regression for fast super-resolution.K. V. IEEE Trans. 2014. He has participated in multiple projects in- embedding.E. S.-M. Dr. Shi. Coll. Image Process. pp. M. Zhang. Qizhi Teng is a professor at the College of Electronics [26] W. [34] K.E. Bovik. Mach. 2013. 2014. 35 (2) (2013) Engineering in 2002 from Sichuan University. De. Capel. Vis. B. Dong. Gong. [24] S. 1920-1927. X. N. image processing. X. J. Her re- search interests include computer vision. Roweis. Liang. He 367–380. Tang. in: research projects sponsored by various organizations.D. formation Engineering at Sichuan University. X. Elad. Xiaoqiang Wu received his B. Wang. Neuro- computing 194 (2016) 340–347. J. M. 24 (9) (2015) papers especially on image processing and image 2797–2810. B. Tang. National Natural Science Funding of China. image communication. He. T. Guo. IEEE Trans. [39] A. 20 (7) (2011) 1838–1857. degree in Biomedical across image spaces. image and video processing. [38] Z. nationwide en- example-based super-resolution. Image Process. in: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Curvelet Support Value Filters (CSVFs) for Image Super-Resolution. communication. Mach. such as. Timofte. . Li et al. Van Gool. Image Process. Liu. 23 (6) (2014) 2569–2582. IEEE Trans. G. Springer. Morel. Radon transform orientation estima.-x Guo. Zhang. He is an editor of both the Journal of Information and [22] C. Sichuan University. A. Zhang. M. 21 (1) (2010) 29–32. pp. A. Y. Pattern Anal. She has published a book and a large number of modulated sparse representation. Li. J. F.L. Improving K-SVD denoising by post-processing its method-noise. His research interests include im- [23] Y. Science 290 (5500) (2000) 2323–2326. She has been involved in about 30 [28] C. Wang. Zheng. He is currently a senior engineer in the College Pattern Recognition (CVPR). Signal Process. 184–199. Image super-resolution based on structure- tronics. Zhang. pp. K. Natural [29] R. respectively. IEEE Trans. Z. Soc.-H.C. / Neurocomputing 216 (2016) 1–17 17 [20] K. Anchored neighborhood regression for fast Science Funding of Sichuan Province. 2013. 2917-2924. A statistical prediction model based on sparse representa. pattern re- [30] R. Gao. deep convolutional neural networks for image super-resolution. A þ: Adjusted anchored neighborhood cognition.T. Elad. Learning multiple linear mappings for efficient single image super-resolution. [36] D. in: Computer Vision–ACCV 2014. Sichuan University. in: Proceedings of 2014 IEEE Conference on Computer Vision and in 1991. Li. Dr. etc. L. Wu. Wang. of Electronics and Information Engineering. Van Gool. Wang. 561-568. Tao Li received her B. Xiong. Soltanian-Zadeh. pp. sition & Processing. Intell. pp. L. Sichuan [33] S. pp. 2014. Springer. Fast direct super-resolution by simple functions. Teng is a senior member of Chinese Institute of Elec- [27] Y. Dong. Yang. [37] Y. Zhu. X. L. Nonlocally centralized sparse representation and Information Engineering at Sichuan University. Y. X. correlation analysis. Nonlinear dimensionality reduction by locally linear University.D.P. Springer Science & Business Media. Dong. 13 (4) (2004) 600–612. [31] K. He is a tions for single image super-resolution. IEEE analysis system developed by him won the Science and Trans. Cui.