You are on page 1of 5

LOCALITY PRESERVING KSVD FOR NONLINEAR MANIFOLD LEARNING Yin Zhou, Jinglun Gao, Kenneth E.

Barner University of Delaware, Newark, DE, USA, 19716 {zhouyin, gao, barner}@udel.edu
ABSTRACT Discovering the intrinsic low-dimensional structure from highdimensional observation space (e.g., images, videos), in many cases, is critical to successful recognition. However, many existing nonlinear manifold learning (NML) algorithms have quadratic or cubic complexity in the number of data, which makes these algorithms computationally exorbitant in processing real-world large-scale datasets. Randomly selecting a subset of data points is very likely to place NML algorithms at the risk of local optima, leading to poor performance. This paper proposes a novel algorithm called Locality Preserving KSVD (LP-KSVD), which can effectively learn a small number of dictionary atoms as locality-preserving landmark points on the nonlinear manifold. Based on the atoms, the computational complexity of NML algorithms can be greatly reduced while the low-dimensional embedding quality is improved. Experimental results show that LP-KSVD successfully preserves the geometric structure of various nonlinear manifolds and it outperforms state-ofthe-art dictionary learning algorithms (MOD, K-SVD and LLC) in our preliminary study on face recognition. Index Terms— Dimensionality reduction, Manifold learning, Dictionary learning, Sparse coding, Face recognition 1. INTRODUCTION One of the central tasks in signal analysis and pattern recognition is to seek effective representations for real-world high-dimensional data [1]. Dimensionality reduction is an important technique that discovers the most succinct and intrinsic forms of representation of the original high-dimensional data, allowing more effective learning and prediction. Linear dimensionality reduction algorithms e.g., Principle Component Analysis (PCA) [2], Linear Discriminative Analysis (LDA) [3], have been widely applied in the past decades due to their simplicity and efficiency. Such algorithms, however, typically are not capable of exploiting the nonlinear structure of a data manifold and therefore are not suitable for processing complex datasets [4]. In recent years, nonlinear manifold learning (NML) algorithms have been developed to effectively discover the intrinsic low-dimensional structure of the nonlinear manifold. Examples are ISOMAP [5], Locally Linear Embedding (LLE) [6], Hessian LLE [7], Laplacian Eigenmap [1], Diffusion map [8], Local Tangent Space Alignment (LSTA) [9], etc. However, many current NML algorithms are of quadratic or cubic complexity in the number of data, which diminishes the applicability of these algorithms in real-world large-scale datasets [10]. Efforts have been made on selecting a subset of training data as landmark points on the manifold to improve the efficiency of NML algorithms. Landmark points are meaningful points that preserve the local geometric structure of a manifold. Silva and Tenenbaum [11] suggested using a subset of randomly selected data points, which,
High-dimensional mensional ng data training Dictionary learning NLDR Dictionary atoms D Low-dimensional Low L dimensional embedding of d dictionary atoms Linear reconstruction L Low-dimensional e embedding of all data

High-dimensional space

Low-dimensional space

Fig. 1: Overview of the proposed method. Given training data in high-dimensional observation space, a representational and localitypreserving dictionary is learned. Then, the low-dimensional embedding of the atoms is computed via some NML algorithm. Finally, using the geometric relationships among training data and the atoms in observation space, the low-dimensional embedding of training data is reconstructed as linear combinations of the low-dimensional embedding of the atoms.

however, is susceptible to local optimum leading to poor performance. The authors of [10] proposed to use LASSO (Least Absolute Shrinkage and Selection Operator) regression for selecting landmark points, which is of high computational cost due to the ℓ1 minimization. Thus, it is still an open and challenging problem of learning landmark points effectively. This paper proposes a novel algorithm–Locality Preserving KSVD (LP-KSVD)–to learn a compact, representational, and locality-preserving dictionary. The overview of the proposed method is illustrated in Fig. 1. From the training set, a small number of atoms are learned as landmark points. NML algorithms can very efficiently compute the low-dimensional embedding of these landmark atoms. Then, based on the atoms, the original high-dimensional manifold as well as the low-dimensional embedding can be accurately reconstructed via locally linear representation [12]. Learning a dictionary of landmark atoms has the advantages of robust to outliers, random initialization, imbalanced data distribution and allowing better generalization capability. Our work is different from sparse coding e.g., K-SVD [13], the method of optimal directions (MOD) [14], since 1) the employed local coding has closed-form solution and is more efficient compared to sparsity driven algorithms, e.g., Orthogonal Matching Pursuit (OMP) [15]; 2) the dictionary is optimized for both its capability in representation and its locality preservation capability, which is in contrast to sparse coding which only solves for a representational dictionary. Experimental results support the promising performance of LP-KSVD.

. we write Ωτ (yi ) as Ωτ for simplicity. D is estimated based on Y. xij = 0 1T xj = 1 if di ∈ / Ωτ (yj ) ∀i ∀j where xj represents the j -th column in X. The second objective is learning di as landmark points.. which are capable of preserving the locality on M. dK ] ∈ Rm×K such that linear combinations of di can approximate the nonlinear manifold M ⊂ Rm . which requires that every training sample can only be reconstructed by its τ nearest-neighbor dictionary atoms. Wang et al. η is a small constant to secure numerical stability and the operator \ means matrix inversion1 . ˜ k contains the ω error columns from Ek .x (5) where the reconstruction error term measures the fitness of D to Y. Moreover. [23] introduced Locally Coordinate Coding (LCC) to approximate nonlinear functions as a linear combination of anchor points and experimentally showed that locality can be more essential than sparsity in representing data distributed on nonlinear manifold. . minimizing Eq. Nevertheless. which only assowhere E ciates with the nonzero entries in xk∗ . Optimization The proposed LP-KSVD solves Eq. etc. we first fix D and solve for the best coefficient matrix X and then.2. next.e. image classification [20–22]. PROPOSED METHOD 3. Suppose yi reside on a smooth manifold of intrinsic dimension n (n ≪ m). Aharon et al [13] developed K-SVD to efficiently learn an overcomplete dictionary from training data and achieved impressive results on image denoising and image compression. In particular. their coding strategy is based on a modification of LASSO and hence is still computationally expensive. which contains m-dimensional N samples. . For the purpose of efficient learning. j s.e.(5) yields simultaneous upx ˜ k∗ but only dk is of our interest as the output of the date of dk and x algorithm. little efforts has been made on learning a representational and locality-preserving dictionary to improve the performance of NML algorithms. Unfortunately. . we update dk as follows. 19]. we update D as well as X [13] jointly. X >= arg min ∥Y − DX∥2 F D. . In order to achieve faithful reconstruction. In order to learn atoms as landmark points. 1T xj = 1 ∀j (1) 3.t. we continue to minimize the objective function by updating D and X jointly. Yu et al. and solves the best shape for dk .1. [23] suggested that di be sufficiently close to M. let Ek = Y − dt xt∗ . We call the first constraint local cardinality constraint. First. KSVD [13] simply treats Eq. Yu et al. Specifically.X { xij = 0 if di ∈ / Ωτ (yj ) ∀i. First is establishing a compact dictionary D = [d1 . we enforce each atom to be locally representational with respect to a small patch on M. d2 . We optimize each dictionary atom individually. y2 . [24] further proposed Locality-constrained Linear Coding (LLC) as a fast approximation to LCC and achieved state-of-the-art results on image classification.t. Local Dictionary Optimization In this step. Another similar algorithm is the method of optimal directions (MOD) [14]. and xij is the i-th elˆ j be a subvector containing only the nonzero ement in xj .2. Moreover. The second constraint allows the reconstruction coefficients to be invariant to translation of the data [6]. to learn di as landmark points. image denoising [13. . Locality-based coding is recently developed [23. please refer to [6] or http://www. . minimize ∥Ek − dk xk∗ ∥2 F with ret̸=k spect to dk and xk∗ .cs. ˜ k∗ dk . As we have no access to the true M.(5) as a rank-1 matrix approximation problem.2.edu/ roweis/lle/. Recall that the local cardinality 1 We have followed the same notation as [6. Actually. Problem Formulation Let Y = [y1 . . K-SVD only focuses on optimizing a representational dictionary without considering the locality-preserving capability of the dictionary. RELATED WORK Sparse coding expresses an input signal succinctly by only a few atoms from a dictionary or codebook. which is embedded into Rm . yN ] ∈ Rm×N be the training set. The iterations are terminated if either the objective function value is below some preset threshold or a maximum number of iterations has been reached. sparse coding algorithms are typically of high computational complexity due to the stage of solving sparse coefficients. For detailed derivation of Eq. Here. The closed-form solution to Eq. . containing linear representation coefficients for reconstructing yj . and has no guarantee of preserving the local geometric structure of M. and Ωτ (yj ) denotes the neighborhood containing τ nearest dictionary atoms of yj in terms of Euclidean distance. This model has been successfully applied to a variety of problems in image processing and computer vision such as image restoration [16–18]. the proposed LP-KSVD algorithm can effectively achieve the aforementioned two objectives. We will show that satisfying this requirement. The dictionary learning problem (LP-KSVD) is thus formalized as: < D. (1) iteratively by alternating between the two variables. the X defined in Eq. 3. the succinct row vector ˜ k∗ ∈ R1×ω . (4) where G = (Ωτ − yj 1T )T (Ωτ − yj 1T ) is the local covariance matrix.1. 14. Let x elements in xj .2. it can be equivalently minimized by solving: ˜ k − dk x ˜ k∗ ∥2 ˜ k∗ >= arg min ∥E < dk . 24]. 3. This. i. with xj being the code for reconstructing yj . the matrix X ∈ RK ×N contains N local reconstruction codes. . As ∥Ek − dk xk∗ ∥2 F is only affected by the ω = ∥xk∗ ∥0 nonzero entries in xk∗ . 3. 24]. we further require each di to be locally representational with respect to a small patch on M. however. (2) is given as [6]: ˆ j = (G + η I)\1 x and ˆj = x ˆj / x ∑ (3) x ˆij .2. The goal here is the joint achievement of two objectives.nyu. Solving for local reconstruction codes Fixing D which is initialized or learned from previous iteration. only yields a representational D. i. Denoting the k-th atom in D as dk ∈ Rm and the k-th row of X as ∑ xk∗ ∈ R1×N . x F. (1) can be obtained by equivalently solving [6]: min yj − { K ∑ i=1 2 xj ∈ X xij di 2 (2) s. (3)(4).

by jointly achieving the minimization of Eq. 8 respectively. For both databases. EXPERIMENTAL RESULTS In this section.2(b) and the curvature shape in Fig. The parameter τ is set to 2 uniformly. C05. where Di is the sub-dictionary for class i. Extended YaleB Database [25] and CMU PIE Database [26].2. Given a test sample z ∈ Rm . we randomly select half of the images (about 32 per person) for training and the other half for testing.2. a structured dictionary is learned as D = [D1 .ucla. . We adopt a sequential optimization strategy. its low-dimensional embedding at:http://www. as recommended in [6]. can be computed by a particular NML algorithm. which yields a dictionary of 304 atoms for Extended YaleB Database and a dictionary of 544 atoms for CMU PIE Database.2. C27. We therefore define Λ as a small patch on M. In Fig. the non-convex region (hole) in Fig. k the gain factor s of dnew that minimizes LRE can be obtained by k solving min s ∑ i u − yi )∥2 ∥ ( s ∥u 2 ∥2 s. The nearest neighbor 2 Accessible ˜ k − dk x ˜ k ∗ ∥2 minimize ∥E F and yield the minimum LRE are given as: U∆V dnew k T = = = = ˜k E su ∆(1. . the ring-shape and connectivity in Fig. For each subject. we use a subset of the CMU PIE Database [26]. In contrast. C09. the images are normalized to 32 × 32 pixels and are preprocessed by histogram equalization.1. . its low-dimensional embedding B ∈ Rn×K . Linear and nonlinear dimensionality reduction are performed by PCA and LLE respectively. we first demonstrate the capability of the proposed LP-KSVD in preserving the local geometric structure of manifolds. To demonstrate the wide applicability of LP-KSVD to various NML algorithms. . D2 . ω y ∈Λ ∥u∥2 i (12) ˜ k.constraint requires each yj to be reconstructed by its nearest dictionary atoms. a random selection of 130 images per person are employed to form the training set. Test Sample Embedding Having obtained D ∈ Rm×K .math. by visualizing the dimensionality reduction results on several synthetic datasets.e. we are able to learn a representational and locality-preserving D. Proposition 1 (Local Dictionary Optimization). Then. Face Recognition The Extended YaleB Database contains 2414 frontal face images of 38 subjects [25]. Moreover. and 2000 test data. Following [27]. Laplacian Eigenmap [1] and LLE [6] are employed to Swiss Hole. On minimizing ˜ k − dk x ˜ k ∗ ∥2 ∥E F . The subset yields a total number of 11554 images with about 170 images per subject. and the optimal s is given by s= yi ∈ Λ (11) 1 ∑ uT yi . i. where x is the local reconstruction code of z over D. 4. 1)vT (9) (10) where u and v are the first column of U and V. 4. the dictionary atoms are learned as landmark points equidistributed on the original 3D manifold and preserve well the local geometric structure. the smooth color displayed by the 2D embeddings verifies that test samples are successfully mapped into the subspace spanned by the dictionary atoms.. from the top row to the bottom row are the visualizations (3D manifold and 2D embeddings) of 3000 training data.edu/∼wittman/mani/ . As in [27]. k Proof. applying SVD yields an optimal rank-1 matrix approximation solution as U∆VT dnew k ˜ new x k∗ = = = ˜k E u ∆(1.e. Over all these datasets. 3. K dictionary atoms learnt by LP-KSVD.t. Let Λ ∈ Rm×ω be a small neighborhood containing ω training samples that concur˜ k ∈ Rm×ω be rently select dk ∈ Rm as a nearest neighbor. convergence to a local minimum is guaranteed. 3. the performance of LP-KSVD is further verified over two popular face recognition databases. and C29. and Gaussian..3. Hessian LLE [7]. KSVD [13] and the recently proposed LLC [24] by classifying test images embedded in the low-dimensional subspace spanned by dictionary atoms. with the number of nearest neighbors k = 6. Define the local representation new ˜ new error (LRE) as ∥dk − yi ∥2 and the x 2 . Note that u is a unit vector indicating the shape (direction) of dnew . randomly selected K training samples.2(a). and the rest of the database are for testing. Let E the error matrix ∑ corresponding to Λ. We initialize all DL algorithms with 8 atoms per subject. Then.g. Without affecting the closest rank-1 matrix approximation to E simultaneously multiplying the right hand side (RHS) of Eq. s is the gain factor (scale) of dnew . i. Thus. (10) by s yields the desired updates. e. C07. We evaluate LP-KSVD and compare it with MOD [14]. For all dictionary learning (DL) algorithms.2(c). where n < m. (9) and dividing the RHS of Eq. Visualization on Synthetic Datasets The 3-dimensional synthetic datasets2 employed are shown in Fig. as shown in the 2nd row of Fig. NML algorithms have difficulty in computing low-dimensional embeddings over a small number of randomly selected training samples. 1)vT s 1 ∑ uT yi ω y ∈Λ ∥u∥2 i (6) (7) (8) ˜ new x k∗ s where u and v are the first columns of U and V.(5) and the best local representation with respect to Λ. 4..2. which is a neighborhood set containing the ω training samples that are concurrently choosing dk as one of their nearest neighbors. noisy Toroidal Helix. the dk k∗ that yi ∈Λ h ∈ Rn can be linearly reconstructed as h = Bx. As LP-KSVD essentially minimizes reconstruction error in the same way as K-SVD [13]. DC ]. in which images are nearly frontal poses and are taken under varying conditions of illumination and expression. Then with u fixed. due to the nonuniform distribution of data points.

The sparsity factor of OMP [15] is set to 16 for MOD and K-SVD. the overall time is the sum of dictionary learning time and training data embedding time. We also evaluate the efficiency of LP-KSVD and compare it with other methods.7% as that of All Train. 3(b) and 3(d). All Train represents the results based on performing NML algorithms on the entire training set. 3(b) and 3(d).2%.6593 s 1. Table 1: Overall execution time (seconds) over two face databases. 3: Classification results over two face databases. as shown in Fig. Results in (a) and (c) are obtained via LLE dimensionality reduction.Hessian LLE Laplacian Eigenmap LLE 3000 Training Samples K Randomly Selected Samples K Codebook Atoms Embedding 2000 Test Samples (a) Swiss Hole K = 500 (b) Toroidal Helix K = 300 (c) Gaussian K = 100 Fig. 0812458 for funding and thank the reviewers for their helpful comments. LP-KSVD achieves the highest recognition rate 95. Note that in Fig.3236 s 13.0830 s 149.3 times higher than that of All Train. Fig. as recommended in [22]. considering the fact ♯atoms ) for CMU PIE Database is that the compression rate ( ♯training samples only 6. results in (b) and (d) are obtained via PCA dimensionality reduction.0008 s 2. Recognition performances of all approaches are illustrated in Fig. 3(a). However. CONCLUSION AND FUTURE WORK We propose a novel LP-KSVD algorithm to learn dictionary atoms as landmark points. 5.7% and leads K-SVD and All Train in the second place by 1. For example.8138 s No 11. . 3(a). Extended YaleB Extended YaleB 100 Recognition Rate (%) (NN) classifier is employed. ACKNOWLEDGEMENT The authors would like to thank NSF Grant No.3121 s No 848. In Fig. Extended YaleB Overall Time Speedup 22. which can preserve the locality on nonlinear manifold.1109 s 19.9x 611.g.2321 s 170. 3(c). the efficiency of LP-KSVD is up to 149.3x All Train K-SVD [13] MOD [14] LLC [24] LP-KSVD 6.1577 s No 59.3x 69.4014 s No 41. We choose All Train as the baseline method. 2x CMU PIE Overall Time Speedup 11807. different DL algorithms display diverse performances and LP-KSVD yields the same highest accuracy 96. CMU PIE). we can see that over largescale databases (e. which further validates the usability of the proposed method. Random stands for the result obtained by employing randomly selected training samples as the dictionary. 2: Visualizing dimensionality reduction results on synthetic datasets. which represents the results obtained by performing NML on the entire training set. For all methods.e. out of the four evaluation scenarios. i. the significant saving in computation complexity makes LP-KSVD a very competitive approach. 3.. We can see that among three. Note: the parameter k of LLE is set to 60 for both the Extended YaleB and CMU PIE databases. Experimental results validate that the proposed method is superior to existing DL algorithms in terms of greatly reducing computational complexity while yielding higher classification rate. Increasing the sparsity factor can lead to a marginal accuracy improvement but will cause significant higher computation time [28]. 100 Recognition Rate (%) 80 80 60 LP−KSVD KSVD MOD LLC Random All Train 60 LP−KSVD KSVD MOD LLC Random All Train 40 40 20 10 50 90 130 170 210 250 290 Dimension 20 10 50 90 130 170 210 250 290 Dimension (a) LLE CMU PIE LLE 100 Recognition Rate (%) 90 80 70 60 50 40 10 50 90 LP−KSVD KSVD MOD LLC Random All Train (b) PCA CMU PIE PCA 100 Recognition Rate (%) 95 90 85 80 75 LP−KSVD KSVD MOD LLC Random All Train 130 170 210 250 290 Dimension 10 50 90 130 170 210 250 290 Dimension (c) LLE (d) PCA Fig. LP-KSVD outperforms other methods in recognition rate. The training time for all methods include the computation time of both dictionary learning (DL) and training data embedding (TrDE). 9x 10. From Table 1.5x 79. This is due to the fact that PCA seeks the optimal projection directions that best preserve the data variance and is insensitive to the nonlinear structure of a manifold. LP-KSVD should be further evaluated over more datasets.. In addition. We report recognition rate as the average over 30 repetitions.2%. Future work includes refining the local reconstruction coding strategy by incorporating local sparse coding. all methods achieve similar accuracies.

pp. “Discriminative k-svd for dictionary learning in face recognition. Aase. 21. 335–338. Zhou and K. and Y. Lafon. march 2012. X. [11] V. pp. A. “Acquiring linear subspaces for face recognition under variable lighting. Sapiro. no. Aharon. [24] J. 1241–1248. [17] H.” 2003. 31. 54. Ramirez. vol. and Y. and I. pp. Lee. no. 2002. and J.” Signal Processing Letters. and G.” Applied and Computational Harmonic Analysis. 12. S. and Y. IEEE Transactions on. “K-svd: An algorithm for designing overcomplete dictionarys for sparse representation. Fisher. S. J. dec. “Signal recovery from random measurements via orthogonal matching pursuit. Yang. 2006. IEEE Trans. no. 11 2006. 113 – 127.” in CVPR 2010. Huang. Belkin and P. “The use of multiple measurements in taxonomic problems. vol. Gong.” in Acoustics. [26] T.S. no. 1061 – 1064. [7] D. no. “Nonlinear dimensionality reduction by locally linear embedding. 5. pp. “Classification and clustering via dictionary learning with structured incoherence and shared features. Wright. 4655 –4666.” Annals of Eugenics. vol. Urtasun. [20] J. Elad. Dec. 2007. pp. Roweis. Lemos. [18] S. 2012 IEEE International Conference on.” J. 313–338. Roweis and L. pp. no.” IEEE Trans. 27. [3] R. January 2005.” in ISCAS ’99. Principal Component Analysis. and J. Comput.” in Advances in Neural Information Processing Systems 15. Donoho and C. [8] B. M. Elad. Gilbert. 7. 4. 684–698. Coifman. Signal Processing. “Locality-constrained linear coding for image classification. 2323 – 2326. “Dictionary learning from sparsely corrupted or compressed signals. 880 –887. Husoy. vol. 20. E. T. 770 –777. Baraniuk.” in ICCV 2007. “Close the loop: Joint blind image restoration and recognition with sparse representation prior.. and T.O. 2007. Wang. Barner.G. [25] K. spectral clustering and reaction coordinates of dynamical systems.. T. pp. PAMI. Kevrekidis. “The cmu pose. C. [19] C. pp. and expression database. pp. Sastry. [2] I. 1615 – 1618.” IEEE Trans. A. “Global versus local methods in nonlinear dimensionality reduction. 2008.” Information Theory.” in Acoustics. no. Zhang. J.. New York. 2. no. “Nonlinear learning using local coordinate coding.M. Studer and R. Sci. vol. [5] J. [23] K. Mach. Sprechmann. pp. Niyogi.” Pattern Analysis and Machine Intelligence. 15. Sim. march 2012. pp. 2000.” in Advances in Neural Information Processing Systems 18. 2009. 2003.” in CVPR 2010. 2 2009. 119–155. Speech and Signal Processing (ICASSP). Zha. Tropp and A. S. and J. 2003. Han. and D. Mairal. “Diffusion maps. Learn. pp. [13] M. [9] Z. “A global geometric framework for nonlinear dimensionality reduction. “Frame based signal compression using method of optimal directions (mod). vol. “Sparse representation for color image restoration. vol. L. 1986. 2. Rahardja. Yang. [4] A.C. 1373–1396. 11. N.. Res. A. 25. Saul and S. 6. [14] K. “Hessian eigenmaps: New locally linear embedding techniques for high-dimensional data. 1936. Lv. Engan. J.” SIAM J. pp. 179–188. 4311 – 4322. J.” IEEE TIP. Baker. Springer-Verlag. pp.” in NIPS09. M. Cai. pp. vol. [28] Y. Nasrabadi. Zhang and H.” Science. [22] Q. Kriegman. [27] D. 1. 53 –69. illumination.” CVPR 2010. He. 3341 –3344. Yu. Zhang. pp. and M.H. PAMI. April. Yang. Grimes. 4. Y. 2006. pp. “Rank priors for continuous non-linear dimensionality reduction. 2319 – 2323. 290. Darrell. Zhang. [10] J. T. Landford. [21] I. Nadler. 2005.” Science. no. 2012 IEEE International Conference on. R. B. vol.C. Gong. “Locality constrained dictionary learning for nonlinear dimensionality reduction. B. Zhang and B. Speech and Signal Processing (ICASSP). 1 –7. 210 – 227. “Principal manifolds and nonlinear dimensionality reduction via tangent space alignment. 2000. Yu. “Selecting landmark points for sparse manifold learning. Ho. fit locally: unsupervised learning of low dimensional manifolds.de Silva. Ma. Li. and A.” IEEE Trans. vol. REFERENCES [1] M. K. Geiger. K. IEEE Conference on. Sapiro. F. pp. and G. vol.” in CVPR 2009. V. Tenenbaum. Marques. 53.7. S. pp. Tenenbaum. 290. Silva. de Silva and J.A. [16] J. “Think globally. Bruckstein. Xie and S. Huang. [15] J.” in ICCV 2011. Jolliffe. R. Ganesh. and J.” Neural Computation. 26. Bsat. June 2003. vol. “Laplacian Eigenmaps for Dimensionality Reduction and Data Representation. “Semi-supervised discriminant analysis. . and T. P. [6] S. [12] L. “An alternating direction method for frame-based image deblurring with balanced regularization. pp. vol. IEEE. G. 12. T. R. Saul. “Robust face recognition via sparse representation.