Professional Documents
Culture Documents
Research Article
Incoherent Dictionary Learning Method Based on
Unit Norm Tight Frame and Manifold Optimization for
Sparse Representation
Copyright © 2016 HongZhong Tang et al. This is an open access article distributed under the Creative Commons Attribution
License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly
cited.
Optimizing the mutual coherence of a learned dictionary plays an important role in sparse representation and compressed sensing.
In this paper, a efficient framework is developed to learn an incoherent dictionary for sparse representation. In particular, the
coherence of a previous dictionary (or Gram matrix) is reduced sequentially by finding a new dictionary (or Gram matrix), which
is closest to the reference unit norm tight frame of the previous dictionary (or Gram matrix). The optimization problem can be
solved by restricting the tightness and coherence alternately at each iteration of the algorithm. The significant and different aspect
of our proposed framework is that the learned dictionary can approximate an equiangular tight frame. Furthermore, manifold
optimization is used to avoid the degeneracy of sparse representation while only reducing the coherence of the learned dictionary.
This can be performed after the dictionary update process rather than during the dictionary update process. Experiments on
synthetic and real audio data show that our proposed methods give notable improvements in lower coherence, have faster running
times, and are extremely robust compared to several existing methods.
1. Introduction s.t. x𝑖 0 ≤ 𝑆, 𝑖 = 1, 2, . . . , 𝐾,
In recent years, the research of dictionary learning has (1)
attracted a lot of attention because a learned dictionary
captures some of the intrafeatures of training samples in where 𝑌 = {y𝑖 }𝑁 𝑖=1 ∈ R
𝑀×𝑁
, D is the admissible set of all
many applications like denoising [1], compressed sensing [2], column-normalized dictionaries, 𝐷 = {d𝑘 }𝐾 𝑘=1 ∈ R
𝑀×𝐾
is an
pattern recognition, and classification tasks [3–5]. A learned overcomplete dictionary (𝑀 < 𝐾), and each column of 𝐷
dictionary allows an interesting signal to be represented is referred to an atom. X represents the admissible set of all
as a linear combination with relatively few atoms, and the sparse coefficient matrices (i.e., most of the entries are either
representation coefficients are as sparse as possible. Hence, zero or are sufficiently small in magnitude), and 𝑋 = {x𝑖 }𝑁𝑖=1 ∈
the problem of dictionary learning can be stated as follows 𝐾×𝑁
R . 𝑆 represents nonzero numbers in x𝑖 .
[6]:
Equation (1) is not a convex problem regarding the
pair (𝐷, 𝑋), so most dictionary learning methods employ
1 alternating optimization over 𝐷 and 𝑋. The following two
argmin ‖𝑌 − 𝐷𝑋‖2𝐹
𝐷∈D,𝑋∈X 2 stages are repeated until convergence:
2 Mathematical Problems in Engineering
from incoherent dictionaries and sparse representa- (1) Each column has unit norm: ‖f𝑘 ‖2 = 1 for 𝑘 =
tion. Experiments are carried out on synthetic data 1, 2, . . . , 𝐾.
and real audio data to illustrate the better perfor- (2) The columns are equiangular. For some nonnegative
mance of our proposed methods. 𝑐, we have |⟨f𝑘 , f𝑗 ⟩| = 𝑐 when 1 ≤ 𝑘 ≤ 𝑗 ≤ 𝐾.
(3) The columns form a tight frame. That is to say, 𝐹𝐹𝑇 =
(3) Organization of Paper. The rest of this paper is organized
(𝐾/𝑀)I𝑀,
as follows. Section 2 gives the definitions of mutual coherence
and ETFs, after which our proposed algorithms are presented where 𝐼𝑀 is a unit matrix with 𝑀 × 𝑀. It follows that 𝑐
in Section 3. Section 4 gives the details of dictionary optimiza- is the lowest coherence, and matrix 𝐹 is full row rank, and 𝑀
tion by employing Manopt. Section 5 reports on extensive nonzero singular values are equal to √𝐾/𝑀.
experiments that were carried on synthetic data and real
audio data. Finally, conclusions are drawn in Section 6. 3. Our Proposed Incoherent
Dictionary Learning
2. Incoherent Dictionary
Frames are an overcomplete version of a basis set and
2.1. The Mutual Coherence. The mutual coherence of a tight frames are an overcomplete version of an orthogonal
dictionary is defined as the maximum inner product between basis set. ETFs generalize the geometric properties of an
atoms [23]: orthogonal basis [26]. However, ETF is difficult to construct.
In particular, Tropp et al. [27] have demonstrated that 𝛼-tight
𝜇 (𝐷) = max ⟨d𝑘 ⋅ d𝑗 ⟩ , 𝑘 ≠ 𝑗, (5)
frame is the closest design in the Frobenius-norm sense to the
where d𝑘 and d𝑗 denote two different atoms. solution of the relaxed problem.
The coherence measures the similarities between atoms.
We have that 𝜇(𝐷) ∈ (0, 1), and a dictionary is considered Theorem 4 (see [27]). Given the matrix 𝐷 ∈ R𝑀×𝐾 with
incoherent if 𝜇(𝐷) is small. Mutual coherence is an important 𝑀 < 𝐾, suppose that it has singular value decomposition
sufficient condition to provide a theoretical guarantee for 𝑈Σ𝑉𝑇 , then the matrix 𝑈𝑉𝑇 is called orthogonal polar factor.
exactly sparse signal recovery. With regard to Frobenius norm, 𝛼𝑈𝑉𝑇 is proximate 𝛼-tight
frame to the matrix 𝐷, and it is also obtained by computing
Theorem 1 (see [9, 10]). Let 𝐷 ∈ R𝑀×𝐾 be overcomplete 𝛼(𝐷𝐷𝑇 )−1/2 𝐷.
dictionary with mutual coherence 𝜇(𝐷), if condition (6) is
satisfied We call the given 𝛼-tight frame UNTF if all columns
‖𝑓𝑘 ‖2 = 1, in which case 𝛼 = √𝐾/𝑀. UNTF is employed in
1 1
𝑆< (1 + ), (6) our proposed methods, because it is closest to the computed
2 𝜇 (𝐷) low-coherent dictionary in terms of its Frobenius norm.
where 𝑆 is nonzero numbers in x. Consider the system y = 𝐷x,
in which case x can be recovered using basis pursuit (BP) and 3.1. Our Proposed Incoherent Dictionary Learning Algorithms.
orthogonal matching pursuit (OMP). Theorem 1 shows that an To constrain the coherence between atoms, (3) can be
incoherent dictionary is desirable; here, the best expectation is reformulated as
that the mutual coherence can reach the Welch bound. 1
argmin ‖𝑌 − 𝐷𝑋‖2𝐹
𝐷∈D 2
Theorem 2 (see [25]). Consider overcomplete dictionary 𝐷 ∈ (8)
R𝑀×𝐾 with normalized columns. The coherence satisfies s.t. 𝜇 (𝐷) ≤ 𝜇𝑡 ,
where 𝐷 is the given dictionary and the minimization of 3.1.2. The Improvement of IP Algorithm. The off-diagonal
matrix nearest problem is employed to resolve (9). entries 𝑔𝑘𝑗 of the Gram matrix 𝐺 = 𝐷𝑇 𝐷 represent the
In the first algorithm, the coherence of an initial dic- coherence between atoms, so another technique for reducing
tionary is reduced sequentially by finding a new dictionary, coherence is to operate on the entries of the Gram matrix also
which has a lower coherence and is nearest to the previous in order to meet the following property:
one. Accordingly, we modify the objective function based on
Theorem 4:
𝑔𝑘𝑗 ≤ 𝜇𝑡 , 1 ≤ 𝑘, 𝑗 ≤ 𝐾, 𝑘 ≠ 𝑗,
𝑇 (11)
argmin 𝐷 Φ − 𝐼
𝐷∈D
∞ 𝑔𝑘𝑘 = 1, 1 ≤ 𝑘 ≤ 𝐾,
previous one. The optimization of (9) can be modified into (10) and (13) are resolved, we add dictionary optimization
the following problem: to maintain good performance of the sparse representa-
tion based on the OPP. Equation (1) can be formulated
argmin 𝐺𝑒 − 𝐺𝑞 ∞ equivalently as an orthogonal-constraint minimization as
follows:
s.t. rank (𝐺𝑒 ) = 𝑀,
(13) 1
diag (𝐺𝑒 ) = 1, argmin ‖𝑌 − 𝑅𝐷𝑋‖2𝐹
𝑅∈R𝑀×𝑀 2
(14)
𝐺𝑒 ⪰ 0,
𝑇
s.t. 𝑅 𝑅 = 𝐼.
where 𝐺𝑞 = Φ𝑇 Φ is called to be reference Gram matrix.
The core methodology is to operate the reference Gram
matrix 𝐺𝑞 , instead of 𝐷𝑇 𝐷. Our modified algorithm is It is clear that (𝑅𝐷)𝑇 (𝑅𝐷) = 𝐷𝑇 𝑅𝑇 𝑅𝐷 = 𝐷𝑇 𝐷.
referred to as UNTF-IP, and the optimization process is So the dictionary optimization in (14) has two advantages:
described as Algorithm 2. (I) good representation performance can be obtained; (II)
Firstly, the closest 𝛼-tight frame Φ0 is obtained. Nor- incoherence remains unchanged.
malization is then executed, after which the Gram matrix In [14, 22], dictionary rotation (DR) is employed to solve
𝐺1 = Φ𝑇0 Φ0 is computed. In the 𝑞th iteration, Φ𝑞−1 can be (14), but this is performed in the iterative dictionary update
viewed as the best coherence set over the current dictionary of (10) and (13). As demonstrated in [28], Manopt provides
by employing Theorem 4. The above shrinkage operation efficient algorithms to find an optimal solution of the OPP.
equation (12) is performed, and SVD is enforced to obtain In the next section, we introduce an optimization framework
the rank 𝑀. The updated Gram matrix is then decomposed based on Manopt.
to obtain the new dictionary. Lastly, we project the new Let 𝑓(𝑅) = (1/2)‖𝑌 − 𝑅𝐷𝑋‖2𝐹 . We consider the special
dictionary onto the UNTF manifold, √𝐾/𝑀(𝐷𝐷𝑇 )−1/2 𝐷, orthogonal constraint as a Riemannian submanifold of 𝑅 ∈
achieving the next reference UNTF. Consequently, we obtain R𝑀×𝑀. Hence the purpose of manifold optimization is to find
an effectively tighter and lower coherence dictionary that an optimal solution of 𝑅 for the following model:
those obtained with the IP algorithms [22].
argmin 𝑓 (𝑅) , (15)
4. Dictionary Optimization with Manopt 𝑅∈M
Only reducing the coherence in (10) and (13) will result in where the search space M is a Riemannian manifold that can
poor performance of the sparse representation. Hence, after be linearized locally at each point 𝑅 as a tangent space 𝑇𝑅 M.
6 Mathematical Problems in Engineering
Sparse Dictionary
0.6 Dictionary Dictionary
Method coding learning
update optimization
[24] [8]
0.5 Literature
OMP 𝐾-SVD INKSVD —
[21]
Literature
𝐾-SVD
𝜇(D)
0.4 OMP IP DR
[22]
The proposed UNTF-
OMP 𝐾-SVD Manopt
0.3 method I INKSVD
The proposed
OMP 𝐾-SVD UNTF-IP Manopt
method II
0.2
0.1
5 10 15 20 25 30 5.2. Incoherent Dictionary Learning for Sparse Representation
Iterations with Synthetic Data. In this section, we have investigated
the incoherent dictionary learning performance for sparse
INK-SVD UNTF-IP
representation of synthetic data. The training samples are
UNTF-INK-SVD 𝜇min
IP generated via underdetermined 𝑌 = 𝑅𝐷𝑋, where 𝑅 ∈ R𝑀×𝑀
and 𝐷 ∈ R𝑀×𝐾 are generated randomly. The dictionary
Figure 1: The mutual coherence of constructed dictionaries.
is enforced by a column normalization, where 𝑀 = 64
and 𝐾 = 256. The matrix 𝑋 ∈ R𝐾×𝑁 is a sparse matrix
with 𝑁 = 20,000. The nonzero coefficients are distributed
2 randomly, and their values are determined according to
a standard Gaussian distribution. The target coherence 𝜇𝑡
1.9 is set as a Welch bound of 0.1085. Table 1 summarizes
1.8
the tested methods, which are executed for 30 iterations
between dictionary update and optimization. Each algorithm
The singular value
0.6 20
0.5 15
SNR (dB)
0.4 10
𝜇(D)
0.3 5
0.2 0
0.1 −5
5 10 15 20 25 30 5 10 15 20 25 30
Iterations Iterations
Figure 3: The mutual coherence and SNR with incoherent dictionary learning methods.
10
20
0 18
−10 16
5 10 15 20 25 30
14
Iteration
12
SNR(dB)
INK-SVD IP 10
UNTF-INK-SVD UNTF-IP
8
Figure 4: The ratio between the SNR and 𝜇(𝐷) with incoherent
6
dictionary learning methods.
4
2
0
from a 16 kHz guitar recording. Furthermore, all columns 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55
in the initial dictionary are selected randomly from training 𝜇(D)
samples and are normalized. \INKSVD \Proposed method II
In this simulation, the target coherence 𝜇𝑡 is set in a range \IPR 𝜇min
from 0.05 to 0.5, and the step size is 0.05. The tested methods \Proposed method I
are the same as those in Table 1, which are executed 10 times,
Figure 5: Comparing incoherent dictionary learning methods on
and average results are taken. The termination criterion
audio data.
is that the target coherence is satisfied. We then evaluate
our proposed incoherent dictionary learning methods by
computing the mutual coherence and SNR.
As shown in Figure 5, the standard deviation based on [21] is to decrease symmetrically the correlation of each pair
running 10 times is showed, and the consistency in many of atoms having higher coherence based on a greedy method.
tests is obtained. And when the target coherence is less than Therefore, when the target coherence is higher, the number of
0.3, the proposed method II in Table 1—employing UNTF-IP pairs of atoms to be decorrelated will decrease dramatically,
followed by Manopt—generates the best effect compared to and the computation decreases dramatically as shown in the
other methods and approximates the lowest bound. However, first row of Table 2. Unlike [22], the most important benefit
if the target coherence is greater than 0.3, the SNR of [21] is the of our proposed methods is to obtain a better computational
highest, followed by that of our proposed method I. Table 2 efficiency when the target mutual coherence is very low,
shows the computational running times. The key idea behind because Manopt can be performed after the dictionary update
Mathematical Problems in Engineering 9
process rather than during the dictionary update process. proposed algorithms. And apply our proposed methods to
Compared with the prior methods, the present experimental other domains.
results indicate that our learned dictionaries have a lower
coherence, and with a certain degree of sparse representation. Competing Interests
6. Conclusion The authors declare that they have no competing interests.
matrices for compressed sensing using orthogonal bases and its [28] N. Boumal, Optimization and estimation on manifolds [Ph.D.
extension for biorthogonal bases case,” Digital Signal Processing, thesis], Princeton University, 2014.
vol. 27, no. 1, pp. 12–22, 2014. [29] N. Boumal, B. Mishra, P.-A. Absil, and R. Sepulchre, “Manopt,
[13] E. V. Tsiligianni, L. P. Kondi, and A. K. Katsaggelos, “Construc- a matlab toolbox for optimization on manifolds,” The Journal of
tion of incoherent unit norm tight frames with application to Machine Learning Research, vol. 15, no. 1, pp. 1455–1459, 2014.
compressed sensing,” IEEE Transactions on Information Theory,
vol. 60, no. 4, pp. 2319–2330, 2014.
[14] C. Rusu and N. González-Prelcic, “Designing incoherent frames
through convex techniques for optimized compressed sensing,”
IEEE Transactions on Signal Processing, vol. 64, no. 9, pp. 2334–
2344, 2016.
[15] M. Yaghoobi, L. Daudet, and M. E. Davies, “Parametric dic-
tionary design for sparse coding,” IEEE Transactions on Signal
Processing, vol. 57, no. 12, pp. 4800–4810, 2009.
[16] M. Yaghoobi, L. Daudet, and M. E. Davies, “Structured and
incoherent parametric dictionary design,” in Proceedings of the
IEEE International Conference on Acoustics, Speech, and Signal
Processing (ICASSP ’10), pp. 5486–5489, IEEE, Dallas, Tex, USA,
March 2010.
[17] I. Ramirez, F. Lecumberry, and G. Sapiro, “Sparse modeling with
universal priors and learned incoherent dictionaries,” in Pro-
ceedings of the IEEE International Workshop on Computational
Advances in Multi-Sensor Adaptive Processing (CAMSAP ’09),
pp. 197–200, IEEE Signal Processing Society, December 2009.
[18] C. D. Sigg, T. Dikk, and J. M. Buhmann, “Learning dictionaries
with bounded self-coherence,” IEEE Signal Processing Letters,
vol. 19, no. 12, pp. 861–864, 2012.
[19] V. Abolghasemi, S. Ferdowsi, and S. Sanei, “Fast and incoherent
dictionary learning algorithms with application to fMRI,” Sig-
nal, Image and Video Processing, vol. 9, no. 1, pp. 147–158, 2015.
[20] C. Bao, Y. Quan, and H. Ji, “A convergent incoherent dictionary
learning algorithm for sparse coding,” in Computer Vision—
ECCV 2014, vol. 8694 of Lecture Notes in Computer Science, pp.
302–316, Springer, New York, NY, USA, 2014, Proceedings of the
IEEE Conference on Computer Vision—ECCV.
[21] B. Mailhé, D. Barchiesi, and M. D. Plumbley, “INK-SVD:
learning incoherent dictionaries for sparse representations,” in
Proceedings of the IEEE International Conference on Acoustics,
Speech, and Signal Processing (ICASSP ’12), pp. 3573–3576,
Kyoto, Japan, March 2012.
[22] D. Barchiesi and M. D. Plumbley, “Learning incoherent dictio-
naries for sparse approximation using iterative projections and
rotations,” IEEE Transactions on Signal Processing, vol. 61, no. 8,
pp. 2055–2065, 2013.
[23] D. L. Donoho and X. Huo, “Uncertainty principles and ideal
atomic decomposition,” IEEE Transactions on Information The-
ory, vol. 47, no. 7, pp. 2845–2862, 2001.
[24] J. A. Tropp and A. C. Gilbert, “Signal recovery from random
measurements via orthogonal matching pursuit,” IEEE Trans-
actions on Information Theory, vol. 53, no. 12, pp. 4655–4666,
2007.
[25] L. R. Welch, “Lower bounds on the maximum cross correlation
of signals,” IEEE Transactions on Information Theory, vol. 20, no.
3, pp. 397–399, 1974.
[26] M. A. Sustik, J. A. Tropp, I. S. Dhillon, and R. W. Heath Jr., “On
the existence of equiangular tight frames,” Linear Algebra and
Its Applications, vol. 426, no. 2-3, pp. 619–635, 2007.
[27] J. A. Tropp, I. S. Dhillon, J. Heath, and T. Strohmer, “Designing
structured tight frames via an alternating projection method,”
IEEE Transactions on Information Theory, vol. 51, no. 1, pp. 188–
209, 2005.
Advances in Advances in Journal of Journal of
Operations Research
Hindawi Publishing Corporation
Decision Sciences
Hindawi Publishing Corporation
Applied Mathematics
Hindawi Publishing Corporation
Algebra
Hindawi Publishing Corporation
Probability and Statistics
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014
International
Journal of Journal of
Mathematics and
Mathematical
Discrete Mathematics
Sciences