Professional Documents
Culture Documents
Justin Romberg
Georgia Tech, ECE
Collaborators:
M. Salman Asif
Aurèle Balavoine
Chris Rozell
Tsinghua University
October 18, 2013
Beijing, China
Goal: a dynamical framework for sparse recovery
(t)
Agenda
We will look at dynamical reconstruction in two different contexts:
M. Salman Asif
• Recursive LS w Á
T T ¡1 T
x1 =new
Compute (©estimate
© + Áusing
Á)rank-1 y + ÁT w)
(©update:
T T −1 T
= x0 +x̂1K=1(Φ
(wΦ¡+Áx
φφ )) (ΦRank
y+φ · w)
one update
0
= x̂0 + K1 (w − φT x0 )
K1 = (©T ©)¡1ÁT (1 + Á(©T ©)¡1ÁT ) 12
where
K1 = (ΦT Φ)−1 φ(1 + φT (ΦT Φ)−1 φ)−1
With the previous inverse in hand, the update has the cost of a
few matrix-vector multiplies
Classical: The Kalman filter
Linear dynamical system for state evolution and measurement:
yt = Φt xt + et
xt+1 = Ft xt + ft
I 0 0 0 ··· F0 x0
Φ1 0 0 0 · · · y1
x1
−F1 I 0 0 · · ·
0
0 Φ2 0 0 · · · x2
x3 = y2
0 −F2 I
0 · · · .
. 0
0 0
Φ 3 0 · · · .
y3
.. .. .. . . . ..
. . . . .. .
As time marches on, we add both rows and columns.
Least-squares problem:
X
min σt kΦt yt − yt k22 + λt kxt − Ft−1 xt−1 k22
x1 ,x2 ,...
t
Classical: The Kalman filter
yt = Φt xt + et
xt+1 = Ft xt + dt
Least-squares problem:
X
min σt kΦt yt − yt k22 + λt kxt − Ft−1 xt−1 k22
x1 ,x2 ,...
t
vk = Fk x̂
Kk+1 = (Fk Pk FkT + I)ΦTk+1 (Φk+1 (Fk Pk FkT + I)ΦTk+1 + I)−1
x̂k+1|k+1 = vk + Kk+1 (yk+1 − Φk+1 vk )
Pk+1 = (I − Kk+1 Φk+1 )(Fk Pk FkT + I)
Dynamic sparse recovery: `1 filtering
1
min kW xk1 + kΦx − yk22
x 2
1
min λkxk1 + kΦx − (1 − )yold − ynew k22 , take from 0 → 1
2
ΦT
Γ (Φx − (1 − )yold − ynew ) = −λ sign xΓ
kΦT
Γc (Φx − (1 − )yold − ynew )k∞ < λ
Γ = active support
Update direction:
(
−(ΦT −1
Γ ΦΓ ) (yold − ynew ) on Γ
∂x =
0 off Γ
y1from
Path = ©x + e1 to new
old1solution
Γ = support of current solution.
1
minimize
Move in this¿kxk
direction
1 + k©x ¡ y1 k22
(2
−(ΦT −1
Γ ΦΓ ) (yold − ynew ) on Γ
∂x =
x !x
1 2 ) y !y
0 1 2 off Γ
parse innovations”
until support changes, or one
1 of these constraints is violated:
2
minimize ¿kxk
φTγ (Φ(x 1 +− (1k©x ¡ y2 k2 ) < λ for all γ ∈ Γc
+ ∂x) 2 − )yold − ynew
x ¡ (1 ¡ ²)y1 ¡ ²y2 k22
omotopy parameter: 0 ! 1
b2
x
to new
near and
²: 0 ! 1 b1
x 17
Sparse innovations
House
20
Numerical experiments: time-varying sparse signals
!"#$%&'"()*+,-$%'-
TKLLU&
X9MEA45YZ R7LSG;; H7IJKL&
L4[ME3&P9W2 V/A/P/W9
!M7N/OKPK%&I7Q) !M7N/OKPK% I7Q) !M7N/OKPK%&I7Q)
!M7N/OKPK%&I7Q)
=&>&,#*+
?&>&-,*
!*'$:*% #$,'*) !*'-%&#$"*+) !,#+$-%&#$,6) !,+6$(-%&#$,::)
@&>&AB-%&<&C&@B*#
DE3021&>&FBG ,
! ! """#!#! $!!
M7N/OKPK\ N/0[V39 PV2 E][$ M/$ /^ AEPN4_ ]25P/N WN/O05P1 84PV # EMO #!
I7Q\ E]2NE[2 5W0P4A2 P/ 1/3]2
[Asif and R. 2009]
Adding a measurement
del: y1 = ©xΦ1TΓ+
(Φxe1− y) + (hφ, xi − w)φΓ = −λ sign xΓ
kΦTΓc (Φx − y) + (hφ, xi − w)φΓc k∞ < λ
1
minimize ¿kxk1 + k©x ¡ y1 k22
2
s: x1 !
From critical
x2 point
(
)x0 ,yupdate
1 ! y2
direction is
“Sparse innovations” T T −1
(w − hφ,
1 x0 i) · (ΦΓ Φ2Γ + 0 φφ ) φΓ on Γ
∂x =
blem: minimize ¿kxk 0 1 + k©x ¡ y2 k2 off Γ
2
1
+ k©x ¡ (1 ¡ ²)y1 ¡ ²y2 k22
2
Homotopy parameter: 0 ! 1
b2
x
solution to new
ewise linear and
ized by ²: 0 ! 1 b1
x 17
Numerical experiments: adding a measurement
N=Dynamic `1 numerical
1024, measurements M = experiments
512, sparsity S = 100
Add Initial
P new problem: N = 1024, M = 512, K = M/5 (Gaussian)
measurements
Sequential update: Added P new measurements
Compare the average
Comparison number
for the count of matrix-vector
of matrix-vector productsproducts
per updateper update
4
Reweighted `1
Weighted `1 reconstruction:
X 1 1
min wk |xk | + kΦx − yk22 = min kW xk1 + kΦx − yk22
x 2 x 2
k
solve this iteratively, adapting the weights to the previous solution:
λ
wk = old
|xk | + c
Enhancing Sparsity by Reweighted l1 Minimization 25
1 1
0.5 0.5
0 0
−0.5 −0.5
−1 −1
0 100 200 300 400 500 0 100 200 300 400 500
(a) (b)
1 1
0.5 0.5
0 0
−0.5 −0.5
−1 −1
0 100 200 300 400 500 0 100 200 300 400 500
(c) (d)
FIGURE 11 (a) Original two-pulse signal (blue) and reconstructions (red) via (b)
!1 synthesis, (c) !1 analysis, (d) reweighted !1 analysis.
(from Boyd, Candes, Wakin ’08)
Changing the weights
Optimality conditions:
Numerical Experiments
Sparse signal of length N recovered from M Gaussian measurements
ADP-H (adaptive weighting via homotopy) SpaRSA [Wright et al 2007]
IRW-H (iterative reweighting via homotopy) YALL1 [Yang et al 2007]
A general, flexible homotopy framework
Formulations above
I depend critically on maintaining optimality
I are very efficient when the solutions are close
Streaming measurements for evolving signals require some type of
predict and update framework
We want to solve
1
min kW xk1 + kΦx − yk22
x 2
Initial guess/prediction: v
Solve
1
min kW xk1 + kΦx − yk22 + (1 − )uT x
x 2
for : 0 → 1.
Taking
u = −W z − ΦT (Φv − y)
for some z ∈ ∂(kvk1 ) makes v optimal for = 0
Moving from the warm-start to the solution
1
min kW xk1 + kΦx − yk22 + (1 − )uT x
x 2
The optimality conditions are
We move in direction (
uΓ on Γ
∂x =
0 on Γc
until a component shrinks to zero or a constraint is violated, yielding new Γ
Streaming basis: Lapped orthogonal transform Streaming signal re
1.5
1
0.8
0.5
0.6
0
0.4
−0.5
0 −1.5
−0.5 0 0.5 1 1.5 2 2.5 3 3.5 −0.5 0 0.5 1 1.5
Streaming sparse recovery
Sparse recovery: streaming system
• Signal observations:
Observations: yt = Φt xt + et
X
• Sparse representation:
Representation: = αp,k ψp,k [n]
x[n]
p,k
Measurements Measurement matrices Signal Error LOT Signal LOT representation bases LOT
windows coefficients
active interval
active interval
34
Streaming sparse recovery
Sparse recovery: streaming system
Iteratively reconstruct the signal over a sliding (active) interval,
form u from your prediction, then take : 0 → 1 in
• Iteratively estimate the signal over a sliding (active) interval:
1 1 2 ¹+ ~(1 − )uT2α
Desired min kW αk1 +kW®k
minimize kΦ̄Ψ̃α +
− ỹkk2©ª® ¡ y ~k2
α 2 1
2
1 ¹~
where Ψ̃, ỹ accountminimize
Homotopy kW®k1 +
for edge effects ~k22 + (1 ¡ ²)uT®
k©ª® ¡ y
2
Overlapping system matrix Sparse Error Divide the system into two parts
vector
37
Streaming signal recovery: Simulation
Streaming signal recovery - Results
Observation/evolution model:
yt = Φt xt + et
xt+1 = Ft xt + dt
We solve
X 1 1
min kWt αt k1 + kΦt Ψt αt − yt k22 + kFt−1 Ψt−1 αt−1 − Ψt αt k22
α
t
2 2
(formulation similar to Vaswani 08, Carmi et al 09, Angelosante et al 09, Zainel at al 10, Charles et al 11)
using
1 1
min kW αk1 + kΦ̄Ψ̃α − ỹk22 + kF̄ Ψ̃α − q̃k22 + (1 − )uT α
α 2 2
Dynamic signal: Simulation
Dynamic signal recovery - Results
• Digital(Multiply0and0 • Analog(Vector0
Accumulate( Matrix(Multiplier(
! (
! !
! !
– Small!time!constant! – Limited!accuracy!
– Low!power!consumption! – Limited!dynamic!range!
! !
Dynamical systems for sparse recovery
There are simple systems of nonlinear differential equations that settle to
the solution of
1
min λkxk1 + kΦx − yk22
x 2
or more generally
N
X 1
min λ C(x[n]) + kΦx − yk22
x 2
n=1
T
2y u2 (t) T (·) x2 (t)
y
h N, 2 ix2 (t)
dC
(u) = u x x = T (u )
C(un ) dx n n
un un
Key questions
T x⇤
1y u1 (t) T (·) x1 (t)
h 1, 2 ix2 (t) x(0)
T
2y u2 (t) T (·) x2 (t)
y
X 1
h N, 2 ix2 (t)
min C(xn ) + k x yk22
x
n
2
T uN (t) T (·) xN (t)
Ny
Uniform convergence
Convergence speed (general)
Convergence speed for sparse recovery via `1 minimization
LCA convergence
Assumptions
x(t)
1 u − a ∈ λ∂C(a)
(
0 |u| ≤ λ
2 x = Tλ (u) =
f (u) |u| > λ
ẋ(t) → 0 as t → ∞
x(t) → x∗ , u(t) → u∗ , as t → ∞
LCA convergence
Assumptions
1 u − a ∈ λ∂C(a)
(
0 |u| ≤ λ
2 x = Tλ (u) =
f (u) |u| > λ
x(t)
4 f (·) is subanalytic
5 f 0 (u) ≤ α
LCA convergence
x(t) → x∗ , u(t) → u∗ , as t → ∞
Φ = [DCT I]
M = 256, N = 512
Convergence: exponential (of a sort)
where x̃(t) = x(t) − x∗ , and αd < 1 (f 0 (u) ≤ α), then the LCA
exponentially converges to a unique fixed point:
ku(t) − u∗ k2 ≤ κ0 e−(1−αδ)/τ
Convergence: exponential (of a sort)
where x̃(t) = x(t) − x∗ , and αd < 1 (f 0 (u) ≤ α), then the LCA
exponentially converges to a unique fixed point:
ku(t) − u∗ k2 ≤ κ0 e−(1−αδ)/τ
Of course, this depends on not too many nodes being active at any one
time ...
Activation in proper subsets for `1
If Φ a “random compressed sensing matrix” and
M ≥ Const · S 2 log(N/S)
then for reasonably small values of λ and starting from rest
Γ(t) ⊂ Γ∗
That is, only subsets of the true support are ever active
M ≥ Const · S log(N/S)
|Γ(t)| ≤ 2|Γ∗ |
M. Asif and J. Romberg, “Dynamic updating for l1 minimization” IEEE Journal on Special Topics in Signal Processing,
April 2010.
M. Asif and J. Romberg, “Fast and accurate algorithms for re-weighted `1 -norm minimization,” submitted to IEEE
Transactions on Signal Processing, July 2012.
M. Asif and J. Romberg, “Sparse recovery of streaming signals using `1 homotopy,” submitted to IEEE Transactions on
Signal Processing, June 2013.
A. Balavoine, J. Romberg, and C. Rozell, “Convergence and Rate Analysis of Neural Networks for Sparse
Approximation,” IEEE Transactions on Neural Networks and Learning Systems, vol. 23, no. 9, pp. 13771389,
September 2012.
A. Balavoine, J. Romberg, and C. Rozell, “Convergence Speed of a Dynamical System for Sparse Recovery,” to appear
in IEEE Transactions on Signal Processing, late 2013.
A. Balavoine, J. Romberg, and C. Rozell, “Convergence of a neural network for sparse approximation using the
nonsmooth ojasiewicz inequality,” Proc. of Int. Joint. Conf. on Neural Networks, August 2013.
http://users.ece.gatech.edu/~justin/Publications.html