You are on page 1of 83

A Tutorial on Compressive Sensing

Simon Foucart
Drexel University / University of Georgia

CIMPA13
New Trends in Applied Harmonic Analysis
Mar del Plata, Argentina, 5-16 August 2013
This minicourse acts as an invitation to the elegant theory of
Compressive Sensing. It aims at giving a solid overview of the
fundamental mathematical aspects. Its content is partly based
on a recent textbook coauthored with Holger Rauhut:
Part 1: The Standard Problem and First Algorithms

The lecture introduces the question of sparse recovery, establishes


its theoretical limits, and presents an algorithm achieving these
limits in an idealized situation. In order to treat more realistic
situations, other algorithms are necessary. Basis Pursuit (that is,
`1 -minimization) and Orthogonal Matching Pursuit make their first
appearance. Their success is proved using the concept of
coherence of a matrix.
Keywords
Keywords

I Sparsity
Essential
Keywords

I Sparsity
Essential

I Randomness
Nothing better so far
(measurement process)
Keywords

I Sparsity
Essential

I Randomness
Nothing better so far
(measurement process)

I Optimization
Preferred, but competitive alternatives are available
(reconstruction process)
The Standard Compressive Sensing Problem
The Standard Compressive Sensing Problem

x : unknown signal of interest in KN


The Standard Compressive Sensing Problem

x : unknown signal of interest in KN


y : measurement vector in Km
The Standard Compressive Sensing Problem

x : unknown signal of interest in KN


y : measurement vector in Km with m  N,
The Standard Compressive Sensing Problem

x : unknown signal of interest in KN


y : measurement vector in Km with m  N,
s : sparsity of x
The Standard Compressive Sensing Problem

x : unknown signal of interest in KN


y : measurement vector in Km with m  N,

s : sparsity of x = card j ∈ {1, . . . , N} : xj 6= 0 .
The Standard Compressive Sensing Problem

x : unknown signal of interest in KN


y : measurement vector in Km with m  N,

s : sparsity of x = card j ∈ {1, . . . , N} : xj 6= 0 .
Find concrete sensing/recovery protocols,
The Standard Compressive Sensing Problem

x : unknown signal of interest in KN


y : measurement vector in Km with m  N,

s : sparsity of x = card j ∈ {1, . . . , N} : xj 6= 0 .
Find concrete sensing/recovery protocols, i.e., find
I measurement matrices A : x ∈ KN 7→ y ∈ Km
The Standard Compressive Sensing Problem

x : unknown signal of interest in KN


y : measurement vector in Km with m  N,

s : sparsity of x = card j ∈ {1, . . . , N} : xj 6= 0 .
Find concrete sensing/recovery protocols, i.e., find
I measurement matrices A : x ∈ KN →
7 y ∈ Km
I m
reconstruction maps ∆ : y ∈ K → 7 x ∈ KN
The Standard Compressive Sensing Problem

x : unknown signal of interest in KN


y : measurement vector in Km with m  N,

s : sparsity of x = card j ∈ {1, . . . , N} : xj 6= 0 .
Find concrete sensing/recovery protocols, i.e., find
I measurement matrices A : x ∈ KN →
7 y ∈ Km
I m
reconstruction maps ∆ : y ∈ K → 7 x ∈ KN
such that
∆(Ax) = x for any s-sparse vector x ∈ KN .
The Standard Compressive Sensing Problem

x : unknown signal of interest in KN


y : measurement vector in Km with m  N,

s : sparsity of x = card j ∈ {1, . . . , N} : xj 6= 0 .
Find concrete sensing/recovery protocols, i.e., find
I measurement matrices A : x ∈ KN →
7 y ∈ Km
I m
reconstruction maps ∆ : y ∈ K → 7 x ∈ KN
such that
∆(Ax) = x for any s-sparse vector x ∈ KN .
In realistic situations, two issues to consider:
The Standard Compressive Sensing Problem

x : unknown signal of interest in KN


y : measurement vector in Km with m  N,

s : sparsity of x = card j ∈ {1, . . . , N} : xj 6= 0 .
Find concrete sensing/recovery protocols, i.e., find
I measurement matrices A : x ∈ KN →
7 y ∈ Km
I m
reconstruction maps ∆ : y ∈ K → 7 x ∈ KN
such that
∆(Ax) = x for any s-sparse vector x ∈ KN .
In realistic situations, two issues to consider:
Stability: x not sparse but compressible,
Robustness: measurement error in y = Ax + e.
A Selection of Applications
A Selection of Applications
I Magnetic resonance imaging

Figure: Left: traditional MRI reconstruction; Right: compressive


sensing reconstruction (courtesy of M. Lustig and S. Vasanawala)
A Selection of Applications
I Magnetic resonance imaging

I Sampling theory

ï2

ï4

ï6
ï0.5 ï0.4 ï0.3 ï0.2 ï0.1 0 0.1 0.2 0.3 0.4 0.5

Figure: Time-domain signal with 16 samples.


A Selection of Applications
I Magnetic resonance imaging

I Sampling theory

I Error correction
A Selection of Applications
I Magnetic resonance imaging

I Sampling theory

I Error correction
I and many more...
`0 -Minimization
Since
N
X N
X
kxkpp := |xj |p −→ 1{xj 6=0} ,
p→0
j=1 j=1
`0 -Minimization
Since
N
X N
X
kxkpp := |xj |p −→ 1{xj 6=0} ,
p→0
j=1 j=1

the notation kxk0 [sic] has become usual for

kxk0 := card(supp(x)), where supp(x) := {j ∈ [N] : xj 6= 0}.


`0 -Minimization
Since
N
X N
X
kxkpp := |xj |p −→ 1{xj 6=0} ,
p→0
j=1 j=1

the notation kxk0 [sic] has become usual for

kxk0 := card(supp(x)), where supp(x) := {j ∈ [N] : xj 6= 0}.

For an s-sparse x ∈ KN , observe the equivalence of


`0 -Minimization
Since
N
X N
X
kxkpp := |xj |p −→ 1{xj 6=0} ,
p→0
j=1 j=1

the notation kxk0 [sic] has become usual for

kxk0 := card(supp(x)), where supp(x) := {j ∈ [N] : xj 6= 0}.

For an s-sparse x ∈ KN , observe the equivalence of


I x is the unique s-sparse solution of Az = y with y = Ax,
`0 -Minimization
Since
N
X N
X
kxkpp := |xj |p −→ 1{xj 6=0} ,
p→0
j=1 j=1

the notation kxk0 [sic] has become usual for

kxk0 := card(supp(x)), where supp(x) := {j ∈ [N] : xj 6= 0}.

For an s-sparse x ∈ KN , observe the equivalence of


I x is the unique s-sparse solution of Az = y with y = Ax,
I x can be reconstructed as the unique solution of

(P0 ) minimize kzk0 subject to Az = y.


z∈KN
`0 -Minimization
Since
N
X N
X
kxkpp := |xj |p −→ 1{xj 6=0} ,
p→0
j=1 j=1

the notation kxk0 [sic] has become usual for

kxk0 := card(supp(x)), where supp(x) := {j ∈ [N] : xj 6= 0}.

For an s-sparse x ∈ KN , observe the equivalence of


I x is the unique s-sparse solution of Az = y with y = Ax,
I x can be reconstructed as the unique solution of

(P0 ) minimize kzk0 subject to Az = y.


z∈KN

This is a combinatorial problem, NP-hard in general.


Minimal Number of Measurements
Minimal Number of Measurements

Given A ∈ Km×N , the following are equivalent:


Minimal Number of Measurements

Given A ∈ Km×N , the following are equivalent:

1. Every s-sparse x is the unique s-sparse solution of Az = Ax,


Minimal Number of Measurements

Given A ∈ Km×N , the following are equivalent:

1. Every s-sparse x is the unique s-sparse solution of Az = Ax,


2. ker A ∩ {z ∈ KN : kzk0 ≤ 2s} = {0},
Minimal Number of Measurements

Given A ∈ Km×N , the following are equivalent:

1. Every s-sparse x is the unique s-sparse solution of Az = Ax,


2. ker A ∩ {z ∈ KN : kzk0 ≤ 2s} = {0},
3. For any S ⊂ [N] with card(S) ≤ 2s, the matrix AS is injective,
Minimal Number of Measurements

Given A ∈ Km×N , the following are equivalent:

1. Every s-sparse x is the unique s-sparse solution of Az = Ax,


2. ker A ∩ {z ∈ KN : kzk0 ≤ 2s} = {0},
3. For any S ⊂ [N] with card(S) ≤ 2s, the matrix AS is injective,
4. Every set of 2s columns of A is linearly independent.
Minimal Number of Measurements

Given A ∈ Km×N , the following are equivalent:

1. Every s-sparse x is the unique s-sparse solution of Az = Ax,


2. ker A ∩ {z ∈ KN : kzk0 ≤ 2s} = {0},
3. For any S ⊂ [N] with card(S) ≤ 2s, the matrix AS is injective,
4. Every set of 2s columns of A is linearly independent.

As a consequence, exact recovery of every s-sparse vector forces

m ≥ 2s.
Minimal Number of Measurements

Given A ∈ Km×N , the following are equivalent:

1. Every s-sparse x is the unique s-sparse solution of Az = Ax,


2. ker A ∩ {z ∈ KN : kzk0 ≤ 2s} = {0},
3. For any S ⊂ [N] with card(S) ≤ 2s, the matrix AS is injective,
4. Every set of 2s columns of A is linearly independent.

As a consequence, exact recovery of every s-sparse vector forces

m ≥ 2s.

This can be achieved using partial Vandermonde matrices.


Exact s-Sparse Recovery from 2s Fourier Measurements
Exact s-Sparse Recovery from 2s Fourier Measurements
Identify an s-sparse x ∈ CN with a function x on {0, 1, . . . , N − 1}
with support S, card(S) = s.
Exact s-Sparse Recovery from 2s Fourier Measurements
Identify an s-sparse x ∈ CN with a function x on {0, 1, . . . , N − 1}
with support S, card(S) = s. Consider the 2s Fourier coefficients
N−1
X
x̂(j) = x(k)e −i2πjk/N , 0 ≤ j ≤ 2s − 1.
k=0
Exact s-Sparse Recovery from 2s Fourier Measurements
Identify an s-sparse x ∈ CN with a function x on {0, 1, . . . , N − 1}
with support S, card(S) = s. Consider the 2s Fourier coefficients
N−1
X
x̂(j) = x(k)e −i2πjk/N , 0 ≤ j ≤ 2s − 1.
k=0

Consider a trigonometric polynomial vanishing exactly on S, i.e.,


Y
1 − e −i2πk/N e i2πt/N .

p(t) :=
k∈S
Exact s-Sparse Recovery from 2s Fourier Measurements
Identify an s-sparse x ∈ CN with a function x on {0, 1, . . . , N − 1}
with support S, card(S) = s. Consider the 2s Fourier coefficients
N−1
X
x̂(j) = x(k)e −i2πjk/N , 0 ≤ j ≤ 2s − 1.
k=0

Consider a trigonometric polynomial vanishing exactly on S, i.e.,


Y
1 − e −i2πk/N e i2πt/N .

p(t) :=
k∈S

Since p · x ≡ 0, discrete convolution gives


N−1
X
0 = (p̂ ∗ x̂)(j) = p̂(k)x̂(j − k), 0 ≤ j ≤ N − 1.
k=0
Exact s-Sparse Recovery from 2s Fourier Measurements
Identify an s-sparse x ∈ CN with a function x on {0, 1, . . . , N − 1}
with support S, card(S) = s. Consider the 2s Fourier coefficients
N−1
X
x̂(j) = x(k)e −i2πjk/N , 0 ≤ j ≤ 2s − 1.
k=0

Consider a trigonometric polynomial vanishing exactly on S, i.e.,


Y
1 − e −i2πk/N e i2πt/N .

p(t) :=
k∈S

Since p · x ≡ 0, discrete convolution gives


N−1
X
0 = (p̂ ∗ x̂)(j) = p̂(k)x̂(j − k), 0 ≤ j ≤ N − 1.
k=0

Note that p̂(0) = 1 and that p̂(k) = 0 for k > s.


Exact s-Sparse Recovery from 2s Fourier Measurements
Identify an s-sparse x ∈ CN with a function x on {0, 1, . . . , N − 1}
with support S, card(S) = s. Consider the 2s Fourier coefficients
N−1
X
x̂(j) = x(k)e −i2πjk/N , 0 ≤ j ≤ 2s − 1.
k=0

Consider a trigonometric polynomial vanishing exactly on S, i.e.,


Y
1 − e −i2πk/N e i2πt/N .

p(t) :=
k∈S

Since p · x ≡ 0, discrete convolution gives


N−1
X
0 = (p̂ ∗ x̂)(j) = p̂(k)x̂(j − k), 0 ≤ j ≤ N − 1.
k=0

Note that p̂(0) = 1 and that p̂(k) = 0 for k > s. The equations
s, . . . , 2s − 1 translate into a Toeplitz system with unknowns
p̂(1), . . . , p̂(s).
Exact s-Sparse Recovery from 2s Fourier Measurements
Identify an s-sparse x ∈ CN with a function x on {0, 1, . . . , N − 1}
with support S, card(S) = s. Consider the 2s Fourier coefficients
N−1
X
x̂(j) = x(k)e −i2πjk/N , 0 ≤ j ≤ 2s − 1.
k=0

Consider a trigonometric polynomial vanishing exactly on S, i.e.,


Y
1 − e −i2πk/N e i2πt/N .

p(t) :=
k∈S

Since p · x ≡ 0, discrete convolution gives


N−1
X
0 = (p̂ ∗ x̂)(j) = p̂(k)x̂(j − k), 0 ≤ j ≤ N − 1.
k=0

Note that p̂(0) = 1 and that p̂(k) = 0 for k > s. The equations
s, . . . , 2s − 1 translate into a Toeplitz system with unknowns
p̂(1), . . . , p̂(s). This determines p̂, hence p, then S, and finally x.
Optimization and Greedy Strategies
`1 -Minimization (Basis Pursuit)
`1 -Minimization (Basis Pursuit)

Replace (P0 ) by

(P1 ) minimize kzk1 subject to Az = y.


z∈KN
`1 -Minimization (Basis Pursuit)

Replace (P0 ) by

(P1 ) minimize kzk1 subject to Az = y.


z∈KN

I Geometric intuition
`1 -Minimization (Basis Pursuit)

Replace (P0 ) by

(P1 ) minimize kzk1 subject to Az = y.


z∈KN

I Geometric intuition
I Unique `1 -minimizers are at most m-sparse (when K = R)
`1 -Minimization (Basis Pursuit)

Replace (P0 ) by

(P1 ) minimize kzk1 subject to Az = y.


z∈KN

I Geometric intuition
I Unique `1 -minimizers are at most m-sparse (when K = R)
I Convex optimization program, hence solvable in practice
`1 -Minimization (Basis Pursuit)

Replace (P0 ) by

(P1 ) minimize kzk1 subject to Az = y.


z∈KN

I Geometric intuition
I Unique `1 -minimizers are at most m-sparse (when K = R)
I Convex optimization program, hence solvable in practice
I In the real setting, recast as the linear optimization program
N
X
minimize cj subject to Az = y and − cj ≤ zj ≤ cj .
c,z∈RN
j=1
`1 -Minimization (Basis Pursuit)

Replace (P0 ) by

(P1 ) minimize kzk1 subject to Az = y.


z∈KN

I Geometric intuition
I Unique `1 -minimizers are at most m-sparse (when K = R)
I Convex optimization program, hence solvable in practice
I In the real setting, recast as the linear optimization program
N
X
minimize cj subject to Az = y and − cj ≤ zj ≤ cj .
c,z∈RN
j=1

I In the complex setting, recast as a second order cone program


Basis Pursuit — Null Space Property
Basis Pursuit — Null Space Property

∆1 (Ax) = x for every vector x supported on S if and only if


Basis Pursuit — Null Space Property

∆1 (Ax) = x for every vector x supported on S if and only if

(NSP) kuS k1 < kuS k1 , all u ∈ ker A \ {0}.


Basis Pursuit — Null Space Property

∆1 (Ax) = x for every vector x supported on S if and only if

(NSP) kuS k1 < kuS k1 , all u ∈ ker A \ {0}.

For real measurement matrices,


Basis Pursuit — Null Space Property

∆1 (Ax) = x for every vector x supported on S if and only if

(NSP) kuS k1 < kuS k1 , all u ∈ ker A \ {0}.

For real measurement matrices, real and complex NSPs read


Basis Pursuit — Null Space Property

∆1 (Ax) = x for every vector x supported on S if and only if

(NSP) kuS k1 < kuS k1 , all u ∈ ker A \ {0}.

For real measurement matrices, real and complex NSPs read


X X
|uj | < |u` |, all u ∈ kerR A \ {0},
j∈S `∈S
Basis Pursuit — Null Space Property

∆1 (Ax) = x for every vector x supported on S if and only if

(NSP) kuS k1 < kuS k1 , all u ∈ ker A \ {0}.

For real measurement matrices, real and complex NSPs read


X X
|uj | < |u` |, all u ∈ kerR A \ {0},
j∈S `∈S
Xq Xq
2 2
vj + wj < v`2 + w`2 , all (v, w) ∈ (kerR A)2 \ {0}.
j∈S `∈S
Basis Pursuit — Null Space Property

∆1 (Ax) = x for every vector x supported on S if and only if

(NSP) kuS k1 < kuS k1 , all u ∈ ker A \ {0}.

For real measurement matrices, real and complex NSPs read


X X
|uj | < |u` |, all u ∈ kerR A \ {0},
j∈S `∈S
Xq Xq
2 2
vj + wj < v`2 + w`2 , all (v, w) ∈ (kerR A)2 \ {0}.
j∈S `∈S

Real and complex NSPs are in fact equivalent.


Orthogonal Matching Pursuit
Orthogonal Matching Pursuit
Starting with S 0 = ∅ and x0 = 0, iterate

(OMP1 ) S n+1 = S n ∪ j n+1 := argmax |(A∗ (y − Axn ))j | ,


 
j∈[N]
n+1
= argmin ky − Azk2 , supp(z) ⊆ S n+1 .

(OMP2 ) x
z∈CN
Orthogonal Matching Pursuit
Starting with S 0 = ∅ and x0 = 0, iterate

(OMP1 ) S n+1 = S n ∪ j n+1 := argmax |(A∗ (y − Axn ))j | ,


 
j∈[N]
n+1
= argmin ky − Azk2 , supp(z) ⊆ S n+1 .

(OMP2 ) x
z∈CN
I The norm of the residual decreases according to
2
ky − Axn+1 k22 ≤ ky − Axn k22 − (A∗ (y − Axn ))j n+1 .

Orthogonal Matching Pursuit
Starting with S 0 = ∅ and x0 = 0, iterate

(OMP1 ) S n+1 = S n ∪ j n+1 := argmax |(A∗ (y − Axn ))j | ,


 
j∈[N]
n+1
= argmin ky − Azk2 , supp(z) ⊆ S n+1 .

(OMP2 ) x
z∈CN
I The norm of the residual decreases according to
2
ky − Axn+1 k22 ≤ ky − Axn k22 − (A∗ (y − Axn ))j n+1 .

I Every vector x6=0 supported on S, card(S) = s, is recovered


from y = Ax after at most s iterations of OMP if and only if
AS is injective and

(ERC) max |(A∗ r)j | > max |(A∗ r)` |


j∈S `∈S

for all r6=0 ∈ Az, supp(z) ⊆ S .
First Recovery Guarantees
Coherence
Coherence

For a matrix with `2 -normalized columns a1 , . . . , aN , define

µ := max |hai , aj i| .
i6=j
Coherence

For a matrix with `2 -normalized columns a1 , . . . , aN , define

µ := max |hai , aj i| .
i6=j

As a rule, the smaller the coherence, the better.


Coherence

For a matrix with `2 -normalized columns a1 , . . . , aN , define

µ := max |hai , aj i| .
i6=j

As a rule, the smaller the coherence, the better.


However, the Welch bound reads
s
N −m
µ≥ .
m(N − 1)
Coherence

For a matrix with `2 -normalized columns a1 , . . . , aN , define

µ := max |hai , aj i| .
i6=j

As a rule, the smaller the coherence, the better.


However, the Welch bound reads
s
N −m
µ≥ .
m(N − 1)

I Welch bound achieved at and only at equiangular tight frames


Coherence

For a matrix with `2 -normalized columns a1 , . . . , aN , define

µ := max |hai , aj i| .
i6=j

As a rule, the smaller the coherence, the better.


However, the Welch bound reads
s
N −m
µ≥ .
m(N − 1)

I Welch bound achieved at and only at equiangular tight frames



I Deterministic matrices with coherence µ ≤ c/ m exist
Recovery Conditions using Coherence
Recovery Conditions using Coherence

I Every s-sparse x ∈ CN is recovered from y = Ax ∈ Cm via at


most s iterations of OMP provided
1
µ< .
2s − 1
Recovery Conditions using Coherence

I Every s-sparse x ∈ CN is recovered from y = Ax ∈ Cm via at


most s iterations of OMP provided
1
µ< .
2s − 1
I Every s-sparse x ∈ CN is recovered from y = Ax ∈ Cm via
Basis Pursuit provided
1
µ< .
2s − 1
Recovery Conditions using Coherence

I Every s-sparse x ∈ CN is recovered from y = Ax ∈ Cm via at


most s iterations of OMP provided
1
µ< .
2s − 1
I Every s-sparse x ∈ CN is recovered from y = Ax ∈ Cm via
Basis Pursuit provided
1
µ< .
2s − 1
I In fact, the Exact Recovery Condition can be rephrased as

kA†S AS k1→1 < 1,

and this implies the Null Space Property.


Summary
Summary

The coherence conditions for s-sparse recovery are of the type


c
µ≤ ,
s
Summary

The coherence conditions for s-sparse recovery are of the type


c
µ≤ ,
s
while the Welch bound (for N ≥ 2m, say) reads
c
µ≥ √ .
m
Summary

The coherence conditions for s-sparse recovery are of the type


c
µ≤ ,
s
while the Welch bound (for N ≥ 2m, say) reads
c
µ≥ √ .
m

Therefore, arguments based on coherence necessitate

m ≥ c s 2.
Summary

The coherence conditions for s-sparse recovery are of the type


c
µ≤ ,
s
while the Welch bound (for N ≥ 2m, say) reads
c
µ≥ √ .
m

Therefore, arguments based on coherence necessitate

m ≥ c s 2.

This is far from the ideal linear scaling of m in s...


Summary

The coherence conditions for s-sparse recovery are of the type


c
µ≤ ,
s
while the Welch bound (for N ≥ 2m, say) reads
c
µ≥ √ .
m

Therefore, arguments based on coherence necessitate

m ≥ c s 2.

This is far from the ideal linear scaling of m in s...


Next, we will introduce new tools to break this quadratic barrier
(with random matrices).

You might also like