You are on page 1of 3

Math 479/579 Homework #5

Due March 24th


1. (Lp Norms and Sparsity) In this problem we consider a simple version of the compressed
sensing problem. We want to find the ~x = (x1 , x2 )T 2 R2 that solves the following problem:
minimize k~xkpp = |x1 |p + |x2 |p

subject to

ax1 + bx2 = c.

(1)

(a) Suppose that all of the constants a, b and c are non-zero. Show that

c/a
0
~xa :=
and
~xb :=
0
c/b
are the only sparse solutions to ax1 + bx2 = c (that is, the only solutions where at least
one component of ~x is zero).
(b) Assume that b > a > 0 and c > 0. Show that ~xb is the solution to (1) with p = 1 (so, the
L1 problem gives the sparsest solution). In other words, show that
c/b = k~xb k1 k~xk1 = |x1 | + |x2 |

(2)

for any vector ~x that has ax1 + bx2 = c. (Hint: First solve the equation ax1 + bx2 = c
for x2 in terms of x1 and substitute into k~xk1 , then directly check (2) in each of the three
possible cases x1 0, 0 < x1 < c/a and x1 c/a.)

(c) By performing a similar substitution for p = 2, we obtain the unconstrained problem

2
c ax1
2
2
min k~xk2 = x1 +
.
x1 2R
b

Show that neither of the sparse solutions ~xa nor ~xb is a solution of this problem. (Hint:
show that the first derivative test fails at x1 = c/a and x1 = 0).
(d) Repeat part (c) using the Lp norm
k~xkpp = |x1 |p +

ax1
b

for p > 1 any power. You may use the following formula for the derivative
(
p|x|p 1
if x 0
(|x|p )0 =
p 1
p|x|
if x < 0
1 then only the L1 norm will give us sparse solutions.

without proof. In other words, if p


(e) If p < 1, show that the Lp norm

k~xkpp := |x1 |p + |x2 |p


is not convex. (Hint: By the third homework, it is enough to show that the function
f (x1 ) := |x1 |p can have a negative second derivative at some point x1 > 0). When
combined with part (d), this shows that p = 1 is the only possible choice if we want to
have a convex minimization that gives sparse solutions.

2. (Coding I) In this exercise we finally perform classification with more than two classes.
(a) By following the procedure outlined in the sixth tutorial, perform a three-way classification
of the 4s, 7s and 9s by using MNIST SMALL as your training set. Use the Gaussian
kernel
2
2
M (~x, ~y ) = e k~x ~yk /(2 )
to form the M and N matrices with the same parameter
Use C = 7.5 in your kernel SVM function.

= 5 from the last assignment.

The remaining portions of this exercise illustrate how to use kernel SVM for simple voice
recognition:
(b) Use the commands
1
2
3

classes = [1,5,9,15,21];
[voice,lab] = loadISOLET(classes,'train');
[voice t,lab t] = loadISOLET(classes,'test');

to load the vowels A,E,I,O,U from the training portion and the testing portion of the
ISOLET data set.
(c) Use the Gaussian kernel
M (~x, ~y ) = e
with

k~
x ~
y k2 /(2

2)

= 9.0 to form the M and N matrices needed for kernel SVM.

(d) For each value of C = .01, .1, 1, 10, 100, use these M and N matrices together with the
multiclass kernel SVM code above to classify each vocal recording in the test set as one
of the vowels A,E,I,O,U.
(e) For each value of C = .01, .1, 1, 10, 100, use the ACCURACY.m file to report the corresponding classification accuracy of your code.
(f) Use the commands
1
2
3

classes = [1,5,9,15,21,12,14,18,19,20];
[voice,lab] = loadISOLET(classes,'train');
[voice t,lab t] = loadISOLET(classes,'test');

to load the vowels A,E,I,O,U along with the Wheel of Fortune consonants R,S,T,L,N
from the training portion and the testing portion of the ISOLET data set. Then repeat parts (c,d,e) above for these ten classes. Report accuracy for each value of C =
.01, .1, 1, 10, 100 as before. Roughly speaking, this illustrates the (unsurprising) fact that
the accuracy of our classification algorithm will get worse if we have more classes. (You
can try all of the letters, too, but have fun waiting).
3. (Coding II) In this exercise we will use LASSO.m to perform a few compressed sensing experiments. The goal is to demonstrate the relationship between the number of measurments m
we must take for a given level s of sparsity in order to achieve exact recovery.
(a) By following the procedure outlined in the seventh tutorial, turn the SVMClassify.m
function into a function LASSO.m that implements the compressed sensing minimization
algorithm.

(b) Use the file CSData.m (on BeachBoard) and the command
1

[A,b,x ex] = CSData(25,500,2); %m = 25, n = 500, s = 2

to generate synthetic compressed sensing data, then use LASSO.m with


T OL = .000001
1

= .0001 and

[x est] = LASSO(A,b,.0001);

to perform the recovery process. Was the recovery process a success? In other words, was
1

norm(x ex-x est)/norm(x ex)

smaller than .01?


(c) Repeat part (b) nine more times using exactly the same parameters (m, n, s) = (25, 500, 2),
and count the total number of successes your algorithm achieves across all 10 independent
runs.
(d) Repeat parts (b) and (c) for (m, n, s) = (25, 500, 4) and (m, n, s) = (25, 500, 10) (so, ten
independent trials for each value of s = 2, 4, 10 with 25 measurments). Report the number
of successes (out of 10) that you achieve at each of these three levels of sparsity.
(e) Finally, increase the number of measurments to m = 100 and use s = 10, 22 and s = 30
in your experiments. Report the number of successes (out of 10) that you achieve at each
of these three levels of sparsity for m = 100 measurments.