You are on page 1of 7

CS205 Homework #3 Solutions

Problem 1
Consider an n n matrix A.
1. Show that if A has distinct eigenvalues all the corresponding eigenvectors are linearly
independent.
2. Show that if Ahas a full set of eigenvectors (i.e. any eigenvalue with multiplicity k has
k corresponding linearly independent eigenvectors), it can be written as A = QQ
1
where is a diagonal matrix of As eigenvalues and Qs columns are As eigenvectors.
Hint: show that AQ = Q and that Q is invertible.
3. If A is symmetric show that any two eigenvectors corresponding to dierent eigenvalues
are orthogonal.
4. If A is symmetric show that it has a full set of eigenvectors. Hint: If (, q) is an
eigenvalue, eigenvector (q normalized) pair and is of multiplicity k > 1, show that
A qq
T
has an eigenvalue of with multiplicity k 1. To show that consider the
Householder matrix H such that Hq = e
1
and note that HAH
1
= HAH and A are
similar.
5. If A is symmetric show that it can be written as A = QQ
T
for an orthogonal matrix
Q. (You may use the result of (4) even if you didnt prove it)
Solution
1. Assume that the eigenvectors q
1
, q
2
, . . . , q
n
are not linearly independent. Therefore,
among these eigenvectors there is one (say q
1
) which can be written as a linear combi-
nation of some of the others (say q
2
, . . . , q
k
) where k n. Without loss of generality,
we can assume that q
2
, . . . , q
k
are linearly independent (we can keep removing vectors
from a linearly dependent set until it becomes independent), therefore the decomposi-
tion of q
1
into a linear combination q
1
=

k
i=2

k
q
k
is unique. However
q
1
=
k

i=2

k
q
k
Aq
1
= A
_
k

i=2

k
q
k
_

1
q
1
=
k

i=2

k
q
k
q
1
=
k

i=2

k
q
k
In the third equation above, we assumed that
1
= 0. This is valid, since otherwise
the same equation would indicate that there is a linear combination of q
2
, . . . , q
k
that
is equal to zero (which contradicts their linear independence). From the last equation
above and the uniqueness of the decomposition of q
1
into a linear combination of
q
2
, . . . , q
k
we have

k

k
=
k
. There must be an
k
0
= 0, 2 k
0
k, otherwise we
would have q
1
= 0. Thus
1
=
k
0
which is a contradiction.
1
2. Since A has a full set of eigenvectors, we can write Aq
i
=
i
q
i
, i = 1, . . . , n.
We can collect all these equations into the matrix equation AQ = Q, where Q
has the n eigenvectors as columns, and = diag(
1
, . . . ,
n
). From (1) we have
that the columns of Q are linearly independent, therefore it is invertible. Thus
AQ = Q A = QQ
1
.
3. If
1
,
2
are two distinct eigenvalues and q
1
, q
2
are their corresponding eigenvectors
(there could be several eigenvectors per eigenvalue, if they have multiplicity higher
than 1), then
q
T
1
Aq
2
= q
T
1
(Aq
2
) =
2
(q
T
1
q
2
)
q
T
1
Aq
2
= q
T
1
A
T
q
2
= (Aq
1
)
T
q
2
=
1
(q
T
1
q
2
)
_

1
(q
T
1
q
2
) =
2
(q
T
1
q
2
)
(
1

2
)(q
T
1
q
2
) = 0

1
=
2
= q
T
1
q
2
= 0
4. In the review session we saw that that for a symmetric matrix A, if is a multiple
eigenvalue and q one of its associated eigenvectors, then the matrix Aqq
T
reduces
the multiplicity of by 1, moving q from the space of eigenvectors of into the
nullspace (or the space of eigenvectors of zero).
Assume that the eigenvalue with multiplicity k has only l linearly independent
eigenvectors, where l < k, say q
1
, q
2
, . . . , q
l
. Then, using the previous result, the
matrix A

l
i=1
q
i
q
T
i
has as an eigenvalue with multiplicity l k and the vec-
tors q
1
, q
2
, . . . , q
l
belong to the set of eigenvectors associated with the zero eigen-
value. Since is an eigenvalue of this matrix, there must be a vector q such that
_
A

l
i=1
q
i
q
T
i
_
q = q. Furthermore, we have that q must be orthogonal to all
vectors q
1
, q
2
, . . . , q
l
since they are eigenvectors of a dierent eigenvalue (the zero
eigenvalue). Therefore
_
A
l

i=1
q
i
q
T
i
_
q = q A q
l

i=1
q
i
q
T
i
q = q A q = q
This means that q is an eigenvector of the original matrix A associated with and is
orthogonal to all q
1
, q
2
, . . . , q
l
, which contradicts our hypothesis that there exist only
l < k linearly independent eigenvectors.
5. From (4) we have that A has a full set of eigenvectors. We can assume that the
eigenvectors of that set corresponding to the same eigenvalue are orthogonal to each
other (we can use the Gram-Schmidt algorithm to make an orthogonal set from a
linearly independent set). We also know from (3) that eigenvectors corresponding to
dierent eigenvalues are orthogonal. Therefore, we can nd a full, orthonormal set of
eigenvectors for A. Under that premise, the matrix Q containing all the eigenvectors
as its columns is orthogonal, that is Q
T
Q = I Q
1
= Q
T
. Finally, using (2) we
have
A = QQ
1
= QQ
T
2
Problem 2
[Adapted from Heath 4.23] Let A be an nn matrix with eigenvalues
1
, . . . ,
n
. Recall that
these are the roots of the characteristic polynomial of A, dened as f() det(AI). Also
we dene the multiplicity of an eigenvalue to be the degree of it as a root of the characteristic
polynomial.
1. Show that the determinant of Ais equal to the product of its eigenvalues, i.e. det (A) =

n
j=1

j
.
2. The trace of a matrix is dened to be the sum of its diagonal entries, i.e., trace(A) =

n
j=1
a
jj
. Show that the trace of Ais equal to the sum of its eigenvalues, i.e. trace(A) =

n
j=1

j
.
3. Recall a matrix B is similar to A if B = T
1
AT for a non-singular matrix T. Show
that two similar matrices have the same trace and determinant.
4. Consider the matrix
A =
_
_
_
_
_
_
_
_
_

a
m1
a
m

a
m2
a
m

a
1
a
m

a
0
a
m
1 0 0 0
0 1 0 0
.
.
.
.
.
.
.
.
.
0 0 1 0
_
_
_
_
_
_
_
_
_
What is As characteristic polynomial? Describe how you can use the power method
nd the largest root (in magnitude) of an arbitrary polynomial.
Solution
1. Since (1)
n
is the highest order term coecient and
1
, . . .,
n
are solutions to f we
write f() = (1)
n
(
1
) (
n
). If we evaluate the characteristic polynomial
at zero we get f(0) = det(A0I) = det A. Also:
det A = f(0) = (1)
n
(0
1
) (0
n
) = (1)
2n
n

i=1

i
=
n

i=1

i
2. Consider the minor cofactor expansion of det(A I) which gives a sum of terms.
Each term is a product of n factors comprising one entry from each row and each
column. Consider the minor cofactor term containing members of the diagonal (a
11

)(a
22
) (a
nn
). The coecient for the
n1
term will be (1)
n
(

n
i=1

i
) =
(1)
n+1

n
i=1

i
. Observe that this minor cofactor term is the only one that will con-
tribute to the
n1
order terms (i.e. if you didnt use the diagonal for one row or
column in the determinant you lose two s so you cant get up to
n1
order). Thus
coecient of the
n1
term is the trace of the matrix. We have traceA = f() =
3
(1)
n
(
1
)(
2
) (
n
). The
n1
coecient from this is going to be
(1)
n

n
i=1

i
= (1)
2n

n
i=1

i
=

n
i=1

i
.
3. Two matrices that are similar have the same eigenvalues. Thus the formulas given
above for the trace and the determinant that depend only on eigenvalues yield the
same value. Alternatively recall that det(AB) = det Adet B so det(B
1
AB) =
(det B
1
)(det A)(det B) = (det A)(det B
1
B) = det AI = det A. And trace(AB) =

n
i=1

n
j=1
a
ij
b
ji
=

n
i=1

n
j=1
b
ij
a
ji
= trace(BA). Thus we have
trace((B
1
A)B) = trace(BB
1
A) = traceA
4. We can use the minor cofactor expansion along the top row. Expanding about i =
1, j = 1 we get (a
m1
/a
m
)
m1
=
m
a
m1
/a
m

m1
because the submatrix
A(2 : m, 2 : m) is triangular with on the diagonal. Then for i = 1, j = k we get
a
mk
/a
m

mk
because we get a triangular sub matrix with mk s on the diagonal
and k 1 1s. Thus we get the equation
det(AI) =
m

a
m1
a
m

m1

k=2
a
mk
a
m

mk
= 0
The roots of the above polynomial are the same as if we multiply by a
m
and so
a
m

m
+ a
m1

m1
+ + a
1
+ a
0
= 0
To solve a polynomial equation simply form the matrix given in the homework. The
eigenvalues of this matrix are the roots of your polynomial. Use the power method to
get the rst eigenvalue which is also the rst root. Then use deation to form a new
matrix and use the power method again to extract the second eigenvalue (and root).
After m runs of the power method all roots are recovered.
Problem 3
Let A be a mn matrix and A = UV
T
its singular value decomposition.
1. Show that A
2
=
2
2. Show that if m n then for all x R
n

min

Ax
2
x
2

max
where
min
,
max
are the smallest and largest singular values, respectively.
3. The Frobenius norm of a matrix A is dened as A
F
=
_

i,j
a
2
ij
. Show that A
F
=
_

p
i=1

2
i
where p = min{m, n}.
4
4. If A is an m n matrix and b is an m-vector prove that the solution x of minimum
Euclidean norm to the least squares problem Ax

= b is given by
x =

i
=0
u
T
i
b

i
v
i
where u
i
, v
i
are the columns of U and V, respectively.
5. Show that the columns of U corresponding to non-zero singular values form an orthog-
onal basis of the column space of A. What space do the columns of U corresponding
to zero singular values span?
Solution
1. If Q is an orthogonal matrix, then
Qx
2
2
= (Qx)
T
(Qx) = x
T
Q
T
Qx = x
T
x = x
2
2
Qx
2
= x
2
Consequently
Ax
2
x
2
=
UV
T
x
2
x
2
=
V
T
x
2
x
2
=
V
T
x
2
V
T
x
2
=
y
2
y
2
(1)
where y = V
T
x. Since V
T
is a nonsingular matrix, as x ranges over the entire space
R
n
\ {0}, so does y, too. Therefore
max
x=0
Ax
2
x
2
= max
y=0
y
2
y
2
A
2
=
2
2. Based on (1) we also have
min
x=0
Ax
2
x
2
= min
y=0
y
2
y
2
Therefore
min
y=0
y
2
y
2

Ax
2
x
2
max
y=0
y
2
y
2
However
y
2
y
2
=

n
i=1

2
i
x
2
i

n
i=1
x
2
i

n
i=1

2
max
x
2
i

n
i=1
x
2
i
=
max
y
2
y
2
=

n
i=1

2
i
x
2
i

n
i=1
x
2
i

n
i=1

2
min
x
2
i

n
i=1
x
2
i
=
min
5
Furthermore, if
min
=
i
,
max
=
j
, then
e
i

2
e
i

2
=
i
=
min
e
j

2
e
j

2
=
j
=
max
Therefore

min

Ax
2
x
2

max
and equality is actually attainable on either side.
3. We have that a
ij
= [A]
ij
= [UV
T
]
ij
=

k

k
u
ik
v
jk
. Therefore
A
F
=
m

i=1
n

j=1
a
2
ij
=
m

i=1
n

j=1
_
p

k=1

k
u
ik
v
jk
__
p

l=1

l
u
il
v
jl
_
=
p

k,l=1

l
_
m

i=1
u
ik
u
il
_
_
_
n

j=1
v
jk
v
jl
_
_
However

m
i=1
u
ik
u
il
is the dot product between the k-th and l-th columns of U and
therefore equals +1 when k = l and 0 otherwise. Similarly,

n
j=1
v
jk
v
jl
is the dot
product between the k-th and l-th columns of V and therefore equals +1 when k = l
and 0 otherwise. Thus
A
F
=
p

k,l=1

kl

kl
=
p

k=1

2
k
where
ij
= 1 if i = j and 0 otherwise.
4. The least squares solution must satisfy the normal equations. It might not be unique
though if A does not have full column rank, in which case we will show that the
formula given just provides one of the least squares solutions (they all lead to the same
minimum for the residual). We can rewrite the normal equations as
A
T
Ax
0
= A
T
b VU
T
UV
T
x
0
= VU
T
b
2
V
T
x
0
= U
T
b
We can show that x =

i
=0
u
T
i
b

i
v
i
satises the last equality as follows

2
V
T
x =
2
V
T
_
_

i
=0
u
T
i
b

i
v
i
_
_
=
_
_

i
=0
u
T
i
b

i
_

2
V
T
v
i
_
_
_
=
_
_

i
=0
u
T
i
b

i
_

2
i
e
i
_
_
_
=
_
_

i
=0

i
_
u
T
i
b
_
e
i
_
_
=
_
_

all
i

i
_
u
T
i
b
_
e
i
_
_
= U
T
b
6
5. Using the singular value decomposition we can write
A =
p

i=1

i
u
i
v
T
i
where
1
,
2
, . . . ,
p
, p min(m, n) are the nonzero singular values of A. Let x R
n
be an arbitrary vector. Then
Ax =
_
p

i=1

i
u
i
v
T
i
_
x =
p

i=1
_

i
v
T
i
x
_
u
i
Therefore, any vector in the column space of A is contained in span{u
1
, u
2
, . . . , u
p
}.
Additionally, any vector of u
1
, u
2
, . . . , u
p
is in the column space of A since u
k
=
1

k
Av
k
.
Therefore, those two vector spaces coincide.
The columns of U corresponding to zero singular values span the normal complement
of the space spanned by u
1
, u
2
, . . . , u
p
. That is, they span the nullspace of A
T
.
7

You might also like