Matrix Theory, Math6304 Lecture Notes From October 11, 2012: 4 Variational Characterization of Eigenvalues, Continued

Matrix Theory, Math6304
Lecture Notes from October 11, 2012

taken by Da Zheng
4 Variational characterization of eigenvalues, continued

We recall from last class that given a Hermitian matrix, we can obtain its largest (resp. smallest)
eigenvalue by maximizing (resp. minimizing) the corresponding quadratic form over all the unit
vectors. In fact, due to the following theorem by Courant and Fischer, we can obtain any
eigenvalue of a Hermitian matrix through the ”min-max” or ”max-min” formula.
4.2 The Courant-Fischer Theorem

4.2.1 Theorem (Courant-Fischer). Suppose A ∈ Mn is Hermitian, and for each 1 ≤ k ≤ n,
let {Skα }α∈Ik denote the set of all k−dimensional linear subspaces of Cn . Also, enumerate the n
eigenvalues λ1 , · · · , λn (counting multiplicity) in increasing order, i.e. λ1 ≤ · · · ≤ λn . Then, we
have
(i).
hAx, xi
min max = λk ,
α∈Ik x∈Sk \{0} kxk2
α
(ii).
hAx, xi
max min = λk .
α
α∈In−k+1 x∈Sn−k+1 \{0} kxk2
Before starting the proof, we denote an or u1 , · · · , un , which is orhonormal, as the eigenvectors

of A corresponding to λ1 , · · · , λn respectively. .
Proof. (i). First, let W = span{u1 , · · · , un }, then dim W = n−k+1. So, for any k−dimensional
subspace Skα , we should have dim(Skα ∩W ) ≥ 1. This is because of the equality: dim(Skα +W ) =
dim Skα + dim W − dim(Skα ∩ W ), and of course dim(SkαP + W ) ≤ n.
Now, choose x ∈ Sk ∩ W \ {0}, note that x = nj=k hx, uj iuj and Auj = uj , then it
α
follows that

Pn
hAx, xi j=k λj hx, uj iuj , x
=
kxk2 kxk2
1
Pn
j=kλj hx, uj ihuj , xi
=
kxk2
Pn 2
j=k λj |hx, uj i|
=
kxk2
Pn 2
j=k |hx, uj i|
≥ λk
kxk2
= λk .
≤ · · · ≤ λn and kxk2 = nj=k |hx, uj i|2 .

P
Here, we use the fact that λk ≤ λk+1
Thus, for any Skα ,
hAx, xi
sup 2
≥ λk .
x∈Skα \{0} kxk
Indeed, for any Skα , it is easy to check that

hAx, xi
sup 2
= sup hAx, xi,
x∈Skα \{0} kxk x∈Skα ,kxk=1
and since {x ∈ Skα : kxk = 1} is compact, supremum is attained. So we have

hAx, xi hAx, xi
max 2
= sup 2
≥ λk .
x∈Sk \{0} kxk
α
x∈Skα \{0} kxk
On the other hand, consider a particular k−dimensional subspace Skα = span{u1 , · · · , uk },

then
Pk 2
hAx, xi j=1 λj |hx, uj i|
= ≤ λk .
kxk2 kxk2
hAx,xi
Hence, choosing x = uk , we obtain maxx∈Skα \{0} = λk .
kxk2
hAx,xi
This also implies that the minimum of maxx∈Skα \{0} kxk2 over all α ∈ Ik is also attained, and
we conclude that
hAx, xi
min max = λk .
α
(ii). The proof of this formula follows from the same idea as (i), and we shall omit the similar
α
details. We first choose W = span{u1 , · · · , uk }, so dim W = k. Then, for any subspace Sn−k+1 ,
α
dim(Sk ∩ W ) ≥ 1.
∩ W \ {0}, hAx,xi
α

Next, choose any x ∈ Sn−k+1 kxk2
≤ λk , and therefore
hAx, xi
min ≤ λk .
α
x∈Sn−k+1 \{0} kxk2
Now, we again choose a particular Sn−k+1 = span{uk , · · · , un }, and this gives

hAx, xi
min = λk .
α
x∈Sn−k+1 \{0} kxk2
2
So finally we conclude that
hAx, xi
max min = λk .
α
4.2.2 Remark. We can compare this result with theorem 4.2.11 in Horn and Johnson’s ”Matrix
Analysis”, which uses vectors to prove the ”min-max” and ”max-min” formulae, but the idea is
essentially the same.
4.3 Eigenvalue estimates for sums of matrices

Next, we shall introduce several theorems and corollaries that can be considered as consequences
of the Courant-Fischer’s theorem. The first theorem, by Weyl, allows us to obtain a lower and
upper bound for the kth eigenvalue of A + B.
4.3.3 Theorem (Weyl). Let A, B ∈ Mn be both Hermitian, and {λj (A)}nj=1 , {λj (B)}nj=1 and
{λj (A+B)}nj=1 denote the sets of eigenvalues of A, B, and A+B in increasing order, respectively.
Then, for any 1 ≤ k ≤ n,
λk (A) + λ1 (B) ≤ λk (A + B) ≤ λk (A) + λn (B).
Notice that by symmetry, we naturally also have
λk (B) + λ1 (A) ≤ λk (A + B) ≤ λk (B) + λn (A).
Proof. By Rayleigh-Ritz’s theorem, we know that
hBx, xi
λ1 (B) ≤ ≤ λn (B), for x 6= 0.
kxk2
Then, by Courant-Fischer’s theorem, for any 1 ≤ k ≤ n,
h(A + B)x, xi
λk (A + B) = min max
α

hAx, xi hBx, xi
= min max +
α∈Ik x∈Skα \{0} kxk2 kxk2

hAx, xi
≥ min max + λ1 (B)
α∈Ik x∈Skα \{0} kxk2
hAx, xi
= λ1 (B) + min max
α∈Ik x∈Skα \{0} kxk2
= λk (A) + λ1 (B).
For the other inequality, we apply similar argument and immediately obtain λk (A + B) ≤
λk (A) + λn (B), and the proof is finished here.
3
4.3.4 Remark. It is interesting to mention that for some special B, both lower and upper bounds
can be attained in the above theorem we have just proved. For example, let {u1 , · · · , un } be
the orthonormal set of eigenvectors of A with Auk = λk uk for 1 ≤ k ≤ n, and consider
the rank one projection B = auk u∗k , where a > 0 or a < 0. Then, A = U DU ∗ , where
D = diag{λ1 , · · · , λn }, and we can assume that λ1 , · · · , λn is listed in increasing order. Also,
it is easy to see that B = U D0 U ∗ , where D0 = diag{0, · · · , 0, a, 0, · · · , 0} (a appears in the
kth place). Then, we immediately have λk (A + B) = λk (A) + λn (B) = λk (A) + a if a > 0;
λk (A + B) = λk (A) + λ1 (B) = λk (A) + a if a < 0.
The following corollary is called the monotonicity theorem, which refines the lower bound in
Weyl’s theorem by assuming B is positive semidefinite.
4.3.5 Definition. B ∈ Mn is called positive semidefinite, if it is Hermitian and hBx, xi ≥ 0 for
all x ∈ Cn .
Notice that hBx, xi ≥ 0 for all x ∈ Cn indeed implies that B is Hermitian. However, for the
real case, we have to impose that B is symmetric.
4.3.6 Corollary (Weyl). Adopt all the assumptions and notations in the above Weyl’s theorem,
if we further that suppose B is positive semideifnite, then
λk (A) ≤ λk (A + B), for all 1 ≤ k ≤ n.
Proof. Since B is positive semidefinite, λj (B) ≥ 0 for all 1 ≤ j ≤ n, so the corollary follows
from Weyl’s theorem directly.
The following theorem discusses the relationship between eigenvalues of a Hermitian matrix
and those of the rank one perturbation of it, which is called the interlacing theorem. This is still
an application of Courant-Fischer’s theorem.
4.3.7 Theorem. Let A ∈ Mn be Hermitian, z ∈ Cn , {λj (A)} and {λj (A ± zz ∗ )} be both in
increasing order, then
(i).
λk (A ± zz ∗ ) ≤ λk+1 (A) ≤ λk+2 (A ± zz ∗ ), for 1 ≤ k ≤ n − 2,
(ii).
λk (A) ≤ λk+1 (A ± zz ∗ ) ≤ λk+2 (A), for 1 ≤ k ≤ n − 2.
endsection
Proof. By Courant-Fischer’s theorem,
h(A ± zz ∗ )x, xi
λk+2 (A ± zz ∗ ) = min max
α
α∈Ik x∈Sk+2 \{0} kxk2
hAx, xi |hx, zi|2

= min max ±
α \{0}
α∈Ik x∈Sk+2 kxk2 kxk2
hAx, xi
≥ min max (∗)
α
x⊥z
4
hAx, xi
≥ min max (∗∗)
α
α
z⊥Sk+1
hAx, xi
≥ min max (∗ ∗ ∗)
α
= λk+1 (A).
α
Here, the inequality (*) and (***) are trivial. For inequality (**), note that x ∈ Sk+2 \ {0}
α ⊥ α
and x ⊥ z is equivalent to x ∈ Sk+2 ∩ (span{z}) , and again by the equality dim(Sk + W ) =
dim Skα + dim W − dim(Skα ∩ W ), we know that dim(Sk+2 α
∩ (span{z})⊥ ) = k + 2 or k + 1.
α ⊥
Then, we see that for each Sk+2 ∩ (span{z}) , we can extract a k + 1−dimensional subspace
α α
Sk+1 such that z ⊥ Sk+1 , and therefore
hAx, xi hAx, xi
max ≥ max ,
x∈Sk+2 \{0} kxk2
α α \{0}
x∈Sk+1 kxk2
x⊥z α
z⊥Sk+1
because on the right side we maximize over a subspace of that on the left.
Now, (**) becomes clear. As we finally want the minimum, and for each of the maximums
in (*) over which we try to take the minimum, there exists a maximum in (**) which is smaller,
we know that ≥ in (**) should hold.
For the other inequality of (i), we apply the analogous argument, again by the ”max-min”
formula in the Courant-Fischer’s theorem,
λk (A ± zz ∗ ) = max min
α
≤ max min
α
x⊥z
hAx, xi
= max min
α
x⊥z
≤ max min (∆)
α
α∈In−k x∈Sn−k \{0}
α
kxk2
z⊥Sn−k
≤ max min
α
α∈In−k x∈Sn−k \{0} kxk2
= λk+1 (A).
Here, inequality (∆) follows from the similar argument as we did for (**). Thus, we proved (i).
(ii). This is indeed a direct corollary of (i), by modifying the indices. So the proof of the
interlacing theorem is done.

Matrix Theory, Math6304 Lecture Notes From October 11, 2012: 4 Variational Characterization of Eigenvalues, Continued

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Matrix Theory, Math6304 Lecture Notes From October 11, 2012: 4 Variational Characterization of Eigenvalues, Continued

Uploaded by

Copyright:

Available Formats

Matrix Theory, Math6304

Lecture Notes from October 11, 2012

4 Variational characterization of eigenvalues, continued

4.2 The Courant-Fischer Theorem

Before starting the proof, we denote an or u1 , · · · , un , which is orhonormal, as the eigenvectors

≤ · · · ≤ λn and kxk2 = nj=k |hx, uj i|2 .

Indeed, for any Skα , it is easy to check that

and since {x ∈ Skα : kxk = 1} is compact, supremum is attained. So we have

On the other hand, consider a particular k−dimensional subspace Skα = span{u1 , · · · , uk },

Now, we again choose a particular Sn−k+1 = span{uk , · · · , un }, and this gives

4.3 Eigenvalue estimates for sums of matrices

λk (A) + λ1 (B) ≤ λk (A + B) ≤ λk (A) + λn (B).

Notice that by symmetry, we naturally also have

λk (B) + λ1 (A) ≤ λk (A + B) ≤ λk (B) + λn (A).

Proof. By Rayleigh-Ritz’s theorem, we know that

Then, by Courant-Fischer’s theorem, for any 1 ≤ k ≤ n,

λk (A) ≤ λk (A + B), for all 1 ≤ k ≤ n.

You might also like