You are on page 1of 5

Matrix Theory, Math6304

Lecture Notes from October 11, 2012


taken by Da Zheng

4 Variational characterization of eigenvalues, continued


We recall from last class that given a Hermitian matrix, we can obtain its largest (resp. smallest)
eigenvalue by maximizing (resp. minimizing) the corresponding quadratic form over all the unit
vectors. In fact, due to the following theorem by Courant and Fischer, we can obtain any
eigenvalue of a Hermitian matrix through the ”min-max” or ”max-min” formula.

4.2 The Courant-Fischer Theorem


4.2.1 Theorem (Courant-Fischer). Suppose A ∈ Mn is Hermitian, and for each 1 ≤ k ≤ n,
let {Skα }α∈Ik denote the set of all k−dimensional linear subspaces of Cn . Also, enumerate the n
eigenvalues λ1 , · · · , λn (counting multiplicity) in increasing order, i.e. λ1 ≤ · · · ≤ λn . Then, we
have

(i).
hAx, xi
min max = λk ,
α∈Ik x∈Sk \{0} kxk2
α

(ii).
hAx, xi
max min = λk .
α
α∈In−k+1 x∈Sn−k+1 \{0} kxk2

Before starting the proof, we denote an or u1 , · · · , un , which is orhonormal, as the eigenvectors


of A corresponding to λ1 , · · · , λn respectively. .
Proof. (i). First, let W = span{u1 , · · · , un }, then dim W = n−k+1. So, for any k−dimensional
subspace Skα , we should have dim(Skα ∩W ) ≥ 1. This is because of the equality: dim(Skα +W ) =
dim Skα + dim W − dim(Skα ∩ W ), and of course dim(SkαP + W ) ≤ n.
Now, choose x ∈ Sk ∩ W \ {0}, note that x = nj=k hx, uj iuj and Auj = uj , then it
α

follows that

Pn
hAx, xi j=k λj hx, uj iuj , x
=
kxk2 kxk2

1
Pn
j=kλj hx, uj ihuj , xi
=
kxk2
Pn 2
j=k λj |hx, uj i|
=
kxk2
Pn 2
j=k |hx, uj i|
≥ λk
kxk2
= λk .

≤ · · · ≤ λn and kxk2 = nj=k |hx, uj i|2 .


P
Here, we use the fact that λk ≤ λk+1
Thus, for any Skα ,
hAx, xi
sup 2
≥ λk .
x∈Skα \{0} kxk

Indeed, for any Skα , it is easy to check that


hAx, xi
sup 2
= sup hAx, xi,
x∈Skα \{0} kxk x∈Skα ,kxk=1

and since {x ∈ Skα : kxk = 1} is compact, supremum is attained. So we have


hAx, xi hAx, xi
max 2
= sup 2
≥ λk .
x∈Sk \{0} kxk
α
x∈Skα \{0} kxk

On the other hand, consider a particular k−dimensional subspace Skα = span{u1 , · · · , uk },


then
Pk 2
hAx, xi j=1 λj |hx, uj i|
= ≤ λk .
kxk2 kxk2
hAx,xi
Hence, choosing x = uk , we obtain maxx∈Skα \{0} = λk .
kxk2
hAx,xi
This also implies that the minimum of maxx∈Skα \{0} kxk2 over all α ∈ Ik is also attained, and
we conclude that
hAx, xi
min max = λk .
α∈Ik x∈Sk \{0} kxk2
α

(ii). The proof of this formula follows from the same idea as (i), and we shall omit the similar
α
details. We first choose W = span{u1 , · · · , uk }, so dim W = k. Then, for any subspace Sn−k+1 ,
α
dim(Sk ∩ W ) ≥ 1.
∩ W \ {0}, hAx,xi
α

Next, choose any x ∈ Sn−k+1 kxk2
≤ λk , and therefore

hAx, xi
min ≤ λk .
α
x∈Sn−k+1 \{0} kxk2

Now, we again choose a particular Sn−k+1 = span{uk , · · · , un }, and this gives


hAx, xi
min = λk .
α
x∈Sn−k+1 \{0} kxk2

2
So finally we conclude that

hAx, xi
max min = λk .
α
α∈In−k+1 x∈Sn−k+1 \{0} kxk2

4.2.2 Remark. We can compare this result with theorem 4.2.11 in Horn and Johnson’s ”Matrix
Analysis”, which uses vectors to prove the ”min-max” and ”max-min” formulae, but the idea is
essentially the same.

4.3 Eigenvalue estimates for sums of matrices


Next, we shall introduce several theorems and corollaries that can be considered as consequences
of the Courant-Fischer’s theorem. The first theorem, by Weyl, allows us to obtain a lower and
upper bound for the kth eigenvalue of A + B.

4.3.3 Theorem (Weyl). Let A, B ∈ Mn be both Hermitian, and {λj (A)}nj=1 , {λj (B)}nj=1 and
{λj (A+B)}nj=1 denote the sets of eigenvalues of A, B, and A+B in increasing order, respectively.
Then, for any 1 ≤ k ≤ n,

λk (A) + λ1 (B) ≤ λk (A + B) ≤ λk (A) + λn (B).

Notice that by symmetry, we naturally also have

λk (B) + λ1 (A) ≤ λk (A + B) ≤ λk (B) + λn (A).

Proof. By Rayleigh-Ritz’s theorem, we know that

hBx, xi
λ1 (B) ≤ ≤ λn (B), for x 6= 0.
kxk2

Then, by Courant-Fischer’s theorem, for any 1 ≤ k ≤ n,

h(A + B)x, xi
λk (A + B) = min max
α
α∈Ik x∈Sk \{0} kxk2
 
hAx, xi hBx, xi
= min max +
α∈Ik x∈Skα \{0} kxk2 kxk2
 
hAx, xi
≥ min max + λ1 (B)
α∈Ik x∈Skα \{0} kxk2
hAx, xi
= λ1 (B) + min max
α∈Ik x∈Skα \{0} kxk2

= λk (A) + λ1 (B).

For the other inequality, we apply similar argument and immediately obtain λk (A + B) ≤
λk (A) + λn (B), and the proof is finished here.

3
4.3.4 Remark. It is interesting to mention that for some special B, both lower and upper bounds
can be attained in the above theorem we have just proved. For example, let {u1 , · · · , un } be
the orthonormal set of eigenvectors of A with Auk = λk uk for 1 ≤ k ≤ n, and consider
the rank one projection B = auk u∗k , where a > 0 or a < 0. Then, A = U DU ∗ , where
D = diag{λ1 , · · · , λn }, and we can assume that λ1 , · · · , λn is listed in increasing order. Also,
it is easy to see that B = U D0 U ∗ , where D0 = diag{0, · · · , 0, a, 0, · · · , 0} (a appears in the
kth place). Then, we immediately have λk (A + B) = λk (A) + λn (B) = λk (A) + a if a > 0;
λk (A + B) = λk (A) + λ1 (B) = λk (A) + a if a < 0.
The following corollary is called the monotonicity theorem, which refines the lower bound in
Weyl’s theorem by assuming B is positive semidefinite.
4.3.5 Definition. B ∈ Mn is called positive semidefinite, if it is Hermitian and hBx, xi ≥ 0 for
all x ∈ Cn .
Notice that hBx, xi ≥ 0 for all x ∈ Cn indeed implies that B is Hermitian. However, for the
real case, we have to impose that B is symmetric.
4.3.6 Corollary (Weyl). Adopt all the assumptions and notations in the above Weyl’s theorem,
if we further that suppose B is positive semideifnite, then

λk (A) ≤ λk (A + B), for all 1 ≤ k ≤ n.

Proof. Since B is positive semidefinite, λj (B) ≥ 0 for all 1 ≤ j ≤ n, so the corollary follows
from Weyl’s theorem directly.
The following theorem discusses the relationship between eigenvalues of a Hermitian matrix
and those of the rank one perturbation of it, which is called the interlacing theorem. This is still
an application of Courant-Fischer’s theorem.
4.3.7 Theorem. Let A ∈ Mn be Hermitian, z ∈ Cn , {λj (A)} and {λj (A ± zz ∗ )} be both in
increasing order, then
(i).
λk (A ± zz ∗ ) ≤ λk+1 (A) ≤ λk+2 (A ± zz ∗ ), for 1 ≤ k ≤ n − 2,

(ii).
λk (A) ≤ λk+1 (A ± zz ∗ ) ≤ λk+2 (A), for 1 ≤ k ≤ n − 2.

endsection
Proof. By Courant-Fischer’s theorem,
h(A ± zz ∗ )x, xi
λk+2 (A ± zz ∗ ) = min max
α
α∈Ik x∈Sk+2 \{0} kxk2
hAx, xi |hx, zi|2
 
= min max ±
α \{0}
α∈Ik x∈Sk+2 kxk2 kxk2
hAx, xi
≥ min max (∗)
α∈Ik x∈Sk+2 \{0} kxk2
α

x⊥z

4
hAx, xi
≥ min max (∗∗)
α∈Ik x∈Sk+1 \{0} kxk2
α
α
z⊥Sk+1

hAx, xi
≥ min max (∗ ∗ ∗)
α∈Ik x∈Sk+1 \{0} kxk2
α

= λk+1 (A).
α
Here, the inequality (*) and (***) are trivial. For inequality (**), note that x ∈ Sk+2 \ {0}
α ⊥ α
and x ⊥ z is equivalent to x ∈ Sk+2 ∩ (span{z}) , and again by the equality dim(Sk + W ) =
dim Skα + dim W − dim(Skα ∩ W ), we know that dim(Sk+2 α
∩ (span{z})⊥ ) = k + 2 or k + 1.
α ⊥
Then, we see that for each Sk+2 ∩ (span{z}) , we can extract a k + 1−dimensional subspace
α α
Sk+1 such that z ⊥ Sk+1 , and therefore

hAx, xi hAx, xi
max ≥ max ,
x∈Sk+2 \{0} kxk2
α α \{0}
x∈Sk+1 kxk2
x⊥z α
z⊥Sk+1

because on the right side we maximize over a subspace of that on the left.
Now, (**) becomes clear. As we finally want the minimum, and for each of the maximums
in (*) over which we try to take the minimum, there exists a maximum in (**) which is smaller,
we know that ≥ in (**) should hold.
For the other inequality of (i), we apply the analogous argument, again by the ”max-min”
formula in the Courant-Fischer’s theorem,
h(A ± zz ∗ )x, xi
λk (A ± zz ∗ ) = max min
α
α∈In−k+1 x∈Sn−k+1 \{0} kxk2
h(A ± zz ∗ )x, xi
≤ max min
α
α∈In−k+1 x∈Sn−k+1 \{0} kxk2
x⊥z
hAx, xi
= max min
α
α∈In−k+1 x∈Sn−k+1 \{0} kxk2
x⊥z
h(A ± zz ∗ )x, xi
≤ max min (∆)
α
α∈In−k x∈Sn−k \{0}
α
kxk2
z⊥Sn−k

h(A ± zz ∗ )x, xi
≤ max min
α
α∈In−k x∈Sn−k \{0} kxk2
= λk+1 (A).

Here, inequality (∆) follows from the similar argument as we did for (**). Thus, we proved (i).
(ii). This is indeed a direct corollary of (i), by modifying the indices. So the proof of the
interlacing theorem is done.

You might also like