You are on page 1of 5

STAT C206A / MATH C223A : Stein’s method and applications 1

Lecture 32
Lecture date: Nov 9, 2007 Scribe: Guy Bresler

1 Matrix Norms

In this lecture we prove central limit theorems for functions of a random matrix with
Gaussian entries. We begin by reviewing two matrix norms, and some basic properties and
inequalities.

1. Suppose A is a n × n real matrix. The operator norm of A is defined as


kAk = sup kAxk, x ∈ Rn .
|x|=1

Alternatively, q
kAk = λmax (AT A),
where λmax (M ) is the maximum eigenvalue of the matrix M .

Basic properties include:


kA + Bk ≤ kAk + kBk
kαAk = |α|kAk
kABk ≤ kAkkBk.

2. The Hilbert Schmidt (alternatively called the Schur, Euclidean, Frobenius) norm is
defined as sX q
kAkHS = 2
aij = Tr(AT A).
i,j

Clearly, p
kAkHS = sum of eigenvalues of AT A,
which implies that √
kAk ≤ kAkHS ≤ nkAk.
Of course, kAkHS also satisfies the usual properties of a norm.

Proposition 1 The following inequality holds:


kABkHS ≤ kAkkBkHS .

32-1
Proof: Let b1 , . . . , bn denote the columns of B. Then
n
X n
X
kABk2HS = kAbi k2 ≤ kAk2 kbi k2 = kAk2 kBk2HS .
i=1 i=1
2

3. A simple matrix inequality follows from the Cauchy-Schwarz inequality:


X
| Tr(AB)| = aij bji ≤ kAkHS kBkHS .
i,j

4. Combining the proposition above with observation 3 gives the inequality


| Tr(ACBD)| ≤ kACkHS kBDkHS ≤ kAkkBkkCkHS kDkHS .
More generally, it holds that
Y
| Tr(A1 A2 . . . , Ak )| ≤ kAi kHS kAj kHS kAl k.
l6=i,j

Next, recall the theorem from last lecture:

Theorem 2 Let X1 , . . . , Xk be i.i.d. N (0, 1) random variables. Let f ∈ C 2 (Rn ) and W =


f (X) with E W = 0. Then

10 1
dT V (L(W ), N (0, σ )) ≤ 2 (Ek Hess f (X)k4 Ek∇f (X)k4 ) 4 .
2
σ

We will use this theorem to study the Gaussian random matrix.

2 CLT for Tr(Ak )

Suppose
1
A = √ (Xij )1≤i,j≤N ,
N
i.i.d.
where Xij ∼ N (0, 1). Fix a positive integer k. We would like a CLT for Tr(Ak ).

To begin, note that


1 X
Tr(Ak ) = Xi1 ,i2 Xi2 ,i3 . . . Xik−1 ,ik Xik ,i1 . (1)
N k/2 1≤i1 ,i2 ,...,ik ≤N

It turns out that the usual dependency graph theorem fails for k ≥ 3, so a more powerful
method must be used.

32-2
Exercise 3 Find a dependency graph theorem that works for all k.

In order to apply Theorem 2, we identify

X = (X11 , X12 , . . . , X1k , X21 , X22 , . . . , XN N ) ,

and f (X) = Tr(Ak ). Now


k−1
!  
∂f ∂ (a) X ∂A k−1−r (b) ∂A k−1
= Tr( Ak ) = T r Ar A = kT r A , (2)
∂xij ∂xij ∂xij ∂xij
r=0


where (a) follows from the fact that for two matrices A and B, ∂x AB = ∂A ∂B
∂x B + A ∂x , and
(b) from moving the trace inside the sum and using Tr(AB) = Tr(BA). But

∂A 1
= √ ei eTj ,
∂xij N
where ei is the ith standard basis vector, i.e. the vector of all zeros with a 1 in the ith
position.

Thus
∂f k
= √ Tr(ei eTj Ak−1 )
∂xij N
k
= √ Tr(eTj Ak−1 ei )
N
k
= √ (Ak−1 )ji
N

This allows us to calculate


X  ∂f 2 k 2 X
2
k∇f (X)k = = (Ak−1 )2ji
∂xij N
i,j
k2
= kAk−1 k2HS (3)
N
k2
≤ N kAk−1 k2
N
≤ k 2 kAk2(k−1) .

Lemma 4
EkAkp ≤ C(p) ∀p ∈ Z+ ,
where C(p) is a constant independent of N .

32-3
Proof: The proof is essentially as follows. For a positive definite random matrix B, kBk =
λmax (B). Thus

EkBkp = Eλpmax ≤ (Eλpm


max )
1/m
for any m
≤ (E Tr(B pm ))1/m .

Now let m → ∞ suitably with N .2

This shows that k∇f (X)k2 = O(1), and hence the Poincaré inequality implies that
Var(f (X)) = O(1).

Exercise 5 Show that any two terms in the sum of equation (1) have non-negative covari-
ance.

The exercise implies that


1 X
Var(f (X)) ≥ Var(Xi1 ,i2 . . . Xik ,i1 ) ≥ C(k) > 0 .
Nk
Recalling the result of Theorem 2,

10 1
dT V (L(W ), N (0, σ )) ≤ 2 (Ek Hess f (X)k4 Ek∇f (X)k4 ) 4 ,
2
σ
we see that σ 2 = Var(f (X)) ≥ C(k) and from equation (3) and the fact noted above,
Ek∇f (X)k2 ≤ C(k). Therefore it remains only to show that Ek Hess f (X)k4 → 0 in order
to prove the desired central limit theorem.

We have  
∂A ∂A k−1
= kT r A ,
∂xij ∂xij
and
k−2
!
∂2A X ∂A r ∂A k−r−2
= kT r A A .
∂xpq xij ∂xij ∂xpq
r=0

Fact about matrix norms: If A is a symmetric, real matrix then

kAk = sup |xT Ay| .


kxk=kyk=1

Now, Hess f (X) is an N 2 × N 2 symmetric, real matrix:


 
X 2
∂ f X X 
k Hess f (X)k = sup cij dpq : c2ij = 1, dpq = 1 .
 ∂xij ∂xpq 
ijpq

32-4
Let C = (cij ) and D = (dpq ) be two matrices with kCkHS = kDkHS = 1. Fix 0 ≤ r ≤ k − 2.
Then
 
X ∂A r ∂A k−2−r 1 X 1
cij dpq T r A A = cij dpq Tr(ei eTj Ar ep eTq Ak−2−r ) = Tr(CAr DAk−2−r ) ,
∂xij ∂xpq N N
ijpq

cij ei eTj = C and similarly for D.


P
where we used the fact that

Now
| Tr(CAr DAk−2−r )| ≤ kAkk−2 kCkHS kDkHS = kAkk−2 .
Thus
k(k − 1)kAkk−2
k Hess f (X)k ≤ .
N
Combining, we get the desired result:

C(k)
dT V (Tr(Ak ), N (0, σ 2 )) ≤ .
N

32-5

You might also like