Lecture32 PDF

STAT C206A / MATH C223A : Stein’s method and applications 1
Lecture 32
Lecture date: Nov 9, 2007 Scribe: Guy Bresler
1 Matrix Norms
In this lecture we prove central limit theorems for functions of a random matrix with
Gaussian entries. We begin by reviewing two matrix norms, and some basic properties and
inequalities.
1. Suppose A is a n × n real matrix. The operator norm of A is defined as

kAk = sup kAxk, x ∈ Rn .
|x|=1
Alternatively, q
kAk = λmax (AT A),
where λmax (M ) is the maximum eigenvalue of the matrix M .
Basic properties include:

kA + Bk ≤ kAk + kBk
kαAk = |α|kAk
kABk ≤ kAkkBk.
2. The Hilbert Schmidt (alternatively called the Schur, Euclidean, Frobenius) norm is
defined as sX q
kAkHS = 2
aij = Tr(AT A).
i,j
Clearly, p
kAkHS = sum of eigenvalues of AT A,
which implies that √
kAk ≤ kAkHS ≤ nkAk.
Of course, kAkHS also satisfies the usual properties of a norm.
Proposition 1 The following inequality holds:

kABkHS ≤ kAkkBkHS .
32-1
Proof: Let b1 , . . . , bn denote the columns of B. Then
n
X n
X
kABk2HS = kAbi k2 ≤ kAk2 kbi k2 = kAk2 kBk2HS .
i=1 i=1
2
3. A simple matrix inequality follows from the Cauchy-Schwarz inequality:

X
| Tr(AB)| = aij bji ≤ kAkHS kBkHS .
i,j
4. Combining the proposition above with observation 3 gives the inequality

| Tr(ACBD)| ≤ kACkHS kBDkHS ≤ kAkkBkkCkHS kDkHS .
More generally, it holds that
Y
| Tr(A1 A2 . . . , Ak )| ≤ kAi kHS kAj kHS kAl k.
l6=i,j
Next, recall the theorem from last lecture:
Theorem 2 Let X1 , . . . , Xk be i.i.d. N (0, 1) random variables. Let f ∈ C 2 (Rn ) and W =

f (X) with E W = 0. Then
√
10 1
dT V (L(W ), N (0, σ )) ≤ 2 (Ek Hess f (X)k4 Ek∇f (X)k4 ) 4 .
2
σ
We will use this theorem to study the Gaussian random matrix.
2 CLT for Tr(Ak )
Suppose
1
A = √ (Xij )1≤i,j≤N ,
N
i.i.d.
where Xij ∼ N (0, 1). Fix a positive integer k. We would like a CLT for Tr(Ak ).
To begin, note that

1 X
Tr(Ak ) = Xi1 ,i2 Xi2 ,i3 . . . Xik−1 ,ik Xik ,i1 . (1)
N k/2 1≤i1 ,i2 ,...,ik ≤N
It turns out that the usual dependency graph theorem fails for k ≥ 3, so a more powerful
method must be used.
32-2
Exercise 3 Find a dependency graph theorem that works for all k.
In order to apply Theorem 2, we identify
X = (X11 , X12 , . . . , X1k , X21 , X22 , . . . , XN N ) ,
and f (X) = Tr(Ak ). Now

k−1
!
∂f ∂ (a) X ∂A k−1−r (b) ∂A k−1
= Tr( Ak ) = T r Ar A = kT r A , (2)
∂xij ∂xij ∂xij ∂xij
r=0
∂
where (a) follows from the fact that for two matrices A and B, ∂x AB = ∂A ∂B
∂x B + A ∂x , and
(b) from moving the trace inside the sum and using Tr(AB) = Tr(BA). But
∂A 1
= √ ei eTj ,
∂xij N
where ei is the ith standard basis vector, i.e. the vector of all zeros with a 1 in the ith
position.
Thus
∂f k
= √ Tr(ei eTj Ak−1 )
∂xij N
k
= √ Tr(eTj Ak−1 ei )
N
k
= √ (Ak−1 )ji
N
This allows us to calculate

X ∂f 2 k 2 X
2
k∇f (X)k = = (Ak−1 )2ji
∂xij N
i,j
k2
= kAk−1 k2HS (3)
N
k2
≤ N kAk−1 k2
N
≤ k 2 kAk2(k−1) .
Lemma 4
EkAkp ≤ C(p) ∀p ∈ Z+ ,
where C(p) is a constant independent of N .
32-3
Proof: The proof is essentially as follows. For a positive definite random matrix B, kBk =
λmax (B). Thus
EkBkp = Eλpmax ≤ (Eλpm

max )
1/m
for any m
≤ (E Tr(B pm ))1/m .
Now let m → ∞ suitably with N .2
This shows that k∇f (X)k2 = O(1), and hence the Poincaré inequality implies that
Var(f (X)) = O(1).
Exercise 5 Show that any two terms in the sum of equation (1) have non-negative covari-
ance.
The exercise implies that

1 X
Var(f (X)) ≥ Var(Xi1 ,i2 . . . Xik ,i1 ) ≥ C(k) > 0 .
Nk
Recalling the result of Theorem 2,
√
10 1
dT V (L(W ), N (0, σ )) ≤ 2 (Ek Hess f (X)k4 Ek∇f (X)k4 ) 4 ,
2
σ
we see that σ 2 = Var(f (X)) ≥ C(k) and from equation (3) and the fact noted above,
Ek∇f (X)k2 ≤ C(k). Therefore it remains only to show that Ek Hess f (X)k4 → 0 in order
to prove the desired central limit theorem.
We have
∂A ∂A k−1
= kT r A ,
∂xij ∂xij
and
k−2
!
∂2A X ∂A r ∂A k−r−2
= kT r A A .
∂xpq xij ∂xij ∂xpq
r=0
Fact about matrix norms: If A is a symmetric, real matrix then
kAk = sup |xT Ay| .

kxk=kyk=1
Now, Hess f (X) is an N 2 × N 2 symmetric, real matrix:

 
X 2
∂ f X X 
k Hess f (X)k = sup cij dpq : c2ij = 1, dpq = 1 .
 ∂xij ∂xpq 
ijpq
32-4
Let C = (cij ) and D = (dpq ) be two matrices with kCkHS = kDkHS = 1. Fix 0 ≤ r ≤ k − 2.
Then

X ∂A r ∂A k−2−r 1 X 1
cij dpq T r A A = cij dpq Tr(ei eTj Ar ep eTq Ak−2−r ) = Tr(CAr DAk−2−r ) ,
∂xij ∂xpq N N
ijpq
cij ei eTj = C and similarly for D.

P
where we used the fact that
Now
| Tr(CAr DAk−2−r )| ≤ kAkk−2 kCkHS kDkHS = kAkk−2 .
Thus
k(k − 1)kAkk−2
k Hess f (X)k ≤ .
N
Combining, we get the desired result:
C(k)
dT V (Tr(Ak ), N (0, σ 2 )) ≤ .
N
32-5

Lecture32 PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture32 PDF

Uploaded by

Copyright:

Available Formats

STAT C206A / MATH C223A : Stein’s method and applications 1

1. Suppose A is a n × n real matrix. The operator norm of A is defined as

Basic properties include:

Proposition 1 The following inequality holds:

3. A simple matrix inequality follows from the Cauchy-Schwarz inequality:

4. Combining the proposition above with observation 3 gives the inequality

Next, recall the theorem from last lecture:

Theorem 2 Let X1 , . . . , Xk be i.i.d. N (0, 1) random variables. Let f ∈ C 2 (Rn ) and W =

We will use this theorem to study the Gaussian random matrix.

2 CLT for Tr(Ak )

To begin, note that

In order to apply Theorem 2, we identify

X = (X11 , X12 , . . . , X1k , X21 , X22 , . . . , XN N ) ,

and f (X) = Tr(Ak ). Now

This allows us to calculate

EkBkp = Eλpmax ≤ (Eλpm

Now let m → ∞ suitably with N .2

The exercise implies that

Fact about matrix norms: If A is a symmetric, real matrix then

kAk = sup |xT Ay| .

Now, Hess f (X) is an N 2 × N 2 symmetric, real matrix:

cij ei eTj = C and similarly for D.

You might also like