You are on page 1of 34

Cochran’s Theorem and Analysis of

Variance

Copyright 2012
c Dan Nettleton (Iowa State University) Statistics 611 1 / 34
Theorem 5.1:(Cochran’s Theorem)

Suppose Y ∼ N(µ, σ 2 I) and A1 , . . . , Ak are symmetric and idempotent


matrices with
rank(Ai ) = si ∀ i = 1, . . . , k.
Pk 1 0
Then i=1 Ai =n×n
I =⇒ σ2
Y Ai Y (i = 1, . . . , k) are independently
distributed as χ2si (φi ), with

k
1 0 X
φi = µ A i µ, si = n.
2σ 2
i=1

Copyright 2012
c Dan Nettleton (Iowa State University) Statistics 611 2 / 34
Proof of Theorem 5.1:

By Result 5.15,
1 0
Y Ai Y ∼ χ2si (φi )
σ2
with
1 0
φi = µ Ai µ
2σ 2
1
∵ A σ2I
σ2 i
= Ai is idempotent and has rank si ∀ i = 1, . . . , k.

Copyright 2012
c Dan Nettleton (Iowa State University) Statistics 611 3 / 34
k
X k
X
si = rank(Ai )
i=1 i=1
Xk
= trace(Ai )
i=1
k
X
= trace( Ai )
i=1

= trace(n×n
I)

= n.

Copyright 2012
c Dan Nettleton (Iowa State University) Statistics 611 4 / 34
By Lemma 5.1, ∃ Gi 3
n×si

Gi G0i = Ai , G0i Gi = I ∀ i = 1, . . . , k.
si ×si

Now let G = [G1 , . . . , Gk ].

Pk
∵ Gi is n × si ∀ i = 1, . . . , k and i=1 si = n, it follows that G is n × n.

Copyright 2012
c Dan Nettleton (Iowa State University) Statistics 611 5 / 34
Moreover,
 
G0
i  1
.. 
h
GG0 = G1 . . . Gk  . 
G0k
k
X k
X
= Gi G0i = Ai = I.
i=1 i=1

G has G0 as its inverse; i.e., G−1 = G0 . Thus G0 G = I.


Thus, n×n

Copyright 2012
c Dan Nettleton (Iowa State University) Statistics 611 6 / 34
Now we have
 
G01
.. 
h i
I = G0 G = 

.  G1 · · · Gk
 
G0k
 
G01 G1 G01 G2 . . . G01 Gk
 0
G G1 G0 G2 . . . G0 Gk 

 2 2 2
= . .

 .. .. .. .. 
 . . . 

G0k G1 G0k G2 . . . G0k Gk

Copyright 2012
c Dan Nettleton (Iowa State University) Statistics 611 7 / 34
∴ G0i Gj = 0 ∀ i 6= j.
∴ Gi G0i Gj G0j = 0 ∀ i 6= j
∴ Ai Aj = 0 ∀ i 6= j
∴ σ 2 Ai Aj = 0 ∀ i 6= j
∴ Ai (σ 2 I)Aj = 0 ∀ i 6= j
∴ Independence, hold by Corollary 5.4, ∀ pair of
quadratic forms Y 0 Ai Y/σ 2 and Y 0 Aj Y/σ 2 .

Copyright 2012
c Dan Nettleton (Iowa State University) Statistics 611 8 / 34
However, we can prove more than pairwise independence.
   
G01 Y G01
 .   .  h i0
 ..  =  ..  Y = G1 . . . Gk Y = G0 Y
   
0
Gk Y G0k
∼ N(G0 µ, G0 (σ 2 I)G = σ 2 G0 G = σ 2 I).

By Result 5.4, G01 Y, . . . , G0k Y are mutually independent.

Copyright 2012
c Dan Nettleton (Iowa State University) Statistics 611 9 / 34
G01 Y, . . . , G0k Y mutually independent,

=⇒ kG01 Yk2 , . . . , kG0k Yk2 mutually independent


=⇒ Y 0 G1 G01 Y, . . . , Y 0 Gk G0k Y mutually independent
=⇒ Y 0 A1 Y/σ 2 , . . . , Y 0 Ak Y/σ 2 mutually independent.

Copyright 2012
c Dan Nettleton (Iowa State University) Statistics 611 10 / 34
Example:

i.i.d.
Suppose Y1 , . . . , Yn ∼ N(µ, σ 2 ).

Pn
Find the joint distribution of nȲ·2 and i=1 (Yi − Ȳ· )2 .

Copyright 2012
c Dan Nettleton (Iowa State University) Statistics 611 11 / 34
Let A1 = P1 = 1(10 1)− 10 = 1n 110 .

Let A2 = I − P1 = I − 1n 110 .

Then
rank(A1 ) = 1 and rank(A2 ) = n − 1.

Also, A1 and A2 are each symmetric and idempotent matrices 3

A1 + A2 = P1 + I − P1 = I.

Let Y = (Y1 , . . . , Yn )0 3 E(Y) = µ1, Var(Y) = σ 2 I.

Copyright 2012
c Dan Nettleton (Iowa State University) Statistics 611 12 / 34
1 0 IND
Cochran’s Thenorem =⇒ σ2
Y Ai Y ∼ χ2si (φi ), where

1 0 µ2 0
si = rank(Ai ) and φi = µ A i µ = 1 Ai 1.
2σ 2 2σ 2

For i = 1, we have
1 0 0 n
2
Y 11 Y/n = 2 Ȳ·2 , s1 = 1, and
σ  σ 
µ2 0 1 0 nµ2
φ1 = 2 1 11 1 = 2 .
2σ n 2σ

Copyright 2012
c Dan Nettleton (Iowa State University) Statistics 611 13 / 34
For i = 2, we have
1 0 1
2
Y A2 Y = 2 Y 0 A02 A2 Y
σ σ
1
= 2 kA2 Yk2
σ
  2
1 1 0
= 2 I − 11 Y

σ n
1
= 2 kY − 1Ȳ· k2
σ
n
1 X
= 2 (Yi − Ȳ· )2 .
σ
i=1

Copyright 2012
c Dan Nettleton (Iowa State University) Statistics 611 14 / 34
Also, s2 = n − 1 and

µ2 0
 
1 0
φ2 = 2 1 I − 11 1
2σ n
2
 
µ 1
= 2 10 1 − 10 110 1
2σ n
2 2
 
µ n
= 2 n− = 0.
2σ n

Copyright 2012
c Dan Nettleton (Iowa State University) Statistics 611 15 / 34
Thus,
nµ2
 
nȲ·2 ∼ σ 2 χ21
2σ 2
independent of
n
X
(Yi − Ȳ· )2 ∼ σ 2 χ2n−1 .
i=1

Copyright 2012
c Dan Nettleton (Iowa State University) Statistics 611 16 / 34
It follows that
nȲ·2 /σ 2 nȲ·2
Pn 2 = Pn 2
i=1 (Yi −Ȳ· ) i=1 (Yi −Ȳ· )
n−1 /σ 2 n−1
nµ2
 
∼ F1,n−1 .
2σ 2

Copyright 2012
c Dan Nettleton (Iowa State University) Statistics 611 17 / 34
nȲ·2
If µ = 0, Pn
(Y −Ȳ· )2
∼ F1,n−1 .
i=1 i
n−1

nȲ·2
Thus, we can test H0 : µ = 0 by comparing Pn
(Y −Ȳ· )2
to F1,n−1
i=1 i
n−1
distribution and rejecting H0 for large values.

Copyright 2012
c Dan Nettleton (Iowa State University) Statistics 611 18 / 34
Note
nȲ·2
Pn 2 = t2 ,
i=1 (Yi −Ȳ· )
n−1

where
Ȳ·
t= p ,
s2 /n
the usual t-statistics for testing H0 : µ = 0.

Copyright 2012
c Dan Nettleton (Iowa State University) Statistics 611 19 / 34
Example:

ANalysis Of VAriance (ANOVA):

Consider Y = Xβ + ε, where
 
β1
 
β 
 2
X = [X1 , X2 , . . . , Xm ], β =  .  , ε ∼ N(0, σ 2 I).
 .. 
 
βm

Copyright 2012
c Dan Nettleton (Iowa State University) Statistics 611 20 / 34
Let Pj = P[X1 ,...,Xj ] for j = 1, . . . , m.

Let

A1 = P 1
A2 = P 2 − P 1
A3 = P 3 − P 2
..
.
Am = Pm − Pm−1
Am+1 = I − Pm .
Pm+1
Note that j=1 Aj = I.

Copyright 2012
c Dan Nettleton (Iowa State University) Statistics 611 21 / 34
Then Aj is symmetric and idempotent ∀ j = 1, . . . , m + 1.

s1 = rank(A1 ) = rank(P1 )
s2 = rank(A2 ) = rank(P2 ) − rank(P1 )
..
.
sm = rank(Am ) = rank(Pm ) − rank(Pm−1 )
sm+1 = rank(Am+1 ) = rank(I) − rank(Pm )
= n − rank(X).

Copyright 2012
c Dan Nettleton (Iowa State University) Statistics 611 22 / 34
It follows that

1 0 IND
2
Y Aj Y ∼ χ2sj (φj ), ∀ j = 1, . . . , m + 1,
σ

where
1 0 0
φj = β X Aj Xβ.
2σ 2
Note
1 0 0
φm+1 = β X (I − PX )Xβ
2σ 2
1
= 2 β 0 X0 (X − PX X)β

1
= 2 β 0 X0 (X − X)β = 0.

Copyright 2012
c Dan Nettleton (Iowa State University) Statistics 611 23 / 34
Thus,
1 0
Y Am+1 Y ∼ χ2n−rank(X)
σ2
and
Y 0 Aj Y/sj
Fj =
Y 0 Am+1 Y/(n − rank(X))
 
1 0 0
∼ Fsj ,n−rank(X) β X Aj Xβ ∀ j = 1, . . . , m.
2σ 2

Copyright 2012
c Dan Nettleton (Iowa State University) Statistics 611 24 / 34
We can assemble the ANOVA table as below:

Source Sum of DF Mean Expected F


Squares Square Mean Square
A1 Y A1 Y0
s1 0
Y A1 Y/s1 σ2 + β 0 X0 A1 Xβ/s1 F1
.. .. .. .. .. ..
. . . . . .
Am Y 0 Am Y sm Y 0 Am Y/sm σ 2 + β 0 X0 Am Xβ/sm Fm
0 0
Am+1 Y Am+1 Y sm+1 Y Am+1 Y/sm+1 σ2
I Y 0 IY n

This ANOVA table contains sequential (a.k.a type I) sum of squares.

Am+1 corresponds to “error.”

Copyright 2012
c Dan Nettleton (Iowa State University) Statistics 611 25 / 34
We can use Fj to test

1 0 0
H0j : β X Aj Xβ = 0 ⇐⇒
2σ 2
H0j : β 0 X0 A0j Aj Xβ = 0 ⇐⇒
H0j : Aj Xβ = 0.

Copyright 2012
c Dan Nettleton (Iowa State University) Statistics 611 26 / 34
Now
 
β1
 . 
Aj Xβ = Aj [X1 , . . . , Xm ]  . 
 . 
βm
m
X
= Aj Xi βi
i=1
m
X
= Aj Xi β i
i=1
m
X
= (Pj − Pj−1 )Xi β i ∀ j = 1, . . . , m (where P0 = 0 ).
i=1

Copyright 2012
c Dan Nettleton (Iowa State University) Statistics 611 27 / 34
Recall Pj = P[X1 ,...,Xj ] .

Thus, Pj Xi = Xi ∀ i ≤ j.

It follows that

(Pj − Pj−1 )Xi = 0 whenever i ≤ j − 1.

Therefore
m
X
Aj Xβ = (Pj − Pj−1 )Xi β i
i=1
m
X
= (Pj − Pj−1 )Xi β i .
i=j

Copyright 2012
c Dan Nettleton (Iowa State University) Statistics 611 28 / 34
For j = m, this simplifies to

Am Xβ = (Pm − Pm−1 )Xm β m


= (I − Pm−1 )Pm Xm β m
= (I − Pm−1 )Xm β m .

Now

(I − Pm−1 )Xm β m = 0
⇐⇒ Xm β m ∈ N (I − Pm−1 )
⇐⇒ Xm β m ∈ C(Pm−1 )
⇐⇒ Xm β m ∈ C([X1 , . . . , Xm−1 ]).

Copyright 2012
c Dan Nettleton (Iowa State University) Statistics 611 29 / 34
Now

Xm β m ∈ C([X1 , . . . , Xm−1 ])
⇐⇒ E(Y) = Xβ
m
X
= Xi βi
i=1

∈ C([X1 , . . . , Xm−1 ])

⇐⇒ Explanatory variables in Xm are irrelevant in the presence of


X1 , . . . , Xm−1 .

Copyright 2012
c Dan Nettleton (Iowa State University) Statistics 611 30 / 34
For the special case where X is of full column rank,

Xm β m ∈ C([X1 , . . . , Xm−1 ])
⇐⇒ Xm β m = 0
⇐⇒ β m = 0.

Thus, in this full column rank case, Fm tests

H0m : β m = 0.

Copyright 2012
c Dan Nettleton (Iowa State University) Statistics 611 31 / 34
Explain what Fj tests for the special case where

X0k Xk∗ = 0 ∀ k 6= k∗ .

Copyright 2012
c Dan Nettleton (Iowa State University) Statistics 611 32 / 34
Now suppose
X0k Xk∗ = 0 ∀ k 6= k∗ .

Then

Xi for i ≤ j
Pj X i =
0 for i > j
   
X01 Xi 0
..   .. 
∵ [X1 , . . . , Xj ]0 Xi = 
  
 .  = .
 for i > j.
X0j Xi 0

It follows that
m
X
Aj Xβ = (Pj − Pj−1 )Xi β i = Pj Xj β j = Xj β j .
i=1

Copyright 2012
c Dan Nettleton (Iowa State University) Statistics 611 33 / 34
Thus, for the orthogonal case, Fj can be used to test

H0j : Xj β j = 0 ∀ j = 1, . . . , m.

If we have X full column rank in addition to the orthogonality condition,


Fj tests
H0j : β j = 0 ∀ j = 1, . . . , m.

Copyright 2012
c Dan Nettleton (Iowa State University) Statistics 611 34 / 34

You might also like