Professional Documents
Culture Documents
2
1 ⊤ ⊤
1 ⊤ ⊤
1 ⊤ ⊤
1 ⊤
ȳ = (j y)(j y) = (y j)(j y) = y (jj )y = y Jy
2 2 2 2
n n n n
n n
2 2 ⊤
1 ⊤ ⊤
1
2
∑(yi − ȳ ) = ∑y − nȳ = y Iy − y Jy = y (I − J) y
i
n n
i=1 i=1
1 1
I = (I − J) + J.
n n
(
1
n
J) (
1
n
J) =
1
2
(JJ) =
1
2
(nJ) =
1
n
J ,
n n
2
(I −
1
n
J) (I −
1
n
J) = I
2
−
2
n
J + (
1
n
J) = I −
n
2
J +
1
n
J = I −
1
n
J ,
2
and (b) (I − 1
n
J) (
1
n
J) = (
1
n
J) − (
1
n
J) =
1
n
J −
1
n
J = O .
⊤ ⊤
E(y Ay) = tr (AΣ) + μ Aμ.
www.math.louisville.edu/~rsgill01/668/Ch_5_Notes.html 1/7
10/3/2020 Chapter 5: Distribution of Quadratic Forms
⊤ ⊤
E(y Ay) = E(tr (y Ay))
⊤
= E(tr (Ayy ))
⊤
= tr (E(Ayy ))
⊤
= tr (AE(yy ))
⊤
= tr (A(Σ + μμ ))
⊤
= tr (AΣ + Aμμ )
⊤
= tr (AΣ) + tr (Aμμ )
⊤
= tr (AΣ) + tr (μ Aμ)
⊤
= tr (AΣ) + μ Aμ.
Theorem 5.2.2 (p.111): If A is an m × n matrix of constants, and x and y are m- and n-dimensional
x μx x Σxx Σxy
random vectors such that E ( ) = ( ) and cov ( ) = ( ) , then
y μy y Σyx Σyy
⊤ ⊤
E(x Ay) = tr (AΣyx ) + μx Aμy .
xi μx
Example 5.2.1: Suppose that (x 1 , y1 ), … , (x n , yn ) is a random sample such that E ( ) = ( )
yi μy
2 n
xi σx σ xy 1
and cov ( ) = ( ) , and let sxy = ∑(x i − x̄ )(yi − ȳ ) . Show that
2
yi σ xy σy n − 1
i=1
E(sxy ) = σ xy .
x1 y1
⎛ ⎞ ⎛ ⎞
x μx j
Answer: Let x = ⎜
⎜ ⋮
⎟
⎟
and y = ⎜
⎜ ⋮
⎟
⎟
. Then E ( ) = ( ) and
y μy j
⎝ ⎠ ⎝ ⎠
xn yn
2
x σx I σ xy I
cov ( ) = ( ) . Since
2
y σ xy I σy I
1 ⊤
1
sxy = x (I − J) y,
n − 1 n
1 1 ⊤
1
E(sxy ) = {tr ((I − J) σ xy I) + (μx j) (I − J) (μy j)}
n − 1 n n
1 1 ⊤
1 ⊤
= {σ xy tr (I − J) + μx μy j (I − jj ) j}
n − 1 n n
1 1 ⊤
1 ⊤ ⊤
= {σ xy (tr (I) − tr (J)) + μx μy (j j − j jj j)}
n − 1 n n
1 1 1 2
= {σ xy (n − n) + μx μy (n − n )}
n − 1 n n
1
= {σ xy (n − 1) + 0}
n − 1
= σ xy .
Theorem 5.2.3 (p.108): If A is a p × p matrix of constants and y ∼ Np (μ, Σ) , then the moment
generating function of y⊤ Ay is
⊤ −1 −1
−1/2 −μ (I−(I−2tAΣ) )Σ μ/2
My⊤ Ay (t) = det(I − 2tAΣ) e .
www.math.louisville.edu/~rsgill01/668/Ch_5_Notes.html 2/7
10/3/2020 Chapter 5: Distribution of Quadratic Forms
⊤ ⊤
var(y Ay) = 2tr (AΣAΣ) + 4μ AΣAμ.
⊤
cov(y, y Ay) = 2ΣAμ.
Proof: We have
⊤ ⊤ ⊤
cov(y, y Ay) = E [(y − μ)(y Ay − tr (AΣ) − μ Aμ)]
′ ′
= E [(y − μ) ((y − μ) A(y − μ) + 2(y − μ) Aμ − tr (AΣ))]
′ ′
= E [(y − μ)(y − μ) A(y − μ)] + 2E [(y − μ)(y − μ) ] Aμ − E(y − μ)tr (AΣ)
′
= E [(y − μ)(y − μ) A(y − μ)] + 2ΣAμ − 0
where z = Σ
−1/2
(y − μ) ∼ Np (0, I) . Letting B = Σ
1/2
AΣ
1/2
, it follows that
z1
⎡⎛ ⎞ p p ⎤
′
E [zz Bz] = E ⎢⎜ ⎟ ⎥
⋮ ⎟ ∑ ∑ bij zi zj ⎥
⎢⎜
i=1 j=1
⎣⎝ ⎠ ⎦
zp
p p
∑ ∑ bij z1 zi zj
⎛ i=1 j=1 ⎞
⎜ ⎟
= E⎜ ⎟
⎜ ⋮ ⎟
⎝ ∑p ∑
p
bij zp zi zj ⎠
i=1 j=1
p p p p
3
= bkk E (z )
k
∞
3
1 −z
2
/2
= bkk ∫ z e dz
−−
−∞ √2π
= 0.
Thus, E (Σ1/2 zz′ Σ1/2 AΣ1/2 z) = 0 which implies that cov(y, y⊤ Ay) = 2ΣAμ .
n
1
and noncentrality parameter λ = ∑μ
2
i
= μ
⊤
μ/2 . We sometimes write v 2
∼ χ (n, λ) .
2
i=1
Definition 5.3.2 (p.112): If y1 , … , yn are independent N (0, 1) random variables, then the probability
n
distribution of v = ∑i=1 yi2 = y⊤ y is called a chi-square distribution with n degrees of freedom and we
can write v ∼ χ 2 (n) .
Theorem 5.3.1 (p.114): If v 2
∼ χ (n, λ) , then
E(v) = n + 2λ
var(v) = 2n + 8λ
Mv (t) = (1 − 2t)
−n/2
e .
−λ[1−1/(1−2t)]
Proof: These statements follow from Theorem 5.2.1, Theorem 5.2.4, and Theorem 5.2.3, respectively. For
instance, with A = Σ = I , Theorem 5.2.4 gives
⊤
var(v) = 2tr (I) + 4μ μ = 2n + 4(2λ) = 2n + 8λ.
www.math.louisville.edu/~rsgill01/668/Ch_5_Notes.html 3/7
10/3/2020 Chapter 5: Distribution of Quadratic Forms
.
2
∑ vi ∼ χ (∑ ni , ∑ λi )
i=1
k k
− ∑ ni /2 − ∑ λi [1−1/(1−2t)]
= (1 − 2t) i=1
e i=1
.
k k
This is the moment generating function of a χ 2 (∑i=1 ni , ∑i=1 λi ) distribution so the result holds based
on Theorem 4.3.3(a).
numerator, q degrees of freedom in the denominator, and noncentrality parameter λ. We sometimes write
z ∼ F (p, q, λ) .
Definition 5.4.4 (p.114): If u ∼ χ 2 (p) and v ∼ χ 2 (q) are independent random variables, then the
u/p
probability distribution of w = is called an F distribution with p degrees of freedom in the numerator
v/q
Answer: Here
y 2
1 4 i
4 25
⎛ ∑ ( ) 25 ⎞
4 i=1 3
2 2
P (∑ y > ∑z ) = P >
i j
1 25
⎝ ∑ z
2 36 ⎠
i=1 j=1
25 j=1 j
4 4 4 2
2
yi 1 1 1 2
where ∑ ( by Theorem 5.3.1 and
2 2
) = ∑y ∼ χ (4, λ = ∑( ) = )
i
3 9 2 3 9
i=1 i=1 i=1
25
by Theorem 5.3.2 (which are independent since they are functions of independent
2 2
∑z ∼ χ (25)
j
j=1
random vectors). This probability can be computed using the R function pf as follows. The arguments
specifying the degrees of freedom are df1 and df2, the noncentrality parameter is specified by ncp (except
R’s noncentrality parameter is μ⊤ Aμ = 2λ), and the option lower.tail=FALSE tells R to compute the
probability that the F-ratio is larger than .
25
36
www.math.louisville.edu/~rsgill01/668/Ch_5_Notes.html 4/7
10/3/2020 Chapter 5: Distribution of Quadratic Forms
pf(25/36,df1=4,df2=25,ncp=2*2/9,lower.tail=FALSE)
## [1] 0.6503005
We can simulate these sums many times using the rnorm function to verify that our answer looks reasonable.
set.seed(159847)
numberOfSimulations=10000000
leftSum=rep(0,numberOfSimulations)
rightSum=rep(0,numberOfSimulations)
for (i in 1:numberOfSimulations){
y=rnorm(4,mean=1,sd=3)
z=rnorm(25)
leftSum[i]=sum(y^2)
rightSum[i]=sum(z^2)
}
mean(leftSum > rightSum)
## [1] 0.6502779
Proof: Let ω1 , … , ωp be the eigenvalues of AΣ . Then the eigenvalues of I − 2tAΣ are 1 − 2tωi for
i = 1, … , p . If we choose t small enough so that |2tωi | < 1 for all i , then
∞
1 k k
= 1 + ∑(2t) ω
i
1 − 2tωi
k=1
and
∞
−1 k k
(I − 2tAΣ) = I + ∑(2t) (AΣ) (see p.50).
k=1
Since AΣ is idempotent, Theorem 2.13.2 implies that r of the ω ’s equal 1 and the other p − r ω ’s equal 0.
So, the moment generating function of y⊤ Ay is
⊤ −1 −1
−1/2 −μ (I−(I−2tAΣ) )Σ μ/2
My⊤ Ay = det(I − 2tAΣ) e
−1/2
p
⊤ ∞ k −1
−μ (− ∑ (2t) AΣ)Σ μ/2
k=1
= (∏(1 − 2tωi )) e
i=1
⊤ ∞ k
−1/2 −(μ Aμ/2)(− ∑ (2t) )
r k=1
= ((1 − 2t) ) e
⊤
−r/2 −(μ Aμ/2)(1−1/(1−2t))
= (1 − 2t) e
Example 5.5.1: Suppose that y1 , … , yn is a random sample from a N (μ, σ 2 ) distribution. Show that
n 2
∑ (yi − ȳ )
.
i=1 2
∼ χ (n − 1)
2
σ
Answer: Here \y=\bpm y_1\\ \vdots \\ y_n}\epm \sim N_n(\mu\j,\sigma^2\I) and
www.math.louisville.edu/~rsgill01/668/Ch_5_Notes.html 5/7
10/3/2020 Chapter 5: Distribution of Quadratic Forms
n 2 1
∑ (yi − ȳ ) (I − J)
i=1 ⊤ n
= y y.
2 2
σ σ
By Theorem 5.1.1(a), I − is idempotent, so all of its eigenvalues are either 0 or 1 and its rank equals the
1
J
n
1 1 1
tr (I − J) = tr (I) − tr (J) = n − n = n − 1,
n n n
1 ′
1 1
λ = (μj) ( (I − J)) (μj)
2
2 σ n
2
μ 1
⊤
= j (I − J)j
2 n
2
μ 1
⊤ ⊤ ⊤
= (j j − j jj j)
2 n
2
μ 1 2
= (n − n )
2 n
= 0.
n 2
∑ (yi − ȳ )
So, by Theorem 5.5.1, is a chi-square random variable with rank(I −
i=1 1
J) = n − 1
2 n
σ
degrees of freedom.
Compare this with the proof of Theorem L4.1(c) from MATH 667.
Theorem 5.6.2 (p.120): If y ∼ Np (μ, Σ) and A and B are p × p symmetric matrices of constants, then
y
⊤
and y⊤ By are independent if and only if AΣB = O.
Ay
Answer: Note the . Since H is idempotent with rank r and I − H is idempotent with
1 1
y ∼ Np ( μ, I)
σ σ
and
′
1 1 1 ⊤ 2 ⊤
( y) (I − H) ( y) = y (I − H)y ∼ χ (p − r) since μ (I − H)μ = 0.
2
σ σ σ
By Theorem 5.6.2, y⊤ Hy and y⊤ (I − H)y are independent since H(I − H) = O . So, by Definition
5.4.3, we see that
1 ⊤
⊤ ( y Hy) /r
y Hy/r σ
2
1 ⊤
= ∼ F (r, p − r, μ Hμ).
⊤ 2
y (I − H)y/(p − r) 1 ⊤ 2σ
( y (I − H)y) /(p − r)
2
σ
www.math.louisville.edu/~rsgill01/668/Ch_5_Notes.html 6/7
10/3/2020 Chapter 5: Distribution of Quadratic Forms
⊤ ⊤
y y = ∑y A i y.
i=1
1 1
Then for i and y⊤ A 1 y, … , y⊤ A k y are mutually
⊤ 2 ⊤
y Ai y ∼ χ (r i , μ A i μ) = 1, … , k
2 2
σ 2σ
A i A j = O for all i ≠ j
n = ∑ ri
i=1
www.math.louisville.edu/~rsgill01/668/Ch_5_Notes.html 7/7