Professional Documents
Culture Documents
Chapter 5 Distribution of Quadratic Forms PDF
Chapter 5 Distribution of Quadratic Forms PDF
Notes for MATH 668 based on Linear Models in Statistics by Alvin C. Rencher and G. Bruce Schaalje, second
edition, Wiley, 2008.
January 30, 2018
2
1 ⊤ ⊤
1 ⊤ ⊤
1 ⊤ ⊤
1 ⊤
ȳ = (j y)(j y) = (y j)(j y) = y (jj )y = y Jy
2 2 2 2
n n n n
n n
2 2 ⊤
1 ⊤ ⊤
1
2
∑(yi − ȳ ) = ∑y − nȳ = y Iy − y Jy = y (I − J) y
i
n n
i=1 i=1
1 1
I = (I − J) + J.
n n
(
1
J) (
1
J) =
1
(JJ) =
1
(nJ) =
1
J ,
n n n
2
n
2 n
2
(I −
1
J) (I −
1
J) = I
2
−
2
J + (
1
J) = I −
2
J +
1
J = I −
1
J ,
n n n n n n n
2
and (b) (I − 1
n
J) (
1
n
J) = (
1
n
J) − (
1
n
J) =
1
n
J −
1
n
J = O .
⊤ ⊤
E(y Ay) = tr (AΣ) + μ Aμ.
⊤ ⊤
E(y Ay) = E(tr (y Ay))
⊤
= E(tr (Ayy ))
⊤
= tr (E(Ayy ))
⊤
= tr (AE(yy ))
⊤
= tr (A(Σ + μμ ))
⊤
= tr (AΣ + Aμμ )
⊤
= tr (AΣ) + tr (Aμμ )
⊤
= tr (AΣ) + tr (μ Aμ)
⊤
= tr (AΣ) + μ Aμ.
Theorem 5.2.2 (p.111): If A is an m × n matrix of constants, and x and y are m- and n-dimensional random vectors such that
x μx x Σxx Σxy
E( ) = ( ) and cov ( ) = ( ) , then
y μy y Σyx Σyy
⊤ ⊤
E(x Ay) = tr (AΣyx ) + μx Aμy .
xi μx
Example 5.2.1: Suppose that (x 1 , y1 ), … , (x n , yn ) is a random sample such that E ( ) = ( ) and
yi μy
2 n
xi σx σ xy 1
cov ( ) = ( ) , and let sxy = ∑(x i − x̄ )(yi − ȳ ) . Show that E(sxy ) = σ xy .
2
yi σ xy σy n − 1
i=1
x1 y1
⎛ ⎞ ⎛ ⎞ 2
x μx j x σx I σ xy I
Answer: Let x = ⎜
⎜ ⋮
⎟
⎟
and y = ⎜
⎜ ⋮
⎟
⎟
. Then E ( ) = ( ) and cov ( ) = (
2
) . Since
y μy j y σ xy I σy I
⎝ ⎠ ⎝ ⎠
xn yn
1 ⊤
1
sxy = x (I − J) y,
This study source was downloaded by 100000857715933 from CourseHero.com on 03-28-2023 09:01:59 GMT -05:00
n − 1 n
1 1 ⊤
1 ⊤
= {σ xy tr (I − J) + μx μy j (I − jj ) j}
n − 1 n n
1 1 ⊤
1 ⊤ ⊤
= {σ xy (tr (I) − tr (J)) + μx μy (j j − j jj j)}
n − 1 n n
1 1 1 2
= {σ xy (n − n) + μx μy (n − n )}
n − 1 n n
1
= {σ xy (n − 1) + 0}
n − 1
= σ xy .
Theorem 5.2.3 (p.108): If A is a p × p matrix of constants and y ∼ Np (μ, Σ) , then the moment generating function of y⊤ Ay is
⊤ −1 −1
−1/2 −μ (I−(I−2tAΣ) )Σ μ/2
My⊤ Ay (t) = det(I − 2tAΣ) e .
⊤ ⊤
var(y Ay) = 2tr (AΣAΣ) + 4μ AΣAμ.
⊤
cov(y, y Ay) = 2ΣAμ.
Proof: We have
⊤ ⊤ ⊤
cov(y, y Ay) = E [(y − μ)(y Ay − tr (AΣ) − μ Aμ)]
′ ′
= E [(y − μ) ((y − μ) A(y − μ) + 2(y − μ) Aμ − tr (AΣ))]
′ ′
= E [(y − μ)(y − μ) A(y − μ)] + 2E [(y − μ)(y − μ) ] Aμ − E(y − μ)tr (AΣ)
′
= E [(y − μ)(y − μ) A(y − μ)] + 2ΣAμ − 0
where z = Σ
−1/2
(y − μ) ∼ Np (0, I) . Letting B = Σ
1/2
AΣ
1/2
, it follows that
z1
⎡⎛ ⎞ p p ⎤
′
E [zz Bz] = E ⎢⎜ ⎟ ⎥
⋮ ⎟ ∑ ∑ bij zi zj ⎥
⎢⎜
i=1 j=1
⎣⎝ ⎠ ⎦
zp
p p
∑ ∑ bij z1 zi zj
⎛ i=1 j=1 ⎞
⎜ ⎟
= E⎜ ⎟
⎜ ⋮ ⎟
⎝ ∑p ∑
p
bij zp zi zj ⎠
i=1 j=1
p p p p
3
= bkk E (z )
k
∞
3
1 −z
2
/2
= bkk ∫ z e dz
−−
−∞ √2π
= 0.
Thus, E (Σ1/2 zz′ Σ1/2 AΣ1/2 z) = 0 which implies that cov(y, y⊤ Ay) = 2ΣAμ .
is called a chi-square distribution with n degrees of freedom and we can write v ∼ χ (n)
2
.
Theorem 5.3.1 (p.114): If v ∼ χ 2 (n, λ), then
E(v) = n + 2λ
var(v) = 2n + 8λ
Proof: These statements follow from Theorem 5.2.1, Theorem 5.2.4, and Theorem 5.2.3, respectively. For instance, with A = Σ = I ,
Theorem 5.2.4 gives
⊤
var(v) = 2tr (I) + 4μ μ = 2n + 4(2λ) = 2n + 8λ.
k k k
This study source was downloaded by 100000857715933 from CourseHero.com on 03-28-2023 09:01:59 GMT -05:00 k
i=1
k k
− ∑ ni /2 − ∑ λi [1−1/(1−2t)]
= (1 − 2t) i=1
e i=1
.
k k
This is the moment generating function of a χ 2 (∑i=1 ni , ∑i=1 λi ) distribution so the result holds based on Theorem 4.3.3(a).
is called a noncentral t distribution with p degrees of freedom and noncentrality parameter μ. We sometimes write t ∼ t(p, μ) .
z
Definition 5.4.2 (p.116): If z ∼ N (0, 1) and u 2
∼ χ (p) are independent random variables, then the probability distribution of t =
− −−
√ u/p
called a noncentral F distribution with p degrees of freedom in the numerator, q degrees of freedom in the denominator, and noncentrality
parameter λ. We sometimes write z ∼ F (p, q, λ) .
u/p
Definition 5.4.4 (p.114): If u 2
∼ χ (p) and v ∼ χ (q)
2
are independent random variables, then the probability distribution of w = is
v/q
called an F distribution with p degrees of freedom in the numerator and q degrees of freedom in the denominator and we write
w ∼ F (p, q) .
R Example 5.4.1: Suppose that y1 , … , y4 are iid N (1, 9) random variables, z1 , … , z25 are iid N (0, 1) random variables, and
4 25
Answer: Here
y 2
1 4 i
4 25
⎛ ∑ ( ) 25 ⎞
4 i=1 3
2 2
P (∑ y > ∑z ) = P >
i j
1 25
⎝ ∑ z
2 36 ⎠
i=1 j=1
25 j=1 j
4 4 4 2 25
2
yi 1 1 1 2
where ∑ ( ) = ∑y
2
i
∼ χ
2
(4, λ = ∑( ) = ) by Theorem 5.3.1 and ∑ zj2 2
∼ χ (25) by Theorem 5.3.2 (which
3 9 2 3 9
i=1 i=1 i=1 j=1
are independent since they are functions of independent random vectors). This probability can be computed using the R function pf as
follows. The arguments specifying the degrees of freedom are df1 and df2, the noncentrality parameter is specified by ncp (except R’s
noncentrality parameter is μ⊤ Aμ = 2λ), and the option lower.tail=FALSE tells R to compute the probability that the F-ratio is larger than
.
25
36
pf(25/36,df1=4,df2=25,ncp=2*2/9,lower.tail=FALSE)
## [1] 0.6503005
We can simulate these sums many times using the rnorm function to verify that our answer looks reasonable.
set.seed(159847)
numberOfSimulations=10000000
leftSum=rep(0,numberOfSimulations)
rightSum=rep(0,numberOfSimulations)
for (i in 1:numberOfSimulations){
y=rnorm(4,mean=1,sd=3)
z=rnorm(25)
leftSum[i]=sum(y^2)
rightSum[i]=sum(z^2)
}
mean(leftSum > rightSum)
## [1] 0.6502779
y
⊤ 2
Ay ∼ χ (r, λ)if and only if AΣ is idempotent.
Proof: Let ω1 , … , ωp be the eigenvalues of AΣ . Then the eigenvalues of I − 2tAΣ are 1 − 2tωi for i = 1, … , p . If we choose t
small enough so that |2tωi | < 1 for all i , then
∞
1 k k
= 1 + ∑(2t) ω
i
1 − 2tωi
k=1
and
−1 k k
This study source was downloaded by 100000857715933 from CourseHero.com on 03-28-2023
(I − 2tAΣ 09:01:59
) GMT
= -05:00
I + ∑(2t) (AΣ) (see p.50).
k=1
https://www.coursehero.com/file/115335130/Chapter-5-Distribution-of-Quadratic-Formspdf/
Since AΣ is idempotent, Theorem 2.13.2 implies that r of the ω’s equal 1 and the other p − r ω ’s equal 0. So, the moment generating
function of y⊤ Ay is
⊤ −1 −1
−1/2 −μ (I−(I−2tAΣ) )Σ μ/2
My⊤ Ay = det(I − 2tAΣ) e
−1/2
p
⊤ ∞ k −1
−μ (− ∑ (2t) AΣ)Σ μ/2
k=1
= (∏(1 − 2tωi )) e
i=1
⊤ ∞ k
−1/2 −(μ Aμ/2)(− ∑ (2t) )
r k=1
= ((1 − 2t) ) e
⊤
−r/2 −(μ Aμ/2)(1−1/(1−2t))
= (1 − 2t) e
n 2 1
∑ (yi − ȳ ) (I − J)
i=1 ⊤ n
= y y.
2 2
σ σ
By Theorem 5.1.1(a), I − 1
J is idempotent, so all of its eigenvalues are either 0 or 1 and its rank equals the number of eigenvalues which
n
n
J is
1 1 1
tr (I − J) = tr (I) − tr (J) = n − n = n − 1,
n n n
so rank(I − 1
J) = n − 1 . The noncentrality parameter is
n
1 ′
1 1
λ = (μj) ( (I − J)) (μj)
2
2 σ n
2
μ 1
⊤
= j (I − J)j
2 n
2
μ 1
⊤ ⊤ ⊤
= (j j − j jj j)
2 n
2
μ 1 2
= (n − n )
2 n
= 0.
n 2
∑ (yi − ȳ )
So, by Theorem 5.5.1, is a chi-square random variable with rank(I − degrees of freedom.
i=1 1
J) = n − 1
2 n
σ
Compare this with the proof of Theorem L4.1(c) from MATH 667.
σ
y ∼ Np (
1
σ
μ, I) . Since H is idempotent with rank r and I − H is idempotent with rank p − r , Theorem 5.5.1 implies
that
′
1 1 1 ⊤ 2
1 ⊤
( y) H ( y) = y Hy ∼ χ (r, μ Hμ)
2 2
σ σ σ 2σ
and
′
1 1 1 ⊤ 2 ⊤
( y) (I − H) ( y) = y (I − H)y ∼ χ (p − r) since μ (I − H)μ = 0.
2
σ σ σ
By Theorem 5.6.2, y⊤ Hy and y⊤ (I − H)y are independent since H(I − H) = O . So, by Definition 5.4.3, we see that
1 ⊤
⊤ ( y Hy) /r
y Hy/r σ
2
1 ⊤
= ∼ F (r, p − r, μ Hμ).
⊤ 2
y (I − H)y/(p − r) 1 ⊤ 2σ
( y (I − H)y) /(p − r)
2
σ
⊤ ⊤
y y = ∑y A i y.
i=1
1 1
Then y
⊤
Ai y ∼ χ
2
(r i , μ
⊤
A i μ) for i = 1, … , k and y⊤ A 1 y, … , y⊤ A k y are mutually independent if and only if at least
2 2
σ 2σ
A i A j = O for all i ≠ j
https://www.coursehero.com/file/115335130/Chapter-5-Distribution-of-Quadratic-Formspdf/
k
n = ∑ ri
i=1
This study source was downloaded by 100000857715933 from CourseHero.com on 03-28-2023 09:01:59 GMT -05:00
https://www.coursehero.com/file/115335130/Chapter-5-Distribution-of-Quadratic-Formspdf/