You are on page 1of 7

HSTS215: MULTIVARIATE METHODS:

Chapter 3: Sampling Distributions

pt
UNIVERSITY OF ZIMBABWE
May 17, 2021

de
1 Introduction
Def: A random vector is a vector whose elements are random variables.
Def: The vector E(X) = µ = (µ1 , µ2 , · · · , µp )0 is called the population mean
s
vector. The matrix E(X − µ)(X − µ)0 = V ar(X) = Σ is called the variance-
at
covariance matrix or the dispersion matrix.
Def: Let X be a p component random vector and Y be a q component
radom vector, then the covariance between X and Y is given by the following
Cov(X, Y) = E(X − µ)(Y − α)0
st

where µ = E(X)
α = E(Y)/
The correlation between the ith and j th variables is given by
σ
ρij = √σiiijσjj and the matrix ρ = [ρij ] with ρii = 1, i = 1, 2, · · · , p is called
the population correlation matrix.
uz

2 Useful results
For a random vector X with mean µ and A and B are constants, we have
z-

the following holding:


lsi

1. E(X + Y) = E(X) + E(Y), E(AXB) = AE(XB).

2. σij = Cov(Xi , Xj ), i 6= j, σii = V ar(Xi ) = σi2 .

3. Σ = E(X0 X) − µµ0

4. V ar(a0 X) = a0 V ar(X)a =
PP
ai aj σij

1
5. V ar(AX + b) = AV ar(X)A0 since V ar(b) = 0
6. Cov(X, X) = V ar(X).
7. Cov(X, Y) = Cov(Y, X)0
8. Cov((X1 + X2 )0 Y) = Cov(X1 , Y) + Cov(X2 , Y).

pt
9. If p = q, V ar(X + Y) = V ar(X) + Cov(X, Y) + Cov(Y, X) + V ar(Y).
10. X and Y are independent then Cov(X, Y) = 0.
Note that

de
1. The measurements of the p− variables on the individuals will usually
be correlated, whilst measurements from different individuals must be
independent.
2. The violation of the assumption of the independence can have serious
s
impact on the quality of statistical inference.
at
3 The sample statistics x and S as estimators
of the respective population parameters
st

Let X1 , X2 , · · · , Xn be a random sample from a joint distribution with mean


vector µ and covariance matrix Σ, then E(x) = µ = (µ1 , µ2 , · · · , µp )0 .

Cov(x) = n1 Σ (population covariance divided by sample size).


uz

For the covariance matrix Sn ,

E(Sn ) = n−1
n
Σ = Σ − n1 Σ.
Proof
z-

Note that x = X1 +X2n+···+Xn

X1 +X2 +···+Xn

E(x)=E
lsi

=E( n1 X1 ) + E( n1 X2 ) + · · · + E( n1 Xn )

= n1 E(X1 ) + n1 E(X2 ) + · · · + n1 E(Xn )

= n1 µ1 + n1 µ2 + · · · + n1 µn ,

2
but since the sample is from a joint distribution, it implies that
µ1 = µ2 = · · · = µn = µ
∴ E(x) = µ

Cov(x)=E [(x − E(x))(x − E(x))0 ]


=E [(x − µ)(x − µ)0 ] since E(x) = µ

pt
h P 0 i
=E n1 Xj − µ n1
 P
Xj − µ

(Xj − µ) n1 (Xl − µ)0 since µ =


1 P P  1
P
=E n n
µ

de
h P P i
=E n12 nj=1 nl=1 (Xj − µ)(Xl − µ)0

= n12 nj=1 nl=1 E(Xj − µ)(Xl − µ)0 (Summation and Expectation are
P P
linear operators so they are commutative/their order of operation can be
interchanged.
s
for j 6= l, each entry E(Xj − µ)(Xl − µ)0 = 0
at
because covariance between component Xj and Xl is zero since these are
independent. P
∴ Cov(x) = n12 nj=1 E(Xj − µ)(Xl − µ)0 .
st

Since Σ = E(Xj − µ)(Xl − µ)0 is the common population covariance


matrix for each
P Xj we have
Cov(x) = n12 nj=1 E(Xj − µ)(Xl − µ)0 .

1
= n2
(Σ + Σ + · · · + Σ).
uz

1
= n2
nΣ.

= n1 Σ.
z-

Question
Show that the sample covariance matrix S is an unbiased estimator of Σ, i.e.
lsi

ES = Σ.
Soln
h Pn i
1 0
E(S) = E n−1 j=1 (X j − x)(X j − x)
hP i
1 n 0 0 0 0
= n−1 E j=1 (X j X j − X j x − xX j + xx)

3
hP i
n Pn Pn Pn
= 1
n−1
E j=1 Xj X0j − 0
j=1 Xj x −
0
j=1 xXj + j=1 xx
0

Pn
Pn Pn 0 j=1 Xj
but j=1 Xj x = ( j=1 Xj )x and n
=x
Pn Pn Pn
∴ j=1 xX0j =
Xj x = j=1 j=1 xx0 = nxx0 .
hP i
1 n 0 0
= n−1 j=1 E(Xj Xj ) − nE(xx )

pt
Since V ar(x) = E(xx0 ) − µµ0
0
⇒ E(xx ) = V ar(x) + µµ0 = n1 Σ + µµ0

de
Since V ar(Xj ) = E(Xj X0j ) − µµ0
⇒ E(Xj X0j ) = V ar(x) + µµ0 = Σ + µµ0
it then follows
hPthat i
1 n 0 1 0
E(S) = n−1 j=1 (Σ + µµ ) − n( n
Σ + µµ )

1
n(Σ + nµµ0 ) − n( n1 Σ − nµµ0 )
 
= n−1 s
1
= [nΣ − Σ]
at
n−1

n−1
= n−1
Σ
st

= Σ.

4 Matrix approach to computing x and S


Suppose the data matrix is given by
uz

x11 x12 · · · x1j · · ·


 
x1n
 x21 x22 · · · x2j · · · x2n 
 . .. .. .. 
 .
 . . . .

X=
z-


 xi1 xi2 · · · xij · · · xin 

 . .. .. .. 
 .. . . . 
lsi

xp1 xp2 · · · xpj · · · xpn

Then the mean vector can be computed from

4
x11 x12 · · · x1j · · ·
 
x1n
x21 x22 · · · x2j · · ·
 
 x2n  1
 .. .. .. .. 
1
. . . .
  
x=
  
xi1 xi2 · · · xij · · · xin  
 .. 
 . 
 .. .. .. ..  1
 . . . . 

pt
xp1 xp2 · · · xpj · · · xpn
i.e.
1
x= X1
n

de
where 1 is a vector of ones.

Post multiplying equation above by 10 gives us


 
x1 x1 · · · x1
1 s  x2
 x2 · · · x2 
x10 = X110 =  ..

.. ..
n

 . . . 
at
xp xp · · · xp
then  
x11 − x1 x12 − x1 · · · x1n − x1
1  x21 − x2 x22 − x2 · · · x2n − x2 
st

X− X110 = 
 
.. .. ..
n

 . . . 
xp1 − xp xp2 − xp · · · xpn − xp
by definition
  0
uz

x11 − x1 x12 − x1 · · · x1n − x1 x11 − x1 x12 − x1 · · · x1n − x1


 x21 − x2 x22 − x2 · · · x2n − x2  x21 − x2 x22 − x2 · · · x2n − x2 
(n−1)S = 
  
.. .. ..  .. .. .. 
 . . .  . . . 
xp1 − xp xp2 − xp · · · xpn − xp xp1 − xp xp2 − xp · · · xpn − xp
z-

 Pn Pn Pn 
i=1 (x1i − x1 )(x1i − x1 ) Pi=1 (x1i − x1 )(x2i − x2 ) · · · Pi=1 (x1i − x1 )(xpi − xp )
 n (x2i − x2 )(x1i − x1 )
P n n
i=1 (x2i − x2 )(x2i − x2 ) · · · i=1 (x2i − x2 )(xpi − xp )
lsi


 i=1
=

.. .. .. 

Pn . Pn . Pn . 
i=1 (xpi − xp )(x1i − x1 ) i=1 (xpi − xp )(x2i − x2 ) · · · i=1 (xpi − xp )(xpi − xp )

1 1
= (X − X110 )(X − X110 )0
n n
1 1
= (X − X110 )(X0 − 110 X0 )
n n

5
1 0 1
11 )(I − 110 )X0
= X(I −
n n
1 0 0
= X(I − 11 )X ,
n
since 11 = n. ∴ x = n X1 and S = X(I − n1 110 )X0
0 1

pt
5 Sample values of linear combinations of vari-
ables
Suppose that you are given two linear combinations b0 X and c0 X where

de
b0 X = b1 X1 + b2 X2 + · · · + bp Xp whose observed value on the j th trial is
b0 xj = b1 x1j + b2 x2j + · · · + bp xpj , j = 1, 2, · · · , n.

b0 x1 +b0 x2 +···+b0 xn
Sample mean = n

= b0 (x1 +x2 +···+x


n
n) s
= b0 x.
at
Since (b0 xj − b0 x)2 = (b0 (xj − x))2 = b0 (xj − x)(xj − x)b
(b0 (x1 −x))2 +(b0 (x2 −x))2 +···+(b0 (xn −x))2
Sample variance =
st

n−1

b0 (x1 −x)(x1 −x)b+b0 (x2 −x)(x2 −x)b+···+b0 (xn −x)(xn −x)b


= n−1

= b0 [(x1 −x)(x1 −x)+(x2 −x)(x


n−1
2 −x)+···+(xn −x)(xn −x)]
b
uz

= b0 Sb

Similarly Sample mean of c0 X = c0 x


z-

Sample variance of c0 X = c0 Sc

Sample covariance of b0 X and c0 X


lsi

(b0 x1 −b0 x)(c0 x1 −c0 x)0 +(b0 x2 −b0 x)(c0 x2 −c0 x)0 +···+(b0 xn −b0 x)(c0 xn −c0 x)0
= n−1

b0 (x1 −x)(x1 −x)c+b0 (x2 −x)(x2 −x)c+···+b0 (xn −x)(xn −x)c


= n−1

= b0 [(x1 −x)(x1 −x)+(x2 −x)(x


n−1
2 −x)+···+(xn −x)(xn −x)]
c

6
= b0 Sc
Similarly Population mean of c0 X = E(c0 X) = c0 µ

Population variance of c0 X = c0 Σc

Population covariance of b0 X and c0 X = b0 Σc

pt
Example
 
9 5 1
Consider the data matrix A = .
1 3 2

de
We have n = 3 observations on p = 2 variables X1 and X2 . Form the linear
combinations  
0
  X1
bX= 2 3 = 2X1 + 3X2
X2 
 X1
c0 X = −1 2

= −X1 + 2X2
X2 s
Calculate the sample means, variances and covariances of b0 X and c0 X
at
Solution
 9+5+1
   
0 0
   5
Sample mean of b X = b x = 2 3 3 = 2 3
1+3+2
3
2
= 10 + 6 = 16
st

 
0 0
  5
Sample mean of c X = c x = −1 2
2
= −5 + 4 = −1
uz

 
16 −2
S=
−2 1
  
0
  16 −2 2
Sample variance of b X = 2 3 = 49
z-

−2 1 3
  
0
  16 −2 −1
Sample variance of c X = −1 2 = 28
−2 1 2
lsi

  
0 0
  16 −2 −1
Sample covariance of b X and c X = 2 3
−2 1 2

= −28

You might also like