Analysis of Variance and Design of Experiments: Results From Matrix Theory and Random Variables

Analysis
of Variance and Design of
Experiments
Results from Matrix Theory and Random Variables
:::
Lecture 2
Random Vectors and Linear Estimation
Shalabh
Department of Mathematics and Statistics
Indian Institute of Technology Kanpur
Slides can be downloaded from http://home.iitk.ac.in/~shalab/sp1
Random vectors:
Let Y1 , Y2 ,..., Yn be n random variables then Y  (Y1 , Y2 ,..., Yn ) ' is
called a random vector.
• The mean vector Y is
E (Y )  (( E (Y1 ), E (Y2 ),..., E (Yn )) '.
• The covariance matrix or dispersion matrix of Y is
 Var (Y1 ) Cov(Y1 , Y2 ) ... Cov(Y1 , Yn ) 

 
Cov (Y , Y ) Var (Y ) ... Cov (Y , Y )
Var (Y )   2 1 2 2 n 
     
 
 Cov(Yn , Y1 ) Cov(Yn , Y2 ) ... Var (Yn ) 
which is a symmetric matrix

2
Random vectors:
• If Y1 , Y2 ,..., Yn are independently distributed, then the
covariance matrix is a diagonal matrix.
• If Var (Yi )   2 for all i = 1, 2,…,n then Var (Y )   2 I n .
3
Linear function of random variable:
If Y1 , Y2 ,..., Yn are n random variables, and k1 , k2 ,.., kn are scalars,
n
then  kiYi is called a linear function of random variable .

Y1 , Y2 ,..., Yn
i 1
n
If Y  (Y1 , Y2 ,..., Yn ) ', K  (k1 , k2 ,..., kn ) ' then K ' Y   kiYi ,
i 1
n
• the mean K ' Y is E ( K ' Y )  K ' E (Y )   ki E (Yi ) and
i 1
• the variance of K ' Y is Var  K ' Y   K 'Var Y  K .
4
Multivariate normal distribution:
A random vector Y  (Y1 , Y2 ,..., Yn ) ' has a multivariate normal
distribution with mean vector   ( 1 , 2 ,..., n ) ' and dispersion
matrix  if its probability density function is
1  1 
f (Y |  , )  1/2
exp 
 2 (Y   ) '  1
(Y   ) 
(2 ) n /2 
assuming  is a nonsingular matrix.
5
Chi‐square distribution:
• If Y1 , Y2 ,..., Yk are independently distributed random variables
following the normal distribution with common mean 0 and
k
common variance 1, then the distribution of  Yi is called

2
i 1
the  ‐ distribution with k degrees of freedom.

2
 2
• The probability density function of ‐distribution with k
degrees of freedom is given as
1  x
k
1
f  2 ( x)  x exp    ; 0  x  .
2
(k / 2)2 k /2
 2
6
• If Y1 , Y2 ,..., Yk are independently distributed following the
normal distribution with common mean 0 and common
1 k
 Yi 2  2
 , 2
variance then has distribution with k degrees
2 i 1
of freedom.
• If the random variables Y1 , Y2 ,..., Yk are normally distributed

with non‐null means 1 ,  2 ,...,  k but common variance 1,
k
then the distribution of Y

i 1
i
2
has noncentral  distribution
2
with k degrees of freedom and noncentrality parameter

k
   i2 .
i 1
7
• If Y1 , Y2 ,..., Yk are independently and normally distributed
following the normal distribution with means 1 , 2 ,..., k but
1 k
common variance  2 then
2
i
Y 2
i 1
has non‐central 2
distribution with k degrees of freedom and noncentrality

1 k
2  i
parameter    2
.
 i 1
• If U has a Chi‐square distribution with k degrees of freedom

then E (U )  k and Var (U )  2k .
8
• If U has a noncentral Chi‐square distribution with k degrees
of freedom and noncentrality parameter  then
E U   k   and Var (U )  2k  4 .
• If U1 , U 2 ,..., U k are independently distributed random

variables with each U i having a noncentral Chi‐square
distribution with ni degrees of freedom and non‐centrality
k
parameter i , i  1, 2,..., k then U

i 1
i has noncentral Chi‐
k
square distribution with n
i 1
i degrees of freedom and
k
noncentrality parameter  .
i 1
i
9
• Let X  ( X 1 , X 2 ,..., X n ) ' has a multivariate normal distribution
with mean vector  and positive definite covariance matrix .
Then X ' AX is distributed as noncentral  with k degrees
2
of freedom if and only if A is an idempotent matrix of

rank k.
10
Let the two quadratic forms‐
X ' A1 X is distributed as  with n1 degrees of freedom
2
•
and noncentrality parameter  ' A1 and
• X ' A2 X is distributed as  2
with n2 degrees of freedom
and noncentrality parameter  ' A2  . .
Then X ' A1 X and X ' A2 X are independently distributed if
A1 A2  0.
11
t‐ distribution:
• If
X has a normal distribution with mean 0 and variance 1,
Y has a  2 distribution with n degrees of freedom, and
X and Y are independent random variables,
X
then the distribution of the statistic T  is called the t‐
Y /n
distribution with n degrees of freedom.
The probability density function of t is

 n 1 
  n 1 
  t 2   2 

 2 
fT (t )   1  ; -   t  .
n  n
   n
2 12
t‐ distribution:
X
• If the mean of X is nonzero then the distribution of
Y /n
is called the noncentral t ‐ distribution with n degrees of
freedom and noncentrality parameter  .
13
F ‐ distribution:

• If X and Y are independent random variables with ‐
2
distribution with m and n degrees of freedom respectively,

X /m
then the distribution of the statistic F  is called
Y /n
the F‐distribution with m and n degrees of freedom. The
probability density function of F is
m /2
 m  n  m 
    m2 
 mn 
 
 2   n    m   2 
fF ( f )  f  2 
1   n  f ; 0  f  .
m n    
  
 2  2
14
F ‐ distribution:
• If X has a noncentral Chi‐square distribution with m degrees
of freedom and noncentrality parameter  ; Y has a
 2 distribution with n degrees of freedom, and X and Y are
independent random variables, then the distribution of

X /m
F is the noncentral F distribution with m and n
Y /n
degrees of freedom and noncentrality parameter  .
15
Linear model:
Suppose there are n observations. In the linear model, we assume
that these observations are the values taken by n random
variables Y1 , Y2 ,..., Yn satisfying the following conditions:
1. E (Yi ) is a linear combination of p unknown parameters 1 ,  2 ,...,  p
E (Yi )  xi11  xi 2  2  ...  xip  p , i  1, 2,..., n
where xij ' s are known constants.
2. Y1 , Y2 ,.., Yn are uncorrelated and normality distributed with

variance Var (Yi )   2 .
16
Linear model:
The linear model can be rewritten by introducing independent
normal random variables following N (0,  2 ) , as
Yi  xi11  xi 2  2  ....  xip  p   i , i  1, 2,..., n.
These equations can be written using the matrix notations as
Y  X 
• Y is a n x 1 vector of observation,
• X is a n x p matrix of n observations on each of
X 1 , X 2 ,..., X p
variables,
•  is a p  1 vector of parameters and
•  is a vector of random error components with
n 1
 ~ N (0,  2 I n ). 17
Linear model:
Here
• Y is called study or dependent variable,
• X 1 , X 2 ,..., X p are called explanatory or independent variables and
• 1 ,  2 ,...,  p are called as regression coefficients.
Alternatively since Y ~ N ( X  ,  2 I ) so the linear model can also be
expressed in the expectation form as a normal random variable Y
E (Y )  X 
with
Var (Y )   2 I .
 2
Note that and are unknown but X is known.
18
Estimable functions:
A linear parametric function  '  of the parameter is said to be
an estimable parametric function or estimable if there exists a
linear function of random variables  'Y of Y where Y  (Y1 , Y2 ,..., Yn ) '
such that
E ( ' Y )   ' 
with   (1 ,  2 ,...,  n ) ' and   (1 , 2 ,..., n ) ' being vectors of
known scalars.
19
Best linear unbiased estimates (BLUE):
The unbiased minimum variance linear estimate  'Y of an
estimable function  
'
is called the best linear unbiased
estimate (BLUE) of  '  .
• Suppose  1' Y and  '2Y are the BLUE of 1'  and 2' 
respectively.
(a11  a2  2 ) ' Y (a11  a2 2 ) '  .
Then is the BLUE of
• If  '  is estimable, its best estimate is  ' ˆ where ˆ
is any solution of the equations X ' X   X ' Y .
20
Least squares estimation:
The least‐squares estimate of  is Y  X    is the value
of  which minimizes the error sum of squares  '  .
Let S   '   (Y  X  ) '(Y  X  )
 Y 'Y  2 ' X 'Y   ' X ' X  .
Minimizing S with respect to  involves
S
0

 X ' X   X 'Y
which is termed as normal equation.
This normal equation has a unique solution given by,
ˆ  ( X ' X ) 1 X ' Y
assuming rank(X) = p.
21
2S
Note that  X ' X is a positive definite matrix.
 '
So ˆ  ( X ' X ) 1 X ' Y is the value of which minimizes and

  '

is termed as ordinary least squares estimator of .
1 ,  2 ,...,  p
• In this case, are estimable and consequently, all
the linear parametric function are estimable.
• E ( ˆ )  ( X ' X ) 1 X ' E (Y )  ( X ' X ) 1 X ' X   
• Var ( ˆ )  ( X ' X ) 1 X 'Var (Y ) X ( X ' X ) 1   2 ( X ' X ) 1

22
• If  ' ˆ and  ' ˆ are the estimates of  '  and  ' 
respectively, then
 Var ( ' ˆ )   'Var ( ˆ )   2 [ '( X ' X ) 1  ]
 Cov ( ' ˆ ,  ' ˆ )   2 [  '( X ' X ) 1  ].
• Y  X ˆ is called the residual vector
• E (Y  X ˆ )  0.
23
Linear model with correlated observations:
In the linear model Y  X    with E ( )  0, Var ( )   and 
is normally distributed, we find
E (Y )  X  , Var (Y )  .
Assuming  to be positive definite, so we can write

  P'P
where P is a nonsingular matrix. Premultiplying Y  X   
by P, we get PY  PX   P

or Y *  X *    *
where Y *  PY , X *  PX and  *  P .
Note that in this model E ( *)  0 and Var ( *)   2 I . 24
Distribution of l’Y :
In the linear model Y  X    ,  ~ N (0,  2 I ) consider a linear
function  'Y which is normally distributed with
E ( ' Y )   ' X  ,
Var ( ' Y )   2 ( ' ).
Then
 'Y  ' X  
~ N ,1 .
  '    ' 
( ' y ) 2
Further, 2 has a noncentral Chi‐square distribution with
  '
( ' X  ) 2
one degree of freedom and noncentrality parameter .
  '
2
25
Degrees of freedom:
A linear function l’Y of the observations (  0) is said to carry
one degrees of freedom.
A set of linear functions L’Y where L is r x n matrix, is said to have
M degrees of freedom if there exist M linearly independent
functions in the set and no more.
Alternatively, the degrees of freedom carried by the set L’Y

equals rank (L).
When the set L’Y are the estimates of ’ the degrees of
freedom of the set  will also be called the degrees of freedom
for the estimates of ’. 26
Sum of squares:
If  'Y is a linear function of observations, then the projection
of Y on 1 is the vector Y '  . .
 '
The square of this projection is called the sum of squares (SS)

( ' Y ) 2
due to  ' y is given by .
 '
Since  ' y has one degree of freedom, so the SS due  'Y to

has one degree of freedom.
27
Sum of squares:
The sum of squares and the degrees of freedom arising out of
the mutually orthogonal sets of functions can be added
together to give the sum of squares and degrees of freedom for
the set of all the function together and vice versa.
28
Sum of squares:
Let X  ( X 1 , X 2 ,..., X n ) has a multivariate normal distribution with
mean vector  and positive definite covariance matrix  .
Let the two quadratic forms.
• X ' A1 X is distribution  2 with n1 degrees of freedom and
noncentrality parameter  ' A1 and
• X ' A2 X is distributed as  2 with n2 degrees of freedom and
noncentrality parameter  ' A2  .
Then X ' A1 X and X ' A2 X are independently distributed if A1A2  0.
29
Fisher‐Cochran theorem:
If X  ( X 1 , X 2 ,..., X n ) ' has a multivariate normal distribution with
mean vector  and positive definite covariance matrix  and
let X ' 1 X  Q1  Q2  ...  Qk
where Qi  X ' Ai X with rank ( Ai )  Ni , i  1, 2,..., k . Then Qi ' s are

independently distributed noncentral Chi‐square distribution
with Ni degrees of freedom and noncentrality parameter  ' Ai 
k
if and only if N
i 1
i  N , is which case
k
 '      ' Ai  .
1
i 1
30
Derivatives of quadratic and linear forms:
Let X  ( x1 , x2 ,..., xn ) ' and f(X) be any function of n
independent variables x1 , x2 ,..., xn , then
 f ( X ) 
 x 
 1 
 f ( X ) 
f ( X )  
  x2  .
X
  
 
 f ( X ) 
 x 
 n 
K ' X
If K  (k1 , k2 ,..., kn ) ' is a vector of constants, then  K.
X
X ' AX
If A is a m  n matrix, then  2( A  A ') X .
X
31
Independence of linear and quadratic forms:
• Let Y be an n 1 vector having multivariate normal
distribution N ( , I ) and B be a m  n matrix.
m1
Then the vector linear form BY is independent of the
quadratic form Y ' AY if BA = 0 where A is a symmetric
matrix of known elements.
• Let Y be a n 1 vector having multivariate normal
distribution N ( , ) with rank ()  n.
If BA  0, then the quadratic form Y ' AY is independent
of linear form BY where B is a m  n matrix.
32

Analysis of Variance and Design of Experiments: Results From Matrix Theory and Random Variables

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Analysis of Variance and Design of Experiments: Results From Matrix Theory and Random Variables

Uploaded by

Copyright:

Available Formats

Analysis

• The covariance matrix or dispersion matrix of Y is

 Var (Y1 ) Cov(Y1 , Y2 ) ... Cov(Y1 , Yn ) 

which is a symmetric matrix

then  kiYi is called a linear function of random variable .

• the variance of K ' Y is Var  K ' Y   K 'Var Y  K .

assuming  is a nonsingular matrix.

common variance 1, then the distribution of  Yi is called

the  ‐ distribution with k degrees of freedom.

• If the random variables Y1 , Y2 ,..., Yk are normally distributed

then the distribution of Y

with k degrees of freedom and noncentrality parameter

distribution with k degrees of freedom and noncentrality

• If U has a Chi‐square distribution with k degrees of freedom

• If U1 , U 2 ,..., U k are independently distributed random

parameter i , i  1, 2,..., k then U

of freedom if and only if A is an idempotent matrix of

The probability density function of t is

distribution with m and n degrees of freedom respectively,

independent random variables, then the distribution of

where xij ' s are known constants.

2. Y1 , Y2 ,.., Yn are uncorrelated and normality distributed with

So ˆ  ( X ' X ) 1 X ' Y is the value of which minimizes and

• E ( ˆ )  ( X ' X ) 1 X ' E (Y )  ( X ' X ) 1 X ' X   

• Var ( ˆ )  ( X ' X ) 1 X 'Var (Y ) X ( X ' X ) 1   2 ( X ' X ) 1

 Var ( ' ˆ )   'Var ( ˆ )   2 [ '( X ' X ) 1  ]

 Cov ( ' ˆ ,  ' ˆ )   2 [  '( X ' X ) 1  ].

• Y  X ˆ is called the residual vector

Assuming  to be positive definite, so we can write

Alternatively, the degrees of freedom carried by the set L’Y

The square of this projection is called the sum of squares (SS)

Since  ' y has one degree of freedom, so the SS due  'Y to

Then X ' A1 X and X ' A2 X are independently distributed if A1A2  0.

where Qi  X ' Ai X with rank ( Ai )  Ni , i  1, 2,..., k . Then Qi ' s are

You might also like