Section 01 Introduction, Review of Linear Algebra

Stats 346.
3
Stats 848.3
Multivariate Data Analysis

Instructor: W.H.Laverty
Office: 235 McLean Hall
Phone: 966-6096
MWF
Lectures:
12:30am - 1:20pm Arts 104
Evaluation: Assignments, Term tests - 40%

Final Examination - 60%
Dates for midterm tests:
1. Friday, February 08
2. Friday, March 22
Each test and the Final Exam are Open Book

Students are allowed to take in Notes, texts, formula sheets,
calculators (laptop computers.)
Text:
Stat 346 –Multivariate Statistical Methods – Donald
Morrison
Not Required - I will give a list of other useful texts

that will be in the library
Bibliography
1. Cooley, W.W., and Lohnes P.R. (1962). Multivariate Procedures for the
Behavioural Sciences, Wiley, New York.
2. Fienberg, S. (1980), Analysis of Cross-Classified Data , MIT Press,

Cambridge, Mass.
3. Fingelton, B. (1984), Models for Category Counts , Cambridge

University Press.
4. Johnson, R.A. and Wichern D.W. Applied Multivariate Statistical

Analysis , Prentice Hall.
5. Morrison, D.F. (1976), Multivariate Statistical Methods , McGraw-Hill,

New York.
6. Seal, H. (1968), Multivariate Statistical Analysis for Biologists ,

Metheun, London
7. Alan Agresti (1990) Categorical Data Analysis, Wiley, New York.

• The lectures will be given in Power Point
• They are now posted on the Stats 346 web
page
Course Outline
Introduction
Review of Linear Algebra and Matrix Chapter 2
Analysis
Review of Linear Statistical Theory Chapter 1

Multivariate Normal distribution
•Multivariate Data plots Chapter 3
•Correlation - sample estimates and tests
•Canonical Correlation
Mean Vectors and Covariance matrices
•Single sample procedures Chapter
4
•Two sample procedures
•Profile Analysis
Multivariate Analysis of Variance
(MANOVA) Chapter
5
Classification and Discrimination
•Discriminant Analysis Chapters
6
•Logistic Regression (if time permits)
•Cluster Analysis
The structure of correlation
•Principal Components Analysis (PCA) Chapter
9
•Factor Analysis
Multivariate Multiple Regression References
TBA
(if time permits)
Discrete Multivariate Analysis References:
TBA
(if time permits)
Introduction
Multivariate Data
• We have collected data for each case in the sample
or population on not just one variable but on several
variables – X1, X2, … Xp
• This is likely the situation – very rarely do you
collect data on a single variable.
• The variables maybe
1. Discrete (Categorical)
2. Continuous (Numerical)
• The variables may be
1. Dependent (Response variables)
2. Independent (Predictor variables)
A chart illustrating Statistical Procedures
Independent variables
Dependent Categorical Continuous Continuous &

Categorical
Variables
Multiway frequency Analysis Discriminant Analysis Discriminant Analysis
Categorical (Log Linear Model)
ANOVA (single dep var) MULTIPLE ANACOVA

Continuous MANOVA (Mult dep var) REGRESSION
(single dep variable)
(single dep var)
MULTIVARIATE MANACOVA
MULTIPLE (Mult dep var)
REGRESSION
(multiple dependent
variable)
Continuous & ?? ?? ??
Categorical
Multivariate Techniques
Multivariate Techniques can be classified as
follows:
1. Techniques that are direct analogues of
univariate procedures.
• There are univariate techniques that are then
generalized to the multivariate situarion
• e. g. The two independent sample t test,
generalized to Hotelling’s T2 test
• ANOVA (Analysis of Variance) generalized to
MANOVA (Multivariate Analysis of Variance)
2. Techniques that are purely multivariate
procedures.
• Correlation, Partial correlation, Multiple
correlation, Canonical Correlation
• Principle component Analysis, Factor Analysis
- These are techniques for studying complicated
correlation structure amongst a collection of variables
3. Techniques for which a univariate
procedures could exist but these techniques
become much more interesting in the
multivariate setting.
• Cluster Analysis and Classification
- Here we try to identify subpopulations from the data
• Discriminant Analysis
- In Discriminant Analysis, we attempt to use a
collection of variables to identify the unknown
population for which a case is a member
An Example:
A survey was given to 132 students

• Male=35,
• Female=97
They rated, on a Likert scale
• 1 to 5
• their agreement with each of 40 statements.
All statements are related to the Meaning of Life
Questions and Statements
1. How religious/spiritual would you say you are?
2. To have trustworthy and intimate friend(s)
3. To have a fulfilling career
4. To be closely connected to family
5. To share values/beliefs with others in your close circle or
community
6. To have and raise children
7. To continually set short and long-term, achievable goals for
yourself
8. To feel satisfied with yourself (feel good about yourself)
9. To live up to the expectations of family and close friends
10. To contribute to world peace
Statements - continued
11. To be involved in an intimate relationship with a significant
person
12. To give of yourself to others.
13. To be able to plan and take time for leisure.
14. To act on your own personal beliefs, despite outside pressure.
15. To be seen as physically attractive.
16. To feel confident in choosing new experiences to better
yourself.
17. To care about the state of the physical/natural environment.
18. To take responsibility for your mistakes.
19. To make restitution for you mistakes, if necessary.
20. To be involved with social or political causes.
21. To keep up with media and popular-culture trends.
22. To adhere to religious practices based on tradition or rituals.
23. To use your own creativity in a way that you believe is
worthwhile.
24. The meaning of life is found in understanding ones ultimate
purpose for life.
25. The meaning of life can be discovered through intentionally
living a life that glorifies a Spiritual being.
26. There is a reason for everything that happens.
27. Obtaining things in life that are material and tangible is only
part of discovering the meaning of life.
28. People unearth the same basic values when attempting to find
the meaning of life.
29. It is more important to cultivate character than to be consumed
with outward rewards, or, awards.
30. Some aims or goals in life are more valuable than other goals.
31. The purpose of life lies in promoting the ends of truth, beauty,
and goodness.
32. A meaningful life is one that contributes to the well-being of
others.
33. The meaning of life is the same as a happy life.
34. The meaning of life is found in realizing my potential.
35. Life has purpose only in the everyday details of living.
36. There is no, one, universal way of obtaining a meaningful life
for all people.
37. People passionately desire different things. Obtaining these
things contributes to making life more meaningful for them.
38. What contributes to a meaningful life varies according to each
person (or group).
39. Lives can be meaningful even without the existence of a God
or spiritual realm.
40. Our lives have no significance, but we must live as if they do.
Cluster Analysis of n = 132 university students using
responses from Meaning of Life questionnaire (40
questions)
80
70
60
50
40
30
Linkage Distance
20
10
0
C ases
Fig. 1. Dendrogram showing clustering using Ward`s method of Euclidean distances

Discriminant Analysis of n = 132 university students
into the three identified populations
8
Optimistic Semi-Religious
7 Religious
Humanistic
6
F2 (Discriminant function 2)
Pessimistic
1
Religious Non-religious
0
-4 -3 -2 -1 0 1 2 3 4 5 6
F1 (Discriminant function 1)
Fig. 4. Cluster map

A Review of Linear Algebra
With some Additions

Matrix Algebra
Definition
An n × m matrix, A, is a rectangular array of
elements
 a11 a12  a1n 
a a a 
   21 22  2n 
A  aij 
    
 
 am1 am 2  amn 
n = # of columns
m = # of rows
dimensions = n × m
Definition
A vector, v, of dimension n is an n × 1 matrix
rectangular array of elements
 v1 
v 
v  2
 
 
vn 
vectors will be column vectors (they may also be

row vectors)
A vector, v, of dimension n
 v1 
v 
v  2
 
 
 vn 
can be thought a point in n dimensional space

v3
 v1 
v  v2 
 v3 
v2
v1
Matrix Operations
Addition
Let A = (aij) and B = (bij) denote two n × m
matrices Then the sum, A + B, is the matrix
 a11  b11 a12  b12  a1n  b1n 
a b a22  b22  a2 n  b2 n 
A  B   aij  bij    21 21 
     
 
 am1  bm1 am 2  bm 2  amn  bmn 
The dimensions of A and B are required to be both

n × m.
Scalar Multiplication
Let A = (aij) denote an n × m matrix and let c be
any scalar. Then cA is the matrix
 ca11 ca12  ca1n 
 ca ca  ca2 n 
cA   caij   21 22
     
 
cam1 cam 2  camn 
Addition for vectors
v3
 v1  w1 
v  w  v2  w2 
 w1 
 v3  w3 
w   w2 
 w3 
 v1 
v  v2 
v2
 v3 
v1
Scalar Multiplication for vectors
v3
 cv1 
cv   cv2 
 cv3 
 v1 
v  v2 
 v3 
v2
v1
Matrix multiplication
Let A = (aij) denote an n × m matrix and B = (bjl)
denote an m × k matrix
Then the n × k matrix C = (cil) where
m
cil   aij b jl
j 1
is called the product of A and B and is denoted by

A∙B
In the case that A = (aij) is an n × m matrix and B
= v = (vj) is an m × 1 vector m
Then w = A∙v = (wi) where wi  a vj 1
ij j
is an n × 1 vector A
w3
v3
 v1 
v  v2 
 v3 
w2
v2  w1 
w   w2   Av
 w3 
w1
v1
Definition
An n × n identity matrix, I, is the square matrix
1 0  0 
0 1  0 
I  In   
    
 
0 0  1 
Note:
1. AI = A
2. IA = A.
Definition (The inverse of an n × n matrix)
Let A denote the n × n matrix
 a11 a12  a1n 
a a a 
   21 22  2n 
A  aij 
    
 
 an1 an 2  ann 
Let B denote an n × n matrix such that
AB = BA = I,
If the matrix B exists then A is called invertible
Also B is called the inverse of A and is denoted by
A-1
The Woodbury Theorem
1
 A  BCD 
1
 A  A B C  DA B  DA1
1 1 1 1
where the inverses

1
A , C and C  DA B  exist.
1 1 1 1
Proof:
1
Let H  A  A B C  DA B  DA
1 1 1 1 1
Then all we need to show is that

H(A + BCD) = (A + BCD) H = I.
H  A  BCD  
 1 1 1
1
1 1
A  A B C  DA B  DA1  A  BCD  
 A A  A B C  DA B  DA1 A
1 1 1 1
1
 A BCD  A B C  DA B  DA BCD
1 1 1 1 1
1
 I  A B C  DA B  D
1 1 1
1
 A BCD  A B C  DA B  DA1BCD
1 1 1 1
 I  A1 BCD
1
 A B C  DA B   I  DA1BC  D
1 1 1
 I  A1 BCD
1
 A B C  DA B  C  DA B  CD
1
  1
1 1 1
1 1
 I  A BCD  A BCD  I
The Woodbury theorem can be used to find the
inverse of some pattern matrices:
Example: Find the inverse of the n × n matrix
b a  a 1 0  0 1 1  1
a b  a   0 1 
 0 1 
1  1
   b  a  a 
           
     
a a  b 0 0  1 1 1  1
1
1
  b  a I    a 1 1  1  A  BCD


1
where
1
1
A   b  a I B  C   a D   1 1  1
 11

1
1
1 1
1
C  
hence A  I
a
ba
and 1
1
1  1  
C  DA B    1 1  1 
1 1
I
a  b  a  

1
1 n b  a  an b  a  n  1
   
a b  a a  b  a a  b  a
Thus
a  b  a
1
C  DA B  
1 1
b  a  n  1
Now using the Woodbury theorem

1
 A  BCD 
1
 A  A B C  DA B  DA1
1 1 1 1
1
1 a b  a
1 1   1
 I I    1 1  1 I
ba b  a  b  a  n  1 ba

1 1
1
1 a    1 1  1
 I
ba  b  a   b  a  n  1  

1
1
Thus b a  a
a b  a 
 
   
 
a a  b
1 0  0 1 1  1
0 1 0  1 1 1
1   a  

b  a      b  a   b  a  n  1     
   
0 0  1 1 1  1
c d  d
d 
c  d

   
 
d d  c
where
a
d 
 b  a   b  a  n  1 
1 a
and c  
b  a  b  a   b  a  n  1 
1  a  1  b  a  n  2 
 1    
b  a  b  a  n  1  b  a  b  a  n  1 
Note: for n = 2
a a
d   2
 b  a   b  a  b  a2
1  b  b
and c     2
b  a  b  a  b  a2
1
b a  1  b a 
Thus    2  a b 
 a b  b  a 2
 
Also
1
b a  a  b a  a b a  a  c d  d
a b  a   a b  a  a b  a   d c  d 
 
                 
     
a a  b  a a  b a a  b  d d  c
 bc   n  1 ad bd  ac  (n  2)ad  bd  ac  (n  2)ad 
 
 bd  ac  (n  2)ad bc   n  1 ad  bd  ac  (n  2)ad 

     
 
bd  ac  (n  2)ad bd  ac  (n  2)ad  bc   n  1 ad 
Now
a
d 
 b  a   b  a  n  1 
1  b  a  n  2 
and c   
b  a  b  a  n  1 
bc   n  1 ad 
b  b  a  n  2    n  1 a 2
 
b  a  b  a  n  1   b  a   b  a  n  1 
b  b  a  n  2     n  1 a 2

 b  a   b  a  n  1 
b 2  ab  n  2    n  1 a 2
 1
b  ab  n  2    n  1 a
2 2
and
bd  ac  (n  2)ad 
a  b  a  n  2   b  (n  2)a  a
 
b  a  b  a  n  1   b  a   b  a  n  1 
0
This verifies that we have calculated the inverse

Block Matrices
Let the n × m matrix

q  A11 A12 
A   
n m n  q  A21 A22 
p m p
be partitioned into sub-matrices A11, A12, A21, A22,
Similarly partition the m × k matrix

p B11 B12 
B   
mk m  p  B21 B22 
l k l
Product of Blocked Matrices
Then
 A11 A12   B11 B12 
A B  
 A21 A22   B21 B22 
 A11 B11  A12 B21 A11B12  A12 B22 

 
 A21 B11  A22 B21 A21B12  A22 B22 
The Inverse of Blocked Matrices
Let the n × n matrix

p  A11 A12 
A  
n n n  p  A21 A22 
p n p
be partitioned into sub-matrices A11, A12, A21, A22,
Similarly partition the n × n matrix

 B11
p B12 
B  
nn n  p  B21 B22 
p n p
Suppose that B = A-1
Product of Blocked Matrices
Then
 A11 A12   B11 B12 
A B  
 A21 A22   B21 B22 
 A11 B11  A12 B21 A11B12  A12 B22 

 
 A21 B11  A22 B21 A21B12  A22 B22 
 Ip 0 
p n  p 
 
 0 I n p 
 n p   p 
Hence
A11 B11  A12 B21  I  1
A11 B12  A12 B22  0  2
A21 B11  A22 B21  0  3
A21 B12  A22 B22  I  4
1 1
From (1) A11  A12 B B  B
21 11 11
From (3)
A221 A21  B21B111  0 or B21B111   A221 A21
1 1
Hence A11  A12 A A21  B
22 11
1
or B11   A11  A12 A A21 
1
22
1
1
11

1
11 12
 1
 A  A A  A22  A A A  A21 A11 1
21 11 12
using the Woodbury Theorem

Similarly
1
B22   A22  A A A 1
21 11 12
1
 A22  A22 A21  A11  A12 A22 A21  A12 A221
1 1 1
From A21 B11  A22 B21  0  3
A221 A21 B11  B21  0
and
1
B21   A A21 B11   A A21  A11  A12 A A21 
1
22
1
22
1
22
similarly
1
B12   A A B22   A A  A22  A A A 
1
11 12
1
11 12
1
21 11 12
Summarizing
 A11
p A12 
Let A  
nn n  p  A21 A22 
p n p
 B11 p B12 
Suppose that A-1 = B   
n  p  B21 B22 
p n p
then
1 1
B11   A11  A12 A A21   A  A A  A22  A A A  A21 A111
1
22
1
11
1
11 12
1
21 11 12
1 1
B22   A22  A A A   A  A A21  A11  A12 A A21  A12 A221
1
21 11 12
1
22
1
22
1
22
1
B12   A A B22   A A  A22  A21 A111 A12 
1
11 12
1
11 12
1
B21   A A21B11   A A21  A11  A12 A A21 
1
22
1
22
1
22
Example
a 0 b 0
Let    
 
 aI
p bI   0 a 0 b
A    
p  cI dI   c 0 d 0
p p
   
 
0 c 0 d
p  B11 B12 
Find A-1 = B   
n  p  B21 B22 
p n p
A11  aI , A12  bI , A21  cI , A22  dI
1
B11   aI  bI  I  cI    a  bcd  I 
1
1
d
d
ad  bc I
B22   dI  cI  I  bI    d  bca  I 
1 1
1
a
a
ad  bc I
B12   A111 A12 B22   1a I (bI ) ad abc I   ad bbc I
B21   A221 A21B11   d1 I (cI ) addbc I   ad cbc I
1  ad  bc I
d
 ad bbc I 
hence A   c 
  ad bc I I
a
ad bc 
The transpose of a matrix
Consider the n × m matrix, A
 a11 a12  a1n 
a a22  a2 n 
A   aij    21
    
 
 am1 am 2  amn 
then the m × n matrix, A(also denoted by AT)
 a11 a21  am1 
a a22  am 2 
A   a ji    12
    
 
 am1 am 2  amn 
is called the transpose of A

Symmetric Matrices
• An n × n matrix, A, is said to be symmetric if
A  A
Note: 
 AB   BA
 AB   B A
1 1 1

 A 1
 
 A 1
The trace and the determinant of a
square matrix
Let A denote then n × n matrix
 a11 a12  a1n 
a 
a22  a2 n 
A   aij    21
    
 
 an1 an 2  ann 
Then n
tr  A    aii
i 1
 a11 a12  a1n 
also a
 21 a22  a2 n 
A  det  the determinant of A
    
 
 an1 an 2  ann 
n
  aij Aij
j 1
where Aij  cofactor of aij  the determinant of the matrix

after deleting i th row and j th col.
 a11 a12 
det    a11a22  a12 a21
 a21 a22 
Some properties
1. I  1, tr  I   n
2. AB  A B , tr  AB   tr  BA 
1 1
3. A 
A
 A11  A A 
A12   22 11 12 22 A21
A A 1
4. A   
 A21 A22   A11 A22  A21 A111 A12

 A22 A11 if A12  0 or A21  0
Some additional Linear Algebra
Inner product of vectors
 
Let x and y denote two p × 1 vectors. Then.
 y1 
   

x  y   x1 , , x p     x1 y1    x p y p
 yp 
 
p
  xi yi
i 1
Note:
  
x  x  x1    x p  the length of x
2 2
 
 
x  y  
cos        angle between x and y
x  x y   y

x

y

Note:
 
 
x  y    
cos   x   y  0 xand
if between
     0angle and y  2
x  x y   y
   
Thus if x   y  0, then x and y are orthogonal.

x

y
 2
Special Types of Matrices
1. Orthogonal matrices
– A matrix is orthogonal if P'P = PP' = I
– In this cases P-1=P' .
– Also the rows (columns) of P have length 1 and
are orthogonal to each other
Suppose P is an orthogonal matrix
then PP  PP  I
 
Let x and y denote p × 1 vectors.
   
Let u  Px and v  Py
     
Then u v   Px   Py   xPPy  xy
     
and uu   Px   Px   xPPx  xx
Orthogonal transformation preserve length and

angles – Rotations about the origin, Reflections
Example
The following matrix P is orthogonal
1 3
1
3
1
3

1 
P 2
 1
2
0 
1 1  2 6 
 6 6
(continued)
2. Positive definite matrices

– A symmetric matrix, A, is called positive definite
 if:

x Ax  a11 x1    ann xn  2a12 x1 x2   2a12 xn 1 xn  0
2 2
 
for all x  0
– A symmetric matrix, A, is called positive semi
definite if:
 
x Ax  0
 
for all x  0
If the matrix A is positive definite then
  
the set of points, x , that satisfy x Ax  c where c  0
are on the surface of an n  dimensiona l ellipsoid

centered at the origin, 0.
Theorem The matrix A is positive definite if
A1  0, A2  0, A3  0, , An  0
where
 a11 a12 a13 
 a11 a12   
A1   a11  , A2    , A3  a12 a22 a23 ,
a12 a22   a13 a23 a33 
 a11 a12  a1n 
a a  a 
and An  A   12 22 2n 
    
 
a
 1n 2 na  a nn 
(continued)
3. Idempotent matrices
– A symmetric matrix, E, is called idempotent if:
EE  E
– Idempotent matrices project vectors onto a linear
subspace 
 
E  Ex   Ex
x

Ex
Definition
Let A be an n × n matrix
Let 
x and  be such that
   
Ax   x with x  0
then  is called an eigenvalue of A and

and x is called an eigenvector of A and
Note:
 
 A  I  x  0
 1  
If A   I  0 then x   A   I  0  0
thus A   I  0
is the condition for an eigenvalue.
 a11     a1n 
 
A   I  det     = 0
 an1   ann    
= polynomial of degree n in .
Hence there are n possible eigenvalues 1, … , n

Thereom If the matrix A is symmetric then the
eigenvalues of A, 1, … , n,are real.
Thereom If the matrix A is positive definite then

the eigenvalues of A, 1, … , n, are
positive.    
Proof A is positive definite if xAx  0 if x  0

Let  and x be an eigenvalue and
corresponding eigenvector of A.
 
then Ax   x  
   xx
and xAx   xx , or      0
xAx
Thereom If the matrix A is symmetric and the
eigenvalues of A are 1, … , n, with
corresponding eigenvectors x1 , , xn
 
i.e. Axi  i xi

If i ≠ j then xix j  0
  
Proof: Note xj Axi  i xj xi
  
and xiAx j   j xix j
0   i   j  xix j


hence xix j  0
Thereom If the matrix A is symmetric with
distinct eigenvalues, 1, … , 
 n, with

corresponding eigenvectors x1 , , xn

Assume xixi  1
 
then A  1 x1 x1    n xn xn

1  0   x1 
     
  x1 , , xn         PDP

 0  n   xn 
 
then A  1 x1 x1    n xn xn
proof
 
Note xixi  1 and xix j  0 if i  j
  
 x1   x1x1  x1xn 
    
PP      x1 , , xn       
  
 xn   xn x1  xn xn 
1  0 

  
  I P is called an
0  1  orthogonal matrix
therefore P  P 1
and PP  PP  I .
1

thus  x1 
     
I  PP   x1 , , xn      x1 x1    xn xn

 xn 
   
now Axi  1 xi and Axi xi  i xi xi
   
Ax1 x1    Axn xn  1 x1 x1    n xn xn
   
A  x1 x1    xn xn   1 x1 x1    n xn xn
 
A  1 x1 x1    n xn xn
Comment
The previous result is also true if the eigenvalues
are not distinct.
Namely if the matrix A is symmetric with
eigenvalues, 1, … , n, with corresponding
 
eigenvectors of unit length x1 , , xn
 
then A  1 x1 x1    n xn xn

1  0   x1 
     
  x1 , , xn         PDP

 0  n   xn 
An algorithm
for computing eigenvectors, eigenvalues of positive
definite matrices
• Generally to compute eigenvalues of a matrix

we need to first solve the equation for all
values of .
– |A – I| = 0 (a polynomial of degree n in )

,x,
• Then solve the equation for the eigenvector
 
Ax   x
Recall that if A is positive definite then
 
A  1 x1 x1    n xn xn
  
where x1 , x2 ,  , xn are the orthogonal eigenvectors
   
of length 1. i.e. xi xi  1 and xi x j  0 if i  j 

 
and 1  2    n  0 are the eigenvalues
It can be shown that
2  2  2 
A  1 x1  x1  2 x2  x2    n xn  xn
2
m  m  m 
and that A  1 x1  x1  2 x2  x2    n xn  xn
m
     m     
m

 1  x1  x1    x2  x2      xn  xn   1  x1  x1
m 2 n   m  
  1   1  
Thus for large values of m
A   a constant   x1  x1
m  
The algorithim
1.Compute powers of A - A2 , A4 , A8 , A16 , ...
2.Rescale (so that largest element is 1 (say))
3.Continue until there is no change, The
A  c x1  x1
m  
resulting matrix
 will be  
4.Find b so that A  b  b   c x1  x1
m  
 1   
5. Find x1    b and 1 using Ax1  1 x1
bb

To find x2 and 2 Note :
     
A  1 x1  x1  2 x2  x2    n xn  xn
6. Repeat steps 1 to 5 with the above matrix to find
7. Continue to find

x2 and 2
  
x3 and 3 , x4 and 4 , , xn and n
Example
5 4 2
A= 4 10 1
2 1 2
1 2 3
eigenvalue 12.54461 3.589204 0.866182
eignvctr 0.496986 0.677344 0.542412
0.849957 -0.50594 -0.14698
0.174869 0.534074 -0.82716
Differentiation with respect to a
vector, matrix
Differentiation with respect to a vector
 
Let x denote a p × 1 vector. Let f  x  denote a

function of the components of x .

 df  x  
 
  dx1 
df  x   
  
dx   
 df  x  
 dx 
 p 
Rules
 
1. Suppose f  x   ax  a1 x1    an xn

 f  x  
 
  x1   a1 
df  x      
then        a
dx     
 f  x    a p 
 x 
 p 
2. Suppose
  
f  x   xAx  a11 x12    a pp x 2p 
2a12 x1 x2  2a13 x1 x3    2a p 1, p x p 1 x p

 f  x  
 
  x1 
df  x    
then     2 Ax
dx   
 f  x  
 x 
 p 

f  x 
i.e.  2ai1 x1    2aii xi    2aip x p
xi
Example
   
1. Determine when f  x   xAx  bx  c
is a maximum or minimum.
solution

df  x      1

  2 Ax  b  0 or x   1
2 A b
dx
  
2. Determine when f  x   xAx is a maximum if

xx  1. Assume A is a positive definite matrix.
solution
   
let g  x   xAx    1  xx 
is the Lagrange multiplier.

dg  x     
or Ax   x

  2 Ax  2 x  0
dx

This shows that x is an eigenvector of A.
   
and f  x   xAx   xx  
Thus the eigenvector of A associated with the largest eigenvalue, .
is
x
Differentiation with respect to a matrix
Let X denote a q × p matrix. Let f (X) denote a
function of the components of X then:
 f  X  f  X  
  
 x11 x1 p 
df  X   f  X   
     
dX  xij   
   f  X  f  X  
  
 xq1 x pp 
Example
Let X denote a p × p matrix. Let f (X) = ln |X|
d ln X 1
then X
dX
Solution
X  xi1 X i1    xij X ij    xip X ip
Note Xij are cofactors

 ln X 1
 X ij = (i,j)th element of X-1
xij X
Example
Let X and A denote p × p matrices.
d tr  AX 
Let f (X) = tr (AX) then  A
dX
Solution p p
tr  AX    aik xki
k 1 k 1
tr  AX 
 a ji
xij
Differentiation of a matrix of functions
Let U = (uij) denote a q × p matrix of functions
of x then:
 du11 du1 p 
  
dx dx
dU  duij   
    
dx  dx  
duq1 duqp 
  
 dx dx 
Rules:
d  aU  dU
1. a
dx dx
d  U V  dU dV
2.  
dx dx dx
d  UV  dU dV
3.  V U
dx dx dx
4.

d U 1   U 1 dU 1
U
dx dx
Proof: 1
U U I
1
dU 1 dU
U U  0
dx dx p p
1
dU 1 dU
U  U
dx dx
1
dU 1 dU
 U U 1
dx dx
dtrAU  dU 
5.  tr  A 
dx  dx 
p p
Proof: tr  AU    aik uki
i 1 k 1
tr  AU  p p
uki  dU 
  aik  tr  A 
x i 1 k 1 x  dx 
dtrAU 1  1 dU 1 
6.   tr  AU U 
dx  dx 
dtrAX 1
7.
dxij

 ij  1
  tr E X AX 1 
 kl   kl   kl  1 i  k , j  l
E  (eij ) where eij 
0 otherwise
Proof:
1  
dtrAX
dxij
 tr   AX

1 dX
dx
1
 
X    tr AX 1 E  ij  X 1 
 ij 

 ij  1
  tr E X AX 1 
dtrAX 1
8.   X 1 AX 1
dX
The Generalized Inverse of a
matrix
Recall
B (denoted by A-1) is called the inverse of A if
AB = BA = I
• A-1 does not exist for all matrices A

• A-1 exists only if A is a square matrix and |
A| ≠ 0
• If A-1 exists thenthe system of linear

equations Ax  b has a unique solution
 1

xA b
Definition
B (denoted by A-) is called the generalized inverse (Moore –
Penrose inverse) of A if
1. ABA = A
2. BAB = B
3. (AB)' = AB
4. (BA)' = BA
Note: A- is unique
Proof: Let B1 and B2 satisfying
1. ABiA = A
2. BiABi = Bi
3. (ABi)' = ABi
4. (BiA)' = BiA
Hence
B1 = B1AB1 = B1AB2AB1 = B1 (AB2)'(AB1) '
= B1B2'A'B1'A'= B1B2'A' = B1AB2 = B1AB2AB2
= (B1A)(B2A)B2 = (B1A)'(B2A)'B2 = A'B1'A'B2'B2
= A'B2'B2= (B2A)'B2 = B2AB2 = B2
The general solution of a system of Equations

 
Ax  b
The general solution

x  A b   I  A A z
   

b   I where
 A A z
 
is arbitrary
Suppose a solution exists
 
Ax0  b
Let

x  A b   I  A A z
   

then Ax  A  A b   I  A A  z 
   


  AA b  A  AA A z 
 
 
  
 AA Ax0  Ax0  b
Calculation of the Moore-Penrose g-inverse
then A   AA 
1
Let A be a p×q matrix of rank q < p, 
A
Proof
 
  
 
1 1

A A A A A  A  A A
 A A  I
 
thus
AA A  AI  A and A AA  IA  A
also A A  I is symmetric
 
1
and AA  A A A

A is symmetric
then B  B  BB 
 1
Let B be a p×q matrix of rank p < q,
Proof
 
  
 
1 1
BB  B  B BB
 
  BB BB
 I

thus
BB  B  IB  B and B  BB   B  I  B 
also BB   I is symmetric
 
1
and B B  B BB

B is symmetric
Let C be a p×q matrix of rank k < min(p,q),
then C = AB where A is a p×k matrix of rank k and B is a k×q matrix of rank
k
then C  B  BB   AA

1 1

A
Proof

   A A 
 
1 1 1
CC   AB  B BB A  A A A A
 
is symmetric, as well as

     

1 1 1
C C   B BB A A A AB  B BB B
 

  
1
Also CC C   A A A A AB  AB  C
 

  
   
 
1 1 1
and C CC    B BB B   B BB A A A
   1 
   
1
 B BB A A A  C 
References
1. Matrix Algebra Useful for Statistics, Shayle
R. Searle
2. Mathematical Tools for Applied Multivariate
Analysis, J. Douglas Carroll, Paul E. Green

Section 01 Introduction, Review of Linear Algebra

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Section 01 Introduction, Review of Linear Algebra

Uploaded by

Copyright:

Available Formats

Stats 346.

Multivariate Data Analysis

Office: 235 McLean Hall

Evaluation: Assignments, Term tests - 40%

Each test and the Final Exam are Open Book

Not Required - I will give a list of other useful texts

2. Fienberg, S. (1980), Analysis of Cross-Classified Data , MIT Press,

3. Fingelton, B. (1984), Models for Category Counts , Cambridge

4. Johnson, R.A. and Wichern D.W. Applied Multivariate Statistical

5. Morrison, D.F. (1976), Multivariate Statistical Methods , McGraw-Hill,

6. Seal, H. (1968), Multivariate Statistical Analysis for Biologists ,

7. Alan Agresti (1990) Categorical Data Analysis, Wiley, New York.

Review of Linear Statistical Theory Chapter 1

Dependent Categorical Continuous Continuous &

ANOVA (single dep var) MULTIPLE ANACOVA

A survey was given to 132 students

Fig. 1. Dendrogram showing clustering using Ward`s method of Euclidean distances

Fig. 4. Cluster map

With some Additions

vectors will be column vectors (they may also be

can be thought a point in n dimensional space

The dimensions of A and B are required to be both

is called the product of A and B and is denoted by

where the inverses

Then all we need to show is that

Now using the Woodbury theorem

This verifies that we have calculated the inverse

Let the n × m matrix

be partitioned into sub-matrices A11, A12, A21, A22,

Similarly partition the m × k matrix

 A11 B11  A12 B21 A11B12  A12 B22 

Let the n × n matrix

be partitioned into sub-matrices A11, A12, A21, A22,

Similarly partition the n × n matrix

 A11 B11  A12 B21 A11B12  A12 B22 

using the Woodbury Theorem

B12   A111 A12 B22   1a I (bI ) ad abc I   ad bbc I

B21   A221 A21B11   d1 I (cI ) addbc I   ad cbc I

is called the transpose of A

where Aij  cofactor of aij  the determinant of the matrix

Orthogonal transformation preserve length and

2. Positive definite matrices

Hence there are n possible eigenvalues 1, … , n

Thereom If the matrix A is positive definite then

• Generally to compute eigenvalues of a matrix

Note Xij are cofactors

• A-1 does not exist for all matrices A

The general solution of a system of Equations

then C  B  BB   AA

You might also like