Professional Documents
Culture Documents
A1 Matrix-Algebra
A matrix is a rectangular or a quadratic array of numbers,
The format or “order” of A is given by the number n of rows and the number of
the columns,
O( A) : n m.
Fact:
Two matrices are identical if they have identical format and if at each
place (i, j) are identical numbers, namely
i {1,..., n}
A B aij bij
j {1,..., m}.
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
A1 Matrix-Algebra 486
A B B A (commutativity)
(A B) C A (B C) (associativity)
Compatibility
( ) A A A
distributivity
( A B) A B
( A B) A B.
3(ii) “Kronecker-Zehfuss-product”
A [aij ], O ( A ) n m
B [bij ], O (B) k l
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
A1 Matrix-Algebra 487
C : B A [cij ], B A : [bij A ], O (C) O (B A ) kn l
3(iii) “Khatri-Rao-product”
(of two rectangular matrices of identical column number)
A [a1 ,..., am ], O ( A ) n m
B [b1 ,..., bm ], O (B) k m
C : B A : [b1 a1 , , bm am ], O (C) kn m
3(iv) “Hadamard-product”
(of two rectangular matrices of the same order; elementwise product)
G [ gij ], O (G ) n m
H =[hij ], O (H ) n m
The existence of the product A B does not imply the existence of the product
B A . If both products exist, they are in general not equal. Two quadratic matri-
ces A and B, for which holds A B = B A , are called commutative.
Laws
(i) (A B) C A (B C)
A ( B C) A B A C
( A B) C A C B C
( A B ) ( B A ) .
(ii)
( A B ) C A ( B C) A B C
( A B ) C ( A B ) ( B C)
A ( B C) ( A B ) ( A C )
( A B ) ( C D ) ( A C) ( B D )
( A B ) A B .
(iii)
( A B ) C A ( B C) A B C
( A B ) C ( A C) ( B C)
A ( B C) ( A B ) ( A C )
( A C) (B D) ( A B) (C D)
A (B D) ( A B) D, if d ij 0 for i j.
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
A1 Matrix-Algebra 488
(iv) A B B A
( A B ) C A ( B C) A B C
( A B ) C ( A C) ( B C)
( A1 B1 C1 ) ( A 2 B 2 C2 ) ( A1 A 2 ) ( B1 B 2 ) (C1 C2 )
(D A) (B D) D ( A B) D, if dij 0 for i j
( A B) A B.
A2 Special Matrices
We will collect special matrices of symmetric, antisymmetric, diagonal, unity,
zero, idempotent, normal, orthogonal, orthonormal, positive-definite and posi-
tive-semidefinite, special orthonormal matrices, for instance of type Helmert or
of type Hankel.
aij 0 i j
unity I n n
aij 1 i j
zero matrix 0 n n : aij 0 i, j {1,..., n}
upper
aij 0 i j
triangular: a 0 i j
lower ij
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
A2 Special Matrices 489
x12 x 22 1
x x2
{X 1 22 x1 x3 x 2 x 4 0 , x1 x 4 x 2 x3 1}
x3 x 4
x32 x 42 1
cos sin
X 22 , [0, 2 ]
cos
(i)
sin
is a trigonometric representation of X SO (2) .
x 1 x 2 22 , x [1, 1]
(ii) X
1 x
2
x
1 x2 2x
X 1 x 1 x 2 2 2 , x
2
(iii)
2 x 1 x
2
1 x 2 1 x 2
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
A2 Special Matrices 490
atlas of the special orthogonal group SO(n) has at least four distinct charts and
there is one with exactly four charts. (“minimal atlas”: Lusternik – Schnirelmann
category)
(i) X (I n S)(I n S) 1 ,
where S S is a skew matrix (antisymmetric matrix), is called a
Cayley-Lipschitz representation of X SO(n) .
( n! / 2(n 2)! is the number of independent parameters/coordinates of X)
(ii) If each of the matrices R 1 , , R k is an nn orthonormal matrix, then
their product
R1R 2 R k 1R k SO(n)
Let a [a1 , , a n ] represent any row vector such that a i 0 (i {1, , n}) is
any row vector whose elements are all nonzero. Suppose that we require an
nn orthonormal matrix, one row which is proportional to a . In what follows
one such matrix R is derived.
Let [r1, , rn ] represent the rows of R and take the first row r1 to be the row of
R that is proportional to a . Take the second row r2 to be proportional to the n-
dimensional row vector
[a1 , a12 / a 2 , 0, 0, , 0], (H2)
the third row r3 proportional to
[a1 , a 2 , (a12 a 22 ) / a 3 , 0, 0, , 0] (H3)
and more generally the first through nth rows r1, , rn proportional to
k 1
[a1 , a 2 , , a k 1 , a i2 / a k , 0, 0, , 0] (Hn-1)
i 1
for k {2, , n} ,
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
A2 Special Matrices 491
a k2 k 1
a i2
(kth row) rk [ k 1
] 1 / 2 (a1 , a 2 , , a k 1 , , 0, 0, , 0)
i 1 a k
k
( a i2 ) ( a i2 ).
i 1 i 1
a n2 n 1
a i2
(nth row) rn [ n 1
] 1 / 2 [a1 , a 2 , , a n 1 , ] .
i 1 a n
n
( a i2 ) ( a i2 ).
i 1 i 1
The recipy is complicated: When a [1, 1, ,1, 1] , the Helmert factors in the
1st row, …, kth row,…, nth row simplify to
r1 n 1 / 2 [1, 1, ,1, 1] n
rn [ n( n 1)]
1/ 2
[1, 1, ,1, 1 n] .
n
r1
r
2
rk1 SO(n)
rk
r
n 1
rn
is known as the Helmert matrix of order n. (Alternatively the transposes of such a
matrix are called the Helmert matrix.)
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
A2 Special Matrices 492
1/ 2 1/ 2 1/ 2 1/ 2
1/ 2 1/ 2 0 0
SO(4).
1/ 6 1/ 6 2 / 6 0
1/ 12 3 / 12
1/ 12 1/ 12
1/ n 1/ n 1/ n 1/ n 1/ n 1/ n
1/ 2 1/ 2 0 0 0 0
1/ 6 1/ 6 2/ 6 0 0 0
SO(n).
1 1 1 1 (n 1)
(n 1)(n 2) 0
(n 1)(n 2) (n 1)(n 2) (n 1)(n 2)
1 1 1 1 1 n
n(n 1) n(n 1) n(n 1) n(n 1) n(n 1)
Check that the rows are orthogonal and normalized. An example is the nth row
(1 n ) n 1 1 2n n
2 2
1 1
n ( n 1) n( n 1) n ( n 1) n ( n 1) n ( n 1)
n n n( n 1)
2
1,
n( n 1) n( n 1)
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
A2 Special Matrices 493
a11
a
21
an 11
an1 an 2 anm
i 1
i i
i 1
i i
i 1
A is a Hankel matrix.
Definition (Vandermonde matrix):
Vandermonde matrix: V nn
1 1 1
x x2 xn
V : 1 ,
x1n 1 n 1
x2 n 1
xn
n
det V ( xi x j ).
i, j
i j
1 1 1
V : x1 x2 x3 , det V ( x2 x1 )( x3 x2 )( x3 x1 ).
x12 x22 x32
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
A2 Special Matrices 494
n n
det P ( i )( xi )(det V ) 2 ,
i 1 i 1
P [a1 , a2 , a3 ]
1 x1 2 x2 3 x3 1 x12 2 x22 3 x32 1 x13 2 x23 3 x33
2 4
1 x1 2 x2 3 x3 1 x1 2 x2 3 x3 1 x1 2 x2 3 x3 .
2 2 3 3 3 4 4
1 x13 2 x23 3 x33 1 x14 2 x24 3 x34 1 x15 2 x25 3 x35
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
A3 Scalar Measures and Inverse Matrices 495
Example (commutation matrix)
1 0 0 0
0 0 1 0
n2 K4 K 4 .
0 1 0 0
0 0 0 1
1 0 0 0 0 0
0 0 0 1 0 0
n 3 0
K 32 0 1 0 0 0
m 2 0 0 0 0 1 0
0 0 1 0 0 0
0 0 0 0 0 1
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
A3 Scalar Measures and Inverse Matrices 496
For all vectors which are characterized by x1 ,..., x n unequal from zero are
called linear dependent.
Let A be a rectangular matrix of the order O( Α) n m . The column rank
of the matrix A is the largest number of linear independent columns, while
the row rank is the largest number of linear independent rows. Actually the
column rank of the matrix A is identical to its row rank. The rank of a ma-
trix thus is called
rk A .
Obviously,
rk A min{n, m}.
If rk A n holds, we say that the matrix A has full row ranks. In contrast if
the rank identity rk A m holds, we say that the matrix A has full column
rank.
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
A3 Scalar Measures and Inverse Matrices 497
Facts
A matrix A has the column space
( A)
formed by the column vectors. The dimension of such a vector space is dim
( A) rk A . In particular,
( A) ( AA)
holds.
(i) ( A B ) C A ( B C) (associativity)
are equivalent. The Cayley-inverse A 1 is left and right identical. The Cayley-
inverse is unique.
A A 12
A : 11 A 11 , A 22 A 22 .
, A 11
A 12 A 22
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
A3 Scalar Measures and Inverse Matrices 498
[I A 11
1
A 11
A 12 ( A 22 A 12 1
]A 11
A 12 ) 1 A 12 1 1
A 11 A 11
A 12 ( A 22 A 12 1
A 12 ) 1
,
( A 22 A 12 A 11
1
A 11
A 12 ) 1 A 12 1
( A 22 A 12 A 11
1
A 12 ) 1
1
if A 11 exists ,
1
1 A A 12
A 11
A 12 A 22
1
if A 22 exists .
A 11
S 11 : A 22 A 12 1
A 22
A 12 and S 22 : A 11 A 12 1
A 12
are the minors determined by properly chosen rows and columns of the matrix A
called “Schur complements” such that
1
1 A A 12
A 11
A 12 A 22
(I A 11
1 1
A 12 S 11 ) A 11
A 12 1 1
A 11 1
A 12 S 11
1
A 11
S 11 A 12 1 1
S 11
1
if A 11 exists ,
1
A A 12
A 1 11
A 12 A 22
S 221 S 221 A 12 A 22
1
1
1
A 22 A 12 S 22
1 1
S 22 A 12 ]A 22
[I A 22 A 12 1
if A 221 exists ,
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
A3 Scalar Measures and Inverse Matrices 499
The formulae S11 and S 22 were first used by J. Schur (1917). The term “Schur
complements” was introduced by E. Haynsworth (1968). A. Albert (1969) re-
placed the Cayley inverse A 1 by the Moore-Penrose inverse A . For a survey
we recommend R. W. Cottle (1974), D.V. Oullette (1981) and D. Carlson (1986).
:Proof:
For the proof of the “inverse partitioned matrix” A 1 (Cayley inverse) of the
partitioned matrix A of full rank we apply Gauss elimination (without pivoting).
AA 1 A 1 A I
A A 12
A 11 A 11 , A 22 A 22
, A 11
A 12 A 22
A mm , A ml
11 l m
12
l l
A 12 , A 22
B B 12
A 1 11 B 11 , B 22 B 22
, B 11
B 12 B 22
B mm , B ml
11 l m
12
l l
B12 , B 22
AA 1 A 1 A I
B11 A 22 B12
A12 B12
A11 B 22 A12
0 (3)
B12 A 22 B 22 B12
A12 A12 B 22 A 22 I l (4).
1
Case (i): A 11 exists
“forward step”
I m (first left equation:
A11B11 A12 B12
1
multiply by A12 A11 )
0 (second right equation)
B11 A 22 B12
A12
B 11 A 12
A 12 A 11
1
A 12
A 12 B12 A 11
1
B11 A 22 B 12
A 12 0
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
A3 Scalar Measures and Inverse Matrices 500
A B A 12 B 12 Im
11 11
A 11 A 12 )B 12
( A 22 A 12
1
A 12
A 11
1
( A 22 A 12
B 12 A 11
1
A 11
A 12 ) 1 A 12 1
S 11 A 12
B 12 1
A 11
1
or
A 11
Note the “Schur complement” S 11 : A 22 A 12 1
A 12 .
“backward step”
Im
A 11B 11 A 12 B12
1
( A 22 A 12
B12 A 11 A 12 ) A 12
1 1
A 11
1
B11 A 11 ) (I m B 12 A 12
(I m A 12 B 12 ) A 11
1
1
B 11 [I m A 11 A 11
A 12 ( A 22 A 12 1
]A 11
A 12 ) 1 A 12 1
1 1 1
A 11
B 11 A 11 A 11 A 12 S 11 A 12 1
A11
B 22 ( A 22 A12 1
A12 ) 1
1
B 22 S11 .
“forward step”
A11B12 A12 B 22 0 (third right equation)
B12 A 22 B 22 I l (fourth left equation:
A12
1
multiply by A12 A 22 )
A 11B 12 A 12 B 22 0
1
A 12 A 22 B 12 A 12 B 22 A 12 A 221
A 12
A B A 22 B 22 I l
12 12
1
)B 12 A 12 A 221
( A 11 A 12 A 22 A 12
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
A3 Scalar Measures and Inverse Matrices 501
1
B 12 ( A 11 A 12 A 22 ) 1 A 12
A 12 A 221
1 1
B 12 S 22 A 12 A 22
or
I m 1
A 12 A 22 A 11 A 12 A 11 A 12 A 22
1
A 12 0
.
0 Il A 12 A 22
A 12 A 22
1
Note the “Schur complement” S 22 : A 11 A 12 A 22 .
A 12
“backward step”
B12 A 22 B 22 I l
A 12
) A 12 A
B12 ( A 11 A 12 A A 12 1 1 1
22 22
1
B 22 A 22 B12
(I l A 12 ) (I l B12
A 12 ) A 22
1
( A 11 A 12 A 221 A 12
B 22 [I l A 221 A 12 ) 1 A 12 ]A 221
1 1
B 22 A 22 A 22 A 12 S 22 A 12 A 22
1 1
B 11 A 22 B 12
A 12 0 ( third left equation )
A 22
B 12 1
B11 A 22
A 12 1
( A 11 A 12 A 22
A 12 1
) 1
A 12
B 1 1 ( A 1 1 A 1 2 A 2 21 A 1 2 ) 1
B 1 1 S 2 21 .
, B 22 } in terms of { A11 , A12 , A 21 = A12
The representations { B11 , B12 , B 21 B12 ,
A 22 } have been derived by T. Banachiewicz (1937). Generalizations are referred
to T. Ando (1979), R. A. Brunaldi and H. Schneider (1963), F. Burns, D. Carl-
son, E. Haynsworth and T. Markham (1974), D. Carlson (1980), C. D. Meyer
(1973) and S. K. Mitra (1982), C. K. Li and R. Mathias (2000).
We leave the proof of the following fact as an exercise.
A A 12
A : 11
A 22
.
A 21
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
A3 Scalar Measures and Inverse Matrices 502
1
A A 12
A 1 11
A 21 A 22
A 11
1 1
A 11 1
A 12 S 11 1
A 21 A 11 1
A 11 1
A 12 S 11
1 1 1 ,
S 11 A 21 A 11 S 11
1
if A 11 exists
1
1 A A 12
A 11
A 21 A 22
S 221
S 221 A 12 A 221
1 1 1 1 ,
1 1
A 22 A 21S 22 A 22 A A 21S A 12 A
22 22 22
1
if A 22 exists
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
A3 Scalar Measures and Inverse Matrices 503
(1992). (s10) has been noted by W. J. Duncan (1944) and L. Guttman (1946):
The result is directly derived from the identity
( A CBD)( A CBD) 1 I
Certain results follow directly from their definitions.
Facts (inverses):
(i) ( A ⋅ B)-1 = B-1 ⋅ A-1
(ii) ( A Ä B)-1 = B-1 Ä A-1
(iii) A positive definite A-1 positive definite
(iv) ( A Ä B)-1 , ( A * B)-1 and (A-1 * B-1 ) are positive de-
finite, then (A-1 * B-1 ) - ( A * B)-1 is positive semidefi-
nite as well as (A-1 * A ) - I and I - (A-1 * A)-1 .
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
A3 Scalar Measures and Inverse Matrices 504
plays a similar role as a second scalar measure. Here the summation is extended
as the summation perm ( j1 , , jn ) over all permutations ( j1 ,..., jn ) of the set of
integer numbers (1, , n) . ( j1 , , jn ) is the number of permutations which
transform (1, , n) into ( j1 , , jn ) .
Laws (determinant)
(i) | A | n | A | for an arbitrary scalar
(ii) | A B || A | | B |
(iii) | A B || A |m | B |n for an arbitrary m n matrix B
(iv) | A || A |
1
(v) | (A A ) || A | if A A is positive definite
2
(vi) | A 1 || A |1 if A 1 exists
(vii) | A | 0 A is singular ( A 1 does not exist)
(viii) | A | 0 if A is idempotent, A I
n
(ix) | A | aii if A is diagonal and a triangular matrix
i 1
n
(x) 0 | A | aii | A I | if A is positive definite
i 1
n
(xi) | A | | B | | A | bii | A B | if A and B are posi-
i 1
tive definite
det A11 det( A 22 A 21 A11
1
A12 )
m m
A11 A12 A11 , rk A11 m1 1 1
(xii) A 1
21 A 22 det A 21 det( A11 A12 A 22 A 21 )
A m m , rkA m .
2 2
22 22 2
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
A3 Scalar Measures and Inverse Matrices 505
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
A3 Scalar Measures and Inverse Matrices 506
In correspondence to the W – weighted vector (semi) – norm.
|| x ||W (x W x)1/ 2
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
A4 Vector-valued Matrix Forms 507
a11
an1
A [aij ] [a ji ] A vechA : 22 .
a
a
n2
ann
(iii) Let A be a quadratic, antisymmetric matrix, A A , of
order O( A) n n . Then veckA (“vec - skew”) is the
[n(n 1) / 2] 1 vector which is generated columnwise stapels of
those matrix elements which are under its diagonal.
a11
an1
a
A [aij ] [a ji ] A veckA : 32 .
a
n2
an, n 1
Examples
a b c
A vecA [a, d , b, e, c, f ]
f
(i)
d e
a b c
(ii) A b d e A vechA [a, b, c, d , e, f ]
c e f
0 a b c
a 0 d e
(iii) A A veckA [a, b, c, d , e, f ] .
b d 0 f
c e f 0
Useful identities, relating to scalar- and vector - valued
measures of matrices will be reported finally.
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
508
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
A5 Eigenvalues and Eigenvectors 509
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
A5 Eigenvalues and Eigenvectors 510
1/ 2 : Diag( 1 , , r ) .
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
A5 Eigenvalues and Eigenvectors 511
The synthetic relation of the matrix A is
A G1 G1 U1 1 U1 .
The pseudoinverse has a peculiar representation if we introduce the matrices
U1 , U1 and 1 .
Definition (pseudoinverse):
If we use the representation of the matrix A of type A G1 G1
U1U1 then
A : U1 U1 U1 1 U1
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
A5 Eigenvalues and Eigenvectors 512
The diagonal matrices on the right side have different formats
m m and m n .
(iii) The original n m matrix A can be decomposed
according to
0
U AV , O(UAV ) n m
0 0
with the r r diagonal matrix
: Diag(1 , , r )
At the end we compute the eigenvalues and eigenvectors which relate the
variation problem xAx extr subject to the condition xx 1 , namely
xAx (xx) extr .
x,
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
A6 Generalized Inverses 513
A6 Generalized Inverses
Because the inversion by Cayley inversion is only possible for quadratic
nonsingular matrices, we introduce a slightly more general definition in order to
invert arbitrary matrices A of the order O( A) n m by so – called generalized
inverses or for short g – inverses.
An m n matrix G is called g – inverse of the matrix A if it fulfils the equation
AGA A
in the sense of Cayley multiplication. Such g – inverses always exist and are
unique if and only if A is a nonsingular quadratic matrix. In this case
G A 1 if A is invertible,
in other cases we use the notation
G A if A 1 does not exist.
For the rank of all g – inverses the inequality
r : rk A rk A min{n, m}
holds. In reverse, for any even number d in this interval there exists a g – inverse
A such that
d rkA dim ( A )
( A ) g-inverse of A
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
A6 Generalized Inverses 514
A ( A A ) A and A ( AA ) 1 A
can be used to generate special g-inverses of AA or AA . For instance,
A : ( AA) A and A m : A ( AA )
have the special reproducing properties
A( A A) A A AA A A
and
AA ( AA ) A AA m A A ,
which can be generalized in case that W and S are positive semidefinite matrices
to
WA ( A WA ) A WA WA
ASA ( ASA ) AS AS ,
where the matrices
WA ( A WA ) A W and SA ( ASA ) AS
are independent of the choice of the g-inverse ( A WA ) and ( ASA ) .
A beautiful interpretation of the various g-inverses is based on the fact that the
matrices
( AA )( AA ) ( AA A ) A AA and ( A A )( A A ) A ( AA A ) A A
are idempotent and can therefore be geometrically interpreted as projections. The
image of AA , namely
( AA ) ( A) {Ax | x m } n ,
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
A6 Generalized Inverses 515
independent of the special rank of the g-inverses A which are determined by the
subspaces ( A A) and ( AA ) , respectively.
( AA ) (A A) ( A A )
( AA )
in n in m
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
A6 Generalized Inverses 516
if both the g-inverses ( AA) and ( A A) exist. The Moore-Penrose inverse
fulfils the Penrose equations:
(i) AA A A (g-inverse)
(ii) A AA A (reflexivity)
(iii) AA ( AA )
Symmetry due to orthogonal projection .
(iv) A A ( A A)
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
A6 Generalized Inverses 517
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
A6 Generalized Inverses 518
1
A EG A EF
E
F 0 EG 0
with the pseudoinverse A on the upper left side. Details can be derived from A.
Ben – Israel and T. N. E. Greville (1974 p. 228).
If the null spaces are always normalized in the sense of
EF | EF I m r , EG | EG I n r
because of
E EF EF | EF 1 EF
F
and
EG EG | EG 1 EG EG
1
A EG A EG
E .
F 0 EF 0
These formulas gain a special structure if the matrix A is symmetric to the order
O( A) . In this case
EG EF : E , O(E) (m r ) m , rk E m r
and
1
A E A E E | E 1
E 0
E | E E
1
0
on the basis of such a relation, namely EA 0 there follows
I m AA E E | E 1 E
( A EE)[ A E(EEEE) 1 E]
and with the projection (S - transformation)
A A I m E E | E 1 E ( A EE) 1 A
and
A ( A EE) 1 E(EEEE) 1 E
pseudoinverse of A
( A A) ( AA ) ( A) ( A) (E) .
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
A6 Generalized Inverses 519
( A rs A ) ( AA rs )
For the special case of a symmetric and positive semidefinite m m matrix A the
matrix set U and V are reduced to one. Based on the various matrix decompo-
sitions
L 0 U1
A U1 , U 2 U1 AU1 ,
0 0 U 2
we find the different g - inverses listed as following.
L1 L12 U1
A U1 , U 2 .
L 21 L 21LL12 U 2
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
A6 Generalized Inverses 520
(ii) reflexive g-inverse
L1 L12 U1
A r U1 , U 2
L 21 L 21LL12 U 2
(iii) reflexive and symmetric g-inverse
L1 L12 U1
A rs U1 , U 2
L12 L12 LL12 U2
(iv) pseudoinverse
L1 0 U1
A U1 , U 2 1
U1L U1 .
0 0 U 2
A ( A U 2 U 2 ) 1 U 2 U 2 .
The main target of our discussion of various g-inverses is the easy handling of
representations of solutions of arbitrary linear equations and their characteriza-
tions.
We depart from the solution of a consistent system of linear equations,
Ax c, O ( A ) n m, c ( A) x A c for any g-inverse A .
x A c is the general solution of such a linear system of equations. If we want
to generate a special g - inverse, we can represent the general solution by
x A c (I m A A ) z for all z m ,
since the subspaces ( A) and (I m A A ) are identical. We test the consis-
tency of our system by means of the identity
AA c c .
c is mapped by the projection AA to itself.
Similary we solve the matrix equation AXB C by the consistency test: the
existence of the solution is granted by the identity
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
A6 Generalized Inverses 521
X A CB Z A AZBB ,
where Z is an arbitrary matrix of suitable order. We can use an arbitrary g-
inverse A and B , for instance the pseudoinverse A and B which would be
for Z 0 coincide with two-sided orthogonal projections.
How can we reduce the matrix equation AXB C to a vector equation?
The vec-operator is the door opener.
AXB C (B A) vec X vec C .
The general solution of our matrix equation reads
vec X (B A) vec C [I (B A) (B A)] vec Z .
Here we can use the identity
( A B) B A ,
generated by two g-inverses of the Kronecker-Zehfuss product.
At this end we solve the more general equation Ax By of consistent type
( A) (B) by
Lemma (consistent system of homogenous equations Ax By ):
Given the homogenous system of linear equations Ax By for
y constraint by By ( A) . Then the solution x Ly can
be given under the condition
( A ) (B ) .
In this case the matrix L may be decomposed by
L A B for a certain g-inverse A .
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
Appendix B: Matrix Analysis
A short version on matrix analysis is presented. Arbitrary derivations of scalar-
valued, vector-valued and matrix-valued vector – and matrix functions for func-
tionally independent variables are defined. Extensions for differenting symmetric
and antisymmetric matrices are given. Special examples for functionally depend-
ent matrix variables are reviewed.
B1 Derivatives of Scalar valued and Vector valued Vector Functions
Here we present the analysis of differentiating scalar-valued and vector-valued
vector functions enriched by examples.
Definition: (derivative of scalar valued vector function):
Let a scalar valued function f (x) of a vector x of the order
O(x) 1 m (row vector) be given, then we call
f
Df (x) [D1 f (x), , Dm f (x)] :
x
first derivative of f (x) with respect to x .
f ij
J ijk
n q m p
x
k
is four-dimensional and automatic outside the usual frame of matrix algebra of
two-dimensional arrays. By means of the operations vecF and vecX we will
vectorize the matrices F and X. Accordingly we will take advantage of vecF(X)
of the vector vecX derived with respect to the matrix J F , a two-dimensional
array.
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
B2 Derivatives of Trace Forms 523
Examples
(i) f (x) xAx a11 x12 (a12 a21 ) x1 x2 a22 x22
f
Df (x) [D1 f (x), D2 f (x)]
x
[2a11 x1 (a12 a21 ) x2 | (a12 a21 ) x1 2a22 x2 ] x( A A)
a11 x1 a12 x2
(ii) f ( x) Ax
a21 x1 a22 x2
f a11 a12
J F Df (x) A
x a21 a22
O(J F ) 4 4 .
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
B2 Derivations of Trace Forms 524
(vecX) symmetric;
[vec(A A )], if the n n matrix X is antisymmetric.
for instance
a a12 x x12
A 11 , X 11
a22 x22
.
a21 x21
Case # 1: “the matrix X consists of functional independent elements”
x x12 a a21
, tr( AX) 11 A .
11
X X a12 a22
x x22
21
Case # 2: “the n n matrix X is symmetric : X X “
x12 x21
tr( AX ) a11 x11 ( a12 a21 ) x21 a22 x22
dx21
x dx12 x21 x11 x21
= 11
X
x x22 x21 x22
21
a11 a12 a21
tr( AX) A A Diag(a11 , , ann ) .
X a12 a21 a22
0 a12 a21
tr( AX) A A .
X a12 a21 0
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
B2 Derivations of Trace Forms 525
Let us now assume that the matrix X of variables xij is always consisting of
functionally independent elements. We note some useful identities of first de-
rivatives.
Scalar valued functions of vectors
(ax) a (B1)
x
(xAx) X( A A ). (B2)
x
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
B2 Derivations of Trace Forms 526
trX ( X) 1 , if X is quadratic ; (B7)
X
especially:
trX
(vecI ) .
(vecX)
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
B4 Derivatives of a Vector/Matrix Function of a Vector/Matrix 527
| AXBXC | BXA (adjAXBXC)C A (adjAXBXC)CXB ; (B11)
X
| XBX | BX(adjXBX) (adjXBX)XB ;
X
especially:
|X | 2
(vec[Xadj(X 2 ) adj(X 2 )X])
(vecX)
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
B6 Matrix-valued Derivatives of Symmetric or Antisymmetric Matrix
Functions 528
(vecXX)
(I p2 +K p p )(I p X) for all X m p
(vecX)
(vecX 1 )
( X 1 ) if X is non-singular
(vecX)
(vecX )
(X) -j X j 1 for all , if X is a square matrix.
(vecX) j 1
as the Kronecker – Zehfuss product of variables X and Y well defined. Then the
identities of the first differential and the first derivative follow:
vec(X Y)
I p [(K qm I n) (I m vecY)] ,
(vecX)
vec(X Y )
(I p K qm ) (vecX I q )] I n .
(vecY )
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
B6 Matrix-valued Derivatives of Symmetric or Antisymmetric Matrix Functions 529
tr(AX) (a12 a 21 )x 21
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
B6 Matrix-valued Derivatives of Symmetric or Antisymmetric Matrix Functions 530
tr(AX)
, a12 a21 ,
(veckX) x21 (veckX)
tr(AX) tr(AX)
[veck(A A )]=[veck ] .
(veckX) X
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
B7 Higher order derivatives 531
The definition of second derivatives of matrix functions can be motivated as
follows. The matrices F [ fij ] n q and X [ xk ] m p are the elements of a
two-dimensional array. In contrast, the array of second derivatives
2 fij
[ ] [kijk pq ] n q m p m p
xk x pq
is six-dimensional and beyond the common matrix algebra of two-dimensional
arrays. The following operations map a six-dimensional array of second deriva-
tives to a two-dimensional array.
(i) vecF(X) is the vectorized form of the matrix valued function
(ii) vecX is the vectorized form of the variable matrix
(iii) the Kronecker – Zehfuss product
(vecX ) (vecX )
vectorizes the matrix of second derivatives
(iv) the formal product of the 1 m 2 p 2 row vector of second
derivatives with the nq 1 column vector vecF(X) leads to
an nq m 2 p 2 Hesse matrix of second derivatives.
Again we assume the vector of variables x and the matrix of variables X consists
of functional independent elements. If this is not the case we according to the
chain rule must apply an alternative differential calculus similary to the first
deri-vative, case studies of symmetric and antisymmetric variable matrices.
Examples:
(i) f (x) xAx a11 x12 (a12 a21 ) x1 x2 a22 x22
f
Df (x) [2a11 x1 (a12 a21 ) x2 | (a12 a21 ) x1 2a22 x2 ]
x
a x a x
(ii) f (x) Ax 11 1 12 2
a21 x1 a22 x2
f a11 a12
Df (x) A
x a21 a22
2f 0 0
DDf (x) , O(DDf (x)) 2 2
xx 0 0
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
B7 Higher order derivatives 532
HF vecF(X) [ , , , ] JF
(vecX) (vecX) x11 x21 x12 x22
2 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0
0 1 0 0 1 0 0 1 0 0 0 0 0 1 0 0
0 0 1 0 0 0 0 0 1 0 0 1 0 0 1 0
0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 2
O(H F ) 4 16 .
At the end, we want to define the derivative of order l of a matrix-valued matrix
function whose structure is derived from the postulate of a suitable array.
Definition ( l-th derivative of a matrix-valued matrix function):
Let F(X) be an n q matrix valued function of an m p matrix
of functional independent variables X. The nq ml p l matrix of
l-th derivative is defined by
Dl F(X) : vecF(X)
(vecX) l -times (vecX)
l
= vecF(X) for all l .
(vecX) (vecX)
l -times
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
Appendix C: Lagrange Multipliers
Fi
rk( ) r m. (C3)
xm
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
C1 A first way to solve the problem 534
F1 ( x1 ,..., xm r ; xm r 1 ,..., xm )
F2 ( x1 ,..., xm r ; xm r 1 ,..., xm )
(x1 , x 2 ) F (x1 , x 2 ) ... (C6)
Fr 1 ( x1 ,..., xm r ; xm r 1 ,..., xm )
Fr ( x1 ,..., xm r ; xm r 1 ,..., xm )
transform a continuously differential function with F(x1 , x 2 ) 0 . In case
of a Jacobi determinant j not zero or a Jacobi matrix J of rank r, or
( F1 ,..., Fr )
j : det J 0 or rk J r , J : , (C7)
( xm r 1 ,..., xm )
there exists a surrounding U : U (x1 ) m r and V : U (x 2 ) r
such that the equation F(x1 , x 2 ) 0 for any x1 U in V has only one
solution
xm r 1 G1 ( x1 ,..., xm r )
xm r 2 G 2 ( x1 ,..., xm r )
x 2 G (x1 ) or ... ... . (C8)
m 1 r 1 1
x G ( x ,..., xmr )
xm G r ( x1 ,..., xm r )
:Example C1:
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
C1 A first way to solve the problem 535
Fi 2x 4 y 0
J( ) , rk J ( x 0 oder y 0) r 2
x j 3 0 4
1
1y 2 2 1 x
2
F1 ( x1 , x2 , x3 ) Z ( x, y, z ) 0
2 y 1 2 1 x2
2
3
F2 ( x1 , x2 , x3 ) E ( x, y, z ) 0 z x
4
1 3
1 f ( x1 , x2 , x3 ) 1 f ( x, y , z ) f ( x, 2 1 x2 , )
2 4
x 1
2 1 x 2
4 2
1 3
2 f ( x1 , x2 , x3 ) 2 f ( x, y , z ) f ( x, 2 1 x2 , )
2 4
x 1
2 1 x2
4 2
1 1 x 1
1 f ( x) 0 2 0 1 x
4 2 1 x 2 3
1 1 x 1
2 f ( x ) 0
2 0 2 x
4 2 1 x 2 3
1 3 1 3
1 f ( ) (minimum), 2 f ( ) (maximum).
3 4 3 4
At the position x 1/ 3, y 2 / 3, z 1/ 4 we find a global minimum, but at
the position x 1/ 3, y 2 / 3, z 1/ 4 a global maximum.
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
C1 A first way to solve the problem 536
F1 ( x1 , x2 , x3 ) Z ( x, y, z ) x 2 2 y 2 1 0
F2 ( x1 , x2 , x3 ) E ( x, y , z ) 3x 4 z 0
representing an elliptical cylinder and a plane. In this case is the (m-r) dimen-
sional surface F the intersection manifold of the elliptic cylinder and of the
plane as the m-r =1 dimensional manifold in 3 , namely as “spatial curve”.
Secondly, the risk function f ( x1 ,..., xm ) extr generates an (m-1) dimensional
surface f which is a special level surface. The level parameter of the (m-1)
dimensional surface f should be external. In our Example C1 one risk function
can be interpreted as the plane
f ( x1 , x2 , x3 ) f ( x, y, z ) x y z .
We summarize our result within Lemma C2.
grad f ( p ) i 1 i grad Fi ( p ).
r
:proof:
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
C1 A first way to solve the problem 537
x
t k ( p ) : p F (k 1,..., m r ).
x k x p
:Example C2:
Let the m r 2 dimensional level surface F of the sphere r2 3 of radius
r (“level parameter r 2 ”) be given by the side condition
F ( x1 , x2 , x3 ) x12 x2 2 x32 r 2 0.
:Normal space:
2 x1
F F F
n( p ) grad F ( p ) e1 e2 e3 [e1 , e 2 , e 3 ] 2 x2 .
x1 x2 x3 2 x
3p
The orthogonal vectors [e1 , e 2 , e 3 ] span . The normal space will be generated
3
F F
x12 x2 2 x32 r 0 and ( ) [2 x1 2 x2 2 x3 ], rk( ) 1
x j x j
x j G ( x1 , x2 ) r 2 ( x12 x2 2 ) .
The negation root leads into another domain of the sphere: here holds the do-
main 0 x1 r , 0 x2 r , r 2 ( x1 x2 ) 0.
2 2
x( p ) e1 x1 e 2 x2 e 3 r 2 ( x12 x2 2 ) ,
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
C1 A first way to solve the problem 538
x x1 1
t1 ( p ) ( p ) e1 e3 [e1 , e 2 , e3 ] 0
x2 r ( x1 x2 )
2 2 2
x1
r 2 ( x12 x22 )
0
x x2
t1 ( p ) x ( p) e 2 e3 2 [e1 , e 2 , e3 ] 1 ,
2 r ( x12 x2 2 ) x2
r 2 ( x12 x2 2 )
Fr ( x1 ,..., xm ) 0
where the Lagrange multipliers i are the coordinates of the vector grad f ( p ) in
the basis grad Fi ( p ).
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
C1 A first way to solve the problem 539
:Example C3:
Let us assume that there will be given the point X 3 . Unknown is the point in
the m r 2 dimensional level surface F of type sphere r 2 3 which is
from the point X 3 at extremal distance, either minimal or maximal.
The distance function || X x ||2 for X 3 and X r 2 describes the risk func-
tion
f ( x1 , x2 , x3 ) ( X 1 x1 ) 2 ( X 2 x2 ) 2 ( X 3 x3 ) 2 2 extr ,
x1 , x2 , x3
2 x1
n( p ) : grad F ( p ) [e1 , e 2 , e 3 ] 2 x2
2x
3
is an element of the normal space p f .
The normal equation
grad f ( p ) grad F ( p )
grad f ( p ) i 1 i grad Fi ( p )
r
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
C1 A first way to solve the problem 540
( x1 ,..., xm ; 1 ,..., r )
f ( x1 ,..., xm ) i 1 i Fi ( x1 ,..., xm )
r
extr
x1 ,..., xm ; 1 ,..., r
f Fi
x x i 1 i x 0 ( j 1,..., m)
r
i j j
x Fi ( x j ) 0 (i 1,..., r ).
k
:Example C4:
We continue our third example by solving the alternative system of equations.
( x1 , x2 , x3 ; ) ( X 1 x1 ) 2 ( X 2 x2 ) 2 ( X 3 x3 )
( x12 x2 2 x32 r 2 ) extr
x1 , x2 , x3 ;
2( X j x j ) 2 x j 0
x j
x1 x2 x3 r 0
2 2 2 2
X1 X2
x1 ; x2
1 1
x1 x2 x3 r 2 0
2 2 2
X 12 X 2 2 X 32
r 2 0 (1 ) 2 r 2 X 12 X 2 2 X 32 0
(1 ) 2
X 12 X 2 2 X 32 1
(1 ) 2 1 1, 2 X 12 X 2 2 X 32
r2 r
1 r X 12 X 2 2 X 32
1, 2 1 X 12 X 2 2 X 32
r r
rX 1
( x1 )1, 2 ,
X 12 X 2 2 X 32
rX 2
( x2 )1, 2 ,
X X 2 2 X 32
1
2
rX 3
( x3 )1, 2 .
X X 2 2 X 32
1
2
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
C1 A first way to solve the problem 541
2
H( ) ( jk (1 )) (1 )I 3
x j xk
H (1 0) 0 (minimum) H (1 0) 0 (maximum)
( x1 , x2 , x3 ) is the point of minimum ( x , x , x ) is the point of maximum .
1 2 3
Our example illustrates how we can find the global optimum under side condi-
tions by means of the technique of Lagrange multipliers.
:Example C5:
Search for the global extremum of the function f ( x1 , x2 , x3 ) subject to two side
conditions F1 ( x1 , x2 , x3 ) and F2 ( x1 , x2 , x3 ) , namely
f ( x1 , x2 , x3 ) f ( x, y, z ) x y z (plane)
F1 ( x1 , x2 , x3 ) Z ( x, y, z ) : x 2 2 y 2 1 0 (elliptic cylinder)
F ( x , x , x ) E ( x, y, z ) : 3x 4 z 0
2 1 2 3 (plane)
Fi 2x 4 y 0
J( ) , rk J ( x 0 oder y 0) r 2 .
x j 3 0 4
:Variational Problem:
( x1 , x2 , x3 ; 1 , 2 ) ( x, y, z; , )
x y z ( x 2 y 2 1) (3x 4 z )
2
extr
x1 , x2 , x3 ; ,
1 2 x 3 0
x
1 4 y 0
1
y 4y
1
1 4 0
z 4
x2 2 y 2 1 0
3x 4 z 0.
We multiply the first equation / x by 4y, the second equation / y by
(2 x) and the third equation / z by 3 and add !
4 y 8 xy 12 y 2 x 8 xy 3 y 12 y y 2 x 0 .
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"
C1 A first way to solve the problem 542
2 2 0 0
H( ) 0 4 0
x j xk 0 0 0
3 4 03 0 - 34 0 0
3
H (1 ) 0 2 0 0
H (2 3 ) 0 - 3 0 0
8 0 0 0 8 0 02 0
(minimum) (maximum)
( x, y, z; , )1 =( 13 ,- 32 , 14 ;- 83 , 14 ) ( x, y, z; , ) 2 =(- 3 , 3 ,- 4 ; 8 , 4 )
1 2 1 3 1
is the restricted minmal solution point. is the restricted maximal solution point.
The geometric interpretation of the Hesse matrix follows from E. Grafarend and
P. Lohle (1991).
The matrix of second derivatives H decides upon whether at the point
( x1 , x2 , x3 , )1, 2 we enjoy a maximum or minimum.
With permission from the author: Extract from "Erik W. Grafarend: Linear and
nonlinear models. Fixed effects, random effects, and mixed models"