You are on page 1of 21

AE6207 QUANTITATIVE METHODS

MATRIX ALGEBRA
Trimester 2, 2021 - 2022

Matrix Algebra: Preliminaries


A matrix is a rectangular array with m rows and n a11 x1 + a12 x2 + + a1n xn = b1
columns denoted as follows:
a21 x1 + a22 x2 + + a2n xn = b2
0 1
a11 a12 a1n .. ..
B a . = . (1)
B 21 a22 a2n CC
A = (aij )m n = B
B .. .. .. .. C
C am1 x1 + am2 x2 + + amn xn = bm
@ . . . . A
can be expressed as Ax = b where
am1 am2 amn
0 1 0 1 0 1
If A = (aij )m n; B = (bij )m n and is a scalar, we a11 a12 a1n x1 b1
B a21 a22 a2n C B x2 C B b2 C
de…ne B C B C B C
A=B
B .. .. .. .. C; x = B
C B .. C; b = B
C B .. C
C
@ . . . . A @ . A @ . A
1. A B = (aij bij )m n
am1 am2 amn xn bm
2. A = ( aij )m n
A square matrix has equal numbers of row and column.
If A = (aij )m n; B = (bij )n p then C = AB =
(cij )m p , where If A is square and n is a positive integer, then we de…ne

P
n
cij = aik bkj
k=1
An = |AA {z A} (2)
Note that multiplication of two matrices is de…ned only n tim e s

when the pre-matrix has the same number of column as


the post-matrix has row. For diagonal matrices, the nth power of these matrices
are easily obtained:
If A; B and C are matrices whose dimensions are con-
formable for the following addition and multiplication, 0 1 0 1
then d11 0 0 dm
11 0 0
B 0 d22 0 C B 0 dm 0 C
B C B 22 C
1. (AB)C = A(BC) D=B
B .. .. .. .. C ) Dm = B
C B .. .. .. .. C
C
@ . . . . A @ . . . . A
2. A(B + C) = AB + AC 0 0 dnn 0 0 dm
nn
3. (A + B)C = AC + BC (3)

Unlike real numbers, matrices are not communicative, The identity matrix of order n (denoted I n or just I) is
i.e. AB is in general not equal to BA: an n n matrix with 1’s down the diagonal and zeros
everywhere else:
Further,

1. AB = 0 does not imply that A = 0 or B = 0 0 1


1 0 0
2. AB = AC and A 6= 0 does not imply B = C B 0 1 0 C
B C
In = B
B .. .. .. .. C
C (4)
A system of linear equations can be written concisely @ . . . . A
using matrices: 0 0 1

Nov 2021 - Mar 2022 If A is m n then

1
1. If two rows (or two columns) of A are interchanged,
AI n = A = I m A (5) the determinant changes sign but its absolute value
remains unchanged.
A symmetric matrix A such that AA = A is said to be 2. If all the elements in a single row (or column) of
idempotent. A are multiplied by a number c, the determinant
is multiplied by c:
If A = (aij )m n , the transpose of A, denoted by A0 , is
de…ned as A0 = (aji )n m : 3. If two of the rows (or columns) of A are propor-
tional, then j Aj = 0:
Transposition follows the following rules: 4. The determinant remains unchanged if a multiple
of one row (or one column) is added to another row
1. (A0 )0 = A
(or column).
2. (A + B)0 = A0 + B 0
5. jA0 j = jAj
0 0
3. ( A) = A
6. jABj = jAjjBj where A and B are square matrices
4. (AB)0 = B 0 A0 of the same dimension
7. jA + Bj =
6 j Aj + jBj (usually)
A square matrix such that A = A0 is said to be sym-
metric. 1
The inverse A of an n n matrix A has the following
properties:
Determinants and Matrix In-
verses B=A 1
() AB = I n () BA = I n (8)
The determinant of an n n matrix A, denoted by jAj;
is de…ned as
1
A exists () jAj 6= 0 (9)

jAj = ai1 Ai1 + ai2 Ai2 + + ain Ain (6) If A = (aij )n n and jAj 6= 0, the unique inverse of A is
given by
where Aij is referred to as the (ij)th co-factor de…ned as
the ( 1)i+j times the determinant of the (n 1) (n 1) 1
1
sub-matrix obtained from deleting the ith row and jth A = adj(A) (10)
jAj
column of A :
where
Aij =
a11 a1;j 1 a1;j+1 a1n 0 10
A11 A12 A1n
.. .. .. .. .. .. B C
. . . . . . B A21 A22 A2n C
adj(A) = B
B .. .. .. .. C
C (11)
ai 1;1 ai 1;j 1 ai 1;j+1 ai 1;n @ . . . . A
( 1)i+j
ai+1;1 ai+1;j 1 ai+1;j+1 ai+1;n An1 An2 Ann
.. .. .. .. .. ..
. . . . . . and Aij is the (ij)th cofactor of A:
an1 an;j 1 an;j+1 an;n
For a 2 2 matrix, this result gives
By this de…nition, the determinant of a 2 2 matrix is
given by
! 1 !
a b 1 d b
= ; (12)
c d ad bc c a
a11 a12
jAj = = a11 a22 a12 a21 (7)
a21 a22 a b
provided = ad bc 6= 0
c d
Equation (6) is referred to as the cofactor expansion of
jAj along the ith row. The following rules for inverses hold, provided the rele-
vant inverses exist:
The following rules for manipulating determinants are
1 1
often useful: 1. (A ) =A

2
1 1 1
2. (AB) =B A Null vector, denoted by 0; contains zeros as its elements
3. (A ) 0 1
= (A 1 0
) and the sum vector, denoted by , contains ones as its
1 1 1
elements.
4. ( A) = A
Sum of two vectors of the same order:
A linear system of n equations and n unknowns,
0 1
x1 + y1
a11 x1 + a12 x2 + + a1n xn = b1 B .. C
x+y =B
@ .
C
A (17)
a21 x1 + a22 x2 + + a2n xn = b2
.. xm + ym
. (13)
Scalar multiplication of a vector:
an1 x1 + an2 x2 + + ann xn = bn

has a unique solution if and only if jAj = j(aij )n nj 6= 0: 0 1


x1
The solution is then B . C
x=B
@ ..
C
A (18)
jAj j xm
xj = ; j = 1; 2; :::; n (14)
jAj
where the determinant Two vectors, y and x; are collinear if y = x for some
scalar 6= 0:
a11 a1;j 1 b1 a1;j+1 a1n The inner product, or dot product, of two vectors of the
a21 a2;j 1 b2 a2;j+1 a2n same order:
jAj j = .. .. .. .. (15)
. . . .
an1 an;j 1 bn an;j+1 ann P
m

is obtained by replacing the jth column of jAj by the x0 y = xi yi (19)


i=1
column whose components are b1 ; b2 ; :::; bn :
The norm of a vector, x,
If bj = 0, j = 1; 2; ::; n in equation (15), the system can
be written as Ax = 0, and it is said to be homogeneous. p
kxk = x0 x (20)
A homogeneous system of linear equations will always
have the trivial solution x1 = x2 = = xn = 0: If kxk = 1, x is said to be a normalized vector.

A homogeneous system of linear equations will have non- Cauchy-Schwarz inequality:


trivial solutions i¤ jAj = 0:

(x0 y)2 (x0 x)(y 0 y) (21)


Vectors
A vector is an ordered m tuple of real numbers. For any x, y 2 Rm and any scalar,
Denote vectors by bold lowercase letters, e.g. a; b; x; y

Bold small letters denote column vectors e.g. kx yk2 0


0
(x y) (x y) 0
0 1 0 1 0 1 xx0 0
2 xy+ 2 0
yy 0
x1 a1 b1
B .. C B C B C
x=B C ; a = B .. C ; b = B .. C (16) Set =
0
x y
@ . A @ . A @ . A y0 y
xm am bm

Row vectors are denoted by a0 ; b0 ; x0 ; y 0 2(x0 y)2 (x0 y)2 0


x0 x + (y y) 0
y0 y (y 0 y)2
The number of elements in a vector is referred to as the (x0 y)2
order of the vector. x0 x 0
y0 y
A vector of order 1 is a scalar. (x0 y)2 (x0 x)(y 0 y)

3
Two vectors are orthogonal, denoted by x ? y; i¤ x0 y = (b) dim(A) = 2
0 (c) There are in…nitely many solutions. We need to …nd
two vectors, say v 1 and v 2 such that they are not collinear
In addition, if kxk = kyk = 1, they are said to be ortho- with each other and with a1 and a2 : A simple solution is
normal. 0 0
v1 = 0 0 1 0 and v 2 = 0 0 0 1
For example, the unit vectors; e1 =
0 0
1 0 0 ; e2 = 0 1 0 : : : em =
0
Partitioned Matrices and Their
0 0 1 in m dimensional Euclidean space
Inverses
are orthonormal.
When dealing with high order matrices, it is sometimes
A linear combination of the vectors: 1 x1 + 2 x2 + + convenient to consider subdiving the matrices into sub-
n xn
matrices, called partitioning the matrices.

Linear dependence: 1 x1 + 2 x2 + + n xn = 0 for Consider the 3 5 matrix


1 ; 2 ; :::; n not all zero.

0 1
a11 a12 a13 a14 a15
Vector Space B
A = @ a21 a22 a23 a24
C
a25 A (22)
A vector space, V, is any set of vectors that is closed a31 a32 a33 a34 a35
under addition and scalar multiplication. Further,
It can be partitioned in a number of ways, for example
1. 1. for all x 2 V, there exists a vector, denoted by 0,
0 1
such that x + 0 = x; ..
B a11 a12 . a13 a14 a15 C
2. for any x 2 V, there exists a vector, denoted by B .. C !
B C
x; such that x + ( x) = 0 B a21 a22 . a23 a24 a25 C A11 A12
A=B C=
B .. C A21 A22
B . C
A linearly independent set of vectors that span a vector @ A
..
space is known as a basis of that vector space. a31 a32 . a33 a34 a35
(23)
The maximum number of linearly independent vectors in
or
a vector space is known as the dimension of the vector
space. 0 1
..
A subspace is a subset of a vector space and is itself B a11 . a12 a13 a14 a15 C
B .. C !
a vector space e.g. the vector space spanned by the B C
B a21 . a22 a23 a24 a25 C A11 A12
0 A=B C=
independent vectors x = x1 x2 0 and y = B .. C A21 A22
B . C
0 @ A
y1 y2 0 is a subspace of R : 3 ..
a31 . a32 a33 a34 a35
(24)
Example: Let A be the subspace of R4 spanned by the
columns of the matrix
One can perform standard matrix operations on parti-
0 1
1 2 3 tioned matrices, treating the submatrices as if they were
B 2 3 8 C ordinary matrix elements. This requires obeying rules
B C
A=B C for sums, di¤erences and products.
@ 5 1 3 A
3 4 5
For example,
(a) Find a basis of A.
(b) What is the dimension of A?
! !
(c) Extend the basis of A to a basis in R4 : A11 A12 B 11 B 12
A21 A22 B 21 B 22
Solution: !
A11 B 11 A12 B 12
(a) Let a1 ; a2 ; a3 denote the columns of A. It is clear = (25)
A21 B 21 A22 B 22
that a1 and a2 are independent (otherwise they would be
proportional). But a1 2a2 + a3 = 0 so that fa1 ; a2 ; a3 g provided the dimensions of the submatrices are con-
are linearly dependent. Hence fa1 ; a2 g is a basis of A. formable for matrix addition/subtraction.

4
0 1
Further, ..
B 2 3 . 0 0 0 C
B .. C
B C
! ! B 3 4 . 0 0 0 C
B C
A11 A12 A11 A12 B .. C
= (26) B . C
A21 A22 A21 A22 A=B
B
C
C
B 1 ..
B 1 . 1 0 0 C
C
If A and B are partitioned conformably, then B .. C
B C
B 1 1 . 0 1 0 C
@ A
! ..
5 7 . 0 0 1
A11 B 11 + A12 B 21 A11 B 12 + A12 B 22 !
AB = A11 A12
A21 B 11 + A22 B 21 A21 B 12 + A22 B 22 =
(27) A21 A22

1
Inverting large square matrices is often made easier using Solution: Using
! equation (34), e = A11 , and e =
partitioning. Consider an n n matrix A which has an 4 3
: Then
inverse and is partitioned as follows: 3 2
0 1
! 1 1 !
A11 A12 e 1 B C 4 3
A= (28) A221 A21 = I3 @ 1 1 A =
A21 A22 3 2
5 7
0 1
1 1
where A11 is a k k submatrix with an inverse. Then B C
@ 1 1 A : Hence,
it can be shown that
1 1
0 1
4 3 0 0 0
! B 3 2 0 0 0 C
B C
A111 + A111 A12 1
A21 A111 A111 A12 1 1 B C
A 1
= A =B 1 1 1 0 0 C
1
A21 A111 1 B C
@ 1 1 0 1 0 A
(29) 1 1 0 0 1
where = A22 A21 A111 A12
Example: If P and Q are invertible square matrices,
prove that
Similarly, if both A 1
and A221 exist, then

1 1
! ! 1 !
1
e eA12 A221 P R P 1 P 1
RQ 1
A = 1 1 = 1
A221 A21 e A221 + A221 A21 e A12 A221 0 Q 0 Q
(30)
Solution: Using equation (33) where A11 = P ; A12 = R;
where e = A11 A12 A221 A21 A21 = 0 and A22 = Q, we obtain the result.

Two useful formulas for the determinant of an n n


matrix A partitioned as in (32) are: Tutorial Exercise No. 1
Question 1
If x, y 2 Rm ; show that
A11 A12
If A111 exists, then
A21 A22
jjx + yjj jjxjj + jjyjj
= jA11 j jA22 A21 A111 A12 j (31)
Question 2
The angle between two non-zero vectors x and y is de-
A11 A12 …ned by
If A221 exists, then
A21 A22
= jA22 j jA11 A12 A221 A21 j (32) x0 y
cos = ; 0
jjxjj jjyjj
1
Example: Compute A when (a) What is the angle between x and x?

5
0 1 0 1
(b) What is the angle between two orthogonal vectors? a11 a1n x1
B .. .. C B C
Let A = B .. C and X = B .. C where
@ . . . A @ . A
Question 3
an1 ann xn
If x; y and z are linearly independent vectors, are x + y,
jAj 6= 0: Show that
x + z and y + z also linearly independent?
1 x1 xn
x1 a11 a1n
Question 4 jA + XX 0 j = .. .... .. ..
Expand the matrix product . .. . .
xn an1 ann
X = ([AB + (CD)0 ][(EF) 1
+ GH])0
= jAj(1 + X 0 A 1
X)
Assume that all matrices are square and E and F are non-
singular.
Linear Independence
Question 5 The n vectors a1 ; a2 ; :::; an in Rm are said to be linearly
For the matrix dependent if there exist numbers, c1 ; c2 ; :::; cn ; not all
zeros, such that
" #
0 1 1 1 1
X =
4 2 3 5 c1 a1 + c2 a2 + + cn an = 0 (33)

compute P = X(X0 X) 1
X0 and M = (I P). Verify that If this equation holds only when c1 = c2 = = cn = 0,
MP = 0: Let then the vectors are said to be linearly independent.
" # So, a linearly dependent set of vectors is such that one
1 3
Q= vector in the set can be expressed as a linear combination
2 8
of the other vectors.

Compute P and M based on XQ instead of X Conversely, a linearly independent set of vectors is such
that no vector in the set can be expressed as a linear
Question 6 combination of the other vectors.
Let A be any square matrix whose columns are
Consider the general system of m equations in n un-
[a1 ; a2 ; :::; aM ] and let B be any rearrangement of the columns
knowns:
of the M M identity matrix. What operation is performed
by the multiplication AB? By the multiplication BA?
a11 x1 + + a1n xn = b1
Question 7 .. .. ..
. . . (34)
Let A = faij g represent an m n matrix and B = fbij g a am1 x1 + + amn xn = bm
p m matrix.
(a) Let x = fxi g represent an n-dimensional column vec- which can be expressed in vector form:
tor. Show that the ith element of the p-dimensional column
Pm P
n
vector BAx is bij ajk xk : x1 a1 + + xn an = b (35)
j=1 k=1
(b) Let x = fxi g represent an n-dimensional column vector Here, a1 ; : : : ; an are the column vectors of coe¢ cients, and
and C = fcij g a q p matrix. Express the ith element of the b is the column vector with components b1 ; : : : ; bm :
q-dimensional column vector CBAx in terms of the elements Suppose (39) has two solutions (u1 ; : : : ; un ) and
of A; B; C and x: (v1 ; : : : ; vn ): Then

Question 8
Show that the product AB of two n n symmetric matrices u1 a 1 + + un an = b and v1 a1 + + vn an = b
A and B is itself symmetric if and only if A and B commute. Subtracting the second equation from the …rst yields

Question 9
(u1 v1 )a1 + + (un vn )an = 0
Show that A0 and B0 commute if A and B commute.
c1 a1 + + cn an = 0
Question 10 where ci = ui vi :

6
The two solutions are di¤erent if and only if not all A is obtained by deleting all but k rows and k columns,
c1 ; : : : ; cn are zeros. Hence if (39) has more than one and then taking the determinant of the resulting k k
solution, then the column vectors a1 ; : : : ; an are lin- matrix.
early dependent. Equivalently, if the column vectors
a1 ; : : : ; an are linearly independent, then (39) has at
Example:
0 1 all the minors of the matrix A =
Describe
1 0 2 1
most one solution. Without saying more about the right- B C
hand side vector b, however, we cannot know if there are @ 0 2 4 2 A
0 2 2 1
any solutions at all, in general.
Solution: Because there are 3 rows, there are minors of
For an n n matrix A with column vectors a1 ; : : : ; an :
orders 1,2 and 3. There are
0 1 (a) 4 minors of order 3. These are obtained by deleting
a11 a12 a1n
B C any one of the 4 columns:
B a21 a22 a2n C
A = B .. .. .. C , (36) 1 0 2 1 0 1 1 2 1 0 2 1
B .. C
@ . . . . A 0 2 4 ; 0 2 2 ; 0 4 2 ; 2 4 2
an1 an2 ann 0 2 2 0 2 1 0 2 1 2 2 1
0 1
a1j (b) 18 minors of order 2. These are obtained by delet-
B a C
B 2j C ing one row and two columns in all possible ways. For
where aj = B . C
B . C example, four of these are
@ . A
anj 1 0 2 4 2 1 0 2
; ; ;
0 2 2 2 4 2 2 2
the n column vectors are linearly independent if and only
if jAj 6= 0 (c) 12 minors of order 1. These are all the 12 individual
elements of A:
Example: Suppose a; b and c are three linearly indepen-
dent vectors in Rn : Are a b; b c and a c linearly The rank r(A) of a matrix A is equal to the order of the
independent? largest minor of A that is non-zero.

Solution: Suppose c1 (a b) + c2 (b c) + c3 (a c) = 0: If A is a square matrix of order n, then the largest minor


Rearranging, we get (c1 + c3 ) a+ ( c1 + c2 ) b+ ( c2 of A is jAj itself. So r(A) = n i¤ jAj 6= 0:
c3 )c = 0. Since a; b and c are linearly independent, c1 + The rank of a matrix A is equal to the rank of its trans-
c3 = 0; c1 +c2 = 0 and c2 c3 = 0: These are satis…ed pose: r(A) = r(A0 ):
(for example) when c1 = c2 = 1 and c3 = 1; so a b;
b c and a c linearly dependent. It follows therefore that the rank of a matrix can also
be characterized as the maximal number of linearly in-
dependent rows of A:
The Rank of a Matrix
The rank of a matrix A, denoted by r(A), is the maxi- The rank of a matrix is not a¤ected by elementary op-
mum number of linearly independent column vectors in erations on the matrix.
A: If A is the 0 matrix, we assign r(A) = 0:
Elementary row (column) operations are:
The rank of a matrix gives very useful information. For (a) interchanging two rows (columns)
example, it determines the existence and multiplicity of
(b) multiplying each element of a row (column) by any
solutions to linear systems of equations.
scalar 6= 0
Let A be a square matrix of order n: Because the matrix (c) if i 6= j; adding to each element of the ith row (col-
has n columns, its rank cannot exceed n: Further, since umn) times the corresponding element of the jth row
the n column vectors are linearly independent i¤ jAj =
6 0, (column)
we can conclude the a square matrix A of order n has
rank n i¤ jAj 6= 0: Often it is easier to determine the number of independent
row/column of a matrix after performing some elemen-
For any real matrix A, the column and row ranks of A tary operations of the matrix.
are equal. 0 1
1 2 3 2
B C
The rank of a matrix can be characterized in terms of its Example. Find the rank of @ 2 3 5 1 A :
nonvanishing minors. In general, a minor of order k in 1 3 4 5

7
Solution. Perform the following elementary row opera- Example. Determine if the system of equations:
tions to get the resulting matrix:
0 1
1 2 3 2
B C 2x1 x2 = 3 (39)
@ 2 3 5 1 A (R1 2) + R2 !
1 3 4 5 (R1 1) + R3 4x1 2x2 = 5 (40)
0 1
1 2 3 2
B C has a solution.
@ 0 1 1 3 A
0 1 1 3 !
0 1 2 1
Solution. Here A = and Ab =
1 2 3 2 4 2
B C !
@ 0 1 1 3 A !
2 1 3
0 1 1 3 R2 + R3 : Since jAj = 0, so r(A) < 2: Because
0 1 4 2 5
1 2 3 2 A is not the null matrix, it follows that r(A) = 1: But
B C
@ 0 1 1 3 A r(Ab ) = 2 because the minor obtained by deleting the
0 0 0 0 …rst column is non-zero. Thus, r(A) 6= r(Ab ), so the
The rank of the last matrix is obviously 2 because there system does not have a solution. To see this, multiply
are precisely two linearly independent rows. So the orig- the …rst equation by 2 to obtain 4x1 2x2 = 6, which is
inal matrix has rank 2. inconsistent with the second equation.

Consider the general linear system of m simultaneous Suppose that equation (41) has solutions and that
equations in n unknowns which can also be expressed in r(A) = r(Ab ) = k < m: Since the rank of A is less
matrix notation as: than the number of equations there are m k "super‡u-
ous" equations in the system. That is, these equations
Ax = b (37) are not required to …nd the solutions of the system. If
we choose any collection of k equations corresponding
where A is the m n coe¢ cient matrix. to k linearly independent rows of A, then the solution
to these equations will also satisfy the remaining m k
De…ne a new m (n+1) matrix Ab ; called the augmented equations.
matrix, that contains A in the …rst n columns and b in
column n + 1 :
Now, suppose that (41) has solutions and that r(A) =
0 1 r(Ab ) = k < n, that is the rank of A is less than the
a11 a12 a1n number of variables in the system. Then there exist n k
B a21 a22 a2n C of the variables that can be chosen freely, whereas the
B C
A=B
B .. .. .. .. C
C remaining k variables are uniquely determined by the
@ . . . . A
choice of these n k free variables. We say the system
am1 am2 amn
has n k degrees of freedom.
and
Example. Determine whether the following system of
0 1 equations has any solutions, and if it has, …nd the num-
a11 a12 a1n b1
B a a22 a2n b2 C ber of degrees of freedom.
B 21 C
Ab = B
B .. .. .. .. .. C
C
@ . . . . . A
am1 am2 amn bm x1 + x2 2x3 + x4 + 3x5 = 1

The relationship between the ranks of A and Ab deter- 2x1 x2 + 2x3 + 2x4 + 6x5 = 2
mines whether equation (41) has a solution. 3x1 + 5x2 10x3 3x4 9x5 = 3 (41)
3x1 + 2x2 4x3 3x4 9x5 = 3
A necessary and su¢ cient condition for a linear system
of equations to be consistent (that is, to have at least
Solution. Here
one solution) is that the rank of the coe¢ cient matrix is 0 1
equal to the rank of the augmented matrix, i.e. 1 1 2 1 3
B 2 1 2 2 6 C
B C
A=B C and
@ 3 5 10 3 9 A
Ax = b has a solution () r(A) = r(Ab ) (38) 3 2 4 3 9

8
0 1
1 1 2 1 3 1 1. 0 r(A) min(m; n): The rank of A cannot be
B C
B 2 1 2 2 6 2 C negative and it cannot exceed the smaller of the
Ab = B C
@ 3 5 10 3 9 3 A number of row or column.
3 2 4 3 9 3
2. r(A) = 0 () A = 0. If r(A) = 0; then there are
no linearly independent columns. Hence, A = 0.
We know that r(Ab ) r(A): All minors of order 4 in Ab Conversely, if A = 0, then there are no linearly
are equal to 0, so r(Ab ) 3: Now, there are minors of order independent columns. Hence, r(A) = 0.
3 in A that are di¤erent from 0. For example, the minor
3. r(I n ) = n: The identity matrix of order n has n lin-
formed by the …rst, third and fourth columns and the …rst,
early independent columns, since I n x = 0 implies
second and fourth rows, is di¤erent from 0, because
x = 0:
1 2 1
2 2 2 = 36: 4. r( A) = r(A) if 6= 0: Multiplying the columns
3 4 3 of A by a nonzero constant , does not change the
dependence of the columns. Hence, the maximum
number of independent columns of A is equal to
Hence, r(A) = 3: Because 3 r(Ab ) r(A) and r(A) =
that of A:
3;so r(Ab ) = 3:Since r(A) = r(Ab ) the system has solutions.
There is one super‡uous equation as m = 4 and k = 3: Also, A real m n matrix A can be viewed as a collection of
since n = 5; there are 2 degrees of freedom. n columns in Rm , or m rows in Rn :
Next, we …nd all the solutions to the system of equations.
Using the non-zero minor above, we can write the equation Two subspaces associated with A
system in terms of 3 independent equations in the form:
1. column space of A, denoted by col A =
fx 2 Rm : x = Ay for some y 2 Rn g
x1 2x3 + x4 + x2 + 3x5 = 1
2. kernel (or null space) of A; denoted by ker A =
2x1 + 2x3 + 2x4 x2 + 6x5 = 2 fy 2 Rn : Ay = 0g
3x1 4x3 3x4 + 2x2 9x5 = 3
Similarly, two subspaces associated with A0
or in matrix form, as
1. col A0 = fy 2 Rn : y = A0 x for some x 2 Rm g
0 10 1 0 1 0 1
!
1 2 1 x1 1 3 1 2. ker A0 = fx 2 Rm : A0 x = 0g
B CB C B C x2 B C
@ 2 2 2 A @ x3 A+@ 1 6 A =@ 2 A
x5 The kernels are commonly known as orthogonal comple-
3 4 3 x4 2 9 3
ments
0 1 0 1 1 80 1 0 1 9
x1 1 2 1 > !>
< 1 1 3
x2 = 1. col? ( A) = fx 2 Rm : x ? Ag = ker A0
B C B C B C B C
@ x3 A = @ 2 2 2 A @ 2 A @ 1 6 A
>
: 3 x5 >
; 2. col? ( A0 ) = fy 2 Rn : y ? A0 g = ker A
x4 3 4 3 2 9
An important result relating to the rank of a matrix A
It is easily veri…ed is

0 1 1 0 1
1 2 1 1 5 3
B C 1 B C r(A) = r(AA0 ) = r(A0 A) (42)
@ 2 2 2 A = @ 6 3 0 A
18
3 4 3 7 1 3 To prove this result, we …rst establish the following:
Hence,
(a) ker A0 = ker AA0
0 1 0 1 0 1 0 1 (b) col? A = col? AA0
x1 1 0 1
B C B C B 1 C B 1 C (c) col A = col AA0
@ x3 A = @ 0 A @ x
2 2 A
=@ x
2 2 A
x4 0 3x5 3x5
Proof:
Given x2 and x5 the values for x1 ; x3 and x4 are uniquely
determined. This con…rms that there are 2 degrees of free- (a) If x 2 ker A0 , then A0 x = 0. Then, AA0 x = 0, so
dom. that x 2 ker AA0 : Conversely, if x 2 ker AA0 , then
AA0 x = 0 and x0 AA0 x = 0 ) A0 x = 0, so that
The rank of a matrix A satis…es the following: x 2 ker A0 :

9
0 1
(b) Since ker A0 = col? A and ker AA0 = col? AA0 ; c11 c12 c1k
B c21 c22 c2k C
the result follows immediately. B C
where C = B
B .. .. .. .. C:
C
(c) col A = (col? A)? = (col? AA0 )? = col AA0 @ . . . . A
cn1 cn2 cnk
From (c) result above, it immediately follows that
r(A) = r(AA0 ). Further since r(A) = r(A0 ); (c) im- That C must have rank k can be shown as follows.
plies r(A) = r(A0 A):

Another important result is if A is m n and B is a


r(A) = r(BC 0 )
square matrix of rank n, then r(AB) = r(A):
min(r(B); r(C 0 ))
This can be reasoned as follows. Every column of AB
min(k; r(C 0 )) (45)
is a linear combination of the columns of A. Further,
0
since B is of full rank, these linear combinations of the Since r(A) = k it follows from (49) that r(C ) k
columns of A are independent. So the column space of and since C 0 has k rows, r(C 0 ) k: These two conditions
AB is the same as that of A, hence we have r(C 0 ) k and r(C 0 ) k imply that r(C 0 ) = k = r(C):

Every m n matrix A of rank k can be written as a sum


of k matrices, each of rank one.
r(AB) = r(A) (43)
P
k
In a product matrix C = AB every column of C is a A = BC 0 = bj c0j where cj is the j th column of C.
j=1
linear combination of the columns of A: So the dimen-
sion of the columns space of C cannot exceed that of the
column space of A: Similarly, every row of C is a linear Eigenvalues and Eigenvectors
combination of the rows of B, so that the dimension of If A is an n n matrix, then a scalar is an eignevalue
the row space of C cannot exceed that of the row space of A if there is a nonzero vector x in Rn such that
of B. Since the number of independent column equals
that of the number of independent row of a matrix, we
Ax = x (46)
have
Then x is an eigenvector of A (associated with ):

r(C) = r(AB) min(r(A); r(B)) (44) Eigenvalues and eigenvectors are also called characteris-
tic roots and characteristic vectors, respectively.
Every m n matrix A of rank k can be written as A =
BC 0 , where B m k and C n k both have rank k: It should be noted that if x is an eigenvector with eigen-
value , then x is another eigenvector for every scalar
Since A has rank k, there are k independent columns, 6= 0: Thus, eigenvectors are not unique.
say b1 ; : : : ; bk such that each column ai of A is a linear
combination of b1 ; : : : ; bk , i.e. To obtain unique eigenvectors, it is often required that
the eigenvectors are of unit length, known as orthonor-
malizing the eigenvectors, i.e. x0 x = 1:
P
k
ai = cij bj ; i = 1; 2; ::; n Here are some general properties of eigenvalues and
j=1
eigenvectors:
Stacking the ai (i = 1; 2; ::; n) side by side, we get
I The eigenvalues of symmetric matrices are real.

A = I Eigenvectors corresponding to di¤erent eigenvalues of


a1 a2 an
symmetric matrices are orthogonal
P
k P
k P
k
= c1j bj c2j bj cnj bj I The sum of the eigenvalues of an n n matrix, A, is
j=1 j=1 j=1
0 1 equal to the sum of its diagonal elements, known as the
c11 c21 cn1 trace, tr(A)
B c c22 cn2 C
B 12 C I The product of the eigenvalues of an n n matrix, A;
= b1 b2 bk B . .. .. .. C
B . C
@ . . . . A is equal to the determinant of A:
c1k c2k cnk I If D is a diagonal matrix with diagonal elements
= BC 0 fd1 ; d2 ; : : : ; dn g, then the eigenvalues of D are the just

10
the diagonal elements. Further, if ej denotes the jth
unit vector in Rn , having all components 0, except for 5:6056
x22 = x21 = 1:8685x21
the jth component which is 1, then any nonzero multiple 3
of ej is an eigenvector associated with the eigenvalue dj : The eigenvector
! corresponding to 2 is therefore x2 =
1
Example: Find the eigenvalues and eigenvectors of the t ; t 6= 0:
! 1:8685
2 3
matrix A = :
3 6 If orthonormalized eigenvectors are required, then x1
and x2 need to satisfy two further conditions x01 x1 = 1
Solution: The eigenvalue equation is and x02 x2 = 1: Using these conditions, we obtain:

Ax = x x01 x1 = t2 (1 + 0:53522 ) = 1 ) t = 0:8817


(A I)x = 0 (47) x02 x2 = t2 (1 + 1:8685)2 = 1 ) t = 0:4719
A nontrivial solution (x 6= 0) of Equation (51) exists i¤ Hence, the !orthonomalized eigenvectors are x1 =
!
0:8817 0:4719
; x2 =
jA Ij = 0 0:4719 0:8817

i.e. A square matrix, A, is said to be orthogonal i¤ A0 =


A 1 , that is, its transpose is equal to its inverse.
! !
2 3 0
3 6 0
= 0 Diagonalization
Suppose A and P are n n matrices and P is invertible.
(2 )(6 ) (3)(3) = 0 Then A and P 1 AP have the same eigenvalues, because
2
8 +3 = 0
p
8 ( 8)2 (4)(3)
= 1 1 1
2 jP AP Ij = jP AP P Pj
= 0:3944; 7:6056 = jP 1
(A I)P j
Thus, the eigenvalues are 1 = 0:3944 and 2 = 7:6056: = jP 1
jjA IjjP j
The corresponding eigenvectors are determined as follows.
= j(A I)j
( ! !) ! !
2 3 0:3944 0 x11 0 1
= A and B = P AP are said to be similar matrices.
3 6 0 0:3944 x12 0
As shown above, similar matrices have same set of eigen-
1:6056x11 + 3x12 = 0 (48)
values (with the same multiplicities), but they do not
3x11 + 5:6056x12 = 0 (49) have the same set of eigenvectors. This can be shown as
Note the (52) and (53) are not independent, hence, follows.

1:6056 If Ax = x, then P BP 1 x = x ) B(P 1 x) =


x12 = x11 = 0:5352x11 (P 1 x) i.e. the eigenvector of B associated with is
3
The eigenvector corresponding to 1 is therefore x1 = P 1 x; whereas the eigenvector of A associated with
!
1 is x.
t ; t 6= 0:
0:5352
An n n matrix A is diagonalizable if there exists an
invertible n n matrix P and a diagonal matrix D such
Similarly, the eigenvector corresponding to 2 = 7:6056
that
can be determined as follows:

( ! !) ! ! 1
2 3 7:6056 0 x21 0 P AP = D (52)
=
3 6 0 7:6056 x22 0
Since the diagonal elements of a diagonal matrix are its
5:6056x21 + 3x22 = 0 (50) eigenvalues, if A is diagonalizable then the diagonal ele-
3x21 1:6056x22 = 0 (51) ments of D are its eigenvalues.

11
An n n matrix A is diagonalizable i¤ it has a set of n A quadratic form Q(x) = x0 Ax; as well as its associ-
linearly independent eigenvectors x1 ; x2 ; :::; xn : In that ated symmetric matrix A, are said to be positive de…-
case, nite (p.d.), positive semide…nite (p.s.d.), negative de…-
nite (n.d.) or negative semide…nite (n.s.d.) according
as
1
P AP = diag( 1; 2; : : : ; n) (53)
where P is the matrix with x1 ; x2 ; :::; xn as its columns, Q(x) > 0; Q(x) 0; Q(x) < 0; Q(x) 0
and 1 ; 2 ; : : : ; n are the corresponding eigenvalues.
for all x 6= 0. The quadratic form Q(x) is inde…nite
if there exists vectors x and y such that Q(x ) < 0
Since for symmetric matrices their eigenvalues are real and Q(y ) > 0: Thus, an inde…nite quadratic form can
and their eigenvectors are orthogonal, symmetric matri- assume both negative and positive values.
ces are diagonalizable.
Let A = (aij )n n be an n n matrix. An arbitrary
principal minor of order k, denoted by k ; is the deter-
Quadratic Form minant of the matrix obtained by deleting n k rows and
A quadratic form in n variables is a function Q of the corresponding n k columns in A: The determinant jAj
form itself is a principal minor (no row or column is deleted).

A principal minor is called a leading principal minor of


order k (1 k n); denoted by Dk , if it is the determi-
P
n P
n
Q(x1 ; : : : ; xn ) = aij xi xj = a11 x21 +a12 x1 x2 + +ann x2n nant of the submatrix that consists of the …rst (’leading’)
i=1 j=1
k rows and columns of A:
(54)
where the aij are contants. Consider the quadratic form
0
Suppose we de…ne x = x1 x2 xn and P
n P
n
Q(x1 ; : : : ; xn ) = aij xi xj (56)
A = (aij )n n , is symmetric. Then Q can be expressed i=1 j=1
as
1. Q is positive de…nite () Dk > 0 for k = 1; :::; n
2. Q is positive semide…nite k 0 for all principal
Q = x Ax0
(55) minors of order k = 1; :::; n
3. Q is negative de…nite () ( 1)k Dk > 0 for k =
A is called the symmetric matrix associated with Q and Q
1; :::; n
is called a symmetric quadratic form.
4. Q is negative semide…nite () ( 1)k k 0 for all
Example. Write Q(x1 ; x2 ; x3 ) = + 6x1 x3 + 3x21 x22 principal minors of order k = 1; :::; n
4x2 x3 + 8x23 as a symmetric quadratic form.
Example. Determine the de…niteness of
Solution. Note that Q can be expressed as
(a) Q = 3x21 + 6x1 x3 + x22 4x2 x3 + 8x23
(b) Q = x21 + 6x1 x2 9x22 2x23
Q = 3x21 + 0 x1 x2 + 3x1 x3 + 0 x2 x1 + x22 2x2 x3
Solution.
+3x3 x1 2x3 x2 + 8x23
It makes sense to check the leading principal minors …rst,
0 1
3 0 3 in case the matrix turns out to be de…nite, instead of
B C
Then, Q = x0 Ax where A = @ 0 1 2 A and semide…nite.
3 2 8 (a) The associated symmetric matrix
0 1
x1 0 1
B C 3 0 3
x = @ x2 A :
B C
x3 A=@ 0 1 2 A;
3 2 8
Often we are interested in conditions that ensure that
Q(x) has the same sign for all x. and the leading principal minors are

12
3 0
3 0 3 Tutorial Exercise No. 2
D1 = 3; D2 = = 3; D3 = 0 1 2 =3 Question 1
0 1
3 2 8 Consider the following matrix M :
0 1
1 1 1
Since Dk > 0; for k = 1; 2; 3; Q is positive de…nite. 0
B 03 1
2 6
0 0 C
B C
M =B C
@ 0 0 1 0 A
(b) The associated symmetric matrix 0 0 0 1
By suitably partitioning M, determine M 100 =
100 tim e s
0 1 z }| {
1 3 0 M M M
B C
A=@ 3 9 0 A:
0 0 2 Question 2
0 1
2 3 0 0 0
B 3 4 0 0 0 C
The leading principal minors are D1 = 1; D2 = 0; B C
1 B C
D3 = 0: It follows that Q is not positive de…nite, neither is it Compute A for A = B 1 1 1 0 0 C
B C
positive semide…nite or negative de…nite. To determine if it @ 1 1 0 1 0 A
is negative semide…nite, we need to examine all the principal 5 7 0 0 1
minors of A:
(m) Question 3
Let k denote the mth principal minor of order k: Then,
Suppose that a; b; c 2 R3 , are all di¤erent from 0, and
for A we have
that a ? b; b ? c; b ? c: Prove that a; b and c are linearly
independent.
(1) (2) (3) (m)
1 = 1; 1 = 9; 1 = 2; so ( 1)1 1 > 0; for
m = 1; 2; 3 Question 4
1
Calculate jAj; tr(A) and A for

(1) 1 3 (2) 1 0 (3)


2 3
2 = = 0; 2 = = 2; 2 = 1 4 7
3 9 0 2 6 7
A=4 3 2 5 5
9 0 (m) 5 2 8
= 18; hence ( 1)2 2 0:
0 2
Question 5
(1) (1)
For the matrix
3 = D3 = 0; hence ( 1)3 3 0: " #
k
Therefore, we have ( 1) k 0 for all principal minors 0 1 1 1 1
X =
k (k = 1; 2; 3): It follows that Q is negative semide…nite. 4 2 3 5

An alternative way of …nding the de…niteness of a ma- compute P = X(X0 X) 1


X0 and M = (I P). Verify that
trix is to determine the signs of the eigenvalues of the MP = 0: Let
associated matrix. " #
1 3
Q=
Let Q = x0 Ax be a quadratic form, where the matrix A 2 8
is symmetric, and let 1 ; 2 ; : : : ; n be the (real) eigen-
values of A: Then What are the characteristic roots of M and P?

Question 6
1. Q is positive de…nite () 1 > 0; : : : ; n >0 Compute the characteristic roots of
2. Q is positive semide…nite () 1 0; : : : ; n 0 2 3
2 4 3
3. Q is negative de…nite () 1 < 0; : : : ; n <0 6 7
A=4 4 8 6 5
4. Q is negative semide…nite () 1 0; : : : ; n 0 3 6 5

5. Q is inde…nite () A has eigenvalues with opposite Question 7


signs. Suppose

13
Let A be a m n matrix and ai its ith column. Then
2 3
a b c vec(A) is de…ned as
6 7
A=4 b d e 5
c e f 1 0
a1
has the eigenvectors B . C
vec(A) = B C
@ .. A (58)
2 3 2 3 2 3
1 1 1 an
6 7 6 7 6 7
v1 = 4 0 5 ; v2 = 4 2 5 ; v3 = 4 1 5
1 1 1 Example. Let

with associated eigenvalues 1 = 3; 2 = 1 and 3 = 4:


Determine A: 1 0
! ! 0
2 5 2 2 4 1 B C
A= ;B = ;e = @ 0 A
Question 8 0 6 3 3 5 0
1
Express each of the following quadratic forms in terms of
vectors and symmetric matrices: (a) Compute I 2 A and A I 2 :
(a) x2 + 2xy + y 2 (b) Compute A0 B:
(b) ax2 + bxy + cy 2 (c) Compute A e and A e0 :
(c) 3x21 2x1 x2 + 3x1 x3 + x22 + 3x23
Solution. (a)
Question 9
Determine the rank of the following matrix: 0 1
0 1 ! 2 5 3 0 0 0
1 2 1 1 B C
A 0 B 0 6 3 0 0 0 C
B 2 C I2 A= =B C
B 1 1 2 C 0 A @ 0 0 0 2 5 3 A
B C
@ 1 1 1 3 A 0 0 0 0 6 3
2 5 2 0 0 1
! 2 0 5 0 2 0
Question 10 2I 2 5I 2 2I 2 B 0 2 0 5 0 2 C
B C
A I2 = =B C
Prove that is an eigenvalue of A i¤ is an eigenvalue of 0I 2 6I 2 3I 2 @ 0 0 6 0 3 0 A
A0 0 0 0 6 0 3
(b)
Question 11 0 1
Suppose A is a square matrix and let be an eigenvalue 4 8 2 0 0 0
0 1 B 6 10 0 0 0 0 C
of A: Prove that if jAj 6= 0, then 6= 0: In this case show 2B 0B B C
B 10 20 5 12 24 6 C
that 1 is an eigenvalue of A 1 B C B C
A0 B = @ 5B 6B A = B C
B 15 25 0 18 30 0 C
2B 3B B C
@ 4 8 2 6 12 3 A
Kronecker Product and Vector- 6 10 0 9 15 0
ization (c)
Let A be a m n matrix and B be a p q matrix. Then 0 1
0 0 0
the mp nq matrix de…ned by B 0 0 0 C
B C
B 2 5 2 C
B C
A e = B C;
0 1 B 0 0 0 C
a11 B a12 B a1n B B C
@ 0 0 0 A
B a B a22 B a2n B C
B 21 C 0 6 3
B .. .. .. C (57) !
B .. C
@ . . . . A 0 0 2 0 0 5 0 0 2
am1 B am2 B amn B A e0 =
0 0 0 0 0 6 0 0 3
is called the Kronecker product of A and B, and is denoted
by A B: Kronecker product satis…es the following rules:

Note that the Kronecker product is de…ned for any pairs 1. (A1 + A2 ) B = (A1 B) + (A2 B) (A1 and
of matrices A and B regardless of their dimensions. A2 of the same dimension)

14
2. A (B 1 + B 2 ) = (A B 1 ) + (A B 2 ) (B 1 and Solution. (A B)(A B) = (AA BB) = (A B)
B 2 of the same dimension)
Example. Show that (A B) 1 = A 1 B 1 when A
3. A B= (A B)
and B are both nonsingular, not necessarily of the same
4. (A B)(C D) = (AC BD) dimension.
5. A B 6= B A, in general
Solution. (A B)(A 1 B 1 ) = (AA 1 BB 1 ) =
6. A (B C) = (A B) C (I m I n ) = I mn : So, (A 1 B 1 ) = (A B) 1
7. diag(A B) = diag(A) diag(B) (A and B
Example. Let A be an m m symmetric matrix with
square)
eigenvalues 1 ; : : : ; m and let B be an n n symmetric
8. (A B)0 = A0 B0 matrix with eigenvalues 1 ; : : : ; n . Show that the mn
9. tr(A B) = (tr A)(tr B) eigenvalues of A B are i j (i = 1; : : : m; j = 1; : : : n):
10. A B is idempotent if A and B are idempotent.
Solution. There exist orthogonal matrices P and Q such
11. (A B) 1 = A 1 B 1 when A and B are both that P 0 AP = and Q0 BQ = where and are
nonsingular, not necessarily of the same dimension. diagonal matrices whose diagonal elements are the eigen-
!
A11 A12 values of A and B, respectively. This gives
12. If A is a partitioned matrix A =
A21 A22
!
A11 B A12 B
then A B = (P 0 Q0 )(A B)(P Q) = (59)
A21 B A22 B
0 1 0 1
13. Let A and B be square matrices of dimensions m Since P = P and Q = Q equation (63) can be
and n, respectively. Then jA Bj = jAjn jBjm written as

14. r(A B) = r(A)r(B) 1 1


(P Q )(A B)(P Q) = (60)
Example. Show that (A B)(C D) = (AC BD) 1
(P Q) (A B)(P Q) = (61)
Solution. Let Am n ; B q r ; C n p ; D r s be given ma- (A B) and are similar matrices, therefore, they
trices. Note that AC and BD are de…ned since A and have the same eigenvalues. is a diagonal matrix and
B have as many columns and C and D have rows, re- hence, its eigenvalues are equal to its diagonal elements i j
spectively. Further, A B has nr columns and C D (i = 1; : : : m; j = 1; : : : n):
has nr rows, so (A B)(C D) is also de…ned.
This result holds even when A and B are not symmetric.
(A B)(C D) =
Example. Show that jA Bj = jAjn jBjm :
0 10 1 Solution. The determinant is the product of the eigen-
a11 B a1n B c11 D c1p D
B .. .. CB .. .. C values, so
B CB C
@ . . A@ . . A
am1 B amn B cn1 D cnp D
Q
m Q
n Q
m
n Q
n
0 1 jA Bj = ( i j) = i j (62)
P
n P
n
i=1 j=1 i=1 j=1
a1i ci1 BD a1i cip BD
B C Q
m Q
m n
B i=1 i=1 C n
B .. .. C = ( i jBj) = jBjm i
=B . . C i=1 i=1
B C
@ P
n P
n A = jAjn jBjm (63)
ami ci1 BD ami cip BD
i=1 i=1
Example. Show that r(A B) = r(A)r(B)
0 1
(AC)11 BD (AC)1p BD
B .. .. C Solution.
=B
@ . .
C
A
(AC)m1 BD (AC)mp BD
r(A B) = r[(A B)(A B)0 ] = r(AA0 BB 0 )
= AC BD
The matrix AA0 BB 0 is symmetric, so its rank is equal
Example. Show that A B is idempotent if A and B to the number of nonzero eigenvalues. Now, the eigenvalues
are idempotent. of AA0 BB 0 are f i j g where f i g are the eigenvalues of

15
AA0 and f j g are the eigenvalues of BB 0 : The eigenvalues Exercise. For any two matrices of the same order, show
i j is nonzero if and only if both i and j are nonzero. that
Hence, the number of nonzero eigenvalues of AA0 BB 0
equals the product of the number of nonzero eigenvalues of
AA0 and the number of nonzero eigenvalues of BB 0 : This (vec A)0 vec B = tr A0 B (64)
implies r(A B) = r(AA0 )r(BB 0 ) = r(A)r(B)
Solution. Let A = (aij ) and B = (bij ) : Then
The vec operator has the following properties:
PP PP 0
(vec A)0 vec B = aij bij = (A )ji bij
1. vec(A + B) = vec A + vec B P 0 j i j i

2. If A and B are of the same dimension, then vec = (A B)jj = tr A0 B


j
A = vec B () A = B
3. vec( A) = vec A Matrix Differentiation
0
4. For any vector a, vec a = vec a In a set of linear functions
5. For any two vectors a and b, vec(ab0 ) = b a
y = Ax each element yi of y is yi = ai x: Therefore,
6. vec(ABC) = (C 0 A)vec B whenever the product
ABC is de…ned @yi
0
@x
= ai1 ai2 ain
7. For any two matrices of the same dimension, (vec
A)0 vec B = tr(A0 B)
= transpose of ith row of A:
Exercise.
Hence,
(a) Show that for any two vectors a and b, vec ab0 = b a: @Ax
= @y1 @y2 @yn
@x @x @x @x
(b) Use this fact to establish vec(ABC) = (C 0 A)vec B
whenever the product ABC is de…ned
= a01 a02 a0n
Solution.
0 = A0
(a) Let b = b1 ; : : : ; bn : Then
0 1
a1 By the same reasoning
B . C
ab0 = B C 0 @y 1 0 1
@ .. A b1 bn 1
@x0
a1
am B @y2 C B a C
0 1 B @x0 C B 2 C
a1 b1 a1 bn
@Ax
@x0
=B C B
B .. C = B ..
C=A
C
B C @ . A @ . A
B .. .. .. C
=@ . . . A @yn
an
@x0
am b1 am bn
= (b1 a; : : : ; bn a) A quadratic form is written
0 1
b1 a P
n P
n
B .. C x0 Ax = aij xi xj
vec ab0 = vec (b1 a; : : : ; bn a) = B
@ . A
C
i=1 j=1

bn a !
=b a 1 3
For example, A =
3 4
(b) Let B = b1 bn be an m n matrix and let
the n columns of I n be denoted by e1 ; : : : ; en : Then B can x0 Ax=x21 + 4x22 + 6x1 x2
Pn
be written as B = bi e0i : Hence, using (a) ! ! !
i=1
P
n P
n @x0 Ax 2x1 + 6x2 2 6 x1
vec ABC = vec Abi e0i C = vec((Abi )(C 0 ei )0 ) Then, @x
= =
6x1 + 8x2 6 8 x2
i=1 i=1
P
n P
n = 2Ax
= (C 0 ei ) (Abi ) = (C 0 A)(ei bi )
i=1 i=1
P
n P
n In general this result holds when A is symmetric. If A
= (C 0 A) (ei bi ) = (C 0 A) vec bi e0i
i=1 i=1 is not symmetric, then
0 P
n
= (C A)vec bi e0i = (C 0
A)vec B @x0 Ax
i=1 @x
= (A + A0 )x

16
Returning to the preceding double summation, we …nd An example.
that for each term aij the associated term is xi xj . There-
fore
maximizex R = a0 x x0 Ax
0
2 3
@x Ax 2 1 3
= xi xj . h i
@aij 6 7
where a0 = 5 4 2 and A=4 1 3 2 5
The square matrix xx has ij 0 th
element xi xj . Hence 3 2 5

@x0 Ax Solution
@A
= xx0 .

From above, the determinant of A is


@R
P
k = a 2Ax
jAj = ( 1)i+j aij jAij j: Then @x 2 3 2 32 3
j=1 5 4 2 6 x1
6 7 6 76 7
@jAj
= 4 4 5 4 2 6 4 5 4 x2 5
@aij
= ( 1)i+j jAij j = cij 2 6 4 10 x3
@R
where cij is the ij th co-factor of A: Hence Setting @x
= 0 and solve for x we obtain

@ ln jAj 1 @jAj cij 2 3 2 3 1 2 3


@aij
= jAj @aij
= jAj
: Therefore, x1 4 2 6 5
6 7 6 7 6 7
4 x2 5 = 4 2 6 4 5 4 4 5
0
@ ln jAj (cij ) (cij )0 1 0 1 x3 6 4 10 2
@A
= jAj
= jAj
= (A ) =A 2 3
11:25
6 7
if A is symmetric. = 4 1:75 5 (67)
7:25

Optimization The su¢ cient condition is that


Unconstrained Optimization
Consider …nding the value x which maximizes/minimizes
a function f (x). The necessary condition for an optimum @ 2 Rx
= 2A
is f 0 (x) = 0: @x@x0 2 3
4 2 6
For a maximum the function must be concave; for a min- 6 7
= 4 2 6 4 5
imum it must be convex. 6 4 10
The su¢ cient condition for an optimum is must be negative de…nite. The three characteristic roots of
this matrix are 15:746; 4 and 0:254. Hence the matrix
is negative de…nite.
f 00 (x) < 0 for a maximum
f 00 (x) > 0 for a minimum Constrained Optimization
To maximize a function subject to certain constraints,
For optimizing a function of several variables, the …rst- we can use the Lagrange multipliers procedure.
order conditions are

max f (x) (68)


x
@f (x)
=0 (65)
@x subject to
The second order condition for an optimum is that
c1 (x) = 0

@ 2 f (x) c2 (x) = 0 (69)


H= (66) ..
@x@x0
.
must be positive de…nite for a minimum and negative def-
inite for a maximum. cJ (x) = 0

17
Form the Lagrangean function
L(x; ) = a0 x x0 Ax + 0
Cx (79)

X
J Di¤erentiating L(x; ) gives us the necessary conditions
0
L(x; ) = f (x) + j cj (x) = f (x) + c(x) (70)
j=1
a 2Ax + C 0 =0 (80)
First order conditions:

@L @f (x) @[ 0 c(x)] Cx = 0 (81)


= + =0 (71)
@x @x @x
@L These may be written as
= c(x) = 0 (72)
@ " #" # " #
The second term on the rhs in @L=@x is 2A C0 x a
= (82)
C 0 0
0 0 0
@[ c(x)] @[c(x) ] @c(x)
= = = C0 (73) Using partitioned inverse to produce the solutions
@x @x @x
where C is the matrix of derivatives of the constraints with
= [CA 1
C0] 1
CA 1
a (83)
respect to x0 . The j th row of the J n matrix C is the vector
of derivatives of the j th constraint, cj (x), with respect to x0 .
1
x= A 1
[I C 0 (CA 1
C0) 1
CA 1
]a (84)
Collecting terms, the …rst-order conditions are 2
1
Substituting for the matrices A ;C and a yields the
@L @f (x) solutions:
= + C0 =0 (74)
@x @x
2 3
@L 1:5 " #
= c(x) = 0 (75) 6 7 0:5
@ x = 4 0 5; = (85)
7:5
In the unconstrained solution, we have @f (x)=@x = 1:5
0: The constrained solution, however, is @f (x)=@x =
By inserting the unconstrained and constrained solutions
C 0 which will not equal 0 unless = 0. This im-
into the original function, we get the two optimum val-
plies
ues:
1. The constrained solution cannot be superior to the un-
constrained solution. This results from the non-zero gra-
dient at the constrained solution. Unconstrained maximum: 24.375

2. If the Lagrange multipliers are zero, then the constrained Constrained maximum: 2.25
solution will equal the unconstrained solution.

Continuing the example above, suppose we add the con-


Local Second Order Conditions
Suppose that f (x) = f (x1 ; : : : ; xn ) is de…ned on a set S
straints
in Rn and that x is an interior stationary point. Assume
also that f is twice di¤erentiable in an open ball around
x . De…ne the (n n) matrix of second-order derivatives,
x1 x2 + x3 = 0 (76)
called the Hessian matrix f 00 (x) = fij 00
(x) : Let Dk (x);
x1 + x2 + x3 = 0 (77) k = 1; : : : ; n be n leading principal minors of the Hessian
matrix, where
In this case, the contraints can be expressed as Cx = 0,
where
00 00 00
f11 (x) f12 (x) f1k (x)
" # 00
f21 (x) 00
f22 (x) 00
f2k (x)
1 1 1 Dk (x) = .. .. .. .. ; k = 1; 2; :::; n
C= (78) .
1 1 1 . . .
00 00 00
fk1 (x) fk2 (x) fkk (x)
The Lagrangean is (86)

18
Then:
(a) Dk (x ) > 0; k = 1; : : : ; n ) x is a local minimum 1 1
f (x) = exp (x )0 1
(x ) (93)
point (2 )k=2 j j1=2 2
(b) ( 1)k Dk (x ) > 0; k = 1; : : : ; n ) x is a local maxi-
mum point and are, respectively, the mean and variance-
(c) Dn (x ) 6= 0 and neither (a) nor (b) is satis…ed ) x is covariance matrix of X.
a saddle point Each variable in X has a marginal distribution that
is univariate normal, that is, Xi N ( i ; 2i ) for i =
Consider the following constrained optimization problem 2 th
1; :::; k, where i and i are the i element of and the
ith diagonal element of , respectively.

An important special case of Eq.(93) occurs when all the


local max(min) f (x) = f (x1 ; : : : ; xn ) (87)
X 0 s have the same variance 2 and are pairwise uncorre-
lated. Then = 2 I giving j j = 2k and 1
= 12 I.
subject to gj (x) = bj ; j = 1; : : : ; m; (m < n) (88)
The multivariate pdf then simpli…es to
The Lagrangian is given by

P
m 1 1
L(x) = f (x) j (gj (x) bj ) (89) f (x) = 2 )k=2
exp 2
(x )0 (x )
j=1
(2 2
Qk 1 1 2
De…ne the bordered Hessian = p exp (xi i)
i=1 2 2 2
@g1 (x) @g1 (x) = f (x1 )f (x2 ) f (xk ) (94)
0 0
@x1 @xr
.. .. .. .. .. .. Thus, the multivariate density is the product of the
. . . . . . marginal densities, that is, the X 0 s are distributed in-
@gm (x) @gm (x) dependently of one another. Zero correlations between
0 0
Br (x) = @x1 @xr normally distributed variables imply statistical indepen-
@g1 (x) @gm (x)
L0011 (x) L001r (x) dence. This result does not necessarily hold for variables
@x1 @x1
.. .. .. .. .. .. that are not normally distributed.
. . . . . .
@g1 (x) @gm (x)
... L00r1 (x) L00rr (x)
@xr @xr
(90)
Distributions of Quadratic
Let x be an interior point that satis…es the …rst-order Forms
conditions: Suppose x N (0; I), that is the k variables in x havein-
dependent (standard) normal distributions, each with
@L(x) @f (x) P
m @gj (x) mean zero and unit variance.
= j = 0; i = 1; :::; n(91)
@xi @xi j=1 @xi
The sum of squares x0 x = ki=1 x2i 2
k: Note that this
gj (x) = bj ; j = 1; : : : ; m (92) a particular quadratic form x0 Ix:
Then 1
Suppose now that x N (0; 2
I): Then x0 2
I x=
x2
1 x2 x2 2
+ 2
+ + k
k:
(a) If ( 1)m Br (x ) > 0 for r = m + 1; :::; n; then x solves 2 2 2

the local minimization problem; Similarly, it can be shown that if x N (0; ) where
(b) If ( 1)r Br (x ) > 0 for r = m + 1; :::; n; then x solves is a positive de…nite matrix, x0 1
x 2
k
the local maximization problem.
Consider the quadratic form x0 Ax where x N (0; I)
and A is a symmetric, idempotent matrix of rank r k:
Multivariate Normal Distribu- Then it can be shown that x0 Ax 2
r:

tion Suppose x N (0; 2 I) and we have two quadratic forms


The k-element random vector X is said to have a mul- x0 Ax and x0 Bx, where A and B are symmetric idem-
tivariate normal distribution, denoted by X N ( ; ), potent matrices. We seek the condition for the two forms
if its pdf is given by to be independently distributed.

19
Because the matrices are symmetric idempotent, Question 3
Let F = (fij ) and G = (gij ) represent p q and q r
matrices of continuous
0
di¤erentiable functions of a vector x =
x0 Ax = x0 A0 Ax x1 xm of m variables. Show that
= (Ax)0 Ax
@(aF + bG) @F @G
and similarly =a +b
@xk @xk @xk
where a and b are constants.
x0 Bx = (Bx)0 Bx

If each of the variables in the Ax has zero correlation Question 4


with each variable in Bx, these variables will be dis- Let X = (xij )m n be a m n matrix and A = (aij )n m
tributed independently of one another, and hence any an n m matrix of constants. Show that
function of the one set of variables, such as x0 Ax, will
be distributed independently of any function of the other @tr(AX)
set, such as x0 Bx. = A0
@X
If X is symmetric (m = n), show that
The covariances between the variables in Ax and Bx
are given by
@tr(AX)
= A + A0 diag(A)
@X
0 1
a11 0 0
E (Ax)(Bx)0 = E(Axx0 B) B 0
B a22 0 CC
= 2
AB where diag(A) = B
B .. .. .. .. C
C
@ . . . . A
These covariances are all zero i¤ AB = 0 0 0 amm

In the same manner, we can show that if x N (0; 2 I);


Question 5
and x0 Ax is a quadratic form with A being symmetric 0
idempotent of order k; Lx be an r-element vector where Let h(x) = h1 (x) hn (x) represent an n
L is r k, then a necessary and su¢ cient condition for 1 vector of continuous di¤erentiable functions of x =
independence of the quadratic and linear forms is LA = (x1 ; : : : ; xm )0 and g(y) represent a continuous di¤erentiable
0. function of y = (y1 ; : : : ; yn )0 : Further, let f to be the com-
posite function de…ned by f (x) = g[h(x)]: Show that

Tutorial Exercise No. 3 @f @g @h


=
Question 1 @xj @y 0 @xj
Show that
(a) diag(A B) = diag(A) diag(B) Question 6
(b) For any two vectors a and b, not necessarily of the The result in Question 5 can be generalized by tak-
same order, a b0 = ab0 0
ing g(y) = g1 (y) gp (y) to be a p 1 vec-
(c) (A B)0 = A0 B 0
tor of continuous di¤erentiable functions and f (x) =
(d) tr(A B) = tr(A)tr(B) 0
(e) If x is an eigenvector of A and y is an eigenvector of g1 [h(x)] gp [h(x)] to be a p 1 vector of com-
B, then x y is an eigenvector of A B. Is it true that posite functions. Show that
eigenvector of A B is of the form x y, where x is an
eigenvector of A and y is an eigenvector of B? @f @g @h
=
@x0 @y 0 @x0
Question 2
Show that Question 7
The result in Question 5 can be generalized in another
direction. Let H = (hij ) be an n r matrix of continuously
tr(ABCD) = (vec D 0 )0 (C 0 A)vec B
0
di¤erentiable functions of x and g a function of n r matrix
= (vec D) (A C 0 )vec B 0 Y = (yij ). Further, let f be the composite function de…ned
when the product ABCD is de…ned and square. by f (x) = g(Y ) = g[H(x)]: Show that

20
0
@f @g @H
= tr
@xj @Y @xj

Question 8
The function f (x1 ; x2 ; x3 ) = x21 +x22 +3x23 x1 x2 +2x1 x3 +
x2 x3 has one stationary point. Show that it is a local mini-
mum point.

Question 9
Solve the problem

max f (x; y; z) = x + 2z
subject to

g1 (x; y; z) = x+y+z =1
7
g2 (x; y; z) = x2 + y 2 + z =
4

Question 10
Let x N (0; ): Show that x0 1
x 2
k:

Question 11
Let x N (0; I) and A is a symmetric, idempotent matrix
of rank r k: Show that x0 Ax 2
r:

21

You might also like