Professional Documents
Culture Documents
Pseudo-inverse of a Matrix
xT Ax > 0
1. The r × r (1 ≤ r ≤ p) submatrix Ar ,
a11 a12 · · · a1r
a21 a22 · · · a2r
Ar = .. .. .. ..
. . . .
ar1 ar2 · · · arr
2. The p eigenvalues of A, λ1 , . . . , λp are positive. Conversely, if all the eigenvalues of a matrix B are
positive, then B is positive definite.
A = LLT (1)
A = SS (2)
A = V DV T (3)
2
where
λ1 0 · · · 0
0 λ2 · · · 0
D = diag(λ1 , . . . , λp ) = .. .. . . ..
. . . .
0 0 ··· λp
is the diagonal matrix composed of the eigenvalues of A, and V is an orthogonal matrix
V TV = 1
6. As A = V DV T ,
|A| = |V DV T | = |V ||D||V T | = |V |2 |D| = |D| > 0
as
p
Y
|V | = 1 and |D| = λi > 0
i=1
by 2 and 5.
7. By 6., as |A| > 0, A is non-singular, that is, the inverse of A, A−1 exists such that
AA−1 = A−1 A = 1.
In fact
A−1 = (V DV T )−1 = V D−1 V T
as
V −1 = V T .
9. For x ∈ Rp ,
xT Ax
min λi ≤ ≤ max λi
1≤i≤p xT x 1≤i≤p
We want a computationally simple test for a symmetric matrix to induce a positive definite quadratic
form. We first treat the case of 2 × 2 matrices where the result is simple. Then, we present the conditions
for n × n symmetric matrices to be positive definite. Finally, we state the corresponding condition for the
symmetric matrix to be negative definite or neither. Before starting all these cases, we recall the relationship
between the eigenvalues and the determinant and trace of a matrix.
For a matrix A, the determinant and trace are the product and sum of the eigenvalues:
det(A) = λ1 · · · λn , and
tr(A) = λ1 + · · · + λn ,
where λ j are the n eigenvalues of A. (Here we list an eigenvalue twice if it has multiplicity two, etc.)
1. T WO BY TWO MATRICES
a b
Let A = be a general 2 × 2 symmetric matrix. We will see in general that the quadratic form
b c
for A is positive definite if and only if all the eigenvalues are positive. Since, det(A) = λ1 λ2 , it is necessary
that the determinant of A be positive. On the other hand if the determinant is positive, then either (i) both
eigenvalues are positive, or (ii) both eigenvalues are negative. Since tr(A) = λ1 + λ2 , if det(A) > 0 and
tr(A) > 0 then both eigenvalues must be positive. We want to give this in a slightly different form that is
more like what we get in the n × n case. If det(A) = ac − b2 > 0, then ac > b2 ≥ 0, and a and c must
have the same sign. Thus det(A) > 0 and tr (A) > 0 is equivalent to the condition that det(A) > 0 and
a > 0. Therefore, a necessary and sufficient condition for the quadratic form of a symmetric 2 × 2 matrix
to be positive definite is for det(A) > 0 and a > 0.
We want to see the connection between the condition on A to be positive definite and completion of the
squares.
x
Q(x, y) = (x, y)A
y
= a x 2 + 2b x y + c y 2
b 2 ac − b2 2
=a x+ y + y .
a a
This expresses the quadratic form as a sum of two squares by means of “completion of the squares”. If
a > 0 and det(A) > 0, then both these coefficients are positive and the form is positive definite. It can also
ac − b2
be checked that a and are the pivots when A is row reduced. We can summarize these two results
a
in the following theorem.
Theorem 1. Let A be an 2 × 2 symmetric matrix and Q(x) = xT Ax the related quadratic form. The
following conditions are equivalent:
(i) Q(x) is positive definite.
(ii) Both eigenvalues of A are positive.
(iii) Both a x 2 and (x, y)A(x, y)T are positive definite.
(iv) Both det(A) > 0 and a > 0.
(v) Both the pivots obtained without row exchanges or scalar multiplications of rows are positive.
(vi) By completion of the squares, Q(x) can be represented as a sum of two squares, with both positive
coefficients.
1
2 TEST FOR POSITIVE AND NEGATIVE DEFINITENESS
The following theorem gives conditions of the quadratic form being positive definite in terms of determinants
of Ak .
Theorem 2. Let A be an n × n symmetric matrix and Q(x) = xT Ax the related quadratic form. The
following conditions are equivalent:
(i) Q(x) is positive definite.
(ii) All the eigenvalues of A are positive.
(iii) For each 1 ≤ k ≤ n, the quadratic form associated to Ak is positive definite.
(iv) The determinants, det(Ak ) > 0 for 1 ≤ k ≤ n.
(v) All the pivots obtained without row exchanges or scalar multiplications of rows are positive.
(vi) By completion of the squares, Q(x) can be represented as a sum of squares, with all positive coeffi-
cients,
Q(x1 , . . . , xn ) = (x1 , . . . , xn )UT DU(x1 , . . . , xn )T
= p1 (x1 + u 1,2 x2 + · · · + u 1,n xn )2
+ p2 (x2 + u 2,3 x3 + · · · + u 2,n xn )2
+ · · · + pn xn2 .
From this representation, it is clear that Q is positive definite if and only if all the eigenvalues are positive,
i.e., conditions (i) and (ii) are equivalent.
Assume Q is positive definite. Then for any 1 ≤ k ≤ n,
0 < Q(x1 , . . . , xk , 0, . . . , 0)
= (x1 , . . . , xk , 0, . . . , 0)A(x1 , . . . , xk , 0, . . . , 0)T
= (x1 , . . . , xk )Ak (x1 , . . . , xk )T
for all (x1 , . . . , xk ) 6= 0. This shows that (i) implies (iii).
TEST FOR POSITIVE AND NEGATIVE DEFINITENESS 3
Assume (iii). Then all all the eigenvalues of Ak must be positive since (i) and (ii) are equivalent for Ak .
Notice that the eigenvalues of Ak are not necessarily eigenvalues of A. Therefore the determinant of Ak is
positive since it is the product of its eigenvalues. This is true for all k, so this shows that (iii) implies (iv).
Assume (iv). When A is row reduced, it also row reduces all the Ak since we do not perform any row
exchanges. Therefore the pivots of the Ak are pivots of A. Also, the determinant of Ak is the product of the
first k pivots, det(Ak ) = p1 . . . pk . Therefore
pk = ( p1 . . . pk )/( p1 . . . pk−1 ) = det(Ak )/ det(Ak−1 > 0,
for all k. This proves (v).
Now assume (v). Row reduction can be realized by matrix multiplication on the left by a lower triangular
matrix. Therefore, we can write A = LDU where D is the diagonal matrix made up of the pivots, L is
lower triangular with ones on the diagonal, and U is upper diagonal with ones on the diagonal. Since A is
symmetric, LDU = A = AT = UT DLT . It can then be shown that UT = L. Therefore,
Q(x1 , . . . , xn ) = (x1 , . . . , xn )UT DU(x1 , . . . , xn )T
= p1 (x1 + u 1,2 x2 + · · · + u 1,n xn )2
+ p2 (x2 + u 2,3 x3 + · · · + u 2,n xn )2
+ · · · + pn xn2 .
Thus, we can “complete the squares”, expressing Q as the sum of squares with the pivots as the coefficients.
If the pivots are all positive, then all the coefficients pi are positive. Thus (v) implies (vi). Note, that z = Ux
is a non-orthonormal change of basis that makes the quadratic form diagonal.
If Q(x) can be written as the sum of squares of the above form with positive coefficients, then the quadratic
form must be positive. Thus, (vi) implies (i).
Example 3. Let
2 -1 0
A = -1 2 -1 .
0 -1 2
√
The eigenvalues are 2 and 2 ± 2 which are all positive, which shows that the quadratic form induced by A
is positive definite. (Notice that these eigenvalues are not especially easy to calculate.)
We can row reduce to represent A as the product of lower triangular, diagonal, and upper triangular
matrices.
1 - 12 0
1 0 0 2 0 0
A = - 12 1 0 0 32 0 0 1 - 32 .
0 - 23 1 0 0 43 0 0 1
Since the pivots on the diagonal are all positive, the quadratic form induced by A is positive definite, and
1 3 2 4
xT Ax = 2(x1 − x2 )2 + (x2 − x3 )2 + x32 .
2 2 3 3
The principal submatrices and their determinants are
A1 = (2), det(A1 ) = 2 > 0,
2 -1
A2 = , det(A2 ) = 3 > 0,
-1 2
det(A) = 2 3/2 4/3 = 4 > 0.
A3 = A
Since these are all positive, the quadratic form induced by A is positive definite.
4 TEST FOR POSITIVE AND NEGATIVE DEFINITENESS
4. P ROBLEMS
1. Decide whether the following matrices are positive definite, negative definite, or neither:
2 -1 -1 2 -1 -1
(a) -1 2 -1 (b) -1 2 1
-1 -1 2 -1 1 2
1 2 0 0
1 2 3 2 6 -2 0
(c) 2 5 4 (d)
0 -2 5 -2
3 4 9
0 0 -2 3
R EFERENCES
[1] C. Simon and L. Blume, Mathematics for Economists, W. W. Norton & Company, New York, 1994
[2] G. Strang, Linear Algebra and its Applications, Harcourt Brace Jovanovich, Publ., San Diego, 1976.
The Moore-Penrose Pseudoinverse (Math 33A: Laub)
(P1) AGA = A
(P2) GAG = G
(P3) (AG)T = AG
(P4) (GA)T = GA
Note that the above theorem is not constructive. But it does provide a checkable cri-
terion, i.e., given a matrix G that purports to be the pseudoinverse of A, one need simply
verify the four Penrose conditions (P1)–(P4) above. This verification is often relatively
straightforward.
" #
1
Example: Consider A = . Verify directly that A+ = [ 15 , 25 ]. Note that other left
2
inverses (for example, A−L = [3 , −1]) satisfy properties (P1), (P2), and (P4) but not (P3).
Still another characterization of A+ is given in the following theorem whose proof can
be found on p. 19 in Albert, A., Regression and the Moore-Penrose Pseudoinverse, Aca-
demic Press, New York, 1972. We refer to this as the “limit definition of the pseudoinverse.”
1
2 Examples
Each of the following can be derived or verified by using the above theorems or characteri-
zations.
−1
Example 1: A+ = AT (AAT ) if A is onto, i.e., has linearly independent rows (A is right
invertible)
−1
Example 2: A+ = (AT A) AT if A is 1-1, i.e., has linearly independent columns (A is left
invertible)
3 Some Properties
Theorem: Let A ∈ IRm×n and suppose U ∈ IRm×m , V ∈ IRn×n are orthogonal (M is
orthogonal if M T = M −1 ). Then
(U AV )+ = V T A+ U T .
Proof: Simply verify that the expression above does indeed satisfy each of the four Penrose
conditions.
2
Both of the above two results can be proved using the limit definition of the pseudoinverse.
The proof of the first result is not particularly easy nor does it have the virtue of being
especially illuminating. The interested reader can consult the proof in Albert, p. 27. The
proof of the second result is as follows:
+ −1
(AT ) = lim (AAT + δ 2 I) A
δ→0
−1 T
= lim [AT (AAT + δ 2 I) ]
δ→0
−1 T
= [lim AT (AAT + δ 2 I) ]
δ→0
T
= (A+ )
Note now that by combining the last two theorems we can, in theory at least, com-
pute the Moore-Penrose pseudoinverse of any matrix (since AAT and AT A are symmet-
ric). Alternatively, we could compute the pseudoinverse by first computing the SVD of
A as A T + + T
" = U ΣV# and then by the first theorem of this section A = V Σ U where
S −1 0
Σ+ = . This is the way it’s done in Matlab; the command is called mpp.
0 0
3
Theorem: Suppose A ∈ IRm×n , b ∈ IRm . Then R(b) ⊆ R(A) if and only if AA+ b = b.
Proof: Suppose R(b) ⊆ R(A). Take arbitrary γ ∈ IR so that γb ∈ R(b) ⊆ R(A). Then
there exists a vector v ∈ IRn such that Av = γb. Thus we have
γb = Av = AA+ Av = AA+ γb
where one of the Penrose properties is used above. Since γ ∈ IR was arbitrary, we have
shown that b = AA+ b. To prove the converse, assume now that AA+ b = b. Then it is clear
that b ∈ R(b) and hence
b = AA+ b ∈ R(A) .
We close with some of the principal results concerning existence and uniqueness of solutions
to the general matrix linear system Ax = b, i.e., the solution of m equations in n unknowns.
Proof: The subspace inclusion criterion follows essentially from the definition of the range
of a matrix. The matrix criterion is from the previous theorem.
Theorem: (Solution) Let A ∈ IRm×n , B ∈ IRm and suppose that AA+ b = b. Then any
vector of the form
x = A+ b + (I − A+ A)y where y ∈ IRn is arbitrary (4)
is a solution of
Ax = b. (5)
Furthermore, all solutions of (5) are of this form.
4
and this is clearly of the form (4).
Proof: The first equivalence is immediate from the form of the general solution in (4). The
second follows by noting that the n × n matrix A+ A = I only if r = n where r = rank(A)
(recall r ≤ n). But rank(A) = n if and only if A is 1-1 or N (A) = 0.
EXERCISES:
" #
1 1
1. Use the limit definition of the pseudoinverse to compute the pseudoinverse of .
2 2
+ + +
2. If x, y ∈ IRn , show that (xy T ) = (xT x) (y T y) yxT .
3. For A ∈ IRm×n , prove that R(A) = R(AAT ) using only definitions and elementary
properties of the Moore-Penrose pseudoinverse.
5
The Nullspace of a Matrix
The solution sets of homogeneous linear systems provide an important source of
vector spaces. Let A be an m by n matrix, and consider the homogeneous system
A x= 0
Since A is m by n, the set of all vectors x which satisfy this equation forms a subset
of Rn. (This subset is nonempty, since it clearly contains the zero
vector: x = 0 always satisfies A x= 0.) This subset actually forms a subspace
of Rn, called the nullspace of the matrix A and denoted N(A). To prove that N(A) is
a subspace of R n , closure under both addition and scalar multiplication must be
established. If x1 and x 2 are in N(A), then, by definition, A x 1 = 0 and A x 2 = 0.
Adding these equations yields
forms a subspace of Rn for some n. State the value of n and explicitly determine
this subspace.
Since the coefficient matrix is 2 by 4, x must be a 4‐vector. Thus, n = 4: The
nullspace of this matrix is a subspace of R4. To determine this subspace, the
equation is solved by first row‐reducing the given matrix:
Therefore, the system is equivalent to
that is,
If you let x 3 and x 4 be free variables, the second equation directly above implies
Therefore, the set of solutions of the given homogeneous system can be written
as
The second row implies that x 2 = 0, and back‐substituting this into the first row
implies that x 1 = 0 also. Since the only solution of A x = 0 is x = 0, the nullspace
of A consists of the zero vector alone. This subspace, { 0}, is called the trivial
subspace (of R 2).
Since the bottom row of this coefficient matrix contains only zeros, x 2 can be
taken as a free variable. The first row then gives so any vector
of the form