Positive Definite Matrices

Notes on Linear Algebra
Chia-Ping Chen
Department of Computer Science and Engineering
National Sun Yat-Sen University
Kaohsiung, Taiwan ROC
Positive Definite Matrices – p. 1
Introduction
The signs of the eigenvalues of a matrix can be
important.
For example, in a system of differential equations,
the signs determine whether the system is stable.
The signs can also be related to the minima, maxima,
and saddle points of a multi-variate function.
Positive Definite Matrices – p. 2
Expansion of a Function
For a “well-behaved” function of x, y, the value near the
origin can be written as
f(x, y) = f(0) +
∂f
∂x
x +
∂f
∂y
y +
1
2

2
f
∂x
2
x
2
+

2
f
∂x∂y
xy +
1
2

2
f
∂y
2
y
2
+ . . . ,
where the derivative is evaluated at the origin.
Near a point (α, β), the value can be written as
f(x, y) =f(α, β) +
∂f
∂x
(x −α) +
∂f
∂y
(y −β)
+
1
2

2
f
∂x
2
(x −α)
2
+

2
f
∂x∂y
(x −α)(y −β) +
1
2

2
f
∂y
2
(y −β)
2
+ . . . ,
where the derivative is evaluated at (α, β).
Positive Definite Matrices – p. 3
Stationary Points
By definition, a point (α, β) is stationary if the first-order
partial derivatives evaluated at (α, β) are all 0.
A stationary point is either a maximum, minimum, or a
saddle point.
It is a maximum if the functional value of all nearby
points are not greater.
It is a minimum if the functional value of all nearby
points are not smaller.
It is a saddle point if it is neither a maximum nor a
minimum.
How do we decide?
Positive Definite Matrices – p. 4
Expansion Near a Stationary Point
The expansion of a function near a stationary point is
f(x, y) =f(α, β) +
∂f
∂x
(x −α) +
∂f
∂y
(y −β)
+
1
2

2
f
∂x
2
(x −α)
2
+

2
f
∂x∂y
(x −α)(y −β) +
1
2

2
f
∂y
2
(y −β)
2
+ . . .
.
=f
0
+
1
2

2
f
∂x
2
(δx)
2
+

2
f
∂x∂y
(δx)(δy) +
1
2

2
f
∂y
2
(δy)
2
Positive Definite Matrices – p. 5
Hessian Matrix
The Hessian matrix at a point x = (x, y) is defined by
H(x) =


2
f
∂x
2

2
f
∂x∂y

2
f
∂x∂y

2
f
∂y
2
¸
¸
It follows that near a stationary point x,
δf =
1
2
δx
T
H(x)δx.
Positive Definite Matrices – p. 6
Quadratic Form
By definition, a function f is quadratic in variables x, y
if f has the following form
f(x, y) = ax
2
+ 2bxy +cy
2
.
f(x, y) can be expressed by a real symmetric matrix
f(x, y) =

x y

a b
b c
¸
¸

x
y
¸
¸
.
Note that the expansion of a function near a stationary
point is quadratic in the variables δx = (δx, δy).
Positive Definite Matrices – p. 7
Positive/Negative (Semi-)Definite
By definition, a quadratic function f is positive definite
if its value is positive for all points except for the origin.
f is said to be positive semidefinite if the values are
non-negative.
f is said to be negative definite if the values are
negative except for the origin.
f is said to be negative semidefinite if the values are
non-positive.
Positive Definite Matrices – p. 8
Necessary Conditions
For f(x, y) = ax
2
+2bxy +cy
2
to be p.d., a > 0 and c > 0.
This can be shown by looking at points (1, 0) and
(0, 1).
But these are merely necessary conditions, as the
following example demonstrates
f
0
(x, y)|
x=1,y=1
= x
2
−4xy + y
2
< 0.
Positive Definite Matrices – p. 9
Sufficient Condition
We can express f using squares by
f(x, y) = ax
2
+ 2bxy +cy
2
= a(x +
b
a
y)
2
+ (c −
b
2
a
)y
2
,
Since square is non-negative, we see from above that
f is positive definite if (sufficient condition)
a > 0, ac > b
2
.
Positive Definite Matrices – p. 10
Negative Definite
f(x, y) = ax
2
+ 2bxy + cy
2
is negative definite if
f(x, y) < 0 other than (x, y) = (0, 0).
f is negative definite iff −f is positive definite.
A sufficient condition for negative definiteness is
(−a) > 0, (−a)(−c) > (−b)
2
.
Equivalently, getting rid of negative signs,
a < 0, ac > b
2
.
example and diagram
Positive Definite Matrices – p. 11
Singular Case
We have a singular case if
ac = b
2
.
If a > 0, f is still non-negative everywhere, since
f(x, y) = a(x +
b
a
y)
2
.
The surface z = f(x, y) degenerates from a bowl to a
valley, along the line ax +by = 0.
f is said to be positive semidefinite (psd) if a > 0 and
negative semidefinite if a < 0.
example and diagram
Positive Definite Matrices – p. 12
Saddle Point
The remaining case is when
ac < b
2
.
In this case, (0, 0) is a saddle point.
Along one direction (0, 0) is a minimum, and along
another direction (0, 0) is a maximum.
f is said to be indefinite.
Positive Definite Matrices – p. 13
Example and Diagram
Let
f(x, y) = x
2
−y
2
.
Note
ac = −1 < 0 = b
2
.
The point (0, 0) is a minimum along the x-axis, a
maximum along the y-axis.
Positive Definite Matrices – p. 14
Matrix and Quadratic Form
A matrix A defines a quadratic form,
x
T
Ax =

x
1
. . . x
n

a
11
. . . a
1n
.
.
.
.
.
.
.
.
.
a
n1
. . . a
nn
¸
¸
¸
¸
¸

x
1
.
.
.
x
n
¸
¸
¸
¸
¸
=
n
¸
i,j=1
a
ij
x
i
x
j
Note that for i = j, the coefficient of x
i
x
j
is a
ij
+a
ji
.
Given a quadratic form, we make A symmetrical by
requiring a
ij
= a
ji
.
The positive-definiteness of a matrix is defined via
f(x) = x
T
Ax.
Positive Definite Matrices – p. 15
Positive Definite Matrices
A real symmetric matrix A is said to be positive definite
if x
T
Ax > 0 except for x = 0.
From the earlier discussion, we have, in the
two-dimensional case,
A =

a b
b c
¸
¸
is positive definite iff a > 0 and ac > b
2
.
We generalize to the case that A is n ×n.
Positive Definite Matrices – p. 16
Conditions for Positive Definiteness
The following are equivalent conditions for positive
definiteness.
All eigenvalues are positive.
All determinants of the principal submatrices are
positive.
All pivots are positive.
example
Positive Definite Matrices – p. 17
Positive Eigenvalues
If a matrix is positive definite, then it has all positive
eigenvalues.
Let x
i
be an eigenvector with eigenvalue λ
i
.
x
T
i
Ax
i
= λ
i
(x
T
i
x
i
) > 0 ⇒ λ
i
> 0.
Conversely, if all eigenvalues are positive, then a
matrix is positive definite.
Via a complete set of eigenvectors,
x
T
Ax = x
T
QΛQ
T
x =
¸
i
c
2
i
λ
i
> 0.
example
Positive Definite Matrices – p. 18
Positive Determinants
If A is positive definite, then the principal submatrices
A
k
’s have positive determinants.
A
k
is positive definite. To prove, let x be a vector
with the last n −k components being 0
x
T
Ax =

x
T
k
0

A
k

∗ ∗
¸
¸

x
k
0
¸
¸
= x
T
k
A
k
x
k
> 0.
Since all eigenvalues of A
k
are positive,
|A
k
| =
¸
i
λ
i
> 0.
example
Positive Definite Matrices – p. 19
Positive Pivots
Suppose A is positive definite, then all pivots d
i
are
positive.
d
k
=
|A
k
|
|A
k−1
|
> 0,
since |A
k
| > 0 for all k.
Conversely, if d
i
> 0 for all d
i
, then A is positive definite.
A = LDL
T
⇒x
T
Ax =
¸
i
d
i
(L
T
x)
2
i
> 0.
example
Positive Definite Matrices – p. 20
Relation to Least Squares
For a least squares problem Rx = b we solve the
normal equation
R
T
R¯ x = R
T
b.
Note that the matrix A = R
T
R is symmetric.
(theorem) A is positive definite if R has linearly
independent columns.
x
T
Ax = x
T
R
T
Rx = (Rx)
T
(Rx)

= 0, x = 0,
> 0, x = 0
example
Positive Definite Matrices – p. 21
Cholesky Decomposition
(Cholesky decomposition) A positive definite matrix is
the product of a lower-triangular matrix and its
transpose.
A special case comes from LDU decomposition,
A = LDL
T
= LD
1/2
D
1/2
L
T
= R
T
R.
There are infinite ways to decompose a positive
definite A = R
T
R. In fact, R

= RQ, where Q is
orthogonal, also satisfies A = R
′T
R

.
example
Positive Definite Matrices – p. 22
Ellipsoids in n Dimensions
Consider the equation x
T
Ax = 1, where A is p.d.
If A is diagonal, the graph is easily seen as an ellipsoid.
If A is not diagonal, then using the eigenvectors as
columns of Q, we have
x
T
Ax = x
T
QΛQ
T
x = y
T
Λy = λ
1
y
2
1
+· · · +λ
n
y
2
n
.
y
i
= q
T
i
x is the component of x along the ith
eigenvector q
i
.
Positive Definite Matrices – p. 23
Principal Axes
The axes of the ellipsoid defined by x
T
Ax = 1 point
toward the eigenvectors (q
i
) of A.
They are called principal axes.
The principal axes are mutually orthogonal.
The length of the axis along q
i
is 1/

λ
i
.
example
Positive Definite Matrices – p. 24
Semidefinite Matrices
A matrix is said to be positive semidefinite if
x
T
Ax ≥ 0 for all x.
Each of the following conditions is equivalent to
positive semi-definiteness.
All eigenvalues are non-negative.
|A
k
| ≥ 0 for all principal submatrices A
k
.
All pivots are non-negative.
A = R
T
R for some R.
Positive Definite Matrices – p. 25
Example
The following matrix is positive semidefinite.
A =

2 −1 −1
−1 2 −1
−1 −1 2
¸
¸
¸
¸
quadratic form:
x
T
Ax = (x
1
−x
2
)
2
+ (x
2
−x
3
)
2
+ (x
3
−x
1
)
2
≥ 0.
eigenvalues: 0, 3, 3 ≥ 0; pivots: 2,
3
2
≥ 0.
determinants:
|A
1
| = 2, |A
2
| = 3, |A
3
| = 0 ≥ 0.
Positive Definite Matrices – p. 26
Congruence Transform
The congruence transform of A by a non-singular C is
defined by
A →C
T
AC
Congruence transform is related to quadratic form,
x = Cy ⇒ x
T
Ax = y
T
C
T
ACy.
If A is symmetric, so is its congruence transform
C
T
AC.
example
Positive Definite Matrices – p. 27
Sylvester’s Law
The signs of eigenvalues are invariant under a
congruence transform.
(proof) For simplicity, suppose A is nonsingular. Let
C = QR, C(t) = tQ+ (1 −t)QR.
The eigenvalues of C(t)
T
AC(t) change gradually as
we vary t from 0 to 1, but they are never 0 since
C(t) = Q(tI + (1 −t)R) is never singular. So the signs
of eigenvalues never change.
Since Q
T
AQ and A have the same eigenvalues, the
law is proved.
Positive Definite Matrices – p. 28
Signs of Pivots
The LDU-decomposition of a symmetric matrix A is
A = LDU = U
T
DU.
So A is a congruence transform of D.
As a result of the Sylvester’s law, the signs of the
pivots (eigenvalues of D) agree with the signs of the
eigenvalues.
example
Positive Definite Matrices – p. 29
Locating Eigenvalues
The relation between pivots and eigenvalues can be
used to locate eigenvalues.
First, note that if A has an eigenvalue λ, then A −cI
has the eigenvalue λ −c with the same eigenvector,
Ax = λx ⇒ (A−cI)x = (λ −c)x.
Positive Definite Matrices – p. 30
Example
Consider
A =

3 3 0
3 10 7
0 7 8
¸
¸
¸
¸
, B = A−2I =

1 3 0
3 8 7
0 7 6
¸
¸
¸
¸
.
B has a negative pivot, so it has a negative eigenvalue.
A is positive definite.
It follows
λ
A
> 0, λ
B
= λ
A
−2 < 0 ⇒0 < λ
A
< 2.
Positive Definite Matrices – p. 31
Generalized Eigenvalue Problem
A generalized eigenvalue problem has the form of
Ax = λMx,
where A, M are given matrices.
We consider only the case that A is symmetric and M
is positive definite.
example
Positive Definite Matrices – p. 32
Equivalent Eigenvalue Problem
A generalized eigenvalue problem can be converted to
an equivalent eigenvalue problem.
We can write M = R
T
R, where R is invertible.
Let y = Rx,
Ax = λMx = λR
T
Rx ⇒AR
−1
y = λR
T
y.
Let C = R
−1
so (R
T
)
−1
= C
T
. Then
C
T
ACy = λy.
This is an equivalent eigenvalue problem: same
eigenvalues, related eigenvectors x = Cy.
Positive Definite Matrices – p. 33
Properties
Since C
T
AC is symmetric, the eigenvectors y
j
can be
made orthonormal.
The eigenvectors x
j
are M-orthonormal, i.e.,
x
T
i
Mx
j
= x
T
i
R
T
Rx
j
= y
T
i
y
j
= δ
ij
.
example
Positive Definite Matrices – p. 34
Simultaneous Diagonalization
Both M and A can be diagonalized by the generalized
eigenvectors x
i
’s,
x
T
i
Mx
j
= y
T
i
y
i
= δ
ij
, x
T
i
Ax
j
= λ
j
x
T
i
Mx
j
= λ
j
δ
ij
.
That is, using x
i
’s as the columns of S, we have
S
T
AS = Λ, S
T
MS = I.
example
Note they are congruence transforms to diagonal
matrices rather than similarity transforms, as S
T
is
used, not S
−1
.
Positive Definite Matrices – p. 35
Singular Value Decomposition
The singular value decomposition (SVD) of a matrix A
is defined by
A = UΣV
T
,
where U, Σ, V are related to the matrices A
T
A and AA
T
Here A is not limited to be a square matrix.
Positive Definite Matrices – p. 36
SVD Theorem
Any m×n real matrix A with rank r can be factored by
A = UΣV
T
= (orthogonal)(diagonal)(orthogonal).
U is an eigenvector matrix of AA
T
.
V is an eigenvector matrix of A
T
A.
Σ contains the r singular values on the diagonal.
A singular value is the square root of a non-zero
eigenvalue of A
T
A.
Positive Definite Matrices – p. 37
Proof
Suppose v
1
, . . . , v
n
are orthonormal eigenvectors of
A
T
A, with eigenvalues in non-increasing order
v
T
i
A
T
Av
j
= λ
j
v
T
i
v
j
= λ
j
δ
ij
, λ
1
≥ λ
2
≥ · · · ≥ λ
n
.
Since A
T
A has the same nullspace as A, there are r
non-zero eigenvalues.
These non-zero eigenvalues are positive since A
T
A is
p.s.d.
Define the singular value for a positive eigenvalue
σ
j
=

λ
j
, j = 1, . . . , r.
Positive Definite Matrices – p. 38
Proof
u
j
=
Av
j
σ
j
, j = 1, . . . , r is an eigenvector of AA
T
, and u
j
’s
are orthonormal.
AA
T
u
j
=
AA
T
Av
j
σ
j
= λ
j
Av
j
σ
j
= λ
j
u
j
, u
T
i
u
j
= δ
ij
.
Construct V with v’s, and U with u’s and eigenvectors
for 0,
(U
T
AV )
ij
= u
T
i
Av
j
=

0 if j > r,
σ
j
u
T
i
u
j
= σ
j
δ
ij
if j ≤ r.
That is, U
T
AV = Σ. So A = UΣV
T
.
Positive Definite Matrices – p. 39
Remarks
A multiplied by a column of V produces a multiple of a
column of U,
AV = UΣ, or Av
j
= σ
j
u
j
.
U is the eigenvector matrix of AA
T
and V is the
eigenvector matrix of A
T
A.
AA
T
= UΣΣ
T
U
T
; A
T
A = V Σ
T
ΣV
T
.
The non-zero eigenvalues of AA
T
and A
T
A are the
same. They are in ΣΣ
T
.
Positive Definite Matrices – p. 40
Example
Consider
A =

−1 1 0
0 −1 1
¸
¸
. A
T
A =

1 −1 0
−1 2 −1
0 −1 1
¸
¸
¸
¸
.
The singular values are

3, 1.
Finding v
i
and u
i
, one has
A =
1

2

−1 1
1 1
¸
¸


3 0 0
0 1 0
¸
¸

1 −2 1 (/

6)
−1 0 1 (/

2)
1 1 1 (/

3)
¸
¸
¸
¸
Positive Definite Matrices – p. 41
Applications of SVD
Through SVD, we can represent a matrix as a sum of
rank-1 matrices
A = UΣV
T
= σ
1
u
1
v
T
1
+· · · +σ
r
u
r
v
T
r
.
Suppose we have a 1000 ×1000 matrix, for a total of 10
6
entries. Using the above expansion and keep only the
50 most significant terms. This would require
50(1 + 1000 + 1000) numbers, a save of space of almost
90%.
commonly used in image processing
Positive Definite Matrices – p. 42
SVD for Image
Positive Definite Matrices – p. 43
Pseudo-Inverse
Consider the normal equation
A
T
Aˆ x = A
T
b.
A
T
A is not always invertible and ˆ x is not unique.
Among all solution, we denote the one with the
minimum length by x
+
.
A
T
Ax
+
= A
T
b.
The matrix that produces x
+
from b is called the
pseudo-inverse of A, denoted by A
+
,
x
+
= A
+
b.
Positive Definite Matrices – p. 44
Pseudoinverse and SVD
A
+
is related to SVD A = UΣV
T
by
A
+
= V Σ
+
U
T
,
Note Σ
+
is the n ×m matrix with diagonals being
1
σ
1
, . . . ,
1
σ
r
and others being 0.
If A is invertible, then
AA
+
= UΣV
T
V Σ
+
U
T
= I ⇒ A
+
= A
−1
.
Positive Definite Matrices – p. 45
Minimum Length
Consider |Ax −b|.
Multiplying U
T
leaves the length unchanged,
|Ax −b| = |UΣV
T
x −b| = |ΣV
T
x −U
T
b| = |Σy −U
T
b|.
x, y have the same length since y = V
T
x = V
−1
x.
Σy has at most r non-zero components, which are
equated to U
T
b. Other components of y are set to 0 to
minimize |y|, so y
+
= Σ
+
U
T
b.
The minimum-length least-square solution for x is
x
+
= V y
+
= V Σ
+
U
T
b = A
+
b.
Positive Definite Matrices – p. 46
Different Perspectives
We have solved several classes of problems, including
system of linear equations, and
eigenvalue problems.
The same solutions can be achieved by completely
different problem formulations.
Positive Definite Matrices – p. 47
Optimization Problem for Ax = b
If A is positive definite, then
P(x) =
1
2
x
T
Ax −x
T
b
reaches its minimum where
Ax = b.
This is proved by showing
P(y) −P(x) =
1
2
(y −x)
T
A(y −x) ≥ 0.
Positive Definite Matrices – p. 48
A
T
Ax = A
T
b
The least-squares problems for Ax = b has a flavor of
minimization.
The function to be minimized is
E(x) = |Ax −b|
2
= (Ax −b)
T
(Ax −b)
= x
T
A
T
Ax −x
T
A
T
b −b
T
Ax + b
T
b.
E(x) has essentially the same form as P(x). The
minimum is achieved where
A

x = b

i.e. A
T
Ax = A
T
b.
Positive Definite Matrices – p. 49
Rayleigh’s Quotient
The Rayleigh’s quotient is defined by
R(x) =
x
T
Ax
x
T
x
,
where A is real-symmetric.
The smallest eigenvalue for
Ax = λx
can be solved by minimizing R(x).
Positive Definite Matrices – p. 50
Rayleigh’s Principle
The minimum value of the Rayleigh’s quotient R(x) is
the smallest eigenvalue λ
1
of A, achieved by the
corresponding eigenvector x
1
.
This follows from
R(x) =
(Qy)
T
A(Qy)
(Qy)
T
(Qy)
=
y
T
Λy
y
T
y
=
λ
1
y
2
1
+· · · +λ
n
y
2
n
y
2
1
+· · · +y
2
n

λ
1
(y
2
1
+· · · +y
2
n
)
y
2
1
+· · · +y
2
n
= λ
1
.
Positive Definite Matrices – p. 51

Introduction
The signs of the eigenvalues of a matrix can be important. For example, in a system of differential equations, the signs determine whether the system is stable. The signs can also be related to the minima, maxima, and saddle points of a multi-variate function.

Positive Definite Matrices – p.

Expansion of a Function
For a “well-behaved” function of x, y , the value near the origin can be written as
∂f 1 ∂2f 2 1 ∂2f 2 ∂2f ∂f x+ y+ xy + x + y + ..., f (x, y) = f (0) + 2 2 ∂x ∂y 2 ∂x ∂x∂y 2 ∂y

where the derivative is evaluated at the origin. Near a point (α, β), the value can be written as

∂f ∂f (x − α) + (y − β) f (x, y) =f (α, β) + ∂x ∂y 1 ∂2f 1 ∂2f ∂2f 2 + (x − α)(y − β) + (x − α) + (y − β)2 + . . . , 2 ∂x2 ∂x∂y 2 ∂y 2

where the derivative is evaluated at (α, β).
Positive Definite Matrices – p.

Stationary Points
By definition, a point (α, β) is stationary if the first-order partial derivatives evaluated at (α, β) are all 0. A stationary point is either a maximum, minimum, or a saddle point. It is a maximum if the functional value of all nearby points are not greater. It is a minimum if the functional value of all nearby points are not smaller. It is a saddle point if it is neither a maximum nor a minimum. How do we decide?

Positive Definite Matrices – p.

Expansion Near a Stationary Point
The expansion of a function near a stationary point is
f (x, y) =f (α, β) + ∂f ∂f (x − α) + (y − β) ∂x ∂y 1 ∂2f ∂2f 1 ∂2f 2 (x − α)(y − β) + (x − α) + (y − β)2 + . . . + 2 ∂x2 ∂x∂y 2 ∂y 2 1 ∂2f 1 ∂2f ∂2f . 2 =f0 + (δx)(δy) + (δx) + (δy)2 2 ∂x2 ∂x∂y 2 ∂y 2

Positive Definite Matrices – p.

Hessian Matrix The Hessian matrix at a point x = (x. . y) is defined by   H(x) = ∂2f  ∂x2 ∂2f ∂x∂y ∂2f ∂x∂y  ∂2f ∂y 2 It follows that near a stationary point x. 1 T δf = δx H(x)δx. 2 Positive Definite Matrices – p.

f (x. . y) = x y  b c y Note that the expansion of a function near a stationary point is quadratic in the variables δx = (δx. y) can be expressed by a real symmetric matrix    a b x  . Positive Definite Matrices – p.Quadratic Form By definition. a function f is quadratic in variables x. δy). y if f has the following form f (x. f (x. y) = ax2 + 2bxy + cy 2 .

. f is said to be positive semidefinite if the values are non-negative. a quadratic function f is positive definite if its value is positive for all points except for the origin. Positive Definite Matrices – p. f is said to be negative definite if the values are negative except for the origin. f is said to be negative semidefinite if the values are non-positive.Positive/Negative (Semi-)Definite By definition.

y)|x=1. But these are merely necessary conditions. 0) and (0.d. Positive Definite Matrices – p. as the following example demonstrates f0 (x. 1)..Necessary Conditions For f (x. a > 0 and c > 0. . y) = ax2 + 2bxy + cy 2 to be p. This can be shown by looking at points (1.y=1 = x2 − 4xy + y 2 < 0.

a a Since square is non-negative.Sufficient Condition We can express f using squares by b 2 b2 2 f (x. y) = ax2 + 2bxy + cy 2 = a(x + y) + (c − )y . 1 . ac > b2 . Positive Definite Matrices – p. we see from above that f is positive definite if (sufficient condition) a > 0.

example and diagram Positive Definite Matrices – p. getting rid of negative signs. Equivalently. 0). 1 . (−a)(−c) > (−b)2 . y) = (0. a < 0. A sufficient condition for negative definiteness is (−a) > 0. y) = ax2 + 2bxy + cy 2 is negative definite if f (x. f is negative definite iff −f is positive definite.Negative Definite f (x. ac > b2 . y) < 0 other than (x.

Singular Case We have a singular case if ac = b2 . y) degenerates from a bowl to a valley. f is said to be positive semidefinite (psd) if a > 0 and negative semidefinite if a < 0. along the line ax + by = 0. since b 2 f (x. f is still non-negative everywhere. If a > 0. example and diagram Positive Definite Matrices – p. y) = a(x + y) . 1 . a The surface z = f (x.

and along another direction (0. f is said to be indefinite. 0) is a maximum.Saddle Point The remaining case is when ac < b2 . 0) is a minimum. (0. Positive Definite Matrices – p. 1 . In this case. 0) is a saddle point. Along one direction (0.

ac = −1 < 0 = b2 . Note The point (0. 0) is a minimum along the x-axis. 1 .Example and Diagram Let f (x. y) = x2 − y 2 . Positive Definite Matrices – p. a maximum along the y -axis.

 = aij xi xj . a1n x1 n   ... the coefficient of xi xj is aij + aji ..  . The positive-definiteness of a matrix is defined via f (x) = xT Ax. . Positive Definite Matrices – p. an1  .  .. .Matrix and Quadratic Form A matrix A defines a quadratic form.  .  i. we make A symmetrical by requiring aij = aji . xn Note that for i = j ... a11   .  ..  . . Given a quadratic form.j=1 ann xn   xT Ax = x1 .  . 1 .

we have. We generalize to the case that A is n × n.   a b  A= b c is positive definite iff a > 0 and ac > b2 .Positive Definite Matrices A real symmetric matrix A is said to be positive definite if xT Ax > 0 except for x = 0. Positive Definite Matrices – p. in the two-dimensional case. 1 . From the earlier discussion.

All determinants of the principal submatrices are positive. All pivots are positive.Conditions for Positive Definiteness The following are equivalent conditions for positive definiteness. example Positive Definite Matrices – p. All eigenvalues are positive. 1 .

example Positive Definite Matrices – p. if all eigenvalues are positive.Positive Eigenvalues If a matrix is positive definite. xT Axi = λi (xT xi ) > 0 ⇒ λi > 0. then it has all positive eigenvalues. 1 . Via a complete set of eigenvectors. xT Ax = xT QΛQT x = i 2 ci λi > 0. then a matrix is positive definite. i i Conversely. Let xi be an eigenvector with eigenvalue λi .

1 . To prove. i example Positive Definite Matrices – p. |Ak | = λi > 0. Ak is positive definite. then the principal submatrices Ak ’s have positive determinants. x Ax = xk 0 k ∗ ∗ 0 Since all eigenvalues of Ak are positive. let x be a vector with the last n − k components being 0    A ∗ x T T  k   k  = xT Ak xk > 0.Positive Determinants If A is positive definite.

if di > 0 for all di . then all pivots di are positive. 2 . |Ak | dk = > 0. then A is positive definite. A = LDLT ⇒ xT Ax = 2 di (LT x)i > 0. |Ak−1 | since |Ak | > 0 for all k . Conversely.Positive Pivots Suppose A is positive definite. i example Positive Definite Matrices – p.

x = 0 example Positive Definite Matrices – p.Relation to Least Squares For a least squares problem Rx = b we solve the normal equation RT R¯ = RT b. xT Ax = xT RT Rx = (Rx)T (Rx) > 0. x Note that the matrix A = RT R is symmetric. (theorem) A is positive definite if R has linearly independent columns.  = 0. 2 . x = 0.

where Q is orthogonal. also satisfies A = R′T R′ . A special case comes from LDU decomposition. R′ = RQ. In fact. A = LDLT = LD1/2 D1/2 LT = RT R. 2 . example Positive Definite Matrices – p. There are infinite ways to decompose a positive definite A = RT R.Cholesky Decomposition (Cholesky decomposition) A positive definite matrix is the product of a lower-triangular matrix and its transpose.

d.Ellipsoids in n Dimensions Consider the equation xT Ax = 1. then using the eigenvectors as columns of Q. 2 . If A is not diagonal. where A is p. If A is diagonal. the graph is easily seen as an ellipsoid. T yi = qi x is the component of x along the ith eigenvector qi . we have 2 2 xT Ax = xT QΛQT x = y T Λy = λ1 y1 + · · · + λn yn . Positive Definite Matrices – p.

2 . √ The length of the axis along qi is 1/ λi .Principal Axes The axes of the ellipsoid defined by xT Ax = 1 point toward the eigenvectors (qi ) of A. They are called principal axes. example Positive Definite Matrices – p. The principal axes are mutually orthogonal.

|Ak | ≥ 0 for all principal submatrices Ak . All eigenvalues are non-negative. All pivots are non-negative. A = RT R for some R.Semidefinite Matrices A matrix is said to be positive semidefinite if xT Ax ≥ 0 for all x. 2 . Positive Definite Matrices – p. Each of the following conditions is equivalent to positive semi-definiteness.

|A3 | = 0 ≥ 0.Example The following matrix is positive semidefinite. pivots: 2.   2 −1 −1   A = −1 2 −1   −1 −1 2 quadratic form: xT Ax = (x1 − x2 )2 + (x2 − x3 )2 + (x3 − x1 )2 ≥ 0. 3 eigenvalues: 0. 2 ≥ 0. 3. |A2 | = 3. determinants: |A1 | = 2. Positive Definite Matrices – p. 2 . 3 ≥ 0.

x = Cy ⇒ xT Ax = y T C T ACy. 2 . so is its congruence transform C T AC .Congruence Transform The congruence transform of A by a non-singular C is defined by A → C T AC Congruence transform is related to quadratic form. If A is symmetric. example Positive Definite Matrices – p.

So the signs of eigenvalues never change. but they are never 0 since C(t) = Q(tI + (1 − t)R) is never singular. (proof) For simplicity. Positive Definite Matrices – p. The eigenvalues of C(t)T AC(t) change gradually as we vary t from 0 to 1. the law is proved. Let C = QR. 2 . C(t) = tQ + (1 − t)QR. Since QT AQ and A have the same eigenvalues. suppose A is nonsingular.Sylvester’s Law The signs of eigenvalues are invariant under a congruence transform.

2 . As a result of the Sylvester’s law. the signs of the pivots (eigenvalues of D) agree with the signs of the eigenvalues.Signs of Pivots The LDU -decomposition of a symmetric matrix A is A = LDU = U T DU. example Positive Definite Matrices – p. So A is a congruence transform of D.

note that if A has an eigenvalue λ. then A − cI has the eigenvalue λ − c with the same eigenvector. First. 3 .Locating Eigenvalues The relation between pivots and eigenvalues can be used to locate eigenvalues. Positive Definite Matrices – p. Ax = λx ⇒ (A − cI)x = (λ − c)x.

It follows λA > 0.Example Consider    1 3 0 3 3 0     A = 3 10 7 . 3 . A is positive definite. B = A − 2I = 3 8 7 . so it has a negative eigenvalue. λB = λA − 2 < 0 ⇒ 0 < λA < 2.     0 7 6 0 7 8  B has a negative pivot. Positive Definite Matrices – p.

example Positive Definite Matrices – p. We consider only the case that A is symmetric and M is positive definite. where A. 3 .Generalized Eigenvalue Problem A generalized eigenvalue problem has the form of Ax = λM x. M are given matrices.

related eigenvectors x = Cy . Ax = λM x = λRT Rx ⇒ AR−1 y = λRT y. This is an equivalent eigenvalue problem: same eigenvalues. where R is invertible. Positive Definite Matrices – p.Equivalent Eigenvalue Problem A generalized eigenvalue problem can be converted to an equivalent eigenvalue problem. Let y = Rx. Let C = R−1 so (RT )−1 = C T . Then C T ACy = λy. We can write M = RT R. 3 .

the eigenvectors yj can be made orthonormal.e. The eigenvectors xj are M -orthonormal. 3 . i i example Positive Definite Matrices – p.. i.Properties Since C T AC is symmetric. T xT M xj = xT RT Rxj = yi yj = δij .

example Note they are congruence transforms to diagonal matrices rather than similarity transforms. as S T is used.Simultaneous Diagonalization Both M and A can be diagonalized by the generalized eigenvectors xi ’s. 3 . using xi ’s as the columns of S . we have S T AS = Λ. i i i That is. Positive Definite Matrices – p. S T M S = I. xT Axj = λj xT M xj = λj δij . T xT M xj = yi yi = δij . not S −1 .

where U.Singular Value Decomposition The singular value decomposition (SVD) of a matrix A is defined by A = U ΣV T . Σ. Positive Definite Matrices – p. 3 . V are related to the matrices AT A and AAT Here A is not limited to be a square matrix.

U is an eigenvector matrix of AAT . Positive Definite Matrices – p. V is an eigenvector matrix of AT A. Σ contains the r singular values on the diagonal. A singular value is the square root of a non-zero eigenvalue of AT A.SVD Theorem Any m × n real matrix A with rank r can be factored by A = U ΣV T = (orthogonal)(diagonal)(orthogonal). 3 .

r. These non-zero eigenvalues are positive since AT A is p. there are r non-zero eigenvalues. . . . . . λ1 ≥ λ2 ≥ · · · ≥ λn . Positive Definite Matrices – p. Since AT A has the same nullspace as A.d. .Proof Suppose v1 .s. 3 . Define the singular value for a positive eigenvalue σj = λj . with eigenvalues in non-increasing order T T vi AT Avj = λj vi vj = λj δij . . vn are orthonormal eigenvectors of AT A. j = 1. .

Avj AAT Avj T = λj = λj uj . . AA uj = i σj σj Construct V with v ’s. . r is an eigenvector of AAT . i That is. T T (U AV )ij = ui Avj = σj uT uj = σj δij if j ≤ r. and U with u’s and eigenvectors for 0.  0 if j > r.Proof uj = Avj . . U T AV = Σ. Positive Definite Matrices – p. 3 . and uj ’s σj are orthonormal. So A = U ΣV T . . j = 1. uT uj = δij .

AV = U Σ.Remarks A multiplied by a column of V produces a multiple of a column of U . or Avj = σj uj . AAT = U ΣΣT U T . AT A = V ΣT ΣV T . The non-zero eigenvalues of AAT and AT A are the same. Positive Definite Matrices – p. U is the eigenvector matrix of AAT and V is the eigenvector matrix of AT A. They are in ΣΣT . 4 .

Finding vi and ui . A A = −1 2 −1 . A=   0 −1 1 0 −1 1    √ The singular values are 3. 1. one has √   1 −2 1 (/ 6)   √ √  1 −1 1  3 0 0  −1 0 1 (/ 2) A= √   2 1 1 0 1 0 √ 1 1 1 (/ 3)  Positive Definite Matrices – p. 4 .Example Consider  1 −1 0   −1 1 0 T  .

Suppose we have a 1000 × 1000 matrix. a save of space of almost 90%. for a total of 106 entries.Applications of SVD Through SVD. Using the above expansion and keep only the 50 most significant terms. we can represent a matrix as a sum of rank-1 matrices T T A = U ΣV T = σ1 u1 v1 + · · · + σr ur vr . This would require 50(1 + 1000 + 1000) numbers. 4 . commonly used in image processing Positive Definite Matrices – p.

4 .SVD for Image Positive Definite Matrices – p.

Positive Definite Matrices – p. x+ = A+ b. The matrix that produces x+ from b is called the pseudo-inverse of A. we denote the one with the minimum length by x+ . ˆ Among all solution. 4 .Pseudo-Inverse Consider the normal equation AT Aˆ = AT b. denoted by A+ . x AT A is not always invertible and x is not unique. AT Ax+ = AT b.

Pseudoinverse and SVD A+ is related to SVD A = U ΣV T by A+ = V Σ+ U T . Positive Definite Matrices – p. then AA+ = U ΣV T V Σ+ U T = I ⇒ A+ = A−1 . . . Note Σ+ is the n × m matrix with diagonals being 1 1 . 4 . σ1 If A is invertible. σr and others being 0. . .

which are equated to U T b. so y + = Σ+ U T b. |Ax − b| = |U ΣV T x − b| = |ΣV T x − U T b| = |Σy − U T b|. Σy has at most r non-zero components. x. The minimum-length least-square solution for x is x+ = V y + = V Σ+ U T b = A+ b. Multiplying U T leaves the length unchanged.Minimum Length Consider |Ax − b|. y have the same length since y = V T x = V −1 x. 4 . Other components of y are set to 0 to minimize |y|. Positive Definite Matrices – p.

Different Perspectives We have solved several classes of problems. and eigenvalue problems. The same solutions can be achieved by completely different problem formulations. Positive Definite Matrices – p. 4 . including system of linear equations.

Optimization Problem for Ax = b If A is positive definite. 4 . This is proved by showing 1 P (y) − P (x) = (y − x)T A(y − x) ≥ 0. 2 Positive Definite Matrices – p. then 1 T P (x) = x Ax − xT b 2 reaches its minimum where Ax = b.

4 .A Ax = A b The least-squares problems for Ax = b has a flavor of minimization.e. E(x) has essentially the same form as P (x). AT Ax = AT b. The minimum is achieved where A′ x = b′ i. Positive Definite Matrices – p. The function to be minimized is E(x) = |Ax − b|2 = (Ax − b)T (Ax − b) T T = xT AT Ax − xT AT b − bT Ax + bT b.

The smallest eigenvalue for Ax = λx can be solved by minimizing R(x). Positive Definite Matrices – p. x x where A is real-symmetric.Rayleigh’s Quotient The Rayleigh’s quotient is defined by xT Ax R(x) = T . 5 .

This follows from 2 2 y T Λy λ1 y 1 + · · · + λn y n (Qy)T A(Qy) = T = R(x) = 2 2 T (Qy) (Qy) y y y1 + · · · + yn 2 2 λ1 (y1 + · · · + yn ) ≥ = λ1 .Rayleigh’s Principle The minimum value of the Rayleigh’s quotient R(x) is the smallest eigenvalue λ1 of A. 5 . achieved by the corresponding eigenvector x1 . 2 2 y1 + · · · + yn Positive Definite Matrices – p.

Sign up to vote on this title
UsefulNot useful