# Positive Deﬁnite Matrices

Notes on Linear Algebra
Chia-Ping Chen
Department of Computer Science and Engineering
National Sun Yat-Sen University
Kaohsiung, Taiwan ROC
Positive Deﬁnite Matrices – p. 1
Introduction
The signs of the eigenvalues of a matrix can be
important.
For example, in a system of differential equations,
the signs determine whether the system is stable.
The signs can also be related to the minima, maxima,
and saddle points of a multi-variate function.
Positive Deﬁnite Matrices – p. 2
Expansion of a Function
For a “well-behaved” function of x, y, the value near the
origin can be written as
f(x, y) = f(0) +
∂f
∂x
x +
∂f
∂y
y +
1
2

2
f
∂x
2
x
2
+

2
f
∂x∂y
xy +
1
2

2
f
∂y
2
y
2
+ . . . ,
where the derivative is evaluated at the origin.
Near a point (α, β), the value can be written as
f(x, y) =f(α, β) +
∂f
∂x
(x −α) +
∂f
∂y
(y −β)
+
1
2

2
f
∂x
2
(x −α)
2
+

2
f
∂x∂y
(x −α)(y −β) +
1
2

2
f
∂y
2
(y −β)
2
+ . . . ,
where the derivative is evaluated at (α, β).
Positive Deﬁnite Matrices – p. 3
Stationary Points
By deﬁnition, a point (α, β) is stationary if the ﬁrst-order
partial derivatives evaluated at (α, β) are all 0.
A stationary point is either a maximum, minimum, or a
It is a maximum if the functional value of all nearby
points are not greater.
It is a minimum if the functional value of all nearby
points are not smaller.
It is a saddle point if it is neither a maximum nor a
minimum.
How do we decide?
Positive Deﬁnite Matrices – p. 4
Expansion Near a Stationary Point
The expansion of a function near a stationary point is
f(x, y) =f(α, β) +
∂f
∂x
(x −α) +
∂f
∂y
(y −β)
+
1
2

2
f
∂x
2
(x −α)
2
+

2
f
∂x∂y
(x −α)(y −β) +
1
2

2
f
∂y
2
(y −β)
2
+ . . .
.
=f
0
+
1
2

2
f
∂x
2
(δx)
2
+

2
f
∂x∂y
(δx)(δy) +
1
2

2
f
∂y
2
(δy)
2
Positive Deﬁnite Matrices – p. 5
Hessian Matrix
The Hessian matrix at a point x = (x, y) is deﬁned by
H(x) =

2
f
∂x
2

2
f
∂x∂y

2
f
∂x∂y

2
f
∂y
2
¸
¸
It follows that near a stationary point x,
δf =
1
2
δx
T
H(x)δx.
Positive Deﬁnite Matrices – p. 6
By deﬁnition, a function f is quadratic in variables x, y
if f has the following form
f(x, y) = ax
2
+ 2bxy +cy
2
.
f(x, y) can be expressed by a real symmetric matrix
f(x, y) =

x y

a b
b c
¸
¸

x
y
¸
¸
.
Note that the expansion of a function near a stationary
point is quadratic in the variables δx = (δx, δy).
Positive Deﬁnite Matrices – p. 7
Positive/Negative (Semi-)Deﬁnite
By deﬁnition, a quadratic function f is positive deﬁnite
if its value is positive for all points except for the origin.
f is said to be positive semideﬁnite if the values are
non-negative.
f is said to be negative deﬁnite if the values are
negative except for the origin.
f is said to be negative semideﬁnite if the values are
non-positive.
Positive Deﬁnite Matrices – p. 8
Necessary Conditions
For f(x, y) = ax
2
+2bxy +cy
2
to be p.d., a > 0 and c > 0.
This can be shown by looking at points (1, 0) and
(0, 1).
But these are merely necessary conditions, as the
following example demonstrates
f
0
(x, y)|
x=1,y=1
= x
2
−4xy + y
2
< 0.
Positive Deﬁnite Matrices – p. 9
Sufﬁcient Condition
We can express f using squares by
f(x, y) = ax
2
+ 2bxy +cy
2
= a(x +
b
a
y)
2
+ (c −
b
2
a
)y
2
,
Since square is non-negative, we see from above that
f is positive deﬁnite if (sufﬁcient condition)
a > 0, ac > b
2
.
Positive Deﬁnite Matrices – p. 10
Negative Deﬁnite
f(x, y) = ax
2
+ 2bxy + cy
2
is negative deﬁnite if
f(x, y) < 0 other than (x, y) = (0, 0).
f is negative deﬁnite iff −f is positive deﬁnite.
A sufﬁcient condition for negative deﬁniteness is
(−a) > 0, (−a)(−c) > (−b)
2
.
Equivalently, getting rid of negative signs,
a < 0, ac > b
2
.
example and diagram
Positive Deﬁnite Matrices – p. 11
Singular Case
We have a singular case if
ac = b
2
.
If a > 0, f is still non-negative everywhere, since
f(x, y) = a(x +
b
a
y)
2
.
The surface z = f(x, y) degenerates from a bowl to a
valley, along the line ax +by = 0.
f is said to be positive semideﬁnite (psd) if a > 0 and
negative semideﬁnite if a < 0.
example and diagram
Positive Deﬁnite Matrices – p. 12
The remaining case is when
ac < b
2
.
In this case, (0, 0) is a saddle point.
Along one direction (0, 0) is a minimum, and along
another direction (0, 0) is a maximum.
f is said to be indeﬁnite.
Positive Deﬁnite Matrices – p. 13
Example and Diagram
Let
f(x, y) = x
2
−y
2
.
Note
ac = −1 < 0 = b
2
.
The point (0, 0) is a minimum along the x-axis, a
maximum along the y-axis.
Positive Deﬁnite Matrices – p. 14
A matrix A deﬁnes a quadratic form,
x
T
Ax =

x
1
. . . x
n

a
11
. . . a
1n
.
.
.
.
.
.
.
.
.
a
n1
. . . a
nn
¸
¸
¸
¸
¸

x
1
.
.
.
x
n
¸
¸
¸
¸
¸
=
n
¸
i,j=1
a
ij
x
i
x
j
Note that for i = j, the coefﬁcient of x
i
x
j
is a
ij
+a
ji
.
Given a quadratic form, we make A symmetrical by
requiring a
ij
= a
ji
.
The positive-deﬁniteness of a matrix is deﬁned via
f(x) = x
T
Ax.
Positive Deﬁnite Matrices – p. 15
Positive Deﬁnite Matrices
A real symmetric matrix A is said to be positive deﬁnite
if x
T
Ax > 0 except for x = 0.
From the earlier discussion, we have, in the
two-dimensional case,
A =

a b
b c
¸
¸
is positive deﬁnite iff a > 0 and ac > b
2
.
We generalize to the case that A is n ×n.
Positive Deﬁnite Matrices – p. 16
Conditions for Positive Deﬁniteness
The following are equivalent conditions for positive
deﬁniteness.
All eigenvalues are positive.
All determinants of the principal submatrices are
positive.
All pivots are positive.
example
Positive Deﬁnite Matrices – p. 17
Positive Eigenvalues
If a matrix is positive deﬁnite, then it has all positive
eigenvalues.
Let x
i
be an eigenvector with eigenvalue λ
i
.
x
T
i
Ax
i
= λ
i
(x
T
i
x
i
) > 0 ⇒ λ
i
> 0.
Conversely, if all eigenvalues are positive, then a
matrix is positive deﬁnite.
Via a complete set of eigenvectors,
x
T
Ax = x
T
QΛQ
T
x =
¸
i
c
2
i
λ
i
> 0.
example
Positive Deﬁnite Matrices – p. 18
Positive Determinants
If A is positive deﬁnite, then the principal submatrices
A
k
’s have positive determinants.
A
k
is positive deﬁnite. To prove, let x be a vector
with the last n −k components being 0
x
T
Ax =

x
T
k
0

A
k

∗ ∗
¸
¸

x
k
0
¸
¸
= x
T
k
A
k
x
k
> 0.
Since all eigenvalues of A
k
are positive,
|A
k
| =
¸
i
λ
i
> 0.
example
Positive Deﬁnite Matrices – p. 19
Positive Pivots
Suppose A is positive deﬁnite, then all pivots d
i
are
positive.
d
k
=
|A
k
|
|A
k−1
|
> 0,
since |A
k
| > 0 for all k.
Conversely, if d
i
> 0 for all d
i
, then A is positive deﬁnite.
A = LDL
T
⇒x
T
Ax =
¸
i
d
i
(L
T
x)
2
i
> 0.
example
Positive Deﬁnite Matrices – p. 20
Relation to Least Squares
For a least squares problem Rx = b we solve the
normal equation
R
T
R¯ x = R
T
b.
Note that the matrix A = R
T
R is symmetric.
(theorem) A is positive deﬁnite if R has linearly
independent columns.
x
T
Ax = x
T
R
T
Rx = (Rx)
T
(Rx)

= 0, x = 0,
> 0, x = 0
example
Positive Deﬁnite Matrices – p. 21
Cholesky Decomposition
(Cholesky decomposition) A positive deﬁnite matrix is
the product of a lower-triangular matrix and its
transpose.
A special case comes from LDU decomposition,
A = LDL
T
= LD
1/2
D
1/2
L
T
= R
T
R.
There are inﬁnite ways to decompose a positive
deﬁnite A = R
T
R. In fact, R

= RQ, where Q is
orthogonal, also satisﬁes A = R
′T
R

.
example
Positive Deﬁnite Matrices – p. 22
Ellipsoids in n Dimensions
Consider the equation x
T
Ax = 1, where A is p.d.
If A is diagonal, the graph is easily seen as an ellipsoid.
If A is not diagonal, then using the eigenvectors as
columns of Q, we have
x
T
Ax = x
T
QΛQ
T
x = y
T
Λy = λ
1
y
2
1
+· · · +λ
n
y
2
n
.
y
i
= q
T
i
x is the component of x along the ith
eigenvector q
i
.
Positive Deﬁnite Matrices – p. 23
Principal Axes
The axes of the ellipsoid deﬁned by x
T
Ax = 1 point
toward the eigenvectors (q
i
) of A.
They are called principal axes.
The principal axes are mutually orthogonal.
The length of the axis along q
i
is 1/

λ
i
.
example
Positive Deﬁnite Matrices – p. 24
Semideﬁnite Matrices
A matrix is said to be positive semideﬁnite if
x
T
Ax ≥ 0 for all x.
Each of the following conditions is equivalent to
positive semi-deﬁniteness.
All eigenvalues are non-negative.
|A
k
| ≥ 0 for all principal submatrices A
k
.
All pivots are non-negative.
A = R
T
R for some R.
Positive Deﬁnite Matrices – p. 25
Example
The following matrix is positive semideﬁnite.
A =

2 −1 −1
−1 2 −1
−1 −1 2
¸
¸
¸
¸
x
T
Ax = (x
1
−x
2
)
2
+ (x
2
−x
3
)
2
+ (x
3
−x
1
)
2
≥ 0.
eigenvalues: 0, 3, 3 ≥ 0; pivots: 2,
3
2
≥ 0.
determinants:
|A
1
| = 2, |A
2
| = 3, |A
3
| = 0 ≥ 0.
Positive Deﬁnite Matrices – p. 26
Congruence Transform
The congruence transform of A by a non-singular C is
deﬁned by
A →C
T
AC
Congruence transform is related to quadratic form,
x = Cy ⇒ x
T
Ax = y
T
C
T
ACy.
If A is symmetric, so is its congruence transform
C
T
AC.
example
Positive Deﬁnite Matrices – p. 27
Sylvester’s Law
The signs of eigenvalues are invariant under a
congruence transform.
(proof) For simplicity, suppose A is nonsingular. Let
C = QR, C(t) = tQ+ (1 −t)QR.
The eigenvalues of C(t)
T
we vary t from 0 to 1, but they are never 0 since
C(t) = Q(tI + (1 −t)R) is never singular. So the signs
of eigenvalues never change.
Since Q
T
AQ and A have the same eigenvalues, the
law is proved.
Positive Deﬁnite Matrices – p. 28
Signs of Pivots
The LDU-decomposition of a symmetric matrix A is
A = LDU = U
T
DU.
So A is a congruence transform of D.
As a result of the Sylvester’s law, the signs of the
pivots (eigenvalues of D) agree with the signs of the
eigenvalues.
example
Positive Deﬁnite Matrices – p. 29
Locating Eigenvalues
The relation between pivots and eigenvalues can be
used to locate eigenvalues.
First, note that if A has an eigenvalue λ, then A −cI
has the eigenvalue λ −c with the same eigenvector,
Ax = λx ⇒ (A−cI)x = (λ −c)x.
Positive Deﬁnite Matrices – p. 30
Example
Consider
A =

3 3 0
3 10 7
0 7 8
¸
¸
¸
¸
, B = A−2I =

1 3 0
3 8 7
0 7 6
¸
¸
¸
¸
.
B has a negative pivot, so it has a negative eigenvalue.
A is positive deﬁnite.
It follows
λ
A
> 0, λ
B
= λ
A
−2 < 0 ⇒0 < λ
A
< 2.
Positive Deﬁnite Matrices – p. 31
Generalized Eigenvalue Problem
A generalized eigenvalue problem has the form of
Ax = λMx,
where A, M are given matrices.
We consider only the case that A is symmetric and M
is positive deﬁnite.
example
Positive Deﬁnite Matrices – p. 32
Equivalent Eigenvalue Problem
A generalized eigenvalue problem can be converted to
an equivalent eigenvalue problem.
We can write M = R
T
R, where R is invertible.
Let y = Rx,
Ax = λMx = λR
T
Rx ⇒AR
−1
y = λR
T
y.
Let C = R
−1
so (R
T
)
−1
= C
T
. Then
C
T
ACy = λy.
This is an equivalent eigenvalue problem: same
eigenvalues, related eigenvectors x = Cy.
Positive Deﬁnite Matrices – p. 33
Properties
Since C
T
AC is symmetric, the eigenvectors y
j
can be
The eigenvectors x
j
are M-orthonormal, i.e.,
x
T
i
Mx
j
= x
T
i
R
T
Rx
j
= y
T
i
y
j
= δ
ij
.
example
Positive Deﬁnite Matrices – p. 34
Simultaneous Diagonalization
Both M and A can be diagonalized by the generalized
eigenvectors x
i
’s,
x
T
i
Mx
j
= y
T
i
y
i
= δ
ij
, x
T
i
Ax
j
= λ
j
x
T
i
Mx
j
= λ
j
δ
ij
.
That is, using x
i
’s as the columns of S, we have
S
T
AS = Λ, S
T
MS = I.
example
Note they are congruence transforms to diagonal
matrices rather than similarity transforms, as S
T
is
used, not S
−1
.
Positive Deﬁnite Matrices – p. 35
Singular Value Decomposition
The singular value decomposition (SVD) of a matrix A
is deﬁned by
A = UΣV
T
,
where U, Σ, V are related to the matrices A
T
A and AA
T
Here A is not limited to be a square matrix.
Positive Deﬁnite Matrices – p. 36
SVD Theorem
Any m×n real matrix A with rank r can be factored by
A = UΣV
T
= (orthogonal)(diagonal)(orthogonal).
U is an eigenvector matrix of AA
T
.
V is an eigenvector matrix of A
T
A.
Σ contains the r singular values on the diagonal.
A singular value is the square root of a non-zero
eigenvalue of A
T
A.
Positive Deﬁnite Matrices – p. 37
Proof
Suppose v
1
, . . . , v
n
are orthonormal eigenvectors of
A
T
A, with eigenvalues in non-increasing order
v
T
i
A
T
Av
j
= λ
j
v
T
i
v
j
= λ
j
δ
ij
, λ
1
≥ λ
2
≥ · · · ≥ λ
n
.
Since A
T
A has the same nullspace as A, there are r
non-zero eigenvalues.
These non-zero eigenvalues are positive since A
T
A is
p.s.d.
Deﬁne the singular value for a positive eigenvalue
σ
j
=

λ
j
, j = 1, . . . , r.
Positive Deﬁnite Matrices – p. 38
Proof
u
j
=
Av
j
σ
j
, j = 1, . . . , r is an eigenvector of AA
T
, and u
j
’s
are orthonormal.
AA
T
u
j
=
AA
T
Av
j
σ
j
= λ
j
Av
j
σ
j
= λ
j
u
j
, u
T
i
u
j
= δ
ij
.
Construct V with v’s, and U with u’s and eigenvectors
for 0,
(U
T
AV )
ij
= u
T
i
Av
j
=

0 if j > r,
σ
j
u
T
i
u
j
= σ
j
δ
ij
if j ≤ r.
That is, U
T
AV = Σ. So A = UΣV
T
.
Positive Deﬁnite Matrices – p. 39
Remarks
A multiplied by a column of V produces a multiple of a
column of U,
AV = UΣ, or Av
j
= σ
j
u
j
.
U is the eigenvector matrix of AA
T
and V is the
eigenvector matrix of A
T
A.
AA
T
= UΣΣ
T
U
T
; A
T
A = V Σ
T
ΣV
T
.
The non-zero eigenvalues of AA
T
and A
T
A are the
same. They are in ΣΣ
T
.
Positive Deﬁnite Matrices – p. 40
Example
Consider
A =

−1 1 0
0 −1 1
¸
¸
. A
T
A =

1 −1 0
−1 2 −1
0 −1 1
¸
¸
¸
¸
.
The singular values are

3, 1.
Finding v
i
and u
i
, one has
A =
1

2

−1 1
1 1
¸
¸

3 0 0
0 1 0
¸
¸

1 −2 1 (/

6)
−1 0 1 (/

2)
1 1 1 (/

3)
¸
¸
¸
¸
Positive Deﬁnite Matrices – p. 41
Applications of SVD
Through SVD, we can represent a matrix as a sum of
rank-1 matrices
A = UΣV
T
= σ
1
u
1
v
T
1
+· · · +σ
r
u
r
v
T
r
.
Suppose we have a 1000 ×1000 matrix, for a total of 10
6
entries. Using the above expansion and keep only the
50 most signiﬁcant terms. This would require
50(1 + 1000 + 1000) numbers, a save of space of almost
90%.
commonly used in image processing
Positive Deﬁnite Matrices – p. 42
SVD for Image
Positive Deﬁnite Matrices – p. 43
Pseudo-Inverse
Consider the normal equation
A
T
Aˆ x = A
T
b.
A
T
A is not always invertible and ˆ x is not unique.
Among all solution, we denote the one with the
minimum length by x
+
.
A
T
Ax
+
= A
T
b.
The matrix that produces x
+
from b is called the
pseudo-inverse of A, denoted by A
+
,
x
+
= A
+
b.
Positive Deﬁnite Matrices – p. 44
Pseudoinverse and SVD
A
+
is related to SVD A = UΣV
T
by
A
+
= V Σ
+
U
T
,
Note Σ
+
is the n ×m matrix with diagonals being
1
σ
1
, . . . ,
1
σ
r
and others being 0.
If A is invertible, then
AA
+
= UΣV
T
V Σ
+
U
T
= I ⇒ A
+
= A
−1
.
Positive Deﬁnite Matrices – p. 45
Minimum Length
Consider |Ax −b|.
Multiplying U
T
leaves the length unchanged,
|Ax −b| = |UΣV
T
x −b| = |ΣV
T
x −U
T
b| = |Σy −U
T
b|.
x, y have the same length since y = V
T
x = V
−1
x.
Σy has at most r non-zero components, which are
equated to U
T
b. Other components of y are set to 0 to
minimize |y|, so y
+
= Σ
+
U
T
b.
The minimum-length least-square solution for x is
x
+
= V y
+
= V Σ
+
U
T
b = A
+
b.
Positive Deﬁnite Matrices – p. 46
Different Perspectives
We have solved several classes of problems, including
system of linear equations, and
eigenvalue problems.
The same solutions can be achieved by completely
different problem formulations.
Positive Deﬁnite Matrices – p. 47
Optimization Problem for Ax = b
If A is positive deﬁnite, then
P(x) =
1
2
x
T
Ax −x
T
b
reaches its minimum where
Ax = b.
This is proved by showing
P(y) −P(x) =
1
2
(y −x)
T
A(y −x) ≥ 0.
Positive Deﬁnite Matrices – p. 48
A
T
Ax = A
T
b
The least-squares problems for Ax = b has a ﬂavor of
minimization.
The function to be minimized is
E(x) = |Ax −b|
2
= (Ax −b)
T
(Ax −b)
= x
T
A
T
Ax −x
T
A
T
b −b
T
Ax + b
T
b.
E(x) has essentially the same form as P(x). The
minimum is achieved where
A

x = b

i.e. A
T
Ax = A
T
b.
Positive Deﬁnite Matrices – p. 49
Rayleigh’s Quotient
The Rayleigh’s quotient is deﬁned by
R(x) =
x
T
Ax
x
T
x
,
where A is real-symmetric.
The smallest eigenvalue for
Ax = λx
can be solved by minimizing R(x).
Positive Deﬁnite Matrices – p. 50
Rayleigh’s Principle
The minimum value of the Rayleigh’s quotient R(x) is
the smallest eigenvalue λ
1
of A, achieved by the
corresponding eigenvector x
1
.
This follows from
R(x) =
(Qy)
T
A(Qy)
(Qy)
T
(Qy)
=
y
T
Λy
y
T
y
=
λ
1
y
2
1
+· · · +λ
n
y
2
n
y
2
1
+· · · +y
2
n

λ
1
(y
2
1
+· · · +y
2
n
)
y
2
1
+· · · +y
2
n
= λ
1
.
Positive Deﬁnite Matrices – p. 51

Introduction
The signs of the eigenvalues of a matrix can be important. For example, in a system of differential equations, the signs determine whether the system is stable. The signs can also be related to the minima, maxima, and saddle points of a multi-variate function.

Positive Deﬁnite Matrices – p.

Expansion of a Function
For a “well-behaved” function of x, y , the value near the origin can be written as
∂f 1 ∂2f 2 1 ∂2f 2 ∂2f ∂f x+ y+ xy + x + y + ..., f (x, y) = f (0) + 2 2 ∂x ∂y 2 ∂x ∂x∂y 2 ∂y

where the derivative is evaluated at the origin. Near a point (α, β), the value can be written as

∂f ∂f (x − α) + (y − β) f (x, y) =f (α, β) + ∂x ∂y 1 ∂2f 1 ∂2f ∂2f 2 + (x − α)(y − β) + (x − α) + (y − β)2 + . . . , 2 ∂x2 ∂x∂y 2 ∂y 2

where the derivative is evaluated at (α, β).
Positive Deﬁnite Matrices – p.

Stationary Points
By deﬁnition, a point (α, β) is stationary if the ﬁrst-order partial derivatives evaluated at (α, β) are all 0. A stationary point is either a maximum, minimum, or a saddle point. It is a maximum if the functional value of all nearby points are not greater. It is a minimum if the functional value of all nearby points are not smaller. It is a saddle point if it is neither a maximum nor a minimum. How do we decide?

Positive Deﬁnite Matrices – p.

Expansion Near a Stationary Point
The expansion of a function near a stationary point is
f (x, y) =f (α, β) + ∂f ∂f (x − α) + (y − β) ∂x ∂y 1 ∂2f ∂2f 1 ∂2f 2 (x − α)(y − β) + (x − α) + (y − β)2 + . . . + 2 ∂x2 ∂x∂y 2 ∂y 2 1 ∂2f 1 ∂2f ∂2f . 2 =f0 + (δx)(δy) + (δx) + (δy)2 2 ∂x2 ∂x∂y 2 ∂y 2

Positive Deﬁnite Matrices – p.

Hessian Matrix The Hessian matrix at a point x = (x. . y) is deﬁned by   H(x) = ∂2f  ∂x2 ∂2f ∂x∂y ∂2f ∂x∂y  ∂2f ∂y 2 It follows that near a stationary point x. 1 T δf = δx H(x)δx. 2 Positive Deﬁnite Matrices – p.

f (x. . y) = x y  b c y Note that the expansion of a function near a stationary point is quadratic in the variables δx = (δx. y) can be expressed by a real symmetric matrix    a b x  . Positive Deﬁnite Matrices – p.Quadratic Form By deﬁnition. a function f is quadratic in variables x. δy). y if f has the following form f (x. f (x. y) = ax2 + 2bxy + cy 2 .

. f is said to be positive semideﬁnite if the values are non-negative. a quadratic function f is positive deﬁnite if its value is positive for all points except for the origin. Positive Deﬁnite Matrices – p. f is said to be negative deﬁnite if the values are negative except for the origin. f is said to be negative semideﬁnite if the values are non-positive.Positive/Negative (Semi-)Deﬁnite By deﬁnition.

y)|x=1. But these are merely necessary conditions. 0) and (0.d. Positive Deﬁnite Matrices – p. as the following example demonstrates f0 (x. 1)..Necessary Conditions For f (x. a > 0 and c > 0. . y) = ax2 + 2bxy + cy 2 to be p. This can be shown by looking at points (1.y=1 = x2 − 4xy + y 2 < 0.

a a Since square is non-negative.Sufﬁcient Condition We can express f using squares by b 2 b2 2 f (x. y) = ax2 + 2bxy + cy 2 = a(x + y) + (c − )y . 1 . ac > b2 . Positive Deﬁnite Matrices – p. we see from above that f is positive deﬁnite if (sufﬁcient condition) a > 0.

example and diagram Positive Deﬁnite Matrices – p. getting rid of negative signs. Equivalently. 0). 1 . (−a)(−c) > (−b)2 . y) = (0. a < 0. A sufﬁcient condition for negative deﬁniteness is (−a) > 0. y) = ax2 + 2bxy + cy 2 is negative deﬁnite if f (x. f is negative deﬁnite iff −f is positive deﬁnite.Negative Deﬁnite f (x. ac > b2 . y) < 0 other than (x.

Singular Case We have a singular case if ac = b2 . y) degenerates from a bowl to a valley. f is said to be positive semideﬁnite (psd) if a > 0 and negative semideﬁnite if a < 0. along the line ax + by = 0. since b 2 f (x. f is still non-negative everywhere. If a > 0. example and diagram Positive Deﬁnite Matrices – p. y) = a(x + y) . 1 . a The surface z = f (x.

and along another direction (0. f is said to be indeﬁnite. 0) is a maximum.Saddle Point The remaining case is when ac < b2 . 0) is a minimum. (0. Positive Deﬁnite Matrices – p. 1 . In this case. 0) is a saddle point. Along one direction (0.

ac = −1 < 0 = b2 . Note The point (0. 0) is a minimum along the x-axis. 1 .Example and Diagram Let f (x. y) = x2 − y 2 . Positive Deﬁnite Matrices – p. a maximum along the y -axis.

 = aij xi xj . a1n x1 n   ... the coefﬁcient of xi xj is aij + aji ..  . The positive-deﬁniteness of a matrix is deﬁned via f (x) = xT Ax. . Positive Deﬁnite Matrices – p. an1  .  .. .Matrix and Quadratic Form A matrix A deﬁnes a quadratic form.  .  i. we make A symmetrical by requiring aij = aji . xn Note that for i = j ... a11   .  ..  . . Given a quadratic form.j=1 ann xn   xT Ax = x1 .  . 1 .

we have. We generalize to the case that A is n × n.   a b  A= b c is positive deﬁnite iff a > 0 and ac > b2 .Positive Deﬁnite Matrices A real symmetric matrix A is said to be positive deﬁnite if xT Ax > 0 except for x = 0. Positive Deﬁnite Matrices – p. in the two-dimensional case. 1 . From the earlier discussion.

All determinants of the principal submatrices are positive. All pivots are positive.Conditions for Positive Deﬁniteness The following are equivalent conditions for positive deﬁniteness. example Positive Deﬁnite Matrices – p. All eigenvalues are positive. 1 .

example Positive Deﬁnite Matrices – p. if all eigenvalues are positive.Positive Eigenvalues If a matrix is positive deﬁnite. xT Axi = λi (xT xi ) > 0 ⇒ λi > 0. then it has all positive eigenvalues. 1 . Via a complete set of eigenvectors. xT Ax = xT QΛQT x = i 2 ci λi > 0. then a matrix is positive deﬁnite. i i Conversely. Let xi be an eigenvector with eigenvalue λi .

1 . To prove. i example Positive Deﬁnite Matrices – p. |Ak | = λi > 0. Ak is positive deﬁnite. then the principal submatrices Ak ’s have positive determinants. x Ax = xk 0 k ∗ ∗ 0 Since all eigenvalues of Ak are positive. let x be a vector with the last n − k components being 0    A ∗ x T T  k   k  = xT Ak xk > 0.Positive Determinants If A is positive deﬁnite.

if di > 0 for all di . then all pivots di are positive. 2 . |Ak | dk = > 0. then A is positive deﬁnite. A = LDLT ⇒ xT Ax = 2 di (LT x)i > 0. |Ak−1 | since |Ak | > 0 for all k . Conversely.Positive Pivots Suppose A is positive deﬁnite. i example Positive Deﬁnite Matrices – p.

x = 0 example Positive Deﬁnite Matrices – p.Relation to Least Squares For a least squares problem Rx = b we solve the normal equation RT R¯ = RT b. xT Ax = xT RT Rx = (Rx)T (Rx) > 0. x Note that the matrix A = RT R is symmetric. (theorem) A is positive deﬁnite if R has linearly independent columns.  = 0. 2 . x = 0.

where Q is orthogonal. also satisﬁes A = R′T R′ . A special case comes from LDU decomposition. R′ = RQ. In fact. A = LDLT = LD1/2 D1/2 LT = RT R. 2 . example Positive Deﬁnite Matrices – p. There are inﬁnite ways to decompose a positive deﬁnite A = RT R.Cholesky Decomposition (Cholesky decomposition) A positive deﬁnite matrix is the product of a lower-triangular matrix and its transpose.

d.Ellipsoids in n Dimensions Consider the equation xT Ax = 1. then using the eigenvectors as columns of Q. 2 . If A is not diagonal. where A is p. If A is diagonal. the graph is easily seen as an ellipsoid. T yi = qi x is the component of x along the ith eigenvector qi . we have 2 2 xT Ax = xT QΛQT x = y T Λy = λ1 y1 + · · · + λn yn . Positive Deﬁnite Matrices – p.

2 . √ The length of the axis along qi is 1/ λi .Principal Axes The axes of the ellipsoid deﬁned by xT Ax = 1 point toward the eigenvectors (qi ) of A. They are called principal axes. example Positive Deﬁnite Matrices – p. The principal axes are mutually orthogonal.

|Ak | ≥ 0 for all principal submatrices Ak . All eigenvalues are non-negative. All pivots are non-negative. A = RT R for some R.Semideﬁnite Matrices A matrix is said to be positive semideﬁnite if xT Ax ≥ 0 for all x. 2 . Positive Deﬁnite Matrices – p. Each of the following conditions is equivalent to positive semi-deﬁniteness.

|A3 | = 0 ≥ 0.Example The following matrix is positive semideﬁnite. pivots: 2.   2 −1 −1   A = −1 2 −1   −1 −1 2 quadratic form: xT Ax = (x1 − x2 )2 + (x2 − x3 )2 + (x3 − x1 )2 ≥ 0. 3 eigenvalues: 0. 2 ≥ 0. 3. |A2 | = 3. determinants: |A1 | = 2. Positive Deﬁnite Matrices – p. 2 . 3 ≥ 0.

x = Cy ⇒ xT Ax = y T C T ACy. 2 . so is its congruence transform C T AC .Congruence Transform The congruence transform of A by a non-singular C is deﬁned by A → C T AC Congruence transform is related to quadratic form. If A is symmetric. example Positive Deﬁnite Matrices – p.

So the signs of eigenvalues never change. but they are never 0 since C(t) = Q(tI + (1 − t)R) is never singular. (proof) For simplicity. Positive Deﬁnite Matrices – p. The eigenvalues of C(t)T AC(t) change gradually as we vary t from 0 to 1. the law is proved. Let C = QR. 2 . C(t) = tQ + (1 − t)QR. Since QT AQ and A have the same eigenvalues. suppose A is nonsingular.Sylvester’s Law The signs of eigenvalues are invariant under a congruence transform.

2 . As a result of the Sylvester’s law. the signs of the pivots (eigenvalues of D) agree with the signs of the eigenvalues.Signs of Pivots The LDU -decomposition of a symmetric matrix A is A = LDU = U T DU. example Positive Deﬁnite Matrices – p. So A is a congruence transform of D.

note that if A has an eigenvalue λ. then A − cI has the eigenvalue λ − c with the same eigenvector. First. 3 .Locating Eigenvalues The relation between pivots and eigenvalues can be used to locate eigenvalues. Positive Deﬁnite Matrices – p. Ax = λx ⇒ (A − cI)x = (λ − c)x.

It follows λA > 0.Example Consider    1 3 0 3 3 0     A = 3 10 7 . 3 . A is positive deﬁnite. B = A − 2I = 3 8 7 . so it has a negative eigenvalue. λB = λA − 2 < 0 ⇒ 0 < λA < 2.     0 7 6 0 7 8  B has a negative pivot. Positive Deﬁnite Matrices – p.

example Positive Deﬁnite Matrices – p. We consider only the case that A is symmetric and M is positive deﬁnite. where A. 3 .Generalized Eigenvalue Problem A generalized eigenvalue problem has the form of Ax = λM x. M are given matrices.

related eigenvectors x = Cy . Ax = λM x = λRT Rx ⇒ AR−1 y = λRT y. This is an equivalent eigenvalue problem: same eigenvalues. where R is invertible. Positive Deﬁnite Matrices – p.Equivalent Eigenvalue Problem A generalized eigenvalue problem can be converted to an equivalent eigenvalue problem. Let y = Rx. Let C = R−1 so (RT )−1 = C T . Then C T ACy = λy. We can write M = RT R. 3 .

the eigenvectors yj can be made orthonormal.e. The eigenvectors xj are M -orthonormal. 3 . i i example Positive Deﬁnite Matrices – p.. i.Properties Since C T AC is symmetric. T xT M xj = xT RT Rxj = yi yj = δij .

example Note they are congruence transforms to diagonal matrices rather than similarity transforms. as S T is used.Simultaneous Diagonalization Both M and A can be diagonalized by the generalized eigenvectors xi ’s. 3 . using xi ’s as the columns of S . we have S T AS = Λ. i i i That is. Positive Deﬁnite Matrices – p. S T M S = I. xT Axj = λj xT M xj = λj δij . T xT M xj = yi yi = δij . not S −1 .

where U.Singular Value Decomposition The singular value decomposition (SVD) of a matrix A is deﬁned by A = U ΣV T . Σ. Positive Deﬁnite Matrices – p. 3 . V are related to the matrices AT A and AAT Here A is not limited to be a square matrix.

U is an eigenvector matrix of AAT . Positive Deﬁnite Matrices – p. V is an eigenvector matrix of AT A. Σ contains the r singular values on the diagonal. A singular value is the square root of a non-zero eigenvalue of AT A.SVD Theorem Any m × n real matrix A with rank r can be factored by A = U ΣV T = (orthogonal)(diagonal)(orthogonal). 3 .

r. These non-zero eigenvalues are positive since AT A is p. there are r non-zero eigenvalues. . . . . . λ1 ≥ λ2 ≥ · · · ≥ λn . Positive Deﬁnite Matrices – p. Since AT A has the same nullspace as A.d. .Proof Suppose v1 .s. 3 . Deﬁne the singular value for a positive eigenvalue σj = λj . with eigenvalues in non-increasing order T T vi AT Avj = λj vi vj = λj δij . . vn are orthonormal eigenvectors of AT A. j = 1. .

Avj AAT Avj T = λj = λj uj . . AA uj = i σj σj Construct V with v ’s. . r is an eigenvector of AAT . i That is. T T (U AV )ij = ui Avj = σj uT uj = σj δij if j ≤ r. and U with u’s and eigenvectors for 0.  0 if j > r.Proof uj = Avj . . U T AV = Σ. Positive Deﬁnite Matrices – p. 3 . and uj ’s σj are orthonormal. So A = U ΣV T . . j = 1. uT uj = δij .

AV = U Σ.Remarks A multiplied by a column of V produces a multiple of a column of U . or Avj = σj uj . AAT = U ΣΣT U T . AT A = V ΣT ΣV T . The non-zero eigenvalues of AAT and AT A are the same. Positive Deﬁnite Matrices – p. U is the eigenvector matrix of AAT and V is the eigenvector matrix of AT A. They are in ΣΣT . 4 .

Finding vi and ui . A A = −1 2 −1 . A=   0 −1 1 0 −1 1    √ The singular values are 3. 1. one has √   1 −2 1 (/ 6)   √ √  1 −1 1  3 0 0  −1 0 1 (/ 2) A= √   2 1 1 0 1 0 √ 1 1 1 (/ 3)  Positive Deﬁnite Matrices – p. 4 .Example Consider  1 −1 0   −1 1 0 T  .

Suppose we have a 1000 × 1000 matrix. a save of space of almost 90%. for a total of 106 entries.Applications of SVD Through SVD. Using the above expansion and keep only the 50 most signiﬁcant terms. we can represent a matrix as a sum of rank-1 matrices T T A = U ΣV T = σ1 u1 v1 + · · · + σr ur vr . This would require 50(1 + 1000 + 1000) numbers. 4 . commonly used in image processing Positive Deﬁnite Matrices – p.

4 .SVD for Image Positive Deﬁnite Matrices – p.

Positive Deﬁnite Matrices – p. x+ = A+ b. The matrix that produces x+ from b is called the pseudo-inverse of A. we denote the one with the minimum length by x+ . ˆ Among all solution. 4 .Pseudo-Inverse Consider the normal equation AT Aˆ = AT b. denoted by A+ . x AT A is not always invertible and x is not unique. AT Ax+ = AT b.

Pseudoinverse and SVD A+ is related to SVD A = U ΣV T by A+ = V Σ+ U T . Positive Deﬁnite Matrices – p. then AA+ = U ΣV T V Σ+ U T = I ⇒ A+ = A−1 . . . Note Σ+ is the n × m matrix with diagonals being 1 1 . 4 . σ1 If A is invertible. σr and others being 0. . .

which are equated to U T b. so y + = Σ+ U T b. |Ax − b| = |U ΣV T x − b| = |ΣV T x − U T b| = |Σy − U T b|. Σy has at most r non-zero components. x. The minimum-length least-square solution for x is x+ = V y + = V Σ+ U T b = A+ b. Multiplying U T leaves the length unchanged.Minimum Length Consider |Ax − b|. y have the same length since y = V T x = V −1 x. 4 . Other components of y are set to 0 to minimize |y|. Positive Deﬁnite Matrices – p.

Different Perspectives We have solved several classes of problems. and eigenvalue problems. The same solutions can be achieved by completely different problem formulations. Positive Deﬁnite Matrices – p. 4 . including system of linear equations.

Optimization Problem for Ax = b If A is positive deﬁnite. 4 . This is proved by showing 1 P (y) − P (x) = (y − x)T A(y − x) ≥ 0. 2 Positive Deﬁnite Matrices – p. then 1 T P (x) = x Ax − xT b 2 reaches its minimum where Ax = b.

4 .A Ax = A b The least-squares problems for Ax = b has a ﬂavor of minimization.e. E(x) has essentially the same form as P (x). AT Ax = AT b. The minimum is achieved where A′ x = b′ i. Positive Deﬁnite Matrices – p. The function to be minimized is E(x) = |Ax − b|2 = (Ax − b)T (Ax − b) T T = xT AT Ax − xT AT b − bT Ax + bT b.

The smallest eigenvalue for Ax = λx can be solved by minimizing R(x). Positive Deﬁnite Matrices – p. x x where A is real-symmetric.Rayleigh’s Quotient The Rayleigh’s quotient is deﬁned by xT Ax R(x) = T . 5 .

This follows from 2 2 y T Λy λ1 y 1 + · · · + λn y n (Qy)T A(Qy) = T = R(x) = 2 2 T (Qy) (Qy) y y y1 + · · · + yn 2 2 λ1 (y1 + · · · + yn ) ≥ = λ1 .Rayleigh’s Principle The minimum value of the Rayleigh’s quotient R(x) is the smallest eigenvalue λ1 of A. 5 . achieved by the corresponding eigenvector x1 . 2 2 y1 + · · · + yn Positive Deﬁnite Matrices – p.