You are on page 1of 158

EE506 – Engineering Mathematics

Lecture 12
Quadratic Forms, Positive Definiteness and Singular Value
Decomposition

Dr. Adeem Aslam


Assistant Professor

Department of Electrical Engineering


University of Engineering and Technology, Lahore, Pakistan

October 24, 2021

Lecture 12 1/26
Outline

1 Quadratic Forms
Principal Axis Theorem

2 Positive Definiteness
Negative Definiteness
Singular Case and Saddle Point
Higher Dimensions
Quadratic Form for a Random Function

3 Singular Value Decomposition


Remarks

Lecture 12 2/26
Quadrartic Forms

A quadraric form Q in the components x1 , x2 , . . . , xn of a vector x is a sum of


n2 terms of the form
n X
X n
Q = xT Ax = ajk xj xk
j=1 k=1

= a11 x21 + a12 x1 x2 + . . . + a1n x1 xn


+ a21 x2 x1 + a22 x22 + . . . + a2n x2 xn
+ .........
+ an1 xn x1 + an2 xn x2 + . . . + ann x2n , (1)

where Am×n = [ajk ] is called the coefficient matrix of the form.

Since Q = QT (Q being a scalar),


T
xT Ax = xT Ax ⇒ xT Ax = xT AT x
⇒ A = AT . (2)

Lecture 12 3/26
Quadrartic Forms

A quadraric form Q in the components x1 , x2 , . . . , xn of a vector x is a sum of


n2 terms of the form
n X
X n
Q = xT Ax = ajk xj xk
j=1 k=1

= a11 x21 + a12 x1 x2 + . . . + a1n x1 xn


+ a21 x2 x1 + a22 x22 + . . . + a2n x2 xn
+ .........
+ an1 xn x1 + an2 xn x2 + . . . + ann x2n , (1)

where Am×n = [ajk ] is called the coefficient matrix of the form.

Since Q = QT (Q being a scalar),


T
xT Ax = xT Ax ⇒ xT Ax = xT AT x
⇒ A = AT . (2)

Lecture 12 3/26
Quadrartic Forms

A quadraric form Q in the components x1 , x2 , . . . , xn of a vector x is a sum of


n2 terms of the form
n X
X n
Q = xT Ax = ajk xj xk
j=1 k=1

= a11 x21 + a12 x1 x2 + . . . + a1n x1 xn


+ a21 x2 x1 + a22 x22 + . . . + a2n x2 xn
+ .........
+ an1 xn x1 + an2 xn x2 + . . . + ann x2n , (1)

where Am×n = [ajk ] is called the coefficient matrix of the form.

Since Q = QT (Q being a scalar),


T
xT Ax = xT Ax ⇒ xT Ax = xT AT x
⇒ A = AT . (2)

Lecture 12 3/26
Quadratic Forms

If A is not symmetric, we can take off-diagonal terms together in pairs and write the
result as a sum of two equal terms, which is equivalent to replacing A with (A+AT )/2,
as illustrated in the example below.

Lecture 12 4/26
Quadratic Forms

If A is not symmetric, we can take off-diagonal terms together in pairs and write the
result as a sum of two equal terms, which is equivalent to replacing A with (A+AT )/2,
as illustrated in the example below.

Example: Consider
  
3 4 x1
xT Ax = [x1 x2 ]
6 2 x2

Lecture 12 4/26
Quadratic Forms

If A is not symmetric, we can take off-diagonal terms together in pairs and write the
result as a sum of two equal terms, which is equivalent to replacing A with (A+AT )/2,
as illustrated in the example below.

Example: Consider
  
3 4 x1
xT Ax = [x1 x2 ] = 3x21 + 4x1 x2 + 6x1 x2 + 2x22
6 2 x2

Lecture 12 4/26
Quadratic Forms

If A is not symmetric, we can take off-diagonal terms together in pairs and write the
result as a sum of two equal terms, which is equivalent to replacing A with (A+AT )/2,
as illustrated in the example below.

Example: Consider
  
3 4 x1
xT Ax = [x1 x2 ] = 3x21 + 4x1 x2 + 6x1 x2 + 2x22
6 2 x2
= 3x21 + 10x1 x2 + 2x22

Lecture 12 4/26
Quadratic Forms

If A is not symmetric, we can take off-diagonal terms together in pairs and write the
result as a sum of two equal terms, which is equivalent to replacing A with (A+AT )/2,
as illustrated in the example below.

Example: Consider
  
3 4 x1
xT Ax = [x1 x2 ] = 3x21 + 4x1 x2 + 6x1 x2 + 2x22
6 2 x2
= 3x21 + 10x1 x2 + 2x22 = 3x21 + 5x1 x2 + 5x1 x2 + 2x22 .

Lecture 12 4/26
Quadratic Forms

If A is not symmetric, we can take off-diagonal terms together in pairs and write the
result as a sum of two equal terms, which is equivalent to replacing A with (A+AT )/2,
as illustrated in the example below.

Example: Consider
  
3 4 x1
xT Ax = [x1 x2 ] = 3x21 + 4x1 x2 + 6x1 x2 + 2x22
6 2 x2
= 3x21 + 10x1 x2 + 2x22 = 3x21 + 5x1 x2 + 5x1 x2 + 2x22 .

Then,
  
T 3 5 x1
x Cx = [x1 x2 ]
5 2 x2

Lecture 12 4/26
Quadratic Forms

If A is not symmetric, we can take off-diagonal terms together in pairs and write the
result as a sum of two equal terms, which is equivalent to replacing A with (A+AT )/2,
as illustrated in the example below.

Example: Consider
  
3 4 x1
xT Ax = [x1 x2 ] = 3x21 + 4x1 x2 + 6x1 x2 + 2x22
6 2 x2
= 3x21 + 10x1 x2 + 2x22 = 3x21 + 5x1 x2 + 5x1 x2 + 2x22 .

Then,

A + AT
  
T 3 5 x1
x Cx = [x1 x2 ] = xT Ax, C= .
5 2 x2 2

Lecture 12 4/26
Quadratic Form – Principal Axis Theorem

Statement:

The substitution

x = Xy, (3)

transforms the quadratic form


n X
X n
T
Q = x Ax = ajk xj xk , ajk = akj , (4)
j=1 k=1

to the principal axis form or the canonical form, given by


n
X
Q = y T Ay = λ1 y12 + λ2 y22 + · · · + λn yn2 = λj yj2 , (5)
j=1

where λ1 , λ2 , . . . , λn are the (not necessarily distinct) eigenvalues of (symmetric) ma-


trix A and X is the orthogonal matrix with eigenvectors of A, i.e., x1 , x2 , . . . , xn , as
its column vectors.

Lecture 12 5/26
Quadratic Form – Principal Axis Theorem

Statement:

The substitution

x = Xy, (3)

transforms the quadratic form


n X
X n
T
Q = x Ax = ajk xj xk , ajk = akj , (4)
j=1 k=1

to the principal axis form or the canonical form, given by


n
X
Q = y T Ay = λ1 y12 + λ2 y22 + · · · + λn yn2 = λj yj2 , (5)
j=1

where λ1 , λ2 , . . . , λn are the (not necessarily distinct) eigenvalues of (symmetric) ma-


trix A and X is the orthogonal matrix with eigenvectors of A, i.e., x1 , x2 , . . . , xn , as
its column vectors.

Lecture 12 5/26
Quadratic Form – Principal Axis Theorem

Statement:

The substitution

x = Xy, (3)

transforms the quadratic form


n X
X n
T
Q = x Ax = ajk xj xk , ajk = akj , (4)
j=1 k=1

to the principal axis form or the canonical form, given by


n
X
Q = y T Ay = λ1 y12 + λ2 y22 + · · · + λn yn2 = λj yj2 , (5)
j=1

where λ1 , λ2 , . . . , λn are the (not necessarily distinct) eigenvalues of (symmetric) ma-


trix A and X is the orthogonal matrix with eigenvectors of A, i.e., x1 , x2 , . . . , xn , as
its column vectors.

Lecture 12 5/26
Quadratic Form – Principal Axis Theorem

Statement:

The substitution

x = Xy, (3)

transforms the quadratic form


n X
X n
T
Q = x Ax = ajk xj xk , ajk = akj , (4)
j=1 k=1

to the principal axis form or the canonical form, given by


n
X
Q = y T Ay = λ1 y12 + λ2 y22 + · · · + λn yn2 = λj yj2 , (5)
j=1

where λ1 , λ2 , . . . , λn are the (not necessarily distinct) eigenvalues of (symmetric) ma-


trix A and X is the orthogonal matrix with eigenvectors of A, i.e., x1 , x2 , . . . , xn , as
its column vectors.

Lecture 12 5/26
Quadratic Form – Principal Axis Theorem
Proof:

It is known that

Ax1 = λ1 x1 , Ax2 = λ2 x2 , . . . , Axn = λn xn ,

where x1 , x2 , . . . , xn are the orthonormal eigenvectors of A. Then, from similarity


transformation,

D = X −1 AX = X T AX, X = [x1 x2 . . . xn ]
T T −1
⇒ Q = x Ax = x XDX x = xT XDX T x.

Let

X T x , y ⇒ X −1 x = y ⇒ x = Xy
T T
⇒ xT X = X T x = X −1 x = yT .

Hence,

Q = xT XDX T x = y T Dy = λ1 y12 + λ2 y22 + · · · + λn yn2 . (6)

Lecture 12 6/26
Quadratic Form – Principal Axis Theorem
Proof:

It is known that

Ax1 = λ1 x1 , Ax2 = λ2 x2 , . . . , Axn = λn xn ,

where x1 , x2 , . . . , xn are the orthonormal eigenvectors of A. Then, from similarity


transformation,

D = X −1 AX = X T AX, X = [x1 x2 . . . xn ]
T T −1
⇒ Q = x Ax = x XDX x = xT XDX T x.

Let

X T x , y ⇒ X −1 x = y ⇒ x = Xy
T T
⇒ xT X = X T x = X −1 x = yT .

Hence,

Q = xT XDX T x = y T Dy = λ1 y12 + λ2 y22 + · · · + λn yn2 . (6)

Lecture 12 6/26
Quadratic Form – Principal Axis Theorem
Proof:

It is known that

Ax1 = λ1 x1 , Ax2 = λ2 x2 , . . . , Axn = λn xn ,

where x1 , x2 , . . . , xn are the orthonormal eigenvectors of A. Then, from similarity


transformation,

D = X −1 AX = X T AX, X = [x1 x2 . . . xn ]
T T −1
⇒ Q = x Ax = x XDX x = xT XDX T x.

Let

X T x , y ⇒ X −1 x = y ⇒ x = Xy
T T
⇒ xT X = X T x = X −1 x = yT .

Hence,

Q = xT XDX T x = y T Dy = λ1 y12 + λ2 y22 + · · · + λn yn2 . (6)

Lecture 12 6/26
Quadratic Form – Principal Axis Theorem
Proof:

It is known that

Ax1 = λ1 x1 , Ax2 = λ2 x2 , . . . , Axn = λn xn ,

where x1 , x2 , . . . , xn are the orthonormal eigenvectors of A. Then, from similarity


transformation,

D = X −1 AX = X T AX, X = [x1 x2 . . . xn ]
T T −1
⇒ Q = x Ax = x XDX x = xT XDX T x.

Let

X T x , y ⇒ X −1 x = y ⇒ x = Xy
T T
⇒ xT X = X T x = X −1 x = yT .

Hence,

Q = xT XDX T x = y T Dy = λ1 y12 + λ2 y22 + · · · + λn yn2 . (6)

Lecture 12 6/26
Quadratic Form – Principal Axis Theorem
Proof:

It is known that

Ax1 = λ1 x1 , Ax2 = λ2 x2 , . . . , Axn = λn xn ,

where x1 , x2 , . . . , xn are the orthonormal eigenvectors of A. Then, from similarity


transformation,

D = X −1 AX = X T AX, X = [x1 x2 . . . xn ]
T T −1
⇒ Q = x Ax = x XDX x = xT XDX T x.

Let

X T x , y ⇒ X −1 x = y ⇒ x = Xy
T T
⇒ xT X = X T x = X −1 x = yT .

Hence,

Q = xT XDX T x = y T Dy = λ1 y12 + λ2 y22 + · · · + λn yn2 . (6)

Lecture 12 6/26
Quadratic Form – Principal Axis Theorem
Proof:

It is known that

Ax1 = λ1 x1 , Ax2 = λ2 x2 , . . . , Axn = λn xn ,

where x1 , x2 , . . . , xn are the orthonormal eigenvectors of A. Then, from similarity


transformation,

D = X −1 AX = X T AX, X = [x1 x2 . . . xn ]
T T −1
⇒ Q = x Ax = x XDX x = xT XDX T x.

Let

X T x , y ⇒ X −1 x = y ⇒ x = Xy
T T
⇒ xT X = X T x = X −1 x = yT .

Hence,

Q = xT XDX T x = y T Dy = λ1 y12 + λ2 y22 + · · · + λn yn2 . (6)

Lecture 12 6/26
Quadratic Form – Principal Axis Theorem
Proof:

It is known that

Ax1 = λ1 x1 , Ax2 = λ2 x2 , . . . , Axn = λn xn ,

where x1 , x2 , . . . , xn are the orthonormal eigenvectors of A. Then, from similarity


transformation,

D = X −1 AX = X T AX, X = [x1 x2 . . . xn ]
T T −1
⇒ Q = x Ax = x XDX x = xT XDX T x.

Let

X T x , y ⇒ X −1 x = y ⇒ x = Xy
T T
⇒ xT X = X T x = X −1 x = yT .

Hence,

Q = xT XDX T x = y T Dy = λ1 y12 + λ2 y22 + · · · + λn yn2 . (6)

Lecture 12 6/26
Quadratic Form – Principal Axis Theorem
Proof:

It is known that

Ax1 = λ1 x1 , Ax2 = λ2 x2 , . . . , Axn = λn xn ,

where x1 , x2 , . . . , xn are the orthonormal eigenvectors of A. Then, from similarity


transformation,

D = X −1 AX = X T AX, X = [x1 x2 . . . xn ]
T T −1
⇒ Q = x Ax = x XDX x = xT XDX T x.

Let

X T x , y ⇒ X −1 x = y ⇒ x = Xy
T T
⇒ xT X = X T x = X −1 x = yT .

Hence,

Q = xT XDX T x = y T Dy = λ1 y12 + λ2 y22 + · · · + λn yn2 . (6)

Lecture 12 6/26
Quadratic Form – Principal Axis Theorem
Proof:

It is known that

Ax1 = λ1 x1 , Ax2 = λ2 x2 , . . . , Axn = λn xn ,

where x1 , x2 , . . . , xn are the orthonormal eigenvectors of A. Then, from similarity


transformation,

D = X −1 AX = X T AX, X = [x1 x2 . . . xn ]
T T −1
⇒ Q = x Ax = x XDX x = xT XDX T x.

Let

X T x , y ⇒ X −1 x = y ⇒ x = Xy
T T
⇒ xT X = X T x = X −1 x = yT .

Hence,

Q = xT XDX T x = y T Dy = λ1 y12 + λ2 y22 + · · · + λn yn2 . (6)

Lecture 12 6/26
Quadratic Form – Principal Axis Theorem
Proof:

It is known that

Ax1 = λ1 x1 , Ax2 = λ2 x2 , . . . , Axn = λn xn ,

where x1 , x2 , . . . , xn are the orthonormal eigenvectors of A. Then, from similarity


transformation,

D = X −1 AX = X T AX, X = [x1 x2 . . . xn ]
T T −1
⇒ Q = x Ax = x XDX x = xT XDX T x.

Let

X T x , y ⇒ X −1 x = y ⇒ x = Xy
T T
⇒ xT X = X T x = X −1 x = yT .

Hence,

Q = xT XDX T x = y T Dy = λ1 y12 + λ2 y22 + · · · + λn yn2 . (6)

Lecture 12 6/26
Quadratic Form – Principal Axis Theorem
Proof:

It is known that

Ax1 = λ1 x1 , Ax2 = λ2 x2 , . . . , Axn = λn xn ,

where x1 , x2 , . . . , xn are the orthonormal eigenvectors of A. Then, from similarity


transformation,

D = X −1 AX = X T AX, X = [x1 x2 . . . xn ]
T T −1
⇒ Q = x Ax = x XDX x = xT XDX T x.

Let

X T x , y ⇒ X −1 x = y ⇒ x = Xy
T T
⇒ xT X = X T x = X −1 x = yT .

Hence,

Q = xT XDX T x = y T Dy = λ1 y12 + λ2 y22 + · · · + λn yn2 . (6)

Lecture 12 6/26
Quadratic Form – Principal Axis Theorem
Proof:

It is known that

Ax1 = λ1 x1 , Ax2 = λ2 x2 , . . . , Axn = λn xn ,

where x1 , x2 , . . . , xn are the orthonormal eigenvectors of A. Then, from similarity


transformation,

D = X −1 AX = X T AX, X = [x1 x2 . . . xn ]
T T −1
⇒ Q = x Ax = x XDX x = xT XDX T x.

Let

X T x , y ⇒ X −1 x = y ⇒ x = Xy
T T
⇒ xT X = X T x = X −1 x = yT .

Hence,

Q = xT XDX T x = y T Dy = λ1 y12 + λ2 y22 + · · · + λn yn2 . (6)

Lecture 12 6/26
Quadratic Form – Principal Axis Theorem
Proof:

It is known that

Ax1 = λ1 x1 , Ax2 = λ2 x2 , . . . , Axn = λn xn ,

where x1 , x2 , . . . , xn are the orthonormal eigenvectors of A. Then, from similarity


transformation,

D = X −1 AX = X T AX, X = [x1 x2 . . . xn ]
T T −1
⇒ Q = x Ax = x XDX x = xT XDX T x.

Let

X T x , y ⇒ X −1 x = y ⇒ x = Xy
T T
⇒ xT X = X T x = X −1 x = yT .

Hence,

Q = xT XDX T x = y T Dy = λ1 y12 + λ2 y22 + · · · + λn yn2 . (6)

Lecture 12 6/26
Positive Definiteness

A function, e.g., f (x1 , x2 ) in 2D, is called positive definite if it is strictly positive


at all points except at the origin where it is zero1 .
Such functions are expressed by the following quadratic form

f (x1 , x2 ) = ax21 + 2bx1 x2 + cx22 , (7)

in which the coefficients a, b, c must be constrained to ensure f (x1 , x2 ) > 0.


To derive these constraints, we can rewrite f (x1 , x2 ) as

b2 b2
f (x1 , x2 ) = ax21 + x22 + 2bx1 x2 − x22 + cx22
a a 
2
b2
  
2 b 2 b
= a x1 + 2 x2 + 2 x1 x2 + c − x22
a a a
2 
b2
 
b
= a x1 + x2 + c − x22 . (8)
a a

1 Origin is the stationary point of the function, where the function has its minimum.
Lecture 12 7/26
Positive Definiteness

A function, e.g., f (x1 , x2 ) in 2D, is called positive definite if it is strictly positive


at all points except at the origin where it is zero1 .
Such functions are expressed by the following quadratic form

f (x1 , x2 ) = ax21 + 2bx1 x2 + cx22 , (7)

in which the coefficients a, b, c must be constrained to ensure f (x1 , x2 ) > 0.


To derive these constraints, we can rewrite f (x1 , x2 ) as

b2 b2
f (x1 , x2 ) = ax21 + x22 + 2bx1 x2 − x22 + cx22
a a 
2
b2
  
2 b 2 b
= a x1 + 2 x2 + 2 x1 x2 + c − x22
a a a
2 
b2
 
b
= a x1 + x2 + c − x22 . (8)
a a

1 Origin is the stationary point of the function, where the function has its minimum.
Lecture 12 7/26
Positive Definiteness

A function, e.g., f (x1 , x2 ) in 2D, is called positive definite if it is strictly positive


at all points except at the origin where it is zero1 .
Such functions are expressed by the following quadratic form

f (x1 , x2 ) = ax21 + 2bx1 x2 + cx22 , (7)

in which the coefficients a, b, c must be constrained to ensure f (x1 , x2 ) > 0.


To derive these constraints, we can rewrite f (x1 , x2 ) as

b2 b2
f (x1 , x2 ) = ax21 + x22 + 2bx1 x2 − x22 + cx22
a a 
2
b2
  
2 b 2 b
= a x1 + 2 x2 + 2 x1 x2 + c − x22
a a a
2 
b2
 
b
= a x1 + x2 + c − x22 . (8)
a a

1 Origin is the stationary point of the function, where the function has its minimum.
Lecture 12 7/26
Positive Definiteness

A function, e.g., f (x1 , x2 ) in 2D, is called positive definite if it is strictly positive


at all points except at the origin where it is zero1 .
Such functions are expressed by the following quadratic form

f (x1 , x2 ) = ax21 + 2bx1 x2 + cx22 , (7)

in which the coefficients a, b, c must be constrained to ensure f (x1 , x2 ) > 0.


To derive these constraints, we can rewrite f (x1 , x2 ) as

b2 b2
f (x1 , x2 ) = ax21 + x22 + 2bx1 x2 − x22 + cx22
a a 
2
b2
  
2 b 2 b
= a x1 + 2 x2 + 2 x1 x2 + c − x22
a a a
2 
b2
 
b
= a x1 + x2 + c − x22 . (8)
a a

1 Origin is the stationary point of the function, where the function has its minimum.
Lecture 12 7/26
Positive Definiteness

A function, e.g., f (x1 , x2 ) in 2D, is called positive definite if it is strictly positive


at all points except at the origin where it is zero1 .
Such functions are expressed by the following quadratic form

f (x1 , x2 ) = ax21 + 2bx1 x2 + cx22 , (7)

in which the coefficients a, b, c must be constrained to ensure f (x1 , x2 ) > 0.


To derive these constraints, we can rewrite f (x1 , x2 ) as

b2 b2
f (x1 , x2 ) = ax21 + x22 + 2bx1 x2 − x22 + cx22
a a 
2
b2
  
2 b 2 b
= a x1 + 2 x2 + 2 x1 x2 + c − x22
a a a
2 
b2
 
b
= a x1 + x2 + c − x22 . (8)
a a

1 Origin is the stationary point of the function, where the function has its minimum.
Lecture 12 7/26
Positive Definiteness

A function, e.g., f (x1 , x2 ) in 2D, is called positive definite if it is strictly positive


at all points except at the origin where it is zero1 .
Such functions are expressed by the following quadratic form

f (x1 , x2 ) = ax21 + 2bx1 x2 + cx22 , (7)

in which the coefficients a, b, c must be constrained to ensure f (x1 , x2 ) > 0.


To derive these constraints, we can rewrite f (x1 , x2 ) as

b2 b2
f (x1 , x2 ) = ax21 + x22 + 2bx1 x2 − x22 + cx22
a a 
2
b2
  
2 b 2 b
= a x1 + 2 x2 + 2 x1 x2 + c − x22
a a a
2 
b2
 
b
= a x1 + x2 + c − x22 . (8)
a a

1 Origin is the stationary point of the function, where the function has its minimum.
Lecture 12 7/26
Positive Definiteness

Because of the complete square in the two terms, f (x1 , x2 ) > 0 iff

b2
a > 0, c− > 0 ⇒ ac − b2 > 0 ⇒ ac > b2 ⇒ c > 0. (9)
a

For coefficients a, b, c defined as in (9), f (x1 , x2 ) > 0 at all points except at the
origin where it has a minimum2 , i.e.,

∂f ∂f
= 2ax1 + 2bx2 = 0, = 2bx1 + 2cx2 = 0,

∂x1 ∂x2
(0,0) (0,0) (10)
∂2f ∂2f 3 ∂ f
2
∂2f
= 2a > 0, = = 2b, = 2c > 0.
∂x21 ∂x1 ∂x2 ∂x2 ∂x1 ∂x22

2 Matrix of double derivatives is positive definite, as explained later!


3 Assuming continuous partial derivative ∂f , ∂f .
∂x1 ∂x2
Lecture 12 8/26
Positive Definiteness

Because of the complete square in the two terms, f (x1 , x2 ) > 0 iff

b2
a > 0, c− > 0 ⇒ ac − b2 > 0 ⇒ ac > b2 ⇒ c > 0. (9)
a

For coefficients a, b, c defined as in (9), f (x1 , x2 ) > 0 at all points except at the
origin where it has a minimum2 , i.e.,

∂f ∂f
= 2ax1 + 2bx2 = 0, = 2bx1 + 2cx2 = 0,

∂x1 ∂x2
(0,0) (0,0) (10)
∂2f ∂2f 3 ∂ f
2
∂2f
= 2a > 0, = = 2b, = 2c > 0.
∂x21 ∂x1 ∂x2 ∂x2 ∂x1 ∂x22

2 Matrix of double derivatives is positive definite, as explained later!


3 Assuming continuous partial derivative ∂f , ∂f .
∂x1 ∂x2
Lecture 12 8/26
Positive Definiteness

Because of the complete square in the two terms, f (x1 , x2 ) > 0 iff

b2
a > 0, c− > 0 ⇒ ac − b2 > 0 ⇒ ac > b2 ⇒ c > 0. (9)
a

For coefficients a, b, c defined as in (9), f (x1 , x2 ) > 0 at all points except at the
origin where it has a minimum2 , i.e.,

∂f ∂f
= 2ax1 + 2bx2 = 0, = 2bx1 + 2cx2 = 0,

∂x1 ∂x2
(0,0) (0,0) (10)
∂2f ∂2f 3 ∂ f
2
∂2f
= 2a > 0, = = 2b, = 2c > 0.
∂x21 ∂x1 ∂x2 ∂x2 ∂x1 ∂x22

2 Matrix of double derivatives is positive definite, as explained later!


3 Assuming continuous partial derivative ∂f , ∂f .
∂x1 ∂x2
Lecture 12 8/26
Positive Definiteness

Because of the complete square in the two terms, f (x1 , x2 ) > 0 iff

b2
a > 0, c− > 0 ⇒ ac − b2 > 0 ⇒ ac > b2 ⇒ c > 0. (9)
a

For coefficients a, b, c defined as in (9), f (x1 , x2 ) > 0 at all points except at the
origin where it has a minimum2 , i.e.,

∂f ∂f
= 2ax1 + 2bx2 = 0, = 2bx1 + 2cx2 = 0,

∂x1 ∂x2
(0,0) (0,0) (10)
∂2f ∂2f 3 ∂ f
2
∂2f
= 2a > 0, = = 2b, = 2c > 0.
∂x21 ∂x1 ∂x2 ∂x2 ∂x1 ∂x22

2 Matrix of double derivatives is positive definite, as explained later!


3 Assuming continuous partial derivative ∂f , ∂f .
∂x1 ∂x2
Lecture 12 8/26
Positive Definiteness

Because of the complete square in the two terms, f (x1 , x2 ) > 0 iff

b2
a > 0, c− > 0 ⇒ ac − b2 > 0 ⇒ ac > b2 ⇒ c > 0. (9)
a

For coefficients a, b, c defined as in (9), f (x1 , x2 ) > 0 at all points except at the
origin where it has a minimum2 , i.e.,

∂f ∂f
= 2ax1 + 2bx2 = 0, = 2bx1 + 2cx2 = 0,

∂x1 ∂x2
(0,0) (0,0) (10)
∂2f ∂2f 3 ∂ f
2
∂2f
= 2a > 0, = = 2b, = 2c > 0.
∂x21 ∂x1 ∂x2 ∂x2 ∂x1 ∂x22

2 Matrix of double derivatives is positive definite, as explained later!


3 Assuming continuous partial derivative ∂f , ∂f .
∂x1 ∂x2
Lecture 12 8/26
Negative Definiteness
Test for Maximum:
f (x1 , x2 ) has a maximum whenever −f (x1 , x2 ) has a minimum. Hence,

−f (x1 , x2 ) = −ax21 − 2bx1 x2 − cx22


b2 b2
= −ax21 − x22 − 2bx1 x2 + x22 − cx22
a a 
2
b2
  
2 b 2 b
= −a x1 + 2 x2 + 2 x1 x2 + − c x22
a a a
2 
b2
 
b
= −a x1 + x2 − c − x22 . (11)
a a

has a minimum when

−a > 0, ac < b2 ⇒ a < 0, ac > b2 ⇒ c < 0, (12)

then
2
b2
  
b
f (x1 , x2 ) = ax21 + 2bx1 x2 + cx22 = a x1 + x2 + c− x22 < 0 (13)
a a

at all points except at the origin, where it has a maximum, and is called negative
definite.
Lecture 12 9/26
Negative Definiteness
Test for Maximum:
f (x1 , x2 ) has a maximum whenever −f (x1 , x2 ) has a minimum. Hence,

−f (x1 , x2 ) = −ax21 − 2bx1 x2 − cx22


b2 b2
= −ax21 − x22 − 2bx1 x2 + x22 − cx22
a a 
2
b2
  
2 b 2 b
= −a x1 + 2 x2 + 2 x1 x2 + − c x22
a a a
2 
b2
 
b
= −a x1 + x2 − c − x22 . (11)
a a

has a minimum when

−a > 0, ac < b2 ⇒ a < 0, ac > b2 ⇒ c < 0, (12)

then
2
b2
  
b
f (x1 , x2 ) = ax21 + 2bx1 x2 + cx22 = a x1 + x2 + c− x22 < 0 (13)
a a

at all points except at the origin, where it has a maximum, and is called negative
definite.
Lecture 12 9/26
Negative Definiteness
Test for Maximum:
f (x1 , x2 ) has a maximum whenever −f (x1 , x2 ) has a minimum. Hence,

−f (x1 , x2 ) = −ax21 − 2bx1 x2 − cx22


b2 b2
= −ax21 − x22 − 2bx1 x2 + x22 − cx22
a a 
2
b2
  
2 b 2 b
= −a x1 + 2 x2 + 2 x1 x2 + − c x22
a a a
2 
b2
 
b
= −a x1 + x2 − c − x22 . (11)
a a

has a minimum when

−a > 0, ac < b2 ⇒ a < 0, ac > b2 ⇒ c < 0, (12)

then
2
b2
  
b
f (x1 , x2 ) = ax21 + 2bx1 x2 + cx22 = a x1 + x2 + c− x22 < 0 (13)
a a

at all points except at the origin, where it has a maximum, and is called negative
definite.
Lecture 12 9/26
Negative Definiteness
Test for Maximum:
f (x1 , x2 ) has a maximum whenever −f (x1 , x2 ) has a minimum. Hence,

−f (x1 , x2 ) = −ax21 − 2bx1 x2 − cx22


b2 b2
= −ax21 − x22 − 2bx1 x2 + x22 − cx22
a a 
2
b2
  
2 b 2 b
= −a x1 + 2 x2 + 2 x1 x2 + − c x22
a a a
2 
b2
 
b
= −a x1 + x2 − c − x22 . (11)
a a

has a minimum when

−a > 0, ac < b2 ⇒ a < 0, ac > b2 ⇒ c < 0, (12)

then
2
b2
  
b
f (x1 , x2 ) = ax21 + 2bx1 x2 + cx22 = a x1 + x2 + c− x22 < 0 (13)
a a

at all points except at the origin, where it has a maximum, and is called negative
definite.
Lecture 12 9/26
Negative Definiteness
Test for Maximum:
f (x1 , x2 ) has a maximum whenever −f (x1 , x2 ) has a minimum. Hence,

−f (x1 , x2 ) = −ax21 − 2bx1 x2 − cx22


b2 b2
= −ax21 − x22 − 2bx1 x2 + x22 − cx22
a a 
2
b2
  
2 b 2 b
= −a x1 + 2 x2 + 2 x1 x2 + − c x22
a a a
2 
b2
 
b
= −a x1 + x2 − c − x22 . (11)
a a

has a minimum when

−a > 0, ac < b2 ⇒ a < 0, ac > b2 ⇒ c < 0, (12)

then
2
b2
  
b
f (x1 , x2 ) = ax21 + 2bx1 x2 + cx22 = a x1 + x2 + c− x22 < 0 (13)
a a

at all points except at the origin, where it has a maximum, and is called negative
definite.
Lecture 12 9/26
Negative Definiteness
Test for Maximum:
f (x1 , x2 ) has a maximum whenever −f (x1 , x2 ) has a minimum. Hence,

−f (x1 , x2 ) = −ax21 − 2bx1 x2 − cx22


b2 b2
= −ax21 − x22 − 2bx1 x2 + x22 − cx22
a a 
2
b2
  
2 b 2 b
= −a x1 + 2 x2 + 2 x1 x2 + − c x22
a a a
2 
b2
 
b
= −a x1 + x2 − c − x22 . (11)
a a

has a minimum when

−a > 0, ac < b2 ⇒ a < 0, ac > b2 ⇒ c < 0, (12)

then
2
b2
  
b
f (x1 , x2 ) = ax21 + 2bx1 x2 + cx22 = a x1 + x2 + c− x22 < 0 (13)
a a

at all points except at the origin, where it has a maximum, and is called negative
definite.
Lecture 12 9/26
Singular Case and Saddle Point

If ac = b2 , then second term in (8) becomes zero, i.e.,


 2
b
f (x1 , x2 ) = a x1 + x2 , (14)
a

which is:
2
1. positive semidefinite if a > 0 ⇒ c > 0, ∵ c = ba ,
2
2. negative semidefinite if a < 0 ⇒ c < 0, ∵ c = ba .
The prefix semi indicates that f (x1 , x2 ) can be zero at points other than the
origin, e.g., at x1 = b, x2 = −a.
If ac < b2 or ac − b2 < 0, which can also happen when a and c have opposite
signs, then the stationary point (in general the critical point4 ) is called the saddle
point of f (x1 , x2 ), which is neither positive (semi)definite nor negative (semi)
definite.

4 A critical point is either a point of minimum/maximum or a point of inflection.


Lecture 12 10/26
Singular Case and Saddle Point

If ac = b2 , then second term in (8) becomes zero, i.e.,


 2
b
f (x1 , x2 ) = a x1 + x2 , (14)
a

which is:
2
1. positive semidefinite if a > 0 ⇒ c > 0, ∵ c = ba ,
2
2. negative semidefinite if a < 0 ⇒ c < 0, ∵ c = ba .
The prefix semi indicates that f (x1 , x2 ) can be zero at points other than the
origin, e.g., at x1 = b, x2 = −a.
If ac < b2 or ac − b2 < 0, which can also happen when a and c have opposite
signs, then the stationary point (in general the critical point4 ) is called the saddle
point of f (x1 , x2 ), which is neither positive (semi)definite nor negative (semi)
definite.

4 A critical point is either a point of minimum/maximum or a point of inflection.


Lecture 12 10/26
Singular Case and Saddle Point

If ac = b2 , then second term in (8) becomes zero, i.e.,


 2
b
f (x1 , x2 ) = a x1 + x2 , (14)
a

which is:
2
1. positive semidefinite if a > 0 ⇒ c > 0, ∵ c = ba ,
2
2. negative semidefinite if a < 0 ⇒ c < 0, ∵ c = ba .
The prefix semi indicates that f (x1 , x2 ) can be zero at points other than the
origin, e.g., at x1 = b, x2 = −a.
If ac < b2 or ac − b2 < 0, which can also happen when a and c have opposite
signs, then the stationary point (in general the critical point4 ) is called the saddle
point of f (x1 , x2 ), which is neither positive (semi)definite nor negative (semi)
definite.

4 A critical point is either a point of minimum/maximum or a point of inflection.


Lecture 12 10/26
Singular Case and Saddle Point

If ac = b2 , then second term in (8) becomes zero, i.e.,


 2
b
f (x1 , x2 ) = a x1 + x2 , (14)
a

which is:
2
1. positive semidefinite if a > 0 ⇒ c > 0, ∵ c = ba ,
2
2. negative semidefinite if a < 0 ⇒ c < 0, ∵ c = ba .
The prefix semi indicates that f (x1 , x2 ) can be zero at points other than the
origin, e.g., at x1 = b, x2 = −a.
If ac < b2 or ac − b2 < 0, which can also happen when a and c have opposite
signs, then the stationary point (in general the critical point4 ) is called the saddle
point of f (x1 , x2 ), which is neither positive (semi)definite nor negative (semi)
definite.

4 A critical point is either a point of minimum/maximum or a point of inflection.


Lecture 12 10/26
Singular Case and Saddle Point

If ac = b2 , then second term in (8) becomes zero, i.e.,


 2
b
f (x1 , x2 ) = a x1 + x2 , (14)
a

which is:
2
1. positive semidefinite if a > 0 ⇒ c > 0, ∵ c = ba ,
2
2. negative semidefinite if a < 0 ⇒ c < 0, ∵ c = ba .
The prefix semi indicates that f (x1 , x2 ) can be zero at points other than the
origin, e.g., at x1 = b, x2 = −a.
If ac < b2 or ac − b2 < 0, which can also happen when a and c have opposite
signs, then the stationary point (in general the critical point4 ) is called the saddle
point of f (x1 , x2 ), which is neither positive (semi)definite nor negative (semi)
definite.

4 A critical point is either a point of minimum/maximum or a point of inflection.


Lecture 12 10/26
Singular Case and Saddle Point

If ac = b2 , then second term in (8) becomes zero, i.e.,


 2
b
f (x1 , x2 ) = a x1 + x2 , (14)
a

which is:
2
1. positive semidefinite if a > 0 ⇒ c > 0, ∵ c = ba ,
2
2. negative semidefinite if a < 0 ⇒ c < 0, ∵ c = ba .
The prefix semi indicates that f (x1 , x2 ) can be zero at points other than the
origin, e.g., at x1 = b, x2 = −a.
If ac < b2 or ac − b2 < 0, which can also happen when a and c have opposite
signs, then the stationary point (in general the critical point4 ) is called the saddle
point of f (x1 , x2 ), which is neither positive (semi)definite nor negative (semi)
definite.

4 A critical point is either a point of minimum/maximum or a point of inflection.


Lecture 12 10/26
Positive Definiteness in Higher Dimensions
The function

f (x1 , x2 ) = ax21 + 2bx1 x2 + cx22

can be seen to be given by the following matrix quadratic form


  
a b x1
xT Ax = [x1 x2 ] (15)
b c x2

in terms of which, we can rewrite the conditions for definiteness, by realizing that

a = A1 = |A1 |, ac − b2 = |A|, (16)

as follows:
1. Positive definiteness: |A1 | > 0, |A| > 0 ⇒ λ1 , λ2 > 0 because c > 0 and |A| = λ1 λ2 >
0, trace(A) = λ1 + λ2 = a + c > 0.
2. Negative definiteness: |A1 | < 0, |A| > 0 ⇒ λ1 , λ2 < 0 because c < 0 and |A| = λ1 λ2 <
0, trace(A) = λ1 + λ2 = a + c < 0.
3. Positive semidefiniteness: |A1 | > 0, |A| = 0 ⇒ λ1 = 0, λ2 > 0 or λ1 > 0, λ2 = 0
because c > 0 and |A| = λ1 λ2 = 0, trace(A) = λ1 + λ2 = a + c > 0.
4. Negative semidefiniteness: |A1 | < 0, |A| = 0 ⇒ λ1 = 0, λ2 < 0 or λ1 < 0, λ2 = 0
because c < 0 and |A| = λ1 λ2 = 0, trace(A) = λ1 + λ2 = a + c < 0.
5. Saddle point: |A| < 0, which indicates that the two eigenvalues have opposite signs.
The corresponding matrix A is called positive definite, negative definite, positive semidefinite,
negative semidefinite, and indefinite respectively.
Lecture 12 11/26
Positive Definiteness in Higher Dimensions
The function

f (x1 , x2 ) = ax21 + 2bx1 x2 + cx22

can be seen to be given by the following matrix quadratic form


  
a b x1
xT Ax = [x1 x2 ] (15)
b c x2

in terms of which, we can rewrite the conditions for definiteness, by realizing that

a = A1 = |A1 |, ac − b2 = |A|, (16)

as follows:
1. Positive definiteness: |A1 | > 0, |A| > 0 ⇒ λ1 , λ2 > 0 because c > 0 and |A| = λ1 λ2 >
0, trace(A) = λ1 + λ2 = a + c > 0.
2. Negative definiteness: |A1 | < 0, |A| > 0 ⇒ λ1 , λ2 < 0 because c < 0 and |A| = λ1 λ2 <
0, trace(A) = λ1 + λ2 = a + c < 0.
3. Positive semidefiniteness: |A1 | > 0, |A| = 0 ⇒ λ1 = 0, λ2 > 0 or λ1 > 0, λ2 = 0
because c > 0 and |A| = λ1 λ2 = 0, trace(A) = λ1 + λ2 = a + c > 0.
4. Negative semidefiniteness: |A1 | < 0, |A| = 0 ⇒ λ1 = 0, λ2 < 0 or λ1 < 0, λ2 = 0
because c < 0 and |A| = λ1 λ2 = 0, trace(A) = λ1 + λ2 = a + c < 0.
5. Saddle point: |A| < 0, which indicates that the two eigenvalues have opposite signs.
The corresponding matrix A is called positive definite, negative definite, positive semidefinite,
negative semidefinite, and indefinite respectively.
Lecture 12 11/26
Positive Definiteness in Higher Dimensions
The function

f (x1 , x2 ) = ax21 + 2bx1 x2 + cx22

can be seen to be given by the following matrix quadratic form


  
a b x1
xT Ax = [x1 x2 ] (15)
b c x2

in terms of which, we can rewrite the conditions for definiteness, by realizing that

a = A1 = |A1 |, ac − b2 = |A|, (16)

as follows:
1. Positive definiteness: |A1 | > 0, |A| > 0 ⇒ λ1 , λ2 > 0 because c > 0 and |A| = λ1 λ2 >
0, trace(A) = λ1 + λ2 = a + c > 0.
2. Negative definiteness: |A1 | < 0, |A| > 0 ⇒ λ1 , λ2 < 0 because c < 0 and |A| = λ1 λ2 <
0, trace(A) = λ1 + λ2 = a + c < 0.
3. Positive semidefiniteness: |A1 | > 0, |A| = 0 ⇒ λ1 = 0, λ2 > 0 or λ1 > 0, λ2 = 0
because c > 0 and |A| = λ1 λ2 = 0, trace(A) = λ1 + λ2 = a + c > 0.
4. Negative semidefiniteness: |A1 | < 0, |A| = 0 ⇒ λ1 = 0, λ2 < 0 or λ1 < 0, λ2 = 0
because c < 0 and |A| = λ1 λ2 = 0, trace(A) = λ1 + λ2 = a + c < 0.
5. Saddle point: |A| < 0, which indicates that the two eigenvalues have opposite signs.
The corresponding matrix A is called positive definite, negative definite, positive semidefinite,
negative semidefinite, and indefinite respectively.
Lecture 12 11/26
Positive Definiteness in Higher Dimensions
The function

f (x1 , x2 ) = ax21 + 2bx1 x2 + cx22

can be seen to be given by the following matrix quadratic form


  
a b x1
xT Ax = [x1 x2 ] (15)
b c x2

in terms of which, we can rewrite the conditions for definiteness, by realizing that

a = A1 = |A1 |, ac − b2 = |A|, (16)

as follows:
1. Positive definiteness: |A1 | > 0, |A| > 0 ⇒ λ1 , λ2 > 0 because c > 0 and |A| = λ1 λ2 >
0, trace(A) = λ1 + λ2 = a + c > 0.
2. Negative definiteness: |A1 | < 0, |A| > 0 ⇒ λ1 , λ2 < 0 because c < 0 and |A| = λ1 λ2 <
0, trace(A) = λ1 + λ2 = a + c < 0.
3. Positive semidefiniteness: |A1 | > 0, |A| = 0 ⇒ λ1 = 0, λ2 > 0 or λ1 > 0, λ2 = 0
because c > 0 and |A| = λ1 λ2 = 0, trace(A) = λ1 + λ2 = a + c > 0.
4. Negative semidefiniteness: |A1 | < 0, |A| = 0 ⇒ λ1 = 0, λ2 < 0 or λ1 < 0, λ2 = 0
because c < 0 and |A| = λ1 λ2 = 0, trace(A) = λ1 + λ2 = a + c < 0.
5. Saddle point: |A| < 0, which indicates that the two eigenvalues have opposite signs.
The corresponding matrix A is called positive definite, negative definite, positive semidefinite,
negative semidefinite, and indefinite respectively.
Lecture 12 11/26
Positive Definiteness in Higher Dimensions
The function

f (x1 , x2 ) = ax21 + 2bx1 x2 + cx22

can be seen to be given by the following matrix quadratic form


  
a b x1
xT Ax = [x1 x2 ] (15)
b c x2

in terms of which, we can rewrite the conditions for definiteness, by realizing that

a = A1 = |A1 |, ac − b2 = |A|, (16)

as follows:
1. Positive definiteness: |A1 | > 0, |A| > 0 ⇒ λ1 , λ2 > 0 because c > 0 and |A| = λ1 λ2 >
0, trace(A) = λ1 + λ2 = a + c > 0.
2. Negative definiteness: |A1 | < 0, |A| > 0 ⇒ λ1 , λ2 < 0 because c < 0 and |A| = λ1 λ2 <
0, trace(A) = λ1 + λ2 = a + c < 0.
3. Positive semidefiniteness: |A1 | > 0, |A| = 0 ⇒ λ1 = 0, λ2 > 0 or λ1 > 0, λ2 = 0
because c > 0 and |A| = λ1 λ2 = 0, trace(A) = λ1 + λ2 = a + c > 0.
4. Negative semidefiniteness: |A1 | < 0, |A| = 0 ⇒ λ1 = 0, λ2 < 0 or λ1 < 0, λ2 = 0
because c < 0 and |A| = λ1 λ2 = 0, trace(A) = λ1 + λ2 = a + c < 0.
5. Saddle point: |A| < 0, which indicates that the two eigenvalues have opposite signs.
The corresponding matrix A is called positive definite, negative definite, positive semidefinite,
negative semidefinite, and indefinite respectively.
Lecture 12 11/26
Positive Definiteness in Higher Dimensions
The function

f (x1 , x2 ) = ax21 + 2bx1 x2 + cx22

can be seen to be given by the following matrix quadratic form


  
a b x1
xT Ax = [x1 x2 ] (15)
b c x2

in terms of which, we can rewrite the conditions for definiteness, by realizing that

a = A1 = |A1 |, ac − b2 = |A|, (16)

as follows:
1. Positive definiteness: |A1 | > 0, |A| > 0 ⇒ λ1 , λ2 > 0 because c > 0 and |A| = λ1 λ2 >
0, trace(A) = λ1 + λ2 = a + c > 0.
2. Negative definiteness: |A1 | < 0, |A| > 0 ⇒ λ1 , λ2 < 0 because c < 0 and |A| = λ1 λ2 <
0, trace(A) = λ1 + λ2 = a + c < 0.
3. Positive semidefiniteness: |A1 | > 0, |A| = 0 ⇒ λ1 = 0, λ2 > 0 or λ1 > 0, λ2 = 0
because c > 0 and |A| = λ1 λ2 = 0, trace(A) = λ1 + λ2 = a + c > 0.
4. Negative semidefiniteness: |A1 | < 0, |A| = 0 ⇒ λ1 = 0, λ2 < 0 or λ1 < 0, λ2 = 0
because c < 0 and |A| = λ1 λ2 = 0, trace(A) = λ1 + λ2 = a + c < 0.
5. Saddle point: |A| < 0, which indicates that the two eigenvalues have opposite signs.
The corresponding matrix A is called positive definite, negative definite, positive semidefinite,
negative semidefinite, and indefinite respectively.
Lecture 12 11/26
Positive Definiteness in Higher Dimensions
The function

f (x1 , x2 ) = ax21 + 2bx1 x2 + cx22

can be seen to be given by the following matrix quadratic form


  
a b x1
xT Ax = [x1 x2 ] (15)
b c x2

in terms of which, we can rewrite the conditions for definiteness, by realizing that

a = A1 = |A1 |, ac − b2 = |A|, (16)

as follows:
1. Positive definiteness: |A1 | > 0, |A| > 0 ⇒ λ1 , λ2 > 0 because c > 0 and |A| = λ1 λ2 >
0, trace(A) = λ1 + λ2 = a + c > 0.
2. Negative definiteness: |A1 | < 0, |A| > 0 ⇒ λ1 , λ2 < 0 because c < 0 and |A| = λ1 λ2 <
0, trace(A) = λ1 + λ2 = a + c < 0.
3. Positive semidefiniteness: |A1 | > 0, |A| = 0 ⇒ λ1 = 0, λ2 > 0 or λ1 > 0, λ2 = 0
because c > 0 and |A| = λ1 λ2 = 0, trace(A) = λ1 + λ2 = a + c > 0.
4. Negative semidefiniteness: |A1 | < 0, |A| = 0 ⇒ λ1 = 0, λ2 < 0 or λ1 < 0, λ2 = 0
because c < 0 and |A| = λ1 λ2 = 0, trace(A) = λ1 + λ2 = a + c < 0.
5. Saddle point: |A| < 0, which indicates that the two eigenvalues have opposite signs.
The corresponding matrix A is called positive definite, negative definite, positive semidefinite,
negative semidefinite, and indefinite respectively.
Lecture 12 11/26
Positive Definiteness in Higher Dimensions
The function

f (x1 , x2 ) = ax21 + 2bx1 x2 + cx22

can be seen to be given by the following matrix quadratic form


  
a b x1
xT Ax = [x1 x2 ] (15)
b c x2

in terms of which, we can rewrite the conditions for definiteness, by realizing that

a = A1 = |A1 |, ac − b2 = |A|, (16)

as follows:
1. Positive definiteness: |A1 | > 0, |A| > 0 ⇒ λ1 , λ2 > 0 because c > 0 and |A| = λ1 λ2 >
0, trace(A) = λ1 + λ2 = a + c > 0.
2. Negative definiteness: |A1 | < 0, |A| > 0 ⇒ λ1 , λ2 < 0 because c < 0 and |A| = λ1 λ2 <
0, trace(A) = λ1 + λ2 = a + c < 0.
3. Positive semidefiniteness: |A1 | > 0, |A| = 0 ⇒ λ1 = 0, λ2 > 0 or λ1 > 0, λ2 = 0
because c > 0 and |A| = λ1 λ2 = 0, trace(A) = λ1 + λ2 = a + c > 0.
4. Negative semidefiniteness: |A1 | < 0, |A| = 0 ⇒ λ1 = 0, λ2 < 0 or λ1 < 0, λ2 = 0
because c < 0 and |A| = λ1 λ2 = 0, trace(A) = λ1 + λ2 = a + c < 0.
5. Saddle point: |A| < 0, which indicates that the two eigenvalues have opposite signs.
The corresponding matrix A is called positive definite, negative definite, positive semidefinite,
negative semidefinite, and indefinite respectively.
Lecture 12 11/26
Positive Definiteness in Higher Dimensions
The function

f (x1 , x2 ) = ax21 + 2bx1 x2 + cx22

can be seen to be given by the following matrix quadratic form


  
a b x1
xT Ax = [x1 x2 ] (15)
b c x2

in terms of which, we can rewrite the conditions for definiteness, by realizing that

a = A1 = |A1 |, ac − b2 = |A|, (16)

as follows:
1. Positive definiteness: |A1 | > 0, |A| > 0 ⇒ λ1 , λ2 > 0 because c > 0 and |A| = λ1 λ2 >
0, trace(A) = λ1 + λ2 = a + c > 0.
2. Negative definiteness: |A1 | < 0, |A| > 0 ⇒ λ1 , λ2 < 0 because c < 0 and |A| = λ1 λ2 <
0, trace(A) = λ1 + λ2 = a + c < 0.
3. Positive semidefiniteness: |A1 | > 0, |A| = 0 ⇒ λ1 = 0, λ2 > 0 or λ1 > 0, λ2 = 0
because c > 0 and |A| = λ1 λ2 = 0, trace(A) = λ1 + λ2 = a + c > 0.
4. Negative semidefiniteness: |A1 | < 0, |A| = 0 ⇒ λ1 = 0, λ2 < 0 or λ1 < 0, λ2 = 0
because c < 0 and |A| = λ1 λ2 = 0, trace(A) = λ1 + λ2 = a + c < 0.
5. Saddle point: |A| < 0, which indicates that the two eigenvalues have opposite signs.
The corresponding matrix A is called positive definite, negative definite, positive semidefinite,
negative semidefinite, and indefinite respectively.
Lecture 12 11/26
Positive Definiteness in Higher Dimensions
The function

f (x1 , x2 ) = ax21 + 2bx1 x2 + cx22

can be seen to be given by the following matrix quadratic form


  
a b x1
xT Ax = [x1 x2 ] (15)
b c x2

in terms of which, we can rewrite the conditions for definiteness, by realizing that

a = A1 = |A1 |, ac − b2 = |A|, (16)

as follows:
1. Positive definiteness: |A1 | > 0, |A| > 0 ⇒ λ1 , λ2 > 0 because c > 0 and |A| = λ1 λ2 >
0, trace(A) = λ1 + λ2 = a + c > 0.
2. Negative definiteness: |A1 | < 0, |A| > 0 ⇒ λ1 , λ2 < 0 because c < 0 and |A| = λ1 λ2 <
0, trace(A) = λ1 + λ2 = a + c < 0.
3. Positive semidefiniteness: |A1 | > 0, |A| = 0 ⇒ λ1 = 0, λ2 > 0 or λ1 > 0, λ2 = 0
because c > 0 and |A| = λ1 λ2 = 0, trace(A) = λ1 + λ2 = a + c > 0.
4. Negative semidefiniteness: |A1 | < 0, |A| = 0 ⇒ λ1 = 0, λ2 < 0 or λ1 < 0, λ2 = 0
because c < 0 and |A| = λ1 λ2 = 0, trace(A) = λ1 + λ2 = a + c < 0.
5. Saddle point: |A| < 0, which indicates that the two eigenvalues have opposite signs.
The corresponding matrix A is called positive definite, negative definite, positive semidefinite,
negative semidefinite, and indefinite respectively.
Lecture 12 11/26
Positive Definiteness in Higher Dimensions

Furthermore, it can be observed that


       
a b a b 1 0 a 0 1 b/a
−→ = , (17)
b c aR2 −bR1 0 ac − b2 b/a 1 0 (ac − b2 )/a 0 1
| {z } | {z } | {z } | {z }
A L D U =LT

in which
ac − b2
d1 = a, d2 = (18)
a
are called the pivots of the matrix A.
Hence,
2 
ac − b2
 
b
f (x1 , x2 ) = a x1 + x2 + x22 = xT Ax
a a

1. is positive definite if d1 , d2 > 0.


2. is negative definite if d1 , d2 < 0.
3. is positive semidefinite if d2 = 0, d1 > 0.
4. is negative semidefinite if d2 = 0, d1 < 0.
5. has saddle point if d2 < 0.

Lecture 12 12/26
Positive Definiteness in Higher Dimensions

Furthermore, it can be observed that


       
a b a b 1 0 a 0 1 b/a
−→ = , (17)
b c aR2 −bR1 0 ac − b2 b/a 1 0 (ac − b2 )/a 0 1
| {z } | {z } | {z } | {z }
A L D U =LT

in which
ac − b2
d1 = a, d2 = (18)
a
are called the pivots of the matrix A.
Hence,
2 
ac − b2
 
b
f (x1 , x2 ) = a x1 + x2 + x22 = xT Ax
a a

1. is positive definite if d1 , d2 > 0.


2. is negative definite if d1 , d2 < 0.
3. is positive semidefinite if d2 = 0, d1 > 0.
4. is negative semidefinite if d2 = 0, d1 < 0.
5. has saddle point if d2 < 0.

Lecture 12 12/26
Positive Definiteness in Higher Dimensions

Furthermore, it can be observed that


       
a b a b 1 0 a 0 1 b/a
−→ = , (17)
b c aR2 −bR1 0 ac − b2 b/a 1 0 (ac − b2 )/a 0 1
| {z } | {z } | {z } | {z }
A L D U =LT

in which
ac − b2
d1 = a, d2 = (18)
a
are called the pivots of the matrix A.
Hence,
2 
ac − b2
 
b
f (x1 , x2 ) = a x1 + x2 + x22 = xT Ax
a a

1. is positive definite if d1 , d2 > 0.


2. is negative definite if d1 , d2 < 0.
3. is positive semidefinite if d2 = 0, d1 > 0.
4. is negative semidefinite if d2 = 0, d1 < 0.
5. has saddle point if d2 < 0.

Lecture 12 12/26
Positive Definiteness in Higher Dimensions

Furthermore, it can be observed that


       
a b a b 1 0 a 0 1 b/a
−→ = , (17)
b c aR2 −bR1 0 ac − b2 b/a 1 0 (ac − b2 )/a 0 1
| {z } | {z } | {z } | {z }
A L D U =LT

in which
ac − b2
d1 = a, d2 = (18)
a
are called the pivots of the matrix A.
Hence,
2 
ac − b2
 
b
f (x1 , x2 ) = a x1 + x2 + x22 = xT Ax
a a

1. is positive definite if d1 , d2 > 0.


2. is negative definite if d1 , d2 < 0.
3. is positive semidefinite if d2 = 0, d1 > 0.
4. is negative semidefinite if d2 = 0, d1 < 0.
5. has saddle point if d2 < 0.

Lecture 12 12/26
Positive Definiteness in Higher Dimensions

Furthermore, it can be observed that


       
a b a b 1 0 a 0 1 b/a
−→ = , (17)
b c aR2 −bR1 0 ac − b2 b/a 1 0 (ac − b2 )/a 0 1
| {z } | {z } | {z } | {z }
A L D U =LT

in which
ac − b2
d1 = a, d2 = (18)
a
are called the pivots of the matrix A.
Hence,
2 
ac − b2
 
b
f (x1 , x2 ) = a x1 + x2 + x22 = xT Ax
a a

1. is positive definite if d1 , d2 > 0.


2. is negative definite if d1 , d2 < 0.
3. is positive semidefinite if d2 = 0, d1 > 0.
4. is negative semidefinite if d2 = 0, d1 < 0.
5. has saddle point if d2 < 0.

Lecture 12 12/26
Positive Definiteness in Higher Dimensions

Furthermore, it can be observed that


       
a b a b 1 0 a 0 1 b/a
−→ = , (17)
b c aR2 −bR1 0 ac − b2 b/a 1 0 (ac − b2 )/a 0 1
| {z } | {z } | {z } | {z }
A L D U =LT

in which
ac − b2
d1 = a, d2 = (18)
a
are called the pivots of the matrix A.
Hence,
2 
ac − b2
 
b
f (x1 , x2 ) = a x1 + x2 + x22 = xT Ax
a a

1. is positive definite if d1 , d2 > 0.


2. is negative definite if d1 , d2 < 0.
3. is positive semidefinite if d2 = 0, d1 > 0.
4. is negative semidefinite if d2 = 0, d1 < 0.
5. has saddle point if d2 < 0.

Lecture 12 12/26
Positive Definiteness in Higher Dimensions

Furthermore, it can be observed that


       
a b a b 1 0 a 0 1 b/a
−→ = , (17)
b c aR2 −bR1 0 ac − b2 b/a 1 0 (ac − b2 )/a 0 1
| {z } | {z } | {z } | {z }
A L D U =LT

in which
ac − b2
d1 = a, d2 = (18)
a
are called the pivots of the matrix A.
Hence,
2 
ac − b2
 
b
f (x1 , x2 ) = a x1 + x2 + x22 = xT Ax
a a

1. is positive definite if d1 , d2 > 0.


2. is negative definite if d1 , d2 < 0.
3. is positive semidefinite if d2 = 0, d1 > 0.
4. is negative semidefinite if d2 = 0, d1 < 0.
5. has saddle point if d2 < 0.

Lecture 12 12/26
Positive Definiteness in Higher Dimensions
Extending these ideas to a function of n variables, i.e.,
  
a11 a12 ... x1n x1
  a12
 a22 ... x2n   x2 
f (x1 , x2 , . . . , xn ) = xT Ax = x1 x2 . . . xn  .
  
.. .. ..   .. 
 .. . . .  . 
a1n a2n ... ann xn
n X
X n
= ajk xj xk , (19)
j=1 k=1

quadratic function f and the symmetric matrix A are positive definite if each of the
following necessary and sufficient conditions holds:
1. xT Ax > 0 for all non-zero vectors x (definition).
2. Eigenvalues of A are positive, i.e., λj > 0, j = 1, 2, . . . , n.
2 2
Since Axj = λj xj ⇒ xT T
j Axj = λj xj xj = λj kxj k > 0. Since kxj k > 0 for
non-zero vector xj , λj > 0, j = 1, 2, . . . , n.
Conversely, if λj > 0, j = 1, 2, . . . , n, then since symmetric matrices have full set
of orthonormal eigenvectors, any x ∈ Rn can be written as
Xn X n n
X
x= cj xj ⇒ Ax = cj Axj = cj λj xj
j=1 j=1 j=1
Xn
xT Ax = c2j λj > 0 (∵ xT
j xk = δj,k ).
j=1
Lecture 12 13/26
Positive Definiteness in Higher Dimensions
Extending these ideas to a function of n variables, i.e.,
  
a11 a12 ... x1n x1
  a12
 a22 ... x2n   x2 
f (x1 , x2 , . . . , xn ) = xT Ax = x1 x2 . . . xn  .
  
.. .. ..   .. 
 .. . . .  . 
a1n a2n ... ann xn
n X
X n
= ajk xj xk , (19)
j=1 k=1

quadratic function f and the symmetric matrix A are positive definite if each of the
following necessary and sufficient conditions holds:
1. xT Ax > 0 for all non-zero vectors x (definition).
2. Eigenvalues of A are positive, i.e., λj > 0, j = 1, 2, . . . , n.
2 2
Since Axj = λj xj ⇒ xT T
j Axj = λj xj xj = λj kxj k > 0. Since kxj k > 0 for
non-zero vector xj , λj > 0, j = 1, 2, . . . , n.
Conversely, if λj > 0, j = 1, 2, . . . , n, then since symmetric matrices have full set
of orthonormal eigenvectors, any x ∈ Rn can be written as
Xn X n n
X
x= cj xj ⇒ Ax = cj Axj = cj λj xj
j=1 j=1 j=1
Xn
xT Ax = c2j λj > 0 (∵ xT
j xk = δj,k ).
j=1
Lecture 12 13/26
Positive Definiteness in Higher Dimensions
Extending these ideas to a function of n variables, i.e.,
  
a11 a12 ... x1n x1
  a12
 a22 ... x2n   x2 
f (x1 , x2 , . . . , xn ) = xT Ax = x1 x2 . . . xn  .
  
.. .. ..   .. 
 .. . . .  . 
a1n a2n ... ann xn
n X
X n
= ajk xj xk , (19)
j=1 k=1

quadratic function f and the symmetric matrix A are positive definite if each of the
following necessary and sufficient conditions holds:
1. xT Ax > 0 for all non-zero vectors x (definition).
2. Eigenvalues of A are positive, i.e., λj > 0, j = 1, 2, . . . , n.
2 2
Since Axj = λj xj ⇒ xT T
j Axj = λj xj xj = λj kxj k > 0. Since kxj k > 0 for
non-zero vector xj , λj > 0, j = 1, 2, . . . , n.
Conversely, if λj > 0, j = 1, 2, . . . , n, then since symmetric matrices have full set
of orthonormal eigenvectors, any x ∈ Rn can be written as
Xn X n n
X
x= cj xj ⇒ Ax = cj Axj = cj λj xj
j=1 j=1 j=1
Xn
xT Ax = c2j λj > 0 (∵ xT
j xk = δj,k ).
j=1
Lecture 12 13/26
Positive Definiteness in Higher Dimensions
Extending these ideas to a function of n variables, i.e.,
  
a11 a12 ... x1n x1
  a12
 a22 ... x2n   x2 
f (x1 , x2 , . . . , xn ) = xT Ax = x1 x2 . . . xn  .
  
.. .. ..   .. 
 .. . . .  . 
a1n a2n ... ann xn
n X
X n
= ajk xj xk , (19)
j=1 k=1

quadratic function f and the symmetric matrix A are positive definite if each of the
following necessary and sufficient conditions holds:
1. xT Ax > 0 for all non-zero vectors x (definition).
2. Eigenvalues of A are positive, i.e., λj > 0, j = 1, 2, . . . , n.
2 2
Since Axj = λj xj ⇒ xT T
j Axj = λj xj xj = λj kxj k > 0. Since kxj k > 0 for
non-zero vector xj , λj > 0, j = 1, 2, . . . , n.
Conversely, if λj > 0, j = 1, 2, . . . , n, then since symmetric matrices have full set
of orthonormal eigenvectors, any x ∈ Rn can be written as
Xn X n n
X
x= cj xj ⇒ Ax = cj Axj = cj λj xj
j=1 j=1 j=1
Xn
xT Ax = c2j λj > 0 (∵ xT
j xk = δj,k ).
j=1
Lecture 12 13/26
Positive Definiteness in Higher Dimensions
Extending these ideas to a function of n variables, i.e.,
  
a11 a12 ... x1n x1
  a12
 a22 ... x2n   x2 
f (x1 , x2 , . . . , xn ) = xT Ax = x1 x2 . . . xn  .
  
.. .. ..   .. 
 .. . . .  . 
a1n a2n ... ann xn
n X
X n
= ajk xj xk , (19)
j=1 k=1

quadratic function f and the symmetric matrix A are positive definite if each of the
following necessary and sufficient conditions holds:
1. xT Ax > 0 for all non-zero vectors x (definition).
2. Eigenvalues of A are positive, i.e., λj > 0, j = 1, 2, . . . , n.
2 2
Since Axj = λj xj ⇒ xT T
j Axj = λj xj xj = λj kxj k > 0. Since kxj k > 0 for
non-zero vector xj , λj > 0, j = 1, 2, . . . , n.
Conversely, if λj > 0, j = 1, 2, . . . , n, then since symmetric matrices have full set
of orthonormal eigenvectors, any x ∈ Rn can be written as
Xn X n n
X
x= cj xj ⇒ Ax = cj Axj = cj λj xj
j=1 j=1 j=1
Xn
xT Ax = c2j λj > 0 (∵ xT
j xk = δj,k ).
j=1
Lecture 12 13/26
Positive Definiteness in Higher Dimensions
Extending these ideas to a function of n variables, i.e.,
  
a11 a12 ... x1n x1
  a12
 a22 ... x2n   x2 
f (x1 , x2 , . . . , xn ) = xT Ax = x1 x2 . . . xn  .
  
.. .. ..   .. 
 .. . . .  . 
a1n a2n ... ann xn
n X
X n
= ajk xj xk , (19)
j=1 k=1

quadratic function f and the symmetric matrix A are positive definite if each of the
following necessary and sufficient conditions holds:
1. xT Ax > 0 for all non-zero vectors x (definition).
2. Eigenvalues of A are positive, i.e., λj > 0, j = 1, 2, . . . , n.
2 2
Since Axj = λj xj ⇒ xT T
j Axj = λj xj xj = λj kxj k > 0. Since kxj k > 0 for
non-zero vector xj , λj > 0, j = 1, 2, . . . , n.
Conversely, if λj > 0, j = 1, 2, . . . , n, then since symmetric matrices have full set
of orthonormal eigenvectors, any x ∈ Rn can be written as
Xn X n n
X
x= cj xj ⇒ Ax = cj Axj = cj λj xj
j=1 j=1 j=1
Xn
xT Ax = c2j λj > 0 (∵ xT
j xk = δj,k ).
j=1
Lecture 12 13/26
Positive Definiteness in Higher Dimensions
Extending these ideas to a function of n variables, i.e.,
  
a11 a12 ... x1n x1
  a12
 a22 ... x2n   x2 
f (x1 , x2 , . . . , xn ) = xT Ax = x1 x2 . . . xn  .
  
.. .. ..   .. 
 .. . . .  . 
a1n a2n ... ann xn
n X
X n
= ajk xj xk , (19)
j=1 k=1

quadratic function f and the symmetric matrix A are positive definite if each of the
following necessary and sufficient conditions holds:
1. xT Ax > 0 for all non-zero vectors x (definition).
2. Eigenvalues of A are positive, i.e., λj > 0, j = 1, 2, . . . , n.
2 2
Since Axj = λj xj ⇒ xT T
j Axj = λj xj xj = λj kxj k > 0. Since kxj k > 0 for
non-zero vector xj , λj > 0, j = 1, 2, . . . , n.
Conversely, if λj > 0, j = 1, 2, . . . , n, then since symmetric matrices have full set
of orthonormal eigenvectors, any x ∈ Rn can be written as
Xn X n n
X
x= cj xj ⇒ Ax = cj Axj = cj λj xj
j=1 j=1 j=1
Xn
xT Ax = c2j λj > 0 (∵ xT
j xk = δj,k ).
j=1
Lecture 12 13/26
Positive Definiteness in Higher Dimensions
3. All upperleft submatrices Ak have positive determinants.
Qn
For xT Ax > 0, λj > 0, j = 1, 2, . . . , n ⇒ |A| = j=1 λj > 0.
Now, considering those vectors whoe last k components are zero, their quadratic
form is written as
 Ak 0 x T
  
T k ≡ xT A x > 0
 T
x Ax = xk 0 k k k
0 0 0

for k = 1, 2, . . . , n, which suggests that Ak is positive definite and thus, its


eigenvalues are positive, i.e., λ0k > 0. This in turn establishes that
n
Y
|Ak | = λ0k > 0.
k=1

4. All pivots (without row exchanges) are positive.


For xT Ax > 0, λj > 0, j = 1, 2, . . . , n ⇒ |Ak | > 0, k = 1, 2, . . . , n. Since
(
|A1 |, k=1
dk , |Ak |
|Ak−1 | , k = 2, . . . , n,

we conclude that pivots of a positive definite matrix are positive.


Lecture 12 14/26
Positive Definiteness in Higher Dimensions
3. All upperleft submatrices Ak have positive determinants.
Qn
For xT Ax > 0, λj > 0, j = 1, 2, . . . , n ⇒ |A| = j=1 λj > 0.
Now, considering those vectors whoe last k components are zero, their quadratic
form is written as
 Ak 0 x T
  
T k ≡ xT A x > 0
 T
x Ax = xk 0 k k k
0 0 0

for k = 1, 2, . . . , n, which suggests that Ak is positive definite and thus, its


eigenvalues are positive, i.e., λ0k > 0. This in turn establishes that
n
Y
|Ak | = λ0k > 0.
k=1

4. All pivots (without row exchanges) are positive.


For xT Ax > 0, λj > 0, j = 1, 2, . . . , n ⇒ |Ak | > 0, k = 1, 2, . . . , n. Since
(
|A1 |, k=1
dk , |Ak |
|Ak−1 | , k = 2, . . . , n,

we conclude that pivots of a positive definite matrix are positive.


Lecture 12 14/26
Positive Definiteness in Higher Dimensions
3. All upperleft submatrices Ak have positive determinants.
Qn
For xT Ax > 0, λj > 0, j = 1, 2, . . . , n ⇒ |A| = j=1 λj > 0.
Now, considering those vectors whoe last k components are zero, their quadratic
form is written as
 Ak 0 x T
  
T k ≡ xT A x > 0
 T
x Ax = xk 0 k k k
0 0 0

for k = 1, 2, . . . , n, which suggests that Ak is positive definite and thus, its


eigenvalues are positive, i.e., λ0k > 0. This in turn establishes that
n
Y
|Ak | = λ0k > 0.
k=1

4. All pivots (without row exchanges) are positive.


For xT Ax > 0, λj > 0, j = 1, 2, . . . , n ⇒ |Ak | > 0, k = 1, 2, . . . , n. Since
(
|A1 |, k=1
dk , |Ak |
|Ak−1 | , k = 2, . . . , n,

we conclude that pivots of a positive definite matrix are positive.


Lecture 12 14/26
Positive Definiteness in Higher Dimensions
3. All upperleft submatrices Ak have positive determinants.
Qn
For xT Ax > 0, λj > 0, j = 1, 2, . . . , n ⇒ |A| = j=1 λj > 0.
Now, considering those vectors whoe last k components are zero, their quadratic
form is written as
 Ak 0 x T
  
T k ≡ xT A x > 0
 T
x Ax = xk 0 k k k
0 0 0

for k = 1, 2, . . . , n, which suggests that Ak is positive definite and thus, its


eigenvalues are positive, i.e., λ0k > 0. This in turn establishes that
n
Y
|Ak | = λ0k > 0.
k=1

4. All pivots (without row exchanges) are positive.


For xT Ax > 0, λj > 0, j = 1, 2, . . . , n ⇒ |Ak | > 0, k = 1, 2, . . . , n. Since
(
|A1 |, k=1
dk , |Ak |
|Ak−1 | , k = 2, . . . , n,

we conclude that pivots of a positive definite matrix are positive.


Lecture 12 14/26
Positive Definiteness in Higher Dimensions
3. All upperleft submatrices Ak have positive determinants.
Qn
For xT Ax > 0, λj > 0, j = 1, 2, . . . , n ⇒ |A| = j=1 λj > 0.
Now, considering those vectors whoe last k components are zero, their quadratic
form is written as
 Ak 0 x T
  
T k ≡ xT A x > 0
 T
x Ax = xk 0 k k k
0 0 0

for k = 1, 2, . . . , n, which suggests that Ak is positive definite and thus, its


eigenvalues are positive, i.e., λ0k > 0. This in turn establishes that
n
Y
|Ak | = λ0k > 0.
k=1

4. All pivots (without row exchanges) are positive.


For xT Ax > 0, λj > 0, j = 1, 2, . . . , n ⇒ |Ak | > 0, k = 1, 2, . . . , n. Since
(
|A1 |, k=1
dk , |Ak |
|Ak−1 | , k = 2, . . . , n,

we conclude that pivots of a positive definite matrix are positive.


Lecture 12 14/26
Positive Definiteness in Higher Dimensions
3. All upperleft submatrices Ak have positive determinants.
Qn
For xT Ax > 0, λj > 0, j = 1, 2, . . . , n ⇒ |A| = j=1 λj > 0.
Now, considering those vectors whoe last k components are zero, their quadratic
form is written as
 Ak 0 x T
  
T k ≡ xT A x > 0
 T
x Ax = xk 0 k k k
0 0 0

for k = 1, 2, . . . , n, which suggests that Ak is positive definite and thus, its


eigenvalues are positive, i.e., λ0k > 0. This in turn establishes that
n
Y
|Ak | = λ0k > 0.
k=1

4. All pivots (without row exchanges) are positive.


For xT Ax > 0, λj > 0, j = 1, 2, . . . , n ⇒ |Ak | > 0, k = 1, 2, . . . , n. Since
(
|A1 |, k=1
dk , |Ak |
|Ak−1 | , k = 2, . . . , n,

we conclude that pivots of a positive definite matrix are positive.


Lecture 12 14/26
Positive Definiteness in Higher Dimensions
3. All upperleft submatrices Ak have positive determinants.
Qn
For xT Ax > 0, λj > 0, j = 1, 2, . . . , n ⇒ |A| = j=1 λj > 0.
Now, considering those vectors whoe last k components are zero, their quadratic
form is written as
 Ak 0 x T
  
T k ≡ xT A x > 0
 T
x Ax = xk 0 k k k
0 0 0

for k = 1, 2, . . . , n, which suggests that Ak is positive definite and thus, its


eigenvalues are positive, i.e., λ0k > 0. This in turn establishes that
n
Y
|Ak | = λ0k > 0.
k=1

4. All pivots (without row exchanges) are positive.


For xT Ax > 0, λj > 0, j = 1, 2, . . . , n ⇒ |Ak | > 0, k = 1, 2, . . . , n. Since
(
|A1 |, k=1
dk , |Ak |
|Ak−1 | , k = 2, . . . , n,

we conclude that pivots of a positive definite matrix are positive.


Lecture 12 14/26
Positive Definiteness in Higher Dimensions
3. All upperleft submatrices Ak have positive determinants.
Qn
For xT Ax > 0, λj > 0, j = 1, 2, . . . , n ⇒ |A| = j=1 λj > 0.
Now, considering those vectors whoe last k components are zero, their quadratic
form is written as
 Ak 0 x T
  
T k ≡ xT A x > 0
 T
x Ax = xk 0 k k k
0 0 0

for k = 1, 2, . . . , n, which suggests that Ak is positive definite and thus, its


eigenvalues are positive, i.e., λ0k > 0. This in turn establishes that
n
Y
|Ak | = λ0k > 0.
k=1

4. All pivots (without row exchanges) are positive.


For xT Ax > 0, λj > 0, j = 1, 2, . . . , n ⇒ |Ak | > 0, k = 1, 2, . . . , n. Since
(
|A1 |, k=1
dk , |Ak |
|Ak−1 | , k = 2, . . . , n,

we conclude that pivots of a positive definite matrix are positive.


Lecture 12 14/26
Positive Definiteness in Higher Dimensions
3. All upperleft submatrices Ak have positive determinants.
Qn
For xT Ax > 0, λj > 0, j = 1, 2, . . . , n ⇒ |A| = j=1 λj > 0.
Now, considering those vectors whoe last k components are zero, their quadratic
form is written as
 Ak 0 x T
  
T k ≡ xT A x > 0
 T
x Ax = xk 0 k k k
0 0 0

for k = 1, 2, . . . , n, which suggests that Ak is positive definite and thus, its


eigenvalues are positive, i.e., λ0k > 0. This in turn establishes that
n
Y
|Ak | = λ0k > 0.
k=1

4. All pivots (without row exchanges) are positive.


For xT Ax > 0, λj > 0, j = 1, 2, . . . , n ⇒ |Ak | > 0, k = 1, 2, . . . , n. Since
(
|A1 |, k=1
dk , |Ak |
|Ak−1 | , k = 2, . . . , n,

we conclude that pivots of a positive definite matrix are positive.


Lecture 12 14/26
Positive Definiteness in Higher Dimensions
3. All upperleft submatrices Ak have positive determinants.
Qn
For xT Ax > 0, λj > 0, j = 1, 2, . . . , n ⇒ |A| = j=1 λj > 0.
Now, considering those vectors whoe last k components are zero, their quadratic
form is written as
 Ak 0 x T
  
T k ≡ xT A x > 0
 T
x Ax = xk 0 k k k
0 0 0

for k = 1, 2, . . . , n, which suggests that Ak is positive definite and thus, its


eigenvalues are positive, i.e., λ0k > 0. This in turn establishes that
n
Y
|Ak | = λ0k > 0.
k=1

4. All pivots (without row exchanges) are positive.


For xT Ax > 0, λj > 0, j = 1, 2, . . . , n ⇒ |Ak | > 0, k = 1, 2, . . . , n. Since
(
|A1 |, k=1
dk , |Ak |
|Ak−1 | , k = 2, . . . , n,

we conclude that pivots of a positive definite matrix are positive.


Lecture 12 14/26
Positive Definiteness in Higher Dimensions

Conversely, we note that pivots are numbers that multiply complete squares to make
up xT Ax. Hence, dk > 0, k = 1, 2, . . . , n to render the quadratic form positive
everywhere except at origin.

5. There exists a matrix R with independent columns such that A = RT R.

T 2
xT Ax = xT RT Rx = (Rx) (Rx) = kRxk > 0,

because x 6= 0 and
n
X
Rx = xj rj = 0
j=1

only when all xj , j = 1, 2, . . . , n are zero, due to linearly independent columns


rj of R.

Lecture 12 15/26
Positive Definiteness in Higher Dimensions

Conversely, we note that pivots are numbers that multiply complete squares to make
up xT Ax. Hence, dk > 0, k = 1, 2, . . . , n to render the quadratic form positive
everywhere except at origin.

5. There exists a matrix R with independent columns such that A = RT R.

T 2
xT Ax = xT RT Rx = (Rx) (Rx) = kRxk > 0,

because x 6= 0 and
n
X
Rx = xj rj = 0
j=1

only when all xj , j = 1, 2, . . . , n are zero, due to linearly independent columns


rj of R.

Lecture 12 15/26
Positive Definiteness in Higher Dimensions

Conversely, we note that pivots are numbers that multiply complete squares to make
up xT Ax. Hence, dk > 0, k = 1, 2, . . . , n to render the quadratic form positive
everywhere except at origin.

5. There exists a matrix R with independent columns such that A = RT R.

T 2
xT Ax = xT RT Rx = (Rx) (Rx) = kRxk > 0,

because x 6= 0 and
n
X
Rx = xj rj = 0
j=1

only when all xj , j = 1, 2, . . . , n are zero, due to linearly independent columns


rj of R.

Lecture 12 15/26
Positive Definiteness in Higher Dimensions

Conversely, we note that pivots are numbers that multiply complete squares to make
up xT Ax. Hence, dk > 0, k = 1, 2, . . . , n to render the quadratic form positive
everywhere except at origin.

5. There exists a matrix R with independent columns such that A = RT R.

T 2
xT Ax = xT RT Rx = (Rx) (Rx) = kRxk > 0,

because x 6= 0 and
n
X
Rx = xj rj = 0
j=1

only when all xj , j = 1, 2, . . . , n are zero, due to linearly independent columns


rj of R.

Lecture 12 15/26
Positive Definiteness in Higher Dimensions

Conversely, we note that pivots are numbers that multiply complete squares to make
up xT Ax. Hence, dk > 0, k = 1, 2, . . . , n to render the quadratic form positive
everywhere except at origin.

5. There exists a matrix R with independent columns such that A = RT R.

T 2
xT Ax = xT RT Rx = (Rx) (Rx) = kRxk > 0,

because x 6= 0 and
n
X
Rx = xj rj = 0
j=1

only when all xj , j = 1, 2, . . . , n are zero, due to linearly independent columns


rj of R.

Lecture 12 15/26
Positive Semidefiniteness in Higher Dimensions

The symmetric matrix in the quadrartic form given in (19) is positive semidefinite, if
each of the following necessary and sufficient conditions hold:
1. xT Ax ≥ 0 for all x 6= 0. (definition)
2. Eigenvalues of A are non-negative, i.e., λj ≥ 0, j = 1, 2, . . . , n.
3. No principal submatrices have negative determinants5 .

If only upperleft matrices were checked for positive semidefiniteness, we would


not be able to distinguish between two matrices whose all upperleft determinants
were zero, e.g.,
   
0 0 0 0
is + ve semidefinite, is − ve semidefinite.
0 1 0 −1

Yet both of these matrices have zero determinants for all upperleft submatrices.

4. No pivots dk , k = 1, 2, . . . , n are negative.


5. There exists a matrix R with possibly linearly dependent columns such that A =
RT R.

5 A principal submatrix is a square matrix obtained by removing certain rows and columns of a matrix.
Lecture 12 16/26
Positive Semidefiniteness in Higher Dimensions

The symmetric matrix in the quadrartic form given in (19) is positive semidefinite, if
each of the following necessary and sufficient conditions hold:
1. xT Ax ≥ 0 for all x 6= 0. (definition)
2. Eigenvalues of A are non-negative, i.e., λj ≥ 0, j = 1, 2, . . . , n.
3. No principal submatrices have negative determinants5 .

If only upperleft matrices were checked for positive semidefiniteness, we would


not be able to distinguish between two matrices whose all upperleft determinants
were zero, e.g.,
   
0 0 0 0
is + ve semidefinite, is − ve semidefinite.
0 1 0 −1

Yet both of these matrices have zero determinants for all upperleft submatrices.

4. No pivots dk , k = 1, 2, . . . , n are negative.


5. There exists a matrix R with possibly linearly dependent columns such that A =
RT R.

5 A principal submatrix is a square matrix obtained by removing certain rows and columns of a matrix.
Lecture 12 16/26
Positive Semidefiniteness in Higher Dimensions

The symmetric matrix in the quadrartic form given in (19) is positive semidefinite, if
each of the following necessary and sufficient conditions hold:
1. xT Ax ≥ 0 for all x 6= 0. (definition)
2. Eigenvalues of A are non-negative, i.e., λj ≥ 0, j = 1, 2, . . . , n.
3. No principal submatrices have negative determinants5 .

If only upperleft matrices were checked for positive semidefiniteness, we would


not be able to distinguish between two matrices whose all upperleft determinants
were zero, e.g.,
   
0 0 0 0
is + ve semidefinite, is − ve semidefinite.
0 1 0 −1

Yet both of these matrices have zero determinants for all upperleft submatrices.

4. No pivots dk , k = 1, 2, . . . , n are negative.


5. There exists a matrix R with possibly linearly dependent columns such that A =
RT R.

5 A principal submatrix is a square matrix obtained by removing certain rows and columns of a matrix.
Lecture 12 16/26
Positive Semidefiniteness in Higher Dimensions

The symmetric matrix in the quadrartic form given in (19) is positive semidefinite, if
each of the following necessary and sufficient conditions hold:
1. xT Ax ≥ 0 for all x 6= 0. (definition)
2. Eigenvalues of A are non-negative, i.e., λj ≥ 0, j = 1, 2, . . . , n.
3. No principal submatrices have negative determinants5 .

If only upperleft matrices were checked for positive semidefiniteness, we would


not be able to distinguish between two matrices whose all upperleft determinants
were zero, e.g.,
   
0 0 0 0
is + ve semidefinite, is − ve semidefinite.
0 1 0 −1

Yet both of these matrices have zero determinants for all upperleft submatrices.

4. No pivots dk , k = 1, 2, . . . , n are negative.


5. There exists a matrix R with possibly linearly dependent columns such that A =
RT R.

5 A principal submatrix is a square matrix obtained by removing certain rows and columns of a matrix.
Lecture 12 16/26
Positive Semidefiniteness in Higher Dimensions

The symmetric matrix in the quadrartic form given in (19) is positive semidefinite, if
each of the following necessary and sufficient conditions hold:
1. xT Ax ≥ 0 for all x 6= 0. (definition)
2. Eigenvalues of A are non-negative, i.e., λj ≥ 0, j = 1, 2, . . . , n.
3. No principal submatrices have negative determinants5 .

If only upperleft matrices were checked for positive semidefiniteness, we would


not be able to distinguish between two matrices whose all upperleft determinants
were zero, e.g.,
   
0 0 0 0
is + ve semidefinite, is − ve semidefinite.
0 1 0 −1

Yet both of these matrices have zero determinants for all upperleft submatrices.

4. No pivots dk , k = 1, 2, . . . , n are negative.


5. There exists a matrix R with possibly linearly dependent columns such that A =
RT R.

5 A principal submatrix is a square matrix obtained by removing certain rows and columns of a matrix.
Lecture 12 16/26
Positive Semidefiniteness in Higher Dimensions

The symmetric matrix in the quadrartic form given in (19) is positive semidefinite, if
each of the following necessary and sufficient conditions hold:
1. xT Ax ≥ 0 for all x 6= 0. (definition)
2. Eigenvalues of A are non-negative, i.e., λj ≥ 0, j = 1, 2, . . . , n.
3. No principal submatrices have negative determinants5 .

If only upperleft matrices were checked for positive semidefiniteness, we would


not be able to distinguish between two matrices whose all upperleft determinants
were zero, e.g.,
   
0 0 0 0
is + ve semidefinite, is − ve semidefinite.
0 1 0 −1

Yet both of these matrices have zero determinants for all upperleft submatrices.

4. No pivots dk , k = 1, 2, . . . , n are negative.


5. There exists a matrix R with possibly linearly dependent columns such that A =
RT R.

5 A principal submatrix is a square matrix obtained by removing certain rows and columns of a matrix.
Lecture 12 16/26
Positive Semidefiniteness in Higher Dimensions

The symmetric matrix in the quadrartic form given in (19) is positive semidefinite, if
each of the following necessary and sufficient conditions hold:
1. xT Ax ≥ 0 for all x 6= 0. (definition)
2. Eigenvalues of A are non-negative, i.e., λj ≥ 0, j = 1, 2, . . . , n.
3. No principal submatrices have negative determinants5 .

If only upperleft matrices were checked for positive semidefiniteness, we would


not be able to distinguish between two matrices whose all upperleft determinants
were zero, e.g.,
   
0 0 0 0
is + ve semidefinite, is − ve semidefinite.
0 1 0 −1

Yet both of these matrices have zero determinants for all upperleft submatrices.

4. No pivots dk , k = 1, 2, . . . , n are negative.


5. There exists a matrix R with possibly linearly dependent columns such that A =
RT R.

5 A principal submatrix is a square matrix obtained by removing certain rows and columns of a matrix.
Lecture 12 16/26
Positive Semidefiniteness in Higher Dimensions

The symmetric matrix in the quadrartic form given in (19) is positive semidefinite, if
each of the following necessary and sufficient conditions hold:
1. xT Ax ≥ 0 for all x 6= 0. (definition)
2. Eigenvalues of A are non-negative, i.e., λj ≥ 0, j = 1, 2, . . . , n.
3. No principal submatrices have negative determinants5 .

If only upperleft matrices were checked for positive semidefiniteness, we would


not be able to distinguish between two matrices whose all upperleft determinants
were zero, e.g.,
   
0 0 0 0
is + ve semidefinite, is − ve semidefinite.
0 1 0 −1

Yet both of these matrices have zero determinants for all upperleft submatrices.

4. No pivots dk , k = 1, 2, . . . , n are negative.


5. There exists a matrix R with possibly linearly dependent columns such that A =
RT R.

5 A principal submatrix is a square matrix obtained by removing certain rows and columns of a matrix.
Lecture 12 16/26
Positive Semidefiniteness in Higher Dimensions

The symmetric matrix in the quadrartic form given in (19) is positive semidefinite, if
each of the following necessary and sufficient conditions hold:
1. xT Ax ≥ 0 for all x 6= 0. (definition)
2. Eigenvalues of A are non-negative, i.e., λj ≥ 0, j = 1, 2, . . . , n.
3. No principal submatrices have negative determinants5 .

If only upperleft matrices were checked for positive semidefiniteness, we would


not be able to distinguish between two matrices whose all upperleft determinants
were zero, e.g.,
   
0 0 0 0
is + ve semidefinite, is − ve semidefinite.
0 1 0 −1

Yet both of these matrices have zero determinants for all upperleft submatrices.

4. No pivots dk , k = 1, 2, . . . , n are negative.


5. There exists a matrix R with possibly linearly dependent columns such that A =
RT R.

5 A principal submatrix is a square matrix obtained by removing certain rows and columns of a matrix.
Lecture 12 16/26
Quadratic Form for a Random Function

Finally, we note that any function F (x1 , x2 , . . . , xn ) has a quadratic form whose
matrix A = [ajk ] is such that

∂2F
ajk = = akj . (20)
∂xj ∂xk

The function F has a minimum6 when the resulting quadratic form is positive
definite.

6 at the stationary point where all of its first derivatives are zero.
Lecture 12 17/26
Quadratic Form for a Random Function

Finally, we note that any function F (x1 , x2 , . . . , xn ) has a quadratic form whose
matrix A = [ajk ] is such that

∂2F
ajk = = akj . (20)
∂xj ∂xk

The function F has a minimum6 when the resulting quadratic form is positive
definite.

6 at the stationary point where all of its first derivatives are zero.
Lecture 12 17/26
Singular Value Decomposition
Any m × n matrix A can be factored into

A = U ΣV T U ΣV H for complex matrices ,



(21)

where Um×m and Vn×n (and thus V T ) are orthogonal and Σm×n is a diagonal matrix.
Colummns of U are eigenvectors of AAT and columns of V are eigenvectors of AT A.
The r singular values on the diagonal of Σ are square roots of the non-zero eigenvalues
of both AAT and AT A.

Proof:
Let Am×n = [aij ] be a matrix with Rank(A) = r. Then, AT A is symmetric and
positive semidefinite, i.e.,
T
AT A = AT A (AT A is symmetric) and
T T T T T 2 2
x (A A)x = x A Ax = (Ax) (Ax) = kAxk = kyk ≥ 0,

where

y = x1 a1 + x2 a2 + · · · + xn an ,

which may equal 0 for non-zero xj if the columns of A, i.e., aj , are linearly
dependent.

Lecture 12 18/26
Singular Value Decomposition
Any m × n matrix A can be factored into

A = U ΣV T U ΣV H for complex matrices ,



(21)

where Um×m and Vn×n (and thus V T ) are orthogonal and Σm×n is a diagonal matrix.
Colummns of U are eigenvectors of AAT and columns of V are eigenvectors of AT A.
The r singular values on the diagonal of Σ are square roots of the non-zero eigenvalues
of both AAT and AT A.

Proof:
Let Am×n = [aij ] be a matrix with Rank(A) = r. Then, AT A is symmetric and
positive semidefinite, i.e.,
T
AT A = AT A (AT A is symmetric) and
T T T T T 2 2
x (A A)x = x A Ax = (Ax) (Ax) = kAxk = kyk ≥ 0,

where

y = x1 a1 + x2 a2 + · · · + xn an ,

which may equal 0 for non-zero xj if the columns of A, i.e., aj , are linearly
dependent.

Lecture 12 18/26
Singular Value Decomposition
Any m × n matrix A can be factored into

A = U ΣV T U ΣV H for complex matrices ,



(21)

where Um×m and Vn×n (and thus V T ) are orthogonal and Σm×n is a diagonal matrix.
Colummns of U are eigenvectors of AAT and columns of V are eigenvectors of AT A.
The r singular values on the diagonal of Σ are square roots of the non-zero eigenvalues
of both AAT and AT A.

Proof:
Let Am×n = [aij ] be a matrix with Rank(A) = r. Then, AT A is symmetric and
positive semidefinite, i.e.,
T
AT A = AT A (AT A is symmetric) and
T T T T T 2 2
x (A A)x = x A Ax = (Ax) (Ax) = kAxk = kyk ≥ 0,

where

y = x1 a1 + x2 a2 + · · · + xn an ,

which may equal 0 for non-zero xj if the columns of A, i.e., aj , are linearly
dependent.

Lecture 12 18/26
Singular Value Decomposition
Any m × n matrix A can be factored into

A = U ΣV T U ΣV H for complex matrices ,



(21)

where Um×m and Vn×n (and thus V T ) are orthogonal and Σm×n is a diagonal matrix.
Colummns of U are eigenvectors of AAT and columns of V are eigenvectors of AT A.
The r singular values on the diagonal of Σ are square roots of the non-zero eigenvalues
of both AAT and AT A.

Proof:
Let Am×n = [aij ] be a matrix with Rank(A) = r. Then, AT A is symmetric and
positive semidefinite, i.e.,
T
AT A = AT A (AT A is symmetric) and
T T T T T 2 2
x (A A)x = x A Ax = (Ax) (Ax) = kAxk = kyk ≥ 0,

where

y = x1 a1 + x2 a2 + · · · + xn an ,

which may equal 0 for non-zero xj if the columns of A, i.e., aj , are linearly
dependent.

Lecture 12 18/26
Singular Value Decomposition
Any m × n matrix A can be factored into

A = U ΣV T U ΣV H for complex matrices ,



(21)

where Um×m and Vn×n (and thus V T ) are orthogonal and Σm×n is a diagonal matrix.
Colummns of U are eigenvectors of AAT and columns of V are eigenvectors of AT A.
The r singular values on the diagonal of Σ are square roots of the non-zero eigenvalues
of both AAT and AT A.

Proof:
Let Am×n = [aij ] be a matrix with Rank(A) = r. Then, AT A is symmetric and
positive semidefinite, i.e.,
T
AT A = AT A (AT A is symmetric) and
T T T T T 2 2
x (A A)x = x A Ax = (Ax) (Ax) = kAxk = kyk ≥ 0,

where

y = x1 a1 + x2 a2 + · · · + xn an ,

which may equal 0 for non-zero xj if the columns of A, i.e., aj , are linearly
dependent.

Lecture 12 18/26
Singular Value Decomposition
Any m × n matrix A can be factored into

A = U ΣV T U ΣV H for complex matrices ,



(21)

where Um×m and Vn×n (and thus V T ) are orthogonal and Σm×n is a diagonal matrix.
Colummns of U are eigenvectors of AAT and columns of V are eigenvectors of AT A.
The r singular values on the diagonal of Σ are square roots of the non-zero eigenvalues
of both AAT and AT A.

Proof:
Let Am×n = [aij ] be a matrix with Rank(A) = r. Then, AT A is symmetric and
positive semidefinite, i.e.,
T
AT A = AT A (AT A is symmetric) and
T T T T T 2 2
x (A A)x = x A Ax = (Ax) (Ax) = kAxk = kyk ≥ 0,

where

y = x1 a1 + x2 a2 + · · · + xn an ,

which may equal 0 for non-zero xj if the columns of A, i.e., aj , are linearly
dependent.

Lecture 12 18/26
Singular Value Decomposition
Any m × n matrix A can be factored into

A = U ΣV T U ΣV H for complex matrices ,



(21)

where Um×m and Vn×n (and thus V T ) are orthogonal and Σm×n is a diagonal matrix.
Colummns of U are eigenvectors of AAT and columns of V are eigenvectors of AT A.
The r singular values on the diagonal of Σ are square roots of the non-zero eigenvalues
of both AAT and AT A.

Proof:
Let Am×n = [aij ] be a matrix with Rank(A) = r. Then, AT A is symmetric and
positive semidefinite, i.e.,
T
AT A = AT A (AT A is symmetric) and
T T T T T 2 2
x (A A)x = x A Ax = (Ax) (Ax) = kAxk = kyk ≥ 0,

where

y = x1 a1 + x2 a2 + · · · + xn an ,

which may equal 0 for non-zero xj if the columns of A, i.e., aj , are linearly
dependent.

Lecture 12 18/26
Singular Value Decomposition
Any m × n matrix A can be factored into

A = U ΣV T U ΣV H for complex matrices ,



(21)

where Um×m and Vn×n (and thus V T ) are orthogonal and Σm×n is a diagonal matrix.
Colummns of U are eigenvectors of AAT and columns of V are eigenvectors of AT A.
The r singular values on the diagonal of Σ are square roots of the non-zero eigenvalues
of both AAT and AT A.

Proof:
Let Am×n = [aij ] be a matrix with Rank(A) = r. Then, AT A is symmetric and
positive semidefinite, i.e.,
T
AT A = AT A (AT A is symmetric) and
T T T T T 2 2
x (A A)x = x A Ax = (Ax) (Ax) = kAxk = kyk ≥ 0,

where

y = x1 a1 + x2 a2 + · · · + xn an ,

which may equal 0 for non-zero xj if the columns of A, i.e., aj , are linearly
dependent.

Lecture 12 18/26
Singular Value Decomposition
Any m × n matrix A can be factored into

A = U ΣV T U ΣV H for complex matrices ,



(21)

where Um×m and Vn×n (and thus V T ) are orthogonal and Σm×n is a diagonal matrix.
Colummns of U are eigenvectors of AAT and columns of V are eigenvectors of AT A.
The r singular values on the diagonal of Σ are square roots of the non-zero eigenvalues
of both AAT and AT A.

Proof:
Let Am×n = [aij ] be a matrix with Rank(A) = r. Then, AT A is symmetric and
positive semidefinite, i.e.,
T
AT A = AT A (AT A is symmetric) and
T T T T T 2 2
x (A A)x = x A Ax = (Ax) (Ax) = kAxk = kyk ≥ 0,

where

y = x1 a1 + x2 a2 + · · · + xn an ,

which may equal 0 for non-zero xj if the columns of A, i.e., aj , are linearly
dependent.

Lecture 12 18/26
Singular Value Decomposition
Any m × n matrix A can be factored into

A = U ΣV T U ΣV H for complex matrices ,



(21)

where Um×m and Vn×n (and thus V T ) are orthogonal and Σm×n is a diagonal matrix.
Colummns of U are eigenvectors of AAT and columns of V are eigenvectors of AT A.
The r singular values on the diagonal of Σ are square roots of the non-zero eigenvalues
of both AAT and AT A.

Proof:
Let Am×n = [aij ] be a matrix with Rank(A) = r. Then, AT A is symmetric and
positive semidefinite, i.e.,
T
AT A = AT A (AT A is symmetric) and
T T T T T 2 2
x (A A)x = x A Ax = (Ax) (Ax) = kAxk = kyk ≥ 0,

where

y = x1 a1 + x2 a2 + · · · + xn an ,

which may equal 0 for non-zero xj if the columns of A, i.e., aj , are linearly
dependent.

Lecture 12 18/26
Singular Value Decomposition
Any m × n matrix A can be factored into

A = U ΣV T U ΣV H for complex matrices ,



(21)

where Um×m and Vn×n (and thus V T ) are orthogonal and Σm×n is a diagonal matrix.
Colummns of U are eigenvectors of AAT and columns of V are eigenvectors of AT A.
The r singular values on the diagonal of Σ are square roots of the non-zero eigenvalues
of both AAT and AT A.

Proof:
Let Am×n = [aij ] be a matrix with Rank(A) = r. Then, AT A is symmetric and
positive semidefinite, i.e.,
T
AT A = AT A (AT A is symmetric) and
T T T T T 2 2
x (A A)x = x A Ax = (Ax) (Ax) = kAxk = kyk ≥ 0,

where

y = x1 a1 + x2 a2 + · · · + xn an ,

which may equal 0 for non-zero xj if the columns of A, i.e., aj , are linearly
dependent.

Lecture 12 18/26
Singular Value Decomposition
Any m × n matrix A can be factored into

A = U ΣV T U ΣV H for complex matrices ,



(21)

where Um×m and Vn×n (and thus V T ) are orthogonal and Σm×n is a diagonal matrix.
Colummns of U are eigenvectors of AAT and columns of V are eigenvectors of AT A.
The r singular values on the diagonal of Σ are square roots of the non-zero eigenvalues
of both AAT and AT A.

Proof:
Let Am×n = [aij ] be a matrix with Rank(A) = r. Then, AT A is symmetric and
positive semidefinite, i.e.,
T
AT A = AT A (AT A is symmetric) and
T T T T T 2 2
x (A A)x = x A Ax = (Ax) (Ax) = kAxk = kyk ≥ 0,

where

y = x1 a1 + x2 a2 + · · · + xn an ,

which may equal 0 for non-zero xj if the columns of A, i.e., aj , are linearly
dependent.

Lecture 12 18/26
Singular Value Decomposition

Hence, eigenvectors of AT A are orthogonal (chosen to be orthonormal by


normalization) and are given by

AT A vj = λj vj = σj2 vj , j = 1, 2, . . . , n,

(22)

where λj are the real eigenvalues7 of AT A and are non-negative

λj = σj2 ∵ λj ≥ 0, j = 1, 2, . . . , n. (23)

As a result, the matrix AT A has the following similarity transformation

AT A n×n = V DV −1 = V DV T

  
λ1 v1
 
 λ2   v2 
  
= v1 v2 . . . vn  . . , (24)
 ..   .. 
 

λn vn

where V is the matrix of orthonormal eigenvectors of AT A and is orthogonal.

7 because AT A is symmetric!
Lecture 12 19/26
Singular Value Decomposition

Hence, eigenvectors of AT A are orthogonal (chosen to be orthonormal by


normalization) and are given by

AT A vj = λj vj = σj2 vj , j = 1, 2, . . . , n,

(22)

where λj are the real eigenvalues7 of AT A and are non-negative

λj = σj2 ∵ λj ≥ 0, j = 1, 2, . . . , n. (23)

As a result, the matrix AT A has the following similarity transformation

AT A n×n = V DV −1 = V DV T

  
λ1 v1
 
 λ2   v2 
  
= v1 v2 . . . vn  . . , (24)
 ..   .. 
 

λn vn

where V is the matrix of orthonormal eigenvectors of AT A and is orthogonal.

7 because AT A is symmetric!
Lecture 12 19/26
Singular Value Decomposition

Hence, eigenvectors of AT A are orthogonal (chosen to be orthonormal by


normalization) and are given by

AT A vj = λj vj = σj2 vj , j = 1, 2, . . . , n,

(22)

where λj are the real eigenvalues7 of AT A and are non-negative

λj = σj2 ∵ λj ≥ 0, j = 1, 2, . . . , n. (23)

As a result, the matrix AT A has the following similarity transformation

AT A n×n = V DV −1 = V DV T

  
λ1 v1
 
 λ2   v2 
  
= v1 v2 . . . vn  . . , (24)
 ..   .. 
 

λn vn

where V is the matrix of orthonormal eigenvectors of AT A and is orthogonal.

7 because AT A is symmetric!
Lecture 12 19/26
Singular Value Decomposition

Hence, eigenvectors of AT A are orthogonal (chosen to be orthonormal by


normalization) and are given by

AT A vj = λj vj = σj2 vj , j = 1, 2, . . . , n,

(22)

where λj are the real eigenvalues7 of AT A and are non-negative

λj = σj2 ∵ λj ≥ 0, j = 1, 2, . . . , n. (23)

As a result, the matrix AT A has the following similarity transformation

AT A n×n = V DV −1 = V DV T

  
λ1 v1
 
 λ2   v2 
  
= v1 v2 . . . vn  . . , (24)
 ..   .. 
 

λn vn

where V is the matrix of orthonormal eigenvectors of AT A and is orthogonal.

7 because AT A is symmetric!
Lecture 12 19/26
Singular Value Decomposition
It must be noted that A and AT A have the same rank, equal to r, because if Ax = 0,
then AT Ax = 0, implying that A and AT A have the same null space. Hence,

dim N (A) = dim N (AT A)


# of cols. of A + Rank(A) = # of cols. of AT A + Rank(AT A)
Rank(A) = Rank(AT A) ∵ # of cols. of A = # of cols. of AT A .


(25)

Now, Am×n represents a linear transformation F : Rn → Rm and vj ∈ Rn for


j = 1, 2, . . . , n constitute an orthonormal basis set for Rn .Then

Avi = u0i , (26)

where u0i ∈ Rm is the image of vi ∈ Rn under F .


Premultiplying (26) with AT and then with A results in the following eigenvalue
equation

AT Avi = AT u0i ⇒ λi vi = AT u0i


⇒ AAT u0i = λi Avi = λi u0i , i = 1, 2, . . . , m. (27)

T
Because AAT = AAT , u0i are orthogonal eigenvectors of AAT .Furthermore,
since AAT is a square matrix of size m × m, i = 1, 2, . . . , m in (27).
Lecture 12 20/26
Singular Value Decomposition
It must be noted that A and AT A have the same rank, equal to r, because if Ax = 0,
then AT Ax = 0, implying that A and AT A have the same null space. Hence,

dim N (A) = dim N (AT A)


# of cols. of A + Rank(A) = # of cols. of AT A + Rank(AT A)
Rank(A) = Rank(AT A) ∵ # of cols. of A = # of cols. of AT A .


(25)

Now, Am×n represents a linear transformation F : Rn → Rm and vj ∈ Rn for


j = 1, 2, . . . , n constitute an orthonormal basis set for Rn .Then

Avi = u0i , (26)

where u0i ∈ Rm is the image of vi ∈ Rn under F .


Premultiplying (26) with AT and then with A results in the following eigenvalue
equation

AT Avi = AT u0i ⇒ λi vi = AT u0i


⇒ AAT u0i = λi Avi = λi u0i , i = 1, 2, . . . , m. (27)

T
Because AAT = AAT , u0i are orthogonal eigenvectors of AAT .Furthermore,
since AAT is a square matrix of size m × m, i = 1, 2, . . . , m in (27).
Lecture 12 20/26
Singular Value Decomposition
It must be noted that A and AT A have the same rank, equal to r, because if Ax = 0,
then AT Ax = 0, implying that A and AT A have the same null space. Hence,

dim N (A) = dim N (AT A)


# of cols. of A + Rank(A) = # of cols. of AT A + Rank(AT A)
Rank(A) = Rank(AT A) ∵ # of cols. of A = # of cols. of AT A .


(25)

Now, Am×n represents a linear transformation F : Rn → Rm and vj ∈ Rn for


j = 1, 2, . . . , n constitute an orthonormal basis set for Rn .Then

Avi = u0i , (26)

where u0i ∈ Rm is the image of vi ∈ Rn under F .


Premultiplying (26) with AT and then with A results in the following eigenvalue
equation

AT Avi = AT u0i ⇒ λi vi = AT u0i


⇒ AAT u0i = λi Avi = λi u0i , i = 1, 2, . . . , m. (27)

T
Because AAT = AAT , u0i are orthogonal eigenvectors of AAT .Furthermore,
since AAT is a square matrix of size m × m, i = 1, 2, . . . , m in (27).
Lecture 12 20/26
Singular Value Decomposition
It must be noted that A and AT A have the same rank, equal to r, because if Ax = 0,
then AT Ax = 0, implying that A and AT A have the same null space. Hence,

dim N (A) = dim N (AT A)


# of cols. of A + Rank(A) = # of cols. of AT A + Rank(AT A)
Rank(A) = Rank(AT A) ∵ # of cols. of A = # of cols. of AT A .


(25)

Now, Am×n represents a linear transformation F : Rn → Rm and vj ∈ Rn for


j = 1, 2, . . . , n constitute an orthonormal basis set for Rn .Then

Avi = u0i , (26)

where u0i ∈ Rm is the image of vi ∈ Rn under F .


Premultiplying (26) with AT and then with A results in the following eigenvalue
equation

AT Avi = AT u0i ⇒ λi vi = AT u0i


⇒ AAT u0i = λi Avi = λi u0i , i = 1, 2, . . . , m. (27)

T
Because AAT = AAT , u0i are orthogonal eigenvectors of AAT .Furthermore,
since AAT is a square matrix of size m × m, i = 1, 2, . . . , m in (27).
Lecture 12 20/26
Singular Value Decomposition
It must be noted that A and AT A have the same rank, equal to r, because if Ax = 0,
then AT Ax = 0, implying that A and AT A have the same null space. Hence,

dim N (A) = dim N (AT A)


# of cols. of A + Rank(A) = # of cols. of AT A + Rank(AT A)
Rank(A) = Rank(AT A) ∵ # of cols. of A = # of cols. of AT A .


(25)

Now, Am×n represents a linear transformation F : Rn → Rm and vj ∈ Rn for


j = 1, 2, . . . , n constitute an orthonormal basis set for Rn .Then

Avi = u0i , (26)

where u0i ∈ Rm is the image of vi ∈ Rn under F .


Premultiplying (26) with AT and then with A results in the following eigenvalue
equation

AT Avi = AT u0i ⇒ λi vi = AT u0i


⇒ AAT u0i = λi Avi = λi u0i , i = 1, 2, . . . , m. (27)

T
Because AAT = AAT , u0i are orthogonal eigenvectors of AAT .Furthermore,
since AAT is a square matrix of size m × m, i = 1, 2, . . . , m in (27).
Lecture 12 20/26
Singular Value Decomposition
It must be noted that A and AT A have the same rank, equal to r, because if Ax = 0,
then AT Ax = 0, implying that A and AT A have the same null space. Hence,

dim N (A) = dim N (AT A)


# of cols. of A + Rank(A) = # of cols. of AT A + Rank(AT A)
Rank(A) = Rank(AT A) ∵ # of cols. of A = # of cols. of AT A .


(25)

Now, Am×n represents a linear transformation F : Rn → Rm and vj ∈ Rn for


j = 1, 2, . . . , n constitute an orthonormal basis set for Rn .Then

Avi = u0i , (26)

where u0i ∈ Rm is the image of vi ∈ Rn under F .


Premultiplying (26) with AT and then with A results in the following eigenvalue
equation

AT Avi = AT u0i ⇒ λi vi = AT u0i


⇒ AAT u0i = λi Avi = λi u0i , i = 1, 2, . . . , m. (27)

T
Because AAT = AAT , u0i are orthogonal eigenvectors of AAT .Furthermore,
since AAT is a square matrix of size m × m, i = 1, 2, . . . , m in (27).
Lecture 12 20/26
Singular Value Decomposition
It must be noted that A and AT A have the same rank, equal to r, because if Ax = 0,
then AT Ax = 0, implying that A and AT A have the same null space. Hence,

dim N (A) = dim N (AT A)


# of cols. of A + Rank(A) = # of cols. of AT A + Rank(AT A)
Rank(A) = Rank(AT A) ∵ # of cols. of A = # of cols. of AT A .


(25)

Now, Am×n represents a linear transformation F : Rn → Rm and vj ∈ Rn for


j = 1, 2, . . . , n constitute an orthonormal basis set for Rn .Then

Avi = u0i , (26)

where u0i ∈ Rm is the image of vi ∈ Rn under F .


Premultiplying (26) with AT and then with A results in the following eigenvalue
equation

AT Avi = AT u0i ⇒ λi vi = AT u0i


⇒ AAT u0i = λi Avi = λi u0i , i = 1, 2, . . . , m. (27)

T
Because AAT = AAT , u0i are orthogonal eigenvectors of AAT .Furthermore,
since AAT is a square matrix of size m × m, i = 1, 2, . . . , m in (27).
Lecture 12 20/26
Singular Value Decomposition
It must be noted that A and AT A have the same rank, equal to r, because if Ax = 0,
then AT Ax = 0, implying that A and AT A have the same null space. Hence,

dim N (A) = dim N (AT A)


# of cols. of A + Rank(A) = # of cols. of AT A + Rank(AT A)
Rank(A) = Rank(AT A) ∵ # of cols. of A = # of cols. of AT A .


(25)

Now, Am×n represents a linear transformation F : Rn → Rm and vj ∈ Rn for


j = 1, 2, . . . , n constitute an orthonormal basis set for Rn .Then

Avi = u0i , (26)

where u0i ∈ Rm is the image of vi ∈ Rn under F .


Premultiplying (26) with AT and then with A results in the following eigenvalue
equation

AT Avi = AT u0i ⇒ λi vi = AT u0i


⇒ AAT u0i = λi Avi = λi u0i , i = 1, 2, . . . , m. (27)

T
Because AAT = AAT , u0i are orthogonal eigenvectors of AAT .Furthermore,
since AAT is a square matrix of size m × m, i = 1, 2, . . . , m in (27).
Lecture 12 20/26
Singular Value Decomposition
The inner product between u0 s can be found as
T T
hu0i , u0i i = u0i u0i = (Avi ) Avi = viT AT Avi = viT λi vi = λi , (28)

which enables us to define an orthonormal basis as


u0 u0
ui , √ i = i ⇒ hui , ui i = 1. (29)
λi σi

Hence, the eigenvalue problem in (27) can be reformulated as

Avi = σi ui ⇒ AAT ui = λi ui = σi2 ui , i = 1, 2, . . . , m.



(30)

Moreover, λi = σi2 , i = 1, 2, . . . , m are non-negative and real, as established


earlier, which implies that AAT is positive semidefinite as can be verified as
follows:
T T  2 2
xT AAT x = AT x A x = AT x = kzk ≥ 0

where

z = x1 â1 + x2 â2 + . . . + xm âm ,

which may equal 0 for non-zero xi if the rows of A, i.e., âi , are linearly dependent.
Lecture 12 21/26
Singular Value Decomposition
The inner product between u0 s can be found as
T T
hu0i , u0i i = u0i u0i = (Avi ) Avi = viT AT Avi = viT λi vi = λi , (28)

which enables us to define an orthonormal basis as


u0 u0
ui , √ i = i ⇒ hui , ui i = 1. (29)
λi σi

Hence, the eigenvalue problem in (27) can be reformulated as

Avi = σi ui ⇒ AAT ui = λi ui = σi2 ui , i = 1, 2, . . . , m.



(30)

Moreover, λi = σi2 , i = 1, 2, . . . , m are non-negative and real, as established


earlier, which implies that AAT is positive semidefinite as can be verified as
follows:
T T  2 2
xT AAT x = AT x A x = AT x = kzk ≥ 0

where

z = x1 â1 + x2 â2 + . . . + xm âm ,

which may equal 0 for non-zero xi if the rows of A, i.e., âi , are linearly dependent.
Lecture 12 21/26
Singular Value Decomposition
The inner product between u0 s can be found as
T T
hu0i , u0i i = u0i u0i = (Avi ) Avi = viT AT Avi = viT λi vi = λi , (28)

which enables us to define an orthonormal basis as


u0 u0
ui , √ i = i ⇒ hui , ui i = 1. (29)
λi σi

Hence, the eigenvalue problem in (27) can be reformulated as

Avi = σi ui ⇒ AAT ui = λi ui = σi2 ui , i = 1, 2, . . . , m.



(30)

Moreover, λi = σi2 , i = 1, 2, . . . , m are non-negative and real, as established


earlier, which implies that AAT is positive semidefinite as can be verified as
follows:
T T  2 2
xT AAT x = AT x A x = AT x = kzk ≥ 0

where

z = x1 â1 + x2 â2 + . . . + xm âm ,

which may equal 0 for non-zero xi if the rows of A, i.e., âi , are linearly dependent.
Lecture 12 21/26
Singular Value Decomposition
The inner product between u0 s can be found as
T T
hu0i , u0i i = u0i u0i = (Avi ) Avi = viT AT Avi = viT λi vi = λi , (28)

which enables us to define an orthonormal basis as


u0 u0
ui , √ i = i ⇒ hui , ui i = 1. (29)
λi σi

Hence, the eigenvalue problem in (27) can be reformulated as

Avi = σi ui ⇒ AAT ui = λi ui = σi2 ui , i = 1, 2, . . . , m.



(30)

Moreover, λi = σi2 , i = 1, 2, . . . , m are non-negative and real, as established


earlier, which implies that AAT is positive semidefinite as can be verified as
follows:
T T  2 2
xT AAT x = AT x A x = AT x = kzk ≥ 0

where

z = x1 â1 + x2 â2 + . . . + xm âm ,

which may equal 0 for non-zero xi if the rows of A, i.e., âi , are linearly dependent.
Lecture 12 21/26
Singular Value Decomposition
The inner product between u0 s can be found as
T T
hu0i , u0i i = u0i u0i = (Avi ) Avi = viT AT Avi = viT λi vi = λi , (28)

which enables us to define an orthonormal basis as


u0 u0
ui , √ i = i ⇒ hui , ui i = 1. (29)
λi σi

Hence, the eigenvalue problem in (27) can be reformulated as

Avi = σi ui ⇒ AAT ui = λi ui = σi2 ui , i = 1, 2, . . . , m.



(30)

Moreover, λi = σi2 , i = 1, 2, . . . , m are non-negative and real, as established


earlier, which implies that AAT is positive semidefinite as can be verified as
follows:
T T  2 2
xT AAT x = AT x A x = AT x = kzk ≥ 0

where

z = x1 â1 + x2 â2 + . . . + xm âm ,

which may equal 0 for non-zero xi if the rows of A, i.e., âi , are linearly dependent.
Lecture 12 21/26
Singular Value Decomposition
The inner product between u0 s can be found as
T T
hu0i , u0i i = u0i u0i = (Avi ) Avi = viT AT Avi = viT λi vi = λi , (28)

which enables us to define an orthonormal basis as


u0 u0
ui , √ i = i ⇒ hui , ui i = 1. (29)
λi σi

Hence, the eigenvalue problem in (27) can be reformulated as

Avi = σi ui ⇒ AAT ui = λi ui = σi2 ui , i = 1, 2, . . . , m.



(30)

Moreover, λi = σi2 , i = 1, 2, . . . , m are non-negative and real, as established


earlier, which implies that AAT is positive semidefinite as can be verified as
follows:
T T  2 2
xT AAT x = AT x A x = AT x = kzk ≥ 0

where

z = x1 â1 + x2 â2 + . . . + xm âm ,

which may equal 0 for non-zero xi if the rows of A, i.e., âi , are linearly dependent.
Lecture 12 21/26
Singular Value Decomposition
The inner product between u0 s can be found as
T T
hu0i , u0i i = u0i u0i = (Avi ) Avi = viT AT Avi = viT λi vi = λi , (28)

which enables us to define an orthonormal basis as


u0 u0
ui , √ i = i ⇒ hui , ui i = 1. (29)
λi σi

Hence, the eigenvalue problem in (27) can be reformulated as

Avi = σi ui ⇒ AAT ui = λi ui = σi2 ui , i = 1, 2, . . . , m.



(30)

Moreover, λi = σi2 , i = 1, 2, . . . , m are non-negative and real, as established


earlier, which implies that AAT is positive semidefinite as can be verified as
follows:
T T  2 2
xT AAT x = AT x A x = AT x = kzk ≥ 0

where

z = x1 â1 + x2 â2 + . . . + xm âm ,

which may equal 0 for non-zero xi if the rows of A, i.e., âi , are linearly dependent.
Lecture 12 21/26
Singular Value Decomposition
The inner product between u0 s can be found as
T T
hu0i , u0i i = u0i u0i = (Avi ) Avi = viT AT Avi = viT λi vi = λi , (28)

which enables us to define an orthonormal basis as


u0 u0
ui , √ i = i ⇒ hui , ui i = 1. (29)
λi σi

Hence, the eigenvalue problem in (27) can be reformulated as

Avi = σi ui ⇒ AAT ui = λi ui = σi2 ui , i = 1, 2, . . . , m.



(30)

Moreover, λi = σi2 , i = 1, 2, . . . , m are non-negative and real, as established


earlier, which implies that AAT is positive semidefinite as can be verified as
follows:
T T  2 2
xT AAT x = AT x A x = AT x = kzk ≥ 0

where

z = x1 â1 + x2 â2 + . . . + xm âm ,

which may equal 0 for non-zero xi if the rows of A, i.e., âi , are linearly dependent.
Lecture 12 21/26
Singular Value Decomposition

Hence, the matrix AAT has the following similarity transformation

AAT n×n = U DU −1 = U DU T

  
λ1 u1
  λ 2
  u2 
= u1 u2 . . . um    ..  , (31)
  
..
 .  . 
λm um

where U is the matrix of orthonormal eigenvectors of AAT and is orthogonal.

It must be noted that A and AAT have the same rank, equal to r, because if AT y = 0,
then AAT y = 0, implying that A and AAT have the same null space. Hence,

dim N (AT ) = dim N (AAT )


# of cols. of AT + Rank(AT ) = # of cols. of AAT + Rank(AAT )
Rank(AT ) = Rank(AAT ) ∵ # of cols. ofAT = # of cols. ofAAT


Rank(A) = Rank(AAT ) ∵ Rank(A) = Rank AT . (32)




Lecture 12 22/26
Singular Value Decomposition

Hence, the matrix AAT has the following similarity transformation

AAT n×n = U DU −1 = U DU T

  
λ1 u1
  λ 2
  u2 
= u1 u2 . . . um    ..  , (31)
  
..
 .  . 
λm um

where U is the matrix of orthonormal eigenvectors of AAT and is orthogonal.

It must be noted that A and AAT have the same rank, equal to r, because if AT y = 0,
then AAT y = 0, implying that A and AAT have the same null space. Hence,

dim N (AT ) = dim N (AAT )


# of cols. of AT + Rank(AT ) = # of cols. of AAT + Rank(AAT )
Rank(AT ) = Rank(AAT ) ∵ # of cols. ofAT = # of cols. ofAAT


Rank(A) = Rank(AAT ) ∵ Rank(A) = Rank AT . (32)




Lecture 12 22/26
Singular Value Decomposition

Hence, the matrix AAT has the following similarity transformation

AAT n×n = U DU −1 = U DU T

  
λ1 u1
  λ 2
  u2 
= u1 u2 . . . um    ..  , (31)
  
..
 .  . 
λm um

where U is the matrix of orthonormal eigenvectors of AAT and is orthogonal.

It must be noted that A and AAT have the same rank, equal to r, because if AT y = 0,
then AAT y = 0, implying that A and AAT have the same null space. Hence,

dim N (AT ) = dim N (AAT )


# of cols. of AT + Rank(AT ) = # of cols. of AAT + Rank(AAT )
Rank(AT ) = Rank(AAT ) ∵ # of cols. ofAT = # of cols. ofAAT


Rank(A) = Rank(AAT ) ∵ Rank(A) = Rank AT . (32)




Lecture 12 22/26
Singular Value Decomposition

Hence, the matrix AAT has the following similarity transformation

AAT n×n = U DU −1 = U DU T

  
λ1 u1
  λ 2
  u2 
= u1 u2 . . . um    ..  , (31)
  
..
 .  . 
λm um

where U is the matrix of orthonormal eigenvectors of AAT and is orthogonal.

It must be noted that A and AAT have the same rank, equal to r, because if AT y = 0,
then AAT y = 0, implying that A and AAT have the same null space. Hence,

dim N (AT ) = dim N (AAT )


# of cols. of AT + Rank(AT ) = # of cols. of AAT + Rank(AAT )
Rank(AT ) = Rank(AAT ) ∵ # of cols. ofAT = # of cols. ofAAT


Rank(A) = Rank(AAT ) ∵ Rank(A) = Rank AT . (32)




Lecture 12 22/26
Singular Value Decomposition
If Rank(A) = r = Rank(AT A) = Rank(AAT ), then
(
T σj2 vj , j = 1, 2, . . . , r
A Avj = (33)
0, j = r + 1, . . . , n,
and
(
σi ui , i = 1, 2, . . . , r
∵ Avi = σi ui ⇒ AAT ui = σi2 ui .

Avi =
0, i = r + 1, . . . , m,
(34)

Writing (34) in matrix form, we get


 
AV = Av1 Av2 . . . Avr Avr+1 . . . Avn
 
σ1

 σ2 

 .. 
  
 . 

= u1 u2 . . . ur ur+1 . . . um ×  σr , (35)
| {z }  0


U  
 .. 
 . 
0
| {z }
Σ

Lecture 12 23/26
Singular Value Decomposition
If Rank(A) = r = Rank(AT A) = Rank(AAT ), then
(
T σj2 vj , j = 1, 2, . . . , r
A Avj = (33)
0, j = r + 1, . . . , n,
and
(
σi ui , i = 1, 2, . . . , r
∵ Avi = σi ui ⇒ AAT ui = σi2 ui .

Avi =
0, i = r + 1, . . . , m,
(34)

Writing (34) in matrix form, we get


 
AV = Av1 Av2 . . . Avr Avr+1 . . . Avn
 
σ1

 σ2 

 .. 
  
 . 

= u1 u2 . . . ur ur+1 . . . um ×  σr , (35)
| {z }  0


U  
 .. 
 . 
0
| {z }
Σ

Lecture 12 23/26
Singular Value Decomposition
If Rank(A) = r = Rank(AT A) = Rank(AAT ), then
(
T σj2 vj , j = 1, 2, . . . , r
A Avj = (33)
0, j = r + 1, . . . , n,
and
(
σi ui , i = 1, 2, . . . , r
∵ Avi = σi ui ⇒ AAT ui = σi2 ui .

Avi =
0, i = r + 1, . . . , m,
(34)

Writing (34) in matrix form, we get


 
AV = Av1 Av2 . . . Avr Avr+1 . . . Avn
 
σ1

 σ2 

 .. 
  
 . 

= u1 u2 . . . ur ur+1 . . . um ×  σr , (35)
| {z }  0


U  
 .. 
 . 
0
| {z }
Σ

Lecture 12 23/26
Singular Value Decomposition
which can be rearranged to obtain the signular value decomposition as

A = U ΣV −1 = U ΣV T . (36)

Lecture 12 24/26
Singular Value Decomposition
which can be rearranged to obtain the signular value decomposition as

A = U ΣV −1 = U ΣV T . (36)

Intuitive Picture
SVD of an m × n matrix A represents linear transformation of a vector x ∈ Rn into
Ax ∈ Rm , i.e., for any vector x ∈ Rn and an orthonormal basis set vj ∈ Rn , j =
1, 2, . . . , n, we can write
n
X
x= cj vj ,
j=1

where cj , hx, vj i = xT vj because hvj , vk i = δj,k , and so noting that xT vj = vjT x,

n
X n
X n
X r
X
Ax = A hx, vj i vj = xT vj Avj = xT vj σj uj = vjT xσj uj
j=1 j=1 j=1 j=1
 
Xr r
X
= uj σj vjT  x ⇒ A = uj σj vjT = U ΣV T .
j=1 j=1

Lecture 12 24/26
Singular Value Decomposition
which can be rearranged to obtain the signular value decomposition as

A = U ΣV −1 = U ΣV T . (36)

Intuitive Picture
SVD of an m × n matrix A represents linear transformation of a vector x ∈ Rn into
Ax ∈ Rm , i.e., for any vector x ∈ Rn and an orthonormal basis set vj ∈ Rn , j =
1, 2, . . . , n, we can write
n
X
x= cj vj ,
j=1

where cj , hx, vj i = xT vj because hvj , vk i = δj,k , and so noting that xT vj = vjT x,

n
X n
X n
X r
X
Ax = A hx, vj i vj = xT vj Avj = xT vj σj uj = vjT xσj uj
j=1 j=1 j=1 j=1
 
Xr r
X
= uj σj vjT  x ⇒ A = uj σj vjT = U ΣV T .
j=1 j=1

Lecture 12 24/26
Singular Value Decomposition
which can be rearranged to obtain the signular value decomposition as

A = U ΣV −1 = U ΣV T . (36)

Intuitive Picture
SVD of an m × n matrix A represents linear transformation of a vector x ∈ Rn into
Ax ∈ Rm , i.e., for any vector x ∈ Rn and an orthonormal basis set vj ∈ Rn , j =
1, 2, . . . , n, we can write
n
X
x= cj vj ,
j=1

where cj , hx, vj i = xT vj because hvj , vk i = δj,k , and so noting that xT vj = vjT x,

n
X n
X n
X r
X
Ax = A hx, vj i vj = xT vj Avj = xT vj σj uj = vjT xσj uj
j=1 j=1 j=1 j=1
 
Xr r
X
= uj σj vjT  x ⇒ A = uj σj vjT = U ΣV T .
j=1 j=1

Lecture 12 24/26
Singular Value Decomposition
which can be rearranged to obtain the signular value decomposition as

A = U ΣV −1 = U ΣV T . (36)

Intuitive Picture
SVD of an m × n matrix A represents linear transformation of a vector x ∈ Rn into
Ax ∈ Rm , i.e., for any vector x ∈ Rn and an orthonormal basis set vj ∈ Rn , j =
1, 2, . . . , n, we can write
n
X
x= cj vj ,
j=1

where cj , hx, vj i = xT vj because hvj , vk i = δj,k , and so noting that xT vj = vjT x,

n
X n
X n
X r
X
Ax = A hx, vj i vj = xT vj Avj = xT vj σj uj = vjT xσj uj
j=1 j=1 j=1 j=1
 
Xr r
X
= uj σj vjT  x ⇒ A = uj σj vjT = U ΣV T .
j=1 j=1

Lecture 12 24/26
Remarks on SVD

1. If A is positive definite, then xT Ax > 0, A = AT and A is square of size n × n.


In this case,

AT Avj = σj2 vj = A2 vj ⇒ Avj = σj vj , j = 1, 2, . . . , n,

and

AAT uj = σj2 uj = A2 uj ⇒ Auj = σj uj , j = 1, 2, . . . , n.

Hence, U = V and A = U ΣV T = XDX T where


   
X = u1 u2 . . . un = v1 v2 . . . vn ,
 
σ1
 σ2 
D=  , σj 0 s are eigenvalues of A now!
 
 . .. 
σn

For any other symmetric matrix, AT Avj = σj2 vj = A2 vj ⇒ Avj = ±σj vj =


σj uj , j = 1, 2, . . . , n, which gives U = ±V and D containing positive eigenvalues
of A, i.e., σj , on its diagonal, even though some eigenvalues of A are negative.

Lecture 12 25/26
Remarks on SVD

1. If A is positive definite, then xT Ax > 0, A = AT and A is square of size n × n.


In this case,

AT Avj = σj2 vj = A2 vj ⇒ Avj = σj vj , j = 1, 2, . . . , n,

and

AAT uj = σj2 uj = A2 uj ⇒ Auj = σj uj , j = 1, 2, . . . , n.

Hence, U = V and A = U ΣV T = XDX T where


   
X = u1 u2 . . . un = v1 v2 . . . vn ,
 
σ1
 σ2 
D=  , σj 0 s are eigenvalues of A now!
 
 . .. 
σn

For any other symmetric matrix, AT Avj = σj2 vj = A2 vj ⇒ Avj = ±σj vj =


σj uj , j = 1, 2, . . . , n, which gives U = ±V and D containing positive eigenvalues
of A, i.e., σj , on its diagonal, even though some eigenvalues of A are negative.

Lecture 12 25/26
Remarks on SVD

1. If A is positive definite, then xT Ax > 0, A = AT and A is square of size n × n.


In this case,

AT Avj = σj2 vj = A2 vj ⇒ Avj = σj vj , j = 1, 2, . . . , n,

and

AAT uj = σj2 uj = A2 uj ⇒ Auj = σj uj , j = 1, 2, . . . , n.

Hence, U = V and A = U ΣV T = XDX T where


   
X = u1 u2 . . . un = v1 v2 . . . vn ,
 
σ1
 σ2 
D=  , σj 0 s are eigenvalues of A now!
 
 . .. 
σn

For any other symmetric matrix, AT Avj = σj2 vj = A2 vj ⇒ Avj = ±σj vj =


σj uj , j = 1, 2, . . . , n, which gives U = ±V and D containing positive eigenvalues
of A, i.e., σj , on its diagonal, even though some eigenvalues of A are negative.

Lecture 12 25/26
Remarks on SVD

1. If A is positive definite, then xT Ax > 0, A = AT and A is square of size n × n.


In this case,

AT Avj = σj2 vj = A2 vj ⇒ Avj = σj vj , j = 1, 2, . . . , n,

and

AAT uj = σj2 uj = A2 uj ⇒ Auj = σj uj , j = 1, 2, . . . , n.

Hence, U = V and A = U ΣV T = XDX T where


   
X = u1 u2 . . . un = v1 v2 . . . vn ,
 
σ1
 σ2 
D=  , σj 0 s are eigenvalues of A now!
 
 . .. 
σn

For any other symmetric matrix, AT Avj = σj2 vj = A2 vj ⇒ Avj = ±σj vj =


σj uj , j = 1, 2, . . . , n, which gives U = ±V and D containing positive eigenvalues
of A, i.e., σj , on its diagonal, even though some eigenvalues of A are negative.

Lecture 12 25/26
Remarks on SVD

1. If A is positive definite, then xT Ax > 0, A = AT and A is square of size n × n.


In this case,

AT Avj = σj2 vj = A2 vj ⇒ Avj = σj vj , j = 1, 2, . . . , n,

and

AAT uj = σj2 uj = A2 uj ⇒ Auj = σj uj , j = 1, 2, . . . , n.

Hence, U = V and A = U ΣV T = XDX T where


   
X = u1 u2 . . . un = v1 v2 . . . vn ,
 
σ1
 σ2 
D=  , σj 0 s are eigenvalues of A now!
 
 . .. 
σn

For any other symmetric matrix, AT Avj = σj2 vj = A2 vj ⇒ Avj = ±σj vj =


σj uj , j = 1, 2, . . . , n, which gives U = ±V and D containing positive eigenvalues
of A, i.e., σj , on its diagonal, even though some eigenvalues of A are negative.

Lecture 12 25/26
Remarks on SVD

1. If A is positive definite, then xT Ax > 0, A = AT and A is square of size n × n.


In this case,

AT Avj = σj2 vj = A2 vj ⇒ Avj = σj vj , j = 1, 2, . . . , n,

and

AAT uj = σj2 uj = A2 uj ⇒ Auj = σj uj , j = 1, 2, . . . , n.

Hence, U = V and A = U ΣV T = XDX T where


   
X = u1 u2 . . . un = v1 v2 . . . vn ,
 
σ1
 σ2 
D=  , σj 0 s are eigenvalues of A now!
 
 . .. 
σn

For any other symmetric matrix, AT Avj = σj2 vj = A2 vj ⇒ Avj = ±σj vj =


σj uj , j = 1, 2, . . . , n, which gives U = ±V and D containing positive eigenvalues
of A, i.e., σj , on its diagonal, even though some eigenvalues of A are negative.

Lecture 12 25/26
Remarks on SVD

1. If A is positive definite, then xT Ax > 0, A = AT and A is square of size n × n.


In this case,

AT Avj = σj2 vj = A2 vj ⇒ Avj = σj vj , j = 1, 2, . . . , n,

and

AAT uj = σj2 uj = A2 uj ⇒ Auj = σj uj , j = 1, 2, . . . , n.

Hence, U = V and A = U ΣV T = XDX T where


   
X = u1 u2 . . . un = v1 v2 . . . vn ,
 
σ1
 σ2 
D=  , σj 0 s are eigenvalues of A now!
 
 . .. 
σn

For any other symmetric matrix, AT Avj = σj2 vj = A2 vj ⇒ Avj = ±σj vj =


σj uj , j = 1, 2, . . . , n, which gives U = ±V and D containing positive eigenvalues
of A, i.e., σj , on its diagonal, even though some eigenvalues of A are negative.

Lecture 12 25/26
Remarks on SVD

2. First r columns of U form an orthonormal basis for column space of A.


3. Last m − r columns of U form an orthonormal basis for left nullspace of A, i.e.,
N (AT ).
4. First r columns of V form an orthonormal basis for range space of A, i.e., R(A).
5. Last n−r columns of V form an orthonormal basis for nullspace of A, i.e., N (A).

Prove the statements in remarks 2 − 5 as homework exercise!

Lecture 12 26/26
Remarks on SVD

2. First r columns of U form an orthonormal basis for column space of A.


3. Last m − r columns of U form an orthonormal basis for left nullspace of A, i.e.,
N (AT ).
4. First r columns of V form an orthonormal basis for range space of A, i.e., R(A).
5. Last n−r columns of V form an orthonormal basis for nullspace of A, i.e., N (A).

Prove the statements in remarks 2 − 5 as homework exercise!

Lecture 12 26/26
Remarks on SVD

2. First r columns of U form an orthonormal basis for column space of A.


3. Last m − r columns of U form an orthonormal basis for left nullspace of A, i.e.,
N (AT ).
4. First r columns of V form an orthonormal basis for range space of A, i.e., R(A).
5. Last n−r columns of V form an orthonormal basis for nullspace of A, i.e., N (A).

Prove the statements in remarks 2 − 5 as homework exercise!

Lecture 12 26/26
Remarks on SVD

2. First r columns of U form an orthonormal basis for column space of A.


3. Last m − r columns of U form an orthonormal basis for left nullspace of A, i.e.,
N (AT ).
4. First r columns of V form an orthonormal basis for range space of A, i.e., R(A).
5. Last n−r columns of V form an orthonormal basis for nullspace of A, i.e., N (A).

Prove the statements in remarks 2 − 5 as homework exercise!

Lecture 12 26/26
Remarks on SVD

2. First r columns of U form an orthonormal basis for column space of A.


3. Last m − r columns of U form an orthonormal basis for left nullspace of A, i.e.,
N (AT ).
4. First r columns of V form an orthonormal basis for range space of A, i.e., R(A).
5. Last n−r columns of V form an orthonormal basis for nullspace of A, i.e., N (A).

Prove the statements in remarks 2 − 5 as homework exercise!

Lecture 12 26/26

You might also like