EE506 - Engineering Mathematics: Quadratic Forms, Positive Definiteness and Singular Value Decomposition

EE506 – Engineering Mathematics
Lecture 12
Quadratic Forms, Positive Definiteness and Singular Value
Decomposition
Dr. Adeem Aslam

Assistant Professor
Department of Electrical Engineering

University of Engineering and Technology, Lahore, Pakistan
October 24, 2021
Lecture 12 1/26
Outline
1 Quadratic Forms
Principal Axis Theorem
2 Positive Definiteness
Negative Definiteness
Singular Case and Saddle Point
Higher Dimensions
Quadratic Form for a Random Function
3 Singular Value Decomposition

Remarks
Lecture 12 2/26
Quadrartic Forms
A quadraric form Q in the components x1 , x2 , . . . , xn of a vector x is a sum of

n2 terms of the form
n X
X n
Q = xT Ax = ajk xj xk
j=1 k=1
= a11 x21 + a12 x1 x2 + . . . + a1n x1 xn

+ a21 x2 x1 + a22 x22 + . . . + a2n x2 xn
+ .........
+ an1 xn x1 + an2 xn x2 + . . . + ann x2n , (1)
where Am×n = [ajk ] is called the coefficient matrix of the form.
Since Q = QT (Q being a scalar),

T
xT Ax = xT Ax ⇒ xT Ax = xT AT x
⇒ A = AT . (2)
Lecture 12 3/26
Quadrartic Forms

n X
X n
j=1 k=1
= a11 x21 + a12 x1 x2 + . . . + a1n x1 xn

+ a21 x2 x1 + a22 x22 + . . . + a2n x2 xn
+ .........

T
⇒ A = AT . (2)
Lecture 12 3/26
Quadrartic Forms

n X
X n
j=1 k=1
= a11 x21 + a12 x1 x2 + . . . + a1n x1 xn

+ a21 x2 x1 + a22 x22 + . . . + a2n x2 xn
+ .........

T
⇒ A = AT . (2)
Lecture 12 3/26
Quadratic Forms
If A is not symmetric, we can take off-diagonal terms together in pairs and write the
result as a sum of two equal terms, which is equivalent to replacing A with (A+AT )/2,
as illustrated in the example below.
Lecture 12 4/26
Quadratic Forms
Example: Consider

3 4 x1
xT Ax = [x1 x2 ]
6 2 x2
Lecture 12 4/26
Quadratic Forms
Example: Consider

3 4 x1
xT Ax = [x1 x2 ] = 3x21 + 4x1 x2 + 6x1 x2 + 2x22
6 2 x2
Lecture 12 4/26
Quadratic Forms
Example: Consider

3 4 x1
xT Ax = [x1 x2 ] = 3x21 + 4x1 x2 + 6x1 x2 + 2x22
6 2 x2
= 3x21 + 10x1 x2 + 2x22
Lecture 12 4/26
Quadratic Forms
Example: Consider

3 4 x1
xT Ax = [x1 x2 ] = 3x21 + 4x1 x2 + 6x1 x2 + 2x22
6 2 x2
= 3x21 + 10x1 x2 + 2x22 = 3x21 + 5x1 x2 + 5x1 x2 + 2x22 .
Lecture 12 4/26
Quadratic Forms
Example: Consider

3 4 x1
xT Ax = [x1 x2 ] = 3x21 + 4x1 x2 + 6x1 x2 + 2x22
6 2 x2
= 3x21 + 10x1 x2 + 2x22 = 3x21 + 5x1 x2 + 5x1 x2 + 2x22 .
Then,

T 3 5 x1
x Cx = [x1 x2 ]
5 2 x2
Lecture 12 4/26
Quadratic Forms
Example: Consider

3 4 x1
xT Ax = [x1 x2 ] = 3x21 + 4x1 x2 + 6x1 x2 + 2x22
6 2 x2
= 3x21 + 10x1 x2 + 2x22 = 3x21 + 5x1 x2 + 5x1 x2 + 2x22 .
Then,
A + AT

T 3 5 x1
x Cx = [x1 x2 ] = xT Ax, C= .
5 2 x2 2
Lecture 12 4/26
Quadratic Form – Principal Axis Theorem
Statement:
The substitution
x = Xy, (3)
transforms the quadratic form

n X
X n
T
Q = x Ax = ajk xj xk , ajk = akj , (4)
j=1 k=1
to the principal axis form or the canonical form, given by

n
X
Q = y T Ay = λ1 y12 + λ2 y22 + · · · + λn yn2 = λj yj2 , (5)
j=1
where λ1 , λ2 , . . . , λn are the (not necessarily distinct) eigenvalues of (symmetric) ma-

trix A and X is the orthogonal matrix with eigenvectors of A, i.e., x1 , x2 , . . . , xn , as
its column vectors.
Lecture 12 5/26
Statement:
The substitution
x = Xy, (3)

n X
X n
T
j=1 k=1

n
X
j=1

its column vectors.
Lecture 12 5/26
Statement:
The substitution
x = Xy, (3)

n X
X n
T
j=1 k=1

n
X
j=1

its column vectors.
Lecture 12 5/26
Statement:
The substitution
x = Xy, (3)

n X
X n
T
j=1 k=1

n
X
j=1

its column vectors.
Lecture 12 5/26
Proof:
It is known that
Ax1 = λ1 x1 , Ax2 = λ2 x2 , . . . , Axn = λn xn ,
where x1 , x2 , . . . , xn are the orthonormal eigenvectors of A. Then, from similarity

transformation,
D = X −1 AX = X T AX, X = [x1 x2 . . . xn ]
T T −1
⇒ Q = x Ax = x XDX x = xT XDX T x.
Let
X T x , y ⇒ X −1 x = y ⇒ x = Xy
T T
⇒ xT X = X T x = X −1 x = yT .
Hence,
Q = xT XDX T x = y T Dy = λ1 y12 + λ2 y22 + · · · + λn yn2 . (6)
Lecture 12 6/26
Proof:
It is known that

transformation,
D = X −1 AX = X T AX, X = [x1 x2 . . . xn ]
T T −1
Let
X T x , y ⇒ X −1 x = y ⇒ x = Xy
T T
⇒ xT X = X T x = X −1 x = yT .
Hence,
Lecture 12 6/26
Proof:
It is known that

transformation,
D = X −1 AX = X T AX, X = [x1 x2 . . . xn ]
T T −1
Let
X T x , y ⇒ X −1 x = y ⇒ x = Xy
T T
⇒ xT X = X T x = X −1 x = yT .
Hence,
Lecture 12 6/26
Proof:
It is known that

transformation,
D = X −1 AX = X T AX, X = [x1 x2 . . . xn ]
T T −1
Let
X T x , y ⇒ X −1 x = y ⇒ x = Xy
T T
⇒ xT X = X T x = X −1 x = yT .
Hence,
Lecture 12 6/26
Proof:
It is known that

transformation,
D = X −1 AX = X T AX, X = [x1 x2 . . . xn ]
T T −1
Let
X T x , y ⇒ X −1 x = y ⇒ x = Xy
T T
⇒ xT X = X T x = X −1 x = yT .
Hence,
Lecture 12 6/26
Proof:
It is known that

transformation,
D = X −1 AX = X T AX, X = [x1 x2 . . . xn ]
T T −1
Let
X T x , y ⇒ X −1 x = y ⇒ x = Xy
T T
⇒ xT X = X T x = X −1 x = yT .
Hence,
Lecture 12 6/26
Proof:
It is known that

transformation,
D = X −1 AX = X T AX, X = [x1 x2 . . . xn ]
T T −1
Let
X T x , y ⇒ X −1 x = y ⇒ x = Xy
T T
⇒ xT X = X T x = X −1 x = yT .
Hence,
Lecture 12 6/26
Proof:
It is known that

transformation,
D = X −1 AX = X T AX, X = [x1 x2 . . . xn ]
T T −1
Let
X T x , y ⇒ X −1 x = y ⇒ x = Xy
T T
⇒ xT X = X T x = X −1 x = yT .
Hence,
Lecture 12 6/26
Proof:
It is known that

transformation,
D = X −1 AX = X T AX, X = [x1 x2 . . . xn ]
T T −1
Let
X T x , y ⇒ X −1 x = y ⇒ x = Xy
T T
⇒ xT X = X T x = X −1 x = yT .
Hence,
Lecture 12 6/26
Proof:
It is known that

transformation,
D = X −1 AX = X T AX, X = [x1 x2 . . . xn ]
T T −1
Let
X T x , y ⇒ X −1 x = y ⇒ x = Xy
T T
⇒ xT X = X T x = X −1 x = yT .
Hence,
Lecture 12 6/26
Proof:
It is known that

transformation,
D = X −1 AX = X T AX, X = [x1 x2 . . . xn ]
T T −1
Let
X T x , y ⇒ X −1 x = y ⇒ x = Xy
T T
⇒ xT X = X T x = X −1 x = yT .
Hence,
Lecture 12 6/26
Proof:
It is known that

transformation,
D = X −1 AX = X T AX, X = [x1 x2 . . . xn ]
T T −1
Let
X T x , y ⇒ X −1 x = y ⇒ x = Xy
T T
⇒ xT X = X T x = X −1 x = yT .
Hence,
Lecture 12 6/26
Proof:
It is known that

transformation,
D = X −1 AX = X T AX, X = [x1 x2 . . . xn ]
T T −1
Let
X T x , y ⇒ X −1 x = y ⇒ x = Xy
T T
⇒ xT X = X T x = X −1 x = yT .
Hence,
Lecture 12 6/26
Positive Definiteness
A function, e.g., f (x1 , x2 ) in 2D, is called positive definite if it is strictly positive

at all points except at the origin where it is zero1 .
Such functions are expressed by the following quadratic form
f (x1 , x2 ) = ax21 + 2bx1 x2 + cx22 , (7)
in which the coefficients a, b, c must be constrained to ensure f (x1 , x2 ) > 0.

To derive these constraints, we can rewrite f (x1 , x2 ) as
b2 b2
f (x1 , x2 ) = ax21 + x22 + 2bx1 x2 − x22 + cx22
a a
2
b2

2 b 2 b
= a x1 + 2 x2 + 2 x1 x2 + c − x22
a a a
2
b2

b
= a x1 + x2 + c − x22 . (8)
a a
1 Origin is the stationary point of the function, where the function has its minimum.
Lecture 12 7/26

f (x1 , x2 ) = ax21 + 2bx1 x2 + cx22 , (7)

b2 b2
f (x1 , x2 ) = ax21 + x22 + 2bx1 x2 − x22 + cx22
a a
2
b2

2 b 2 b
= a x1 + 2 x2 + 2 x1 x2 + c − x22
a a a
2
b2

b
= a x1 + x2 + c − x22 . (8)
a a
Lecture 12 7/26

f (x1 , x2 ) = ax21 + 2bx1 x2 + cx22 , (7)

b2 b2
f (x1 , x2 ) = ax21 + x22 + 2bx1 x2 − x22 + cx22
a a
2
b2

2 b 2 b
= a x1 + 2 x2 + 2 x1 x2 + c − x22
a a a
2
b2

b
= a x1 + x2 + c − x22 . (8)
a a
Lecture 12 7/26

f (x1 , x2 ) = ax21 + 2bx1 x2 + cx22 , (7)

b2 b2
f (x1 , x2 ) = ax21 + x22 + 2bx1 x2 − x22 + cx22
a a
2
b2

2 b 2 b
= a x1 + 2 x2 + 2 x1 x2 + c − x22
a a a
2
b2

b
= a x1 + x2 + c − x22 . (8)
a a
Lecture 12 7/26

f (x1 , x2 ) = ax21 + 2bx1 x2 + cx22 , (7)

b2 b2
f (x1 , x2 ) = ax21 + x22 + 2bx1 x2 − x22 + cx22
a a
2
b2

2 b 2 b
= a x1 + 2 x2 + 2 x1 x2 + c − x22
a a a
2
b2

b
= a x1 + x2 + c − x22 . (8)
a a
Lecture 12 7/26

f (x1 , x2 ) = ax21 + 2bx1 x2 + cx22 , (7)

b2 b2
f (x1 , x2 ) = ax21 + x22 + 2bx1 x2 − x22 + cx22
a a
2
b2

2 b 2 b
= a x1 + 2 x2 + 2 x1 x2 + c − x22
a a a
2
b2

b
= a x1 + x2 + c − x22 . (8)
a a
Lecture 12 7/26
Because of the complete square in the two terms, f (x1 , x2 ) > 0 iff
b2
a > 0, c− > 0 ⇒ ac − b2 > 0 ⇒ ac > b2 ⇒ c > 0. (9)
a
For coefficients a, b, c defined as in (9), f (x1 , x2 ) > 0 at all points except at the
origin where it has a minimum2 , i.e.,

∂f ∂f
= 2ax1 + 2bx2 = 0, = 2bx1 + 2cx2 = 0,

∂x1 ∂x2
(0,0) (0,0) (10)
∂2f ∂2f 3 ∂ f
2
∂2f
= 2a > 0, = = 2b, = 2c > 0.
∂x21 ∂x1 ∂x2 ∂x2 ∂x1 ∂x22
2 Matrix of double derivatives is positive definite, as explained later!

3 Assuming continuous partial derivative ∂f , ∂f .
∂x1 ∂x2
Lecture 12 8/26
b2
a > 0, c− > 0 ⇒ ac − b2 > 0 ⇒ ac > b2 ⇒ c > 0. (9)
a

∂f ∂f
= 2ax1 + 2bx2 = 0, = 2bx1 + 2cx2 = 0,

∂x1 ∂x2
(0,0) (0,0) (10)
∂2f ∂2f 3 ∂ f
2
∂2f
= 2a > 0, = = 2b, = 2c > 0.
∂x21 ∂x1 ∂x2 ∂x2 ∂x1 ∂x22

∂x1 ∂x2
Lecture 12 8/26
b2
a > 0, c− > 0 ⇒ ac − b2 > 0 ⇒ ac > b2 ⇒ c > 0. (9)
a

∂f ∂f
= 2ax1 + 2bx2 = 0, = 2bx1 + 2cx2 = 0,

∂x1 ∂x2
(0,0) (0,0) (10)
∂2f ∂2f 3 ∂ f
2
∂2f
= 2a > 0, = = 2b, = 2c > 0.
∂x21 ∂x1 ∂x2 ∂x2 ∂x1 ∂x22

∂x1 ∂x2
Lecture 12 8/26
b2
a > 0, c− > 0 ⇒ ac − b2 > 0 ⇒ ac > b2 ⇒ c > 0. (9)
a

∂f ∂f
= 2ax1 + 2bx2 = 0, = 2bx1 + 2cx2 = 0,

∂x1 ∂x2
(0,0) (0,0) (10)
∂2f ∂2f 3 ∂ f
2
∂2f
= 2a > 0, = = 2b, = 2c > 0.
∂x21 ∂x1 ∂x2 ∂x2 ∂x1 ∂x22

∂x1 ∂x2
Lecture 12 8/26
b2
a > 0, c− > 0 ⇒ ac − b2 > 0 ⇒ ac > b2 ⇒ c > 0. (9)
a

∂f ∂f
= 2ax1 + 2bx2 = 0, = 2bx1 + 2cx2 = 0,

∂x1 ∂x2
(0,0) (0,0) (10)
∂2f ∂2f 3 ∂ f
2
∂2f
= 2a > 0, = = 2b, = 2c > 0.
∂x21 ∂x1 ∂x2 ∂x2 ∂x1 ∂x22

∂x1 ∂x2
Lecture 12 8/26
Test for Maximum:
f (x1 , x2 ) has a maximum whenever −f (x1 , x2 ) has a minimum. Hence,
−f (x1 , x2 ) = −ax21 − 2bx1 x2 − cx22

b2 b2
= −ax21 − x22 − 2bx1 x2 + x22 − cx22
a a
2
b2

2 b 2 b
= −a x1 + 2 x2 + 2 x1 x2 + − c x22
a a a
2
b2

b
= −a x1 + x2 − c − x22 . (11)
a a
has a minimum when
−a > 0, ac < b2 ⇒ a < 0, ac > b2 ⇒ c < 0, (12)
then
2
b2

b
f (x1 , x2 ) = ax21 + 2bx1 x2 + cx22 = a x1 + x2 + c− x22 < 0 (13)
a a
at all points except at the origin, where it has a maximum, and is called negative
definite.
Lecture 12 9/26
Test for Maximum:
−f (x1 , x2 ) = −ax21 − 2bx1 x2 − cx22

b2 b2
= −ax21 − x22 − 2bx1 x2 + x22 − cx22
a a
2
b2

2 b 2 b
= −a x1 + 2 x2 + 2 x1 x2 + − c x22
a a a
2
b2

b
= −a x1 + x2 − c − x22 . (11)
a a
has a minimum when
−a > 0, ac < b2 ⇒ a < 0, ac > b2 ⇒ c < 0, (12)
then
2
b2

b
a a
definite.
Lecture 12 9/26
Test for Maximum:
−f (x1 , x2 ) = −ax21 − 2bx1 x2 − cx22

b2 b2
= −ax21 − x22 − 2bx1 x2 + x22 − cx22
a a
2
b2

2 b 2 b
= −a x1 + 2 x2 + 2 x1 x2 + − c x22
a a a
2
b2

b
= −a x1 + x2 − c − x22 . (11)
a a
has a minimum when
−a > 0, ac < b2 ⇒ a < 0, ac > b2 ⇒ c < 0, (12)
then
2
b2

b
a a
definite.
Lecture 12 9/26
Test for Maximum:
−f (x1 , x2 ) = −ax21 − 2bx1 x2 − cx22

b2 b2
= −ax21 − x22 − 2bx1 x2 + x22 − cx22
a a
2
b2

2 b 2 b
= −a x1 + 2 x2 + 2 x1 x2 + − c x22
a a a
2
b2

b
= −a x1 + x2 − c − x22 . (11)
a a
has a minimum when
−a > 0, ac < b2 ⇒ a < 0, ac > b2 ⇒ c < 0, (12)
then
2
b2

b
a a
definite.
Lecture 12 9/26
Test for Maximum:
−f (x1 , x2 ) = −ax21 − 2bx1 x2 − cx22

b2 b2
= −ax21 − x22 − 2bx1 x2 + x22 − cx22
a a
2
b2

2 b 2 b
= −a x1 + 2 x2 + 2 x1 x2 + − c x22
a a a
2
b2

b
= −a x1 + x2 − c − x22 . (11)
a a
has a minimum when
−a > 0, ac < b2 ⇒ a < 0, ac > b2 ⇒ c < 0, (12)
then
2
b2

b
a a
definite.
Lecture 12 9/26
Test for Maximum:
−f (x1 , x2 ) = −ax21 − 2bx1 x2 − cx22

b2 b2
= −ax21 − x22 − 2bx1 x2 + x22 − cx22
a a
2
b2

2 b 2 b
= −a x1 + 2 x2 + 2 x1 x2 + − c x22
a a a
2
b2

b
= −a x1 + x2 − c − x22 . (11)
a a
has a minimum when
−a > 0, ac < b2 ⇒ a < 0, ac > b2 ⇒ c < 0, (12)
then
2
b2

b
a a
definite.
Lecture 12 9/26
If ac = b2 , then second term in (8) becomes zero, i.e.,

2
b
f (x1 , x2 ) = a x1 + x2 , (14)
a
which is:
2
1. positive semidefinite if a > 0 ⇒ c > 0, ∵ c = ba ,
2
2. negative semidefinite if a < 0 ⇒ c < 0, ∵ c = ba .
The prefix semi indicates that f (x1 , x2 ) can be zero at points other than the
origin, e.g., at x1 = b, x2 = −a.
If ac < b2 or ac − b2 < 0, which can also happen when a and c have opposite
signs, then the stationary point (in general the critical point4 ) is called the saddle
point of f (x1 , x2 ), which is neither positive (semi)definite nor negative (semi)
definite.
4 A critical point is either a point of minimum/maximum or a point of inflection.

Lecture 12 10/26

2
b
f (x1 , x2 ) = a x1 + x2 , (14)
a
which is:
2
2
definite.

Lecture 12 10/26

2
b
f (x1 , x2 ) = a x1 + x2 , (14)
a
which is:
2
2
definite.

Lecture 12 10/26

2
b
f (x1 , x2 ) = a x1 + x2 , (14)
a
which is:
2
2
definite.

Lecture 12 10/26

2
b
f (x1 , x2 ) = a x1 + x2 , (14)
a
which is:
2
2
definite.

Lecture 12 10/26

2
b
f (x1 , x2 ) = a x1 + x2 , (14)
a
which is:
2
2
definite.

Lecture 12 10/26
Positive Definiteness in Higher Dimensions
The function
f (x1 , x2 ) = ax21 + 2bx1 x2 + cx22
can be seen to be given by the following matrix quadratic form

a b x1
xT Ax = [x1 x2 ] (15)
b c x2
in terms of which, we can rewrite the conditions for definiteness, by realizing that
a = A1 = |A1 |, ac − b2 = |A|, (16)
as follows:
1. Positive definiteness: |A1 | > 0, |A| > 0 ⇒ λ1 , λ2 > 0 because c > 0 and |A| = λ1 λ2 >
0, trace(A) = λ1 + λ2 = a + c > 0.
2. Negative definiteness: |A1 | < 0, |A| > 0 ⇒ λ1 , λ2 < 0 because c < 0 and |A| = λ1 λ2 <
0, trace(A) = λ1 + λ2 = a + c < 0.
3. Positive semidefiniteness: |A1 | > 0, |A| = 0 ⇒ λ1 = 0, λ2 > 0 or λ1 > 0, λ2 = 0
because c > 0 and |A| = λ1 λ2 = 0, trace(A) = λ1 + λ2 = a + c > 0.
4. Negative semidefiniteness: |A1 | < 0, |A| = 0 ⇒ λ1 = 0, λ2 < 0 or λ1 < 0, λ2 = 0
because c < 0 and |A| = λ1 λ2 = 0, trace(A) = λ1 + λ2 = a + c < 0.
5. Saddle point: |A| < 0, which indicates that the two eigenvalues have opposite signs.
The corresponding matrix A is called positive definite, negative definite, positive semidefinite,
negative semidefinite, and indefinite respectively.
Lecture 12 11/26
The function
f (x1 , x2 ) = ax21 + 2bx1 x2 + cx22

a b x1
xT Ax = [x1 x2 ] (15)
b c x2
a = A1 = |A1 |, ac − b2 = |A|, (16)
as follows:
0, trace(A) = λ1 + λ2 = a + c > 0.
0, trace(A) = λ1 + λ2 = a + c < 0.
Lecture 12 11/26
The function
f (x1 , x2 ) = ax21 + 2bx1 x2 + cx22

a b x1
xT Ax = [x1 x2 ] (15)
b c x2
a = A1 = |A1 |, ac − b2 = |A|, (16)
as follows:
0, trace(A) = λ1 + λ2 = a + c > 0.
0, trace(A) = λ1 + λ2 = a + c < 0.
Lecture 12 11/26
The function
f (x1 , x2 ) = ax21 + 2bx1 x2 + cx22

a b x1
xT Ax = [x1 x2 ] (15)
b c x2
a = A1 = |A1 |, ac − b2 = |A|, (16)
as follows:
0, trace(A) = λ1 + λ2 = a + c > 0.
0, trace(A) = λ1 + λ2 = a + c < 0.
Lecture 12 11/26
The function
f (x1 , x2 ) = ax21 + 2bx1 x2 + cx22

a b x1
xT Ax = [x1 x2 ] (15)
b c x2
a = A1 = |A1 |, ac − b2 = |A|, (16)
as follows:
0, trace(A) = λ1 + λ2 = a + c > 0.
0, trace(A) = λ1 + λ2 = a + c < 0.
Lecture 12 11/26
The function
f (x1 , x2 ) = ax21 + 2bx1 x2 + cx22

a b x1
xT Ax = [x1 x2 ] (15)
b c x2
a = A1 = |A1 |, ac − b2 = |A|, (16)
as follows:
0, trace(A) = λ1 + λ2 = a + c > 0.
0, trace(A) = λ1 + λ2 = a + c < 0.
Lecture 12 11/26
The function
f (x1 , x2 ) = ax21 + 2bx1 x2 + cx22

a b x1
xT Ax = [x1 x2 ] (15)
b c x2
a = A1 = |A1 |, ac − b2 = |A|, (16)
as follows:
0, trace(A) = λ1 + λ2 = a + c > 0.
0, trace(A) = λ1 + λ2 = a + c < 0.
Lecture 12 11/26
The function
f (x1 , x2 ) = ax21 + 2bx1 x2 + cx22

a b x1
xT Ax = [x1 x2 ] (15)
b c x2
a = A1 = |A1 |, ac − b2 = |A|, (16)
as follows:
0, trace(A) = λ1 + λ2 = a + c > 0.
0, trace(A) = λ1 + λ2 = a + c < 0.
Lecture 12 11/26
The function
f (x1 , x2 ) = ax21 + 2bx1 x2 + cx22

a b x1
xT Ax = [x1 x2 ] (15)
b c x2
a = A1 = |A1 |, ac − b2 = |A|, (16)
as follows:
0, trace(A) = λ1 + λ2 = a + c > 0.
0, trace(A) = λ1 + λ2 = a + c < 0.
Lecture 12 11/26
The function
f (x1 , x2 ) = ax21 + 2bx1 x2 + cx22

a b x1
xT Ax = [x1 x2 ] (15)
b c x2
a = A1 = |A1 |, ac − b2 = |A|, (16)
as follows:
0, trace(A) = λ1 + λ2 = a + c > 0.
0, trace(A) = λ1 + λ2 = a + c < 0.
Lecture 12 11/26
Furthermore, it can be observed that

a b a b 1 0 a 0 1 b/a
−→ = , (17)
b c aR2 −bR1 0 ac − b2 b/a 1 0 (ac − b2 )/a 0 1
| {z } | {z } | {z } | {z }
A L D U =LT
in which
ac − b2
d1 = a, d2 = (18)
a
are called the pivots of the matrix A.
Hence,
2
ac − b2

b
f (x1 , x2 ) = a x1 + x2 + x22 = xT Ax
a a
1. is positive definite if d1 , d2 > 0.

2. is negative definite if d1 , d2 < 0.
3. is positive semidefinite if d2 = 0, d1 > 0.
4. is negative semidefinite if d2 = 0, d1 < 0.
5. has saddle point if d2 < 0.
Lecture 12 12/26

a b a b 1 0 a 0 1 b/a
−→ = , (17)
| {z } | {z } | {z } | {z }
A L D U =LT
in which
ac − b2
d1 = a, d2 = (18)
a
Hence,
2
ac − b2

b
f (x1 , x2 ) = a x1 + x2 + x22 = xT Ax
a a

Lecture 12 12/26

a b a b 1 0 a 0 1 b/a
−→ = , (17)
| {z } | {z } | {z } | {z }
A L D U =LT
in which
ac − b2
d1 = a, d2 = (18)
a
Hence,
2
ac − b2

b
f (x1 , x2 ) = a x1 + x2 + x22 = xT Ax
a a

Lecture 12 12/26

a b a b 1 0 a 0 1 b/a
−→ = , (17)
| {z } | {z } | {z } | {z }
A L D U =LT
in which
ac − b2
d1 = a, d2 = (18)
a
Hence,
2
ac − b2

b
f (x1 , x2 ) = a x1 + x2 + x22 = xT Ax
a a

Lecture 12 12/26

a b a b 1 0 a 0 1 b/a
−→ = , (17)
| {z } | {z } | {z } | {z }
A L D U =LT
in which
ac − b2
d1 = a, d2 = (18)
a
Hence,
2
ac − b2

b
f (x1 , x2 ) = a x1 + x2 + x22 = xT Ax
a a

Lecture 12 12/26

a b a b 1 0 a 0 1 b/a
−→ = , (17)
| {z } | {z } | {z } | {z }
A L D U =LT
in which
ac − b2
d1 = a, d2 = (18)
a
Hence,
2
ac − b2

b
f (x1 , x2 ) = a x1 + x2 + x22 = xT Ax
a a

Lecture 12 12/26

a b a b 1 0 a 0 1 b/a
−→ = , (17)
| {z } | {z } | {z } | {z }
A L D U =LT
in which
ac − b2
d1 = a, d2 = (18)
a
Hence,
2
ac − b2

b
f (x1 , x2 ) = a x1 + x2 + x22 = xT Ax
a a

Lecture 12 12/26
Extending these ideas to a function of n variables, i.e.,
  
a11 a12 ... x1n x1
 a12
 a22 ... x2n   x2 
f (x1 , x2 , . . . , xn ) = xT Ax = x1 x2 . . . xn  .
 
.. .. ..   .. 
 .. . . .  . 
a1n a2n ... ann xn
n X
X n
= ajk xj xk , (19)
j=1 k=1
quadratic function f and the symmetric matrix A are positive definite if each of the
following necessary and sufficient conditions holds:
1. xT Ax > 0 for all non-zero vectors x (definition).
2. Eigenvalues of A are positive, i.e., λj > 0, j = 1, 2, . . . , n.
2 2
Since Axj = λj xj ⇒ xT T
j Axj = λj xj xj = λj kxj k > 0. Since kxj k > 0 for
non-zero vector xj , λj > 0, j = 1, 2, . . . , n.
Conversely, if λj > 0, j = 1, 2, . . . , n, then since symmetric matrices have full set
of orthonormal eigenvectors, any x ∈ Rn can be written as
Xn X n n
X
x= cj xj ⇒ Ax = cj Axj = cj λj xj
j=1 j=1 j=1
Xn
xT Ax = c2j λj > 0 (∵ xT
j xk = δj,k ).
j=1
Lecture 12 13/26
  
a11 a12 ... x1n x1
 a12
 a22 ... x2n   x2 
f (x1 , x2 , . . . , xn ) = xT Ax = x1 x2 . . . xn  .
 
.. .. ..   .. 
 .. . . .  . 
a1n a2n ... ann xn
n X
X n
= ajk xj xk , (19)
j=1 k=1
2 2
Xn X n n
X
j=1 j=1 j=1
Xn
j xk = δj,k ).
j=1
Lecture 12 13/26
  
a11 a12 ... x1n x1
 a12
 a22 ... x2n   x2 
f (x1 , x2 , . . . , xn ) = xT Ax = x1 x2 . . . xn  .
 
.. .. ..   .. 
 .. . . .  . 
a1n a2n ... ann xn
n X
X n
= ajk xj xk , (19)
j=1 k=1
2 2
Xn X n n
X
j=1 j=1 j=1
Xn
j xk = δj,k ).
j=1
Lecture 12 13/26
  
a11 a12 ... x1n x1
 a12
 a22 ... x2n   x2 
f (x1 , x2 , . . . , xn ) = xT Ax = x1 x2 . . . xn  .
 
.. .. ..   .. 
 .. . . .  . 
a1n a2n ... ann xn
n X
X n
= ajk xj xk , (19)
j=1 k=1
2 2
Xn X n n
X
j=1 j=1 j=1
Xn
j xk = δj,k ).
j=1
Lecture 12 13/26
  
a11 a12 ... x1n x1
 a12
 a22 ... x2n   x2 
f (x1 , x2 , . . . , xn ) = xT Ax = x1 x2 . . . xn  .
 
.. .. ..   .. 
 .. . . .  . 
a1n a2n ... ann xn
n X
X n
= ajk xj xk , (19)
j=1 k=1
2 2
Xn X n n
X
j=1 j=1 j=1
Xn
j xk = δj,k ).
j=1
Lecture 12 13/26
  
a11 a12 ... x1n x1
 a12
 a22 ... x2n   x2 
f (x1 , x2 , . . . , xn ) = xT Ax = x1 x2 . . . xn  .
 
.. .. ..   .. 
 .. . . .  . 
a1n a2n ... ann xn
n X
X n
= ajk xj xk , (19)
j=1 k=1
2 2
Xn X n n
X
j=1 j=1 j=1
Xn
j xk = δj,k ).
j=1
Lecture 12 13/26
  
a11 a12 ... x1n x1
 a12
 a22 ... x2n   x2 
f (x1 , x2 , . . . , xn ) = xT Ax = x1 x2 . . . xn  .
 
.. .. ..   .. 
 .. . . .  . 
a1n a2n ... ann xn
n X
X n
= ajk xj xk , (19)
j=1 k=1
2 2
Xn X n n
X
j=1 j=1 j=1
Xn
j xk = δj,k ).
j=1
Lecture 12 13/26
3. All upperleft submatrices Ak have positive determinants.
Qn
For xT Ax > 0, λj > 0, j = 1, 2, . . . , n ⇒ |A| = j=1 λj > 0.
Now, considering those vectors whoe last k components are zero, their quadratic
form is written as
Ak 0 x T

T k ≡ xT A x > 0
T
x Ax = xk 0 k k k
0 0 0
for k = 1, 2, . . . , n, which suggests that Ak is positive definite and thus, its

eigenvalues are positive, i.e., λ0k > 0. This in turn establishes that
n
Y
|Ak | = λ0k > 0.
k=1
4. All pivots (without row exchanges) are positive.

For xT Ax > 0, λj > 0, j = 1, 2, . . . , n ⇒ |Ak | > 0, k = 1, 2, . . . , n. Since
(
|A1 |, k=1
dk , |Ak |
|Ak−1 | , k = 2, . . . , n,
we conclude that pivots of a positive definite matrix are positive.

Lecture 12 14/26
Qn
form is written as
Ak 0 x T

T k ≡ xT A x > 0
T
x Ax = xk 0 k k k
0 0 0

n
Y
|Ak | = λ0k > 0.
k=1

(
|A1 |, k=1
dk , |Ak |
|Ak−1 | , k = 2, . . . , n,

Lecture 12 14/26
Qn
form is written as
Ak 0 x T

T k ≡ xT A x > 0
T
x Ax = xk 0 k k k
0 0 0

n
Y
|Ak | = λ0k > 0.
k=1

(
|A1 |, k=1
dk , |Ak |
|Ak−1 | , k = 2, . . . , n,

Lecture 12 14/26
Qn
form is written as
Ak 0 x T

T k ≡ xT A x > 0
T
x Ax = xk 0 k k k
0 0 0

n
Y
|Ak | = λ0k > 0.
k=1

(
|A1 |, k=1
dk , |Ak |
|Ak−1 | , k = 2, . . . , n,

Lecture 12 14/26
Qn
form is written as
Ak 0 x T

T k ≡ xT A x > 0
T
x Ax = xk 0 k k k
0 0 0

n
Y
|Ak | = λ0k > 0.
k=1

(
|A1 |, k=1
dk , |Ak |
|Ak−1 | , k = 2, . . . , n,

Lecture 12 14/26
Qn
form is written as
Ak 0 x T

T k ≡ xT A x > 0
T
x Ax = xk 0 k k k
0 0 0

n
Y
|Ak | = λ0k > 0.
k=1

(
|A1 |, k=1
dk , |Ak |
|Ak−1 | , k = 2, . . . , n,

Lecture 12 14/26
Qn
form is written as
Ak 0 x T

T k ≡ xT A x > 0
T
x Ax = xk 0 k k k
0 0 0

n
Y
|Ak | = λ0k > 0.
k=1

(
|A1 |, k=1
dk , |Ak |
|Ak−1 | , k = 2, . . . , n,

Lecture 12 14/26
Qn
form is written as
Ak 0 x T

T k ≡ xT A x > 0
T
x Ax = xk 0 k k k
0 0 0

n
Y
|Ak | = λ0k > 0.
k=1

(
|A1 |, k=1
dk , |Ak |
|Ak−1 | , k = 2, . . . , n,

Lecture 12 14/26
Qn
form is written as
Ak 0 x T

T k ≡ xT A x > 0
T
x Ax = xk 0 k k k
0 0 0

n
Y
|Ak | = λ0k > 0.
k=1

(
|A1 |, k=1
dk , |Ak |
|Ak−1 | , k = 2, . . . , n,

Lecture 12 14/26
Qn
form is written as
Ak 0 x T

T k ≡ xT A x > 0
T
x Ax = xk 0 k k k
0 0 0

n
Y
|Ak | = λ0k > 0.
k=1

(
|A1 |, k=1
dk , |Ak |
|Ak−1 | , k = 2, . . . , n,

Lecture 12 14/26
Conversely, we note that pivots are numbers that multiply complete squares to make
up xT Ax. Hence, dk > 0, k = 1, 2, . . . , n to render the quadratic form positive
everywhere except at origin.
5. There exists a matrix R with independent columns such that A = RT R.
T 2
xT Ax = xT RT Rx = (Rx) (Rx) = kRxk > 0,
because x 6= 0 and
n
X
Rx = xj rj = 0
j=1
only when all xj , j = 1, 2, . . . , n are zero, due to linearly independent columns

rj of R.
Lecture 12 15/26
T 2
because x 6= 0 and
n
X
Rx = xj rj = 0
j=1

rj of R.
Lecture 12 15/26
T 2
because x 6= 0 and
n
X
Rx = xj rj = 0
j=1

rj of R.
Lecture 12 15/26
T 2
because x 6= 0 and
n
X
Rx = xj rj = 0
j=1

rj of R.
Lecture 12 15/26
T 2
because x 6= 0 and
n
X
Rx = xj rj = 0
j=1

rj of R.
Lecture 12 15/26
Positive Semidefiniteness in Higher Dimensions
The symmetric matrix in the quadrartic form given in (19) is positive semidefinite, if
each of the following necessary and sufficient conditions hold:
1. xT Ax ≥ 0 for all x 6= 0. (definition)
2. Eigenvalues of A are non-negative, i.e., λj ≥ 0, j = 1, 2, . . . , n.
3. No principal submatrices have negative determinants5 .
If only upperleft matrices were checked for positive semidefiniteness, we would

not be able to distinguish between two matrices whose all upperleft determinants
were zero, e.g.,

0 0 0 0
is + ve semidefinite, is − ve semidefinite.
0 1 0 −1
Yet both of these matrices have zero determinants for all upperleft submatrices.
4. No pivots dk , k = 1, 2, . . . , n are negative.

5. There exists a matrix R with possibly linearly dependent columns such that A =
RT R.
5 A principal submatrix is a square matrix obtained by removing certain rows and columns of a matrix.
Lecture 12 16/26

were zero, e.g.,

0 0 0 0
0 1 0 −1

RT R.
Lecture 12 16/26

were zero, e.g.,

0 0 0 0
0 1 0 −1

RT R.
Lecture 12 16/26

were zero, e.g.,

0 0 0 0
0 1 0 −1

RT R.
Lecture 12 16/26

were zero, e.g.,

0 0 0 0
0 1 0 −1

RT R.
Lecture 12 16/26

were zero, e.g.,

0 0 0 0
0 1 0 −1

RT R.
Lecture 12 16/26

were zero, e.g.,

0 0 0 0
0 1 0 −1

RT R.
Lecture 12 16/26

were zero, e.g.,

0 0 0 0
0 1 0 −1

RT R.
Lecture 12 16/26

were zero, e.g.,

0 0 0 0
0 1 0 −1

RT R.
Lecture 12 16/26
Finally, we note that any function F (x1 , x2 , . . . , xn ) has a quadratic form whose
matrix A = [ajk ] is such that
∂2F
ajk = = akj . (20)
∂xj ∂xk
The function F has a minimum6 when the resulting quadratic form is positive
definite.
6 at the stationary point where all of its first derivatives are zero.
Lecture 12 17/26
Finally, we note that any function F (x1 , x2 , . . . , xn ) has a quadratic form whose
matrix A = [ajk ] is such that
∂2F
ajk = = akj . (20)
∂xj ∂xk
The function F has a minimum6 when the resulting quadratic form is positive
definite.
6 at the stationary point where all of its first derivatives are zero.
Lecture 12 17/26
Singular Value Decomposition
Any m × n matrix A can be factored into
A = U ΣV T U ΣV H for complex matrices ,

(21)
where Um×m and Vn×n (and thus V T ) are orthogonal and Σm×n is a diagonal matrix.
Colummns of U are eigenvectors of AAT and columns of V are eigenvectors of AT A.
The r singular values on the diagonal of Σ are square roots of the non-zero eigenvalues
of both AAT and AT A.
Proof:
Let Am×n = [aij ] be a matrix with Rank(A) = r. Then, AT A is symmetric and
positive semidefinite, i.e.,
T
AT A = AT A (AT A is symmetric) and
T T T T T 2 2
x (A A)x = x A Ax = (Ax) (Ax) = kAxk = kyk ≥ 0,
where
y = x1 a1 + x2 a2 + · · · + xn an ,
which may equal 0 for non-zero xj if the columns of A, i.e., aj , are linearly
dependent.
Lecture 12 18/26

(21)
Proof:
T
T T T T T 2 2
where
y = x1 a1 + x2 a2 + · · · + xn an ,
dependent.
Lecture 12 18/26

(21)
Proof:
T
T T T T T 2 2
where
y = x1 a1 + x2 a2 + · · · + xn an ,
dependent.
Lecture 12 18/26

(21)
Proof:
T
T T T T T 2 2
where
y = x1 a1 + x2 a2 + · · · + xn an ,
dependent.
Lecture 12 18/26

(21)
Proof:
T
T T T T T 2 2
where
y = x1 a1 + x2 a2 + · · · + xn an ,
dependent.
Lecture 12 18/26

(21)
Proof:
T
T T T T T 2 2
where
y = x1 a1 + x2 a2 + · · · + xn an ,
dependent.
Lecture 12 18/26

(21)
Proof:
T
T T T T T 2 2
where
y = x1 a1 + x2 a2 + · · · + xn an ,
dependent.
Lecture 12 18/26

(21)
Proof:
T
T T T T T 2 2
where
y = x1 a1 + x2 a2 + · · · + xn an ,
dependent.
Lecture 12 18/26

(21)
Proof:
T
T T T T T 2 2
where
y = x1 a1 + x2 a2 + · · · + xn an ,
dependent.
Lecture 12 18/26

(21)
Proof:
T
T T T T T 2 2
where
y = x1 a1 + x2 a2 + · · · + xn an ,
dependent.
Lecture 12 18/26

(21)
Proof:
T
T T T T T 2 2
where
y = x1 a1 + x2 a2 + · · · + xn an ,
dependent.
Lecture 12 18/26

(21)
Proof:
T
T T T T T 2 2
where
y = x1 a1 + x2 a2 + · · · + xn an ,
dependent.
Lecture 12 18/26
Hence, eigenvectors of AT A are orthogonal (chosen to be orthonormal by

normalization) and are given by
AT A vj = λj vj = σj2 vj , j = 1, 2, . . . , n,

(22)
where λj are the real eigenvalues7 of AT A and are non-negative
λj = σj2 ∵ λj ≥ 0, j = 1, 2, . . . , n. (23)
As a result, the matrix AT A has the following similarity transformation
AT A n×n = V DV −1 = V DV T

  
λ1 v1

 λ2   v2 
  
= v1 v2 . . . vn  . . , (24)
 ..   .. 
 
λn vn
where V is the matrix of orthonormal eigenvectors of AT A and is orthogonal.
7 because AT A is symmetric!
Lecture 12 19/26


(22)
λj = σj2 ∵ λj ≥ 0, j = 1, 2, . . . , n. (23)

  
λ1 v1

 λ2   v2 
  
= v1 v2 . . . vn  . . , (24)
 ..   .. 
 
λn vn
Lecture 12 19/26


(22)
λj = σj2 ∵ λj ≥ 0, j = 1, 2, . . . , n. (23)

  
λ1 v1

 λ2   v2 
  
= v1 v2 . . . vn  . . , (24)
 ..   .. 
 
λn vn
Lecture 12 19/26


(22)
λj = σj2 ∵ λj ≥ 0, j = 1, 2, . . . , n. (23)

  
λ1 v1

 λ2   v2 
  
= v1 v2 . . . vn  . . , (24)
 ..   .. 
 
λn vn
Lecture 12 19/26
It must be noted that A and AT A have the same rank, equal to r, because if Ax = 0,
then AT Ax = 0, implying that A and AT A have the same null space. Hence,
dim N (A) = dim N (AT A)

# of cols. of A + Rank(A) = # of cols. of AT A + Rank(AT A)
Rank(A) = Rank(AT A) ∵ # of cols. of A = # of cols. of AT A .

(25)
Now, Am×n represents a linear transformation F : Rn → Rm and vj ∈ Rn for

j = 1, 2, . . . , n constitute an orthonormal basis set for Rn .Then
Avi = u0i , (26)
where u0i ∈ Rm is the image of vi ∈ Rn under F .

Premultiplying (26) with AT and then with A results in the following eigenvalue
equation
AT Avi = AT u0i ⇒ λi vi = AT u0i

⇒ AAT u0i = λi Avi = λi u0i , i = 1, 2, . . . , m. (27)
T
Because AAT = AAT , u0i are orthogonal eigenvectors of AAT .Furthermore,
since AAT is a square matrix of size m × m, i = 1, 2, . . . , m in (27).
Lecture 12 20/26


(25)

Avi = u0i , (26)

equation

T
Lecture 12 20/26


(25)

Avi = u0i , (26)

equation

T
Lecture 12 20/26


(25)

Avi = u0i , (26)

equation

T
Lecture 12 20/26


(25)

Avi = u0i , (26)

equation

T
Lecture 12 20/26


(25)

Avi = u0i , (26)

equation

T
Lecture 12 20/26


(25)

Avi = u0i , (26)

equation

T
Lecture 12 20/26


(25)

Avi = u0i , (26)

equation

T
Lecture 12 20/26
The inner product between u0 s can be found as
T T
hu0i , u0i i = u0i u0i = (Avi ) Avi = viT AT Avi = viT λi vi = λi , (28)
which enables us to define an orthonormal basis as

u0 u0
ui , √ i = i ⇒ hui , ui i = 1. (29)
λi σi
Hence, the eigenvalue problem in (27) can be reformulated as
Avi = σi ui ⇒ AAT ui = λi ui = σi2 ui , i = 1, 2, . . . , m.

(30)
Moreover, λi = σi2 , i = 1, 2, . . . , m are non-negative and real, as established

earlier, which implies that AAT is positive semidefinite as can be verified as
follows:
T T 2 2
xT AAT x = AT x A x = AT x = kzk ≥ 0

where
z = x1 â1 + x2 â2 + . . . + xm âm ,
which may equal 0 for non-zero xi if the rows of A, i.e., âi , are linearly dependent.
Lecture 12 21/26
T T

u0 u0
ui , √ i = i ⇒ hui , ui i = 1. (29)
λi σi

(30)

follows:
T T 2 2

where
z = x1 â1 + x2 â2 + . . . + xm âm ,
Lecture 12 21/26
T T

u0 u0
ui , √ i = i ⇒ hui , ui i = 1. (29)
λi σi

(30)

follows:
T T 2 2

where
z = x1 â1 + x2 â2 + . . . + xm âm ,
Lecture 12 21/26
T T

u0 u0
ui , √ i = i ⇒ hui , ui i = 1. (29)
λi σi

(30)

follows:
T T 2 2

where
z = x1 â1 + x2 â2 + . . . + xm âm ,
Lecture 12 21/26
T T

u0 u0
ui , √ i = i ⇒ hui , ui i = 1. (29)
λi σi

(30)

follows:
T T 2 2

where
z = x1 â1 + x2 â2 + . . . + xm âm ,
Lecture 12 21/26
T T

u0 u0
ui , √ i = i ⇒ hui , ui i = 1. (29)
λi σi

(30)

follows:
T T 2 2

where
z = x1 â1 + x2 â2 + . . . + xm âm ,
Lecture 12 21/26
T T

u0 u0
ui , √ i = i ⇒ hui , ui i = 1. (29)
λi σi

(30)

follows:
T T 2 2

where
z = x1 â1 + x2 â2 + . . . + xm âm ,
Lecture 12 21/26
T T

u0 u0
ui , √ i = i ⇒ hui , ui i = 1. (29)
λi σi

(30)

follows:
T T 2 2

where
z = x1 â1 + x2 â2 + . . . + xm âm ,
Lecture 12 21/26
Hence, the matrix AAT has the following similarity transformation
AAT n×n = U DU −1 = U DU T

  
λ1 u1
 λ 2
  u2 
= u1 u2 . . . um    ..  , (31)
  
..
 .  . 
λm um
where U is the matrix of orthonormal eigenvectors of AAT and is orthogonal.
It must be noted that A and AAT have the same rank, equal to r, because if AT y = 0,
then AAT y = 0, implying that A and AAT have the same null space. Hence,
dim N (AT ) = dim N (AAT )

# of cols. of AT + Rank(AT ) = # of cols. of AAT + Rank(AAT )
Rank(AT ) = Rank(AAT ) ∵ # of cols. ofAT = # of cols. ofAAT

Rank(A) = Rank(AAT ) ∵ Rank(A) = Rank AT . (32)

Lecture 12 22/26

  
λ1 u1
 λ 2
  u2 
= u1 u2 . . . um    ..  , (31)
  
..
 .  . 
λm um



Lecture 12 22/26

  
λ1 u1
 λ 2
  u2 
= u1 u2 . . . um    ..  , (31)
  
..
 .  . 
λm um



Lecture 12 22/26

  
λ1 u1
 λ 2
  u2 
= u1 u2 . . . um    ..  , (31)
  
..
 .  . 
λm um



Lecture 12 22/26
If Rank(A) = r = Rank(AT A) = Rank(AAT ), then
(
T σj2 vj , j = 1, 2, . . . , r
A Avj = (33)
0, j = r + 1, . . . , n,
and
(
σi ui , i = 1, 2, . . . , r
∵ Avi = σi ui ⇒ AAT ui = σi2 ui .

Avi =
0, i = r + 1, . . . , m,
(34)
Writing (34) in matrix form, we get

AV = Av1 Av2 . . . Avr Avr+1 . . . Avn
 
σ1

 σ2 

 .. 

 . 

= u1 u2 . . . ur ur+1 . . . um ×  σr , (35)
| {z }  0


U  
 .. 
 . 
0
| {z }
Σ
Lecture 12 23/26
(
T σj2 vj , j = 1, 2, . . . , r
A Avj = (33)
0, j = r + 1, . . . , n,
and
(
σi ui , i = 1, 2, . . . , r

Avi =
0, i = r + 1, . . . , m,
(34)

 
σ1

 σ2 

 .. 

 . 

= u1 u2 . . . ur ur+1 . . . um ×  σr , (35)
| {z }  0


U  
 .. 
 . 
0
| {z }
Σ
Lecture 12 23/26
(
T σj2 vj , j = 1, 2, . . . , r
A Avj = (33)
0, j = r + 1, . . . , n,
and
(
σi ui , i = 1, 2, . . . , r

Avi =
0, i = r + 1, . . . , m,
(34)

 
σ1

 σ2 

 .. 

 . 

= u1 u2 . . . ur ur+1 . . . um ×  σr , (35)
| {z }  0


U  
 .. 
 . 
0
| {z }
Σ
Lecture 12 23/26
which can be rearranged to obtain the signular value decomposition as
A = U ΣV −1 = U ΣV T . (36)
Lecture 12 24/26
A = U ΣV −1 = U ΣV T . (36)
Intuitive Picture
SVD of an m × n matrix A represents linear transformation of a vector x ∈ Rn into
Ax ∈ Rm , i.e., for any vector x ∈ Rn and an orthonormal basis set vj ∈ Rn , j =
1, 2, . . . , n, we can write
n
X
x= cj vj ,
j=1
where cj , hx, vj i = xT vj because hvj , vk i = δj,k , and so noting that xT vj = vjT x,
n
X n
X n
X r
X
Ax = A hx, vj i vj = xT vj Avj = xT vj σj uj = vjT xσj uj
j=1 j=1 j=1 j=1
 
Xr r
X
= uj σj vjT  x ⇒ A = uj σj vjT = U ΣV T .
j=1 j=1
Lecture 12 24/26
A = U ΣV −1 = U ΣV T . (36)
Intuitive Picture
1, 2, . . . , n, we can write
n
X
x= cj vj ,
j=1
n
X n
X n
X r
X
j=1 j=1 j=1 j=1
 
Xr r
X
j=1 j=1
Lecture 12 24/26
A = U ΣV −1 = U ΣV T . (36)
Intuitive Picture
1, 2, . . . , n, we can write
n
X
x= cj vj ,
j=1
n
X n
X n
X r
X
j=1 j=1 j=1 j=1
 
Xr r
X
j=1 j=1
Lecture 12 24/26
A = U ΣV −1 = U ΣV T . (36)
Intuitive Picture
1, 2, . . . , n, we can write
n
X
x= cj vj ,
j=1
n
X n
X n
X r
X
j=1 j=1 j=1 j=1
 
Xr r
X
j=1 j=1
Lecture 12 24/26
Remarks on SVD
1. If A is positive definite, then xT Ax > 0, A = AT and A is square of size n × n.

In this case,
AT Avj = σj2 vj = A2 vj ⇒ Avj = σj vj , j = 1, 2, . . . , n,
and
AAT uj = σj2 uj = A2 uj ⇒ Auj = σj uj , j = 1, 2, . . . , n.
Hence, U = V and A = U ΣV T = XDX T where

X = u1 u2 . . . un = v1 v2 . . . vn ,
 
σ1
 σ2 
D=  , σj 0 s are eigenvalues of A now!
 
 . .. 
σn
For any other symmetric matrix, AT Avj = σj2 vj = A2 vj ⇒ Avj = ±σj vj =

σj uj , j = 1, 2, . . . , n, which gives U = ±V and D containing positive eigenvalues
of A, i.e., σj , on its diagonal, even though some eigenvalues of A are negative.
Lecture 12 25/26
Remarks on SVD

In this case,
and

X = u1 u2 . . . un = v1 v2 . . . vn ,
 
σ1
 σ2 
 
 . .. 
σn

Lecture 12 25/26
Remarks on SVD

In this case,
and

X = u1 u2 . . . un = v1 v2 . . . vn ,
 
σ1
 σ2 
 
 . .. 
σn

Lecture 12 25/26
Remarks on SVD

In this case,
and

X = u1 u2 . . . un = v1 v2 . . . vn ,
 
σ1
 σ2 
 
 . .. 
σn

Lecture 12 25/26
Remarks on SVD

In this case,
and

X = u1 u2 . . . un = v1 v2 . . . vn ,
 
σ1
 σ2 
 
 . .. 
σn

Lecture 12 25/26
Remarks on SVD

In this case,
and

X = u1 u2 . . . un = v1 v2 . . . vn ,
 
σ1
 σ2 
 
 . .. 
σn

Lecture 12 25/26
Remarks on SVD

In this case,
and

X = u1 u2 . . . un = v1 v2 . . . vn ,
 
σ1
 σ2 
 
 . .. 
σn

Lecture 12 25/26
Remarks on SVD
2. First r columns of U form an orthonormal basis for column space of A.

3. Last m − r columns of U form an orthonormal basis for left nullspace of A, i.e.,
N (AT ).
4. First r columns of V form an orthonormal basis for range space of A, i.e., R(A).
5. Last n−r columns of V form an orthonormal basis for nullspace of A, i.e., N (A).
Prove the statements in remarks 2 − 5 as homework exercise!
Lecture 12 26/26
Remarks on SVD

N (AT ).
Lecture 12 26/26
Remarks on SVD

N (AT ).
Lecture 12 26/26
Remarks on SVD

N (AT ).
Lecture 12 26/26
Remarks on SVD

N (AT ).
Lecture 12 26/26

EE506 - Engineering Mathematics: Quadratic Forms, Positive Definiteness and Singular Value Decomposition

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

EE506 - Engineering Mathematics: Quadratic Forms, Positive Definiteness and Singular Value Decomposition

Uploaded by

Copyright:

Available Formats

EE506 – Engineering Mathematics

Dr. Adeem Aslam

Department of Electrical Engineering

October 24, 2021

3 Singular Value Decomposition

A quadraric form Q in the components x1 , x2 , . . . , xn of a vector x is a sum of

= a11 x21 + a12 x1 x2 + . . . + a1n x1 xn

where Am×n = [ajk ] is called the coefficient matrix of the form.

Since Q = QT (Q being a scalar),

A quadraric form Q in the components x1 , x2 , . . . , xn of a vector x is a sum of

= a11 x21 + a12 x1 x2 + . . . + a1n x1 xn

where Am×n = [ajk ] is called the coefficient matrix of the form.

Since Q = QT (Q being a scalar),

A quadraric form Q in the components x1 , x2 , . . . , xn of a vector x is a sum of

= a11 x21 + a12 x1 x2 + . . . + a1n x1 xn

where Am×n = [ajk ] is called the coefficient matrix of the form.

Since Q = QT (Q being a scalar),

transforms the quadratic form

to the principal axis form or the canonical form, given by

where λ1 , λ2 , . . . , λn are the (not necessarily distinct) eigenvalues of (symmetric) ma-

transforms the quadratic form

to the principal axis form or the canonical form, given by

where λ1 , λ2 , . . . , λn are the (not necessarily distinct) eigenvalues of (symmetric) ma-

transforms the quadratic form

to the principal axis form or the canonical form, given by

where λ1 , λ2 , . . . , λn are the (not necessarily distinct) eigenvalues of (symmetric) ma-

transforms the quadratic form

to the principal axis form or the canonical form, given by

where λ1 , λ2 , . . . , λn are the (not necessarily distinct) eigenvalues of (symmetric) ma-

Ax1 = λ1 x1 , Ax2 = λ2 x2 , . . . , Axn = λn xn ,

where x1 , x2 , . . . , xn are the orthonormal eigenvectors of A. Then, from similarity

Q = xT XDX T x = y T Dy = λ1 y12 + λ2 y22 + · · · + λn yn2 . (6)

Ax1 = λ1 x1 , Ax2 = λ2 x2 , . . . , Axn = λn xn ,

where x1 , x2 , . . . , xn are the orthonormal eigenvectors of A. Then, from similarity

Q = xT XDX T x = y T Dy = λ1 y12 + λ2 y22 + · · · + λn yn2 . (6)

Ax1 = λ1 x1 , Ax2 = λ2 x2 , . . . , Axn = λn xn ,

where x1 , x2 , . . . , xn are the orthonormal eigenvectors of A. Then, from similarity

Q = xT XDX T x = y T Dy = λ1 y12 + λ2 y22 + · · · + λn yn2 . (6)

Ax1 = λ1 x1 , Ax2 = λ2 x2 , . . . , Axn = λn xn ,

where x1 , x2 , . . . , xn are the orthonormal eigenvectors of A. Then, from similarity

Q = xT XDX T x = y T Dy = λ1 y12 + λ2 y22 + · · · + λn yn2 . (6)

Ax1 = λ1 x1 , Ax2 = λ2 x2 , . . . , Axn = λn xn ,

where x1 , x2 , . . . , xn are the orthonormal eigenvectors of A. Then, from similarity

Q = xT XDX T x = y T Dy = λ1 y12 + λ2 y22 + · · · + λn yn2 . (6)

Ax1 = λ1 x1 , Ax2 = λ2 x2 , . . . , Axn = λn xn ,

where x1 , x2 , . . . , xn are the orthonormal eigenvectors of A. Then, from similarity

Q = xT XDX T x = y T Dy = λ1 y12 + λ2 y22 + · · · + λn yn2 . (6)

Ax1 = λ1 x1 , Ax2 = λ2 x2 , . . . , Axn = λn xn ,

where x1 , x2 , . . . , xn are the orthonormal eigenvectors of A. Then, from similarity

Q = xT XDX T x = y T Dy = λ1 y12 + λ2 y22 + · · · + λn yn2 . (6)

Ax1 = λ1 x1 , Ax2 = λ2 x2 , . . . , Axn = λn xn ,

where x1 , x2 , . . . , xn are the orthonormal eigenvectors of A. Then, from similarity

Q = xT XDX T x = y T Dy = λ1 y12 + λ2 y22 + · · · + λn yn2 . (6)

Ax1 = λ1 x1 , Ax2 = λ2 x2 , . . . , Axn = λn xn ,

where x1 , x2 , . . . , xn are the orthonormal eigenvectors of A. Then, from similarity

Q = xT XDX T x = y T Dy = λ1 y12 + λ2 y22 + · · · + λn yn2 . (6)

Ax1 = λ1 x1 , Ax2 = λ2 x2 , . . . , Axn = λn xn ,

where x1 , x2 , . . . , xn are the orthonormal eigenvectors of A. Then, from similarity

Q = xT XDX T x = y T Dy = λ1 y12 + λ2 y22 + · · · + λn yn2 . (6)

Ax1 = λ1 x1 , Ax2 = λ2 x2 , . . . , Axn = λn xn ,

where x1 , x2 , . . . , xn are the orthonormal eigenvectors of A. Then, from similarity

Q = xT XDX T x = y T Dy = λ1 y12 + λ2 y22 + · · · + λn yn2 . (6)

Ax1 = λ1 x1 , Ax2 = λ2 x2 , . . . , Axn = λn xn ,