You are on page 1of 61

An introduction into applied

Linear Algebra in Signal Processing


Univ.-Prof. Dr.-Ing. Wolfgang Utschick

Technische Universität München

Summer 2020
© 2020 Univ.-Prof. Dr.-Ing. Wolfgang Utschick

A circulation of this document to other parties without a written consent of the author is forbidden.

Email: utschick@tum.de

Layout by LATEX2ε
A few starting remarks

This appendix contains a short introduction to finite dimensional vector spaces and linear operators, also
called matrices. It contains the most important basic tools for classical signal processing and its applications.
The content is not taken directly from a well-known textbook, but reflects the author’s view after more
than 20 years of teaching and research in this field.
Nevertheless, there has been a strong influence by some textbooks, in particular the books of Prof. Gilbert
Strang, MIT, these are
- Linear Algebra and Its Applications, Academic Press (1976). Second Edition : Harcourt Brace Jovanovich
(1980). Third Edition : Brooks/Cole (1988). Fourth Edition : Brooks/Cole/Cengage (2006).
- Introduction to Linear Algebra, Wellesley-Cambridge Press (1993). Second Edition (1998). Third Edition
(2003). Fourth Edition (2009).

3
Part I

Vector Spaces

4
1. Vector Spaces

1.1 Definition

Definition. A Vector Space is a set S with structural properties, such that for every x, y, z ∈ S and
α, β ∈ C (or R) the following holds,

Commutativity : x + y = y + x (1.1)
Associativity : (x + y) + z = x + (y + z) and (αβ)x = α(βx) (1.2)
Distributivity : α(x + y) = αx + αy and (α + β)x = αx + βx (1.3)
Additive Identity : ∃0 : x + 0 = 0 + x = x, for all x ∈ S (1.4)
Additive Inverse : for every x ∈ S ∃ − x ∈ S : x + (−x) = (−x) + x = 0 (1.5)
Multiplicative Idendity : for every x ∈ S : 1 · x = x. (1.6)

5
In this tutorial, we consider the Complex Vector Space S ≡ CN of dimension N ∈ N, with α ∈ C
and vectors x ∈ CN with the following definitions,
Definition. Elementwise

x ∈ CN with [x]i = xi = Re {xi } + j Im {xi } ∈ C, (1.7)

and
     
x1 y1 αx1 + βy1

 x2 


 y2  
  αx2 + βy2 

αx + βy = α  .. +β .. = .. .
 .   .   . 
xN yN αxN + βyN

Definition. Vector, Transpose Vector, Hermitian Vector


 
x1
 x2 
x =  ..  , xT = [x1 x2 · · · xN ] , xH = (x∗ )T = [x∗1 x∗2 · · · x∗N ] ,
 
 . 
xN

with x∗i = (Re {xi } + j Im {xi })∗ = Re {xi } − j Im {xi }.

6
1.2 Properties

Definition. A Subspace U ⊂ CN is a nonempty subset of CN with the following properties,

for all x, y ∈ U : x + y ∈ U (1.8)


for all x ∈ U, α ∈ C : αx ∈ U. (1.9)

The Linear Combination of vectors x1 , . . . , xL ∈ CN is given by the weighted sum

L
X
αi xi = α1 x1 + α2 x2 + · · · + αL xL (1.10)
i=1

with α1 , α2 , . . . , αL ∈ C.
The vectors x1 , . . . , xL ∈ CN are Linear Independent, if
L
X
αi x i = 0 ⇔ α1 = α2 = · · · = αL = 0. (1.11)
i=1

7
Definition. The Span of vectors x1 , . . . , xL ∈ CN is the set of Linear Combinations of x1 , . . . , xL ∈
CN for all α1 , α2 , . . . , αL ∈ C,

( )
L
X

span [x1 x2 · · · xL ] = αi xi for all α1 , α2 , . . . , αL ∈ C . (1.12)

i=1

x1
x′1
x2

x′2

Definition. The Dimension of a Vector Space (Subspace) U is the maximal number of linear
independent vectors that can be found in U.

8
1.3 Example: Spatial Sampling Using a Linear Uniform Array (ULA)

Spatial Sampling of an impinging planar wavefront with Direction Of Arrival (DoA) θ by


means of a Linear Uniform Antenna Array (ULA) with M antenna elements.

ry
θi

rx
x0 (t) x1 (t) x2 (t)

d d

In the following, we assume W > 1 impinging planar wavefronts with θ1 , . . . , θW .

9
The Received Signal Vector at ULA at time instant t ∈ R is equal to the superposition signal
W
X
x(t) = ξi ai si (t) + η(t) ∈ CM , (1.13)
i=1

with the signal at the mth antenna element as

W
X − j 2π sin θi md
xm (t) = ξi e λ
i si (t) + ηm (t), m = 0, . . . , M − 1. (1.14)
i=1

The d and λi denote the distance between two adjacent antenna elements and the wavelengths of the
assumed Narrowband Signals. The received signal vector is corrupted by a noise vector η(t).
The ξi ∈ R represents the attenuation which the transmitted signal s(t) ∈ R experiences over the trans-
mission path. W.l.o.g. we assume a Linear Uniform Array (ULA) with M antenna elements, i.e.
 
αi0
 α1     
 i  m j kT r 2π − sin θi md
ai =  ..  , with αi = e i , ki = m
and r m = . (1.15)
 .  λi − cos θi 0
αiM −1

10
From a different perspective the Received Signal Vector (neglecting noise) can be written as

   T T T  
x0 (t) ξ1 e j k 1 r 0 ξ2 ej k2 r0 ... ξW ej kW r0 s1 (t)
T T T

 x1 (t) 


 ξ1 e j k 1 r 1 ξ2 ej k2 r1 ... ξW ej kW r1 
 s2 (t) 

 ..  =  .. .. ..  .. 
 .   . . .  . 
T T T
xM −1 (t) ξ1 ej k1 rM −1 ξ2 ej k2 rM −1 . . . ξW ej kW rM −1 sW (t)

m
x(t) = Hs(t),
with
x(t) = [x0 (t) x1 (t) · · · xM −1 (t)]T ,
H = [h1 h2 · · · hW ],
h T T T
iT
hi = ξi ej ki r0 ej ki r1 · · · ej ki rM −1 , i = 1, . . . , W
s(t) = [s1 (t) s2 (t) · · · sW (t)]T .

Note. The Received Signal Vector x(t) (neglecting noise) is an element of the Subspace
H = span [h1 h2 · · · hW ] = range [H] at any time t ∈ R.
The Subspace H is called the Array Manifold.

11
2. Norm and Inner Product

2.1 Norm

Definition. A Norm on CN is a positive real-valued function k•k : CN → R+ with the following


properties for every x, y ∈ CN and α ∈ C,

Positive Definiteness : kxk ≥ 0, with kxk = 0 ⇔ x = 0 (2.1)


Positive Scalability : kαxk = |α| kxk (2.2)
Triangle Inequality : kx + yk ≤ kxk + kyk . (2.3)

qP qP 1
N N
N
The Standard Norm of any x ∈ C is given by kxk = i=1 |xi |2 = i=1 x∗i xi = xH x 2 .
Note. The norm of differences of vectors provides a metric on CN .

12
2.2 Inner Product

Definition. An Inner Product on CN is a function h•, •i : CN × CN → C with the following


properties for every x, y, z ∈ CN and α ∈ C,

Distributivity : hx + y, zi = hx, zi + hy, zi (2.4)


Linear in the first argument : hαx, yi = αhx, yi (2.5)
Hermitian Symmetry : hx, yi∗ = hy, xi (2.6)
Positive Definiteness : hx, xi ≥ 0, with hx, xi = 0 ⇔ x = 0. (2.7)

PN
The Standard Inner Product of vectors x, y, z ∈ CN is given by hx, yi = i=1 xi yi∗ = y H x, with

z H (x + y) = z H x + z H y
y H (αx) = αy H x
∗
y H x = xH y
xH x ≥ 0, with xH x = 0 ⇔ x = 0.

13
Definition. Vectors x, y ∈ CN are Orthogonal, i.e. x ⊥ y, if and only if hx, yi = 0.
x, y ∈ CN are Orthogonal with respect to the Standard Inner Product on CN , if
Vectors P
yHx = N ∗
i=1 xi yi = 0, i.e. N
X
Re {xi } Re {yi } + Im {xi } Im {yi } = 0
i=1
XN
Re {xi } Im {yi } − Im {xi } Re {yi } = 0.
i=1

Definition. A set of vectors x1 , x2 , . . . , xL is an Orthogonal Set, if xi ⊥ xj for every i, j = 1, . . . , L


with i 6= j.

Definition. An Orthogonal Set set of vectors x1 , x2 , . . . , xL is an Orthonormal Set, if kxi k = 1


for every i = 1, . . . , L.

Definition. A vector x is Orthogonal to the set of vectors x1 , x2 , . . . , xL , if x ⊥ xi for every i =


1, . . . , L, i.e., x ⊥ span [x1 x2 · · · xL ].

Definition. Two subspaces X and Y of CN are Orthogonal Subspaces, if x ⊥ y for every x ∈ X


and y ∈ Y.

Definition. The Orthogonal Complement X⊥ of a subspace X of CN consists of all elements


x ∈ CN which are orthogonal to the set of vectors in X.
The Orthogonal Complement X⊥ of a subspace X of CN is again a subspace of CN .

14
2.3 Norms Induced by Inner Products

Any inner product induces a norm, e.g., the Standard Inner Product on CN , viz. hx, xi = xH x,
1 p
induces the Standard Norm on CN , i.e. kxk = xH x 2 = hx, xi.

The following laws hold:


Triangle Law.
kx + yk ≤ kxk + kyk.
Pythagoras Law.
x ⊥ y ⇔ kx + yk2 = kxk2 + kyk2 .
Parallelogram Law.  
kx + yk2 + kx − yk2 = 2 kxk2 + kyk2 .

Chauchy
H Schwarz Inequality.
y x ≤ kxk kyk.

15
3. Inner Product Spaces

3.1 Hilbert Space CN

Definition. A Hilbert Space is a Normed Vector Space S with an Inner Product fulfilling
the Completeness Property, this means every convergent sequence of vectors in S converges (in the
Cauchy sense) to a vector in S.
Any Finite Normed Vector Space with an Inner Product, including the Vector Space CN
with the Standard Inner Product
N
X
hx, yi = xi yi∗, (3.1)
i=1

(trivially) fulfills the Completeness Property. Thus the Vector Space CN is a Hilbert Space.

Note. In Finite-Dimensional Vector Spaces there is no added value by the concept of the
Hilbert Space compared to Inner Product Spaces.

16
Counterexample.

A counterexample is the Infinite Dimensional normed vector space L2 (R) of Square-Integrable


Functions with

kf k2 < ∞, (3.2)

where the norm is induced by the inner product of two functions f (t) and g(t) according to

Z ∞
hf, gi = f (t)g ∗ (t) dt. (3.3)
−∞

The proper handling of L2 (R) requires the use of the Lebesque Integration, which is a different
concept of integration in contrast to the Riemann Integration.
The L2 (R) is a Hilbert Space where the Completeness Property is not trivially shown.
Furthermore, the Infinite Dimensional vector space requires a careful treatment of properties which are
common sense in the Finite Dimensional case, e.g., when considering different types of basis functions
which are all equivalent in finite vector spaces.

17
3.2 Basis Functions
Definition. A set of vectors u1 , u2 , . . . , uB ∈ CN is a basis of U, if

for every x ∈ U : x ∈ span [u1 u2 · · · uB ] , i.e., (3.4)


B
X
x= αi ui and the respective set of weights α1 , α2 , . . . , αB is unique. (3.5)
i=1

u1
x
u2
U ⊂ CN

x′

18
Definition. A set of vectors u1 , u2 , . . . , uB ∈ CN is a Riesz Basis for U, if

(1) it is a basis for U and (3.6)


(2) there exist Stability Values 0 < λmin ≤ λmax < ∞ (3.7)
B
X
such that for every x ∈ U with x = αi ui the following property is fulfilled: (3.8)
i=1
B
X
2
λmin kxk ≤ |αi |2 ≤ λmax kxk2 . (3.9)
i=1

In contrast to infinite dimensional vector spaces, any basis of CN is also a Riesz Basis with proper
Stability Values.
Nevertheless, even for CN it turns out to be numerically advantageous if the Stability Values λmin and
λmax are of similar size.

19
Definition. A set of vectors u1 , u2 , . . . , uB ∈ CN is an Orthonormal Basis (ONB) for U when

(1) it is a basis for U and (3.10)


(2) it is an Orthonormal Set, i.e. (3.11)
uHj ui = 0 if i 6= j and kui k = 1, for all i, j = 1, . . . , L. (3.12)

An Orthonormal Basis (ONB) P is always a Riesz Basis with Stability Values λmin = λmax = 1,
i.e., for every x ∈ U with x = B
i=1 αi ui

B
X B
X
kxk2 ≤ |αi |2 ≤ kxk2 ⇔ |αi |2 = kxk2 . (3.13)
i=1 i=1

20
3.3 Orthonormal Basis Expansion

PB
Given an ONB u1 , u2 , . . . , uB ∈ CN for U, then for every x ∈ U with x = i=1 αi ui and for every
P
y ∈ U with y = Bj=1 βj uj , we obtain

Analysis : αi = uHi x (3.14)


B B
!
X X
Synthesis : x = ui uHi x = ui uHi x (3.15)
i=1 i=1
| {z }
Projector P U
B
X
Parseval Identity: kxk2 = |αi |2 , (3.16)
i=1

and

B
!H B B X
B B
X X X X
H
hx, yi = y x = βj uj αi u i = βj∗ αi uHj ui = αi βi∗ . (3.17)
j=1 i=1 j=1 i=1 i=1

21
3.4 Orthogonalization
Given a non-orthogonal basis h1 , h2 , . . . , hB ∈ CN , an ONB u1 , u2 , . . . , uB ∈ CN can be found by the
Gram-Schmidt-Orthogonalization method, i.e.,

h1
1st Step : u1 = (3.18)
kh1 k
P
hi − i−1 H
j=1 uj uj hi
ith Step : ui =
Pi−1 , (3.19)
H
h
i − u u
j=1 j j i h

h2 − u1 uH1 h2

h2 u2
h1

u1
with span [h1 h2 · · · hi ] = span [u1 u2 · · · ui ] for all i = 1, . . . , B.

22
QR-Decomposition.
Consequently,
 
r1,1 r1,2 · · · r1,B

 0 r2,2 · · · r2,B 

[h1 , h2 , . . . , hB ] = [u1 , u2 , . . . , uB ]  .. .. .. 
 . . . 
0 0 · · · rB,B
m

H = U R,

with rj,i = uHj hi with respect to the Analysis-Step (3.14).

Note. By rephrasing U into Q, the matrix decomposition H = QR equals the so-called QR-
Decomposition. However, the Gram-Schmidt-Orthogonalization is not appropriate for numer-
ically computing the QR-Decomposition and would not be applied for this task.
Obviously, considering the QR-Decomposition of the adjoint matrix H H = Q′ R′ provides the LQ-
Decomposition of H = LQ with LH = R′ and QH = Q′ .

23
Part II

Linear Operators

24
4. Linear Operators

4.1 Definition

Definition. A Linear Operator is a function A : S → S′ between Vector Spaces S and S′ , such


that for all x, y ∈ S and α ∈ C (or R) the following holds,

Additivity : A(x + y) = Ax + Ay (4.1)


Scalability : A(αx) = αAx. (4.2)

In this tutorial, we only consider Linear Operators between finite-dimensional Complex Vector
Spaces CN and CM with α ∈ C. Consequently, a Linear Operator A is represented by a Matrix
A : CN → CM .

25
Derivation.
′ M 1
Assume {un }Nn=1 and {um }m=1 is an Orthonormal Basis of an abstract N-dimensional vector space
S and the abstract M-dimensional vector space S , respectively, and A : S → S′ is a Linear Operator

with y = Ax for all x ∈ S and y ∈ S′ .


Then, with x = [x1 , . . . , xN ]T and y = [y1 , . . . , yM ]T representing the coordinates of x and y with respect
′ M
to the basis {un }N
n=1 and {um }m=1 , respectively, we obtain the Matrix A with y = Ax as follows:

* N
+ N
X X
ym = hy, u′mi = hAx, u′m i = A xn un , u′m = hAun , u′m ixn (4.3)
n=1 n=1

Consequently, the elements of matrix A are obtained by

△ △
[A]m,n = hAun , u′m i and [AT ]n,m = hAun , u′m i. (4.4)

1 For instance the vector space of bandlimited periodic signals.

26
A : S → S′ , u1 7→ Au1 = A1,1 u′1 + A2,1 u′2
△ △
A1,1 = hAu1 , u′1 i, A2,1 = hAu1 , u′2 i

u1
A1,1 u′1
S S′

Au1
u2 A2,1
u′2

27
4.2 Elementwise Perspective on Matrices

A ∈ CM ×N with [A]m,n = am,n = Re {am,n } + j Im {am,n } ∈ C, (4.5)

and
  
a1,1 a1,2 · · · a1,N x1

 a2,1 a2,2 · · · a2,N 
 x2 

Ax =  .. .. ..  .. 
 . . .  . 
aM,1 aM,2 · · · aM,N xN
 
a1,1 x1 + a1,2 x2 + · · · + a1,N xN

 a2,1 x1 + a2,2 x2 + · · · + a2,N xN 

= .. 
 . 
aM,1 x1 + aM,2 x2 + · · · + aM,N xN
 PN 
n=1 a1,n xn
 PN a x 
 n=1 2,n n 
= .. .
 . 
PN
n=1 aM,n xn

28
Definition. Matrix, Transpose Matrix, Adjoint Matrix
 
a1,1 a1,2 · · · a1,N
 a2,1 a2,2 · · · a2,N 
 
A =  .. .. ..  ,
 . . . 
aM,1 aM,2 · · · aM,N

 
a1,1 a2,1 · · · aM,1
T

 a1,2 a2,2 · · · aM,2 

A = .. .. .. ,
 . . . 
a1,N a2,N · · · aM,N

 
a∗1,1 a∗2,1 · · · a∗M,1
 a∗1,2 a∗2,2 · · · a∗M,2 
AH = 
 
.. .. .. .
 . . . 
a∗1,N a∗2,N · · · a∗M,N

29
The Adjoint Matrix AH ∈ CN ×M of a matrix A ∈ CM ×N has the constituting property


H
hAx, yi = x, AH y ⇔ y H Ax = AH y x, ∀x ∈ CN , y ∈ CM . (4.6)

Matrix Columns, Matrix Rows


 
a1,n

 a2,n 

A = [a1 a2 · · · aN ] with an =  .. 
 . 
aM,n
 
aT1
T
 aT2 
with aTn = [a1,n a2,n · · · aM,n ]
 
A = .. 
 . 
aTN
 
aH1
 aH2   
AH =  with aTn = a∗1,n a∗2,n · · · a∗M,n ,
 
.. 
 . 
aHN
T H
with AT = A and AH = A.

30
Definition. Identity Matrix, Zero Matrix, Inverse Matrix, Selection Vector
 
1 0 ··· 0
 0 1
 ··· 0  
I =  .. . . ..  = diag [1 1 · · · 1]
 . . . 
0 0 ··· 1
 
0 0 ··· 0

 0 0 ··· 0 

0= .. .. .. 
 . . . 
0 0 ··· 0

A−1 A = I and A + (−A) = 0

ei = [ 0| ·{z · · 0} ]T ∈ CN
· · 0} 1 |0 ·{z with
i−1 zeros N −i zeros
 
  0  
a1,1 a1,2 · · · a1,N . a1,i
  ..   a2,i
 

 a2,1 a2,2 · · · a2,N   


Aei =  .. .. ..   1  =  ..  = ai .
 . . .  .   .
 .. 

aM,1 aM,2 · · · aM,N aM,i
0

31
4.3 Fundamental Subspace Perspective on Matrices

Any Linear Operator A : CN → CM is related to Four Fundamental Subspaces, viz.


Range Space of A : range [A] = Ax| x ∈ CN (4.7)

Null Space of A : null [A] = x ∈ CN Ax = 0 (4.8)
 
Orthogonal Complement of range [A] : range [A]⊥ = null AH (4.9)
 
Orthogonal Complement of null [A] : null [A]⊥ = range AH . (4.10)

CN
range [A]

null [A] 0
0
CM

32
4.4 Eigenvectors and Eigenvalues

Definition. An Eigenvector of a matrix A ∈ CM ×M is a nonzero vector v ∈ CN such that

Av = λv for some λ ∈ C. (4.11)

The constant λ is the Eigenvalue to the Eigenvector v. The v and its corresponding λ form a
so-called Eigenpair of the matrix A.
For finite dimensional matrices the Set of Eigenvalues is discrete, the maximum number of different
eigenvalues is M.

The Eigenvalues of Self-Adjoint Matrices AH = A are real-valued and the Eigenvectors


form an ONB.
Proof.

(1) λv H v = v H (λv) = v H Av = v H AH v = (Av)H v = (λv)H v = λ∗ v H v ⇒ λ = λ∗


(2) λi v Hj v i = v Hj (λi v i ) = v Hj Av i = v Hj AH v i = (Av j )H v i = (λj v j )H v i = λ∗j v Hj v i ⇒ v Hj v i = 0

since λi and λj are real-valued (1st part) and different for i 6= j by assumption.

33
5. Orthogonal Projectors

5.1 Best Approximation and Orthogonal Projection


The Projection Operator P : CM → CM is related to the Best Approximation problem, i.e.,
find x̂ ∈ U ⊂ CM that is closest to a given x ∈ CM . As a result, we obtain x̂ = P x ⊥ x − P x.

x ∈ CM

P : CM → U
U⊥

U
x̂ = P x

34
According to the Closest Point Theorem1 we obtain

Existence : ∃x̂ ∈ U such that kx − x̂k ≤ kx − sk for every s ∈ U (5.1)


Orthogonality : x − x̂ ⊥ U (5.2)
Linearity : x̂ = P x and P only depends on U (5.3)
Indempotency : P 2 = P (5.4)
Self-Adjointness : P H = P . (5.5)

Note.
Projection Operators are always Indempotent.
Orthogonal Projectors must be Indempotent and Self-Adjoint.
Oblique Projectors (non-orthogonal projectors) are only indempotent, i.e., P 2 = P , but Self-
Adjointness does not hold.
Oblique Projectors are not considered in this tutorial.

1 The Closest Point Theorem applies for the category of Convex Sets which Linear Subspaces are belonging to.

35
Orthogonal Projection on Subspaces.
Given a pair of matrices A : CN → CM and B : CM → CN and A is a Left-Inverse of B, i.e.,
AB = I, then

BA is a Projector onto range [B] ⊂ CN , (5.6)

since (BA)2 = BABA = B(AB)A = BA.


If BA is also self-adjoint, then

BA is an Orthogonal Projector onto range [B] ⊂ CN , (5.7)

since (BA)2 = BABA = B(AB)A = BA and (BA)H = BA.

Example. Assume the column vectors of U form an ONB of U, then P = U U H is the Orthogonal
Projector onto U. For a proof consider the plugins

A = UH and B = U . (5.8)

36
Orthogonal Projection via Pseudo-Inverse.
If (AH A)−1 exists, then A+ H −1 H
left = (A A) A is a Left-Inverse of A, i.e.,

(1) AA+ H −1 H
left = A(A A) A is the Orthogonal Projector onto range [A] (5.9)

with so-called Pseudo-Inverse A+


left .

Otherwise, if (AAH )−1 exists, then A+ H H −1


right = A (AA ) is a Right-Inverse of A, i.e.,

 H
(2) A+ H H −1
right A = A (AA ) A is the Orthogonal Projector onto range A (5.10)

with the Pseudo-Inverse A+


right .

37
5.2 Example: Least Squares Regression

In Least Squares Regression (LS) the Linear Regressor, i.e., the inference of t, is based on
the minimization of the sum of squared errors between M pairs of observations yi and outcomes ŷi of the
Linear Model2 ŷ = xT t with x and t ∈ CN .
The related Optimization Problem is equal to
 
(M ) y 1 − xT t 2
X 1
min (yi − xTi t)2

⇔ min  .. 
⇔ min ky − Xtk22 . (5.11)
. 
t∈RN t∈RN t∈RN
i=1 T
y M − xM t
2

From a Subspace Perspective, we search for a vector ŷ = XtLS ∈ range [X] ∈ CM , which is the
Best Approximation of y ∈ CM , i.e., minŷ∈range[X] ky − ŷk2 with

 
y − ŷ ⊥ range [X] ⇔ y − ŷ ∈ null X H ⇔ X H (y − ŷ) = 0. (5.12)

The resulting vector solves X H y − X H XtLS = 0 and thus (under favorable conditions)
−1 H
tLS = X H X X y. (5.13)
   
2 The t x
affine case y = xT t + t0 can similarly be treated by introducing y = x′,T t′ , with t′ = and x′ = and therefore N ′ = N + 1.
t0 1

38
y

y3
y2 y4 y1


◦ = yi

• = ŷi

x
x1 x2 x3 x4
39
Alternatively, from an Orthogonal Projection Perspective, the Best Approximationŷ is
found by the Orthogonal Projection onto range [X].
The appropriate Left-Inverse of X ∈ CM ×N in order to determine the Orthogonal Projector
−1 H
on range [X] is given by the Pseudo-Inverse X + = X H X X , i.e.,

−1
ŷ = XX + y = X X H X X H y. (5.14)

Taking into account the estimation model ŷ = xH t and ŷ = Xt, we consequently again obtain
−1
tLS = X H X X H y. (5.15)

Definition. In order to obtain a feasible solution, Favorable Conditions are required. To this end,
the existance of the Inverse Matrix of X H X is required, i.e., M ≥ N linear independent measurement
vectors xi must be available.

40
5.3 Example: Minimum Norm Solution in A Linear System of Equations

In contrast to Least Squares Problems, where X ∈ CM ×N is typically a (Tall Matrix, M > N)


and y = Xt is Overdetermined, we now consider the Underdetermined case (Wide Matrices,
M < N).
Underdetermined cases of y = Xt (Wide Matrices) typically suffer from the Non-Uniqueness
of its solutions, since any addition of arbitrary elements of null [X] to a known solution t∗ still results into
a valid solution of y = Xt.
An appropriate solution to this issue is provided by the regularization of the problem, e.g., by introducing
a penalty on the norm of the solution which here leads to the Minimum Norm Solution:

min ktk2 subject to y − Xt = 0 (5.16)


t∈CN

min ky − Xtk2 , (5.17)



t∈null[X]

where the Minimum Norm Condition has been taken into account by excluding any portion of the
solution t which lies in the Nullspace of matrix X.

41
 
Since we are obviously searching for topt ∈ null [X]⊥ and null [X]⊥ ≡ range X H , the desired solution
can be parameterized by means of

topt = X H z (5.18)

with z ∈ CM .
Under favorable conditions this leads to
−1
y = XX H z ⇔ z = XX H y, (5.19)

and thus to
−1
topt = X H z = X H XX H y = X + y. (5.20)

The solution can be interpreted as the orthogonal projection of an arbitrary solution t to topt via the or-
thogonal projection

−1
topt = X + Xt = X H XX H Xt, (5.21)

with the alternative definition of the Pseudo-Inverse for Wide Matrices.


−1
The Pseudo-Inverse X + = X H XX H is the Right-Inverse of X and thus X + X is a Pro-
jector.

42
5.4 Weighted Sum of Disjoint Orthogonal Projectors

Assume rank-R matrices A : CM → CM that can be represented by means of a Weighted Sum of


rank-Ri Disjoint Orthogonal Projectors P i : CM → CM ,
P
X
A= wi P i , (5.22)
i=1

P
with rank R = Pi=1 Ri and range [P i ] ⊥ range [P j ] for all i 6= j, i = 1, . . . , P , i.e., all projectors are
mutually orthogonal to each other.

A weighted sum of disjoint orthogonal projectors A : CM → CM referring to (5.22) is a Normal Matrix


with

AAH = AH A. (5.23)

The respective projectors and weights depend on A.


Note. The weights take Complex Values, i.e., wi ∈ C .

43
Special Matrices

Definition. A : CM → CM referring to (5.22) is a Self-Adjoint Matrix when

AH = A. (5.24)

Self-Adjoint Matrices are also Normal Matrices.


Note. The weights strictly take Real Values, i.e., wi ∈ R .

Definition. A : CM → CM referring to (5.22) is a Positive Definite Matrix when it is Self-


Adjoint3 and

xH Ax ≥ 0, for all x ∈ CM . (5.25)

Note. The weights strictly take Nonnegative Values, i.e., wi ≥ 0 .


3 There exist more general definitions of Positive Definite Matrices.

44
Eigenpairs of Normal Matrices

Normal Matrices A are Linear Combinations of Orthogonal Projectors, i.e., A =


P P
i=1 wi P i .

Consequently, any uj ∈ range [P j ] and its corresponding weight wj form an Eigenpair of A, since for
every element of range [P j ] we obtain P j uj = uj by definition and P i uj = 0 for i 6= j due to the
orthogonality of projectors, and thus

P
!
X
Auj = w i P i uj (5.26)
i=1
P
X
= wi (P i uj ) = wj uj . (5.27)
i=1

Obviously, each wj and uj form an Eigenpair consisting of an Eigenvector and an Eigenvalue of


the matrix:

Auj = wj uj , for all j = 1, . . . , P. (5.28)

45
5.5 Spectral Theorem of Normal Matrices

Since (1) Normal Matrices A are Linear Combinations of Orthogonal Projectors P i and
any uj ∈ range [P j ] and its corresponding weight wj form an Eigenpair of A, and
(2) since any P i can be represented as a Linear Combination of rank-1 projectors ui,j uHi,j with respect
to elements of an ONB {ui,j }R j=1 , with
i

range [P i ] = range [U i ] = span [ui,1 · · · ui,Ri ] , (5.29)

with U i = [ui,1 , . . . , ui,Ri ] and Ri is the dimension of range [P i ], we obtain the main result of the
Spectral Theorem of Normal Matrices,

P
X
A= wi P i (5.30)
i=1
P
X Ri
P X
X
= wi U i U Hi = wi ui,j uHi,j (5.31)
i=1 i=1 j=1
  H 
w1 I R1 ×R1 · · · 0 U1
  .. . .. .
.. . 
  ..  .
 
= U1 · · · UP  . (5.32)
0 · · · wP I RP ×RP U HP

46
6. Singular Value Decomposition

6.1 Derivation and Properties of the SVD

Any Compact Linear Operator A can be decomposed by means of the Singular Value Decom-
position (SVD) and any Finite Dimensional Matrix A : CN → CM is a compact linear operator.
(1) Since the product of matrices AH A—the so-called Gramian Matrix—is Normal, Self-Adjoint
and Positive Definite,1 the Spectral Theorem can be applied such that
P
X R
X
H
A A= λi P i = λi v i v Hi , (6.1)
i=1 i=1

with rank R = R1 + · · · + RP ≤ min(M, N) and at least P different weights λi > 0.


 H
The vectors {v i }R
i=1 form an ONB of range A .

H
1 AH A = AH A and vH AH Av = v H λv = λv H v = λ kvk2 ≥ 0.

47
(2) Given the Orthonormal Vectors {v i }R R
i=1 , the vectors {ui }i=1 , with

△ −1
ui = λi 2 Av i , (6.2)

again constitute an ONB:


−1 −1
uHj ui = (λj 2 Av j )H λi 2 Av i
−1 −1
= λj 2 λi 2 v Hj AH Av i
−1 −1
= λj 2 λi 2 λi v Hj v i

1 ; i=j
= .
0 ; otherwise

Definition.
H
The elements of an ONB {v i }R R
i=1 with Eigenvalues {λi }i=1 of a Gramian Matrix A A are called
Right Singular Vectors of A.

The elements of the ONB {ui = 1/ λi Av i }R R
i=1 with Eigenvalues {λi }i=1 of a Gramian Matrix
H
A A are called corresponding Left Singular Vectors of A.

48
The Singular Value Decomposition (SVD) of an arbitrary matrix A : CN → CM is constituted
by

R
X
A= σi ui v Hi , with matrix rank R (6.3)
i=1
uHj ui = δi,j and v Hj v i = δi,j and σi > 0 i, j = 1, . . . , Q. (6.4)

with δi,j = 1 when i = j and zero otherwise.


The set of Left Singular Vectors {ui }R i=1 of nonzero Singular Values σi > 0 provides an ONB
for the Image Space (range [A] ∈ CM ) of the matrix A, i.e.,

range [A] = span [u1 u2 · · · uR ] . (6.5)

Correspondingly, the Right Singular Vectors


 {v i }R
i=1 of nonzero Singular Values σi > 0 provides
an ONB for the Image Space (range A ∈ C ) of the matrix AH , which is equal to the Complement
H N
 
of the Null Space (null [A]⊥ ≡ range AH ∈ CN ) of the matrix A, i.e.,
 
range AH = null [A]⊥ = span [v 1 v 2 · · · v R ] . (6.6)

49
Derivation.
Applying the matrix A to a vector x can be equivalently expressed by

Ax = A(x ⊢ + x⊥ ) = Ax ⊢ ,
 
with x ⊢ ∈ range AH ≡ null [A]⊥ and x⊥ ∈ null [A].
 H
Since {v i }R
i=1 forms an ONB for range A ≡ null [A]⊥ , we obtain

Ax = Ax ⊢
R
X
=A v i v Hi x ⊢
i=1
R
X
= Av i v Hi x ⊢
i=1
XR p
= λi ui v Hi x ⊢
i=1
R p
X
= λi ui v Hi x,
i=1

with σi = λi .

50
6.2 Sorted Matrix Representation

By sorting the Singular Values according to

σ1 ≥ σ2 ≥ · · · ≥ σR with R = rank [A] ,

we obtain the Sorted Matrix Representation of the SVD,

A = U ΣV H , with (6.7)

U = [u1 u2 · · · uR ] ∈ CM ×R , (6.8)
 
σ1

Σ= .. 
 = diag [σ1 σ2 · · · σR ] ∈ C
R×R
(6.9)
.
σR

 
v H1
VH =  ...  ∈ CR×N .
 
(6.10)
v HR

51
Extended Sorted Matrix Representation. By accomplishing the ONB of the Left Singular Vec-
tors (column vectors of U ) and Right Singular Vectors (column vectors of V ) by their Orthog-
onal Complements, and taking into account that Q = min(M, N),

U⊥ ⊥
R<Q = [uR+1 uR+2 · · · uQ ] and U Q<M = [uQ+1 uQ+2 · · · uM ] (6.11)
V⊥ ⊥
R<Q = [v R+1 v R+2 · · · v Q ] and V Q<N = [v Q+1 v Q+2 · · · v N ] (6.12)
   ⊥ 
range [U ] ⊕ range U ⊥ R<Q ⊕ range U Q<M = C
M
and (6.13)
 ⊥   ⊥  N
range [V ] ⊕ range V R<Q ⊕ range V Q<N = C , (6.14)

we obtain the Extended Sorted Matrix Representation of the SVD of matrices with maxi-
mum rank (R = Q)

  
 ⊥
 Σ


 U , U Q<M VH ; M ≥ N (Tall)

 0
A= U Σ VH ; M = N (Square) (6.15)
  H 

   V
 U Σ 0 ; M ≤ N (Wide).
V ⊥,H

Q<N

52
SVD of Matrices without maximum Rank.
For Tall Matrices AM ≥N with linear dependent columns or Wide Matrices AM ≤N with linear
dependent rows, i.e., with

rank [A] < Q = min{M, N}, (6.16)

we obtain the cases

   
Σ R×R 0R×N −R  
  VH
AM >N = U, U⊥ ⊥
R<Q , U Q<M
 0N −R×R 0N −R×N −R  (6.17)
  V ⊥,H
R<Q
0M −N ×R 0M −N ×N −R
  
  Σ R×R 0R×N −R VH
AM =N = U, U⊥ (6.18)
R<Q
0N −R×R 0N −R×N −R V ⊥,H
R<Q
 
      VH
  Σ R×R 0R×M −R 0R×N −M  V ⊥,H
AM <N = U, U⊥
R<Q R<Q
. (6.19)
0M −R×R 0M −R×M −R 0M −R×N −M ⊥,H
V Q<N

53
6.3 Fundamental Subspaces (continued)

Any Linear Operator A : CN → CM is related to Four Fundamental Subspaces, viz.


Image Space of A : range [A] = Ax| x ∈ CN (6.20)

Null Space of A : null [A] = x ∈ CN Ax = 0 (6.21)
 
Orthogonal Complement of range [A] : range [A]⊥ = null AH (6.22)
 
Orthogonal Complement of null [A] : null [A]⊥ = range AH . (6.23)

CN
range [A]

null [A] 0
0
CM

54
ONB.
     
The column vectors of U ext = U , U ⊥ with U ⊥ = U ⊥ R<Q , U⊥
Q<M and V ext = V , V ⊥ with
 
V⊥ = V⊥ R<Q , V ⊥
Q<N from the Extended SVD of a rank-R matrix A ∈ CM ×N constitute the
Orthonormal Bases for the Four Fundamental Subspaces range [A], range [A]⊥ , null [A], and
null [A]⊥ , viz.

range [A] = span [U ext ei | i = 1, . . . , R] (6.24)



range [A] = span [U ext ei | i = R + 1, . . . , Q] ⊕ span [U ext ei | i = Q + 1, . . . , M] (6.25)
null [A] = span [V ext ei | i = R + 1, . . . , Q] ⊕ span [V ext ei | i = Q + 1, . . . , N] (6.26)
null [A]⊥ = span [V ext ei | i = 1, . . . , R] . (6.27)

The Four Fundamental Subspaces of the matrices A : CN → CM and AH : CM → CN form


Orthogonal Complements of the N- and M-dimensional Vector Spaces:
 
CN = null [A] ⊕ range AH , (6.28)
 
CM = range [A] ⊕ null AH . (6.29)

55
6.4 Example: Direction of Arrival Estimation

Array Manifolds
We consider the well known MUSIC (Multiple Signal Classification) algorithm for estimating
the Direction of Arrival (DoA) of impinging wavefronts on a Uniform Linear Array (ULA).2
In Section 1.3 it has been shown that the Received Signal Vector x(t) is always an element of the
Array Manifold H and thus must be Orthogonal to the so-called (pure) Noise Subspace.
The (pure) Noise Subspace N = H⊥ only considers the part of the noise vectors which is orthogonal to
the Array Manifold H.

x(t) = Hs(t) + η(t) and x(t) − η(t) ∈ range [H] ≡ H (noise free). (6.30)

2 For further studies the interested reader may also refer to H. Krim, M. Viberg, Two Decades of Array Signal Processing Research, in IEEE

Signal Processing Magazine, 13(4):67–94, July 1996, and H.L. Van Trees, Optimum Array Processing, Wiley, 2002.

56
ry
θi x(t) = [x0 (t) x1 (t) · · · xM −1 (t)]T
H = [h1 h2 · · · hW ]
h T T T
iT
hi = ξi ej ki r0 ej ki r1 · · · ej ki rM −1
md
kTi rm = −2π sin θi
λi
s(t) = [s1 (t) s2 (t) · · · sW (t)]T
rx
x0 (t) x1 (t) x2 (t)

d d

W
X − j 2π sin θi md
xm (t) − ηm (t) = ξi e λ
i si (t), m = 0, . . . , M − 1.
i=1

57
Estimation of the Array Manifold.
The principal idea of the MUSIC algorithm for estimating the DoA parameter θi of an impinging wavefront
is to test the property h(θi ) ∈ range [H] = H, or alternatively

h(θ) ⊥ N, (6.31)

with the so-called Pure Noise Subspace N = H⊥ , which only considers the part of the noise vectors
which is orthogonal to the range [H].
The computation of an estimate of N requires an Basis of the Array Manifold H. Though the
subspace H is unkown, based on the observation that any (noise free) Received Signal Vector
x(t) − η(t) ∈ H, we conclude that

H = range [X − N ] with X = [x1 x2 · · · xS ] , N = [n1 n2 · · · nS ] (6.32)

where xi = x(ti ) and ni = η(ti ) are appropriately sampled realizations of the Received Signal
Vector x(t) and Noise Signal Vector η(t) at time instances t1 , t2 , . . . , tS and S ≥ M.

58
A comparison of the Extended SVDs of the unknown Tall Matrix H and the known Wide Matrix
X provides ONBs for H and N:

  
  ΣH 0 V HH
H = U HU ⊥
| {z }H
0 0 V ⊥,H
H
M ×M | {z } | {z }
M ×M M ×W

  
  ΣX 0 V HX
X = U XU ⊥
| {z }X
0 σ2I V ⊥,H
X
M ×M | {z } | {z }
M ×M M ×S

with H = range [H] = range [U H ] = range [X − N ] = range [U X ] (6.33)


     ⊥
and N = range [H]⊥ = null H H = range U ⊥ H = range U X . (6.34)

Note. Given the Noise Variance σ 2 is weaker than the Singular Values σX,i of Σ X , the matrix
U X can be computed without any knowledge of the θi , i = 1, . . . , W , by simply ordering the Singular
Values of the SVD.

59
MUSIC Spectrum
The so-called MUSIC Spectrum is a Pseudo-Spectrum which is based on the observation that
whenever the Inner Product between h(θ) and one of the basis vectors of the Pure Noise Subspace
N vanishes, the parameter θ must be one of the DOAs to be determined.
Estimates of the θi , i = 1, . . . , W , can be found by those arguments θ for which the MUSIC Spectrum
ϕ(θ) takes locally maximum values:3

1 1 1
ϕMUSIC (θ) = 2 = H ⊥ ⊥,H
= H H
 . (6.35)
⊥,H
U X h(θ) h (θ)U X U X h(θ) h (θ) I − U X U X h(θ)

In the following numerical example, we compare MUSIC with


1 H
- Standard Beamforming (BF) with ϕBF (θ) = M
h (θ)R̂x h(θ), and
 −1
−1
- Capon Beamforming (CAP) with viz. ϕCAP (θ) = hH (θ)R̂x h(θ) .
  1
PS
Here, the matrix R̂x is the Sample Covariance Matrix of Rx = E xx H , i.e., R̂x = S i=1 xi xHi .

H ⊥ ⊥,H ⊥ ⊥,H
3I − U XU H ⊥
U XU ⊥ − U XU H H
− U XU H
 
X = U XU X X X = U XU X + U X U X X = UX UX .

60
Normalized Pseudo-Spectra of Beamforming (BF), Capon Beamforming (CAP), and MUSIC
for M = 10 antennas, each ξi = 1, σ 2 = 0.01 noise variance, S = 100 samples, and W = 4 different DOAs.

MUSIC BF CAP

1.0
0.8
0.6
ϕ(θ)

0.4
0.2
0
−90 −70 −50 −30 −10 10 30 50 70 90
MUSIC BF CAP
θ
1.0
0.8
0.6
ϕ(θ)

0.4
0.2
0
−90 −70 −50 −30 −10 10 30 50 70 90
θ

61

You might also like