Chapter 2

Linear Algebra and matrix analysis
Contents

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Range Space, Null Space and Matrix Rank . . . . . . . . . . . . . . . . . . . . . . . 2.3 Eigenvalue decompostion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.1 General matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1 1 3
3

2.4 2.5 2.6 2.7 2.8

2.3.2

Singular Value Decomposition and Projection Operator Positive (Semi)Denite matrices . . . . . . . . . . . . . . Matrices with special structure . . . . . . . . . . . . . . . Matrix inversion Lemmas . . . . . . . . . . . . . . . . . . Systems of linear equations . . . . . . . . . . . . . . . . .
Inconsistent systems

Hermitian matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

6 10 11 12 12
12 13

4

2.8.1 2.8.2

Consistent systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.9 The multivariate normal distribution . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.9.1 2.9.2 2.9.3 2.9.4 Linear transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Diagonalising transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Quadratic forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 15 15 15

Complex Normal Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.1 Introduction
In many digital signal processing, problems are modelised under linear algebra equations. As a consequence, it is classical to manipulate expressions such as:

Y = AX + N
where

X

is the input signal matrix,

A

a matrix resulting from from the modelisation process,

N

the matrix

or noise vector, and as

Y

the measured output of the system.

It is then important to introduce some classical matrix analysis tools to solve

AX = B.

Complementary,

N

is the matrix or noise vector, the multivariate normal distribution has to be introduced.

This chapter was redacted using mainly the references [1] and [2].

2.2 Range Space, Null Space and Matrix Rank
A matrix is signaled by a bold capitale symbol, for example

A

or

α

or

the

transpose

v.

Let

A

be a

and

conjugate

m×n

matrix with complex-valued elements: operators and let

(.)

H

= (.)

T∗

Σ, a vector is A ∈ Cm×n ,

signaled by a bold minuscule: let

(.)T , (.)∗ denote

respectively

. The determinant of a matrix is denoted by

|.|

or

det(.).

Denition 1.

The range space of

A,

also called the column space, is the subspace spanned by (i.e. all linear

combinaisons of ) the colums of

A: R(A) = α ∈ Cm×1 |α = Aβ
for

β ∈ Cn×1

(2.2.1)

The range of the space

A

T

is usually called the row space of

A.

1

This directly follows from the denition of rank(A) because the aforementionned multiplications does not change the number of linearly independent columns (or rows) of A. respectively.2. .2.2) Denition 3. Proposition 7. The null space of A. yk ∈ Cm×1 . Proposition 5.2. Then: Proposition 6. hence rank (AB) ≤ rB .2 CHAPTER 2. (2. rank (AB) ≤ min (rA . • • • • A is said to be: Rank decient whenever Full column rank if Full row rank if Nonsingular if r < min(m. Let A ∈ Cm×n and B ∈ Cn×p be two conformable matrices of rank rA and rB . The latter number is by r is equal to the maximum number of linearly independent columns of denition the dimension of the R (A) : r = dim R (A) (ii) (2. r = n ≤ m. the post-multiplication of increase the number of linearly independant columns of AT . Let A ∈ Cm×n be given by N A= k=1 H xk yk where xk . n). rank (A) ≤ min(m. LINEAR ALGEBRA AND MATRIX ANALYSIS Denition 2.3) r is equal to the maximum number of linearly independent rows of A. Then. also called kernel.5) Proof. Premultiplication or postmultiplication of A by a nonsingular matrix does not change the rank of A. r = m ≤ n. B by A can not increase the number of linearly A by B can not wich means that rank (AB) ≤ rA . Using the denition of the rank.4) r = dim R AT = dim R AH (iii) r is the dimension of the nonzero determinant of maximum size that can be built from the element of A. Similarly.    ∗ yN the results follows from 6 . Since A can be written as  A= x1 ··· xN   ∗ y1 . r = m = n. Equivalent denitions of the rank of A follow r (i) rank (A) A. is the following subspace : N ( A ) = β ∈ Cn × 1 | A β = 0 (2. the premultiplication of independent columns of B. Denition 4.2. rB ) (2. n) Proof. .

Denition 14.2. The trace of a square matrix A ∈ Cm×m tr (A) = is dened as m Aii i=1 (2.1) if In paticular. The asumption (2. If B = A + αI. with m > n.2. · · · . . of a matrix A∈C m×m λ ∈ C and a (nonzero) vector x ∈ Cm×1 Ax = λx are an eigenvalue and its associated eigenvetor (2. with α ∈ C. A matrix U ∈ Cm × m is said to be unitary (Orthogonal if U is real valued) whenever UH U = UUH = I If U ∈ Cm×n . The matrices A and B Q−1 AQ. |B − λI| = Q−1 (A − λI) Q = Q−1 |A − λI| |Q| = 0 is equivalent to |A − λI| = 0. 0 λp    0 Proposition 12. Proposition 13. share the same Proof. the postmultiplication of Proof. p) in the following compact form: Observe that if AX = XΛ where (2. . by 6. Let (λ. and let rank (A) = n (2.2.1 General matrices A scalar Denition 11. an eigenvalue λ is a solution of the so-called characteristic equation of A: (2. where Q is any nonsingular matrix. A ∈ Cm × m is said Hermitian if AH = A. is such that UH U = I then we say that U is semi-unitary. x) be an eigenpair of A ∈ Cm×m .3. EIGENVALUE DECOMPOSTION 3 Proposition 8. p The pair (λ.6) then rank (AB) = rank (B) (2. then we can write the dening equation {(λi . xi )}i=1 are p eigenpairs of A (with p ≤ m) Axi = λxi (i = 1.3 Eigenvalue decompostion Denition 9.7) submatrix. 2.2.2. B is said to be related to A by a similarity transformation.2) |A − λI| = 0 and x is a vector in N (A − λI).7) follows. then (λ + α. rank (AB) ≥ rank (B) However. x) is an eigenpair of B. such an A is said Denition 10.3) X = [x1 · · · xp ] and   Λ= λ1 . 2. Proof. to be A matrix symetric. Hence. let B ∈ Cn×p .6) implies that which by B gives a block of rank equal to A contains a nonsingular n × n rank (B) (from R1).3. Let A ∈ Cm×n with n ≤ m.3.3.3. x) is called an eigenpair.4) . In the real-valued case.3. rank (AB) ≤ rank (B) and hence (2. Obvious eigenvalues.

Let A ∈ Cm×n and B ∈ Cn×m . AU = UΛ where UU∗ = U∗ U = I and the diagonal elements of Λare real numbers. Lemma 17.8) and (2. Let λ=0 be an eigenvalue of AB.9) have the same determinant (see lemma 17). is the following.3. (i) All eigenvalues of A = A∗ ∈ Cm×m are real valued. then tr (AB) = tr (BA) (2. the trace is invariant to commuting the factors in a matrix product. LINEAR ALGEBRA AND MATRIX ANALYSIS m ×m Proposition 15.7) Proof.3) that for an Hermitian matrix we can write: Proposition 20. we can write the matrices: I A 0 I I B I −A −B I 0 I I −A −B I I −A −B I I 0 B I I A 0 I = I − AB 0 I 0 0 I (2. The EVD of a Hermitian matrix is a special case of the Singular Value Decomposition (SVD) of a general matrix discussed in next section. Then |AB| = |A| |B| |αA| = αm |A| Proposition 18. Another usefull result. Equivalently: A = UΛU∗ which is so-called Eigen-Value Decomposition (EVD) of (2.3. Proposition 19. If {λi }m then i=1 are the eigenvalues of A ∈ C m tr (A) = i=1 λi (2.3.10) A = A∗ . associated with Hermitian matrices is the following: . B ∈ Cm×m and let α ∈ C. In other words. It follows from (i) and (ii) and from equation (2.5) Interestingly.2 Hermitian matrices An important property of the class of Hermitian matrices. while the matrix product is not commutative.4 CHAPTER 2. Proof.3.3.9) As matrices of left-hand side of (2. Let A ∈ Cm×n and B ∈ Cn×m . A straightforward calculation based on the trace denition and diagonal elements of a matrices product. then 0 = |AB − λI| = λm |AB/λ − I| = λm |BA/λ − I| = λm−n |BA − λI| This result obtained thanks to proposition (18).3. 2. then the right-hand side matrices also have the same determinant. which does not necessarily hold for general matrices. the non-zero eigenvalues of AB and BA are identical.8) = 0 I − BA (2. (2. Proposition 16.3.6) Proof. Then: |I − AB| = |I − BA| . then we can conclude that λ is also an eigenvalue of BA. (ii) The m eigenvectors of A = A∗ ∈ Cm×m form an orthogonal set. Let A.3. Let A ∈ Cm×n and B ∈ Cn×m .3.3. equal to . the matrices whose columns are the eigenvectors of A is unitary.

3.10) A = UΛUH  S = UH V   sH 1 . As this ratio (equation 2. . and let A = A∗ ∈ have its eigenvalues ordered (λ1 ≥ λ2 ≥ · · · ≥ λm ) then: m n λk ≤ tr VH AV ≤ k=m−n+1 k=1 λk (2. The following result extands the previous proposition. λm ) and. when the columns of V are the eigenvectors of A corresponding to (λm−n+1 . . let the eigenvalues of A be arranged λ1 ≥ λ2 ≥ · · · ≥ λm v∗ Av ≤ λ1 v∗ v (2.3. V∗ V = I).3. Also. .11) is invariant to the multiplication of v by any non-null complex number.11) This ratio (2. Let A = A∗ ∈ Cm×m and let v ∈ Cm×1 in a decreasing order: then: λm ≤ (v = 0). EIGENVALUE DECOMPOSTION 5 Proposition 21. with m > n.12) are evidently achieved when v is equal to the eigenvectors of A associated with λm and λ1 respectively. for instance. . and let    (m × n) sH m .12) The equalities in equation (2.10).e.3. this is readily verifyed as follows: |wk | = 1 2 m m λ1 − k=1 and λk |wk | = k=1 m 2 (λ1 − λk ) |wk | ≥ 0 2 m λk |wk | − λm = k=1 that concludes the proof. Let C m×m V ∈ Cm×n . . . Proof. λn ). Proof. with (2. be a semi-unitary matrix (i. respectively. .11) in the form: λm ≤ v∗ Av ≤ λ1 for any v ∈ Cm×1 with v∗ v = 1 (2. Let the EVD of A be given by (2. . we can rewrite equation (2. . to (λ1 .3.3. 2 (λk − λm ) |wk | ≥ 0 k=1 2 Proposition 22. The rato ∗ ∗ tr (V AV) tr (V AV) = ∗ tr (V V) n is somtimes called the extended Rayleigh quotient.3.11) is called the Rayleigh quotient. . .13) where the equalities are achieved. .2.3.3. Let.    wm m λm ≤ w∗ Λw = k=1 for any λk |wk | ≤ λ1 2 w ∈ Cm×1 satisfying m w∗ w = k=1 However.3. and let   w = U∗ v =  then we need to prove that w1 .

.16)-(2. 2. m To see this.1) Σ can be arranged in a decreasing order: σ1 ≥ σ2 ≥ · · · ≥ σmin(m. least squares tting of data. such that A = UΣVH By appropriate permutation. By making use of the above notation. . . LINEAR ALGEBRA AND MATRIX ANALYSIS (hence sH k is the k th row of S).14) ck obvioulsy sH k sk .1) satisfy: Singular Value Decomposition ( SVD) of A and its existence is a signicant result for both a theoretical and practical standpoint.n) The factorisation (2. c1 = · · · = cm−n = 0.6 CHAPTER 2.3. .3. . and determining the rank. and let H gk denote the k th G.18) G ∈ Cm×(m−n) such that the matrix [SG] is a unitary matrix. With this observation. k = 1.14) with (2. (2. m (2. .17) ck ≤ 1 k = 1. .4. S is equal to 0 I T and I 0 T . for example.3.3. range and null space of a matrix. cn+1 = · · · = cm = 0 These conditions on {ck } are satised if.13). .3. .1) is called the (2. m ck ≥ 0 k = 1. by combining (2.3.4. sH k which is (2.18). by construction. let row of (2.15) (2. . H gk sk gk H H = ck + gk gk = 1 =⇒ ck = 1 − gk gk ≤ 1 Finally.3. matrix approximation. For any matrix Σ∈R m×n A ∈ Cm × n there exist unitary matrices U ∈ Cm×m and V ∈ Cn×n and a diagonal matrix with non-negative diagonal elements. the proof is concluded. Proposition (21) is clearly a special case of Proposition (22). cm−n+1 = · · · = cm = 1 and respectively c1 = · · · = cn = 1. Then. We reiterate that the matrices in equation UH U = UUH = I (m × m) VH V = VVH = I (n × n) Σij = σi ≥ 0 0 for for i=j i=j The following terminology is most commonly associated with the SVD: • the left singular vectors of the matrix A are the columns of U.3. . These singular vectors are also the eigenvectors of AAH . .4. .4 Singular Value Decomposition and Projection Operator Applications: applications which employ the SVD include computing the pseudoinverse.16) and m ck = tr SSH = tr SH S = tr VH UUH V = tr VH V = tr (I) = n k=1 Futhermore. we can write: m tr V AV = tr V UΛU V = tr S ΛS = tr ΛSS where H H H H H = k=1 λk ck (2.3.18) we can readily verify that where the equality was achieved for tr VH AV satises (2. the diagonal element of (2.3. respectively.

11) has a number of important consequences. n) r the SVD can be written as: A= where U1 r U2 m−r Σ1 0 0 0 H V1 H V2 }r H = U1 Σ1 V1 }n − r (2. for a matrix of rank k = 1. . we need to show that: R(A) = R (U1 ) and respectively. . From (2. Hence. then R (A) = R (U1 ). Same philosophy for the demonstration of (2.4.4. . n) eigenvalues of AAH or AH A.4. Proof. A are the diagonal elements • The singular values of largest {σi } of Σ. .4) α ∈ R (A) ⇒ there Then exists β such that α = Aβ H α = U1 Σ1 V1 β = U1 γ and so α ∈ R (U1 ). σk = 0. it follows that U1 = AV1 Σ and −1 α = A V1 Σ β = Aρ and then α ∈ R (A). −1 Now. n) then it can be shown that: σk > 0.4. If rank (A) = r ≤ min (m.2. . . H which leads to (2. Note that {σi } are the square roots of the min(m. .3) N (AH ) = N (U2 ) To show (2. min (m. To prove (i) and (ii).4.11) where r ≤ min(m. . and thus α = U2 β . We can then conclude that N A ⊂ N (U2 ). for α ∈ R (U1 ). Consider the SVD of A ∈ Cm×n in (2.4): −1 H H H α ∈ N A H ⇒ A H α = 0 ⇒ V 1 Σ1 U H 1 α = 0 ⇒ Σ1 V1 V1 Σ1 U1 α = 0 ⇒ U1 α = 0 Now. note that (2. SINGULAR V ALUE DECOMPOSITION AND PROJECTION OPERATOR 7 • The right singular vectors of the matrix Aare the columns of V. Then: (i) U1 is an orthogonal basis of R(A) (ii) U2 is an orthogonal basis of N (AH ) (iii) V1 is an orthogonal basis of R(AH ) (iv) V2 is an orthogonal basis of N (AH ). .3). However. H H H Now.4. any vector α can be written as γ= since U1 U2 γ β H H 0 = UH 1 α = U1 U1 γ + U1 U2 β = γ .4. that is to say R (U1 ) ⊂ R (A). readily derived by using the SVD. it exists a β such that α = U1 β . where • The singular triple of columns of A is the triple (σk .4) The previous results. α ∈ N (U2 ) ⇒ there exists β such that α = U2 β . The factorisation of A in (2. vk ) uk is the k th columns of U and vk is the k th V. n). We can see that (iii) and (iv) follow from the properties (i) and (ii) applied to AH . But we had also R (A) ⊂ R (U1 ).2. then A α = V1 Σ1 U1 U2 β = 0 ⇒ α ∈ N A U1 U2 is non singular. uk . These singular vectors are also the eigenvectors of AH A. (2.4. Proposition 23. so γ = 0.11).4.4. has a number of intersting corollaries which complement the discussion on the range and nullspace in section 2. r k = r. Then we show that R (A) ⊂ R (U1 ) .2) Σ1 ∈ Rr×r is non-singular.

which is such R(A) = R(Π).4. matrix (i) y ∈ Cm×1 that: be an arbitrary vector.5) dim R(A) = r (2. The SVD of a matrix also provides a conveniant representation for the projectors onto the range and null space of A and AH .8) follows β = UH 1 y. By denition the orthogonal projector onto R(A) is the Π. idempotent matrices .4.4. direct corallary of 23.4. (ii) the Euclidean distance between y and Πy ∈ R(A) 2 is minimum : y − Πy Hereafter. has values equal to r eigen- 1 and (m − 1) eigenvalues equal to zero.4. As an aside.10) Furthermore. that for projection of R (A) the error vector is H y − U1 UH 1 y = U2 U2 y which is in R ( U2 ) and is therefore orthogonal to R(A) by 23. according to 23.9) y − U1 β 2 = = β H − y H U1 β − UH 1 y 2 H β − UH I − U 1 UH 1 y +y 1 y + UH 2 y 2 If readily follows that the solution to the minimization problem (2.4.4. This is a general property of idempotent matrices: their eigenvalues are either zero or one. Finally. and vice m Proposition 24. observe by making use of 19 that the idempotent matrix in (2. . For any versa.8) Proof. see Denition 27.4.4. The matrix A ∈ Cm×m is idempotent if A2 = A (2. Let A ∈ Cm×n .4.7) whereas the orthogonal projector on N (AH ) is H Π ⊥ = I − U1 UH 1 = U 2 U2 (2. Proposition 26. Then (2. = min over R(A) x 2 = xH x denotes the Euclidean vector norm. Consequently. Let Denition 25. Π is given the name orthogonal projector in 25 and in 26. LINEAR ALGEBRA AND MATRIX ANALYSIS A ∈ Cm×n the subspaces R(A) and N (AH ) are orthogonal to each other and they together span C .7) and (2. we present a result that even alone would be enough to make the SVD an essential matrix analysis tool.7). we can nd the vector R(A) that is of minimal distance from y by solving the problem: min y − U1 β β Because 2 (2.9) is given by H vector U1 U1 y is the orthognal projection of This prove (2.7).4. For this reason. for instance. As R(A) = R(U1 ). we say that N (AH ) is the orthogonal complement of R(A) in Cm . Hence the y onto R (A) and the minimum distance from y to R (A) is UH 2 y . we remark that the orthogonal projector in (2. Let y ∈ Cm × 1 be an arbitrary vector. Proof. In particular we have: dim N (AH ) = m − r (2.4. H immediately from (2.7) and the fact that N A = R (U2 ).4. y onto Note.8 CHAPTER 2. The orthogonal projector onto R(A) is given by: Π = U1 UH 1 (2.6) (recall that dim R(A) = dim R(AH ) = r. for example.8) are next denition.

19) with respect to C as: S.11) where p ≤ min (m.4.4.4.21) The solution to (2.4. Then the best rank-p approximant of A in the Frobenius norm metric.12) denote the square of the so-called Frobenius norm.4. observe that: A − CDH 2 = tr D − A H C CH C −1 CH C DH − CH C −1 CH A + AH I − C CH C H −1 −1 CH A (2.4.17) is given by tr AH I − C CH C Next we minimise (2.16) with respect to D.13) in the following form: min A − CDH C. n) is an integer.15) we can rewrite the problem (2. Proof. with elements Aij . the matrix is positive semidenite for any D−A C C C H C C H D. the full column rank condition that must be satised by C and D can be easilly handled. Indeed.19) C.4.19) with respect to −1 CH A denote an orthogonal basis of (2.13) is given by ∗ B0 = U1 Σ1 V1 (2.13) as: (2.4.21) follows from 22: the maximising S0 = U1 which yields .4.4. It follows from 8 and (2. see bellow.D 2 rank (C) = rank (D) = p (2. To that end. Let the SVD of A(with eigenvalues arranged in a U1 p U2 m−p Σ1 0 0 Σ2 H V1 H V2 }p H = U 1 Σ1 V 1 }n − p (2. S S=I H and S = CΓ for some non singuliar p×p matrix Γ.17) is minimised with D − CH C respect to D for: H −1 CH A D0 = AH C CH C −1 (2.4.4. that is.4.13) if and only if σp > σp+1 .18) and the corresponding minimum value of (2.20) we can restate the problem of minimising (2.17) By result (iii) in denition 29 in the next section. The previous parametrisation of B is of course non-unique but. Let A ∈ Cm×n .n) A 2 = tr AH A = i=1 j =1 |Aij | = k=1 2 2 σk (2. Let m n min(m.4. for a given C. Let S ∈ Cm × p R (C).4.4.4. that is the solution of min A − B B 2 subject to rank (B) = p (2.8. this fact does not introduce any problem.4.16) The reparametrised problem is essentially constraint free. First we minimise (2.19) and (2.4. It is then straightforward to verify that I − C CH C −1 CH = I − SSH (2.20) By combining (2.4.4. This observation implies that (2. B0 above is the unique solution to the approximation problem (2.2. SINGULAR V ALUE DECOMPOSITION AND PROJECTION OPERATOR 9 decreasing order) be given by: A= Proposition 28.15) B = CDH where C ∈ Cm×p and D ∈ Cn×p are full column rank matrices.5) that we can parametrise B in (2.14) Futhermore. By making use of (2.4. SH S=I max tr SH AAH S S is given by (2.4.4. as we will see.

ik ∈ [1. . then the best rank-p approximant 2.5. m) is called a leading submatrix of A). and : (2. A. (i) (ii) We say that A is positive sem-denite (psd) or positive denite (pd) if any of the following equivalent conditions hold true. . The any matrix C that satises (2.5 Positive (Semi)Denite matrices Let A = AH ∈ Cm×m be a Hermitian matrix. . σp = σp+1 we can obtain B0 by using either the singular vectors associated with σp or those σp+1 . . A = CCH with (2. B. . λk ≥ 0 (λk > 0 for pd) for k = 1.13) is given by min(m. Otherwise it is not unique. Sometimes such a C is denoted C = A1/2 . m and all indices i1 . . (iv) Of the previous dening conditions. . The Hermitian square root is unique. which will generally lead to dierent solutions. (A (k + 1.n) A − B0 2 = H 2 U2 Σ2 V2 = k=p+1 2 σk If σp > σp+1 Indeed. ik ) is called the principal submatrix of A). . . LINEAR ALGEBRA AND MATRIX ANALYSIS C0 = U1 Γ−1 It follows that: (2. . . the Cholesky factor is unique. and let {λk }k=1 m denotes its eigenvalues. . . In this case we can simply write (2.5.3) C = UΛ1/2 UH because as A is Hermitian positive semi-denite. then so is CB for any unitary matrix square roots. (iv) is apparently more involved. . m]. m − 1) and |A| > 0. A = UΛ1/2 Λ1/2 UH = UΛ1/2 UH UΛ1/2 UH = CC H . whenever corresponding to B0 derived above is unique. . . . then C is called the Cholesky factor of A is positive denite. .2) A = CCH is called the square root of If A.4. .4. ik ) is the submatrix formed from A by eliminating the i1 . . m. . Denition 30. . . (A (i1 . αH Aα ≥ 0 (αH Aα ≥ 0 for pd) for any (iii) There exists a matrix C such that non-zero vector α ∈ Cm × 1 . . . .2) as A = C2 . . . and named the Sylvester Criterion. Two often-used particular choices for square roots are: (i) Hermitian square root: C = CH . Denition 29.5. . . . . . (ii) Cholesky factor. ik )| ≥ 0 (> 0 for pd) for all k = 1. The condition for A to be positive denite can be simplied to requiring that |A (k + 1. m)| > 0 (for k = 1. where A (i1 .5. . . If If C is lower triangular with nonnegative diagonal elements. This decomposition is also called the LU decomposition or Cholesky decomposition. . . . . ik rows and columns of A. . Let A = AH be a positive semidenite matrix. .22) B0 = = = H H C0 DH CH 0 = C0 C0 C0 0 A = S0 S0 A H H U1 UH U Σ V + U Σ V 1 1 1 2 2 2 1 H U 1 Σ1 V 1 −1 Futhermore.1) rank (C) = m for pd) |A (i1 .10 CHAPTER 2. There are then an innite number of C is a square root of A. we observe that the minimum value of the Frobenius distance in (2.

.  . .   .. 0 h0 h1 ··· 0 h0 . the convolution of h and x can be formulated as: m−1 y (n) = k=0 with h(k − n)x(k ) = H[m×n] x[n×1] 0 h0 h1 .  . . .. .6 Matrices with special structure Denition 31. a−2 a−1 . .. . ··· ··· a1 a2 The discrete convolution operation can be constructed as a matrix multiplication. . .             . . . if J denotes the exchange (or reversal) matrix    J=  0 Proposition 33. . . .. . a1−n . ··· 0 1 0 0 1 0 ··· 1  0 1    0  0 and if x is an eigenvector of A.Symmetric Toeplitz matrices are both centrosymmetric and bisymmetric. . . ··· . . The eigenvectors of a symmetric Toeplitz matrix A ∈ Rm×m are either symmetric or skew.. A matrix A ∈ Cm×n is called Vandermonde. For example. if it has the following structure:    A=  where 1 z1 ..  . where one of the inputs is converted into a Toeplitz matrix. . • A ∈ Cm×n is called: Toeplitz when Ai. then either x = Jx or x = −Jx .  Hm×n     h2   . . .  . .6. .  . . .  . h1 hm−2 hm−1 0 ··· 0 0 h0 0 a A= b c  b c c d d e d e  f hm−1  is a Hankel matrix. • Hankel when Ai.. A matrix Denition 32.   x 0             ··· hm−2 hm−1 0 .. symmetric. .j = Ai−j These matrices are also called persymmetric. ··· .   hm−2 =  hm−1   0   .1) m −1 z1 m−1 zn zk ∈ C are usually assumed to be distinct. . . ··· 1 zn . . .       xn−1    xn−2   . . .2.  . MATRICES WITH SPECIAL STRUCTURE 11 2. . More precisely. . Examples: a−1 a0 a1 .6.      (2. . .j = Ai+j       A=      a0 a1 a2 . 0 ··· 0 0 . . . a−1 a0 a1 a−2 a−1 a0 am−1 is a Toeplitz matrix.. .

8. (Matrix Inversion Lemma) = A−1 + A−1 C B − DA−1 C −1 DA−1 2.8. Then.1) AX = B where matrices We say that: A and B are given and where X is the unknown matrix. B ∈ Cm×p .1) in the sense that X0 2 ≤ X 2 (2. B ∈ Cn×n .1) is overdetermined when m = n.12 CHAPTER 2.8. LINEAR ALGEBRA AND MATRIX ANALYSIS 2. if and only if rank A B = rank (A) (2.1) is consistent.1) has an exact solution and then the case where (2.1).8.8.1 Consistent systems R (B) ⊂ R (A) or equivalently Proposition 36. and let A= U1 r U2 m−r Σ1 0 0 0 H V1 H V2 }r H = U 1 Σ1 V 1 }n − r (2. B ∈ Cn×n .8.1) is underdetermined when In the following.2) .8. m < n. providing that the matrix inverses = I 0 0 I A −1 I 0 + −A −1 C I I −B−1 D B − DA−1 C −1 −DA−1 I = B−1 0 I + A − CB−1 D −1 I −CB−1 By identication of terms involved in 34.8.4) Proposition 39. The linear system (2. Then the set of all solutions to (2.8 Systems of linear equations Let A ∈ Cm×n . that is it admits an exact solution X.5) denote the SVD of A (Σ1 is nonsingular). and X ∈ Cn × p .2) holds and A has a full column rank: rank (A) = n ≤ m (2. C ∈ Cm×n .8. Let A ∈ Cm×m . • • • (2. Proposition 38.8. n). Let X0 be a particular solution to (2. Consider a linear system that satises the consistency condition in (2.2) Proposition 37. D ∈ Cn×m .8.1) is given by: X = X0 + ∆ (2. we directly obtain: Let A ∈ Cm×m .1) has a unique solution if and only if (2.1) can not be exactly satised.7 Matrix inversion Lemmas appearing below exist. and (2. A C D B Proposition 34.8. Let A have a rank r ≤ min (m.3) where ∆ ∈ C n×p is any matrix whose columns are in N (A).7) for any solution X = X0 .6) is the minimum Frobenius norm solution of (2.8.8. we rst examin the case where (2. A − CB−1 D −1 Lemma 35. C ∈ Cm×n .8. D ∈ Cn×m and providing that the matrix inverses appearing below exist.8.8. 2.8. .1) exactly determined whenever (2. A general system of linear equations in X can be written as: (2. then: 1 H X0 = U1 Σ− 1 V1 B (2. m > n.8. The system of linear equation (2.8.

8. That is why the QR σn decomposition is prefered. Note that XLS should not be computed by directly evaluating (2.8. when inverse given to is square and nonsingular we have A −1 = A † which motivates the name of pseudo- A † The computation of a solution to (2. That is to say. the problem is said  ill-conditioned .2 Inconsistent systems introduced in (2. Such systems are said incosistent and frequently they are overdetermined and have a matrix A that has full columns rank: rank (A) = n ≤ m linear equations: (2. always the condition number : cond (A) = The reason is σ1 and its eect on the perturbated systems.11) R ∈ Cn×n such that A=Q there exist a unitary matrix Q ∈ Cm×m and nonsingular upper- R 0 Q1 n Q2 m−n R 0 (2.12) is given by the minimiser XLS of the following criterion AX − B 2 Proposition 43.8.8.8.2. The matrix 1 ∗ A † = U 1 Σ− 1 V1 (2.8. If this 2.8.13) we obtain XLS = R−1 QH 1 B . This solution must be numerically stable.8.14) into (2.8.2) is small (not to much gretter than one).8.12) is given by: XLS = AH A −1 AH B (2.8. ratio is large.8. The LS solution to (2. is the unique solution to the following set of equations: A† Denition 41. are Hermitian Evidently.10) σ1 and σn are respectively the largest and the smallest singular values of cond(.13) as it stand. the solution must be as unsensible as possible to perturbations (for example limited resolution of computer). Remark.8.8.8. what we can hope is to compute an exact solution to a slightly perturbated system of linear equations: (A + ∆A ) (X + ∆X ) = (B + ∆B ) One can show that the perturbations factor given by: (2.) means condition number .8.11). we present two approaches to obtain an approxiamte solution to an inconsistent system of AX under the condition (2. SYSTEMS OF LINEAR EQUATIONS 13 Denition 40. B (2. If the corresponding ratio the problem is said  well-conditioned .9) are retrieved in ∆X multiplied by a proportionality cond (A) = where σ1 σn A and where (2.8. The least square (LS) approximate solution to (2.12) Denition 42.11) In what follows.8. Inserting (2.6) is the so-called It can be shown that Moore-Penrose pseudo-inverse (or generalised inverse) of A.9.8) in (2. The systems of linear equation that appear in applications are quite often perturbated and usually they do not admit any exact solution.13) The inverse matrix in the above equation exists in view of (2.8.14) The previous factorisation of A is called the QR decomposition .1) is an important problem.11).9) ∆A ∆B in (2. In eect.8.   AA† A = A A† AA† = A†  † A A and AA† A . For any matrix triangular matrix A satisfying (2.

Linear transformations allow to synthetise random vector for normaly distributed multivariate vector.9.14 CHAPTER 2. XLS can be conveniently obtained as the solution of the triangular system of linear equations: RXLS = QH 1 B (2.9.9. (2. once the QR decomposition of A has been performed. Let vector. rank m. Let X = (X1 X2 · · · Xn ) T denote an n×1 random X is m mi So the mean vector = E [X] = E (X1 X2 · · · Xn ) = (m1 m2 · · · mn ) = E [Xi ] T T (2. LINEAR ALGEBRA AND MATRIX ANALYSIS Hence.8. (2.15) 2. If the requirement that then AT RA be nonsigular is the requirement that R be nonsingular and A have AT RA is singular.6) Y is multivariate normal distributed as: Y : N AT m. indicating that at least m−n components of Y are linearly dependent.9. .7) m ≤ n.9. AT RA a (2.1) m is vector of the means.9. If m > n.9 The multivariate normal distribution The multivariate normal distribution is the most important distribution in science and engineering.5) The characteristic function itself is a multivariate normal function of the frequency variable 2.9. The covariance matrix for X is R = E (X − m) (X − m) = {rij } rij = E [(Xi − mi ) (Xj − mj )] A random vector T (2.2) X is said to be multivariate normal if its density function is p (x) = (2π ) The characteristic function of T −n/2 (det (R)) −1/2 1 T exp − (x − m) R−1 (x − m) 2 (2.1 Let Linear transformations be a linear transformation of a multivariate normal random variable: Y Y = AT X AT : n × m (m ≤ n) thus.9. Remark. The mean value of X denotes a random vector and x its realisation. after some calculus: φ (ω ) = = exp −j ω T m − j ω T Rω 1 exp − 2 mT R−1 m exp − 1 ω + j R −1 m 2 T R ω + j R−1 m ω.4) and.3) X is the multidimensional Fourier transform of the density function: φ (ω ) = E e −j ω X ˆ 1 −n/2 −1/2 T = dx (2π ) (det (R)) exp −j ω T X − (x − m) R−1 (x − m) 2 (2.

. Y − UT m Yk 2 = UT RU = diag λ2 1 · · · λn (2. The density function for a complex normal distribution can be computed as: f (z) = 1 πn det (R) det (P) exp − 1 H T [z − µ] [z − µ] 2 R CH C R∗ −1 [z − µ] ∗ [z − µ] . Y2 . Yn T are uncorrelated because: E Writting the density for Y − UT m Y. .2.9. But what about quadratic functions? In some very important cases the quadratic function have a Let χ2 distribution (read Chi-Squared).13) Complex Normal Distribution and x y be random vectors in Rn such that vec [x y] is a 2n-dimensional normal random vector.4 Let 1 q (n/2)−1 e−q/2 .9) 2 Y = UT X : N UT m. This distribution can be described with 3 parameters: µ = E [ z] R = E (z − µ) (z − µ) H T C = E (z − µ) (z − µ) where • • • the location parameter (mean) vector the covariance matrix the relation matrix µ can be an arbitrary n- dimensionnal complex random vector. C should be symetric.12) The density function for n degrees of freedom.9. UT RU = diag λ2 1 · · · λn We say that the random variables Y1 .2 Diagonalising transformations The correlation matrix is symetric and nonnegative denite. THE MULTIVARIATE NORMAL DISTRIBUTION 15 2.9. The distribution of the quadratic form Q = (X − m) R−1 (X − m) is a chi-squared distribution with T (2.9. X denote a N (m. 2.3 Quadratic forms Linear functions of multivariate normal vectors remain multivariate normal.9. Moreover. . R) random variable.9.11) This approach provides a possibility to generate independents random variables. R must be Hermitian.9.9.9.9. the matrix P = R∗ − CH R−1 C is also non-negative denite. The complex random vector z = x + jy has a complex normal distribution .10) we see that the are independents normal random variables with means UT m k and variances λ2 k: n p (y) = k=1 2πλ2 k −1/2 exp − 1 y k − UT m 2λ2 k 2 k (2. denoted χ2 n. q ≥ 0 Γ (n/2) 2n/2 (2. . Therefor there exists an orthogonal matrix that: U such (2. Q is p(q ) = 2. The transformation Y= UT X is called a Karhunen-Loeve transform.8) 2 UT RU = diag λ2 1 · · · λn It follows that the vector Y=U X T is distributed as follows: (2.

centered ( i. z is said circular (i. In the case where normal. z = x + jy is circular complex normal. (R) − (R) (R) ( R) E [(z − µ) (z − µ)] = 0) complex if Remark. z = x + jy .e. σ 2 /2 f (z ) = z∗z 1 exp − π2 σ2 2σ 2 = 1 |z | exp − 2 π2 σ 2σ 2 . then the vector vec [x y] is multivariate normal with the following structure: x y ∼N where . LINEAR ALGEBRA AND MATRIX ANALYSIS A circular symetric complex normal distribution coresponds to the case of a null relation matrix If C = 0. zero mean) with a variance equal to x and y are two random independant variable with the same normal distribution N 0. z ∈ C.e.16 CHAPTER 2. (µ) (µ) . 1 2 σ2 .

1997.Bibliography [1] P. Prentice Hall. S. L. Pedersen. B. [2] K. The Matrix Cookbook. http://matrixcookbook. Petersen and M.com. Stoica and R. 17 . Moses. 2008. Introduction to spectral analysisPe.

Sign up to vote on this title
UsefulNot useful