Professional Documents
Culture Documents
QR Francis
QR Francis
The QR Algorithm
and other methods to compute the eigenvalues of complex matrices
”[The QR algorithm is] one of the most remarkable algorithms in numerical math-
ematics”
(Strang)
”Indeed it is quite remarkable that an algorithm, wich is both effective and easy to
describe, has resisted, and stoutly continues to resist, a full mathematical analysis,
to such an extent that no proof of convergence in the most general case (the matrix
not Hermitian; the QR algorithm with shifts) exists at the present, at the same
time that no counter-example to convergence exists”
(Phillippe G. Ciarlet)
Abstract
This work deals with variants of the power and QR method to determine the eigenvalues of
complex matrices. First, definitions and properties about matrices, the eigenvalues problem
and matrix decompositions are presented. Then the methods are thoroughly discussed. All
presented algorithms were implemented and tested in Java.
Introduction
At first the eigenvalue problem seems quite clear. The eigenvalues of a matrix A are exactly
the roots of its characteristic polynomial pA and in C one can find all the roots of pA . So why
all the excitement?
The reason is the numeric instability of the determinant function. In fact one rather tries
to find the zeroes of a polynomial by applying the QR algorithm to its companion matrix
(which has exactly the zeroes of the polynomial as eigenvalues) than to form the characteristic
polynomial of a matrix and find its zeroes.
The correspondence between (arbitrary) polynomials and the eigenvalue problem shows
also that one cannot find explicit solutions for the eigenvalues of a matrix with more than 4x4
entries. Hence, one has to use iterative solutions. The QR method is a simple algorithm that
does that. It can be improved to make it efficient. Anyway it has some disadvantages: It has
problems with multiple eigenvalues and eigenvalues of equal modulus. Although one can try
to work around theses problems there is no all-embracing algorithm to solve the eigenvalue
problem.
Remarks
The reader needs basic knowledge in linear algebra and analysis (as gained in the first or second
year of studying mathematics).
A ”•” indicates a definition, a ”” indicates a proposition or theorem. In the whole paper
K can be considered as R or C.
• •
The notation stands for a 2x2 matrix A with arbitrary entries A11 , A12 , A22 and
· •
zero entry A21 .
i
Table of Contents
1 Matrices 1
1.1 Basic Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Matrix Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
5 Algorithms 14
5.1 Householder Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
5.2 Givens Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
5.3 QR Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
5.4 Hessenberg Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
5.4.1 Definition and Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
5.4.2 Hessenberg Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
5.4.3 QR Decomposition of Hessenberg Mastrices . . . . . . . . . . . . . . . . . . . . . 20
5.5 The Power Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.5.1 The Simple Power Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.5.2 The Inverse Power Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5.6 The QR Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5.6.1 Definition and Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5.6.2 QR Method with Hessenberg Reduction . . . . . . . . . . . . . . . . . . . . . . . 24
5.6.3 QR Method with Shifts and Decoupling . . . . . . . . . . . . . . . . . . . . . . . 24
A Proofs 25
B Bibliographical Reference 25
ii
1 Matrices
1.1 Basic Definitions
Let denote:
• Mnl (K) the set of all n-l matrices over K
• Mn (K) the set of all n-n matrices over K (i.e. Mn (K) := Mnn (K))
• In the n-n identity matrix
• 0nl the n-l zero matrix.
For abbreviation let Mn denote Mn (C).
One defines:
• GLn (K) := {A ∈ Mn (K) : A regular } = general linear group
• SLn (K) := {A ∈ Mn (K) : det A = 1} = special linear group
• On (K) := {A ∈ Mn (K) : A orthogonal } = orthogonal group
• SOn (K) := {A ∈ Mn (K) : A orthogonal, det A = 1} = special orthogonal group
• Sym(K) := {A ∈ Mn (K) : A symmetric }
• Hn := {A ∈ Mn (C) : A Hermetian }
• Un := {A ∈ Mn (C) : A unitary }
• U Sn := {A ∈ Mn (C) : A unitary, det A = 1}
• HP Dn := {A ∈ Mn (C) : A Hermitian, positive definit }
• SP Dn := {A ∈ Mn (R) : A symmetric, positive definit }
1
A matrix A ∈ Mn (K) is called:
• lower triangular :⇔ Aij = 0 whenever i < j
• upper triangular :⇔ Aij = 0 whenever i > j
• diagonal :⇔ Aij = 0 whenever i 6= j
• reducible :⇔ a nontrivial partition {1, ..., n} = I ∪ J exists s.t.
(i) Aij = 0 whenever (i, j) ∈ I × J
• irreducible :⇔ A is not reducible
• projection :⇔ A is idempotent (A2 = A)
• permutation matrix :⇔ Aij = δi,s(j) for some permutation s ∈ Sn
Let denote:
• diag (d1 , ..., dn ) the diagonal n-n-matrix D s.t. Dii = di
• Ps the permutation matrix related to the permutation s ∈ Sn
Permutation matrices are real, orthogonal and have in every row and column exactly one
nonzero entry. Further: Ps−1 = Ps−1 = PsT
−1 B C
A ∈ Mn is reducible iff a permutation matrix P exists s.t. P AP = .
0p,n−p D
Every Projection P can be characterized by its image M := =P and kernel W := Ker P . We
say that P projects on M along W . It holds Cn = M ⊕ W . On the other side: If Cn = M ⊕ W
then there is an unique projection on M along W .
Every matrix A ∈ Mnl (K) can be considered as linear map from Kl to Kn (and vice versa) via x 7→ Ax.
One defines:
• null space (kernel) Ker A := {x ∈ Cn : Ax = 0}
• range space =A := ACn = {Ax : x ∈ Cn }
• trace tr A := A11 + ... + Ann
2
One has the following characterization of the various norms:
P
(1) kAk1 = maxj Pi |aij |
(2) kAk∞ = maxi j |aij | = kAH k1
P 1/2
(3) kAk<2> = kAkF = (tr AH A)1/2 = i,j |A ij |2
• One defines for A ∈ GLn (K) the condition number relative to inversion as κp (A) :=
kAkp kA−1 kp .
3
An eigenvalue λ is called:
• defective :⇔ gAλ < mλ
A
• nondefective :⇔ gA λ = mλ
A
• simple :⇔ mλA = 1
• geometrically simple :⇔ gA λ =1
4
3 Matrix Decompositions and Factorizations
This section presents the most important matrix decompositions.
3.1 LU Decomposition
• A LU decomposition of A ∈ Mn (K) is a pair (L, U ) of matrices L, U ∈ Mn (K) s.t.
(i) L is unit lower triangular (i.e. Lii = 1)
(ii) U is upper triangular
(iii) A = LU
A ∈ GLn (K) admits a (unique) LU decomposition iff its leading principal minors are nonzero.
(i.e. iff det A[1 : p][1 : p] 6= 0 (p = 1 : n))
3.3 QR Decomposition
• A QR decomposition of A ∈ Mn (K) is a pair (Q, R) of matrices Q, R ∈ Mn (K) s.t.
(i) Q is unitary
(ii) R is upper triangular with positive diagonal entries (Rii > 0)
(iii) A = QR
Every A ∈ GLn (K) admits a (unique) QR decomposition.
5
3.5 SVD - Singular Value Decomposition
• A SV decomposition of A ∈ Mnl (K) is a triple (U, D, V ) of matrices U ∈ Mn (K), D ∈ Mnl (K),
V ∈ Ml (K) s.t.
(i) U, V are unitary
(ii) D = diag (σ1 , . . . , σp ) with σ1 ≥ ... ≥ σp ≥ 0 and p = min(n, l)
(iii) A = U DV H
Every A ∈ Mnl (K) admits a (nonunique) SV decomposition. The σ1 , . . . , σp are uniquely de-
termined by A (and thus equal in all SVD’s). In fact σi2 (A) = σi2 (AH ) = λi (AAH ) = λi (AH A)
for i = 1, . . . , p.
6
Jordan blocks are eigenvalues of A. The number and dimensions of Jordan blocks correspond-
ing to an eigenvalue are unique, although P can be chosen s.t. they appear in any order.
7
4 Sensitivity of the Eigenvalue Problem, Perturbation Theory
The family of all eigenvalues of an A ∈ Mn can be regarded as unordered n-tuple of complex
numbers. All these tuples form the space Cnsym = Cn/∼ with (a1 , ..., an ) ∼ (b1 , ..., bn ) ⇔ ∃s ∈
Sn : bi = as(i) , i = 1, ..., n. It is a quotient space of Cn and has therefore an induced metric:
(1) d(M1 , M2 ) := mina∈M1 ,b∈M2 ka − bk∞ where M1 = [(a1 , ..., an )]/∼ , M2 = [(b1 , ..., bn )]/∼
One can easily verify that this is equal to the now defined
• spectral variation (=optimal matching distance) of A, B ∈ Mn :
(2) d(σ(A), σ(B)) := inf s∈Sn maxj |λj (A) − λs(j) (B)| (Sn is the group of permutations on {1, ..., n})
Another distance function for the spectra of two matrices is now introduced:
• For closed subsets A, B ⊂ C one defines s(A, B) = supa∈A dist (a, B) = supa∈A inf b∈B |a − b|.
Then one defines the Hausdorff distance of A and B as h(A, B) := max(s(A, B), s(B, A)).
One can think of the spectrum of a matrix as a subset of C. This defines the Hausdorff distance
of the spectra of two matrices.
For arbitrary A, B ∈ Mn :
(3) h(σ(A), σ(B)) ≤ d(σ(A), σ(B))
Only for n = 2 are the two distances equal.
With the notation of spectral variation this can be formulated more elegantly:
Let A, B ∈ Mn then ∀α > 0 ∃ > 0 s.t.
(5) kA − Bk < ⇒ d(σ(A), σ(B)) < α
The following theorem answers the question how good the diagonal entries of a matrix approx-
imate its eigenvalues:
Gershgorin Theorems:
1. If A ∈ Mn then σ(A) ⊆ SR ∩ SC where
(6) SR = ni=1 Ri , Ri = {z ∈ C : |z − Aii | ≤ nj=1,j6=i |aij |}
S P
Sn Pn
(7) SC = j=1 Cj , Cj = {z ∈ C : |z − Ajj | ≤ i=1,i6=j |aij |}
The Ri areScalled row Gershgorin circles, the Ci column Gershgorin circles.
2. If S1 = m
Sn
R
i=1 i 2, S = i=m+1 i and S1 ∩ S2 = ∅ then S1 contains exactly m eigenvalues of
R
A
8
(counted with algebraic multiplicity).
The next theorems establish bounds for the variation of the spectra of two matrices:
If A, B ∈ Mn then:
1/n
(8) h(σ(A), σ(B)) ≤ (kAk2 + kBk2 )1−1/n kA − Bk2
1/n
(9) d(σ(A), σ(B)) ≤ 4(kAk2 + kBk2 )1−1/n kA − Bk2
This means roughly that order perturbations of A lead to κ(λ) perturbations of λ. Thus, if
κ(λ) is small then λ is regarded as well-conditioned.
9
eigenvectors xi , yi (i.e. Axi = λi xi , yiH A = λi yiH , kxi k2 = kyi k2 = 1, i = 1, ..., n). Further, let
A() = A + E be a perturbation of A with kEk2 = 1. Then exist in a neighborhood of zero
differentiable functions xi (), yi () and λi () with:
(18) A()xi () = λi ()xi ()
(19) yi ()H A() = λi ()yi ()H
(20) kxi ()k2 = kyi ()k2 = 1, λi (0) = λi , xi (0) = xi , yi (0) = yi
(21) kxk () − xk k2 ≤ /(minj6=k |λk − λj |)kEk2 + O(2 ) = κ(xk )kEk2 + O(2 )
This means that the sensitivity of xk depends upon the separation of λk from the other eigen-
values.
• Under the assumptions above one defines the separation of T11 and T22 as:
kT11 X−XT22 kF
(23) sep (T11 , T22 ) = minx6=0 kXkF
From [1]: Suppose that (22) holds and that for any matrix E ∈ Mn we partition U H EU as
follows:
E11 E12
(24) U H EU = , E11 ∈ Mp , E22 ∈ Mn−p
E21 E22
If δ = sep (T11 , T22 ) − kE11 k2 − kE2 2k2 > 0 and
(25) kE21 k2 (kT12 k2 + kE21 k2 ) ≤ δ 2 /4
then there exists a P ∈ Mn−p,p s.t.
(26) kP k2 ≤ 2kE21 k2 /δ
and the columns of Û = (U1 + U2 P )(Ip + P H P )−1/2 form an orthonormal basis for a subspace
that is invariant for A + E.
10
1
Let A ∈ Mn be normal and B ∈ Mn . If kA − Bk2 ≤ 2 minλ,µ∈σ(A),λ6=µ |λ − µ|, Then:
(31) d(σ(A), σ(B)) ≤ kA − Bk2
For Hermitian matrices one can formulate some strong perturbation theorems. But let us
first consider a characterization of the eigenvalues of a Hermitian matrix. One therefore intro-
duces Rayleigh quotients: • For A ∈ Mn one defines the Rayleigh quotient (function) as
H
RA : Cn \ {0} → C s.t. v 7→ RA (v) = vvHAv v
.
If A ∈ Mn is Hermitian then RA (v) is real and RA (αv) = RA (v) for α ∈ C \ {0}.
Now one can give the characterizations of the eigenvalues of a Hermitian matrix:
Let A ∈ Mn be Hermitian with eigenvalues λ1 ≥ . . . ≥ λn and associated eigenvectors xi
that form an orthonormal basis of Cn . With the notation Vk := span {x1 , . . . , xk }, V0 = {0}
and
Vk := {V ⊂sub Cn | dim V = k}, V0 = {V0 } one has:
(32) λk = RA (xk )
(33) λk = minv∈Vk ,v6=0 RA (v) = minv∈Vk ,kvk2 =1 v H Av
(34) λk = maxv⊥Vk−1 ,v6=0 RA (v)
(35) λk = maxW ∈Vk maxv∈W,v6=0 RA (v) = maxW ∈Vk maxv∈w,kvk2 =1 v H Av
(36) λk = minW ∈Vk−1 maxv⊥W,v6=0 RA (v) = minW ∈Vn−k+1 maxv∈W,kvk2 =1 v H Av
Equation (35) and (36) are also called the Minimax Principal. Furthermore:
(37) RA (Cn \ {0}) = {RA (v)|v ∈ Cn , v 6= 0} = [λ1 , λn ] ⊂ R
11
Lidskii’s Theorem Let A, B ∈ Mn be Hermitian. Then:
Pk ↓ Pk ↓ Pk ↓
(44) j=1 λij (A + B) ≤ j=1 λij (A) + j=1 λij (B) for 1 ≤ i1 < . . . < ik ≤ n
12
Let A ∈ Mn be Hermitian and B ∈ Mn skew-Hermitian.
Then for 2 ≤ p ≤ ∞ inequality (46) and (47) hold while for 1 ≤ p ≤ 2 inequality (48) and
(49) hold:
1
−1
(46) kEig |↓| (A) − Eig |↑| (B)k<p> ≤ kA − Bk<p> ≤ 2 2 p kEig |↓| (A) − Eig |↓| (B)k<p>
(47) kEig |↓| (A) − Eig |↑| (B)k<p> ≤ kEig(A) − Eigs (B)k<p> ≤ kEig |↓| (A) − Eig |↓| (B)k<p> s ∈ Sn
1
− p1 |↓| |↓| |↓| |↑|
(48) 22 kEig (A) − Eig (B)k<p> ≤ kA − Bk<p> ≤ kEig (A) − Eig (B)k<p>
(49) kEig (A) − Eig |↓| (B)k<p> ≤ kEig(A) − Eigs (B)k<p> ≤ kEig |↓| (A) − Eig |↑| (B)k<p>
|↓|
s ∈ Sn
Further: √
|↓|
(50) 1
2 k|Eig (A) − Eig |↓| (B)k| ≤ k|A − Bk| ≤ 2k|Eig |↓| (A) − Eig |↓| (B)k|
4.5 Residuals
• Let (λ̂, x̂) be the estimation for an eigenvalue/eigenvector pair of A ∈ Mn . Then one defines
its residual as:
(52) r̂ = Ax̂ − λ̂x̂
The next two propositions give a posteriori estimates for eigenvalues and eigenvectors of a
Hermitian matrix:
Let A ∈ Mn be Hermitian and let r̂ be the residual of the estimated eigenvector/eigenvalue
pair (λ̂, x̂). Then:
kr̂k2
(53) minµ∈σ(A) |λ̂ − µ| ≤ kx̂k2
13
5 Algorithms
Algorithms are presented in a pseudo java code. The assignment of objects (like vectors) should
not be understand as the assignment of references. The following example makes that clear:
The Householder Transformation can be used to zero all but one entries of a vector:
Assume x = (x1 , ..., xn )T ∈ Cn and 1 ≤ i ≤ n are fixed. Set:
(2) v = (x1 , ..., xi−1 , (1 ± kxk 2 T xi
|xi | )xi , xi+1 , ..., xn ) = x ± kxk2 |xi | ei if xi 6= 0
(3) v = (x1 , ..., xi−1 , ±kxk2 c, xi+1 , ..., xn )T = x + kxk2 cei for some c ∈ T if xi = 0
Then Hn (v)x = ∓kxk2 |xxii | ei and Hn (v)x = −kxk2 cei , respectively.
Proof: see appendix
• Define h : Cn → Cn as:
0
if x2 = ... = xn = 0
(4) h(x) := x + kxk2 e1 if x1 = 0, x2 ·. . .·xn 6= 0
x + kxk2 |xx11 | e1
if x1 6= 0, x2 ·. . .·xn 6= 0
If x is the i − th column vector of X then B = Hn (h(x))X will have zero entries Bi,2 = ... =
Bi,n = 0. One uses this to zero some entries of a matrix. First some important algorithms are
presented.
14
One uses the special structure of Hn (v) to compute the Householder pre- and post-multiplication
Hn (v)A and AHn (v) (A ∈ Mnl , A ∈ Mln respectively). The complexity is then ≤ [ 8nl +
2n ; 8nl+2n+2l ; 1 ]R . An ordinary matrix multiplication would have complexity [ 4n2 l ; 4n2 l ]R .
r e q u i r e s : x ∈ Cn
r e t u r n s : h(x)
c o m p l e x i t y ≤ [ 2n + 1 ; 2n + 4 ; 1 ; 1 ]R
r e q u i r e s : A ∈ Mnl , v ∈ Cn
r e t u r n s : Hn (v)H A (= Hn (v)A)
c o m p l e x i t y ≤ [ 8nl + 2n ; 8nl + 2n + 2l ; 1 ]R
r e q u i r e s : A ∈ Mln , v ∈ Cn
r e t u r n s : AHn (v)
c o m p l e x i t y ≤ [ 8nl + 2n ; 8nl + 2n + 2l ; 1 ]R
15
Consider a matrix A ∈ Mnl (C). One can zero the elements Ai1 ,j , ..., Ai2 ,j for 1 < i1 ≤ i2 ≤ n.
To do so, compute the Householder vector v for A[i1 : i2 ][j]. Complete this vector with zero’s
to obtain a fitting vector h of length n: h = ( 0| ... T T
{z 0} v 0| ...
{z 0}) . One can easily verify that:
i1 −1 n−i2
Ii1 −1
(6) Hn (h) = Hi2 −i1 +1 (v)
In−i2
Therefore the matrix B = Hn (h)A has zero entries Bi1 +1,j = ... = Bi2 ,j = 0.
Due to roundoff errors, it could happen that the Bi1 +1,j ...Bi2 ,j are only very small. The next
algorithm zeros the entries (i1 + 1, j)...(i2 , j) of a given matrix by pre-multiplying it with the
Householder matrix. The entries are explicitly zeroed. Vector h returns the Householder vector
of the performed transformation for later usage.
r e q u i r e s : A ∈ Mnl , 1 ≤ j ≤ l, 1 ≤ i1 < i2 ≤ n
r e t u r n s : [Hn (h)A, h] where h i s c h o s e n s . t . (Hn (h)A)[i1 + 1 : i2 ][j] = 0
c o m p l e x i t y ≤ [ 8st + 4s + 1 ; 8st + 4s + 2t + 4 ; 1 ; 1 ]R where s = i2 − i1 + 1, t = n − j + 1
n−i2
return [A; h] ;
}
16
5.2 Givens Transformations
• For c, s ∈ C with |c|2 + |s|2 = 1 the n-n matrix Gn (p, q, c, s) is called Givens Matrix:
Iq−1
c s ←q
(7) Gn (p, q, c, s) =
Ip−q−1
(1 ≤ q < p ≤ n)
-s c ←p
In−p
↑ ↑
q p
The Givens Transformation can be used to zero one specific entry of a vector:
a
Assume ∈ C2 and set c = √ 2a 2 , s = √ −b . Then:
b |a| +|b| |a|2 +|b|2
H
c s a •
(8) =
−s c b ·
r e q u i r e s : a, b ∈ C
H
c s a •
r e t u r n s : [c, s] s . t . =
−s c b ·
c o m p l e x i t y ≤ [ 3 ; 8 ; 1 ; 1 ]R
[ C, C ] getGivensNumbers (C a , C b ) {
i f ( b == 0 )
return [1; 0] ;
p
double f = 1./ |a|2 + |b|2 ;
return [f a; −f b] ;
}
Consider the product B = Gn (p, q, c, s)H A where A ∈ Mnl . Then only the p − th and q − th
row of A changes, i.e.
sAqj + cApj if i = p
(9) Bij = cAqj − sApj if i = q
Aij else
Similar, in the product AGn (p, q, c, s) only the p − th and q − th column changes. One uses
this to compute the Givens pre- and post-multiplication:
17
Listing 6: Givens Pre-Multiplication
r e q u i r e s : A ∈ Mnl ; c, s ∈ C; 1 ≤ q < p ≤ n
r e t u r n s : Gn (p, q, c, s)H A
c o m p l e x i t y ≤ [ 12l ; 16l ]R
r e q u i r e s : A ∈ Mnl ; c, s ∈ C; 1 ≤ q < p ≤ n
r e t u r n s : AGn (p, q, c, s)
c o m p l e x i t y ≤ [ 12l ; 16l ]R
Because of roundoff errors we need another method to zero a specific entry of a matrix:
r e q u i r e s : A ∈ Mnl ; 1 ≤ q < p ≤ n
r e t u r n s : [Gn (p, q, c, s)H A, c, s] where [c, s] =getGivensNumbers ( Aqq , Apq )
c o m p l e x i t y ≤ [ 12l + 3 ; 16l + 8 ; 1 ; 1 ]R
18
5.3 QR Decomposition
One uses the Householder transformation to compute the (generalized) QR decomposition of
a matrix:
Listing 9: QR Decomposition
r e q u i r e s : A ∈ Mnl
r e t u r n s : [U, R] s . t . (U, R) i s t h e ( g e n e r a l i z e d ) QR d e c o m p o s i t i o n o f A
c o m p l e x i t y ≤ [ 11n3 + 3n ; 11n3 + 3n2 + 5n ; 2n − 2 ; n − 1 ]R f o r n = l
19
H = getTimesHouseMatPost ( H, h ) ;
U = getTimesHouseMatPost ( U, h ) ;
}
return [H, U ] ;
}
R e q u i r e s : H ∈ Mnl i n H e s s e n b e r g form
Returns : [U, R] s . t . (U, R) i s t h e ( g e n e r a l i z e d ) QR d e c o m p o s i t i o n o f H
c o m p l e x i t y ≤ [ 24n2 − 21n ; 32n2 − 24n ; n − 1 ; n − 1 ]R f o r n = l
k
k (0) a λk (x +s(k) ) +s(k)
(11) q (k) = kAAk qq(0) k = ka 1λk1(x 1+s(k) )k = |aa11 | |λλ11 | kxx1+s (k) k
2 1 1 1 2 1 2
with s(k) → 0 for k → ∞. This means that q (k) tends to lie in span {x1 }. Further:
(12) kAx(k) k2 → rσ (A) (k → ∞)
H (k)
(13) q(k) Aq → λ1 (k → ∞)
20
Aq (k−1)
(14) q (k) = kAq (k−1) k2
(15) λ(k) = (q (k) )H Aq (k)
Then:
k
(16) dist (span {x1 )}) = O λλ21
{q (k) }, span
k
(17) |λ1 − λ | = O λλ12
(k)
It follows, that this method is well suited when |λ2 |/|λ1 | is small. A possible stopping criterium
is to monitor the differences between λ(k) and λ(k−1) and to stop when |λ(k) − λ(k−1) | is small.
Another stopping criterium uses the residual r(k) = Aq (k) − λ(k) q (k) :
kr(k) k2
(18) |λ1 − λ(k) | ≈ H
|w(k) q (k) |
(AH )k w(0)
where w(k) are approximate left eigenvectors w(k) = k(AH )k w(0) k2
. The necessity to compute
w(k) nearly doubles the costs for the power method.
An algorithm that uses the first stopping criterium is given in the following listing:
If the simple power method returns the approximate eigenvalue/eigenvector pair (λ, q) for
(A − µIn )−1 then ( λ1 + µ, q) is an approximate eigenvalue/eigenvector pair for A.
21
22
Listing 13: Inverse Power Method
R e q u i r e s : A ∈ Mn
Returns : A(k) t h a t was c o n s t r u c t e d a s i n d i c a t e d above
c o m p l e x i t y ≤ [ 15n3 + 3n ; 15n3 + 3n2 + 5n ; 2n − 2 ; n − 1 ]R p e r i t e r a t i o n
Matrix qr ( Matrix A ) {
Matrix B = A ;
Matrix U, R = 0 ;
f or ( int k = 0; k < 1000; k++ ) {
[U, R] =getQR ( B ) ;
B = RU ;
}
return B ;
}
23
5.6.2 QR Method with Hessenberg Reduction
The QR method can be improved when one first reduces A to Hessenberg form. Then one can
compute the QR decomposition much faster. An algorithm is given in the following listing:
R e q u i r e s : A ∈ Mn
Returns : A(k) t h a t was c o n s t r u c t e d a s i n d i c a t e d above
c o m p l e x i t y ≤ [ 24n2 ; 32n2 ; n − 1 ; n − 1 ]R p e r i t e r a t i o n
Matrix q r H e s s ( Matrix A ) {
Matrix U0 , U, R, H = 0 ;
[H, U0 ] = g e t H e s s e n b e r g ( A ) ;
f or ( int k = 0; k < 100; k++ ) {
// f a s t QR s t e p :
for ( int j = 1; j < n; j++ ) {
applyGivens ( H, j + 1, j, c, s ) ;
timesGivensMatPost ( H, j + 1, j, c, s ) ;
}
}
return H ;
}
H (k) − ρI = U R // QR f a c t o r i z a t i o n
H (k+1) = RU + ρI
24
A Proofs
Proof: of Theorem 3 in Section 5.1 (for convenience: k.k = k.k2 )
xi 6= 0: v H v = kvk2 = |x1 |2 + ...|(1 ± kxk 2 2
|xi | )xi | ... + |xn |
2
B Bibliographical Reference
[1] Gene H. Golub, Charles F. Van Loan: ”Matrix Computations”, 2nd Edition, 1989 The
John Hopkins University Press
[2] Denis Serre: ”Matrices. Theory and Applications”, Graduate Texts in Mathematics, 2002
Springer Verlag New York
[3] Philippe G. Ciarlet: ”Introduction to numerical linear algebra and optimisation”, 1989
Cambridge University Press
[4] Alfio Quarteroni, Riccardo Sacco, Fausto Saleri: ”Numerical Mathematics”, Texts in
Applied Mathematics, 2000 Springer Verlag New York
[5] Rajendra Bhatia: ”Matrix Analysis”, Graduate Texts in Mathematics, 1997 Springer
Verlag New York
25