You are on page 1of 20

Krylov space methods

Name/ Surname: Dionysios Zelios

Email: dionisis.zel@gmail.com

Course: Computational Physics (FK8002)

CONTENTS

Description of the problem


Introduction
i.
ii.
iii.

Arnoldi algorithm
Lanczos algorithm
Time evolution of our system

Results
i.
ii.
iii.
iv.

Arnoldi
Lanczos
Time comparison between the algorithms
Time evolution of our system

References

Description of the problem


To begin with, we will investigate the Krylov space methods for diagonalization of a matrix.
The Krylov space methods transform the original matrix to one of much lower order. This
lower order matrix can then easily be diagonalized. We will write the Arnoldi and Lanczos
algorithm in Matlab and then we will apply those methods to harmonic oscillator
Hamiltonian with an extra potential term added (this problem has already been solved in
assignment 4 with the shifted inverse power method).
Moreover, we will use the Krylov space methods to calculate the exponent of the
Hamiltonian matrix and hereby to construct the time evolution of the aforementioned
quantum mechanical system.

Introduction
An intuitive method for finding an eigenvalue (specifically the largest eigenvalue) of a
given m m matrix
is the power iteration. Starting with an initial random vector x, this
method calculates Ax , A2 x, A3 x... iteratively storing and normalizing the result into x on
every turn. This sequence converges to the eigenvector corresponding to the largest
eigenvalue, 1 .
However, much potentially useful computation is wasted by using only the final
result, n 1 x . This suggests that instead, we form the so-called Krylov matrix:

Kn ( x, Ax , A2 x, A3 x..., An1x)

The columns of this matrix are not orthogonal, but in principle, we can extract an
orthogonal basis, via a method such as GramSchmidt orthogonalization. The resulting
vectors are a basis of the Krylov subspace, K n .
We may expect the vectors of this basis to give good approximations of the eigenvectors
corresponding to the largest eigenvalues, for the same reason that n 1 x approximates
the dominant eigenvector.
The process described above is intuitive. Unfortunately, it is also unstable. This is where
the Arnoldi iteration enters.

Arnoldi algorithm
Iterative algorithms compute a sequence of vectors that hopefully converges to an
eigenvector. The most basic iteration is the power method, where the x0 is a starting
guess and then a sequence xk is computed by:

xk Axk 1

(1)
After many iterations xk will tend to an eigenvector, corresponding to the eigenvalue
1 that is largest in absolute value, provided there is only one such eigenvalue.
We will get interesting algorithms, if we save all vectors in the sequence (1),
and get the Krylov subspace:

K ( A, x0 ) {x0 , Ax0 , A2 x0 ,..., Ak 1x0}

(2)

(Here brackets mean linear space of the columns given)


Then, we can write: Kn1 AKn Cn , where Cn is an upper Hessenberg matrix . (An upper
Hessenberg matrix is a matrix where its elements obey the rule that : aij 0 , for i j 1 ).
In order to obtain better conditioned basis for span ( K n ), we compute the QR
factorization:
Qn Rn K n , so that QnH AQn RnCn Rn1 H , where H is an upper Hessenberg matrix.
Equating k th columns on each side of equation AQn Qn H , we have the recurrence
relation:

Aqk h1k q1 ... hkk qk hk 1,k qk 1

relating qk 1 to the preceding vectors q1....qk

Continuing, we premultiply by q Hj and using orthonormality, we have: h jk q Hj Aqk ,


j 1....k
These relationships yield Arnoldi iteration, which produces unitary matrix Qn and upper
Hessenberg matrix H n , using only matrix-vector multiplication by A and inner products of
vectors. Below we present a flow chart of the steps that we have followed in order to
create the Arnoldi algorithm.

1. Start with q1=x/||x||2,


where x is an arbitraty
non-zero starting vector

For k=1,2...
2. uk=A*qk

3. For j=1...k
(1) hj,k=qj,H*uk
(2) uk=uk-qj*hj,k

4. If hk+1,k=||uk||2 then
stop!

5. qk+1=uk/hk+1,k

If Qk [q1...qk ] , then H k QkH AQk is an upper Hessenberg matrix. The eigenvalues of H k


are called Ritz values and they are approximate eigenvalues of the matrix A. Ritz vectors
are given by Qk y where y is an eigenvector of the matrix H k and they are the
approximate eigenvectors of matrix A. Eigenvectors of H k must be computed by another
method such as QR iteration (in our project, we have used the build-in Matlab command
eig which gives eigenvalues from a Schur decomposition).
It is often observed in practice that some of the Ritz eigenvalues converge to eigenvalues
of A. Since Hn is n-by-n, it has at most n eigenvalues, and not all eigenvalues of A can be
approximated. Typically, the Ritz eigenvalues converge to the extreme eigenvalues of A.
This can be related to the characterization of Hn as the matrix whose characteristic
polynomial minimizes ||p(A)q1|| in the following way. A good way to get p(A) small is to
choose the polynomial p such that p(x) is small whenever x is an eigenvalue of A. Hence,
the zeros of p (and thus the Ritz eigenvalues) will be close to the eigenvalues of A.
However, the details are not fully understood yet. This is in contrast to the case
where A is symmetric. In that situation, the Arnoldi iteration becomes the Lanczos
iteration, for which the theory is more complete.

Arnoldi iteration is fairly expensive in work and storage because each new vector q k must
be orthogonalized against all previous columns of Qk and all must be stored for that
purpose. Ritz values and vectors are often good approximations to eigenvalues and
eigenvectors of A after relatively few iterations (20-50).

Lanczos algorithm
In order to decrease the work and storage dramatically, we use the Lanczos algorithm. If a
matrix is symmetric or Hermitian recurrence then it has only three terms and H k is
tridiagonal (so usually denoted Tk ). Below we present a flow chart of the steps that we have
followed in order to create the Lanczos algorithm.

1. q0=0
b0=0
x0=arbitrary non zero starting vector
q1=x0/||x0||2

2. for k=1,2....
uk=A*qk
ak=qkH*uk
uk=uk-bk-1qk-1-akqk

3. bk=||uk||2

4. If bk=0 then stop!

5. qk+1=uk/bk

a k and bk are diagonal and subdiagonal entries of the symmetric tridiagonal matrix Tk .
As with Arnoldi, Lanczos iteration does not produce eigenvalues and eigenvectors directly, but only
the tridiagonal matrix Tk , whose eigenvalues and eigenvectors must be computed by another
method to obtain the Ritz values and vectors. If bk 0 , then the algorithm appears to break
down but in that case invariant subspace has already been identified (i.e. eigenvalues and
eigenvectors are already exact at that point).

In principle, if Lanczos algorithm run until k=n, the resulting tridiagonal matrix would be
orthogonally similar to matrix A. In practice, it was proved by Christopher Paige in his
thesis 1970, that loss of orthogonality happens precisely when the first eigenvalue
converges. As the calculations are performed in floating point arithmetic where inaccuracy
is inevitable, the orthogonality is quickly lost and in some cases the new vector could even
be linearly dependent on the set that is already constructed. As a result, some of the
eigenvalues of the resultant tridiagonal matrix may not be approximations to the original
matrix. Therefore, the Lanczos algorithm is not very stable.
This problem can be overcome by reorthogonalizing vectors as needed, but expense can
be substantial. Alternatively, we can ignore the problem, in which case the algorithm still
produces good eigenvalue approximations but multiple copies of some eigenvalues may
be generated.

Time evolution of our system


If we know the wave function of a system at certain time t we can find it at a later time
with the help of the time-dependent Schrdinger equation:

as

(t t ) e

(1)

t t

H ( t ) dt

(t )

(2)

The so called time-propagation operator can, to first order in t be approximated as:

t t

H ( t ) dt

H (t ) (t )

(3)

Still we have though the operator, H(t), in the exponent. If we have a complete, but still
finite, set of solutions to H(t) (for a specific time t):

H (t ) | i i | i
(4)

we can use this to effectively take the exponent of H(t) as:

i
H (t ) (t )

i
H (t ) (t )

| i i |

(5)

Starting for example in the ground state of the harmonic oscillator (at t0 ) , 0 (t0 ) we can
consider an additional potential V(t) (such that V(t)=0 for t t0 . The time dependent
Hamiltonian is thus:

H (t ) H ( x) V ( x, t )

(6)

where H is the time independent harmonic oscillator Hamiltonian. For V(x,t) one can
take:
0t

V ( x, t ) sin( * t ) *V ( x) ,

i.e. V(x,t) is non-zero only between t=0 and

(7)

. The term V(x) can be the bump in

Assignment 4.
We use then a time grid and we get:

(tn 1 ) e

i
( Ei ( tn1 ) ( t ))

| in 1 in 1
| (tn )

(8)

The time grid has to be chosen with small enough steps to capture the dynamics. To get
the set | in 1 and its eigenvalues one may of course diagonalize the full H (tn 1 ) matrix. It has to
be done in every step. A more efficient way is to use the Kyrlov-space obtained by working with
H (tn 1 ) on n ,
i.e.

n , H (tn1 ) n , H 2 (tn1 ) n , H 3 (tn1 ) n ...

and use the Lanczos algorithm to get the set | in 1 and its eigenvalues. We still have to do

this in every time-step, but now the matrix is just of the size of the Krylov space. Since we
use the solution at the previous time-step to construct the space, we can hope that (the small) set
we obtain is adequate: that we emphasize the part of the full space spanned by H (tn 1 ) that is
important for the time-evolution of n and neglect less important parts.

Results
To begin with, we apply the Arnoldi & Lanczos algorithms, to the harmonic oscillator
Hamiltonian with an extra potential term added:

2 1
m 2 x 2 V ( x) H V ( x)
2m x 2 2
2

(1)

We will start with a simple form of the extra potential: V ( x) C1e x C2


2

(2)

where C1 20 and C2 0.5 are numerical constants.


The extra potential is thus a bump in the middle of the harmonic oscillator potential.
Below, we present the graph of the potential:

Above, we have used a grid in the interval [-7,7] and a stepsize h=0,1.

In order to apply Arnoldi & Lanczos algorithms in this problem, we get an initial vector q,
which takes random integer values from [1,10] and its size is determined by the size of our
initial Hamiltonian matrix. In this case, that means that q has 141 rows and 1 column.
Using the flow chart of the Arnoldi algorithm that we have described above, we get two
matrices, Qk [q1...qk ] and H k QkH AQk which is an upper Hessenberg matrix.
The eigenvalues of H k are computed by the build-in Matlab command eig. These are the
Ritz values which are actually approximations of the eigenvalues of our initial Hamiltonian
matrix. We note that when we calculate the matrix H k , we have 142 rows and 141
columns. In order to use the eig command, we should have a square matrix, hence we
remove the last row.
Moreover, in order to calculate the corresponding eigenvectors, we multiply the matrix
Qk with the corresponding eigenvector of matrix H k .

Arnoldi
For 40 iterations, we get a good approximation of the first eigenvalue, 5.1822 . Hence,
below we present the first solution for our potential and we compare it with the one given
by the build-in Matlab command eig:

We notice that even though we get a good approximation for the first eigenvalue, in the
middle spectrum the eigenvalue approximations are not so good. This can partly be
explained by the fact that we get only 41 eigenvalues instead of 141. In order to have a
better view of our result, we present in the following table some of the eigenvalues that
we got and we find in which eigenvalue of the build-in Matlab command eig each one
corresponds to.

# state

Eigenvalues Arnoldi

# state
Eig

Eigenvalues Eig

5.1822

5.1822

7.4789

7.4303

9.5476

9.5412

13.3676

13.4966

16.0584

12

15.3817

18.5041

15

18.7751

15

85.7669

55

84.5857

19

120.8626

68

119.9923

28

212.4741

99

211.6348

41

224.4503

104

224.8900

Hence, we can conclude that we take very good approximations in the initial and in the
last positions. This can be clearer with the following graph which shows the eigenvalues
that we have calculated with the Arnoldi algorithm and those generated by the build in
Matlab command Eig.

Lanczos
In our next step, we follow the same procedure as described above for the Lanczos
method. We start working with 40 iterations. For the first eigenvalue, we take as a result
5.1822 (as expected) and the corresponding eigenvector is presented below:

In the following table, we compare the values that we got from our algorithm with those of the
build-in Matlab command eig, as we did before:

# state

Eigenvalues
Lanczos

# state
Eig

Eigenvalues Eig

5.1822

5.1822

7.4350

7.4303

9.5445

9.5417

13.4689

13.4966

10

39.0985

33

38.9009

15

89.9717

57

89.7022

20

139.6471

75

140.6537

30

238.1887

109

237.0860

40

284.1597

141

284.1615

We notice that we get very good approximations of the eigenvalues for the first 3-4
eigenvalues and also for the last. Hence, we can conclude that when we use this algorithm
in a symmetric matrix, we can obtain very good approximations of a few lower eigenvalues
and a lot of higher eigenvalues.

Moreover, the more iterations we let our algorithm to run, the better results for the
lowest and highest eigenvalues we obtain. We chose to let our algorithm to run only for 40
iterations since we noticed that we take satisfactory results with this number of iterations.
Below we present a table where we compare the eigenvalues for different number of
iterations:

# eigenvalues

Expected
value

40 iter.

50 iter.

60 iter.

70 iter.

5.1822

5.1822

5.1822

5.1822

5.1822

13.4966

13.4689

13.5078

13.4971

13.4967

32

37.3456

39.0985

33.7400

37.0993

36.2141

73

134.6757

139.6471

131.8173

132.9627

135.4785

107

232.3495

229.2186

233.5147

231.1329

235.1561

141

284.1615

284.1597

284.1615

284.1615

284.1615

We notice that letting our algorithm to run over 50 times, we lose some accuracy in the
eigenvalues that we found, but on the other hand we take more results. Furthermore, we
see that for the lowest and highest eigenvalues, our algorithm converges very fast to the
desired eigenvalue. The main problem in convergence is detected in the middle spectrum.
If eigenvalues are needed in the middle of spectrum, lets say near s, then the algorithm
can be applied to matrix | A sI | . In this way, we obtain the eigenvalue near s point.

Furthermore, we present some graphs in which we vary the number of iterations and
check if the eigenvalues that we have found from the Lanczos algorithm coincide with the
ones expected from the build-in Matlab function eig.

From the graphs above, we verify that our algorithm (Lanczos) converges very fast to the
first and last few eigenvalues. The more iterations we let our program to run, the more
eigenvalues we get and as a result the eigenvalues in the middle spectrum start to coincide
with the expected values, in a slower pace.

Time comparison between the algorithms


In assignment 3, we have used the inverse power with shift routine in order to calculate
the eigenvalue and the corresponding eigenvector of the Hermitian matrix. Now, we will
compare the inverse power iteration with shift, Arnoldi and Lanczos algorithms as far as
the time that each one of those needs, in order to calculate eigenvalues and eigenvectors.

Method

Shifted Inverse
Power iteration
Time(sec)

Arnoldi

Lanczos

Time(sec)

Time(sec)

141x141

0.0137

0.0633

0.0158

467x467

0.1210

0.0972

0.0769

701x701

0.1411

0.0951

0.0651

1401x1401

0.8758

0.3097

0.2138

2801x2801

6.7129

1.1123

1.0213

4667x4776

32.4801

14.0167

2.6940

Size matrix

As we can see, the Lanczos algorithm is the fastest method to calculate eigenvalues and its
corresponding eigenvectors. This is clearer for large matrices.
We note that in the above table, we have used 40 iterations for the Arnoldi algorithm, 40
iterations for the Lanczos algorithm and 4 iterations for the Inverse power iteration with
shift.
Moreover, using the shifted inverse power iteration, we have only calculated the first
eigenvalue and its corresponding eigenvector in the time shown in the table. On the other
hand, using Arnoldi and Lanczos method, we have calculated 40 eigenvalues and its
corresponding eigenvectors. This is the reason that for a small matrix (141*141) we get
that the inverse power iteration with shift is faster than the other two methods. Hence,
we conclude that Arnoldi & Lanczos methods are much more efficient than the above
table shows.

Time evolution of our system

Our first step is to create the new potential. Since we already have the potential from the
time independent harmonic oscillator Hamiltonian and we also know the potential with
the bump from assignment 4, we conclude that our potential will be given by the formula:

V ( x, t ) 0.5* x 2 sin( * t ) *(c1e c2 x ) ,


2

where c1 20 , c2

1
,
2

0t

The time grid has to be chosen with small enough steps to capture the dynamics. Hence,
initially we split the time interval [0, /2] in 141 points (time-step 0.01). For the x-axis grid,
we have used the interval [-7,7] and a stepsize h=0,1. As a result, we took a matrix that
each row shows how each point of the potential changes with time.
In our next step, we calculate the eigenvalues and eigenvectors of our Hamiltonian matrix, so we
get two matrices with dimensions (141,141). We store in a matrix called (tn ) the matrix of the
eigenvectors. Hence, in order to get the next set (of eigenvectors and eigenvalues) , we diagonalize
the full Hermitian matrix again for the next time value. Doing this iteration for the whole time grid
and using the equation:

(tn 1 ) e

i
( Ei ( tn1 ) ( t ))

| in 1 in 1
| (tn )

where

in 1 | (tn )

is the inner product of currently calculated eigenvectors and the

eigenfunction of the previous step, we calculate the time evolution of our problem.

References

1) Topics in numerical linear algebra


Axel Ruhe
2) Lecture notes, computational physics course (FK8002)
Eva Lindroth
3) http://en.wikipedia.org/wiki/Lanczos_algorithm
4) http://en.wikipedia.org/wiki/Arnoldi_iteration