You are on page 1of 25

Fast Eigenvalue Solutions

Techniques
! Steepest Descent/Conjugate Gradient
! Davidson/Lanczos
! Carr-Parrinello
PDF Files will be available
Where HF/DFT calculations spend time
Guess ρ

Form
O(N4)
H Cutoffs → O(N2)

O(N3)
Diagonalize Difficult to reduce:
⇒ρ Lanczos or Conjugate Gradient
Currently only important if N > 2000

Did ρ
Yes Change?

No

Done
Rayleigh-Ritz Variational Principle

Can always map a minimization problem into a


diagonalization problem, and vice-versa

u , Au
λ1 = min
u u, u
This isn’t surprising to QMers, since we normally
interchange the two
Iterative Eigenvector Optimization

Ron already covered this…


Eigenvalue problem
Hx = Ex
(H-D)x = (EI-D)x D is the diagonal of H
x = (EI-D)-1(H-D)x
Define an iterative approach
x(n+1) = (E(n)I-D)-1(H-D)x(n)
E(n) = x(n)’Hx(n)/x(n)’x(n)
Factor out the correction
x(n+1) = x(n) + d(n)
d(n) = (E(n)I-D)-1(H-E(n)I)x(n)
The Residual Vector d

The residual vector d(n) = (E(n)I-D)-1(H-E(n)I)x(n)


! Represents the direction of greatest change for the
eigenvector x.
! Known as the “steepest descent vector” (although this term
is also used for (H-E(n)I)x(n)
We have a direction, but we don’t know how far to
move.
! Use Steepest Descent or Conjugate Gradient optimization
Steepest descent optimization
1. Find d1

2. Minimize in 1-d
along d1

3. Find d2

4. Minimize in 1-d along


d2

5. Etc.

Tends to be slow, because you overshoot the


minimum in each direction.
Conjugate Gradient optimization
1. Find d1

2. Minimize in 1-d
along d1

3. Find “conjugate
direction” d2:
d2∇d1 = 0
4. Minimize along d2

5. Etc.
CG is normally must faster than SD
One-dimensional optimization

Given a direction d (SD or CG), how do we optimize?


! x(n+1) = x(n)cos(θ) + d sin(θ)
! Finding optimal theta normally only requires a few points.
! Trig functions preserve normality
Orbital/band at a time optimization
Trial eigenvector for band

Calculate steepest descent vector

Orthogonalize to all bands

Determine conjugate direction


Iterate
Orthogonalize to present band and normalize
until
converged Calculate KS Energy for θ

Compute optimum θ

Compute new eigenvector


Davidson Diagonalization

Optimize eigenvectors in a guess subspace


Also uses residual vector d(n) = (E(n)I-D)-1(H-E(n)I)x(n)
! Rather than CG or SD, use d to augment subspace
! “Lanczos” technique
Derived for Configuration Interactions
! N ~ 10,000
! K=1
Lanczos methods can be unstable
! Davidson normally more stable, because of the subspace
projection.
Subspace Optimization

Rather than solve a problem in a large space (N ~


10,000) project onto a smaller space (n ~ 50).
! Small space has to contain dominant features of large space

Example: Water with 1M basis functions


! Only have 5 occupied states
" 1s orbital on O
" 2 O lone pairs
" 2 OH bonds
! Project onto a space that spans these five orbitals
! Add onto this space until converged
Davidson #1: Subspace Transform
1. Choose an initial set of guess vectors {bi, i=1,ng}
2. Transform H into subspace spanned by {bi}
G = BTHB
3. Compute eigenvectors v and values λ of G
Dense technique (ng is small)

If B fully spans the desired eigenvector, one of the


eigenvalues λk will be the exact one.
Otherwise, we can iteratively optimize B to make it
better.
How do we do this? Using the idea of the residual
vector d that we saw earlier.
Davidson #2: Optimizing the subspace

4. Compute the residual vector dk as before:


d = (λkI-D)-1(G- λkI)vk
5. Orthogonalize d to all of the {bi}; normalize d.
6. Set bng+1 = d
7. Form the additional row and column of G
Gng+1,i = BTHbng+1,i
8. Diagonalize G, as before.
9. If not converged, go to #4.
Davidson Timings
Timing Caveats

These timings represent a best-case scenario!!


! 1 eigenvalue, lots of functions
! Your mileage may vary!!!
Higher roots with Davidson

If we want more than one eigenvalue/vector, the


guess vectors {b} converged after computing one
are normally a good guess for another.
Start with the lowest one, compute to highest.
Not a good method to compute all
eigenvalues/vectors
! End up solving the full problem multiple times!
Compare/Contrast CG & Davidson

Both use a subspace transform


! We normally have the previous iteration’s eigenvectors, and
we may as well use them
Both use the residual vector d
! CG uses d to compute the conjugate direction
! Dav uses d to augment the subspace

Conventional wisdom
! Davidson is faster
! CG is more stable
! Many more people use CG than Davidson
Caveats

1. Use a dense technique when n < 500 always.


dsyevx from LAPACK is very hard to beat!
Dense techniques are much more stable than sparse.
2. Keep an eye on nk/N
nk is the number of desired roots, N is the size of the
matrix.
If nk/N > 0.25, probably best to use dense technique.
3. Precondition!
Multiplying by the last iteration’s eigenvectors is a good
form of precondintioning.
Pure Lanczos Techniques (Zunger, ’94)
1. Start with a random u1
2. Solve βi+1ui+1 = Hui – αiui – βiui-1
αi = <ui|H|ui>
βi determined by normalization condition of ui
3. Diagonalize tridiagonal matrix

α 1 β 2 
β α β3 
 2 2 
 β3 O β M −1 
 
 β M −1 αM 
Lanczos #2

4. Check to see how many eigenvalues are converged;


add vectors ui until the requisite number of states
are converged.
5. The kth eigenvector is given by
Vk = Σi bkiuk
Pure Lanczos notes

Lanczos has a reputation of being unstable


! Can also produce duplicate eigenvalues/vectors
" Must project out
! Can become non-orthogonal
! Numerical errors often grow during iterations
Car-Parrinello Propagator Technique

Different approach
! Rather than minimizing the problem, propagate solutions

Born-Oppenheimer approximation
! Nuclei move much faster than electrons
! Fix nuclei, minimize electrons

Car-Parrinello technique does not solve for the Born-


Oppenheimer surface
! Propagated electrons can be shown to lie reasonably close
to BO surface
Scheme for QM-MD

1. Fix the nuclei, solve for the wave function


2. Compute the forces on the nuclei due to the wave
function
3. Propagate the nuclei using Verlet-type scheme
4. Go to #1
Car-Parrinello

Start from observation that diagonalization and


energy minimization are interchangeable
Rather than minimizing the electron, propagate it in
the same way one propagates the nuclear motions
! Electrons must have fictitious masses
! Treat as dynamic variables
! Much faster than diagonalization
Can also dynamically damp the electron to obtain
Born-Oppenheimer solution
References

! “Iterative minimization techniques for ab initio total energy


calculations.” Payne, Teter, Allen, Arias, Joannopoulos. RMP
64, 1045 (1992).
! “Iterative calculation of a few of the lowest eigenvalues and
corresponding eigenvectors of large real-symmetric
matrices.” Davidson. J. Comp. Phys. 17, 87 (1975).
! “Asymptotic convergence for iterative optimization in
electronic structure.” Lippert, Sears. PRB 61, 12772 (2000).
! “Large scale electronic structure calculations using the
Lanczos method.” Wang, Zunger. Comp. Mat. Sci 2, 326
(1994).
! “Unified approach for MD and DFT” Car, Parrinello. PRL 55,
2471 (1985).

You might also like