You are on page 1of 37

Algorithms and Computational Aspects of DFT

Calculations
Part II
Juan Meza and Chao Yang
High Performance Computing Research
Lawrence Berkeley National Laboratory
IMA Tutorial
Mathematical and Computational Approaches to Quantum Chemistry
Institute for Mathematics and its Applications, University of Minnesota
September 26-27, 2008
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 1 / 37
1
Goals and Motivation
2
Review of Equations
3
Plane Wave DFT Computational Components
4
Parallelization Strategies
5
Future Computational Challenges
Linear Scaling Methods
Parallelism Issues
6
Software
Available Codes
KSSOLV
7
Summary
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 2 / 37
1
Goals and Motivation
2
Review of Equations
3
Plane Wave DFT Computational Components
4
Parallelization Strategies
5
Future Computational Challenges
Linear Scaling Methods
Parallelism Issues
6
Software
Available Codes
KSSOLV
7
Summary
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 3 / 37
Goals
1
The Role of Computation
2
Review Equations and Solution Techniques
3
Discuss Major Computational Aspects of Plane Wave DFT codes
4
Present Some Parallelization Issues
5
Highlight Computational Challenges
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 4 / 37
Materials by design
Advances in density functional theory coupled with multinode
computational clusters now enable accurate simulation of the behavior
of multi-thousand atom complexes that mediate the electronic and ionic
transfers of solar energy conversion. These new and emerging nanoscience
capabilities bring a fundamental understanding of the atomic and
molecular processes of solar energy utilization within reach.
Basic Research Needs for Solar Energy Utilization, Report of the BES
Workshop on Solar Energy Utilization,April 18-21, 2005
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 5 / 37
DFT codes are widely used for science applications
9470 nodes; 19,480 cores
13 Tops/s SSP (100 Tops/s
peak)
Upgrade to QuadCore (355 Tops/s
peak)
DFT methods account for 75% of
the materials sciences simulations at
NERSC, totaling over 5 Million
hours of computer time in 2006
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 6 / 37
We can now simulate some realistic structures
The charge density of a 15,000 atom
quantum dot, Si
13607
H
2236
. Using 2048
processors at NERSC the calculation took
about 5 hours.
The calculated dipole moment of
a 2633 atom CdSe quantum rod,
Cd
961
Se
724
H
948
. Using 2560 processors
at NERSC the calculation took about 30
hours.
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 7 / 37
1
Goals and Motivation
2
Review of Equations
3
Plane Wave DFT Computational Components
4
Parallelization Strategies
5
Future Computational Challenges
Linear Scaling Methods
Parallelism Issues
6
Software
Available Codes
KSSOLV
7
Summary
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 8 / 37
Kohn-Sham Equations
Recall our goal is to nd the ground state energy by minimizing the
Kohn-Sham total energy, E
total
Leads to:
Kohn-Sham equations
H
i
=
i

i
, i = 1, 2, ..., n
e
H =

1
2

2
+V ((r))

,
V ((r)) = V
ext
(r) +


|r r

|
+V
xc
()
Nonlinear eigenvalue problem since the Hamiltonian, H, depends on
through the charge density,
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 9 / 37
Discretized Kohn-Sham Equations
KKT conditions

X
L(X, ) = 0,
X

X = I
n
e
.
Discretized Kohn-Sham equations can now be written as:
H(X)X = X,
X

X = I
n
e
.
Kohn-Sham Hamiltonian given by:
H(X) =
1
2
L +V (X),
V (X) = V
ext
+ Diag (L

(X)) + Diag g
xc
((X))
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 10 / 37
The SCF Iteration
V ((r))
(r) =

n
e
i
|
i
(r)|
2
{
i
}
i=1,...,n
e

1
2

2
+V ((r))

i
= E
i

i
1
Given an initial charge density
compute a potential V
k
((r))
2
Solve the linear eigenvalue problem
for the
i
, i = 1, . . . , n
e
3
Compute the new charge density
4
Update using your favorite mixing
scheme
5
Compute V
k+1
and repeat until
converged
Overall computational complexity is
O(N n
2
e
) due to linear algebra
Major computational components
CG method
Orthogonalization
Computation of potentials
3D FFT
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 11 / 37
What Are the Computational Issues?
DFT methods account for 75% of the material science simulations at NERSC
Parallel eciencies can be quite high
on plane wave basis can scale to 1000 processors
on plane wave basis and wavefunction index can scale to 10, 000 processors
Most codes still based on O(N
3
) algorithms
Not systematically improvable
Inadequate for strong and/or non-local correlations
Parallel eciencies can be dicult to achieve; 10-20% parallel eciency is
not uncommon
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 12 / 37
1
Goals and Motivation
2
Review of Equations
3
Plane Wave DFT Computational Components
4
Parallelization Strategies
5
Future Computational Challenges
Linear Scaling Methods
Parallelism Issues
6
Software
Available Codes
KSSOLV
7
Summary
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 13 / 37
Major Computational Components of Plane Wave DFT
Codes
Eigenvalue solver
Orthogonalization
3D FFTs
Computation of potentials
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 14 / 37
Eigenvalue Solver
Need to solve one N n
e
linear eigenvalue problem at each SCF iteration
The size of N can easily be 10,000 100,000
Only need the n
e
( number of atoms) lowest eigenvalues and corresponding
eigenvectors
Called diagonalization in chemistry/materials science circles
Various approaches including CG, Grassmann CG, residual minimization
Distinction is usually made between all band vs. band-by-band, which
corresponds to solving for all eigenvectors simultaneously vs. solving for one
eigenvector at a time. We would call this blocked vs. unblocked
Use of optimized high-level BLAS3 routines can signicantly improve
performance
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 15 / 37
Orthogonalization
Due to physical constraints, the electronic wavefunctions must be
orthonormal
This adds a constraint to the KS equations in the form of X

X = I
n
e
Can be time consuming for large systems
Complexity is O(N n
2
e
), where N is the size of the discretization and n
e
is
the number of electrons
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 16 / 37
FFTs
Recall that the kinetic energy operator takes on a particularly simple form in
Fourier space (also called G-space)
Most DFT codes take advantage of this fact by converting from real space to
G-space for computation of the Hamiltonian
Since systems are usually 3D, codes need to compute the 3D FFTs through a
series of 1D FFTs
This has a consequence both in the total amount of work and when trying to
parallelize the codes
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 17 / 37
Computation of potentials
The Hartree potential,
V
Hartree
=


|rr

|
, can be computed in several ways
The calculation can be posed as the solution of a Poisson problem.
Fast Poisson solvers or multigrid can also be used
Because the potential can be viewed a convolution, it can also be computed
using FFTs
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 18 / 37
1
Goals and Motivation
2
Review of Equations
3
Plane Wave DFT Computational Components
4
Parallelization Strategies
5
Future Computational Challenges
Linear Scaling Methods
Parallelism Issues
6
Software
Available Codes
KSSOLV
7
Summary
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 19 / 37
Parallel Calculations Milestones
1991 Silicon surface reconstruction (7x7), Meiko I860, 64 processor, (Stich, Payne,
King-Smith, Lin, Clarke)
1998 FeMn alloys (exchange bias), Cray T3E, 1500 procs; First > 1 Top
simulation, Gordon Bell prize (Ujfalussy, Stocks, Canning, Y. Wang, Shelton
et al.)
2005 1000 atom Molybdenum simulation with Qbox, BlueGene/L at LLNL with
32,000 processors (F. Gygi et al.)
2008 Band-gap calculation of a 13,824 atom ZnTeO alloy proposed as a new solar
cell material. Used 131,072 processors on Blue Gene/P at ANL achieved
107.5 Tops/s
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 20 / 37
Parallelization Strategies
Parallel across k-points Not useful for large systems as k is usually small
Parallel over electrons number of processors limited by number of electrons
Parallel over the number of plane-wave basis, n
g
most commonly used in
plane-wave codes
Parallelization of DFT codes is nontrivial and most codes cannot scale to
large numbers of processors with even moderate eciencies.
30% parallel eciency is usually considered very good
Parallelization issues for Hartree-Fock codes are similar, especially for SCF
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 21 / 37
Parallelization of 3D FFT
3D FFTs are computed via 3 sets of 1D
FFTs and 2 transposes
Most of the communication is in global
transpose (b) to (c)
Ratio of ops/comm log N
Many FFTs are computed at the same
time to avoid latency issues
Only non-zero elements
computed/communicated
For details see (Canning et al.):
http://www.nersc.gov/projects/paratec/
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 22 / 37
1
Goals and Motivation
2
Review of Equations
3
Plane Wave DFT Computational Components
4
Parallelization Strategies
5
Future Computational Challenges
Linear Scaling Methods
Parallelism Issues
6
Software
Available Codes
KSSOLV
7
Summary
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 23 / 37
Linear Scaling Electronic Structure Methods
Goal is to reduce the computational work from O(N
3
) to O(N)
Quantum mechanical eects are near-sighted, e.g. treat the computation of
the exchange-correlation potential locally
Need to introduce concept of a localization region, inside which the quantity
of interest is computed and is assumed to vanish outside the region
Six strategies for taking advantage of this fact (see Goedecker (1999)):
1
Fermi operator expansion
2
Fermi operator projection
3
Divide-and-conquer
4
Density-matrix minimization
5
Orbital minimization approach
6
Optimal basis density-matrix minimization
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 24 / 37
LS3DF
Based on Divide-and-Conquer approach
Divide a large system into smaller sub-domains that can be solved
independently, then stitch the sub-domains back together again
Classical electrostatic interactions are long-ranged, i.e. solve one global
Poisson equation
Requires minimal communication between the sub-domains
Articial boundary eects due to sub-dividing domains can be cancelled out
Based on ideas from fragment molecular method
We call our method Linear Scaling 3D Fragment or LS3DF
1
1
L.W. Wang, Z. Zhao, J. Meza, LBNL-61691 (2006)
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 25 / 37
Parallelism Issues
IBM Cell Blade. Same processor as found in
a Sony Playstation 3
Multi-core and many-core is the
wave of the future
Current algorithms for parallelism
are dicult to parallelize with high
eciency
Many quantum chemistry codes do
not parallelize well for even medium
scaled paralellism
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 26 / 37
1
Goals and Motivation
2
Review of Equations
3
Plane Wave DFT Computational Components
4
Parallelization Strategies
5
Future Computational Challenges
Linear Scaling Methods
Parallelism Issues
6
Software
Available Codes
KSSOLV
7
Summary
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 27 / 37
Electronic Structure Codes
ABINIT www.abinit.org
PARATEC www.nersc.gov/projects/paratec
PEtot hpcrd.lbl.gov/linwang/PEtot/PEtot.html
PWscf www.pwscf.org
NWChem www.emsl.pnl.gov/docs/nwchem/nwchem.html
Q-Chem www.q-chem.com/
Quantum Espresso www.quantum-espresso.org
Socorro dft.sandia.gov/Socorro
VASP cms.mpi.univie.ac.at/vasp
Many, many more apologies if your favorite code was not listed
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 28 / 37
KSSOLV Matlab package
KSSOLV Matlab code for solving the Kohn-Sham equations
Open source package
Handles SCF, DCM, Trust Region
Example problems to get started with
Object-oriented design - easy to extend
Good starting point for students
Beta version of KSSOLV available, ask one of us for more information!
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 29 / 37
Example: SiH
4
a1 = Atom(Si);
a2 = Atom(H);
alist = [a1 a2 a2 a2 a2];
xyzlist= [
0.0 0.0 0.0
1.61 1.61 1.61
... ];
mol = Molecule();
mol = set(mol,supercell,C);
mol = set(mol,atomlist,alist);
mol = set(mol,xyzlist ,xyzlist);
mol = set(mol,ecut, 25);
mol = set(mol,name,SiH4);
...
isosurface(rho);
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 30 / 37
Convergence
[Etot, X, vtot, rho] = scf(mol);
[Etot, X, vtot, rho] = dcm(mol);
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 31 / 37
Charge Density
isosurface(rho);
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 32 / 37
Example: Pt
6
Ni
2
O
cell:
19.59 0.0 0.0
...
sampling size: n1 = 96, n2 = 48, n3 = 48
atoms and coordinates:
1 Pt 1.3 -0.180 -0.015
...
7 Ni 8.4 0.003 3.069
8 Ni 8.5 7.998 7.762
9 O 14.9 2.644 1.511
number of electrons : 86
spin type : 1
kinetic energy cutoff: 60.0
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 33 / 37
Comparison of DCM vs. SCF
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 34 / 37
1
Goals and Motivation
2
Review of Equations
3
Plane Wave DFT Computational Components
4
Parallelization Strategies
5
Future Computational Challenges
Linear Scaling Methods
Parallelism Issues
6
Software
Available Codes
KSSOLV
7
Summary
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 35 / 37
Summary
Described most common PW DFT computational components
Overview of standard numerical methods used
Brief introduction into some parallelization issues
Listed some computational challenges
Introduced KSSOLV, Matlab package for solving KS equations
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 36 / 37
References
Aron J. Cohen, Paula Mori-Snchez, Weitao Yang, Insights into Current
Limitations of Density Functional Theory, Science, Vol. 321. no. 5890, pp.
792 - 794 (2008).
F. Gygi, R. K. Yates, J. Lorenz, E. W. Draeger, F. Franchetti, C. W.
Ueberhuber, B. R. de Supinski, S. Kral, J. A. Gunnels, J. C. Sexton ,
Proceedings of the 2005 ACM/IEEE conference on Supercomputing (2005).
G. Goedecker, Linear Scaling Electronic Structure Methods, Rev. Mod. Phys.
71, 1085 (1999).
Curtis L. Janssen and Ida M.B. Nielsen, Parallel Computing in Quantum
Chemistry, CRC Press, (2008).
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 37 / 37

You might also like