You are on page 1of 38

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Lecture 23: Krylov Subspace Methods for Asymmetric Systems; Stationary Iterative Methods; Multigrid Method Xiangmin Jiao
SUNY Stony Brook

Xiangmin Jiao

Numerical Analysis I

1 / 38

Outline

Krylov Subspace Methods for Asymmetric Systems Smoothing Eect of Stationary Iterative Methods Multigrid Method Motivation Key Ideas of Multigrid More Advanced Topics

Xiangmin Jiao

Numerical Analysis I

2 / 38

Minimizing Residual
CG only works for SPD matrices. It minimizes x x 1 T (x ) = 2 x Ax x T b
A

and

There have been many proposed extensions to nonsymmetric matrices, GMRES, BiCG, etc. GMRES (Generalized Minimal RESiduals) is one of most well known The basic idea of GMRES is to nd xn Kn that minimizes rn = b Axn This can be viewed as a least squares problem: Find a vector c s.t. AKn c b is minimized, where Kn is the m n Krylov matrix composed of basis vectors of Kn , and AKn = Ab A2 b An b

Orthogonal basis is often used, produced by Arnoldi iteration


Xiangmin Jiao Numerical Analysis I 3 / 38

Review: Arnoldi Iteration


Let Qn = [q1 | q2 | | qn ] be m n matrix with rst n columns of Q n be (n + 1) n upper-left section of H and H 1 Start by picking a random q1 and then determine q2 and H n can be written as The nth columns of AQn = Qn+1 H Aqn = h1n q1 + + hnn qn + hn+1,n qn+1 Algorithm: Arnoldi Iteration given random nonzero b, let q1 = b/ b for n = 1 to 1, 2, 3, . . . v = Aqn for j = 1 to n hjn = qj v v = v hjn qj hn+1,n = v qn+1 = v /hn+1,n
Xiangmin Jiao Numerical Analysis I 4 / 38

Minimal Residual with Orthogonal Basis


Let Qn be Krylov matrix whose columns q1 , q2 , . . . span the successive Krylov subspaces Instead of nd xn = Kn c , nd xn = Qn y which minimizes AQn y b n , so For Arnoldi iteration, we showed that AQn = Qn+1 H n y b = minimum Qn+1 H
Left multiplication by Qn +1 does not change the norm, so n y Qn H +1 b = minimum b = b e , so Finally, by construction, Qn 1 +1

n y b e1 = minimum. H

Xiangmin Jiao

Numerical Analysis I

5 / 38

The GMRES Algorithm

Algorithm: GMRES q1 = b/ b for n = 1 to 1, 2, 3, . . . Step n of Arnoldi iteration n y b e1 = rn Find y to minimize H xn = Qn y The residual rn does not need to be computed explicitly from xn

Xiangmin Jiao

Numerical Analysis I

6 / 38

The GMRES Algorithm

Least squares problem has Hessenberg structure, solved with QR n factorization of H n is constructed from scratch, then it costs If QR factorization of H 2 O (n ) ops, due to Hessenberg structure n can be updated from that of H n 1 , However, QR factorization of H using Givens rotation within O (n) work However, memory and cost grow with n. In practice, restart the algorithm by clearing accumulated data. This might stagnate the method

Xiangmin Jiao

Numerical Analysis I

7 / 38

GMRES and Polynomial Approximation


GMRES can be interpreted as nding polynomial pn Pn for n = 1, 2, 3, . . . where Pn = {polynomial p of degree n with p (0) = 1} such that pn (A)b is minimized Note that r = b AKn c , where AKn c = c1 A + c2 A2 + + cn An b and r = 1 c1 A c2 A2 cn An b. In other words, pn (z ) = 1 z (c1 + c2 z + + cn1 z n1 ) Invariance of GMRES
Scale invariance: If we change A A and b b , then rn rn Invariance under unitary similarity transformations: If change A UAU for some unitary matrix U and b Ub , then rn U rn
Xiangmin Jiao Numerical Analysis I 8 / 38

Convergence of GMRES

GMRES converges monotonically and it converges after at most m steps Based on a polynomial analysis, diagonalizable A = V V 1 converges as rn (V ) inf sup |pn (i )| pn Pn (A) b i In other words, if A is not far from normal (i.e., eigenvectors are nearly orthogonal), and if properly normalized degree n polynomials can be found whose size on the spectrum (A) decreases quickly with n, then GMRES converges quickly

Xiangmin Jiao

Numerical Analysis I

9 / 38

Other Krylov Subspace Methods


CG on the Normal Equations (CGN)
Solve A Ax = A b using Conjugate Gradients Poor convergence due to squared condition number (i.e., (A A) = (A)2 ) One advantage is that it applies least squares problems without modication

BiConjugate Gradients (BCG/BiCG)


Makes residuals orthogonal to another Krylov subspace, based on A It can be implemented with three-term recurrences, so memory requirements is smaller Convergence sometimes comparable to GMRES, but unpredictable

Conjugate Gradients Squared (CGS)


Avoids multiplication by A in BCG, sometimes twice as fast convergence as BCG

Quasi-Minimal Residuals (QMR) and Stabilized BiCG (Bi-CGSTAB)


Variants of BiCG with more regular convergence
Xiangmin Jiao Numerical Analysis I 10 / 38

MINRES: Minimum residual


Solve Ax = b or (AsI )x = b. The matrix AsI must be symmetric but it may be denite or indenite or singular Scalar s is a shifting parameter it may be any number. The method is based on Lanczos tridiagonalization MINRES is solving one of the least-squares problems minimize Ax b or (AsI )x b References
C. C. Paige and M. A. Saunders (1975). Solution of sparse indenite systems of linear equations, SIAM J. Numerical Analysis, Vol. 12, 617-629 http://www.stanford.edu/group/SOL/software/minres.html

Xiangmin Jiao

Numerical Analysis I

11 / 38

LSQR
It solves Ax = b, or minimizes Ax b Ax b 2 + 2 x 2
2

or minimize

A may be square or rectangular (over-determined or under-determined), and may have any rank The method is based on the Golub-Kahan bidiagonalization process. It is algebraically equivalent to applying MINRES to the normal equation (AT A + 2 I )x = AT b, but has better numerical properties, especially if A is ill-conditioned LSQR reduces r monotonically (where r = bAx if = 0) References:
Paige, C. C. and M. A. Saunders, LSQR: An Algorithm for Sparse Linear Equations And Sparse Least Squares, ACM Trans. Math. Soft., Vol.8, 1982, pp.43-71 http://www.stanford.edu/group/SOL/software/lsqr.html

Xiangmin Jiao

Numerical Analysis I

12 / 38

Outline

Krylov Subspace Methods for Asymmetric Systems Smoothing Eect of Stationary Iterative Methods Multigrid Method Motivation Key Ideas of Multigrid More Advanced Topics

Xiangmin Jiao

Numerical Analysis I

13 / 38

Stationary Iterative Methods

Stationary iterative methods can be interpreted as a xed point iteration obtained by matrix splitting. Let A = M N and rk = b Axk we can obtain xk +1 = M 1 Nxk + M 1 b xk +1 = xk + M 1 rk Dierent choices of splitting will lead to various schemes (1) (2)

Xiangmin Jiao

Numerical Analysis I

14 / 38

Stationary Iterative Methods

These iteration schemes work for a wide range of problems They can often be implemented without forming the matrix explicitly. However, they have slow convergence

Example
For 2D Poisson equation, the spectral radius of Jacobi iteration matrix is cos n 1O The number of iterations required to achieve is O (n2 ln 1 ).
1 n2

Xiangmin Jiao

Numerical Analysis I

15 / 38

Stationary Iterative Methods

After 5 Jacobi iterations on a Poisson equation, error decreases very slowly.


Xiangmin Jiao Numerical Analysis I 16 / 38

Smoothing Eect

The reason behind this phenomenon is the smoothing property of stationary iterative methods This property is one of the foundations for multigrid methods To illustrate the idea we apply iterative methods to the homogeneous system with initial guess vk Au = 0 vk is chosen as (vk )j = sin (Fourier modes)
jk n

,1

n 1, 1

n1

Xiangmin Jiao

Numerical Analysis I

17 / 38

Smoothing Eect
The modes in the lower half of the spectrum, with wavenumbers in the range 1 k < n 2 are called low frequency or smooth modes. The modes in the upper half of the spectrum, with called high frequency modes or oscillatory modes.
n 2

n 1 are

Xiangmin Jiao

Numerical Analysis I

18 / 38

Smoothing Eect

Weighted Jacobi applied on 1-D model problem with 64 points with initial guess v1 ,v3 and v6
Xiangmin Jiao Numerical Analysis I 19 / 38

Smoothing Eect

Gauss Seidel applied on 1-D model problem with 64 points with initial guess v1 ,v3 and v6

Xiangmin Jiao

Numerical Analysis I

20 / 38

Smoothing Eect

Oscillatory modes are eliminated quickly. Smooth modes remain relatively unchanged. Errors for the model problem can be decomposed using these Fourier modes. After several iterations, high frequency components will disappear and the error becomes smooth.

Xiangmin Jiao

Numerical Analysis I

21 / 38

Smoothing Eect

If we project a smooth wave directly onto a coarser grid, it becomes more oscillatory.

Xiangmin Jiao

Numerical Analysis I

22 / 38

Smoothing Eect

What does it imply? If we can move the error to a coarser grid, iterations will be more eective! Even if the error does not become more oscillatory, relaxing on the coarse grid is simply cheaper We may consider using coarse grids

Xiangmin Jiao

Numerical Analysis I

23 / 38

Outline

Krylov Subspace Methods for Asymmetric Systems Smoothing Eect of Stationary Iterative Methods Multigrid Method Motivation Key Ideas of Multigrid More Advanced Topics

Xiangmin Jiao

Numerical Analysis I

24 / 38

Nested Iteration

We can solve problems on coarse grids to obtain better initial guesses: Relax Au = f on a very coarse grid 8h to obtain an initial guess for the next ner grid 4h . Relax Au = f on grid 4h to obtain an initial guess for 2h . Relax Au = f on grid 2h to obtain an initial guess for h . Relax Au = f on h to obtain a nal approximation to the solution.

Xiangmin Jiao

Numerical Analysis I

25 / 38

Correction Scheme

From our previous observations, error becomes smooth after relaxations. If we move the error to a coarser grid, it becomes oscillatory and iterations are eective What problem should we be solving then? The residual equation Ae = f Av = r , where v is approximate solution of u

Xiangmin Jiao

Numerical Analysis I

26 / 38

Correction Scheme

Why residual equation? We want to relax the error directly since it becomes oscillatory on coarse level If we can solve the residual equation accurately then the real solution u can be obtained by u = v + e . Relaxation on the original equation Au = f with arbitrary initial guess v is equivalent to relaxing on the residual equation Ae = r with specic initial guess e = 0

Xiangmin Jiao

Numerical Analysis I

27 / 38

Two-Grid Correction Scheme


The basic form of the multigrid method is dened as the following two-grid correction scheme [Briggs et al., Multigrid Tutorial]: v h MG v h , f h
1

Presmoothing: relax 1 times on Ah u h = f h on h with initial guess v h. Restriction: compute the ne-grid residual r h = f h Ah v h and restrict it to the coarse grid by r 2h = Rr h . Coarse Grid Solving: either solve A2h e 2h = r 2h or relax 3 times with initial guess 0 on 2h . Prolongation: interpolate the coarse-grid error to the ne grid by e h = Pe 2h and correct the ne-grid approximation by v h v h + e h . Postsmoothing: Relax 2 times on Ah u h = f h on h with initial guess v h .
Xiangmin Jiao Numerical Analysis I 28 / 38

Presmoothing

Presmoothing: relax 1 times on Ah u h = f h on h with initial guess v h . We apply 1 steps of iterations on the original linear system This step is known as the presmoothing step After iterations errors e h will become smooth and it will appear oscillatory on 2h We then approximate the residual equation on 2h

Xiangmin Jiao

Numerical Analysis I

29 / 38

Restriction
Restriction: compute the ne-grid residual r h = f h Ah v h and restrict it to the coarse grid by r 2h = Rr h . We restrict the residual onto 2h
h or Restriction operator can be chosen as injection vj2h = v2 j 1 h h + vh full-weighting vj2h = 4 (v2j 1 + 2v2 j 2j + 1 )

Figure: Restriction by full weighting

Xiangmin Jiao

Numerical Analysis I

30 / 38

Solving on coarse level

Coarse Grid Solving: either solve A2h e 2h = r 2h or relax 3 times with initial guess 0 on 2h . We obtain A2h by rediscretizing the PDE on 2h A2h e 2h = r 2h is an approximation of Ah e h = r h on 2h Iterative methods are eective as e 2h becomes oscillatory Iterations are also cheaper as there are less grid points

Xiangmin Jiao

Numerical Analysis I

31 / 38

Prolongation
Prolongation: interpolate the coarse-grid error to the ne grid by e h = Pe 2h and correct the ne-grid approximation by v h v h + e h . After e 2h is obtained, we interpolate it back to h and update error. Prolongation operator can be chosen as linear interpolation 1 2h h = v 2h , v h 2h v2 j j 2j +1 = 2 (vj + vj +1 )

Figure: Prolongation by linear interpolation

Xiangmin Jiao

Numerical Analysis I

32 / 38

Postsmoothing

Postsmoothing: relax 2 times on Ah u h = f h on h with initial guess v h . We apply 2 steps of iterations on the original linear system This step is known as the postmoothing step Errors will be further reduced

Xiangmin Jiao

Numerical Analysis I

33 / 38

Two-Grid Correction Scheme

Iteration on ne grid leaves smooth errors e h and they appear to be oscillatory on coarse grid as e 2h . Iteration on coarse grid then solve e 2h eectively and e 2h will become a good approximation of e h after interpolation. Finally with the correction step v h v h + e h , we will obtain a solution very close to u . Errors which cannot be eliminated eectively by iterations are removed by coarse grid correction.

Xiangmin Jiao

Numerical Analysis I

34 / 38

Toward Multigrids

In our description, we assume e 2h on coarse level is solved accurately Practically a few steps of iterations can not guarantee sucient accuracy of e 2h We may apply two-grid idea recursively on subsequent levels We can recursively solve problems on coarser levels and use them as initial guesses on ne levels

Xiangmin Jiao

Numerical Analysis I

35 / 38

V-cycle and Full Multigrid scheme

Figure: V-cycle and FMG scheme


Xiangmin Jiao Numerical Analysis I 36 / 38

Classical Algebraic Multigrid Method

Smooth errors are dened algebraically as ek +1 ek which leads us to Ae 0 Dene interpolation operator P Coarsening is performed by a greedy maximum independent set algorithm on weighted graph Restriction is chosen as the transpose of interpolation and A2h = P T Ah P Current AMG development focuses on improving coarsening strategy and interpolation formula

Xiangmin Jiao

Numerical Analysis I

37 / 38

Numerical Results: Poisson Equation


Poisson equation discretized using third-order generalized nite dierence method. The resulting matrix is asymmetric Comparison of algebraic multigrid method, MATLABs built in direct solver and GMRES to problems of various sizes AMG TOL = 1010 0.0081 seconds 0.081 seconds 0.95 seconds 11.79 seconds Direct Solver 0.0093 seconds 0.21 seconds 8.14 seconds out of memory GMRES(10) 0.020 seconds 0.28 seconds 7.15 seconds 470.88 seconds

10 10 10 20 20 20 40 40 40 80 80 80

Xiangmin Jiao

Numerical Analysis I

38 / 38