You are on page 1of 54

Introduction to Simulation - Lecture 21

Boundary Value Problems - Solving 3-D


Finite-Difference problems
Jacob White

Thanks to Deepak Ramaswamy, Michal Rewienski, and


Karen Veroy

Outline
Reminder about FEM and F-D
1-D Example

Finite Difference Matrices in 1, 2 and 3D


Gaussian Elimination Costs

Krylov Method
Communication Lower bound
Preconditioners based on improving communication

Heat Flow

1-D
Example
Normalized 1-D Equation

Normalized Poisson Equation

T ( x )
u ( x)
= hs
= f ( x)

2
x
x
x
2

u xx ( x ) = f ( x )

SMA-HPC 2002 MIT

SMA-HPC 2002 MIT

FD Matrix
properties
x0 x1 x2

2
1
1
0
2
x
M
0
SMA-HPC 2002 MIT

uxx = f

1-D Poisson

Finite Differences

xn xn +1

2 1
O O
O O
L 0

L
O
O
O
1

u j +1 2u j + u j
x

= f (x j )

u1
f ( x1 )
M
0 M

M
M M

0 M = M

M
1 M

2 M
M
u
f ( x )
n
n

Residual Equation

Using Basis
Functions
Partial Differential Equation form

u
2 = f
x
2

u (0) = 0 u (1) = 0

Basis Function Representation


n

u ( x ) uh ( x ) = i i ( x )
{
i =1

Basis Functions

Plug Basis Function Representation into the Equation

d i ( x )
R ( x ) = i
+ f ( x)
2
dx
i =1
n

SMA-HPC 2002 MIT

Using Basis functions

Basis Weights
Galerkin Scheme

Force the residual to be orthogonal to the basis functions


1

( x ) R ( x ) dt = 0
l

Generates n equations in n unknowns

d i ( x )
i
+ f ( x ) dx = 0 l {1,..., n}
2
0 l ( x )
dx
i =1

SMA-HPC 2002 MIT

Basis Weights

Using Basis
Functions

Galerkin with integration by


parts

Only first derivatives of basis functions


n

d l ( x )
0 dx
1

d i i ( x )
i =1

dx

dx i ( x ) f ( x ) dx = 0
0

l {1,..., n}
SMA-HPC 2002 MIT

Structural Analysis of
Automobiles

Equations
Force-displacement relationships for
mechanical elements (plates, beams, shells)
and sum of forces = 0.
Partial Differential Equations of Continuum
Mechanics

Drag Force Analysis of


Aircraft

Equations
Navier-Stokes Partial Differential Equations.

Engine Thermal
Analysis

Equations
The Poisson Partial Differential Equation.

2-D Discretized Problem

FD Matrix
properties

Discretized Poisson

x1

x2

x2m

xm +1

u j +1 2u j + u j 1
244
x
144
3
u
xx
2

SMA-HPC 2002 MIT

xm

u j + m 2u j + u j m
y
1442443
u
yy
2

= f (x j )

FD Matrix
properties

SMA-HPC 2002 MIT

2-D Discretized Problem


Matrix Nonzeros, 5x5 example

3-D Discretization

FD Matrix
properties

Discretized Poisson

x j m

x j 1

x j m2

xj

x j +1

x j + m2
u j +1 2u j + u j 1
( x )
1442443
u xx
2

SMA-HPC 2002 MIT

u j + m 2u j + u j m
( y )
1442443
u yy
2

x j +m

u j + m 2 2u j + u j m 2
( z )
144
42444
3
u zz
2

= f (xj )

FD Matrix
properties

SMA-HPC 2002 MIT

3-D Discretization
Matrix nonzeros, m = 4 example

FD Matrix
properties

Summary
Numerical Properties

Matrix is Irreducibly Diagonally Dominant

| Aii | | Aij |
j i

Each row is strictly diagonally dominant, or path


connected to a strictly diagonally dominant row

Matrix is symmetric positive definite


Assuming uniform discretization, diagonal is
1
1 D 2 2,

SMA-HPC 2002 MIT

1
1
2 D 2 4, 3 D 2 6,

Summary

FD Matrix
properties

Structural Properties

Matrices in 3-D are LARGE


1 D m m,

2 D m 2 m 2 , 3 D m3 m3

100x100x100 grid in 3-D = 1 million x 1 million matrix

Matrices are very sparse


Nonzeros per row 1 D
Matrices are banded
1 D

Aij = 0

|i j | > 1

2D

Aij = 0

|i j| > m

3 D

Aij = 0

| i j | > m2

SMA-HPC 2002 MIT

3,

2D

5, 3 D

Basics of GE

A11
A
0
21

A
031

041
A
SMA-HPC 2002 MIT

Triangularizing
Picture

A12 A13 A14

A
A2323 A
A24
A22
22 A
24

A03232 A33
A
34
34
33

A04242 A0A4343 A4444

Triangularizing

GE Basics

Algorithm

For i = 1 to n-1 {
For each Row
For j = i+1 to n {
For each Row below pivot
For k = i+1 to n { For each element beyond Pivot

Ajk Ajk

Multiplie
r

}
}

Aji
Aii

Pivot

Aik
Form n-1 reciprocals (pivots)
Form

n 1

n2
(n i ) =

2
i =1
n 1

Perform
SMA-HPC 2002 MIT

multipliers

(n i)2
i =1

2 3
n
3

Multiply-adds

Complexity of GE

1 D

O ( n3 ) = O ( m3 )

100 pt grid O(106 ) ops

2D

O (n3 ) = O (m6 ) 100 100 grid O(1012 ) ops

3 D

O (n3 ) = O (m9 ) 100 100 100 grid O(1018 ) ops

For 2- D and 3-D problems Need a Faster Solver !

SMA-HPC 2002 MIT

Triangularizing

Banded GE

Algorithm

b
b

For i = 1 to n-1 {
For j = i+1 to ni+b-1
{ {
{
For k = i+1 to n i+b-1
{
Ajk Ajk

NONZEROS

Aii

Aik

}
}
}
n 1

Perform

(min(b 1, n i))
i =1

SMA-HPC 2002 MIT

Aji

O (b2n )

Multiply-adds

Complexity of
Banded GE

1 D

O (b 2 n) = O (m)

100 pt grid O(100) ops

2D

O (b 2 n) = O (m 4 ) 100 100 grid O(108 ) ops

3 D

O (b 2 n) = O (m7 ) 100 100 100 grid O (1014 ) ops

For 3 - D problems Need a Faster Solver !

SMA-HPC 2002 MIT

The World According to


Krylov

Preconditioning

Start with Ax = b, Form PAx = Pb


Determine the Krylov Subspace r 0 = Pb PAx
k 0
0
0
Krylov Subspace span r , PAr ,..., ( PA ) r

Select Solution from the Krylov Subspace


k 0
k +1
k
k
0
0
0
x = x + y , y span r , PAr ,..., ( PA ) r

GCR picks a residual minimizing y .


SMA-HPC 2002 MIT

Preconditioning

Krylov Methods

Diagonal Preconditioners

Let A = D + And
(

Apply GCR to D 1 A x = I + D 1 And x = D 1b


The Inverse of a diagonal is cheap to compute
Usually improves convergence
SMA-HPC 2002 MIT

Krylov Methods

Convergence Analysis
Optimality of GCR poly

GCR Optimality Property


% k+1 ( PA) r 0 where
% k+1 is any k th order
r k +1
% k+1 ( 0 ) =1
polynomial such that

Therefore
Any polynomial which satisfies the
constraints can be used to get an
upper bound on
SMA-HPC 2002 MIT

r k +1
r

Residual Poly Picture for Heat Conducting Bar Matrix


No loss to air (n=10)

Keep k ( i ) as small as possible:

Easier if eigenvalues are clustered!

SMA-HPC 2002 MIT

Krylov Vectors for diagonal


Preconditioned A

The World According to


Krylov

xexact (1 digit)

0.9 -0.7 0.6 -0.5 0.4 -0.3 0.1

b = r0

-.5

1.25 -1 -.5

D A
D 1 A

1-D Discretized
PDE

0.5
0
0 1
1
0.5

1
0
O
0

0.5
0
O
O 0.5 M


0
0
0.5
1 0
14444
4244444
3
SMA-HPC 2002 MIT

D1A

1
1.25

0.5

M0.5
0
0

The World According to


Krylov

xexact (1 digit)
b = r0
1

D A
1
D A

Krylov Vectors for Diagonal


Preconditioned A
Communication Lower Bound

0.9 -0.7 0.6 -0.5 0.4 -0.3 0.1


1

-.5

1.25 -1 -.5

Communication Lower Bound for m gridpoints

D A r 0 is nonzero in mth entry after k = m iters


k

0
k +1
j
= x + j r 0
Need at least m iterations for x
m
j =1

SMA-HPC 2002 MIT

Krylov Vectors for Diagonal


Preconditioned A

The World According to


Krylov

Two Dimensional Case

For an mxm Grid

1
0
If r 0 =
M

0

Takes 2m = O ( m ) iters
m2
SMA-HPC 2002 MIT

for x k +1

)m

Convergence for GCR

The World According to


Krylov

Eigenanalysis

0.5
0
0 1
1
0.5

1
0
O
0

0.5
0
O
O 0.5 M


0
0
0.5
1 0
14444
3
4244444

1
1.25
0.5

M0.5
0
0

D1A

k
Recall Eigenvalues of D A = 1 cos

m +1
1

SMA-HPC 2002 MIT

Convergence for GCR

The World According to


Krylov

Eigenanalysis

For D 1 A,

max
min

m

= 1 cos

1
cos

+
+
m
1
m
1

# Iters for GCR to achieve convergence


rk
r

1
2

+1
k =

log

convergence

tolerance

1 m
log

+1

O (m)

GCR achieves Communication lower bound O(m)!


SMA-HPC 2002 MIT

The World According to


Krylov

Work for Banded Gaussian


Elimination, Relaxation and GCR

Dimension Banded GE Sparse GE


1

O (m)

O (m)

O (m

O (m

)
O (m )
4

)
O (m )
3

GCR
O (m

)
O (m )
O (m )

GCR faster than banded GE in 2 and 3 dimensions


Could be faster, 3-D matrix only m3 nonzeros.
Must defeat the communication lower bound!
SMA-HPC 2002 MIT

The World According to


Krylov

How to get Faster


Converging GCR

Preconditioning is the Only Hope


GCR already achieves Communication Lower
bound for a diagonally preconditioned A

Preconditioner must accelerate communication


Multiplying by PA must move values by
more than one grid point.

SMA-HPC 2002 MIT

Gauss-Seidel Preconditioning

Preconditioning
Approaches

Physical View

1-D Discretized PDE


(new)

u0 u1

u2(old)

u1(new)

(new)
2

Gauss Seidel
u3(old)

un(new)
1

un(new) u
n +1

Each Iteration of Gauss-Seidel Relaxation


moves data across the grid

SMA-HPC 2002 MIT

Gauss-Seidel Preconditioning

Preconditioning
Approaches

xexact (1 digit)
b =1 r 0
( D + L) A

( D + L)

b =1 r
( D + L) A

( D + L)

Krylov Vectors

0.9 -0.7 0.6 -0.5 0.4 -0.3 0.1


1

1-D Discretized
PDE

X=nonzero

Gauss-Seidel communicates quickly


in only one direction

SMA-HPC 2002 MIT

Gauss-Seidel Preconditioning

Preconditioning
Approaches

Symmetric Gauss-Seidel

(new)

u
u0 1

u2(old)
(new)

u
(new)
2

u1

u3(old)

un(new)
1
u
(newer)

u
u 1
0

(new)
n2

un(newer)
1

un(new) u
n +1

un( new)

u2(newer)

This symmetric Gauss-Seidel Preconditioner


Communicates both directions
SMA-HPC 2002 MIT

Gauss-Seidel Preconditioning

Preconditioning
Approaches

Symmetric Gauss-Seidel

Derivation of the SGS Iteration Equations


Forward Sweep ( half step ) : ( D + L ) x k +1/ 2 + Ux k = b
Backward Sweep ( half step ) : ( D + U ) x k + Lx k +1/ 2 = b

k +1

= ( D + U ) L ( D + L ) Ux k
1
1
1
+ ( D +U ) b ( D +U ) L ( D + L) b

k +1

= x ( D + U ) D ( D + L ) Ax k

+ ( D +U ) D ( D + L) b
1

SMA-HPC 2002 MIT

Block Diagonal
Preconditioners

Preconditioning
Approaches

Line Schemes

Grid

Matrix

m2

Tridiagonal Matrices factor quickly


SMA-HPC 2002 MIT

Block Diagonal
Preconditioners

Preconditioning
Approaches

Line Schemes

Grid

Problem

Lines preconditioners
communicate rapidly in
only one direction

Solution
m2

Do lines first in x,
then in y.

The Preconditioner is now two Tridiagonal solves, with


variable reordering in between.
SMA-HPC 2002 MIT

Block Diagonal
Preconditioners

Preconditioning
Approaches

Domain Decomposition

Approach

Break the domain into


small blocks each with the
same # of grid points

The trade-off
m2

SMA-HPC 2002 MIT

Fewer blocks means


faster convergence, but
more costly iterates

Block Diagonal
Preconditioners

Preconditioning
Approaches

Domain Decomposition

m points

Block
index

m
l2

m
points
l
2

m
Block cost: factoring l l grids, O ( m 2l ) , sparse GE
l
m
GCR iters: Communication bound gives O iterations.
l3
Suggests insensitivity to l: Algorithm is O ( m ) .

Do you have to refactor every GCR iteration?


SMA-HPC 2002 MIT

Siedelerized Block Diagonal


Preconditioners

Preconditioning
Approaches

Line Schemes

Grid

Matrix

m2

SMA-HPC 2002 MIT

Overlapping Domain
Preconditioners

Preconditioning
Approaches

Line based Schemes

Grid

Matrix

m2

Bigger systems to solve, but can have faster convergence on


tougher problems (not just Poisson).

SMA-HPC 2002 MIT

Preconditioning
i
Approaches

Incomplete Factorization
Schemes
Outline

Reminder about Gaussian Elimination


Computational Steps
Fill-in for Sparse Matrices
Greatly increases factorization cost
Fill-in in a 2-D grid
Incomplete Factorization Idea

Sparse Matrices

Fill-In
Example

Matrix Non zero structure

Matrix after one GE step

X X X
X X 0

X 0 X

X X X
X X X

0 X
X X

X= Non zero
Fill-ins
SMA-HPC 2002 MIT

Fill-In

Sparse Matrices

Second Example

Fill-ins Propagate

X
X

X
0

X
0

X
0

X
0

X
0

Fill-ins from Step 1 result in Fill-ins in step 2


SMA-HPC 2002 MIT

Sparse Matrices

Fill-In

Pattern of a Filled-in Matrix

SMA-HPC 2002 MIT

Very Sparse

Very Sparse

Dense

Sparse Matrices

SMA-HPC 2002 MIT

Fill-In
Unfactored Random Matrix

Sparse Matrices

SMA-HPC 2002 MIT

Fill-In
Factored Random Matrix

Factoring 2-D Finite-Difference matrices

Generated Fill-in Makes Factorization Expensive

SMA-HPC 2002 MIT

FD Matrix
properties

SMA-HPC 2002 MIT

3-D Discretization
Matrix nonzeros, m = 4 example

Preconditioning
i
Approaches

Incomplete Factorization
Schemes
Key idea

THROW AWAY FILL-INS!


Throw away all fill-ins
Throw away only fill-ins with small values
Throw away fill-ins produced by other fill-ins
Throw away fill-ins produced by fill-ins of
other fill-ins, etc.

Summary
3-D BVP Examples
Aerodynamics, Continuum Mechanics, Heat-Flow

Finite Difference Matrices in 1, 2 and 3D


Gaussian Elimination Costs

Krylov Method
Communication Lower bound
Preconditioners based on improving communication

You might also like