You are on page 1of 61

Frontal and multi-frontal solvers:

orderings, elimination trees,


refinement trees
Maciej Paszynski
Department of Computer Science
AGH University of Science and Technology, Krakow, Poland
maciej.paszynski@agh.edu.pl
http://www.ki.agh.edu.pl/en/research-groups/a2s
EXEMPLARY SIMPLE 1D NUMERICAL PROBLEM

Find temperature distribution such that

Finite difference disretization


TRIDIAGONAL MATRIX FOR EXEMPLARY 1D PROBLEM
FRONTAL SOLVER

Front matrix

Frontal matrix focuses on forward elimination of the first row


FRONTAL SOLVER

Front matrix

2nd row = 2nd row – 1/h2 * 1st row


FRONTAL SOLVER

Front matrix

2nd row = 2nd row – 1/h2 * 1st row


FRONTAL SOLVER

Front matrix

The first row has been eliminated


Frontral matrix focuses on the elimination of the second row
FRONTAL SOLVER

Front matrix

3rd row = 3nd row – [1/h2] / [-2/h2] * 2nd row


FRONTAL SOLVER

Front matrix

3rd row = 3nd row – [1/h2] / [-2/h2] * 2nd row


ELEMENT-WISE DECOMPOSITION

Construct multiple
frontal matrices
in such a way
so they sum up
to the full matrix
Variables must be split
into parts
MULTI-FRONTAL SOLVER ALGORITHM

First all frontal matrices are constructed


MULTI-FRONTAL SOLVER ALGORITHM

First and second frontal matrices are sum up into new 3x3 frontal matrix
Its first and second rows are fully assembled
MULTI-FRONTAL SOLVER ALGORITHM

First column is eliminated by using first row


2nd row = 2nd row – [1/h2] * 1st row
MULTI-FRONTAL SOLVER ALGORITHM

2nd row = 2nd row – [1/h2] * 1st row


MULTI-FRONTAL SOLVER ALGORITHM

+ +

Third and fourth frontal matrices are sum up into new 3x3 frontal matrix
Now only the second row is fully assembled
MULTI-FRONTAL SOLVER ALGORITHM

+ +

Change of the ordering


MULTI-FRONTAL SOLVER ALGORITHM

+ +

Eliminate entries below the diagonal


2nd row = 2nd row – [1/h2] / [-2/h2] * 1st row
3rd row = 3rd row – [1/h2] / [-2/h2] * 1st row
MULTI-FRONTAL SOLVER ALGORITHM

Why can I substract fully assembled 1st row


from not fully assemled rows 2nd and 3rd ?

+ +

Eliminate entries below the diagonal


2nd row = 2nd row – [1/h2] / [-2/h2] * 1st row
3rd row = 3rd row – [1/h2] / [-2/h2] * 1st row
MULTI-FRONTAL SOLVER ALGORITHM
This is because substraction and addition are interchangable
(now I am substructing from the 2nd not fully assembled row,
and later I will add the remaining part)
Moreover the 1st row which is utilized for the substractions
in other columns contains only zeros

+ +

Eliminate entries below the diagonal


2nd row = 2nd row – [1/h2] / [-2/h2] * 1st row
3rd row = 3rd row – [1/h2] / [-2/h2] * 1st row
MULTI-FRONTAL SOLVER ALGORITHM

+ +

Eliminate entries below the diagonal


2nd row = 2nd row – [1/h2] / [-2/h2] * 1st row
3rd row = 3rd row – [1/h2] / [-2/h2] * 1st row
MULTI-FRONTAL SOLVER ALGORITHM

+ + +

Fifth and sixth frontal matrices are sum up into new 3x3 frontal matrix
Now the second and third rows are fully assembled

MULTI-FRONTAL SOLVER ALGORITHM
root
+
node
+

+ + +

Continue until root node


PARALLEL MULTI-FRONTAL SOLVER ALGORITHM

Procesor 1 Procesor 2 Procesor 3 Procesor 4 Procesor 5 Procesor 6

All frontal matrices are generated at the same time


PARALLEL MULTI-FRONTAL SOLVER ALGORITHM

+ + +

Procesor 1 Procesor 2 Procesor 3 Procesor 4 Procesor 5 Procesor 6

Summing up and elimination are executed at the same time


over different pairs of frontal matrices
PARALLEL MULTI-FRONTAL SOLVER ALGORITHM

+ + +

Procesor 1 Procesor 2 Procesor 3 Procesor 4 Procesor 5 Procesor 6

Summing up and elimination are executed at the same time


over different pairs of frontal matrices
PARALLEL MULTI-FRONTAL SOLVER ALGORITHM
root
+
node
+

+ + +

Procesor 1 Procesor 2 Procesor 3 Procesor 4 Procesor 5 Procesor 6

The agorithm is recursively repeated until we reach the root of the tree
The algorithm results in upper trianguler matrix stored in distributed manner
Computational complexity = height of the tree = logN (where N = #unknowns-1)
GENERALIZATION TO 1D FINITE ELEMENT METHOD

Strong formulation

Find u  C 2 0, l  such that

 a( x)u' ( x)'b( x)u' ( x)  c( x)u( x)  f ( x)


u(0)  0
a(l ) u' (l )   u(l )  

Weak formulation

Find u V such that


l l

 au' v'bu' v  cuvdx   u(l )v(l )   f v dx   v(l )


0 0

bu, v   l v

v V  v  H 1 0, l  : v(0)  0 
GENERALIZATION TO 1D FINITE ELEMENT METHOD

Finite element method disretization


N

a e  a be , e   l e 
N
u i i v  ej i i j j j  1,...,N
i 1 i 1
l l

   aei ' e j 'bei ' e j  cei e j dx   ei (l )e j (l )


b ei , e j     f e j dx   e j (l )
l ej 
0 0

Exemplary basis functions for [0,l] = [0,1], for two finite elements

for x  0,0.5 for x  0,0.5  0 for x  0,0.5


1  2 x
e1 x   
 2x
e2 x    e3 x   
for x  0.5,1 for x  0.5,1 2 x  1 for x  0.5,1
 0 2  2 x

de1 x   2 for x  0,0.5 de2 x   2 for x  0,0.5 de3 x  0 for x  0,0.5


  
for x  0.5,1 dx  2 for x  0.5,1 dx 2 for x  0.5,1
dx 0
GENERALIZATION TO 1D FINITE ELEMENT METHOD

 be1 , e1  be2 , e1  be3 , e1   a1   l e1 


N be , e  be , e  be , e  a   l e 
u a e
i 1
i i v  ej  1 2 2 2 3 2  2   2 
be1 , e3  be2 , e3  be3 , e3  a3  l e3 
l
   aei ' e j 'bei ' e j  cei e j dx   ei (l )e j (l )
l
b ei , e j     f e j dx   e j (l )
l ej 
0 0
Exemplary basis functions for [0,l] = [0,1], for two finite elements

e2  1K 2   2K1
e3   2K 2
e1  1K1

1
be1 , e3  
 ae ' e 'be ' e  ce e dx   e (l )e (l )  0  d
1 3 1 3 1 3 1 3
1

0.5 K1
d 2K1 d1K1 K1 
K1 K1 
be , e    ae ' e 'be ' e  ce e dx   e (l )e (l )   a
0
1 2 1 2 1 2 1 2 1 1
1
b   c1  2 dx

 dx dx dx 2 

0 0
1 1

 d1 2 d 2 2
K K
d1K 2 K 2 
K2 K2 
be2 , e3   ae2 ' e3 'be2 ' e3  ce 2 e3 dx   e2 (l )e3 (l )  a
  b  2  c1  2 dx
0

0.5
dx dx dx 

GENERALIZATION TO 1D FINITE ELEMENT METHOD
 be1 , e1  be2 , e1  be3 , e1   a1   l e1 
N be , e  be , e  be , e  a   l e 
u a e
i 1
i i v  ej  1 2 2 2 3 2  2   2 
be1 , e3  be2 , e3  be3 , e3  a3  l e3 
l
   aei ' e j 'bei ' e j  cei e j dx   ei (l )e j (l )
l
b ei , e j     f e j dx   e j (l )
l ej 
0 0
Exemplary basis functions for [0,l] = [0,1], for two finite elements
e2  1K 2   2K1
e3   2K 2
K
e1  1 1


  dx
2
1 0.5
 d1 1 d1 1 d1K1 K1
K K

be1 , e1   ae1 ' e1 'be1 ' e1  ce1e1dx   e1 (l )e1 (l )  a
  b 1  c 1K1
0
 dx
0
dx dx 
1
be2 , e2   ae2 ' e2 'be2 ' e2  ce 2 e2 dx   e2 (l )e2 (l ) 

0


  2 
  dx
0.5 1
 d 2 1 d 2 1 d 2K1 K1  d1 2 d1 2 d1K 2 K 2
K K K K

 
2
a b  2  c  2K1 dx  a b 1  c 1K 2
 dx
0
dx dx 
  dx
0.5
dx dx 

  
1 1
d 2K 2 d 2K2 d 2K 2
be3 , e3    ae3 ' e3 'be3 ' e3  ce3e3dx   e3 (l )e3 (l )  
2
a b  2K 2  c  2K 2 dx  
0
 dx
0.5
dx dx 
GENERALIZATION TO 1D FINITE ELEMENT METHOD

 be1 , e1  be2 , e1  be3 , e1   a1   l e1 


be , e  be , e  be , e  a   l e 
 1 2 2 2 3 2  2   2 
be1 , e3  be2 , e3  be3 , e3  a3  l e3 


b 1K1 , 1K1  
b  2K1 , 1K1  0   a1     
l 1K1


b 1 ,  2
K1 K1
 b 2 , 2
K1 K1
 b 1 , 1
K2 K2
 b 2 , 1
K2 K2

    K1
   
 a2   l  2  l 1
K2



0 
b 1K 2 ,  2K 2  b 2 , 2
K2 K2

 a  
  3  
  
l  2K 2

Notice that when we switch from finite difference to finite elements,


it only changes the local systems of equations at tree nodes


b 1K , 1K  b 2 , 1
K K
x
K 
   
l 1
 K
K

  b x  


1

b 1 ,  2 2 , 2  l  2
 
K K K K K
2
GENERALIZATION TO 1D FINITE ELEMENT METHOD

Global basis functions


are composed with
local shape functions,
e.g.
e2  1K 2   2K1

   a
b  iK ,  Kj  i '  j 'b i '  j
K K K K

 c iK  Kj dx    iK (l )  iK (l )
K
  f
l ej   Kj dx    Kj (l )
K

Local system of equations generated over the element K


b 1K , 1K  b K
,  K
x
K
1 
  
l 1
 K
K

  b x  


2 1
 K K K
b 1 ,  2 2 , 2 2  l  2

K K

GENERALIZATION TO 1D FINITE ELEMENT METHOD


b 1K , 1K  b 2 , 1
K K
x
K 
 
l 1
 K
 
K

b 1K , 1K  b 2 , 1
K K
x
K 
   
l 1
K

b 1K , 1K  b 2 , 1
K K
x
K 
   
l 1
K

  b x     b x  K


  
1

  b x  


 
1

1
  K
b 1 ,  2 2 , 2  l  2
  b 1 ,  2 2 , 2  l  2
 
K K K K K K K K K K
b 1 ,  2 2 , 2 
  l  2
2
K K K K K
2 2

b ,   b
K K
2 , 1
K K
x K  l  
 
K
 
b 1K , 1K  b 2 , 1
K K
x
K 
   
l 1
K
b 1 , 1
K K
 b ,  x
K K K 
  
l 
 K
b ,   b x 
l     b x   
1

b  b ,  x 
1 1 1 1
  
2 1 1
2 , 2 
  b 1 ,  2 2 , 2  l  2
 
K K K K K K K K K K K
 1 , 2  l 
 
2
K K K K K
1 2 2 2 2 2 2

Notice that when we switch from finite difference to finite elements,


it only changes the local systems of equations at tree nodes
2D hp FINITE ELEMENT METHOD
15
u   uhp
i i We seek the solution u of some weak form of PDE as a linear
ehp i
i 1 combination of shape functions ehp spread over finite element mesh

e1hp 3
ehp 5
ehp

2 4
ehp ehp 7 9
ehp ehp
8
ehp
2D hp FINITE ELEMENT METHOD

e1hp 3
ehp 5
ehp

2 4
ehp ehp 7 9
ehp ehp

 
8
ehp

i j
b e ,e hp hp

   
i 15

 hp hp hp
The coefficients u hp
u i
b e i
, e j
 l e j
hp j  1,...,15
(called „degrees of freedom” d.o.f.) i 1
are obtained by solving
system of linear equations –
where
i

b ehp , ehpj  and l e 
j
hp
are some integrals of shape functions
finite element disecretization i
ehp , ehpj
of a weak (variational) form of PDE
FRONTAL SOLVER
SOLUTION BASED ON LINEAR ORDER OF ELEMENTS

Partial
forward elimination
O(6*9^2)

Generates frontal matrix for the first element,


eliminates fully assembled degrees of freedom
FRONTAL SOLVER
SOLUTION BASED ON LINEAR ORDER OF ELEMENTS

Full
forward elimination

Generates frontal matrix for the second element,


merges with the current frontal matrix
eliminates fully assembled degrees of freedom
MULTI-FRONTAL SOLVER
SOLUTION BASED ON THE ELIMINATION TREE

Partial
forward elimination
MULTI-FRONTAL SOLVER
SOLUTION BASED ON THE ELIMINATION TREE

Partial
forward elimination
MULTI-FRONTAL SOLVER
SOLUTION BASED ON THE ELIMINATION TREE

Full forward elimination


of the interface problem matrix
COMPARISON OF COSTS

Number of operations for partial forward elimination


CONSTRUCTION OF THE ELIMINATION TREE
BASED ON THE HISTORY OF MESH REFINEMENTS

For any initial mesh, the elimination tree can be created based on
nested dissection algorithm.
CONSTRUCTION OF THE ELIMINATION TREE
BASED ON THE HISTORY OF MESH REFINEMENTS

The elimination tree created for the initial mesh is updated when the mesh is refined
(elimination tree is constructed dynamically, during mesh refinements)
CONSTRUCTION OF THE ELIMINATION TREE
BASED ON THE HISTORY OF MESH REFINEMENTS

The elimination tree created for the initial mesh is updated when the mesh is refined
(elimination tree is constructed dynamically, during mesh refinements)
CONSTRUCTION OF THE REFINEMENT TREES
CONSTRUCTION OF THE REFINEMENT TREES
CONSTRUCTION OF THE REFINEMENT TREES
CONSTRUCTION OF THE REFINEMENT TREES
ELIMINATIONS BASED ON REFINEMENT TREES

Level of
initial mesh elements

Level of
refinement trees

• Local matrices for active elements – leaves of the elimination tree –


are created
ELIMINATIONS BASED ON REFINEMENT TREES

Level of
initial mesh elements

Level of
refinement trees

• Interior degrees of freedom are eliminated at every leaf


ELIMINATIONS BASED ON REFINEMENT TREES

Level of
initial mesh elements

Level of
refinement trees

• Schur complement contributions are merged at parent level


ELIMINATIONS BASED ON REFINEMENT TREES

Level of
initial mesh elements

Level of
refinement trees

• Fully assembled degrees of freedom are eliminated at parent nodes level


(degrees of freedom related to common edges shared by both son elements
can be eliminated now)
ELIMINATIONS BASED ON REFINEMENT TREES

Level of
initial mesh elements

Level of
refinement trees

• Contributions to Schur complement are merged again at the next level


ELIMINATIONS BASED ON REFINEMENT TREES

Level of
initial mesh elements

Level of
refinement trees

• Fully assembled degrees of freedom are eliminated


ELIMINATIONS BASED ON REFINEMENT TREES

Level of
initial mesh elements

Level of
refinement trees

• Finally, Schur complement contributions are merged at the tree root node
ELIMINATIONS BASED ON REFINEMENT TREES

Level of
initial mesh elements

Level of
refinement trees

• The common interface problem is solved at the tree root node


(The size of the common interface problem corresponds to the size of
crossection of the entire domain)
ELIMINATIONS BASED ON REFINEMENT TREES

Level of
initial mesh elements

Level of
refinement trees

• The solution obtained at root node is utilized at son nodes.


• The backward substitution is executed
ELIMINATIONS BASED ON REFINEMENT TREES

Level of
initial mesh elements

Level of
refinement trees

• The solution utilized at the second level nodes is utilizes at their sone nodes.
• The backward substitution is executed
ELIMINATIONS BASED ON REFINEMENT TREES

Level of
initial mesh elements

Level of
refinement trees

• The solution obtained at the third level nodes is utilized at leaf nodes.
• The backward substitution is executed on the level of leaf nodes
PERFORMANCE OF THE 1st VERSION OF THE DISTRIBUTED
MEMORY MULTI-FRONTAL PARALLEL SOLVER
1,482,570 degrees of freedom (68,826,475 non-zeros)

- Embeded into hp code


+ Outperforms MUMPS for large number of processors
- Slower than MUMPS for low number of processors
Profiling showed that the time consuming part for low number of processors
was actually process of merging of matrices –
performed on the level of unknowns, with moving of matrix entries.
Switch to the level of nodes, do not touch matrix entries –work with pointers
PAPERS
http://home.agh.edu.pl/~paszynsk/Publications.html
Maciej Paszyński, David Pardo, Carlos Torres-Verdin, Leszek Demkowicz,
Victor Calo
A PARALLEL DIRECT SOLVER FOR SELF-ADAPTIVE HP FINITE
ELEMENT METHOD
Journal of Parallel and Distributed Computing, 70, 3 (2013) 270-281

Anna Paszynska, Maciej Paszynski, Konrad Jopek, Maciej Wozniak, Damian Goik,
Piotr Gurgul, Hassan AbouEisha, Mikhail Moshkov, Victor Calo, Andrew Lenharth,
Donald Nguyen, Keshav Pingali
QUASI-OPTIMAL ELIMINATION TREES FOR 2D GRIDS WITH
SINGULARITIES
Scientiffic Programming, (2015) Article ID 303024, 1-18

Maciej Paszynski, David Pardo, Anna Paszynska, Leszek Demkowicz


OUT-OF-CORE MULTI-FRONTAL SOLVER FOR MULTI-
PHYSICS HP ADAPTIVE PROBLEMS
Procedia Computer Science, 4 (2011) 1788-1797

You might also like