You are on page 1of 58

Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Using AD to derive adjoints for large, legacy CFD


solvers

J.-D. Müller
Queen Mary, University of London

collaborators:
D. Jones, F. Christakpoulos, S. Xu, QMUL
S. Bayyuk, ESI, Huntsville AL


c Jens-Dominik Müller, 2013

1 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Fluid optimisation in industrial application

Testcases of the FlowHead FP7 project

FlowHead: http://flowhead.sems.qmul.ac.uk

2 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

How to satisfy these requirements

• Large number of design variables, acceptable run times


• Gradient-based optimisation
• Adjoint solvers
• Complete differentiation of metrics and parametrisation
• Fully coupled KKT solvers, ’one-shot’
• Multi-level methods, including the design
• Able to deal with complex geometry
• Automatic parametrisation
• Node-based parametrisations
• Morphing-based parametrisation
• CAD-based parametrisation
• Robust mesh deformation algorithms
• Integration into the complete design process
∂J
• Complete sensitivity computation, ∂α
• Integration into the design chain:
• CAD-based design 3 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Outline

Solving the Adjoint Flow Equations

Compressible adjoint codes using AD (easy)

Development of incompressible adjoint solvers (medium)

AD on legacy, incompressible CFD codes (hard)

Wishlist for AD tools

Summary

4 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Outline

Solving the Adjoint Flow Equations

Compressible adjoint codes using AD (easy)

Development of incompressible adjoint solvers (medium)

AD on legacy, incompressible CFD codes (hard)

Wishlist for AD tools

Summary

5 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Background on Adjoint Design Optimisation


Navier Stokes equations, fixed-point iteration to steady state
(typical compressible flow discretisation):

R(U (α), α) = 0

Linearisation with respect to a design (control) variable α

∂R ∂U ∂R
=− ,
∂U ∂α ∂α
Au = f.

Sensitivity of an objective function L with respect to α


dL ∂L ∂L ∂U ∂L ∂L
= + = + g Tu = + g T A−1 f
dα ∂α ∂U ∂α ∂α ∂α
∂L
∂α is directly computable, g Tu requires an expensive solve for the
perturbation flow field u for each αi .
6 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

The Adjoint Equations


Regroup the terms in the sensitivity computation:
dL ∂L ∂L T ∂L
= + g T A−1 f = + A−T g f = + vT f
dα ∂α ∂α ∂α
leads to the definition of the adjoint equation:

ATv = g
T
∂L T ∂L T
  
∂R
= .
∂U ∂R ∂U

From this follows the Adjoint Equivalence

g Tu = (ATv)T u = v TAu = v Tf

Using v Tf , needs a single solve of ATv = g and the evaluation of


fi for each αi .
7 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Applications of adjoint solutions


The adjoint solution can directly express the sensitivity of a single
cost function, e.g. drag, w.r.t. many design variables, e.g. normal
surface displacement of each mesh point:
to reduce drag: red: push in, blue: pull out

(FlowHead project: adjoint solver ICON , testcase and solution VW )


8 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

How to get an adjoint solver


Continuous approach: derive the adjoint equations, then discretise.
First codes are simple to achieve, but it is difficult to
guarantee stability, but can benefit from additional
stability due to re-discretisation.
Discrete approach: discretise the equations, then differentiate and
transpose the algorithm.
Can be applied as a stric procedure, exactly
verifiable. Guarantees that primal and adjoint
Jacobians A are exactly transposed, guarantees
stability (or instability if primal is not stable.)
Automatic Differentiation: use the discrete approach, but use a
software tool to perform the derivation and
transposition.
Makes the process automatable via
Makefile/compiler, but good performance can be
difficult to achieve.
9 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Outline

Solving the Adjoint Flow Equations

Compressible adjoint codes using AD (easy)

Development of incompressible adjoint solvers (medium)

AD on legacy, incompressible CFD codes (hard)

Wishlist for AD tools

Summary

10 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Forward mode AD: tangent linearisation

• A program with input x and output y is a sequence of


operations Fi , and it’s derivative Ei can be chained up for
each operation:

y = Fk (Fk−1 (· · · F1 (x) · · · ))
ẏ = Ek (Ek−1 (· · · E1 (ẋ) · · · ))

• This produces the matrix-vector product of the exact


derivative with a weighting or direction ẋ:

ẏ = ∇F (x)ẋ = Aẋ.

• Computing the gradient vector ẏ for an x with m dimensions


costs m evaluations of A.

11 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Reverse mode AD: the adjoint

• Alternatively, transpose this matrix vector product by


traversing the code in reverse:

y = Fk (Fk−1 (· · · F1 (x) · · · ))
x̄ = E1T (· · · Ek−1
T T
(EkT (ȳ)) · · · ) = AT ȳ T

• The result is the matrix-vector product of a weighting or


direction ȳ with the gradient:

x̄T = ∇F (x)ȳ T , x̄ = AT ȳ.

• For n outputs, we need to evaluate AT n times.


• If the number of outputs n is smaller than the number of
inputs m, using adjoint gradients is more efficient.

12 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Tangent-linear code with AD


• Typically the innermost loop of an explicit compressible CFD
code is
call initialise flow ( ←Q )
call metrics ( →X, ←Nrm )
do nIter = 1,mIt
call residual ( →Q, →Nrm, ←R )
call update ( →R, ←Q )
end do
call cost fun ( →Q, →Nrm, ←J )

• The tangent version then is


call initialise flow d ( ←U, ←U̇ )
˙ )
call metrics d ( →X, →Ẋ, ←Nrm, ←Nrm
do nIter = 1,mIt
˙
call residual d ( →U, →U̇, →Nrm, →Nrm, ←R, ←Ṙ )
call update d ( →R, →Ṙ, ←U, ←U̇ )
end do
˙
call cost fun d ( →U, →U̇, →Nrm, →Nrm, ←J, ←J̇ )

13 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Timestepping using “Full AD”


• Typically the innermost loop of an explicit compressible CFD
code is
call initialise flow ( ←U )
call metrics ( →X, ←Nrm )
do nIter = 1,mIt
call residual ( →U, →Nrm, ←R )
call update ( →R, ←U )
end do
call cost fun ( →U, →Nrm, ←J )

with →inputs and ←outputs.

• ‘Brute-force’ application of reverse-mode AD:


U = 0; J = 1
call cost fun ( →U, ←U, →Nrm, ←Nrm, ←J, →J )
do nIter = mIt,1,-1
call update ( →R, ←R, ←U, →U )
call residual ( →U, ←U, →Nrm, ←Nrm, ←R, →R )
end do
call metrics ( →X, ←X, ←Nrm, →Nrm )
14 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Use of the primal time-stepping


To solve ATv
= g, re-use the time-stepping sequence of the CFD
solver (the primal):
• evaluate res = AT q̄ by calling res b with input q̄, output res:
call cost fun ( →U, ←g, →Nrm, ←Nrm, ←J, 1 )
do nIter = 1,mIt
call residual r ( →U, ←R, Nrm, ←R, →U )
R = R + g
call update ( →R, ←U )
end do
call residual nrm ( →U, →Nrm, ←Nrm, ←R, →U )
call metrics ( →X, ←X, ←Nrm, →Nrm )

• Note that R and U are swapped in residual r, we call the


standard update, not the transposed update.
• As A and AT share eigenvalues, this procedure will converge
at the identical rate as the primal, as does the equivalent ’full
AD’ version with reversed sequence.
15 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Gradient convergence comparison, tangent vs. adjoint

It. Tangent linearisation Altered adjoint timestep


1 1.5693931118856 1.5693931118856
2 2.5791035355540 2.5791035355540
3 3.2301257117945 3.2301257117945
4 3.6606833916720 3.6606833916720
5 3.9529170141718 3.9529170141718
6 4.1559630103599 4.1559630103599
7 4.2996992666559 4.2996992666559
8 4.4027343761762 4.4027343761762
... ... ...
10000 3.0305888221497 3.0305888221497
... ... ...

16 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Performance of discrete adjoint

Cost of evaluating the primal (CFD) is 1 Unit

Cost
Primal (CFD) 1.0
Pseudo-ts. adjoint 2.6337738
‘Brute force’ adjoint 4.2138729

17 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Time-stepping for fully coupled schemes

• Compressible schemes are typically fully coupled, i.e. a


complete residual vector is computed and all components are
fully coupled.
• The system matrices for primal, tangent-linear A and adjoint
AT have the same eigenvalues, so the same time-stepping
schemes and preconditioners are optimal
• All non-symmetric elements, such as e.g. low-Mach
preconditioners, Block-Jacobi spectral timestepping, need to
be transposed to maintain convergence properties.
• All symmetric parts such as explicit updates, multi-grid, can
be kept as is, do not need to be reversed.

18 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Adjoint convergence
NACA 0012 aerofoil, M a = 0.43 and α = 2◦ , 4 lvl multi-grid
adjoint w.r.t angle of attack.

0
Single grid - Primal
Single grid - Adjoint
Multigrid - Primal
-2 Multigrid - Adjoint

-4

-6
logRMS

-8

-10

-12

-14

-16
0 2000 4000 6000 8000 10000 12000
Iterations on the finest grid

19 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Use of AD for small, fully coupled CFD codes

• The Finite Volume formulation leads to a very simple structure


of the system: R(U ) = 0, where R is the control volume
residual obtained as a sum of fluxes that have to balance.
• The call-tree is simple and can be split by hand into 3
branches: cost fun, residual, metrics.
• For efficiency some code residual is differentiated twice wrt
different variables.
• Assembling these 3 branches in a hand-coded iterative loop
typically allows to reuse primal code for preconditioning
• Very effective code can be produced with memory and
runtime performance equal to continuous adjoints, allowing
“hot-starts” with arbitrary initial adjoint solutions.

20 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

mgOpt: in-house compressible flow solver of QMUL


Characteristics of mgOpt:
• node-centred, unstructured finite-volume code written in F90,
• MUSCL extrapolation for 2nd order accuracy with limiters,
• Spalart-Allmaras turbulence model,
• Multi-stage Runge-Kutta time-stepping with geometric
multi-grid.
adjoint mgOpt:
• differentiate flux functions, preconditioners and metrics
separately,
• assemble the routines in the same time-stepping routine as for
the primal,
• performance of the adjoint code is about 2x primal in CPU
and memory.
• first and second derivatives.
21 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Sensitivity Validation

AUSM+ UP Method 1st Order Accuracy 2nd Order Accuracy


First FD 0.453321242571 0.619166853538
Derivative Tangent 0.453321292403 0.619166824022
Adjoint 0.453321292403 0.619166824022

Second FD 138.3578951408 142.2067719136


Derivative ToT 138.3578906648 142.2184126185
ToR 138.3578906648 142.2184126185
Table: Gradient comparison on NACA 0012, using AUSM+ up.

ROE Method 1st Order Accuracy 2nd Order Accuracy


First FD 0.488457739739 0.623140574551
Derivative Tangent 0.488457833838 0.623140717437
22 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Ahmed body

Figure: Adjoint solution of Ahmed body

23 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Onera M6

Figure: Adjoint solution on the Onera M6 wing


24 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Surface sensitivities

25 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Design: Formula front wing


Cost function : Downforce constrained with drag

Design vectors on Formula front wing.


26 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Design: Formula Front Wing

Step 1

27 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Outline

Solving the Adjoint Flow Equations

Compressible adjoint codes using AD (easy)

Development of incompressible adjoint solvers (medium)

AD on legacy, incompressible CFD codes (hard)

Wishlist for AD tools

Summary

28 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

The equations of incompressible flow: a saddplepoint


problem
The incompressible flow equations have a saddlepoint form:
∂ui ∂ui 1 ∂p ∂ 2 ui
= −uj − +ν 2 momentum
∂t ∂xj ρ ∂xj ∂xj
∂ui ∂uj ∂uk
0= + + continuity
∂xi ∂xj ∂xk

• There is no evolution equation for the pressure,


• Continuity eq. is a constraint rather than an evol. equ.
Standard SIMPLE pressure-correction scheme:
• implicit block solves for each of u01 , u02 , u03 with a predicted
pressure p0 , higher-order terms lagged in time.
• then solve a pressure-correction equation for δp,
• then update u1 , u2 , u3 , p using under-relaxation.
29 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Incompressible pressure-correction algorithm (SIMPLE)


The SIMPLE algorithm in 2-D:
call initialise flow ( ←U, ←P )
call metrics ( →X, ←Nrm )
do nIter = 1,mIt
call solve u ( →U, →P, →Nrm, ←u’ )
call solve v ( →U, →P, →Nrm, ←v’ )
call solve p ( →U, →P, →Nrm, ←p’ )
call corr ( →u’, →v’, →p’, ←U, ←P )
end do
call cost fun ( →U, →P, →Nrm, ←J )

Reverse-mode differentiation then gives:


U = 0; P = 0; J = 1
call cost fun ( →U, ←U, →P, ←P, →Nrm, ←Nrm, ←J, →J )
do nIter = mIt,1,-1
call corr ( →u’, ←u0 , →v’, ←v0 , →p’, ←p0 , ←U, →U, ←P, →P )
call solve p ( →U, ←U, →P, ←P, →Nrm, ←Nrm, ←p’, →p0 )
call solve v ( →U, ←U, →P, ←P, →Nrm, ←Nrm, ←v’, →v0 )
call solve u ( →U, ←U, →P, ←P, →Nrm, ←Nrm, ←u’, →u0 )
end do
call metrics ( →X, ←X, ←Nrm, →Nrm )

30 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Development of incompressible adjoint solvers

• In-house code gpde is a compact incompressible CFD code


(5,000 lines) written as a test-bed for developing adjoint N-S
fields.
• Fortran 90/95, using its more modern programming features.
• The algorithm computes laminar/turbulent flow through
complex 2/3D geometries.
• Code design mimicks typical CFD code setup and algorithms,
exploiting support by Automatic differentiation tools to the
maximum.
• Via the makefile, either the primal, primal with tangent or
with adjoint can be built automatically.
• Run-time ratio of adjoint over primal is approximately 2.

31 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Validation case: VW S-Bend

• Simplified vehicle climatisation duct.


• Uniform flow at inlet, shape modifications only in the bend.
• Objective: total pressure loss. Sensitivities are computed
w.r.t. vertex coordinates.
• Turbulent viscosity is approximated using the Spalart-Allmaras
model. For y+ > 11.225 the standard wall function
approximates the near-wall turbulent viscosity.

32 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Duct flow case with turb. model, ReH = 60

Convergence history
1
primal [LR]
adjoint [RL]

0.1

0.01
Max. residual 1-norm

0.001

0.0001

1e-05

1e-06

1e-07
0 50 100 150 200 250 300 350 400
Iteration

(a) Convergence of primal and adjoint (b) Fluid speed

33 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Duct flow case with turb. model, ReH = 60

(c) Bottom view (d) Top view

34 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Duct flow case with turb. model, ReH = 600

Convergence history
1
primal [LR]
adjoint [RL]

0.1

0.01
Max. residual 1-norm

0.001

0.0001

1e-05

1e-06

1e-07
0 50 100 150 200 250 300 350 400
Iteration

(e) Convergence of primal and adjoint (f) Fluid speed

35 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Duct flow case with turb. model, ReH = 600

(g) Bottom view (h) Top view

36 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Outline

Solving the Adjoint Flow Equations

Compressible adjoint codes using AD (easy)

Development of incompressible adjoint solvers (medium)

AD on legacy, incompressible CFD codes (hard)

Wishlist for AD tools

Summary

37 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Industrial incompressible adjoint codes


• A number of industrial CFD packages are intending to have
(Star-CCM+) or already have (Fluent 13, OpenFOAM 2.0)
adjoint versions.
• So far, none of the current sensitivity implementations use
automatic differentiation (AD) to generate the sensitivity
algorithm.
• Bischof et al. (2007) differentiated Fluent IV in tangent mode.
• Here we apply fwd and rev AD to incompressible commercial
flow solver, ESI’s ACE+.
• Original / Full Source Code Size: 3000 files, 1.1M lines /
40MB of source code.
• Reduced Kernel Code Size: 680 files, 230K lines / 8MB of
source code, with some Fortran 90 features suppressed or
eliminated.
38 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Differentiation with Tapenade

• The target language in this work is Fortran 90/95, which is


well supported by the source-transformation AD tool
Tapenade.
• An alternative to differentiation by source transformation is
operator overloading, both techniques are investigated in the
EU funded project About Flow.
• We like source-transformation: the resulting code can be
easily inspected, troubleshooting is much easier, primal can be
amended to produce better AD’ed code,
• A selection of AD tools are presented on autodiff.org

39 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Pre- and post-processing methodology

• To aid the construction of the sensitivity algorithm, source


code pre- and post-processing is performed either side of
differentiation.
• Pre- and post-processing involve
1. remove dead code in the original source code via C
preprocessor pragmas.
2. reorganise modules, types and procedures in the generated
source code in addition to optimising fixed-point iterators and
introducing library functions where appropriate (such as the
sparse linear solver).
• Source code processing is where the bulk of the work lies in
order to successfully generate adjoint algorithms using AD,
but the tools are very difficult to implement.

40 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Legacy incompressible, hard: flow-graph

41 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Pre-processing steps

1. Elimination of parts of the code not to be differentiated:


physically cut out models (e.g. icing, solidification),
functionality (array management) which are not to be
differentiated. Wrap code to be kept but not differentiated
with pragmas, e.g. certain i/o functions. About 5
man-months.
2. Create a ’fake’ top level that hides global datastructes, in
particular a global array of pointes to all allocated fields.
Replace refs to that global array with equivalent array refs.
3. Removal of complex boundary condition routines that involve
separate iterations, such as “fixed mass flow”.
4. Collapse modules into a single one to aid post-processing.

42 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Pre-processing

• The most complex task is to identify and tag the relevant


source code which needs to be differentiated from the entire
program source code.
• For large programs (100,000+ lines) this becomes tedious and
error-prone to perform manually.
• By making use of the call-graph generated by AD tool
Tapenade, the tedium of tracing dependencies is removed.
• Knowledge is still required to know which dependent routines
are non-essential so can be ignored.

43 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Pre-processing: example of call tree to be pruned

44 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Pre-processing: primal source equipped with pragmas


module m a t r i x u t i l s m
contains

#i f n d e f STRIP AD
! ! passive
subroutine p r i n t m a t r i x ( mat )
...
end subroutine
#e n d i f

! ! a c t i v e non−d i f f e r e n t i a b l e
function c o o t o c s r ( i , j , ja , i a ) r e s u l t ( k i j )
#i f n d e f STRIP AD
...
#e l s e
k i j = 0 ! r e t a i n a t r i v i a l dependency
#e n d i f
end function

! ! active differentiable
subroutine m a t r i x s e t u p ( a i j , phi , ja , i a , r e s )
...
#i f n d e f STRIP
i f ( use advanced feature ) call adv feature ()
#e n d i f
end subroutine
end module
45 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Post-processing

This process is automated via a program which parses the broad


features of the source code and modifies it according to in-built
rules, such as:
• Use the augmented primal module in its derivative module
• Remove derivative derived data types
• Strip repeated primal subroutines in derivative modules
• Redirect the differentiated linear solver calls to its manually
constructed derivative
• Only record (and restore) the last iteration of fixed-point
iterations
Having access to the parse tree in order to modify the source code
would enable much more robust processing.

46 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Post-processing of AD’ed code

1. inherit original modules into their generated derivative modules,


2. purge generated code of all definitions of primal equivalent routines
and data (except private data),
3. ensure that all references to primal routines and data refer to
original code and not to equivalent generated code,
4. identify and remove generated derivative type definitions and replace
associated declarations using that type with the original type,
5. for the adjoint of linear system solvers, use the hand coded
alternative (this must be properly converged at each invocation),
6. in adjoint code, identify fixed-point iterations, reconfigure it to
record once and restore once active primal variables,
7. in adjoint code, identify active quantities which can be assumed to
behave like constants and remove their associated adjoint
computation.

47 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

48 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

ACE+ differentiation

• Reduced Kernel Code Size: 680 files, 230K lines / 8MB of


source code.
• Using tapenade Version 3.6 (September 2011).
• Memory: 500-900MB RAM, 2.0-2.5GB VM.
• CPU Time: 10 minutes on Intel i5 Equivalent for
tangent-linear mode.

49 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

tangent-linear ACE+, S-Bend testcase

S-Bend testcase, tangent-linear discrete solution, pressure

50 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

tangent-linear ACE+, S-Bend testcase

Pressure sensitivity to inlet variation

51 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

tangent-linear ACE+, Ahmed body

Pressure sensitivity to inlet variation

52 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Adjoint runtime performance


Current status: code differentiates, compiles and runs, adjoint
solution yet to be validated.
“Out of the box” performance for adjoint ACE+:

Test Case Memory Runtime


Prim Adj Prim Adj
2-D channel I 1.2MB 10.5MB 43s 257s
2-D channel II 21.5MB 112.0MB 1.3h 5.4h
3-D S-Bend 1.7GB 8.3GB 12.7h 45.5h
3-D Ahmed Body 3.2GB 15.6GB 15.2h 52.3h

53 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Outline

Solving the Adjoint Flow Equations

Compressible adjoint codes using AD (easy)

Development of incompressible adjoint solvers (medium)

AD on legacy, incompressible CFD codes (hard)

Wishlist for AD tools

Summary

54 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Dear Santa ...


For Christmas I would like ..
For my next project I would like:
• An option to dump the Call Tree, to help with pruning.
• A ’disregard’ family of options:
• Don’t provide differentiated code, but parse to complete the
Call Tree.
• Don’t provide diff’ed code, and accept my word that these
outputs depend on these inputs.
• Ah, just forget about this bit of code. Trust me, it does
nothing.
• Single modules, so we can have functions diff’ed wrt to
various outputs.
• Easy linear solver pragmas: The next bit of code solves
Ax = b, and I want Aẋ = ḃ, or AT x̄ = b̄.
• Better support for global container data-structures: Pushing
the array under this pointer is enough: you don’t need to push
the whole array of all pointers.
55 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Outline

Solving the Adjoint Flow Equations

Compressible adjoint codes using AD (easy)

Development of incompressible adjoint solvers (medium)

AD on legacy, incompressible CFD codes (hard)

Wishlist for AD tools

Summary

56 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Summary

• AD works really well for codes where the ratio of volume of


code to (willingness%permission) to change source is small,
simple. Fully coupled timestepping schemes of the form
R(U ) = 0 are easy and will give such good perfomance that
you wonder why anyone tries anything else.
• Add a more complex block-solving time-stepping scheme with
lagged r.h.s., and things get more tricky. You can still get
good performance, especially if your algorithm does lots of
fixed-point iterations, but needs more work.
• You have a legacy code which you are not allowed to change
much? Still worth using AD, we think. But you’re looking at a
significant setup time. After 4 years we’re finally getting close.

57 / 58
Adjoint Eq Compressible, easy Incompressibe, medium Legacy incompressible, hard Wishlist for AD tools Summary

Acknowledgements

Thanks to the team around Tapenade.

The research in FlowHead was funded by the European


Commission under theme ’IDEAS’ SST.2007-RTD-1.
http://flowhead.sems.qmul.ac.uk

About Flow is funded by the European Commission under theme


’People’ FP7-PEOPLE-2012-ITN 317006
http://aboutflow.sems.qmul.ac.uk

58 / 58

You might also like