PETSc Tutorial

Numerical Software Libraries for
the Scalable Solution of PDEs
Satish Balay
Kris Buschelman
Bill Gropp
Lois Curfman McInnes
Barry Smith

Mathematics and Computer Science Division
Argonne National Laboratory
http://www.mcs.anl.gov/petsc
Intended for use with version 2.0.29 of PETSc

2 of 132

Tutorial Objectives
• Introduce the Portable, Extensible Toolkit for
Scientific Computation (PETSc)
• Demonstrate how to write a complete parallel
implicit PDE solver using PETSc
• Learn about PETSc interfaces to other packages
• How to learn more about PETSc

3 of 132

The Role of PETSc
• Developing parallel, non-trivial PDE solvers that
deliver high performance is still difficult, and
requires months (or even years) of concentrated
effort.
• PETSc is a toolkit that can ease these difficulties
and reduce the development time, but it is not a
black-box PDE solver nor a silver bullet.

Sun • PCs running Linux or NT • PETSc history – Begun in September 1991 – Now: over 4.gov Usable from Fortran 77/90. SGI Origin. Multidisciplinary Challenge Program..anl. MICS Program – DOE2000 – National Science Foundation. gov/petsc Hyperlinked documentation and manual pages for all routines Many tutorial-style examples Support via email: petsc -maint@mcs. networks of workstations • Compaq. Sun Enterprise – Loosely coupled systems. e. mcs. HP 9000.g.000 downloads of version 2. CISE . SGI. HP.anl. IBM SP.4 of 132 What is PETSc? • A freely available and supported research code – – – – – • Available via http://www. IBM. and C++ Portable to any parallel system supporting MPI. C.0 • PETSc funding and support – Department of Energy. including – Tightly coupled systems • Cray T3E.

matrices • How to solve the problem – Solvers • linear. and time stepping (ODE) solvers • Parallel computing complications – Parallel data layout • structured and unstructured meshes . nonlinear.5 of 132 PETSc Concepts • How to specify the mathematics of the problem – Data objects • vectors.

. sparse Jacobians) • Viewers – object information – visualization • Solvers – linear – nonlinear – timestepping (and ODEs) • Data layout and ghost values – structured and unstructured mesh problems – partitioning and coloring • Putting it all together – a complete example • Debugging and error handling • Profiling and performance tuning • Extensibility issues • Using PETSc with other software packages .g.6 of 132 Tutorial Topics • Getting started – sample results – programming paradigm • Data objects – vectors (e. field variables) – matrices (e.g..

http://www.math. Henshaw. Smith . Kohn.http://www.ch/~grote/spai Matlab – • http://www. and D. ethz. Garaiza. L. Karypis .edu/~saad ParMETIS – parallel partitioner • • • • • – G. Brown. Hindmarsh et al. Hornung. X. Bernhard and M.Tutorial Topics: Using PETSc with Other Packages • PVODE – ODE integrator – A.umn. R.llnl. mcs.anl.http://www. W. and J. and S.gov /CASC/Overture SAMRAI – AMR package – S.com TAO – optimization software – S.http://www. Saad . Benson.http://www. Moré . sam. .llnl.umn. Grote . McInnes.edu/~karypis Overture – composite mesh PDE package – D.C. cs.cs.gov /CASC/PVODE • ILUDTP – drop tolerance ILU – Y. gov/tao 7 of 132 .llnl.gov /CASC/SAMRAI SPAI – sparse approximate inverse preconditioner – S. Quinlan .http://www. mathworks.http://www.

intended primarily for use by library developers 4 developer developer .8 of 132 Tutorial Approach From the perspective of an application programmer: • Beginner • Advanced – basic functionality. performance evaluation and tuning 2 intermediate intermediate • Developer – advanced customizations. intended for use by most programmers – user-defined customization of algorithms and data structures 1 Only in this tutorial 3 beginner beginner advanced advanced • Intermediate – selecting options.

anl.mcs.9 of 132 Incremental Application Improvement • Beginner – Get the application “up and walking” • Intermediate – Experiment with options – Determine opportunities for improvement • Advanced – Extend algorithms and/or data structures as needed • Developer – Consider interface and efficiency issues for integration and interoperability of multiple toolkits • Full tutorials available at http://www.gov/petsc/docs/tutorials .

BLAS. MPI-IO. Vectors. LAPACK . Interface Unconstrained Minimization Linear Solvers Preconditioners + Krylov Methods Object-Oriented Grid Matrices.10 of 132 Structure of PETSc PETSc PDE Application Codes ODE Integrators Visualization Nonlinear Solvers. Indices Management Profiling Interface Computation and Communication Kernels MPI.

11 of 132 PETSc Numerical Components Nonlinear Solvers Newton-based Methods Time Steppers Other Backward Pseudo Time Euler Stepping Euler Line Search Trust Region Other Krylov Subspace Methods GMRES CG CGS Bi-CG-STAB TFQMR Richardson Chebychev Other Preconditioners Additive Schwartz Block Jacobi Compressed Sparse Row (AIJ) Jacobi ILU ICC Others Matrices Blocked Compressed Sparse Row (BAIJ) Block Diagonal (BDIAG) Distributed Arrays Dense Other Index Sets Indices Vectors LU (Sequential only) Block Indices Stride Other .

12 of 132 Flow of Control for PDE Solution Main Routine Timestepping Solvers (TS) Nonlinear Solvers (SNES) Linear Solvers (SLES) PETSc PC Application Initialization KSP Function Evaluation User code Jacobian Evaluation PETSc code PostProcessing .

13 of 132 Flow of Control for PDE Solution Main Routine Overture SAMRAI Timestepping Solvers (TS) Nonlinear Solvers (SNES) Linear Solvers (SLES) PETSc PC Application Initialization User code SPAI KSP ILUDTP PETSc code Function Evaluation Other Tools Jacobian Evaluation PVODE PostProcessing .

BLAS-type operations . such as PDEs and boundary conditions • Algorithmic and discrete mathematics interface PETSc emphasis – Programmer manipulates mathematical objects (sparse matrices. nonlinear equations). algorithmic objects (solvers) and discrete geometry (meshes) • Low-level computational kernels – e.g..14 of 132 Levels of Abstraction in Mathematical Software • Application-specific interface – Programmer manipulates objects associated with the application • High-level mathematics interface – Programmer manipulates mathematical objects.

15 of 132 Basic PETSc Components • Data Objects – Vec (vectors) and Mat (matrices) – Viewers • Solvers – Linear Systems – Nonlinear Systems – Timestepping • Data Layout and Ghost Values – Structured Mesh – Unstructured Mesh .

16 of 132 PETSc Programming Aids • Correctness Debugging – Automatic generation of tracebacks – Detecting memory corruption and leaks – Optional user-defined error handlers • Performance Debugging – Integrated profiling using -log_summary – Profiling by stages of an application – User-defined events .

“shared-nothing” • Requires only a compiler (single node or processor) • Access to data on remote machines through MPI – Can still exploit “compiler discovered” parallelism on each node (e.. SMP) – Hide within parallel objects the details of the communication – User orchestrates communication at a higher abstract level than message passing . runs everywhere – Performance – Scalable parallelism • Approach – Distributed memory.17 of 132 The PETSc Programming Model • Goals – Portable.g.

e.g..g.. VecCreate(MPI_Comm comm. int M.18 of 132 Collectivity • MPI communicators (MPI_Comm) specify collectivity (processors involved in a computation) • All PETSc creation routines for solver and data objects are collective with respect to a communicator. they must be called in the same order on each processor . while others are not. e. Vec *x) • Some operations are collective. – collective: VecNorm( ) – not collective: VecGetLocalSize() • If a sequence of collective routines is used. int m.

“Hello World\n”).h” int main( int arc. &argv. PetscFinalize(). NULL. NULL ). return 0. } .19 of 132 Hello World #include “petsc. PetscPrintf( PETSC_COMM_WORLD. char *argv[] ) { PetscInitialize( &argc.

rank #include "include/finclude/petsc.eq.h" call PetscInitialize( PETSC_NULL_CHARACTER. ierr ) if (rank .20 of 132 Hello World (Fortran) program main integer ierr. 0) then print *.ierr ) call MPI_Comm_rank( PETSC_COMM_WORLD. rank. ‘Hello World’ endif call PetscFinalize(ierr) end .

char *argv[] ) { int rank. return 0. PetscFinalize(). rank). NULL. NULL ). &rank ). PetscInitialize( &argc. MPI_Comm_rank( PETSC_COMM_WORLD.21 of 132 Fancier Hello World #include “petsc. “Hello World from %d\n”. PetscSynchronizedPrintf( PETSC_COMM_WORLD. &argv. } .h” int main( int arc.

g. pressure) are updated with global solves • Implicit: Most or all variables are updated in a single global linear or nonlinear solve .22 of 132 Solver Definitions: For Our Purposes • Explicit: Field variables are updated using neighbor information (no global linear or nonlinear solves) • Semi-implicit: Some subsets of variables (e..

23 of 132 Focus On Implicit Methods • Explicit and semi-explicit are easier cases • No direct PETSc support for – ADI-type schemes – spectral methods – particle-type methods .

not low-level programming language objects • Application code focuses on mathematics of the global problem. not parallel programming details . application-friendly manner • Use mathematical and algorithmic objects.24 of 132 Numerical Methods Paradigm • Encapsulate the latest numerical algorithms in a consistent.

25 of 132 Data Objects • Vectors (Vec) – focus: field data arising in nonlinear PDEs • Matrices (Mat) – focus: linear operators arising in nonlinear PDEs (i. Jacobians) beginner beginner • Object creation beginner beginner • Object assembly intermediate intermediate • Setting options intermediate intermediate • Viewing advanced advanced • User-defined customizations tutorial outline: tutorial outline: data objects data objects .e..

right-hand sides.Vec *) – MPI_Comm .26 of 132 Vectors • Fundamental objects for storing field solutions... • VecCreateMPI(.processors that share the vector – number of elements local to this processor – total number of elements • Each process locally owns a subvector of contiguously numbered global indices beginner beginner proc 0 proc 1 proc 2 proc 3 proc 4 data objects: data objects: vectors vectors . etc..

ADD_VALUES] • VecAssemblyBegin(Vec) • VecAssemblyEnd(Vec) beginner beginner data objects: data objects: vectors vectors .27 of 132 Vector Assembly • VecSetValues(Vec.…) – number of entries to insert/add – indices of entries – values to add – mode: [INSERT_VALUES.

28 of 132 Parallel Matrix and Vector Assembly • Processors may generate any entries in vectors and matrices • Entries need not be generated on the processor on which they ultimately will be stored • PETSc automatically moves data during the assembly process if necessary beginner beginner data objects: data objects: vectors and vectors and matrices matrices .

int *idx. Vec x. Vec y) VecAYPX(Scalar *a. Vec x) VecCopy(Vec x. double *r) y = y + a*x y = x + a*y w = a*x + y x = a*x y=x w_i = x_i *y_i r = max x_i x_i = s+x_i x_i = |x_i | r = ||x|| beginner beginner data objects: data objects: vectors vectors . Vec w) VecMax(Vec x. Vec y. Vec y. Vec w) VecScale(Scalar *a. double *r) VecShift(Scalar *s. Vec y) VecPointwiseMult( Vec x. NormType type . Vec x) VecAbs(Vec x) VecNorm(Vec x. Vec x. Vec x.29 of 132 Selected Vector Operations Function Name Operation VecAXPY(Scalar *a. Vec y) VecWAXPY(Scalar *a.

ex1f90.synchronized printing petsc/src/vec/examples/tutorials/ E ex1.on-line exercise 1 data objects: data objects: vectors vectors .c.F.basic vector routines .c • Location: .F • 1 beginner beginner 1 ..F • E ex3.30 of 132 Simple Example Programs Location: petsc/src/sys/examples/tutorials/ E ex2.c.parallel vector layout And many more examples . ex3f.. E . ex1f.

g.processors that share the matrix – number of local rows and columns – number of global rows and columns – optional storage pre-allocation information beginner beginner data objects: data objects: matrices matrices .Mat *) – MPI_Comm .31 of 132 Sparse Matrices • Fundamental objects for storing linear operators (e. Jacobians) • MatCreateMPIAIJ(…..

int *rend) – rstart: first locally owned row of global matrix – rend -1: last locally owned row of global matrix beginner beginner data objects: data objects: matrices matrices . int *rstart.32 of 132 Parallel Matrix Distribution Each process locally owns a submatrix of contiguously numbered global rows. proc 0 proc 1 proc 2 proc 3 proc 4 } proc 3: locally owned rows MatGetOwnershipRange(Mat A.

ADD_VALUES] • MatAssemblyBegin(Mat) • MatAssemblyEnd(Mat) beginner beginner data objects: data objects: matrices matrices .…) – number of rows to insert/add – indices of rows and columns – number of columns to insert/add – values to add – mode: [INSERT_VALUES.33 of 132 Matrix Assembly • MatSetValues(Mat.

Mat *) – MPI_Comm .34 of 132 Blocked Sparse Matrices • For multi-component PDEs • MatCreateMPIBAIJ(….processors that share the matrix – block size – number of local rows and columns – number of global rows and columns – optional storage pre-allocation information beginner beginner data objects: data objects: matrices matrices .

35 of 132 Blocking: Performance Benefits More issues and details discussed in Performance Tuning section 100 • 3D compressible Euler code • Block size 5 • IBM Power2 80 MFlop /sec 60 40 Blocked 0 Basic 20 Matrix-vector products Triangular solves beginner beginner data objects: data objects: matrices matrices .

36 of 132 Viewers beginner beginner beginner beginner intermediate intermediate • Printing information about solver and data objects • Visualization of field and matrix data • Binary output of vector and matrix data tutorial outline: tutorial outline: viewers viewers .

37 of 132 Viewer Concepts • Information about PETSc objects – runtime choices for solvers. • Data for later use in restarts or external tools – vector fields. nonzero info for matrices. binary) • Visualization – simple x-window graphics • vector fields • matrix sparsity structure beginner beginner viewers viewers . etc. matrix contents – various formats (ASCII.

• Default viewers Solution components.Viewer v).38 of 132 Viewing Vector Fields • VecView(Vec x. velocity: v vorticity: ζ beginner beginner velocity: u temperature: T viewers viewers . using runtime option -snes_vecmonitor – ASCII (sequential): VIEWER_STDOUT_SELF – ASCII (parallel): VIEWER_STDOUT_WORLD – X-windows: VIEWER_DRAW_WORLD • Default ASCII formats – – – – – VIEWER_FORMAT_ASCII_DEFAULT VIEWER_FORMAT_ASCII_MATLAB VIEWER_FORMAT_ASCII_COMMON VIEWER_FORMAT_ASCII_INFO etc.

Viewer v). • Runtime options available after matrix assembly – -mat_view_info • info about matrix assembly – -mat_view_draw • sparsity structure – -mat_view • data in ASCII – etc.39 of 132 Viewing Matrix Data • MatView(Mat A. beginner beginner viewers viewers .

40 of 132 Solvers: Usage Concepts Solver Classes • Linear (SLES) • Nonlinear (SNES) • Timestepping (TS) important concepts important concepts Usage Concepts • • • • Context variables Solver options Callback routines Customization tutorial outline: tutorial outline: solvers solvers .

41 of 132 Linear PDE Solution Main Routine PETSc Linear Solvers (SLES) Solve Ax = b Application Initialization PC Evaluation of A and b User code beginner beginner KSP PostProcessing PETSc code solvers: solvers: linear linear .

b beginner beginner solvers: solvers: linear linear . parallel problems arising within PDE-based models User provides: – Code to evaluate A. particularly for sparse. Ax=b.42 of 132 Linear Solvers Goal: Support the solution of linear systems.

E. D. M. C. R.43 of 132 Sample Linear Application: Exterior Helmholtz Problem Solution Components − ∇ 2u − k 2u = 0 lim r 1/ 2 r →∞ beginner beginner Real  ∂u  + iku  = 0   ∂r  Collaborators: H. Keyes. L. McInnes. Susan-Resiga Imaginary solvers: solvers: linear linear . Atassi.

parallelized with DAs Finite element discretization (bilinear quads) Nonreflecting exterior BC (via DtN map) Matrix sparsity structure (option: -mat_view_draw) Natural ordering beginner beginner Close-up Nested dissection ordering solvers: solvers: linear linear .44 of 132 Helmholtz: The Linear System • • • • Logically regular grid.

45 of 132 Linear Solvers (SLES) SLES: Scalable Linear Equations Solvers beginner beginner • Application code interface beginner beginner • Choosing the solver intermediate intermediate • Setting algorithmic options intermediate intermediate • Viewing the solver intermediate intermediate • Determining and monitoring convergence intermediate intermediate • Providing a different preconditioner matrix advanced advanced • Matrix-free solvers advanced advanced • User-defined customizations tutorial outline: tutorial outline: solvers: solvers: linear linear .

g.g.46 of 132 Context Variables • Are the key to solver organization • Contain the complete state of an algorithm...g. including – parameters (e. iteration number) beginner beginner solvers: solvers: linear linear . convergence monitoring routine) – information about the current state (e.. convergence tolerance) – functions that run the algorithm (e.

ierr) • Provides an identical user interface for all linear solvers – uniprocessor and parallel – real and complex numbers beginner beginner solvers: solvers: linear linear . • Fortran version call SLESCreate(MPI_COMM_WORLD.47 of 132 Creating the SLES Context • C/C++ version ierr = SLESCreate(MPI_COMM_WORLD.sles.&sles).

48 of 132 Linear Solvers in PETSc 2. LU (sequential only) • etc.0 Krylov Methods (KSP) • • • • • • Conjugate Gradient GMRES CG-Squared Bi-CG-stab Transpose-free QMR etc. beginner beginner Preconditioners (PC) • Block Jacobi • Overlapping Additive Schwarz • ICC. solvers: solvers: linear linear . ILU via BlockSolve95 • ILU(k).

&b).n.A. A.n.&x).A. /* /* /* /* linear solver context */ matrix */ solution. RHS vectors */ problem dimension. x.&its). its. n.n. /* assemble RHS vector */ SLESCreate(MPI_COMM_WORLD. SLESSolve(sles. SLESSetFromOptions(sles). beginner beginner solvers: solvers: linear linear . SLESSetOperators(sles.&A). VecDuplicate(x. /* assemble matrix */ VecCreate(MPI_COMM_WORLD.&sles).DIFFERENT_NONZERO_PATTERN).49 of 132 Basic Linear Solver Code (C/C++) SLES Mat Vec int sles. number of iterations */ MatCreate(MPI_COMM_WORLD.x.b. b.

x.A.n.b.ierr) call SLESSolve(sles.n.ierr) call SLESSetFromOptions(sles.DIFFERENT_NONZERO_PATTERN . ierr call MatCreate(MPI_COMM_WORLD. its.ierr) call VecCreate(MPI_COMM_WORLD.ierr) call SLESSetOperators(sles.its.50 of 132 Basic Linear Solver Code (Fortran) SLES Mat Vec integer sles A x. b n.x.ierr) C then assemble matrix and right-hand-side vector call SLESCreate(MPI_COMM_WORLD.sles.n.ierr) beginner beginner solvers: solvers: linear linear .ierr) call VecDuplicate(x.b.A.A.

…] -ksp_max_it <max_iters> -ksp_gmres_restart <restart> -pc_asm_overlap <overlap> -pc_asm_type [basic.interpolate.sor.restrict..ilu.51 of 132 Setting Solver Options at Runtime • • • • • • • 1 -ksp_type [cg..gmres.jacobi.none] etc .bcgs.asm.tfqmr. 2 beginner intermediate beginner intermediate 1 2 solvers: solvers: linear linear .…] -pc_type [lu.

52 of 132 Linear Solvers: Monitoring Convergence • -ksp_monitor .Prints true residual norm || b-Ax || • -ksp_xtruemonitor .Plots preconditioned residual norm • -ksp_truemonitor . using callbacks 1 2 3 beginner intermediate advanced beginner intermediate advanced 2 3 solvers: solvers: linear linear .Prints preconditioned residual norm 1 • -ksp_xmonitor .Plots true residual norm || b-Ax || • User-defined monitors.

4 15.6 solvers: solvers: linear linear .06 37. ILU(1) subdomain solver Procs 1 2 4 8 16 32 beginner beginner Iterations 221 222 224 228 229 230 Time (Sec) 163.01 81.4 8.53 of 132 Helmholtz: Scalability 128x512 grid.0 25.0 4.37 Speedup - 2.36 19. wave number = 13.49 10. 1-cell overlap.85 6. IBM SP GMRES(30)/Restricted Additive Schwarz 1 block per proc.

Destroy solver solvers: solvers: linear linear .Run linear solver SLESView( ) .PC] SLESSolve( ) .54 of 132 SLES: Review of Basic Usage • • • • • • SLESCreate( ) .View solver options SLESDestroy( ) beginner beginner actually used at runtime (alternative: -sles_view) .Set runtime solver options for [SLES.Set linear operators SLESSetFromOptions( ) .Create SLES context SLESSetOperators( ) . KSP.

restrict.jacobi.asm..55 of 132 SLES: Review of Selected Preconditioner Options Functionality Procedural Interface Set preconditioner type PCSetType( ) Set level of fill for ILU Set SOR iterations Set SOR parameter Set additive Schwarz variant Set subdomain solver options 1 2 beginner intermediate beginner intermediate Runtime Option -pc_type [lu.. sor.ilu.interpolate. solvers: linear: solvers: linear: preconditioners preconditioners .none] PCGetSubSLES( ) -sub_pc_type < pctype> -sub_ksp_type < ksptype> -sub_ksp_rtol < rtol> And many more options.…] 1 PCILULevels( ) PCSORSetIterations( ) PCSORSetOmega( ) PCASMSetType( ) -pc_ilu_levels <levels> 2 -pc_sor_its <its> -pc_sor_omega <omega> -pc_asm_type [basic.

1 tfqmr. -ksp_xtruemonitor KSPSetTolerances( ) -ksp_rtol <rt> -ksp_atol <at> Set convergence -ksp_max_its <its> tolerances KSPGMRESSetRestart( ) -ksp_gmres_restart <restart> Set GMRES restart parameter Set orthogonalization KSPGMRESSetOrthogon -ksp_unmodifiedgramschmidt -ksp_irorthog routine for GMRES alization( ) 2 And many more options. -ksp_truemonitor. 1 2 beginner intermediate beginner intermediate solvers: linear: solvers: linear: Krylov methods Krylov methods ...56 of 132 SLES: Review of Selected Krylov Method Options Functionality Procedural Interface Runtime Option Set Krylov method KSPSetType( ) Set monitoring routine KSPSetMonitor() -ksp_type [cg.cgs. –ksp_xmonitor.bcgs.…] -ksp_monitor.gmres.

F .using complex numbers • ex4.c 1 • ex9.c • . 1 2 3 beginner intermediate advanced beginner intermediate advanced E .c..basic uniprocessor codes E ex23.c E ex22. ex1f.setting a user-defined preconditioner ex15..57 of 132 SLES: Example Programs Location: petsc/src/sles/examples/tutorials/ • ex1.using different linear system and 2 preconditioner matrices .on-line exercise solvers: solvers: linear linear .c 3 And many more examples .c • .repeatedly solving different linear systems .c .basic parallel code • ex11.3D Laplacian using multigrid • .

58 of 132 Nonlinear Solvers (SNES) SNES: Scalable Nonlinear Equations Solvers beginner beginner • Application code interface beginner beginner • Choosing the solver intermediate intermediate • Setting algorithmic options intermediate intermediate • Viewing the solver intermediate intermediate • Determining and monitoring convergence advanced advanced • Matrix-free solvers advanced advanced • User-defined customizations tutorial outline: tutorial outline: solvers: solvers: nonlinear nonlinear .

59 of 132 Nonlinear PDE Solution Main Routine Nonlinear Solvers (SNES) Solve F(u) = 0 Linear Solvers (SLES) PC Application Initialization PETSc KSP Function Evaluation User code beginner beginner Jacobian Evaluation PETSc code PostProcessing solvers: solvers: nonlinear nonlinear .

60 of 132 Nonlinear Solvers Goal: For problems arising from PDEs. support the general solution of F(u) = 0 User provides: – Code to evaluate F(u) – Code to evaluate Jacobian of F(u) (optional) • or use sparse finite difference approximation • or use automatic differentiation (coming soon!) beginner beginner solvers: solvers: nonlinear nonlinear .

including – – – – Line search strategies Trust region approaches Pseudo-transient continuation Matrix-free variants • User can customize all phases of the solution process beginner beginner solvers: solvers: nonlinear nonlinear .61 of 132 Nonlinear Solvers (SNES) • Newton-based methods.

62 of 132

Sample Nonlinear Application:
Driven Cavity Problem

• Velocity-vorticity
formulation
• Flow driven by lid and/or
bouyancy
• Logically regular grid,
parallelized with DAs
• Finite difference
discretization
• source code:
petsc/src/snes/examples/tutorials/ex8.c

beginner
beginner

Solution Components

velocity: u

vorticity: ζ

velocity: v

temperature: T

Application code author: D. E. Keyes

solvers:
solvers:
nonlinear
nonlinear

63 of 132

Basic Nonlinear Solver Code (C/C++)
SNES snes;
Mat J;
Vec x, F;
int
n, its;
ApplicationCtx usercontext;

/*
/*
/*
/*
/*

nonlinear solver context */
Jacobian matrix */
solution, residual vectors */
problem dimension, number of iterations */
user-defined application context */

...

MatCreate(MPI_COMM_WORLD,n,n,&J);
VecCreate(MPI_COMM_WORLD,n,&x);
VecDuplicate(x,&F);
SNESCreate(MPI_COMM_WORLD,SNES_NONLINEAR_EQUATIONS,&snes);
SNESSetFunction(snes,F,EvaluateFunction,usercontext);
SNESSetJacobian(snes,J,EvaluateJacobian,usercontext);
SNESSetFromOptions(snes);
SNESSolve(snes,x,&its);

beginner
beginner

solvers:
solvers:
nonlinear
nonlinear

64 of 132

Basic Nonlinear Solver Code (Fortran)
SNES
Mat
Vec
int

snes
J
x, F
n, its

...

call MatCreate(MPI_COMM_WORLD,n,n,J,ierr)
call VecCreate(MPI_COMM_WORLD,n,x,ierr)
call VecDuplicate(x,F,ierr)
call SNESCreate(MPI_COMM_WORLD
&
SNES_NONLINEAR_EQUATIONS,snes,ierr)
call SNESSetFunction(snes,F,EvaluateFunction,PETSC_NULL,ierr)
call SNESSetJacobian(snes,J,EvaluateJacobian,PETSC_NULL,ierr)
call SNESSetFromOptions(snes,ierr)
call SNESSolve(snes,x,its,ierr)

beginner
beginner

solvers:
solvers:
nonlinear
nonlinear

Data are handled through such opaque objects. • usercontext: serves as an application context object.) • uservector ..pointer to private data for the user’s function • Now.. whenever the library needs to evaluate the user’s nonlinear function.vector to store function values important concept important concept • userfunction . – SNESSetFunction(SNES. the library never sees irrelevant application data beginner beginner solvers: solvers: nonlinear nonlinear .name of the user’s function • usercontext ..65 of 132 Solvers Based on Callbacks • User provides routines to perform actions that the library requires. the solver may call the application code directly with its own local state. For example.

2 beginner intermediate beginner intermediate 1 2 solvers: solvers: nonlinear nonlinear .66 of 132 Uniform access to all linear and nonlinear solvers • • • • • • • 1 -ksp_type [cg...jacobi.tr.bcgs.ilu.…] -pc_type [lu.asm.tfqmr.sor.…] -snes_type [ls.…] -snes_line_search <line search method> -sles_ls <parameters> -snes_convergence <tolerance> etc.gmres.

KSP.PC] . routine Set Jacobian eval.Destroy solver solvers: solvers: nonlinear nonlinear . routine Set runtime solver options for [SNES.67 of 132 SNES: Review of Basic Usage • • • • SNESCreate( ) SNESSetFunction( ) SNESSetJacobian( ) SNESSetFromOptions( ) - • • SNESSolve( ) SNESView( ) • SNESDestroy( ) beginner beginner Create SNES context Set function eval.SLES.View solver options actually used at runtime (alternative: -snes_view) .Run nonlinear solver .

tr.quadratic.umls. solvers: solvers: nonlinear nonlinear . … Set convergence tolerances Set line search routine View solver options Set linear solver options SNESSetTolerances( ) -snes_rtol <rt> -snes_atol <at> -snes _ max_its <its> -snes_eq_ls [cubic.…] -snes_monitor 1 – snes_xmonitor.umtr.68 of 132 SNES: Review of Selected Options Functionality Procedural Interface Runtime Option Set nonlinear solver Set monitoring routine SNESSetType( ) SNESSetMonitor( ) -snes_type [ls...…] -snes_view -ksp_type < ksptype> 2 -ksp_rtol <krt> -pc_type <pctype> … 1 2 beginner intermediate beginner intermediate SNESSetLineSearch( ) SNESView( ) SNESGetSLES( ) SLESGetKSP( ) SLESGetPC( ) And many more options.

c.. ex4f. ex1f.parallel radiative transport problem with multigrid E ex19.on-line exercise solvers: solvers: nonlinear nonlinear .basic uniprocessor codes .c.F ex4.parallel nonlinear PDE (1 DoF per node) • E ex18.c • . ex5f.uniprocessor nonlinear PDE (1 DoF per node) E ex5..c • 1 .parallel driven cavity problem with multigrid 2 And many more examples .F .F.c. ex5f90.69 of 132 SNES: Example Programs Location: • • petsc/src/snes/examples/tutorials/ ex1.F . 1 2 beginner intermediate beginner intermediate E .

70 of 132 Timestepping Solvers (TS) (and ODE Integrators) beginner beginner • Application code interface beginner beginner • Choosing the solver intermediate intermediate • Setting algorithmic options intermediate intermediate • Viewing the solver intermediate intermediate • Determining and monitoring convergence advanced advanced • User-defined customizations tutorial outline: tutorial outline: solvers: solvers: timestepping timestepping .

Uxx) KSP Function Evaluation User code beginner beginner Jacobian Evaluation PETSc code PostProcessing solvers: solvers: timestepping timestepping .71 of 132 Time-Dependent PDE Solution Main Routine Timestepping Solvers (TS) Nonlinear Solvers (SNES) PETSc Linear Solvers (SLES) PC Application Initialization Solve U t = F(U.Ux.

Ux.t) User provides: – Code to evaluate F(U.t) • or use sparse finite difference approximation • or use automatic differentiation (coming soon!) beginner beginner solvers: solvers: timestepping timestepping .Uxx.Uxx.Ux.Ux.72 of 132 Timestepping Solvers Goal: Support the (real and pseudo) time evolution of PDE systems Ut = F(U.Uxx.t) – Code to evaluate Jacobian of F(U.

x) = sin(2 πx) U(t.Sample Timestepping Application: Burger’s Equation Ut= U Ux + ε Uxx U(0.1) beginner beginner solvers: solvers: timestepping timestepping .0) = U(t.

0*u(i) + u(i-1)) 10 continue beginner beginner solvers: solvers: timestepping timestepping .U) = Ui (Ui+1 .2Ui + U i-1)/(h*h) Do 10.5/h)*u(i)*(u(i+1)-u(i-1)) + (e/(h*h))*(u(i+1) .74 of 132 Actual Local Function Code Ut = F(t.localsize F(i) = (. i=1.U i-1)/(2h) + ε (Ui+1 .2.

of LLNL – Adams – BDF beginner beginner solvers: solvers: timestepping timestepping . a sophisticated parallel ODE solver package by Hindmarsh et al.75 of 132 Timestepping Solvers • • • • Euler Backward Euler Pseudo-transient continuation Interface to PVODE.

• User can customize all phases of the solution process beginner beginner solvers: solvers: timestepping timestepping .76 of 132 Timestepping Solvers • Allow full access to all of the PETSc – nonlinear solvers – linear solvers – distributed arrays. matrix assembly tools. etc.

PC] .Set Jacobian eval.Run timestepping solver . routine .Set runtime solver options for [TS.77 of 132 TS: Review of Basic Usage • • • • TSCreate( ) TSSetRHSFunction( ) TSSetRHSJacobian( ) TSSetFromOptions( ) • • TSSolve( ) TSView( ) • TSDestroy( ) beginner beginner .Destroy solver solvers: solvers: nonlinear nonlinear .Set function eval.Create TS context .SNES.SLES.View solver options actually used at runtime (alternative: -ts_view) .KSP. routine .

… -ts_max_steps <maxsteps> -ts_max_time < maxtime> TSView( ) -ts_view View solver options -snes_monitor -snes_rtol < rt> Set timestepping solver TSGetSNES( ) SNESGetSLES( ) -ksp_type < ksptype> options 2 SLESGetKSP( ) -ksp_rtol <rt> SLESGetPC( ) -pc_type <pctype> … Set timestep duration 1 2 beginner intermediate beginner intermediate TSSetDuration ( ) And many more options.. solvers: solvers: timestepping timestepping .…] -ts_monitor 1 -ts_xmonitor.78 of 132 TS: Review of Selected Options Functionality Procedural Interface Set timestepping solver TSSetType( ) TSSetMonitor() Set monitoring routine Runtime Option -ts_ type [euler.beuler..pseudo.

basic parallel codes (time-dependent • nonlinear PDE) .basic uniprocessor codes (time- • ex3.c. 1 2 beginner intermediate beginner intermediate E . ex2f..parallel heat equation 1 2 And many more examples .c dependent nonlinear PDE) E ex2.79 of 132 TS: Example Programs Location: petsc/src/ts/examples/tutorials/ • ex1.F .c..c • ex4.F . ex1f.uniprocessor heat equation .on-line exercise solvers: solvers: timestepping timestepping .

J. J. K coordinates • Semi-Structured: In well-defined regions. K coordinates tutorial tutorial introduction introduction . determine neighbor relationships purely from logical I.80 of 132 Mesh Definitions: For Our Purposes • Structured: Determine neighbor relationships purely from logical I. K coordinates • Unstructured: Do not explicitly use logical I. J.

81 of 132 Structured Meshes • PETSc support provided via DA objects tutorial tutorial introduction introduction .

.

83 of 132 Semi-Structured Meshes • No explicit PETSc support – OVERTURE-PETSc for composite meshes – SAMRAI-PETSc for AMR tutorial tutorial introduction introduction .

Mesh Types • Structured – DA objects • Unstructured – VecScatter objects important concepts important concepts Usage Concepts • • • • Geometric data Data structure creation Ghost point updates Local numerical computation tutorial outline: tutorial outline: data layout data layout .84 of 132 Data Layout and Ghost Values : Usage Concepts Managing field data layout and required ghost values is the key to high performance of most PDE-based parallel programs.

85 of 132

Ghost Values
Local node

Ghost node

Ghost values: To evaluate a local function f(x) , each process
requires its local portion of the vector x as well as its ghost values -or bordering portions of x that are owned by neighboring processes.
beginner
beginner

data layout
data layout

86 of 132

Communication and Physical Discretization
Communication
Geometric
Data

Data Structure Ghost Point
Creation
Data Structures

stencil
[implicit]

DACreate( )

DA
AO

Ghost Point
Updates

DAGlobalToLocal( )

structured meshes
elements
edges
vertices

VecScatter
VecScatterCreate( ) AO

1

Loops over
I,J,K
indices

1

VecScatter( )

unstructured meshes

Local
Numerical
Computation

Loops over
entities

2

2

beginner intermediate
beginner intermediate

data layout
data layout

87 of 132

DA: Parallel Data Layout and Ghost Values
for Structured Meshes

beginner
beginner

• Local and global indices

beginner
beginner

• Local and global vectors

beginner
beginner

DA creation

intermediate
intermediate

• Ghost point updates

intermediate
intermediate

• Viewing
tutorial outline:
tutorial outline:
data layout:
data layout:
distributed arrays
distributed arrays

K indices data layout: data layout: distributed arrays distributed arrays .88 of 132 Communication and Physical Discretization: Structured Meshes Communication Geometric Data stencil [implicit] Data Structure Ghost Point Creation Data Structures DACreate( ) DA AO Ghost Point Updates DAGlobalToLocal( ) structured meshes beginner beginner Local Numerical Computation Loops over I.J.

89 of 132 Global and Local Representations Local node Ghost node 5 9 0 Global: each process stores a unique local set of vertices (and each vertex is owned by exactly one process) beginner beginner 1 2 3 4 Local: each process stores a unique local set of vertices as well as ghost nodes from neighboring processes data layout: data layout: distributed arrays distributed arrays .

…) beginner beginner data layout: data layout: distributed arrays distributed arrays .Distributed Array: object containing information about vector layout across the processes and communication of ghost values • Form a DA – DACreateXX(…..90 of 132 Logically Regular Meshes • DA .…) – DAGlobalToLocalEnd(DA.DA *) • Update ghostpoints – DAGlobalToLocalBegin(DA.

91 of 132 Distributed Arrays Data layout and ghost values Proc 10 Proc 0 Proc 1 Box-type stencil beginner beginner Proc 10 Proc 0 Proc 1 Star-type stencil data layout: data layout: distributed arrays distributed arrays .

– uses “natural” local numbering of indices (0.…nlocal-1) beginner beginner data layout: data layout: distributed arrays distributed arrays .1. which is contained in PETSc vectors • Global vector: parallel – each process stores a unique local portion – DACreateGlobalVector(DA da.92 of 132 Vectors and DAs • The DA object contains information about the data layout and ghost values.Vec *lvec). but not the actual field data. • Local work vector: sequential – each processor stores its local portion plus ghost values – DACreateLocalVector(DA da.Vec *gvec).

.processors containing array • DA_STENCIL_[BOX.STAR] • DA_[NONPERIODIC.. data layout: data layout: distributed arrays distributed arrays .XPERIODIC] • • • • beginner beginner number of grid points in x-direction degrees of freedom per node stencil width .*DA) • MPI_Comm .93 of 132 DACreate1d(….

And similarly for DACreate3d() beginner beginner data layout: data layout: distributed arrays distributed arrays ..94 of 132 DACreate2d(….X.and y-directions processors in x.and y-directions degrees of freedom per node stencil width ..Y.*DA) • … • DA_[NON.XY]PERIODIC • • • • • number of grid points in x.

– INSERT_VALUES or ADD_VALUES – Vec local_vec).…) beginner beginner data layout: data layout: distributed arrays distributed arrays . – Vec global_vec. • DAGlobalToLocal End(DA.95 of 132 Updating the Local Representation Two-step process that enables overlapping computation and communication • DAGlobalToLocalBegin(DA.

96 of 132 Unstructured Meshes • Setting up communication patterns is much more complicated than the structured case due to – mesh dependence – discretization dependence beginner beginner data layout: data layout: vector scatters vector scatters .

g.97 of 132 Sample Differences Among Discretizations • • • • Cell-centered Vertex-centered Cell and vertex centered (e. staggered grids) Mixed triangles and quadrilaterals beginner beginner data layout: data layout: vector scatters vector scatters ..

J.K indices 1 VecScatter( ) unstructured mesh Local Numerical Computation Loops over entities 2 2 beginner intermediate beginner intermediate data layout data layout .98 of 132 Communication and Physical Discretization Communication Geometric Data Data Structure Ghost Point Creation Data Structures stencil [implicit] DACreate( ) DA AO Ghost Point Updates DAGlobalToLocal( ) structured mesh elements edges vertices VecScatter VecScatterCreate( ) AO 1 Loops over I.

ζ .99 of 132 Driven Cavity Model Example code: petsc/src/snes/examples/tutorials/ex8.c Solution Components • Velocity-vorticity formulation.T] 1 2 beginner intermediate beginner intermediate solvers: solvers: nonlinear nonlinear . with flow driven by lid and/or bouyancy • Finite difference discretization with 4 DoF per mesh point velocity: u velocity: v vorticity: ζ temperature: T [u.v.

100 of 132 Driven Cavity Program • Part A: Parallel data layout • Part B: Nonlinear solver creation. and usage • Part C: Nonlinear function evaluation – ghost point updates – local function computation • Part D: Jacobian evaluation – default colored finite differencing approximation • Experimentation 1 2 beginner intermediate beginner intermediate solvers: solvers: nonlinear nonlinear . setup.

101 of 132

Driven Cavity Solution Approach
A
B

Main Routine

Nonlinear Solvers (SNES)
Solve
F(u) = 0

Linear Solvers (SLES)

PC

Application
Initialization

PETSc

KSP

Function
Evaluation

C
User code

Jacobian
Evaluation

PostProcessing

D
PETSc code

solvers:
solvers:
nonlinear
nonlinear

102 of 132

Driven Cavity:
Running the program (1)
Matrix-free Jacobian approximation with no preconditioning
(via -snes_mf) … does not use explicit Jacobian evaluation
• 1 processor: (thermally-driven flow)
– mpirun -np 1 ex8 -snes_mf -snes_monitor -grashof 1000.0 -lidvelocity 0.0

• 2 processors, view DA (and pausing for mouse input):
– mpirun -np 2 ex8 -snes_mf -snes_monitor
da_view_draw -draw_pause -1

-

• View contour plots of converging iterates
– mpirun ex8 -snes_mf -snes_monitor -snes_vecmonitor

beginner
beginner

solvers:
solvers:
nonlinear
nonlinear

103 of 132

Debugging and Error Handling
beginner
beginner

• Automatic generation of tracebacks

beginner
beginner

• Detecting memory corruption and leaks

developer
developer

• Optional user-defined error handlers

tutorial outline:
tutorial outline:
debugging and errors
debugging and errors

104 of 132 Sample Error Traceback Breakdown in ILU factorization due to a zero pivot beginner beginner debugging and errors debugging and errors .

105 of 132 Sample Memory Corruption Error beginner beginner debugging and errors debugging and errors .

106 of 132 Sample Out-of-Memory Error beginner beginner debugging and errors debugging and errors .

107 of 132 Sample Floating Point Error beginner beginner debugging and errors debugging and errors .

108 of 132 Profiling and Performance Tuning Profiling: beginner beginner • Integrated profiling using -log_summary intermediate intermediate • Profiling by stages of an application intermediate intermediate • User-defined events Performance Tuning: intermediate intermediate • Matrix optimizations intermediate intermediate • Application optimizations advanced advanced • Algorithmic tuning tutorial outline: tutorial outline: profiling and profiling and performance tuning performance tuning .

109 of 132 Profiling • Integrated monitoring of – – – – time floating-point performance memory usage communication • All PETSc events are logged if compiled with -DPETSC_LOG (default). can also profile application code segments • Print summary data with option: -log_summary • See supplementary handout with summary data beginner beginner profiling and profiling and performance tuning performance tuning .

110 of 132 Conclusion beginner beginner beginner beginner beginner beginner developer developer beginner beginner • • • • • Summary New features Interfacing with other packages Extensibility issues References tutorial outline: tutorial outline: conclusion conclusion .

111 of 132 Summary • Using callbacks to set up the problems for ODE and nonlinear solvers • Managing data layout and ghost point communication with DAs and VecScatters • Evaluating parallel functions and Jacobians • Consistent profiling and error handling .

degrees of freedom per point = dof • using piecewise linear interpolation •ComputeRHS () and ComputeMatrix() are user-provided functions • DAMG *damg.DA_STENCIL_STAR.3. PC and MG options apply.sw.nlevels.ComputeRHS.NULL.DA_NONPERIODIC.mz.&damg) • DAMGSetGrid(damg. mx. . • DAMGCreate(comm.dof) • DAMGSetSLES(damg.112 of 132 Multigrid Support: Recently simplified for structured grids Linear Example: •3-dim linear problem on mesh of dimensions mx x my x mz •stencil width = sw.my.ComputeMatrix) • DAMGSolve(damg) • solution = DAMGGetx(damg) All standard SLES.

degrees of freedom per point = dof • using piecewise linear interpolation •ComputeFunc () and ComputeJ acobian() are user-provided functions • DAMG *damg.ComputeJacobian) • DAMGSolve(damg) • solution = DAMGGetx(damg) All standard SNES.nlevels.3.113 of 132 Multigrid Support Nonlinear Example: •3-dim nonlinear problem on mesh of dimensions mx x my x mz •stencil width = sw.mz.sw.NULL. PC and MG options apply.ComputeFunc.my.DA_STENCIL_STAR. • DAMGCreate(comm.DA_NONPERIODIC.dof) • DAMGSetSNES(damg. SLES. .&damg) • DAMGSetGrid(damg. mx.

profiling. etc.. can be exploited through explicit calls to the PETSc API. software software interfacing: interfacing: Overture Overture .114 of 132 Using PETSc with Other Packages: Overture • Overture is a framework for generating discretizations of PDEs on composite grids. debugging info. • Advanced features of PETSc such as the runtime options database. • PETSc can be used as a “black box” linear equation solver (a nonlinear equation solver is under development).

cg. numberOfDimensions())+1).nameOfOGFile). – Create differential operators for the grid int stencilSize = pow(3. CompositeGridOperators ops(cg). – – – – Create grid functions to hold matrix and vector values Attach the operators to the grid functions Assign values to the grid functions Create an Oges (Overlapping Grid Equation Solver) software software object to solve the system interfacing: interfacing: Overture Overture . getFromADataBase(cg.115 of 132 Overture Essentials – Read the grid CompositeGrid cg.update().cg.setStencilSize(stencilSize). ops.

0.finishBoundaryConditions(). software software interfacing: interfacing: Overture Overture .116 of 132 Constructing Matrix Coefficients Laplace operator with Dirichlet BC’s: – Make a grid function to hold the matrix coefficients: Range all. – Fill in extrapolation coefficients for defining ghost cells: coeff.extrapolate.allBoundaries).allBoundaries).applyBoundaryConditionCoefficients(0. – Designate this grid function for holding matrix coefficients: coeff.applyBoundaryConditionCoefficients(0.stencilSize.stencilSize). coeff.all). realCompositeGridFunction coeff(cg.all. – Attach operators to this grid function: coeff. – Get the coefficients for the Laplace operator: coeff=ops.all.setOperators(ops). – Fill in the coefficients for the boundary conditions: coeff.dirichlet.setIsACoefficientMatrix(TRUE.0.laplacianCoefficients().

Yale.. solver. SLAP. – Set solver parameters: solver.set(gmres). solver. etc.e.OgesParameters::PETSc).117 of 132 Simple Usage of PETSc through Oges PETSc API can be hidden from the user – Make the solver: Oges solver(cg).rhs).set(OgesParameters::THEsolverType.set(blockJacobiPreconditioner).solve(sol. PETSc. – Hides explicit matrix and vector conversions – Allows easy swapping of solver types (i. – Solve the system: solver.) software software interfacing: interfacing: Overture Overture .

equationSolver[solver.pes->xsol.set(OgesParameters::THEsolverType.&its).formRhsAndSolutionVectors(sol.MyPC).&argv.buildEquationSolver(solver.OgesParameters::PETSc).storeSolutionIntoGridFunction().pes->brhs.parameters. solver.solver). software software interfacing: interfacing: Overture Overture . solver. – Get a pointer to the PETScEquationSolver: pes=(PETScEquationSolver*)solver. PCRegister(“MyPC”. – Use Oges to convert vector into GridFunction: solver. – Build a PETScEquationSolver via Oges: solver.…). SLESSolve(pes->sles.solver].…).formMatrix().rhs).parameters. – Use PETSc API directly: PCSetType(pes->pc. – Use Oges for matrix and vector conversions: solver.118 of 132 Advanced usage of PETSc with Oges Exposing the PETSc API to the user – Set up PETSc: PetscInitialize(&argc.

cg.0.all. // Fill in extrapolation coefficients for ghost line: coeff.applyBoundaryConditionCoefficients(0.set(blockJacobiPreconditioner).119 of 132 PETSc-Overture Black Box Example #include "Overture.all. // Make grid functions to hold vector coefficients: realCompositeGridFunction u(cg).set(OgesParameters::THEsolverType. int stencilSize=pow(3. op.setStencilSize(stencilSize). coeff. // Tell Oges which preconditioner and Krylov solver to use: solver.h" #include "Oges. BCTypes::dirichlet. BCTypes::extrapolate.finishBoundaryConditions().C”). // Designate this grid function for holding matrix coefficients: coeff. // Tell Oges to use PETSc: solver.setOperators(op).update().all).h" int main() { printf(“This is Overture’s Primer example7. getFromADataBase(cg.setIsACoefficientMatrix(TRUE. return(0).f ).applyBoundaryConditionCoefficients(0.BCTypes::allBoundaries). // Fill in the coefficients for the boundary conditions: coeff. realCompositeGridFunction coeff(cg. // Prescribe the location of the matrix coefficients: solver. // Get the coefficients for the Laplace operator: coeff=op.stencilSize. // Display the solution using Overture’s ASCII format: u. // Read in Composite Grid generated by Ogen: String nameOfOGFile=“TheGrid. // Solve the system: solver. // Create an Overlapping Grid Equation Solver: Oges solver(cg).laplacianCoefficients().nameOfOGFile). OgesParameters::PETSc).h" #include "CompositeGridOperators.display(). // Assign the right hand side vector coefficients: … // Make a grid function to hold the matrix coefficients: Range all.set(gmres). // Attach operators to this grid function: coeff.stencilSize). } software software interfacing: interfacing: Overture Overture .setCoefficientArray( coeff ).numberOfDimensions())+1. // Make some differential operators: CompositeGridOperators op(cg).cg.solve( u.BCTypes::allBoundaries). solver.f(cg).hdf”.0. CompositeGrid cg.

// Make some differential operators: … // Make grid functions to hold vector coefficients: … // Make a grid function to hold the matrix coefficients: … // Create an Overlapping Grid Equation Solver: Oges solver(cg). solver. interfacing: interfacing: return(0).h“ #include “petscpc. CHKERRA(ierr). Overture } Overture .CreateMyPC).set(OgesParameters::THEsolverType. // enabling use of the runtime option: -pc_type MyPC solver."CreateMyPC". } software software PetscFinalize(). // Prescribe the location of the matrix coefficients: solver. // Determine file with runtime option -file PetscTruth flag. // View the actual (PETSc) matrix generated by Overture: ierr = MatView(pes.120 of 132 Advanced PETSc Usage In Overture #include “mpi.h" #include "Oges.VIEWER_STDOUT_SELF).C using \ advanced PETSc features.f ).(char*)nameOfOGFile . CompositeGrid cg. // Display the solution using Overture’s ASCII format: u.setCommandLineArguments(argc.h” EXTERN_C_BEGIN extern int CreateMyPC(PC). int main(int argc.hdf”. \n Use of the Preconditioner \ ‘MyPC’ is enabled via the option \n \t –pc_type MyPC”. // Access PETSc Data Structures: PETScEquationSolver &pes = *(PETScEquationSolver *) solver.Amx. CHKERRA(ierr).&argv.0.”-file”.update().display().help). OgesParameters::PETSc). // Tell Oges which preconditioner and Krylov solver to use: solver. // Allow command line arguments to supercede the above. // Read in Composite Grid generated by Ogen: String nameOfOGFile=“TheGrid. EXTERN_C_END char help=“This is Overture’s Primer example7. // Solve the system: solver. { // Allow PETSc to select a Preconditioner I wrote: ierr = PCRegister(“MyPC". // Tell Oges to use PETSc: solver.set(blockJacobiPreconditioner).nameOfOGFile). getFromADataBase(cg.setCoefficientArray(coeff). ierr = OptionsGetString(0.solve( u.h" #include "CompositeGridOperators.0.argv).equationSolver[OgesParameters::PETSc].char *argv[]) { int ierr = PetscInitialize(&argc. &flag).set(gmres). cg.h” #include "Overture.

Drop Tolerance ILU • Use PETSc SeqAIJ or MPIAIJ (for block Jacobi or ASM) matrix formats • -pc_ilu_use_drop_tolerance <dt.maximum number of nonzeros kept per row software software interfacing: interfacing: ILUDTP ILUDTP .121 of 132 Using PETSc with Other Packages ILUDTP .dtcol.tolerance for column pivot – maxrowcount .maxrowcount> – dt – drop tolerance – dtcol .

matrix) • Optional – MatPartitioningSetVertexWeights(ctx.122 of 132 Using PETSc with Other Packages ParMETIS – Graph Partitioning • Use PETSc MPIAIJ or MPIAdj matrix formats • MatPartitioningCreate(MPI_Comm.weights) • MatPartitioningSetFromOptions(ctx) • MatPartitioningApply(ctx.IS *partitioning) software software interfacing: interfacing: ParMETIS ParMETIS .MatPartitioning ctx) • MatPartitioningSetAdjacency(ctx.

TS_NONLINEAR..123 of 132 Using PETSc with Other Packages PVODE – ODE Integrator • TSCreate(MPI_Comm.TS_PVODE) • …. regular TS functions • TSPVODESetType(ts.&ts) • TSSetType(ts. other PVODE options TSSetFromOptions(ts) – accepts PVODE options • software software interfacing: interfacing: PVODE PVODE .PVODE_ADAMS) • ….

…) • … set SPAI options PCSetFromOptions(pc) … accepts SPAI options software software interfacing: interfacing: SPAI SPAI .PCSPAI) • PCSPAISetXXX(pc.124 of 132 Using PETSc with Other Packages SPAI – Sparse Approximate Inverse • PCSetType(pc.

PetscObject obj) software software interfacing: interfacing: Matlab Matlab . PetscMatlabEngine eng) • PetscMatlabEnginePut(eng.”) • PetscMatlabEngineGet(eng.125 of 132 Using PETSc with Other Packages Matlab • PetscMatlabEngineCreate(MPI_Comm.”R = QR(A).machinename.PetscObject obj) – Vector – Matrix • PetscMatlabEngineEvaluate(eng.

software software interfacing: interfacing: SAMRAI SAMRAI . • SAMRAI developers wrote a new class of PETSc vectors that uses SAMRAI data structures and methods.126 of 132 Using PETSc with Other Packages SAMRAI • SAMRAI provides an infrastructure for solving PDEs using adaptive mesh refinement with structured grids. • This enables use of the PETSc matrix-free linear and nonlinear solvers.

SNESSolve(…). – Use PETSc API to solve the system: SNESCreate(…). – Generate vector coefficients using SAMRAI – Create the PETSc Vector object wrapper for the SAMRAI Vector: Vec PETSc_Vector = createPETScVector(Samrai_Vector). Both PETSc_Vector and Samrai_Vector refer to the same data software software interfacing: interfacing: SAMRAI SAMRAI .127 of 132 Sample Usage of SAMRAI with PETSc Exposes PETSc API to the user – Make a SAMRAI Vector: Samrai_Vector = new SAMRAIVectorReal2<double>(…).

Moré. currently PETSc provides these implementations. etc. including – – – – unconstrained optimization bound constrained optimization nonlinear least squares nonlinearly constrained optimization • TAO uses abstractions for vectors. McInnes.C. L.mcs. linear solvers. • TAO interface is similar to that of PETSc • See tutorial by S. and J.128 of 132 Using PETSc with Other Packages TAO • The Toolkit for Advanced Optimization (TAO) provides software for large-scale optimization problems. available via http://www.gov/tao software software interfacing: interfacing: TAO TAO .anl.. matrices. Benson.

ApplicationCtx usercontext.) software software interfacing: interfacing: TAO TAO .g. /* Initialize Application -...129 of 132 TAO Interface TAO_SOLVER tao.(void*)&usercontext). call TaoCreate(.g.&tao). TaoDestroy(tao). e.. TaoCreate(MPI_COMM_WORLD..x. /* Finalize application -. Vec x. TaoSolve(tao). FctGrad. TaoFinalize().Destroy vectors x and g */ ...Create variable and gradient vectors x and g */ . /* optimization solver */ /* solution and gradient vectors */ /* user-defined context */ TaoInitialize(). Similar Fortran interface. g. TaoSetFunctionGradient(tao..”tao_lmvm”.

• Heavily commented example codes include – Krylov methods: petsc/src/sles/ksp/impls/cg – preconditioners: petsc/src/sles/pc/impls/jacobi • Feel free to discuss more details with us in person. .130 of 132 Extensibility Issues • Most PETSc objects are designed to allow one to “drop in” a new implementation with a new set of data structures (similar to implementing a new class in C++).

non-trivial PDE solvers that deliver high performance is still difficult.131 of 132 Caveats Revisited • Developing parallel. but it is not a black-box PDE solver nor a silver bullet. and requires months (or even years) of concentrated effort. .gov. • Users are invited to interact directly with us regarding correctness or performance issues by writing to petsc-maint@mcs. • PETSc is a toolkit that can ease these difficulties and reduce the development time.anl.

Lusk. Smith.gov/petsc/docs • Example codes – docs/exercises/main.mcs.org • Using MPI (2nd Edition).mpi-forum. and Skjellum • Domain Decomposition. and Gropp . Gropp.anl.132 of 132 References • http://www. Bjorstad.htm • http://www.