Numerical Solution Algorithms for Compressible Flows

Lecture Notes

Hrvoje Jasak
Faculty of Mechanical Engineering and Naval Architecture University of Zagreb, Croatia

Academic Year 2006-2007

Course prepared for the Aerospace Engineering Program Tempus NUSIC Project JEP-18085-2003
c 2006. Hrvoje Jasak, Wikki Ltd. All right reserved.


I Introduction to Modern CFD 7
9 11 11 21 26 29

1 Introduction 2 Introduction: CFD in Aeronautical Applications 2.1 Modern Aircraft Design and CFD . . . . . . . . . . . . . . . . . . 2.2 Scope of Computational Efforts . . . . . . . . . . . . . . . . . . . 2.3 Finite Volume or Finite Element? . . . . . . . . . . . . . . . . . . 3 CFD in Automotive Applications


The Finite Volume Method
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

41 41 42 45 50 53 57 61 61 61 63 65 67 70 72 74 77 77 77

4 Mesh Handling 4.1 Introduction . . . . . . . . . . . . 4.2 Complex Geometry Requirements 4.3 Mesh Structure and Organisation 4.4 Manual Meshing: Airfoils . . . . . 4.5 Adaptive Mesh Refinement . . . . 4.6 Dynamic Mesh Handling . . . . .

5 Transport Equation in the Standard Form 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . 5.2 Scalar Transport Equation in the Standard Form 5.2.1 Reynolds Transport Theorem . . . . . . . 5.2.2 Diffusive Transport . . . . . . . . . . . . . 5.3 Initial and Boundary Conditions . . . . . . . . . . 5.4 Physical Bounds in Solution Variables . . . . . . . 5.5 Complex Equations: Introducing Non-Linearity . 5.6 Inter-Equation Coupling . . . . . . . . . . . . . .

6 Polyhedral Finite Volume Method 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Properties of a Discretisation Method . . . . . . . . . . . . . . . .

4 6.3 6.4 6.5 Discretisation of the Scalar Transport Face Addressing . . . . . . . . . . . . Operator Discretisation . . . . . . . . 6.5.1 Temporal Derivative . . . . . 6.5.2 Second Derivative in Time . . 6.5.3 Evaluation of the Gradient . . 6.5.4 Convection Term . . . . . . . 6.5.5 Diffusion Term . . . . . . . . 6.5.6 Source and Sink Terms . . . . 6.6 Numerical Boundary Conditions . . . 6.7 Time-Marching Approach . . . . . . 6.8 Equation Discretisation . . . . . . . . 6.9 Convection Differencing Schemes . . 6.10 Examples . . . . . . . . . . . . . . . Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

78 82 85 85 86 86 87 89 90 91 93 94 94 94 95 95 96 99 100 102 105 107 111 111 111 115 115 116 117 122 126

7 Algebraic Linear System and Linear Solver Technology 7.1 Structure and Formulation of the Linear System . . . . . . 7.2 Matrix Storage Formats . . . . . . . . . . . . . . . . . . . 7.3 Linear Solver Technology . . . . . . . . . . . . . . . . . . . 7.3.1 Direct Solver on Sparse Matrices . . . . . . . . . . 7.3.2 Simple Iterative Solvers . . . . . . . . . . . . . . . 7.3.3 Algebraic Multigrid . . . . . . . . . . . . . . . . . . 7.4 Parallelisation and Vectorisation . . . . . . . . . . . . . . . 8 Solution Methods for Coupled Equation Sets 8.1 Examining the Coupling in Equation Sets . . . 8.2 Examples of Systems of Simultaneous Equations 8.3 Solution Strategy for Coupled Sets . . . . . . . 8.3.1 Segregated Approach . . . . . . . . . . . 8.3.2 Fully Coupled Approach . . . . . . . . . 8.4 Matrix Structure for Coupled Algorithms . . . . 8.5 Coupling in Model Equation Sets . . . . . . . . 8.6 Special Coupling Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


Numerical Simulation of Fluid Flows
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

129 129 130 132 133 133

9 Governing Equations of Fluid Flow 9.1 Compressible Navier-Stokes Equations . 9.2 Flow Classification based on Flow Speed 9.3 Steady-State or Transient . . . . . . . . 9.4 Incompressible Formulation . . . . . . . 9.5 Inviscid Formulation . . . . . . . . . . .


5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 134 135 136 137 139 139 141 144 144 146 147 147 150 152 154 157 159 161 161 162 164 165 166 169 169 172 173 174 177 179 180 181 185 185 186 187

9.6 9.7

Potential Flow Formulation . . . . . Turbulent Flow Approximations . . . 9.7.1 Direct Numerical Simulation . 9.7.2 Reynolds Averaging Approach 9.7.3 Large Eddy Simulation . . . .

10 Pressure-Velocity Coupling 10.1 Nature of Pressure-Velocity Coupling . . . . . . . . . . . . . . 10.2 Density-Based Block Solver . . . . . . . . . . . . . . . . . . . 10.3 Pressure-Based Block Solver . . . . . . . . . . . . . . . . . . . 10.3.1 Gradient and Divergence Operator . . . . . . . . . . . 10.3.2 Block Solution Techniques for a Pressure-Based Solver 10.4 Segregated Pressure-Based Solver . . . . . . . . . . . . . . . . 10.4.1 Derivation of the Pressure Equation . . . . . . . . . . . 10.4.2 SIMPLE Algorithm and Related Methods . . . . . . . 10.4.3 PISO Algorithm . . . . . . . . . . . . . . . . . . . . . . 10.4.4 Pressure Checkerboarding Problem . . . . . . . . . . . 10.4.5 Staggered and Collocated Variable Arrangement . . . . 10.4.6 Pressure Boundary Conditions and Global Continuity . 11 Compressible Pressure-Based Solver 11.1 Handling Compressibility Effects in Pressure-Based Solvers 11.2 Derivation of the Pressure Equation in Compressible Flows 11.3 Pressure-Velocity-Energy Coupling . . . . . . . . . . . . . 11.4 Additional Coupled Equations . . . . . . . . . . . . . . . . 11.5 Comparison of Pressure-Based and Density Based Solvers . 12 Turbulence Modelling for Aeronautical Applications 12.1 Nature and Importance of Turbulence . . . . . . . . . . . . 12.2 Direct Numerical Simulation of Turbulence . . . . . . . . . 12.3 Reynolds-Averaged Turbulence Models . . . . . . . . . . . 12.3.1 Eddy Viscosity Models . . . . . . . . . . . . . . . . 12.3.2 Reynolds Transport Models . . . . . . . . . . . . . 12.3.3 Near-Wall Effects . . . . . . . . . . . . . . . . . . . 12.3.4 Transient RANS Simulations . . . . . . . . . . . . . 12.4 Large Eddy Simulation . . . . . . . . . . . . . . . . . . . . 12.5 Choosing a Turbulence Model . . . . . . . . . . . . . . . . 12.5.1 Turbulence Models in Airfoil Simulations . . . . . . 12.5.2 Turbulence Models in Bluff-Body Aerodynamics . . 12.6 Future of Turbulence Modelling in Industrial Applications

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

6 13 Large-Scale Computations 13.1 Background . . . . . . . . . . . . . . . 13.1.1 Computer Power in Engineering 13.2 Classification of Computer Platforms . 13.3 Domain Decomposition Approach . . . 13.3.1 Components . . . . . . . . . . . 13.3.2 Parallel Algorithms . . . . . . . 14 Fluid-Structure Interaction 14.1 Scope of Simulations . . . . . 14.2 Coupling Approach . . . . . . 14.3 Discretisation of FSI Systems 14.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


. . . . . . . . Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

189 189 189 191 196 196 198 201 201 204 205 206

Part I Introduction to Modern CFD

Chapter 1 Introduction
Computational Fluid Dynamics • Definition of CFD, from Versteeg and Malalasekera: “An Introduction to Computational Fluid Dynamics” “Computational Fluid Dynamics or CFD is the analysis of systems involving fluid flow, heat transfer and associated phenomena such as chemical reactions by means of computer-based simulation.” • CFD is also a subset of Computational Continuum Mechanics: fundamentally identical numerical simulation technology is used for many sets of simular partial differential equations – Numerical stress analysis – Electromagnetics, including low- and high-frequency phenomena – Weather prediction and global oceanic/atmosphere circulation models – Large scale systems: galactic dynamics and star formation – Complex heat and mass transfer systems – Fluid-structure interaction and similar coupled systems • In all cases, equations are very similar: capturing conservation of mass, momentum, energy and associated transport phenomena



Chapter 2 Introduction: CFD in Aeronautical Applications
2.1 Modern Aircraft Design and CFD

In this section we will explore the role and history of Computational Fluid Dynamics in the aerospace industry. Problems of aerospace design were leading the technological push for a long period in the 20th, dictating areas of research and, together with nuclear research, expanding the use of numerical modelling. Even today, aerospace and related technology (e.g. rocket design) is considered sufficiently serious to limit the access to latest design by world powers to other governments: however, major parts of technology may be considered more than 50 years old. Example: Chuck Yeager and the first supersonic flight, 1947? In NASA, the new push towards manned space exploration and reach-out towards a manned Mars mission involves mainly fluid dynamics challenges. The mission requirements beyond Earth’s orbit are more or less settled. New work in aircraft design concentrates on optimising the existing technology with very few revolutionary new ideas. Main area of work still follws traditional “functional decomposition” of functions on an airplane: wings for lift, rudder for steering, body as useful volume. However, new simulation techniques allow us to re-visit functionally interesting alternatives: flying wind configuration has been revived recently in the B-2 bomber after 50 years from the first attempt. Introduction • Aerospace industry is the first and most prevalent in the use of numerical techniques, including Computational Fluid Dynamics (CFD) • Early beginning of CFD in early 1960’s • First successes came to prominence in the 1970’s


Introduction: CFD in Aeronautical Applications

• The creation of the CFD-service industry started in the 1980’s • The CFD industry expanded significantly in the 1990’s • First fully computer-based design process for external aerodynamics design in a commercial aircraft: Airbus 380 in 2000’s • In most phases of the process, it was the aerospace industry driving the CFD development to answer to its needs Early adpotion of numerical modelling in aerospace applications has brought with it some interesting consequences: living with expensive computer hardware and limited memory space leads to simplified modelling techniques and carefully tuned solution algorithms for the set of problems under consideration. Example: 1-equation turbulence models for airfoil calculations, e.g. Baldwin-Lomax. Aerospace Industry and CFD • Use of CFD is no longer in question: definitely used throughout the design process • Questions on fidelity and accuracy: can we get sufficiently reliable results?

• Roll-out of CFD continues with more complex requirements, increase of the computer power and applicability of new methods (optimisation) • Some problems still hit issues with level of performance increase: how much is the difference in results quality between steady 2-D RANS and 3-D LES for single airfoil design

2.1 Modern Aircraft Design and CFD


• Challenges in aircraft design moving elsewhere: systems integration, control components (e.g. electro-hydraulics), packaging, computer support and “battlefield integration”, advanced materials (single-crystal turbine blades) etc. The only truly revolutionary new technology on its way to (military) aircraft is a scram-jet engine: air-breading jet engine without a compressor or a turbine, where shock management in supersonic flow is used to create the necessary compression. State of the market • Boeing and Airbus totally dominate the commercial airliner market. Number of smaller players on the edges and in business/regional jet business: ATR, Gulfstream, Raytheon etc. • Military situation a bit more diverse: BAE Systems, Lockheed Martin, Sukhoi and a number of smaller manufacturers • I also count missile systems and aircraft engine manufacturers (General Electric, Rolls Royce, Pratt and Whitney) • NASA. In the latest budget statement, claims that all of its critical problems are associated with managing fluid flow: the space bit in the middle is not nearly that difficult. • High-speed car aerodynamics: highly specialised, very rich, with very clear requirements. Often bundled with aerospace: wings are turned upsidedown, creating down-force instead of lift; concerns about drag very different than in standard car industry. However, the aerodynamics problem is much more complex than in aircraft. Example: proximity to the ground: important boundary layer effects; trying to organise a much more complex flow pattern • Aircraft design also includes other flow physics and auxiliary component simulations In order to understand the requirements of various uses of CFD, let us introduce a simple flow classification, based on how tightly managed is the flow field. My flow classification • Smooth flows: engineering machinery specifically organises the flow for maximum stability and efficiency. Design conditions are clearly defined and


Introduction: CFD in Aeronautical Applications

their variation is relatively small. Fractional changes in flow characteristics have profound performance effects (detached flow, small recirculations, turbulence. Example: aircraft at cruising speed, turbo-machinery blades, yacht design • Rough flows: flow regime is uncertain, the main object of design is not flow management, but flow may still have critical effect on performance. Example: electronics cooling, passenger compartment comfort in aircraft, swimsuit of Olympic swimmers • In aerospace, we mostly deal with smooth flows

Scope of Simulations • Traditionally, experimental studies in aerospace are important, but fullscale models are more and more out of the question. This creates ideal scope for numerical studies • Questions we look to answer with numerical simulation techniques range from simple lift and drag studies to extremely complex physical problems: stall characteristics, stability in manoeuvres, sensitivity and robust design, optimisation, aero-acoustic noise. A number of new techniques stem from use of CFD in aerospace and are still spreading through the rest of CFD and industry • The “baseline” physics involved is relatively simple: compressible Newtonian flows of an ideal gas • . . . complications easy to add: incompressible to hypersonic flow regime

Speed Range low subsonic high subsonic transonic supersonic hypersonic

Mach Number < 0.3 0.3 − 0.6 0.6 − 1.1 1−5 >5

2.1 Modern Aircraft Design and CFD
u, m/s 1000 100 10 1 0.1 dust 0.01 0 1 2 3 4 5 6 7 8 9 log(Re) gliders model airplanes general aviation

jet aircraft



• Even in simplest flows, we do not have an easy job: turbulence complicates the situation immensely! The problem of turbulence modelling for engineering applications is still unsolved; however, the physics is straightforward and well understood • Away from the baseline, physics can get considerably more complex: combustion, de-icing, multi-phase flow etc. • There is significant penetration of general purpose CFD tools into the aerospace companies, but this is still considered a massive untapped market from the commercial CFD point of view. It is unclear that general purpose tools will be sufficiently good to do the job. Numerical simulation software • You don’t do CFD without computers! Early efforts with pieces of paper and rooms of people date from UK Metheorological office, running large scale weather forecasting simulations • In the last 10 years, CFD performance and use coming together – Computers power is a cheap commodity. Massively parallel computers are commonplace today and can be easily handled in software – In aerospace, understanding the physics is typically not a problem – Numerical methods cleaned up of systemic errors and gross failures – Sufficient experience in research departments – Validation against “trusted” experimental data – Understanding of simplifications and assumptions • In other industries, roll-out of numerical simulation tools limited by experience. Phases of integration of CFD in the design process:


Introduction: CFD in Aeronautical Applications

1. Research and development departments: validation and assessment of capabilities. Typically involves detailed study of old designs or production pieces and comparison with available measured data. 2. Pre-design: experimenting with early prototypes and new ideas away from the current development line 3. Design and pre-production: new product development. 4. Production: optimisation of existing components and incremental development of the running design • In aerospace design, it is no longer sufficient to make a plane fly – Economy, fuel consumption – Government regulations: noise and pollution levels. Example: noise pollution caused by the supersonic shock wave on the ground killed supersonic flight! Simulation objective: dissipate the shock between the plane and the ground – Passenger comfort. Includes both oscillatory and non-oscillatory flows around the aircraft, as well as cabin heating and air-conditioning. Example: Boeing 747-300 with a wiggly tail – Military application requirements: agile manoeuvring system and unstable aerodynamic configurations – In some cases, aerodynamics design does not dominate: instead, it is necessary to make a bad aerodynamic shape fly. Example F-117 stealth bomber. This is also a good example of what happens when the numerical simulation software (in this case, simulation followed electromagnetic signature) cannot handle a traditional engineering shape of an aircraft. Note that F-117 is an old aircraft: scheduled to be retired from US Air Force by the end of 2008. Process of performing a CFD simulation has evolved through the years, with maturing numerical simulation tools and transition of the design work from the drafting board to a Computer-Aided Desegn (CAD) system. This is still ongoing: over the last few years, providers of CAD solutions have started talking about Product Lifetime Management (PLM) solutions, moving the complete cycle under a single IT-based system. Even in a relatively modern field like CFD, legacy practices still act as a stumbling block. Example: traditionally, a meshing tool and CFD solver are two separate software components; add to this the problem of transferring geometrical data from a CAD package into a mesher and attempting to run an optimisation simulation in this manner. From the moves in the market, it is expected that software convergence may happen over the next 5-10 years (one generation of CFD software tools. Note that the similar problem in structural

2.1 Modern Aircraft Design and CFD


analysis has already been overcome: compared to fluid flow, physics involved in structural simulation is significantly simpler. Phases of a CFD Simulation • Description of the geometry. Airfoil curve data, CAD surface or anywhere in between. External aerodynamics = geometry of interest located in a large domain (atmosphere) • Extraction of the fluid domain. In cases where a CAD description is given, a considerable amount of clean-up may be required. This is not easily done, no reliable automatic tools. • Mesh generation. Based on the given fluid domain, a computational mesh is created. Tools range form manual (points and cells), semi-automatic (block splitting, template geometries, surface wrapping (adaptation of a mesh template to a given surface) to fully automatic (tetrahedral and hexahedral/polyhedral automatic mesh generators). Mesh generation is the most demanding and time-consuming process today. Significant push to automatic tools. In spite of automatic tools, there is room for engineering judgement, as a quality solution can be obtained more cheaply by constructing a quality mesh. A good mesh takes into account what the solution should look like. • Physics setup. Select the governing equations and specify the material properties and boundary conditions involved. Second level of engineering judgement: how much does the knowledge of detailed material behaviour improve the final result. Example: specific heat capacity of water as a function of temperature; thermal expansion coefficient of water as a function of temperature T . • Boundary condition setup. This includes both the location and type of boundary conditions used. The role of boundary conditions is to include the influence of the environment to the solution. In “big box” cases, this is easier than is other engineering simulations • Solver setup and simulation. Choice of discretisation parameters and numerical solution procedure: differencing schemes, relaxation parameters, multigrid, convergence tolerance etc.. • Data post-processing and analysis of results. Not always straightforward. – Integral studies. In simple lift and drag studies, we could be looking at a small number of integral properties.


Introduction: CFD in Aeronautical Applications

– Flow organisation, where global characteristics of the flow are controlled to achieve stability or a desired pattern – Management of detailed flow structure. Example: remove the vortex depositing dirt on a part of the windshield – Sensitivity and robust design studies. Usually cannot be seen in results without experience or require specialised simulations. Advanced visualisation tools are a part of the game: provides a way of managing the wealth of data. 20 years ago, leading CFD tools were developed at Universities, centred around strong research groups and attracting significant funding from the industry. As a response to deployment problems, large aerospace companied develop their own research teams and in-house expertise. Today, CFD software development at Universities is winding down significantly: the components a good research platform requires are substantial and very few groups can afford to finance the effort (there is little research value and publishable results in writing a new “known technology” CFD code. Majority of groups rely on commercial CFD software to do their research. In-house software development in large companies suffers a similar fate: the work that can be done by commercial software is migrated to commercial CFD codes and sometimes even outsorced. Apart from financial pressures, this is related to software development, maintenance and validation work required to keep in-house codes working and up-to-date with technology. The above pushes larg-scale software work to commercial organisation, which have grown from small companies in late 1980s and early 1990s to large organisations. Current trend towards large packages that can satisfy all simulation needs to all customers under the same hood acts as a counter-weight for this state of affairs. Specialised software for specialised needs and technology components giving competitive advantage will be kept separate. CFD Software Development • Small experimental codes: playing around with physics and numerical methods • In-house “general” CFD solver development • In-house custom-written software for specific purposes: e.g. wing-nacelle engine system, turbine blade optimisation, simulation of unstable manoeuvres in military jets, calculation of directional derivatives and solution stability, matching computations with measured data sets etc. – Complex and tuned panel method codes

2.1 Modern Aircraft Design and CFD


– Simplified physics, e.g. potential flow and boundary layer codes – Hooked-up mesh generation and parametrisation – Special purpose codes: sensitivity, aero-acoustics etc. – In-house development kept secret: competitive advantage. Example: Pratt & Whitney material properties databases • Government-sponsored (National Labs) developments • General-purpose CFD packages: from a fridge to a stealth plane • University research codes; public-domain software • “Write-your-own” CFD solver • Software getting increasingly complex: you need a PhD to join the game Market situation: Aerospace CFD • Aerospace atypical for the general CFD picture: early adopter with lots of experience in-house and specific tools targeted to applications • In-house codes extremely important and integrated into the design process. However, currently approaching “vintage” status • Example: Boeing dominated by multi-block structured solvers, which currently hinders development. Airbus came in later and developed unstructured solvers in-house, with the massive competitive advantage • There are problems with in-house codes: development effort more complex, people with knowledge move on, process of acceptance and validation very long • Simulation software needs to become more user-friendly and closer to the CAE line. This implies extra work apart from “raw” solver capability which is not easily handled in-house. • Additional CAD-related requirements and cost of keeping CFD development teams in-house opens the room for commercial general-purpose CFD packages • There also exists a number of consortium or government-sponsored codes. Example: NASA (USA), DLR (Germany)

20 Remaining Challenges

Introduction: CFD in Aeronautical Applications

• Mesh generation, especially parallel mesh generation • Handling massively parallel simulations • Integration into the CAD-based design process • Fluid-structure interaction and aeroacoustics • On the cusp between two generations of general-purpose CFD solvers: procedural programming, Fortran and C against object orientation • The push for bigger, faster, more accurate simulations in external aerodynamics not so strong in the aerospace market: meshes are already sufficiently large. Also, extensive experience of the required size of the model, mesh resolution and locally fine meshes from the days when computer power was expensive • In aircraft engine design, the opposite is the case. ASC Project (Advanced Simulation and Computing), US Dept of Energy, Los Alamos, Livermore, Sandia, Stanford University and other partners – Tip-to-toe simulation of a turbo-fan aircraft engine, including fan, turbo compressor, combustion chambers and turbine. Preferred modelling technique: Large Eddy Simulation – Integrated Multicode Simulation Framework – As a part of the project, world’s biggest parallel computers have been built: ∗ ∗ ∗ ∗ ∗ ∗ ∗ ASC Red, Sandia 1996 ASC Blue Livermore, Los Alamos, 1998 ASC White, Livermore, 2001 ASC Q, Los Alamos, 2003 ASC Red Storm, Sandia, 2004, 40-TeraOps ASC Purple, Livermore, 2005, 100-TeraOps ASC Blue Gene Livermore, Los Alamos, : 130 000 CPUs and 360 TeraFlops performance.

– For comparison, ASC Linux, 960 node-linux box with 1920 processors and 3.8 TB produces peak performance of 9.2 TeraFlops/s – The idea of doing a complete engine is somewhat abandoned: not enough power for LES on compressor or turbine. Using combined RANS/LES simulation approach with coupling on interfaces.

2.2 Scope of Computational Efforts



Scope of Computational Efforts

The level and fidelity of numerical simulation is tailored to the design process: it will cover everything form preliminary design tools running in 1-2 seconds to full transient CFD studies for complex physics simulations. The use of analytical and “pedestrian” methods in early design phases cannot be ignored: laying out the initial set-up of a jet engine compressor is done using precisely the techniques taught in University turbomachinery courses. Once, the basic design is laid down, more detailed tools will be used to satisfy design requirements and optimise the performance. Aerodynamic Drag • Drag varies with the velocity squared: major influence at aerospace speed. Narrow improvements in drag lead to considerable advances: A 15% drag reduction on the Airbus A340-300B would yield a 12% fuel saving, other parameters being constant. (Mertens, 1998) • Chasing drag improvements in highly optimised shapes is only of marginal interest

Cd = 0.47 Sphere

Cd = 0.80 Angled cube

Cd = 0.42 Half sphere

Cd = 0.82 Long cylinder

Cd = 0.50 Cone

Cd = 1.15 Short cylinder

Cd = 1.05 Cube

Cd = 0.04 Streamlined body

• Simulations include functional subset cases, e.g. airfoils, wings, tails configuration, nacelle-to-wing assembly, but also full aircraft models • Subjects of interest include shock-boundary layer interaction: effects of shocks on standard turbulence model prediction is still in question.

22 High-Lift Aerodynamics

Introduction: CFD in Aeronautical Applications

• High-lift wing configuration very important: lower take-off and landing speed, higher pay-load etc. • Study of multi-element airfoil configuration: high flow curvature, flow separation, wakes from upstream elements, laminar-to-turbulent boundary layer transition etc. • High-lift devices added to wings include flaps and slats (common), but also leading edge extensions, vortex generators and blown flaps

• The subject of control is boundary layer management and flow stability (avoiding stall)

2.2 Scope of Computational Efforts


• Looking at Formula 1 aerodynamics, many similar devices can be found Unsteady Aerodynamics • In most cases, aerodynamic flow are considered steady-state: flight at cruising speed, steady-state lift-off configuration etc. • Unsteady effects are sometimes critical, both in oscillatory and non-oscillatory regime • Oscillatory instability: dynamic stall on helicopter rotor blades in forward flight; vortex shedding behind bluff bodies • Non-oscillatory flows: flow separation at the high angle of attack. Turbulence effects are critical for accurate modelling • Unsteady transonic effects, moving or oscillating shock studies: significant effect on the performance, especially in cases of high-speed helicopter rotor blades • Unsteady aerodynamics is closely related to aero-elasticity. Sources of unsteadiness are mechanically generated: flutter Rotary Aerodynamics • Simulation of helicopter rotor blades usually considered a specialised area of research: special assumptions and modelling regime • Study of dynamic stall, blade-vortex interaction, blade-to-blade interaction, blade tip effects and transonic flow effects • Similar effects, but at lower speeds can be found in other devices, e.g. wind turbines, propeller design, turbo-machinery High-Speed Aerodynamics • At high speed, the equation of state and ideal gas assumptions break down. In other aspects, the flow is becoming easier to handle. Generally refers to speed of Ma = 5 and above • For high speed, and due to the real gas effects we speak of aerothermodynamics rather than aerodynamics. • Regimes of hypersonic flow: separation is done based on the choice of equation of state


Introduction: CFD in Aeronautical Applications

– Perfect gas. Flow regime still Mach number independent, but there are problems with adiabatic wall conditions – Two-temperature ideal gas. Rotational and vibrational motion of the molecules needs to be separated and leads to two-temperature models. Used in supersonic nozzle design – Dissociated gas. Multi-molecular gases begin to dissociate at the bow shock of the body. – Ionised gas. The ionised electron population of the stagnated flow becomes significant, and the electrons must be modelled separately: electron temperature. Effect important at speeds of 10 − 12km/s Rudder and Steering Diagrams • In automated steering/targeting systems, the aircraft/missile is controlled by a computer: given target or flight path • Automatic control systems rely on the diagrams showing the response on steering commands: in practice, large look-up tables or fitted functional data. Consider a case of a rotating missile with 2 × 4 control surfaces. • The steering data created by computation: combinations of control configurations with lift, drag, pitch, yaw orientation and force response. This typically involves 5-10 thousand simulations, done automatically on massively parallel computers. Automatic mesh generation, massive parallelism and controlled accuracy are essential. aftosmis/home.html Internal Flows and Auxiliary Devices • Internal flows: incompressible, low speed, aerodynamics forces typically of no consequence • Example: passenger compartment comfort, heating, cooling and ventilation. Closer to “standard” CFD and usually handled by general-purpose CFD packages Stability and Robust Design • Stability analysis takes into account the effects of uncertainly (noise) in the input parameters. Example: how much will the lift coefficient on the airfoil change with a 5% change in the angle of attack? – Away from stall point: lift is stable to small change in conditions

2.2 Scope of Computational Efforts


– At stall: catastrophic change – What about a NACA 0012 (symmetric airfoil profile) at zero angle of attack? • Stability of the solution on small perturbations can be examined in different ways: – Lots of simulations: detailed analysis, lots of work – Special numerical techniques: forward derivatives, adjoint equations (continuous and discrete), Proper Orthogonal Decomposition methods • All of the above are extensively used in aerospace simulations. However, looking at results is not easy: need to understand the meaning • Robust design studies – Under normal circumstances, looking to maximise the performance of a device in absolute terms. Example: maximum lift in multi-element airfoils – In reality, requirements are different: consider aircraft landing in a storm, where angle of attack is not constant. Thus, the optimisation process should account for uncertainty of the input parameters and provide stable performance across the range. – Such effects typically lead to different optimisation results: envelope of performance instead of maximum lift • Matching of computations with experimental data in combined experimental and numerical studies. Example: unknown flow pattern at the entry of the jet engine combustor, but measured pressure and temperature data available at the outlet. Fluid-Structure Interaction • The first step in modelling is to choose the domain of interest. In simple situations, this will cover only a single material or a single governing law. Unfortunately, this is not always the case • Example: wing flutter – Aerodynamic forces from fluid flow determine the load on the wing. Wing itself is an elastic structure and deforms under load – Deflection of the elastic wing changes the flow geometry: a new solution produces different surface load – Interaction between the two may be stable or unstable: flutter


Introduction: CFD in Aeronautical Applications

• Fluid-structure simulations involve both the fluid and solid domain. Care must be given to the coupling methods and stability of the algorithm


Finite Volume or Finite Element?

Two sets of numerical techniques handling computational continuum mechanics dominate the field: the Finite Volume Method (FVM) and the Finite Element Method. Once can clearly show both are based on the same principles and are closely mathematically related. Various variants and generalisations can also be devised, but so far their impact has been limited. Some deserve a mention: • Discontinuous Galerkin discretisation provides a common framework for the FVM and FEM. It combines the conservative flux formulation which is a basis of the FVM with the elemental shape function and a weak formulation of the FEM. One of interesting uses would be a formal higher-order extension beyond second-order integrals. So far, the most important use is the generalisation of mathematical machinery underpinning both methods • Lattice Gas and Lattice Boltzmann methods claim to simulate the flow equations from basic principles of molecular dynamics instead of using the continuum equations. Clearly, averaging over sufficient number of latice operations will yield the original PDEs and producing the required solution. Attractions of this method follow from simplifications of latice operations to very primitive accuracy (e.g. 3 velocity levels) and simplifications in complex geometry handling Numerical Techniques in Aerospace Simulations: Spatial Techniques • Finite Difference Method (FDM): really appropriate only for structured meshes; no conservation properties. Not used commercially. Important use of FDM is in aero-acoustic simulations, where high-order discretisation is essential (e.g. 6th order in space and 10th order in time). Problems with high-order boundary conditions. • Finite Volume Method: dominates the fluid simulation arena • Finite Element Method. No particular reason why it cannot be used; however, the bulk of the numerical method development targeted to FVM. As a result, some techniques and solution methodology not suitable for fluid flow. I do not know any FEM fluid flow aerospace solvers, but FEM dominates the structural analysis arena • Discontinuous Galerkin: a formal unification of the FEM and FVM ideas. Strongly conservative and consistent, but extensions are still impractical

2.3 Finite Volume or Finite Element?


(control of matrix properties, solution techniques etc.). Consider it workin-progress • Monte Carlo Methods: extensively used in low-density high-speed aerodynamics (Space Shuttle re-entry). Techniques are specialised for high efficiency • Spectral techniques: special purposes only. Extremely efficient and accurate for “box in a box” and cyclic matching simulations, e.g. DNS Handling Temporal Variation • Steady state: no temporal discretisation required • Time domain: bulk of transient flow simulations • Frequency domain: special purposes. Example: in turbo-machinery simulations, it is possible to extract the dominant frequencies. Instead of solving a time-dependent problem, a series of steady simulations is set up, each for a selected frequency (effects of the temporal derivative now convert into a source/sink term). The time-dependent behaviour is recovered from the combination of frequency solutions. Simplified Flow Solvers in Industrial Use • It is not always necessary to run a full Navier-Stokes solver to obtain usable results. Also, the simulation time is sometimes critical: approximate result now. • Panel method. Combination of source, sinks, doublets and vortex elements used to assemble a “zero streamline” form which represents the body. Extremely fast and capable of producing indicative solutions with experience. • Potential Flow Solvers. Incompressible formulation considered too basic. However, the compressible potential formulation, or even a transient compressible potential can be very useful. The main effect missing in the simplified form is the viscosity effect in the boundary layer: effective change of shape for the potential region. Potential flow solver can be used to accelerate the solution to steady-state for more complex solver: initialisation of the solution • Potential Flow with Boundary Layer Correction. Here, a combination of the compressible potential and boundary layer correction takes into account the near-wall effect: the geometry is corrected for displacement thickness in the boundary layer


Introduction: CFD in Aeronautical Applications

• Euler Flow Solver. Neglects the viscous effects but the compressibility physics can be handled in full.

Chapter 3 CFD in Automotive Applications
CFD Methodology • Numerous automotive components involve fluid flow and require optimisation. This opens a wide area of potential of CFD use in automotive industry • CFD approaches the problem of fluid flow from fundamental equations: no problem-specific or industry-specific simplification • A critical step involves complex geometry handling: it is essential to capture real geometrical features of the engineering component under consideration • Traditional applications involve incompressible turbulent flow of Newtonian fluids • While most people think of automotive CFD in terms of external aerodynamics simulations, reality of industrial CFD use is significantly different

Automotive CFD Today • In numbers of users in automotive companies, CFD today is second only to CAD packages • In some areas, CFD replaces experiments

30 – Engine coolant jackets – Under-hood thermal management – Passenger compartment comfort

CFD in Automotive Applications

• In comparison with CFD, experimental studies are expensive, carry limited information and it is difficult to achieve sufficient turn-over • The biggest obstacle is validation: can CFD results be trusted?

• In other areas, CFD is insufficiently accurate for complete design studies

CFD in Automotive Applications


– Required accuracy is beyond the current state of physical modelling (especially turbulence modelling) – Simulation cost is prohibitive or turn-around is too slow – Flow physics is too complex: incomplete modelling or insufficient understanding of detailed physical processes – In some cases, combined 1-D/3-D studies capture the physics without resorting to complete 3-D study • Examples: – Prediction of the lift and drag coefficient on a car body – In-cylinder simulations in an internal combustion engine – Complete internal combustion engine system: air intake, turbo-charger, engine ports and valves, in-cylinder flow, exhaust and gas after-treatment • CFD can still contribute: parametric study (trends), reduced experimental work etc. • Numerical modelling is particularly useful in understanding the flow or looking for qualitative improvements: e.g. optimisation of vehicle soiling pattern on windows Examples of External Aerodynamics Simulations


CFD in Automotive Applications

CFD in Automotive Applications


• CFD is used across the industry, at various levels of sophistication

• Impact of simulations and reliance on numerical methods is greatest in areas that were not studied in detail beforehand

• Considerable use in cases where it is difficult to quantify the results in simple terms like the lift and drag coefficient

– Flow organisation, stability and optimisation – Detailed look at the flow field, especially in complex geometry – Optimisation of secondary effects: fuel-air mixture preparation


CFD in Automotive Applications

CFD Capabilities in 1980s: Early Adoption in Aerospace Industry • Historically, early efforts in CFD involve simplified equations and simulations relevant for aerospace industry • Experience in achieving best results with limited computational resources: attention given to solution acceleration techniques • Application-specific physical models – Linearised potential equations, Hess and Smith, Douglas Aircraft 1966 – 3-D panel codes developed by Boeing, Lockheed, Douglas and others in 1968 – Specific turbulence models for aerospace flows, e.g. Baldwin-Lomax – Coupled boundary layer-potential flow solver, Euler flow solver • Capabilities beyond steady-state compressible flow were very limited

CFD in Automotive Applications


Early Automotive CFD Simulations • First efforts aimed at simplified external aerodynamics (1985-1988) • . . . but airfoil assumptions are not necessarily applicable • Joint numerical and experimental studies: validation of numerical techniques and simulation tools, qualitative results, analysis of flow patterns and similar

• It is quickly recognised that the needs of automotive industry and (potential) capabilities of CFD solvers are well beyond contemporary experimental work • Focus of early numerical work is on performance-critical components: internal combustion engines and external aerodynamics • Geometry and flow conditions are simplified to help with simulation set-up Example: Intake Valve and Manifold • 2-D steady-state incompressible turbulent fluid flow • Axi-symmetric geometry with a straight intake manifold and fixed valve lift


CFD in Automotive Applications

• Simulation by Peri´, Imperial College London 1985 c

Automotive of CFD in 1990s: Expanding Computer Power and Validated Models • Numerical modelling is moving towards product design – Improvements in computer performance: reduced hardware cost, Moore’s law – Improved physical modelling and numerics: fundamental problems are with flow, turbulence and discretisation are resolved – Sufficient validation and experience accumulated over 10 years

CFD in Automotive Applications


• Notable improvement in geometrical handling: realistic 3-D geometry • Graphical post-processing tools and animations: easier solution analysis • Mesh generation for complex geometry is a bottle-neck: need better tools

Expansion of Automotive CFD • Increase in computer performance drives the expansion of CFD into new areas by reducing simulation turn-over time • Massively parallel computers provide the equivalent largest supercomputers at prices affordable in industrial environment (1000s of CPUs)

38 Physical Modelling

CFD in Automotive Applications

• New physical models quickly find their use, e.g. free surface flows • Looking at more complex systems in transient mode and in 3-D: simulation of a multi-cylinder engine, with dynamic effects in the intake and exhaust system • Computing power brings in new areas of simulation and physical modelling paradigms. Example: Large Eddy Simulation (LES) of turbulent flows Integration into a CAE Environment • Computer-Aided Design software is the basis of automotive industry • Historically, mesh generation and CFD software are developed separately and outside of CAD environment, but the work flow is CAD based! • Current trend looks to seamlessly include CFD capabilities in CAD Summary: Automotive CFD Today • CFD is successfully used across automotive product development • Initial “landing target” of external aerodynamics and in-cylinder engine simulation still not reached (!) – sufficient accuracy difficult to achieve Lessons Learned • The success of CFD in automotive simulation is based on providing industry needs rather than choosing problems we may simulate: find a critical broken process and offer a solution • Numerical simulation tools will be adopted only when they fit the product development process: robust, accurate and validated solver, rapid turn-over • Experimental and numerical work complement each other even if sufficient accuracy for predictive simulations cannot be achieved – Validation of simulation results ↔ understanding experimental set-up – Parametric studies: speeding up experimental turn-over • True impact of simulation tools is beyond the obvious uses: industry will drive the research effort to answer its needs

Part II The Finite Volume Method

Chapter 4 Mesh Handling
4.1 Introduction

When presenting a continuum mechanics problem for computer simulation, one needs to establish not only the mathematical model but also the computational domain. While the choice of physics is relatively general, numerical description of the domain of interest is considerably more complex. Looking at the area of external aerodynamics in aerospace, compressible Navier-Stokes equations for an ideal gas with typically suffice, while the wealth of geometrical shapes defies even basic classification. In most cases, shape of the spatial domain is of primary interest: capturing it in all relevant detail is essential. In transient simulations, handling the temporal axis is considerably simpler. Due to uni-directional nature of interaction, it is sufficient to split the time interval into a finite number of time-steps and march the solution forward in time. It quickly becomes clear that fidelity of geometrical description of an engineering object plays an important role. For example, in a heat exchanger, it is necessary to capture active surface area with some precision in order to correctly calculate the total heat transfer. At the same time, it is a question of engineering judgement to decide which geometrical features are important for the result and which may be omitted. A computational mesh splits the space into a finite number of elements (cells, control volumes or similar), bounded by faces and supported by points. Computational locations are located in the cells or on the points in a regular manner. The idea of mesh support is to discretise the governing equations over each cell and handle cell-to-cell interaction. Some mesh validity criteria follow directly from the above: • Computational cells should not overlap; • Computational cells should completely fill the domain of interest.


Mesh Handling

Every discretisation method bring its own mesh validity criteria and measures of mesh quality. In general terms, a mesh that visually pleasing is also likely to support a quality solution. Our second concern is the interaction between the mesh resolution and (known or implied) solution characteristics. Features such as shocks, boundary layers and mixing planes require higher resolution that a “far field” section of the domain. Construction of a quality mesh is usually a question of experience and use of quality mesh generation tools. An ideal mesh would be the one uniformly distributing the discretisation error in the solution volume and producing “user-independent” (or, more precisely, user-experienceindependent) result. The quest for fast and robust automatic mesh generators iteratively sensitised to the solution is still ongoing.


Complex Geometry Requirements

Computational Mesh • A computational mesh represents a description of spatial domain in the simulation: external shape of the domain and highlighted regions of interest, with increased mesh resolution • Mesh-less methods are possible (though not popular): the issue of describing the domain of interest to the computer still remains • Mesh generation is the current bottle-neck in CFD simulations. Fully automatic mesh generators are getting better and are routinely used. At the same time, requirements on rapid and high-quality meshing and massively increased mesh size are becoming a problem • Routinely used mesh size today – Small mesh for model experimentation and quick games: 100 to 50k cells. Fast turn-around and qualitative results. Note that a number of flow organisation problems may be solved on this mesh resolution – 2-D geometry: 10k to 1m cells. Low-Re turbulent simulations may require more, due to near-wall mesh resolution requirements – 3-D geometry: 50k to several million cells – Complex geometry, 3D, industrial size, 100k to 10-50 million cells. Varies considerably depending on geometry and physics, steady/transient flow etc. – Large Eddy Simulation (LES) 3-D, transient, 1-10 million cells. LES requires very long transient runs and averaging (20-50k time steps), which keeps the mesh resolution down

4.2 Complex Geometry Requirements


– Full car aerodynamics, Formula 1: 20-200 million cells for routine use. Large simulations under discussion: 1 billion cells! • On very large meshes, problem swith the current generation of CFD software becomes a limiting factor: missing parallel mesh generation, data file read/write, post-processing of results, hardware and software prices Handling Complex Geometry • In aerospace applications, geometrical information is usually available before the simulation. In general, this is not the case: for simple applications, a mesh may be the only available description of the geometry • Domain description is much easier in 2-D: real complications can only be seen in 3-D meshes • Geometrical data formats – 2-D boundary shape: airfoils. Usually a detailed map of x−y locations on the surface. Sometimes defined as curve data database.html – Stereo Lithographic Surface (STL): a surface is represented by a set of triangular facets. Resolution can be automatically adjusted to capture the surface curvature or control points. Creation of STL usually available from CAD packages – Native CAD description: Initial Graphics Exchange Specification (IGES), solid model etc. In most cases, the surface is represented by NonUniform Rational B-Splines or approximated by quadric surfaces. Typically, both are too expensive for the manipulations required in mesh generation and either avoided or simplified • Geometry clean-up. Very rarely is the CAD description built specifically for CFD – in most cases, CAD surfaces (wing, body, nacelle) are assembled from various sources, with varying quality and imperfect matching. Surface clean-up is time-consuming and not trivial. In some cases, the mesh generator may be less sensitive to errors in surface description, which simplifies the clean-up • Feature removal. CAD description or STL surface may contain a level of detail too fine to be captured by the desired mesh size, causing trouble with 3-D mesh generation. Feature removal creates an approximation of the original geometry with the desired level of detail

44 Surface Mesh Generation

Mesh Handling

• In cases where the surface description is not discrete, a surface mesh may be created first • STL surface is already a mesh. It may be necessary to additionally split the surface for easier imposition of boundary conditions: inlet, outlet, symmetry plane etc. • Surface mesh is usually triangular or quadrilateral. There are potential issues with capturing surface curvature: surface mesh will be considered “sufficiently fine” Volume Mesh Generation • The main role of the volume mesh is to capture the 3-D geometry • The cells should not overlap and should completely fill the computational domain. Additionally, some convexness criteria (FVM) or a library of predefined cell shapes (FEM) is included. • Computational mesh defines the location and distribution of solution points (vertices, cells etc.). Thus, filling the domain with the mesh is not sufficient - ideally some aspects of the solution should be taken into account. • A-priori knowledge of the solution is useful in mesh generation. Trying to locate the regions of high mesh resolution (“fine mesh”) to capture critical parts of the solution: shocks, boundary layers and simular • Quality of the mesh critical for a good solution and is not measured only in mesh resolution • Mesh quality measures depend on the discretisation method – Cell aspect ratio – Non-orthogonality – Skewness – Cell distortion from ideal shape – . . . etc.

4.3 Mesh Structure and Organisation



Mesh Structure and Organisation

Influence of Mesh Structure • Some numerical solution techniques require specific mesh types. Example: Cartesian meshes for high-order finite difference method • Supported mesh structure may severely limit the use of a chosen discretisation method • With mesh generation as a bottle-neck, it makes sense to generalise the solver to be extremely flexible on the meshing side, simplifying the most difficult part of the simulation process Cartesian Mesh • x − y − z mesh aligned with the coordinate system. May be defined by 2 points and resolution in 3 directions • Mesh addressing (cells to neighbour cells, cells to points, points to neighbour points etc.) can be calculated on the fly given the mesh dimension • Simple to define, efficient and can be used with any type of discretisation • Severe limitation on the geometry that can be handled: a box within a box • Extensions may include blocked-out cells or staircase boundaries

Structured Body-Fitted Mesh • Body-fitted meshes originate from the non-orthogonal curvilinear coordinate system approach. The case-specific coordinate system is created to fit the boundary


Mesh Handling

• The mesh is hexahedral and regularly connected. Real geometry can be captured but with insufficient control over local mesh resolution • The use of contravariant coordinates for the solution vectors was quickly abandoned

Multi-Block Mesh • Mesh created as a combination of multiple body-fitted blocks. All block and cells are still hexahedral • In FVM, special coding is done on block interfaces, where the mesh connectivity cannot be implicitly established • Much more control over mesh grading and local resolution. However, mesh generation in 3-D for relatively complex shapes is still hard and timeconsuming: meshes need to match

4.3 Mesh Structure and Organisation


Unstructured Shape-Consistent Mesh • At this stage, all meshes are hand-built. A complex 3-D mesh could take 2-3 months to construct • Block connectivity above introduces the concept of storing mesh connectivity rather than calculating it: unstructured mesh • Loose definition of connectivity allows more freedom: hexahedral and degenerate hexahedral meshes: prisms, pyramids, wedges etc. allow easier meshing


Mesh Handling

• From the numerical simulation point of view, this is a major step forward. Geometries of industrial interest can now be tackled with a detailed description, which satisfies the design engineer • At this stage, numerical simulation in an industrial setting really takes off. Handling airfoils and single wing or even wing-fuselage assembly is not too difficult. Hand-built meshes for a complete aircraft are still quite difficult

Tetrahedral and Hybrid Tet-Hex Meshes • Tetrahedral mesh are not good from the numerics point of view • . . . but they could be generated automatically! • In a solver can support tetrahedral meshes, mesh generation time for complex geometry reduces from weeks to hours. • Great saving in mesh generation effort, faster turn-around of simulations and geometrical variation, mesh sensitivity studies can be performed on realistic geometries • Tetrahedra are particularly poor in boundary layers close to walls. A hybrid mesh is built by creating a layered hexahedral mesh next to the wall.

4.3 Mesh Structure and Organisation


The rest of the domain is filled with tetrahedra. A combined tet-hex mesh is a great improvement in quality • On the negative side, cell count for a tetrahedral mesh of equivalent resolution is higher than for hexahedra. A part of the price is paid in lower accuracy of the solver on tetrahedra: limited neighbourhood connectivity.

• Tetrahedral mesh generation techniques – Advancing front method: starting from the boundary triangulation, insert tetrahedra from the live front using priority lists – Delaunay triangulation: point insertion and re-triangulation. The initial mesh is created by triangulating the boundary. New points are added in a way which improves the quality of the most distorted triangles and creates a convex hull around each point

Overset and Chimera Meshes • Used for cases where a simple solver is used for complex cases or parts of geometry move relative to each other • Each part is meshed in a simple manner and over-set on a background mesh. In regions of overlap, special discretisation practices couple the solution • Chimera approach is numerically problematic: issues of coupling, conservation and accuracy in overlap regions.


Mesh Handling

Polyhedral Mesh Support • In spite of automatic generation techniques, tetrahedral meshes are not of sufficient quality for industrial use. On the other hand, automatic hexahedral mesh generation has proven to be extremely challenging • Finite Volume discretisation is not actually dependent on the cell shape: unlike FEM, there are no pre-defined shape functions and transformation tensors. This brings the possibility of polyhedral mesh support • Finite Volume discretisation algorithm is reformulated into loops over cells and faces (still doing the same job) • Polyhedral meshes are considerably better than tetrahedra, can be manipulated to be predominantly hexahedral, orthogonal and regular and can be created automatically


Manual Meshing: Airfoils

Mesh Structure for 2-D Airfoils • Manual meshing of airfoil profiles really belongs to the past; it is still indicative to show how mesh handling governs the use of CFD • O-mesh: NACA0012 example • C-mesh: NACA32012 example, prettier in raeProfile • H-mesh

4.4 Manual Meshing: Airfoils


• Hybrid mesh structure: triangular mesh with prismatic layers: twoElement • Adapting to the geometry: transfinite mapping techniques • Adapting to the solution: shock capturing with r-refinement • Meshing multi-element airfoil configurations Mesh Generation by Partial Differential Equation • Transfinite mapping operation can be viewed as a solution of the Laplace equation. Thus, a mesh can be created by solving an equation • Mesh grading can be controlled by sizing functions: Laplace equation with variable coefficients • An equivalent formulation exists for controlling mesh orthogonality • This approach to mesh generation is useful in parametric studies, where a large number of similar geometries needs to be simulated. An initial template mesh is built and adjusted to the correct shape

e e40

e e35 e e25 e e37 e e19 e e12 e e4 e11 ee6 e10 e e e35 e e69 e e66 e e16 e e41 e e68 ee8 e5 ee7 e9 ee3 e e e e15 e e20 e e17 e42 e e e38 e e31 ee9e3 e5e7e15 ee8 e ee e e e39 e e30 e e6 e e69 e e66 e e38 e e68 e e42 e e17 e e20 e e31 e19 ee10 e4 ee11 ee12 e e e37 e e16 e e30 e e39 e e41 e e40

e e27 e e34

e e26 e e33 e e1

e e1 e e32 e e32 e e25

e e29 e e28 e e29 e e77 e e13 e e76 e e26 e e77 e e13 e e33 e e22

e e27 e e34

e e76

e e28

Polyhedral Mesh Generation • Tessalated mesh – The Delaunay triangulation algorithm introduces points on proximity rules. During the creation of the mesh, a dual mesh of convex polyhedra is created and can be extracted by a post-processing operation – Interaction on the tessalated mesh and the boundary needs to be recovered after polyhedral mesh assembly – Local control of mesh size achieved in the same way as in tetrahedral meshes


Mesh Handling



Voronoi vertex



4.5 Adaptive Mesh Refinement


• Cut hexahedral and cut polyhedral mesh – Most of mesh generation is straightforward: filling space with nonoverlapping cells. Even close to boundaries, it is easy to build high quality layered structure – Problematic parts of mesh generation are related to interaction of advancing generation surfaces or boundary interaction in complex corners of regions where the mesh resolution dos not match the level of detail on the boundary description. – Cut cell technology creates a rough mesh background mesh, either uniform hexahedral or capturing major features of the geometry. The mesh inside of the domain is kept and the one interacting with the boundary surface is adjusted or cut by the surface – In some cases, the background mesh resolution can be automatically adjusted around the surface to match the local resolution requirements – Meshes are good quality and can be generated rapidly. Prismatic boundary layers may also be added. In some cases, background mesh adjustment or concave cell corrections are required.

Examples • 3rd AIAA CFD Drag Prediction Workshop


Adaptive Mesh Refinement

From the above examples it can be seen how the structure and quality of the mesh influences the solution. In first approximation, the number and distribution of computational points determines out picture of the solution even in the absence of computational errors. In places where the solution varies rapidly or


Mesh Handling

complex physical processes occur, it is advisable to locally increase the density of computational points. Putting the resolution requirement on a firmer basis, ona may postulate that every discretisation method aimed at continuum mechanics postulates a local variation of the solution between the computational points. A largest source of discretisation error is a discrepancy between the postulated and actual field variation. Grouping computational points closer together relaxes the difference between the prescribed and actual variation in the solution, reducing the discretisation error. Mesh Resolution • Mesh structure specifies where the computational points are located. Discretisation practice postulates the shape of solution between the computational points, which is the main source of discretisation error • A sensible meshing strategy requires high resolution in regions of interest instead of uniformly distributing points in the domain. This implies some knowledge of the solution during mesh generation. • The same can be achieved in an iterative way 1. Create initial mesh and initial solution 2. Examine the solution from the point of view of accuracy or resolution in “regions of interest” 3. Based on the available solution, adjust mesh resolution in order to improve the solution in the selected parts of the domain 4. Repeat until sufficient accuracy is achieved or computer resources are exhausted • Performing mesh improvement by hand is tedious and time-consuming. For an automatic procedure, two questions need to be answered: – Where to refine the mesh (adjust resolution)? – How to change the mesh to achieve the required accuracy Types of Mesh Refinement • Global refinement: mesh sensitivity studies • h-refinement: introducing new computational points in regions of interest • r-refinement: re-organise the existing points such that more points fall into the region of interest

4.5 Adaptive Mesh Refinement


• p-refinement: enriching the space of shape functions in order to capture the solution more closely • Mesh refinement cannot be done indiscriminately: locally refined meshes typically introduce increased mesh-induced errors as well. The trick it to locate the regions of poor mesh away from the regions of interest

Error- or Indicator-Driven Adaptivity • In strongly shocked flows, it is relatively easy to identify regions of interest: shocks, boundary layer, contact discontinuities. In more complex situations or in presence of flow features of different strength, this is much more difficult. Mesh-induced discretisation errors (poor mesh quality or insufficient resolution) also needs to be taken into account. • A region of interest can usually be recognised by high gradients: rapidly varying solution


Mesh Handling

• Error indicators: highlight regions of interest. Example: magnitude of the second pressure gradient, Mach number distribution etc. • Error estimates: apart from the spatial information (error distribution), they provide guidance on the absolute error level Adjusting to Original Boundary Shape • Traditionally, mesh adaptation was a part of the CFD solver instead of mesh generator. In cases where the refinement algorithm resorts to cell splitting, we may end up with a faceted surface representation instead of a smooth surface, which compromises the results. • Solution: geometrical description of the boundary needs to be available from the solver instead of trying to recover the data from the original (coarse) mesh • A further step is related to the specification of boundary conditions. In, for example, wind tunnel simulations, the velocity and turbulence at the inlet plane in shown from the measured data and interpolated onto the inlet patch of the mesh. Ideally, the boundary condition should be associated with space or with the boundary description, avoiding problems with interpolation. This leads to issues of CAD integration, which is beyond our scope Examples of Automatic Meshing and Adaptivity • Supersonic flow, h-refinement

4.6 Dynamic Mesh Handling










Dynamic Mesh Handling

Many relevant simulations in continuum mechanics involve the cases where the shape of computational domain changes during the simulation, either in a manner prescribed up front or as a function of the solution. As we will show later, handling such cases generalises the discretisation practice to some form of “Arbitrary Lagrangian-Eulerian” practice, combining the view from the Lagrangian and Eulerian reference frame. This is usually terms dynamic mesh handling, coming in a number of different guises. From the point of view of mesh handling, we can recognise two distinct situations: • Mesh deformation, where the structure and connectivity of the mesh remains unchanged, but the position of points supporting its shape changes. Mesh deformation is characterised by the fact that the number of point, faces, cells and boundary faces remains constant, as does the connectivity between the shapes; • In a topologically changing mesh, the number of points, faces and cells or their connectivity varies during the simulation. It will be shown that standard discretisation methods handle cases of mesh deformation without loss of accuracy, while topological changes may (depending


Mesh Handling

on the algorithm) involve solution re-mapping, with associated interpolation or data redistribution errors. Thus, mesh deformation is usually preferred, unless it implies excessive mesh-induced discretisation errors. Typical examples of dynamic mesh handling in aerospace application include moving flap and slat simulation, aircraft landing, bomb or missile release (opening of the ordonnance bay), multi-stage turbomachinery simulations with rotor-stator interaction etc. Moving Deforming Mesh • There exist cases where the shape of the domain varies during the calculation. Boundary motion may be prescribed in advance as a part of the case setup or be a part of the solution itself • Internal mesh influences mainly the discretisation error: it is the external shape of the domain which carries the major influence. Moving deforming mesh algorithm will allow the domain to change its shape during the simulation and preserve its validity • Shape changes are performed by point motion: the connectivity and structure of the mesh remains unchanged Topological Mesh Changes • In cases of extreme shape change, moving deforming mesh is not sufficiently flexible: deforming the mesh to accommodate extreme boundary deformation would introduce high discretisation errors • Mesh motion can be accommodated by adding or removing computational cells to accommodate the boundary deformation. This is associated with higher discretisation errors and complications in the algorithm, but is sometimes essential • Common types of topological changes: – Attach/detach boundary – Cell layer addition/removal – Sliding interface

4.6 Dynamic Mesh Handling


• Typically, a combination of several topological changes will be used together to achieve mode complex mesh changes • Example: in-cylinder simulations in internal combustion engines


Mesh Handling

Chapter 5 Transport Equation in the Standard Form
5.1 Introduction

The importance of a scalar transport equation in the standard form lies in the fact that it contains typical forms of rate-of change, transport and volume source/sink terms present in continuum governing laws. These include convective transport, based on the convective velocity field, gradient-driven diffusive transport, rate-of-change terms and localised volume sources and sinks. Understanding the behaviour of various terms and their interaction will help the reader comprehend even the most complex physical models. Governing equations of physical interest regularly take the form of the scalar transport equation. The derivation and modelling rationale is straightforward: the rate of change and convection terms follow directly from The Reynolds Transport Theorem, while the diffusive transport is the simplest gradient-based model of surface sources and sinks. A good example of generalisation of the scalar transport equation is the density-based compressible flow solver, often written as a “scalar” transport of a composite variable [ρ, ρu, ρE]. In what follows, we will offer a brief overview of the background and derivation of the scalar transport equation, its initial and boundary conditions and various often-encountered generalisations.


Scalar Transport Equation in the Standard Form

Background • Scalar transport equation in the standard form will be our model for discretisation. Conservation laws, governing the continuum mechanics adhere

62 to the standard form: good example

Transport Equation in the Standard Form

• Standard form is not the only one available: modelled equations may be more complex or some source/sink terms can be recognised as transport. This leads to other forms, but the basics are still the same • Moving away from physics, almost identical equations can be found in other areas: for example financial modelling • The common factor for all equations under consideration is the same set of operators: temporal derivative, gradient, divergence, Laplacian, curl, as well as various source and sink terms Nomenclature • Scalar, vector, tensor represent a property in a point. In the equations under consideration, we will need tensors only up to second order – Scalars in lowercase: a – Vectors in bold: a = ai – Tensors in bold capitals: A = Aij • All vectors will be written in the global Cartesian coordinate system and in 3-D space • Inner and outer product of vectors and tensors. Vector notation will be used – feel free to shadow in the Einstein notation in the notes and I will help – Scalar product: ab = a bi – Inner vector product, producing a scalar: a•b = ai bi – Outer vector product, producing a second rank tensor: ab = ai bj – Inner product of a vector and a tensor (mind the index) ∗ product from the left: a•C = ai Cij ∗ product from the right: C•a = aj Cij • Field algebra – Continuum mechanics deals with field variables: according to the continuity assumption, a variable (e.g. pressure) is defined in each point in space for each moment in time – I will use φ as a name for the generic variable

5.2 Scalar Transport Equation in the Standard Form


– From the field definition φ = φ(x, t), which means that we can define the spatial and temporal derivative • Divergence and gradient – For convenience, we need to define the gradient operator ∇• to extract the spatial component of the derivative as a vector. Formally this would be ∂φ ∂x ∇= ∂ ∂ ∂ ∂ = i+ j+ k ∂x ∂x ∂y ∂z ∂φ ∂x (5.1)

Thus, for a scalar φ, ∇φ is a vector ∇φ = (5.2)

• If we imagine φ defined in a 2-D space as a 2-D surface, for each point the gradient vector points in the direction of the steepest ascent, i.e. up the slope • For vector and tensor fields, we define the inner and outer product with the gradient operator. Please pay attention to the definition of the gradient: multiplication from the left! • Gradient operator for a vector u creates a second rank tensor ∇u = ∂ ∂uj uj = ∂xi ∂xi (5.3)

• Divergence operator for a vector u creates a scalar ∇•u = ∂ui ∂xi (5.4)


Reynolds Transport Theorem

Reynolds Transport Theorem is a mathematical derivation of the relationship between the Lagrangian and Eulerian analysis framework. It is essential to recognise that it involves no simplifications or modelling but it establishes the basis for Euler view of the continuum. It is sometimes tempting to dwell on the interaction of Lagrangian particles and look at the generalising their behaviour to the continuum level: after all, matter itself is composed of discrete particles rather than “field variables”. In fact, classical physics has already covered this in kinetic theory of gasses, where the continuum behaviour and transition scales are established from basic principles. However, scales of engineering interest today are sufficiently removed from the mean free path to warrant the use of continuum mechanics in most engineering disciplines for some time to come.

64 Reynolds Transport Theorem

Transport Equation in the Standard Form

• Reynolds transport theorem is a first step to assembling the standard transport equation • Examine a region of space: a Control Volume (CV)


dS 11 00
11 00 11 00 11 00

n outflow


The rate of change of a general property φ in the system is equal to the rate of change of φ in the control volume plus the rate of net outflow of φ through the surface of the control volume. Mathematically: d dt d dt φ dV =
Vm Vm

∂φ dV + ∂t



φ dV =

∂φ + ∇•(φu) dV ∂t


• Here u represents the convective velocity: flux going in is negative (u•n < 0). The convective velocity in general terms can be considered as a coordinate transformation. • u is also a function of space and time: our coordinate transformation is not trivial. Examples: “solid body motion”, solid rotation, cases where u is not divergence-free Sources and Sinks • Apart from convection (above), we can have local sources and sinks of φ. • Volume source: distributed through the volume, e.g. gravity • Surface source: act on external surface S, e.g. heating. Typically modelled using gradient-based models

5.2 Scalar Transport Equation in the Standard Form


qs inflow Qv V dS 11 00
11 00 11 00 11 00

n outflow

d dt

φ dV =

qv dV −

(n•qs )dS


∂φ + ∇•(φu) = qv − ∇•qs ∂t



Diffusive Transport

Gradient-based transport plays a very different role from the Reynolds Transport Theorem terms derived above. One should keep in mind that diffusion is a physical model for the behaviour of surface terms rather that a result of direct mathematical manipulation. However, its generality and special mathematical properties are much deeper. Gradient-based transport si observed regularly in many physical phenomena, from conductive heat transfer to equilibration of species concentration. It can be seen as the effect of molecular dynamics on the macro-scale, in presence of sufficient scale separation. Diffusive Transport • Gradient-based transport is a model for surface source/sink terms • Consider a case where φ is a concentration of a scalar variable and a closed domain. Diffusion transport says that φ will be transported from regions of high concentration to regions of low concentration until the concentration is uniform everywhere. • Taking into account that ∇φ point up the concentration slope, and the transport will be in the opposite direction, we can define the following diffusion model qs = −γ ∇φ, where γ is the diffusivity. (5.9)

66 Generic Transport Equation

Transport Equation in the Standard Form

• Assembling the above yields the transport equation in the standard form ∂φ ∂t
temporal derivative


convection term

− ∇•(γ∇φ) =
diffusion term

source term


• Temporal derivative represents inertia of the system • Convection term represents the convective transport by the prescribed velocity field (coordinate transformation). The term has got hyperbolic nature: information comes from the vicinity, defined by the direction of the convection velocity • Diffusion term represents gradient transport. This is an elliptic term: every point in the domain feels the influence of every other point instantaneously • Sources and sinks account for non-transport effects: local volume production and destruction of φ Conservation Equations • As promised, conservation equations in continuum mechanics follow the above form • Conservation of mass: continuity equation ∂ρ + ∇•(ρu) = 0 ∂t • Conservation of linear momentum ∂(ρu) + ∇•(ρuu) = ρg + ∇•σ ∂t • Energy conservation equation ∂(ρe) + ∇•(ρeu) = ρg.u + ∇•(σ.u) − ∇•q + ρQ ∂t (5.13) (5.12) (5.11)

5.3 Initial and Boundary Conditions



Initial and Boundary Conditions

The role of boundary conditions is to isolate the system under consideration from the external environment. Location and type of boundary conditions depends on our knowledge about flow and physical conditions and their influence on the solution. Boundary conditions can be classified as numerical and physical boundary conditions. Numerical boundary conditions can be considered at the equation level. Main types are the fixed value or Dirichlet condition, zero (Neumann) or fixed gradient condition (flux condition) and a mixed or Robin condition. Physical boundary conditions are related to the model under consideration and involves combinations of individual equations under consideration. This reduces to numerical boundary conditions between various equations, where the fixed value, flux or their combination are updated in a physically meaningful sense. Examples of physical boundary conditions in fluid flows are flow inlets and outlets, wall, symmetry planes and far field conditions. Boundary Conditions • The role of boundary conditions is to isolate the system from the rest of the Universe. Without them, we would have to model everything • Position of boundaries and specified condition requires engineering judgement. Badly placed boundaries will compromise the solution or cause “numerical problems”. Example: locating an outlet boundary across a recirculation zone. • Incorporating the knowledge of boundary conditions from experimental studies or other sources into a simulation is not trivial: it is not sufficient to pick up some arbitrary data and force in on a simulation. Choices need to be based on physical understanding of the system Numerical Boundary Conditions • Dirichlet condition: fixed boundary value of φ • Neumann: zero gradient or no flux condition: n•qs = 0 • Fixed gradient or fixed flux condition: n•qs = qb . Generalisation of the Neumann condition • Mixed condition: Linear combination of the value and gradient condition

68 More Numerical Conditions

Transport Equation in the Standard Form

• More numerical conditions, related to simplifications in the shape or size of the computational domain. The idea is to limit or decrease the size of the computational domain (saving on the cell count) by using the properties of the solution and boundary conditions • Symmetry plane. In cases where the geometry and boundary conditions are symmetric and the flow is steady (or the equation is linear in the symmetrical direction), only a section of the problem may be modelled. The simplification will not work if the expected flow pattern is not symmetric as well: manoeuvring aircraft, cross-wind etc.

• Cyclic and periodic conditions. In cases of repeating geometry (e.g. tube bundle heat exchangers) or fully developed conditions, the size of domain can be reduced by modelling only a representative segment of the geometry. In order to account for periodicity, a “self-coupled” condition can be set up on the boundary. In special cases, a jump condition can be specified for variables that do not exhibit cyclic behaviour. Example: pressure in fully developed channel flow

5.3 Initial and Boundary Conditions
Symmetry plane




5D 0.
Wall H Developed inlet profile

0. 5D


Symmetry plane H


Symmetry plane

• Implicit implementation of the condition (depending on the current value) improves the numerical properties of the condition • A more general (re-mapping) form of the condition can also be specified, but not in the implicit form

Physical Boundary Conditions • Currently, we are dealing with a passive transport of a scalar variable: physical meaning of the boundary condition is trivial • In case of coupled equation sets or a clear physical meaning, it is useful to associate physically meaningful names to the sets of boundary conditions for individual equations. Examples – Subsonic velocity inlet: fixed value velocity, zero gradient pressure, fixed temperature – Supersonic outlet: all variables zero gradient


Transport Equation in the Standard Form

– Heated wall: fixed value velocity, zero gradient pressure, fixed gradient temperature (fixed heat flux) Initial Condition • Boundary conditions are only a part of problem specification. Initial conditions specify the variation of each solution variable in space. In some cases, this may be irrelevant: – Steady-state simulation result should not depend on the initial condition – In oscillatory transient cases (e.g. vortex shedding), the initial condition is irrelevant

• . . . but in other simulations it is essential: relaxation problems • Initial field should in principle satisfy the governing equation and physical bounds. Importance of this will depend on the robustness of the algorithm. Example: initialise the flow simulations using the potential flow solver to satisfy continuity. In practice, robust solvers only care about physical bounds


Physical Bounds in Solution Variables

An important property of physical variables are their natural bounds. Examples here would include kinetic energy, which always remains positive; species concentration, bounded between 0 and 1 (100 %) and many others. Physical bounds may be implied from the nature of the variable, but also from the differential equations governing the system. A good test of understanding of the equation systems involves the analysis of boundedness from the source and sink terms and their interaction. Enforcing Physical Bounds • When transport equations are assembled, they represent real physical properties. A set of equations under consideration relies on the fact that physical

5.4 Physical Bounds in Solution Variables


variables obey certain bounds: if the bounds are violated, the system exhibits unrealistic behaviour • Examples of variables with physical bounds – Negative density value: −3 kg/m3 – Negative absolute temperature – Negative kinetic energy (to turbulent kinetic energy – Concentration value below zero or above one: Two phase flow, using a scalar concentration φ to indicate the presence of fluid A φ = 1.05, ρ1 = 1 kg/m3 , ρ2 = 1000 kg/m3 ρ = φ ρ1 + (1 − φ) ρ2 = 1.05 ∗ 1 + (1 − 1.05) ∗ 1000 = 1.05 − 0.05 ∗ 1000 = 1.05 − 50 = −48.95 kg/m3 • Physical bounds on solution variables are easily established. However, our task is not only to recognise this in the original equations but to enforce it during the iterative solution process. If at any stage we obtain a locally negative density, the convergence of the iterative algorithm will be disrupted: this is not trivial • For vector and tensor variables, the physical bounds are not as straightforward and may be more difficult to enforce • Diffusion coefficient and stability. An example of how the iterative process breaks down is a case of negative diffusion introducing positive feed-back in the system. The diffusion model: qs = −γ ∇φ, assumes positive value of γ: the gradient transport will act to decrease the maximum value of φ in the domain and tend towards the uniform distribution. For negative γ, the process is reversed and φ is accumulated at the location of highest φ, which tends to infinity in an unstable manner. If you encounter cases where γ is genuinely negative (e.g. financial modelling equations), there is still a way to solve them: marching in time backwards! • Bounding source and sink terms. Looking at a scalar variable with bounds, e.g. 0 ≥ φ ≥ 1, governed by a generic transport equation, a sanity check can be performed on the volumetric source term: as φ approaches its bounds, the value of qv must tend to zero. This is how the form of the differential equation preserves the sanity of the variable; the same property needs to be achieved in the discretised form of the equation

72 Examples • Convection-dominated problems • Diffusion problems • Negative diffusion coefficient

Transport Equation in the Standard Form

• Convection-diffusion and Peclet number • Source and sink terms: preserving the boundedness


Complex Equations: Introducing Non-Linearity

Scalar transport equation in its standard form represents a relatively simple physical system, including convective and diffusive transport and linearised source and sink terms. The equation is sufficiently easy to fathom to provide a number of analytical solutions (e.g. line source in cross flow) but does not capture the richness and complexity of may real-life phenomena. We shall now look at a series of seemingly simple modifications to the form of various terms and their effect. An important property of good discretisation is to enforce physical bounds on all relevant variables not only on convergence but also on intermediate solution on the iterative process. Vector and Tensor Transport • A transport equation for a vector and tensor quantity very similar to the scalar form: φ becomes d. However, having d as a transported variable allows the introduction of some interesting new terms – Variable convected by itself: ∇•(d d) – Laplace transpose: ∇• γ(∇d)T – Divergence (trace): λI∇•d • The tricky terms will introduce non-linearity or inter-component coupling and produce interesting solutions • For now, we can consider the question of coupling: are the components of the transported vector coupled or decoupled?

5.5 Complex Equations: Introducing Non-Linearity


Multiple Convection or Diffusion Terms • Some equations can contain multiple transport terms, sometimes disguised as sources or sinks. Recognising the real nature of the term is critical in its correct numerical treatment ∂(ρ b) + ∇•(ρu b) = −ρSt |∇b| ∂t (5.14)

• Multiple diffusion terms can appear in the same variable or in a different one, e.g. ∇•(γ∇b) in the equation for φ. Diffusion terms in the same variables can be combined into a single term; the ones in a different equation require special treatment. Non-Linear Transport • The non-linearity in convection, ∇•(u u) is the most interesting term in the Navier-Stokes equations. Complete wealth of interaction in incompressible flows stems from this term. This includes all turbulent interaction: in nature, this is an inertial effect • In compressible flows, additional effects, related to inter-equation coupling appear: shocks, contact discontinuities. • Another form of non-linearity introduces the diffusion coefficient γ as a direct or indirect function of the solution: much less interesting Non-Linear Source and Sink Terms • As mentioned before, for bounded scalar variables, source and sink terms need to tend to zero as φ approaches its bounds. Therefore, cases where qv is a function of φ are a rule rather than exception • qv = qv (φ) usually leads to the decomposition of the term into a source and sink. This strictly only makes sense when φ is bounded below by zero and has no upper bound, but it is instructive. The linearisation is only first-order, i.e. qu and qp can still depend on φ. qv = qu − qp φ, (5.15)

where both qu ≥ 0 and qp ≥ 0. This kind of linearisation also follows from numerical considerations and will be re-visited later.


Transport Equation in the Standard Form


Inter-Equation Coupling

True complexity of physical processes in engineering is rarely seen from single transport equations, be they linear or non-linear. It is the interaction between multiple physical phenomena interacting with each other that represents a true challenge. Consider for example how the simplicity of a mass continuity equation enforces constraints on the momentum transfer in incompressible fluid flow. Adding to that the dependence of material properties on state variables further increases the complexity. Many sets of coupled differential equations stem not only from basic physical principles, but from the need to describe very complex physical systems in simpler terms. Good examples would include combustion and turbulence models. Here, the “complete” physics of interest may or may not be understood, but is too complex for to be captured in its entirety. Therefore, a modeller chooses some representative variables (e.g. turbulent length scale, eddy turn-over time, laminar flame speed) and incorporates their interaction in a set of coupled partial differential equations. Coupled Equations Sets • Inter-equation coupling introduces additional complexity: a set of physical phenomena which depend on each other. • Complexity, strength of coupling and non-linearity varies wildly, to the level of inability to handle certain models numerically. The most difficult ones involve separation of scales, where the fastest interaction (e.g. chemical reaction) occurs at time-scales several order of magnitude faster than the slowest (e.g. turbulent fluid flow) Example: Two Coupled Scalar Equations • k − ǫ model of turbulence: – k: turbulence kinetic energy – ǫ: dissipation turbulence kinetic energy – u: velocity. Consider it fixed for the moment – Cµ , C1 , C2 : model coefficients. – k-equation: ∂k + ∇•(u k) − ∇•(µt ∇k) = G − ǫ, ∂t where µ t = Cµ k2 ǫ (5.17) (5.16)

5.6 Inter-Equation Coupling


and G = µt [∇u + (∇u)T ] : ∇u. – ǫ-equation: ǫ ǫ2 ∂ǫ + ∇•(u ǫ) − ∇•(µt ∇ǫ) = C1 G − C2 , ∂t k k (5.19) (5.18)

• The coupling looks very complex but is benign: the most critical part is the treatment of sink terms to preserve the boundedness during an iterative solution sequence. Example: sink term in the k-equation ǫ= ǫold knew kold (5.20)

Exercise Examine the boundedness of above equations for k and ǫ, given a prescribed velocity field by analysing various source and sink terms.


Transport Equation in the Standard Form

Chapter 6 Polyhedral Finite Volume Method
6.1 Introduction

In this chapter we will lay out the HJ HERE!!!


Properties of a Discretisation Method

Discretisation • Generic transport equation can very rarely be solved analytically: this is why we resort to numerical methods • Discretisation is a process of representing the differential equation we wish to solve by a set of algebraic expressions of equivalent properties (typically a matrix) • Two forms of discretisation operators. We shall use a divergence operator as an example. – Calculus. Given a vector field u, produce a scalar field of ∇•u – Method. For a given divergence operator ∇•, create a set of matrix coefficients that represent ∇•u for any given u • The Calculus form can be easily obtained from the Method (by evaluating the expression), but this is not computationally efficient Properties A discretised form of equation needs to consistently represent the original equation


Polyhedral Finite Volume Method

1. Consistency: when the mesh spacing tends to zero, the discretisation should become exact 2. Stability: a solution method is stable if it does not magnify the errors that appear during the numerical solution process 3. Convergence: the solution of the discretised equations should tend to the exact solution of the differential equation as the mesh spacing tends to zero 4. Conservation: at steady-state and in the absence of sources and sinks the amount of a conserved quantity leaving the system is equal to the amount entering it 5. Boundedness: for variables that possess physical (sanity) bounds, boundedness should be preserved in the discretised form 6. Realisability: discretised version of the model should be such that solutions obtained numerically are physically realistic 7. Accuracy: produce the best possible solution on a given mesh


Discretisation of the Scalar Transport Equation

We shall now review the technique of second-order Finite Volume discretisation on polyhedral meshes. After specifying the spatial and temporal distribution, we shall visit a number of operators and present their explicit and implicit form. Discretisation Methodology 1. We shall assemble the discretisation on a per-operator basis: visit each operator in turn and describe a strategy for evaluating the term explicitly and discretising it 2. Describe space and time: a computational mesh for the spatial domain and time-steps covering the time interval 3. Postulate spatial and temporal variation of φ required for a discrete representation of field data 4. Integrate the operator over a cell 5. Use the spatial and temporal variation to interpret the operator in discrete terms

6.3 Discretisation of the Scalar Transport Equation


Representation of a Field Variable • Equations we operate on work on fields: before we start, we need a discrete representation of the field • Main solution variable will be stored in cell centroid: collocated cell-centred finite volume method. Boundary data will be stored on face centres of boundary faces • For some purposes, e.g. face flux, different data is required – in this case it will be a field over all faces in the mesh • Spatial variation can be used for interpolation in general: post-processing tools typically use point-based data. Nomenclature: Computational Cell
sf df f N


rP z y



• The figure shows a convex polyhedral cell boundary be a set of convex polygons • Cell volume is denoted by VP • Point P is the computational point located at cell centroid xP . The definition of the centroid reads: (x − xP ) dV = 0.


• For the cell, there is one neighbouring cell across each face. Neighbour cell and cell centre will be marked with N. • Delta vector for the face f is defined as df = P N (6.2)


Polyhedral Finite Volume Method

• The face centre f is defined in the equivalent manner, using the centroid rule: (x − xf ) dS = 0.


• Face area vector sf is a surface normal vector whose magnitude is equal to the area of the face. The face is numerically never flat, so the face centroid and area are calculated from the integrals. sf =

n dS.


• The fact that the face centroid does not necessarily lay on the plane of the face is not worrying: we are dealing with surface-integrated quantities. However, we shall require the cell centroid to lay within the cell • In practice, cell volume and face area calculated by decompositions into triangles and pyramids • Types of faces in a mesh – Internal face, between two cells – Boundary face, adjacent to one cell only and pointing outwards of the computational domain • When operating on a single cell, assume that all face area vectors sf point outwards of cell P Spatial and Temporal Variation • Postulating spatial variation of φ: second order discretisation in space φ(x) = φP + (x − xP )•(∇φ)P This expression is given for each individual cell. Here, φP = φ(xP ). • Postulating linear variation in time: second order in time φ(t + ∆t) = φt + ∆t where φt = φ(t) ∂φ ∂t



6.3 Discretisation of the Scalar Transport Equation


Polyhedral Mesh Support • In FVM, we have specified the “shape function” without reference to the actual cell shape (tetrahedron, prism, brick, wedge). The variation is always linear. Doing polyhedral Finite Volume should be straightforward! • In contrast, FEM specifies various forms for shape function for various shapes and provides options for higher order elements. Example: 27-node brick. However, I am not aware of the possible FEM formulation which s shape independent Volume and Surface Integrals • Discretisation is based on the integral form of the transport equation over each cell ∂φ dV + ∂t φ (n•u) dS −

γ (n•∇φ) dS =

qv dV



• Each term contains volume or surface integral. Evaluate the integrals, using the prescribed variation in space • Volume integral φ dV =

[φP + (x − xP )•(∇φ)P ] dV dV + (∇φ)P •

= φP

(x − xP )dV = φP VP

• Surface integral splits into a sum over faces and evaluates in the same manner n φ dS =
S f Sf

nφf dSf =
f Sf

n[φf + (x − xf )•(∇φ)f ]


sf φf

• The above integrals show how the assumption of linear variation of φ and the selection of P in the centroid eliminate the second part of the integral and create second-order discretisation


Polyhedral Finite Volume Method


Face Addressing

Software Organisation • Assuming that φf depends on the values of φ in the two cells around the face, P and N, let us attempt to calculate a surface integral for the complete mesh. Attention will be given on how the mesh structure influences the algorithm • Structured mesh. Introducing compass notation: East, West, North, South • The index of E, W , N and S can be calculated from the index of P : n + 1, n − 1, n + colDim, n − colDim





• Looping structure – Option 1: For all cells, visit East, West, North, South and sum up the values. Not too good: each face value calculated twice. Also, poor optimisation for vector computers we want to do a relatively short operation for lots and lots of cells – Option 2: ∗ For all cells, do East face and add to P and E ∗ For all cells, do North face and add to P and N Better, but stumbles on the boundary. Nasty tricks, like “zero-volume boundary cells” on the W and S side of the domain. – OK, I can do a box. How about implementing a boundary condition: on E, W , N and S. Ugly! • Block-structured mesh. Same kind of looping as above

6.4 Face Addressing


– On connections between blocks, the connectivity is no longer “regular”, e.g. on the right side I can get a N cell of another block – Solution: repeat the code for discretisation and boundary conditions for all possible block-to-block connections • Repeated code is very bad for your health: needs to be changed consistently, much more scope for errors, boring and difficult to keep running properly. • Tetrahedral mesh. Similar to structured mesh. – A critical difference to above is that in a tetrahedral mesh we cannot calculate the neighbouring indices because the mesh is irregular. Thus, cell-to-cell connectivity needs to be calculated during mesh generation or at the beginning of the simulation and stored. Example: for each tetrahedron, store 4 indices of neighbour cells across 4 faces in order. 1

3 2


sτ 1

• Unstructured mesh. We can treat a block structured mesh in the same manner: forget about blocks and store neighbour indices for each cell. Much better: no code duplication. • Mixed cell types. When mixed types are present, we will re-use the unstructured mesh idea, but with holes: a tetrahedron only has 4 neighbours and a brick has got six – Option 1: For all cells, visit all neighbours. Woops: short loop inside a long loop AND all face values calculated twice – Option 2: ∗ For all neighbours, up to max number of neighbours ∗ For all cells ∗ . . . do the work if there is a neighbour Works, but not too happy: I have to check if the neighbour is present

84 Face Addressing

Polyhedral Finite Volume Method

• Thinking about the above, all I want to do is to visit all cell faces and then all boundary faces. For internal face, do the operation and put the result into two cells around the face • Orient face from P to N: add to P and subtract from N (because the face area vector points the wrong way) • Addressing slightly different: for each internal face, record the left and right (owner and neighbour) cell index. Owner will be the first one in the cell list • Much cleaner, compact addressing, fast and efficient (some cache hit issues are hidden but we can work on that) • Most importantly, it no longer matters how many faces there is in the cell: nothing special is required for polyhedral cells Gauss’ theorem • Gauss’ theorem is a tool we will use for handing the volume integrals of divergence and gradient operators • Divergence form ∇•a dV =



• Gradient form ∇φ dV =

ds φ


• Note how the face area vector operates from the same side as the gradient operator: fits with our definition of te gradient of for a vector field • In the rest of the analysis, we shall look at the problem face by face. A diagram of a face is given below for 2-D. Working with vectors will ensure no changes are required when we need to switch from 2-D to 3-D. • A non-orthogonal case will be considered: vectors d and s are not parallel
s f P d N

6.5 Operator Discretisation



Operator Discretisation

In the following section we will look at the discrete representation of various operators. Operators which do not interact can be looked at in isolation and will be considered in the increased order of complexity.


Temporal Derivative

Time derivative captures the rate-of-change of φ. We only need to handle the volume integral. • Using the prescribed temporal variation in a point, defining time-step size ∆t • tnew = told + ∆t, defining time levels φn and φo φo = φ(t = told ) (6.10)

φn = φ(t = tnew ) • Temporal derivative, first and second order approximation ∂φ φn − φo = ∂t ∆t 1 3 n φ − φo + 2 φoo ∂φ = 2 ∂t ∆t • Thus, with volume integral: φn − φo ∂φ dV = VP ∂t ∆t




• Calculus: given φn , φo and ∆t create a field of the time derivative of φ • Method: matrix representation. Since ∂φ in cell P depends on φP , the ∂t matrix will only have a diagonal contribution and a source – Diagonal value: aP =
VP ∆t VP ,φo ∆t

– Source contribution: rP =


Polyhedral Finite Volume Method


Second Derivative in Time

Second derivative in time • This term will appear when we try to do stress analysis • Very similar to the above: second temporal derivative is calculated using two old-time levels of φ ∂2φ φn − 2φo + φoo = , ∂t2 ∆t2 where φn = φ(t + ∆t), φo = φ(t) and φoo = φ(t − ∆t). • One can also construct a second-order accurate form of “old-time” levels (φooo = φ(t − 2 ∆t)): ∂2φ 2φn − 5φo + 4φoo − φooo = . ∂t2 ∆t2
∂2φ ∂t2


using three


• Exercise: what needs to be done if the time step is not constant between the two old time levels?


Evaluation of the Gradient

Gauss’ Theorem • Evaluation of the gradient is a direct application of the Gauss’ Theorem ∇φ dV =

ds φ


• Discretised form splits into a sum of face integrals nφ dS =
S f

sf φf


• It still remains to evaluate the face value of φ. Consistently with secondorder discretisation, we shall assume linear variation between P and N φf = fx φP + (1 − fx )φN • Gradient evaluation almost exclusively used as a calculus operation (6.17)

6.5 Operator Discretisation


Least Squares Fit • On highly distorted meshes, accuracy of Gauss gradients is compromised • Lest squares fit uses a set of neighbouring points without reference to cell geometry to assemble the gradient • Assuming a linear variation of a general variable φ, the error at N is: eN = φN − (φP + dN •(∇φ)P ) Minimising the least square error: e2 = P


(wN eN )2


with the weighting function wN = 1 |dN | (6.20)

leads to the following expression: (∇φ)P =
N 2 wN G−1 •dN (φN − φP ),


where G is a 3 × 3 symmetric matrix: G=
N 2 wN dN dN


• This produces a second-order accurate gradient irrespective of the arrangement of the neighbouring points


Convection Term

Convection term captures the transport by the convective velocity. In general terms, convection can be seen as a “coordinate transformation”: information (variable) is carried by the flow field from one region to another. A concept of upwind or downwind direction is needed to understand the convective process. Convection Operator and Face Flux • Convection operator splits into a sum of face integrals • Two different ways of writing the same term: integral and differential form φ(n•u)dS =

∇•(φu) dV


88 • Integration follows the same path as before φ(n•u)dS =
S f

Polyhedral Finite Volume Method

φf (sf •uf ) =

φf F


where φf is the face value of φ and F = sf •uf is the face flux • In general, face flux is a face field giving the measure of the flow through the face. In some algorithms, it may come from different expressions, depending on the overall algorithm • Primary unknowns are the cell centre values, not face values • In order to close the system, we need a way of evaluating φf from the cell values φP and φN : face interpolation Face Interpolation Schemes • Simplest face interpolation: central differencing. Second-order accurate, but causes oscillations φf = fx φP + (1 − fx )φN where fx = f N/P N • Upwind differencing: taking into account the transportive property of the term: information comes from upstream. No oscillations, but smears the solution φf = max(F, 0) ∗ φP + min(F, 0) ∗ φN (6.27) (6.26) (6.25)

• There exists a large number of schemes, trying to achieve good accuracy without causing oscillations: e.g. TVD, and NVD families: φf = f (φP , φN , F, . . .)

φD φf










• We shall re-visit the schemes with examples

6.5 Operator Discretisation


Matrix Coefficients • In the convection term, φf depends on the values of φ in two computational points: P and N. • Therefore, the solution in P will depend on the solution in N and vice versa, which means we’ve got an off-diagonal coefficient in the matrix. In the case of central differencing on a uniform mesh, a contribution for a face f is – Diagonal value: aP = 1 F 2
1 – Off-diagonal value: aN = 2 F

– Source contribution: in our case, nothing. However, some other schemes may have additional (gradient-based) correction terms – Note that, in general the P -to-N coefficient will be different from the N-to-P coefficient: the matrix is asymmetric


Diffusion Term

Diffusion term captures the gradient transport Diffusion Operator • Integration same as before γ(n•∇φ)dS =
S f Sf

γ(n•∇φ) dS γf sf •(∇φ)f


• γf evaluated from cell values using central differencing • Evaluation of the face-normal gradient. If s and df = P N are aligned, use difference across the face sf •(∇φ)f = |sf | φN − φP |df | (6.28)

• This is the component of the gradient in the direction of the df vector • For non-orthogonal meshes, a correction term may be necessary

90 Matrix Coefficients

Polyhedral Finite Volume Method

• For an orthogonal mesh, a contribution for a face f is – Diagonal value: aP = −γf
|sf | |df | |sf | |df |

– Off-diagonal value: aN = γf

– Source contribution: for orthogonal meshes, nothing. Non-orthogonal correction will produce a source – The P -to-N and N-to-P coefficients are identical: symmetric matrix Non-Orthogonal Correction • We wish to keep the part with coefficient creation as above even for nonorthogonal meshes . . . but this would not be correct • Solution: add a correction – Decompose the s vector into a component parallel with d and the rest. – For the parallel component, same as above – Correction = k•(∇φ)f . The missing gradient will be calculated at cell centres and interpolated, just as γf above

s f P d ∆

k N


Source and Sink Terms

Source and sink terms are integrated over the volume qv dV = qv VP


• In general, qv may be a function of space and time, the solution itself, other variables and can be quite complex. In complex physics cases, the source term can carry the main interaction in the system. Example: complex chemistry mechanisms. We shall for the moment consider only a simple case.

6.6 Numerical Boundary Conditions


• Typically, linearisation with respect to φ is performed to promote stability and boundedness qv (φ) = qu + qd φ where qd =
∂qv (φ) ∂φ


and for cases where qd < 0 (sink), treated separately

Matrix Coefficients • Source and sink terms do not depend on the neighbourhood – Diagonal value created for qd < 0: “boosting diagonal dominance” – Explicit source contribution: qu


Numerical Boundary Conditions

Implementation of Numerical Boundary Conditions • Boundary conditions will contribute the the discretisation through the prescribed boundary behaviour • Boundary condition is specified for the whole equation • . . . but we will study them term by term to make the problem simpler Dirichlet Condition: Fixed Boundary Value • Boundary condition specifies φf = φb • Convection term: fixed contribution F φb . Source contribution only • Diffusion term: need to evaluate the near-boundary gradient n•(∇φ)b = φb − φP |db | (6.31)

This produces a source and a diagonal contribution • What about source, sink, rate of change?

92 Neumann and Gradient Condition

Polyhedral Finite Volume Method

• Boundary condition specifies the near-wall gradient n•(∇φ)b = gb • Convection term: evaluate the boundary value of φ from the internal value and the known gradient φb = φP + db •(∇φ)b = φP + |db |gb (6.32)

Use the evaluated boundary value as the face value. This creates a source and a diagonal contribution • Diffusion term: boundary-normal gb gradient can be used directly. Source contribution only Mixed Condition • Combination of the above • Very easy: α times Dirichlet plus (1 − α) times Neumann. Symmetry Plane • Above boundary conditions were the same for scalars. vectors and tensors. On a symmetry plane, there will be a different condition on scalar (zero gradient), vector (zero normal component and zero-gradient condition on the tangential component • For scalars, the surface-normal gradient is zero • For vectors, draw a “ghost cell” on the opposite side of the boundary with the value the same as in P but with mirror transformation and do the discretisation as usual • Note: symmetry plane boundary condition for a vector couples the components Cyclic, Periodic and Other Coupled Conditions • Cyclic and periodic boundary conditions couple near-boundary cells to cells on another boundary • A coordinate transformation is applied between the two sides: N to N ′ and vise-versa; the rest of the discretisation is performed as if this is an internal face of the mesh

6.7 Time-Marching Approach





y x



Time-Marching Approach

Time Advancement • Having completed the discretisation of all operators we can now evolve the solution in time • There are two basic types of time advancement: Implicit and explicit schemes. Properties of the algorithm critically depend on this choice, but both are useful under given circumstances • There is a number of methods, with slightly different properties, e.g. fractional step methods, • Temporal accuracy depends on the choice of scheme and time step size • Steady-state simulations – If equations are linear, this can be solved in one go! – For non-linear equations or special discretisation practices, relaxation methods are used, which show characteristics of time integration (we are free to re-define the meaning of time Explicit Schemes • The algorithm uses the calculus approach, sometimes said to operate on residuals • In other words, the expressions are evaluated using the currently available φ and the new φ is obtained from the time term • Courant number limit is the major limitation of explicit methods: information can only propagate at the order of cell size; otherwise the algorithm is unstable • Quick and efficient, no additional storage • Very bad for elliptic behaviour

94 Implicit Schemes

Polyhedral Finite Volume Method

• The algorithm is based on the method: each term is expressed in matrix form and the resulting linear system is solved • A new solution takes into account the new values in the complete domain: ideal for elliptic problems • Implicitness removed the Courant number limitation: we can take larger time-steps • Substantial additional storage: matrix coefficients!


Equation Discretisation

• The equation we are trying to solve is simply a collection of terms: therefore, assemble the contribution from • Initial condition. Specifies the initial distribution of φ • . . . and we are ready to look at examples!


Convection Differencing Schemes

• Testing differencing schemes on standard profiles • Simple second-order discretisation: upwind differencing, central differencing, blended differencing, NVD schemes • First-order scheme: Upwind differencing. Take into account the transport direction • Exercise: how does all this relate to the discretisation of the Euler equation described in the previous lectures?



• Forms of convection discretisation and kinds of error they introduce • Positive and negative diffusion terms • Temporal discretisation: first and second-order, implicit or explicit discretisation

Chapter 7 Algebraic Linear System and Linear Solver Technology
7.1 Structure and Formulation of the Linear System

Matrix Assembly • Assembling the terms from the discretisation method – Time derivative: φ depends on old value – Convection: u provided; φf depends on φP and φN – Diffusion: sf •(∇φ)f depends on φP and φN • Thus, the value of the solution in a point depends on the values around it: this is always the case. For each computational point, we will create an equation aP φP +

aN φN = r


where N denotes the neighbourhood of a computational point – Every time φP depends on itself, add contribution into aP – Every time φN depends on itself, add contribution into aN – Other contributions into r – Examples of matrix structure ∗ ∗ ∗ ∗ Structured mesh, Finite Volume Unstructured mesh, Finite Volume 2-D linear quad elements, Finite Element 2-D linear triangular elements, Finite Element


Algebraic Linear System and Linear Solver Technology

Implicit and Explicit Methods – Explicit method: φn depends on the old neighbour values φo P N ∗ Visit each cell, and using available φo calculate φn = P r−

aN φo N



∗ No additional information needed ∗ Fast and efficient; however, poses the Courant number limitation: the information about boundary conditions is propagated very slowly and poses a limitation on the time-step size – Implicit method: φn depends on the new neighbour values φn P N φn = P r−

aN φn N



∗ Each cell value of φ for the “new” level depends on others: all equations need to be solved simultaneously Linear System: Nomenclature • Equations form a linear system or a matrix [A][φ] = [r] (7.4)

where [A] contain matrix coefficients, [φ] is the value of φP in all cells and [r] is the right-hand-side • [A] is potentially very big: N cells × N cells • This is a square matrix: the number of equations equals the number of unknowns • . . . but very few coefficients are non-zero. The matrix connectivity is always local, potentially leading to storage savings if a good format can be found • What about non-linearity?


Matrix Storage Formats

Storing Matrix Coefficients • Dense matrix format. All matrix coefficients have are stored, typically in a two-dimensional array

7.2 Matrix Storage Formats


– Diagonal coefficients: aii , off-diagonal coefficients: aij – Convenient for small matrices and direct solver use – Matrix coefficients represent a large chunk of memory: efficient operations imply memory management optimisation – It is impossible to say if the matrix is symmetric or not without floating point comparisons • Sparse matrix format. Only non-zero coefficients will be stored – Considerable savings in memory – Need a mechanism to indicate the position of non-zero coefficients – This is static format, which imposes limitations on the operations: if a coefficient is originally zero, it is very expensive to set its value: recalculating the format. This is usually termed a zero fill-in condition – Searching for coefficients is out of the question: need to formulate sparse matrix algorithms Sparse Matrix Storage • Compressed row format. Operate on a row-by-row basis. Diagonal coefficients may or may not be stored separately – Coefficients stored in a single 1-D array. Coefficients are ordered in a row-by-row structure – Addressing in two arrays: row start and column array – The column array records the column index for each coefficients. Size of column array equal to the number of off-diagonal coefficients – The row array records the start and end of each row in the column array. Thus, row i has got coefficients from row[i] to row[i + 1]. Size of row arrays equal to number of rows + 1 – Coding [b] = [A] [x] vectorProduct(b, { for (int n = { for (int { b[n] } with compressed row addressing x) 0; n < count; n++) ip = row[n]; ip < row[n+1]; ip++) = coeffs[ip]*x[col[ip]];

98 } }

Algebraic Linear System and Linear Solver Technology

– Good for cases where coefficients are present in each row – Symmetric matrix cannot be recognised easily • Arrow format. Arbitrary sparse format. Diagonal coefficients typically stored separately – Coefficients stored in 2-3 arrays: diagonal, upper triangle, lower triangle (if needed) – Diagonal addressing implied – Off-diagonal addressing stored in 2 arrays: owner or row index array and neighbour or column index array. Size of addressing arrays equal to the number of off-diagonal coefficients – The matrix structure (fill-in) is assumed to be symmetric: presence of aij implies the presence of aji – If the matrix coefficients are symmetric, only the upper triangle is stored – a symmetric matrix is easily recognised and stored only half of coefficients – Coding [b] = [A] [x] with arrow addressing vectorProduct(b, x) { int c0, c1; for (int n = 0; n < coeffs.size(); n++) { c0 = owner(n); c1 = neighbour(n); b[c0] = upperCoeffs[n]*x[c1]; b[c1] = lowerCoeffs[n]*x[c0]; } } Matrix Format and Discretisation Method • Relationship between the FV mesh and a matrix: – A cell value depends on other cell values only if the two cells share a face. Therefore, a correspondence exists between the off-diagonal matrix coefficients and the mesh structure – In practice, the matrix is assembled by looping through the mesh

7.3 Linear Solver Technology


• Finite Element matrix assembly – Connectivity depends on the shape function and point-to-cell connectivity in the mesh – In assembly, a local matrix is assembled and then inserted into the global matrix – Clever FEM implementations talk about the kinds of assembly without the need for searching: a critical part of the algorithm


Linear Solver Technology

The Role of a Linear Solver • Good (implicit) numerical simulation software will spend 50-90 % percent of CPU time inverting matrices: performance of linear solvers is absolutely critical for the performance of the solver • Like in the case of mesh generation, we will couple the characteristics of a discretisation method and the solution algorithm with the linear solver • Only a combination of a discretisation method and a linear solver will result in a useful solver. Typically, properties of discretisation will be set up in a way that allows the choice of an efficient solver Solution Approach • Direct solver. The solver algorithm will perform a given number of operations, after which a solution will be obtained • Iterative solver. The algorithm will start from an initial solution and perform a number of operations which will result in an improved solution. Iterative solvers may be variants of the direct solution algorithm with special characteristics • Explicit method. New solution depends on currently available values of the variables. The matrix itself is not required or assembled; in reality, the algorithm reduces to point-Jacobi or Gauss-Seidel sweeps Direct or Iterative Solver • Direct solvers: expensive in storage and CPU time but can handle any sort of matrix • Iterative solvers: work by starting from an initial guess and improving the solution. However, require matrices with “special” properties


Algebraic Linear System and Linear Solver Technology

• For large problems, iterative solvers are the only option • Fortunately, the FVM matrices are ideally suited (read: carefully constructed) for use with iterative solvers Full or Partial Convergence • When we are working on linear problems with linear discretisation in steadystate, the solution algorithm will only use a single solver call. This is very quick and very rare: linear systems are easy to simulate • Example: linear stress analysis. In some FEM implementations, for matrices under a certain size the direct solver will be used exclusively for matrices under a given size • In cases of coupled or non-linear partial differential equations, the solution algorithm will iterate over the non-linearity. Therefore, intermediate solution will only be used to update the non-linear parameters. • With this in mind, we can choose to use partial convergence, update the non-linearity and solve again: capability of obtaining an intermediate solution at a fraction of the cost becomes beneficial • Moreover, in iterative procedures or time-marching simulations, it is quite easy to provide a good initial guess for the new solution: solution form the previous iteration or time-step. This further improves the efficiency of the algorithm • Historically, in partial convergence cases, FEM solvers use tighter tolerances that FVM: 6 orders of magnitude for FEM vs. 1-2 orders of magnitude for the FVM


Direct Solver on Sparse Matrices

Properties of Direct Solvers • The most important property from the numerical point of view is that the number of operations required for the solution is known and intermediate solutions are of no interest • Matrix fill-in. When operating on a large sparse matrix like the one from discretisation methods, the direct solver will create entries for coefficients that were not previously present. As a consequence, formal matrix storage requirement for a direct solver is a full matrix for a complete system: huge! This is something that needs to be handled in a special way

7.3 Linear Solver Technology


• Advantage of direct solvers is that they can handle any sort of well-posed linear system • In reality, we additionally have to worry about pollution by the round-off error. This is partially taken into account through the details of the solution algorithm, but for really bad matrices this cannot be helped Gaussian Elimination • Gaussian elimination is the easiest direct solver: standard mathematics. Elimination is performed by combining row coefficients until a matrix becomes triangular. The elimination step is followed by backwards substitution to obtain the solution. • Pivoting: in order to control the discretisation error, equations are chosen for elimination based on the central coefficient • Combination of matrix rows leads to fill in • Gaussian elimination is one of the cases of I-L-U decomposition solvers and is rarely used in practices • The number of operations in direct solvers scales with the number of equations cubed: very expensive! Multi-Frontal Solver • When handling very sparse systems, the fill-in is very problematic: leads to a large increase in storage size and accounts for the bulk of operations • Window approach: modern implementation of direct solvers – Looking at the structure of the sparse system, it can be established that equation for φP depends only on a small subset of other nodes: in principle, it should be possible to eliminate the equation for P just by looking at a small subset of the complete matrix – If all equations under elimination have overlapping regions of zero offdiagonal coefficients, there will be no fill-in in the shared regions of zeros! – Idea: Instead of operating on the complete matrix, create an active window for elimination. The window will sweep over the matrix, adding equations one by one and performing elimination immediately – The window matrix will be dense, but much smaller than the complete matrix. The triangular matrix (needed for back-substitution) can be stored in a sparse format


Algebraic Linear System and Linear Solver Technology

• The window approach may reduce the cost of direct solvers by several orders of magnitude: acceptable for medium-sized systems. The number of operations scales roughly with N M 2 , where N is the number of equations and M is the maximum size of the solution window Implementing Direct Solvers • The first step in the implementation is control of the window size: the window changes its width dynamically and in the worst case may be the size of the complete matrix • Maximum size of the window depends on the matrix connectivity and ordering of equation. Special optimisation software is used to control the window size: matrix renumbering and ordering heuristics • Example: ordering of a Cartesian matrix for minimisation of the band • Most expensive operation in the multi-frontal solver is the calculation of the Schur’s complement: the difference between the trivial and optimised operation can be a factor of 10000! In practice, you will not attempt this (cache hit rate and processor-specific pre-fetch operations) • Basic Linear Algebra (BLAs) library: special assembly code implementation for matrix manipulation. Code is optimised by hand and sometimes written specially for processor architecture. It is unlikely that a handwritten code for the same operation achieves more than 10 % efficiency of BLAs. A good implementation can now be measured in how much the code spends on operations outside of BLAs.


Simple Iterative Solvers

Iterative solvers • Performance of iterative solvers depends on the matrix characteristics. The solver operates by incrementally improving the solution, which leads to the concept of error propagation: if the error is augmented in the iterative process, the solver diverges • The easiest way of analysing the error is in terms of eigen-spectrum of the matrix • One categorisation of iterative solvers is based on their smoothing characteristics:

7.3 Linear Solver Technology


– Smoothers, or smoothing algorithms guarantee that the approximate solution after each solver iteration will be closer to the exact solution than all previous approximation. An example of a smoother would be the Gauss-Seidel algorithm – For rougheners, this is not the case: in the iterative sequence, the solution can temporarily move away from the exact solution, followed by a series of convergence steps Matrix Properties • A matrix is sparse if it contains only a few non-zero elements • A sparse matrix is banded if its non-zero coefficients are grouped in a stripe around the diagonal • A sparse matrix has a multi-diagonal structureif its non-zero off-diagonal coefficients form a regular diagonal pattern • A symmetric matrix is equal to its transpose [A] = [A]T • A matrix is positive definite if for every [φ] = [0] [φ]T [A][φ] > [0] (7.6) (7.5)

• A matrix is diagonally dominant if in each row the sum of off-diagonal coefficient magnitudes is equal or smaller than the diagonal coefficient

aii ≥

|aij | ; j = i


and for at least one i

aii >

|aij | ; j = i


Residual • Matrix form of the system we are trying to solve is [A][φ] = [r] (7.9)


Algebraic Linear System and Linear Solver Technology

• The exact solution can be obtained by inverting the matrix [A]: [φ] = [A]−1 [r] (7.10)

This is how direct solvers operate: number of operations required for the inversion of [A] is fixed and until the inverse is constructed we cannot get [φ] • Iterative solvers start from an approximate solution [φ]0 and generates a set of solution estimates [φ]k , where k is the iteration counter • Quality of the solution estimate is measured through a residual, or error e: [e] = [r] − [A][φ]k (7.11)

Residual is a vector showing how far is the current estimate [φ]k from the exact solution [φ]. Note that for [φ], [e] will be zero • [e] defines a value for every equation (row) in [A]: we need a better way to measure it. A residual norm ||r|| can be assembled in many ways, but usually

||r|| =

|rj |


In CFD software, the residual norm is normalised further for easier comparison between the equations etc. • Convergence of the iterative solver is usually measured in terms of residual reduction. When ||rk || <ǫ ||r0|| the matrix is considered to be solved. Examples of Simple Solvers • The general idea of iterative solvers is to replace [A] with a matrix that is easy to invert and approximates [A] and use this to obtain the new solution • Point-Jacobi solution • Gauss-Seidel solver • Tri-diagonal system and generalisation to 5- or 7-diagonal matrices (7.13)

7.3 Linear Solver Technology


• Propagation of information in simple iterative solvers. Point Jacobi propagates the “data” one equation at a time: very slow. For Gauss-Seidel, the information propagation depends on the matrix ordering ans sweep direction. In practice forward and reverse sweeps are alternated • Krylov space solvers – Looking at the direct solver, we can imagine that it operates in Ndimensional space, where N is the number of equations and searches for a point which minimises the residual – In Gaussian elimination, we will be visiting each direction of the Ndimensional space and eliminating it from further consideration – The idea of Krylov space solvers is that an approximate solution can be found more efficiently if we look for search directions more intelligently. A residual vector [e] at each point contains the “direction” we should search in; additionally, we would like to always search in a direction orthogonal to all previous search directions – On their own, Krylov space solvers are poor; however, when matrix preconditioning is used, we can assemble efficient methods. This is an example of an iterative roughener – In terms of performance, the number of operations in Krylov space solvers scales with N log(N), where N is the number of unknowns – For more details, see Shevchuk: Conjugate Gradient Method without Agonizing Pain


Algebraic Multigrid

Basic Idea of Multigrid • Operation of a multigrid solver relies on the fact that a high-frequency error is easy to eliminate: consider the operation of the Gauss-Seidel algorithm • Once the high-frequency error is removed, iterative convergence slows down. At the same time, the error that looks smooth on the current mesh will behave as high-frequency on a coarser mesh • If the mesh is coarser, the error is both eliminated faster and in fewer iterations. • Thus, in multigrid the solution is mapped through a series of coarse levels, each of the levels being responsible for a “band” of error

106 Algebraic Multigrid (AMG)

Algebraic Linear System and Linear Solver Technology

• When performing CFD operations, we can readily assemble a multigrid algorithm by creating a series of coarse grids. This in itself is not trivial: convexness of cells, issues with boundary conditions, etc. • In terms of matrices and linear solvers, the same principle should apply: our matrices come from discretisation! However, it would be impractical to build a series of coarse meshes just to solve a system of linear equations • At the same time, we can readily recognise that all the information about the coarse mesh (and therefore the coarse matrix) already exists in the fine mesh! • Example: assembling the convection, diffusion and source operator on the imaginary coarse mesh directly from the data on a fine mesh • Algebraic multigrid generalises this idea: a coarse matrix is created directly from the fine matrix • An alternative view of multigrid can be propagation of information from one boundary to another. In elliptic systems, each point in the solution depends on every other point. Thus, it is critical to transfer the boundary condition influences to each point in the domain, which is done efficiently Algebraic Multigrid Operations • Matrix coarsening. This is roughly equivalent to creation of coarse mesh cells. Two main approaches are: – Aggregative multigrid (AAMG). Equations are grouped into clusters in a manner similar to grouping fine cells to for a coarse cell. The grouping pattern is based on the strength of off-diagonal coefficients – Selective multigrid (SAMG). In selective multigrid, the equations are separated into two groups: the coarse and fine equations. Selection rules specifies that no two coarse points should be connected to each other, creating a maximum possible set. Fine equations form a fineto-coarse interpolation method (restriction matrix), [r], which is used to form the coarse system. • Restriction of residual handles the transfer of information from fine to coarse levels. A fine residual, containing the smooth error component, is restricted and used as the r.h.s. (right-hand-side) of the coarse system.

7.4 Parallelisation and Vectorisation


• Prolongation of correction. Once the coarse system is solved, coarse correction is prolongated to the fine level and added to the solution. Interpolation introduces aliasing errors, which can be efficiently removed by smoothing on the fine level. • Multigrid smoothers. The bulk of multigrid work is performed by transferring the error and correction through the multigrid levels. Smoothers only act to remove high-frequency error: simple and quick. Smoothing can be applied on each level: – Before the restriction of the residual, called pre-smoothing – After the coarse correction has been added, called post-smoothing • Algorithmically, post-smoothing is more efficient • Cycle types. Based on the above, AMG can be considered a two-level solver. In practice, the “coarse level” solution is also assembled using multigrid, leading to multi-level systems. • The most important multigrid cycle types are – V-cycle: residual reduction is performed all the way to the coarsest level, followed by prolongation and post-smoothing. Mathematically, it is possible to show that the V-cycle is optimal and leads to the solution algorithm where the number of operations scales linearly with the number of unknowns – Flex cycle. Here, the creation of coarse levels is done on demand, when the smoother stops converging efficiently • Other cycles, e.g. W-cycle or F-cycle are a variation on the V-cycle theme


Parallelisation and Vectorisation

Solver Performance • Time spent in the solvers is a significant amount of the total simulation time. Therefore, efficiency of solvers and choice of algorithm is critical for the overall performance • We can make the simulation run faster either by devising a better solution algorithm (hard!) or by performing operations and handling data faster • The subject here is rarely the solution algorithm itself: the design of solvers is typically left to mathematicians. Instead, we are looking for operations that can be efficiently executed on computers


Algebraic Linear System and Linear Solver Technology

• Two main “devices” we have at disposal are • When designing solvers to work on high-performance computers, two main “devices” we have at disposal are: – Vector registers – Multiple CPUs • Other (and configurable) structures for efficient execution include pipelining and short vector optimisation, but the principle is the same Vector Operations • We can simplify numerous solver operations into vector-matrix multiply c = a*x + b

This is what the computer does for us • An operation like the above, uses computer resources in 3 ways – Configuring the registers – Fetching the data – Performing the operation • The idea of vector computers is to perform this operation simultaneously on a large amount of data, defined by the vector length (e.g. 256 or 1024 operations together) • Efficient algorithm should therefore perform the same operation on a large data-set, without if-statements, function calls, data inter-dependency etc. • For practical purposes, vector computers are (currently) dead: however, lessons on vector programming are extremely useful on current-generation chips Parallelisation • The idea of parallelisation is to split the large loop of for (int i = 0; i < N; i++) { c[i] = a[i]*x[i] + b[i]; }

7.4 Parallelisation and Vectorisation


between a number of CPUs, with each CPU responsible for its own part • Problem decomposition can be done in several ways – Algorithmic decomposition, or decomposition over the numerical procedure, with each CPU being responsible for its own part of the algorithm – Decomposition over time steps or Time decomposition – Domain decomposition, where each CPU is responsible for its part of the computational domain • Fine-grain decomposition decomposes the solver on a loop-by-loop basis: typically done by the compiler • A critical part of the parallel solution approach is to ensure that every CPU has approximately the same amount of work; otherwise, CPUs end up waiting for each other • Iterative solvers parallelise well: operations have weak data dependency and few synchronisation points. It is relatively easy to establish the necessary communication pattern for data dependency between CPUs • In direct solvers, the problem is more serious: multiple solution windows can propagate the solution front independently on each CPU, but problems arise when two windows on two separate CPUs need to merge


Algebraic Linear System and Linear Solver Technology

Chapter 8 Solution Methods for Coupled Equation Sets
8.1 Examining the Coupling in Equation Sets

Nature of Coupling • The nature of coupling is not usually examined in general terms: all our equations look very similar • Additionally, the nature and strength of coupling depends not only on the equation but also on the state of the system and material properties. Example: change of viscosity in the fluid flow equations. Typically, such changes are described in terms of dimensionless groups, e.g. Reynolds number Re • In principle, difficult systems of equations encompass a large range of space and time-scales. In fact, the equations are not the culprit: we are trying to assemble the solution on an inappropriate scale • Inappropriate scale is usually chosen for efficiency: the actual scale of the physical phenomenon may be very fast and lead to extremely long simulation times • Example: chemical reactions in fully premixed flames


Examples of Systems of Simultaneous Equations

In the next paragraphs, we shall review several mathematical models from the point of view of equation interaction.

112 Porous Media: Darcy’s Equation • Darcy’s Law: u = −γ∇p

Solution Methods for Coupled Equation Sets


• Darcy’s law, combined with the mass conservation equation for the incompressible liquid creates the Laplace equation which controls the system ∇•(γ∇p) = 0 (8.2)

Velocity field is obtained from the pressure distribution in a post-processing step. • Cases where γ is a scalar field represent uniform flow resistance in all directions: isotropic porous medium • For directed Darcy’s law, the flow resistance may be depend on spatial direction, e.g. “flow straighteners”. This produces an orthotropic resistance tensor: u = −γ∇p where  γxx 0 0 γ =  0 γyy 0  0 0 γzz  (8.4) (8.3)

• A general form, where γ is a full symmetric tensor is also possible. In a generalised form of Darcy’s law, we can introduce the more general form, where the resistance tensor is a function of local velocity Linear Stress Analysis • Solution variable: displacement vector d ∂ 2 (ρd) − ∇•[µ∇d + µ(∇d)T + λI tr(∇d)] = ρf. ∂t2 (8.5)

• Equation is assembled by substituting the linear stress-strain relationship into the momentum (force balance) equation: σ = 2µε + λ tr(ε) I and ε= 1 ∇d + (∇d)T 2 (8.7) (8.6)

• Displacement is a vector variable and the equation is linear

8.2 Examples of Systems of Simultaneous Equations


Incompressible Navier-Stokes Equations • Solution variables: velocity u and pressure p • Momentum equation: ∂u + ∇•(uu) − ∇• (ν∇u) = −∇p ∂t • Continuity equation: ∇•u = 0 • ν is the kinematic viscosity and p kinematic pressure Compressible Navier-Stokes Equations • Solution variables: density ρ, momentum ρ u and energy ρ e • Continuity equation: ∂ρ + ∇•(ρu) = 0 ∂t • Momentum equation: ∂(ρu) +∇•(ρuu)−∇• µ ∇u + (∇u)T ∂t • Energy equation: ∂(ρe) 2 + ∇•(ρeu) − ∇•(λ∇T ) = ρg•u − ∇•(P u) − ∇• µ(∇•u) u ∂t 3 (8.12) T +∇• µ ∇u + (∇u) •u + ρQ, • Equation of state ρ = ρ(P, T ) (8.13) = ρg −∇ P + 2 µ∇•u 3 (8.11) (8.10) (8.9) (8.8)

• The transport coefficients λ and µ are also functions of the thermodynamic state variables: λ = λ(P, T ), µ = µ(P, T ). • Pressure or density formulation? (8.14) (8.15)

114 k − ǫ Turbulence Model

Solution Methods for Coupled Equation Sets

• Solution variables: turbulence kinetic energy k and its dissipation ǫ • k-equation: ∂k + ∇•(u k) − ∇•(µt ∇k) = G − ǫ, ∂t with k2 µ t = Cµ ǫ G = µt [∇u + (∇u)T ] : ∇u • ǫ-equation: ∂ǫ ǫ ǫ2 + ∇•(u ǫ) − ∇•(µt ∇ǫ) = C1 G − C2 , ∂t k k Chemical Reactions • Example set of chemical reactions 3 C1 → C2 + 2 C3 + 9 H C2 → AH + 2 H 15 C3 → 2 C12 A7 + C2 + 21CH + 66H • Solution variables: species concentration C1 , C2 and C3 • Transport equations for species 2 44 4 ∂C1 + ∇•(ρC1 u) − ∇•(γc ∇C1 ) = −3C1 S1 + C2 S2 + C3 S3 + C2 S2 ∂t 3 15 45 (8.23) ∂C2 + ∇•(ρC2 u) − ∇•(γc ∇C2 ) = −2C2 S2 (8.24) ∂t 22 2 ∂C3 + ∇•(ρC3 u) − ∇•(γc ∇C3 ) = − C3 S3 + C2 S2 (8.25) ∂t 5 15 • Arrhenius law (reaction rate): Si (T ) = A exp −Ei RT (8.26) (8.20) (8.21) (8.22) (8.19) (8.17) (8.18) (8.16)

8.3 Solution Strategy for Coupled Sets



Solution Strategy for Coupled Sets

We shall review the options of handling the coupled vector variables or coupled equation sets in a numerical solution algorithm. • Coupled solution algorithms are designed to handle systems of equations in the most efficient way possible • The option of solving all equations together always exists, but it is very expensive and in most cases unnecessary • The objective is to treat “important” and “nice” terms implicitly and handle the coupling algorithmically whenever possible • Numerically well behaved terms help with the stability of discretisation – Time derivative: inertial behaviour – Diffusion: smoothing: no new minima or maxima are introduced – Convection: coordinate transformation – Linear and bounded sources and sinks: control of boundedness


Segregated Approach

Segregated Solution Technique • In the segregated approach, the set of equations will be solved one at a time. The coupling terms will be evaluated from the currently available solution and lagged • For vector equations, vector components will be solved individually. Componentto-component coupling terms are lagged (source/sink) by one iteration • In algorithmic terms, the segregated solver corresponds to successive substitution: there is no guarantee a converged solution can be reached • Equation segregation makes smaller matrices: one for each component. Matrices are solved one at a time, re-using the storage arrays and are usually identical for all components (apart from the source/sink terms) • Equation segregation is not always desirable: it may convert a linear componentcoupled problem into a non-linear one and require iterations

116 Under-Relaxation

Solution Methods for Coupled Equation Sets

• In order to improve the convergence, we sometimes use under-relaxation. Here, only a part of the correction is added, potentially slowing down convergence but increasing stability • Types of under-relaxation – Explicit under-relaxation: when a new solution φp is obtained, the value for the next iteration will only use a part of the correction φnew = φold + α(φp − φold ) where 0 < α < 1 – Implicit under-relaxation. When a linear equation for φP is formed the diagonal is boosted and an appropriate correction is added to the r.h.s.: aP φP + α aN φN =


1−α aP φold + R P α


– When convergence is reached φP = φold and the two terms cancel out P – The form of under-relaxation is equivalent to time-stepping, but the “time step size” is not equal for all cells in the mesh • Note that under-relaxation may sometimes be counter-intuitive or slow down the solution process.


Fully Coupled Approach

Block Matrix • For cases of strong coupling between the components of a vector, the components can be solved as a block variable: (ux , uy , uz ) will appear as variables in the same linear system • In spite of the fact that the system is much larger, the coupling pattern still exists: components of u in cell P may be coupled to other components in the same point or to vector components in the neighbouring cell • With this in mind, we can still keep the sparse addressing defined by the mesh: if a variable is a vector, a tensorial diagonal coefficients couples the vector components in the same cell. A tensorial off-diagonal coefficient couples the components of uP to all components of uN , which covers all possibilities

8.4 Matrix Structure for Coupled Algorithms


• For Multi-variable block solution like the compressible Navier-Stokes system above, the same trick is used: the cell variable consists of (ρ, ρu, ρE) and the coupling can be coupled by a 5 × 5 matrix coefficient • Important disadvantages of a block coupled system are – Large linear system: several variables are handled together – Different kinds of physics can be present, e.g. the transport-dominated momentum equation and elliptic pressure equation. At matrix level, it is impossible to separate them, which makes the system more difficult to solve Nature of Coupling • Block matrix represents complete coupling for a block variable • We can examine cases of partial coupling by looking at degenerate forms of the coefficients. This will reveal special cases of coupling where alternatives to a fully coupled solution approach may be considered


Matrix Structure for Coupled Algorithms

Matrix Connectivity and Mesh Structure • Irrespective of the level of coupling, the FVM dictates that a cell value will depend only on the values in surrounding cells





• We still have freedom to organise the matrix by ordering entries for various components of φ. Also, the matrix connectivity pattern may be changed by reordering the computational points


Solution Methods for Coupled Equation Sets

• Example: block-coupled vector equation (ux , uy , Uz ) – Per-variable organisation: first ux for all cells, followed by uy and uz . Ordering of each sub-list matches the cell ordering.   [ux ↔ ux ] [ux ↔ uy ] [ux ↔ uz ] aP = [uy ↔ ux ] [uy ↔ uy ] [uy ↔ uz ] (8.29) [uz ↔ ux ] [uz ↔ uy ] [uz ↔ uz ] Diagonal blocks, e.g. [ux ↔ ux ] have the size equal to the number of computational points and contain the coupling within the single component. All matrix coefficients are scalars. Off-diagonal block represent variable-to-variable coupling.

– Per-cell organisation: (ux , uy , Uz ) for each cell. A single numbering space for all cells, but each individual coefficient is more complex: contains complete coupling • Both choices have advantages and choice depends on software infrastructure and matrix assembly methods. In order to illustrate the nature of coupling, we shall choose per-cell organisation Coupling Coefficient • Consider a linear dependence between two vectors m and n. We can write a general form as m = Ab (8.30)

We shall evaluate the shape of A for various levels of coupling. We shall think of A as a matrix coefficient in the block matrix. The diagonal matrix entry is termed AP and the off-diagonal as AN . Matrix connectivity is dictated by the mesh structure • Component-wise coupling describes the case where mx depends only on nx , my on ny and mz on nz 1. Scalar component-wise coupling 2. Vector component-wise coupling 3. Full (block) coupling • Explicit methods do not feature here because it is not necessary to express them in terms of matrix coefficients • For reference, the linear equation for each cells featuring in the matrix reads AP mP +

AN mN = R


8.4 Matrix Structure for Coupled Algorithms


Scalar-Implicit Coupling • In scalar implicit coupling, components of m at P do not depend on each other. Thus, AP and AN is a diagonal tensor:   axx 0 0 A =  0 ayy 0  (8.32) 0 0 azz • In most terms axx = ayy = azz = a or A = aI (8.34) (8.33)

• In this case, the “block system” represents 3 equations written together but not interacting: the block notation for the system is misleading for the level of coupling present in discretisation • This leads towards a segregated method: we have three independent equations written together. Lack of off-diagonal coefficients indicate the absence of component-to-component coupling • Example of scalar coefficient terms: temporal derivative, diagonal and offdiagonal of convection and diffusion with scalar diffusivity Block-Point Implicit Coupling • In block-point implicit coupling the components of a vector variable m depend on each other in the same computational point, but each individual component depends only of the neighbouring value of the same component • Thus: – In point P , mx depends on self, my and mz . Thus, the diagonal coefficient ap would be a full 3 × 3 matrix   axx axy axz AP = ayx ayy ayz  azx azy azz



Solution Methods for Coupled Equation Sets

– In the off-diagonal, mx fo location P will depend only on mx at N, creating a diagonal-only coefficient.   axx 0 0 AN =  0 ayy 0  (8.36) 0 0 azz – As before, in most cases, the diagonal components are identical. AN = a I The first form is typical for anisotropic porous media. – In this situation, the “transport” part of the system (as depicted by AN exhibits segregated behaviour, combined by a point-coupled problem for each computational point Scalar-Point Vector-Implicit Coupling • In the third combination, local point components of mx are decoupled, but the coupling to the neighbouring locations is complete. Thus   axx 0 0 AP =  0 ayy 0  (8.38) 0 0 azz and   axx axy axz AN = ayx ayy ayz  azx azy azz (8.39) (8.37)

• Such cases are relatively rare and typically appear from tensorial diffusion problems and in some cases of rotational coupling Full Block Coupling • In full block coupling, each component of m depends on all other components both in the local and neighbouring computational points. Thus, both the diagonal and off-diagonal coefficient take full tensor form:   axx axy axz (8.40) AP = ayx ayy ayz  azx azy azz   axx axy axz AN = ayx ayy ayz  (8.41) azx azy azz (note that component values will be different between the two)

8.4 Matrix Structure for Coupled Algorithms


• This is the most complex form of coupling, where “everything is related to everything else” [Lenin] Composite Variables • In some equations, the system will be coupled not only across the components of vectors and tensors, but also across different variables. In such cases, we may write a composite variable formulation, where all equations are grouped together into a single equation • The fact that a composite variable is not a Cartesian tensor needs to be kept in mind. Calculation of gradients, divergence etc. is no longer trivial: the physical meaning of the field needs to be taken into account • Example: compressible Navier-Stokes equations   ρ U = ρu ρe


• Note that U above holds 5 scalar values: 1 for the density, 3 momentum components (ρux , ρuy , ρuz ) and one for energy • This tactics makes sense only if the variables are strongly coupled to each other. Thus, full block coupling typically appears for such systems Non-Linear Coupling • Additional complications will arise for cases where the matrix coefficients are also a function of the solution: non-linearity • Example: convection term in the momentum equation ∇•(u u). Here, components of AP and AN depend on the solution itself, thus creating a nonlinear system • Standard methods, line the Newton linearisation require the evaluation of the Jacobian, which is complex and costly. In reality, simple linearisation is used most often: evaluate AP and AN based on the current value of u and re-calculate u. Saddle Block Systems • A system of equations central to our interest (incompressible Navier-Stokes equations) has a worrying property: wrong equations!


Solution Methods for Coupled Equation Sets

– Unknowns: velocity vector u (3 vector components) and pressure p (scalar) – Equations: momentum equation (3 vector components) ∂u + ∇•(uu) − ∇• (ν∇u) = −∇p ∂t – Continuity equation: ∇•u = 0 (8.44) (8.43)

– Continuity equation sets a condition on velocity divergence ∇•u, which is a scalar – this makes is a scalar equation – Formally, we have 1 vector equation and one vector unknown and one scalar equation – . . . but the scalar equation is given in terms of u and not p!!! • This kind of system is termed the saddle-point system: equations that govern p do not depend on it. Formally, we can write the system as follows: [Au ] [∇(.)] [∇•(.)] [0] u 0 = p 0 (8.45)

Note the absence of entries for p in the diagonal matrix! Off diagonal blocks actually represent the discretised form of the gradient and divergence operator, multiplied by p and u, respectively. The diagonal block [Au ] contains the discretised form of the momentum equation, excluding the pressure gradient term • While there exists a large set of zero diagonal entries, this matrix can be solved. However, naive solution method would require a direct linear equation solver, making it extremely expensive. We shall look for cheaper and faster solution methods • In compressible flows, the density-pressure relationship replaces the zero diagonal block. However, as we approach the incompressibility limit, the system approaches the saddle point form


Coupling in Model Equation Sets

Porous Media: Darcy’s Equation • Solution is governed by the Laplace equation: easy, simple and cheap to solve

8.5 Coupling in Model Equation Sets


• The nature of equation dictates that every point in the domain influences every other point: elliptic nature of the equation. This can be seen in the operation of iterative solvers – large number of sweeps due to the fact that the information is global • For directed resistance, γ may be different in different directions, but the above still holds Linear Stress Analysis • The equation is linear and easy to solve. No convection term = symmetric matrix • The significant new term in the system is ∇•[µ(∇d)T ]. It can be shown that it represents rotation, coupling the components of d to each other • In solid body rotation, the components of the vector change together: strong inter-dependency of vector components • Note that a segregated solution approach is very detrimental in this case. This would imply decoupling the vector components of d and lagging crosscomponent coupling. As a result, an initially linear problem is “nonlinearised”, potentially massively increasing solution cost Incompressible Navier-Stokes • Velocity coupled to itself: non-linear convection term • Pressure coupled to velocity in a linear way • Notes on the form of the pressure – Stress term is modelled using the velocity gradient ∇u – Pressure is the spherical part of the stress tensor – The continuity equation specifies the condition on the divergence of velocity, which is the trace of the gradient tensor – Thus, the role of the pressure is to make sure the velocity is divergence free • Simple solution methods will not work due to a zero diagonal block in pressure equations: need specialised pressure-velocity coupling algorithms

124 Compressible Navier-Stokes • Complex coupling:

Solution Methods for Coupled Equation Sets

– Density appears in the momentum equation and velocity in the continuity equation – Compressibility effect (speed of sound) changes the nature of the densitymomentum coupling – energy affects density through the equation of state, with feed-back both directly through the density and the momentum • Close coupling between the equations recognised in the block form. Rewriting the same equations to emphasise strong coupling: ∂U + ∇•F − ∇•V = 0 ∂t where the solution variable U is:   ρ U = ρu ρe (8.46)


the convective flux F is:   ρu F =  ρuu + pI  ρ(e + p)u


and the diffusive flux V reads:   0 V = σ  σ•u − q


• The above emphasises the fact that the face flux of the system (mass, momentum, energy) needs to be evaluated together: it depends on (ρ, u, e)lef t and (ρ, u, e)right • At the same time, the coupled system hides the issues with the coupling at low speed. For example, a pressure difference of 3 − 5 Pa can drive a significant amount of flow. The associated density difference (air at atmospheric conditions) is of the order of 5 × 10−5 kg/m3 at mean density of 1.176829 kg/m3 , which causes numerical problems • Note that in the limit of incompressibility, decoupling between density and pressure complicates the numerical approach

8.5 Coupling in Model Equation Sets


k − ǫ Turbulence Model • Both equations source-dominated, with relatively short time-scales: turbulence transported from elsewhere quickly dissipates • Left on its own (no mean shear), the system quickly tends to the “no turbulence” solution: k = 0, ǫ = 0 • In most turbulence models, local balance of turbulence production and destruction dominates over the transport: equations are said to be sourcedominated. This makes them easy to solve: local effect • Equation coupling is highly non-linear. Generation term k2 G = Cµ [∇u + (∇u)T ] : ∇u ǫ ǫ-equation sources and sinks: ǫ ǫ2 Sǫ = C1 G − C2 , k k Note various k 2 and ǫ2 terms in the equations!



• Non-linearity if further (massively) complicated bu the introduction of the momentum equation, influenced through effective viscosity µef f = µ + µt and k2 (8.52) µ t = Cµ ǫ • In segregated solution methods, two equations are solved consecutively without major coupling problems. In reality, either k or ǫ will over-shoot and stabilise the system • In external aerodynamics (aerospace) flows with coupled solvers and large time-steps, it sometimes pays to solve the equations in a coupled manner. However, the nature of equations indicates the largest benefit from local source coupling, followed by a transport step: see Multi-Step Approach below Chemical Reactions • Coupling dependent on the reaction rate. For systems with fast reactions, interaction between local quantities may totally dominate • Stiffness and behaviour of the system critically depends on the choice of reactions, species (or pseudo-species) and the time-step • In most cases, the system is source-dominated, but inter-equation coupling issues may be extremely severe. Depending on the problem, use of nonlinear stiff system solvers may be required


Solution Methods for Coupled Equation Sets


Special Coupling Algorithms

• For significant equation sets like fluid flow or magneto-hydrodynamics, we can also devise special solution algorithms based on the detailed understanding of the physics. These may be orders of magnitude faster or memory-efficient that the above approaches • Examples of such algorithms are multi-step algorithms for chemical reactions and pressure-velocity coupling algorithms like SIMPLE and PISO in fluid flows Multi-Step Approach • In chemical reactions, it regularly happens that the system of reaction rates creates a strongly coupled and non-linear system that requires a non-linear solver • At the same time, the transport part of the system is easy to solve. However, a combination of non-linear source coupling and transport would result in a very large and strongly non-linear system • Such systems are solved in 2 steps: – Reaction step. Solution of the local non-linear coupling with frozen transport terms: one system per computational point. The system captures all coupled species and resolves local effects – Transport step. Once the coupling is resolved, the reaction terms are frozen and a transport is solved in the standard manner • If necessary, the steps can be repeated until convergence Pressure-Velocity Coupling • Pressure-velocity coupling algorithms stem from the incompressible NavierStokes equations and separate into 2 parts: – Assembly of the pressure equation from the divergence condition – Coupling between the momentum and pressure equations • Variants of pressure-velocity coupling tend to agree on the formulation of the pressure equation but differ in the way the coupling is established, as will be presented in future chapters

Part III Numerical Simulation of Fluid Flows

Chapter 9 Governing Equations of Fluid Flow
In this chapter, we will revisit the governing equations of fluid flow and various levels of simplification in engineering practice. Some simplifications are voluntary (e.g. steady-state) and some follow from the physical behaviour or flow characteristics (e.g. incompressible flow, turbulence). All simplified forms and levels of approximation shown below are used in fluid flow simulations. Simpler forms are not only quick and easy to compute, but can be used as an initial guess for more complete level of approximation.


Compressible Navier-Stokes Equations

• Solution variables: density ρ, momentum ρu and energy ρe • Continuity equation: ∂ρ + ∇•(ρu) = 0 ∂t (9.1)

– Rate of change and convection: mass transport. The two terms are sometimes grouped into a substantial derivative – Mass sources and sinks would appear on the r.h.s. – Note the absence of a diffusion term: mass does not diffuse – Coupling with the momentum equation: rate of change of ρ depends on the divergence of ρu • Momentum equation: ∂(ρu) + ∇•(ρuu) − ∇• µ ∇u + (∇u)T ∂t = ρg − ∇ P + 2 µ∇•u 3 (9.2)

130 – Substantial derivative

Governing Equations of Fluid Flow

– Non-linear convection term: ∇•(ρuu). This terms provides the wealth of interaction in fluid flows – Diffusion term contains viscous effects • Energy equation: 2 ∂(ρe) + ∇•(ρeu) − ∇•(λ∇T ) = ρg•u − ∇•(P u) − ∇• µ(∇•u) u ∂t 3 (9.3) T +∇• µ ∇u + (∇u) •u + ρQ, – Note that the diffusion term is given in terms of temperature T , not energy: for non-constant material properties, this may be problematic – r.h.s. contains a number of terms related to the work from the stress tensor – Weaker coupling to the rest of the system: e and T influence ρ and u through the equation of state • Equation of state : ρ = ρ(P, T ) – Relationship between density ρ and pressure P • Transport coefficients λ and µ are also functions of the thermodynamic state variables: λ = λ(P, T ), µ = µ(P, T ). (9.5) (9.6) (9.4)

– Properties of real gasses and liquids rarely used in tabular form. Instead, measured data is curve fitted be standard sources: JANAF, NIST, etc. – Variation of material properties is usually a smooth function and does not introduce significant non-linear problems. Issues sometimes occur when the state changes significantly in a single time-step. Here, the initial guess for the new state may be far away from the solution, causing excessive number of search iterations


Flow Classification based on Flow Speed

• Flow-related compressibility effects are measured by comparing the flow speed with the speed of sound

9.2 Flow Classification based on Flow Speed


• Velocities to compare are the convective velocity and the speed with which a weak pressure wave travels through the medium • When the convective speed reaches and exceeds the speed of sound, the mode of propagation of information changes significantly: shocks Speed Range low subsonic high subsonic transonic supersonic hypersonic Low Subsonic Flow • Pressure changes driving the flow are sufficiently slow to cause minimal changes in the density • As a consequence, flow may be considered constant density, allowing all equations to be divided through by the density and setting ∂ρ = 0 ∂t • In special cases, effects like buoyancy-driven flow can be modelled in the same way: driving force from buoyancy is treated as a body force without changing the density High Subsonic Flow • Flow-induced density variation is significant, but without transonic flow pockets. In other words, the convective effects in the pressure distribution are significant but not dominating • Similar situation appears in flow where engineering machinery is designed to increase the pressure (density) mechanically. Example: internal combustion engine (compression-expansion) • This formulation is sometimes called the variable density formulation Transonic Flow • Inlet/outlet conditions typically subsonic, but with pockets of supersonic flow • In some parts of the flow, the convective effects are dominant • Because of the mix of elliptic and hyperbolic nature, transonic cases are usually the most difficult to compute Mach Number < 0.3 0.3 − 0.6 0.6 − 1.1 1−5 >5

132 Supersonic Flow

Governing Equations of Fluid Flow

• Boundary conditions are typically supersonic, with pockets of subsonic flow. Subsonic regions are usually captured close to walls or moving obstacles Hypersonic Flow • On very high speed, simple formulation of the equation of state breaks down and more complex laws are needed • Apart from increasingly complex equation of state, the flow is basically supersonic, with the same limitations on the specification of boundary conditions • Forms of equation of state: – Perfect gas. Flow regime still Mach number independent, but there are problems with adiabatic wall conditions – Two-temperature ideal gas. Rotational and vibrational motion of the molecules needs to be separated and leads to two-temperature models. Used in supersonic nozzle design – Dissociated gas. Multi-molecular gases begin to dissociate at the bow shock of the body. – Ionised gas. The ionised electron population of the stagnated flow becomes significant, and the electrons must be modelled separately: electron temperature. Effect important at speeds of 10 − 12km/s • In engineering machinery, this flow regime is achieved by dropping the speed of sound (rarefied gas), or in space vehicle re-entry aerodynamics


Steady-State or Transient

• In engineering machinery and especially in fluid flow simulations we are regularly interested in the mean or time-averaged properties. Example: mean lift and drag on an airfoil or the mean pressure drop in the pipe. Physically, such simulations should involve calculating a time-dependent flow and performing an appropriate averaging procedure, as is the case in experimental studies • Operations on mathematical equation governing the system allow a different approach: assemble the equations for time-averaged (instead of instantaneous) properties and solve them: in principle, this should provide a mean (time-averaged) solution without further manipulation

9.4 Incompressible Formulation


• Unfortunately, in engineering practice, steady-state approximation is used indiscriminately: having an aircraft flying at cruising speed and altitude, with constant atmospheric conditions does not imply that the flow is steady or even that lift and drag remain constant • In true steady-state simulations, the value of time derivative in all equations reduces to zero. However, forcing this on cases where it will not physically happen leads to numerical problems, including “lack of convergence” • Example: approximations and numerical difficulties of steady state: vortex shedding behind a cylinder in laminar flow • For some transient cases, with a well ordered time response, additional time-response simplifications are possible. Example: frequency-based decomposition in turbomachinery simulations, where frequency is determined from the number of stator and rotor passages


Incompressible Formulation

• Decoupling dependence of density on pressure, also resulting in the decoupling of the energy equation from the rest of the system • Equations can be solved both in the velocity-density or velocity-pressure formulation – Velocity-density formulation does not formally allow for Ma = 0 (or c = ∞), but formally this is never the case. In practice, matrix preconditioning techniques are used to overcome zero diagonal coefficients – Velocity-pressure formulation does not suffer from low-Ma limit, but performs considerably worse at high Ma number ∂u + ∇•(uu) − ∇• (ν∇u) = −∇p ∂t ∇•u = 0 (9.7) (9.8)


Inviscid Formulation

• Relative influence of convective and viscous effects is measured by the Reynolds number (Re). • Inviscid formulation implies infinite Re number. In reality, viscous effects are only important in the vicinity of walls. Also, this simplification would have important effects on turbulence dynamics, described below


Governing Equations of Fluid Flow

• A popular simplified form of equations used in the past is a combination of an inviscid flow solver in the far field coupled with a boundary layer solver in the near-wall region


Potential Flow Formulation

• Fast turnaround: panel method simulations, sometimes coupled with a boundary layer solver • Still useful in engineering practice: initialisation of the flow field, speeding up convergence


Turbulent Flow Approximations

Turbulent Flow • Navier-Stokes equations represent fluid flow in all necessary detail. However, the span of scales in the flow is considerable • Nature of turbulent flow is such that it is possible to separate the mean signal from turbulence interaction • Example: turbulent flow around Airbus A380 – Largest scale of interest is based on the scale of engineering machinery: overall length (79.4 m), wing span (79.8 m). In practice, wake behind the aircraft is also of interest – In turbulent flows, energy is introduced into large scales and through the process of vortex stretching transferred into smaller scales. Most dissipation of turbulence energy into heat happens at smallest scales – The size of smallest scale of interest is estimated from the size of a vortex which would dissipate the energy it contains in one revolution. The scale depends on Re number, but an estimate would be obtained from the Kolmogorov micro-scale: η= ν3 ǫ
1 4



where η is the scale, ν is the kinematic viscosity and ǫ is the dissipation rate (equal to the production rate). For our case, this will be well below a millimetre; additionally, include the requirement for time-accurate simulation and averaging

9.7 Turbulent Flow Approximations


• In order to resolve the flow to all of its details, full range of scales need to be simulated. The range of scales in turbulent flow on high Re is well beyond the capabilities of modern computers, which leads to turbulence modelling Level of Approximation • Direct Numerical Simulation (DNS). Full range of scales is simulated: transient simulation with averaging. 3-D and time-dependent simulations, with the need for averaging • Reynolds Averaged Navier-Stokes Equations (RANS). Velocity and pressure (density) are decomposed into the mean and oscillating component u = u + u′ p = p + p′ (9.10) (9.11)

Substituting the above into the Navier-Stokes equations and eliminating second-order terms yields the equations in terms of mean properties: u and p, with a closure problem. • Large Eddy Simulation (LES). LES recognises the fact that turbulence on larger scales depends on the geometry and flow details and smaller scales acting mainly as the energy sink. By nature, smaller scales are more isotropic and homogenous and thus easier to model. Therefore, we shall aim to decompose the flow into larger scales, which are resolved and model the effect of smaller scales. Simulation is 3-D and time-resolved and requires averaging.


Direct Numerical Simulation

• Main source of comparison data for simple and canonical flows (e.g. homogenous isotropic turbulence, incompressible and compressible turbulent boundary layer, simple geometries) • DNS has completely replaced experimental methods at this level because it provides complete information and numerics has proven sufficiently accurate • Current push towards compressible flows and simple chemical reactions, e.g. interaction between turbulent mixing and flame wrinkling in premixed combustion • Typical level of discretisation accuracy: 6th order in space and 10th order in time. Critical for accurate high-order correlation data • Extremely expensive simulations: pushing the limits of computing power


Governing Equations of Fluid Flow


Reynolds Averaging Approach

Reynolds Averaging • Reynolds averaging removes a significant component of unsteady behaviour: all transient effects that can be described as “turbulence” are removed by the manipulation of equations • Note that u and p are still time-dependent (separation of scales): time dependent RANS • It is now possible to solve directly for the properties of engineering interest: mean flow field, mean drag etc. For cases which are 2-D in the mean, it makes sense to perform 2-D simulations irrespective of the nature of turbulence • A turbulence model is required for closure: describe the effect of sub-grid scales on the resolved flow based on resolved flow characteristics • This is a substantial reduction in simulation cost and has allowed the adoption industrial of CFD. RANS models are the mainstay of industrial CFD and likely to remain so until the next change in computing power of approximately 2 orders of magnitude • Turbulence models are just models (!) and their physical justification is often more limited than for the fundamental equations ∂u + ∇•(u u) − ∇• (ν∇u) = −∇p + ∇•R ∂t ∇•u = 0 Here, R is the Reynolds stress tensor: R = u′ u′ Reynolds Stress Closure Models • Eddy viscosity models. Models are based on reasoning similar to Prandtl’s theory: R = νt ∇u + (∇u)T (9.15) (9.14) (9.12) (9.13)

where νt is the eddy viscosity. In short, the formula specifies that the Reynolds stress tensor is aligned with the velocity gradient. Eddy viscosity is assembled through dimensional analysis, based on a characteristic lengthand time-scale

9.7 Turbulent Flow Approximations


• Second and higher order closure. Instead of assembling R based on the velocity gradient, a transport equation for the Reynolds stress is assembled by manipulating the momentum equation. However, this leads to a higherorder closure problem (new terms in the Reynolds stress transport equation) with additional uncertainty • Near-wall treatment. Regions of sharp velocity gradients near the wall is the most demanding: high mesh resolution, controlling cell aspect ratio and time-step. Two modelling approaches: – Integration to the wall, also known as low-Re turbulence models. Near-wall region is resolved in full detail, with the associated space resolution requirements. – Wall functions, where the region of high gradients is bridged with a special model which compensates for unresolved gradients. Model assumes equilibrium behaviour near the wall (attached fully developed flow) and significantly influences the result


Large Eddy Simulation

The first step in Large Eddy Simulation (LES) modelling approach is the separation of the instantaneous value of a variable into the resolved and unresolved (modelled) component. Mathematical Machinery • Scale separation operation is achieved through filtering. Imagine a separation of space into small pockets of space and performing local averaging. Averaging operation is mathematically defined as: u= G(x, x′ ) u(x′ )dx′ , (9.16)

where G(x, x′ ) is the localised filter function. This can be interpreted as a local spatial average • Effect of filtering the Navier-Stokes equations is very similar to the Reynolds averaging, but the meaning of the filtered values is considerably different • Simulations remains 3-D and unsteady, with the need for averaging. However, demands for spatial and temporal resolution are considerably reduced, due to the fact that smallest scales are to be modelled ∂u + ∇•(u u) − ∇• (ν∇u) = −∇p + ∇•τ ∂t (9.17)


Governing Equations of Fluid Flow

∇•u = 0


Here, τ is the sub-grid stress tensor, arising from the fact that u u = u u: τ = uu − uu = (u + u′ ) (u + u′ ) = (u u − u u) + (u u′ + u′ u) + u′ u′ (9.19) (9.20) (9.21)

(Leonard stress, grid-to-subgrid energy transfer, sub-grid Reynolds stress) Sub-Grid Scale (SGS) Model • The idea of LES is to separate the scales of turbulence such that only small scales are modelled, whereas energetic and geometrical scales are resolved by simulation. Small scale turbulence is closer to isotropic and homogenous, making it easier to model • A number of modelling paradigms exist, based on different ways of extracting the information about sub-grid scales (SGS). Since the main role of SGS models is to remove the energy of the resolved scales, overall result is only weakly influenced by the SGS, provided the correct rate of energy removal is accounted for • In practice, most SGS models are based on eddy viscosity, sometimes with additional transport or back-scatter effects Numerical Model and Simulation Framework • Numerical errors introduced by discretisation are typically diffusive in nature. In other words, the discretisation error will act as if additional diffusivity in the system • At the same time, it is the role of the SGS model to control the energy dissipation at the correct physical rate – this would imply the importance of reducing numerical errors to a minimum • Older school of LES required the same accuracy of spatial and temporal discretisation as in DNS. Recent studies show this is excessive: higher moments are typically not of interest. • On balance, good second-order discretisation and unstructured mesh handling for complex geometries provides a good balance of accuracy, speed and resolution requirements

Chapter 10 Pressure-Velocity Coupling
In this chapter, we shall examine the nature of pressure-velocity coupling and review numerical algorithms to handle fluid flow equations in the most efficient manner. The algorithms can be divided into pressure- and density- based algorithms, with segregated and coupled solution methods.


Nature of Pressure-Velocity Coupling

Discretisation Procedure for Fluid Flow Equations • In previous chapters, we have presented a discretisation procedure for transport equations for scalars and vectors. Additionally, we have presented a method for handling coupled equation sets and linear equation solver technology • In density-based algorithms, the methodology is satisfactory: solving a single transport equation for a block variable, where the flux of mass, momentum and energy depends on the complete set of state variables • However, the machinery does not seem to be complete for pressure-based system. We shall examine this further starting from the incompressible Navier-Stokes, equations, extend it to compressible flow and compare with the density-based solvers Momentum Equation • Momentum equation is in the standard form and the discretisation of individual terms is clear. This is the incompressible form, assuming ρ = const. and ∇•u = 0 (demonstrate): ∂u + ∇•(uu) − ∇• (ν∇u) = −∇p ∂t (10.1)


Pressure-Velocity Coupling

• The non-linearity of the convection term, ∇•(uu) can be easily handled by an iterative algorithm, until a converged solution is reached • The limiting factor is the pressure gradient: ∇p appears as the source term and for known p there would be no issues. Continuity Equation • Continuity equation states that mass will neither be created nor destroyed. In incompressible flows, by definition ρ = const., resulting in the incompressible form of the continuity equation: ∇•u = 0 (10.2)

• Note: this is a scalar field equation in spite of the fact that u is a vector field! Pressure – Momentum Interaction • Counting the equations and unknowns, the system seems well posed: 1 vector and 1 scalar field governed by 1 vector and 1 scalar equation • Linear coupling exists between the momentum equation and continuity. Note that u is a vector variable governed by the vector equation. Continuity equation imposes an additional criterion on velocity divergence (∇•u). This is an example of a scalar constraint on a vector variable, as ∇•u is a scalar • Non-linear u − u interaction in the convection is unlikely to cause trouble: use an iterative solution technique. In practice ∇•(uu) ≈ ∇•(uo un ) (10.3)

where uo is the currently available solution or an initial guess and un is the “new” solution. The algorithm cycles until uo = un Continuity Equation and the Role of Pressure • There is no obvious way of assembling the pressure equation, which is at the root of the problem. Available equation expresses the divergence-free condition on the velocity field. ∇•u = 0 (10.4)

10.2 Density-Based Block Solver


• Examining the role of the pressure, it turns out that the spherical part of the stress tensor, extracted in the pressure term directly relates to the above condition on the velocity. Viscous stress is modelled on the basis of the velocity gradient: σ = −pI + µ ∇u + (∇u)T , (10.5)

postulating the equivalence between the mechanical and thermodynamic pressure. Therefore, the pressure term is related to the tr(∇u) = ∇•u, which appears in the continuity equation. In other words, pressure distribution should be such that the pressure gradient in the momentum equation enforces the divergence-free condition on the velocity field. • If the pressure distribution is known, the problem of pressure-velocity coupling is resolved. However, it is clear that pressure and velocity will be closely coupled to each other.


Density-Based Block Solver

Density-Based Algorithm • In previous lectures, we have shown a block coupled form of the densitybased flow solver. Noting that all governing equations fit into the standard form and all variables are fully coupled, the compressible Navier-Stokes system can be written as: ∂U + ∇•F − ∇•V = 0 ∂t where the solution variable U is:   ρ U = ρu ρe • In the above, pressure appears in the convective flux F :   ρu F =  ρuu + pI  ρ(e + p)u (10.6)



• Standard (Roe flux) compressible Navier-Stokes solver will evaluate F for each cell face directly from the state (U) left and right from the face, using approximate Riemann solver techniques


Pressure-Velocity Coupling

• Looking at the second row of the flux expression we can recognise the convective contribution and the pressure driving force (note ∇•(pI) = ∇p). In high-speed flows, the first component is considerably larger than the second • In the low-speed limit, a pressure difference of 3−5Pa can drive considerable flow; however, in this case, the pressure gradient will dominate. As shown before, this implies a density change of approximately 5 × 10−5 kg/m3 for the mean density of 1kg/m3 . Equivalent calculation for a liquid (water), would produce even more extreme result (due to the higher speed of sound) • Equation governing pressure effects in this case is the continuity, through density transport and the equation of state. Therefore, for accurate pressure data we need to capture density changes of the order of 1 × 10−5 , with reference level of 1, together with the velocity changes of the order of 1 and energy level of 2 × 105 (e = ρCv T ). Note that all properties are closely coupled, which means that matrix coefficients vary to extreme levels • The speed of sound in general is given as c= ∂p ∂ρ (10.9)

Infinite speed of sound (incompressible fluid) implies decoupling between density and pressure • As a consequence of decoupling, density-based solver cannot handle the the incompressible limit. In practice, very low Ma number flow can be achieved, either through matrix preconditioning or by introducing artificial compressibility Explicit and Implicit Compressible Flow Solver • Relationship that prescribes F as a function of UP and UN is complex and non-linear: calculating characteristic wave speed and propagation. It is therefore natural to evaluate the flux F and advance the simulation explicitly: U n = U o − ∆t(∇•F − ∇•V ) = U o − ∆tR (10.10)

Here, R is the convection-diffusion residual residual (A higher-order timeintegration technique may also be used) • This leads to a fundamentally explicit time-integration method, with the associated Courant number (Co) limit: time-step is limited by the size of the smallest cell

10.2 Density-Based Block Solver


• Time-step limitation is in reality so severe that it renders the code useless: for steady-state simulations, we need to achieve acceleration of a factor of 100 − 10 000 • Solution acceleration techniques require faster information transfer in order to approach steady-state more rapidly. We will examine two: – Implicit solver – Geometric multigrid Solution Acceleration Techniques • Implicit solver – Implicit compressible solver is based on the same flux evaluation technique as the explicit solver, but generalising the form of the flux expression to create matrix coefficients F = F (UP , UN ) = ∂F ∂F UP + UN + D ∂UP ∂UN = AP •UP + AN •UN + D (10.11) (10.12)

– Here, matrix coefficient is a full 5 × 5 matrix, calculated as a Jacobian and D is the explicit correction. Linearisation may be done in several ways, with different level of approximation  ∂(ρu) 

 ∂(ρuu+pI)  A =  ∂(ρu) 
∂(ρ(e+p)u) ∂(ρe)


– With the help of flux Jacobians, we have created an implicit system of equations, which relaxes the Co number criterion, but not to the desired level. However, this is a very useful first step • Multigrid acceleration – Geometric multigrid is based on a curious fact: as the mesh gets coarse, the Co number limit becomes less strict, allowing the simulation to advance in larger time-steps and a steady-state solution is reached in fewer time-steps – The problem we have solved on a coarse grid is physically identical to its fine-grid equivalent. It should therefore be possible to “solve” the coarse-grid problem and use the solution as the initial guess for its fine-grid equivalent


Pressure-Velocity Coupling

– Full Approximation Storage (FAS) Multigrid performs this process on several levels simultaneously, using a hierarchy of corse grids. This allows us to use a very large Co number (100 − 1 000 or higher) without falling foul of the Co criterion: significant part of information transfer occurs on coarse grids without violating the stability criterion – Additional complication in multigrid simulation is the requirement for a hierarchy of coarse grids for the geometry of interest. Additional problems, related to the geometric representation and specification of boundary conditions on coarse grids – In practice, coarse grids are assembled be agglomerating fine grid cells into clusters


Pressure-Based Block Solver

Rationale • We have shown there exists a fundamental limitation of density-based solvers close to the incompressibility limit. At the same time, based on the flow classification based on Ma number, for Ma < 0.3 the compressibility effects are negligible. This covers a large proportion of flow regimes • Idea: assemble the solution algorithm capable of handling the low Mach number limit and extend it to compressible flow. Formally, such a method should be able to simulate the flow at all speeds • A critical part here is handling the incompressibility limit: this is what we will examine below Block Pressure-Momentum Solution • Looking at basic discretisation techniques, we can handle the momentum equation without any problems, apart from the pressure gradient term. If pressure were known, its gradient could be easily evaluated; however, we need to create an implicit form of the operator • The same applies for the velocity divergence term with an additional complication: ∇•u needs to be expressed in terms of pressure as a working variable • This technique leads to the saddle-point system mentioned above


Gradient and Divergence Operator

Repeating the discretisation of the gradient and divergence term given above, we shall now repeat the procedure, attempting to assemble an implicit form

10.3 Pressure-Based Block Solver


Gradient Operator • We shall only show the discretisation for the Gauss gradient; least square and other techniques can be assembled in an equivalent manner • Discretised form of the Gauss theorem splits into a sum of face integrals ∇φ dV =

nφ dS =

sf φf


• It still remains to evaluate the face value of φ. Consistently with secondorder discretisation, we shall assume linear variation between P and N φf = fx φP + (1 − fx )φN • Assembling the above, the gradient can be assembled as follows ∇φ ≈ aP φP +


aN φN


where aN = and aP =

1 − fx sf VP


fx sf



• Note that both aP and aN are vectors: multiplying a scalar field φ produces a gradient (vector field) • For a uniform mesh (fx = const.), aP = 0! This is because for a closed cell f sf = 0 Divergence Operator • The divergence operator is assembled in an equivalent manner. A divergence of a vector field u is evaluated as follows: ∇•u dV =

n•u dS =

sf •u



Pressure-Velocity Coupling

• Equivalent to the gradient operator discretisation, it follows: ∇•u ≈ aP •uP +

aN •uN


where aN = and aP =

1 − fx sf VP


fx sf VP


• Note that the coefficients are equivalent to the gradient operator, but here we have the inner product of two vectors, producing a scalar


Block Solution Techniques for a Pressure-Based Solver

Pressure-Based Block Solver • Discretisation of the gradient and divergence operator above allows us to assemble the block pressure-velocity system as promised • The system can be readily solved using the direct solver (note the zeros on the diagonal of the pressure matrix). However, this is massively expensive and we need to find a better way to handle the system Solver Technology • Zero diagonal entries exclude a majority of iterative solvers: any GaussSeidel technique is excluded • There exists a set of iterative techniques for saddle systems which may be of use. Typically, they combine a Krylov-space solver (operating on residual vectors) with special preconditioners for saddle systems • We shall examine one such technique below, as a part of derivation of the pressure equation

10.4 Segregated Pressure-Based Solver



Segregated Pressure-Based Solver

Segregated Solution Procedure • Currently, a pressure-based block solver does not look very attractive: large matrix, with a combination of variables and different nature of equations with uncertain performance of linear equation solvers • A step forward could be achieved be deriving a “proper” equation governing pressure and assembling a coupling algorithm. In this way, momentum and pressure could be solver separately (1/4 of the storage requirement of the block- or density-based solver) and handled by an external coupling algorithm • In any case, the first step would be a derivation of the pressure equation, which will be examined below


Derivation of the Pressure Equation

Pressure Equation as a Schur Complement • Consider a general block matrix system M, consisting of 4 block matrices, A, B, C and D, which are respectively p × p, p × q, q × p and q × q matrices and A is invertible: A B C D (10.23)

• This structure will arise naturally when trying to solve a block system of equations Ax + By = a Cx + Dy = b (10.24) (10.25)

• The Schur complement arises when trying to eliminate x from the system using partial Gaussian elimination by multiplying the first row with A−1 : A−1 Ax + A−1 By = A−1 a and x = A−1 a − A−1 By. Substituting the above into the second row: (D − CA−1 B)y = b − CA−1 a (10.28) (10.27) (10.26)


Pressure-Velocity Coupling

• Let us repeat the same set of operations on the block form of the pressurevelocity system, attempting to assemble a pressure equation. Note that the operators in the block system could be considered both as differential operators and in a discretised form [Au ] [∇(.)] [∇•(.)] [0] u 0 = p 0 (10.29)

• Formally, this leads to the following form of the pressure equation: [∇•(.)][A−1 ][∇(.)][p] = 0 u (10.30)

Here, A−1 represent the inverse of the momentum matrix in the discretised u form, which acts as diffusivity in the Laplace equation for the pressure. • From the above, it is clear that the governing equation for the pressure is a Laplacian, with the momentum matrix acting as a diffusion coefficient. However, the form of the operator is very inconvenient: – While [Au ] is a sparse matrix, its inverse is likely to be dense – Discretised form of the divergence and gradient operator are sparse and well-behaved. However, a triple product with [A−1 ] would result u in a dense matrix, making it expensive to solve • The above can be remedied be decomposing the momentum matrix before the triple product into the diagonal part and off-diagonal matrix: [Au ] = [Du ] + [LUu ], (10.31)

where [Du ] only contains diagonal entries. [Du ] is easy to invert and will preserve the sparseness pattern in the triple product. Revisiting Eqn. (10.29 before the formation of the Schur complement and moving the off-diagonal component of [Au ] onto r.h.s. yields: [Du ] [∇(.)] [∇•(.)] [0] −[LUu ][u] u = 0 p (10.32)

A revised formulation of the pressure equation via a Schur’s complement yields:
−1 −1 [∇•(.)][Du ][∇(.)][p] = [∇•(.)][Du ][LUu ][u] −1 In both cases, matrix [Du ] is simple to assemble.


• It follows that the pressure equation is a Poisson equation with the diagonal part of the discretised momentum acting as diffusivity and the divergence of the velocity on the r.h.s.

10.4 Segregated Pressure-Based Solver


Derivation of the Pressure Equation • We shall now rewrite the above derivation formally without resorting to the assembly of Schur’s complement in order to show the identical result • We shall start by discretising the momentum equation using the techniques described before. For the purposes of derivation, the pressure gradient term will remain in the differential form. For each CV, the discretised momentum equation yields: au uP + P

au uN = r − ∇p N


For simplicity, we shall introduce the H(u) operator, containing the offdiagonal part of the momentum matrix and any associated r.h.s. contributions: H(u) = r −

au uN N


Using the above, it follows: au uP = H(u) − ∇p P and uP = (au )−1 (H(u) − ∇p) P (10.37) (10.36)

• Substituting the expression for uP into the incompressible continuity equation ∇•u = 0 yields ∇• (au )−1 ∇p = ∇•((au )−1 H(u)) P P (10.38)

We have again arrived to the identical form of the pressure equation • Note the implied decomposition of the momentum matrix into the diagonal and off-diagonal contribution, where au is an coefficient in [Du ] matrix and P H(u) is the product [LUu ][u], both appearing in the previous derivation Assembling Conservative Fluxes • Pressure equation has been derived from the continuity condition and the role of pressure is to guarantee a divergence-free velocity field


Pressure-Velocity Coupling

• Looking at the discretised form of the continuity equation ∇•u =

sf •u =



where F is the face flux F = sf •u (10.40)

Therefore, conservative face flux should be created from the solution of the pressure equation. If we substitute expression for u into the flux equation, it follows: F = −(au )−1 sf •∇p + (au )−1 sf •H(u) P P (10.41)

• A part of the above, (au )−1 sf •∇p appears during the discretisation of the P Laplacian, for each face. This is discretised as follows: (au )−1 sf •∇p = (au )−1 P P
|s |

|sf | (pN − pP ) = ap (pN − pP ) N |d|


f Here, ap = (au )−1 |d| is equal to the off-diagonal matrix coefficient in the P N pressure Laplacian

• Note that in order for the face flux to be conservative, assembly of the flux must ba completely consistent with the assembly of the pressure equation (e.g. non-orthogonal correction)


SIMPLE Algorithm and Related Methods

SIMPLE Algorithm • This is the earliest pressure-velocity coupling algorithm: Patankar and Spalding, 1972 (Imperial College London) • SIMPLE: Semi-Implicit Algorithm for Pressure-Linked Equations • Sequence of operations: 1. Guess the pressure field p∗ 2. Solve the momentum equation using the guessed pressure. This step is called momentum predictor au uP = H(u) − ∇p∗ P (10.43)

10.4 Segregated Pressure-Based Solver


3. Calculate the new pressure based on the velocity field. This is called a pressure correction step ∇• (au )−1 ∇p = ∇• (au )−1 H(u) P P (10.44)

4. Based on the pressure solution, assemble conservative face flux F F = sf •H(u) − ap (pN − pP ) N 5. Repeat to convergence • Corrected velocity field may be obtained by substituting the new pressure field into the momentum equation: uP = (au )−1 (H(u) − ∇p) P Under-Relaxation • The algorithm in its base form produces a series of corrections on u and p. Unfortunately, in the above form it will diverge! • Divergence is due to the fact that pressure correction contains both the pressure as a physical variable and a component which forces the discrete fluxes to become conservative • In order to achieve convergence, under-relaxation is used: p∗∗ = p∗ + αP (p − p∗ ) and u∗∗ = u∗ + αU (u − u∗ ) (10.48) (10.47) (10.46) (10.45)

where p and u are the solution of the pressure and momentum equations and u∗ and p∗ represent a series of pressure and velocity approximations. Note that in practice momentum under-relaxation is implicit and pressure (elliptic equation) is under-relaxed explicitly 1 − αU u ∗ au P uP = H(u) − ∇p∗ + aP uP αU αU (10.49)

• αP and αU are the pressure and velocity under-relaxation factors. Some guidelines for choosing under-relaxation are 0 < αP ≤ 1 0 < αU ≤ 1 αP + αU ≈ 1 (10.50) (10.51) (10.52)

152 or the standard set (guidance only!!!) αP = 0.2 αU = 0.8

Pressure-Velocity Coupling

(10.53) (10.54)

• Under-relaxation dampens the oscillation in the pressure-velocity coupling and is very efficient in stabilising the algorithm


PISO Algorithm

Pressure Correction Equation • SIMPLE algorithm prescribes that the momentum predictor will be solved using the available pressure field. The role of pressure in the momentum equation is to ensure that the velocity field is divergence free • After the first momentum solution, the velocity field is not divergence-free: we used a guessed pressure field • Therefore, the pressure field after the first pressure corrector will contain two parts – Physical pressure, consistent with the global flow field – A “pressure correction” component, which enforces the continuity and counter-balances the error in the initial pressure guess Only the first component should be built into the physical pressure field • In SIMPLE, this is handled by severely under-relaxing the pressure Under-Relaxation and PISO • Having 2 under-relaxation coefficients which balance each other is very inconvenient: difficult tuning • The idea of PISO is as follows: – Pressure-velocity system contains 2 complex coupling terms ∗ Non-linear convection term, containing u − u coupling ∗ Linear pressure-velocity coupling – On low Co number (small time-step), the pressure velocity coupling is much stronger than the non-linear coupling – It is therefore possible to repeat a number of pressure correctors without updating the discretisation of the momentum equation (using the new fluxes)

10.4 Segregated Pressure-Based Solver


– In such a setup, the first pressure corrector will create a conservative velocity field, while the second and following will establish the pressure distribution • Since multiple pressure correctors are used with a single momentum equation, it is no longer necessary to under-relax the pressure. In steady-state simulations, the system is stabilised by momentum under-relaxation • On the negative side, derivation of PISO is based on the assumption that momentum discretisation may be safely frozen through a series of pressure correctors, which is true only at small time-steps PISO Algorithm • PISO is very useful in kinds of simulations where the time-step is controlled by external issues and temporal accuracy is important. In such cases, assumption of slow variation over non-linearity holds and the cost of momentum assembly and solution can be safely avoided. Example: Large Eddy simulation • Sequence of operations: 1. Use the available pressure field p∗ from previous corrector or time-step. Conservative fluxes corresponding to p∗ are also available 2. Discretise the momentum equation with the available flux field 3. Solve the momentum equation using the guessed pressure. This step is called momentum predictor au uP = H(u) − ∇p∗ P (10.55)

4. Calculate the new pressure based on the velocity field. This is called a pressure correction step ∇• (au )−1 ∇p = ∇•((au )−1 H(u)) P P F = sf •H(u) − ap (pN − pP ) N (10.56)

5. Based on the pressure solution, assemble conservative face flux F (10.57)

6. Explicitly update cell-centred velocity field with the assembled momentum coefficients uP = (au )−1 (H(u) − ∇p) P 7. Return to step 4 if convergence is not reached 8. Proceed from step 1 for a new time-step • Functional equivalent of the PISO algorithm is alo used as a preconditioner in Krylov space saddle-point solvers (10.58)


Pressure-Velocity Coupling


Pressure Checkerboarding Problem

Checkeboarded Pressure Distribution • In early variants of pressure-velocity coupling algorithms an interesting error was noticed, completely invalidating the results: pressure checkerboarding. The pressure field with 1-cell oscillation seemed to satisfy the discretised equations just as the in the place of a uniform field. Algorithm

Figure 10.1: Checkerboarded pressure distribution. which cannot discriminate between a uniform and checkerboarded pressure distribution is useless for practical purposes. We shall now examine the cause and possible solutions for the checkerboarding problem

Figure 10.2: Checkerboarded pressure distribution.

Checkerboarding Error • As shown above, the derived form of the pressure equation contains a Laplace operator ∇• (au )−1 ∇p = ∇•((au )−1 H(u)) P P • We have also derived the matrix equivalent of the pressure equation using Schur’s complement in the following form:
−1 −1 [∇•(.)][Du ][∇(.)][p] = [∇•(.)][Du ][LUu ][u]

10.4 Segregated Pressure-Based Solver


−1 In both cases, the (au )−1 or [Du ] acts as a diffusion coefficient and can be P safely neglected as a pre-factor

• The matrix equivalent can, as a triple product be read as follows: – Create the discretisation for the gradient term – Interpolate it to the face (and multiply by the diffusion) – Assemble the divergence term with the interpolated pressure • An equivalent procedure can be seen when taking the (discrete) divergence of the discretised momentum equation: uP = (au )−1 H(u) − (au )−1 ∇p /∇•. P P (10.59)

Here, the last term may require the interpolation of the pressure gradient. Computational Molecule • The cause of checkerboarding error becomes clear when we examine the implied discretised form. • A cell-centred gradient is evaluated using the values in neighbouring cells. Note that for (∇p)P the cell centre P does not appear in the discretisation

Pressure gradient


Figure 10.3: Cell-centred gradient. • A divergence operator requires the gradient to be interpolated to the cell face in order to assemble the divergence term. Symmetrically, on the opposite face, the interpolated gradient will use four computational points around the face • Points around the cell P will appear in computational molecules for both interpolated gradients appearing in the ∇•. operator for cell P . Since the face area vectors point in opposite direction for two faces, the coefficients for the intermediate points will exactly cancel out!


Pressure-Velocity Coupling

Face interpolated pressure gradient



Figure 10.4: Interpolated gradient. • As a result of coefficient cancellation in intermediate points, the computational molecule for the assembled Laplace operator does not feature the points immediately to the left and right of P , but is still forms a valid discretisation of the Laplacian!
Laplace operator with interpolated gradients


Figure 10.5: Laplace operator with interpolated gradients. • Looking at the above it becomes clear why checkerboarding occurs: if we evaluate the Laplacian using every other cell, a checkerboarded pressure field appears as uniform and there is no correction to make
Laplace operator with interpolated gradients Standard Laplace operator


Figure 10.6: Comparison of computational molecules for the Laplace operator. • Comparison of the two computational molecules clearly demonstrates the problem and the way a standard discretisation of a Laplacian overcomes the difficulty: a compact computational molecule of the standard discretisation leaves no room for checkerboarding errors

10.4 Segregated Pressure-Based Solver


• The solution to the problem is clearly related to the rearrangement of the computational molecule in the pressure Laplacian to compact support and will be examined below.


Staggered and Collocated Variable Arrangement

Staggered Variable Arrangement • The issue of checkerboarding arises from the fact that interpolated velocity in the divergence operator contains a cell-centred pressure gradient. This results in an expanded molecule for the discretised Laplacian • At the time, the FVM was strictly a (2-D) structured mesh technique and the offered solution was to stagger the computational locations where p and u are stored.

uy p ux

Figure 10.7: Staggered variable arrangement. • Note that components of the velocity vectors are now stored in separate locations and both are staggered: they formally represent face flux as well as the velocity component • With the above, no interpolation is necessary and the pressure Laplacian appears with compact support • Unfortunately, the staggered variable arrangement is useless on any but simplest of meshes: for all other shapes the problem would be either underor over-constrained. A more general solution is required


Pressure-Velocity Coupling

• There exist a pressure-flux formulation but this is beyond our scope at this time Collocated Variable Arrangement • The second approach to resolving the staggering problem is to recognise that the issue boils down to the calculation of the face-based pressure gradient In the original form (above), the face pressure gradient is obtained by interpolation: (∇p)f = fx (∇p)P + (1 − fx )(∇p)N (10.60)

The face gradient is then used in the dot-product with the face area vector, s•(∇p)f • In the discretisation of the Laplace operator we have also come across the expression s•(∇p)f which was discretised as follows: s•(∇p)f = |s| (pN − pP ) |d| (10.61)

This formula results in compact support of the Laplacian and resolves the problem • We can arrive to the collocated in several ways: – Delayed discretisation of the pressure gradient. Recognising that the pressure equation contains a Laplace operator, we shall delay the discretisation of the ∇p term in the momentum equation. Once the pressure equation is assembled, the Laplace operator is discretised in the usual way – Rhie-Chow interpolation. In order to manufacture the coefficients for compact pressure support, we will create a special formula for velocity interpolation, which will separate the gradient term. Thus: uf = fx uP + (1 − fx )uN + (au )−1 n n•(∇p)f − P f ˆ ˆ pN − pP |d| (10.62)

Here, (au )−1 is the face interpolate of the diagonal coefficient of the P f ˆ momentum equation, n is a unit-normal vector in the direction of interest (parallel with the direction of interpolation, d) and the expression in brackets represents two ways of evaluating the face-based pressure gradient ∗ Interpolated cell-centred pressure: (∇p)f = fx (∇p)P + (1 − fx )(∇p)N (10.63)

10.4 Segregated Pressure-Based Solver


∗ Face-normal gradient ˆ n•(∇p)f = pN − pP |d| (10.64)

– This term is introduced to remove the interpolated for of the gradient and replace it with a compact support, thus removing the cause of checkerboarding • Rhie-Chow interpolation (1983) has started a major step forward in CFD: truly complex geometries could now be handled, as well as allowing for hybrid mesh types, embedded refinement and a number of other techniques


Pressure Boundary Conditions and Global Continuity

Pressure and Velocity Boundary Condition • Momentum and pressure equations form a coupled set of equations. A consequence of this is a coupled behaviour of their boundary conditions: the prescribed condition on u and p need to act in unison. If this is not the case, the pressure-velocity system may be ill-posed and have no solution • The easiest way of examining the nature of boundary condition coupling is based on the semi-discretised form of the momentum equation: uP = (au )−1 (H(u) − ∇p) P (10.65)

1. On boundaries where u is prescribed, the value of pressure on the boundary is a part of the solution and cannot be enforced 2. If a boundary value of p is given, the pressure gradient will balance the flow rate: thus, the flow rate is a part of the solution and cannot be enforced • There exists a profusion of pressure- and velocity boundary conditions, e.g. fixed pressure inlet, pressure drop etc. which seem to invalidate the above. However, for stabile discretisation the actual implementation of the boundary condition will obey the above rules, with the wrapping for user convenience • Example: a fixed pressure inlet boundary condition will internally act as a fixed velocity boundary condition. However, the value of fixed velocity will be adjusted such that the pressure value (obtained as a part of the solution) tends towards the one specified by the user

160 Enforcing Global Continuity

Pressure-Velocity Coupling

• Note that the pressure equation is derived from a global continuity condition ∇•u = 0 (10.66)

This condition should be satisfied for each cell and for the domain as a whole • Looking at the formulation of the pressure-velocity system in incompressible flows, we can establish that the absolute pressure level does not appear in the equations: it is the pressure gradient that drives the flow • In some situations it is possible to have a set of boundary conditions where the pressure level is unknown from its boundary conditions. In such cases, two corrections are needed: – Undeterminate pressure level implies a zero eigen-value in the pressure matrix. In order to resolve such problems, the level of pressure will be artificially fixed in one computational point – In order for the continuity equation to be satisfied for each cell, it also needs to be satisfied for the complete domain. When a pressure level is fixed by a boundary condition, global continuity will be enforced as a part of the pressure solution. However, when this is not the case, one needs to explicitly satisfy the condition after solving the pressure equation. • Adjusting global continuity 1. Sum up the magnitude of all fluxes entering the domain Fin = |F |; F < 0 (10.67)

2. Separately, sum up all the fluxes leaving the domain Fout = |F |; F > 0 (10.68)

3. Adjust the out-going fluxes such that Fin = Fout

Chapter 11 Compressible Pressure-Based Solver
11.1 Handling Compressibility Effects in PressureBased Solvers

In this Chapter we shall repeat the derivation of the pressure-based solver for compressible flows. The idea of a behind the derivation is that a pressure-based algorithm and pressure-velocity coupling does not suffer from singularity in the incompressible limit and may behave better across the range of speeds. Memory usage for a segregated solver is also considerably lower than the coupled one, which may be useful in large-scale simulations. The issue that remains to be resolved is the derivation of the pressure equation and momentum-pressure-energy coupling procedure Compressibility Effects • Compressible form of the continuity equation introduces density into the system ∂ρ + ∇•(ρu) = 0 ∂t (11.1)

• In the analysis, we shall attempt to derive the equation set in general terms. For external aerodynamics, it is typical to use the ideal gas law as the constitutive relation connecting pressure p and density ρ: P = ψP RT where ψ is compressibility: ρ= ψ= 1 RT (11.2)



Compressible Pressure-Based Solver

The principle is the same for more general expressions. In this case, presence of density also couples in the energy equation because temperature T appears in the constitutive relation ∂(ρe) + ∇•(ρeu) − ∇•(λ∇T ) = ρg•u − ∇•(P u) ∂t 2 − ∇• µ(∇•u) u + ∇• µ ∇u + (∇u)T •u + ρQ, 3


• Momentum equation is in a form very simular to before: note the presence of (non-constant) density in all terms. Also, unlike the incompressible form, we shall now deal with dynamic pressure and viscosity in the lace of their kinematic equivalents ∂(ρu) +∇•(ρuu)−∇• µ ∇u + (∇u)T ∂t = ρg −∇ P + 2 µ∇•u 3 (11.5)

• In the incompressible form, the ∇• µ(∇u)T term was dropped due to ∇•u = 0: ∇• µ(∇u)T = ∇u•∇µ + µ∇(∇•u) (11.6)

where the first term disappears for µ = const. and the second for ∇•u = 0. In compressible flows, this is not the case and the term remains


Derivation of the Pressure Equation in Compressible Flows

Compressible Pressure Equation • The basic idea in the derivation is identical to the incompressible formulation: we shall use the semi-discretised form of the momentum equation au uP = H(u) − ∇P P and uP = (au )−1 (H(u) − ∇P ) P (11.8) (11.7)

• Substituting this into the continuity equation will not yield the pressure equation directly: we need to handle the density-pressure relation

11.2 Derivation of the Pressure Equation in Compressible Flows


• The first step is the transformation of the rate-of-change term. Using the chain rule on ρ = ρ(p, . . .), it follows: ∂ρ ∂P ∂ρ = ∂t ∂P ∂t From the ideal gas law, it follows ∂ρ =ψ ∂P (11.10) (11.9)

• Looking at the divergence term, we will substitute the expression for u and try to present ρ in terms of P as appropriate ∇•(ρu) = ∇• ρ(au )−1 H(u) − ∇• ρ (au )−1 ∇P P P (11.11)

• The first term is under divergence and we will attempt to convert it into a convection term. Using ρ = ψ P , it follows: ∇• ρ(au )−1 H(u) = ∇• ψ P (au )−1 H(u) = ∇•(Fp P ) P P (11.12)

where Fp is the flux featuring in the convective effects in the pressure. Fp = ψ (au )−1 H(u) P (11.13)

• The second term produces a Laplace operator similar to the incompressible form and needs to be preserved. The working variable is pressure and we will leave the term in the current form. Note the additional ρ pre-factor, which will remain untouched; otherwise the term would be a non-linear function of P • Combining the above, we reach the compressible form of the pressure equation: ∂(ψ P ) + ∇• ψ(au )−1 H(u) P − ∇• ρ (au )−1 ∇P = 0 P P ∂t (11.14)

• A pleasant surprise is that the pressure equation is in standard form: it consists of a rate of change, convection and diffusion terms. However, flux Fp is not a volume/mass flux as was the case before. This is good news: discretisation of a standard form can be handle in a stable, accurate and bounded manner


Compressible Pressure-Based Solver


Pressure-Velocity-Energy Coupling

Discretised Pressure-Velocity System • Let us review the set of equations for the compressible system • Discretisation of the momentum equation is performed in standard way. Pressure gradient term is left in a differential form: au uP = H(u) − ∇P P (11.15)

• Using the elements of the momentum equation, a sonic flux is assembled as: Fp = ψ (au )−1 H(u) P (11.16)

• Pressure equation is derived by substituting the expression for u and expressing density in terms of pressure ∂(ψ P ) + ∇•(Fp P ) − ∇• ρ (au )−1 ∇P = 0 P ∂t • The face flux expression is assembled in a similar way as before F = sf • ψ (au )−1 H(u) P


Pf − ρ (au )−1 sf •∇P P


and is evaluated from the pressure solution • Density can be evaluated either from the constitutive relation: ρ= P = ψP RT (11.19)

or from the continuity equation. Note that at this stage the face flux (= velocity field) is known and the equation can be explicitly evaluated for ρ • Depending on the kind of physics and the level of coupling, the energy equation may or may not be added to the above. It is in standard form but contains source and sink terms which need to be considered with care Coupling Algorithm • The pressure-velocity coupling issue in compressible flows is identical to its incompressible equivalent: in order to solve the momentum equation, we need to know the pressure, whose role is to impose the continuity constraint on the velocity

11.4 Additional Coupled Equations


• In the limit of zero Ma number, the pressure equation reduces to its incompressible form • With this in mind, we can re-use the incompressible coupling algorithms: SIMPLE and PISO • In cases of rapidly changing temperature distribution (because of the changes in source/sink terms in the energy equation), changing temperature will considerably change the compressibility ψ. For correct results, coupling between pressure and temperature needs to be preserved and the energy equation is added into the loop Boundary Conditions • We have shown that for incompressible flows boundary conditions on pressure and velocity are not independent: two equations are coupled and badly posed set of boundary conditions may result in an ill-defined system • In compressible flows, we need to account for 3 variables (ρ, u, e) handled together. The issue is the same: number of prescribed values at the boundary depends on the number of characteristics pointing into the domain: – Supersonic inlet: 3 variables are specified – Subsonic inlet: 2 variables – Subsonic outlet 1 variable – Supersonic outlet: no variables • Inappropriate specification of boundary conditions or location of boundaries may result in an ill-defined problem: numerical garbage


Additional Coupled Equations

Coupling to Other Equations • Compared with the importance and strength of pressure-velocity (or pressurevelocity-energy) coupling, other equations that appear in the system are coupled more loosely • We shall consider two typical sets of equations: turbulence and chemical reactions

166 Turbulence

Compressible Pressure-Based Solver

• Simple turbulence models are based on the Boussinesq approximation, where µt acts as turbulent viscosity. Coupling of turbulence to the momentum equation is relatively benign: the Laplace operator will handle it without trouble • In all cases, momentum to turbulence coupling will thus be handled in a segregated manner • In 2-equation models, the coupling between two equations may be strong (depending on the model formulation). Thus, turbulence equations may be solved together – keep in mind that only linear coupling may be made implicit • A special case is Reynolds stress transport model: the momentum equation is formally saddle-point with respect to R; R is governed by its own equation. In most cases, it is sufficient to handle RSTM models as an explicit extension of the reduced 2-equation model (note that k = tr(R)). From time to time, the model will blow up, but careful discretisation usually handles is sufficiently well Chemistry and Species • Chemical species equations are coupled to pressure and temperature, but more strongly coupled to each other. Coupling to the rest of the system is through material properties (which depend on the chemical composition of the fluid) and temperature. • Only in rare cases it is possible to solve chemistry in a segregated manner: a coupled chemistry solver is preferred • The second option is a 2-step strategy. Local equilibrium solution is sought for chemical reactions using an ordinary differential equation (ODE) solver, which is followed be a segregated transport step


Comparison of Pressure-Based and Density Based Solvers

Density-Based Solver • Coupled equations are solved together: flux formulation enforces the coupling and entropy condition

11.5 Comparison of Pressure-Based and Density Based Solvers


• The solver is explicit and non-linear in nature: propagating waves. Extension to implicit solver is approximate and done through linearisation • Limitation on Courant number are handled specially: multigrid is a favoured acceleration technique • Problem exist at the incompressibility limit: formulation breaks down Pressure–Based Solver • Equation set is decoupled and each equation is solved in turn: segregated solver approach • Equation coupling is handled by evaluating the coupling terms from the available solution and updating equations in an iteration loop • Density equation is reformulated as an equation for the pressure. In the incompressible limit, it reduces to a the pressure-velocity system described above: incompressible flows are handled naturally • Equation segregation implies that matrices are created and inverted one at a time, re-using the storage released the storage from the previous equation. This results is a considerably lower overall storage requirement • Flux calculation is performed one equation at a time, consistent with the segregated approach. As a consequence, the entropy condition is regularly violated (!) Variable Density or Transonic Formulation • To follow the discussion, note that the cost of solving an elliptic equation (characterised by a symmetric matrix) is half of the equivalent cost for the assymetric solver • For low Mach number or variable compressibility flows, it is known in advance that the pressure equation is dominated by the Laplace operator. Discretised version of it creates a symmetric matrix • In subsonic high-Ma or transonic flows, importance of convection becomes more important. However, changed nature of the equation (transport is local) makes it easier to solve • Variable compressibility formulation handles the convection explicitly: the matrix remains symmetric but total cost is reduced with minimal impact on accuracy


Compressible Pressure-Based Solver

Chapter 12 Turbulence Modelling for Aeronautical Applications
12.1 Nature and Importance of Turbulence

Why Model Turbulence? • The physics of turbulence is completely understood and described in all its detail: turbulent fluid flow is strictly governed by the Navier-Stokes equations • . . . but we do not like the answer very much! – Turbulence spans wide spatial and temporal scales – When described in terms of vortices (= eddies), non-linear interaction is complex – Because of non-linear interactions and correlated nature, it cannot be attacked statistically – It is not easy to assemble the results of full turbulent interaction and describe them in a way relevant for engineering simulations: we are more interested in mean properties of physical relevance • In spite of its complexity, there is a number of analytical, order-of-magnitude and quantitative result for simple turbulence flows. Some of them are extremely useful in model formulation • Mathematically, after more than 100 years of trying, we are nowhere near to describing turbulence the way we wish to

170 Handling Turbulent Flows

Turbulence Modelling for Aeronautical Applications

• Turbulence is irregular, disorderly, non-stationary, three-dimensional, highly non-linear, irreversible stochastic phenomenon • Characteristics of turbulent flows (Tennekes and Lumley: First Course in Turbulence) – Randomness, meaning disorder and no-repeatability – Vorticality: high concentration and intensity of vorticity – Non-linearity and three-dimensionality – Continuity of Eddy Structure, reflected in a continuous spectrum of fluctuations over a range of frequencies – Energy cascade, irreversibility and dissipativeness – Intermittency: turbulence can only occupy only parts of the flow domain – High diffusivity of momentum, energy, species etc. – Self-preservation and self-similarity: in simple flows, turbulence structure depends only on local environment • Turbulence is characterised by higher diffusion rates: increase id drag, mixing, energy diffusion. In engineering machinery, this is sometimes welcome and sometimes detrimental to the performance • Laminar-turbulent transition is a process where laminar flow naturally and without external influence becomes turbulent. Example: instability of free shear flows Vortex Dynamics and Energy Cascade • A useful way of looking at turbulence is vortex dynamics. – Large-scale vortices are created by the flow. Through the process of vortex stretching vortices are broken up into smaller vortices. This moves the energy from large to smaller scales – Energy dissipation in the system scales with the velocity gradient, which is largest in small vortices

12.1 Nature and Importance of Turbulence



Taylor scale

Energy scales Inertial range

Kolmogorov scale

Dissipation Wavenumber

• The abscissa of the above is expressed in terms of wavenumber: how many vortices fit into the space • Thus, we can recognise several parts of the energy cascade: – Large scale vortices, influenced by the shape of flow domain and global flow field. Large scale turbulence is problematic: it is difficult yo decide which of it is a coherent structure and which is actually turbulence – Energy-containing vortices, which contain the highest part of the turbulent kinetic energy. This scale is described by the Taylor scale – Inertial scale, where vortex stretching can be described by inertial effects of vortex breakup – Small vortices, which contain low proportion of overall energy, but contribute most of dissipation. This is also the smallest relevant scale in turbulent flows, characterised by the Kolmogorov micro-scale • Note that all of turbulence kinetic energy eventually ends up dissipated as heat, predominantly is small structures Turbulence Modelling • The business of turbulence modelling can be described as: We are trying to find approximate simplified solutions for the Navier-Stokes equations in the manner that either describes turbulence in terms of mean properties or limits the spatial/temporal resolution requirements associated with the full model


Turbulence Modelling for Aeronautical Applications

• Turbulence modelling is therefore about manipulating equations and creating closed models in the form that allows us to simulate turbulence interaction under our own conditions. For example, a set of equations describing mean properties would allow us to perform steady-state simulations when only mean properties are of interest • We shall here examine three modelling frameworks – Direct Numerical Simulation (DNS) – Reynolds-Averaged Navier-Stokes Equations (RANS), including eddy viscosity models and higher moment closure. For compressible flows with significant compressibility effects, the averaging is actually of the Favre type – Large Eddy Simulation (LES)


Direct Numerical Simulation of Turbulence

Direct Numerical Simulation • DNS is, strictly speaking, not a turbulence model at all: we will simulate all scales of interest in a well-resolved transient mode with sufficient spatial and temporal resolution • In order to perform the simulation well, it is necessary to ensure sufficient spatial and temporal resolution: – Spatial resolution: vortices smaller that Kolmogorov scale will dissipate their energy before a full turn. Smaller flow features are of no interest; Kolmogorov scale is a function of the Re number – Temporal resolution is also related to Kolmogorov scale; but may be adjusted for temporal accuracy • Computer resources are immense: we can really handle relatively modest Re numbers and very simple geometry • . . . but this is the best way of gathering detailed information on turbulent interaction: mean properties, first and second moments, two-point correlations etc. in full fields • In order to secure accurate higher moments, special numerics is used: e.g. sixth order in space and tenth order in space will ensure that higher moments are not polluted numerically. An alternative are spectral models, using Fourier modes or Chebyshev polynomials as a discretisation base

12.3 Reynolds-Averaged Turbulence Models


• DNS simulations involve simple geometries and lots of averaging. Data is assembled into large databases and typically used for validation or tuning of “proper” turbulent models • DNS on engineering geometries is beyond reach: the benefit of more complete fluid flow data is not balanced by the massive cost involved in producing it • Current research frontier: compressible turbulence with basic chemical reactions, e.g. mixing of hydrogen and oxygen with combustion; buoyancydriven flows


Reynolds-Averaged Turbulence Models

Reynolds Averaging • The rationale for Reynolds averaging is that we are not interested in the part of flow solution that can be described as “turbulent fluctuations”: instead, it is the mean (velocity, pressure, lift, drag) that is of interest. Looking at turbulent flow, it may be steady in the mean in spite of turbulent fluctuations. If this is so, and we manage to derive the equations for the mean properties directly, we may reduce the cost by orders of magnitude: – It is no longer necessary to perform transient simulation and assemble the averages: we are solving for average properties directly – Spatial resolution requirement is no longer governed by the Kolmogorov micro-scale! We can tackle high Reynolds numbers and determine the resolution based on required engineering accuracy Reynolds Averaged Navier-Stokes Equations • Repeating from above: decompose u and p into a mean and fluctuating component: u = u + u′ p = p + p′ (12.1) (12.2)

• Substitute the above into original equations. Eliminate all terms containing products of mean and fluctuating values ∂u + ∇•(u u) − ∇• (ν∇u) = −∇p + ∇•(u′ u′ ) ∂t ∇•u = 0 (12.3) (12.4)


Turbulence Modelling for Aeronautical Applications

• One new term: the Reynolds stress tensor: R = u′ u′ (12.5)

R is a second rank symmetric tensor. We have seen something similar when the continuum mechanics equations were assembled, but with clear separation of scales: molecular interaction is described as diffusion Modelling Paradigms • In order to close the system, we need to describe the unknown value, R as a function of the solution. Two ways of doing this are: 1. Write an algebraic function, resulting in eddy viscosity models R = f (u, p) (12.6)

2. Add more differential equations, i.e. a transport equation for R, producing Reynolds Transport Models. A note of warning: as we keep introducing new equations, the above problem will recur. At the end, option 1 will need to be used as some level of closure • Both options are in use today, but the first one massively out-weights the second in practicality


Eddy Viscosity Models

Dimensional Analysis • Looking at R, the starting point is to find an appropriate symmetric second rank tensor. Remember that the terms acts as diffusion of momentum, appears in the equation under divergence and appears to act as diffusion • Based on this, the second rank tensor is the symmetric velocity gradient S: R = f (S) where S= 1 ∇u + (∇u)T 2 (12.8) (12.7)

Under divergence, this will produce a ∇•(∇u) kind of term, which makes physical sense and is numerically well behaved • Using dimensional analysis, it turns out that we need a pre-factor of dimensions of viscosity: for laminar flows, this will be [m2 /s] and because of its equivalence with laminar viscosity we may call it turbulent viscosity νt

12.3 Reynolds-Averaged Turbulence Models


• The problem reduces to finding νt as a function of the solution. Looking at dimensions, we need a length and time-scale, either postulated or calculated. On second thought, it makes more sense to use velocity scale U and length-scale ∆ • We can think of the velocity scale as the size of u′ and length-scale as the size of energy-containing vortices. Thus: R = νt and U (12.10) ∆ where A is a dimensionless constant allowing us to tune the model to the actual physical behaviour νt = A Velocity and Length Scale • Velocity scale is relatively easy: it represents the strength of turbulent fluctuations. Thus, U ≈ |u′ |. Additionally, it is easy to derive the equation for turbulence kinetic energy k: 3 2 k = u′ 2 directly from the momentum equation in the following form: ∂k + ∇•(uk) − ∇•[(νef f )∇k] = νt ∂t

1 ∇u + (∇u)T 2



1 (∇u + ∇uT ) 2




Here ǫ is turbulent dissipation which contains the length scale: k2 ǫ = Cǫ ∆ (12.13)

Zero and One-Equation Models • Zero equation model: assume local equilibrium above: k = ǫ, with no transport. The problem reduces to the specification of length-scale. Example: Smagorinsky model νt = (CS ∆)2 |S| (12.14)

where CS is the Smagorinsky “constant”. The model is actually in active use (!) but not in this manner – see below • One equation model: solve the k equation and use an algebraic equation for the length scale. Example: length-scale for airfoil simulations can be determined form the distance to the wall

176 Two-Equation Model

Turbulence Modelling for Aeronautical Applications

• Two-equation models are the work-horse of engineering simulations today. Using the k equation from above, the system is closed by forming an equation for turbulent dissipation ǫ and modelling its generation and destruction terms • Other choices also exist. For example, the Wilcox model uses eddy turnover time ω as the second variable, claiming better behaviour near the wall and easier modelling • Two-equation models are popular because it accounts for transport of moth the velocity and length-scale and can be tuned to return several canonical results Standard k − ǫ Model • This is the most popular 2-equation model, now on its way out. There exists a number of minor variants, but the basic idea is the same • Turbulence kinetic energy equation ∂k + ∇•(uk) − ∇•[(νef f )∇k] = G − ǫ ∂t where G = νt 1 (∇u + ∇uT ) 2



• Dissipation of turbulence kinetic energy equation ǫ ∂ǫ ǫ2 + ∇•(uǫ) − ∇•[(νef f )∇ǫ] = C1 G − C2 ∂t k k • Turbulent viscosity νt = Cµ k2 ǫ (12.18) (12.17)

• Reynolds stress R = νt 1 (∇u + ∇uT ) 2 (12.19)

• Model constants are tuned to canonical flows. Which?

12.3 Reynolds-Averaged Turbulence Models



Reynolds Transport Models

Background • Transport equation for Reynolds stress R = f (u, p) is derived in a manner similar to the derivation of the Reynolds-averaged Navier-Stokes equation. We encounter a number of terms which are physically difficult to understand (a pre-requisite for the modelling) • Again the most difficult term is the destruction of R, which will ba handled by solving its own equation: it is unreasonable to expect a postulated or equilibrium length-scale to be satisfactory • Analytical form of the (scalar) turbulence destruction equation is even more complex: in full compressible form it contains over 70 terms • The closure problem can be further extended by writing out equations for higher moments etc. but “natural” closure is never achieved: the number of new terms expands much faster that the number of equations Modelling Reynolds Stress Equation • Briefly looking at the modelling of the R and ǫ equations, physical understanding of various terms is relatively weak and uninteresting. As a result, terms are grouped into three categories – Generation terms – Redistribution terms – Destruction terms Each category is then modelled as a whole • Original closure dates from 1970s and in spite of considerable research efforts, it always contained problems • Currently, Reynolds transport models are used only in situations where it is a-priori known that eddy viscosity models fails. Example: cyclone simulations Standard Closure • Reynolds stress transport equation ǫ 2 ∂R +∇•(uR)−∇•[(αR νt + νl )∇R] = P−C1 R+ (C1 −1)Iǫ−C2 dev(P)+W ∂t k 3 (12.20) where


Turbulence Modelling for Aeronautical Applications

– P is the production term P = −R • [∇u + (∇u)T ] – νt is the turbulent viscosity νt = Cµ k2 ǫ (12.22) (12.21)

and k is the turbulent kinetic energy k= 1 tr(R) 2 (12.23)

– W is the wall reflection term(s) G= 1 tr(P) 2 (12.24)

• Dissipation equation: ǫ is still a scalar ǫ ǫ2 ∂ǫ + ∇•(uǫ) − ∇•[(αǫ νt + νl )∇ǫ] = C1 G − C2 ∂t k k – P is the production term P = −R • [∇u + (∇u)T ] – G is the (scalar) generation term G= 1 tr(P) 2 (12.27) (12.26) (12.25)

Comparing Reynolds Closure with Eddy Viscosity Models • Eddy viscosity implies that the Reynolds stress tensor is aligned with the velocity gradient R = νt 1 ∇u + (∇u)T 2 (12.28)

This would represent local equilibrium: compare with equilibrium assumptions for k and ǫ above • In cases where the two tensors are not aligned, Reynolds closure results are considerably better • . . . but at a considerable cost increase: more turbulence equations, more serious coupling with the momentum equation

12.3 Reynolds-Averaged Turbulence Models



Near-Wall Effects

Turbulence Near the Wall • Principal problem of turbulence next to the wall is the inverted energy cascade: small vortices are rolled up and ejected from the wall. Here, small vortices create big ones, which is not accounted in the standard modelling approach • Presence of the wall constrains the vortices, giving them orientation: effect on turbulent length-scales • Most seriously of all, both velocity and turbulence properties contain very steep gradients near the wall. Boundary layers on high Re are extremely thin. Additionally, turbulent length-scale exhibits complex behaviour: in order for the model to work well, all of this needs to be resolved in the simulation Resolved Boundary Layers • Low-Re Turbulence Models are based on the idea that all details of turbulent flow (in the mean: this is still RANS!) will be resolved • In order to achieve this, damping functions are introduced in the near-wall region and tuned to actual (measured, DNS) near-wall behaviours • Examples of such models are: Launder-Sharma, Lam-Bremhorst k − ǫ • Near-wall resolution requirements and boundary conditions depend on the actual model, but range from y + = 0.01−0.1 for the first node, with grading away from the wall. This is a massive resolution requirement! • If the resolution requirement is not satisfied, models will typically blow up. On stabilisation, velocity profile and wall drag will be wrong Wall Functions • In engineering simulations, we are typically not interested in the details of the near-wall region. Instead, we need to know the drag • This allows us to bridge the troublesome region near the wall with a coarse mesh and replace it with an equilibrium model for attached flows: wall functions • Wall functions bridge the problematic near-wall region, accounting for drag increase and turbulence. A typical resolution requirement is y + = 30 − 50, but coarser meshes can also be used


Turbulence Modelling for Aeronautical Applications

• This is a simple equilibrium model for fully developed attached boundary layer. It will cause loss of accuracy in non-equilibrium boundary layers, but it will still produce a result • Wall functions split the region of operation below and above y + = 11.6 and revert to laminar flow for below it. Here, increased mesh resolution may result in less accurate drag prediction – this is not a well-behaved model • Advanced wall functions may include effects of adverse pressure gradient and similar but are still a very crude model • Note that wall functions are used with high-Re bulk turbulence models, reducing the need for high resolution next to the wall What Can a Low-Re Model Do For Me? • With decreasing Re number, turbulence energy spectrum loses its inertial range and regularity: energy is not moved smoothly from larger scales to smaller; importance of dissipation spreads to lower wavenumber • Low-Re models are aimed at capturing the details of the near-wall flow, characterised by lower Re • However, near-wall turbulence is nothing like low-Re bulk flow: this is to do with the presence and effect of the wall, not the loss of turbulence structure • A low-Re turbulence model is not appropriate for low-Re flows away from the wall: the results will be wrong!


Transient RANS Simulations

Concept of Transient RANS • RANS equations are derived by separating the variable into the mean and fluctuation around it. In simple situations, this implies a well-defined meaning: mean is (well,) mean – implying time-independence and the fluctuation carries the transient component • In many physical simulations, having a time-independent mean makes no sense: consider a flow simulation in an internal combustion engine. Here, we will change the mean into a ensemble average (over a number of identical experiments) and allow the mean to be time-dependent • In other cases, the difference between the mean and fluctuation may become even more complex: consider a vortex shedding behind a cylinder at high Re, where large shed vortices break up into turbulence further downstream

12.4 Large Eddy Simulation


• Idea of RANS here is recovered through separation of scales, where large scales are included in the time-dependence of the mean and turbulence is modelled as before. It is postulated that there exists separation of scales between the mean (= coherent structures) and turbulence Using Transient RANS • Transient RANS is a great step forward in the fidelity of modelling. Consider a flow behind an automobile, with counter-rotating vortices in the wake and various other unsteady effects. Treating it as “steady” implies excessive damping, typically done through first-order numerics because the simulation does not naturally converge to steady-state • Simulations can still be 2-D where appropriate and the answer is typically analysed in terms of a mean and coherent structure behaviour • RANS equations are assembled as before, using a transient Navier-Stokes simulations. Usually, no averaging is involved Transitional Flows • Phenomena of transition are extremely difficult to model: as shown before, a low-Re turbulence model would be a particularly bad choice • The flow consists of a mixture of laminar pockets and various levels of turbulence, with laminar-to-turbulent transition within it • Apart from the fact that a low-Re flow is difficult to model in RANS, additional problem stems from the fact that k = ǫ = 0 is the solution to the model: thus if no initial or boundary turbulence is given, transition will not take place • Introducing intermittency equation: to handle this a RANS model is augmented by an equation marking presence of turbulence • Transition models are hardly available: basically, a set of correlations is packed as transport equations. Details of proper boundary conditions, posedness of the model, user-controlled parameters and model limitations are badly understood. A better approach is needed!


Large Eddy Simulation

Deriving LES Equations


Turbulence Modelling for Aeronautical Applications

• Idea of LES comes from the fact that large-scale turbulence strongly depends on the mean, geometry and boundary conditions, making it casedependent and difficult to model. Small-scale turbulence is close to homogenous and isotropic, its main role is energy removal from the system, it is almost universal (Re dependence) and generally not of interest • Mesh resolution requirements are imposed by the small scales, which are not of interest anyway • In LES we shall therefore simulate the coherent structures and largescale turbulence and model small-scale effects • For this purpose, we need to make the equations understand scale, using equation filtering: a variable is decomposed into large scales which are solved for and modelled small scales. To help with the modelling, we wish to capture a part of the inertial range and model the (universal) high wavenumber part of the spectrum • Unlike transient RANS, a LES simulation still captures a part of turbulence dynamics: a simulation must be 3-D and transient, with the results obtained by averaging Filtered Navier-Stokes Equations • Equation averaging is mathematically defined as: u= G(x, x′ ) u(x′ )dx′ , (12.29)

where G(x, x′ ) is the localised filter function • Various forms of the filter functions can be used: local Gaussian distribution, top-hat etc. with minimal differences. The important principle is localisation • After filtering, the equation set looks very similar to RANS, but the meaning is considerably different ∂u + ∇•(u u) − ∇• (ν∇u) = −∇p + ∇•τ ∂t ∇•u = 0 with τ = (u u − u u) + (u u′ + u′ , u) + u′ u′ = L + C + B (12.32) (12.30) (12.31)

12.4 Large Eddy Simulation


• The first term, L is called the Leonard stress. It represents the interaction between two resolved scale eddies to produce small scale turbulence • The second term, C (cross term), contains the interaction between resolved and small scale eddies. It can transfer energy in either direction but on average follows the energy cascade • The third term represents interaction between two small eddies to create a resolved eddy. B (backscatter) represents energy transfer from small to large scales Sub-Grid Scale (SGS) Modelling • The scene in LES has been set to ensure that single turbulence models work well: small-scale turbulence is close to homogenous and isotropic • The length-scale is related to the separation between resolved and unresolved scales: therefore, it is related to the filter width • In LES, implicit filtering is used: separation between resolved and unresolved scales depends on mesh resolution. Filter size is therefore calculated as a measure of mesh resolution and results are interpreted accordingly • Typical models in use are of Smagorinsky model type, with the fixed or dynamic coefficients. In most models, all three terms are handled together • Advanced models introduce some transport effects by solving a subgrid kequation, use double filtering to find out more about sub-grid scale or create a “structural picture” of sub-grid turbulence from resolved scales • Amazingly, most models work very well: it is only important to remove the correct amount of energy from resolved scales LES Inlet and Boundary Conditions • In the past, research on LES has centred on sub-grid scale modelling and the problem can be considered to be resolved • Two problematic areas in LES are the inlet conditions and near-wall treatment • Modelling near-wall turbulence A basic assumption of LES is energy transfer from large towards smaller scales, with the bulk of dissipation taking place in small vortices. Near the wall, the situation is reversed: small vortices and streaks are rolled up on the wall and ejected into the bulk


Turbulence Modelling for Aeronautical Applications

– Reversed direction of the energy cascade violates the modelling paradigm. In principle, the near-wall region should be resolved in full detail, with massive resolution requirements – A number of modelling approaches to overcome the problem exists: structural SGS models (guessing the sun-grid scale flow structure), dynamic SGS models, approaches inspired by the wall function treatment and Detached Eddy Simulation • Inlet boundary condition. On inlet boundaries, flow conditions are typically known in the mean, or (if we are lucky) with u′ and turbulence lengthscale. An important property of turbulence in the energy cascade: correlation between various scales and vortex structures. The inlet condition should contain the “real” turbulence interaction and it is not immediately clear how to do this Energy-Conserving Numerics • For accurate LES simulation, it is critical to correctly predict the amount of energy removal from resolved scale into sub-grid. This is the role of a SGS model • In order for the SGS model to perform its job, it is critical that the rest of implementation does not introduce dissipative errors: we need energyconserving numerics • Errors introduced by spatial and temporal discretisation must not interfere with the modelling • In short, good RANS numerics is not necessarily sufficient for LES simulations. In RANS, a desire for steady-state and performance of RANS models masks poor numerics; in LES this is clearly not the case Averaging and Post-Processing • Understanding LES results is different than looking at steady or transient RANS: we have at disposal a combination of instantaneous fields and averaged results • Resolved LES fields contain a combination of mean (in the RANS sense) and large-scale turbulence. Therefore, it is extremely useful in studying the details of flow structure • The length of simulation, number of averaging steps etc. is studied in terms of converging averages: for statistically steady simulations, averages must converge!

12.5 Choosing a Turbulence Model


• A good LES code will provide a set of on-the-fly averaging tools to assemble data of interest during the run • Flow instability and actual vortex dynamics will be more visible in the instantaneous field • Data post-processing – It is no longer trivial to look and understand the LES results, especially in terms of vortex interaction: we typically use special derived fields, e.g. enstrophy (magnitude of curl of velocity), invariants of the strain tensor etc. – Looking at LES results takes some experience and patience: data sets will be very large


Choosing a Turbulence Model

Background • There exists a wide range of turbulence models in various approaches to the problem. A role of a good engineer is to choose the best for the problem at hand • Important factors are the goal of simulation, available computer resources and required accuracy • In what follows, we will give short overview of “traditional” choices


Turbulence Models in Airfoil Simulations

Single and Multiple Airfoils • Simulations typically done in steady-state and 2-D • Objective of simulation is mainly lift/drag and stall characteristics • This automatically implies 2-D steady-state RANS. Moreover, region of interest is close to the surface of the airfoil; the bulk flow is simple • Presence of the wall allows for simple prescription of length-scale

186 New Challenges

Turbulence Modelling for Aeronautical Applications

• Laminar-to turbulent transition occurs along the airfoil; in multiple airfoil configuration, upstream components trigger transition downstream • In order to handle transition, new models are being developed (currently: useless!) • Problematic region is also found around the trailing edge: flow detachment • LES is prohibitively expensive: from steady-state 2-D RANS to unsteady 3-D with averaging Choice of Models • Zero-equation and one-equation turbulence models for aeronautics • Balwdin-Lomax model, Cebeci-Smith are the usual choices. Spalart-Allmaras model represents the “new generation” and across all models the performance is very good • This is a very popular set of cases for low-Re RANS models • 2-equation models are also used regularly. A very popular model is the k − ω because of its performance close the the wall


Turbulence Models in Bluff-Body Aerodynamics

Background • Bluff body flows (e.g. complete aircraft, automobile, submarine) are considerably more complex, both in the structure of boundary layers and in the wake • Abandoning local equilibrium: transport of turbulence and length-scale • A standard choice of model would be 2-equation RANS with wall functions. Currently moving to transient RANS Choice of Models • k − ǫ model and its variants; k − ω model represent normal industrial choice. There are still issues with mesh resolution for full car/aeroplane aerodynamics: meshes for steady RANS with wall functions can be of the order of 100 million cells and larger

12.6 Future of Turbulence Modelling in Industrial Applications


• Low-Re formulations wall-bounded flows is not popular: excessive mesh resolution for realistic geometric shapes • Study of instabilities and aero-acoustic effects in moving steadily to LES. Typically, only a part of the geometry is modelled and coupled to the global (RANS) model for boundary conditions. Examples: bomb bay in aeroplanes or wing mirrors in automobiles


Future of Turbulence Modelling in Industrial Applications

Future Trends • Future trends are quite clear: moving from the RANS modelling to LES on a case-by-case basis and depending on problems with current models and available computer resources • RANS is recognised as insufficient in principle because the decomposition into mean and fluctuation. Also, models are too diffusive to capture detailed flow dynamics. Research in RANS is scaled down to industrial support; everything else is moving to LES • Transient RANS is a stop-gap solution until LES is not available at reasonable cost • DNS remains out of reach for all engineering use, but provides a very good base for model development and testing


Turbulence Modelling for Aeronautical Applications

Chapter 13 Large-Scale Computations
13.1 Background

In this chapter, a computing background for CFD simulations in engineering will be examined. In 1965, Gordon Moore, Director of Fairchild Semiconductor’s Research and Development Laboratories, wrote an article on the future development of semiconductor industry with a sentence on computing power at fixed cost is doubling every 18 months. Increasing computer power is the driving force behind the expansion of numerical simulation tools. Every new level of performance brings a possibility of tackling new problems, using more advanced models or achieving higher simulation fidelity.


Computer Power in Engineering Applications

Background • CFD simulations are among the largest users of CPU time in the world. Even for a relative novice, it is easy to devise and set up a very large simulation that would yield relevant results • Other computational fields with similar level of requirements include – Numerical weather forecasting. Currently at the level of first-order models and correlations tuned to the mesh size. Large facilities and efforts at the UK Met Office and in Japan – Computational chemistry: detailed atom-level study of chemical reactions from first principles – Global climate modelling. This includes ocean and atmosphere models, vapour in atmosphere and polar ice caps effects. Example: global climate model facility (“Earth Simulator”)


Large-Scale Computations

– Direct numerical simulation of turbulence, mainly as replacement for experimental studies • In all cases, the point is how to achieve maximum with the available computing resources rather than how to perform the largest simulation. A small simulation with equivalent speed, accuracy etc. is preferred Simulation Time • Typical simulation time depends on available resources, object of simulation and required accuracy. Recently, the issue of optimal use of computer resources comes into play: running a trivial simulation on a supercomputer is not fair game • Example: parametric studies, optimisation and robust design in engineering. Here, the point is to achieve optimal performance of engineering equipment by in-depth analysis. Optimisation algorithms will perform hundreds of related simulations with subtle changes in geometrical and flow setup details in order to achieve multi-objective optimum. Each simulation on its own can be manageable, but we need several hundreds! • In many cases, the limiting factor is not feasibility, but time to market: a Formula 1 car must be ready for the next race (or next season) • Reducing simulation time: 4. Algorithmic improvements: faster, more accurate numerics, timestepping algorithms 3. Linear solver speed. Numerical solution of large systems of algebraic equations is still under development. Having in mind that a good solver spends 50-80 % of solution time inverting matrices, this is a very important research area. Interaction with computer hardware (how does the solver fit onto a supercomputer to use it to the best of its abilities) is critical 2. Physical modelling. A typical role of a model is to describe complex physics of small scales in a manner which is easier to simulate. Better models provide sufficient accuracy for available resource 1. User expertise. The best way of reducing simulation time is an experienced user. Physical understanding of the problem, modelling, properties of numerics and required accuracy allows the user to optimally allocate computer resources. Conversely, there is no better way of producing useless results or wasting computer resources than applying numerical tools without understanding.

13.2 Classification of Computer Platforms


Scope • Our objective is to examine the architecture requirements, performance and limitations of large-scale CFD simulations today • There is no need to understand the details of high performance programming or parallel communications algorithms: we wish to know what parallelism means, how to use it and how it affects solver infrastructure • Crucially, we will examine the mode of operation of parallel computers choice of algorithms and their tuning • The first step is classification of high-performance computer platforms in use today


Classification of Computer Platforms

High Performance Computers • Basic classification of high performance architecture depends on how instructions and data are handled in the computer (Flynn, 1972). Thus: – SISD: single instruction, single data – SIMD: single instruction, multiple data – MISD: multiple instruction, single data – MIMD: multiple instruction, multiple data • The above covers all possibilities. SISD is no longer considered high performance. In short, SISD is a very basic processing unit (a toaster?) • We shall concentrate on SIMD, also called a vector computer and MIMD, known as a parallel computer. MISD is sometimes termed pipelining and is considered a “hardware optimisation” rather than a programming technique Vector Computers • Computationally intensive part of CFD algorithms involves performing identical operations on large sets of data. Example: calculation of face values from cell centres for grading calculation in cell-centred FVM: φf = fx φP + (1 − fx )φN (13.1)

• φP and φN belong to the same array over all cells. The result, φf belongs to an array over all faces. Subscripts P , N and f will be cell and face indices:


Large-Scale Computations

const labelList& owner = mesh.owner(); const labelList& neighbour = mesh.neighbour(); const scalarField& fx = mesh.weights(); for (label i = 0; i < phiFace.size(); i++) { phiFace[i] = fx[i]*phiCell[owner[i]] + (1 - fx[i])*phiCell[neighbour[i]]; }

• Performing an operation like this consists of several parts – (Splitting up the operation into bits managed by the floating point unit) – Setting up the instruction registers, e.g. a = b + c ∗ d – Fetching the data (memory, primary cache, secondary cache, registers) – Performing the operation • In vector computers, the idea is that performing the same operation over a large set can be made faster: create special hardware with lots of identical (floating point) units under unified control 1. Set up instruction registers. This is done only once for the complete data set 2. Assume the data is located in a contiguous memory space. Fetching the start of the list grabs the whole list 3. Perform the operation on a large data set simultaneously More Vector Computers • A number of joint units is called the vector length. It specifies how many operations can be performed together. Typical sizes would be 256 or 1024: potentially very fast! • Some care is required in programming. Examples: – Do-if structure
for (label i = 0; i < phiFace.size(); i++) { if (f_x < 0.33) {

13.2 Classification of Computer Platforms phiFace[i] = 0.5*(phiCell[owner[i]] + phiCell[neighbour[i]]); } else { phiFace[i] = fx[i]*phiCell[owner[i]] + (1 - fx[i])*phiCell[neighbour[i]]; } }


This kills performance: a decision required at each index. Reorganise to execute the complete loop twice and then combine result. Min performance loss: 50%! – Data dependency
for (label i = 0; i < phiFace.size(); i++) { phiCell[i] -= fx[i]*phiCell[owner[i]]; }

Values of phiCell depend on each other – if this happens within a single vector length, we have a serious problem! • Today, vector computers are considered “very 1970-s”. The principle works, but loss of performance due to poor programming or compiler problems is massive • Compilers and hardware are custom-built: cannot use off-the shelf components, making the computers very expensive indeed • However, the lesson on vectorisation is critical for understanding highperformance computing. Modern CPU-s will automatically and internally attempt to configure themselves as vector machines (with a vector length of 10-20, for example). If the code is written vector-safe and the compiler is good, there will be substantial jump in performance • There is a chance that vector machines will make a come-back: the principle of operation is sound but we need to make sure things are done more cleverly and automatically Parallel Computers • Recognising that vector computers perform their magic by doing many operations simultaneously, we can attempt something similar: can a room full of individual CPU-s be made to work together as a single large machine


Large-Scale Computations

• Idea of massive parallelism is that a large loop (e.g. cell-face loop above) could be executed much faster if it is split into bits and each part is given to a separate CPU unit to execute. Since all operations are the same, there is formally no problem in doing the decomposition • Taking a step back, we may generalise: A complete simulation can be split into separate bits, where each bit is given to a separate computer. Solution of separate problems is then algorithmically coupled together to create a solution of the complete problem. Parallel Computer Architecture • Similar to high-performance architecture, parallel computers differ in how each node (CPU) can see and access data (memory) on other nodes. The basic types are: – Shared memory machines, where a single node can see the complete memory (also called addressing space) with “no cost overhead” – Distributed memory machines, where each node represents a selfcontained unit, with local CPU, memory and disk storage. Communication with other nodes involved network access and is associated with considerable overhead compared to local memory access • In reality, even shared memory machines have variable access speed and special architecture: other approaches do not scale well to 1000s of nodes. Example: CC-NUMA (Cache Coherent Non-Uniform Memory Access) • For distributed memory machines, a single node can be an off-the-shelf PC or a server node. Individual components are very cheap, the approach scales well and is limited by the speed of (network) communication. This is the cheapest way of creating extreme computing power from standard components at very low price • Truly massively parallel supercomputers are an architectural mixtures of local quasi-shared memory and fast-networked distributed memory nodes. Writing software for such machines is a completely new challenge Coarse- and Fine-Grain Parallelisation • We can approach the problem of parallelism at two levels

13.2 Classification of Computer Platforms


– In coarse-grain parallelisation, the simulation is split into a number of parts and their inter-dependence is handled algorithmically. Main property of coarse-grain parallelisation is algorithmic impact. The solution algorithm itself needs to account for multiple domains and program parallel support – Fine-grain parallelisation operates on a loop-by-loop level. Here, each look is analysed in terms of data and dependency and where appropriate it may be split among various processors. Fine-grain action can be performed by the compiler, especially if the communications impact is limited (e.g. shared memory computers) • In CFD, this usually involved domain decomposition: computational domain (mesh) is split into several parts (one for each processor): this corresponds to coarse-grain parallelisation
Global domain Subdomain 1 Subdomain 2


Subdomain 3

Subdomain 4

• While fine-grain parallelisation sounds interesting, current generation of compilers is not sufficiently clever for complex parallelisation jobs. Examples include algorithmic changes in linear equation solvers to balance local work with communications: this action cannot be performed by the compiler Build Your Own Supercomputer • In the age of commodity computing, price of individual components is falling: processors, memory chips, motherboards, networking components and hard disks are commodity components • A distributed memory computer can be built up to medium size (dozens of compute nodes) without concern: a balance of computing power and communication speed is acceptable - usually called Beowulf clusters • As a result, parallel machines have become immensely popular and used regularly even for medium-size simulations


Large-Scale Computations


Domain Decomposition Approach

In this section we will review the impact of parallel domain decomposition to various parts of the algorithm. A starting point is a computation mesh decomposed into a number of sub-domains.



Functionality • In order to perform a parallel FVM simulation, the following steps are performed: – Computational domain is split up into meshes, each associated with a single processor. This consists of 2 parts: ∗ Allocation of cells to processors ∗ Physical decomposition of the mesh in the native solver format Optimisation of communications is important: it scales with the surface of inter-processor interfaces and a number of connections. Both should be minimised. This step is termed domain decomposition – A mechanism for data transfer between processors needs to be devised. Ideally, this should be done in a generic manner, to facilitate porting of the solver between various parallel platforms: a standard interface to a communications package – Solution algorithm needs to be analysed to establish the inter-dependence and points of synchronisation • Additionally, we need a handling system for a distributed data set, simulation start-up and shut-down and data analysis tools • Keep in mind that during a single run we may wish to change a number of available CPUs and may wish to resume or perform data analysis on a single node Parallel Communication Protocols • Today, Message Passing Interface (MPI) is a de-facto standard ( A programmer does not write custom communications routines. The standard is open and contains several public domain implementation • On large or specialist machines, hardware vendor will re-implement or tune the message passing protocol to the machine, but the programming interface is fixed

13.3 Domain Decomposition Approach


• Modes of communication – Pairwise data exchange, where processors communicate to each other in pairs – Global synchronisation points: e.g. global sum. Typically executed as a tree-structured gather-scatter operation • Communication time is influenced by 2 components – Latency, or a time interval required to establish a communication channel – Bandwidth, or the amount of data per second that can be transferred by the system Mesh Partitioning Tools • The role of mesh a partitioner is to allocate each computational point (cell) to a CPU. In doing so, we need to account for: – Load balance: all processing units should have approximately the same amount of work between communication and synchronisation points – Minimum communication, relative to local work. Performing local computations is orders of magnitude faster than communicating the data • Achieving the above is not trivial, especially if the computing load varies during the calculation Handling Parallel Computations and Data Sets • The purpose of parallel machines is to massively scale up computational facilities. As a result, the amount of data handled and preparation work is not trivial • Parallel post-processing is a requirement. Regularly, the only machine capable of handling simulation data is the one on which the computation has been performed. For efficient data analysis, all post-processing operations also need to be performed in parallel and presented to the user in a single display or under a single heading: parallelisation is required beyond the solver • On truly large cases, mesh generation is also an issue: it is impossible to build a complete geometry as a single model. Parallel mesh generation is still under development


Large-Scale Computations


Parallel Algorithms

Finally, let us consider parallelisation of three components of a CFD algorithm for illustration purposes. Mesh Support • For purposes of algorithmic analysis, we shall recognise that each cell belongs to one and only one processor • Mesh faces can be grouped as follows – Internal faces, within a single processor mesh – Boundary faces – Inter-processor boundary faces: faces used to be internal but are now separate and represented on 2 CPUs. No face may belong to more than 2 sub-domains • Algorithmically, there is no change for internal and boundary faces. This is the source of parallel speed-up. Out challenge is to repeat the operations for for faces on inter-processor boundaries Gradient Calculation • Using Gauss’ theorem, we need to evaluate face values of the variable. For internal faces, this is done trough interpolation: φf = fx φP + (1 − fx ) φN (13.2)

Once calculated, face value may be re-used until cell-centred φ changes • In parallel, φP and φN live on different processors. Assuming φP is local, φN can be fetched through communication: this is once-per-solution cost and obtained by pairwise communication • Note that all processors perform identical duties: thus, for a processor boundary between domain A and B, evaluation of face values can be done in 3 steps: 1. Collect internal cell values from local domain and send to neighbouring processor 2. Receive neighbour values from neighbouring processor 3. Evaluate local face value using interpolation

13.3 Domain Decomposition Approach


Discretisation Routines: Matrix Assembly • Similar to gradient calculation above, assembly of matrix coefficients on parallel boundaries can be done using simple pairwise communication • In order to assemble the coefficient, we need geometrical information and some interpolated data: all readily available, maybe with some communication • Example: off-diagonal coefficient of a Laplace operator aN = |sf | γf |df | (13.3)

where γf is the interpolated diffusion coefficient (see above). In actual implementation, geometry is calculated locally and interpolation factors are cached to minimise communication • Discretisation of a convection term is similarly simple • Note: it is critical that both sides of a parallel interface calculate the identical coefficient. If consistency is not ensured, simulation will fail • Sources, sinks and temporal schemes all remain unchanged: each cell belongs to only one processor Linear Equation Solvers • Major impact of parallelism in linear equation solvers is in choice of algorithm. For example, direct solver technology does not parallelise well, and is typically not used in parallel. Only algorithms that can operate on a fixed local matrix slice created by local discretisation will give acceptable performance • In terms of code organisation, each sub-domain creates its own numbering space: locally, equation numbering always starts with zero and one cannot rely on global numbering: it breaks parallel efficiency • With this in mind, coefficients related to parallel interfaces need to be kept separate and multiplied through in a separate matrix update • Impact of parallel boundaries will be seen in: – Every matrix-vector multiplication operation – Every Gauss-Seidel or similar smoothing sweep . . . but nowhere else!

200 • Identical serial and parallel operation

Large-Scale Computations

– If serial and parallel execution needs to be identical to the level of machine tolerance, additional care needs to be taken: algorithmically, order of operations needs to be the same – This complicates algorithms, but is typically not required. Only large (badly behaved) meteorological models pose such requirements – Under normal circumstances, parallel implementation of linear equation solvers will provide results which vary from the serial version at the level of machine tolerance Synchronisation • Parallel domain decomposition solvers operate such that all processors follow identical execution path in the code. In order to achieve this, some decisions and control parameters need to be synchronised across all processor • Example: convergence tolerance. If one of the processors decides convergence is reached and others do not, they will attempt to continue with iterations and simulation will lock up waiting for communication • Global reduce operations synchronise decision-making and appear throughout high-level code. • Communications in global reduce is of gather-scatter type: all CPUs send their data to CPU 0, which combines the data and broadcasts it back • Actual implementation is more clever and controlled by the parallel communication protocol

Chapter 14 Fluid-Structure Interaction
14.1 Scope of Simulations

A majority of simulation examples shown so far concentrate on a single physical phenomenon or set of equations in a domain. There also exists a set of coupled problems, where governing equations are zoned but still closely coupled. Fluid-Structure Interaction (FSI) • A number of engineering devices operates by combining various physical effects in a closely coupled manner. In such cases, it is insufficient to examine each effect in isolation, ignoring the coupling; regularly it is precisely the coupling that needs to be considered • Example: heat exchanger – Fluid flow inside of the pipe, heated by combustion gasses outside. From the point of view of flow analysis, two domains are “independent” of each other – Even in a trivial case, coupling exists: material properties are a function of temperature and heat transfer is the basic effect we need to consider – Adding a solid component with finite heat capacity and conductivity which separates two fluids completes the system Thus: ∗ Liquid flow inside the pipe (water). Navier-Stokes equations + energy equation ∗ Reacting mixture of combustion gases outside the pipe. NavierStokes equations + additional equation sets depending on interest: turbulence, combustion etc., including energy equation ∗ Metal pipe wall, with conductivity and heat capacity. Heat transfer equation within the pipe wall; thermal stress analysis


Fluid-Structure Interaction

– Note that the energy equation is solved in all parts in a strongly coupled manner: single equation encompassing all heat transfer physics • From above, it follows that a number of equation sets will be solved together, with some equations covering multiple parts of the domain: Fluidstructure interaction • This does not necessarily involve only fluids and structures: we can speak of multi-physics or, more accurately: physics! • With this in mind, “single-physics” simulations are a simplification of a complete machine, where the influence of other components is neglected or handled by prescribed boundary conditions Components of Fluid-Structure Interaction Simulation • In order to perform an FSI simulation, we first need to handle each bit of physics separately: ideally in a single simulation code • Simulations should be performed side-by-side and allow for coupling effects • Care should be taken to isolate parts of the simulation depending on nature of coupling and engineering judgement. Example: fan-to-afterburner analysis of a jet engine: – Turbo-fan – Compressor – Fuel supply and injection system – Combustion chambers – Turbine – Afterburner and – Fluid flow and heat transfer – Structural integrity: thermal and structural stresses – Vibration modes, natural frequencies, modes of excitement • Analysis of the coupling allows us to judge which effects are important and which should be solved together

14.1 Scope of Simulations


Choice of Model and Discretisation Method • Additional set of problems arises from physical modelling: Example: Reynoldsaveraged Navier-Stokes (RANS) for compressor, coupled to Large Eddy Simulation (LES) for combustion chambers • Ideally, physical modelling and discretisation are chosen to solve the local equation set in the best possible way: if local solution is insufficiently accurate, coupling will not be captured either • Coupling problems will follow and can be expressed in two levels – Physical model coupling. Various combinations of physical models are more or less suited for coupled simulations. Decision on the mode of coupling or additional “coupling physics” is made on a case-by-case basis ∗ Example: magneto-hydrodynamics. Additional body force term in the momentum equation. Two-way coupling caused by magnetic effects of the conductive fluid in motion ∗ Example: LES to RANS turbulence model coupling. RANS requires mean turbulence properties from the upstream model; they will be provided using averaged LES data – Coupling discretisation models, where various or inconsistent discretisation methods are combined together. The easiest way of achieving the coupling is through data exchange on coupled boundary conditions. Coupling Data • Consider a case of wing flutter: fluid flow around an elastic wing – Fluid flow creates forces on the wing surface. Since the wing is not rigid, forces result in a deflection of the wing – Wing deflection changes the shape of the fluid domain in the critical region: next to the wing. Details of the flow field, including lift and drag forces change feeding back to the interaction – Adding a transient effect and natural frequency of oscillation for the structure further complicates the problem • In the example above, fluid forces are transferred to the solid, followed by transfer of displacement onto the boundary of the fluid domain • Note that in structural simulations domain motion is determined as a part of the solution. In fluids, deformation of the domain needs to be handled separately


Fluid-Structure Interaction


Coupling Approach

Level of Coupling • Some level of coupling exists in every physical situation. Engineering judgement decides if coupling is critical for the performance or can be safely neglected • Level of coupling – Decoupled simulations. Each physical phenomenon can be studied in isolation, using boundary conditions or material properties to handle the dependence to external phenomena. Feed-back effects are small or limited – Explicit coupling approach. Two simulations are executed sideby-side, exchanging boundary data in a stationary or transient mode. Dynamic coupling effects can be captured, but with uncertainties in accuracy of simulation. Capable of simulating weakly coupled phenomena. This is currently state-of-the-art for industrial fluid-structures simulations – Implicit coupling: single matrix. Here, multiple physical phenomena are discretised separately and coupling is also described in an implicit manner. All matrices are combined into a single linear system and solved in a coupled manner. Block implicit solution is more stable then in explicit coupling, but poses requirements on software design: need to access matrix data directly. Currently used in conjugate heat transfer simulations – Single equation approach. Recognising the fact that equation set represents identical conservation equation and only governing laws vary from material to material, we can describe the complete system as a single equation. Governing equations are rewritten in a consistent and compatible manner with a single working variable. Single equation represents closest possible coupling. However, there are issues with consistency on interfaces and simulation accuracy in regions of rapidly changing solution (e.g. boundary layers). Resulting equations are not necessarily known in type or well behaved and may require special solution algorithms. This mode of coupling is a current research topic • In many engineering situations, software limitations are a significant factor: when tools cannot handle all the physics or software design does not allow choice or level of coupling, we are forced to use simplifications

14.3 Discretisation of FSI Systems


• In such cases, engineering judgement is used after the simulation: how can we interpret the results or study the problem in a decoupled manner


Discretisation of FSI Systems

• FVM both sides; FEM, both sides • FVM fluid flow + FEM stress analysis • Data mapping and integral quantity corrections • Single equation approach Choice of Discretisation • Ideally, discretisation for each set of equations is chosen for optimal accuracy and efficiency • In cases of FSI, this would usually employ the FEM for structural analysis and FVM for fluid flow (why?) • In explicit or implicit coupling, one needs to describe (boundary) data transfer between the two: interpolation • Additional care is required for implicit solution: are the methods compatible and what are the properties of a coupled system Data Mapping • Boundary data mapping involves interpolation. This, by necessity includes a discretisation error: one set of data points describing a continuous field is translated into a different set • Example: FSI, with transfer of forces from fluid to structure – At the completion of the fluid flow step, we can calculate forces (pressure + shear) on the wall boundary. The force is available for each boundary face of a fluid domain – Wall pressure represents external load onto the structure. However, discrete representation of a structures mesh and location of solution points is not identical: interpolation is needed – It is critical that integral properties (total force) is preserved: typically done by global re-scaling of the profile • In FEM, one can find terms like profile-conserving or flux-conserving interpolation. In reality, we need both


Fluid-Structure Interaction




Master your semester with Scribd & The New York Times

Special offer for students: Only $4.99/month.

Master your semester with Scribd & The New York Times

Cancel anytime.